Implementing Apache Spark on AWS

Introduction

  • Apache Spark link
  • EMR link
  • Redshift link

Warnings

  • On the AWS cloud platform you have access to a cloud service that facilitates using Spark. This is called Elastic Map Reduce or EMR. AWS also provides a data warehouse service called Redshift. You are not obliged to use either of these to fully implement an Apache Spark data analysis framework on AWS. These technologies are optional; see for example Tim Durham’s genomics case study.

Apache Spark