10
March 19, 2015 | Facebook Presto Meetup Steve McPherson

Amazon EMR Facebook Presto Meetup

Embed Size (px)

Citation preview

Page 1: Amazon EMR Facebook Presto Meetup

March 19, 2015 | Facebook Presto Meetup

Steve McPherson

Page 2: Amazon EMR Facebook Presto Meetup

instance AMI DB on

instance

instance with

CloudWatch

Elastic IP optimized

instance

Amazon

WorkSpaces

assignment/

task

Amazon EMR cluster MapR M3

engine

MapR M5

engine

MapR M7

engine

engine

Kinesis-enabled

appnew!

Amazon

Route 53

hosted zone route table

solid state disks

AWS Direct Connect

router

Amazon RDS

customer

gateway

attribute

VPC peering

Auto Scaling

Amazon S3 bucket with

objects

object AWS Import/Export

AWS Storage

Gateway

volume snapshotAmazon EBS cached

volume

virtual tape

library

Elastic Beanstalk

Amazon Glacier archive vault

CloudFront download

distribution Node.js

streaming

distribution

items

tableDynamoDB attributes global

secondary

index

Amazon

KinesisRDS DB

instance

RDS DB

instance standby

(Multi-AZ) Oracle DB

instance

MS SQL

instance

PostgreSQL

instance

PIOP MemcachedRedis

new! new! new! new!

AWS CloudTrail

instances

domain Amazon RedshiftAmazon SimpleDB

new!

DW1

Dense Compute

ElastiCache

DW2

Dense Compute

edge location

AWS Toolkit for

Visual Studio

JavaScriptapplication

stack

Amazon VPC VPN

connection

virtual private

gateway

alarm

stack

Internet

gateway

.NET

RDS DB

instance read

replica

IAMJava Python (boto)

AWS CLI

permissions role

MFA token

new!

new! new!

AWS OpsWorks

elastic network

instance

PHPdata encryption

keyAWS Data Pipeline

monitoring

new!

new!

deployment CloudWatch

Elastic Load

Balancing

SQL master

new!new!

Amazon EC2

new!

SQL slave

encrypted

data

AWS Tools for

Windows

PowerShellnon-cached

volume

users

IAM add-on

deployments

bucketdeployments

new!

permissions

iOS

resources

cache node

stack

AWS OpsWorks layers

apps

new!

new! apps

new!

Amazon SNS

new!

Human Intelligence

Tasks (HIT)

AWS Simple Icons: Deployment & Management

instances

new!

new!new!

Ruby

new!

instances

new!

permissionsresources

new!

topic

new!

templateAWS Toolkit

for Eclipse

Amazon SES

traditional server

Elastic

Transcoder

email

monitoring

Requester

email notification HTTP notification

Amazon

CloudSearchSDF metadata

Amazon SQSitem

message

Amazon SWF

decider

layers

worker

tape storagedisk

userInternet

Amazon

Mechanical Turk

client mobile client multimedia

workers

corporate

data centergeneric database

Android

AWS Security

Token Service

AWS cloud

AWS Management

Console

virtual private cloud forums

MySQL DB

instance

queueAMAZON

EMR

Page 3: Amazon EMR Facebook Presto Meetup

Amazon EMR makes Cluster Management easy

• Setup and

configuration

• Node monitoring and

replacement

• Log aggregation

• Cloudwatch integration

• Expand and shrink on

demand

• Integration with Spot

• AWS Support

Page 4: Amazon EMR Facebook Presto Meetup

Extract Transform & Load Data Warehouse Report Generation & Ad Hoc Analysis

Amazon S3

• MapReduce API

• Scoop

• Spark

• Cascading

• Pig

• MR

• Hive

• Spark

• Cascading

• Pig

• Presto

• Hive

• Spark-SQL

• Lingual

• Parquet

• ORC

• SEQ

• Text

Extract Transform & Load

Data Warehouse Report Generation

Ad Hoc Analysis

write read

Page 5: Amazon EMR Facebook Presto Meetup

Different Clusters for different workloads

Hive, Pig,

Cascading

Presto

Spark HBase

Amazon S3

Page 6: Amazon EMR Facebook Presto Meetup
Page 7: Amazon EMR Facebook Presto Meetup

name

ami-version

applications

ec2-attributes

instance-groups/

bootstrap-action

#wait 5 minutes

Page 8: Amazon EMR Facebook Presto Meetup

hive

presto-cli --catalog hive

Page 9: Amazon EMR Facebook Presto Meetup
Page 10: Amazon EMR Facebook Presto Meetup

Get started today

http://aws.amazon.com/elasticmapreduce/