29
ELASTIC MAP REDUCE OF AWS Rim Moussa ENI-Carthage University of Carthage 2017/2018 1

EMR AWS Demo

Embed Size (px)

Citation preview

ELASTIC MAP REDUCE OF AWS

Rim Moussa

ENI-Carthage

University of Carthage

2017/2018

1

DEMO OUTLINE

Simple Storage Service: S3

Job

Jar: Map Reduce code

Input: input data files

Output: output data files

All data must be on S3 including jar and input data

Create Hadoop Cluster

Size: number of workers

Hardware configuration

Stat the job

2

S3

3

CREATE A BUCKET FOR INPUT DATA ON S3

4

CREATE A NEW S3 BUCKET

5

UPLOAD DATA INTO THE BUCKET

6

UPLOAD IN PROGRESS

7

UPLOAD JAR INTO S3

8

CREATE EMR HADOOP CLUSTER

9

10

11

12

13

CREATE A NEW KEY PAIR

14

Click here

15

DOWNLOAD NEW KEY AND CHANGE ACCESS RIGHTS

16

CLUSTER PROVISIONING

17

HARDWARE TAB

18

BOOTSTRAPPING

19

CONNECT TO MASTER USING SSH

20

RESIZE A CLUSTER

21

Add or drop instances

INSTANCES ARE RUNNING

22

SUBMIT A MAPREDUCE JOB

23

JOB DETAILS

24

JOB IS RUNNING

25

26

27

/!\ TERMINATE CLUSTER

28

FIND OUT

29

S3

How to upload data from a terminal to S3

Scenario where data is some where in the net

Hadoop Master

Compile the job on the master

Submit the job from a terminal on the master

Performance Tuning

Hadoop cluster configuration

RAM allocated to each Mapper, Reducer

Data Compression

Code

Input Split Size

Adjust the number of reducers