15
Introduction to MapReduce

3.introduction to map reduce

Embed Size (px)

Citation preview

Introduction to

MapReduce

Hadoop What is MapReduce an example MapReduce Process Job Tracker & Task Tracker Anatomy of File Write Anatomy of File Read Replication & Rack awareness

Hadoop is a framework that allows

for distributed processing of large data sets across clusters of

commodity computers using a

simple programming model

Hadoop was designed to enable applications to make most out of cluster architecture by addressing two key points:1. Layout of data across the cluster ensuring data is evenly distributed2. Design of applications to benefit from data locality

It brings us two main mechanism of hadoop hdfs and hadoop MapReduce

What is MapReduce

example : election

MapReduce process

Job Tracker

Job Tracker (contd.)

Job Tracker (contd.)

Anatomy of File Write

Anatomy of File Read

Replication and Rack Awareness