Upload
databloginfo
View
36
Download
2
Tags:
Embed Size (px)
Citation preview
Hadoop What is MapReduce an example MapReduce Process Job Tracker & Task Tracker Anatomy of File Write Anatomy of File Read Replication & Rack awareness
Hadoop is a framework that allows
for distributed processing of large data sets across clusters of
commodity computers using a
simple programming model
Hadoop was designed to enable applications to make most out of cluster architecture by addressing two key points:1. Layout of data across the cluster ensuring data is evenly distributed2. Design of applications to benefit from data locality
It brings us two main mechanism of hadoop hdfs and hadoop MapReduce