Upload
tomwhite
View
2.719
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
An Introduction to MapReduce 2 and
YARN
Tom WhiteApril 25, 2012
Seattle Hadoop / Scalability / NoSQL Meetup
Wednesday, April 25, 2012
First, whatʼs MapReduce 1?
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Whatʼs wrong with MR1?
Wednesday, April 25, 2012
Motivation
•Scaling >4000 nodes
•HA of Job Tracker
•Poor resource utilization
Wednesday, April 25, 2012
Yet Another Resource Negotiator
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Node Manageris a generalized Task Tracker• Task Tracker
• fixed number of map or reduce slots
• Node Manager
• containers with variable resource limits
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Wednesday, April 25, 2012
MR is user spaceYARN is kernel
Wednesday, April 25, 2012
Bonus Apps
•Distributed shell
•MPI (MAPREDUCE-2911)
•Master-worker (MAPREDUCE-3315)
•Apache Giraph, Hama
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Old API ≠ MR1New API ≠ MR2
Wednesday, April 25, 2012
Old APIo.a.h.mapred
New APIo.a.h.mapreduce
MR1 ✓ ✓
MR2 ✓ ✓
Wednesday, April 25, 2012
Wednesday, April 25, 2012
Try out MR2
•Apache Hadoop 0.23.1
•CDH4 Beta 2
Wednesday, April 25, 2012
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>1.0.2</version></dependency>
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>0.23.1</version></dependency>
MR1
MR2
Wednesday, April 25, 2012
TODO
• Still alpha status
• Performance tuning
• Usability bug fixes
• RM recovery
• Security in MR2 not complete
Wednesday, April 25, 2012
Further Reading
Wednesday, April 25, 2012
Thank You!
Wednesday, April 25, 2012