Tune hadoop

Notebook

How to Debug and Tune Hadoop

Alex RovnerProclivity Systems

Tune Your Cluster

Choose optimal number of mappers / reducers per nodemapred.tasktracker.map.tasks.maximum

mapred.tasktracker.reduce.tasks.maximum

Oversubscribe the CPU by 20-30% (8 Cores can generally handle 10 slots)

Mappers to reducers ratio 4:3

Tune Hadoop

Adjust memory allocationsmapred.child.javaopts=-Xmx512M

Use 80% of available memory

Do not oversubscribe memory to avoid swapping

Total Memory = Map Slots + Reduce Slots + TT + DN + Other Services + OS

Tune Hadoop

Increase buffers for sorting and shufflingio.sort.mb & fs.inmemorysize.mb

Set to 60-70% of Java heap size

Set it large enough to avoid disk spills

Compress intermediate datamapred.compress.map.output

Install native libraries for performance

Use LZO to minimize CPU cycles

Set compression to use BLOCK compression

Tune Your Job

Use Combiner where possible!Combiner is a mini reduce phase on the map side

Reduces the amount of data sent to the reducers

Does not need to be the same class as the reducer

Tune Your Job

Setup an appropriate number of reducersCheck job stats to figure out how many reducers are needed

Map output bytes will drive how many reducers you need

Rule of thumb is 1 Gig per reducer

Tune Your Job

7.4GB = 7 Reducers

Tune Your Job

Tune hadoop

Technology

PROFESSIONAL HADOOP® SOLUTIONS - Startseite€¦ · The Hadoop Ecosystem 7 Hadoop Core Components 7 Hadoop Distributions 10 Developing Enterprise Applications with Hadoop 12 Summary

Tuning Hadoop on Dell PowerEdge Serversen.community.dell.com/cfs-file/__key/widgetcontainerfiles/6874df8... · This Dell Technical White Paper explains how to tune BIOS, OS and Hadoop

Hadoop Deployment Manual - Hyadespleiades.ucsc.edu/doc/bright/hadoop-deployment-manual.pdf2.2 Ncurses Installation Of Hadoop Using cm-hadoop-setup ... •The Hadoop Deployment Manual

Hadoop virtualization extensions hadoop world meetup

Hadoop Installation Guide | Hadoop Configuration

[Hadoop] Terapot: Massive Email Archiving with Hadoop

Hadoop Summit - Sanoma self service on hadoop

Hadoop Summit 2010 Benchmarking And Optimizing Hadoop

Chattanooga Hadoop Meetup - Hadoop 101 - November 2014

Hue: The Hadoop UI - Hadoop Singapore

Hadoop , Hadoop , Hadoop !!!

Docker based Hadoop provisioning - Hadoop Summit 2014

Hadoop Crash Course Hadoop Summit SJ

· (Page views ? Hourly? Monthly Hadoop Node Hadoop Node Hadoop Camus Node Hadoop Node Hadoop Node Hadoop Node Hadoop Node Hadoop Node Ad-Hoc Analysis External Datastores Trends

Hadoop Summit 2010 Machine Learning Using Hadoop

BIGDATA HADOOP COURSE CONTENT · Industries using Hadoop. Data Locality. Hadoop Architecture. Map Reduce & HDFS. Using the Hadoop single node image (Clone). The Hadoop Distributed

MarkLogic and Hadoop - Strata + Hadoop World 2014

Hadoop Interview Questions Version 2.0.0 Author: Hadoop ...kpbigdata.com/img/Hadoop_Interview_question.pdf · Hadoop Interview Questions Version 2.0.0 Author: Hadoop Learning Resource

Analyzing Hadoop with Hadoop

Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)