102
Adam Kawa Data Engineer @ Spotify Hadoop Operations Powered By … Hadoop

Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Embed Size (px)

DESCRIPTION

At Spotify we collect huge volumes of data for many purposes. Reporting to labels, powering our product features, and analyzing user growth are some of our most common ones. Additionally, we collect many operational metrics related to the responsiveness, utilization and capacity of our servers. To store and process this data, we use scalable and fault-tolerant multi-system infrastructure, and Apache Hadoop is a key part of it. Surprisingly or not, Apache Hadoop generates large amounts of data in the form of logs and metrics that describe its behaviour and performance. To process this data in a scalable and performant manner we use … also Hadoop! During this presentation, I will talk about how we analyze various logs generated by Apache Hadoop using custom scripts (written in Pig or Java/Python MapReduce) and available open-source tools to get data-driven answers to many questions related to the behaviour of our 690-node Hadoop cluster. At Spotify we frequently leverage these tools to learn how fast we are growing, when to buy new nodes, how to calculate the empirical retention policy for each dataset, optimize the scheduler, benchmark the cluster, find its biggest offenders (both people and datasets) and more.

Citation preview

Page 1: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Adam KawaData Engineer @ Spotify

Hadoop Operations

Powered By … Hadoop

Page 2: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. How many times has Coldplay been streamed this month?

2. How many times was “Get Lucky” streamed during first 24h?

3. Who was the most popular artist in NYC last week?

Labels, Advertisers, Partners

Page 3: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. What song to recommend Jay-Z when he wakes up?

2. Is Adam Kawa bored with Coldplay today?3. How to get Arun to subscribe to Spotify

Premium?

Data Scientists

Page 4: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Page 5: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

(Big) Data At Spotify■ Data generated by +24M monthly active usersand for users!

- 2.2 TB of compressed data from users per day- 64 TB of data generated in Hadoop each day

(triplicated)

Page 6: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Data Infrastructure At Spotify ■ Apache Hadoop YARN ■ Many other systems including

- Kafka, Cassandra, Storm, Luigi in production - Giraph, Tez, Spark in the evaluation mode

Page 7: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Probably the largest commercial Hadoop cluster in Europe!

- 694 heterogeneous nodes- 14.25 PB of data consumed- ~12.000 jobs each day

Apache Hadoop

Page 8: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

March 2013Tricky questions were asked!

Page 9: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. How many servers do you need to buy to survive one year?

2. What will you do to use them efficiently?3. If we agree, don’t come back to us this year! OK?

Finance Department

Page 10: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ One of Data Engineers responsible for answering these questions!

Adam Kawa

Page 11: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Examples of how to analyze various metrics, logs and files

- generated by Hadoop- using Hadoop- to understand Hadoop- to avoid guesstimates!

The Topic Of This Talk

Page 12: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ This knowledge can be useful to- measure how fast HDFS is growing- define an empirical retention policy- measure the performance of jobs- optimize the scheduler- and more

What To Use It For

Page 13: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. Analyzing HDFS2. Analyzing MapReduce and YARN

Agenda

Page 14: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

HDFSGarbage Collection On The NameNode

Page 15: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

“ We don’t have any full GC pauses on the NN.Our GC stops the NN for less than 100 msec,

on average!:) ”

Adam Kawa @ Hadoop User Mailing ListDecember 16th, 2013

Page 16: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

“ Today, between 12:05 and 13:00we had 5 full GC pauses on the NN.

They stopped the NN for 34min47sec in total!:( ”

Adam Kawa @ Spotify office, StockholmJanuary 13th, 2014

Page 17: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

What happened

between 12:05 and 13:00?

Page 18: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

The NameNode was receiving the block reports from all the DataNodes

Quick Answer!

Page 19: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. We started the NN when the DNs were running

Detailed Answer

Page 20: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. We started the NN when the DNs were running2. 502 DNs immediately registered to the NN

■ Within 1.2 sec (based on logs from the DNs)

Detailed Answer

Page 21: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. We started the NN when the DNs were running2. 502 DNs immediately registered to the NN

■ Within 1.2 sec (based on logs from the DNs)3. 502 DNs started sending the block reports

■ dfs.blockreport.initialDelay = 30 minutes■ 17 block reports per minute (on average)■ +831K blocks in each block report (on average)

Detailed Answer

Page 22: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. We started the NN when the DNs were running2. 502 DNs immediately registered to the NN

■ Within 1.2 sec (based on logs from the DNs)3. 502 DNs started sending the block reports

■ dfs.blockreport.initialDelay = 30 minutes■ 17 block reports per minute (on average)■ +831K blocks in each block report (on average)

4. This generated a high memory pressure on the NN■ The NN ran into Full GC !!!

Detailed Answer

Page 23: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Hadoop told us everything!

Page 24: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Enable GC logging for the NameNode■ Visualize e.g. GCViewer■ Analyze memory usage patterns, GC pauses, misconfiguration

Collecting The GC Stats

Page 25: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Time

Page 26: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

This blue line shows the heap used by the NN

Page 27: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Loading FsImage

Page 28: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Start replaying Edit logs

Page 29: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

First block report processed

Page 30: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

25 block reports processed

Page 31: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

131 block reports processed

Page 32: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

5min 39sec of Full GC

Page 33: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

40 block reports processed

Page 34: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Next Full GC

Page 35: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Next Full GC !!!

Page 36: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

CMS collector startsat 98.5% of heap…

We fixed that !

Page 37: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

What happened in HDFSbetween mid-December 2013

and mid-January 2014?

Page 38: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

HDFSHDFS Metadata

Page 39: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ A persistent checkpoint of HDFS metadata■ It contains information about files + directories■ A binary file

HDFS FsImage File

Page 40: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Converts the content of FsImage to text formats- e.g. a tab-separated file or XML

■ Output is easily analyzed by any tools- e.g. Pig, Hive

HDFS Offline Image Viewer

Page 41: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

50% of the data created during last 3

months

Page 42: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Page 43: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Anything interesting?

Page 44: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

1. NO data added that day2. Many more files added after

Page 45: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

The migration to YARN

Page 46: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Page 47: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Where

did

the small files

come from?

Page 48: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ An interactive visualization of data in HDFS

Twitter's HDFS-DU

/app-logsavg. file size = 253 KB

no. of dirs = 595K

no. of files = 60.6M

Page 49: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Statistics broken down by user/group name■ Candidates for duplicate datasets

■ Inefficient MapReduce jobs- Small files- Skewed files

More Uses Of FsImage File

Page 50: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ You can analyze FsImage to learn how fast HDFS grows■ You can combine it with “external” datasets - number of daily/monthly active users - total size of logs generated by users - number of queries / day run by data analysts

Advanced HDFS Capacity Planning

Page 51: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ You can also use ''trend button'' in Ganglia

Simplified HDFS Capacity Planning

If we do NOTHING, we might fill the cluster in September ...

Page 52: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

What will we do

to survive longer

than September?

Page 53: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

HDFSRetention

Page 54: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

QuestionHow many days after creation, a dataset is not accessed anymore?

Retention Policy

Page 55: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

QuestionHow many days after creation, a dataset is not accessed anymore?

Possible Solution ■ You can use modification_time and access_time from FsImage

Empirical Retention Policy

Page 56: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Logs and core datasets are accessed even many years after creation■ Many reports are not accessed even a hour after creation■ Most intermediate datasets needed less than a week

■ 10% of data has not been accessed for a year

Our Retention Facts

Page 57: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

HDFSHot Datasets

Page 58: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Some files/directories will be accessed more often than others e.g.: - fresh logs, core datasets, dictionary files

Idea■ To process it faster, increase

its replication factor while it’s “hot”■ To save disk space, decrease

its replication factor when it becomes “cold”

Hot Dataset

Page 59: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

How to find them?

Page 60: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Logs all filesystem access requests sent to the NN■ Easy to parse and aggregate - a tab-separated line for each request

HDFS Audit Log

2014-01-18 15:16:12,023INFO FSNamesystem.audit: allowed=trueugi=kawaa (auth:SIMPLE) ip=/10.254.28.4 cmd=opensrc=/metadata/artist/2013-11-27/part-00061.avro dst=null perm=null

Page 61: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ JAR files stored in HDFS and used by Pig scripts■ A dictionary file with metadata about log messages■ Core datasets: playlists, users, top tracks

Our Hot Datasets

Page 62: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

YARNMapReduce Jobs Autotuning

Page 63: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ There are jobs that we schedule regularly- e.g. top lists for each country

Idea■ Before submitting it next time, use statistics from the previous executions of a job

- To learn about its historical performance - To tweak its configuration settings

Recurring MapReduce Jobs

Page 64: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

We implemented■ A pre-execution hook that automatically sets - Maximum size of an input split - Number of Reduce tasks

■ More settings can be tweaked- Memory

- Combiner

Jobs Autotuning

Page 65: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Here, the goal is that a task runs approx. 10 min, on average

- Inspired by LinkedIn at Hadoop Summit 2013- Helpful in extreme cases (short/long running tasks)

A Small PoC ;)

Page 66: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Another Example - Job Optimized Over Time

Page 67: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Even perfect manual settings

may become outdated

when an input dataset grows!

Page 68: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

YARNMapReduce Statistics

Page 69: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Extracts the statistics from historical MapReduce jobs- Supports MRv1 and YARN

■ Stores them as Avro files- Enables easy analysis using e.g. Pig and Hive

■ Similar projects- Replephant, hRaven

Zlatanitor = Zlatan + Monitor

Zlatanitor

Page 70: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Low Medium High

Page 71: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

A Slow Node- 40% lower throughput than the average

Low Medium High

Page 72: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

NIC negotiated 100MbE instead of 1GbE

Low Medium High

Page 73: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

According to Facebook■ ”Small percentage of machines are responsible for large percentage of failures”

- Worse performance- More alerts- More manual intervention

Repeat Offenders

Page 74: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Adding nodes to the cluster

increases performance.

Sometimes, removing (crappy) nodes

does too !

Page 75: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Fixing

slow and failing

tasks as well !

Page 76: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

YARNApplication Logs

Page 77: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ YARN - can be moved to HDFS - They are stored as TFiles … :( - Small and many of them!

Location Of Application Logs

Page 78: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Frequent exceptions and bugs - Just looking at the last line of stderr shows a lot!

■ Possible optimizations - Memory and size of map input buffer

What Might Be Checked

a) AttributeError: 'int' object has no attribute 'iteritems' b) ValueError: invalid literal for int() with base 10: 'spotify' c) ValueError: Expecting , delimiter: line 1 column 3257 (char 3257) d) ImportError: No module named db_statistics

Page 79: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

YARNThe Capacity Scheduler

Page 80: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ We specified capacities and elasticity based on a combination of

- “some” data- intuition- desire to shape future usage (!)

Our Initial Capacities

Page 81: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Basic information available on the Scheduler Web UI■ Take print-screens!

- Otherwise, you will lose the history of what you saw :(

Overutilization And Underutilization

Page 82: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Capacity Scheduler exposes these metrics via JMX ■ Ganglia does NOT display the metrics related to utilization of queues (by default)

Visualizing Utilization Of Queue

Page 83: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ It collects JMX metrics from Java processes■ It can send metrics to multiple destinations

- Graphite, cacti/rrdtool, Ganglia- tab-separated text file- STDOUT- and more

Jmxtrans

Page 84: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Our Production queue often borrows resources- Usually from the Queue3 and Queue4 queues

Overutilization And Underutilization

Page 85: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Page 86: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

The Best Time For The Downtime?

Page 87: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Three Crowns

Page 88: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Three Crowns = Sweden

Page 89: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

BONUSSome Cool StuffFrom The Community

Page 90: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Aggregates and visualizes Hadoop cluster utilization across users

LinkedIn's White Elephant

Page 91: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Collects run-time statistics from MR jobs- Stores them in HBase

■ Does not provide built-in visualization layer- The picture below comes from Twitter's blog

Twitter's hRaven

Page 92: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

That’s all!

Page 93: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Analyzing Hadoop is also a “business” problem- Save money- Iterate faster- Avoid downtimes

Summary

Page 94: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Thank you!

Page 95: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ To my awesome colleagues for great technical review:

Piotr Krewski, Josh Baer, Rafal Wojdyla,Anna Dackiewicz, Magnus Runesson, Gustav Landén, Guido Urdaneta, Uldis Barbans

More Thanks

Page 96: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Section name

Questions?

Page 97: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Check out spotify.com/jobs or @Spotifyjobs for more information

[email protected] out my blog: HakunaMapData.com

Want to join the band?

Page 98: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Backup

Page 99: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

■ Tricky question!■ Use production jobs that represent your workload■ Use a metric that is independent from size of data that you process■ Optimize one setting at the time

Benchmarking

Page 100: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Benchmarking

Page 101: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

Benchmarking

Page 102: Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)