The Past, Present, and Future of Hadoop at LinkedIn

The Past, Present, and Future of Hadoop @ LinkedIn

Carl SteinbachSenior Staff Software EngineerData Analytics Infrastructure GroupLinkedIn

The (Not So) Distant Past

PYMK (People You May Know)First version implemented in 2006

6-8 Million members

Ran on Oracle (foreshadowing!)Found various overlaps

School, Work… etc

Used common connections Triangle closing (?)

Triangle Closing

?Mary

Dave

Steve

PYMK ProblemsBy 2008, 40-50 Million membersStill running on OracleFailed oftenInfrequent data refresh

6 weeks – 6 months!

Humble Beginnings Back in ‘08

Success! (circa 2009)Apache Hadoop 0.2020 node cluster (repurposed hardware) PYMK in 3 days!

The Present

Hadoop @ LinkedIn Circa 2016> 10 Clusters> 10,000 Nodes> 1000 Users

Thousands of workflows, datasets, and ad-hoc queries

MR, Pig, Hive, Gobblin, Cubert, Scalding, Tez, Spark, Presto, …

Two Types of Scaling Challenges

Machines

People and Processes

Scaling Machines

Some Tough Talk About HDFSConventional wisdom holds that HDFS Scales to > 4k nodes without federation* Scales to > 8k nodes with federation*

What’s been our experience? Many Apache releases won’t scale past a couple thousand nodes Vendor distros usually aren’t much better

Why? Scale testing happens after the release, not before Most vendors have only a handful of customers with clusters larger than 1k nodes

* Heavily dependent on NN RPC workload, block size, average file size, average container size, etc, etc

March 2015 Was Not a Good Month

What Happened?We rapidly added 500 nodes to a 2000 node cluster

(don’t do this!)

NameNode RPC queue length and wait time skyrocketed

Jobs crawled to a halt

What Was the Cause?A subtle performance/scale regression was introduced upstream

The bug was included in multiple releases

Increased time to allocate a new file

The more nodes you had, the worse it got

How We Used to do Scale Testing1. Deploy the release to a small cluster (num_nodes = 100)2. See if anything breaks3. If no, then deploy to next largest cluster and goto step 24. If yes, figure out what went wrong and fix it

Problems with this approach Expensive: developer time + hardware Risky: Sometimes you can’t roll back! Doesn’t always work: overlooks non-linear regressions

17

• Scale testing and performance investigation tool for HDFS

• High fidelity in all the dimensions that matter

• Focused on the NameNode• Completely Black-box• Accurately fakes thousands of DNs on a

small fraction of the hardware• More details in forthcoming blog post

HDFS Dynamometer

Scaling People and Processes

19

20

v

HadoopPerformanceTuning

21

Too many dials!

Lots of frameworks: each one is slightly different.

Performance can change over time.

Tuning requires constant monitoring and maintenance!

Why Are Most User Jobs Poorly Tuned?

* Tuning decision tree from “Hadoop In Practice”

22

Dr Elephant: Running Light Without OverbyteAutomated Performance Troubleshooting for Hadoop Workflows

● Detects Common MR and Spark Pathologies:

○ Mapper Data Skew○ Reducer Data Skew○ Mapper Input Size○ Mapper Speed○ Reducer Time○ Shuffle & Sort○ More!

● Explains Cause of Disease● Guided Treatment Process

23

Grab the source codegithub.com/linkedin/dr-elephant

Read the blog postengineering.linkedin.com/blog

Dr Elephant is Now Open Source

http://github.com/linkedin/dr-elephant





http://engineering.linkedin.com/blog/2016/04/dr-elephant-open-source-self-serve-performance-tuning-hadoop-spark

Upgrades are HardA totally fictional story: The Hadoop team pushes a new Pig upgrade The next day thirty flows fail with ClassNotFoundExceptions Angry users riot Property damage exceeds $30mm

What happened? The flows depended on a third-party UDF that depended on a transitive

dependency provided by the old version of Pig, but not the new version of Pig

Bringing Shading Out of the ShadowsWhat most people think it is

Package artifact and all dependencies in the same JAR + rename some or all of the package names

What it really is Static linking for Java

Unfairly maligned by many people

We built an improved Gradle plugin that makes shading easier for inexperienced users

26

Audit Hadoop flows for incompatible and unnecessary dependencies.

Predict failures before they happen by scanning for dependencies that won’t be satisfied post-upgrade.

Proved extremely useful during Hadoop2 migration

Byte-Ray: “X-Ray Goggles for JAR Files”

Byte-Ray in Action

SoakCycle: Real World Integration Testing

The Future?

Dali2015 was the year of the table

We want to make 2016 the year of the view

Learn more at the Dali talk tomorrow

©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.