Transcript
Page 1: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Netflix Data Pipeline

Sudhir Tonse (@stonse)Danny Yuan (@g9yuayon)

Page 2: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/o/in/photostream/!

Netflix is a log generating company that also happens to stream movies

- Adrian Cockroft

Page 3: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Data Is the most important asset at Netflix

Page 4: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

If all the data is easily available to all teams, it can be leveraged in new and

exciting ways

Page 5: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Dashboard

Page 6: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

Dashboard

Page 7: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

~500 Apps/Web Services

Dashboard

Page 8: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

~1000 Device Types

~500 Apps/Web Services

~100 Billion Events/Day !3.2M messages per second at peak time !3GB per second at peak time

Dashboard

Page 9: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Type of Events• User  Interface  Events  • Search  Event  (‘Matrix’  using  PS3  …)  • Star  Ra>ng  Event  (HoC  :  5  stars,  Xbox,  US,  …)  

!

• Infrastructural  Events  • RPC  Call  (API  -­‐>  Billing  Service,  ‘/bill/..’,  200,  …)  • Log  Errors  (NPE,  “Movie  is  null”,  …,  …)  

!

• Other  Events  …  !!

Page 10: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Making Sense of Billions of Events

Page 11: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

http://netflix.github.io+

ElasticSearchDruid

Page 12: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 13: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A Humble Beginning

Page 14: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 15: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 16: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 17: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Evolution …Scale!

Page 18: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 19: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 20: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

ApplicationApplication

Application Application

Application

Application

Application

Application

ApplicationApplication

Page 21: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Process App Data in Hadoop

Page 22: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 23: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 24: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 25: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 26: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 27: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Our Hadoop Ecosystem

Page 28: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

@NetflixOSS Big Data Tools

Page 29: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Hadoop as a Service

Page 30: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pig Scripting on Steroids

Page 31: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pig Married to Clojure

Page 32: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

S3MPER

S3mper is a library that provides an additional layer of consistency checking on top of Amazon's S3 index through use of a consistent, secondary index.

S3mper is a library that provides an additional layer of consistency

checking on top of Amazon's S3 index through use of a consistent, secondary index.

Page 33: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Efficient ETL with Cassandra

Cassandra

Page 34: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Offline Analysis

Page 35: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Evolution … Speed!

Page 36: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket

Page 37: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 38: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 39: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 40: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 41: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 42: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Aggregate, Index, and Query Data in Real Time

Page 43: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Interactive Exploration

Page 44: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Let’s walk through some use cases

Page 45: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

client activity event

*/name = “movieStarts”

Page 46: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

Page 47: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

Page 48: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

• Data scientists: validation, ETL, batch processing

Page 49: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Pipeline Challenges

• App owners: send and forget

• Data scientists: validation, ETL, batch processing

• DevOps: stream processing, targeted search

Page 50: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Routing

Page 51: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 52: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 53: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

We Want to Consume Data Selectively in Different Ways

Page 54: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 55: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Message broker!

• High-throughput!

• Persistent and replicated

Page 56: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There Is More

Page 57: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Intelligent Alerts

Page 58: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Intelligent Alerts

Page 59: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 60: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 61: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 62: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 63: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 64: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 65: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 66: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Guided Debugging in the Right Context

Page 67: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

What We Need

Page 68: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

What We Need

Page 69: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries

What We Need

Page 70: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries• Time series with flexible filters

What We Need

Page 71: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Ad-hoc query with different dimensions

• Quick aggregations and Top-N queries• Time series with flexible filters• Quick access to raw data using boolean queries

What We Need

Page 72: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Druid

• Rapid exploration of high dimensional data!

• Fast ingestion and querying!

• Time series

Page 73: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

• Real-time indexing of event streams!

• Killer feature: boolean search!

• Great UI: Kibana

Page 74: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

The Old Pipeline

Page 75: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

The New Pipeline

Page 76: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 77: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There Is More

Page 78: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

It’s Not All About Counters and Time Series

Page 79: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 80: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

RequestId Parent Id Node Id Service Name Status

4965-4a74 0 123 Edge Service 200

4965-4a74 123 456 Gateway 200

4965-4a74 456 789 Service A 200

4965-4a74e 456 abc Service B 200

Status:200

Page 81: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 82: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 83: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 84: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 85: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 86: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Distributed Tracing

Page 87: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A System that Supports All These

Page 88: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

A Data Pipeline To Glue Them All

Page 89: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Make It Simple

Page 90: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Producing

Page 91: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Message Producing

• Simple and Uniform API

• messageBus.publish(event)

Page 92: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Consumption Is Simple Too                              consumer.observe().subscribe(new  Subscriber<>()  {         @Override     public  void  onNext(Ackable<IncomingMessage>  ackable)  {           process(ackable.getEntity(MyEventType.class));       ackable.ack();     }  });  !consumer.pause();  consumer.resume()

Page 93: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

RxJava

• Functional reactive programming model!

• Powerful streaming API!

• Separation of logic and threading model

Page 94: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

Page 95: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

Page 96: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

Page 97: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

• Aggressive buffering

Page 98: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Design Decisions

• Top Priority: app stability and throughput

• Asynchronous operations

• Aggressive buffering

• Drops messages if necessary

Page 99: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Anything Can Fail

Page 100: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Cloud Resiliency

Page 101: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

Page 102: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

Page 103: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

• disk-backed queue: big-queue

Page 104: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Fault Tolerance Features

• Write and forward with auto-reattached EBS (Amazon’s Elastic Block Storage)

• disk-backed queue: big-queue

• Customized scaling down

Page 105: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
Page 106: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

There’s More to Do

• Contribute to @NetflixOSS !

• Join us :-)

Page 107: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Summaryhttp://netflix.github.io

+ElasticSearchDruid

Page 108: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

You can build your own web-scale data pipeline using open source components

Page 109: DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro

Thank You!Sudhir Tonse http://www.linkedin.com/in/sudhirtonse Twitter: @stonse

Danny Yuan http://www.linkedin.com/pub/danny-yuan/4/374/862 Twitter: @g9yuayon


Recommended