The way to deal with Big data problems
Monal Daxini
March 2016
Monal DaxiniReal Time Data Infrastructure
Senior Software Engineer, Netflixhttps://www.linkedin.com/in/monaldaxini
@monaldax
#Netflix #Keystone
We help Produce,Store,
Process,Move
Events @ scale
Tell me more...
● Big Data Ecosystem @ Netflix● How we built a scalable event pipeline - Keystone - in a year
○ Replaced legacy system without service disruption○ Small team 8 +1
● Netflix Culture○ Relevant tenets tagged on the slides
Global Launch - Jan 6, 2016
Over 75M Members
190 Countries
125M hours/day → 11B hours / quarter
14,269 years / day → 1,255,707 years / quarter
1000+ devices
37% of Internet traffic at peak
Netflix Is a Data Driven Company
Content
Product
Marketing
Finance
Business Development
Talent
Infrastructure
← C
ultu
re o
f Ana
lytics→
Data @ Netflix
Data at Rest (batch)
Data in Motion (streaming)
Big Data Systems - batch
Ingestion / Kafka -> Ursula, Aegisthus
Storage / S3, Teradata, Redshift, Druid
Processing / Pig, Hive, Presto, Spark
Reporting / Microstrategy, Tableau, Sting
Scheduling / UC4
Interface / Big Data Portal, Kragle
Open source &
Community Driven
Big Data Systems - batch
Scale - batch
AWS S3 (instead of HDFS)
40 PB (S3) Compressed
Of which 13 PB events data
Big Data Systems - streaming
Data Pipeline - Keystone
Playback & operational insight - Mantis
Stream Processing* - Spark Streaming
Metrics & monitoring - Atlas
Loosely Coupled
Highly AlignedOpen so
urce &
Community
Driven
What does culture have to do with big data?
Netflix Culture Deck
Netflix CultureFreedom & Responsibility
"It may well be the most important document ever to come out of the Valley." 1
Sheryl SandbergCOO, Facebook
1 Business Insider, 2013
A NETFLIX ORIGINAL SERVICE
How we built an internal facing 1 trillion / day stream processing cloud platform in a year, and how culture played a pivotal role
Freedom & Responsibility
Years ago...
In the Old Days ...
EMR
EventProducers
Chukwa/Suro + Real-Time Branch
About a year ago ...
Chukwa / Suro + Real-Time Branch
EventProducer
Druid
Stream Consumers
EMR
ConsumerKafka
Suro Router
EventProducer
Suro
Kafka
SuroProxy
Support at-least-once processing
Scale, Ease of Operations
Replace dormant open source software - Chukwa
Enable future value adds - Stream Processing As a Service
Seamless transition to the new platform
Context Not Control
Migrate Events to a new Pipeline In flight,while not losing more that 0.1% of them
Context Not ControlHighly A
ligned
Loosel
y Cou
pled
Jan 2016
Keystone
Stream Consumers
SamzaRouter
EMR
FrontingKafka
ConsumerKafka
Control Plane
EventProducer
KS
Prox
y
1 trillion events ingested per day during holiday season
1+ trillion events processed every day
350 billion a year ago 600+ billion events ingested per day
Keystone - Scale - Streaming
11 million events (24 GB per second) peak
Upto 10MB payload / Avg 4K
1.3 PB / day
Keystone - Scale - Streaming
Events & Producers
Keystone
Stream Consumers
SamzaRouter
EMR
FrontingKafka
EventProducer
ConsumerKafka
Control Plane
Event Payload is ImmutableAt-least-once semantics*
* Once the event makes it to Kafka, ther are disaster scenarios where this breaks.
Injected Event Metadata
● GUID
● Timestamp
● Host
● App
Keystone Extensible Wire Protocol
● Backwards and forwards compatibility
● Supports JSON, AVRO on the horizon
● Invisible to source & sinks
● Efficient - 10 bytes overhead per message
○ because message size - hundreds of bytes to 10MB
Netflix Kafka Producer
● Best effort delivery - ack = 1
● Prefer drop event than disrupting producer app
● Resume event production after Kafka cluster restore
● Integration with Netflix Ecosystem
● Configurable topic to Kafka clusters route
Fronting Kafka Clusters
Keystone
Stream Consumers
SamzaRouter
EMR
FrontingKafka
EventProducer
ConsumerKafka
Control Plane
● Pioneer Tax● Started with 0.7● In prod with 0.8.2● Move to 0.9 & VPC in progress
Kafka in the Cloud
Based on topics assigned
● Normal-priority (majority)● High-priority (streaming activities etc.)
Fronting Kafka Topic Classification
● ≅3200 d2.xl brokers for regular, failover, & consumer
● 125 Zookeeper nodes○ Independent zookeeper cluster per Kafka cluster
● 24 island clusters, 8 per region○ 3 ASGs per cluster, 1 ASG per zone○ 24 warm standby 3 node failover clusters
Scale - Kafka (prod)
● No dynamic topic creation● Two copies● Zone aware assignment of Topic partitions and replica
Fronting Kafka Topics
In a distributed system make sure you understand limitations and failures,
even if you don’t know all the features.
- Monal
In addition, we doKafka Kong once a week
Fronting Kafka Failover
Self Service Tool
Blameless Culture
Fronting Kafka Failover
Fronting Kafka Failover
Kafka Management UI (Beta)Open sourcing on the road map
Open source
&
Community
Driven
Kafka AuditorOpen sourcing on the road map
Open source
&
Community
Driven
Kafka Auditor - One pre cluster
● Broker monitoring
● Consumer monitoring
● Heart-beat & Continuous message latency
● On-demand Broker performance testing
● Built as a service deployable on single or multiple instances
Kafka Cluster Size -Tips
● Per Cluster Stay under 10k partitions & 200 brokers
● Leave approx. 40% free disk space on each broker
● Started with AWS zone aware partition assignments
● We have discovered and filed several bugs
○ Details - Upcoming in Netflix Tech blog
Kafka ContributionsOpen source &
Community Driven
Routing Service
Keystone
Stream Consumers
SamzaRouter
EMR
FrontingKafka
EventProducer
ConsumerKafka
Control Plane
Routing Infrastructure
+
CheckpointingCluster
+ 0.9.1Go
C language
Router Job Manager(Control Plane)
EC2 InstancesZookeeper
(Instance Id assignment)
JobJobJob
ksnode
Checkpointing Cluster
ASG
Custom Go Executor
./runJob
Logs
Snapshots
Attach Volumes
./runJob./runJob
Reconcile Loop - 1 minHealth Check
What’s running in ksnode?
Zookeeper(Instance Id assignment)
Logs ZFS Volume Snapshots
Custom Go Executor
./runJo
b.
/runJob
./runJo
b
Go Tools Server��Client ToolsStream Logs
Browse through rotated logs by date
Ksnode Tooling
Yes! You inferred right!
No Mesos & No Yarn
Distributed Systems are HardKeep it Simple
Minimize Moving Parts
● 13,000 docker containers (samza jobs)○ 7,000 - S3 Sink○ 4,500 - Consumer Kafka sink○ 1,500 - Elasticsearch sink
● 1,300 AWS C3-4XL instances
Scale - Routing Service
More Info - Samza Meetup (10/2015)
Samza ver 0.9.1 Contributions
Open source &
Community Driven
Target & Achieved <= 0.1% diff
bw Chukwa & Keystone pipeline,
over 2.6 PB of data / day
Chukwa & Keystone Pipeline Shadowing
Metrics & Monitoring
Keystone
Stream Consumers
SamzaRouter
EMR
FrontingKafka
ConsumerKafka
Control Plane
EventProducer
KS
Prox
y
Customer Facing per topic end-to-end dashboard
Dev facing infrastructure end-to-end dashboard
Scaling Avenues
● Exposed cost attribution per event producers & topic
○ E.g. one producer reduced throughput by 600%
● Automation - frees up additional resources
Scaling Up by Scaling Down
● No dedicated product or project managers
● No separate devops or operational team
● This does not mean we are constantly overworked
○ we make wise and simple choices and
○ lean towards automation & self-healing systems.
We build and run what you saw today!
You build It!
You run it!High Perf
ormance
Not DevOps, but move towards NoOps
You build it! You run it!
● High Performance culture● Communication● No culture of process adherence
○ Creativity & Self Discipline○ Freedom and Responsibility
Looking into the future?
Streaming Processing As a Service
● multi-tenant polyglot support of streaming engines like
Spark Streaming, Mantis, Samza, and may be Flink
Future stepsOpen source &
Community Driven
Messaging As a Service
● Kafka & Others● Spark Streaming, Mantis, Samza, and may be Flink.
Future stepsOpen source &
Community Driven
Data thruway
● Support for schemas - registry, discovery, validation.
Self Service Tooling
Future stepsOpen source &
Community Driven
More brain food...
Netflix OSS
Samza Meetup Presentation
Netflix Tech Blog
Spark Summit 2015 Talk