Upload
guozhang-wang
View
621
Download
2
Embed Size (px)
Citation preview
1
Guozhang Wang Kafka Meetup Beijing, April 15, 2017
Apache Kafka Development Experience and Lesson Learned
2
A Short History of Kafka
3
A Short History• 2010.10: First commit of Kafka
4
A Short History• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
5
A Short History• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
• 2012.10: Graduated to top-level project
• Release 0.8.0: intra-cluster replication
6
A Short History• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
• 2012.10: Graduated to top-level project
• Release 0.8.0: intra-cluster replication
• 2014.11: Confluent founded
• Release 0.8.2: new producer, quota
• Release 0.9.0: Kafka Connect, new consumer, security
• Release 0.10.0: Kafka Streams, timestamps, rack awareness
7
What is Kafka, Really?
[NetDB 2011]a scalable pub-sub messaging system..
8
Example: Pub-Sub Messaging
Tracking Logs / Metrics
Hadoop / DW
Apache Kafka
…
9
What is Kafka, Really?
[NetDB 2011]
[Hadoop Summit 2013]
a scalable pub-sub messaging system..
a real-time data pipeline..
10
Example: Centralized Data Pipeline
KV-Store Doc-Store RDBMSTracking Logs / Metrics
Hadoop / DW Monitoring Rec. Engine Social GraphSearchingSecurity
Apache Kafka
…
11
What is Kafka, Really?
[NetDB 2011]
[Hadoop Summit 2013]
[VLDB 2015]
a scalable pub-sub messaging system..
a real-time data pipeline..
a distributed and replicated log..
12
Example: Data Store Geo-Replication
Apache Kafka
Local Stores
User Apps User Apps
Local Stores
Apache Kafka
Region 2Region 1
write read
append log
mirroring
apply log
13
What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
14
Example: Async. Micro-Services
15
What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
16
What is Kafka, Really?
a scalable Pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
All of them!
17
Kafka: Streaming Platform
• Publish / Subscribe• Move data around as online streams
• Store• “Source-of-truth” continuous data
• Process• React / process data in real-time
18
How did we get here?
19
Lesson 1: Build evolvable systems
20
Upgrade Your Kafka Cluster is like ..
21
Kafka @ LI
• Release from trunk• Push frequency: daily
• Staging cluster• Full traffic of production• Full monitoring / alerting
• Production Ramp-up• Prepare to roll-back anytime
22
Kafka: Evolvable System• Zero down-time• Maintenance outage? No such thing.
• All protocols versioned• Brokers can talk to older versioned clients
• And vice versa since 0.10.2!
• One should do no more than rolling bounces• Staging before production
23
Server-Client Compatibility
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
24
25
Lesson 2: What gets measured gets fixed
26
Audit Trail @ LI
27
28
.. and a LOT of Them
29
Metrics Reporter
30
• JMXReporter• Included in AK: sensor / metrics -> mbean / attributes
• KafkaReporter• Send metrics back to Kafka!
• More..• Ganglia, Graphite, statsd ..
31
Confluent Enterprise: Delivery Tracking
32
Confluent Enterprise: Delivery Tracking
33
Confluent Enterprise: Cluster Health
34
Lesson 3: APIs stay forever
The Story of KAFKA-1481
35
Hey, we should stop allowing dashes / underscores in MBean name since they care used in hostnames / topics / etc.
Makes sense, let’s do this!
DONE! (a few lines of key changes)
The Story of KAFKA-1481
36
The Story of KAFKA-1481
37
The Story of KAFKA-1481
38
OMG what happened?
We need to tell our SRE to change their monitoring metrics names now..
That’s N cluster on M data centers ..
39
40
41
42
Lesson 4: Service needs gatekeepers
43
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
44
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
45
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
One naughty client can bother everyone ..
46
One naughty client can bother everyone ..
0.10.0
0.8.2
0.10.2
0.8.0
0.9.0
0.8.2
0.10.1
0.8.1
Multi-tenancy Services
47
• Security from ground up [Release 0.9.0+]
• Authentication
• Authorization
• Resources under control [Release 0.9.0+]
• Quota on bytes rate / request rate on clientId / GroupId etc • Quota on CPU resources
48
Lesson 5: Ecosystems are the key
49
Remember Hadoop-Producer/Consumer?
50
• Container Image?
• Kafka Manager GUI?
• REST Proxy APIs?
• Operation Tooling?
What Should Go into AK ?
• Other Language Clients?
• Schema Registry?
• Hadoop / etc Integration?
• Stream Processing?
51
Layered Architecture, not Monolithic
Messagin
g
Cross-D
C
Data Schem
as
Fast ETL
Search
&Query
Process
ing
Console
s
Apache Kafka Enterprise OfferingsOS Eco-Systems
Third-party ToolsFramework Impls
52
API, coding
“Full stack” evaluation
Operations, debugging, …
53
API, coding
“Full stack” evaluation
Operations, debugging, …
Simple is Beautiful
54
55
56
57
Take-aways• Build evolvable systems
• What gets measured gets fixed
• APIs stay forever
• (Multi-tenant) Services need gatekeepers
• Ecosystems are the key
THANKS!
Guozhang Wang | [email protected] | @guozhangwang
Kafka Summit 2017 @ NYC & SF