Upload
ben-stopford
View
2.784
Download
1
Embed Size (px)
Citation preview
Today
• What is Kafka? (High level fluffy stuff)
• What makes it tick? (Low level geeky stuff)
• How can you use it? (Architect oriented stuff)
The Log Scalable, Fault Tolerant, Concurrent, Strongly Ordered, Stateful
The Log Connectors Connectors
Producer Consumer
Streaming Engine
Clients JVM & C native implementations, Go, Python, many more OS
The Log Connectors Connectors
Producer Consumer
Streaming Engine
Connectors Plug into your database of choice
The Log Connectors Connectors
Producer Consumer
Streaming Engine
Streaming Engine The declarative power of a database, wrapped into a Kafka client
The Log Connectors Connectors
Producer Consumer
Streaming Engine
What is messaging in essence?
• Take a message, keep it safe, make it available to consumers.
• Track what messages have been consumed
Kafka attacks these problems separately
The log is a simple idea
Messages are added at the end of the log
Just think of the log as a file
Old New
No Random Access
Index
Disk
Kafka avoids Indexes by keeping the approach simple (indexes impede scalability in this context)
Shard data to get scalability
Messages are sent to different partitions
Producer (1) Producer (2) Producer (3)
Cluster of machines Partitions live on
different machines
Replicate to get fault tolerance
replicate
msg
mastership moves
machines
(1)
(2)
msg
leader
Machine A
Machine A
Machine B
Machine B
Kafka goes a step further
A single topic can be spread over multiple consumers
(4 consuming machines process a single topic)
Linearly Scalable Architecture
Single topic:
- Many producers machines
- Many consumer machines
- Many Broker machines
No Bottleneck!!
Strong Consistency
Send Message
3 replicas on different machines
• Only 1 elected leader • Only leader can be written to, read from
Replication Protocol
Number of replicas is a soft quorum (set min/max tolerable values)
Writer
Reader
Replication is used for resiliency. No need to flush
to disk synchronously. You can flush if you wish, but no one does.
Compacted Topics (Tabular View)
Version 3
Version 2
Version 1
Version 2
Version 1
Version 5
Version 4
Version 3
Version 2
Version 1
Version 2
Version 3
Version 5
All versions Latest Key only
Always on, Event-Driven Services
The Log (streams & tables)
Ingestion Services
Services with Polyglotic
persistence
Simple Services
Streaming Services
Kafka Streams Example
Orders
Customer (Compacted)
Join
Customer Stream
Join, aggregate, intermediary state
stored in Kafka
Kafka Kafka Streams
Orders Stream
Dashboard
Query