1Confidential
Introducing Kafka’s Streams APIStream processing made simple
Target audience: technical staff, developers, architectsExpected duration for full deck: 45 minutes
2Confidential
0.10 Data processing (Streams API)
0.9 Data integration (Connect API)
Intra-clusterreplication
0.8
Apache Kafka: birthed as a messaging system, now a streaming platform
2012 2014 2015 2016 2017
Cluster mirroring,data compression
0.7
2013
3Confidential
Kafka’s Streams API: the easiest way to process data in Apache Kafka
Key Benefits of Apache Kafka’s Streams API• Build Apps, Not Clusters: no additional cluster required• Cluster to go: elastic, scalable, distributed, fault-tolerant, secure• Database to go: tables, local state, interactive queries• Equally viable for S / M / L / XL / XXL use cases• “Runs Everywhere”: integrates with your existing deployment
strategies such as containers, automation, cloud
Part of open source Apache Kafka, introduced in 0.10+• Powerful client library to build stream processing apps• Apps are standard Java applications that run on client
machines• https://github.com/apache/kafka/tree/trunk/streams
Streams API
Your App
KafkaCluster
4Confidential
Kafka’s Streams API: Unix analogy
$ cat < in.txt | grep “apache” | tr a-‐z A-‐Z > out.txt
Kafka Cluster
Connect API Streams API
5Confidential
Streams API in the context of Kafka
Streams API
Your App
KafkaCluster
Conn
ect A
PI
Conn
ect A
PI
Oth
er S
yste
ms
Oth
er S
yste
ms
6Confidential
When to use Kafka’s Streams API
• Mainstream Application Development• To build core business applications• Microservices• Fast Data apps for small and big data• Reactive applications• Continuous queries and transformations• Event-triggered processes• The “T” in ETL• <and more>
Use case examples• Real-time monitoring and intelligence• Customer 360-degree view• Fraud detection• Location-based marketing• Fleet management• <and more>
7Confidential
Some public use cases in the wild & external articles
• Applying Kafka’s Streams API for internal message delivery pipeline at LINE Corp.• http://developers.linecorp.com/blog/?p=3960• Kafka Streams in production at LINE, a social platform based in Japan with 220+ million users
• Microservices and reactive applications at Capital One• https://speakerdeck.com/bobbycalderwood/commander-decoupled-immutable-rest-apis-with-kafka-streams
• User behavior analysis• https://timothyrenner.github.io/engineering/2016/08/11/kafka-streams-not-looking-at-facebook.html
• Containerized Kafka Streams applications in Scala• https://www.madewithtea.com/processing-tweets-with-kafka-streams.html
• Geo-spatial data analysis• http://www.infolace.com/blog/2016/07/14/simple-spatial-windowing-with-kafka-streams/
• Language classification with machine learning• https://dzone.com/articles/machine-learning-with-kafka-streams
8Confidential
Do more with less
9Confidential
Architecture comparison: use case example
Real-time dashboard for security monitoring“Which of my data centers are under attack?”
10Confidential
Architecture comparison: use case example
Other App
Dashboard Frontend
AppOther App
1 Capture businessevents in Kafka
2 Must process events withseparate cluster (e.g. Spark)
4 Other apps access latest resultsby querying these DBs
3 Must share latest results throughseparate systems (e.g. MySQL)
Before: Undue complexity, heavy footprint, many technologies, split ownership with conflicting priorities
Your “Job”
Other App
Dashboard Frontend
AppOther App
1 Capture businessevents in Kafka
2 Process events with standardJava apps that use Kafka Streams
3 Now other apps can directlyquery the latest results
With Kafka Streams: simplified, app-centric architecture, puts app owners in control
KafkaStreams
Your App
11Confidential
12Confidential
13Confidential
How do I install the Streams API?
• There is and there should be no “installation” – Build Apps, Not Clusters!• It’s a library. Add it to your app like any other library.
<dependency><groupId>org.apache.kafka</groupId><artifactId>kafka-‐streams</artifactId><version>0.10.1.1</version>
</dependency>
14Confidential
“But wait a minute – where’s THE CLUSTER to process the data?”
• No cluster needed – Build Apps, Not Clusters!• Unlearn bad habits: “do cool stuff with data ≠ must have cluster”
Ok. Ok. Ok.
15Confidential
Organizational benefits: decouple teams and roadmaps, scale people
16Confidential
Organizational benefits: decouple teams and roadmaps, scale people
Infrastructure Team(Kafka as a shared, multi-tenant service)
Fraud detection
app
Payments team
Recommendations app
Mobile team
Securityalertsapp
Operations team
...more apps...
...
17Confidential
How do I package, deploy, monitor my apps? How do I …?
• Whatever works for you. Stick to what you/your company think is the best way.• No magic needed.• Why? Because an app that uses the Streams API is…a normal Java app.
18Confidential
Available APIs
19Confidential
The API is but the tip of the iceberg
API, coding
Org. processes
Reality™Deployment
OperationsSecurity
…
Architecture
Debugging
20Confidential
• API option 1: DSL (declarative)
KStream<Integer, Integer> input =builder.stream("numbers-‐topic");
// Stateless computationKStream<Integer, Integer> doubled =
input.mapValues(v -‐> v * 2);
// Stateful computationKTable<Integer, Integer> sumOfOdds = input
.filter((k,v) -‐> v % 2 != 0)
.selectKey((k, v) -‐> 1)
.groupByKey()
.reduce((v1, v2) -‐> v1 + v2, "sum-‐of-‐odds");
The preferred API for most use cases.
Particularly appeals to:
• Fans of Scala, functional programming
• Users familiar with e.g. Spark
21Confidential
• API option 2: Processor API (imperative)
class PrintToConsoleProcessorimplements Processor<K, V> {
@Overridepublic void init(ProcessorContext context) {}
@Overridevoid process(K key, V value) {
System.out.println("Got value " + value); }
@Overridevoid punctuate(long timestamp) {}
@Overridevoid close() {}
}
Full flexibility but more manual work
Appeals to:
• Users who require functionality that is
not yet available in the DSL
• Users familiar with e.g. Storm, Samza
• Still, check out the DSL!
22Confidential
When to use Kafka Streams vs. Kafka’s “normal” consumer clients
Kafka Streams
• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time• Basically all the time
Kafka consumer clients (Java, C/C++, Python, Go, …)
• When you must interact with Kafka at a very low level and/or in a very special way• Example: When integrating your own stream
processing tool (Spark, Storm) with Kafka.
23Confidential
Code comparisonFeaturing Kafka with Streams API <-> Spark Streaming
24Confidential
”My WordCount is better than your WordCount” (?)
Kafka
Spark
These isolated code snippets are nice (and actually quite similar) but they are not very meaningful. In practice, we also need to read data from somewhere, write data back to somewhere, etc.– but we can see none of this here.
25Confidential
WordCount in Kafka
WordCount
26Confidential
Compared to: WordCount in Spark 2.0
1
2
3
Runtime model leaks into processing logic(here: interfacing from Spark with Kafka)
27Confidential
Compared to: WordCount in Spark 2.0
4
5Runtime model leaks into processing logic(driver vs. executors)
28Confidential
Key concepts
29Confidential
Key concepts
30Confidential
Key concepts
31Confidential
Key concepts
Kafka Core Kafka Streams
32Confidential
Streams and TablesStream Processing meets Databases
33Confidential
34Confidential
35Confidential
Key observation: close relationship between Streams and Tables
http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple http://docs.confluent.io/current/streams/concepts.html#duality-of-streams-and-tables
36Confidential
37Confidential
Example: Streams and Tables in Kafka
Word Count
hello 2
kafka 1
world 1
… …
38Confidential
39Confidential
40Confidential
41Confidential
42Confidential
Example: continuously compute current users per geo-region
4
7
5
3
2
8 4
7
6
3
2
7
Alice
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
-1+1
user-locations(mobile team)
user-prefs(web team)
43Confidential
Example: continuously compute current users per geo-regionKTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
44Confidential
Example: continuously compute current users per geo-region
alice Europe
user-locationsalice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
KTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
// Merge into detailed user profiles (continuously updated)KTable<UserId, UserProfile> userProfiles =
userLocations.join(userPrefs, (loc, prefs) -‐> new UserProfile(loc, prefs));
KTable userProfilesKTable userProfiles
45Confidential
Example: continuously compute current users per geo-regionKTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
// Merge into detailed user profiles (continuously updated)KTable<UserId, UserProfile> userProfiles =
userLocations.join(userPrefs, (loc, prefs) -‐> new UserProfile(loc, prefs));
// Compute per-‐region statistics (continuously updated)KTable<UserId, Long> usersPerRegion = userProfiles
.filter((userId, profile) -‐> profile.age < 30)
.groupBy((userId, profile) -‐> profile.location)
.count();
alice Europe
user-locationsAfrica 3
… …Asia 8
Europe 5
Africa 3… …
Asia 7Europe 6
KTable usersPerRegion KTable usersPerRegion
46Confidential
Example: continuously compute current users per geo-region
4
7
5
3
2
8 4
7
6
3
2
7
Alice
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
-1+1
user-locations(mobile team)
user-prefs(web team)
47Confidential
Streams meet Tables – in the DSL
48Confidential
Streams meet Tables
• Most use cases for stream processing require both Streams and Tables• Essential for any stateful computations
• Kafka ships with first-class support for Streams and Tables• Scalability, fault tolerance, efficient joins and aggregations, …
• Benefits include: simplified architectures, less moving pieces, less Do-It-Yourself work
49Confidential
Key features
50Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration
51Confidential
Native, 100% compatible Kafka integration
Read from Kafka
Write to Kafka
52Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features
53Confidential
Secure stream processing with the Streams API
• Your applications can leverage all client-side security features in Apache Kafka
• Security features include:• Encrypting data-in-transit between applications and Kafka clusters• Authenticating applications against Kafka clusters (“only some apps may talk to the production
cluster”)• Authorizing application against Kafka clusters (“only some apps may read data from sensitive topics”)
54Confidential
Configuring security settings
• In general, you can configure both Kafka Streams plus the underlying Kafka clients in your apps
55Confidential
Configuring security settings
• Example: encrypting data-in-transit + client authentication to Kafka cluster
Full demo application at https://github.com/confluentinc/examples
56Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant
57Confidential
58Confidential
59Confidential
60Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations
61Confidential
Stateful computations
• Stateful computations like aggregations (e.g. counting), joins, or windowing require state• State stores are the backbone of state management
• … are local for best performance• … are backed up to Kafka for elasticity and for fault-tolerance• ... are per stream task for isolation – think: share-nothing
• Pluggable storage engines• Default: RocksDB (a key-value store) to allow for local state that is larger than available RAM• You can also use your own, custom storage engine
• From the user perspective:• DSL: no need to worry about anything, state management is automatically being done for you• Processor API: direct access to state stores – very flexible but more manual work
62Confidential
63Confidential
64Confidential
65Confidential
66Confidential
Use case: real-time, distributed joins at large scale
67Confidential
Use case: real-time, distributed joins at large scale
68Confidential
Use case: real-time, distributed joins at large scale
69Confidential
Stateful computations
• Use the Processor API to interact directly with state stores
Get the store
Use the store
70Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries
71Confidential
72Confidential
Interactive Queries: architecture comparison
KafkaStreams
AppApp
App
App
1 Capture businessevents in Kafka
2 Process the eventswith Kafka Streams
4 Other apps query externalsystems for latest results
! Must use external systemsto share latest results
App
App
App
1 Capture businessevents in Kafka
2 Process the eventswith Kafka Streams
3 Now other apps can directlyquery the latest results
Before (0.10.0)
After (0.10.1): simplified, more app-centric architecture
KafkaStreams
App
73Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries• Time model
74Confidential
Time
75Confidential
Time
A
C
B
76Confidential
Time
• You configure the desired time semantics through timestamp extractors• Default extractor yields event-time semantics
• Extracts embedded timestamps of Kafka messages (introduced in v0.10)
77Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries• Time model• Windowing
78Confidential
Windowing
• Group events in a stream using time-based windows• Use case examples:
• Time-based analysis of ad impressions (”number of ads clicked in the past hour”)• Monitoring statistics of telemetry data (“1min/5min/15min averages”)
Input data, wherecolors represent
different users events
Rectangles denotedifferent event-time
windows
processing-time
event-time
windowing
alice
bob
dave
79Confidential
Windowing in the DSL
TimeWindows.of(3000)
TimeWindows.of(3000).advanceBy(1000)
80Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries• Time model• Windowing• Supports late-arriving and out-of-order data
81Confidential
Out-of-order and late-arriving data
• Is very common in practice, not a rare corner case• Related to time model discussion
82Confidential
Out-of-order and late-arriving data: example when this will happen
Users with mobile phones enterairplane, lose Internet connectivity
Emails are being writtenduring the 10h flight
Internet connectivity is restored,phones will send queued emails now
83Confidential
Out-of-order and late-arriving data
• Is very common in practice, not a rare corner case• Related to time model discussion
• We want control over how out-of-order data is handled, and handling must be efficient• Example: We process data in 5-minute windows, e.g. compute statistics
• Option A: When event arrives 1 minute late: update the original result!• Option B: When event arrives 2 hours late: discard it!
84Confidential
Key features in 0.10
• Native, 100%-compatible Kafka integration• Secure stream processing using Kafka’s security features• Elastic and highly scalable• Fault-tolerant• Stateful and stateless computations• Interactive queries• Time model• Windowing• Supports late-arriving and out-of-order data• Millisecond processing latency, no micro-batching• At-least-once processing guarantees (exactly-once is in the works as we speak)
85Confidential
Roadmap Outlook
86Confidential
Roadmap outlook for Kafka Streams
• Exactly-Once processing semantics• Unified API for real-time processing and “batch” processing• Global KTables• Session windows• … and more …
87Confidential
Wrapping Up
88Confidential
Where to go from here
• Kafka Streams is available in Confluent Platform 3.1 and in Apache Kafka 0.10.1• http://www.confluent.io/download
• Kafka Streams demos: https://github.com/confluentinc/examples • Java 7, Java 8+ with lambdas, and Scala• WordCount, Interactive Queries, Joins, Security, Windowing, Avro integration, …
• Confluent documentation: http://docs.confluent.io/current/streams/• Quickstart, Concepts, Architecture, Developer Guide, FAQ
• Recorded talks• Introduction to Kafka Streams:
http://www.youtube.com/watch?v=o7zSLNiTZbA• Application Development and Data in the Emerging World of Stream Processing (higher level talk):
https://www.youtube.com/watch?v=JQnNHO5506w
89Confidential
Thank You
90Confidential
Appendix: Streams and TablesA closer look
91Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8
Real-time dashboard“How many users younger than 30y, per region?”
alice Asia, 25y, …bob Europe, 46y, …
… …
user-locations(mobile team)
user-prefs(web team)
92Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
user-locations(mobile team)
user-prefs(web team)
93Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
user-locations(mobile team)
user-prefs(web team)
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
94Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8 4
7
6
3
2
7
Alice
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
-1+1
user-locations(mobile team)
user-prefs(web team)
95Confidential
Same data, but different use cases require different interpretations
alice San Francisco
alice New York City
alice Rio de Janeiro
alice Sydney
alice Beijing
alice Paris
alice Berlin
96Confidential
Same data, but different use cases require different interpretations
alice San Francisco
alice New York City
alice Rio de Janeiro
alice Sydney
alice Beijing
alice Paris
alice Berlin
Use case 1: Frequent traveler status?
Use case 2: Current location?
97Confidential
Same data, but different use cases require different interpretations
“Alice has been to SFO, NYC, Rio, Sydney,Beijing, Paris, and finally Berlin.”
“Alice is in SFO, NYC, Rio, Sydney,Beijing, Paris, Berlin right now.”
⚑ ⚑ ⚑⚑
⚑⚑
⚑ ⚑ ⚑ ⚑⚑
⚑⚑
⚑
Use case 1: Frequent traveler status? Use case 2: Current location?
98Confidential
Same data, but different use cases require different interpretations
alice San Francisco
alice New York City
alice Rio de Janeiro
alice Sydney
alice Beijing
alice Paris
alice Berlin
Use case 1: Frequent traveler status?
Use case 2: Current location?
⚑ ⚑ ⚑⚑ ⚑
⚑⚑
⚑
99Confidential
Same data, but different use cases require different interpretations
alice San Francisco
alice New York City
alice Rio de Janeiro
alice Sydney
alice Beijing
alice Paris
alice Berlin
Use case 1: Frequent traveler status?
Use case 2: Current location?
⚑ ⚑ ⚑⚑ ⚑
⚑⚑
⚑
100Confidential
Same data, but different use cases require different interpretations
alice San Francisco
alice New York City
alice Rio de Janeiro
alice Sydney
alice Beijing
alice Paris
alice Berlin
Use case 1: Frequent traveler status?
Use case 2: Current location?
⚑ ⚑ ⚑⚑ ⚑
⚑⚑
⚑
101Confidential
Streams meet Tables
record stream
When you need… so that the topic isinterpreted as a
All the values of a key KStream
then you’d read theKafka topic into a
Example
All the places Alicehas ever been to
with messagesinterpreted as
INSERT(append)
102Confidential
Streams meet Tables
record stream
changelog stream
When you need… so that the topic isinterpreted as a
All the values of a key
Latest value of a key
KStream
KTable
then you’d read theKafka topic into a
Example
All the places Alicehas ever been to
Where Aliceis right now
with messagesinterpreted as
INSERT(append)
UPSERT(overwrite
existing)
103Confidential
Same data, but different use cases require different interpretations
“Alice has been to SFO, NYC, Rio, Sydney,Beijing, Paris, and finally Berlin.”
“Alice is in SFO, NYC, Rio, Sydney,Beijing, Paris, Berlin right now.”
⚑ ⚑ ⚑⚑
⚑⚑
⚑ ⚑ ⚑ ⚑⚑
⚑⚑
⚑
Use case 1: Frequent traveler status? Use case 2: Current location?
KStream KTable
104Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8 4
7
6
3
2
7
Alice
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
-1+1
user-locations(mobile team)
user-prefs(web team)
105Confidential
Motivating example: continuously compute current users per geo-regionKTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
106Confidential
Motivating example: continuously compute current users per geo-region
alice Europe
user-locationsalice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
KTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
// Merge into detailed user profiles (continuously updated)KTable<UserId, UserProfile> userProfiles =
userLocations.join(userPrefs, (loc, prefs) -‐> new UserProfile(loc, prefs));
KTable userProfilesKTable userProfiles
107Confidential
Motivating example: continuously compute current users per geo-regionKTable<UserId, Location> userLocations = builder.table(“user-‐locations-‐topic”);KTable<UserId, Prefs> userPrefs = builder.table(“user-‐preferences-‐topic”);
// Merge into detailed user profiles (continuously updated)KTable<UserId, UserProfile> userProfiles =
userLocations.join(userPrefs, (loc, prefs) -‐> new UserProfile(loc, prefs));
// Compute per-‐region statistics (continuously updated)KTable<UserId, Long> usersPerRegion = userProfiles
.filter((userId, profile) -‐> profile.age < 30)
.groupBy((userId, profile) -‐> profile.location)
.count();
alice Europe
user-locationsAfrica 3
… …Asia 8
Europe 5
Africa 3… …
Asia 7Europe 6
KTable usersPerRegion KTable usersPerRegion
108Confidential
Motivating example: continuously compute current users per geo-region
4
7
5
3
2
8 4
7
6
3
2
7
Alice
Real-time dashboard“How many users younger than 30y, per region?”
alice Europe
user-locations
alice Asia, 25y, …bob Europe, 46y, …
… …
alice Europe, 25y, …bob Europe, 46y, …
… …
-1+1
user-locations(mobile team)
user-prefs(web team)
109Confidential
Another common use case: continuous transformations
• Example: to enrich an input stream (user clicks) with side data (current user profile)
KStream alice /rental/p8454vb, 06:59 PM PDT
user-clicks-topics (at 1M msgs/s)
“facts”
110Confidential
Another common use case: continuous transformations
• Example: to enrich an input stream (user clicks) with side data (current user profile)
KStream alice /rental/p8454vb, 06:59 PM PDT
alice Asia, 25ybob Europe, 46y
… …KTable
user-profiles-topic
user-clicks-topics (at 1M msgs/s)
“facts”
“dimensions”
111Confidential
Another common use case: continuous transformations
• Example: to enrich an input stream (user clicks) with side data (current user profile)
KStream
alice /rental/p8454vb, 06:59 PDT, Asia, 25y
stream.JOIN(table)
alice /rental/p8454vb, 06:59 PM PDT
alice Asia, 25ybob Europe, 46y
… …KTable
user-profiles-topic
user-clicks-topics (at 1M msgs/s)
“facts”
“dimensions”
112Confidential
Another common use case: continuous transformations
• Example: to enrich an input stream (user clicks) with side data (current user profile)
KStream
alice /rental/p8454vb, 06:59 PDT, Asia, 25y
stream.JOIN(table)
alice /rental/p8454vb, 06:59 PM PDT
alice Asia, 25ybob Europe, 46y
… …KTable
alice Europe, 25ybob Europe, 46y
… …alice Europe
new update for alice from user-locations topicuser-profiles-topic
user-clicks-topics (at 1M msgs/s)
“facts”
“dimensions”
113Confidential
Appendix: Interactive QueriesA closer look
114Confidential
Interactive Queries
115Confidential
Interactive Queries
charlie 3bob 5 alice 2
116Confidential
Interactive Queries
New API to accesslocal state stores ofan app instance
charlie 3bob 5 alice 2
117Confidential
Interactive Queries
New API to discoverrunning app instances
charlie 3bob 5 alice 2
“host1:4460” “host5:5307” “host3:4777”
118Confidential
Interactive Queries
You: inter-app communication (RPC layer)