Upload
concord
View
424
Download
0
Embed Size (px)
Citation preview
Concord: Simple & Flexible
Stream Processing on Apache Mesos
Shinji Kim
Co-founder, Concord Systems
@concord @databythebay #datagrid
Overview
• What is Stream Processing?
• Today’s Stream Processing
• Introducing Concord
1. Concepts & API
2. Job Topology Management
3. Operations, Toolings, Performance
4. Message Delivery Guarantees
• Future Development Plans
Page 2
What is stream processing?
Page 3
• Processing Data in motion
• Sits between message queues and databases
• Used for faster:
– Data enrichment
– Aggregation
– Filtering / deduplication
Today’s Stream Processing
• Faster MapReduce jobs à ends up running core business logic on top
– Fradulent click detection
– Real-time budget updates
– Trigger-based trading
• Your stream processing jobs are more like microservices
• Need support for services / application management: Cluster mgmt, Monitoring, Debuggability
Page 4
Introducing Concord
Concord is a distributed stream processing framework
built in C++ on top of Apache Mesos, designed for
high-performance, real-time applications that require
flexibility & control.
Page 5
Pub / Sub Operator Model
• Composable jobs by Metadata
A B words Metadata(
Name=‘A’, istreams=[], ostreams=[‘words’])
Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[])
Page 7
Pub / Sub Operator Model
• Composable jobs by Metadata
A B words Metadata(
Name=‘A’, istreams=[], ostreams=[‘words’])
Metadata( Name=‘B’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[])
Page 8
C Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.SHUFFLE], ostreams=[])
Simple API in Multiple Languages
• ProcessRecord, ProduceRecord, ProcessTimer
• GetState, SetState backed by Rocksdb
• API available in Python, Ruby, Go, Java/Scala, C++
B Metadata( Name=‘C’, istreams=[‘words’, StreamGrouping.GROUP_BY], ostreams=[‘wordcount’])
Page 9
words wordcount
Key Value
Corgi 2
Chiwawa 4
Dashhound 5
Native Integration with Apache Mesos
Page 11
• Dynamic resource scheduling
• Task Isolation
• Task supervision
• High Availability
Containerized Execution Environment
• Horizontal scaling
• Multi-tenancy
• Hot code deployment & dynamic topology
Page 12
Mesos Agent
RocksD
B
Concord supports Transparent Debugging
[2015-11-02 15:36:44.770] [dispatcher_latencies] [info] 127.0.0.1:31000: traceId: -8816532120874703981, parentId: 0, id: -6816766813334129096, p50: 388179us, p95: 519668us, p99: 524812us, p999: 526425us
[2015-11-02 15:37:13.929] [principal_latencies] [info] 127.0.0.1:31001:
traceId: -4811311467074699790, parentId: -7681059555040553620, id: -1899872683843643522, p50: 73355us, p95: 145626us, p99: 210345us, p999: 272018us
[2015-11-02 15:36:43.323] [incoming_throughput] [info] 12288 req in 1045515us. total: 367616 req [2015-11-02 15:36:30.240] [outgoing_throughput] [info] 100000 req in 4804526us. total: 600000 req
Page 19
Concord performs well at scale
• Word count benchmark (1.13B msgs) – Concord: 500K QPS/node at 10ms/event
– Storm: 16K QPS/node at 100ms/event
– Spark Streaming: 100K QPS/node at 1s batch window
• Server log processing (29G server log, ~260M msgs) – 4 nodes, 8 vCPU, 32GB RAM each
– Concord: 1M – 1.8M QPS
– Spark Streaming: 72K – 2M QPS
• Consistent performance
Page 20
Concord is designed for Predictability
• As you scale, JVM reconfiguration and GC pauses are inevitable (Framework GC vs. Application GC)
• Cluster abstracted as CPU, Memory, Disk numbers à cluster optimization & overall runtime
• Fast Compile à Test à Deploy cycle without downtime
Page 21
Message Delivery Guarantees
Today: Fast > Complete or Perfect
• Best-effort / at-most-once processing – When operator or node crashes, the local cache goes away
– Automatically retries the failed operator (number of retries is configurable)
– Recommends implementing check mechanisms in operators (e.g., Concord Kafka consumer)
Page 22
Message Delivery Guarantees
Soon: Fast + Complete > Perfect
• In development for at-least-once with Kafka – Kafka acts as a message bus between operators – Kafka replays data from checked offset (data duplication)
Eventually: Fast + Complete + Perfect
• Transactional datastore in design phase
Page 23
Future plans
• “At least once” guarantee support with Kafka
• DC/OS integration
• More data source / data sink connector support
• Higher level DSL
Page 24
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 25
• Operator model that you can use multiple languages
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 26
• Operator model that you can use multiple languages
à Fast development and iteration time for multiple teams using the same data
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 27
• Operator model that you can use multiple languages
à Fast development and iteration time for multiple teams using the same data
• Dynamic topology, run-time deployment and scaling
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 28
• Operator model that you can use multiple languages
à Fast development and iteration time for multiple teams using the same data
• Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 29
• Operator model that you can use multiple languages
à Fast development and iteration time for multiple teams using the same data
• Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
• High performance at scale
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 30
• Operator model that you can use multiple languages
à Fast development and iteration time for multiple teams using the same data
• Dynamic topology, run-time deployment and scaling
à Decoupled development & dev ops work
• High performance at scale
à Predictable system for real-time applications
Concord: Simple & Flexible streaming application framework on Apache Mesos
Page 31
• Low-latency / Real-time applications:
– Real-time fraud detection
– Financial market data processing for real-time risks and triggers
– Real-time campaign management for real-time bidding (RTB)
Thank You!
Get Started: http://concord.io
[email protected] / @shinjikim
@concord @databythebay #datagrid