View
5.948
Download
4
Category
Preview:
Citation preview
Alexander Kolb, Otto Group BI Hamburg, Germany, 2015
Flink, Yet another Streaming Framework?
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Alexander Kolb otto group BI
@lofifnc
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Introduction
Eval
uatio
n
Usability
Functionality
Architecture
Support
Non-Functional-Requirements
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Approach
5
Rating based on: - Research - Hands-on
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Use-case
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Use-case
7
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Frameworks
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Frameworks
9
SQLStream
Pulsar
SPQR
Apache Spark
Apache Flink
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
SQLStream
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
SQLStream
11
Architecture
source: sqlstream.com
SQLS
trea
m
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
SQLStream
13
Window Aggregation
1 SELECT STREAM 2 B.pagetype,3 B.ecid,4 productid,5 SUM(QUANTITY) AS "views"6 FROM VIEWS AS B7 GROUP BY FLOOR((B.ROWTIME - TIMESTAMP '1970-01-01 00:00:00')8 MINUTE / 5 to MINUTE),9 PRODUCTID, PAGETYPE, ECID;
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Pulsar
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Pulsar
15
Architecture
source: github.com/pulsarIO
Puls
ar
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Pulsar
17
Window Aggregation
1 create context MCContext start @now end after 60 seconds; 2 3 context MCContext 4 insert into ViewAgg select count(*) as views, prid 5 from PageView group by prid output snapshot when terminated;
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
SPQR
SPQ
R
source: github.com/ottogroup/SPQR
SPQ
R
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
SPQR
21
Window Aggregation
1 select productid, ecid, sum(quantity)2 from views.win:time_batch(5 min) 3 group by productid, ecid
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Apache Spark
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Apache Spark
23
Architecture
source: spark.apache.org
Apa
che
Spar
k
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Apache Spark
25
Aggregation
1 val aggViews = views.reduceByKeyAndWindow({ 2 case ((pageType, ecid, sum, price),(_,_,quant,_)) => 3 (pageType, ecid, sum + quant, price) 4 }, Minutes(5), Minutes(5))
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Apache Flink
26
Architecture
source: spark.apache.org
source: flink.apache.org
Apa
che
Flin
k
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Apache Flink
28
Window Aggregation
1 val aggViews = input.window(Time.of(5, TimeUnit.MINUTES)) 2 .groupBy(“productId”).sum(“quantity”);
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Result Evaluation
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Summary
30
Use-case
Topic UnitFramework
Pulsar.io SQLStream SPQR Flink Spark
Time for building the stream hours 40 35+
(POC)8+
(POC) 13 4
Time for adding missing connector
hours 3 8 1 3 0.5
Points 3.14 2.06 3.44 4.16 4.45
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
List of Rating Aspects
31
DSL/DDL/UI for creating Pipelines / Required know-how to define new Pipelines / Project documentation /Workflow / Testing Workflows / hot deploying / redeploying of pipelines / dynamic topology changes / Monitoring / Deployment / Dashboard for data visualization / Ease of defining udf's / Merge / Sum / Count / Min/max/avg / Aggregate / Transform / Parsing (xml/json/csv) / Group-by / Join / Ease of defining new connectors / Kafka / WebSocket / JDBC / JMS / HDFS / File / Effort for cluster deployment / Configuration effort / Supports YARN / Supports Mesos / Scalability / Resilience /Predefined communication framework / Dependencies / Flexibility / Expandability / Buffering/Pressure handling / Partitioning/Parallelism / Strategy for Partitioning/ Parallelism? / Ordering / Guarantees / State-Management / Fault tolerance / Licensing model / Professional support available / Community Activity / License / Maturity / Manageable code-base / Community Size
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Summary
32
Topic Framework weightSQLStream Pulsar.io SPQR Spark Flink
Usability 2.6 3 2.2 3.6 3.9 15Flexibility 2.5 4 3 1.5 1.5 8User-Interface 3.3 1.8 1 3 3.3 6Operators 4.8 4.3 4.3 3.9 4.7 10Connectors 4.7 1.9 1.9 2.4 2.6 6Deployment 2.4 3.2 2.8 3.8 3.7 10Architecture/ Concepts 2 3.2 3.3 4 4 12
Functional Requirements 2 3.2 2.4 4 4 14
Costs 0 5 5 5 5 5Service/ Support 2.5 0.5 1 4.5 3.5 8Project 1.8 3.3 2.5 4.3 3.8 6
Sum 2.58 3.11 2.74 3.74 3.80 100
Sum
mar
yUsability
Flexibility
User-Interface
Operators
Connectors
Deployment
Architecture/ Concepts
Functional Requirements
SQLStreamPulsar.ioSPQRSparkFlink
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
Final Scores
34
SQLStream Pulsar.io SPQR Spark Flink
Evaluation 2.58 3.11 2.74 3.74 3.8
Use-case 2.06 3.14 3.44 4.45 4.16
Evaluation Streaming Frameworks Alexander Kolb, Otto Group BI, Hamburg, Germany, 2015
ottogroup.comWE ARE HIRING!
Recommended