View
495
Download
3
Category
Preview:
Citation preview
Stratosphere 0.4
4
Stratosphere Optimizer
Pact API (Java)
Stratosphere Runtime
DataSet API (Scala)
Local Remote
Batch processing on a pipelining engine, with iterations …
Flink
Historic data
Kafka, RabbitMQ, ...
HDFS, JDBC, ...
ETL, Graphs,
Machine Learning
Relational, …
Low latency,
windowing,
aggregations, ...
Event logs
Real-time data
streams
What is Apache Flink?
(master)
What is Apache Flink?
7
Pyth
on
Gelly
Table
ML
SA
MO
A
Flink Optimizer
DataSet (Java/Scala) DataStream (Java/Scala)
Stream Builder Hadoop
M/R
Local Remote Yarn Tez Embedded
Data
flow
Data
flow
Flink Dataflow Runtime
HDFS
HBase
Kafka
RabbitMQ
Flume
HCatalog
JDBC
Batch / Steaming APIs
8
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.window(Count.of(1000)).every(Count.of(100))
.groupBy("word").sum("frequency")
.print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.groupBy("word").sum("frequency")
.print()
DataSet API (batch):
DataStream API (streaming):
Technology inside Flink
case class Path (from: Long, to:Long)val tc = edges.iterate(10) {
paths: DataSet[Path] =>val next = paths
.join(edges)
.where("to")
.equalTo("from") {(path, edge) =>
Path(path.from, edge.to)}.union(paths).distinct()
next}
Cost-based
optimizer
Type extraction
stack
Task
scheduling
Recovery
metadata
Pre-flight (Client)
MasterWorkers
DataSourc
eorders.tbl
Filter
MapDataSourc
elineitem.tbl
JoinHybrid Hash
build
HTprobe
hash-part [0] hash-part [0]
GroupRed
sort
forward
Program
Dataflow
Graph
Memory
manager
Out-of-core
algos
Batch &
Streaming
State &
Checkpoints
deploy
operators
track
intermediate
results
Life of data streams
Create: create streams from event sources (machines, databases, logs, sensors, …)
Collect: collect and make streams available for consumption (e.g., Apache Kafka)
Process: process streams, possibly generating derived streams (e.g., Apache Flink)
12
Defining windows in Flink
Trigger policy• When to trigger the computation on current window
Eviction policy• When data points should leave the window
• Defines window width/size
E.g., count-based policy• evict when #elements > n
• start a new window every n-th element
Built-in: Count, Time, Delta policies
14
Checkpointing / Recovery
Flink acknowledges batches of records
• Less overhead in failure-free case
• Currently tied to fault tolerant data sources (e.g., Kafka)
Flink operators can keep state
• State is checkpointed
• Checkpointing and record acks go together
Exactly one semantics for state
15
Checkpointing / Recovery
16
Chandy-Lamport Algorithm for consistent asynchronous distributed snapshots
Pushes checkpoint barriersthrough the data flow
Operator checkpointstarting
Checkpoint done
Data Stream
barrier
Before barrier =part of the snapshot
After barrier =Not in snapshot
Checkpoint done
checkpoint in progress
(backup till next snapshot)
Heavy Data Pipelines
18
Complex ETL programs
Apology: Graph had to be blurred for
online slides, due to confidentiality
Memory Management
public class WC {public String word;public int count;
}
empty
page
Pool of Memory Pages
Sorting,
hashing,
caching
Shuffling,
broadcasts
User code
objects
Manag
ed
Un
man
ag
ed
19
Flink contains its own memory management stack. Memory is
allocated, de-allocated, and used strictly using an internal buffer pool
implementation. To do that, Flink contains its own type extraction and
serialization components.
More at: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741525
Smooth out-of-core performance
20More at: http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
Single-core join of 1KB Java objects beyond memory (4 GB)
Blue bars are in-memory, orange bars (partially) out-of-core
Benefits of managed memory
More reliable and stable performance (less GC effects, easy to go to disk)
21
Table API
22
val customers = envreadCsvFile(…).as('id, 'mktSegment).filter( 'mktSegment === "AUTOMOBILE" )
val orders = env.readCsvFile(…).filter( o => dateFormat.parse(o.orderDate).before(date) ).as('orderId, 'custId, 'orderDate, 'shipPrio)
val items = orders.join(customers).where('custId === 'id).join(lineitems).where('orderId === 'id).select('orderId,'orderDate,'shipPrio,
'extdPrice * (Literal(1.0f) - 'discount) as 'revenue)
val result = items.groupBy('orderId, 'orderDate, 'shipPrio).select('orderId, 'revenue.sum, 'orderDate, 'shipPrio)
Iterate by looping
for/while loop in client submits one job per
iteration step
Data reuse by caching in memory and/or disk
Step Step Step Step Step
Client
24
Iterate in the Dataflow
25
partial
solution partial
solution X
other
datasets
Y initial
solution
iteration
result
Replace
Step function
Large-Scale Machine Learning
26
Factorizing a matrix with28 billion ratings forrecommendations
(Scale of Netflixor Spotify)
More at: http://data-artisans.com/computing-recommendations-with-flink.html
Iterate natively with deltas
28
partial
solution
delta
setX
other
datasets
Y initial
solution
iteration
result
workset A B workset
Merge deltas
Replace
initial
workset
Effect of delta iterations…
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
45000000
1 6 11 16 21 26 31 36 41 46 51 56 61
# o
f e
lem
en
ts u
pd
ate
d
iteration
… very fast graph analysis
30
… and mix and matchETL-style and graph analysisin one program
Performance competitivewith dedicated graph
analysis systems
More at: http://data-artisans.com/data-analysis-with-flink.html
Flink Roadmap for 2015
Out-of-core state in Streaming
Monitoring and scaling for streaming
Streaming Machine Learning with SAMOA
More additions to the libraries
• Batch Machine Learning
• Graph library additions (more algorithms)
SQL on top of expression language
Master failover32
Flink community
0
20
40
60
80
100
120
Aug-10 Feb-11 Sep-11 Apr-12 Oct-12 May-13 Nov-13 Jun-14 Dec-14 Jul-15
#unique contributor ids by git commits
Cornerpoints of Flink Design
36
Robust Algorithms on
Managed Memory
Pipelined Execution
of Batch Programs
Better shuffle performance
No OutOfMemory Errors
Scales to very large JVMs
Efficient an robust processing
Flexible Data
Streaming Engine
Low Latency Steam Proc.
Highly flexible windows
Native Iterations
Very fast Graph Processing
Stateful Iterations for ML
High-level APIs,
beyond key/value pairs
Java/Scala/Python (upcoming)
Relational-style optimizer
Graphs / Machine Learning
Streaming ML (coming)
Scales to very large groups
Active Library Development
A simple program
38
val orders = … val lineitems = …
val filteredOrders = orders.filter(o => dataFormat.parse(l.shipDate).after(date)).filter(o => o.shipPrio > 2)
val lineitemsOfOrders = filteredOrders.join(lineitems).where(“orderId”).equalTo(“orderId”).apply((o,l) => new SelectedItem(o.orderDate, l.extdPrice))
val priceSums = lineitemsOfOrders.groupBy(“orderDate”).sum(“l.extdPrice”);
Two execution plans
39
DataSourceorders.tbl
Filter
Map DataSourcelineitem.tbl
JoinHybrid Hash
buildHT probe
broadcast forward
Combine
GroupRed
sort
DataSourceorders.tbl
Filter
Map DataSourcelineitem.tbl
JoinHybrid Hash
buildHT probe
hash-part [0] hash-part [0]
hash-part [0,1]
GroupRed
sort
forwardBest plan
depends on
relative sizes
of input files
Examples of optimization
Task chaining
• Coalesce map/filter/etc tasks
Join optimizations
• Broadcast/partition, build/probe side, hash or sort-merge
Interesting properties
• Re-use partitioning and sorting for later operations
Automatic caching
• E.g., for iterations
40
Recommended