82
1 Stefan Richter @stefanrrichter 29.10.2016 A look at Flink 1.2 and beyond

Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Embed Size (px)

Citation preview

Page 1: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

1

Stefan Richter@stefanrrichter

29.10.2016

A look at Flink 1.2 and beyond

Page 2: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Agenda

▪ Flink 1.2 feature overview & walkthrough ▪ Taking a closer look at two features: ▪ Queryable state ▪ Dynamic scaling

2

Page 3: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Feature OverviewFlink Release 1.2

3

Page 4: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.1+ ongoing development

4Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Operations

EcosystemApplication

FeaturesBroader

Audience

Page 5: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.1+ ongoing development

4Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Operations

EcosystemApplication

FeaturesBroader

Audience

Page 6: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.1+ ongoing development

4Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Operations

EcosystemApplication

FeaturesBroader

Audience

Page 7: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.1+ ongoing development

4Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Operations

EcosystemApplication

FeaturesBroader

Audience

Page 8: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.1+ ongoing development

4Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Operations

EcosystemApplication

FeaturesBroader

Audience

Page 9: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Flink 1.2 Improvements

5Session Windows(Stream) SQL

Libraryenhancements

MetricSystem

Operations

EcosystemApplication

Features

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

Connectors in Flink Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

BroaderAudience

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable StateApache Bahir connectors

Page 10: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Security / Authentication - Flink 1.2

6

Authorized data access Secured clusters with Kerberos-based authentication • Kafka, ZooKeeper, HDFS, YARN, HBase, …

Encrypted traffic between Flink Processes • RPC, Data Exchange, Web UI, … - „SSL for all connections“

Largely contributed by

Prevent malicious users to hook into Flink jobs

Page 11: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Cluster Management - Flink 1.1

7

Standalone

Flink on Yarn

Page 12: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Cluster Management - Flink 1.2

8Mesos integration contributed by

Standalone

Flink on Yarn

Flink on Mesos

Page 13: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Cluster Management - Beyond 1.2

9

Efforts to seamlessly interoperate with various cluster managers.

Generalized abstraction (FLIP-6).

Driven by and

Page 14: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Cluster Management - Beyond (ct’d)

10

TaskManagerJobManager

(1) Register

(2) Deploy Tasks

ResourceManager

(1) Request slots

TaskManager

JobManager

(2) Start TaskManager

(3) Register

(4) Deploy TasksDispatcher

(0) Start JobManager

Page 15: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Cluster Management - Beyond (ct’d)

10

TaskManagerJobManager

(1) Register

(2) Deploy Tasks

ResourceManager

(1) Request slots

TaskManager

JobManager

(2) Start TaskManager

(3) Register

(4) Deploy TasksDispatcher

(0) Start JobManager

Page 16: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Metrics

11

Page 17: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Metrics

▪ Rates

11

Page 18: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Metrics

▪ Rates▪ Latency (operator)

11

Page 19: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Metrics

▪ Rates▪ Latency (operator)▪ Visualization in WebUI

11

Page 20: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Savepoint / Checkpoint Robustness

12

Page 21: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Savepoint / Checkpoint Robustness

▪ Resume job from checkpoints

12

C S

Page 22: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Savepoint / Checkpoint Robustness

▪ Resume job from checkpoints

▪ Use older checkpoint on failed recovery

12

C1 C2 C3t

Page 23: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Savepoint / Checkpoint Robustness

▪ Resume job from checkpoints

▪ Use older checkpoint on failed recovery

▪ Skip failed Checkpoints

12

C1 C2 C3t

Page 24: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Savepoint / Checkpoint Robustness

▪ Resume job from checkpoints

▪ Use older checkpoint on failed recovery

▪ Skip failed Checkpoints▪ Backwards compatible

12

S1.1 1.2

Page 25: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Processing Function

13

Stream SQL

Streaming API

Processing Function

Window Operator

Timer Handling

?

Problem: Implement custom windowing?

Page 26: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Processing Function

13

Stream SQL

Streaming API

Processing Function

Window Operator

Timer Handling

Interface ProcessingFunction:

void flatMap(I value, Context ctx, Collector<O> out) throws Exception;

void onTimer(long timestamp, OnTimerContext ctx, Collector<O> out) throws Exception

Page 27: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

14

Example:

Page 28: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

▪ Group-windows

14

Example:

table .groupBy('user') .window(Session withGap 10.minutes on 'rowtime') .select('uid', 'product.count')

Page 29: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

▪ Group-windows▪ More SQL operations

14

Example:

EXISTS, VALUES, LIMIT

Page 30: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

▪ Group-windows▪ More SQL operations▪ More built-in scalar functions

14

Example:

CURRENT_DATE, INITCAP, NULLIF

Page 31: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

▪ Group-windows▪ More SQL operations▪ More built-in scalar functions▪ More datatypes & better

integration

14

Example:

pojo.get('field') pojo.flatten()

Page 32: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Table API & Stream SQL

▪ Group-windows▪ More SQL operations▪ More built-in scalar functions▪ More datatypes & better

integration▪ User-defined scalar functions

14

Example:

table. select('uid', parseName('userJson'))

Page 33: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

15

Page 34: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

▪ Kafka 0.10 (with watermarks)

15

Page 35: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user

logic

15

Page 36: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user

logic▪ Detached execution: first step in programatically controlled

job

15

Page 37: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user

logic▪ Detached execution: first step in programatically controlled

job ▪ Async IO operator: non-blocking queries to external systems

15

Page 38: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Many more improvements…

▪ Kafka 0.10 (with watermarks)▪ Bucketing Sink: divides output into different file w.r.t. user

logic▪ Detached execution: first step in programatically controlled

job ▪ Async IO operator: non-blocking queries to external systems▪ Improved scalability, robustness + bugfixes

15

Page 39: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable StateFlink 1.2

16

Page 40: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Motivation

17

Realtime Queries

Periodically (every second)flush new aggregates

to Redis

Page 41: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Motivation

18

Number ofKeys

Page 42: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Motivation

19

Realtime QueriesWhere is the bottleneck?

Page 43: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Motivation

19

Writes to the key/valuestore take too long

Realtime QueriesWhere is the bottleneck?

Page 44: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Idea

20

Realtime Queries

Archive Database

Optional + only at end of windows

“Streamprocessor as a database“

Page 45: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Performance

21

Number ofKeys

Page 46: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State - Implementation

22

Query Client

StateRegistry

window() /

sum()

Job Manager Task Manager

ExecutionGraph

State Location Server

deploy

status

Query: /job/operation/state-name/key

StateRegistry

Task Manager

(1) Get location of "key-partition" for "operator" of" job"

(2) Look uplocation

(3)Respond location

(4) Querystate-name and key

localstate

register

window() /

sum()

Page 47: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

23

Page 48: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

▪ Flink has state as a first class citizen

23

Page 49: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)

23

Page 50: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the

operators that create/update it

23

Page 51: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the

operators that create/update it▪ State is continuous (not mini batched)

23

Page 52: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Queryable State Enablers

▪ Flink has state as a first class citizen▪ State is fault tolerant (exactly once semantics)▪ State is partitioned (sharded) together with the

operators that create/update it▪ State is continuous (not mini batched)▪ State is scalable (e.g., embedded RocksDB state

backend)

23

Page 53: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Dynamic ScalingFlink 1.2

24

Page 54: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Changing Workloads

25

Page 55: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Changing Workloads

25

Page 56: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Changing Workloads

25

Page 57: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Resource Adaption

26

time

Workload Resources

Page 58: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Resource Adaption

26

time

Workload Resources

time

Workload Resources

Page 59: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Motivation - Resource Adaption

26

+

time

Workload Resources

time

Workload Resources

Page 60: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Basic Idea

27

• Spread work across more workers to decrease workload

Page 61: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Scaling Stateless Jobs

28

Scale Up Scale DownSource

Mapper

Sink

• Scale up: Deploy new tasks • Scale down: Cancel running tasks

Page 62: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Scaling Stateful Jobs

29

?

• Problem 1: Which state to assign to new task? • Problem 2: Read + filter whole state?

Page 63: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Non-keyed vs Keyed State

30

• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“

• State bound only to operator • E.g. Source state

KeyedNon-keyed

Page 64: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Non-keyed vs Keyed State

30

• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“

• State bound only to operator • E.g. Source state

KeyedNon-keyed

Page 65: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Repartitioning Non-keyed state

31

#1 #2

#3 #4

#1 #2

#3 #4

Flink 1.1:

T snapshot() void restore(T)

Flink 1.2:

List<T> snapshot() void restore(List<T>)

Idea: break up state into finer granules that can be redistributed independently

Page 66: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Example: Kafka Source Flink 1.1

32

partitionId: 1, offset: 42

partitionId: 3, offset: 10

partitionId: 6, offset: 27?

Operator state is black box. How to repartition?

Page 67: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Example: Kafka Source Flink 1.2

33

partitionId: 1, offset: 42

partitionId: 3, offset: 10

partitionId: 6, offset: 27

???

Return a list of sub-states which can be freely repartitioned.

Page 68: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

partitionId: 1, offset: 42

partitionId: 6, offset: 27

Example: Kafka Source Flink 1.2

34

partitionId: 3, offset: 10

Scale Out

Page 69: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

partitionId: 1, offset: 42

partitionId: 6, offset: 27

Example: Kafka Source Flink 1.2

34

partitionId: 3, offset: 10

Scale Out

Page 70: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Example: Kafka Source Flink 1.2

35

partitionId: 1, offset: 42

partitionId: 6, offset: 27

partitionId: 3, offset: 10

Scale In

Page 71: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Example: Kafka Source Flink 1.2

35

partitionId: 1, offset: 42

partitionId: 6, offset: 27

partitionId: 3, offset: 10

Scale In

Page 72: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Non-keyed vs Keyed State

36

• State bound to an operator + key • E.g. Keyed UDF and window state • „SELECT count(*) FROM t GROUP BY t.key“

• State bound only to operator • E.g. Source state

KeyedNon-keyed

Page 73: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Repartitioning Keyed State

▪ Split key space into key groups

▪ Every key falls into exactly one key group

▪ Assign key groups to tasks

37

Key space

Key group #1 Key group #2

Key group #3Key group #4

One key

Page 74: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Repartitioning Keyed State (ct’d)

▪ Rescaling changes key group assignment

▪ Maximum parallelism defined by #key groups

38

Page 75: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Current State in Flink 1.2

▪ Manual rescaling 1. Take savepoint 2. Restart job with adjusted parallelism and

savepoint

39

Page 76: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Next Steps beyond Flink 1.2

▪ Rescaling individual operators w/o restart ▪ Refactor Flink deployment and process

model (previously discussed) ▪ On-the-fly Scaling

40

Page 77: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Autoscaling Policies

41

• Latency • Throughput • Resource utilization

• Kubernetes on GCE, EC2 and Mesos (marathon-autoscale) already support auto-scaling

Page 78: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Conclusion

42

Page 79: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Conclusion

▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling

42

Page 80: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Conclusion

▪ Many great features in Flink 1.2 ▪ Walkthrough ▪ Queryable State & Dynamic Scaling

▪ Glimpse beyond the 1.2 release

42

Page 81: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

43

Thank you!@stefanrrichter @ApacheFlink @dataArtisans

Page 82: Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup

Questions?

44