31
Università degli Studi di Roma Tor VergataDipartimento di Ingegneria Civile e Ingegneria Informatica DSP Frameworks Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini DSP frameworks we consider Apache Storm (with lab) Twitter Heron From Twitter as Storm and compatible with Storm Apache Spark Streaming (lab) Reduce the size of each stream and process streams of data (micro-batch processing) Apache Flink Apache Samza Cloud-based frameworks Google Cloud Dataflow Amazon Kinesis Streams 1 Valeria Cardellini - SABD 2017/18

DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica

DSP Frameworks

Corso di Sistemi e Architetture per Big Data A.A. 2017/18

Valeria Cardellini

DSP frameworks we consider •  Apache Storm (with lab) •  Twitter Heron

–  From Twitter as Storm and compatible with Storm

•  Apache Spark Streaming (lab) –  Reduce the size of each stream and process streams

of data (micro-batch processing) •  Apache Flink •  Apache Samza •  Cloud-based frameworks

–  Google Cloud Dataflow –  Amazon Kinesis Streams

1 Valeria Cardellini - SABD 2017/18

Page 2: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Storm •  Apache Storm

–  Open-source, real-time, scalable streaming system –  Provides an abstraction layer to execute DSP

applications –  Initially developed by Twitter

•  Topology –  DAG of spouts (sources of streams) and bolts

(operators and data sinks)

2 Valeria Cardellini - SABD 2017/18

Stream grouping in Storm

•  Data parallelism in Storm: how are streams partitioned among multiple tasks (threads of execution)?

•  Shuffle grouping –  Randomly partitions the tuples

•  Field grouping –  Hashes on a subset of the tuple attributes

Valeria Cardellini - SABD 2017/18

3

Page 3: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Stream grouping in Storm

•  All grouping (i.e., broadcast) –  Replicates the entire stream to all the consumer

tasks

•  Global grouping –  Sends the entire stream to a single task of a bolt

•  Direct grouping –  The producer of the tuple decides which task of

the consumer will receive this tuple

Valeria Cardellini - SABD 2017/18

4

Storm architecture

5 Valeria Cardellini - SABD 2017/18

•  Master-worker architecture

Page 4: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Storm components: Nimbus and Zookeeper

•  Nimbus –  The master node –  Clients submit topologies to it –  Responsible for distributing and coordinating the

topology execution

•  Zookeeper –  Nimbus uses a combination of the local disk(s)

and Zookeeper to store state about the topology

Valeria Cardellini - SABD 2017/18

6

worker process

executor executorTHREAD THREAD

JAVA PROCESS

task

task

task

task

task

Storm components: worker

•  Task: operator instance –  The actual work for a bolt or a spout is done in the

task •  Executor: smallest schedulable entity

–  Execute one or more tasks related to same operator •  Worker process: Java process running one or

more executors •  Worker node: computing

resource, a container for one or more worker processes

7 Valeria Cardellini - SABD 2017/18

Page 5: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Storm components: supervisor

•  Each worker node runs a supervisor

The supervisor: •  receives assignments from Nimbus (through

ZooKeeper) and spawns workers based on the assignment

•  sends to Nimbus (through ZooKeeper) a periodic heartbeat;

•  advertises the topologies that they are currently running, and any vacancies that are available to run more topologies

Valeria Cardellini - SABD 2017/18

8

Twitter Heron

•  Realtime, distributed, fault-tolerant stream processing engine from Twitter

•  Developed as direct successor of Storm –  Released as open source in 2016

https://twitter.github.io/heron/ –  De facto stream data processing engine inside

Twitter •  Goal of overcoming Storm’s performance,

reliability, and other shortcomings •  Compatibility with Storm

–  API compatible with Storm: no code change is required for migration

Valeria Cardellini - SABD 2017/18

9

Page 6: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Heron: in common with Storm

•  Same terminology of Storm –  Topology, spout, bolt

•  Same stream groupings –  Shuffle, fields, all, global

•  Example: WordCount topology

Valeria Cardellini - SABD 2017/18

10

Heron: design goals •  Isolation

–  Process-based topologies rather than thread-based –  Each process should run in isolation (easy

debugging, profiling, and troubleshooting) –  Goal: overcoming Storm’s performance, reliability,

and other shortcomings •  Resource constraints

–  Safe to run in shared infrastructure: topologies use only initially allocated resources and never exceed bounds

•  Compatibility –  Fully API and data model compatible with Storm

Valeria Cardellini - SABD 2017/18

11

Page 7: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Heron: design goals •  Backpressure

–  Built-in rate control mechanism to ensure that topologies can self-adjust in case components lag

–  Heron dynamically adjusts the rate at which data flows through the topology using backpressure

•  Performance –  Higher throughput and lower latency than Storm –  Enhanced configurability to fine-tune potential

latency/throughput trade-offs •  Semantic guarantees

–  Support for both at-most-once and at-least-once processing semantics

•  Efficiency –  Minimum possible resource usage

Valeria Cardellini - SABD 2017/18

12

Heron topology architecture

•  Master-work architecture •  One Topology Master (TM)

–  Manages a topology throughout its entire lifecycle

•  Multiple Containers –  Each Container multiple Heron Instances, a

Stream Manager, and a Metrics Manager –  A Heron Instance is a process that handles a

single task of a spout or bolt –  Containers communicate with TM to ensure that

the topology forms a fully connected graph

Valeria Cardellini - SABD 2017/18

13

Page 8: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Heron topology architecture

Valeria Cardellini - SABD 2017/18

14

Heron topology architecture

•  Stream Manager (SM): routing engine for data streams –  Each Heron connects to its local SM, while all of the SMs in

a given topology connect to one another to form a network –  Responsbile for propagating back pressure

Valeria Cardellini - SABD 2017/18

15

Page 9: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Topology submit sequence

Valeria Cardellini - SABD 2017/18

16

Self-adaptation in Heron •  Dhalion: framework on top of Heron to autonomously

reconfigure topologies to meet throughput SLOs, scaling resource consumption up and down as needed

Valeria Cardellini - SABD 2017/18

17

•  Phases in Dhalion: -  Symptom detection

(backpressure, skew, …)

-  Diagnosis generation -  Resolution

•  Adaptation actions: parallelism changes

Page 10: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Heron environment •  Heron supports deployment on Apache

Mesos •  Can also run on Mesos using Apache Aurora

as a scheduler or using a local scheduler

Valeria Cardellini - SABD 2017/18

18

Batch processing vs. stream processing

•  Batch processing is just a special case of stream processing

Valeria Cardellini - SABD 2017/18

19

Page 11: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Batch processing vs. stream processing

•  Batched/stateless: scheduled in batches –  Short-lived tasks (Hadoop, Spark) –  Distributed streaming over batches (Spark

Streaming)

•  Dataflow/stateful: continuous/scheduled once (Storm, Flink, Heron) –  Long-lived task execution –  State is kept inside tasks

Valeria Cardellini - SABD 2017/18

20

Native vs. non-native streaming

Valeria Cardellini - SABD 2017/18

21

Page 12: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Flink

•  Distributed data flow processing system •  One common runtime for DSP applications

and batch processing applications –  Batch processing applications run efficiently

as special cases of DSP applications •  Integrated with many other projects in the

open-source data processing ecosystem

Valeria Cardellini - SABD 2017/18

22

•  Derives from Stratosphere project by TU Berlin, Humboldt University and Hasso Plattner Institute

•  Support a Storm-compatible API

Flink: software stack

Valeria Cardellini - SABD 2017/18

23

•  On top: libraries with high-level APIs for different use cases, still in beta

Page 13: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: programming model

•  Data stream –  An unbounded, partitioned immutable sequence

of events •  Stream operators

–  Stream transformations that generate new output data streams from input ones

Valeria Cardellini - SABD 2017/18

24

Flink: some features

•  Supports stream processing and windowing with Event Time semantics –  Event time makes it easy to compute over streams

where events arrive out of order, and where events may arrive delayed

•  Exactly-once semantics for stateful computations

•  Highly flexible streaming windows

Valeria Cardellini - SABD 2017/18

25

Page 14: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: some features

•  Continuous streaming model with backpressure

•  Flink's streaming runtime has natural flow control: slow data sinks backpressure faster sources

Valeria Cardellini - SABD 2017/18

26

Flink: APIs and libraries

•  Streaming data applications: DataStream API –  Supports functional transformations on data

streams, with user-defined state, and flexible windows

–  Example: how to compute a sliding histogram of word occurrences of a data stream of texts

Valeria Cardellini - SABD 2017/18

27

WindowWordCount in Flink's DataStream API

Sliding time window of 5 sec length and 1 sec trigger interval

Page 15: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: APIs and libraries •  Batch processing applications: DataSet API •  Supports a wide range of data types beyond

key/value pairs, and a wealth of operators

Valeria Cardellini - SABD 2017/18

28

Core loop of the PageRank algorithm for graphs

Flink: program optimization

•  Batch programs are automatically optimized to exploit situations where expensive operations (like shuffles and sorts) can be avoided, and when intermediate data should be cached

Valeria Cardellini - SABD 2017/18

29

Page 16: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: control events

•  Control events: special events injected in the data stream by operators

•  Periodically, the data source injects checkpoint barriers into the data stream by dividing the stream into pre-checkpoint and post-checkpoint •  More coarse-grained approach than Storm: acks sequences of

records instead of individual records

•  Watermarks signal the progress of event-time within a stream partition

•  Flink does not provide ordering guarantees after any form of stream repartitioning or broadcasting –  Dealing with out-of-order records is left to the operator

implementation Valeria Cardellini - SABD 2017/18

30

Flink: fault-tolerance •  Based on Chandy-Lamport distributed

snapshots •  Lightweight mechanism

–  Allows to maintain high throughput rates and provide strong consistency guarantees at the same time

Valeria Cardellini - SABD 2017/18

31

Page 17: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: performance and memory management •  High performance and low latency

•  Memory management

–  Flink implements its own memory management inside the JVM

Valeria Cardellini - SABD 2017/18

32

Flink: architecture

Valeria Cardellini - SABD 2017/18

33

•  The usual master-worker architecture

Page 18: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: architecture

Valeria Cardellini - SABD 2017/18

34

•  Master (Job Manager): schedules tasks, coordinates checkpoints, coordinates recovery on failures, etc.

•  Workers (Task Managers): JVM processes that execute tasks of a dataflow, and buffer and exchange the data streams –  Workers use task slots to control the number of

tasks it accepts –  Each task slot represents a fixed subset of

resources of the worker

Flink: application execution

•  Jobs are expressed as data flows •  The job graph is transformed into the

execution graph •  The execution graph contain information to

schedule and execute a job

Valeria Cardellini - SABD 2017/18

35

Page 19: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Flink: infrastructure

•  Designed to run on large-scale clusters with many thousands of nodes

•  Provides support for YARN and Mesos

Valeria Cardellini - SABD 2017/18

36

A recent need

•  A common need for many companies –  Run both batch and stream processing

•  Alternative solutions –  Lambda architecture –  Unified frameworks: Spark, Flink, Samza –  Unified programming model: Apache Beam

Valeria Cardellini - SABD 2017/18

37

Page 20: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Lambda architecture •  Data-processing design pattern to integrate batch and

real-time processing •  Streaming framework used to process real-time events,

and, in parallel, batch framework (e.g., Hadoop/Spark) deployed to process the entire dataset (perhaps periodically)

•  Results from the two parallel pipelines are then merged

38 Source: https://voltdb.com/products/alternatives/lambda-architecture

Valeria Cardellini - SABD 2017/18

Lambda architecture: example

Valeria Cardellini - SABD 2017/18

39

•  Lambda architecture used at LinkedIn before Samza development

Page 21: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Lambda architecture

•  Pros: –  Flexibility in the frameworks’ choice

•  Cons: –  Implementing and maintaining two separate

frameworks for batch and stream processing can be hard and error-prone

–  Overhead of developing and managing multiple source codes

•  The logic in each fork evolves over time, and keeping them in sync involves duplicated and complex manual effort, often with different languages

Valeria Cardellini - SABD 2017/18

40

Unified frameworks

•  Use a unified (Lambda-less) design for processing both real-time as well as batch data using the same data structure

•  Spark, Flink and Samza follow this trend

Valeria Cardellini - SABD 2017/18

41

Page 22: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Beam •  A new layer of abstraction •  Provides advanced unified programming model

–  Allows to define batch and streaming data processing pipelines that run on any execution engine (for now: Flink, Spark, Google Cloud Dataflow)

–  Java, Python and Go as programming languages •  Translates the data processing pipeline defined

by the user with the Beam program into the API compatible with the chosen distributed processing engine

•  Developed by Google and released as open-source top-level project

Valeria Cardellini - SABD 2017/18

42

Apache Samza

•  A distributed system for stateful and fault-tolerant stream processing –  A unified framework for batch and stream

processing: similarly to Flink, streams as first-class citizen, batch as special case of streaming

–  Production at LinkedIn since 2014

•  Uses Kafka for messaging and YARN to provide fault tolerance, processor isolation, security, and resource management

Valeria Cardellini - SABD 2017/18

43

Page 23: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Samza

•  Why stateful and fault-tolerant processing? User profiles, email digests, aggregate counts, …

•  Example: Email Digestion System –  Production application running at LinkedIn using Samza to

digest updates into one email

Valeria Cardellini - SABD 2017/18

44

Apache Samza: features

•  Provides distributed and scalable data processing with –  Configurable and heterogeneous data sources

and sinks (e.g., Kafka, HDFS, AWS Kinesis) –  Efficient state management: local state (in memory

or on disk) partitioned among tasks (rather than using a remote data store) and incremental checkpointing (more efficient than full state checkpointing)

–  Unified processing API for stream and batch processing

•  Differently for Flink

–  Flexible deployment models

Valeria Cardellini - SABD 2017/18

45

Page 24: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Samza: architecture

•  Samza is made up of three layers: –  A streaming layer: Kafka –  An execution layer: YARN –  A processing layer: Samza API

Valeria Cardellini - SABD 2017/18

46

Apache Samza: architecture

•  Samza, YARN and Kafka interaction

Valeria Cardellini - SABD 2017/18

47

•  RM: YARN ResourceManager •  NM: YARN NodeManager •  AM: ApplicationMaster

•  Samza client uses YARN to run a Samza job

•  YARN starts and supervises one or more SamzaContainers, and the processing code (using the StreamTask API) runs inside those containers

•  The input and output for the Samza StreamTasks come from Kafka brokers

Page 25: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Samza: running an application

•  Count the number of page views aggregated by user ID

•  Two Samza jobs: one to group messages by user ID, the other to do the counting

Valeria Cardellini - SABD 2017/18

48

Apache Samza: state management

•  Common approach (e.g., in Storm) to deal with large amounts of state: use a remote data store (e.g., Redis)

•  Samza approach: keep state local to each node and make it robust to machine failures by replicating state changes across multiple machines

Valeria Cardellini - SABD 2017/18

49

Page 26: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Samza: high-level API

Valeria Cardellini - SABD 2017/18

50

Apache Samza: example application

•  Samza job to find top-k trending tags –  DAG for the logical representation of the

application

Valeria Cardellini - SABD 2017/18

51

Page 27: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Apache Samza: example application

Valeria Cardellini - SABD 2017/18

52

Towards strict delivery guarantees •  Most frameworks provide at-least-once delivery

guarantees (e.g., Storm, Samza) –  For stateful non-idempotent operators such as counting, at-

least-once delivery guarantees can give incorrect results

•  Flink, Storm with Trident, and Google’s MillWheel offer stronger delivery guarantees (i.e., exactly-once) –  Exactly-once low latency stream processing in MillWheel works

as follows: •  The record is checked against de-duplication data from previous

deliveries; duplicates are discarded •  User code is run for the input record, possibly resulting in pending

changes to timers, state, and productions •  Pending changes are committed to the backing store •  Senders are acked •  Pending downstream productions are sent

Valeria Cardellini - SABD 2017/18

53

Source: MillWheel: Fault-Tolerant Stream Processing at Internet Scale, VLDB 2013.

Page 28: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

DSP in the Cloud

•  Data streaming systems are also offered as Cloud services –  Amazon Kinesis Data Streams –  Google Cloud Dataflow –  IBM Streaming Analytics –  Microsoft Azure Stream Analytics

•  Abstract the underlying infrastructure and support dynamic scaling of the computing resources

•  Appear to execute in a single data center

Valeria Cardellini - SABD 2017/18

54

Google Cloud Dataflow

•  Fully-managed data processing service, supporting both stream and batch execution of pipelines –  Transparently handles resource lifetime and can dynamically

provision resources to minimize latency while maintaining high utilization efficiency

–  On-demand and auto-scaling

•  Provides a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation –  Apache Beam model

•  Provides exactly-once processing –  MillWheel is Google’s internal version of Cloud Dataflow

Valeria Cardellini - SABD 2017/18

55

Page 29: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Google Cloud Dataflow

•  Seamlessly integrates with other Google cloud services –  Cloud Storage, Cloud Pub/Sub, Cloud Datastore, Cloud

Bigtable, and BigQuery

•  Apache Beam SDKs, available in Java and Python –  Enable developers to implement custom extensions and

choose other execution engines

Valeria Cardellini - SABD 2017/18

56

Amazon Kinesis Data Streams

•  Allows to build applications that process and analyze streaming data

Valeria Cardellini - SABD 2017/18

57

Page 30: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

Kinesis Data Streams: components •  Data record

–  Unit of data stored by Kinesis

•  Stream: group of data records –  Data producers write data records to Kinesis streams –  Data records in the stream are distributed into shards

•  Shard: sequence of data records in a stream –  Number of shards in a stream specified when the stream is

created •  Can then be increased or decreased as needed but manually

–  Base unit of capacity: up to 1MB/s of written data and 2MB/s of read data

–  Partition key used to group data by shard within a stream –  Used also for service pricing http://amzn.to/2szRTkG –  Data records are stored in shards temporarily (24 hours by

default) Valeria Cardellini - SABD 2017/18

58

Kinesis Data Streams: consuming data •  Kinesis Data Streams is used to capture streaming

data •  An application reads data from a Kinesis stream as

data records, then uses the Kinesis Client Library (KCL) for the processing logic –  KCL takes care of: load-balancing across multiple EC2

instances, responding to instance failures, check-pointing processed records, reacting to re-sharding (that adjusts the number of shards)

Valeria Cardellini - SABD 2017/18

59

Page 31: DSP Frameworks · 2018-05-26 · Apache Storm • Apache Storm – Open-source, real-time, scalable streaming system – Provides an abstraction layer to execute DSP applications

References

•  Overview on DSP frameworks –  Wingerath et al. “Real-time stream processing for Big Data”,

2016. http://bit.ly/2rUhXXG

•  Kulkarni et al., “Twitter Heron: stream processing at scale”, ACM SIGMOD'15. http://bit.ly/2rUXkux

•  Carbone et al., “Apache Flink: Stream and batch processing in a single engine”, Bulletin IEEE Comp. Soc. Tech. Comm. on Data Eng., 2015. http://bit.ly/2sYzoGb

•  Noghabi et al., “Samza: Stateful scalable stream processing at LinkedIn”, VLDB Endowment, 2017. https://bit.ly/2LushvF

•  Akidau et al., “MillWheel: fault-tolerant stream processing at Internet scale”, VLDB'13. http://bit.ly/2rE7Fa3

Valeria Cardellini - SABD 2017/18

60