37
2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference Karlsruhe , October 11.-12., 2017 Presenting at Efficiently Handling Streams from Millions of Sensors Jonas Traub – TU Berlin / DFKI 1

Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Embed Size (px)

Citation preview

Page 1: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference Karlsruhe , October 11.-12., 2017

Presenting at

Efficiently Handling Streams from Millions of Sensors

Jonas Traub – TU Berlin / DFKI

1

Page 2: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

The Growth of the Internet of Things

Gartner says 6.4 billion connected

"Things" will be in use in 2016 and

more than 20 billion in 2020.

Year

# Devices (in billions)

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 2

Page 3: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Goal

Provide real-time insights based on IoT data.

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 3

Page 4: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Problem

• Billions of devices provide real-time data

• Result: Vast amount of data streams

Heavy Network Utilization Scalability Challenges Increasing Latencies

Financial Costs

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 4

Page 5: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Solution

Produce and process data streams

based on the data demand of applications.

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 5

Page 6: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

State of the Art Approach

Data Stream Production with Periodic Sampling

Major Challenges: • Oversampling • Missing Adaptivity

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 6

Page 7: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Solution

On-Demand Data Streaming from Sensor Nodes

Optimized On-Demand Data Streaming from Sensor Nodes

Jonas Traub – TU Berlin / DFKI – Efficently Handling Streams from Millions of Sensors 7

Page 8: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

State of the Art Approach

Provide all Data to Front-End Applications

Optimized On-Demand Data Streaming from Sensor Nodes

Major Challenge: • Front End Overload

Jonas Traub – TU Berlin / DFKI – Efficently Handling Streams from Millions of Sensors 8

Page 9: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Solution

Adaptive Data Reduction with Streaming Engines

Optimized On-Demand Data Streaming from Sensor Nodes

I²: Interactive Real-Time Visualization for Streaming Data

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 9

Page 10: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Solution

Adaptive Data Reduction with Streaming Engines

Optimized On-Demand Data Streaming from Sensor Nodes

I²: Interactive Real-Time Visualization for Streaming Data

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 10

Page 11: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Solution

Efficient Processing of user-defined Windows

Optimized On-Demand Data Streaming from Sensor Nodes

I²: Interactive Real-Time Visualization for Streaming Data

Cutty: Aggregate Sharing for User-Defined Windows

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 11

Page 12: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Publications

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors 12

Page 13: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Optimized On-Demand Data Streaming from Sensor Nodes

Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl

Santa Clara, California, September 25-27, 2017

13

Page 14: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 15: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 16: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 17: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 18: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

14 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 19: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

User-Defined Sampling Functions

19

• Provide an abstraction to define the data demand of applications.

• Upon a sensor read, request the next sensor read.

Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 20: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

User-Defined Sampling Functions

20

• Provide an abstraction to define the data demand of applications.

• Upon a sensor read, request the next sensor read. • Make read time tolerances explicit.

Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 21: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

User-Defined Sampling Functions

21

Enable adaptive sampling techniques to reduce data transmission

e.g., Adam [Trihinas ‘15], FAST [Fan ‘14], L-SIP [Gaura ’13]

Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 22: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Sensor Read Fusion

22 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 23: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Sensor Read Fusion

23

1) Minimize Sensor Reads and Data Transfer:

Latest possible read time

2) Optimize Sensor Read Times:

● Check the paper for all details on the read time optimizer!

Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 24: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

24 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 25: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Local Filtering

25 Jonas Traub et al. – Optimized On-Demand Data Streaming from Sensor Nodes – ACM SoCC 2017

Page 26: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Optimized On-Demand Data Streaming from Sensor Nodes

Wrap-Up:

Tailor Data Streams to the Demand of Applications

• Define data demand: User-Defined Sampling Functions • Schedule sensor reads and data transfer on-demand • Optimize read times globally - for all users and queries

Jonas Traub, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl

26

Page 27: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Cutty: Aggregate Sharing for User-Defined Windows

Paris Cabone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl

27

Page 28: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Streaming Window Aggragation

Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 28

Page 29: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Stream Slicing

Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 29

Page 30: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Applicability of Stream Slicing

Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 30

Page 31: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Yes, we can do better!

Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 31

Page 32: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Cutty Overview

Paris Carbone et al. – Cutty: Aggregate Sharing for User-Defined Windows – CIKM 2017 32

Page 33: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Cutty: Aggregate Sharing for User-Defined Windows

Wrap-Up:

Enable Stream Slicing beyond Simple Tumbling and Sliding Windows

• Cutty enables Stream Slicing for a broad class of windows • Cutty combines Stream Slicing, On-the-fly Aggregation,

Aggregate Sharing, and Aggregate Trees

Paris Cabone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl

33

Page 34: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

I²: Interactive Real-Time Visualization for Streaming Data

Jonas Traub, Nikolaas Steenbergen, Philipp Grulich, Tilmann Rabl, Volker Markl

34

Page 35: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

Architecture Overview

Jonas Traub et al. – I²: Interactive Real-Time Visualization for Streaming Data – EDBT 2017 35

Page 37: Efficiently Handling Streams from Millions of Sesors (@KIT - 2nd BMBF Big Data All Hands Meeting and 2nd Smart Data Innovation Conference)

The Big Picture

Optimized On-Demand Data Streaming from Sensor Nodes

Traub et al.; ACM SoCC’17

I²: Interactive Real-Time Visualization for Streaming Data

Traub et al.; EDBT’17

Cutty: Aggregate Sharing for User-Defined Windows Carbone et al.; CIKM’16

Jonas Traub – TU Berlin / DFKI – Efficiently Handling Streams from Millions of Sensors