27
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices IEEE International Conference on Big Data 2015 (BigData 2015) Oct 29 – Nov 01, 2015 @ Santa Clara, CA, USA Demetris Trihinas, George Pallis, Marios D. Dikaiakos University of Cyprus {trihinas, gpallis, mdd}@cs.ucy.ac.cy

AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Embed Size (px)

Citation preview

Page 1: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

AdaM: an Adaptive Monitoring Framework for Sampling and Filtering

on IoT Devices

IEEE International Conference on Big Data 2015 (BigData 2015)

Oct 29 – Nov 01, 2015 @ Santa Clara, CA, USA

Demetris Trihinas, George Pallis, Marios D. DikaiakosUniversity of Cyprus

{trihinas, gpallis, mdd}@cs.ucy.ac.cy

Page 2: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Demetris Trihinas 2

IoT was initially devices sensing and exchanging data streams with humans or other network-enabled devices

IEEE BIGDATA CONFERENCE 2015

Page 3: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Edge-Mining

Demetris Trihinas 3

A term coined to reflect data processing and decision-making on

“smart” devices that sit at the edge of IoT networks

Buy more detergent

…our devices just got a little bit more “smarter”…

IEEE BIGDATA CONFERENCE 2015

Page 4: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

The “Big Data” in IoT

Demetris Trihinas 4

• Taming data volume and data velocity with

limited processing and network capabilities

• IoT devices are usually battery-powered which means intense

processing leads to increased energy consumption (less battery-life)

Zhang et al., Usenix HotCloud, 2015

Challenges

[IDC, Big Data in IoT, 2014]

IEEE BIGDATA CONFERENCE 2015

Page 5: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Sampling and Filtering

Demetris Trihinas 5IEEE BIGDATA CONFERENCE 2015

Page 6: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Metric Stream

Demetris Trihinas 6

• A metric stream 𝑀 = {𝑠𝑖}𝑖=0𝑛 is a large sequence of collected samples,

denoted as 𝑠𝑖 where 𝑖 = 0, 1, … , 𝑛 and 𝑛 → ∞

• Each sample 𝑠𝑖 is a tuple 𝑡𝑖 , 𝑣𝑖 described by a timestamp 𝑡𝑖 and a

value 𝑣𝑖

𝑠𝑖

𝑠𝑖+1

Metric Stream M

IEEE BIGDATA CONFERENCE 2015

Page 7: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Periodic Sampling

Demetris Trihinas 7

• The process of triggering the collection mechanism of a monitored source every

𝑇 time units such that the 𝑖𝑡ℎ sample is collected at time 𝑡𝑖 = 𝑖 ∙ 𝑇

𝑠𝑖

𝑠𝑖+1

Metric Stream M sampled every T = 1s

Compute resources and energy are wasted while generating large data volumes at a high velocity

Metric Stream M sampled every T = 10s

Sudden events and significant insights are missed

IEEE BIGDATA CONFERENCE 2015

Page 8: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Sampling

Demetris Trihinas 8

• Dynamically adjust the sampling period 𝑇𝑖 based on some function,

containing information of the metric stream evolution

𝑠𝑖𝑠𝑖+1

𝑇𝑖+1

Metric Stream 𝑀′ dynamically sampledMetric Stream 𝑀 with 𝑇 = 1𝑠

IEEE BIGDATA CONFERENCE 2015

• Adapting the sampling rate reduces unnecessary processing accompanied

by computing a new sample and consequently preserves energy!

Page 9: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Metric Filtering

Demetris Trihinas 9

• “Cleansing” the metric stream by not transmitting over the network

consecutive collected samples of “little” difference

• Fixed Range Metric Filters

Metric Stream M with fixed filter Range 𝑅 = 1%

if (curValue ∈ [prevValue – R, prevValue + R ])

filter(curValue)

Although stable phases exist, only 2 metric values are filtered

Can we know this R beforehand?

Should it keep the same value forever?

Is it the same for each and every metric?

IEEE BIGDATA CONFERENCE 2015

𝑅𝑠𝑖+1

𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑!𝑠𝑖

x x

Page 10: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Filtering

Demetris Trihinas 10

• Dynamically adjusting the filter range 𝑅 based on some function containing

information of the current variability of the metric stream

Metric Stream 𝑀′ with adaptive RangeMetric Stream 𝑀

𝑠𝑖

𝑠𝑖+1

𝑅𝑖+1

𝑅𝑖

IEEE BIGDATA CONFERENCE 2015

• Adapting the filter range to the variability of the metric streams allows

for data volume and network traffic to be reduced

Page 11: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

The Adaptive Monitoring FrameworkAdaM

Demetris Trihinas 11

SensingUnit

ProcessingUnit

Communication

Unit

AdaptiveSampling

AdaptiveFiltering

Knowledge Base

AdaM

IoT Device

𝑠𝑖

𝑇𝑖+1

𝑅𝑖+1𝑠𝑖

• Software library with no external dependencies embeddable on IoT devices

• Reduces processing, network traffic and energy consumption by adapting the

monitoring intensity based on the metric stream evolution and variability

• Achieves a balance between efficiency and accuracyIt’s open-source! Give it a try!

https://github.com/dtrihinas/AdaM

Raspberry Pi

Arduino

Beacons

IEEE BIGDATA CONFERENCE 2015

Page 12: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Sampling Algorithm• Step 1: Compute the distance 𝛿𝑖 between the current two consecutive

sample values

• Step 2: Compute metric stream evolution based on a Probabilistic

Exponential Weighted Moving Average (PEWMA) to estimate 𝛿𝑖+1

and the standard deviation 𝜎𝑖+1:

𝛿𝑖+1 = 𝜇𝑖 = 𝑎 ∙ 𝜇𝑖−1 + 1 − 𝑎 𝛿𝑖

Demetris Trihinas 12

Looks like an exponential moving average, right?

But weighting is probabilistically applied!

𝑎 = 𝛼 1 − 𝛽P𝑖

IEEE BIGDATA CONFERENCE 2015

Page 13: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Why use a Probabilistic EWMA?

Simple EWMA is volatile to abrupt transient changes

• Slow to acknowledge spike after “stable” periods

• If “stable” phases follow sudden spikes, then subsequent

values are overestimated

Demetris Trihinas 13

Sounds a lot like a job for

a Gaussian distribution!

IEEE BIGDATA CONFERENCE 2015

Page 14: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Sampling Algorithm• Step 3: Compute actual standard deviation 𝜎𝑖

• Step 4: Compute the current confidence (𝑐𝑖 ≤ 1) of our approach

based on the actual and estimated standard deviation

𝑐𝑖 = 1 −| 𝜎𝑖 − 𝜎𝑖|

𝜎𝑖

• Step 5: Compute the sampling period 𝑇𝑖+1 based on the current

confidence and the user-defined imprecision γ

Demetris Trihinas 14

… the more “confident” the algorithm is, the larger the

outputted sampling period 𝑇𝑖+1 can be…

O(1) complexity as all steps only use their previous values!

IEEE BIGDATA CONFERENCE 2015

Page 15: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Adaptive Filtering Algorithm

• Step 1: Compute the dispersion of the stream based on average 𝜇𝑖

and standard deviation 𝜎𝑖 already computed from adaptive sampling

𝐹𝑖 =𝜎𝑖

2

𝜇𝑖

• Step 2: Compute the new filter range 𝑅𝑖+1 based on 𝐹𝑖 and the user-

defined imprecision 𝛾

Demetris Trihinas 15

…a low 𝐹𝑖 indicates a non dispersed data stream which

means a low variability in the metric stream…

O(1) complexity as all steps only use their previous values!

IEEE BIGDATA CONFERENCE 2015

𝑠𝑖

𝑠𝑖+1𝑅𝑖+1

𝑠𝑖+2

𝑠𝑖+3

𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑!𝑅𝑖+2𝑅𝑖

𝐹 < 𝛾

Page 16: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Evaluation

Demetris Trihinas 16IEEE BIGDATA CONFERENCE 2015

Page 17: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

AdaM vs State-of-the-Art Adaptive Techniques

• i-EWMA: an EWMA adapting sampling period by 1 time unit when the

estimated error (ε) is under/over user-defined imprecision

• L-SIP[1]: a linear algorithm using a double EWMA to produce estimates of

current data distribution based on the rate sample values change

=> Slow to react to highly transient and abrupt fluctuations in the metric stream

• FAST[2]: an (aggressive) framework using a PID controller to compute (large)

sampling periods accompanied by a Kalman Filter to predict values at non

sampling points

Demetris Trihinas 17

[1] Gaura et al., IEEE Sensors Journal, 2013 [2] Fan et al., ACM CIKM, 2012

IEEE BIGDATA CONFERENCE 2015

Page 18: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Datasets

Demetris Trihinas 18

Memory TraceJava Sorting Benchmark

CPU TraceCarnegie Mellon RainMon Project

Disk I/O TraceCarnegie Mellon RainMon Project

TCP Port Monitoring TraceCyber Defence SANS Tech Institute

Fitbit TraceCoursera Data Science Course

IEEE BIGDATA CONFERENCE 2015

Page 19: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Our Small “Big Data” Testbeds

Demetris Trihinas 19

Raspberry Pi 1st Gen. Model B512MB RAM, Single Core 700MHz

Daemon on OS emulates traces while feeding samples to each algorithm

Android Emulator + SensorSimulator512MB RAM, Single Core

SensorSimulator script emulates traces by feeding samples to Android emulator for each

algorithm

Daemon

IEEE BIGDATA CONFERENCE 2015

Page 20: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Evaluation Metrics• Estimation Accuracy – Mean Absolute Percentage Error (MAPE)

• CPU Cycles

• Outgoing Network Traffic

• Edge Device Energy Consumption*

Demetris Trihinas 20

*Xiao et al., CPSCom, 2010

perf

nethogs

powerstat

Other than error evaluation no other system goes through an overhead study!

Page 21: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Error Comparison

Demetris Trihinas 21

Memory Trace CPU Trace Disk I/O Trace

TCP Port Monitoring Trace Fitbit Trace

For all traces AdaM has the smallest inaccuracy (<10%)

Even in a more aggressive configuration (λ=2) AdaM is still comparable to L-SIP

AdaM AdaM AdaM

AdaM AdaM

User-defined accepted imprecision set to γ = 0.1

IEEE BIGDATA CONFERENCE 2015

Page 22: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

So How Well Does AdaM Perform?

Demetris Trihinas 22

Memory TraceJava Sorting Benchmark

CPU TraceCarnegie Mellon RainMon Project

Disk I/O TraceCarnegie Mellon RainMon Project

TCP Port Monitoring TraceCyber Defence SANS Tech Institute

Fitbit TraceCoursera Data Science Course

IEEE BIGDATA CONFERENCE 2015

Page 23: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Overhead Comparison (1)

Demetris Trihinas 23

Network Traffic

Energy Consumption

CPU Cycles

• Data volume reduced by 74%

• Energy consumption by 71%

• Accuracy is always above 89% and 83%

with filtering enabled

IEEE BIGDATA CONFERENCE 2015

Page 24: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Overhead Comparison (2)

Demetris Trihinas 24

Network Traffic Energy Consumption

Error (MAPE)

FAST’s aggressiveness results in slightly lower

energy consumption and network traffic

But, this does not come for free…

FAST features a large inaccuracy

AdaM achieves a balance between

efficiency and accuracy

IEEE BIGDATA CONFERENCE 2015

Page 25: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Scalability Evaluation• Integrated AdaM to a data streaming

system

• JCatascopia Cloud Monitoring*

• JCatascopia Agents (data sources) use AdaM

to adapt monitoring intensity

• Archiving time is measured at JCatascopia

Server to evaluate data velocity

• Compare AdaM over Periodic

Sampling

Demetris Trihinas 25

* [Trihinas et al, CCGrid 2014] https://github.com/dtrihinas/JCatascopia

DB

AdaMAgent

Agent

Agent

.

.

.

JCatascopia

AdaM

AdaM

metrics

Data velocity is reduced to a linear growth

IEEE BIGDATA CONFERENCE 2015

Page 26: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Conclusion and Future Work

Demetris Trihinas 26

• Data volume and velocity of IoT-generated data

keeps increasing

• AdaM reduces processing, network traffic and

energy consumption of IoT device by adapting

monitoring intensity

• AdaM achieves a balance between efficiency and

accuracy

• AdaM v2.0 support data streaming engines such

as Apache Spark and Flink

IEEE BIGDATA CONFERENCE 2015

It’s open-source! Give it a try! https://github.com/dtrihinas/AdaM

Page 27: AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

Demetris Trihinas 27

Laboratory for Internet ComputingDepartment of Computer Science

University of Cyprus

http://linc.ucy.ac.cy

IEEE BIGDATA CONFERENCE 2015