21
From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture applicable to multiple real world use cases 28/06/22 Adnan Akbar Institute for Communication Systems (ICS) 5G Innovation Centre (5GIC) University of Surrey, UK [email protected] Joint work with: Paula Ta-Shma, IBM Research Michael Factor, IBM Research Guy Hadash, IBM Research Juan Sancho, ATOS

COSMOS Data Analytics Architecture

Embed Size (px)

Citation preview

Page 1: COSMOS Data Analytics Architecture

Monday 1 May 2023

From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture

applicable to multiple real world use cases

Adnan AkbarInstitute for Communication Systems (ICS)

5G Innovation Centre (5GIC)University of Surrey, UK

[email protected]

Joint work with:Paula Ta-Shma, IBM ResearchMichael Factor, IBM ResearchGuy Hadash, IBM ResearchJuan Sancho, ATOS

Page 2: COSMOS Data Analytics Architecture

What is Internet of Things ?

• “Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber-

physical system, where every object will be represented by its virtual representation enabling the

control of physical world remotely” (F. Mattern and C. Floerkemeier)

• Connecting Everyday Objects

– Physical things containing chips/ sensors

– capture and communicate all types of data

• Virtual Representation

• Control of Physical World

– interact with other devices, computing systems and the external environment, including people

Monday 1 May 2023

Page 3: COSMOS Data Analytics Architecture

IoT Data Analytics

• More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from

this data

Monday 1 May 2023

Which are the right set of tools ?Which processing model should be used to analyze

this data ?

Which analytic methods are available to get more value

from this data ?

IoT Data

Page 4: COSMOS Data Analytics Architecture

Which processing Model to use ?

Monday 1 May 2023

Batch Processing vs Event Processing or Real-time vs Historical

IoT Data

Batch Processing

Event Processing

Complex Event Processing

Machine Learning

Statistical Methods

Hybrid Solutions

Page 5: COSMOS Data Analytics Architecture

Right combination of tools for IoT data ?

Monday 1 May 2023

Plethora of open source projects for storing and Processing Big data

SwiftSecorElasticsearch

Page 6: COSMOS Data Analytics Architecture

Generic IoT Architecture – Data Flow

Monday 1 May 2023

Ingestion

1. Collect historical time series data– Collect data from devices– Aggregate into objects– Index and/or partition

Secor

IoT

Swift

Page 7: COSMOS Data Analytics Architecture

Generic IoT Architecture – Data Flow

Monday 1 May 2023

Historical Data Access and Analytics

SecorSwift

2. Learn patterns in data– May be time/location dependent– Generate thresholds, classifiers etc.

Page 8: COSMOS Data Analytics Architecture

Generic IoT Architecture – Data Flow

Monday 1 May 2023

Real-Time Data Analytics

IoT

Secor

CEP

Swift

3. Apply what was learned on real time data stream– Take action

Page 9: COSMOS Data Analytics Architecture

Proposed Solution: A Lambda Architecture for IoT

1) Ingestion2) Historical Data Analytics (Batch Processing)3) Real-time Data Analytics (Event Processing)

Monday 1 May 2023

A generic IoT Analytics architecture

IoT

CEP

Secor

Swift

Green Flows: Real time

Purple Flows: Batch

Page 10: COSMOS Data Analytics Architecture

Use Case 1: Intelligent Transportation System for Madrid Council• Problem

• Over 3000 traffic sensors deployed through city of Madrid• EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output.

This can be slow and costly. • Objective

• Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real-time traffic problems

• Approach• Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data,

apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles, rerouting buses, modifying traffic lights, etc

Monday 1 May 2023

Today Tomorrow

Page 11: COSMOS Data Analytics Architecture

IoT Architecture – Madrid Traffic – Ingestion Flow

Aim: Collect historical timeseries data for analysis– Continuously collect data from up to 3000 Madrid council traffic sensors via web service

• Data includes traffic speeds and intensities, updated every 5 mins– Push the messages to Kafka– Use Secor to aggregate multiple messages into a single Swift object

• According to policy, e.g., every 60 mins• Possibly partition the data, e.g. according to date• Convert to Parquet format• Annotate with metadata, e.g., min/max speed, start/end time

– Index Swift objects according to their metadata using ElasticSearch

Secor

SwiftIoT

Monday 1 May 2023

Page 12: COSMOS Data Analytics Architecture

IoT Architecture – Madrid Traffic – Data Access

Aim: Access data efficiently and cost effectively– Store IoT data in OpenStack Swift object storage

• Open source, low cost deployment, and highly scalable– Parquet data is accessible via Spark SQL– Optimized predicate pushdown

• Custom Spark SQL external data source driver • Uses object metadata indexes• Searches for Swift objects whose min/max values overlap requested ranges

Get all data for morning traffic:SELECT codigo, intensidad, velocidad FROM madridtraffic WHERE tf >= '08:00:00' AND tf <= '12:00:00'

Brute force method13245 Swift requests

Optimized predicate pushdown616 Swift requests

21.5 times improvement

Swift

Monday 1 May 2023

Page 13: COSMOS Data Analytics Architecture

IoT Architecture – Madrid Traffic – Machine Learning

Aim: Learn to differentiate between ‘good’ and ‘bad’ traffic– Depends on context

• Time (morning/evening), Day (weekday/weekend)• Location

– Use Spark MLlib k-means clustering– Produce threshold values for real-time decision making– Re-run algorithm when quality of clusters decreases

• Can use silhouette index to measure quality Swift

Monday 1 May 2023

Page 14: COSMOS Data Analytics Architecture

IoT Architecture – Madrid Traffic – Machine Learning

Event Detection:

• Use Spark MLlib k-means clustering to separate data into 2 clusters

• Find the midpoint between the 2 cluster centres

• Use this midpoint to generate the thresholds

• Repeat for each context e.g. time period (morning, afternoon, evening, night)

Anomaly Detection:

• Use a single cluster and define an anomaly to be further than a certain distance from the cluster centre

Morning Traffic on Weekdays

Monday 1 May 2023

Page 15: COSMOS Data Analytics Architecture

IoT Architecture – Madrid Traffic – Real Time Decision Making

Aim: Respond in real time to traffic conditions– Use Complex Event Processing (CEP) approach

• Rule based• Process events record by record• CEP rules are typically defined manually but in many cases it is difficult to get them right

– We automate this process and make it smart CEP

IoT

Prediction

Proactive approach:

• Use Spark streaming linear regression to predict traffic behavior (e.g. speed, intensity) for near future

• Apply CEP on predicted data

• Respond pro-actively to predicted events such as traffic congestion

– e.g. EMT can proactively re-route buses

Monday 1 May 2023

Page 16: COSMOS Data Analytics Architecture

Use Case 2: Taipei Smart Homes

Monday 1 May 2023

Smart plugs

Home Gateway

Real-time monitoring, control, and report of home appliances energy usage

• Taipei test scenario comprised of fifty 50 volunteer households

• Installed with Smart Energy kit (incl. home gateway, smart plugs, and smart strips)

• Real-time Energy usage

Goal: Real time Monitoring of Appliances in order to detect anomalies

Page 17: COSMOS Data Analytics Architecture

Taipei Smart Homes

• Example of Anomalies• Short circuit of a device• Devices being operated at unusual times• An Anomaly at night might not be an anomaly at daytime

• Same Architecture is used for monitoring Energy data• Only difference lies in the type of Analytics and Rules

• Historical Data Analytics• Learn normal patterns from historical data• Use CEP rules to detect the deviation from normal

• Different Models for different context• Time of a day (Morning, Afternoon, Evening, Night)• Weekday or weekend• Winter or summer• Rainy or sunny

Monday 1 May 2023

Page 18: COSMOS Data Analytics Architecture

Real-Time Anomaly detection using COSMOS Data Analytics Architecture

CEP

Secor

Swift

Node-Red

7

……

PC/monitor

…… istrip

Refrigerator

sensor

Fan / Lighting

Real-time warning messages

Monday 1 May 2023

COSMOS Data Analytics

Page 19: COSMOS Data Analytics Architecture

Our Architecture Applies to Many IoT Use cases

• Healthcare• Healthcare patient monitoring/alert/response

• Logistics• Monitoring of sensitive goods

• Social Media• Event detection if high number of posts detected as compared to normal behavior

• Insurance• Driver behavior and location monitoring

• Transportation • Connected vehicles, engine diagnostics, automated service scheduling

Monday 1 May 2023

Page 20: COSMOS Data Analytics Architecture

COSMOS

Funding: EU FP7 at level of 2PY x 3 yearsStarted: Sept 2013Coordinator: ATOSTechnical partners: University of Surrey, IBM, NTUA, Siemens, ATOSUse Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan – Smart Cities use casesProject Vision: Enable ‘things’ to interact with each other based on shared experience, trust, reputation etc.

Monday 1 May 2023

Page 21: COSMOS Data Analytics Architecture

Thank you. Any Questions ?

Monday 1 May 2023

For more details, Email: [email protected]