Upload
adnan-akbar
View
87
Download
0
Embed Size (px)
Citation preview
Monday 1 May 2023
From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture
applicable to multiple real world use cases
Adnan AkbarInstitute for Communication Systems (ICS)
5G Innovation Centre (5GIC)University of Surrey, UK
Joint work with:Paula Ta-Shma, IBM ResearchMichael Factor, IBM ResearchGuy Hadash, IBM ResearchJuan Sancho, ATOS
What is Internet of Things ?
• “Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber-
physical system, where every object will be represented by its virtual representation enabling the
control of physical world remotely” (F. Mattern and C. Floerkemeier)
• Connecting Everyday Objects
– Physical things containing chips/ sensors
– capture and communicate all types of data
• Virtual Representation
• Control of Physical World
– interact with other devices, computing systems and the external environment, including people
Monday 1 May 2023
IoT Data Analytics
• More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from
this data
Monday 1 May 2023
Which are the right set of tools ?Which processing model should be used to analyze
this data ?
Which analytic methods are available to get more value
from this data ?
IoT Data
Which processing Model to use ?
Monday 1 May 2023
Batch Processing vs Event Processing or Real-time vs Historical
IoT Data
Batch Processing
Event Processing
Complex Event Processing
Machine Learning
Statistical Methods
Hybrid Solutions
Right combination of tools for IoT data ?
Monday 1 May 2023
Plethora of open source projects for storing and Processing Big data
SwiftSecorElasticsearch
Generic IoT Architecture – Data Flow
Monday 1 May 2023
Ingestion
1. Collect historical time series data– Collect data from devices– Aggregate into objects– Index and/or partition
Secor
IoT
Swift
Generic IoT Architecture – Data Flow
Monday 1 May 2023
Historical Data Access and Analytics
SecorSwift
2. Learn patterns in data– May be time/location dependent– Generate thresholds, classifiers etc.
Generic IoT Architecture – Data Flow
Monday 1 May 2023
Real-Time Data Analytics
IoT
Secor
CEP
Swift
3. Apply what was learned on real time data stream– Take action
Proposed Solution: A Lambda Architecture for IoT
1) Ingestion2) Historical Data Analytics (Batch Processing)3) Real-time Data Analytics (Event Processing)
Monday 1 May 2023
A generic IoT Analytics architecture
IoT
CEP
Secor
Swift
Green Flows: Real time
Purple Flows: Batch
Use Case 1: Intelligent Transportation System for Madrid Council• Problem
• Over 3000 traffic sensors deployed through city of Madrid• EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output.
This can be slow and costly. • Objective
• Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real-time traffic problems
• Approach• Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data,
apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles, rerouting buses, modifying traffic lights, etc
Monday 1 May 2023
Today Tomorrow
IoT Architecture – Madrid Traffic – Ingestion Flow
Aim: Collect historical timeseries data for analysis– Continuously collect data from up to 3000 Madrid council traffic sensors via web service
• Data includes traffic speeds and intensities, updated every 5 mins– Push the messages to Kafka– Use Secor to aggregate multiple messages into a single Swift object
• According to policy, e.g., every 60 mins• Possibly partition the data, e.g. according to date• Convert to Parquet format• Annotate with metadata, e.g., min/max speed, start/end time
– Index Swift objects according to their metadata using ElasticSearch
Secor
SwiftIoT
Monday 1 May 2023
IoT Architecture – Madrid Traffic – Data Access
Aim: Access data efficiently and cost effectively– Store IoT data in OpenStack Swift object storage
• Open source, low cost deployment, and highly scalable– Parquet data is accessible via Spark SQL– Optimized predicate pushdown
• Custom Spark SQL external data source driver • Uses object metadata indexes• Searches for Swift objects whose min/max values overlap requested ranges
Get all data for morning traffic:SELECT codigo, intensidad, velocidad FROM madridtraffic WHERE tf >= '08:00:00' AND tf <= '12:00:00'
Brute force method13245 Swift requests
Optimized predicate pushdown616 Swift requests
21.5 times improvement
Swift
Monday 1 May 2023
IoT Architecture – Madrid Traffic – Machine Learning
Aim: Learn to differentiate between ‘good’ and ‘bad’ traffic– Depends on context
• Time (morning/evening), Day (weekday/weekend)• Location
– Use Spark MLlib k-means clustering– Produce threshold values for real-time decision making– Re-run algorithm when quality of clusters decreases
• Can use silhouette index to measure quality Swift
Monday 1 May 2023
IoT Architecture – Madrid Traffic – Machine Learning
Event Detection:
• Use Spark MLlib k-means clustering to separate data into 2 clusters
• Find the midpoint between the 2 cluster centres
• Use this midpoint to generate the thresholds
• Repeat for each context e.g. time period (morning, afternoon, evening, night)
Anomaly Detection:
• Use a single cluster and define an anomaly to be further than a certain distance from the cluster centre
Morning Traffic on Weekdays
Monday 1 May 2023
IoT Architecture – Madrid Traffic – Real Time Decision Making
Aim: Respond in real time to traffic conditions– Use Complex Event Processing (CEP) approach
• Rule based• Process events record by record• CEP rules are typically defined manually but in many cases it is difficult to get them right
– We automate this process and make it smart CEP
IoT
Prediction
Proactive approach:
• Use Spark streaming linear regression to predict traffic behavior (e.g. speed, intensity) for near future
• Apply CEP on predicted data
• Respond pro-actively to predicted events such as traffic congestion
– e.g. EMT can proactively re-route buses
Monday 1 May 2023
Use Case 2: Taipei Smart Homes
Monday 1 May 2023
Smart plugs
Home Gateway
Real-time monitoring, control, and report of home appliances energy usage
• Taipei test scenario comprised of fifty 50 volunteer households
• Installed with Smart Energy kit (incl. home gateway, smart plugs, and smart strips)
• Real-time Energy usage
Goal: Real time Monitoring of Appliances in order to detect anomalies
Taipei Smart Homes
• Example of Anomalies• Short circuit of a device• Devices being operated at unusual times• An Anomaly at night might not be an anomaly at daytime
• Same Architecture is used for monitoring Energy data• Only difference lies in the type of Analytics and Rules
• Historical Data Analytics• Learn normal patterns from historical data• Use CEP rules to detect the deviation from normal
• Different Models for different context• Time of a day (Morning, Afternoon, Evening, Night)• Weekday or weekend• Winter or summer• Rainy or sunny
Monday 1 May 2023
Real-Time Anomaly detection using COSMOS Data Analytics Architecture
CEP
Secor
Swift
Node-Red
7
……
PC/monitor
…… istrip
Refrigerator
sensor
Fan / Lighting
Real-time warning messages
Monday 1 May 2023
COSMOS Data Analytics
Our Architecture Applies to Many IoT Use cases
• Healthcare• Healthcare patient monitoring/alert/response
• Logistics• Monitoring of sensitive goods
• Social Media• Event detection if high number of posts detected as compared to normal behavior
• Insurance• Driver behavior and location monitoring
• Transportation • Connected vehicles, engine diagnostics, automated service scheduling
Monday 1 May 2023
COSMOS
Funding: EU FP7 at level of 2PY x 3 yearsStarted: Sept 2013Coordinator: ATOSTechnical partners: University of Surrey, IBM, NTUA, Siemens, ATOSUse Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan – Smart Cities use casesProject Vision: Enable ‘things’ to interact with each other based on shared experience, trust, reputation etc.
Monday 1 May 2023