25
Marco Parenzan Implementing a canonical IoT backend in Azure with Azure Stream Analytics IoT Experiments @Intel - Assago (Milano) DotNetLombardia Wednesday, May 27, 2015 from 9:00 AM to 6:00 PM (CEST) Milano Fiori, Italy

Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Embed Size (px)

Citation preview

Page 1: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Marco Parenzan

Implementing a canonical IoT backend in Azure with Azure Stream Analytics

IoT Experiments @Intel - Assago (Milano)DotNetLombardiaWednesday, May 27, 2015 from 9:00 AM to 6:00 PM (CEST)Milano Fiori, Italy

Page 2: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Speaker info/Marco Parenzan

www.slideshare.net/marco.parenzan www.github.com/marcoparenzan marco [dot] parenzan [at] 1nn0va [dot]

it www.1nnova.it @marco_parenzan

Formazione ,Divulgazione e Consulenza con 1nn0vaMicrosoft MVP 2014 for Microsoft AzureCloud Architect, NET developerLoves Functional Programming, Html5 Game Programming and Internet of Things

AZURE COMMUNITY

BOOTCAMP 2015

IoT Day - 08/05/2015

@1nn0va#microservicesconf20159 Maggio 2015

Page 3: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

IoT as an hobby (now…?)

Page 4: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Canonical Stream Analytics PatternPresentation and action

Storage andBatch Analysis

StreamAnalysis

IngestionCollectionEvent production

Event hubs

Cloud gateways(web APIs)

Field gateways

Applications

Legacy IOT (custom protocols)

Devices

IP-capable devices(Windows/Linux)

Low-power devices (RTOS)

Search and query

Data analytics(Power BI)

Web/thick client dashboardsEvent Hubs

SQL DB

Storage Tables

Power BI

Storage Blobs

Stream Analytics

Devices to take action

MachineLearning

more to come…

Page 5: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

• Analytics on Data in motion• Focus on building solutions• … not on solution infrastructure• … and get there faster

Scenario

ARCHIVING DASHBOARDING TRIGGERING WORKFLOWS

Page 6: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

What is Streaming Data?

Data in MotionData at Rest

Page 7: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Introducing Azure Stream Analytics

Mission critical reliability and scale

Enables rapid development

Fully managed real-time analytics

Page 8: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Real-time analytics

Fully managed real-time analytics

Real-time Analytics

• Intake millions of events per second (up to 1 GB/s)

• Low processing latency, auto adaptive (sub-second to seconds)

• Correlate between different streams, or with reference data

• Find patterns or lack of patterns in data in real-time

Fully Managed Cloud Service

• No hardware acquisition and maintenance

• No platform/infrastructure deployment and maintenance

• Easily expand your business globally leveraging Azure regions

Page 9: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Mission critical

Mission Critical Reliability

• Guaranteed event delivery

• Guaranteed business continuity: Automatic and fast recovery

Effective Audits

• Privacy and security properties of solutions are evident

• Azure integration for monitoring and ops alerting

Easy To Scale

• Scale from small to large on demand

Mission critical reliability and scale

Page 10: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Rapid development

Rapid Development with SQL like language

• High-level: focus on stream analytics solution

• Concise: less code to maintain

• Fast test: Rapid development and debugging

• First-class support for event streams and reference data

Built in temporal semantics

• Built-in temporal windowing and joining

• Simple policy configuration to manage out-of-order eventsand late arrivals

Enables rapid development

Page 11: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

DML• SELECT• FROM• WHERE• GROUP BY• HAVING• CASE WHEN THEN ELSE• INNER/LEFT OUTER JOIN• UNION• CROSS/OUTER APPLY• CAST• INTO• ORDER BY ASC, DSC

SAQL – Language & Library

Scaling Extensions• WITH• PARTITION BY• OVER

Date and Time Functions• DateName• DatePart• Day• Month• Year• DateTimeFromParts• DateDiff• DateAdd

Windowing Extensions• TumblingWindow• HoppingWindow• SlidingWindow

Aggregate Functions• Sum• Count• Avg• Min• Max• StDev• StDevP• Var• VarP

String Functions• Len• Concat• CharIndex• Substring• PatIndex

Temporal Functions• Lag, IsFirst• CollectTop

Page 12: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Pipeline

SELECT UserName, TimeZoneINTO OutputTableFROM InputStream

Put the data in a static data container

Page 13: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Filters

SELECT UserName, TimeZoneFROM InputStreamWHERE Topic = 'XBox'

Show me the user name and time zone of tweets on the topic XBox

"Haroon”, “Eastern Time (US & Canada)”

"XO", “London”

“Zach Dotseth“, “London”, “Football”,(…)

"Haroon”, “Eastern Time (US & Canada)” “XBox”,(…)

"XO",”London”, “XBox“, (…)

time

Page 14: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Windowing Concepts• Windows can be tumbling, hopping, or sliding

• Windows are fixed length

• Must be used in a GROUP BY clause

• Output event will have the timestamp of the end of the window

1 5 4 26 8 6 4

t1 t2 t5 t6t3 t4

Time

Window 1 Window 2 Window 3

Aggregate Function (Sum)

18 14Output Events

Page 15: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Tumbling Windows

SELECT Topic, Count(*) AS TotalTweetsFROM TwitterStream TIMESTAMP BY CreatedAtGROUP BY Topic, TumblingWindow(second, 10)

“Give me the count of tweets every 10 seconds”

1 5 4 26 8 6

0 2010 Time (secs)

A 10-second Tumbling Window

30

8 6

5 3 6 1

1 5 4 26

6 15 3

Page 16: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Hopping Windows

SELECT Topic, Count(*) AS TotalTweetsFROM TwitterStream TIMESTAMP BY CreatedAtGROUP BY Topic, HoppingWindow(second, 10, 5)

“Every 5 seconds give me the count of tweets over the last 10 seconds”

1 5 4 26 8 6

0 5 2010 15 Time (secs)

25

A 10-second Hopping Window with a 5-second “Hop”

30

4 26

8 6

5 3 6 1

1 5 4 26

8 6 5 3

6 15 3

Page 17: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Sliding Windows

SELECT Topic, Count(*) AS TotalTweetsFROM TwitterStream TIMESTAMP BY CreatedAtGROUP BY Topic, SlidingWindow(second, 10)

“Give me the count of tweets in every distinct 10 seconds window”

1 5 4 26 8 6

0 2010 Time (secs)

Every 10-second Sliding Window with changes

30

8 6

5 3 6 1

1 5 4 26

6 15 3

Page 18: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Using Windowing• Tumbling• Sample that cannot repeat• Sampling in a production line (item exist in just one window)

• Hopping• Sample that can repeat• Sampling in a “fixed group” (item exists in multiple window)

• Sliding• Every sample count• Sampling

Page 19: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Reference Data

Seamless correlation of event streams with reference dataStatic or slowly-changing data stored in blobs

CSV and JSON files in Azure Blobs;scanned for new snapshots on a settable cadence

JOIN (INNER or LEFT OUTER) between streams and reference data sources

Reference data appears like another input:SELECT myRefData.Name, myStream.Value FROM myStreamJOIN myRefData

ON myStream.myKey = myRefData.myKey

Page 20: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Multiple steps, multiple outputsWITH Step1 AS (

SELECT Count(*) AS CountTweets, Topic

FROM TwitterStream PARTITION BY PartitionId

GROUP BY TumblingWindow(second, 3), Topic, PartitionId

),

Step2 AS (

SELECT Avg(CountTweets)

FROM Step1

GROUP BY TumblingWindow(minute, 3)

)

SELECT * INTO Output1 FROM Step1

SELECT * INTO Output2 FROM Step2

SELECT * INTO Output3 FROM Step2

• A query can have multiple steps to enable pipeline execution

• A step is a sub-query defined using WITH (“common table expression”)

• Can be used to develop complex queries more elegantly by creating a intermediary named result

• Creates unit of execution for scaling out when PARTITION BY is used

• Each step’s output can be sent to multiple output targets using INTO

Page 21: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Stream Analytics

Scaling using Partitions Partitioning allows for parallel execution over scaled-out resources

SELECT Count(*) AS Count, Topic

FROM TwitterStream PARTITION BY PartitionId

GROUP BY TumblingWindow(minute, 3), Topic, PartitionId

Query Result 1

Query Result 2

Query Result 3

PartitionId = 1

PartitionId = 3PartitionId = 2

PartitionId = 1

PartitionId = 2

PartitionId = 3

Event Hub

Page 22: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Demo

Page 23: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

PricingVolume of data processed by the streaming job• €0.0008/GB

• Streaming Unit (Blended measure of CPU, memory, throughput)• €0.0231/hr

Page 25: Implementing a canonical IoT backend in Azure with Azure Stream Analytics

Marco Parenzan

Grazie

IoT Experiments @Intel - Assago (Milano)DotNetLombardiaWednesday, May 27, 2015 from 9:00 AM to 6:00 PM (CEST)Milano Fiori, Italy