15
Building an ECA Rules engine for IoT using CDAP Big Data On Tap 03/29/2017 Bhooshan Mogal

#BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

Embed Size (px)

Citation preview

Page 1: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

Building an ECA Rules engine for IoT using CDAP

Big Data On Tap

03/29/2017

Bhooshan Mogal

Page 2: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

2

Event-Condition-Action (ECA) Basics

• Event Parsing and Schema Management

• Boolean expressions — Conditions (or Rules)

• Ability to take one or more Actions based on the result of conditions

ECA is made up of three major components

• Common paradigm in traditional Complex Event Processing (CEP) or Event-Driven architectures, relatively new to the scale-out Apache Hadoop world.

Page 3: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

3

ECA Use-cases and Characteristics

• Use Cases

• Smart Home - Security Systems, Appliance Monitoring, …

• Wearables - Monitoring Vital Stats, Fitness Goals, …

• Typical Characteristics

• Data arrives in continuous, real-time streams

• Data has varying schema

• Metadata (Schemas, Rules) can be registered and managed

Page 4: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

4

Cask Data Application Platform (CDAP)

• CDAP is a unified integration platform that provides higher level abstractions such as ingest, storage, compute, egress, and visual pipelines for Big Data applications

• Ties together data preparation, data integration, data discovery, data science as well as complex, custom data applications with metadata management, security, operations and governance.

Page 5: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

5

ECA via CDAP

• CDAP ECA Application for Schema and Rules Management

• RESTful APIs + UI for Schema and Rules Management

• CDAP Streams/Apache Kafka for high-throughput event ingestion

• Data Preparation directives for transforming data, generating measurements, selecting rules

• Real-time streaming pipelines using Apache Spark Streaming for processing events

• Generic Event Parser plugin to parse events from a known set of schemas

• Rule Executor plugin to apply rules on parsed events, and generate actions

Page 6: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

6

ECA Concepts

• Events are telemetry data sent from devices or aggregators that run at edge

• Schema defines the fields, types, rules and transformations that parse the ingested Events

• Measurements are one or more quantitative measures of any kind in an event. Measurements can also be generated by applying transformations on events.

• Conditions (a.k.a Rules) are boolean expressions applied on fields or measurements in an event. Expressions can include complex conditions like ‘and’, ‘or’, etc. A set of rules can be defined for a given schema.

• Actions define a concrete external notifications that are generated based on the result of executing a condition.

• Schema Hash is a MD5 digest of the field names of an event that uniquely identifies the event type.

Page 7: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

7

ECA Architecture

Page 8: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

8

Schema and Rules Management User Flow

Page 9: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

9

ECA Architecture Description

• (A) Event Parser parses incoming events. It has generic

parsing capabilities combined with ability to generate

schema hash for an event. Schema hash is then further

used to retrieve the user based transformations on event

to enhance or extract measurements.

• (B) Rules Executor is responsible for executing the

conditions on the event to generate a boolean value to

be associated with an action. Rules and Action for an

event are uniquely identified using a schema hash.

• (C) Schema Registry is a repository of definition of

schema types that are parseable by the Event Parser. It’s

a CDAP Service backed by a dataset.

• (D) Rules Registry is a repository of Rules (conditions) to

be executed for a Event type. Rules are indexed on a

Schema hash. It’s a CDAP Service backed by a dataset.

• (E) Reliable Notification Dispatcher is a daemon process

that is responsible for reading the events of a priority

queue dataset to trigger external notifications. It uses

plugin capabilities to define different external comm.

points.

• (F) Event transport or ingestion is achieved by either

using Kafka or CDAP Streams. There can be other

mechanisms like Amazon SQS or Azure Event Hub.

Page 10: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

10

ECA Event Flow

{ "Alert": { "Id": "25", "Time": "2016-09-22T07:41:59.2486611+01:00", "Type": "SOS" }, "Battery": 85, "CallerId": "+44123456789", "Calories": 100, "LastContactTime": "2016-09-22T07:41:59.2486611+01:00", "MessageId": "a32d4883-1d0e-489c-bf74-706ffa4b9e62", "MessageTime": "2016-09-22T07:42:06.2486611+01:00", "Position": { "Accuracy": 10, "Latitude": "51.507351", "Longitude": "-0.127758", "Time": "2016-09-22T07:41:58.2486611+01:00" }, "Steps": 1000, "WatchImei": "123456789012345" }

Incoming Telemetry Event

{ “Alert_Id”: “25”, “Alert_Time” : “2016-09-22T07:41:59.2486611+01:00”, “Alert_Type” : “SOS”, "Battery": 85, "CallerId": "+44123456789", "Calories": 100, "LastContactTime": "2016-09-22T07:41:59.2486611+01:00", "MessageId": "a32d4883-1d0e-489c-bf74-706ffa4b9e62", "MessageTime": "2016-09-22T07:42:06.2486611+01:00", “Position_Accuracy”: 10, “Position_Latitude”: “51.507351”, “Position_Longitude”: “-0.127758”, “Position_Time”: “2016-09-22T07:41:58.2486611+01:00", "Steps": 1000, "WatchImei": "123456789012345" }

{ “Alert_Id”: “25”, “Alert_Time” : “2016-09-22T07:41:59.2486611+01:00”, “Alert_Type” : “SOS”, "Battery": 85, "CallerId": "+44123456789", "Calories": 100, "LastContactTime": "2016-09-22T07:41:59.2486611+01:00", "MessageId": "a32d4883-1d0e-489c-bf74-706ffa4b9e62", "MessageTime": "2016-09-22T07:42:06.2486611+01:00", “Position_Accuracy”: 10, “Position_Latitude”: “51.507351”, “Position_Longitude”: “-0.127758”, “Position_Time”: “2016-09-22T07:41:58.2486611+01:00", "Steps": 1000, "WatchImei": “123456789012345”, “CaloriesPerStep” : 0.1, “hash”: “ABABBASBAB342442ABABABAAB234ABABA67867” }

Parsing Directives Applied Hash Generation & User Transformation

• Generic directives applied

• If array of events, multiple record created

• Flattening on each record

• Hash Generated based on field names (all field considered for hash generation) (e.g. hash)

• User directives looked up based on hash

• User directives applied (e.g. CaloriesPerStep)

Page 11: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

11

ECA Event Flow

Hash Generation

(Battery > 85 && Alert_Type == “SOS”) => sms (CaloriesPerStep > 10) => email

Apply Rules & Post Directives

{ “Alert_Id”: “25”, “Alert_Time” : “2016-09-22T07:41:59.2486611+01:00”, “Alert_Type” : “SOS”, "Battery": 85, "CallerId": "+44123456789", "Calories": 100, "LastContactTime": "2016-09-22T07:41:59.2486611+01:00", "MessageId": "a32d4883-1d0e-489c-bf74-706ffa4b9e62", "MessageTime": "2016-09-22T07:42:06.2486611+01:00", “Position_Accuracy”: 10, “Position_Latitude”: “51.507351”, “Position_Longitude”: “-0.127758”, “Position_Time”: “2016-09-22T07:41:58.2486611+01:00", "Steps": 1000, "WatchImei": “123456789012345”, “CaloriesPerStep” : 0.1, “hash”: “ABABBASBAB342442ABABABAAB234ABABA67867” }

{ “output” : <key-value-of-event>, “sms” : true, “email” : false }

Output Event Stored in Dataset

• Applies conditions (boolean expressions) on the incoming fields

• Action types - sms, email are predefined

• Conditions can be complex

• Each event will generate the action result

• Includes the event as key-value for debugging purpose

Page 12: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

12

Demo

Page 13: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

13

Summary - Schema and Rules Management APIs

•Add New Schema — Adds a new schema to the registry

•Delete Schema — Deletes a schema from the registry

•View A Schema — Provides details of a schema

• List Schemas — List all the schemas in the schema registry

•Update Schema — Update a schema in the registry

Schema Management APIs provide ability to create, delete, list and update Schema Registry

Rules and Action management APIs provide ability to create, delete, list and update Rules and associated actions

•Add New Rule(s) — Adds one or more rules to the rules registry. Rules are associated with a Schema Hash or

Key field as specified by the user directives

•Delete A Rule — Deletes a rule.

• List All Rules for a key - Key could be schema hash or user generated key

Page 14: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

14

Summary - Processing events in real-time

•CDAP Realtime pipeline using Apache Spark Streaming

• Reads events from a CDAP Stream

•Generic Event parser parses the events, applies transformations stored in the schema registry. Generates

measurements, and a key to lookup rules to be applied to the event.

• Rules executor looks up rules from the rules registry and applies them to generate actions

• Events with an sms action are stored in the SMS Kafka topic, ones with an email action in the Email Kafka topic

Page 15: #BDAM: Building an ECA Rules Engine for IoT with CDAP, by Bhooshan Mogal, Cask

Questions?Thank You