Upload
opher-etzion
View
4.809
Download
1
Embed Size (px)
DESCRIPTION
AAAI 2011 - Tutorial: Introduction to event processing and challenges for the next generations of event processing of interest to the AI community
Citation preview
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
Event processing – State of the art and research challenges
AAAI 2011 Tutorial, San Francisco, August 7th, 2011
Opher Etzion ([email protected])Yagil Engel ([email protected])
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
Slides available at:
ie.technion.ac.il/~yagile/EP_Tutorial.pdf
Slides available at:
ie.technion.ac.il/~yagile/EP_Tutorial.pdf
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation3
Imagine that…
A driver gets notification on the car screen: the person crossing the street is an Alzheimer patient out of his regular route, he lives in 5 King Street.
A national park gets information on all cars heading to the park from the car computer; can open more parking lots and notify cars that the park will be closed.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation4
Agenda
Introduction and roots of event processing
Players and architecture of event processing
Current state of the art in event processing
Challenges in event processing systems
Summary
I
II III IV
V
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
I: Introduction and roots of event processing
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation6
What is “event processing” anyway?
or
Event processing is a form of computing that performs operations on events
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation7
In computing we processed events since early days
Network and System Management
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation8
Emerging technologies in enterprise computing(Gartner Hype Cycle, Summer 2009)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation9
What’s new?The analog: moving from files to DBMS
In recent years – architectures, abstractions, and dedicated commercial products emerge to support functionality that wastraditionally carried out within regular programming. For some applications it is an improvement in TCO; for others is breaking the cost-effectiveness barrier.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation10
What is an event – three views
An event is anything that happens, or is contemplated as happening.
The happening view
The state change view
An event is a state of change of anything
The detectable condition view
An event is a detectable condition that can trigger a notification
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation11
In daily life we often react to events..
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation12
Many times we react to combination of events within a context
The house sensor detects that the childdid not arrive home within 2 hours fromthe scheduled end of classes for the day
I want to be notified when my own investmentportfolio is down 5% since the start of the tradingDay; have an agent call me when I am available, send SMS when I am in a meeting, and Email whenI am out of office.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation13
EventPatterns
Pattern detection is one of the notable functions ofevent processing
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation14
What we actually want to react to are – situations
TOLL VILOATOR FRUSTRATED
CUSTOMER
Sometimes the situation is determined by detecting thatsome pattern occurred in theFlowing events.
Toll violation Frustrated customer
Sometimes the events can approximate or indicate withsome certainty that the situation has occurred
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation15
Event processing is being used for various reasons
IBM Haifa Research Lab – Event Processing
IBM Haifa Research Lab – Event Processing © 2008 IBM Corporation
DetectDecide
Respond
EP Solution Segments – Business Value
Real-Time Operational
Information Dissemination
Observation
Predictive ProcessingActive Diagnostics
Reactions to events are done as part of business transactions – achieving low latency decisions, and quick reaction to threats and opportunities
Getting the
right
information
in the right
granularity
to the right
person at
the right
time
Diagnose
problems
based on
symptoms and
resolve them
Quick observation into exceptional business behavior and notification to the appropriate people.
Mitigate or
eliminate
predicted events
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation16
Ancestor: Production Rules
When Precondition
Fire Action
The precondition is implicit event when activatedin forward chaining
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation17
Ancestor: active databases
On event
When condition
Do action
With coupling mode
Composite events were inherited to event processing
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation18
Ancestor: Data Stream management system
Source: Ankur Jain’s website
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation19
Event processing and Data stream management?
Aliases?
One of them subset of the other?
Totally unrelated concepts?
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation20
Ancestor: Temporal databases
There is a substantial temporal nature to event processing. Recently – also spatial and spatio-temporal functions are being added
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation21
Ancestor: Discrete event simulation
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation22
Ancestor: Formal Verification
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation23
Ancestor: Network and system management
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation24
Ancestor: Messaging – pub/sub middleware
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
II: Players and architecture of event processing
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation26
Event Driven Architecture
Event driven architecture: asynchronous, decoupled; each component is autonomic.
27
Flower StoreVan Driver
Ranking and Reporting
System
Bid Request
Delivery Bid
Assignments,
Bid alerts, Assign Alerts
Control System
GPS Location
Location Service
Location
Driver’s Guild
Ranking and reports
Delivery
confirmation
Pick Up confirmation
Ranked drivers / automatic assignment
Bid System
StorePreferences
Delivery Request
Assignment System
Manual Assignment Assignment
Assignments,
Pick Up Alert
Delivery Alert
Fast Flower Deliveryhttp://www.ep-ts.com/EventProcessingInAction
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation28
Event Processing AgentContext
EventChannel
EventConsumer
EventType
EventProducer
GlobalState
The seven
Building blocks
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation29
Event processing network
Event Producer 1
Event Consumer 1
Event Consumer 2
Event Producer 2
Event Consumer 3
Agent 2
Channel
Agent 1
State
Agent 3
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation30
Example of EPN – part of the FFD example
Automatic assignment
Driver
Bid routing
Store reference
Manual assignment preparationBid Request
channel Bid enrichment
Assignment manager
Alerts channel
Assignment channel
Assignment request channel
Driver status
Driver enrichment
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation31
Event type definition
Header Payload Open content
System defined event attributes
Attributes specific to the event type
Additional free format data included in the event instance
Detection time, Occurrence time, source, Certainty…
Stock id, quote, volume… Free
comments…
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation32
Producer – State Observer in workflows
State observerPush:
Instrumentation points;
Pull:
Query the state
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation33
Producer – Code instrumentation
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation34
Producer – syndication
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation35
Producers – video streams to events
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation36
Producer – sensors
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation37
Producer and consumer - Sixth sense
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation38
Twitter as a producer and consumer
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation39
Consumer - Performance monitoring dashboard
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation40
Consumer - Ambient Orb
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation41
Event Processing Agent
Filter Transform Detect Pattern
Translate Aggregate Split Compose
Enrich Project
Event Processing Agents
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation42
The EPA picture
Output
Not selected
Instanceselection
Context expression
Pattern detect EPA
Relevance filtering
Input terminal filter expression
Relevant event types
DerivationDerivation expression
MatchingPattern signature:
Pattern typePattern parametersRelevant event typesPattern policies
Pattern matching set
Participant events
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation43
Filter EPAA filter EPA is an EPA that performs filtering only, and has no matching or derivation steps, so it does not transform the input event.
Filtering
Filter EPA
Filtered Out
Filtered In
Non-Filterable
Principal filter expression
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation44
Transform EPA sub types
Translate Compose
Aggregate Enrich
Split
Project
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation45
Sample of pattern types
all pattern is satisfied when the relevant event set contains at least one instance of each event type in the participant set
any pattern is satisfied if the relevant event set contains an instance of any of the event types in the participant set
absence pattern is satisfied when there are no relevant events
relative N highest values pattern is satisfied by the events which have the N highest value of a specific attribute over all the relevant events, where N is an argument
value average pattern is satisfied when the value of a specific attribute, averaged over all the relevant events, satisfies the value average threshold assertion.
always pattern is satisfied when all the relevant events satisfy the always pattern assertion
sequence pattern is satisfied when the relevant event set contains at least one event instance for each event type in the participant set, and the order of the event instances is identical to the order of the event types in the participant set.
increasing pattern is satisfied by an attribute A if for all the relevant
events, e1 << e2 e1.A < e2.A
relative max distance pattern is satisfied when the maximal distance between any two relevant events satisfies the max threshold assertion
moving toward pattern is satisfied when for any pair of relevant events
e1, e2 we have e1 << e2 the location of e2 is closer to a certain object then the location of e1.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation46
Pattern detection example
Pattern name: Manual Assignment Preparation
Pattern Type: relative N highest
Context: Bid Interval
Relevant event types: Delivery Bid
Pattern parameter: N = 5; value = Ranking
Cardinality: Single deferred
Find the five highest bids within the bid interval
Taken from the Fast Flower Delivery use case
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation47
Our entire culture is context sensitive
In the play “The Tea house of the August Moon” one of the characters says: Pornography question of geography
•This says that in different geographical contexts people view things differently
•Furthermore, the syntax of the language (no verbs) is typical to the way that the people of Okinawa are talking
When hearing concert people are not talking,
eating, and keep their mobile phone on “silent”.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation48
Context has three distinct roles (which may be combined)
Partition the incoming events
The events that relate to each customer are processed separately
Grouping events together
Different processing forDifferent context partitions
Determining the processing
Grouping together events that happened in the same hour at the same location
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation49
Context Definition
A context is a named specification of conditions that groups eventinstances so that they can be processed in a related way. It assignseach event instance to one or more context partitions.
A context may have one or more context dimensions.
A context is a named specification of conditions that groups eventinstances so that they can be processed in a related way. It assignseach event instance to one or more context partitions.
A context may have one or more context dimensions.
TemporalTemporal
SpatialSpatial
State OrientedState Oriented
Segmentation OrientedSegmentation Oriented
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation50
Context Types
Fixed locationFixed location
Entity distance locationEntity distance location
Event distance locationEvent distance location
SpatialSpatial
State OrientedState Oriented
Fixed intervalFixed interval
Event intervalEvent interval
Sliding fixed intervalSliding fixed interval
Sliding event intervalSliding event interval
TemporalTemporal
Segmentation OrientedSegmentation Oriented
ContextContext
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation51
Context Types Examples
SpatialSpatial
State OrientedState Oriented
TemporalTemporal
ContextContext
“Every day between 08:00and 10:00 AM”
“A week after borrowing a disk”
“A time window bounded byTradingDayStart andTradingDayEnd events”
“Every day between 08:00and 10:00 AM”
“A week after borrowing a disk”
“A time window bounded byTradingDayStart andTradingDayEnd events”
“3 miles from the trafficaccident location”
“Within an authorized zone ina manufactory”
“3 miles from the trafficaccident location”
“Within an authorized zone ina manufactory”
“All Children 2-5 years old”“All platinum customers”“All Children 2-5 years old”“All platinum customers”
“Airport security level is red”“Weather is stormy”“Airport security level is red”“Weather is stormy”
Segmentation OrientedSegmentation Oriented
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
III: The current states of the art in event processing
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation53
An ObservationThe Babylon Tower symbolizes the tendencyOf humanity to talk in multiple languages.
The Event Processing area is no different: most languages in the industry really followthe hammer and nails syndrome – and extended existing approaches• imperative script language• SQL extensions• Extension of inference rule language
The epts language analysis workgroup is aimed to understand the various stylesAnd extract common functions that can be used to define what is an event processing language; this tutorial is an interim report
It does not seem that we’ll succeed to settleIn the near future around a single programming style
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation54
The Babylon tower and current state of the practice
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation55
StreamBase Studio
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation56
StreamBase Pattern Matching
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation57
CCL Studio (Coral8 Sybase)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation58
CCL – Pattern Matching
RFID monitoring application
Checks if a tag has been seen by readers A and B, then C, but not D, within a 10 second window.
Insert into StreamAlerts Select StreamA.id From StreamA a, StreamB b, StreamC c, StreamD d Matching [10 seconds: a && b, c, !d] On a.id = b.id = c.id = d.id
Insert into StreamAlerts Select StreamA.id From StreamA a, StreamB b, StreamC c, StreamD d Matching [10 seconds: a && b, c, !d] On a.id = b.id = c.id = d.id
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation59
Microsoft Streaminsights
var topfive = (from window in inputStream.Snapshot() from e in window
orderby e.f ascending, e.i descending select e).Take(5);
var avgCount = from v in inputStream group v by v.i % 4 into eachGroup
from window in eachGroup.Snapshot() select new { avgNumber = window.Avg(e => e.number) };
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation60
Esper EPL – FFD Example
/** Not delivered up after 10 mins (600 secs) of the request target delivery time*/insert into AlertW(requestId, message, driver, timestamp)select a.requestId, "not delivered", a.driver, current_timestamp()from pattern[
every a=Assignment (timer:interval(600 + (a.deliveryTime-current_timestamp)/1000) and
not DeliveryConfirmation(requestId = a.requestId) and
not NoOneToReceiveMSG(requestId = a.requestId))];
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation61
ruleCore - Reakt Event stream view - a unique context of events
a view contains a window into the inbound stream of events andcontains commonly only semantically related events
Situation - an interesting combination of multiple events as they occur over time
An item with an RFID tag being picked up from the shelf and then moving past the checkout without being paid for
Rule - an active event processing entity reacting to specific combinations of inbound events over time
Action - the last part of a rule's evaluation in response to a detected situation
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation62
Amit - Situation
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation63
IBM Websphere Business Events
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation64
Apama EPL – FFD Examples
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation65
Performance benchmarks
There is a large variance among applications, thus a collection of benchmarks should be devised, and each application should be classified to a benchmark
Some classification criteria:
Application complexity
Filtering rate Required Performance metrics
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation66
Performance benchmarks – cont.Adi A., Etzion O. Amit - the situation manager.The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004.
Mendes M., Bizarro P., Marques P. Benchmarkingevent processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260.
event processing system benchmark
0
10000
20000
30000
40000
50000
60000
70000
80000
standby w orld noisy w orld filtered w orld complex w orld
category
thro
ug
hp
ut
throughput
event processing system benchmark
0
20000
40000
60000
80000
100000
120000
140000
standby world noisy world filtered world complex world
category
tota
l p
rocessin
g t
ime (
ms)
performance time (ms)
performance study of event processing systems
0
100
200
300
400
500
600
700
800
900
1000
selection andprojection
aggregation overwindows
joins pattern detection
category
thro
ug
hp
ut
* 10^
3
system 1
system 2
system 3
Previous studies indicate that thereis a major performance degradation asapplication complexity increases.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation67
Throughput
Input throughput
output throughput
Processing throughput
Measures: number of input events that the system can digest within a given time interval
Measures: Total processing times /# of event processed within a giventime interval
Measures: # of events that were emitted to consumers within a given time interval
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation68
Latency
latency
In the E2E level it is defined as the elapsed time FROM the time-point when the producer emits an input event TO the time-point when the consumer receives and output event
The latency definition
But – input event may not result in output event:It may be filtered out, participate in a pattern but does not result in patterndetection, or participates in deferred operation (e.g. aggregation)
Similar definitions for the EPA level, or path level
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation69
Performance goals and metrics
Multi-objective optimization function:min(*avg latency + (1-)*(1/thoughput))
0
5
10
15
20
25
30
35
40
45
50
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
0
5
10
15
20
25
30
35
40
45
50
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
0
5
10
15
20
25
30
35
40
45
50
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59
minmax latency minavg latency latency leveling
Max throughput
All/ 80% have max/avg latency < δ
All/ 90% of time units have throughput > Ω
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation70
Scalability in event processing:various dimensions
# of
producers
# of input events
# of EPA types
# of concurrent runtime instances
# of concurrent runtime contexts
Internal statesize
# of consumers
# of derived events
Processingcomplexity
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation71
Scalability solutions
Significant progress in scalability enablers that provides feasibility for a system based on large scale event sources, event quantities, computations and actuators
Smart placements of processing elements with dynamic load balancing
Fault tolerance techniques enable trustable automatic processing
Virtualization (scale-in)
Use of parallel processing – multi-core and GPU processors – without extra programming efforts
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
IV: Challenges in event processing systems
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation73
Challenges
Inexact Event ProcessingPredictive Event ProcessingUse of Machine LearningFrom Reactive to ProactiveCorrectness
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation74
Inexact event processing
Sourcemalfunction
Malicioussource
Projection oftemporal
anomalies
Imprecise source
Sampling orestimate
Propagation of
inexactness
Uncertainevent
Inexact event content
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation75
Uncertain situations
False positive:The pattern is matched;The real-world situation does not occur
False positive:The pattern is matched;The real-world situation does not occur
False negative:The pattern is not matched;The real-world situation occurs
False negative:The pattern is not matched;The real-world situation occurs
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation76
Temporal indeterminacy
Inexact indicator Probability
Event did not occur 0.4
Event occurred before T1 0.1
Event occurred in [T1, T2] 0.45
Event occurred after T2 0.05
T1T1 T2T2
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation77
Challenges
Inexact Event ProcessingPredictive Event ProcessingUse of Machine LearningFrom Reactive to ProactiveCorrectness
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation78
Predictive Event Processing (1)
VS.VS.
Photo by Michael Gray, FlickrPhoto by Michael Gray, Flickr
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation79
Predictive Event Processing (2)
VS.VS.
++
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation80
Predictive Event Patterns
Pattern Future event, probability, time interval
“4 high value deposits from different geographic locations within 3 days” “0.6 chance for a large transfer abroad, in 1 day”
“4 high value deposits from different geographic locations within 3 days” “0.6 chance for a large transfer abroad, in 1 day”
“Output event will occur with distribution D over interval (t1,t2)”“Output event will occur with distribution D over interval (t1,t2)”
Stock decrease of > 5% in 3 hours Good chance for 2% increase within 2 hours
Stock decrease of > 5% in 3 hours Good chance for 2% increase within 2 hours
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation81
Limitations of the use of rules in specifying predictive event patterns
Limitations: Partial patterns
Uncertain input events
Complex relationship between random variables
Rule = hard-coded probabilistic Relationship
Rule = hard-coded probabilistic Relationship
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation82
Dynamic event prediction
Time Series PredictionTime Series Prediction
Graphical models Graphical models
Temporal Graphical models Temporal Graphical models
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation83
Graphical Model for Missing a Flight (Logistics Scenario)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation84
Predictive Model for Missing a Flight (Logistics Scenario)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation85
Predictive Model for Missing a Flight (Logistics Scenario)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation86
Predictive Model for Missing a Flight (Logistics Scenario)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation87
Continuous Time Bayesian Networks (CTBN, Nodelman et al, 2002)
Can be used to model probabilistic and temporal relationship between events E.g., Applied for the problem of detecting host-level attacks in network traffic (Xu and Shelton, 2008)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation88
Anomaly Detection in Networks (Xu and Shelton, 2008)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation89
CTBN model (Xu and Shelton, 2008)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation90
Challenges
Inexact Event ProcessingPredictive Event ProcessingUse of Machine LearningFrom Reactive to ProactiveCorrectness
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation91
Machine Learning in EP Systems
Requires for training predictive capabilities:
Learn parameters / structure of graphical models
Learn predictive rules
Discover the patters used by EPAs
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation92
Event Pattern Discovery
Most (almost all) deployed systems today rely on user input to obtain complex event patterns
How can (business) users obtain these patterns?
Users do not know all the patterns that are relevant
System must be built and maintained by domain experts
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation93
Requirements of Data Mining Algorithms
What DM algorithms should be able to do?
Low frequency patterns
Temporal Windows
Assertions and Thresholds
Non-Standard patterns
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation94
Low Frequency Patterns
Detecting rare events: Frauds, attacks
Predict crashes
Equipment failure
Natural disasters
Solutions: Low support mining
Unsupervised learning for anomaly detection
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation95
Temporal Windows
Time window should be output of the DM process
Work by Mannila et al. 1997 : WINEPI
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation96
Assertions and Thresholds
Pattern “3 cash deposits in one day” may have no predictive value
BUT“3 cash deposits above $10000 from 3 different locations” does
Multiattribute mining (Hellerstein et al.)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation97
Other kinds of patterns
We may be interested in patterns which are not sequential:
“All”, “Absence”, “Max Value”, “Sometime”
“If there is no deposit to this account in the last year,…”
“If the maximal value of deposit to this account in the last year is $5,…”
“If at least one of the deposits where made from abroad,…”
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation98
Challenges
Inexact Event ProcessingPredictive Event ProcessingUse of Machine LearningFrom Reactive to ProactiveCorrectness
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation99
From Reactive to Proactive
IBM Haifa Research Lab – Event Processing
IBM Haifa Research Lab – Event Processing © 2008 IBM Corporation
TIME
now
Late
r
StableStates
Shift left
Action
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation100
Example: Call Center Queue Assignment
MDP Model:States (S): queue statusActions (a): assignmentsReward (R): penalty for waiting and blockingTransition (T): call arrival, call ending
MDP Model:States (S): queue statusActions (a): assignmentsReward (R): penalty for waiting and blockingTransition (T): call arrival, call ending
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation101
Proactivity: Call Center Example
Skill Based routingpolicies Event Processing
System
Real-time Optimization
Resource or demand changes
Policy Adaptation
Request traffic events
Employees
Related events
Events thattrigger manyRequest (outage, Bad weather)…
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation102
Proactive Event-Driven Computing (1)
predict (states, events)
Real-timedecision
Proactiveaction
events
Event processing(filter, transform,match patterns)
events
Detect / Derive
Predict Decide Act
events
Proactive event-driven computing is a new paradigm aimed at predicting the occurrence of problems or opportunities before they occur, and changing the course of actions to mitigate or leverage them
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation103
Energy Scenario
Detect Predict Decide
Act
Consumption Level
Production Level
State
Generator FailureGenerator FixedWeather Forecast
(sun, wind, temp, storm)
Consumption ForecastProduction Forecast
Outage Prediction
Many Failed Generators Prediction
Call for Urgent Generators Fix
Activate Expensive Diesel Generators
Declare “Peak Hours”for Tomorrow
Activate Rolling Blackout
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation104
DetectMonitor shipment progress and various related alerts (traffic, cargo handling time at airport, carriers being late)
PredictAccording to current route, the shipment will be 3 hours late and we will incur high penalty
Decide Find alternative route which (given new condition) is faster than previous route
Act Generate cargo reservations, reroute shipment
Critical Shipment Logistics
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation105
Personal reschedule
Detect I got out of the house 20 Minutes late; there are three spotsof traffic congestion on the way to the office; it is raining; and I have an important meeting in 25 minutes!
PredictI am not going to get to the meeting, not even close!
Decide Check whether there is a qualified person for this meeting that can replace me and has lower priority task for the duration of this meeting and reschedule his/her other obligations;Alternatively, check if there Is another time-slot later on the day for which the meeting can be rescheduled and get a decision!
Act Notify all involved on their reschedule.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation106
Electric car – battery replacement overload
DetectTracking the cars driving within a certain area and their battery status.
PredictIn 2 hours the service stations in the area will be outof charged batteries.
Decide Whether there are available spare batteries nearby that can be shipped via car, or a helicopter need to be dispatched to ship batteries from the central store.
ActLoad batteries on selected means of transportation and start the journey!
Background:A company leases electric cars that can drive up to 100 miles; it provides both personal and public battery charge spots, and robotic battery replacement service stations as part of the lease.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation107
Portfolio tuning
DetectTrack corporate actions, news, exchange prices, and rumors about all securities in my portfolio
PredictMy portfolio is going to exceed my personal risk limit within 1 hour
Decide Mark the securities to be sold and best timing to sell, find an alternative to buy that retain the risk limit.
Act Buy/Sell orders
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation108
Predict•Uncertain Rules•Bayesian Network•Classifiers:
•Decision trees•Naïve Bayes•…
•…
Decide
•Temporal Decision Process•Optimization tools (black box)
Probabilisticevents
Analytics
•Events
Actions
Proactive Event-Driven Computing (2)
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation109
Event Processing DM vs. AI DM
EP: scalable decision making, under large steams of online information
AI: state-based, decision-theoretic deliberation
EP+AI: EP synthesize streams to meaningful bit of info, AI operates on reduced state space
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation110
Decision Making for Proactive E-D Computing
• Decision Rules: EPAs that react to future events• Markov Decision Process
• Model for policy optimization under uncertainty
• Model must be updated when the predictive EP modules predicts relevant future events
• Requires online adjustment of policy• Brafman, Domshlak, Engel, and Feldman, AAAI 2011
• External Optimization tools
• E.g., route planner for the logistics scenario
• Parameterization, or shared resource information
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation111
Proactive EP: Challenges to the EPN
Event Life SpanResponse from ActuatorsMultiple Proactive AgentsState Driven vs. Event-Driven
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation112
Challenges
Inexact Event ProcessingPredictive Event ProcessingUse of Machine LearningFrom Reactive to ProactiveCorrectness
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation113
Correctness
The ability of a developer to create correct implementation for all cases (including the boundaries)
Observation:A substantial amount of effort is invested today in manyof the tools to workaround the inability of the languageto easily create correct solutions
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation114
Some correctness topics
The right interpretation of language constructs
The right order of events
The right classification of events to windows
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation115
The right interpretation of language constructs – example
All (E1, E2) – what do we mean?
10:00 11:02 13:35
Buy Amount: $2M
SellAmount: $7.8M
Buy Amount: $10.6M
A customer both sells and buys the same security in value of more than $1M within a single day
Deal fulfillment: Package arrival and payment arrival
6/310:00
7/311:00
8/311:00
8/314:00
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation116
Fine tuning of the semantics (I)
When should the derived event be emitted?
When the Pattern is matched?
At the window end?
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation117
Fine tuning of the semantics (II)
How many instances of derived events should be emitted?
Only once?
Every time there is a match?
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation118
Fine tuning of the semantics (III)
What happens if the same event happens several times?
Only one – first, last, higher/lower value on some predicate?
All of them participate in a match?
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation119
Fine tuning of the semantics (IV)
Can we consume or reuse events that participate in a match?
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation120
Fine tuning of semantics – conclusion
Some languages have explicit policies:Example: CCL Keep policies
–KEEP LAST PER Id–KEEP 3 MINUTES–KEEP EVERY 3 MINUTES–KEEP UNTIL (”MON 17:00:00”)–KEEP 10 ROWS–KEEP LAST ROW–KEEP 10 ROWS PER Symbol
In other cases – explicit programming and workarounds are used if semantics intended is different than the default semantics
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation121
The right order of events - scenario
Bid scenario- ground rules:1. All bidders that issued a bid within the validity interval participate in the bid.2. The highest bid wins. In the case of tie between bids, the first accepted bid wins the auction
===Input Bids===
Bid Start 12:55:00credit bid id=2,occurrence time=12:55:32,price=4 cash bid id=29,occurrence time=12:55:33,price=4cash bid id=33,occurrence time=12:55:34,price=3credit bid id=66,occurrence time=12:55:36,price=4credit bid id=56,occurrence time=12:55:59,price=5Bid End 12:56:00
===Winning Bid===cash bid id=29,occurrence time=12:55:33,price=4
Trace:
Race conditions:
Between events;Between events andWindow start/end
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation122
Ordering in a distributed environment - possible issues
Even if the occurrence time of an event is accurate, it might arrive after some processing has already been done
If we used occurrence time of an event as reported bythe sources it might not be accurate, due to clock accuracy in the source
Most systems order event by detection time – but events may switchtheir order on the way
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation123
Clock accuracy in the source
Clock synchronization
Time server, example: http://tf.nist.gov/service/its.htm
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation124
Buffering techniqueAssumptions:
Events are reported by the producers as soon as they occur;
The delay in reporting events to the system is relatively small, and can be bounded by a time-out offset;
Events arriving after this time-out can be ignored.
Sorted Buffer (by occurrence time)
To
t > To +
Producers Event Processing
Principles: Let be the time-out offset, according to the assumption it is safe to assume that at any time-point t, all events whose occurrence time is earlier than t - have already arrived. Each event whose occurrence time is To is then kept in the buffer until To+, at which time the buffer can be sorted by occurrence time, and then events can be processed in this sorted order.
Principles: Let be the time-out offset, according to the assumption it is safe to assume that at any time-point t, all events whose occurrence time is earlier than t - have already arrived. Each event whose occurrence time is To is then kept in the buffer until To+, at which time the buffer can be sorted by occurrence time, and then events can be processed in this sorted order.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation125
Retrospective compensation
Event Producer 1
Event Consumer 1
Event Consumer 2
Event Producer 2
Event Consumer 3
Agent 2
Channel
Agent 1
State
Agent 3
Out of order event
Out of order event
RecalculationRecalculation
Retraction of previous EPA results
Retraction of previous EPA results
Not always possible!
Not always possible!
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation126
Classification to windows - scenario
Calculate Statisticsfor each Player
(aggregate per quarter)
Calculate Statisticsfor each Team
(aggregate per quarter)
Window classification:
Player statistics are calculated at the end of each quarterTeam statistics are calculated at the end of each quarter based on the players events arrived within the same quarter
All instances of player statistics that occur within a quarter window must be classified to the same window, even if they are derived after the window termination.
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
V: Summary
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation
Event processingis an emerging technology Potential for mutually
beneficial interaction with AI Make the next generationA vehicle to substantiallyChange the world
Already attracted coverage of analysts and all major software vendors
EventPatterns
It barely scratched the surfaceOf its potential
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation129
REFERENCES (StoA of Event Processing)
Opher Etzion and Peter Niblett, Event Processing in Action, Manning, 2010.
Mani Chandy and Roy Schulte, Event Processing: Designing IT Systems for Agile Companies, McGraw Hill, 2009.
David Luckham, The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, Addison-Wesley, 2002.
Gianpaolo Cugola and Alessandro Margara, Processing Flows of Information: From Data Stream to Complex Event Processing, to appear in ACM Computing Surveys. Available through: http://home.dei.polimi.it/margara/papers/survey.pdf
IBM Haifa Research Lab – Event Processing
© 2011 IBM Corporation130
REFERENCES (Challenges Section)
H. Mannila, H. Toivonen, and A. Inkeri Verkamo, Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery, 1997.
J.L. Hellerstein, S. Ma, and C.S. Perng, Discovering actionable patterns in event data, IBM Systems Journal, 2002
R.I. Brafman, C. Domshlak, Y. Engel, and Z. Feldman, Planning for Operational Control Systems with Predictable Exogenous Events, AAAI 2011
Y. Engel and O. Etzion, Towards Proactive Event Driven Computing, DEBS 2011