Upload
daniele-dellaglio
View
290
Download
0
Embed Size (px)
Citation preview
Daniele Dell’Aglio
On Unified Stream ReasoningThe RDF Stream Processing realm
Daniele Dell’Aglio
WU Vienna, 18/02/2016
Daniele Dell’Aglio
Problem setting
Real time integration of huge volumes of dynamic data from
heterogeneous sources
– Traffic Prediction
– Social media analytics
– Personalised services
2
Daniele Dell’Aglio
Stream Reasoning
Stream Reasoning (SR): inference over streams of data
– Stream and Event Processing: real-time processing of
highly dynamic data
• Aggregations, filters
• Complex event detection
– Reasoning
• Access and integration of heterogeneous data
• Make explicit hidden information
3
Daniele Dell’Aglio
A non-comprehensive view on the
Stream Reasoning research so far
Euro
pe
Map
fro
m W
ikip
edia
4
Daniele Dell’Aglio
The initial problem (1)
Where are Alice and Bob,
when they are together?
Let’s consider a tumbling
window W(ω=β=5)
Let’s execute the
experiment 4 times
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
S1 S2 S3 S4S
t3 6 91
{:alice :isIn :hall}
{:bob :isIn :hall}
{:alice :isIn :kitchen}
{:bob :isIn :kitchen}
Which is the correct answer?width
slide
5
Daniele Dell’Aglio
The initial problem (2)
System 1 System 2
Which system behaves in the correct way?
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Execution 1° answer 2° answer
1 :hall [3] :kitchen [9]
2 No answers
3 :hall [3] :kitchen [9]
4 No answers
S1 S2 S3 S4S
t3 6 91
{:bob :isIn :hall} {:bob :isIn :kitchen}
{:alice :isIn :hall} {:alice :isIn :kitchen}
6
Daniele Dell’Aglio
Problem
How to unify current Stream Reasoning techniques?
Why do we need it?
• Comparison and contrast
• Interoperability
• Study RDF Stream Processing related problems
• Standard RSP query language
7
Daniele Dell’Aglio
Streams OntologyBackground
data
Entailment Regimes
RSEP-QL
Applications
RSP-QL
BGP evaluation over streams BGP evaluation
over BKG
Event Pattern detection operators
Model to express continous queries
The entailment regimes require an ontology and
provide more answers w.r.t. Both RSP-QL and RSEP-QLNot part of the today talk!
Contribution – RSEP-QL
A comprehensive model that formally defines the semantics of
RDF Stream Processing engines
8
Daniele Dell’Aglio
Q (E, DS, QF)
From SPARQL…
Evaluator
Data layer
Result Formatter
Ans(Q)RDF graphs
E
DS
QF
Query Interface
9
Daniele Dell’Aglio
Q (E, DS, QF)
…to RSEP-QL
Evaluator
Data layer
Result Formatter
Ans(Q)RDF graphs
E
DS
QFContinuous EvaluatorET
RDF graphsRDF streams
Query Interface
SDS
Q (E, SDS, QF)
Q (E, SDS, ET, QF)
Q (SE, SDS, ET, QF)
SE
10
Daniele Dell’Aglio
Sequence of timestamped graphs (stream items)
Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)
𝕎(3,1,1)
ω
t
width
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
11
Daniele Dell’Aglio
Sequence of timestamped graphs (stream items)
Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)
𝕎(3,1,1)
β
ω
t
widthslide
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
12
Daniele Dell’Aglio
Sequence of timestamped graphs (stream items)
Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)
𝕎(3,1,1)
β
ω
t
widthslide
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
13
Daniele Dell’Aglio
Streaming Dataset – Landmark window: 𝕃(t0)
𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
14
Sequence of timestamped graphs (stream items)
Daniele Dell’Aglio
Streaming Dataset – Landmark window: 𝕃(t0)
𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
15
Sequence of timestamped graphs (stream items)
Daniele Dell’Aglio
Streaming Dataset – Landmark window: 𝕃(t0)
𝕃(2)
t
S3
S4 S5 S6
S7
S8 S9
S10
S11
SS1
S2
t0
16
Sequence of timestamped graphs (stream items)
Daniele Dell’Aglio
From SPARQL dataset to RSEP-QL Streaming Dataset
t1
G(t1)
T⊆ ℕ R={RDF graph}
SPARQL dataset
G
H
Instantaneous GraphG(t1) RTime-Varying Graph
G: T R
RSP-QL dataset
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
SS1
S2
𝕎(S)
17
Daniele Dell’Aglio
Evaluation
The SPARQL evaluation function is defined as
⟦𝑃⟧𝐷𝑆(𝐺)
The RSEP-QL evaluation function extends the SPARQL one by
introducing the evaluation time instant
⟦𝑃⟧𝑆𝐷𝑆(𝐴)𝑡
SPARQL operators are straight extended to the new evaluation
function
Example: JOIN
⟦𝐽𝑂𝐼𝑁(𝑃1, 𝑃2)⟧𝑆𝐷𝑆 𝐴𝑡 = ⟦𝑃1⟧𝑆𝐷𝑆 𝐴
𝑡 ⨝⟦𝑃2⟧𝑆𝐷𝑆 𝐴𝑡
18
Daniele Dell’Aglio
Instantaneous evaluation
The main difference is on the BGP evaluation:
⟦𝐵𝐺𝑃⟧𝑆𝐷𝑆(𝐴)𝑡
=⟦𝐵𝐺𝑃⟧𝑆𝐷𝑆(𝐴,𝑡)
SDS(A,t) is:
SDS(G,t)= SDS(G(t)) if A is a time-varying graph G
SDS(𝕎(S),t)=SDS(m(𝕎(S,t))) if A is from a sliding window 𝕎
SDS(𝕃(S),t)=SDS(m(𝕃(S,t))) if A is from a landmark window 𝕃
where m denotes a merge function
m(𝕎(S,t))= 𝑑𝑖,𝑡𝑖 ∈𝕎(S,t)
𝑑𝑖
– takes as input a window content i.e. a sequence of timestamped
RDF graphs
– produces an RDF graph
19
Daniele Dell’Aglio
Continuous evaluation
For each evaluation time t ∈ ET: ⟦𝑆𝐸⟧𝑆𝐷𝑆(𝐴)𝑡
– The continuous evaluation is a sequence of instantaneous
evaluations
It is not always possible to compute ET a priori
– Can be data dependent
– ET is expressed through a Report Policy
A Report Policy is a set of conditions to one or more window
operators in SDS
– Initially defined in SECRET for Stream Processing engines
20
Daniele Dell’Aglio
Continuous evaluation – Report Policies
Report Policy examples:
– P Periodic: the window reports only at regular intervals
– WC Window Close: the window reports if the active
window closes
– CC Content Change: the window reports if the content
changes.
21
Daniele Dell’Aglio
Event Processing – Basic Event Pattern
Support to Complex Event Processing operators
The minimal element is the Basic Event Pattern:
EVENT 𝑤 𝑃
Intuitively, the Basic Graph Pattern 𝑃 should match against one
stream item of the window identified by 𝑤
BEP can be combined through complex operators
• SEQ, LAST, EVERY
Example:
EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2
22
Daniele Dell’Aglio
Event Processing – Evaluation semantics
Formally, we use a new evaluation function ⦅⋅⦆ 𝑜,𝑐𝑡
• t is the evaluation time instant,
• 𝑜, 𝑐 is an additional window to identify the portion of the
data on which the event may happen
Event pattern evaluation produces event mappings 𝜇, 𝑡1, 𝑡2• 𝜇 is a solution mapping
• 𝑡1 and 𝑡2 denote the time inverval justifying 𝜇
23
Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
24
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t10 12 14 1611 13 15
Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
25
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t10 12 14 1611 13 15
Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
26
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t10 12 14 1611 13 15
11 13
Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
27
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
S1 S12
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t10 12 14 1611 13 15
11 13
11 15
Daniele Dell’Aglio
Event Processing – MATCH graph pattern
Event patterns are eclosed in MATCH graph patterns
• Event mappings exist only in the context of event patterns
• The evaluation of a MATCH graph pattern produces a bag of
solution mappings
𝑀𝐴𝑇𝐶𝐻 𝐸 𝑆𝐷𝑆 𝐴𝑡 = {𝜇| 𝜇, 𝑡1, 𝑡2 ∈ ⦅𝐸⦆ 0,𝑡
𝑡 }
It is possible to combine the MATCH graph pattern with other
SPARQL graph patterns
28
Daniele Dell’Aglio
RSP output correctness
S1 S2 S3 S4S
t3 6 91
{:bob :isIn :hall} {:bob :isIn :kitchen}
t0=0
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Window 1° answer 2° answer
t0=0 :hall [5] :kitchen [10]
t0=1 :hall [6] :kitchen [11]
t0=2 - [7] - [12]
{:alice :isIn :hall} {:alice :isIn :kitchen}
t0=1
t0=2
29
Daniele Dell’Aglio
RSP output correctness
System 1 System 2
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Execution 1° answer 2° answer
1 :hall [3] :kitchen [9]
2 No answers
3 :hall [3] :kitchen [9]
4 No answers
S1 S2 S3 S4S
t3 6 91
{:bob :isIn :hall} {:bob :isIn :kitchen}
{:alice :isIn :hall} {:alice :isIn :kitchen}
Window-close vs content-change
Empty relation notification (yes|no)
30
Daniele Dell’Aglio
Correctness Assessment
Online
Offline
Data importer
Query transformer
SPARQL engine
Result matcherDS
R
RSP-QL model of R
Correctness assessment
Q (E, SDS, ET, QF)
31
Daniele Dell’Aglio
What’s next?
An RSEP-QL query language
• W3C RSP CG ongoing activities
Implementations
• Yet another RSP engine
• Framework to let existing RSP engine interoperate
Streams are getting popular – applications want more and more
sophisticated features
• Different timestamps, out-of-orders
• Inductive reasoning to cope with noise
• Permanent storage of portions of data (raw or inferred)
32
Daniele Dell’Aglio
Conclusions
The dynamics introduced in the continuous query evlauation
process have not been totally understood
• Not fully captured by existing models
• RSEP-QL captures those dynamics
• All of them? Let’s discover it!
We need to push implementations and applications on use cases
• To understand which helpful operators are missing
• To find new unexpected behaviours
33
Daniele Dell’Aglio
People I am grateful to...
Emanuele Della Valle
and:
Marco Balduini
Jean-Paul Calbimonte
Oscar Corcho
Minh Dao-Trao
Danh Le Phuoc
Freddy Lecue
34
Daniele Dell’Aglio
... without forgetting you!
Thank you! Questions?
On Unified Stream ReasoningThe RDF Stream Processing realm
Daniele Dell’Aglio
http://dellaglio.org
35