35
Daniele Dell’Aglio On Unified Stream Reasoning The RDF Stream Processing realm Daniele Dell’Aglio WU Vienna, 18/02/2016

On Unified Stream Reasoning - The RDF Stream Processing realm

Embed Size (px)

Citation preview

Page 1: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

On Unified Stream ReasoningThe RDF Stream Processing realm

Daniele Dell’Aglio

WU Vienna, 18/02/2016

Page 2: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Problem setting

Real time integration of huge volumes of dynamic data from

heterogeneous sources

– Traffic Prediction

– Social media analytics

– Personalised services

2

Page 3: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Stream Reasoning

Stream Reasoning (SR): inference over streams of data

– Stream and Event Processing: real-time processing of

highly dynamic data

• Aggregations, filters

• Complex event detection

– Reasoning

• Access and integration of heterogeneous data

• Make explicit hidden information

3

Page 4: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

A non-comprehensive view on the

Stream Reasoning research so far

Euro

pe

Map

fro

m W

ikip

edia

4

Page 5: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

The initial problem (1)

Where are Alice and Bob,

when they are together?

Let’s consider a tumbling

window W(ω=β=5)

Let’s execute the

experiment 4 times

Execution 1° answer 2° answer

1 :hall [6] :kitchen [11]

2 :hall [5] :kitchen [10]

3 :hall [6] :kitchen [11]

4 - [7] - [12]

S1 S2 S3 S4S

t3 6 91

{:alice :isIn :hall}

{:bob :isIn :hall}

{:alice :isIn :kitchen}

{:bob :isIn :kitchen}

Which is the correct answer?width

slide

5

Page 6: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

The initial problem (2)

System 1 System 2

Which system behaves in the correct way?

Execution 1° answer 2° answer

1 :hall [6] :kitchen [11]

2 :hall [5] :kitchen [10]

3 :hall [6] :kitchen [11]

4 - [7] - [12]

Execution 1° answer 2° answer

1 :hall [3] :kitchen [9]

2 No answers

3 :hall [3] :kitchen [9]

4 No answers

S1 S2 S3 S4S

t3 6 91

{:bob :isIn :hall} {:bob :isIn :kitchen}

{:alice :isIn :hall} {:alice :isIn :kitchen}

6

Page 7: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Problem

How to unify current Stream Reasoning techniques?

Why do we need it?

• Comparison and contrast

• Interoperability

• Study RDF Stream Processing related problems

• Standard RSP query language

7

Page 8: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Streams OntologyBackground

data

Entailment Regimes

RSEP-QL

Applications

RSP-QL

BGP evaluation over streams BGP evaluation

over BKG

Event Pattern detection operators

Model to express continous queries

The entailment regimes require an ontology and

provide more answers w.r.t. Both RSP-QL and RSEP-QLNot part of the today talk!

Contribution – RSEP-QL

A comprehensive model that formally defines the semantics of

RDF Stream Processing engines

8

Page 9: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Q (E, DS, QF)

From SPARQL…

Evaluator

Data layer

Result Formatter

Ans(Q)RDF graphs

E

DS

QF

Query Interface

9

Page 10: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Q (E, DS, QF)

…to RSEP-QL

Evaluator

Data layer

Result Formatter

Ans(Q)RDF graphs

E

DS

QFContinuous EvaluatorET

RDF graphsRDF streams

Query Interface

SDS

Q (E, SDS, QF)

Q (E, SDS, ET, QF)

Q (SE, SDS, ET, QF)

SE

10

Page 11: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Sequence of timestamped graphs (stream items)

Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)

𝕎(3,1,1)

ω

t

width

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

11

Page 12: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Sequence of timestamped graphs (stream items)

Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)

𝕎(3,1,1)

β

ω

t

widthslide

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

12

Page 13: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Sequence of timestamped graphs (stream items)

Streaming Dataset – Time-based sliding window:𝕎(ω,β,t0)

𝕎(3,1,1)

β

ω

t

widthslide

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

13

Page 14: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Streaming Dataset – Landmark window: 𝕃(t0)

𝕃(2)

t

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

14

Sequence of timestamped graphs (stream items)

Page 15: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Streaming Dataset – Landmark window: 𝕃(t0)

𝕃(2)

t

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

15

Sequence of timestamped graphs (stream items)

Page 16: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Streaming Dataset – Landmark window: 𝕃(t0)

𝕃(2)

t

S3

S4 S5 S6

S7

S8 S9

S10

S11

SS1

S2

t0

16

Sequence of timestamped graphs (stream items)

Page 17: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

From SPARQL dataset to RSEP-QL Streaming Dataset

t1

G(t1)

T⊆ ℕ R={RDF graph}

SPARQL dataset

G

H

Instantaneous GraphG(t1) RTime-Varying Graph

G: T R

RSP-QL dataset

S3

S4 S5

S6

S7

S8

S9 S10

S11

S12

SS1

S2

𝕎(S)

17

Page 18: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Evaluation

The SPARQL evaluation function is defined as

⟦𝑃⟧𝐷𝑆(𝐺)

The RSEP-QL evaluation function extends the SPARQL one by

introducing the evaluation time instant

⟦𝑃⟧𝑆𝐷𝑆(𝐴)𝑡

SPARQL operators are straight extended to the new evaluation

function

Example: JOIN

⟦𝐽𝑂𝐼𝑁(𝑃1, 𝑃2)⟧𝑆𝐷𝑆 𝐴𝑡 = ⟦𝑃1⟧𝑆𝐷𝑆 𝐴

𝑡 ⨝⟦𝑃2⟧𝑆𝐷𝑆 𝐴𝑡

18

Page 19: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Instantaneous evaluation

The main difference is on the BGP evaluation:

⟦𝐵𝐺𝑃⟧𝑆𝐷𝑆(𝐴)𝑡

=⟦𝐵𝐺𝑃⟧𝑆𝐷𝑆(𝐴,𝑡)

SDS(A,t) is:

SDS(G,t)= SDS(G(t)) if A is a time-varying graph G

SDS(𝕎(S),t)=SDS(m(𝕎(S,t))) if A is from a sliding window 𝕎

SDS(𝕃(S),t)=SDS(m(𝕃(S,t))) if A is from a landmark window 𝕃

where m denotes a merge function

m(𝕎(S,t))= 𝑑𝑖,𝑡𝑖 ∈𝕎(S,t)

𝑑𝑖

– takes as input a window content i.e. a sequence of timestamped

RDF graphs

– produces an RDF graph

19

Page 20: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Continuous evaluation

For each evaluation time t ∈ ET: ⟦𝑆𝐸⟧𝑆𝐷𝑆(𝐴)𝑡

– The continuous evaluation is a sequence of instantaneous

evaluations

It is not always possible to compute ET a priori

– Can be data dependent

– ET is expressed through a Report Policy

A Report Policy is a set of conditions to one or more window

operators in SDS

– Initially defined in SECRET for Stream Processing engines

20

Page 21: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Continuous evaluation – Report Policies

Report Policy examples:

– P Periodic: the window reports only at regular intervals

– WC Window Close: the window reports if the active

window closes

– CC Content Change: the window reports if the content

changes.

21

Page 22: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Basic Event Pattern

Support to Complex Event Processing operators

The minimal element is the Basic Event Pattern:

EVENT 𝑤 𝑃

Intuitively, the Basic Graph Pattern 𝑃 should match against one

stream item of the window identified by 𝑤

BEP can be combined through complex operators

• SEQ, LAST, EVERY

Example:

EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2

22

Page 23: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Evaluation semantics

Formally, we use a new evaluation function ⦅⋅⦆ 𝑜,𝑐𝑡

• t is the evaluation time instant,

• 𝑜, 𝑐 is an additional window to identify the portion of the

data on which the event may happen

Event pattern evaluation produces event mappings 𝜇, 𝑡1, 𝑡2• 𝜇 is a solution mapping

• 𝑡1 and 𝑡2 denote the time inverval justifying 𝜇

23

Page 24: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Evaluation semantics - Examples

The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is

24

S2

S3 S4

S1 S1

S6

S7

S8

S9 S10

S11

S12

S2

EVENT 𝑤1 𝑃1

SEQ

EVERY EVENT 𝑤2 𝑃2

t10 12 14 1611 13 15

Page 25: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Evaluation semantics - Examples

The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is

25

S2

S3 S4

S1 S1

S6

S7

S8

S9 S10

S11

S12

S2

EVENT 𝑤1 𝑃1

SEQ

EVERY EVENT 𝑤2 𝑃2

t10 12 14 1611 13 15

Page 26: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Evaluation semantics - Examples

The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is

26

S2

S3 S4

S1 S1

S6

S7

S8

S9 S10

S11

S12

S2

S1 S10

EVENT 𝑤1 𝑃1

SEQ

EVERY EVENT 𝑤2 𝑃2

t10 12 14 1611 13 15

11 13

Page 27: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – Evaluation semantics - Examples

The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is

27

S2

S3 S4

S1 S1

S6

S7

S8

S9 S10

S11

S12

S2

S1 S10

S1 S12

EVENT 𝑤1 𝑃1

SEQ

EVERY EVENT 𝑤2 𝑃2

t10 12 14 1611 13 15

11 13

11 15

Page 28: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Event Processing – MATCH graph pattern

Event patterns are eclosed in MATCH graph patterns

• Event mappings exist only in the context of event patterns

• The evaluation of a MATCH graph pattern produces a bag of

solution mappings

𝑀𝐴𝑇𝐶𝐻 𝐸 𝑆𝐷𝑆 𝐴𝑡 = {𝜇| 𝜇, 𝑡1, 𝑡2 ∈ ⦅𝐸⦆ 0,𝑡

𝑡 }

It is possible to combine the MATCH graph pattern with other

SPARQL graph patterns

28

Page 29: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

RSP output correctness

S1 S2 S3 S4S

t3 6 91

{:bob :isIn :hall} {:bob :isIn :kitchen}

t0=0

Execution 1° answer 2° answer

1 :hall [6] :kitchen [11]

2 :hall [5] :kitchen [10]

3 :hall [6] :kitchen [11]

4 - [7] - [12]

Window 1° answer 2° answer

t0=0 :hall [5] :kitchen [10]

t0=1 :hall [6] :kitchen [11]

t0=2 - [7] - [12]

{:alice :isIn :hall} {:alice :isIn :kitchen}

t0=1

t0=2

29

Page 30: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

RSP output correctness

System 1 System 2

Execution 1° answer 2° answer

1 :hall [6] :kitchen [11]

2 :hall [5] :kitchen [10]

3 :hall [6] :kitchen [11]

4 - [7] - [12]

Execution 1° answer 2° answer

1 :hall [3] :kitchen [9]

2 No answers

3 :hall [3] :kitchen [9]

4 No answers

S1 S2 S3 S4S

t3 6 91

{:bob :isIn :hall} {:bob :isIn :kitchen}

{:alice :isIn :hall} {:alice :isIn :kitchen}

Window-close vs content-change

Empty relation notification (yes|no)

30

Page 31: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Correctness Assessment

Online

Offline

Data importer

Query transformer

SPARQL engine

Result matcherDS

R

RSP-QL model of R

Correctness assessment

Q (E, SDS, ET, QF)

31

Page 32: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

What’s next?

An RSEP-QL query language

• W3C RSP CG ongoing activities

Implementations

• Yet another RSP engine

• Framework to let existing RSP engine interoperate

Streams are getting popular – applications want more and more

sophisticated features

• Different timestamps, out-of-orders

• Inductive reasoning to cope with noise

• Permanent storage of portions of data (raw or inferred)

32

Page 33: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

Conclusions

The dynamics introduced in the continuous query evlauation

process have not been totally understood

• Not fully captured by existing models

• RSEP-QL captures those dynamics

• All of them? Let’s discover it!

We need to push implementations and applications on use cases

• To understand which helpful operators are missing

• To find new unexpected behaviours

33

Page 34: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

People I am grateful to...

Emanuele Della Valle

and:

Marco Balduini

Jean-Paul Calbimonte

Oscar Corcho

Minh Dao-Trao

Danh Le Phuoc

Freddy Lecue

34

Page 35: On Unified Stream Reasoning - The RDF Stream Processing realm

Daniele Dell’Aglio

... without forgetting you!

Thank you! Questions?

On Unified Stream ReasoningThe RDF Stream Processing realm

Daniele Dell’Aglio

[email protected]

http://dellaglio.org

35