Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Optimizing the Performance for
Concurrent RDF Stream Processing
Queries
Chan Le Van, Feng Gao, Muhammad Intizar Ali
The INSIGHT Centre for Data Analytics – NUI Galway, Ireland
May, 2017
OutlineI. Introduction
II. Foundations
III. Optimization of Concurrent CQELS Queries
IV. Evaluations
V. Conclusion and Future Works
2
Data Streams are Everywhere !
3
RDF Stream Processing
4
RDF Stream Processing• RDF Stream Processing(RSP) Engines: C-
SPARQL, SPARQL-stream, CQELS
4
RDF Stream Processing• RDF Stream Processing(RSP) Engines: C-
SPARQL, SPARQL-stream, CQELS
• Concurrent Query Processing is still a challenge
with these engines
4
RDF Stream Processing• RDF Stream Processing(RSP) Engines: C-
SPARQL, SPARQL-stream, CQELS
• Concurrent Query Processing is still a challenge
with these engines
• CQELS+: Extension of CQELS aiming at
optimizing the multiple-query processing
4
II. Foundations
5Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
II. Foundations• CQELS – RDF Stream Processing Framework
5Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
II. Foundations• CQELS – RDF Stream Processing Framework
• Multi-way Join Operator
5Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
II. Foundations• CQELS – RDF Stream Processing Framework
• Multi-way Join Operator
• Shared Join Operator
5Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
II. Foundations• CQELS – RDF Stream Processing Framework
• Multi-way Join Operator
• Shared Join Operator
• Network of Shared Join Operators(NSJO)
5Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61313
• Accepting CQELS-declarative
language(extended from SPARQL
language)
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61414
• Accepting CQELS-declarative
language(extended from SPARQL
language)
• Following eager-execution approach
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61515
• Accepting CQELS-declarative
language(extended from SPARQL
language)
• Following eager-execution approach
• Can process both static and RDF stream
data
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61616
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61717
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61818
S1 S3 S2
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
61919
S1 S3 S2
Q1(S1, S2, S3)CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62020
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62121
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62222
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62323
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q2(S2, S3)
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62424
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q2(S2, S3)
j
B22 B2
3
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62525
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q2(S2, S3)
j
B22 B2
3
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62626
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q3(S1, S3)
Q2(S2, S3)
j
B22 B2
3
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62727
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q3(S1, S3)
Q2(S2, S3)
j
B22 B2
3
j
B31 B3
3
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62828
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q3(S1, S3)
Q2(S2, S3)
j
B22 B2
3
j
B31 B3
3
CQELS
CQELS – RDF Stream Processing Framework
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
62929
J
j
B11 B1
2 B13
S1 S3 S2
Q1(S1, S2, S3)
Q3(S1, S3)
Q2(S2, S3)
j
B22 B2
3
j
B31 B3
3
CQELS
Multi-way Join Operator
7Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Multi-way Join Operator
7Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Multi-way Join Operator
7
• Incremental evaluation
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Multi-way Join Operator
7
(indexed)
• Incremental evaluation
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Multi-way Join Operator
7
(indexed)
(indexed)
• Incremental evaluation
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
8
Shared Join Operator
Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Join
Graph
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Join
Graph
Join
Graph
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Join
Graph
Join
Graph
Join
Graph
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Join
Graph
Join
Graph
Join
Graph
Join
Graph
Network of Shared Join Operators(NSJO)
9Reference: D. Le-Phuoc. A Native and Adaptive Approach for Linked Stream Data Processing. PhD thesis, National University of
Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland, 2012.
Join
Graph
Join
Graph
Join
Graph
Join
Graph
• Join Graph contains the best-cost join
sequences of the involved queries
• Join Sequence = order of joined data buffers
III. Optimization of Concurrent CQELS queries
10
III. Optimization of Concurrent CQELS queries
• CQELS+: Extending CQELS with the network of
shared join operators
10
III. Optimization of Concurrent CQELS queries
• CQELS+: Extending CQELS with the network of
shared join operators
o Output reutilization Heuristic over Join Graph
o Join graph Example
10
III. Optimization of Concurrent CQELS queries
• CQELS+: Extending CQELS with the network of
shared join operators
o Output reutilization Heuristic over Join Graph
o Join graph Example
• Load Balancing for Parallel CQELS+ Instances
10
III. Optimization of Concurrent CQELS queries
• CQELS+: Extending CQELS with the network of
shared join operators
o Output reutilization Heuristic over Join Graph
o Join graph Example
• Load Balancing for Parallel CQELS+ Instances
o Rotation
o Minimal Average Latency
o Minimal Buffer Size
10
11
Output Reutilization Heuristic
CQELS+: Join Graph – Example
12Reference: Conference scenario: Integrating physical stream with online profiles Manfred Hauswirth Danh Le-Phuoc, Josiane Xavier Parreira. Linked stream data processing. 2012.
CQELS+: Join Graph – Example
12
QUERY 1: inform a participant about the name and description of the location he just entered.
Reference: Conference scenario: Integrating physical stream with online profiles Manfred Hauswirth Danh Le-Phuoc, Josiane Xavier Parreira. Linked stream data processing. 2012.
CQELS+: Join Graph – Example
12
QUERY 1: inform a participant about the name and description of the location he just entered.
QUERY 2:notify two people when they can reach each other from two different and directly connected (nearby) locations.
Reference: Conference scenario: Integrating physical stream with online profiles Manfred Hauswirth Danh Le-Phuoc, Josiane Xavier Parreira. Linked stream data processing. 2012.
CQELS+: Join Graph – Example
12
QUERY 1: inform a participant about the name and description of the location he just entered.
QUERY 2:notify two people when they can reach each other from two different and directly connected (nearby) locations.
QUERY 3:notify an author of his co-authors who have been in his current location during the last 5 seconds.
Reference: Conference scenario: Integrating physical stream with online profiles Manfred Hauswirth Danh Le-Phuoc, Josiane Xavier Parreira. Linked stream data processing. 2012.
CQELS+: Join Graph – Example
12
QUERY 1: inform a participant about the name and description of the location he just entered.
QUERY 2:notify two people when they can reach each other from two different and directly connected (nearby) locations.
QUERY 3:notify an author of his co-authors who have been in his current location during the last 5 seconds.
QUERY 4:count the number of co-authors appearing in nearby locations in the last 30 seconds grouped by location.
Reference: Conference scenario: Integrating physical stream with online profiles Manfred Hauswirth Danh Le-Phuoc, Josiane Xavier Parreira. Linked stream data processing. 2012.
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
13
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
CQELS+: Join Graph – Example
14
Load Balancing for Parallel CQELS+ Instances
15
Load Balancing for Parallel CQELS+ Instances
15
Load Balancing for Parallel CQELS+ Instances
15
Load Balancing for Parallel CQELS+ Instances
15
Load Balancing for Parallel CQELS+ Instances
15
Load Balancing for Parallel CQELS+ Instances
15
Register query using Load-balancing strategies:
Load Balancing for Parallel CQELS+ Instances
15
Register query using Load-balancing strategies:
1. Rotation: Round-robin
registration
Load Balancing for Parallel CQELS+ Instances
15
Register query using Load-balancing strategies:
1. Rotation: Round-robin
registration
2. Minimum
Average Latency:
Choose the engine
with the lowest
average latency to
register
Load Balancing for Parallel CQELS+ Instances
15
Register query using Load-balancing strategies:
1. Rotation: Round-robin
registration
2. Minimum
Average Latency:
Choose the engine
with the lowest
average latency to
register
3. Minimum Average
Buffer Size: Choose
the engine with the
lowest average buffer
size to register
IV. Evaluation
• Shared Join Operator Evaluation
• Load Balancing over CQELS+ engines
• Query Registration Time
16
Query 3
Query 5Query 6
Query 2
Join Performance between CQELS and CQELS+
17Experimentation: https://github.com/chanlevan/CqelsplusExperiment
Source code: https://github.com/chanlevan/CQELSPLUS
Query 3
Query 5Query 6
Query 2
Join Performance between CQELS and CQELS+
17Experimentation: https://github.com/chanlevan/CqelsplusExperiment
Source code: https://github.com/chanlevan/CQELSPLUS
Query 5
Load Balancing
18
Scale instances Scale Streams
Experimentation: https://github.com/chanlevan/CqelsplusLoadBalancingExperiment
Source code: https://github.com/chanlevan/CPFederation
Query Registration Time
19Experimentation: https://github.com/chanlevan/CqelsplusLoadBalancingExperiment
Source code: https://github.com/chanlevan/CPFederation
VI. Conclusion and Future Works
20
VI. Conclusion and Future Works
Better Performance of handling multiple queries
Federating CQELS+ engines with different load-
balancing strategies
20
VI. Conclusion and Future Works
Better Performance of handling multiple queries
Federating CQELS+ engines with different load-
balancing strategies
CQELS+:Reduce Query Registration Time
Distributed model: More efficient load-balancing
strategies
20
VI. Conclusion and Future Works
Better Performance of handling multiple queries
Federating CQELS+ engines with different load-
balancing strategies
CQELS+:Reduce Query Registration Time
Distributed model: More efficient load-balancing
strategies
20