Upload
maxim-kolchin
View
178
Download
5
Embed Size (px)
Citation preview
YABench: A Comprehensive Framework for RDF
Stream Processor Correctness and Performance Assessment
Maxim Kolchin, Peter Wetz, Elmar Kiesling, A Min TjoaITMO University, Russia | TU Wien, Austria
The 16th International Conference on Web Engineering 2016, Lugano, Switzerland
RDF Stream Processing (RSP)
RDF Stream - a potentially infinite sequence of time-varying data elements encoded in RDF
Continuous query - a query registered over streams that in most cases are observed through windows
Query results - similarly to SPARQL they can be tuples, RDF dataset or a new RDF Stream
2
State of the art
■ LSBench (2012)■ SRBench (2012)■ CSRBench (2013)■ CityBench (2015)
Details can be found at W3C RSP Community Group’s Wiki: https://www.w3.org/community/rsp/wiki/RSP_Benchmarking
3
Our contribution
■ We propose a benchmarking framework for RDF Stream Processing engines that focuses on correctness and performance○ Stream generator (generates configurable RDF stream)○ Oracle (validates correctness of the results)○ Runner (measures performance of an RSP engine)
■ We run a benchmark with the window-based RDF stream processing engines:○ C-SPARQL○ CQELS
4
Requirements
■ Scalable and configurable input■ Comprehensive correctness checking■ Flexible queries■ Reproducibility
5
Architecture1. Define tests,2. Generate data streams,3. Run the tests with a given
engine,a. Performance metrics
are collected in a separate process,
4. At the end validate the results with the oracle.
6
Architecture: Reporting tool
7
Validation against CSRBench
We validated the correctness checking functionality of YABench by reproducing the CSRBench* benchmark.
CSRBench defines 7 queries for C-SPARQL, CQELS and SPARQLstream engines.
Datasets, test configurations and results are available online: github.com/YABench/csrbench-validation
*Daniele Dell’Aglio, et al. “On Correctness in RDF Stream Processor Benchmarking”, 2013
8
Validation against CSRBench (C-SPARQL)
QueryC-SPARQL
CSRBench YABench
Q1 ✓ ✓
Q2 ✓ ✓
Q3 ✓ ✓*
Q4 ✓ ✓
Q5 ✗ ✗
Q6 ✓ ✓*
Q7 ✓ ✓*
* - the results are the same, but because of timing discrepancies some results sometimes present in the subsequent window
9
Validation against CSRBench (CQELS)
QueryCQELS
CSRBench YABench
Q1 ✓ ✓
Q2 ✓ ✓
Q3 ✓ ✓
Q4 ✗ ✗
Q5 ✓ ✓
Q6 ✗ ✗
Q7 ✗ ✗**
** - indicates that the query did not execute successfully on the CQELS engine. The engine crashed before returning the query results
10
Benchmark
We reuse queries introduced by CSRBench, but we’re able to parametrize them, e.g. window size, window slide, filter values, etc.
Measure:
- Precision and recall,- Window and result size, and delay,- Memory and CPU usage, # of threads
We run each test 10 times, to compute the distribution of precision/recall.
Detailed results are available online: github.com/YABench/yabench-one
11
Benchmark: Data Stream Model
A data stream is generated based on:
■ Number of weather stations,
■ Time interval between two observations of a single station,
■ Duration of the stream,■ A seed for the
randomize function
12
Benchmark: Queries
Experiment 1: SELECT + FILTER
Experiment 2: SELECT + AVG + FILTER
Experiment 3: joining of triples from different timestamps
Experiment 4: demonstrates the use of gracious mode which implemented by the oracle to eliminate the timing discrepancy issues of the engines
13
Experiment 1 (precision/recall): 50 stations
14
Experiment 1 (precision/recall): 1000 stations
15
Experiment 1 (precision/recall): 10000 stations
16
Experiment 1 (memory usage): 50 stations
17
Experiment 1 (memory usage): 1000 stations
18
Experiment 1 (memory usage): 10000 stations
19
Experiment 1 (delay): 50 stations
20
Experiment 1 (delay): 1000 stations
21
Experiment 1 (delay): 10000 stations
22
Experiment 1 (C-SPARQL): delay vs result size
23
Architecture: Gracious mode
In this mode the oracle tries to adjusts its window scope to match the scope of an actual window, by moving the left and right borders to back and/or forth while the precision and recall grows.
It allows to:
(a) confirm our assumption on why precision and recall are low,(b) reconstruct and visualize the actual window borders
24
Experiment 4: gracious vs non-gracious modes
(a) In non-gracious (default) mode (b) In gracious mode
C-SPARQL
25
Experiment 4: gracious vs non-gracious modes
(a) In non-gracious (default) mode (b) In gracious mode
CQELS
26
Conclusion
■ We build a framework for benchmarking RSP engines which allows to assess their correctness and performance
■ We run a benchmark which revealed some insides:○ CQELS shows better precision/recall for simple queries,○ C-SPARQL is slightly more memory efficient than CQELS,
○ C-SPARQL outperformes CQELS in terms of delay for more complex queries, which is mainly caused by a different reporting strategy
■ By introducing gracious mode we’re able to estimate the extent of the timing discrepancy
27
Thank you!
github.com/YABench