View
120
Download
0
Category
Tags:
Preview:
DESCRIPTION
PAGOdA (Pay-as-you-go OWL Query Answering Using a Triple Store) presentation by Bernardo Cuenca Grau Abstract: We present an enhanced hybrid approach to OWL query answering that combines an RDF triple-store with an OWL reasoner in order to provide scalable pay-as-you-go performance. The enhancements presented here include an extension to deal with arbitrary OWL ontologies, and optimisations that significantly improve scalability. We have implemented these techniques in a prototype system, a preliminary evaluation of which has produced very encouraging results.
Citation preview
Pay-as-you-go Query Answering with PAGOdA
BERNARDO CUENCA GRAU
Ontology-mediated Query Answering
Q
TA
B D
C
RDF Data
ab
• (Meta)-data published in RDF
• RDF resources reference an OWL 2 ontology
• The ontology describes the meaning of data
RDF and OWL 2 well-established
• Thousands of available OWL 2 ontologies
• RDF ubiquitous on the Web
2
Ontology-mediated Query Answering
Ontology languages offer a wide range modeling constructs
High expressive power à high worst-case complexity of reasoning How can we provide scalable query answering?
• Restrict our ontology to a lightweight fragment of OWL EL, QL or RL profiles
• Tolerate incompleteness
• Rely on highly optimised pay-as-you-go systems • Worst case optimal for lightweight fragments • Rapidly computes easy answers • Performance gracefully degrades with harder instances
3
Datalog and the OWL 2 Profiles
Datalog is the quintessential rule-based KR language
• Reasoning typically implemented via materialisation • Our in-house system RDFox shows excellent performance
Query answering within the OWL 2 profiles
• RL ontologies equivalent to Datalog programs • EL and QL ontologies can be strengthened using Datalog
Query answering requires an additional filtration step
4
Incomplete Reasoning
§ RL / EL reasoning w.r.t. arbitrary OWL ontology O dataset D and query q gives (in general) an incomplete answer L
P Profile-specific reasoning via Datalog (relatively) scalable O Answers may be incomplete O Degree of incompleteness unknown O Incompleteness may be pathological (empty answers)
5
L = cert(q, hO`,Di) ✓ cert(q, hO,Di) with O |= O`
The idea behind PAGOdA
6
Redistribute reasoning workload Datalog reasoner Fully-fledged OWL 2 reasoner
Resort to expensive OW2 reasoning as little as possible (if at all) Ensure sound and complete answers Do not restrict ontology language
Datalog reasoner OWL 2 reasoner
Step 1: Lower and Upper Bounds
Data
Lower
ELHO Lower
Data
Upper
Ontology
Query
Datalog
Eng
ine
Datalog Engine
7
Profile-specific reasoning via Datalog gives a lower bound
L gives a subset of We transform O into strictly stronger Datalog ontology Ou
• Normalise ontology into Datalog±,v rules
• Eliminate ∨ by transforming to ∧
• Replace existential variables with Skolem constants
Datalog reasoning w.r.t. Ou gives upper bound answer U
cert(q, hO,Di)
cert(q, hO,Di) ✓ cert(q, hOu,Di) = U
Step 2: Module extraction
8
Checking possible answers in U \ L is expensive Compute a fragment of ontology + data sufficient to check each answer in U \ L. Fragment computation involves proof tracing in Ou
Achieved also using Datalog materialisation Relevant fragments are typically much smaller Size of the problem substantially reduced
DUDatalog Engine
Fragment
Step 3: Summarisation
9
Summary
Summarisation
Full Reasoner Q
FragmentFurther reduce problem size by summarising the fragment
• Technique introduced by the SHER team at IBM • “Merge” constants that are instances of same concepts • Check answers against summary using OWL 2 reasoner • The summary of the fragment is typically very small
This is an orthogonal over-approximation to previous ones We further reduce the size of U \ L Sometimes we even make it empty !
Step 4: Dependency analysis
10
Dependency AnalysisF
Full Reasoner QF
Output
Group remaining candidate answers • If a and b are in the same group then a is an answer iff b is • We can also establish dependencies between groups
Check group representatives against fragment using the fully-fledged reasoner.
Features of PAGOdA
PAGOdA provides PAYG query answering for OWL 2:
§ Uses Datalog reasoner “out of the box” § Efficiently computes sound partial answers § In “easy” cases, efficiently computes complete answers § In “harder” cases, applies increasingly powerful but less scalable
reasoning techniques as needed to completely answer query § The last step involving full reasoner is rarely needed in practice
§ Recent improvements § Better and better upper bounds § Smaller and smaller modules
11
Queries answered by each technique
Scalability for lower and upper bound computation
LUBM UOBM FLY DBPedia NPD Total 24 15 6 441 329
Bounds 22 12 5 439 326
Sum 22 14 5 440 329
Full 24 15 6 441 329
Importing Lower Mat Upper Mat Ave QA LUBM1000 313s 190s 269s 12s UOBM500 356s 346s 734s 4s
Queries that require full reasoning
Lower Upper Gap Sum Groups LUBM100_q20 0 26 26 26 1 LUBM100_q22 0 14 14 14 1 UOBM1_q14 6271 6535 264 264 1 FLY_q5 0 344 344 344 1 DBPedia_q404 0 2 2 2 1
Lower Upper Frag Size (%) Sum Full
LUBM100_q20 0.2s 0.3s 14.5s .005/.04 1.2s 190.1s
LUBM100_q22 0.3s 0.2s 10.0s .005/.04 0.8s 46.1s
UOBM1_q14 0.1s 0.1s 0.7s .17/.076 0.5s 5.4s
FLY_q5 0.0s 0.0s 16.0s .34/.01 0.1s 0.2s
14
Time distribution and fragment size
PAGOdA Team
§ Yujiao Zhou
§ Yavor Nenov
§ Bernardo Cuenca Grau
§ Ian Horrocks
15
Recommended