Upload
mariano-rodriguez
View
191
Download
0
Tags:
Embed Size (px)
DESCRIPTION
First talk where I introduce the semantic index technique for query answering with inferences, the T-mappings technique (a mapping transformation/optimisation technique to avoid exponential blows during query rewriting) and the role of dependencies in query answering by query rewriting.
Citation preview
DependenciesMaking Ontology Based Data Access Work in Practice
Mariano Rodriguez-Muro and Diego Calvanese{rodriguez,calvanese}@inf.unibz.it
KRDB Research CentreFree University of Bozen Bolzano
May 11, 2011
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 1 / 33
The context
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 2 / 33
DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Ontologies
• Intentional knowledge: TBox T .
• Extensional knowledge: ABox A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 3 / 33
DL Ontologies
Description Logics:
• Formalisms for knowledge representation.
• Decidable fragments of FOL
• Base of OWL
• World is described by means of Concepts and Roles
Ontologies
• Intentional knowledge: TBox T .
• Extensional knowledge: ABox A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 3 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF conceptsB := A | ∃R
• DL-LiteF rolesR := P | P−
• DL-LiteF TBoxes
B v B | B v ¬B | (funct R)
• DL-LiteF ABoxesA(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF conceptsB := A | ∃R
• DL-LiteF rolesR := P | P−
• DL-LiteF TBoxes
B v B | B v ¬B | (funct R)
• DL-LiteF ABoxesA(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF conceptsB := A | ∃R
• DL-LiteF rolesR := P | P−
• DL-LiteF TBoxes
B v B | B v ¬B | (funct R)
• DL-LiteF ABoxesA(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF conceptsB := A | ∃R
• DL-LiteF rolesR := P | P−
• DL-LiteF TBoxes
B v B | B v ¬B | (funct R)
• DL-LiteF ABoxesA(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 4 / 33
OBDA with DL-Lite
A family of light-weight ontology languages
• DL-LiteF conceptsB := A | ∃R
• DL-LiteF rolesR := P | P−
• DL-LiteF TBoxes
B v B | B v ¬B | (funct R)
• DL-LiteF ABoxesA(a) | R(a, b)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 4 / 33
Query Answering
TBox:
Man v Person,Woman v Person,Person v ∃hasFather ,
∃hasFather− v Person
ABox:Man(mariano)
Queries:q(x)← Person(x), hasFather(x , y),Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q,O).
The promise
We can do this as efficiently as answering DB queries, also in the virtualsetting.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 5 / 33
Query Answering
TBox:
Man v Person,Woman v Person,Person v ∃hasFather ,
∃hasFather− v Person
ABox:Man(mariano)
Queries:q(x)← Person(x), hasFather(x , y),Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q,O).
The promise
We can do this as efficiently as answering DB queries, also in the virtualsetting.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 5 / 33
Query Answering
TBox:
Man v Person,Woman v Person,Person v ∃hasFather ,
∃hasFather− v Person
ABox:Man(mariano)
Queries:q(x)← Person(x), hasFather(x , y),Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q,O).
The promise
We can do this as efficiently as answering DB queries, also in the virtualsetting.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 5 / 33
Query Answering
TBox:
Man v Person,Woman v Person,Person v ∃hasFather ,
∃hasFather− v Person
ABox:Man(mariano)
Queries:q(x)← Person(x), hasFather(x , y),Person(y)
Problem: Compute the certain answers of Q, denoted cert(Q,O).
The promise
We can do this as efficiently as answering DB queries, also in the virtualsetting.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 5 / 33
Query Answering with PerfectRef (2005)
Query:q(x)← Person(x), hasFather(x , y),Person(y)
Reformulation:
q(x)← Person(x), hasFather(x , y),Person(y)
q(x)← Person(x), hasFather(x , y), hasFather(z , y)
q(x)← Person(x), hasFather(x , y)
q(x)← Person(x),Person(x)
q(x)← Person(x)
q(x)← Person(x), hasFather(x , y),Man(y)
q(x)← Person(x), hasFather(x , y),Woman(y)
q(x)← hasFather(x ,m), hasFather(x , y),Person(y)
q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)
q(x)← hasFather(x ,m), hasFather(x , y)
q(x)← hasFather(x ,m),Person(x)
q(x)← hasFather(x ,m), hasFather(x , t)
q(x)← hasFather(x ,m)
q(x)← hasFather(x ,m), hasFather(x , y),Man(y)
q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)
q(x)← Man(x), hasFather(x , y),Person(y)
q(x)← Man(x), hasFather(x , y), hasFather(y , z)
q(x)← Man(x), hasFather(x , y),Man(y)
q(x)← Man(x), hasFather(x , y),Woman(y)
q(x)←Woman(x), hasFather(x , y),Person(y)
q(x)←Woman(x), hasFather(x , y), hasFather(y , z)
q(x)←Woman(x), hasFather(x , y),Man(y)
q(x)←Woman(x), hasFather(x , y),Woman(y)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 6 / 33
Query Answering with PerfectRef (2005)Query:
q(x)← Person(x), hasFather(x , y),Person(y)
Reformulation:
q(x)← Person(x), hasFather(x , y),Person(y)
q(x)← Person(x), hasFather(x , y), hasFather(z , y)
q(x)← Person(x), hasFather(x , y)
q(x)← Person(x),Person(x)
q(x)← Person(x)
q(x)← Person(x), hasFather(x , y),Man(y)
q(x)← Person(x), hasFather(x , y),Woman(y)
q(x)← hasFather(x ,m), hasFather(x , y),Person(y)
q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)
q(x)← hasFather(x ,m), hasFather(x , y)
q(x)← hasFather(x ,m),Person(x)
q(x)← hasFather(x ,m), hasFather(x , t)
q(x)← hasFather(x ,m)
q(x)← hasFather(x ,m), hasFather(x , y),Man(y)
q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)
q(x)← Man(x), hasFather(x , y),Person(y)
q(x)← Man(x), hasFather(x , y), hasFather(y , z)
q(x)← Man(x), hasFather(x , y),Man(y)
q(x)← Man(x), hasFather(x , y),Woman(y)
q(x)←Woman(x), hasFather(x , y),Person(y)
q(x)←Woman(x), hasFather(x , y), hasFather(y , z)
q(x)←Woman(x), hasFather(x , y),Man(y)
q(x)←Woman(x), hasFather(x , y),Woman(y)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 6 / 33
Query Answering with PerfectRef (2005)Query:
q(x)← Person(x), hasFather(x , y),Person(y)
Reformulation:
q(x)← Person(x), hasFather(x , y),Person(y)
q(x)← Person(x), hasFather(x , y), hasFather(z , y)
q(x)← Person(x), hasFather(x , y)
q(x)← Person(x),Person(x)
q(x)← Person(x)
q(x)← Person(x), hasFather(x , y),Man(y)
q(x)← Person(x), hasFather(x , y),Woman(y)
q(x)← hasFather(x ,m), hasFather(x , y),Person(y)
q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)
q(x)← hasFather(x ,m), hasFather(x , y)
q(x)← hasFather(x ,m),Person(x)
q(x)← hasFather(x ,m), hasFather(x , t)
q(x)← hasFather(x ,m)
q(x)← hasFather(x ,m), hasFather(x , y),Man(y)
q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)
q(x)← Man(x), hasFather(x , y),Person(y)
q(x)← Man(x), hasFather(x , y), hasFather(y , z)
q(x)← Man(x), hasFather(x , y),Man(y)
q(x)← Man(x), hasFather(x , y),Woman(y)
q(x)←Woman(x), hasFather(x , y),Person(y)
q(x)←Woman(x), hasFather(x , y), hasFather(y , z)
q(x)←Woman(x), hasFather(x , y),Man(y)
q(x)←Woman(x), hasFather(x , y),Woman(y)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 6 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
Alternatives
• Improved version of PerfectRef (2007-2011)
• RQR (Urbina et, al. 2007)
Too many unions, cannot execute!.
• PRESTO (Rosati et al., 2010)
Better, eventually it breaks.
• Combined Approach (Kontchakov et. al., 2010)
Fast. But too much data and too much time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 7 / 33
What can we do?
?
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 8 / 33
Query AnsweringIt is not only about existential constants
Query:q(x , y)← Person(x), hasFather(x , y),Person(y)
Reformulation:
q(x , y)← Person(x), hasFather(x , y),Person(y)
q(x , y)← Person(x), hasFather(x , y), hasFather(z , y)
q(x , y)← Person(x), hasFather(x , y),Man(y)
q(x , y)← Person(x), hasFather(x , y),Woman(y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Person(y)
q(x , y)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Man(y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Woman(y)
q(x , y)← Man(x), hasFather(x , y),Person(y)
q(x , y)← Man(x), hasFather(x , y), hasFather(z , y)
q(x , y)← Man(x), hasFather(x , y),Man(y)
q(x , y)← Man(x), hasFather(x , y),Woman(y)
q(x , y)←Woman(x), hasFather(x , y),Person(y)
q(x , y)←Woman(x), hasFather(x , y), hasFather(z , y)
q(x , y)←Woman(x), hasFather(x , y),Man(y)
q(x , y)←Woman(x), hasFather(x , y),Woman(y)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 9 / 33
Query AnsweringIt is not only about existential constants
Query:q(x , y)← Person(x), hasFather(x , y),Person(y)
Reformulation:
q(x , y)← Person(x), hasFather(x , y),Person(y)
q(x , y)← Person(x), hasFather(x , y), hasFather(z , y)
q(x , y)← Person(x), hasFather(x , y),Man(y)
q(x , y)← Person(x), hasFather(x , y),Woman(y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Person(y)
q(x , y)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Man(y)
q(x , y)← hasFather(x ,m), hasFather(x , y),Woman(y)
q(x , y)← Man(x), hasFather(x , y),Person(y)
q(x , y)← Man(x), hasFather(x , y), hasFather(z , y)
q(x , y)← Man(x), hasFather(x , y),Man(y)
q(x , y)← Man(x), hasFather(x , y),Woman(y)
q(x , y)←Woman(x), hasFather(x , y),Person(y)
q(x , y)←Woman(x), hasFather(x , y), hasFather(z , y)
q(x , y)←Woman(x), hasFather(x , y),Man(y)
q(x , y)←Woman(x), hasFather(x , y),Woman(y)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 9 / 33
The full picture: Ontology Based DataAccess
SourceUser SourceUser
Queries Ontology
Mappings
Source
To deal with OBDA we need to consider:
• If in the backend we have RDBMSs, we cannot go beyond theircapabilities.
• All systems are composed by T , D = 〈R, I〉, M.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 10 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v Employee
In the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
First ObservationIs my data complete?
Completeness of A
The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.
In any realistic scenario:
• We don’t use arbitrary sources;
• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)
• This happens a lot!
Keyword
Redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 11 / 33
Second ObservationThere are no ABoxes
THERE ARE NO ABOXES!
Any Ontology based query answering systems today:
• Uses relational DBs to store the ABox data;
• In such D, both, R and I can be manipulated;
• Implementors may choose any M for their system;
Opportunity
To complete an ABox we can do more than expansion.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 12 / 33
Second ObservationThere are no ABoxes
THERE ARE NO ABOXES!
Any Ontology based query answering systems today:
• Uses relational DBs to store the ABox data;
• In such D, both, R and I can be manipulated;
• Implementors may choose any M for their system;
Opportunity
To complete an ABox we can do more than expansion.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 12 / 33
How to approach the problemTwo level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 13 / 33
How to approach the problemTwo level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 13 / 33
How to approach the problemTwo level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 13 / 33
How to approach the problemTwo level approach
How to approach OBDA in practice?
• Efficient ways to deal with redundancy due to completeness.
• Efficient ways to complete (virtual) ABoxes.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 13 / 33
ContributionsDealing with redundancy
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 14 / 33
Characterizing completeness
ABox Dependencies
Definition
An assertion B vA B that restricts valid ABoxes.
Syntax B2 vA B2
Semantics: A |= Manager vA Employee if Manager(x)∈ A impliesEmployee(x)∈ A.
ABox dependencies are fundamentally different than TBox assertions.Think open world
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 15 / 33
Characterizing completeness
ABox Dependencies
Definition
An assertion B vA B that restricts valid ABoxes.
Syntax B2 vA B2
Semantics: A |= Manager vA Employee if Manager(x)∈ A impliesEmployee(x)∈ A.
ABox dependencies are fundamentally different than TBox assertions.Think open world
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 15 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?
Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 16 / 33
Where to deal with redundancy?
Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:
• Optimize the query reformulation algorithm to deal with Σ.
• Optimize the TBox T with respect to Σ.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 16 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the followinghierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the followinghierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the followinghierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 1
Let T be implied the followinghierarchy:
∃hasFather
Person
Human
Redundant if Σ is:
∃hasFather
Person
Human
Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 17 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 18 / 33
When is an assertion redundant?
Direct Redundancy: Case 2
Let T be the following TBox:
Person
∃hasFather−
∃hasFather
Man
Redundant if Σ is:
Person
∃hasFather−
∃hasFather
Man
Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 18 / 33
When is an assertion redundant?Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 19 / 33
When is an assertion redundant?Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 19 / 33
When is an assertion redundant?Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 19 / 33
When is an assertion redundant?Indirect Redundancy
Let T be the following TBox:
Animal
Man Human
Redundant if Σ is:
Animal
Man Human
Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 19 / 33
Formalization: Redundancy
Given a TBox T and a set of dependencies Σ over T , the optimized versionof T w.r.t. Σ, denoted optim(T ,Σ), is the set of inclusion assertions
{α ∈ sat(T ) | α is not redundant in sat(T ) w.r.t. sat(Σ)}
We can compute optim(T ,Σ) in linear time.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 20 / 33
ContributionsCompleting ABoxes
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 21 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.
If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B v ∃R without the need for newdata.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.
If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.
Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B v ∃R without the need for newdata.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.
If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B v ∃R without the need for newdata.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 22 / 33
General considerations
OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.
If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:
• Degree of completeness (# of dependencies),
• Cost of the procedure
• Performance of Query answering.
We can complete virtual ABoxes up to B v ∃R without the need for newdata.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 22 / 33
Semantic Index for OBDA
General Idea
• To encode the semantics of T in numeric indexes and ranges forconcept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the indexand ranges!
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 23 / 33
Semantic Index for OBDA
General Idea• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the indexand ranges!
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 23 / 33
Semantic Index for OBDA
General Idea• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the indexand ranges!
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 23 / 33
Semantic Index for OBDA
General Idea• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the indexand ranges!
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 23 / 33
Semantic Index for OBDA
General Idea• To encode the semantics of T in numeric indexes and ranges for
concept names and roles.
• Store the ABox in the database using those indexes and ranges.
• Make mappings for the system that take the ranges into account.
We can do this by using the implied hierarchy of T to generate the indexand ranges!
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 23 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
A
B C
D
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
1A
B2
C3
4D
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
1A
B2
C3
4D
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
1, {(1, 3)}A
B2, {(2, 2)}
C3, {(3, 3)}
4, {(3, 4)}D
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Semantic Index Example
T = {B v A,C v A,C v D}
1, {(1, 3)}A
B2, {(2, 2)}
C3, {(3, 3)}
4, {(3, 4)}D
We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC
We create the mappings using the ranges, e.g., SELECT constant
FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 24 / 33
Experimentation I
The Resource Index features:
• Search over 22 document collections
• Semantics given by the hierarchies of 200 ontologies (SNOMED, GO)
Implementation in a nutshell:
(i) Understand documents with natural language processing andannotate
Cervical Cancer(′doc224′)
(ii) Expand the ABox
(iii) Pose queries that retrieve documents as
q(x)← A1(x) ∧ · · · ∧ An(x)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 25 / 33
Experimentation II
The challenge:
• ≈ 3 million concepts and ≈ 2.5 million is-a assertions
• Split second responses
• 150 GB of data
• Expansion data: 1.5 TB
The experimentation data:
• Clinical Trials.gov (CT)
• 181 million assertion (≈ 14 GB of data, ≈ 140 GB when expanded.)
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 26 / 33
Results
The query:
q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Results:
• Traditional reformulation: Union of 467874 SQL SPJ queries;
• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Timeto compute semantic index: 1 min; Size of data: +≈ 4 GB.
• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansiontime ≈ 7 days; Size of data +≈ 126 GB.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 27 / 33
Results
The query:
q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
Results:
• Traditional reformulation: Union of 467874 SQL SPJ queries;
• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Timeto compute semantic index: 1 min; Size of data: +≈ 4 GB.
• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansiontime ≈ 7 days; Size of data +≈ 126 GB.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 27 / 33
The Query
The query:
q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)
SELECT DISTINCT r0.element_id as element_id
FROM
RESOURCE_INDEX.CT_ANN r0 JOIN RESOURCE_INDEX.CT_ANN r1
ON r0.element_id = r1.element_id
JOIN RESOURCE_INDEX.CT_ANN r2
ON r1.element_id = r2.element_id
WHERE
((r0.idx >= 1783559 AND r0.idx <= 1783657)) AND
((r1.idx >= 1782996 AND r1.idx <= 1783029)) AND
((r2.idx >= 1783115 AND r2.idx <= 1783253));
Standard SQL query efficient in ANY DBMS.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 28 / 33
Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account morethan only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can completeABoxes.
• We presented to efficient ways to complete ABoxes, one for thegeneral OBDA setting and one for the virtual setting.
Future work
• Exploring more expressive languages.
• Exploring the RDFS/SPARQL setting.
• Handling updates of T and A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 29 / 33
Conclusions
Contributions
• We indicated that efficient OBDA requires to take into account morethan only T , A and Q.
• Provided means to deal with redundancy at the level of the TBox.
• We showed that expansion is not necessary that we can completeABoxes.
• We presented to efficient ways to complete ABoxes, one for thegeneral OBDA setting and one for the virtual setting.
Future work
• Exploring more expressive languages.
• Exploring the RDFS/SPARQL setting.
• Handling updates of T and A.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 29 / 33
Extra examples
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 30 / 33
First Observation (cont.)Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id , dept)← Manager(id)∧MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 31 / 33
First Observation (cont.)Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id , dept)← Manager(id)∧MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).
This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 31 / 33
First Observation (cont.)Mappings will introduce dependencies over ABoxes
Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:
SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)
SELECT id,dept FROM employee
WHERE salary > 1000
;q(id , dept)← Manager(id)∧MANAGES(id, dept)
Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 31 / 33
Formalization: Chains
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies overT . A T -chain from B to C in T (resp., a Σ-chain from B to C in Σ) is asequence of concept inclusion assertions (Bi v B ′i )
ni=0 in T (resp., a
sequence of inclusion dependencies (Bi vA B ′i )ni=0 in Σ), for some n ≥ 0,
such that:
1 B0 = B, B ′n = C , and
2 for 1 ≤ i ≤ n, we have that B ′i−1 and Bi are basic concepts s.t., either
(i) B ′i−1 = Bi , or(ii) B ′i−1 = ∃R and Bi = ∃R−, for some basic role R.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 32 / 33
Formalization: Redundancy
Let T be a TBox, B, C basic concepts, and Σ a set of dependencies. Theconcept inclusion assertion B v C is directly redundant in T w.r.t. Σ if
(i) Σ |= B vA C and
(ii) for every T -chain (Bi v B ′i )ni=0 with B ′n = B in T , there is a Σ-chain
(Bi vA B ′i )ni=0.
Then, B v C is redundant in T w.r.t. Σ if
(a) it is directly redundant, or
(b) there exists B ′ 6= B s.t.
(i) T |= B ′ v C ,(ii) B ′ v C is not redundant in T w.r.t. Σ, and(iii) B v B ′ is directly redundant in T w.r.t. Σ.
Rodriguez-Muro and Calvanese (UNIBZ) Dependencies and OBDA May 11, 2011 33 / 33