Upload
december
View
35
Download
3
Tags:
Embed Size (px)
DESCRIPTION
An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de. Problem Statement. - PowerPoint PPT Presentation
Citation preview
An Efficient Method forComputing Alignment Diagnoses
Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim
Lehrstuhl für Künstliche Intelligenz{christian, heiner}@informatik.uni-mannheim.de
Computing a local optimal diagnosis 2
Problem StatementProblem Statement
• Automatically and manually (!) generated ontology alignments are often incoherent
– See OAEI-2008 results of conference track
• => Incoherent alignments are a problem in many application scenarios*
– Instance migration results in inconsistent ontologies
– Query translation results in ‚a priori‘ empty result sets
• Find a way to automatically repair incoherent alignments in a very efficient way, because …
– ‚Agents on the web‘ require coherent alignments on the fly
– Large ontologies require efficient algorithms* C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.
Computing a local optimal diagnosis 3
OutlineOutline
Alignment Semantics Incoherence of an alignment, MIPS alignments
Alignment Diagnosis Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis
Computing a Local Optimal Diagnosis (LOD) Brute-Force LOD and Efficient LOD
Experimental Results Runtime, Quality of the Diagnosis
Computing a local optimal diagnosis
O2
"Natural" Semantics"Natural" Semantics
<1#Person, 2#Person, =, 0.98><1#hasName, 2#name, =, 0.87><1#writtenBy, 2#docWrittenBy, = 0.7><1#authorOf, 2#hasWritten, =, 0.56><1#firstAuthor, 2#Author, ⊑ , 0.56>
O1
O1 ∪A O2
An alignment A and two ontologies O1 and O2
1#firstAuthor 2#Author⊑
1#Person ≣ 2#Person
…
Merged Ontology
Correspondences
Axioms
Computing a local optimal diagnosis
Incoherence of an AlignmentIncoherence of an Alignment
Definition: Incoherence of an Alignment
An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi {1,2} that is unsatisfiable in O1 ∪A O2.
can be reduced to thesatisfiability of ∃i#R.⊤
Definition: MIPS Alignment (minimal conflict set)
Given an incoherent alignment A between ontologies O1 and O2.
A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.
Computing a local optimal diagnosis 6
"Terminology""Terminology"
Correspondence Alignment
Alignmentin a sequence ordered by confidences
MIPS depicted by red-dotted links
Alignmentwith MIPS shown as subsets
Computing a local optimal diagnosis
Alignment DiagnosisAlignment Diagnosis
Definition: Alignment Diagnosis
Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2.
Proposition: Alignment Diagnosis and minimal Hitting Sets
Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.
Computing a local optimal diagnosis
Local Optimal Diagnosis (LOD)Local Optimal Diagnosis (LOD)
low confidence
high confidenceDefinition: Accused correspondence
A correspondence c A is accused by A iff there exists a MIPS in A with c M such that for all c‘ ≠ c in M it holds that• (1) conf(c‘) > conf(c) and• (2) c‘ is not accused by A.
Definition: Local optimal diagnosis (LOD)
The set of all accussed correspondences is referred to as local optimal diagnosis (LOD).
important!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?NO!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?Now it is!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?NO!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?Now it is!
… continue the same way
Computing a local optimal diagnosis
Algorithm 1: ResultAlgorithm 1: Result
… and after a few more slides we would end up like this:
Note:
• 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS
• We have not computed a single MIPS alignment!
1 2 3 4 5 6 7 8 9 10
First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08)
With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)
Computing a local optimal diagnosis
„„Patternbased“ reasoningPatternbased“ reasoning
• Idea: Use incomplete method for incoherence detection in A‘ ⊆A– Classify O1 and O2 once, then check for each pair of
correspondence in A‘ wether a certain pattern occurs
• If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent– If no pattern occurs A‘ can nevertheless be incoherent!
Oi
Oj
Computing a local optimal diagnosis
That doesn‘t work …That doesn‘t work …
• Use the efficient coherence test instead of complete reasoning in algorithm described above– Reasoning about A' ⊆ A does not require to reason in O1 ∪A'
O2, but is replaced by iterating over all pairs in A'
– Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD
– Missing out one MIPS might result in a chain of incorrect follow-up decisions!
– Thus, afterwards removal of missed-out MIPS does not work!
• How to exploit the efficient method while still constructing a LOD?
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Run the BF algorithm with efficient reasoning. Still incoherent?
Verification Step: Use binary search to detect correspondence k such thatA[0… k-1] is coherent and A[0 … k] is incoherent
k=8safe part, efficient reasoning did not fail up to k
incorrect part,recompute!
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Run the main algorithm again with efficient reasoning for A[k+1 … n] where∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis.
Still incoherent?If yes, we have knew > kold
repeat again the same verification step
A[1…k] A[k+1…n]
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Final result is a LOD.
Computing a local optimal diagnosis
Runtime Considerations (Theory)Runtime Considerations (Theory)
n = size of alignment A
m = number of times the binary search is applied
• The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry
– Runtime of pattern based reasoning not really matters with respect to runtime!
• Runtime Comparison
– Brute Force LOD: O(n)
– Efficient LOD: O(log(n) * m)
Do we have m << n ?
Computing a local optimal diagnosis
Results: RuntimeResults: Runtime
• Based on experiments with OAEI conference ontologies and submission from 2007/08– Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D)
– Four different state of the art matching systemsn m
• Better results for benchmark datasets: 5 to 10 times faster
Computing a local optimal diagnosis
Results: Quality of DiagnosisResults: Quality of Diagnosis
• Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure
• For alignments with low precision positive effects are very strong.
• In rare cases an incorrect correspondences annotated with high confidence has negative effects
Computing a local optimal diagnosis
SummarySummary
• Algorithm 1: Algorithm for computing a LOD
– Without computing MIPS or MUPS!
• Algorithm 2: General approach for improving the algorithms of type 1
– Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning
– In principle applicable to each semantic for which we can find a similar efficient reasoning approach!
• Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!
Computing a local optimal diagnosis 29
Thanks for attentionQuestions?
Computing a local optimal diagnosis 30
Back-Up SlidesBack-Up Slides
Computing a local optimal diagnosis
Property Pattern ExampleProperty Pattern Example
readPaper reviewOfPaper
DocumentDocument
∃readPaper.⊤ ∃reviewOfPaper.⊤
≣
≣
dis
join
t∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑ ¬Person
dis
join
t
∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1
O2