31
An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de

An Efficient Method for Computing Alignment Diagnoses

Embed Size (px)

DESCRIPTION

An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de. Problem Statement. - PowerPoint PPT Presentation

Citation preview

Page 1: An Efficient Method for Computing Alignment Diagnoses

An Efficient Method forComputing Alignment Diagnoses

Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim

Lehrstuhl für Künstliche Intelligenz{christian, heiner}@informatik.uni-mannheim.de

Page 2: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis 2

Problem StatementProblem Statement

• Automatically and manually (!) generated ontology alignments are often incoherent

– See OAEI-2008 results of conference track

• => Incoherent alignments are a problem in many application scenarios*

– Instance migration results in inconsistent ontologies

– Query translation results in ‚a priori‘ empty result sets

• Find a way to automatically repair incoherent alignments in a very efficient way, because …

– ‚Agents on the web‘ require coherent alignments on the fly

– Large ontologies require efficient algorithms* C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.

Page 3: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis 3

OutlineOutline

Alignment Semantics Incoherence of an alignment, MIPS alignments

Alignment Diagnosis Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis

Computing a Local Optimal Diagnosis (LOD) Brute-Force LOD and Efficient LOD

Experimental Results Runtime, Quality of the Diagnosis

Page 4: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

O2

"Natural" Semantics"Natural" Semantics

<1#Person, 2#Person, =, 0.98><1#hasName, 2#name, =, 0.87><1#writtenBy, 2#docWrittenBy, = 0.7><1#authorOf, 2#hasWritten, =, 0.56><1#firstAuthor, 2#Author, ⊑ , 0.56>

O1

O1 ∪A O2

An alignment A and two ontologies O1 and O2

1#firstAuthor 2#Author⊑

1#Person ≣ 2#Person

Merged Ontology

Correspondences

Axioms

Page 5: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Incoherence of an AlignmentIncoherence of an Alignment

Definition: Incoherence of an Alignment

An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi {1,2} that is unsatisfiable in O1 ∪A O2.

can be reduced to thesatisfiability of ∃i#R.⊤

Definition: MIPS Alignment (minimal conflict set)

Given an incoherent alignment A between ontologies O1 and O2.

A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.

Page 6: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis 6

"Terminology""Terminology"

Correspondence Alignment

Alignmentin a sequence ordered by confidences

MIPS depicted by red-dotted links

Alignmentwith MIPS shown as subsets

Page 7: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Alignment DiagnosisAlignment Diagnosis

Definition: Alignment Diagnosis

Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2.

Proposition: Alignment Diagnosis and minimal Hitting Sets

Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.

Page 8: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Local Optimal Diagnosis (LOD)Local Optimal Diagnosis (LOD)

low confidence

high confidenceDefinition: Accused correspondence

A correspondence c A is accused by A iff there exists a MIPS in A with c M such that for all c‘ ≠ c in M it holds that• (1) conf(c‘) > conf(c) and• (2) c‘ is not accused by A.

Definition: Local optimal diagnosis (LOD)

The set of all accussed correspondences is referred to as local optimal diagnosis (LOD).

important!

Page 9: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Page 10: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?YES!

Page 11: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?YES!

Page 12: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?NO!

Page 13: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?Now it is!

Page 14: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?YES!

Page 15: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?YES!

Page 16: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?NO!

Page 17: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1Algorithm 1

1 2 3 4 5 6 7 8 9 10

Coherent?Now it is!

… continue the same way

Page 18: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 1: ResultAlgorithm 1: Result

… and after a few more slides we would end up like this:

Note:

• 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS

• We have not computed a single MIPS alignment!

1 2 3 4 5 6 7 8 9 10

First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08)

With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)

Page 19: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

„„Patternbased“ reasoningPatternbased“ reasoning

• Idea: Use incomplete method for incoherence detection in A‘ ⊆A– Classify O1 and O2 once, then check for each pair of

correspondence in A‘ wether a certain pattern occurs

• If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent– If no pattern occurs A‘ can nevertheless be incoherent!

Oi

Oj

Page 20: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

That doesn‘t work …That doesn‘t work …

• Use the efficient coherence test instead of complete reasoning in algorithm described above– Reasoning about A' ⊆ A does not require to reason in O1 ∪A'

O2, but is replaced by iterating over all pairs in A'

– Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD

– Missing out one MIPS might result in a chain of incorrect follow-up decisions!

– Thus, afterwards removal of missed-out MIPS does not work!

• How to exploit the efficient method while still constructing a LOD?

Page 21: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 2: ExampleAlgorithm 2: Example

1 2 3 4 5 6 7 8 9 10

Detectable by efficient method

Only detectable by complete method

Resolved due to removal of correspondence

Page 22: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 2: ExampleAlgorithm 2: Example

1 2 3 4 5 6 7 8 9 10

Detectable by efficient method

Only detectable by complete method

Resolved due to removal of correspondence

Run the BF algorithm with efficient reasoning. Still incoherent?

Verification Step: Use binary search to detect correspondence k such thatA[0… k-1] is coherent and A[0 … k] is incoherent

k=8safe part, efficient reasoning did not fail up to k

incorrect part,recompute!

Page 23: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 2: ExampleAlgorithm 2: Example

1 2 3 4 5 6 7 8 9 10

Detectable by efficient method

Only detectable by complete method

Resolved due to removal of correspondence

Run the main algorithm again with efficient reasoning for A[k+1 … n] where∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis.

Still incoherent?If yes, we have knew > kold

repeat again the same verification step

A[1…k] A[k+1…n]

Page 24: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Algorithm 2: ExampleAlgorithm 2: Example

1 2 3 4 5 6 7 8 9 10

Detectable by efficient method

Only detectable by complete method

Resolved due to removal of correspondence

Final result is a LOD.

Page 25: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Runtime Considerations (Theory)Runtime Considerations (Theory)

n = size of alignment A

m = number of times the binary search is applied

• The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry

– Runtime of pattern based reasoning not really matters with respect to runtime!

• Runtime Comparison

– Brute Force LOD: O(n)

– Efficient LOD: O(log(n) * m)

Do we have m << n ?

Page 26: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Results: RuntimeResults: Runtime

• Based on experiments with OAEI conference ontologies and submission from 2007/08– Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D)

– Four different state of the art matching systemsn m

• Better results for benchmark datasets: 5 to 10 times faster

Page 27: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Results: Quality of DiagnosisResults: Quality of Diagnosis

• Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure

• For alignments with low precision positive effects are very strong.

• In rare cases an incorrect correspondences annotated with high confidence has negative effects

Page 28: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

SummarySummary

• Algorithm 1: Algorithm for computing a LOD

– Without computing MIPS or MUPS!

• Algorithm 2: General approach for improving the algorithms of type 1

– Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning

– In principle applicable to each semantic for which we can find a similar efficient reasoning approach!

• Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!

Page 29: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis 29

Thanks for attentionQuestions?

Page 30: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis 30

Back-Up SlidesBack-Up Slides

Page 31: An Efficient Method for Computing Alignment Diagnoses

Computing a local optimal diagnosis

Property Pattern ExampleProperty Pattern Example

readPaper reviewOfPaper

DocumentDocument

∃readPaper.⊤ ∃reviewOfPaper.⊤

dis

join

t∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑ ¬Person

dis

join

t

∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1

O2