Upload
peri
View
36
Download
3
Embed Size (px)
DESCRIPTION
From Conceptual to Instance Matching. George A. Vouros AI Lab Department of Information and Communication Systems Eng. University of the Aegean 83200 Karlovassi, Samos, Greece [email protected]. Ontology Matching at the Conceptual Level. - PowerPoint PPT Presentation
Citation preview
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
From Conceptual to Instance Matching
George A. Vouros
AI LabDepartment of Information and Communication Systems Eng.
University of the Aegean83200 Karlovassi, Samos, Greece
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Given two Ontologies (S1,A1, I1), (S2,A2,I2) find a mapping (i.e. equivalences) between their signatures so that
The translation of A1 with respect to this mapping is satisfied by A2.
Ontology Matching at the Conceptual Level
2
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Given two Ontologies (S1,A1, I1), (S2,A2,I2), find a mapping between their
- Signatures (i.e. equivalences) &- Instances (i.e. “same as” assertions)such that the assertions in I2, together with the
“translated” assertions in I1 are consistent with A2 and the translated axioms in A1
Instance Matching
4
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
The Instance Matching contest was composed by two tracks
The ISLab Instance Matching Benchmark (IIMB) is a benchmark automatically generated starting from one data source that is automatically modified according to various criteria.
The AKT-Rexa-DBLP test case aims at testing the capability of the tools to match individuals. All three datasets were structured using the same schema. The challenges for the matchers included ambiguous labels (person names and paper titles) and noisy data (some sources contained incorrect information).
OAEI 09 Instance Matching Track
5
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
(a) Scalability(b) Different methods exploit different
information concerning instances, or different facets of the same type of information
(c) Assumptions concerning the structure of the “search space”
Issues
6
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Our first approach:- Computing clusters of “same as” instances
where each cluster is represented by a “model” of the cluster.
- Clusters and models are stored on disk files - New instances are compared with each
cluster by exploiting the “models”- The highest similarity above a specific
threshold indicates the cluster of the new instance
Scalability
7
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
COCLU: Aims at discovering typographic similarities between sequences of characters over an alphabet (ASCII or UTF character set), aiming to reveal the similarity of classes instances’ lexicalizations during ontology population. It is a partition-based clustering algorithm which divides data into clusters and searches the space of possible clusters using a greedy heuristic.
Methods
8
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Vector Space Model – based (VSM) method: It computes the matching of two pseudo documents. In our case each such document corresponds to an instance and it is produced by the words in the vicinity of that instance. The “vicinity” includes all words occurring (i) to the local name, label and comments of this concept, (ii) to any of its properties (exploiting the properties’ local names, labels and comments), as well as (iii) to any of its related concepts or instances. Each document is represented by a vector of n weighted index words, where the weight of a word is the frequency of its appearance in the document. The similarity between two vectors is computed by means of the cosine similarity measure.
Methods
9
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Simple (e.g. the union/intersection of clusters with at least one common member)
Biased : The clusters of one method may be used as input by the another.
Model based: Set the constraints that must be satisfied according to the axioms of the schemas and run a generic method (e.g. the max-sum algorithm or a DCOP method) that reconciles the prefernces and conflicts among the individual methods.
Synthesis of different methods
10
Univ
ers
ity o
f th
e A
eg
ean
AI –
LAB
ES
WC
2008
Get results from the individual methods…. Implement the synthesis of different
methods Investigate the interaction between
conceptual mapping and instance mapping for a sophisticated but scalable synthesis method.
To be done
11