9
Ontology matching erˆ ome Euzenat & Montbonnot, France [email protected] Thanks to Pavel Shvaiko and Natasha Noy for our collaboration on former versions of these slides What you have learned so far I Data can be expressed in RDF I Linked through URIs I Modelled with OWL ontologies I Retrieved through SPARQL queries erˆ ome Euzenat Ontology matching 2 / 36 Being serious about the semantic web I It is not one person’s ontology I It is not several people common ontology I It is many people’s many ontologies I So it is a mess, but a meaningful mess. Heterogeneity is not a bug, it is a feature erˆ ome Euzenat Ontology matching 3 / 36 Ontology heterogeneity Item DVD Book Paperback Hardcover CD price title doi creator pp author integer string uri Person Monograph Essay Literary critics Politics Biography Autobiography Literature pages isbn author title subject Human Writer erˆ ome Euzenat Ontology matching 4 / 36

jerome Euzenat - Ontology Matching

Embed Size (px)

DESCRIPTION

Jerome Euzenat's presentation at SSSW 2012

Citation preview

Page 1: jerome Euzenat - Ontology Matching

Ontology matching

Jerome Euzenat

&

Montbonnot, [email protected]

Thanks to Pavel Shvaiko and Natasha Noy for our collaboration on former versions of these

slides

What you have learned so far

I Data can be expressed in RDF

I Linked through URIs

I Modelled with OWL ontologies

I Retrieved through SPARQL queries

Jerome Euzenat Ontology matching 2 / 36

Being serious about the semantic web

I It is not one person’s ontology

I It is not several people common ontology

I It is many people’s many ontologies

I So it is a mess, but a meaningful mess.

Heterogeneity is not a bug, it is a feature

Jerome Euzenat Ontology matching 3 / 36

Ontology heterogeneity

Item

DVD

Book

Paperback

Hardcover

CD

pricetitledoicreatorpp

author

integer

string

uri

Person

Monograph

Essay

Literary critics

Politics

Biography

Autobiography

Literature

pages

isbnauthor

title

subject

Human

Writer

Jerome Euzenat Ontology matching 4 / 36

Page 2: jerome Euzenat - Ontology Matching

Heterogeneity problem

Resources being expressed in different ways must be reconciled before beingused.Mismatch between formalized knowledge can occur when:

I different languages are used (OWL vs. Topic maps);I different terminologies are used:

I English vs. Chinese;I Book vs. Monograph.

I different models are used:I different classes: Autobiography vs. Paperback;I classes vs. property: Essay vs. literarygenre;I classes vs. instances: One physical book as an instance vs. one work as

an instance.

I different scopes and granularity are used.I Only books vs. cultural items vs. any product;I Books detailed to the print and translation level vs. books as works.

Jerome Euzenat Ontology matching 5 / 36

How can we address the problem?

First ontology

Second ontology

matching Resulting alignmentInitial alignment

parameters

resources

Jerome Euzenat Ontology matching 6 / 36

Ontology alignment

Item

DVD

Book

Paperback

Hardcover

CD

pricetitledoicreatorpp

author

integer

string

uri

Person

Monograph

Essay

Literary critics

Politics

Biography

Autobiography

Literature

pages

isbnauthor

title

subject

Human

Writer

Jerome Euzenat Ontology matching 7 / 36

Expressive alignments (EDOAL)

Pocket

Booktopic

author=

Volume

size14≥

Autobiography

v

=

∀x , Pocket(x)⇐ Volume(x) ∧ size(x , y) ∧ y ≤ 14

∀x , Book(x) ∧ author(x , y) ∧ topic(x , y) ≡ Autobiography(x)

Jerome Euzenat Ontology matching 8 / 36

Page 3: jerome Euzenat - Ontology Matching

Transformation and mediation

SELECT x.isbnWHERE x : Autobiography

AND x.author = ”Bertrand Russell”

mediator

SELECT x.doiWHERE x : BookAND x.author = ”Bertrand Russell”

AND x.topic = ”Bertrand Russell”

x.doi=http://dx.doi.org/10.1080/041522862X x.isbn=041522862X

Jerome Euzenat Ontology matching 9 / 36

Ontology networks

o1

a1

b1 c1

d1 e1 o3

a3

b3

f3 g3

c3

d3 e3

o2

a2

b2

f2 g2

c2

d2 e2

o5

b5

f5

h5 j5

g5

o4

a4

b4

f4 g4

c4

d4 e4

A1,3

A1,2

A2,3

w

v

A2,4

A3,4

Jerome Euzenat Ontology matching 10 / 36

Why should we deal with this?

Applications of semantic integration

I Catalogue integration

I Schema and data integration

I Query answering

I Peer-to-peer information sharing

I Web service composition

I Agent communication

I Data transformation

I Ontology evolution

I Data interlinking

Jerome Euzenat Ontology matching 11 / 36

Applications: Query answering

Firstpeer

Firstontology

Secondpeer

Secondontology

Matcher

Alignment

Generator

mediatorquery reformulated query

answerreformulated answer

Jerome Euzenat Ontology matching 12 / 36

Page 4: jerome Euzenat - Ontology Matching

Applications: Agent communication

Firstagent

Firstontology

Secondagent

Secondontology

message

Matcher

Alignment

Generator

Translator

Transformedontology

Transformed message

Jerome Euzenat Ontology matching 13 / 36

Data interlinking

Firstdataset

Firstontology

Seconddataset

Secondontology

Matcher

Alignment

Generator

links

Jerome Euzenat Ontology matching 14 / 36

Ontology matching in three steps

Reconciliation can be performed in 3 steps o o ′

Match, Matcher

thereby determines the alignment A

Generate Generator

a processor (for merging, transforming, etc.) Transformation

Apply

Jerome Euzenat Ontology matching 15 / 36

On what basis can we match?

I Content: relying on what is inside the ontologyI Name, comments, alternate names, names of related entities: NLP, IR,

etc.I Internal structure: constraints on relations, typingI External structure: relations between entities: Data mining, Discrete

mathematicsI Extension: Statistics, data analysis, data mining, machine learningI Semantics (models): Reasoning techniques

I Context: the relations of the ontology with the outsideI Annotated resources:I The webI External ontologies: dbpedia, etc.I External resources: wordnet, etc.

Jerome Euzenat Ontology matching 16 / 36

Page 5: jerome Euzenat - Ontology Matching

Name similarity

Item

DVD

Book

Paperback

Hardcover

CD

pricetitledoicreatorpp

author

Person

Monograph

Essay

Literary critics

Politics

Biography

Autobiography

Literature

pages

isbnauthor

title

subject

Human

Writer

Jerome Euzenat Ontology matching 17 / 36

Structure similarity

Item

creator

DVD

Book

pricetitledoipp

Paperback

Hardcover

CD

author

integer

string

uri

Person

Monograph

Essay

Literary critics

Politics

Biography

Autobiography

Literature

pages

isbnauthor

title

subject

Human

Writer

Jerome Euzenat Ontology matching 18 / 36

Instance similarity

Item

DVD

Book

Paperback

Hardcover

CD

Monograph

Essay

Literary critics

Politics

Biography

Autobiography

LiteratureBertrand Russell: My life

Albert Camus: La chute

Jerome Euzenat Ontology matching 19 / 36

Combining different techniques

Basic matchers provide candidate correspondences, most of the systems useseveral such matchers and further combine and filter their results.

o

o ′

M A′

M ′′ A′′′

M ′ A′′

Matcher composition Aggregation

A′′′′

Filtering

A′′′′′

Iteration

A

Jerome Euzenat Ontology matching 20 / 36

Page 6: jerome Euzenat - Ontology Matching

How well do these approaches work?

Ontology Alignment Evaluation Initiative (OAEI)

I Formal comparative evaluation of different ontology-matching tools;

I Run every year since 2004;

I Variety of test cases (in size, in formalism, in content);

I Results consistent across test cases;

I Results very dependent on the tasks and the data (from under 50% ofprecision and recall to well over 80% if ontologies are relatively similar)

I Progress every year!

http://oaei.ontologymatching.org

Now involved in the SEALS (Semantics Evaluation At Large Scale) project.

Jerome Euzenat Ontology matching 21 / 36

Evaluation process

o

o ′

matching

parameters

resources

A

R

evaluator m

Jerome Euzenat Ontology matching 22 / 36

Benchmark results (precision and recallcurves)

recall0. 1.0.

prec

isio

n

1.

2010ASMOV

2009Lily

2008Lily

2007ASMOV

2006RiMOM

2005Falcon

edna

Jerome Euzenat Ontology matching 23 / 36

Tools you should be aware of

I Frameworks

I Alignment API: used by many tools; provides an exchange format andevaluation tools for OAEI. Alignment server for sharing.

I PROMPT (a Protege plug-in): includes a user interface and a plug-inarchitecture.

I COMA++: oriented toward database integration (many basic algorithmsimplemented).

I Matching systems

I OAEI best performers (Falcon, RiMOM, ASMOV, etc.)I Available systems (FOAM, Falcon, COMA++, Aroma, etc.)

Jerome Euzenat Ontology matching 24 / 36

Page 7: jerome Euzenat - Ontology Matching

The data interlinking problem

URI1 URI2

o o ′

Data interlinking

owl:sameAs

Jerome Euzenat Ontology matching 25 / 36

Example: Linking INSEE and NUTS

NUTS: Nomenclature of territorial units for statistics

#INSEE INSEE name NUTS Level #NUTS1 Pays 0 34

1 14226 Region 2 344

100 Departement 3 1488342 Arrondissement

4036 Canton 452422 Commune 5

Jerome Euzenat Ontology matching 26 / 36

INSEE and NUTS: ontology alignment

Territoire FR

Pays

Region

Departement

Arrondissement

Commune

codenom

chef-lieusubdivision

integer

string

Region

Country

NUTSRegion

LAURegion

name

level

code

hasSubRegion=

≤≤

=

Jerome Euzenat Ontology matching 27 / 36

Simple alignments are not sufficient

Territoire FR

Region

Departement

Commune

nom

DEP 75

nom

COM 75056

nom

Region

NUTSRegion

name

FR101

name

Paris

=

=

=

≤≤

=

=

=

Jerome Euzenat Ontology matching 28 / 36

Page 8: jerome Euzenat - Ontology Matching

Expressive alignments are necessary

Region

NUTSRegion

level

hasParentRegion

2 =

FR1=

=

subdivision hasSubRegion=

nom name=

Jerome Euzenat Ontology matching 29 / 36

Query generation

SELECT ?rPREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://rdf.insee.fr/geo/regions-2011.rdf>WHERE {

?r rdf:type insee:Region .}

SELECT ?nPREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/>WHERE {

?n rdf:type nuts:NUTSRegion .?n nuts:level 2^^xsd:int .?n nuts:hasParentRegion nuts:FR1 .

}

Jerome Euzenat Ontology matching 30 / 36

Query generation

CONSTRUCT { ?r owl:sameAs ?n . }PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>FROM <http://rdf.insee.fr/geo/regions-2011.rdf>FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/>WHERE {

?r rdf:type insee:Region .?r insee:nom ?l .?n rdf:type nuts:NUTSRegion .?n nuts:name ?l .?n nuts:level 2^^xsd:int .?n nuts:hasParentRegion nuts:FR1 .

}

Jerome Euzenat Ontology matching 31 / 36

What does this mean?

I Ontology alignments are schema-level expression of correspondences;

I They are useful for focussing the search;

I Expressive alignments are necessary;

I They can be turned into SPARQL-based link generators.

but it is also necessary to express instance level constraints:

I for converting data (e.g., mph vs. m/s);

I for expressing matching constraint on data (e.g., similarity).

Jerome Euzenat Ontology matching 32 / 36

Page 9: jerome Euzenat - Ontology Matching

General framework

o o ′

URI1 URI2

Ontology matching

A

Data interlinking

owl:sameAs

Jerome Euzenat Ontology matching 33 / 36

Selected challenges

I Scalability and efficiencyI Current matchers can be fast, scale and accurate, but not all at once.

I New sources of matchingI Context-based matching,

I General purpose matching (vs. special purpose matching)I Matcher combination,I Matcher selection and self-configuration,

I User involvement,I Matching (serendipitously) while working,I How to explain alignments?I Social and collaborative ontology matching,

I Alignment management: infrastructure and support,I How do we maintain alignments when ontologies evolve?I Reasoning with alignments,I Being robust to incorrect alignments.

and, of course, many others,

Jerome Euzenat Ontology matching 34 / 36

Further reading

I “Ontology Matching” by Euzenat andShvaiko

I Proceedings of ISWC, ASWC, ESWC,WWW conferences, etc.

I Journal of web semantics, Semantic webjournal, Journal on data semantics, etc.

I http://www.ontologymatching.org

Jerome Euzenat Ontology matching 35 / 36

[email protected]

http://exmo.inrialpes.fr