19
Learning to Map between Learning to Map between Ontologies on the Semantic Ontologies on the Semantic Web Web AnHai Doan, Jayant Madhavan , Pedro Domingos, and Alon Halevy Databases and Data Mining group University of Washington

Learning to Map between Ontologies on the Semantic Web

  • Upload
    neci

  • View
    31

  • Download
    4

Embed Size (px)

DESCRIPTION

Learning to Map between Ontologies on the Semantic Web. AnHai Doan, Jayant Madhavan , Pedro Domingos, and Alon Halevy Databases and Data Mining group University of Washington. Semantic Web. Mark-up data on the web using ontologies Enable intelligent information processing over the web - PowerPoint PPT Presentation

Citation preview

Page 1: Learning to Map between Ontologies on the Semantic Web

Learning to Map between Learning to Map between Ontologies on the Semantic Ontologies on the Semantic WebWeb

AnHai Doan, Jayant Madhavan,Pedro Domingos, and Alon Halevy

Databases and Data Mining group

University of Washington

Page 2: Learning to Map between Ontologies on the Semantic Web

Semantic WebSemantic Web

Mark-up data on the web using ontologies

Enable intelligent information processing over the web Personal software agents Queries over multiple web pages …

Page 3: Learning to Map between Ontologies on the Semantic Web

An ExampleAn Example

Semantic Mappings allow information processing across Semantic Mappings allow information processing across ontologiesontologies

www.cs.washington.edu www.cs.usyd.edu.au

James CookPhD, U Sydney

Data Instance

Find Prof. Cook, a professor in a Seattle college, earlier an assoc. professor at his alma mater in Australia

Semantic Mapping

People

Staff Faculty

Professor Assoc. Professor

Asst. Professor

Academic Technical

Professor SeniorLecturer

Lecturer

Staff

NameEducation

… …

Page 4: Learning to Map between Ontologies on the Semantic Web

Semantic Web: State of the ArtSemantic Web: State of the Art

Languages for ontologies RDF, DAML+OIL,…

Ontology learning and Ontology design tools [Maedche’02], Protégé, Ontolingua,…

Semantic Mappings crucial to the SW vision [Uscold’01, Berners-Lee, et al.’01]

Without semantic mappings…Tower of Without semantic mappings…Tower of Babel !!!Babel !!!

Page 5: Learning to Map between Ontologies on the Semantic Web

Semantic Mapping Semantic Mapping ChallengesChallenges

Ontologies can be very different Different vocabularies, different design principles Overlap, but not coincide

Semantic Mapping information Data instances marked up with ontologies Concept names and taxonomic structure Constraints on the mapping

Page 6: Learning to Map between Ontologies on the Semantic Web

OverviewOverview

People

Staff Faculty

Professor

Assoc. Professo

r

Asst. Professo

r

Academic Technical

Professor

SeniorLecturer

Lecturer

Staff

Faculty

Define Similarity

Sim(Fac,Staff

)Sim(Fac,Prof)

Sim(Fac,Acad)

ComputeSimilarity

SatisfyConstraints

Academic?

Page 7: Learning to Map between Ontologies on the Semantic Web

Our ContributionsOur Contributions An automatic solution to taxonomy matching

Handles different similarity notions Exploits information in data instances and taxonomic

structure, using multi-strategy learning

Extend solution to handle wide variety of constraints, using Relaxation Labeling

An implementation, our GLUEGLUE system, and experiments on real-world taxonomies

High accuracy (68-98%) on large taxonomies (100-330 concepts)

Page 8: Learning to Map between Ontologies on the Semantic Web

Defining Similarity Defining Similarity

Multiple Similarity measures in terms of the JPDMultiple Similarity measures in terms of the JPD

Assoc. Prof Snr. Lecturer

A,S

A, S

A,S

A,S

P(A,S) + P(A,S) + P(A,S)

P(A,S)=

Joint Probability Distribution: P(A,S),P(A,S),P(A,S),P(A,S)

HypotheticalCommonMarked updomain

P(A S)

P(A S)Sim(Assoc. Prof., Snr. Lect.) =

[Jaccard, 1908]

Page 9: Learning to Map between Ontologies on the Semantic Web

No common data instancesNo common data instances

In practice, not easy to find data tagged with both ontologies !

United States Australia

Solution: Use Machine LearningSolution: Use Machine Learning

A

A

S

S

Page 10: Learning to Map between Ontologies on the Semantic Web

Machine Learning for computing Machine Learning for computing similaritiessimilarities

JPD estimated by counting the sizes of the partitionsJPD estimated by counting the sizes of the partitions

CLS

S

S

United States AustraliaA

A

S

S

CLA

A

A

A,S A,S

A,S A,S

A,S A,S

A,S A,S

Page 11: Learning to Map between Ontologies on the Semantic Web

Improve Predictive Accuracy – Use Improve Predictive Accuracy – Use Multi-Strategy Learning Multi-Strategy Learning

Single Classifier cannot exploit all available information

Combine the prediction of multiple classifiersCombine the prediction of multiple classifiers

CLA1

A

A

A

A

CLAN

A

A

…Content Learner

Frequencies on different words in the text in the data instances

Name LearnerWords used in the names of concepts in the taxonomy

Others …

Meta-Learner

Page 12: Learning to Map between Ontologies on the Semantic Web

So far…So far…

Define Similarity

ComputeSimilarity

SatisfyConstraints

Joint ProbabilityDistribution

Multi-strategyLearning

Page 13: Learning to Map between Ontologies on the Semantic Web

Constraints due to the taxonomy structure

Domain specific constraints Department-Chair can only map to a unique

concept

Numerous constraints of different types

StaffPeople

Next Step: Exploit ConstraintsNext Step: Exploit Constraints

Staff Fac

Prof Assoc. Prof Asst. Prof

Acad Tech

Prof Snr. Lect. Lect.

PeoplePeople StaffStaffParentsParents

ChildrenChildren

Extended Relaxation Labeling to ontology matchingExtended Relaxation Labeling to ontology matching

Page 14: Learning to Map between Ontologies on the Semantic Web

Solution: Relaxation LabelingSolution: Relaxation Labeling

Find the best label assignment given a set of constraints

Start with an initial label assignment Iteratively improves labels, given constraints

Standard Relaxation Labeling not applicable Extended in many ways

Acad

StaffPeople

Staff Fac

Prof Assoc. Prof Asst. Prof

Tech

Prof Snr. Lect. Lect.

PeoplePeople

ProfProf Assoc. Assoc. ProfProf

Asst. Asst. ProfProf

FacFac

ProProff

?

StaffStaff

Snr. Lect.Snr. Lect. Lect.Lect.

StaffStaff

AcadAcad

ProProff

Snr. Lect.Snr. Lect. Lect.Lect.

?

Page 15: Learning to Map between Ontologies on the Semantic Web

Distribution Estimator

Joint Distributions:P(A,B),P(A,B),…

Taxonomy O2

(structure + data instances)Taxonomy O1

(structure + data instances)

Putting it all together Putting it all together GLUE System GLUE System

Relaxation Labeler

Generic & Domain constraints

Mappings for O1 , Mappings for O2

Similarity Estimator

Similarity Matrix

Similarity function

DistributionEstimator

Meta Learner

Learner CL1 Learner CLN

Page 16: Learning to Map between Ontologies on the Semantic Web

Real World ExperimentsReal World Experiments Taxonomies on the web

University classes (UW and Cornell) Companies (Yahoo and The Standard)

For each taxonomy Extracted data instances – course descriptions, and company

profiles Trivial data cleaning 100 – 300 concepts per taxonomy 3-4 depth of taxonomies 10-90 average data instances per concept

Evaluation against manual mappings as the gold standard

Page 17: Learning to Map between Ontologies on the Semantic Web

ResultsResults

0

10

20

30

40

50

60

70

80

90

100

Cornell to Wash. Wash. to Cornell Cornell to Wash. Wash. to Cornell Standard to Yahoo Yahoo to Standard

Mat

chin

g a

ccu

racy

(%)

Name Learner Content Learner Meta Learner Relaxation Labeler

University I University II Companies

Page 18: Learning to Map between Ontologies on the Semantic Web

Related WorkRelated Work Our LSD schema matching system [Doan,

Domingos, Halevy ’01] GLUE handles taxonomies, richer models, and a

much richer set of constraints

Other Ontology and Schema Matching work [Noy, Musen’01], [Melnik, et al.’02], [Ichise, et al.’01] Mostly heuristics, or single machine learning

techniques

Relaxation Labeling for constraint satisfaction [Hummel, Zucker’83], [Chakrabarti, et al.’00] Significantly extend this approach

Page 19: Learning to Map between Ontologies on the Semantic Web

Conclusions & Future WorkConclusions & Future Work

An automated solution to taxonomy matching Handles multiple notions of similarity Exploits data instances and taxonomy structure Incorporates generic and domain-specific constraints Produces high accuracy results

Future Work More expressive models Complex Mappings Automated reasoning about mappings between

models