27
A Panoramic Approach to Integrated Evaluation of Ontologies in the Semantic Web S. Dasgupta, D. Dinakarpandian, Y. Lee School of Computing and Engineering University of Missouri-Kansas City

A Panoramic Approach to Integrated Evaluation of Ontologies in the Semantic Web S. Dasgupta, D. Dinakarpandian, Y. Lee School of Computing and Engineering

Embed Size (px)

Citation preview

A Panoramic Approach to Integrated Evaluation of

Ontologies in the Semantic Web

S. Dasgupta, D. Dinakarpandian, Y. LeeSchool of Computing and Engineering

University of Missouri-Kansas City

Overview

• Motivation• Approach - Pan-Onto-Eval

1. Triple Centricity

2. Theme Centricity

3. Structure Centricity

4. Domain Centricity• Experiments • Evaluation• Conclusion

Related Work • Ontology ranking by cross-references: Swoogle [3,6],

OntoSelect [7] and OntoKhoj [4]

• Structural richness

– Tartir et al [8]: distribution and generic/specific super/sub concepts# [Alani et al. 16-18]. Density measure [16], centrality measure [18].

• Relational richness – Tartir et al [8] - ratio of #non-IS-A to #rels. – Sabou et al [2] - no consideration of the roles of

concepts of relationships.

• Very limited work on Thematic richness - multiple hierarchies in a single ontology

NO!Actually, they are similarThey live in the same houseThey have the same last nameThey have the same children….

Are they similar?

you cannot judge them all by their "covers".

Ontology Evaluation

• How to evaluate ontology?

–Some ontologies are strong in terms of structure while their relationships are weak.

• We need to evaluate ontologies considering different perspectives.

OntoSnap Framework

Ontology Summarization

Ontology Evaluation

Ontology Categorization

OntologyQuery & Reasoning

Ontology Integration

OntoSnap

Summary - WINE Ontology

• http://www.w3.org/2002/03owlt/miscellaneous/consistent001 • Total Number of Classes: 138  (Defined: 77, Imported: 61) • Total Number of Datatype Properties: 1 • Total Number of Object Properties: 16  (Defined: 13, Imported: 3) • Total Number of Annotation Properties: 2 • Total Number of Individuals: 206  (Defined: 161, Imported: 45

Category Selected RN Relation Associated Theme Nodes (TN)

SE

Functional

CORBANShasMaker

Riesling0.172

MARIETTA hasMaker CabernetSauvignon, PetiteSyrah, Zinfandel

0.21

MOUNTADAM hasMaker Chardonnay, PinotNoir, DryRiesling

0.19

MOUNT-EDEN-VINEYARD hasMaker Chardonnay, PinotNoir 0.121

WHITEHALL-LANE hasMaker DessertWine 0.117

Attributive

DRY hasSugar

Chardonnay, WhiteBurgundy, Zinfandel, CabernetSauvignon, CheninBblanc, Merlot, PinotNoir, Meritage, PetiteSyrah, Riesling, CabernetFranc

0.52

MODERATE FLAVOR

Chardonnay, Meursault, Riesling, Zinfandel,CheninBlanc, Merlot, CabernetSauvignon, PetiteSyrah, PinotNoir, WhiteBurgundy, IceWine, CabernetFranc

0.37

STRONG FLAVOR

WhiteBurgundy, Zinfandel, CabernetSauvignon, PinotNoir, Chardonnay, PetiteSyrah

0.25

MEDIUMBODY

Chardonnay, Chianti, Riesling, Merlot, Meritage, CabernetSauvignon, PetiteSyrah, Zinfandel, PinotNoir,DryRiesling, WhiteBurgundy, IceWine CheninBlanc, CabernetFranc

0.4

FULL BODY

WhiteBurgundy, Zinfandel, CabernetSauvignon, Chardonnay, CheninBlanc, PinotNoir, PetiteSyrah

0.31

Spatial

NAPA locatedInChardonnay,Zinfandel,

CabernetSauvignon, PetiteSyrah, CabernetFranc

0.394

NEW-ZEALAND locatedIn SauvignonBlanc 0.23

SONOMAlocatedIn

Zinfandel, Merlot, CabernetSauvignon, PetiteSyrah,

Chardonnay

0.42

GERMANYlocatedIn

Sweet Riesling 0.302

SOUTH-AUSTRALIA locatedIn Chardonnay, Pinot-Noir, Riesling 0.241

Summary - Wine 3 Ontology

Pan-Onto-Eval

A comprehensive approach to evaluating an ontology by considering its structure, semantics, and domain

1. Triple Centricity: • <domain (S), property (R), range (O)>• Information sources

2. Theme Centricity: Relation Classification

3. Structure Centricity: Relationship Inheritance

4. Domain Centricity

Information Source

Triple Centricitycapturing Information source

isMadeFrom

Subject(Domain)

Relation(Property)

Object(Range)

Theme CentricityClassification of Relations in Wine

Domain

compositionalFunctionalAttributive Spatial Temporal

Relation

•madeInYear•madeFrom•madeFromFruit madeFromGrape

•blendWith

•hasMaker•drink•cause

•hasFlavor•hasColor•hasSugar•hasBody

•hasRegion•isLocatedIn•adjacentTo

Comparative

•tasteBetter•Expensive

Conceptual

Relations between domain and range concepts carry different semantic ‘senses’. for better understanding of the thematic categories of the ontology

Relationship Inheritance

isMadeFrom IS-AIS-A

hasColor

Cirrhosis

Cause

hasMakerwinery

IS-A IS-A

beverage

hasSugarBeer Wine

IS-A

Specific

Generic

Structure Centricity

Distribution of non-IS-A relations

WineHistory

Grape varieties

Classification

Vintages

Testing

Collecting

Production

Exporting countries

Uses

Health effects

Packaging & Storage

WIKIpediaDomain Centricity

Semantic implication of each hierarchy is different - contributes differently to thesemantics of the ontology as a whole.

Pan-Onto-EvalOntology

H1

O1

H2Hierarchies

IC IR DR

DMF

DMI

IC IR DR

RR

DMF1

DMI1

IC IR DR

RR

DMF2

DMI2

H3

IC IR DR

RR

DMF3

DMI3

ρ

PanoramicMetrics

DomainImportance

EvaluationScore

Information Content (IC)

Domain Concepts

Range Concepts

D1

D2

R1

R2

R3

Triple: Domain-Property-Range

Which information sources are importantHow Range concepts are associated - with which Domain concepts - through which Relation types

InformationSources

Information Content (IC)Domain Concept

Range Concept

Relation type1Relation type2

Relation type7

1

)()( 1

MR

RCRRCIA

Q

t

t

Information Entropy is used to measure the significance of information sources • the overall uncertainty of Range concept association

)(log)()( 21

i

M

i

i RCIARCIAHE

)(

1)(

HERHIC

...

IS-A

Inheritance Richness (IR)

N: Number of domain concepts in HR(DCi)): Number of relations associated with the domain concept DCi S(DCi) Number of children under the domain concept DCi

All Domain Concepts X For each X IR(X) = R(X)*S(X) Average of IRs

X

Domain ConceptRange ConceptIS-ANon IS-A

)()(1

)(1

i

N

i

i DCRDCSN

HIR

Dimensional Richness (DR)

The dimensional coverage of relationships in a hierarchy. The richness of these relationships are measured by selected range concepts corresponding domain concepts i

Q

i

i MNQ

QHDR

1

)(

{DCi, RCjDCk, RCl...}.

{DCi, RCjDCk, RCl...}.

{DCi, RCjDCk, RCl...}.

{DCi, RCjDCk, RCl...}.

Relational Richness (RR)

Q

t

tRQ

HRR1

)(1

)(

The dimensional coverage of relations in a hierarchy. The richness of these relations are measured by selected relations for categories in a hierarchy

{Ri, Rj ...}. {Rk, Rl ...}. {Rm, Rn ...}. {Ro, Rp ...}.

Domain Importance (DMI)

• The richness of the core domain(s) of hierarchy H

k compared to other

hierarchies.

)()()()()( kkkkk HRRHDRHIRHICHDMF

k

ii

kk

HDMFMAX

HDMFHDMI

1))((

)()(

Ontology Evaluation Score

• Combine the richness of hierarchies together into a single model that can effectively evaluate ontologies.

K

i

i

k

ii HDMI

KHDMFMAXo

11)(

1))(()(

K: the number of hierarchies in a given ontology

Experiments

• We analyze three related university ontologies

– http://www.ksl.stanford.edu/projects/DAML/ksl-daml-desc.daml– http://www.ksl.stanford.edu/projects/DAML/ksl-daml-instances.daml– http://www.cs.umd.edu/projects/plus/DAML/onts/univ1.0.daml.

• Preprocessing – convert the DAML files to OWL using a

mindswap converting tool – assign a type to the relations in these

ontologies – generate summaries.

• The application is implemented in Java using the Protégé OWL 3.3 beta API.

H5: Document - attributive, functional and temporalH7: Organization - conceptual and attributiveH6: Organism

The evaluation score of the University-I (ρ) is 6.109

The best hierarchy in O2 is H6 vs. O1's is H5The evaluation score of the ontology (ρ) is 3.909.

The evaluation score of the University-III (ρ) is 4.567.

Comparison of the three ontologies

Conclusions• Pan-Onto-Eval

– A comprehensive approach to evaluating an ontology considering various aspects - structure, semantics, and domain.

– A formal treatment of the model• The experimental results demonstrate benefits of

the proposed model. • Overall, the model has great potential on

evaluation of distributed knowledge in the Semantic Web.

• Limitations– Lack of rigorous evaluation by experts. – Preprocessing – summarization, relation type

assignment– Verified for real applications.