22
ONTOLOGY EVALUATION AND RANKING USING ONTOQA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

Embed Size (px)

Citation preview

Page 1: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

ONTOLOGY EVALUATION AND RANKING USING ON-TOQA

By. Samir Tatir and I.Budak Arpinar

Department of Industrial Engineering

Park Jihye

Page 2: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

WHY “ONTOQA?” More and more ontologies are being introduced Difficult to find good ontology related to user’s

work Need tools for evaluating and ranking the on-

tologies

Provides a flexible technique to rank ontologies based on user’s contents and relevance

OntoQA is the first approach that evaluates on-tologies using their instances as well as schemas

Page 3: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

CONTENTS Architecture

Terminology

The Metrics Schema Metrics Instance Metrics

Ontology Score Calculation

Experiments and Evaluation

Conclusion

Page 4: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

ARCHITECTURE

1. Input Ontologya. OntoQA calculates metric values

Page 5: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

ARCHITECTURE

2. Input Ontology and Keywords OntoQA calculates metric valuesUses WordNet to expand the keywords to include

any related keywords that might exist in the ontol-ogy

Uses metric values to evaluate the overall contents of the ontology and obtain its relevance to the keywords

Page 6: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

ARCHITECTURE

3. Input Keywords OntoQA uses Swoogle to find the RDF and OWL

ontologies in the top 20 results returned by Swoogle

OntoQA then evaluates each of the ontologies OntoQA finally displays the list of ontologies ranked by their score

Page 7: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

TERMINOLOGY Schema

A set of classes, A set of relationships, A set of class-ancestor pairs,

Knowledgebase

A set of instances, A class instantiation function, A relationship instantiation function,

HP

C

I

( )iinst C( , )i iinstr I I

Page 8: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

METRICS Two dimension

Schema

Ontology design and its potential for rich knowl-edge representation

Instances

Placement of instance data and distribution of the data

Overall Knowledgebase Class-specific metrics Relationship-specific metrics

Page 9: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

SCHEMA METRICS (1) Relationship Diversity(RD)

: Whether user prefers a taxonomy or diverse rela-tionships

If RD value is close to 0, most of the relationships are inher-itance relationship

IF RD value is close to 1, most of the relationships are non-inheritance

PRD

H P

Page 10: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

SCHEMA METRICS (2) Schema Deepness(SD)

: Distinguish Shallow ontology from a deep ontol-ogy

If SD value is low, ontology would be deep, and covers spe-cific domain in detailed manner

IF SD value is high, ontology would be shallow, and repre-sents a wide range of general knowledge

HSD

C

?

Page 11: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (1) OVERALL KB MET-

RICS Class Utilization(CU)

: Indicate how classes defined in the schema are

being utilized in the Knowledgebase

C’ is the set of populated classes

If CU value is low, knowledgebase does not have data that exemplifies all the knowledge that exists in the schema

'CCU

C

Page 12: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (1) OVERALL KB MET-

RICS Cohesion(Coh)

: Represents the number of connected components in the KB

Class Instance Distribution(CID)

: Indicate how instances are spread across the classes on the schema

Standard deviation in the number of instances per class

Coh CC

( ( ))iCID StdDev Inst C

Page 13: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (2) CLASS SPECIFIC METRICS

Class Connectivity(Conn) : Indicate centrality of a class

NIREC (C) is the set of relationships, instances of the class havewith instances of other classes

( ) ( )i iConn C NIREL C

Page 14: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (2) CLASS SPECIFIC METRICS

Class Importance (Imp) : Indicate what parts of the ontology are consid-

ered focal and what parts are on the edge

Number of instances that belong to the inheritance subtree rootedat in the KB, compared to the total number of class instances in the KB

( )Im ( )

( )i

i

Inst Cp C

KB CI

iC ( )iInst C iC

Page 15: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (2) CLASS SPECIFIC METRICS

Relationship Utilization(RU) : Reflects how the relationships defined for each

class in the schema are being used at instance level

is the set of distinct relationships used by instances of a class ,

is the set of relationships a class has with another class ,

( )( )

( )i

ii

IREL CRU C

CREL C

IRELiC ( ) : { ( , ), ( )}i i j i iIREL C instr I I where I inst C

CREL iC

jC ( ) : { ( , )}i i jCREL C P C C

Page 16: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

INSTANCE METRICS (3) RELATIONSHIP-SPECIFIC METRICS

Relationship Importance(Imp) : Measures percentage of importance of the current

relationship

Number of instances of relationship in the KB,compared to the total number of property instances in the KB (RI)

( )Im ( )

( )i

i

Inst Rp R

KB RI

( )iInst RiR

Page 17: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

ONTOLOGY SCORE CALCULA-TION Evaluation of Ontology based on the entered

keywords

I. The terms entered by the user are extended by adding any related terms

II. Determines the class and relationship whose name contain any term of the extended set of terms

III. Aggregate the overall metrics to get overall score for the ontology

i iScore W Metric

Page 18: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

EXPERIMENTS AND EVALUA-TION Compare the ranking of the ontoQA, On-

toRank of Swoogle, group of expert users.

OntoRank1)

Similar to Google’s pageRank approach Gives preference to Popular Ontologies

wPR(a) is weighted PageRank variation

1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web

( )

( ) ( ) ( )x OTC a

OntoRank a wPR a wPR x

Page 19: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

EXPERIMENTS AND EVALUA-TION

Problem of OntoRank1)

If two copies of the same ontology are placed in two different locations and one of these locations is cited more than the other, it will rank the copy at this popular location higher than the other copy

OntoQA will give both ontologies the same ranking

1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web

Page 20: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

EXPERIMENTS AND EVALUATION

1 2 3 4 5 6 7 8 90123456789

10

1 2 3 4 5 6 7 8 90123456789

10

With Balanced WeightWith Higher Weight for Schema Size

user

OntoQASwoogle

user

Swoogle

OntoQA

Page 21: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

CONCLUSION Different from other approaches in that it is

tunable, requires minimal user involvement

Consider both the schema and the instances of a populated ontology

Page 22: O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye

REVIEW Ranking result depends highly on the Weight

Difficult to decide proper Weight

Due to inconsistent metrics, every metric has its own range

=> “same weight” doesn’t mean “same preference”

About 10 kinds of metrics, too many cases of com-bination