View
40
Download
2
Category
Preview:
DESCRIPTION
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Natalya F. Noy Stanford Medical Informatics Stanford University. Outline. Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) - PowerPoint PPT Presentation
Citation preview
PROMPT:Algorithm and Tool for Automated Ontology Merging and Alignment
Natalya F. Noy
Stanford Medical Informatics
Stanford University
Outline
Definitions and motivation The PROMPT ontology-merging algorithm
Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT)
The tools Evaluation Future work
Ontologies
Characterize concepts and relationships in an application area, providing a domain of discourse
Enumerate concepts, attributes of concepts, and relationships among concepts
Define constraints on relationships among concepts
Why do we need ontologies
An ontology provides a shared vocabulary for different applications in a domain
An ontology enables interoperation among applications using disparate data sources from the same domain
Ontologies Are Everywhere
Ontologies have been used in academic projects for a long time Knowledge sharing and reuse Reuse of problem-solving methods
Ontologies are becoming widely used outside of academia Categorization of Web sites (e.g. Yahoo!) Product catalogs
Need for Ontology Merging
There is significant overlap in existing ontologies Yahoo! and DMOZ Open Directory Product catalogs for similar domains
Need for Ontology Merging and Integration
Need to merge or align overlapping ontologies Chemdex™—a portal for accessing life-
science–supply catalogs Workshop on “Ontologies and Information
Sharing” at IJCAI’2001 6 out of 18 papers (1/3) are about ontology
merging and integration
What Is Ontology Merging
Existing Approaches
Ontology design and integration term matching (Stanford SKC, ISI) graph-based analysis (Stanford SKC) transformation operators (Ontomorph at ISI) merging tools (Chimaera at Stanford KSL)
Object-oriented Programming subject-oriented programming (IBM)
“subjective” views of classes transformation operations concentrates on methods rather than relations
Existing Approaches (II)
Databases develop mediators and provide wrappers define a common data model and mappings define matching rules to translate directly
Most of these approachesdo not provide any guidance to the user,
do not use structural information
Outline
Definitions and motivation The PROMPT ontology-merging algorithm
Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT)
The tools Evaluation Future work
PROMPT
Our approach is: Partial automation Algorithms based on
concept-representation structure relations between concepts user’s actions
Our approach is not: Complete automation Algorithm for matching concept names
Knowledge Model
A generic knowledge model of OKBC (Open Knowledge-Base Connectivity Protocol) Classes
Collections of objects with similar properties Arranged in a subclass–superclass hierarchy
Instances Slots
First-class objects in a knowledge base Binary relations describing properties of classes and instances
Facets Constraints on slot values (cardinality, min, max)
Make initial suggestions
Select the next operation
Perform automatic updates
Find conflicts
Make suggestions
The PROMPT Algorithm
Example: merge-classes
Agencyemployee
Agent
Customer
subclass of
agent for
Agent
Employee
Traveler
subclass of
has client
Agencyemployee
Agent
Employee
Customer Traveler
subclass of subclass of
agent for has client
Example: merge-classes (II)
Agencyemployee
Agent
Employee
Customer Traveler
subclass of subclass of
agent for has client
Agencyemployee
Agent
Employee
Customer Traveler
subclass of subclass of
agent for
Analyzing Global Properties Locally
Global properties classes that have the same sets of slots classes that refer to the same set of classes slots that are attached to the same classes
Local context incremental analysis consider only the concepts that were affected
by the last operation
The PROMPT Operation Set
Extends the OKBC operation set with ontology-merging operations merge classes merge slots merge instances copy of a class
deep or shallow with or without subclasses with or without instances
…
After a User Performs an Operation
For each operation perform the operation consider possible conflicts
identify conflicts propose solutions
analyze local context create new suggestions reinforce or downgrade existing suggestions
Conflicts
Conflicts that PROMPT identifies name conflicts dangling references redundancy in a class hierarchy slot-value restrictions that violate class
inheritance
Agent Agent
Agent
Example: merge-classes
Operation Steps: merge-classes
Own slot and their values for the new classask the user in case of conflicts or use preferences
Template slots for the new classunion of template slots of the original classes
Subclasses and superclasses for the new class
Conflicts Suggestions
Agent Agent
Agent
agent for
Template Slots
Copy template slots that don’t exist in the merged ontology
agent for
Agent Agent
Agent
has client
clientclient
Template Slots
Attach the slots that have already been mapped
Employee
Subclasses And Superclasses
If a superclass (subclass) exists, re-establish the links
Agent Agent
Agent
Agencyemployee
superclass
superclass
Agent
Dangling References
Agent Agentagent for
Customer
facet value
For example,allowed class
agent forfacet value
Customer _temp
dummy frame
Agentclient
has client
Additional Suggestions: Merge Slots
If slot names at the merged class are similar, suggest to merge the slots
Agent
Additional Suggestions: Merge Classes
If the set of classes referenced by the merged class is the same as the set of classes referenced by another class, suggest a merge
ReservationClient
hasclients
handlesreservations
Agency employee
Employee Agencyemployee
Agent
If names of superclasses (subclasses) of the merged class are similar, suggest to merge the classes
superclasssuperclass
Additional Suggestions: Merge Classes
Check for Cycles
Person
Employee Agencyemployee
Agent
superclass
superclass
If there is a cycle, suggest removing one of the parents
To Summarize
Perform the actual operation For the concepts (classes, slots, and
instances) directly attached to the operation arguments perform global analysis for new suggestions Perform global analysis for new conflicts
Non-local context
Classes directly referenced by C
Slots in C
Context
C
Anchor-PROMPT: Using Non-Local Contexts
Input: A set of anchor pairs
Output: A set of related terms with
similarity scores
Where do anchors come from? Lexical matching Interactive tools User-specified
Ontology 1 Ontology 2
Generating Paths in the Graph
Similarity Score
Generate a set of all paths (of length < L) Generate a set of all possible pairs of paths of
equal length For each pair of paths and for each pair of
nodes in the identical positions in the paths, increment the similarity score
Combine the similarity score for all the paths
Equivalence Groups
Anchor-PROMPT: Initial Results
TRIAL Trial
PERSON Person
CROSSOVER Crossover
PROTOCOL Design
TRIAL-SUBJECT Person
INVESTIGATORS Person
POPULATION Action_Spec
PERSON Character
TREATMENT-POPULATION Crossover_arm
Knowledge Model Assumptions
The only assumption:
An OKBC-compliant knowledge model
Outline
Definitions and motivation The PROMPT ontology-merging algorithm
Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT)
The tools Evaluation Future work
Protégé-2000
An environment for Ontology development Knowledge acquisition
Intuitive direct-manipulation interface Extensibility
Ability to plug in new components
Ontologies in Protégé-2000
Protégé-200 plugins
Domain-specific user-interface plugins Alternative back ends for archival storage Utility programs for knowledge-acquisition
tasks End-user applications
Protégé-based PROMPT tool
Protégé-2000 has an OKBC-compatible knowledge model allows building extensions through a plug-in
mechanism can work as a knowledge-base server for the plug-
ins
The PROMPT tool
The PROMPT tool features
Setting a preferred ontology Maintaining the user’s focus Providing feedback to the user Preserving original relations
subclass-superclass relations slot attachment facet values
Linking to the direct-manipulation ontology editor Logging operations
Outline
Definitions and motivation The PROMPT ontology-merging algorithm
Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT)
The tools Evaluation Future work
Evaluation
Knowledge-based systems are rarely evaluated
We can use software-engineering approaches to empirical evaluation of tools
We need to develop additional knowledge-base measurements
Questions we asked
How good are PROMPT’s suggestions and conflict-resolution strategies?
Does PROMPT provide any benefit when compared to a generic ontology-editing tool (Protégé-2000)?
What we were trying to find out
The benefit that the tool provides Productivity benefit Quality improvement in the resulting
ontologies User satisfaction
Precision and recall of the tool’s suggestions
Source ontologies for the experiments
Two ontologies of problem-solving methods the ontology for the Unified Problem-solving
Method Development Language (UPML) the ontology for the Method-Description
Language (MDL)
Experiment 1: Evaluate the quality of PROMPT’s suggestions
Metrics Precision Recall
Method Automatic
logging Automatic data
reporting
Suggestions that the tool
produced
Operations that the user
performed
Suggestions that the user
followed
Results: the quality of PROMPT’s suggestions
Suggestions that users followed
Conflict-resolution strategies that users followed
Knowledge-base operationsgenerated automatically
90% 75%
74%
Experiment 2: PROMPT versus generic Protégé-2000
Metrics content of the resulting
ontologies number of explicit
knowledge-base operations
PROMPT
Results: PROMPT versus generic Protégé-2000
The resulting ontologies had only one difference Specifying operations explicitly
0
20
40
60
PROMPT Protégé
1660
Results
Experts followed most of the PROMPT’s suggestions
Using PROMPT has improved the efficiency of ontology merging
Anchor-PROMPT Evaluation
Experiment setup Two ontologies from the DAML ontology library Varying parameters
maximum path length number of anchor pairs
Experiment results Ratio of correct results above the median
similarity score
Anchor-PROMPT: Evaluation Results
Max path length
Number of anchors
Result precision
4 4 67%4 3 67%4 2 61%3 4 67%3 3 61%3 2 56%2 4 100%2 3 100%
Anchor-PROMPT Evaluation Results
Equivalence groups of size <= 2 are required
Maximum path lengths of 2 provides extremely high precision (but low recall)
75% precision with maximum path lengths 3 and 4
Future work
Extend the set of heuristics that PROMPT uses for guiding the experts
Extend the techniques to ontology alignment and ontology refactoring
Develop protocols and metrics for a more detailed evaluation of the tools
http://protege.stanford.edu
Recommended