38
Ontology Mapping Tool for Diabetes By Madhuri Gopal

Ontology Mapping Tool for Diabetes By Madhuri Gopal

  • Upload
    oistin

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Ontology Mapping Tool for Diabetes By Madhuri Gopal. Topics covered: Project overview Design Principles Technology Stack Approach and Methodology Execution Framework Modules Covered Results. Project Overview : Background The aim of the project is to overcome semantic - PowerPoint PPT Presentation

Citation preview

Page 1: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology Mapping Tool for Diabetes

By Madhuri Gopal

Page 2: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Topics covered:

• Project overview• Design Principles• Technology Stack• Approach and Methodology • Execution Framework• Modules Covered• Results

Page 3: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Project Overview: Background • The aim of the project is to overcome semantic heterogeneity in the WWW by using ontology mapping techniques that find the semantic correspondences between similar elements of two ontologies. • We are aiming to map ontology that are created from standard documents on Diabetes medical domain. • Our approach will enable better decision making support for queries on these

documentsChallenges in the existing systems• Identification of a safer drug regimen requires searching through a space of

indicated regimens that outnumbers the pages Google searches 1000 to 1.• A single criterion is insufficient to guide the selection of a safer regimen.• Fragmented gathering and storage of clinical data• Lack of formal standardized knowledge representation of clinical data.

Page 4: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Design Principles Open Close Principle Software entities like classes, modules and functions should be open for extension but closed for modifications.

Dependency Inversion Principle a) High-level modules should not depend on low-level modules. Both should depend on abstractions. b) Abstractions should not depend on details. Details should depend on abstractions.

Interface Segregation Principle Clients should not be forced to depend upon interfaces that they don't use.

Page 5: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Design Principles contd… Single Responsibility Principle A class should have only one reason to change.

Liskov's Substitution Principle Derived types must be completely substitutable for their base types.

Page 6: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Technology Stack

The architecture followed is a 2 tier architecture.

Front-End : Java Back-end : Ontology (.owl files)

Page 7: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Development Hardware

Processor: Intel(R) Core™ 2 Duo CPU T6400 @ 2.00 GHZ Memory(RAM) : 4 GBSystem type: 32-bit Operating System

Tools used

Protégé - Ontology Creation (Stanford Open Source Tool)PDPTools – Neural networks Simulator ( Stanford Open Source Tool)

Page 8: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Approach and Methodology • Software prototyping (Incremental prototyping) methodology is used for development.

• The final product is built as separate prototypes. • At the end the separate prototypes are merged in an overall design

• Steps are: a) Identification of basic requirements. b) Development of the initial prototype c) Review of prototype d) Revision and Enhancement of the Prototype

Page 9: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Execution Framework• Eclipse IDE is used as the execution framework.

• All the required plugins (jar files) from protégé/plugins/edu.stanford.smi.protegex.owl and OWL API ( open source API) are included in the build path of the Java project for accessing the ontology built using Protégé ( Stanford open source tool).

• The IAC Neural networks is implemented using PDPTools suite of neural networks software ( Stanford tool for Parallel Distributed Processing) which runs in Matlab . All required inputs are taken from java environment by

connectivity between Eclipse and Matlab

Page 10: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Overall Architecture

Page 11: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Modules covered 1) Creation of diabetes ontology from American Association of Clinical Endocrinologists (Benchmark document ) and from Wikipedia

2) Name Similarity Matrix calculated for all terms in both ontologies using the Levenshtein Distance formula ( Dynamic Programming Technique)

3) Profile Similarity Matrix calculated using term frequency – inverse document frequency (tf.idf statistical data mining algorithm ) .

4) Conversion of ontology terms to a vector space model and computation of Cosine Similarity matrix.

Page 12: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Modules covered contd….5) Structural similarity matrix for calculation of structural similarity between ontologies using basic structural features such as depth from root, number of children , number of instances.

6) Similarity Aggregator for aggregating the name similarity , profile similarity and structural similarity

7) Harmony function estimation for filtering out the most useful similarities and eliminating the erroneous similarity.

8) IAC neural networks algorithm that satisfies a constraint satisfaction problem for improving the mapping between the two ontologies.

Page 13: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology Creation

- Using Protégé

Page 14: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology 1

Page 15: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology 2

Page 16: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology Mapping

Page 17: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Ontology Mapping

Input: 2 homogeneous ontologies O1 and O2 expressed in formal ontology language (OWL/RDF) .

Output: 4 Tuple: M(e1i , e2j , r, s)

where ‘M’ is the mapping

e1i is an element in O1 e2j is an element in O2 r mapping between e1i and e2j s confidence measure of mapping normalized from [0..1]

Page 18: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

IR Based Similarity

Generator

Input: Ontologies O1 ,O2

Output : 3 similarity matrices that contain similarity scores for each pair of elements in

ontologies. Similarity Matrices : • Name Similarity• Profile Similarity• Structural Similarity

Page 19: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Name Similarity

This is calculated based on the edit distance between the name(id) of the elements

NameSim(e1i, e2j) = 1- { EditDist(e1j , e2j) / Max(l(e1i) , l(e2j)) }

where : EditDist - LevenShtein distance between elements.

l(e1i) and l(e2j)- length of strings e1i and e2j.

Page 20: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Sample Output for two Ontologies with 6 elements each

Page 21: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Name similarity matrix of dimension 37*26

Page 22: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Profile Similarity:

The profile similarity is defined in 3 steps:

• Profile Enrichment• Profile Propagation• Profile Mapping

Page 23: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Profile Enrichment and

Propagation

• Profile of a class Class ID + Comments + Properties Profiles + Instances Profiles

• Profile of a property Property ID + Property Domain + Property Range

• Profile of an instance Instance ID+ Descriptive information

Page 24: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Profile Mapping

• Cosine similarity between the profiles of the 2 elements e1i and e2j is calculated in a vector space model . → → ProfileSim(e1i, e2j) = ( Vei1 Ve2j) / ( |Vei1||Ve2j| )

where: Ve1i and Ve2j are 2 vectors representing the profile of elements e1i and e2j respectively.

Page 25: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Property domain range of Ontology1

Page 26: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Property Domain Range of Ontology 2

Page 27: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Cosine Similarity Matrix

Page 28: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Structural similarity

• This is applicable for classes alone as they have hierarchical information StructSim(e1i,e2j) = ∑ ( 1-diffk(e1i,e2j) / N where: e1i , e2j are 2 class elements in the ontology O1 and O2 respectively N – total number of structure features diffk(e1i , e2j) denotes the difference for feature k.

diff(e1i,e2j) = (sf(e1i) - sf(e2j)) / max (sf(e1i) , sf(e2j))

where:

sf(e1i), sf(e2j) denote the value of a structural feature of the element

Page 29: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Identical Ontologies Similarity Calculation

Page 30: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Structural Similarity Matrix

Page 31: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Harmony • Harmony estimates the importance and reliability of different similarities. Harmony (h) = #s_max / min(#e1 ,#e2)

where : #s_max - number of pairs of elements having the highest similarity in both the row and column in the similarity matrix.

#ei - number of elements of ontology Oi

Page 32: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Similarity matrices Harmony Estimation

Page 33: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Adaptive Similarity Aggregator

Input: Individual similarity matrices

Output : Aggregated similarity matrix

FinalSim(e1i,e2j) = ∑ hk * Simk( e1i,e2j) / n where: hk - kth similarity matrix harmony n- Total number of similarity matrices

Page 34: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Final Aggregated Similarity Matrix

Page 35: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

IAC neural NetworkWith Constraint Satisfaction

Page 36: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

H11

H12

H1n

SYNAPSIS 1

H21

H22

H2n

SYNAPSIS 2

H31

H32

H3n

Architecture

Page 37: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Neural Networks Constraint Satisfaction Sample Output

Page 38: Ontology Mapping Tool for  Diabetes By        Madhuri Gopal

Thank You