17
Project 2 Ontology alignment

Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Embed Size (px)

Citation preview

Page 1: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Project 2Ontology alignment

Page 2: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

SIGNAL-ONTOLOGY (SigO)

Immune Response         i- Allergic Response     i- Antigen Processing and Presentation     i- B Cell Activation      i- B Cell Development     i- Complement Signaling synonym complement activation      i- Cytokine Response      i- Immune Suppression      i- Inflammation      i- Intestinal Immunity      i- Leukotriene Response        i-  Leukotriene Metabolism      i- Natural Killer Cell Response      i- T Cell Activation      i- T Cell Development      i- T Cell Selection in Thymus

GENE ONTOLOGY (GO)

immune response i- acute-phase response i- anaphylaxis i- antigen presentation i- antigen processing i- cellular defense response i- cytokine metabolism i- cytokine biosynthesis synonym cytokine production … p- regulation of cytokine biosynthesis … … i- B-cell activation i- B-cell differentiation i- B-cell proliferation i- cellular defense response … i- T-cell activation i- activation of natural killer cell activity …

Page 3: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Ontology Alignment

equivalent concepts

equivalent relations

is-a relation

SIGNAL-ONTOLOGY (SigO)

Immune Response         i- Allergic Response     i- Antigen Processing and Presentation     i- B Cell Activation      i- B Cell Development     i- Complement Signaling synonym complement activation      i- Cytokine Response      i- Immune Suppression      i- Inflammation      i- Intestinal Immunity      i- Leukotriene Response        i-  Leukotriene Metabolism      i- Natural Killer Cell Response      i- T Cell Activation      i- T Cell Development      i- T Cell Selection in Thymus

GENE ONTOLOGY (GO)

immune response i- acute-phase response i- anaphylaxis i- antigen presentation i- antigen processing i- cellular defense response i- cytokine metabolism i- cytokine biosynthesis synonym cytokine production … p- regulation of cytokine biosynthesis … … i- B-cell activation i- B-cell differentiation i- B-cell proliferation i- cellular defense response … i- T-cell activation i- activation of natural killer cell activity …

Defining the relations between the terms in different ontologies

Page 4: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

An Alignment Framework

Page 5: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Strategies based on linguistic matching Structure-based strategies Constraint-based approaches Instance-based strategies Use of auxiliary information

Matcher Strategies

Page 6: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Aim

Gain understanding about the ontology alignment process Gain understanding about about advantages and disadvantages

of different strategies for ontology alignment: matching strategies and single threshold filtering

Gain understanding about evaluation of strategies using precision, recall, f-measure

Learn about the Ontology Alignment Evaluation Initiative (OAEI)

Learn to use the tools and data sets of the OAEI

Page 7: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Tasks

1. Tutorial - learn how to use the tools provided by the OAEI (matching, single threshold filtering, evaluation)

2. Run existing algorithms on the benchmark test and discuss results

3. Implement own matcher, evaluate and discuss results

4. (study other test cases and discuss what kind of matchers would be appropiate)

Page 8: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 2 – using task 1 data

Method Precision Recall F-score ThresholdSMOA 0.69 0.96 0.80 0.5 SMOA 0.73 0.96 0.83 0.6 SMOA 0.79 0.96 0.87 0.7 SMOA 0.80 0.94 0.87 0.8 SMOA 0.88 0.73 0.80 0.9

Levenshtein 0.80 0.94 0.87 0.5 Levenshtein 0.86 0.77 0.81 0.6 Levenshtein 0.92 0.46 0.61 0.7 Levenshtein 0.94 0.35 0.52 0.8Levenshtein 0.99 0.35 0.52 0.9

Page 9: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 3

Implementation of own matchers.

- Definition of similarity computation- Testing using thresholds

Page 10: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 3

2

1

*)2,1(

)2,1(

))2(),1(()2,1(

CnCnMixCCncommon

CCdistedit

CCsim

Page 11: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 3

))2(),1(max(

)2,1()2,1(

ccountccount

ccmatchccBOWOverlap

))2(),1(max(*2

)2,1()2,1(

)2,1(

ccountccount

cctchbackwardMaccchforwardMat

ccpWordOverla

Page 12: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 3

))2(1),1(1max(

)2,1(1)2,1(

ccountccount

ccmatchccBOTOverlap

))2(1),1(1max(*2

)2,1(1)2,1(1

)2,1(

ccountccount

cctchbackwardMaccchforwardMat

ccrlapTrigramOve

Page 13: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 3 – performance on task1 data

Method Precision Recall F-score Threshold

equal 1.00 0.23 0.37 1 SMOA 0.69 0.96 0.80 0.5 Levenshtein 0.68 0.98 0.80 0.33 Levenshtein 0.53 1.00 0.69 0

BOWOverlap 0.87 0.85 0.86 0.3 WordOverlap 1.00 0.60 0.75 0.3 BOTOverlap 0.76 0.85 0.80 0.3 TrigramOverlap 0.80 0.92 0.85 0.1 FinalScore 0.87 0.94 0.90 0.2

NEW 0.48 0.92 0.63 0NEW 0.74 0.88 0.80 0.88

Page 14: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 2 - benchmark

Benchmark

1xx (4): same ontology, no overlap, language generalization, language restriction

2xx (ca 40): base ontology with modified base ontology (e.g. change names, remove relations, spelling mistakes, use of sysnonyms, change of natural language)

3xx (4): real cases

Page 15: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 2 – benchmark data

Best system OAEI 2007: p: 0.95; r: 0.9Range: p: 0.98 (with r: 0.64) - 0.76 (with r: 0.7) r: 0.90 (with p: 0.95) - 0.21 (with p: 0.92)According to category: 1xx: several systems p:1, r:1 2xx: p: 0.95, r: 0.9; p: 0.97, r: 0.89 3xx: p: 0.94, r: 0.68

Page 16: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 2 - benchmark

Approximate string matching

(WordNet)

Multilingual WordNet

Structure

Instances

Page 17: Project 2 Ontology alignment. SIGNAL-ONTOLOGY (SigO) Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation

Task 2/3

00. 10. 20. 30. 40. 50. 60. 70. 8

103104201

201-2201-4201-6201-8202

202-2202-4202-6202-8203204205206207208209210

equalproposedSMOA