Upload
rudolf-rodgers
View
212
Download
0
Embed Size (px)
Citation preview
Project 2Ontology alignment
SIGNAL-ONTOLOGY (SigO)
Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation i- B Cell Development i- Complement Signaling synonym complement activation i- Cytokine Response i- Immune Suppression i- Inflammation i- Intestinal Immunity i- Leukotriene Response i- Leukotriene Metabolism i- Natural Killer Cell Response i- T Cell Activation i- T Cell Development i- T Cell Selection in Thymus
GENE ONTOLOGY (GO)
immune response i- acute-phase response i- anaphylaxis i- antigen presentation i- antigen processing i- cellular defense response i- cytokine metabolism i- cytokine biosynthesis synonym cytokine production … p- regulation of cytokine biosynthesis … … i- B-cell activation i- B-cell differentiation i- B-cell proliferation i- cellular defense response … i- T-cell activation i- activation of natural killer cell activity …
Ontology Alignment
equivalent concepts
equivalent relations
is-a relation
SIGNAL-ONTOLOGY (SigO)
Immune Response i- Allergic Response i- Antigen Processing and Presentation i- B Cell Activation i- B Cell Development i- Complement Signaling synonym complement activation i- Cytokine Response i- Immune Suppression i- Inflammation i- Intestinal Immunity i- Leukotriene Response i- Leukotriene Metabolism i- Natural Killer Cell Response i- T Cell Activation i- T Cell Development i- T Cell Selection in Thymus
GENE ONTOLOGY (GO)
immune response i- acute-phase response i- anaphylaxis i- antigen presentation i- antigen processing i- cellular defense response i- cytokine metabolism i- cytokine biosynthesis synonym cytokine production … p- regulation of cytokine biosynthesis … … i- B-cell activation i- B-cell differentiation i- B-cell proliferation i- cellular defense response … i- T-cell activation i- activation of natural killer cell activity …
Defining the relations between the terms in different ontologies
An Alignment Framework
Strategies based on linguistic matching Structure-based strategies Constraint-based approaches Instance-based strategies Use of auxiliary information
Matcher Strategies
Aim
Gain understanding about the ontology alignment process Gain understanding about about advantages and disadvantages
of different strategies for ontology alignment: matching strategies and single threshold filtering
Gain understanding about evaluation of strategies using precision, recall, f-measure
Learn about the Ontology Alignment Evaluation Initiative (OAEI)
Learn to use the tools and data sets of the OAEI
Tasks
1. Tutorial - learn how to use the tools provided by the OAEI (matching, single threshold filtering, evaluation)
2. Run existing algorithms on the benchmark test and discuss results
3. Implement own matcher, evaluate and discuss results
4. (study other test cases and discuss what kind of matchers would be appropiate)
Task 2 – using task 1 data
Method Precision Recall F-score ThresholdSMOA 0.69 0.96 0.80 0.5 SMOA 0.73 0.96 0.83 0.6 SMOA 0.79 0.96 0.87 0.7 SMOA 0.80 0.94 0.87 0.8 SMOA 0.88 0.73 0.80 0.9
Levenshtein 0.80 0.94 0.87 0.5 Levenshtein 0.86 0.77 0.81 0.6 Levenshtein 0.92 0.46 0.61 0.7 Levenshtein 0.94 0.35 0.52 0.8Levenshtein 0.99 0.35 0.52 0.9
Task 3
Implementation of own matchers.
- Definition of similarity computation- Testing using thresholds
Task 3
2
1
*)2,1(
)2,1(
))2(),1(()2,1(
CnCnMixCCncommon
CCdistedit
CCsim
Task 3
))2(),1(max(
)2,1()2,1(
ccountccount
ccmatchccBOWOverlap
))2(),1(max(*2
)2,1()2,1(
)2,1(
ccountccount
cctchbackwardMaccchforwardMat
ccpWordOverla
Task 3
))2(1),1(1max(
)2,1(1)2,1(
ccountccount
ccmatchccBOTOverlap
))2(1),1(1max(*2
)2,1(1)2,1(1
)2,1(
ccountccount
cctchbackwardMaccchforwardMat
ccrlapTrigramOve
Task 3 – performance on task1 data
Method Precision Recall F-score Threshold
equal 1.00 0.23 0.37 1 SMOA 0.69 0.96 0.80 0.5 Levenshtein 0.68 0.98 0.80 0.33 Levenshtein 0.53 1.00 0.69 0
BOWOverlap 0.87 0.85 0.86 0.3 WordOverlap 1.00 0.60 0.75 0.3 BOTOverlap 0.76 0.85 0.80 0.3 TrigramOverlap 0.80 0.92 0.85 0.1 FinalScore 0.87 0.94 0.90 0.2
NEW 0.48 0.92 0.63 0NEW 0.74 0.88 0.80 0.88
Task 2 - benchmark
Benchmark
1xx (4): same ontology, no overlap, language generalization, language restriction
2xx (ca 40): base ontology with modified base ontology (e.g. change names, remove relations, spelling mistakes, use of sysnonyms, change of natural language)
3xx (4): real cases
Task 2 – benchmark data
Best system OAEI 2007: p: 0.95; r: 0.9Range: p: 0.98 (with r: 0.64) - 0.76 (with r: 0.7) r: 0.90 (with p: 0.95) - 0.21 (with p: 0.92)According to category: 1xx: several systems p:1, r:1 2xx: p: 0.95, r: 0.9; p: 0.97, r: 0.89 3xx: p: 0.94, r: 0.68
Task 2 - benchmark
Approximate string matching
(WordNet)
Multilingual WordNet
Structure
Instances
Task 2/3
00. 10. 20. 30. 40. 50. 60. 70. 8
103104201
201-2201-4201-6201-8202
202-2202-4202-6202-8203204205206207208209210
equalproposedSMOA