Upload
keena
View
14
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Applications. Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart. Typical Applications of Ontologies. Agent communication Data integration Description of service capabilities for matching and composition purposes Formal verification of process descriptions - PowerPoint PPT Presentation
Citation preview
Applications
Chapter 9, Cimiano Ontology Learning Textbook
Presented by Aaron Stewart
Typical Applications of Ontologies
• Agent communication• Data integration• Description of service capabilities for matching
and composition purposes• Formal verification of process descriptions• Unification of terminology across communities
Text Applications of Ontologies
• Information Retrieval (IR)• Clustering and Classification of Documents• Semantic Annotation• Natural Language Processing
Task-Based Evaluation(Porzel and Malaka 2005)
Task-Based EvaluationRequirements
1. Algorithm output can be quantified2. Task can use background knowledge3. Ontology is an additional parameter4. Output can be traced to the ontology
Contents
1. Text Clustering and Classification2. Information Highlighting for Supporting
Search3. Related Work
Text Clustering and Classification
• What is the difference?
Text Clustering
Text Classification
Arrows Weather Flat shapes 3-D forms Smile!
Dot Kom Project
• One of many competitions
Approaches
• Bag of words• Manually engineered MeSH Tree Structures• Automatically constructed ontologies
What is a “Bag of Words” anyway?
the
quickbrown
fox
Bag of Words
the quick brown fox jumps over the lazy dog
(2)
Building Hierarchies
Note on Ontologies
• Our ontologies (“micro”)– Like a database record schema
• Their ontologies (“macro”)– Like WordNet
Clustering
• Hierarchical Agglomerative Clustering• Bi-Section K-means• “A Comparison of Document Clustering
Techniques”– www.cs.sfu.ca/~wangk/894report/chen1.pdf
Document Representations
• Bag of Words• Certain words + ontology -> extended features• Strategies: add, replace, only
Vectors and Cosine Similarity
Classification Results (Categories)
Classification Results (Documents)
Cluster Metrics
P : computer-generated clustersL : human-created clustersP, L : sets of clusters (partitioning)
Clustering Results
Clustering Results
Information Highlighting for Supporting Search
• Challenge:– 10 minute limit– KMi Planet News web site– Compile a list of important• People• Technologies
Information Highlighting for Supporting Search
• Tools:– Regular browser– Magpie– ESpotter– C-PANKOW
Teams
• A : web browser only• B : web browser with AKT information• C : web browser with AKT++ information
AKT++ Lexicon
Scores
Conclusions (for this section)
• Generated ontologies can be comparable to hand-crafted ontologies
• Humans can trust the computer too much! (Group C drop in score)
Related Work
• Query Expansion• Information Retrieval• Text Clustering and Classification• Natural Language Processing
Natural Language Processing
• Ambiguity resolution– Bank
• Compounds– Headache medicine
• Vague words– With, of, has– Selectional restrictions
• Anaphora
More Applications
• Word sense disambiguation• Classification of unknown words• Named Entity Recognition (NER)• Anaphora Resolution• Question Answering– Who wrote the Hobbit?– Tolkien is the author of the Hobbit.
• Information Extraction– AUTOSLOG, ASIUM
Analysis/Conclusion
• Pro/con: – Focused on two systems– Passing survey of others