Upload
ahmed-salazar
View
35
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Related terms search based on WordNet / Wiktionary and its application in ontology matching. RCDL'2009. St. Petersburg Institute for Informatics and Automation of RAS. J ö nk ö ping University, Sweden. Feiyu Lin, A. Krizhanovsky (andrew.krizhanovsky at gmail.com). Contents. - PowerPoint PPT Presentation
Citation preview
Related terms search based on WordNet / Wiktionary
and its application in ontology matching
RCDL'2009
St. Petersburg Institute
for Informatics and Automation of RAS
Feiyu Lin, A. Krizhanovsky (andrew.krizhanovsky at gmail.com)
Jönköping University, Sweden
2
Contents
Wiki and Wiktionary intro
MRD, parser and Wiktionaries comparison
Correlation of relatedness measures Experiment scheme Result and comparison
Results, applications and future
Goal Is it possible to find related terms by the
current version of Wiktionary
as successfully as by WordNet? for ontology matching, for application in text search systems, etc.
What advantages?
4
Wiki-resources
Distributed users and authors (edit pages)
Centralized storage (e.g. MySQL, Apache, PHP)
Set of hyper linked articles
Each article has one or more categories (tree)
* Example: http://en.wikipedia.org
6
Wiktionary data: +, -, simplicity & complexity
− Different wiktionaries have different levels of standartization.
− Fast growing data, but it’s created by a huge community(a developed parser should be very stable)
+ Rich data+ thesaurus
(synonyms, antonyms )
+ phrase books+ etymologies+ pronunciations+ sample quotations+ translations
+ Fast growing data
+ Interwiki (add. data)
+ GNU DFL
Correlation of relatedness measures
Correlation with human judgments of relatedness measures 353-TC to measures based on WordNet, English Wikipedia, Russian Wiktionary
Application of Machine-readable dictionary (MRD)
Thesaurus data:
Related Terms Search
Search request extension (by synonyms) / request
reformulation (in search systems)
Request recognition in question-answering systems
Word sense disambiguation
Media data (audio + pictures)
Language learning
Work plan: done and todoRussian Wiktionary• Extraction (by RE)
– Definition– Relations (synonyms…)
– Translation– Audio– Graphics
• Database API• Visualization
(MRD browser)• Quiz & tests
(test application)
Russian Wiktionary• Database scheme
– Definition– Relations (synonyms…)
– Translation– Audio– Graphics
• Database API
English Wiktionary
15
Implementation
Software based on Synarcher code
Java
MySQL or SQLite database
JUnit test framework
16
Results
The scheme of the experiment for calculating the semantic relatedness measure based on Russian Wiktionary data
The parser of Russian Wiktionary Database scheme designed Database API implemented in Java
Compared the results of related terms search based on Wiktionary and WordNet
Project site (Wiki tool kit)
http://code.google.com/p/wikokit/
Future work
Finish creation MRD
Database and software
Russian Wiktionary and English Wiktionary
Visualization (JavaFX)
MRD browser
Quiz & tests (learning application)
Online application (Java Web-start)
asdf