Upload
gaetan
View
78
Download
0
Embed Size (px)
DESCRIPTION
Unsupervised Word Sense Disambiguation. REU, Summer, 2009. Word Sense Disambiguation. large vessel for holding gases or liquids. E.g., “The soldiers drove the tank .”. armored combat vehicle. Context Knowledge Base. “Many companies hire computer programmers”. - PowerPoint PPT Presentation
Citation preview
UnsupervisedWord Sense Disambiguation
REU, Summer, 2009
Word Sense Disambiguation
E.g., “The soldiers drove the tank .”
armored combat vehicle
large vessel for holding gases or liquids
hire
Context Knowledge Base
company programmer
many computer
“Many companies hire computer programmers”
write
programmer software
“Computer programmers write software”
+
computer
Context Knowledge Base
hire
company programmer
many computer
write
software
1 1
1
1 1
2
Result of merging dependency trees
Weights are number of dependency relation
instances found
WSD Algorithm
Parse original sentence using Minipar, get weighted dependency tree.
hire
company programmer
software computer
“A large software company hires computer programmers.”
1
0.5
0.33
To-be-disambiguated word
large
1 1
Weights are distances from to-be-disambiguated word
Parse each gloss of to-be-disambiguated word, get weighted dependency trees.
WSD Algorithm
Gloss 1: an institution created to conduct business
create
institution
business
unit
small military
Gloss 2: a small military unit
conduct
For each word in a gloss tree, find that word’s dependent words in the context knowledge base. We are looking for words in the knowledge base that match words in the original sentence. In other words, we are looking for context clues to disambiguate a word.
A score is generated based on the weights of those dependency relations in the knowledge base, and the dependent words of the to-be-disambiguated word in the original sentence. The more matches we find, the higher the generated score will be.
The gloss with the highest generated score will be selected as the correct sense of the word.
WSD Algorithm
Synonym Matching
If no direct matches are found between a gloss word and dependency relations in context knowledge base, we can replace the gloss word with one of its synonyms, since synonyms are semantically equivalent words.
Hypernym/hyponym Matching
E.g., animal
mammal dog
poodle
• Extract hypernyms and hyponyms of words from WordNet database.
• Store these in a data structure.
• Strategies: use all “levels”use only levels close to the original wordapply the above strategies to synonym matching, as
well
Word Similarity
• Use WordNet::Similarity Perl module to calculate “similarity score” between gloss word and dependent words in knowledge base.
• The most similar word found will be considered the closest to an actual match.
dog animal0.780
dog desk0.162
WordNet::Similarity similarity scores