Click here to load reader
Upload
elena-oana-tabaranu
View
1.943
Download
1
Embed Size (px)
DESCRIPTION
Presents comparative evaluations of graph based word sense disambiguation techniques using several measures of word semantic similarity and several ranking algorithms. Unsupervised word sense disambiguation has received a lot of attention lately because of it's fast execution time and it's ability to make the most of a small input corpus. Recent state of the art graph based systems have tried to close the gap between the supervised and the unsupervised approaches.
Citation preview
A Survey on Unsupervised Graph-based Word Sense Disambiguation
Elena-Oana [email protected]
UAIC, Iasi
Elena-Oana Tabaranu 2
Plan
1.Introduction2.State of the Art3.Experiments and Results4.Conclusions5.References
Elena-Oana Tabaranu 3
Introduction● WSD = assign automatically the most
appropriate meaning to a polysemous word within a given context (Sinha et al, 2007)
● Use Cases:● Machine translation● Speech processing● Boosting the performance of tasks like text retrieval, document
classification and document clustering
Elena-Oana Tabaranu 4
State of the Art● Supervised WSD vs Unsupervised WSD● GWSD and Semantic Graph Construction● SAN Method● Page-Rank Method● HITS Method● P-Rank Method
Elena-Oana Tabaranu 5
Supervised WSD vs Unsupervised WSD ● Most approaches transform
the sense of the word into a feature vector
● Low execution time ● Accuracy of 60%-70%● Major disadvantage:
knowledge aquisition bottleneck (accuracy connected to the amount of manually anotated data)
● Identify the best sense candidate for a model of the word sense dependency in text
● Ranking algorithm to choose their most likely combination
● Window, graph based representation of the model
● Fast execution time● Accuracy of 40%-60%
Elena-Oana Tabaranu 6
Graph-based WSD● GWSD = graph representation used to model
word sense dependencies in text (WSD with graphs, not just word window)
● Goal: identify the most probable sense (label) for each word
● Advantage: takes into account information drawn from the entire graph
Elena-Oana Tabaranu 7
Semantic Graph Construction (I)● Example (Sinha et al, 2007)
Elena-Oana Tabaranu 8
Semantic Graph Construction (II)● Example (Tsatsaronis et al, 2010)
Elena-Oana Tabaranu 9
The Page-Rank Method (Brin and Page, 1998)
● Ranking algorithm based on the idea of voting: when one node links to another it offers a vote to that other node
● The higher the number of votes for a note, the higher the importance of the node
● Recursively score the candidate nodes for a weighted undirected graph
Elena-Oana Tabaranu 10
The P-Rank Method (Zao et al, 2009)
● Check the structural similarity of nodes in an information network
● Based on the idea that two nodes are similar if they reference and also reference similar nodes
● Represents a generalization of other state of the art measures like CoCitation, Coupling, Amsler, SimLink
Elena-Oana Tabaranu 11
The HITS Method (Kleinberg,1999)● Identify authorities = the most important nodes
in the graph● Identify hubs = the nodes which point to
authorities ● The sense with the highest authority is chosen
as the most likely one for each word● Major disadvantage: densely connected nodes
can attract the highest score (clique attack)
Elena-Oana Tabaranu 12
Experiments and Results (I)● Senseval 2 and 3 data sets often used for testing● Occurencies for Senseval 2 using WordNet 2
● Occurencies for Senseval 3 using WordNet 2
Elena-Oana Tabaranu 13
Experiments and Results (II)● Accuracies on the Senseval 2 and 3 English All
Words Task data sets (Tsatsaronis et al)
Elena-Oana Tabaranu 14
Conclusions● Recent systems minimise the gap between supervised
and unsupervised approaches.● The graph-based methods make the most of the rich
semantic model they employ.● Unsupervised approaches seek the optimal value for
the parameters using as little training data as possible and testing on as large a dataset as possible.
● Future work: implement P-Rank using a different representation, for example Sinha et al.
Elena-Oana Tabaranu 15
References1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental
Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010).
2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007).
3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005).
4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007).
Elena-Oana Tabaranu 16
Questions?