A Survey on Unsupervised Graph-based Word Sense Disambiguation

A Survey on Unsupervised Graph-based Word Sense Disambiguation

Elena-Oana [email protected]

UAIC, Iasi

mailto:[email protected]

Elena-Oana Tabaranu 2

Plan

1.Introduction2.State of the Art3.Experiments and Results4.Conclusions5.References


Introduction● WSD = assign automatically the most

appropriate meaning to a polysemous word within a given context (Sinha et al, 2007)

● Use Cases:● Machine translation● Speech processing● Boosting the performance of tasks like text retrieval, document

classification and document clustering


State of the Art● Supervised WSD vs Unsupervised WSD● GWSD and Semantic Graph Construction● SAN Method● Page-Rank Method● HITS Method● P-Rank Method


Supervised WSD vs Unsupervised WSD ● Most approaches transform

the sense of the word into a feature vector

● Low execution time ● Accuracy of 60%-70%● Major disadvantage:

knowledge aquisition bottleneck (accuracy connected to the amount of manually anotated data)

● Identify the best sense candidate for a model of the word sense dependency in text

● Ranking algorithm to choose their most likely combination

● Window, graph based representation of the model

● Fast execution time● Accuracy of 40%-60%


Graph-based WSD● GWSD = graph representation used to model

word sense dependencies in text (WSD with graphs, not just word window)

● Goal: identify the most probable sense (label) for each word

● Advantage: takes into account information drawn from the entire graph


Semantic Graph Construction (I)● Example (Sinha et al, 2007)


Semantic Graph Construction (II)● Example (Tsatsaronis et al, 2010)


The Page-Rank Method (Brin and Page, 1998)

● Ranking algorithm based on the idea of voting: when one node links to another it offers a vote to that other node

● The higher the number of votes for a note, the higher the importance of the node

● Recursively score the candidate nodes for a weighted undirected graph


The P-Rank Method (Zao et al, 2009)

● Check the structural similarity of nodes in an information network

● Based on the idea that two nodes are similar if they reference and also reference similar nodes

● Represents a generalization of other state of the art measures like CoCitation, Coupling, Amsler, SimLink


The HITS Method (Kleinberg,1999)● Identify authorities = the most important nodes

in the graph● Identify hubs = the nodes which point to

authorities ● The sense with the highest authority is chosen

as the most likely one for each word● Major disadvantage: densely connected nodes

can attract the highest score (clique attack)


Experiments and Results (I)● Senseval 2 and 3 data sets often used for testing● Occurencies for Senseval 2 using WordNet 2

● Occurencies for Senseval 3 using WordNet 2


Experiments and Results (II)● Accuracies on the Senseval 2 and 3 English All

Words Task data sets (Tsatsaronis et al)


Conclusions● Recent systems minimise the gap between supervised

and unsupervised approaches.● The graph-based methods make the most of the rich

semantic model they employ.● Unsupervised approaches seek the optimal value for

the parameters using as little training data as possible and testing on as large a dataset as possible.

● Future work: implement P-Rank using a different representation, for example Sinha et al.


References1. Tsatsaronis, G., Varlamis, I., Norvag, K. : An Experimental

Study on Unsupervised Graph-based Word Sense Disambiguation. In Proc. of CICLing (2010).

2. Sinha, R., Mihalcea, R. :Unsupervised graph-based word sense disambiguation using measures of semantic similarity. In Proc. of ICSC (2007).

3. Mihalcea, R., Csomai, A. : Senselearner: Word sense disambiguation for all words in unrestricted text. In Proc. of ACL, pages 53-56 (2005).

4. Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I. :Word Sense Disambiguation with Spreading Activation Networks Generated from Thesauri. In Proc. of IJCAI (2007).


Questions?

Technology

A Survey on Unsupervised Graph-based Word Sense Disambiguation