9
Word Sense Disambiguation (WSD) By Asma Kausar and Joshim Uddin

Word Sense Disambiguation (WSD)

  • Upload
    xenon

  • View
    63

  • Download
    0

Embed Size (px)

DESCRIPTION

Word Sense Disambiguation (WSD). By Asma Kausar and Joshim Uddin. Introduction to WSD. In computation linguistics is the process of identifying and analyzing the meaning of words in context. An open problem of Natural Language Processing. - PowerPoint PPT Presentation

Citation preview

Page 1: Word Sense Disambiguation (WSD)

Word Sense Disambiguation (WSD)

By Asma Kausar and Joshim Uddin

Page 2: Word Sense Disambiguation (WSD)

Introduction to WSD● In computation linguistics is the process of

identifying and analyzing the meaning of words in context.

● An open problem of Natural Language Processing.

● Governs the process of which 'sense' of a word is used, since one word written or pronounced has different meanings

Page 3: Word Sense Disambiguation (WSD)

ProblemsTake the following examples:● “ The rebel seized the opportunity to rebel”

Rebel has two meanings, same spelling but pronounced differently

First used being a Noun which indicates the 'person' who resists authority, control or convention and the latter being a Verb being the action taken by the person.

Page 4: Word Sense Disambiguation (WSD)

Problems● “I read a book and it had a red cover”

Same Pronunciation but spelt differently with different meanings

● To a human, it is common sense what it is● Developing Algorithms to replicate this human ability can be a

difficult task, as is further exemplification by implicit equivocation between “read” (Best book I've read) and “read” (I read the newspaper)

When relating to Machine Translation it is identified as a distinctive task.

One of the first problems faced by the systems was word sense ambiguity, it became apparent the semantic dis ambiguity at the lexical level

Page 5: Word Sense Disambiguation (WSD)

Early Approaches To WSD● Three earlier attempts to solve WSD● Preference Semantic: In simple words a representation of entire

sentence would be build up from the representation of the individual word through the process of semantic interpretation.

● Word Expert Prasing: This approach is more highly lexicalized.This approach is based on assumptions of human knowledge about knowledge of words rather knowledge about rules.

● Polaried words: A system which is less lexicalized. This system cantain modules same in the NLP system like grammer, prasers, lexicon and sematic interpretor.

● Issues with all three: Lack of lexical hierarchy, the high degree of lexicalization.

Page 6: Word Sense Disambiguation (WSD)

Approaches To WSD● Dictionary Based Approach – By Michiel E Lesk● First Machine readable dictionary method● Use Lesk Algorithm – based on assumption that words in a

given "neighborhood" will tend to share a common subject.● For example algorithm used for 'pine' 'cone'.● 'Pine' has two major senses in Oxford dictionary, 'tree with

needle shaped leaves' and 'Waste away through sorrow or illness'.

● Whereas 'cone' has three meaning,'solid body which narrows to a point','something of this shape whether solid or hollo' and 'fruit of certain evergreen trees'.

Page 7: Word Sense Disambiguation (WSD)

Machine Learning Approaches to WSD

● Tagging with Thesaurus catagories:-developed by mastermman in 1960

-- Used simple algorithm that based on repitition of catagories accociated with words in the same sentence. Approach remained unsuccessful but later was improved by using statical medal of catagories

-- The later method was successful when tested on 12 Polysemy word (as 92 correct disambigaution word were found)

Page 8: Word Sense Disambiguation (WSD)

Machine Learning Approaches to WSD

● Clustering word Usages -● Uses Yarowsky algorithm for unsupervised learning to

find hidden structure in unlabeled data. Uses one sense per collocation and one sense per discourse properties of human language

● Decision list is based on ‘One sense per collocation’ property and start with large set of possible collocations and calculate log-likelihood ratio of word-sense probability for each collocation. The Higher log-likelihood the more predictive evidence.

Page 9: Word Sense Disambiguation (WSD)

Summary● WSD is a long standing problem in Language processing● Early problems suffered from lack of coverage due to lack

of lexical resources

● The earlier approaches were not fully successful as they had some issues like Lack of lexical hierarchy, the high degree of lexicalization, due to large size of vocabulary.

● Most Successful was the Machine Learning approaches