27
WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Embed Size (px)

Citation preview

Page 1: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

WORD SENSE DISAMBIGUATION

Presented ByRoshan R. Karwa

Guided ByDr. M.B.Chandak

A Technical Seminar on

Page 2: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

MOTIVATION

• One of the open problem in NLP.

• Computationally determining which sense of a word is activated by its use in a particular context.

• E.g. I went to the bank to withdraw some money.

• Needed in:

• Machine Translation: For correct lexical choice.

• Information Retrieval: Resolving ambiguity in queries.

• Information Extraction: For accurate analysis of text. 2

Page 3: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

OUTLINE

• Knowledge Based Approaches• WSD using Selectional Preferences• Overlap Based Approaches

• Machine Learning Based Approaches• Supervised Approaches• Unsupervised Algorithms

• Hybrid Approaches• Summary• Conclusion• Future Work

3

Page 4: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Knowledge Based Approaches

WSD USING SELECTIONAL PREFERENCES AND ARGUMENTs

4

• This airlines serves dinner in the evening flight.

• serve (Verb)• agent• object – edible

• This airlines serves the sector between Agra & Delhi.

• serve (Verb)• agent• object – sector

SENSE 1 SENSE 2

oRequires exhaustive enumeration of:Argument-structure of verbs.Description of properties of words such that meeting the selectional preference criteria can be decided.

E.g. This flight serves the “region” between Mumbai and Delhi

Page 5: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Knowledge Based Approaches: OVERLAP BASED APPROACHES

• Require a Machine Readable Dictionary (MRD).

• Find the overlap between the features of different senses of an ambiguous word (sense bag) and the features of the words in its context (context bag).

• The sense which has the maximum overlap is selected as the contextually appropriate sense.

5

Page 6: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

OVERLAP BASED APPROACHES: LESK’S ALGORITHM

6

• SENSE 1.. THE RESIDUE THAT REMAINS WHEN SOMETHING IS BURNED.

• SENSE 2.. TIMBER TREE . .

• SENSE 3.. STRONG ELASTIC WOOD OF ANY OF VARIOUS ASH TREES; USED FOR FURNITURE . . .

ASH“On burning coal we get ash.”

In this case Sense 1 of ash would be the winner sense.

“ On burning the ash, We found that its root were deap into ground.Winner sense will be again 1 by simple LESK. But Intented meaning is ??????

Page 7: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

OVERLAP BASED APPROACHES: WALKER’S ALGORITHM• A Thesaurus Based approach.

• Step 1: For each sense of the target word find the thesaurus category to which that sense belongs.

• Step 2: Calculate the score for each sense by using the context words. A context words will add 1 to the score of the sense if the thesaurus category of the word matches that of the sense.

• E.g. The money in this bank fetches an interest of 8% per annum• Target word: bank• Clue words from the context: money, interest, annum, fetch

Sense1: Finance Sense2: Location

Money +1 0

Interest +1 0

Fetch 0 0

Annum +1 0

Total 3 0

Context words add 1 to thesense when the topic of theword matches that of the sense

Page 8: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

WSD USING CONCEPTUAL DENSITY

• Select a sense based on the relatedness of that word-sense to the context.

• Relatedness is measured in terms of conceptual distance

• (i.e. how close the concept represented by the word and the concept represented by its context words are)

• This approach uses a structured hierarchical semantic net (WordNet) for finding the conceptual distance.

• Smaller the conceptual distance higher will be the conceptual density.

• (i.e. if all words in the context are strong indicators of a particular concept then that concept will have a higher density.)

8

Page 9: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

THE JURY(2) PRAISED THE ADMINISTRATION(3) AND OPERATION (8) OF ATLANTA POLICE DEPARTMENT(1)

Step 1:Lattice Making Step 2: Compute CD Step 3: Select highest CD Step 4: Select concept

operation

division

administrative unit

jury

committee

police department

local department

government department

department

jury administration

bodyCD = 0.256 CD = 0.062

9

CONCEPTUAL DENSITY (EXAMPLE)

Page 10: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Supervised Approach: NAÏVE BAYES

10

sˆ= argmax s ε senses Pr(s|Vw)

‘Vw’ is a feature vector consisting of:

POS of wSemantic & Syntactic features of wCollocation vector (set of words around it) typically

consists of next word(+1), next-to-next word(+2), -2, -1 & their POS's

Co-occurrence vector (number of times w occurs in bag of words around it)

Applying Bayes rule and naive independence assumption

sˆ= argmax s ε senses Pr(s).Πi=1nPr(Vw

i|s)

Page 11: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

NAÏVE BAYES Example

I went to the bank to withdraw some money

• Collocation vector: < I, went, withdraw, money>

• Co-occurrence vector: considering window, 2 words before bank and 2 words after bank. So bank appear one time.

• Vsense of bank: <N, org, plural-s, went, withdraw, money>

Page 12: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

NAÏVE BAYES Example (CONT . .)

• P(Vbank|sense of bank) = ?

• P(<N, org, plural-s, went, withdraw, money>|sense of bank) = P(N|sense of bank). P(org|sense of bank). P(plural-s|sense of bank). P(went|sense of bank). P(withdraw|sense of bank). P(money|sense of bank)

• Say, P(org| sense1 bank)

P(org| sense2 bank)

Page 13: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

ESTIMATING PARAMETERS

• Parameters in the probabilistic WSD are:• Pr(s)

• Pr(Vwi|s)

• Senses are marked with respect to sense repository (WORDNET)

Pr(s) = count(s,w) / count(w)

Pr(Vwi|s) = Pr(Vw

i,s)/Pr(s)

= c(Vwi,s,w)/c(s,w)

Page 14: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Supervised Approach: DECISION LIST ALGORITHM

• Based on ‘One sense per collocation’ property.

• Nearby words provide strong and consistent clues as to the sense of a target word.

• Collect a large set of collocations for the ambiguous word.

• Calculate word-sense probability distributions for all such collocations.

• Calculate the log-likelihood ratio

• Higher log-likelihood = more predictive evidence

• Collocations are ordered in a decision list, with most predictive collocations ranked highest.

14

Pr(Sense-A| Collocationi)

Pr(Sense-B| Collocationi)Log( )

14

Assuming there are only

two senses for the word.

Of course, this can easily

be extended to ‘k’ senses.

Page 15: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Training Data Resultant Decision List

DECISION LIST ALGORITHM (CONTD.)

Classification of a test sentence is based on the highest ranking collocation found in the test sentence.E.g.

…plucking flowers affects plant growth…

15

Page 16: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Unsupervised Approach: HYPERLEX

• KEY IDEA

• Instead of using “dictionary defined senses” extract the “senses from the corpus” itself

• These “corpus senses” or “uses” correspond to clusters of similar contexts for a word.

(river)

(water)

(flow)

(electricity)(victory)

(team)

(cup)

(world)

Example:“ Outre la production d electrite, le BARRAGE permettre de-regular le corpav du fleuve”( In addition to production of electricity, the dam will regulate the river flow)

Page 17: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

HYBRID: AN ITERATIVE APPROACH TO WSD

• Extracts collocational and contextual information form WordNet (gloss) and a small amount of tagged data.

• Monosemic words in the context serve as a seed set of disambiguated words.

• It would be interesting to exploit other semantic relations available in WordNet.

• Combine information obtained from multiple knowledge source

Page 18: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

STRUCTURAL SEMANTIC INTERCONNECTIONS (SSI)

• AN ITERATIVE APPROACH.

• Uses the following relations

• hypernymy (car#1 is a kind of vehicle#1) denoted by (kind-of )

• hyponymy (the inverse of hypernymy) denoted by (has-kind)

• meronymy (room#1 has-part wall#1) denoted by (has-part )

• holonymy (the inverse of meronymy) denoted by (part-of )

• attribute (dry#1 value-of wetness#1) denoted by (attr)

• gloss denoted by (gloss)

• context denoted by (context)

• domain denoted by (dl)

• Monosemic words serve as the seed set for disambiguation.

Page 19: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Structural Semantic Interconnections (SSI) contd.

A SEMANTIC RELATIONS GRAPH FOR THE TWO SENSES OF THE WORD BUS (I.E. VEHICLE AND CONNECTOR)

Page 20: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

RESOURCES FOR WSD

• Sense Inventory:

Dictionaries

Thesauri(roget’s thesaurus. . )

Lexical KB(Wordnet . . )

• Corpora:

Raw(Brown corpus . . )

Sense Tagged(Semcor . . )

Automatically Tagged Corpora(Open dictionary Project . .)

Page 21: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Conclusion

• Using a diverse set of features improves WSD accuracy.

• WSD results are better when the degree of polysemy is reduced.

• Hyperlex (unsupervised corpus based), SSI (hybrid) look promising for resource-poor Indian languages.

Page 22: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

FUTURE IMPLEMENTATION

• Use unsupervised or hybrid approaches to develop a WSD engine. (focusing on MT)

• Automatically generate sense tagged data.

• Explore whether it possible to evaluate the role of WSD in MT.

Page 23: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

REFERENCES

• Agirre, Eneko & German Rigau. 1996. "Word sense disambiguation using conceptual density", in Proceedings of the 16th International Conference on Computational Linguistics (COLING), Copenhagen, Denmark, 1996

• Ng, Hwee T. & Hian B. Lee. 1996. "Integrating multiple knowledge sources to disambiguate word senses: An exemplar-based approach", Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL), Santa Cruz, U.S.A., 40-47.

• Ng, Hwee T. 1997. "Exemplar-based word sense disambiguation: Some recent improvements", Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, U.S.A., 208-213.

Page 24: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

REFERENCES• Rada Mihalcea and Dan Moldovan, 2000."An Iterative Approach to Word

Sense Disambiguation", in Proceedings of Florida Artificial Intelligence Research Society Conference (FLAIRS 2000), [pg.219-223] Orlando, FL, May 2000.

• Roberto Navigli, Paolo Velardi, 2005."Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation", IEEE Transactions On Pattern Analysis and Machine Intelligence, July 2005.

• Yee Seng Chan, Hwee Tou Ng and David Chiang, 2007."Word Sense Disambiguation Improves Statistical Machine Translation", in Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, 2007.

• Ping Chen and Chris Bowes, University of Houston-Downtown and Wei Ding and Max Choly, University of Massachusetts, Boston Word Sense Disambiguation with Automatically Acquired Knowledge, 2012 IEEE INTELLIGENT SYSTEMS

published by the IEEE Computer Society.

Page 25: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

THANK YOU!

Page 26: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

APPENDIX

Conceptual Density

CD(c,m)= m-1 h-1 ∑ nhyp ∕ ∑ nhyp i=o i=0

Page 27: WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

Unsupervised: LIN’S APPROACH

27

INSTALLATION PROFICIENCY ADEPTNESS READINESS TOILET/BATHROOM

Word Freq Log Likelihood

ORG 64 50.4

Plant 14 31.0

Company 27 28.6

Industry 9 14.6

Unit 9 9.32

Aerospace 2 5.81

Memory device

1 5.79

Pilot 2 5.37

SENSES OF FACILITY SUBJECTS OF “EMPLOY”

Two different words are likely to have similar meanings if they occur in identical local contexts.

E.g. The facility will employ 500 new employees.

In this case Sense 1 of installation would be the winner sense.