23
Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI, Technical University of Catalonia UIUC, June 10 2004

Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,

Embed Size (px)

Citation preview

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Word Sense DisambiguationAnother NLP working problem for learning with constraints…

Lluís MàrquezTALP, LSI, Technical University of CataloniaUIUC, June 10 2004

Word Sense DisambiguationAnother NLP working problem for learning with constraints…

Lluís MàrquezTALP, LSI, Technical University of CataloniaUIUC, June 10 2004

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• The problem– WSD is the problem of assigning the correct meaning to

the words occurring in a text or discourse (sense tagging)

– Example:

“He was mad about stars at the age#1 of nine”

“About 20,000 years ago the last ice age#2 ended”

age#1: the length of time something (or someone) has existed

age#2: a historic period

– Origin in the beginning of AI (60’s) around first MT models

– Renewed interest with the explosion of statistical and ML-based approaches to NLP (90’s)

• The problem– WSD is the problem of assigning the correct meaning to

the words occurring in a text or discourse (sense tagging)

– Example:

“He was mad about stars at the age#1 of nine”

“About 20,000 years ago the last ice age#2 ended”

age#1: the length of time something (or someone) has existed

age#2: a historic period

– Origin in the beginning of AI (60’s) around first MT models

– Renewed interest with the explosion of statistical and ML-based approaches to NLP (90’s)

Word Sense DisambiguationWord Sense Disambiguation

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Usual approaches

– Supervised learning (ML): multiclass classification problem; “word-experts”. Results about 75% accuracy on subsets of selected polysemous words. Sometimes better (over 90%) on some specific words

– “Unsupervised”, “knowledge-based” = heuristic rules based on preexisting knowledge sources (WorNet, MRDs, multilingual aligned corpora, etc.). Accuracy: around 60% (allwords WSD)

– Combined approaches: 65% (allwords WSD)

– Supervised methods are better but difficult to apply to “allwords” WSD

• Usual approaches

– Supervised learning (ML): multiclass classification problem; “word-experts”. Results about 75% accuracy on subsets of selected polysemous words. Sometimes better (over 90%) on some specific words

– “Unsupervised”, “knowledge-based” = heuristic rules based on preexisting knowledge sources (WorNet, MRDs, multilingual aligned corpora, etc.). Accuracy: around 60% (allwords WSD)

– Combined approaches: 65% (allwords WSD)

– Supervised methods are better but difficult to apply to “allwords” WSD

Word Sense DisambiguationWord Sense Disambiguation

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Usual Features: – Local context patterns (POS, words, lemmas)

• the <age> of, <age> CD

• <age> limit, mean <age>

– Broad context features: Bag of (relevant) words• Atomic occurs in the sentence

• Dark occurs in the sentence

– Also syntactic features capturing predicate-argument relations

• Usual Features: – Local context patterns (POS, words, lemmas)

• the <age> of, <age> CD

• <age> limit, mean <age>

– Broad context features: Bag of (relevant) words• Atomic occurs in the sentence

• Dark occurs in the sentence

– Also syntactic features capturing predicate-argument relations

WSD: ML ApproachWSD: ML Approach

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Main difficulties:

– Each word is a classification problem => data scarceness

– High granularity of sense repositories used => many classes

– Difficulty in capturing the semantic information present in the context: words (sparseness problem) which are also ambiguous (no interactions between word-classifiers have been exploited).

• Main difficulties:

– Each word is a classification problem => data scarceness

– High granularity of sense repositories used => many classes

– Difficulty in capturing the semantic information present in the context: words (sparseness problem) which are also ambiguous (no interactions between word-classifiers have been exploited).

WSD: ML ApproachWSD: ML Approach

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Example (from WSJ)• Example (from WSJ)

The jury further said in term end presentments that the City Executive Committee, which had over-all charge of the election, “deserves the praise and thanks of the City of Atlanta” for the manner in which the election was conducted.

The jury further said in term end presentments that the City Executive Committee, which had over-all charge of the election, “deserves the praise and thanks of the City of Atlanta” for the manner in which the election was conducted.

WSD: DifficultiesWSD: Difficulties

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Example (from WSJ, WordNet senses)• Example (from WSJ, WordNet senses)

The jury#NN#1 further#RB#2 said#VB#1 in term#NN#2 end#NN#2 presentments#NN#1 that the City_Executive_ Committee#1 , which had#VB#4 over-all#JJ#2 charge#NN#6 of the election#NN#1 , “ deserves#VB#1 the praise#NN#1 and thanks#NN#1 of the City_of_Atlanta#1 ” for the manner#NN#1 in which the election#NN#1 was conducted#VB#1 .

The jury#NN#1 further#RB#2 said#VB#1 in term#NN#2 end#NN#2 presentments#NN#1 that the City_Executive_ Committee#1 , which had#VB#4 over-all#JJ#2 charge#NN#6 of the election#NN#1 , “ deserves#VB#1 the praise#NN#1 and thanks#NN#1 of the City_of_Atlanta#1 ” for the manner#NN#1 in which the election#NN#1 was conducted#VB#1 .

WSD: DifficultiesWSD: Difficulties

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Example (from WSJ, WordNet senses)• Example (from WSJ, WordNet senses)

jury#NN#1 further#RB#2 said#VB#1 term#NN#2 end#NN#2 presentments#NN#1 had#VB#4 over-all#JJ#2 charge#NN#6 election#NN#1 deserves#VB#1 praise#NN#1 thanks#NN#1 manner#NN#1 election#NN#1

conducted#VB#1 .

jury#NN#1 further#RB#2 said#VB#1 term#NN#2 end#NN#2 presentments#NN#1 had#VB#4 over-all#JJ#2 charge#NN#6 election#NN#1 deserves#VB#1 praise#NN#1 thanks#NN#1 manner#NN#1 election#NN#1

conducted#VB#1 .

WSD: DifficultiesWSD: Difficulties

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Example (from WSJ, WordNet senses)• Example (from WSJ, WordNet senses)

The jury(2) further(5) said(11) in term(6) end(15) presentments(3) that the City_Executive_ Committee , which had(21) over-all(2) charge(15) of the election(2) , “ deserves the praise(2) and thanks(2) of the City_of_Atlanta ” for the manner(3) in which the election(2) was conducted(5) .

The jury(2) further(5) said(11) in term(6) end(15) presentments(3) that the City_Executive_ Committee , which had(21) over-all(2) charge(15) of the election(2) , “ deserves the praise(2) and thanks(2) of the City_of_Atlanta ” for the manner(3) in which the election(2) was conducted(5) .

WSD: DifficultiesWSD: Difficulties

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Utility? – Useful for IR / IE / Semantic parsing / Knowledge

acquisition?

– Accurately resolving WSD is more difficult that most of the NLP tasks for which is potentially helpful

• Evaluation Exercises for WSD: Senseval-1/2/3 – Senseval-3 collocated with ACL-2004

– 2 major types of task: “lexical sample”, “allwords”

– 10 different languages + 1 multilingual lexical sample task

– Several new tasks: Automatic subcategorization acquisition, WSD of WordNet glosses, Semantic Roles (English and Swedish), Logic Forms, etc.

• Utility? – Useful for IR / IE / Semantic parsing / Knowledge

acquisition?

– Accurately resolving WSD is more difficult that most of the NLP tasks for which is potentially helpful

• Evaluation Exercises for WSD: Senseval-1/2/3 – Senseval-3 collocated with ACL-2004

– 2 major types of task: “lexical sample”, “allwords”

– 10 different languages + 1 multilingual lexical sample task

– Several new tasks: Automatic subcategorization acquisition, WSD of WordNet glosses, Semantic Roles (English and Swedish), Logic Forms, etc.

WSD: ML ApproachWSD: ML Approach

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Our implication in Senseval-3 (TALP research group)

– As organizers:• Lexical sample tasks for Catalan and Spanish:

– Coarse sense dictionary developed for the tasks with additional information (collocations, examples, etc.)

– Manual annotation of about 300 examples for 50 different words in each language. Context of 3 sentences. Also POS and lemma annotation

– Large corpus of about 1,500 unnanotated examples for each word

– Best results: 85% accuracy

– But nothing new was presented!!!

• Our implication in Senseval-3 (TALP research group)

– As organizers:• Lexical sample tasks for Catalan and Spanish:

– Coarse sense dictionary developed for the tasks with additional information (collocations, examples, etc.)

– Manual annotation of about 300 examples for 50 different words in each language. Context of 3 sentences. Also POS and lemma annotation

– Large corpus of about 1,500 unnanotated examples for each word

– Best results: 85% accuracy

– But nothing new was presented!!!

Word Sense DisambiguationWord Sense Disambiguation

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

– As participants:• English lexical sample task: SVMs, constraint

classification, thorough feature optimization and parameter tuning, (semantically) rich feature set. Accuracy: 71.6% - 78.2%, state-of-the-art.

• English allwords task: combination (cascade + weighted voted scheme) of several supervised and knowledge based modules. Supervised trained on frequent words of the SemCor corpus. Knowledge based modules rely on WordNet and WordNet Domains. Accuracy: 62.40% (67.4%)

• Desambiguation of WordNet glosses (best results)

– Five papers already available. Also resources (datasets and dictionaries) will be also available after the workshop in July.

– As participants:• English lexical sample task: SVMs, constraint

classification, thorough feature optimization and parameter tuning, (semantically) rich feature set. Accuracy: 71.6% - 78.2%, state-of-the-art.

• English allwords task: combination (cascade + weighted voted scheme) of several supervised and knowledge based modules. Supervised trained on frequent words of the SemCor corpus. Knowledge based modules rely on WordNet and WordNet Domains. Accuracy: 62.40% (67.4%)

• Desambiguation of WordNet glosses (best results)

– Five papers already available. Also resources (datasets and dictionaries) will be also available after the workshop in July.

Word Sense DisambiguationWord Sense Disambiguation

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

New DirectionNew Direction

...The jury#NN#1 further#RB#2 said#VB#1 in term#NN#2 end#NN#2 presentments#NN#1 that the City_Executive_ Committee#1 , which had#VB#4 over-all#JJ#2 charge#NN#6 of the election#NN#1 , “ deserves#VB#1 the praise#NN#1 and thanks#NN#1 of the City_of_Atlanta#1 ” for the manner#NN#1 in which the election#NN#1 was conducted#VB#1 ....

...The jury#NN#1 further#RB#2 said#VB#1 in term#NN#2 end#NN#2 presentments#NN#1 that the City_Executive_ Committee#1 , which had#VB#4 over-all#JJ#2 charge#NN#6 of the election#NN#1 , “ deserves#VB#1 the praise#NN#1 and thanks#NN#1 of the City_of_Atlanta#1 ” for the manner#NN#1 in which the election#NN#1 was conducted#VB#1 ....

• Allwords WSD in context• Allwords WSD in context

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

jury term end

presentments charge election

praise thanks manner

election

jury term end

presentments charge election

praise thanks manner

election

• Example (WSJ, only nouns)• Example (WSJ, only nouns)

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

jury term end

presentments charge election

praise thanks manner

election

jury term end

presentments charge election

praise thanks manner

election

• Example (WSJ, only nouns)• Example (WSJ, only nouns)

“One sense per discourse” constraint“One sense per discourse” constraint

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

• Example (WSJ, only nouns)• Example (WSJ, only nouns)

Sense pairs likely to occur togetherSense pairs likely to occur together

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

• Example (WSJ, only nouns)• Example (WSJ, only nouns)

Uncompatible sense pairsUncompatible sense pairs

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

jury term end body of citizens... word or expression point in time in which something endscommittee, panel limited period of time surface of a three dimensional object

presentments charge election an accusation of crime... electrical chargethe act of presenting something a impetuous rush toward someone... a pleading a command to do something

praise thanks manner acnkowledgement of appreciation with the help or owing to

• Example (WSJ, only nouns)• Example (WSJ, only nouns)

Lots of irrelevant/unknown sense pairsLots of irrelevant/unknown sense pairs

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Allwords WSD in contextAllwords WSD in context

• Selectional preferences

– To produce compatibility constraints between verbs and subject/object head nouns

– For instance: “when money#1 appears as object the preferred verbs are: raise#4 (1.44), {take_in#5, collect#2} (0.45), {earn#2, garner#2} (0.23), …”

– Need of syntactic information

• Selectional preferences

– To produce compatibility constraints between verbs and subject/object head nouns

– For instance: “when money#1 appears as object the preferred verbs are: raise#4 (1.44), {take_in#5, collect#2} (0.45), {earn#2, garner#2} (0.23), …”

– Need of syntactic information

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• A very good starting point

– Funding: MEANING, European research project

– Resources: MCR, including WordNets from different languages, “ontologies” (Domains, SUMO, TopOntology, SemFile) linked to WordNet synsets, selectional preferences, etc.

– Tools: the Senseval-3 allwords WSD system and all its components

– People: Lluís Villarejo (PhD student at TALP)

– ML approach: Inference & Learning with Linear Constraints

• A very good starting point

– Funding: MEANING, European research project

– Resources: MCR, including WordNets from different languages, “ontologies” (Domains, SUMO, TopOntology, SemFile) linked to WordNet synsets, selectional preferences, etc.

– Tools: the Senseval-3 allwords WSD system and all its components

– People: Lluís Villarejo (PhD student at TALP)

– ML approach: Inference & Learning with Linear Constraints

Allwords WSD in contextAllwords WSD in context

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Potential problems– Computational requirements

– Soft constraints

– Lots of irrelevant sense pairs

– Can compatibility constraints be reliably estimated from existing labeled corpora?

– …

– We have to codify only the most relevant constraints between pairs of “related” words at a coarse level of granularity (very general semantic class labels)

• Potential problems– Computational requirements

– Soft constraints

– Lots of irrelevant sense pairs

– Can compatibility constraints be reliably estimated from existing labeled corpora?

– …

– We have to codify only the most relevant constraints between pairs of “related” words at a coarse level of granularity (very general semantic class labels)

Allwords WSD in contextAllwords WSD in context

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

• Current status

– Semantic-class attributes of the context words have already been incorporated as features for capturing “interactions”: gain 1-2 points (but context words are very ambiguous…)

– Training/testing the system assuming that we know the actual senses of context words (upper bounds)

• (near) Future

– Inference on top of classifiers’ output

– Learning with global feedback (coming from inference)

• Current status

– Semantic-class attributes of the context words have already been incorporated as features for capturing “interactions”: gain 1-2 points (but context words are very ambiguous…)

– Training/testing the system assuming that we know the actual senses of context words (upper bounds)

• (near) Future

– Inference on top of classifiers’ output

– Learning with global feedback (coming from inference)

Allwords WSD in contextAllwords WSD in context

Word Sense Disambiguation UIUC - 06/10/2004Word Sense Disambiguation UIUC - 06/10/2004

Thanks again for your attention!!!Thanks again for your attention!!!