30
LING 581: Advanced Computational Linguistics Lecture Notes April 27th

LING 581: Advanced Computational Linguistics Lecture Notes April 27th

Embed Size (px)

Citation preview

LING 581: Advanced Computational Linguistics

Lecture NotesApril 27th

TCE

• 2nd last class today…

WordNet Homework

If you haven’t already, you should have emailed me this report

QA Homework

• Idea: evaluate the feasibility of QA on the web– using TREC 9 QA examples– programming it up is optional

– see and appreciate why it’s hard to do...

• Steps:– Pick 3 query groups– Simulate (programmatically) QA– use the Collins parser and WordNet to find answers to the queries– submit report (before final class next week)

• Example• Question groupWhat kind of animal was Winnie the Pooh?Winnie the Pooh is what kind of animal? What species was Winnie the Pooh?Winnie the Pooh is an imitation of which animal?What was the species of Winnie the Pooh?

Example

• trees

• reformulate Qs into declarative sentences with missing wh-phrase– ____ (kind of animal) is winnie the pooh– winnie the pooh is ____ (species)– winnie the pooh is an imitation of ____ (animal)– the species of winnie the pooh is ____

Example

• Answers:• Winnie the Pooh is such a

popular character in Poland• Winnie-the-Pooh Is My Co-

worker• Winnie the Pooh is a little,

adorable and cute bear obsessed by honey.

• Winnie-the-Pooh is so fat.• Winnie the Pooh is one of the

things most closest to my heart• Winnie the Pooh is his usual

befuddled self

Example

• Original declarative form:– winnie the pooh is ____

(species) • Check semantic relatedness of

extracted head words using WordNet:– character– co-worker– little– one– self

• Here, look at shortest paths

SummaryHeadword Length #nodes–bear 6 9258–character 6 1072–one 7 734 –co-worker 7 6488–self 7 14456–little 10 28706

Constraints:length < #nodes

Other resources

• XWN: http://xwn.hlt.utdallas.edu/

glosses in Logical Form

XWN

• Applications– The Extended WordNet may be used as a Core Knowledge

Base for applications such as Question Answering, Information Retrieval, Information Extraction, Summarization, Natural Language Generation, Inferences, and other knowledge intensive applications.

– The glosses contain a part of the world knowledge since they define the most common concepts of the English language.

XWN

Example: • Dan Moldovan and Adrian Novischi, Lexical Chains for Question Answering, COLING 2002

COALS

• Take a look at an alternative to WordNet for computing similarity– WordNet: handbuilt system– COALS:

• the correlated occurrence analogue to lexical semantics• (Rohde et al. 2004)• a instance of a vector-based statistical model for similarity

– e.g., see also Latent Semantic Analysis (LSA) – Singular Valued Decomposition (SVD)

» sort by singular values, take top k and reduce the dimensionality of the co-occurrence matrix to rank k

• based on weighted co-occurrence data from large corpora

COALS

• Basic Idea:– compute co-occurrence counts for (open class) words from a large

corpora– corpora:

• Usenet postings over 1 month• 9 million (distinct) articles• 1.2 billion word tokens• 2.1 million word types

– 100,000th word occurred 98 times

– co-occurrence counts• based on a ramped weighting system with window size 4

– excluding closed-class items

4 4

wi

332 2 11

wi-1wi-2wi-3wi-4 wi+4wi+3wi+2wi+1

COALS

• Example:

COALS

• available online• http://dlt4.mit.edu/~dr/COALS/similarity.php

Computing Similarity

Worked Example: zealous• run connectbf/3

– ?- connectbf(impassioned,zealous,X).– X = 10 ?– ?- connectbf(zealous,impassioned,X).– X = 9 ?

• compare to b. ravenous– ?- connectbf(ravenous,zealous,X).– no– ?- connectbf(zealous,ravenous,X).

• shortest link between impassioned and zealous

Old Code: WordNet 1.7.1

Worked Example: zealous

• shortest path between ravenous and zealous

Task: Match each word in the first column with its definition in the second column

accolade

abateaberrant

abscondacumen

abscissionacerbic

accretionabjureabrogate

deviation

abolishkeen insight

lessen in intensitysour or bitter

building updepart secretly

renounceremovalpraise

Task: Match each word in the first column with its definition in the second column

accolade

abateaberrant

abscondacumen

abscissionacerbic

accretionabjureabrogate

deviation

abolishkeen insight

lessen in intensitysour or bitter

building updepart secretly

renounceremovalpraise3

2

3

2

2

2

2

COALS and the GREACCOLADE

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREABERRANT

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREABATE

-0.1

-0.05

0

0.05

0.1

0.15

0.2

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREABSCOND

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREACUMEN

-0.05

0

0.05

0.1

0.15

0.2

0.25

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREACERBIC

-0.1

-0.05

0

0.05

0.1

0.15

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREACCRETION

-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISECorrelation

COALS and the GREABJURE

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

COALS and the GREABROGATE

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

DEVIATIONINSIGHT ABOLISH LESSEN

SOURDEPART BUILD

RENOUNCEREMOVAL

PRAISE

Correlation

Task: Match each word in the first column with its definition in the second column

accolade

abateaberrant

abscondacumen

abscissionacerbic

accretionabjureabrogate

deviation

abolishkeen insight

lessen in intensitysour or bitter

building updepart secretly

renounceremovalpraise

Heuristic: competing words, pick the strongest

accolade

abateaberrant

abscondacumen

abscissionacerbic

accretionabjureabrogate

deviation

abolishkeen insight

lessen in intensitysour or bitter

building updepart secretly

renounceremovalpraise

7 out of 10