Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania

Preview:

Citation preview

Automatic Sense Prediction for Implicit Discourse

Relations in TextEmily Pitler, Annie Louis, Ani Nenkova

University of Pennsylvania

Implicit Discourse Relations

2

I am in Singapore, but I live in the United States.◦ Explicit Comparison

The main conference is over Wednesday. I am staying for EMNLP. ◦ Implicit Comparison

Implicit discourse relations are hard

3

I am here because I have a presentation to give at ACL.◦ Explicit Contingency

I am a little tired; there is a 13 hour time difference.◦ Implicit Contingency

Implicit discourse relations are hard

4

Focus on implicit discourse relations ◦ in a realistic distribution

Better understanding of lexical features◦ Showed do not capture semantic oppositions

Empirical validation of new and old features◦ Polarity, verb classes, context, and some lexical

features indicate discourse relations

First experiments on implicits

5

Classify both implicits and explicits◦ Same sentence [Soricut and Marcu, 2003]

◦ Graphbank corpus: doesn’t distinguish implicit and explicit [Wellner et al., 2006]

Create artificial implicits by deleting connective◦ I am in Singapore, but I live in the United States.◦ [Marcu and Echihabi, 2001; Blair-Goldensohn et al., 2007;

Sporleder and Lascarides, 2008]

Related work on relation sense

6

Word Pairs Investigation

7

Most basic feature for implicits

I_there, I_is, …, tired_time, tired_difference

Word pairs as features

8

Iam

aIittle

tired

there is a 13 hour time difference

Marcu and Echihabi , 2001

The recent explosion of country funds mirrors the “closed-end fund mania of the 1920s, Mr. Foot says, when narrowly focused funds grew wildly popular.

They fell into oblivion after the 1929 crash.

Intuition: with large amounts of data, will find semantically-related pairs

9

Using just content words reduces performance (but has steeper learning curve) ◦ Marcu and Echihabi, 2001

Nouns and adjectives don’t help at all ◦ Lapata and Lascarides, 2004

Filtering out stopwords lowers results ◦ Blair-Goldensohn et al., 2007

Meta error analysis of prior work

10

Synthetic implicits: Cause/Contrast/None sentences ◦ Explicit instances from Gigaword with connective

deleted ◦ Because Cause, But Contrast◦ At least 3 sentences apart None◦ Blair-Goldensohn et al., 2007

Random selection ◦ 5,000 Cause◦ 5,000 Other

Computed information gain of word pairs

Word pairs experiments

11

The government says it has reached most isolated townships by now, but because roads are blocked, getting anything but basic food supplies to people remains difficult.

but because Comparison

but because Contingency

“but” signals “Not-Comparison” in synthetic data

12

Maybe even with lots and lots of data, we won’t see “popular…but…oblivion” that often

What are we trying to get at?

Popular Desirable Mollify

Oblivion Abhorrent Enrage

Sentiment orientation relieves lexical sparsity

13

Features for sense prediction

14

Multi-perspective Question Answering Opinion Corpus◦ Wilson et. al, 2005

Sentiment words annotated as◦ Positive◦ Negative◦ Both◦ Neutral

Resource for Polarity Tags

15

Similar to word pairs, but words replaced with polarity tags

Arg1: Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one.

Arg2: The venture, formed in 1986, was supposed to be Time’s low-cost, safe entry into women’s magazines.

Arg1NegatePositive Arg2PositiveArg1NegatePositiveArg2Positive

Polarity Tags pairs

16

General Inquirer lexicon◦ Stone et al., 1966◦ Semantic categories of words

Complementary classes◦ “Understatement” vs. “Overstatement”◦ “Rise” vs. “Fall”◦ “Pleasure” vs. “Pain”

Features ~ Tag pairs, only verbs

Inquirer Tags

17

Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year

U.S. News' circulation in the same time was 2,303,328, down 2.6%

Probably WSJ-specific

Money/Percent/Num

18

Levin verb class level in LCS database◦ Levin, 1993; Dorr, 2001◦ More related verbs ~ Expansion

Average length of verb chunk◦ They [are allowed to proceed] ~ Contingency◦ They [proceed] ~ Expansion, Temporal

POS tags of the main verb◦ Same tense ~ Expansion◦ Different tense ~ Contingency, Temporal

Verbs

19

Prior work found first and last words very helpful in predicting sense◦ Wellner et al., 2006◦ Often explicit connectives

First-Last, First3

20

Was preceding/following relation explicit?

◦ If so, which sense?

◦ If so, which connective?

Does Arg1 begin a paragraph?

Context

21

Largest available annotated corpus of discourse relations◦ Penn Treebank WSJ articles◦ 16,224 implicit relations between adjacent

sentences

I am a little tired; [because] there is a 13 hour time difference.◦ Contingency.cause.reason

Penn Discourse Treebank

22

Relation SenseProportion of implicits

Expansion 53%

Contingency 26%

Comparison 15%

Temporal 6%

Top level senses in PDTB

23

Developed features on sections 0-1 Trained on sections 2-20 Tested on sections 21-22 Binary classification task for each sense

Trained on equal numbers of positive and negative examples

Tested on natural distribution

Naïve Bayes classifier

Classification Experiments on PDTB Implicits

24

Results

25

Motivation in prior work◦ Train on synthetic implicits

Results: Word pairs for comparison and contingency

26

17.13 31.10

20.96 43.79

21.96 45.60

What works better◦ Train on actual implicits

Synthetic examples can still

help!

Comp. Cont.

◦ With only best features selected from synthetic implicits

Features f-score

First-Last, First3 21.01

Context 19.32

Money/Percent/Num 19.04

Random 9.91

Results: Comparison

27

Polarity is actually the

worst feature16.63

Comparison

Not Comparison

Positive-Negative or Negative-Positive Pairs

30% 31%

Distribution of Opposite Polarity Pairs

28

Features f-score

First-Last, First3 36.75

Verbs 36.59

Context 29.55

Random 19.11

Results: Contingency

29

Features f-score

Polarity Tags 71.29

Inquirer Tags 70.21

Context 67.77

Random 64.74

Results: Expansion

30

• Expansion is majority class

• precision more problematic than recall

• These features all help other senses

Features f-score

First-Last, First3 15.93

Verbs 12.61

Context 12.34

Random 5.38

Results: Temporal

31

Temporals often end with words like “Monday” or “yesterday”

Comparison◦ Selected word pairs

Contingency◦ Polarity, Verb, First/Last, Modality, Context,

Selected word pairs

Best feature sets

32

Expansion◦ Polarity, Inquirer Tags, Context

Temporal◦ First/Last+word pairs

Best feature sets

33

Comparison

21.96 (17.13)

Contingency

47.13 (31.10)

Expansion

76.41 (63.84)

Temporal

16.76 (16.21)

Best Results: f-scores

34

Comparison/Contingency baseline: synthetic implicits word pairsExpansion/Temporal baseline: real implicits word pairs

Results from classifying each relation independently◦ Naïve Bayes, MaxEnt, AdaBoost

Since context features were helpful, tried CRF

6-way classification, word pairs as features◦ Naïve Bayes accuracy: 43.27%◦ CRF accuracy: 44.58%

Further experiments using context

35

Focus on implicit discourse relations ◦ in a realistic distribution

Better understanding of word pairs◦ Showed do not capture semantic oppositions

Empirical validation of new and old features◦ Polarity, verb classes, context, and some lexical

features indicate discourse relations

Conclusion

36

Recommended