21
Toward Dependency Path based Entailment Rodney Nielsen, Wayne Ward, and James Martin

Toward Dependency Path based Entailment

  • Upload
    ahava

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Toward Dependency Path based Entailment. Rodney Nielsen, Wayne Ward, and James Martin. Dependency Path-based Entailment. DIRT (Lin and Pantel, 2001) Unsupervised method to discover inference rules “X is author of Y ≈ X wrote Y” “X solved Y ≈ X found a solution to Y” - PowerPoint PPT Presentation

Citation preview

Page 1: Toward Dependency Path based Entailment

Toward Dependency Path based Entailment

Rodney Nielsen, Wayne Ward, and James Martin

Page 2: Toward Dependency Path based Entailment

Dependency Path-based Entailment

DIRT (Lin and Pantel, 2001) Unsupervised method to discover

inference rules “X is author of Y ≈ X wrote Y” “X solved Y ≈ X found a solution to Y”

If two dependency paths tend to link the same sets of words, they hypothesize that their meanings are similar

Page 3: Toward Dependency Path based Entailment

ML Classification Approach

Features derived from corpus statistics Unigram co-occurrence Surface form bigram co-occurrence Dependency-derived bigram co-occurrence

Mixture of experts: About 18 ML classifiers from Weka toolkit Classify by majority vote or average

probability

Bag of Words Graph MatchingDependency PathBased Entailment

Page 4: Toward Dependency Path based Entailment

Corpora

7.4M articles, 2.5B words, 347 words/doc Gigaword (Graff, 2003) – 77% of documents Reuters Corpus (Lewis et al., 2004) TIPSTER

Lucene IR engine Two indices

Word surface form Porter stem filter

Stop words = {a, an, the}

Page 5: Toward Dependency Path based Entailment

Core Features

Core Repeated Features

Product of MLEs

Average of MLEs

Geometric Mean of MLEs

Worst Non-Zero MLE

Entailing Ngrams for the Lowest Non-Zero MLE

Largest Entailing Ngram Count with a Zero MLE

Smallest Entailing Ngram Count with a Non-Zero MLE

Count of Ngrams in h that do not Co-occur with any Ngrams from t

Count of Ngrams in h that do Co-occur with Ngrams in t

Page 6: Toward Dependency Path based Entailment

Dependency Features

Dependency bigram features

pc

pcpc

dpcvv

vvww

tvvpc n

ntwwP

,

,,,

,max,,

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Page 7: Toward Dependency Path based Entailment

Dependency Features

c

cw

wcw tsPtwwPK

tsP ,,,1

, 21

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Page 8: Toward Dependency Path based Entailment

Dependency Features

0,,1

1 , twwPtsP ofpaperof

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Page 9: Toward Dependency Path based Entailment

Dependency Features

tsPtwwPtwwPtsP ,21,,2

10,,2

1 , ofcostofcostthecost

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Page 10: Toward Dependency Path based Entailment

Dependency Features

0,,,21,,2

12

1 , risingiscostrisingcostrising twwPtsPtwwPtsP

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Page 11: Toward Dependency Path based Entailment

Verb Dependency Features

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Combined verb descendent relation features

Worst verb descendent relation features

Page 12: Toward Dependency Path based Entailment

Subject Dependency

Features

Combined and worst subject descendent relations

Combined and worst subject-to-verb paths

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Page 13: Toward Dependency Path based Entailment

Other Dependency Features

Repeat these same features for: Object pcomp-n Other descendent relations

Page 14: Toward Dependency Path based Entailment

Results

RTE2 by Task: IE IR QA SUM Overall

Accuracy 55.5 64.0 55.0 70.0 61.1

Average Precision 49.4 73.0 57.3 80.7 65.2

RTE2 Accuracy SUM NonSUM Overall

Test Set 70.0 58.2 61.1

Training Set CV 84.5 62.7 68.1

RTE1 Accuracy CD NonCD Overall

Test Set (Best

submission)83.3

(83.3)56.8

(52.8)61.8

(58.6)

Training Set CV 83.7 56.9 61.6

Page 15: Toward Dependency Path based Entailment

Feature Analysis

All feature sets are contributing according to cross validation on the training set

Most significant feature set: Unigram stem based word alignment

Most significant core repeated feature: Average MLE

Page 16: Toward Dependency Path based Entailment

Questions

Mixture of experts classifier using corpus co-occurrence statistics Moving in the direction of DIRT Domain of Interest: Student response analysis in intelligent tutoring systems

RTE2 Task: IE IR QA SUM

All

Accuracy 55.5

64.0

55.0

70.0

61.1

Average Precision

49.4

73.0

57.3

80.7

65.2

Bag of Words Graph MatchingDependency PathBased Entailment

Hypothesis hRTE2 Accuracy SUM NonSUM Overall

Test Set 70.0 58.2 61.1

Training Set CV 84.5 62.7 68.1

Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

RTE1 Accuracy CD NonCD Overall

Test Set (Best Subm)

83.3 (83.3)

56.8 (52.8)

61.8 (58.6)

Training Set CV 83.7 56.9 61.6

c

cw

wcw tsPtwwPK

tsP ,,,1

, 21

Page 17: Toward Dependency Path based Entailment

Why Entailment

Intelligent Tutoring Systems Student Interaction Analysis

Are all aspects of the student’s answer entailed by the text and the gold standard answer

Are all aspects of the desired answer entailed by the student’s response

Page 18: Toward Dependency Path based Entailment

Word Alignment Features

hw v

vw

tvh

v

vw

tv

v

vw

tvw

tvw

n

ntTrP

n

ntw

n

nvTrPtTrP

,

,

,

max|1

max,MLE

max|1max|1

Unigram word alignment

Page 19: Toward Dependency Path based Entailment

Word Alignment Features Bigram word alignment

Example: <t>Newspapers choke on rising paper costs and

falling revenue.</t><h>The cost of paper is rising.</h>

MLE(cost, t) = ncost of, costs of /ncosts of = 6086/35800 = 0.17

1

11

1

11

1

11

1

11

,

4

,

3

,

2

,

1

1max,MLE

jj

jjji

jj

jjij

ij

ijii

ji

jiii

vv

vvvw

vv

vvwv

wv

wvww

vw

vwww

tvi

n

n

n

n

n

n

n

n

ktw

Page 20: Toward Dependency Path based Entailment

Word Alignment Features

Average unigram and bigram

Stem-based tokens

Page 21: Toward Dependency Path based Entailment

Corpora

7.4M articles/docs & 2.5B words, 347 words/doc Gigaword (Graff, 2003) -

5.7M articles, 2.1B words, 375 words/article 77% of documents and 83% of indexed

words Reuters Corpus (Lewis et al., 2004)

0.8M articles, 0.17B words, 213 words/article TIPSTER

0.9M articles, 0.26B words, 291 words/article