22
1 NSF-ULA Sense tagging and Eventive Nouns Martha Palmer, Miriam Eckert, Jena D. Hwang, Susan Windisch Brown, Dmitriy Dligach, Jinho Choi, Nianwen Xue University of Colorado Departments of Linguistics and Computer Science Institute for Cognitive Science

1 NSF-ULA Sense tagging and Eventive Nouns Martha Palmer, Miriam Eckert, Jena D. Hwang, Susan Windisch Brown, Dmitriy Dligach, Jinho Choi, Nianwen Xue

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

1

NSF-ULASense tagging and Eventive

Nouns

Martha Palmer, Miriam Eckert, Jena D. Hwang, Susan Windisch Brown, Dmitriy Dligach, Jinho

Choi, Nianwen Xue

University of ColoradoDepartments of Linguistics and Computer

Science

Institute for Cognitive Science

2

OntoNotes

Participating Sites:BBNUniversity of ColoradoUniversity of PennsylvaniaUSC/ISI

http://www.bbn.com/NLP/OntoNotes

3

OntoNotes goals

Develop a skeletal representation of the literal meaning of sentences– Add to a frame-based (PropBank) representation of predicates and

their arguments:• Referring expressions and the textual phrases they refer to• Terms disambiguated by coarse-grained word sense in an ontology

– Encodes the core, skeletal meaning– Moves away from strings to terms that a reasoning system can use

Find a “sweet spot” in space of– Inter-tagger agreement– Productivity– Depth of representation

Text

Co-referenceWord Sense wrt Ontology

Treebank

PropBank

OntoNotesAnnotated Text

4

Creating a Sense Inventory that Supports High Quality Annotation A large scale annotation effort as part of the OntoNotes

project

Two Steps– Grouping subtle, fine-grained WordNet senses into coherent

semantic sense groups based on syntactic and semantic criteria

For Example:WordNet Sense1: I called my son DavidWordNet Sense 12: You can call me Sir

are grouped together– Annotation

5

Example Grouping: Order

–5

Group Characteristics Examples1 “give a command” NP1[+human] ORDER

NP2[+animate] to V

where NP1 has some authority over NP2

The victim says that the owner ordered the dogs to attack.

2 “request something to be made, supplied, or delivered"

NP1[+human] ORDER NP2

I just ordered pizza from panhandle pizza.

3 “organize” I ordered the papers before the meeting,

6

Annotation Process

Verb sense groups are created based on WordNet senses and on-line resources (VerbNet, PropBank, FrameNet, dictionaries)

Newly created verb sense groups are subject to sample-annotation

Verbs higher than 90% ITA (or 85% after regrouping) go to actual annotation

Verbs with less that 90% ITA are regrouped and sent back into sample-annotation tasks.

Regroupings and resample annotations are not done by the original grouper and taggers.

Verbs that complete actual annotation are adjudicated

7

The Grouping and Annotation Process

8

WSD with English OntoNotes Verbs

Picked 217 sense group annotated verbs with 50+ instances each (out of 1300+ verbs)– 35K instances total (almost half the data)– WN polysemy range: 59 to 2; Coarse polysemy range: 16 to

2– Test: 5-fold cross-validation– Automatic performance approaches human performance!

WN Avg.Polysemy

Onto Avg. Polysemy

Baseline ITA MaxEnt SVM

10.4 5.1 0.68 0.825 0.827 0.822

9

ITA > System Performance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

form count deal order

ITA

Baseline

Performance

10

ITA < System Performance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

decide keep throw mean

ITA

Baseline

Performance

11

Discussion

Coarse-grained sense distinctions improve both ITA and system performance

Linguistically motivated features contributed to high system accuracy

ALL w/o SEM. w/o SEM+SYN

0.827 0.816 0.789

Data set Baseline Acc. System Acc. ITA

SENSEVAL-2 verbs 0.407 0.646 0.713

OntoNotes verbs 0.680 0.827 0.825

12

Eventive nouns

ISI sense tags nouns

Some nouns have eventive senses– party– development

Given a list of nouns and tagged instances, we nombank just those. A few thousand at most.

Last meeting we reported very poor ITA with Adam’s nombank annotation

13

Comparison of NomBank and PropBank Frames

107 Frames examined: the ISI eventive nouns that have frame files in NomBank

47 of those showed differences between the NomBank and the PropBank frames.

Types of differences:

1. No PropBank equivalent

2. No PropBank equivalent for some NomBank senses

3. No NomBank equivalent for some PropBank senses

4. NomBank equivalent has extra Args

14

No PropBank equivalent

15 cases

breakdown; downturn; illness; oath;

outcome; pain; repercussion; stress;

transition; turmoil; unrest

15

No related PropBank equivalent for some NomBank senses 7 cases

No PB equivalent

start

start.02 “attribute/housing-starts”

PB equivalent of unrelated name

appointment

appointment.02 “have a date”

Equivalent: meet.03, no equivalent sense of “to appoint”

• PB equivalent of related name has different sense numbering

plea

plea.02 “beg”

Equivalent: plead.01, source is listed as appeal.02

16

solution

solution.02 “mix, combine”; source mix.01

Arg0: agent, mixer

Arg1: ingredient one

Arg2: ingredient two

Related PB equivalent: dissolve.01 “cause to come apart”??

Arg0: causer, agent

Arg1: thing dissolving

Arg2: medium

“salt water solution”

“rubber solution”

“chemical solution”

17

No related NomBank equivalent for some PropBank rolesets

10 cases

harassment

harassment.01 “bother”

Source: harass.01 “bother”

“Police and soldiers continue to harass Americans.”

“the harassment of diplomats and their families”

harass.02 “cause an action”

“John harassed Mary into giving him some ice cream.”

No NB equivalent.

18

NomBank equivalent has extra Args

6 cases

allegation

PB: allege.01

Arg0: speaker, alleger

Arg1: utterance, allegation

Arg2: hearer

NB: allegation.01

Arg3: person against whom something is alleged

“Fraud allegation against Wei-Chyung Wang”

“Abuse alleged against accused murderer.

Conclusion: Consider adding Arg3 to VN frame.

19

answer

PB: answer.01

Arg0: replier

Arg1: in response to

Arg2: answer

NB: answer.01

Arg3: asker/recipient of answer

“Wang’s marketing department provided the sales forceArg3 answers to [the] questions”

In PropBank, this role is often fulfilled by the Arg1.

“’I’ve read Balzac’, he answers criticsArg1”

20

attachment

attachment.01

Arg0: agent

Arg1: theme

Arg2: theme2

Arg3: instrument

attach.01

Arg0: agent

Arg1: thing being tied

Arg2: instrument

Attach.01 allows two Arg1’s:

John attached the apology noteArg1 to his dissertationArg1.

21

Other issues

Roles described differently (different label or true difference)

utterance.01: Arg0 “agent”

utter.01: Arg0 “speaker”

score.01 (VN): Arg2 “opponent”

score.01 (NB): Arg2 “test/game”

“scored against the B teamArg2”; “high testArg2 scores”

Different frame numbers

“attach, as with glue”

bond.02 (NB)

bond.01 (VN)

22

Conclusion

We can fix these!