View
218
Download
1
Tags:
Embed Size (px)
Citation preview
1
NSF-ULASense tagging and Eventive
Nouns
Martha Palmer, Miriam Eckert, Jena D. Hwang, Susan Windisch Brown, Dmitriy Dligach, Jinho
Choi, Nianwen Xue
University of ColoradoDepartments of Linguistics and Computer
Science
Institute for Cognitive Science
2
OntoNotes
Participating Sites:BBNUniversity of ColoradoUniversity of PennsylvaniaUSC/ISI
http://www.bbn.com/NLP/OntoNotes
3
OntoNotes goals
Develop a skeletal representation of the literal meaning of sentences– Add to a frame-based (PropBank) representation of predicates and
their arguments:• Referring expressions and the textual phrases they refer to• Terms disambiguated by coarse-grained word sense in an ontology
– Encodes the core, skeletal meaning– Moves away from strings to terms that a reasoning system can use
Find a “sweet spot” in space of– Inter-tagger agreement– Productivity– Depth of representation
Text
Co-referenceWord Sense wrt Ontology
Treebank
PropBank
OntoNotesAnnotated Text
4
Creating a Sense Inventory that Supports High Quality Annotation A large scale annotation effort as part of the OntoNotes
project
Two Steps– Grouping subtle, fine-grained WordNet senses into coherent
semantic sense groups based on syntactic and semantic criteria
For Example:WordNet Sense1: I called my son DavidWordNet Sense 12: You can call me Sir
are grouped together– Annotation
5
Example Grouping: Order
–5
Group Characteristics Examples1 “give a command” NP1[+human] ORDER
NP2[+animate] to V
where NP1 has some authority over NP2
The victim says that the owner ordered the dogs to attack.
2 “request something to be made, supplied, or delivered"
NP1[+human] ORDER NP2
I just ordered pizza from panhandle pizza.
3 “organize” I ordered the papers before the meeting,
6
Annotation Process
Verb sense groups are created based on WordNet senses and on-line resources (VerbNet, PropBank, FrameNet, dictionaries)
Newly created verb sense groups are subject to sample-annotation
Verbs higher than 90% ITA (or 85% after regrouping) go to actual annotation
Verbs with less that 90% ITA are regrouped and sent back into sample-annotation tasks.
Regroupings and resample annotations are not done by the original grouper and taggers.
Verbs that complete actual annotation are adjudicated
8
WSD with English OntoNotes Verbs
Picked 217 sense group annotated verbs with 50+ instances each (out of 1300+ verbs)– 35K instances total (almost half the data)– WN polysemy range: 59 to 2; Coarse polysemy range: 16 to
2– Test: 5-fold cross-validation– Automatic performance approaches human performance!
WN Avg.Polysemy
Onto Avg. Polysemy
Baseline ITA MaxEnt SVM
10.4 5.1 0.68 0.825 0.827 0.822
9
ITA > System Performance
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
form count deal order
ITA
Baseline
Performance
10
ITA < System Performance
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
decide keep throw mean
ITA
Baseline
Performance
11
Discussion
Coarse-grained sense distinctions improve both ITA and system performance
Linguistically motivated features contributed to high system accuracy
ALL w/o SEM. w/o SEM+SYN
0.827 0.816 0.789
Data set Baseline Acc. System Acc. ITA
SENSEVAL-2 verbs 0.407 0.646 0.713
OntoNotes verbs 0.680 0.827 0.825
12
Eventive nouns
ISI sense tags nouns
Some nouns have eventive senses– party– development
Given a list of nouns and tagged instances, we nombank just those. A few thousand at most.
Last meeting we reported very poor ITA with Adam’s nombank annotation
13
Comparison of NomBank and PropBank Frames
107 Frames examined: the ISI eventive nouns that have frame files in NomBank
47 of those showed differences between the NomBank and the PropBank frames.
Types of differences:
1. No PropBank equivalent
2. No PropBank equivalent for some NomBank senses
3. No NomBank equivalent for some PropBank senses
4. NomBank equivalent has extra Args
14
No PropBank equivalent
15 cases
breakdown; downturn; illness; oath;
outcome; pain; repercussion; stress;
transition; turmoil; unrest
15
No related PropBank equivalent for some NomBank senses 7 cases
No PB equivalent
start
start.02 “attribute/housing-starts”
PB equivalent of unrelated name
appointment
appointment.02 “have a date”
Equivalent: meet.03, no equivalent sense of “to appoint”
• PB equivalent of related name has different sense numbering
plea
plea.02 “beg”
Equivalent: plead.01, source is listed as appeal.02
16
solution
solution.02 “mix, combine”; source mix.01
Arg0: agent, mixer
Arg1: ingredient one
Arg2: ingredient two
Related PB equivalent: dissolve.01 “cause to come apart”??
Arg0: causer, agent
Arg1: thing dissolving
Arg2: medium
“salt water solution”
“rubber solution”
“chemical solution”
17
No related NomBank equivalent for some PropBank rolesets
10 cases
harassment
harassment.01 “bother”
Source: harass.01 “bother”
“Police and soldiers continue to harass Americans.”
“the harassment of diplomats and their families”
harass.02 “cause an action”
“John harassed Mary into giving him some ice cream.”
No NB equivalent.
18
NomBank equivalent has extra Args
6 cases
allegation
PB: allege.01
Arg0: speaker, alleger
Arg1: utterance, allegation
Arg2: hearer
NB: allegation.01
Arg3: person against whom something is alleged
“Fraud allegation against Wei-Chyung Wang”
“Abuse alleged against accused murderer.
Conclusion: Consider adding Arg3 to VN frame.
19
answer
PB: answer.01
Arg0: replier
Arg1: in response to
Arg2: answer
NB: answer.01
Arg3: asker/recipient of answer
“Wang’s marketing department provided the sales forceArg3 answers to [the] questions”
In PropBank, this role is often fulfilled by the Arg1.
“’I’ve read Balzac’, he answers criticsArg1”
20
attachment
attachment.01
Arg0: agent
Arg1: theme
Arg2: theme2
Arg3: instrument
attach.01
Arg0: agent
Arg1: thing being tied
Arg2: instrument
Attach.01 allows two Arg1’s:
John attached the apology noteArg1 to his dissertationArg1.
21
Other issues
Roles described differently (different label or true difference)
utterance.01: Arg0 “agent”
utter.01: Arg0 “speaker”
score.01 (VN): Arg2 “opponent”
score.01 (NB): Arg2 “test/game”
“scored against the B teamArg2”; “high testArg2 scores”
Different frame numbers
“attach, as with glue”
bond.02 (NB)
bond.01 (VN)