From Verbal Argument Structures to Nominal Ones:
A Data-Mining Approach
Olya Gurevich
1 December 2010
© 2010 Microsoft Page 2
Talk Outline
Powerset: a natural language search engine (acquired by Microsoft in 2008)
Deverbal nouns and their arguments
Data collection and corpus-based modeling
Baseline system
Experiments
Conclusions
© 2010 Microsoft Page 3
Powerset: Natural Language Search
Queries and documents undergo syntactic and semantic parsing
Semantic representations allow both more constrained and more expansive matching compared to keywords► Who invaded Rome ≠ Who did Rome invade► Who did Rome invade ≈ Who was invaded by Rome► Who invaded Rome ≈ Who attacked Rome► Who invaded Rome ≈ Who was the invader of Rome
Worked on English-language Wikipedia
NL technology initially developed at Xerox PARC (XLE)
© 2010 Microsoft Page 4
Deverbal Nouns
Events often realized as nouns, not verbs► Armstrong’s return after his retirementArmstrong returned after he retired
► The destruction of Rome by the Huns was devastatingThe Huns destroyed Rome
► The Yankees’ defeat over the MetsThe Yankees defeated the Mets
► Kasparov’s defense of his knightKasparov defended his knight
In search, need to map deverbal expression to the verb (or vice versa)
© 2010 Microsoft Page 5
Deverbal Types
Eventive► destruction, return, death
Agent-like► Henri IV was the ruler of FranceHenri IV ruled France
Patient-like► Mary is an IBM employeeIBM employs Mary
© 2010 Microsoft Page 6
Deverbal Role Ambiguity
Deverbal syntax doesn’t always determine argument role► They jumped to the support of the Queen ==> They supported the Queen
► They enjoyed the support of the Queen ==> The Queen supported them
► We talked about the Merril Lynch acquisition
==> Was Merryl Lynch acquired? Or did it acquire something?
Particularly problematic if underlying verb is transitive but the deverbal noun has only one argument
© 2010 Microsoft Page 7
Baseline system
LFG-based syntactic parser (XLE)► Grammar is rule based► Disambiguation component statistically trained
List of ~4000 deverbals and corresponding verbs, from► WordNet derivational morphology► NomLex, NomLex Plus► Hand curation
Verb lexicon with subcategorization frames
© 2010 Microsoft Page 8
Baseline system cont.
Parse sentence using XLE
If a noun is in the list of ~4000 deverbals, map its arguments into those of a corresponding verb using rule-based heuristics. For transitive verbs:► X’s DV of Y ==> subj(V, X); obj(V,Y), etc.Obama’s support of reform => subj(support, Obama); obj(support, reform)
► X’s DV ==> subj(V, X) [default to most-frequent pattern]Obama’s support ==> subj(support, Obama)
► DV of X ==> obj(V,X) [default to most-frequent pattern]support of reform ==> obj(support, reform)
► X DV ==> no role► Subject-sharing support verbs: make, take
Goal: to improve over default assignments
© 2010 Microsoft Page 9
Baseline system cont.
Agent-like Deverbals► X’s DVer ==> obj(V,X)
■ the project’s director == subj(direct, director); obj(direct, project)
► DVer of X ==> obj(V,X)■ composer of the song == subj(compose; composer); obj(compose; song)
Patient-like Deverbals► X’s DVee ==> subj(V,X)
■ IBM’s employee == subj(employ, IBM); obj(employ, employee)
► DVee of X ==> subj(V,X)■ captive of the rebels == subj(capture, rebels); obj(capture, captive)
© 2010 Microsoft Page 10
Deverbal TaskGoal: predict relation between transitive V and argument X, given {X’s DV}, {DV of X}, or {X DV} ► the program’s renewal ==> obj(renew, program)► the king’s promise ==> subj(promise, king)
► the destruction of Rome ==> obj(destroy, Rome)► the rule of Henri IV ==> subj(rule, Henri IV)
► the Congress decision ==> subj(decide, Congress)
► castle protection ==> obj(protect, castle)► domain adaptation ==> ?(adapt, domain)
© 2010 Microsoft Page 11
Inference from verb usage
Large corpus data can indicate lexical preferences► Armstrong’s return == Armstrong returned► return of the book == the book was returned
If: X is more often a subject of V than object► then: X’s DV or DV of X ==> subj(V, X)
Need to count subj(V,X) | obj(V,X) occurrences for all possible pairs (V,X)
Need lots of parsed sentences!
© 2010 Microsoft Page 12
Data Sparseness
Where to get enough parsed data to count all occurrences to model any pair (V,X)?
We have parsed all of the English Wikipedia (2M docs, 121M sentences)
■ cf. Penn TreeBank (~50,000 sentences)
Oceanography: distributed architecture for fast extraction / analysis of huge parsed data sets
72M Role (Verb, Role, Arg) examples■ 69% of these appear just once, 13% just twice!!
Not enough data to make a good prediction for each individual argument
■ need to generalize across arguments
© 2010 Microsoft Page 13
Deverbal-only model
for each deverbal DV and related verb V
find corpus occurrences of overlapping arguments► X SUBJ V, X OBJ V, and X’s DV for all X
if (XSUBJ / XOBJ) > 1.5, consider X “subject preferring” for this DV
if DV has more subject-preferring than object-preferring arguments, then map:► X’s DV ==> subj(V,X) for all X
(conversely for object preference)
if the majority of overlapping arguments for a given V are neither subjects nor objects, DV is “other-preferring”
For each DV, average over all arguments X
© 2010 Microsoft Page 14
Walk-through example
renewal : renew► Argument: program
■ program’s renewal 2 occurrences■ obj(renew, program) 72 occurrences■ subj(renew, program) 9 occurrences■ {renewal, program} is object-preferring
► Argument: he■ his renewal 18 occurrences■ subj(renew, he) 615 occurrences■ obj(renew, he) 42 occurrences ■ {renewal, he} is subject-preferring
► Object-preferring arguments: 15► Subject-preferring arguments: 9► Overall preference for X’s renewal: obj► But is there a way to model non-majority preferences?
© 2010 Microsoft Page 15
Overall preferences
For possessive arguments, e.g. X’s renewal, X’s possession► Subj-preferring: 1786 deverbals (67%)► Obj-preferring: 884 (33%)► Default: subj
For of arguments, e.g. renewal of X, possession of X► Subj-preferring: 839 (29%)► Obj-preferring: 2036 (71%)► Default: obj
For prenominal arguments, e.g. X protection, X discovery► Subj-preferring: 373 (11%)► Obj-preferring: 1037 (31%)► Other-preferring: 1933 (58%)► Default: other (= no role)
© 2010 Microsoft Page 16
Incidence: subjects
Subject head (N=1000)
Deverbal
Verb
Subject error (N=220)
Agent
Patient
2-argument
"of"
poss
prenom
Other deverb
Verb
© 2010 Microsoft Page 17
Incidence: objects
Object head (N=1000)
Deverbal
Verb
Object error (N=260)
Agent
Patient
2-argument
"of"
poss
prenom
Other deverb
Verb
© 2010 Microsoft Page 18
Evaluation Data
X’s DV:► 1000 hand-annotated sentences► Possible judgments:
■ subj■ obj■ other
► Evaluate classification between subj and non-subj System
Subj Non-subj
Judged
Subj Correct Incorrect
Non-subj Incorrect Correct
© 2010 Microsoft Page 19
Evaluation
DV of X:► 750 hand-annotated sentences► Evaluate classification between obj and non-obj
System
Obj Non-obj
Judged
Obj Correct Incorrect
Non-obj Incorrect Correct
© 2010 Microsoft Page 20
Evaluation
DV X► 999 hand-annotated sentences► Evaluate classification between subj, obj, and none System
Subj Obj Other
Judged
Subj Correct Incorrect Incorrect
Obj Incorrect Correct Incorrect
Other Incorrect Incorrect Correct
© 2010 Microsoft Page 21
Evaluation measures
Error Incorrect
Correct Incorrect
© 2010 Microsoft Page 22
Deverbal-only Results: Possessives
Combining all arguments for each deverbal reduces role-labeling error by 39% for possessive argumentsPossessive arguments
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Subj error Non-subjerror
Overall error
Baseline
Deverbal-onlymodel
© 2010 Microsoft Page 23
Deverbal-only Results: ‘of’ args
Error rate is reduced by 44%'of' arguments
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Obj error Non-objerror
Overall error
Baseline
Deverbal-onlymodel
© 2010 Microsoft Page 24
Deverbal-only Results: prenominal args
Error rate is reduced by 28%
© 2010 Microsoft Page 25
Too much smoothing?
Combining all arguments is fairly drastic
Possible features of arguments that may impact behavior:► Ontological class: hard to get reliable classifications
► Animacy: subjects are more animate than objects (crosslinguistically true) the program’s renewal vs. his renewal
Possible features of deverbals and verbs that may impact behavior:► Ontological class► Active vs. passive use of verbs
© 2010 Microsoft Page 26
Animacy-based model
Split model into separate predictors for animate(X) and inanimate(X)► Animate: pronouns (I, you, he, she, they)► Inanimate: common nouns ► Ignored proper names due to poor classification into
people vs. places vs. organizations
If model does not have a prediction for the class of argument encountered, fall back to deverbal-only model
Results: ► more accurate subject labeling for animate arguments► lower recall and less accurate object labeling► overall error rate is about the same as deverbal-
only model► possibly due to insufficient training data
© 2010 Microsoft Page 27
Lexicalized model
Try to make predictions for individual DV+argument pairs
If the model has insufficient evidence for the pair, default to deverbal-only model
Results:► For possessive args, performance about the same as deverbal-only
► For ‘of’ args, performance slightly worse than deverbal-only
► For prenominal args, much worse performance► Model is vulnerable to data sparseness and systematic parsing errors (e.g. weather conditions)
© 2010 Microsoft Page 28
DV+animacy / lex results: possessives
Possessive arguments
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Subj error Non-subj error Overall error
Baseline
Deverbal-only
Animacy
Lexicalized
© 2010 Microsoft Page 29
DV+animacy / lex results: ‘of’ arguments
'of' arguments
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Obj error Non-obj error Overall error
Baseline
Deverbal-only
Animacy
Lexicalized
© 2010 Microsoft Page 30
DV+lex results: prenominal args
Prenominal arguments
0
0.2
0.4
0.6
0.8
1
1.2
Subj error Obj error Other error Overall error
Baseline
Deverbal-only
Lexicalized
© 2010 Microsoft Page 31
Training data size: 10K vs. 2M docs
00.10.20.30.40.50.60.70.80.9
Possessive "of"
Coverage vs. Size of training data
10K
2M
Error rate vs. Size of training data
0
0.1
0.2
0.3
0.4
0.5
Possessive of'
Baseline10K2M
© 2010 Microsoft Page 32
Support (“light”) verbs
Tried using the same method to derive support verbs► e.g. make a decision, take a walk, receive a gift
Look for patterns likeJohn decided vs. John made a decision => lv(decision, make, sb)We agreed vs. We had an agreement => lv(agreement, have, sb)
Initial predictions had quite a few spurious patterns
After manual curation► 96 DV-V pairs got a support verb► 25 unique support verbs► 28 support verb / argument patterns
Default model fairly fragile
Tight semantic relationship between light verbs and deverbals makes this method less applicable
© 2010 Microsoft Page 33
Directions for future work
Less ad hoc parameter setting
Further lexicalization of the model► Predictions for ontological classes of arguments
► Use properties of verbal constructions (e.g. passive vs. active, tense, etc.)
More fine-grained classification of non-subj/obj roles► director of 12 years► Bill Gates’ foundation► the Delhi Declaration
© 2010 Microsoft Page 34
Conclusions
Knowing how arguments typically participate in events allows interpretation of ambiguous deverbal syntax
Large parsed corpora are a valuable resource
Even the simplest models greatly reduce error
More data is better
© 2010 Microsoft Page 35
Thanks to:
Scott Waterman
Dick Crouch
Tracy Holloway King
Powerset NLE Team
© 2010 Microsoft Page 36
ReferencesM. Banko and E. Brill, Scaling to very very large corpora for natural language
disambiguation, ACL 2001.
S. Riezler, T. H. King, R. Kaplan, J. T. Maxwell. III, R. Crouch, and M. Johnson, Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques, ACL 2002.
S. A. Waterman, Distributed parse mining, in SETQA-NLP 2009.
O. Gurevich, R. Crouch, T. H. King, and V. de Paiva, Deverbal nouns in knowledge representation, Journal of Logic and Computation, vol. 18, pp. 385-404, 2008.
O. Gurevich, S.A. Waterman. Mapping Verbal Argument Preferences to Deverbal Nouns, IJSC 4(1), 2010
M. Nunes, Argument linking in English derived nominals, in Advances in Role and Reference Grammar, R. V. Valin, Ed. John Benjamins, 1993, pp. 375-432.
R. S. Crouch and T. H. King, Semantics via f-structure rewriting, LFG 2006.
C. Macleod, R. Grishman, A. Meyers, L. Barrett, and R. Reeves, NOMLEX: A lexicon of nominalizations, EURALEX 1998.
A. Meyers, R. Reeves, C. Macleod, R. Szekely, V. Zielinska, B. Young, and R. Grishman, The cross-breeding of dictionaries, LREC-2004.
C. Walker and H. Copperman, Evaluating complex semantic artifacts, LREC 2010.
S. Pradhan, H. Sun, W. Ward, J. H. Martin, and D. Jurafsky, Parsing arguments of nominalizations in English and Chinese, HLT-NAACL 2004.
C. Liu and H. T. Ng, Learning predictive structures for semantic role labeling of Nombank, ACL 2007.
M. Lapata, The disambiguation of nominalizations, Computational Linguistics, 28(3),357-388, 2002.
S. Pado, M. Pennacchiotti, and C. Sporleder, Semantic role assignment for event nominalisations by leveraging verbal data, CoLing 2008.