Upload
emmeline-ellis
View
214
Download
0
Embed Size (px)
Citation preview
CIS630 1
Penn
Putting Meaning Into Your Trees
Martha Palmer
Collaborators: Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Scott Cotton Karin Kipper, Hoa Dang, Szuting Yi, Edward Loper, Jinying Chen, Tom Morton, William Schuler, Fei Xia, Joseph Rosenzweig, Dan Gildea, Christiane Fellbaum
September 8, 2003
CIS630 2
PennElusive nature of “meaning”
Natural Language Understanding
Natural Language Processing or Natural Language Engineering
Empirical techniques rule!
CIS630 3
PennStatistical Machine Translation results
CHINESE TEXT The japanese court before china photo
trade huge & lawsuit. A large amount of the proceedings before
the court dismissed workers. japan’s court, former chinese servant
industrial huge disasters lawsuit. Japanese Court Rejects Former Chinese
Slave Workers’ Lawsuit for Huge Compensation.
CIS630 4
PennLeverage from shallow techniques?
Still need an approximation of meaning for accurate MT, IR, Q&A, IESense taggingLabeled dependency structures
What do we have as available resources?
What can we do with them?
CIS630 5
PennOutline Introduction – need for semantics Sense tagging Issues highlighted by
Senseval1 VerbNet Senseval2 – groupings, impact on ITA Automatic WSD, impact on scores Proposition Bank
Framesets, automatic role labellers Hierarchy of sense distinctions
Mapping VerbNet to PropBank
CIS630 6
PennWordNet - Princeton On-line lexical reference (dictionary)
Words organized into synonym sets <=> concepts
Hypernyms (ISA), antonyms, meronyms (PART) Useful for checking selectional restrictions (doesn’t tell you what they should be)
Typical top nodes - 5 out of 25 (act, action, activity) (animal, fauna) (artifact) (attribute, property) (body, corpus)
CIS630 7
PennWordNet – president, 6 senses1. president -- (an executive officer of a firm or corporation) -->CORPORATE EXECUTIVE, BUSINESS EXECUTIVE… LEADER 2. President of the United States, President, Chief Executive -- (the person who
holds the office of head of state of the United States government; "the President likes to jog every morning")-->HEAD OF STATE, CHIEF OF STATE
3. president -- (the chief executive of a republic) -->HEAD OF STATE, CHIEF OF STATE
4. president, chairman, chairwoman, chair, chairperson -- (the officer who presides at the meetings of an organization; "address your remarks to the chairperson") --> PRESIDING OFFICER LEADER
5. president -- (the head administrative officer of a college or university)--> ACADEMIC ADMINISTRATOR …. LEADER
6. President of the United States, President, Chief Executive -- (the office of the United States head of state; "a President is elected every four years")
--> PRESIDENCY, PRESIDENTSHIP POSITION
CIS630 8
PennLimitations to WordNet Poor inter-annotator agreement (73%)
Just sense tags - no representationsVery little mapping to syntaxNo predicate argument structure no selectional restrictions
No generalizations about sense distinctions
No hierarchical entries
CIS630 9
PennSIGLEX98/SENSEVAL Workshop on Word Sense Disambiguation
54 attendees, 24 systems, 3 languages 34 Words (Nouns, Verbs, Adjectives) Both supervised and unsupervised systems Training data, Test data
Hector senses - very corpus based (mapping to WordNet)
lexical samples - instances, not running text Inter-annotator agreement over 90%
ACL-SIGLEX98,SIGLEX99, CHUM00
CIS630 10
PennHector - bother, 10 senses 1. intransitive verb, - (make an effort), after negation,
usually with to infinitive; (of a person) to take the trouble or effort needed (to do something). Ex. “About 70 percent of the shareholders did not bother to vote at all.” 1.1 (can't be bothered), idiomatic, be unwilling to make the effort
needed (to do something), Ex. ``The calculations needed are so tedious that theorists cannot be bothered to do them.''
2. vi; after neg; with `about" or `with"; rarely cont – (of a person) to concern oneself (about something or
someone) “He did not bother about the noise of the typewriter because Danny could not hear it above the sound of the tractor.” 2.1 v-passive; with `about" or `with“ - (of a person) to be concerned
about or interested in (something) “The only thing I'm bothered about is the well-being of the club.”
CIS630 11
PennMismatches between lexicons:Hector - WordNet, shake
CIS630 12
PennLevin classes (3100 verbs)
47 top level classes, 193 second and third level
Based on pairs of syntactic frames. John broke the jar. / Jars break easily. / The jar broke.
John cut the bread. / Bread cuts easily. / *The bread cut. John hit the wall. / *Walls hit easily. / *The wall hit.
Reflect underlying semantic components contact, directed motion, exertion of force, change of state
Synonyms, syntactic patterns (conative), relations
CIS630 13
PennConfusions in Levin classes? Not semantically homogenous
{braid, clip, file, powder, pluck, etc...}
Multiple class listingshomonymy or polysemy?
Alternation contradictions?Carry verbs disallow the Conative, but include{push,pull,shove,kick,draw,yank,tug}also in Push/pull class, does take the Conative
CIS630 14
PennIntersective Levin classes
CIS630 15
PennRegular Sense Extensions
John pushed the chair. +force, +contact
John pushed the chairs apart. +ch-state
John pushed the chairs across the room. +ch-loc
John pushed at the chair. -ch-loc
The train whistled into the station. +ch-loc
The truck roared past the weigh station. +ch-loc
AMTA98,ACL98,TAG98
CIS630 16
PennIntersective Levin Classes
More syntactically and semantically coherentsets of syntactic patternsexplicit semantic componentsrelations between senses
VERBNETwww.cis.upenn.edu/verbnet
CIS630 17
PennVerbNet
Computational verb lexicon
Clear association between syntax and semanticsSyntactic frames (LTAGs) and selectional restrictions
(WordNet)Lexical semantic information – predicate argument
structureSemantic components represented as predicatesLinks to WordNet senses
Entries based on refinement of Levin Classes
Inherent temporal properties represented explicitlyduring(E), end(E), result(E)
TAG00, AAAI00, Coling00
CIS630 18
PennVerbNet
Class entries: Verb classes allow us to capture generalizations about verb
behavior Verb classes are hierarchically organized Members have common semantic elements, thematic roles,
syntactic frames and coherent aspect
Verb entries: Each verb can refer to more than one class (for different senses) Each verb sense has a link to the appropriate synsets in WordNet
(but not all senses of WordNet may be covered) A verb may add more semantic information to the basic semantics
of its class
Basic Transitive A V P cause(Agent,E) /\
manner (during(E),directedmotion,Agent)/\
manner (end(E), forceful,Agent)/\
contact(end(E),Agent,Patient)
Conative AV at P manner (during (E), directedmotion, Agent)
¬contact(end(E),Agent,Patient)
With/against alternation A V I against/on P
cause(Agent, E) /\
manner(during (E),directedmotion, Instr)/\
manner(end(E), forceful, Instr)/\
contact (end(E), Instr, Patient)
MEMBERS: [bang(1,3),bash(1),... hit(2,4,7,10), kick (3),...]THEMATIC ROLES: Agent, Patient, InstrumentSELECT RESTRICTIONS: Agent(int_control), Patient(concrete),
Instrument(concrete)
FRAMES and PREDICATES:
Hit class – hit-18.1
CIS630 20
PennVERBNET
CIS630 21
PennVerbNet/WordNet
CIS630 22
PennMapping WN-Hector via VerbNet
SIGLEX99, LREC00
CIS630 23
PennSENSEVAL2 –ACL’01 Adam Kilgarriff, Phil Edmond and Martha Palmer
All-words task Lexical sample taskCzech BasqueDutch ChineseEnglish EnglishEstonian Italian
Japanese Korean Spanish Swedish
CIS630 24
PennEnglish Lexical Sample - Verbs
Preparation for Senseval 2manual tagging of 29 highly polysemous verbs
(call, draw, drift, carry, find, keep, turn,...)WordNet (pre-release version 1.7)
To handle unclear sense distinctionsdetect and eliminate redundant sensesdetect and cluster closely related senses
NOT ALLOWED
CIS630 25
PennWordNet – call, 28 senses1. name, call -- (assign a specified, proper name to; "They named their son David"; "The new school was named
after the famous Civil Rights leader") -> LABEL
2. call, telephone, call up, phone, ring -- (get or try to get into communication (with someone) by telephone;
"I tried to call you all night"; "Take two aspirin and call me in the morning")
->TELECOMMUNICATE
3. call -- (ascribe a quality to or give a name of a common noun that reflects a quality;
"He called me a bastard"; "She called her children lazy and ungrateful")
-> LABEL
CIS630 26
PennWordNet – call, 28 senses4. call, send for -- (order, request, or command to come; "She was called into the director's office"; "Call the police!")
-> ORDER
5. shout, shout out, cry, call, yell, scream, holler, hollo, squall -- (utter a sudden loud cry;
"she cried with pain when the doctor inserted the needle"; "I yelled to her from the window but she couldn't hear me")
-> UTTER
6. visit, call in, call -- (pay a brief visit; "The mayor likes to call on some of the prominent citizens")
-> MEET
CIS630 27
PennGroupings Methodology
Double blind groupings, adjudication Syntactic Criteria (VerbNet was useful)
Distinct subcategorization frames call him a bastard call him a taxi
Recognizable alternations – regular sense extensions: play an instrument play a song play a melody on an instrument
CIS630 28
PennGroupings Methodology (cont.)
Semantic Criteria Differences in semantic classes of arguments
Abstract/concrete, human/animal, animate/inanimate, different instrument types,…
Differences in the number and type of arguments Often reflected in subcategorization frames John left the room. I left my pearls to my daughter-in-law in my will.
Differences in entailments Change of prior entity or creation of a new entity?
Differences in types of events Abstract/concrete/mental/emotional/….
Specialized subject domains
CIS630 29
PennWordNet: - call, 28 senses
WN2 , WN13,WN28 WN15 WN26
WN3 WN19 WN4 WN 7 WN8 WN9
WN1 WN22
WN20 WN25
WN18 WN27
WN5 WN 16 WN6 WN23
WN12
WN17 , WN 11 WN10, WN14, WN21, WN24
CIS630 30
PennWordNet: - call, 28 senses, groups
WN2, WN13,WN28 WN15 WN26
WN3 WN19 WN4 WN 7 WN8 WN9
WN1 WN22
WN20 WN25
WN18 WN27
WN5 WN 16 WN6 WN23
WN12
WN17 , WN 11 WN10, WN14, WN21, WN24,
Phone/radio
Label
Loud cry
Bird or animal cry
Request
Call a loan/bond
Visit
Challenge
Bid
CIS630 31
PennWordNet – call, 28 senses, Group11. name, call -- (assign a specified, proper name to; "They named their son David"; "The new school was named
after the famous Civil Rights leader") --> LABEL3. call -- (ascribe a quality to or give a name of a common
noun that reflects a quality; "He called me a bastard"; "She called her children lazy and
ungrateful") --> LABEL
19. call -- (consider or regard as being; "I would not call her beautiful")--> SEE
22. address, call -- (greet, as with a prescribed form, title, or name;
"He always addresses me with `Sir'"; "Call me Mister"; "She calls him by first name")
--> ADDRESS
CIS630 32
PennSense Groups: verb ‘develop’
WN1 WN2 WN3 WN4
WN6 WN7 WN8 WN5 WN 9 WN10
WN11 WN12 WN13 WN 14
WN19 WN20
CIS630 33
PennResults – averaged over 28 verbs
Call Develop Total
WN/corpus 28/14 21/16 16.28/10.83
Grp/corp 11/7 9/6 8.07/5.90
Entropy 3.68 3.17 2.81
ITA-fine 69% 67% 71%
ITA-coarse 89% 85% 82%
CIS630 34
PennMaximum Entropy WSDHoa Dang (in progress)
Maximum entropy frameworkcombines different features with no assumption of
independenceestimates conditional probability that W has sense X in
context Y, (where Y is a conjunction of linguistic features
feature weights are determined from training dataweights produce a maximum entropy probability
distribution
CIS630 35
PennFeatures used Topical contextual linguistic feature for W:
presence of automatically determined keywords in S Local contextual linguistic features for W:
presence of subject, complementswords in subject, complement positions, particles, prepsnoun synonyms and hypernyms for subjects,
complementsnamed entity tag (PERSON, LOCATION,..) for proper
Nswords within +/- 2 word window
CIS630 36
PennGrouping improved sense identification for MxWSD
75% with training and testing on grouped senses vs. 43% with training and testing on fine-grained senses Most commonly confused senses suggest grouping:
(1) name, call--assign a specified proper name to; ``They called their son David'' (2) call--ascribe a quality to or give a name that reflects a quality; ``He called me a bastard''; (3) call--consider or regard as being; ``I would not call her beautiful'' (4) address, call--greet, as with a prescribed form, title, or name; ``Call me Mister''; ``She calls him by his first name''
CIS630 37
PennResults – averaged over 28 verbs
Total
WN/corpus 16.28/10.83
Grp/corp 8.07/5.90
Entropy 2.81
ITA-fine 71%
ITA-coarse 82%
MX-fine 59%
MX-coarse 69%
CIS630 38
PennResults - first 5 Senseval2 verbs
Verb Begin
Call Carry
Develop
Draw Dress
WN/corpus
10/9 28/14 39/22 21/16 35/21 15/8
Grp/corp 10/9 11/7 16/11 9/6 15/9 7/4
Entropy 1.76 3.68 3.97 3.17 4.60 2.89
ITA-fine .812 .693 .607 .678 .767 .865
ITA-coarse
.814 .892 .753 .852 .825 1.00
MX-fine .832 .470 .379 .493 .366 .610
MX-coarse
.832 .636 .485 .681 .512 .898
CIS630 39
PennSummary of WSD
Choice of features is more important than choice of machine learning algorithm
Importance of syntactic structure (English WSD but not Chinese) Importance of dependencies Importance of an hierarchical approach to
sense distinctions, and quick adaptation to new usages.
CIS630 40
PennOutline Introduction – need for semantics Sense tagging Issues highlighted by
Senseval1 VerbNet Senseval2 – groupings, impact on ITA Automatic WSD, impact on scores Proposition Bank
Framesets, automatic role labellers Hierarchy of sense distinctions
Mapping VerbNet to PropBank
CIS630 41
PennProposition Bank:From Sentences to Propositions
Powell met Zhu Rongji
Proposition: meet(Powell, Zhu Rongji)Powell met with Zhu Rongji
Powell and Zhu Rongji met
Powell and Zhu Rongji had a meeting
. . .When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
debate
consult
joinwrestle
battle
meet(Somebody1, Somebody2)
CIS630 42
PennCapturing semantic roles*
Charles broke [ ARG1 the LCD Projector.]
[ARG1 The windows] were broken by the hurricane.
[ARG1 The vase] broke into pieces when it toppled over.
SUBJ
SUBJ
SUBJ
*See also Framenet, http://www.icsi.berkeley.edu/~framenet/
CIS630 43
PennA TreeBanked Sentence
Analysts
S
NP-SBJ
VP
have VP
been VP
expectingNP
a GM-Jaguar pact
NP
that
SBAR
WHNP-1
*T*-1
S
NP-SBJVP
wouldVP
give
the US car maker
NP
NP
an eventual 30% stake
NP
the British company
NP
PP-LOC
in
(S (NP-SBJ Analysts) (VP have (VP been (VP expecting
(NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that)
(S (NP-SBJ *T*-1) (VP would
(VP give (NP the U.S. car maker)
(NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British
company))))))))))))
Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.
CIS630 44
PennThe same sentence, PropBanked
Analysts
have been expecting
a GM-Jaguar pact
Arg0 Arg1
(S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting
Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that)
(S Arg0 (NP-SBJ *T*-1) (VP would
(VP give Arg2 (NP the U.S. car maker)
Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British
company))))))))))))
that would give
*T*-1
the US car maker
an eventual 30% stake in the British company
Arg0
Arg2
Arg1
expect(Analysts, GM-J pact)give(GM-J pact, US car maker, 30% stake)
CIS630 45
Penn
English PropBankhttp://www.cis.upenn.edu/~ace/ 1M words of Treebank over 2 years,
May’01-03 New semantic augmentations
Predicate-argument relations for verbslabel arguments: Arg0, Arg1, Arg2, …First subtask, 300K word financial subcorpus (12K sentences, 29K predicates,1700 lemmas)
Spin-off: Guidelines FRAMES FILES - (necessary for annotators)3500+ verbs with labeled examples, rich
semantics, 118K predicates
CIS630 46
PennFrames Example: expectRoles: Arg0: expecter Arg1: thing expected
Example: Transitive, active:
Portfolio managers expect further declines in interest rates.
Arg0: Portfolio managers REL: expect Arg1: further declines in interest rates
CIS630 47
PennFrames File example: giveRoles: Arg0: giver Arg1: thing given Arg2: entity given to
Example: double object The executives gave the chefs a standing
ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation
CIS630 48
PennHow are arguments numbered?
Examination of example sentences Determination of required / highly preferred
elements Sequential numbering, Arg0 is typical first
argument, except ergative/unaccusative verbs (shake example) Arguments mapped for "synonymous" verbs
CIS630 49
PennTrends in Argument Numbering Arg0 = agent Arg1 = direct object / theme / patient Arg2 = indirect object / benefactive /
instrument / attribute / end state Arg3 = start point / benefactive / instrument /
attribute Arg4 = end point
CIS630 50
Penn
Additional tags (arguments or adjuncts?)
Variety of ArgM’s (Arg#>4): TMP - when?
LOC - where at?
DIR - where to?
MNR - how?
PRP -why?
REC - himself, themselves, each other
PRD -this argument refers to or modifies another
ADV -others
CIS630 51
PennInflection Verbs also marked for tense/aspect
Passive/Active Perfect/Progressive Third singular (is has does was) Present/Past/Future Infinitives/Participles/Gerunds/Finites
Modals and negation marked as ArgMs
CIS630 52
PennPhrasal Verbs Put together Put in Put off Put on Put out Put up ...
CIS630 53
PennErgative/Unaccusative Verbs: rise
RolesArg1 = Logical subject, patient, thing rising
Arg2 = EXT, amount risen
Arg3* = start point
Arg4 = end point
Sales rose 4% to $3.28 billion from $3.16 billion.
*Note: Have to mention prep explicitly, Arg3-from, Arg4-to, or could haveused ArgM-Source, ArgM-Goal. Arbitrary distinction.
CIS630 54
PennSynonymous Verbs: add in sense riseRoles:
Arg1 = Logical subject, patient, thing rising/gaining/being added to
Arg2 = EXT, amount risen
Arg4 = end point
The Nasdaq composite index added 1.01 to 456.6 on paltry volume.
CIS630 55
PennAnnotation procedure
Extraction of all sentences with given verb First pass: Automatic tagging (Joseph
Rosenzweig) http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon
Second pass: Double blind hand correction Variety of backgrounds Less syntactic training than for treebanking
Tagging tool highlights discrepancies Third pass: Solomonization (adjudication)
CIS630 56
PennInter-Annotator Agreement
0
10
20
30
40
50
60
70
80
90
100
Pe
rce
nta
ge
Ag
ree
me
nt
CIS630 57
PennSolomonizationAlso , substantially lower Dutch corporate tax rates helped
the company keep its tax outlay flat relative to earnings growth.
*** Kate said:arg0 : the companyarg1 : its tax outlayarg3-PRD : flatargM-MNR : relative to earnings growth*** Katherine said:arg0 : the companyarg1 : its tax outlayarg3-PRD : flatargM-ADV : relative to earnings growth
CIS630 58
Penn
Automatic Labelling of Semantic Relations
Features: Predicate Phrase Type Parse Tree Path Position (Before/after predicate) Voice (active/passive) Head Word
CIS630 59
PennLabelling Accuracy-Known Boundaries
79.673.682.0Automatic
83.177.0Gold Standard
PropBank > 10 instances
PropBankFramenet
Parses
Accuracy of semantic role prediction for known boundaries--the system is given the constituents to classify.Framenet examples (training/test) are handpicked to be unambiguous.
CIS630 60
Penn
Labelling Accuracy – Unknown Boundaries
57.7 50.064.6 61.2Automatic
71.1 64.4Gold Standard
PropBank
Precision Recall
Framenet
Precision Recall
Parses
Accuracy of semantic role prediction for unknown boundaries--the system must identify the constituents as arguments and give them the correct roles.
CIS630 61
PennAdditional Automatic Role Labelers
Szuting Yi – EM clustering, unsupervisedConditional Random Fields
Yinying Chen - using role labels as features for WSD, decision trees, supervised, EM clustering, unsupervised
CIS630 62
PennOutline Introduction – need for semantics Sense tagging Issues highlighted by Senseval1 VerbNet Senseval2 – groupings, impact on ITA Automatic WSD, impact on scores Proposition Bank
Framesets, automatic role labellers Hierarchy of sense distinctions
Mapping VerbNet to PropBank
CIS630 63
PennFrames: Multiple Framesets Framesets are not necessarily consistent between
different senses of the same verb Verb with multiple senses can have multiple frames, but
not necessarily Roles and mappings onto argument labels are
consistent between different verbs that share similar argument structures, Similar to Framenet
Levin / VerbNet classes http://www.cis.upenn.edu/~dgildea/VerbNet
Out of the 720 most frequent verbs: 1 frameset 470 2 framesets 155 3+ framesets 95 (includes light verbs)
CIS630 64
PennWord Senses in PropBank Orders to ignore word sense not feasible for 700+ verbs
Mary left the room Mary left her daughter-in-law her pearls in her will
Frameset leave.01 "move away from":Arg0: entity leavingArg1: place left
Frameset leave.02 "give":Arg0: giver Arg1: thing givenArg2: beneficiary
How do these relate to traditional word senses as in WordNet?
CIS630 65
PennWordNet: - leave, 14 senses
WN1 WN5 WN3 WN7
WN8
WN2 WN12 WN9 WN10 WN13
WN14
WN4
WN6 WN11
CIS630 66
PennWordNet: - leave, groups
WN1 WN5 WN3 WN7
WN8
WN2 WN12 WN9 WN10 WN13
WN14
WN4
WN6 WN11
CIS630 67
PennWordNet: - leave, framesets
WN1 WN5 WN3 WN7
WN8
WN2 WN12 WN9 WN10 WN13
WN14
WN4
WN6 WN11
CIS630 68
PennOverlap between Groups and Framesets – 95%
WN1 WN2 WN3 WN4
WN6 WN7 WN8 WN5 WN 9 WN10
WN11 WN12 WN13 WN 14
WN19 WN20
Frameset1
Frameset2
develop
CIS630 69
PennSense Hierarchy Framesets – coarse grained distinctions
Sense Groups (Senseval-2) intermediate level (includes Levin classes) – 95% overlap
WordNet – fine grained distinctions
CIS630 70
Pennleave.01 - move away from VerbNet Levin class: escape-51.1-1; WordNet Senses: WN 1, 5, 8
Thematic Roles: Location[+concrete] Theme[+concrete]
Frames with Semantics Basic Intransitive
"The convict escaped" motion(during(E),Theme) direction(during(E),Prep,Theme,?Location)
Intransitive (+ path PP) "The convict escaped from the prison"
Locative Preposition Drop "The convict escaped the prison"
CIS630 71
Pennleave.02 - give VerbNet Levin class: future_having-13.3 ; WordNet Senses: WN 2,10,13
Thematic Roles: Agent[+animate OR +organization] Recipient[+animate OR +organization] Theme[]
Frames with Semantics Dative
"I promised somebody my time" Agent V Recipient Theme has_possession(start(E),Agent,Theme) future_possession(end(E),Recipient,Theme) cause(Agent,E)
Transitive (+ Recipient PP) "We offered our paycheck to her" Agent V Theme Prep(to) Recipient )
Transitive (Theme Object) "I promised my house (to somebody)" Agent V Theme
CIS630 72
Penn
Propbank to VN mapping from Text meaning workshop Cluster verbs based on frames of arg labels
K-nearest neighbors EM
Compare derived clusters to VerbNet classes
sim(X, Y) =
Only a rough measure Not all verbs in VerbNet are attested in PropBank Not all verbs in PropBank are treated in VerbNet
YXYX
PropBank Frame for Clustering
For [Arg4
Mr. Sherwin], [Arg0
a conviction] could
[Rel
carry] [Arg1
penalties of five years in prison and
a $250,000 fine on each count](wsj_1331)
reduces to:
arg4 arg0 rel arg1
• Frameset tags, ~7K annotations, 200 schemas, 921 verbs
1, tran, 2 – ditran, 3 unaccusative
3 10 20 30 40 50 60 70 80 90 100 110 120 130 140 1500.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
0.11
Average Similarity
Number of Clusters
Sim
ilarit
y
Adding to VerbNet Classes 36.3 'combative meetings'
fight, consult, ... Clustering analysis adds hedge
Hedge one's bets against ... But some investors might prefer a simpler strategy
than hedging their individual holdings (wsj\_1962) Thus, buying puts after a big market slide can be an
expensive way to hedge against risk (wsj\_2415)
CIS630 77
PennLexical Semantics at Penn
Annotation of Penn Treebank with semantic role labels (propositions) and sense tags
Links to VerbNet and WordNetProvides additional semantic information that
clearly distinguishes verb sensesClass based to facilitate extension to previously
unseen usages
CIS630 78
PennPropBank I
Also, [Arg0substantially lower Dutch corporate tax rates] helped [Arg1[Arg0 the company] keep [Arg1 its tax outlay] [Arg3-PRD flat] [ArgM-ADV relative to earnings growth]].
relative to earnings…
flatits tax outlaythe companykeep
the company keep its tax outlay flat
tax rateshelp
ArgM-ADVArg3-PRD
Arg1Arg0REL
Event variables;
ID#h23
k16
nominal reference;sense tags;
help2,5 tax rate1
keep1 company1
discourse connectives
{ }
I