Christopher Potts Feb 2€¦ · Semantic roles capture usefully abstract semantic information (despite the challenges of assigning them). SRL reached a peak of popularity around 2005-2006,

Overview PropBank 1 FrameNet Other corpora SRL: tasks, evaluation, tools Approaches to SRL Conclusions

Semantic role labeling

Christopher Potts

CS 244U: Natural language understandingFeb 2

With diagrams and ideas from Scott Wen-tau Tih,Kristina Toutanova, Dan Jurafsky, Sameer Pradhan,Chris Manning, Charles Fillmore, Martha Palmer,and Ed Loper.

1 / 46


Plan and goals

“There is perhaps no concept in modern syntactic and semantic theorywhich is so often involved in so wide a range of contexts, but on whichthere is so little agreement as to its nature and definition, as thematicrole” (Dowty 1991:547)

1 Semantic roles as a useful shallow semantic representation2 Resources for studying semantic roles:

• FrameNet• PropBank

3 Semantic role labeling:• Identification: which phrases are role-bearing?• Classification: for role-bearing phrases, what roles do they play?• Evaluation• Tools

4 Approaches to semantic role labeling:• Models• Local features• Global and joint features

2 / 46


Plan and goals

“There is perhaps no concept in modern syntactic and semantic theorywhich is so often involved in so wide a range of contexts, but on whichthere is so little agreement as to its nature and definition, as thematicrole” (Dowty 1991:547)

1 Semantic roles as a useful shallow semantic representation2 Resources for studying semantic roles:

• FrameNet• PropBank

3 Semantic role labeling:• Identification: which phrases are role-bearing?• Classification: for role-bearing phrases, what roles do they play?• Evaluation• Tools

4 Approaches to semantic role labeling:• Models• Local features• Global and joint features

2 / 46


Common high-level roles

Definitions adapted from http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsASemanticRole.htm

• Agent: a person or thing who is the doer of an event

• Patient/Theme: affected entity in the event; undergoes the action

• Experiencer: receives, accepts, experiences, or undergoes the effect of anaction

• Stimulus: the thing that is felt or perceived

• Goal: place to which something moves, or thing toward which an action isdirected.

• Recipient (sometimes grouped with Goal):

• Source (sometimes grouped with Goal): place or entity of origin

• Instrument: an inanimate thing that an Agent uses to implement an event

• Location: identifies the location or spatial orientation of a state or action

• Manner: how the action, experience, or process of an event is carried out.

• Measure: notes the quantification of an event

(Dowty 1991:§3 on how, ahh, extensive and particular these lists can become)

3 / 46

http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsASemanticRole.htm

http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsASemanticRole.htm


Examples

1 [Agent Doris] caught [Theme the ball] with [Instrument her mitt].

2 [Agent Sotheby’s] offered [Recipient the heirs] [Theme a money-back guarantee].

3 [Stimulus The response] dismayed [Experiencer the group].

4 [Experiencer The group] disliked [Stimulus the response].

5 [Agent Kim] sent [Theme a stern letter] to [Goal the company].

4 / 46


Roles and morpho-syntactic diversity

Kim sent Sandy a letter.Kim sent a letter to Sandy.

A letter was sent to Sandy by Kim.Sandy was sent a letter by Kim.

Agent: Kim, Theme: a letter, Goal: Sandy

Kim criticized the administration.Kim demanded the resignation.

The compromise was rejected by Kim.Kim paid the check.

Agent: Kim, Theme: *

The storm frightened Sandy.Sandy feared the storm.

}Experiencer: Sandy, Stimulus: the storm

Sam froze the ice cubes.m

The ice cubes froze.

Jed ate the pizza.m

Jed ate.

Edith cut the bread easily.m

The bread cut easily.

5 / 46


Applications

The applications tend to center around places where we want a semantics thatabstracts away from syntactic differences:

• Question answering (abstract Q/A alignment)

• Translation (abstract source/target alignment)

• Information extraction (grouping conceptually related events)

• High-level semantic summarization (what role doesObama/Gingrich/Romney typically play in media coverage?)

• . . .

6 / 46


Let’s annotate some data!

• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure

1 [Doris] hid [the money] [in the jar].

2 [Sam] broke [the flowerpot].

3 [The flowerpot] broke.

4 [The storm] frightened [Sam].

5 [The speaker] told [a story].

6 [The watch] told [the time].

7 [Italians] make [great desserts].

8 [Cookies] make [great desserts].

7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure

1 [AgentDoris] hid [Themethe money] [Locationin the jar].








7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure


2 [AgentSam] broke [Themethe flowerpot].







7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure



3 [ThemeThe flowerpot] broke.






7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure




4 [StimulusThe storm] frightened [ExperiencerSam].





7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure





5 [AgentThe speaker] told [Themea story].




7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure






6 [SourceThe watch] told [Themethe time]. ???



7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure







7 [AgentItalians] make [Themegreat desserts].


7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure









7 / 46



• Agent

• Patient/Theme

• Experiencer

• Stimulus

• Goal

• Recipient

• Source

• Instrument

• Location

• Manner

• Measure








8 [Source?Cookies] make [Pred?great desserts]. ???

7 / 46


Challenges and responses

Challenges (from Dowty 1991:§3)• Roles are hard to define/delimit.• It can be hard to know which meaning contrasts are role-related and which

belong to other domains, especially• lexical influences that subdivide roles very finely;• conceptual domains that cross-cut role distinctions;• information structuring

Responses• Dowty (1991): argue forcefully for a tiny set of very general roles.

• PropBank: adopt a small set of roles as a matter of convenience, or tochange the subject.

• FrameNet: different roles sets for different semantic domains, with someabstract connections between domains.

8 / 46


A brief history of semantic roles1 Common in descriptive grammars dating back to the origins of linguistics,

where they are used to informally classify predicates and case morphology.2 Fillmore (1968) proposes an abstract theory of Case to capture underlying

semantic relationships that affect/guide syntactic expression.3 Syntacticians seek to discover patterns in how thematic (theta) roles are

expressed syntactically (linking theory), and in how roles relate to each otherand to other properties (e.g., animacy).

4 In linguistics, lexical semantics is currently a thriving area in which one of thecentral concerns is to find systematic connections between differentargument realizations (Levin and Rappaport Hovav 2005).

5 Early SRL systems based on rule sets designed for specific texts (Simmons1973; Riesbeck 1975).

6 The FrameNet project (Baker et al. 1998; Fillmore and Baker 2001)continues the research line begun by Fillmore.

7 Gildea and Jurafsky (2000, 2002) are among the very first to use resourceslike FrameNet to train general-purpose SRL systems.

8 PropBank (Palmer et al. 2005) provides comprehensive annotations for asection of the Penn Treebank, facilitating experiments of the sort thatdominate NLP currently.

9 / 46


PropBank 1 (Palmer et al. 2005)

• A subset of the Wall Street Journal section of the Penn Treebank 2:• the version number is important; v1 and v3 will be misaligned in places• the subdirectory is combined/wsj/, which contains subdirectories of .mrg files

• 112,917 annotated examples (all centered around verbs)

• 3,257 unique verbs

• Core arguments numbered; peripheral arguments labeled

• Contains only verbs and their arguments• Stand-off annotations:

• data/prop.txt: one example per row, indexed to the Treebank files• data/verbs.txt: the list of verbs (by type)• data/vloc.txt: format

filename tree no string index verb lemma

• data/frames: directory containing verbal frame files (XML)

10 / 46


PropBank frames and labels

Frame: increase.01• name: go up incrementally

• vncls: 45.4 45.6

• ARG0 causer of increase (vntheta: Agent)

• ARG1 thing increasing (vntheta: Patient)

• ARG2 amount increased by, EXT or MNR (vntheta: Extent)

• ARG3 start point (vntheta: –)

• ARG4 end point (vntheta: –)

Examples

1 [ARG0 The Polish government] [rel increased] [ARG1 home electricity charges][ARG2-EXT by 150%].

2 [ARG1 The nation’s exports] [rel increased] [2-EXT 4%] [4-2 to $50.45 billion].

3 [ARG1 Output] will be [2-MNR gradually] [rel increased] .

11 / 46


ExampleFirst row of prop.txt

Field Value

wsj-filename wsj/00/wsj 0001.mrgsentence 0terminal 8tagger goldframeset join.01inflection vf--aproplabel 0:2-ARG0proplabel 7:0-ARGM-MODproplabel 8:0-relproplabel 9:1-ARG1proplabel 11:1-ARGM-PRDproplabel 15:1-ARGM-TMP

(only a subset of the ARG’s labeled to avoid clutter)

rel (verb) inflection fields (‘-’ means no value)

1. form: i=inf g=gerund p=part v=finite2. tense: f=future p=past n=present3. aspect: p=perfect o=prog. b=both perfect &

prog.4. person: 3=3rd person5. voice: a=active p=passive

Label

rel the verbARGA causative agentARGM adjunctsARG0 generally subjARG1 generally dobjARG2 generally iobj...

Label

EXT extentDIR directionLOC locationTMP temporalREC reciprocalPRD predicationNEG negationMOD modalADV adverbialMNR mannerCAU causePNC purpose not causeDIS discourse

12 / 46


Trace paths and discontinuity

Traces: 2:1*0:1-ARG0 Split args: 1:0,2:0-rel

13 / 46


Trace chains and discontinuity combined

28:1,30:1*32:1*33:0-ARG0

14 / 46


PropBank tools

• Web browser:http://verbs.colorado.edu/verb-index/index.php

• Stanford JavaNLP:http://nlp.stanford.edu/software/framenet.shtml

• Python NLTK:http://nltk.sourceforge.net/corpus.html#propbank-corpus

http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.

reader.propbank-module.html

15 / 46

http://verbs.colorado.edu/verb-index/index.php

http://nlp.stanford.edu/software/framenet.shtml

http://nltk.sourceforge.net/corpus.html#propbank-corpus

http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.propbank-module.html

http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.propbank-module.html


NLTK interface to PropBank: example level>>> import nltk.data; nltk.data.path = [’/path/to/penn-treebank2/’] + nltk.data.path>>> from nltk.corpus import propbank>>> pb = propbank.instances()>>> len(pb)112917>>> len(propbank.verbs())3257########## Grab the first sentence, the one we looked at before:>>> i0 = pb[0]>>> i0.fileid’wsj_0001.mrg’>>> i0.sentnum0>>> i0.wordnum8>>> i0.inflection.tense’f’>>> i0.inflection.aspect’-’>>> i0.inflection.person’-’>>> i0.inflection.voice’a’>>> i0.roleset’join.01’>>> i0.arguments((PropbankTreePointer(0, 2), ’ARG0’), (PropbankTreePointer(7, 0), ’ARGM-MOD’), \(PropbankTreePointer(9, 1), ’ARG1’), (PropbankTreePointer(11, 1), ’ARGM-PRD’), \(PropbankTreePointer(15, 1), ’ARGM-TMP’) 16 / 46


NLTK interface to PropBank: example level (continued)

>>> i0.tree.pprint()’(S(NP-SBJ(NP (NNP Pierre) (NNP Vinken))(, ,)(ADJP (NP (CD 61) (NNS years)) (JJ old))(, ,))

(VP(MD will)(VP(VB join)(NP (DT the) (NN board))(PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director)))(NP-TMP (NNP Nov.) (CD 29))))

(. .))’

>>> inst.predicate.select(i0.tree)Tree(’VB’, [’join’])

>>> i0.arguments[0][0].select(i0.tree).pprint()’(NP-SBJ(NP (NNP Pierre) (NNP Vinken))(, ,)(ADJP (NP (CD 61) (NNS years)) (JJ old))(, ,))’

17 / 46


NLTK interface to PropBank: frame level

>>> from nltk.etree import ElementTree

>>> j = propbank.roleset(’join.01’)>>> j<Element ’roleset’ at 0x3b781a0>

>>> ElementTree.tostring(j)<roleset id="join.01" name="attach" vncls="22.1-2"><roles><role descr="agent, entity doing the tying" n="0"><vnrole vncls="22.1-2" vntheta="Agent" /></role>

<role descr="patient, thing(s) being tied" n="1"><vnrole vncls="22.1-2" vntheta="Patient1" /></role>

<role descr="instrument, string" n="2"><vnrole vncls="22.1-2" vntheta="Patient2" /></role>

</roles>

<example name="straight transitive">...

>>> for r in j.findall(’roles/role’): print ’ARG’ + r.attrib[’n’], r.attrib[’descr’]ARG0 agent, entity doing the tyingARG1 patient, thing(s) being tiedARG2 instrument, string

18 / 46


A more advanced example: argument number and theta role alignment

from collections import defaultdict

from operator import itemgetter

import nltk.data; nltk.data.path = [’/path/to/penn-treebank2/’] + nltk.data.path

from nltk.corpus import propbank

def role_iterator():

for verb in iter(propbank.verbs():

index = 1

while True:

roleset_id = ’%s.%s’ % (verb, str(index).zfill(2))

try:

for role in propbank.roleset(roleset_id).findall(’roles/role’):

yield role

index += 1

except ValueError:

break

def view_arg_theta_alignment(n):

counts = defaultdict(int)

for role in role_iterator():

if role.attrib[’n’] == n:

counts[role.attrib[’descr’]] += 1

# View the result, sorted from most to least common theta role:

for vtheta, count in sorted(counts.items(), key=itemgetter(1), reverse=True):

print vtheta, count, round(float(count) / sum(counts.values()), 2)

19 / 46


Argument number and theta role alignment: examples

view theta alignment(’0’) view theta alignment(’1’) view theta alignment(’2’)

causer 96 0.023speaker 66 0.016agent, causer 46 0.011causal agent 45 0.011entity in motion 41 0.01giver 35 0.008causer, agent 31 0.007cause, agent 29 0.007creator 29 0.007agent 20 0.005thinker 19 0.005cutter 19 0.005agent, hitter - animate only! 18 0.004builder 17 0.004describer 16 0.004Agent 15 0.004

.

.

.2,454 vtheta types

utterance 77 0.017path 41 0.009entity in motion 26 0.006thing hit 25 0.006victim 22 0.005commodity 21 0.005impelled agent 21 0.005experiencer 19 0.004thing given 19 0.004topic 17 0.004thing changing 17 0.004Logical subject, patient, thing falling 17 0.004thing in motion 17 0.004food 16 0.004construction 15 0.003subject 14 0.003

.

.

.2,842 vtheta types

instrument 93 0.04hearer 61 0.026benefactive 53 0.023EXT 42 0.018attribute 40 0.017source 36 0.015destination 32 0.014attribute of arg1 29 0.012instrument, if separate from arg0 26 0.011impelled action 22 0.009listener 21 0.009end state 20 0.009instrument, thing hit by or with 19 0.008location 19 0.008EXT, amount fallen 18 0.008recipient 17 0.007

.

.

.1,125 vtheta types

20 / 46


Dependency relations and PropBank core semantic roles

Dep ARG0 ARG1 ARG2 ARG3 ARG4

nsubj 32,564 13,034 995 42 1dobj 340 16,416 971 79 9iobj 4 65 195 24 1pobj 53 246 14 0 0

21 / 46


PropBank summary

Virtues• Full gold-standard parses.

• Full coverage of a single collection of documents — one of the most heavilyannotated document collections in the world.

• Different levels of role granularity.

Limitations• ARG2-5 overloaded. FrameNet (and VerbNet) both provide more

fine-grained role labels

• WSJ too domain specific and too financial.

• Only verbs are covered; in language, nouns and adjs also have rolearguments.

22 / 46


FrameNetData source: https://framenet.icsi.berkeley.edu/fndrupal/current_status

• Database of over 12,379 lexical units (7,963 full annotated).

• 1,135 distinct semantic frames (1,020 lexical; 115 non-lexical).

• 188,682 annotation sets (162,643 lexicographic; 26,039 full text).

• The ‘net’ part: words are related in numerous ways via their frames.

33

FrameNet [Fillmore et al. 01]

Frame: Hit_target (hit, pick off, shoot)

Agent Target

Instrument Manner

Means Place

Purpose Subregion

Time

Lexical units (LUs): Words that evoke the frame (usually verbs)

Frame elements (FEs): The involved semantic roles

Non-Core Core

[Agent Kristina] hit [Target Scott] [Instrument with a baseball] [Time yesterday ].

23 / 46

https://framenet.icsi.berkeley.edu/fndrupal/current_status


Background ideas (see Ruppenhofer et al. 2006)

Theoretical assumptions• Word meanings are best understood in terms of the semantic/conceptual

structures (frames) which they presuppose.

• Words and grammatical constructions that evoke frames and their elements.

Goals• To discover and describe the frames that support lexical meanings.

• To provide names for the relevant elements of those frames

• To describe the syntactic/semantic valence of the words that fit the frames.

• To base the whole process on attestations from a corpus.

The focus is on the frames and their connections. Role labeling is necessary butsecondary.

24 / 46


Example domains and frames

248

Computational Linguistics Volume 28, Number 3

banter!v

debate!v

converse!vgossip!v

dispute!n

discussion!n

tiff!n

ConversationFrame:Protagonist!1Protagonist!2ProtagonistsTopicMedium

Frame Elements:

argue!v

Domain: Communication Domain: Cognition

Frame: Questioning

TopicMedium

Frame Elements: SpeakerAddresseeMessage

Frame:

TopicMedium

Frame Elements: SpeakerAddresseeMessage

Statement

Frame:

Frame Elements:

JudgmentJudgeEvalueeReasonRole

dispute!n

blame!v fault!nadmire!v

admiration!n disapprove!v

blame!nappreciate!v

Frame:

Frame Elements:

CategorizationCognizerItemCategoryCriterion

Figure 1Sample domains and frames from the FrameNet lexicon.

Many of these sets of roles have been proposed by linguists as part of theoriesof linking, the part of grammatical theory that describes the relationship betweensemantic roles and their syntactic realization. Other sets have been used by computerscientists in implementing natural language understanding systems. As a rule, themore abstract roles have been proposed by linguists, who are more concerned withexplaining generalizations across verbs in the syntactic realization of their arguments,whereas the more specific roles have more often been proposed by computer scientists,who are more concerned with the details of the realization of the arguments of specificverbs.

The FrameNet project (Baker, Fillmore, and Lowe 1998) proposes roles that areneither as general as the 10 abstract thematic roles, nor as specific as the thousands ofpotential verb-specific roles. FrameNet roles are defined for each semantic frame. Aframe is a schematic representation of situations involving various participants, props,and other conceptual roles (Fillmore 1976). For example, the frame Conversation,shown in Figure 1, is invoked by the semantically related verbs argue, banter, debate,converse, and gossip, as well as the nouns dispute, discussion, and tiff, and is defined asfollows:

(7) Two (or more) people talk to one another. No person is construed asonly a speaker or only an addressee. Rather, it is understood that both(or all) participants do some speaking and some listening: the process isunderstood to be symmetrical or reciprocal.

The roles defined for this frame, and shared by all its lexical entries, includeProtagonist-1 and Protagonist-2 or simply Protagonists for the participantsin the conversation, as well as Medium and Topic. Similarly, the Judgment framementioned above has the roles Judge, Evaluee, and Reason and is invoked by verbssuch as blame, admire, and praise and nouns such as fault and admiration. We refer tothe roles for a given frame as frame elements. A number of hand-annotated exam-ples from the Judgment frame are included below to give a flavor of the FrameNetdatabase:

(8) [Judge She ] blames [Evaluee the Government ] [Reason for failing to doenough to help ] .

(9) Holman would characterise this as blaming [Evaluee the poor ] .

(From Gildea and Jurafsky 2002:249)

25 / 46


From strings to frames

26 / 46



26 / 46



26 / 46



26 / 46



26 / 46



26 / 46



26 / 46


Full text annotations

From https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=fulltextIndex

• American National Corpus Texts

• AQUAINT Knowledge-Based Evaluation Texts

• LUCorpus-v0.3

• Miscellaneous

• Texts from Nuclear Threat Initiative website, created by Center forNon-Proliferation Studies

• Wall Street Journal Texts from the PropBank Project

27 / 46

https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=fulltextIndex


Gildea and Jurafsky (2000, 2002) FrameNet experiment format

From their training set:

body/action/arch.v.ar:<S TPOS="30621249"> <C TARGET="y"> Arch/VVB </C> <C FE="Agt" PT="CNI"> </C> <C FE="BPrt"> your/DPS back/NN1 </C> <C FE="Degr"> as/AV0 high/AJ0-AV0 as/CJS you/PNP can/VM0 </C> and/CJC drop/VVB your/DPS head/NN1 </S>(TOP (S (VP (VP (VB Arch) (NP (PRP$ your) (NN back)) (ADVP (ADVP (RB as) (JJ high)) (SBAR (IN as) (S (NP (PRP you)) (VP (MD can)))))) (CC and) (VP (VB drop) (NP (PRP$ your) (NN head))))))

body/action/arch.v.ar:<S TPOS="67141515"> <T TYPE="Canonical"> </T> She/PNP snatched/VVD Buster/NN1-NP0 from/PRP his/DPS play/NN1 and/CJC we/PNP went/VVD back/AVP into/PRP the/AT0 house/NN1 where/AVQ-CJS she/PNP held/VVD him/PNP close/AV0 to/PRP her/DPS face/NN1 laughing/VVG as/PRP <C FE="Agt"> the/AT0 big/AJ0 cat/NN1 </C> purred/VVD-VVN and/CJC <C TARGET="y"> arched/VVD </C> <C FE="BPrt"> himself/PNX </C> ecstatically/AV0 <C FE="Goal"> against/PRP her/DPS cheek/NN1 </C> </S>(TOP (S (S (NP (PRP She)) (VP (VBD snatched) (NP (NN Buster)) (PP (IN from) (NP (PRP$ his) (NN play))))) (CC and) (S (NP (PRP we)) (VP (VBD went) (ADVP (RB back) (PP (IN into) (NP (NP (DT the) (NN house)) (SBAR (WHADVP (WRB where)) (S (NP (PRP she)) (VP (VBD held) (NP (PRP him)) (ADVP (RB close) (PP (TO to) (NP (PRP$ her) (NN face)))) (S (VP (VBG laughing) (SBAR (IN as) (S (NP (DT the) (JJ big) (NN cat)) (VP (VP (VBD purred)) (CC and) (VP (VBD arched) (NP (PRP himself)) (ADVP (RB ecstatically)) (PP (IN against) (NP (PRP$ her) (NN cheek)))))))))))))))))))

...

body/action/bat.v.ar:<S TPOS="77171143"> <C TYPE="Blend"> </C> <C FE="Agt"> The/AT0 receptionist/NN1 </C> had/VHD obviously/AV0 recognised/VVN him/PNP too/AV0 had/VHD practically/AV0 fallen/VVN over/PRP herself/PNX to/TO0 <C TARGET="y"> bat/VVI </C> <C FE="BPrt"> her/DPS long/AJ0-AV0 dark/AJ0-NN1 eyelashes/NN2-VVZ </C> <C FE="Add"> at/PRP him/PNP </C> </S>(TOP (S (NP (DT The) (NN receptionist)) (VP (VBD had) (VP (ADVP (RB obviously)) (VBN recognised) (NP (PRP him)) (SBAR (S (ADVP (RB too)) (VP (VBD had) (VP (ADVP (RB practically)) (VBN fallen) (PP (IN over) (NP (PRP herself))) (S (VP (TO to) (VP (VB bat) (NP (PRP$ her) (JJ long) (JJ dark) (NNS eyelashes)) (PP (IN at) (NP (PRP him))))))))))))))

body/action/bat.v.ar:<S TPOS="69048344"> Did/VDD <C FE="Agt"> saints/NN2 </C> ever/AV0 <C TARGET="y"> bat/VVI </C> <C FE="BPrt"> their/DPS eyelids/NN2 </C> and/CJC look/NN1-VVB sleepily/AV0 self-satisfied/AJ0 as/CJS-PRP cats/NN2 </S>(TOP (SQ (VBD Did) (NP (NNS saints)) (ADVP (RB ever)) (VP (VP (VB bat) (NP (PRP$ their) (NNS eyelids))) (CC and) (VP (VB look) (ADJP (RB sleepily) (JJ self-satisfied)) (PP (IN as) (NP (NNS cats)))))))

...

body/action/bend.v.ar:<S TPOS="25399472"> <C FE="Agt"> You/PNP </C> may/VM0 <C TARGET="y"> bend/VVI </C> <C FE="BPrt"> the/AT0 lower/AJC arm/NN1 </C> <C FE="Degr"> a/AV0 little/AV0 </C> if/CJS you/PNP wish/VVB </S>(TOP (S (NP (PRP You)) (VP (MD may) (VP (VB bend) (NP (DT the) (JJR lower) (NN arm)) (NP (DT a) (RB little)) (SBAR (IN if) (S (NP (PRP you)) (VP (VBP wish))))))))

28 / 46


FrameNet summary

Virtues• Many levels of analysis.

• Different parts of speech (not just verbs).

• Diverse document collection.

• A rich lexical resource, not just for SRL.

Limitations (some addressed by the new full-text annotations)• Example sentences are chosen by hand (non-random).

• Complete sentences not labeled

• No gold-standard parses or other annotations.

• A work in progress with sometimes surprising gaps.

29 / 46


Other corpora

• FrameNets in other languages:https://framenet.icsi.berkeley.edu/fndrupal/framenets_in_other_languages

• VerbNet:http://verbs.colorado.edu/˜mpalmer/projects/verbnet.html

• NomBank (extends PropBank with NP-internal annotations):http://nlp.cs.nyu.edu/meyers/NomBank.html

• Korean PropBank:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2006T03

• Chinese Propbanks:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2005T23

http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2008T07

• CoNNL-2005 shared task PropBank subset (tabular format):http://www.lsi.upc.edu/˜srlconll/soft.html

• Senseval 3 SRL (FrameNet subset):http://www.clres.com/SensSemRoles.html

• SemEval 2007 (FrameNet, NomBank, PropBank, Arabic)http://nlp.cs.swarthmore.edu/semeval/tasks/index.php

30 / 46

https://framenet.icsi.berkeley.edu/fndrupal/framenets_in_other_languages

http://verbs.colorado.edu/~mpalmer/projects/verbnet.html

http://nlp.cs.nyu.edu/meyers/NomBank.html




http://www.lsi.upc.edu/~srlconll/soft.html

http://www.clres.com/SensSemRoles.html

http://nlp.cs.swarthmore.edu/semeval/tasks/index.php


SRL tasks

Identification: which phrases are role-bearing?• Necessary for real-world tasks, where phrases are unlikely to be identified

as role-bearing.

• Role-bearing phrases need not be constituents, or even necessarilycontiguous, making the search space enormous (2n for n words, thoughmost candidates will be absurd).

Classification: for role-bearing phrases, what roles do they play?• Highly dependent on the underlying role set.

• Also a very large search space: ≈ 20m for m arguments, assuming 20candidate labels.

Evaluation: very involved and tricky to get right• In identification, how do we score overlap/containment/subsumption?

• Should classification scores be influenced by identification errors?

• Are some argument-tyles more important than others?

• Are some mis-classifications worse than others?

31 / 46


Evaluation in Toutanova et al. 2008:§3.2


Our argument-based measures do not require exact bracketing (if the set of wordsconstituting an argument is correct, there is no need to know how this set is brokeninto constituents) and do not give partial credit for labeling correctly only some ofseveral constituents in a multi-constituent argument. They are illustrated in Figure 2.For these measures, a semantic role labeling of a sentence is viewed as a labeling on setsof words. These sets can encompass several non-contiguous spans. Figure 2(c) gives therepresentation of the correct and guessed labelings shown in Figures 2(a) and 2(b), in thefirst and second rows of the table, respectively. To convert a labeling on parse tree nodesto this form, we create a labeled set for each possibly multi-constituent argument. Allremaining sets of words are implicitly labeled with NONE. We can see that, in this way,exact bracketing is not necessary and also no partial credit is given when only some ofseveral constituents in a multi-constituent argument are labeled correctly.

We will refer to word sets as “spans.” To compute the measures, we are comparinga guessed set of labeled spans to a correct set of labeled spans. We briefly define thevarious measures of comparison used herein, using the example guessed and correct

Figure 2Argument-based scoring measures for the guessed labeling.

168

Acc: whole frame accuracy

CORE: only core args

CoarseARGM: adjuncts all collapsed to ARGM

ALL: all args

Argument ID: classify word sets as role-bearing or not; all labels mapped to ARG or NONE

Argument Cls: assign roles to role-bearing phrases

tp: gold ≠ NONE & guess = goldfp: guess ≠ NONE & guess ≠ goldfn: gold ≠ NONE & guess ≠ gold

p: tp / (tp + fp)r: tp / (tp + fn)F1: (2*p*r) / (p+r)

32 / 46


Evaluation in Toutanova et al. 2008:§3.2


Our argument-based measures do not require exact bracketing (if the set of wordsconstituting an argument is correct, there is no need to know how this set is brokeninto constituents) and do not give partial credit for labeling correctly only some ofseveral constituents in a multi-constituent argument. They are illustrated in Figure 2.For these measures, a semantic role labeling of a sentence is viewed as a labeling on setsof words. These sets can encompass several non-contiguous spans. Figure 2(c) gives therepresentation of the correct and guessed labelings shown in Figures 2(a) and 2(b), in thefirst and second rows of the table, respectively. To convert a labeling on parse tree nodesto this form, we create a labeled set for each possibly multi-constituent argument. Allremaining sets of words are implicitly labeled with NONE. We can see that, in this way,exact bracketing is not necessary and also no partial credit is given when only some ofseveral constituents in a multi-constituent argument are labeled correctly.

We will refer to word sets as “spans.” To compute the measures, we are comparinga guessed set of labeled spans to a correct set of labeled spans. We briefly define thevarious measures of comparison used herein, using the example guessed and correct

Figure 2Argument-based scoring measures for the guessed labeling.

168

Acc: whole frame accuracy

CORE: only core args

CoarseARGM: adjuncts all collapsed to ARGM

ALL: all args

Argument ID: classify word sets as role-bearing or not; all labels mapped to ARG or NONE

Argument Cls: assign roles to role-bearing phrases

tp: gold ≠ NONE & guess = goldfp: guess ≠ NONE & guess ≠ goldfn: gold ≠ NONE & guess ≠ gold

p: tp / (tp + fp)r: tp / (tp + fn)F1: (2*p*r) / (p+r)

tp

tp

fp

fn

p = tp / (tp + fp) = 2 / (2 + 2)r = tp / (tp + fn) = 2 / (2 + 1)f1 = (2 * 0.5 * 0.67) / (0.5 + 0.67) = 0.571

fp

32 / 46


CoNNL evaluation (Carreras and Marquez 2005)

• Distributed as a Perl script fromhttp://www.lsi.upc.edu/˜srlconll/soft.html

• Essentially the same as the Argument Id&Cls metric of Toutanova et al. 2008:“For an argument to be correctly recognized, the words spanning theargument as well as its semantic role have to be correct.”

• Verbs are excluded from the evaluation, since they are generally the targets.

• For CoNNL, co-indexed arguments are treated as separate arguments[ARG1 The deregulation] of railroads [

R-ARG1that] [PRED began] enabled

shippers to bargain for transportation.

whereas for Toutanova et al. they are treated as single C- relatedconstituents to be assigned a single role:

[ARG1 The deregulation] of railroads [C-ARG1

that] [PRED began] enabled

shippers to bargain for transportation.

33 / 46

http://www.lsi.upc.edu/~srlconll/soft.html


Tools (pause here for demos)SwiRL: http://www.surdeanu.name/mihai/swirl/

(Surdeanu and Turmo 2005; Surdeanu et al. 2007)The glass broke .

( S1 0( S 1

( NP 1 { B-A1-2 }( DT 0 The the O )( NN 1 glass glass O ) )

( VP 0( VBD 2 broke break O ) )

( . 3 . . O ) ) )

DT (S1(ˆS(NP* "the" 0 (A1*NN ˆ*) "glass" 0 *)VBD (ˆVPˆ*) "break" 1 *. *)) "." 0 *

Illinois: http://cogcomp.cs.illinois.edu/demo/srl/The glass broke.

The breaker [A0] (S1 (S (NP (DT the)glass (NN glass))broke V: break (VB (VBD broke)). (. .)))

34 / 46

http://www.surdeanu.name/mihai/swirl/

http://cogcomp.cs.illinois.edu/demo/srl/


Approaches to SRL

Many different kinds of models have been used for SRL:

• Gildea and Jurafsky (2002): direct Bayesian estimates using richmorpho-synactic features

• Pradhan et al. (2004): SVMs with very rich features

• Punyakanok et al. (2004, 2005): systems of hand-built, categorical rules withan integer linear programming solver

• Shallow morph-syntactic features (CoNNL-2005 systems)

• Toutanova et al. (2008): inter-label dependencies (discussed extensivelyhere)

For many additional references, see Yih and Toutanova 2007.

35 / 46


Basic architecture

60

Basic Architecture of a Generic SRL System

annotations

local scoring

joint scoring

Sentence s , predicate p

semantic roles

s, p, A

s, p, Ascore(l|c,s,p,A)

Local scores for phrase labels do not depend on labels of other phrases

Joint scores take into account dependencies among the labels of multiple phrases

(adding features)

(From Yih and Toutanova 2007)

36 / 46


Local classifiers

Definition (Local SRL classifier)• t : a tree

• v: a target predicate node in t

• L : a mapping from nodes in t to semantic roles (including NONE)

• Id(L): the mapping that is just like L except all non-NONE values are ARG

The probability of L is given by

PLOCALSRL (L |t , v) =

∏ni∈t

PID

(Id(li)|t , v

)×

∏ni∈t

PCLS

(li |t , v , Id(li)

)For classification, pick the L that maximizes this product.

• Toutanova et al. (2008:§4) train MaxEnt models for each term in the productand then multiply the predicted distributions together to obtainPLOCAL

SRL (L |t , v). The feature sets are the same for both models.• Because the maximal labeling could involve overlapping spans and role

assignments, they develop a dynamic programming algorithm thatmemoizes scores moving from the leaves to the root (§4.2). The gains aremodest, though.

37 / 46


Baseline features


Then, like the Gildea and Jurafsky (2002) system, we decompose the probability of alabeling L into probabilities according to an identification model PID and a classificationmodel PCLS.

PSRL(L|t, v) = PID(Id(L)|t, v)PCLS(L|t, v, Id(L)) (1)

This decomposition does not encode any independence assumptions, but is a use-ful way of thinking about the problem. Our local models for semantic role labelinguse this decomposition. We use the same features for local identification and classi-fication models, but use the decomposition for efficiency of training. The identifica-tion models are trained to classify each node in a parse tree as ARG or NONE, andthe classification models are trained to label each argument node in the training setwith its specific label. In this way the training set for the classification models issmaller. Note that we do not do any hard pruning at the identification stage in testingand can find the exact labeling of the complete parse tree, which is the maximizer ofEquation (1).

We use log-linear models for multi-class classification for the local models. Becausethey produce probability distributions, identification and classification models can bechained in a principled way, as in Equation (1). The baseline features we used for thelocal identification and classification models are outlined in Figure 3. These features area subset of the features used in previous work. The standard features at the top of thefigure were defined by Gildea and Jurafsky (2002), and the rest are other useful lexicaland structural features identified in more recent work (Surdeanu et al. 2003; Pradhanet al. 2004; Xue and Palmer 2004). We also incorporated several novel features whichwe describe next.

Figure 3Baseline features.

172(Toutanova et al. 2008:172)

38 / 46


Handling displaced constituents (Toutanova et al. 2008:§4.1)Toutanova, Haghighi, and Manning A Global Joint Model for SRL

Figure 4Example of displaced arguments.

4.1 Additional Features for Displaced Constituents

We found that a large source of errors for ARG0 and ARG1 stemmed from cases suchas those illustrated in Figure 4, where arguments were dislocated by raising or controlverbs. Here, the predicate, expected, does not have a subject in the typical position—indicated by the empty NP—because the auxiliary is has raised the subject to its currentposition. In order to capture this class of examples, we use a binary feature, MISSINGSUBJECT, indicating whether the predicate is “missing” its subject, and use this featurein conjunction with the PATH feature, so that we learn typical paths to raised subjectsconditioned on the absence of the subject in its typical position.6

In the particular case of Figure 4, there is another instance of an argument beingquite far from its predicate. The predicate widen shares the phrase the trade gap withexpect as an ARG1 argument. However, as expect is a raising verb, widen’s subject is notin its typical position either, and we should expect to find it in the same position asexpected’s subject. This indicates it may be useful to use the path relative to expectedto find arguments for widen. In general, to identify certain arguments of predicatesembedded in auxiliary and infinitival VPs we expect it to be helpful to take the pathfrom the maximum extended projection of the predicate—the highest VP in the chainof VPs dominating the predicate. We introduce a new path feature, PROJECTED PATH,which takes the path from the maximal extended projection to an argument node. Thisfeature applies only when the argument is not dominated by the maximal projection(e.g., direct objects). These features also handle other cases of discontinuous and non-local dependencies, such as those arising due to control verbs. The performance gainfrom these new features was notable, especially in identification. The performance onALL arguments for the model using only the features in Figure 3, and the model usingthe additional features as well, are shown in Figure 5. For these results, the constraintthat argument phrases do not overlap was enforced using the algorithm presented inSection 4.2.

4.2 Enforcing the Non-Overlapping Constraint

The most direct way to use trained local identification and classification models intesting is to select a labeling L of the parse tree that maximizes the product of the

6 We consider a verb to be missing its subject if the highest VP in the chain of VPs dominating the verbdoes not have an NP or S(BAR) as its immediate left sister.

173

Numerous errors caused by dis-placed constituents. Responseis to have a feature Missing Sub-ject and a Path feature, so thatthe model establishes the asso-ciations.

Basic Stanford dependencies

The trade

gap

det nn

is

expected

nsubjpass auxpass

widen

xcomp

to

aux

Collapsed Stanford dependencies

The trade

gap

det nn

is

expected

nsubjpass

auxpass

widen

xcomp

to

xsubj aux

39 / 46


Joint model features (Toutanova et al. 2008:174–176)

• Higher precision than recall.

• Most mistakes involveNONE. (Not surprising tome; I am often surprised atwhat does and doesn’t getrole-labeled.)

• Few Core ARG labels areswapped.

• More Modifier labels areswapped.

• Few Core Arg/Modifierswaps.

40 / 46


Joint model (Toutanova et al. 2008:§5)

1 Use the local model to generate the top n non-overlapping labeling functionsL , via a variant of the dynamic programming algorithm used to ensurenon-overlap (§4.2).

2 Use a MaxEnt model to re-rank the top n labeling sequences via valuesP r

SRL(L |t , v).

3 Obtain final scores:

Definition (Joint model scoring)

PSRL(L |t , v) =(PLOCAL

SRL (L |t , v))α× P r

SRL(L |t , v)

where α is a tuntable parameter (they used 1.0)

4 Classification: pick the L that maximizes this scoring function.

41 / 46


Joint model features (Toutanova et al. 2008:§5.2)

• All the features from the local models.

• Whole Label Sequence features of arbitrary length:Toutanova, Haghighi, and Manning A Global Joint Model for SRL

Figure 9An example tree from Propbank with semantic role annotations, for the sentence Final-hourtrading accelerated to 108.1 million shares yesterday.

node and the labels of other nodes, but also dependencies between the label of a nodeand input features of other argument nodes. The features are specified by instantiationof templates and the value of a feature is the number of times a particular pattern occursin the labeled tree.

For a tree t, predicate v, and joint assignment L of labels to the nodes of the tree, wedefine the candidate argument sequence as the sequence of non-NONE labeled nodes[n1, l1, . . . , vPRED, . . . , nm, lm] (li is the label of node ni). A reasonable candidate argumentsequence usually contains very few of the nodes in the tree—about 2 to 7—as this isthe typical number of arguments for a verb. To make it more convenient to expressour feature templates, we include the predicate node v in the sequence. This sequenceof labeled nodes is defined with respect to the left-to-right order of constituents in theparse tree. Because non-NONE labeled nodes do not overlap, there is a strict left-to-rightorder among these nodes. The candidate argument sequence that corresponds to thecorrect assignment in Figure 9 is then:

[NP1-ARG1, VBD1-PRED, PP1-ARG4, NP3-ARGM-TMP]

Features from Local Models. All features included in the local models are also includedin our joint models. In particular, each template for local features is included as a jointtemplate that concatenates the local template and the node label. For example, for thelocal feature PATH, we define a joint feature template that extracts PATH from everynode in the candidate argument sequence and concatenates it with the label of thenode. Both a feature with the specific argument label and a feature with the genericback-off ARG label are created. This is similar to adding features from identificationand classification models. In the case of the example candidate argument sequenceprovided, for the node NP1 we have the features:

{(NP!S"VP"VBD)-ARG1, (NP!S"VP"VBD)-ARG}

When comparing a local and a joint model, we use the same set of local featuretemplates in the two models. If these were the only features that a joint model used,we would expect its performance to be roughly the same as the performance of alocal model. This is because the two models will in fact be in the same parametricfamily but will only differ slightly in the way the parameters are estimated. Inparticular, the likelihood of an assignment according to the joint model with localfeatures will differ from the likelihood of the same assignment according to the localmodel only in the denominator (the partition function). The joint model sums over

179

Basic: [voice:active, ARG1, PRED, ARG4, ARGM-TMP]Lemma: [voice:active, lemma:accelerate, ARG1, PRED, ARG4, ARGM-TMP]Generic: [voice:active, ARG, PRED, ARG, ARG]

POS: [voice:active, NP-ARG0, PRED, NP-ARG1, PP-ARG2]POS+lemma: [voice:active,lemma:offer, NP-ARG0, PRED, NP-ARG1, PP-ARG2]

• Repetition features: POS-annotated features indicating when the same ARGoccurs multiple times.

42 / 46


Joint model results (Toutanova et al. 2008:§5.4)• Local: the local model and results given above

• JointLocal: a joint model using only the Local features

• LabelSeq: a joint model using only the Local features and the whole labelssequence features

• AllJoint: a joint model using the Local features, the whole labels sequencefeatures, and the repetition features


few good argument frames, it cannot easily rule out very bad frames. It makes sensethen to incorporate the local model into our final score. Our final score is given by:

PSRL(L|t, v) = (P!SRL(L|t, v))" Pr

SRL(L|t, v)

where ! is a tunable parameter determining the amount of influence the local score hason the final score (we found ! = 1.0 to work best). Such interpolation with a score froma first-pass model was also used for parse re-ranking in (Collins 2000). Given this score,at test time we choose among the top n local assignments L1, . . . , Ln according to:

argmaxL!L1,...,Ln! log P!

SRL(L|t, v) + log PrSRL(L|t, v) (4)

5.4 Joint Model Results

We compare the performance of joint re-ranking models and local models. We usedn = 10 joint assignments for training re-ranking models, and n = 15 for testing. Theweight ! of the local model was set to 1. Using different numbers of joint assignmentsin training and testing is in general not ideal, but due to memory requirements, wecould not experiment with larger values of n for training.

Figure 10 shows the summary performance of the local model (LOCAL), repeatedfrom earlier figures, a joint model using only local features (JOINTLOCAL), a joint modelusing local + whole label sequence features (LABELSEQ), and a joint model using alldescribed types of features (ALLJOINT). The evaluation is on gold-standard parse trees.In addition to performance measures, the figure shows the number of binary featuresincluded in the model. The number of features is a measure of the complexity of thehypothesis space of the parametric model.

We can see that a joint model using only local features outperforms a local modelby .5 points of F-Measure. The joint model using local features estimates the featureweights only using the top n consistent assignments, thus making the labels of differentnodes non-independent according to the estimation procedure, which may be a causeof the improved performance. Another factor could be that the model JOINTLOCAL is acombination of two models as specified in Equation (4), which may lead to gains (as isusual for classifier combination).

The label sequence features added in Model LABELSEQ result in another 1.5 pointsjump in F-Measure on all arguments. An additional .8 gain results from the inclusionof syntactic–semantic and repetition features. The error reduction of model ALLJOINT

Figure 10Performance of local and joint models on ID&CLS on Section 23, using gold-standard parse trees.The number of features of each model is shown in thousands.

182• The pattern of errors for the Joint models is broadly the same as for the

Local models, though there are notable points of improvement (p. 183).

• Toutanova et al. (2008:§6) show that the Joint-model approach is robust forautomatic (and therefore error-ridden) parses as well.

43 / 46


Conclusions

• Semantic roles are distinct from syntactic roles.

• Semantic roles capture usefully abstract semantic information (despite thechallenges of assigning them).

• SRL reached a peak of popularity around 2005-2006, and it is currently onthe wane, but this is probably just because system performance is still notgreat.

• There are many SRL models, but a lot of commonalities in the underlyingfeature sets.

• Even if we manage to do complete and accurate semantic composition (staytuned for Bill, Percy Liang, and Richard Socher!) SRL will remain valuablewhere a coarse-grained semantics is called for.

44 / 46


References I

Baker, Collin F.; Charles J. Fillmore; and John B. Lowe. 1998. The Berkeley FrameNet project. InProceedings of COLING-ACL, 86–90. Montreal: Association for Computational Linguistics.

Carreras, Xavier and Luıs Marquez. 2005. Introduction to the CoNLL-2005 shared task: Semantic rolelabeling. In Proceedings of CoNLL, 152–164. Ann Arbor, MI.

Dowty, David. 1991. Thematic proto-roles and argument selection. Language 67(3):547–619.Fillmore, Charles J. 1968. The case for Case. In Emmon Bach and Robert T. Harms, eds., Universals in

Linguistic Theory, 1–88. New York: Holt, Rinehart, and Winston.Fillmore, Charles J. and Collin F. Baker. 2001. Frame semantics for text understanding. In Proceedings

of the WordNet and Other Lexical Resources Workshop, 59–64. Pittsburgh, PA: Association forComputational Linguistics.

Gildea, Daniel and Daniel Jurafsky. 2000. Automatic labeling of semantic roles. In Proceedings of the38th Annual Meeting of the Association for Computational Linguistics, 512–520. Hong Kong:Association for Computational Linguistics. doi:\bibinfo{doi}{10.3115/1075218.1075283}. URLhttp://www.aclweb.org/anthology/P00-1065.

Gildea, Daniel and Daniel Jurafsky. 2002. Automatic labeling of semantic roles. ComputationalLinguistics 28(3):245–288.

Levin, Beth and Malka Rappaport Hovav. 2005. Argument Realization. Cambridge: CambridgeUniversity Press.

Palmer, Martha; Paul Kingsbury; and Daniel Gildea. 2005. The Proposition Bank: An annotated corpusof semantic roles. Computational Linguistics 31(1):71–105.

Pradhan, Sameer S.; Wayne H. Ward; Kadri Hacioglu; James H. Martin; and Dan Jurafsky. 2004.Shallow semantic parsing using support vector machines. In Daniel Marcu Susan Dumais and SalimRoukos, eds., HLT-NAACL 2004: Main Proceedings, 233–240. Boston, Massachusetts, USA:Association for Computational Linguistics.

Punyakanok, Vasin; Dan Roth; and Scott Wen-tau Yih. 2005. The necessity of syntactic parsing forsemantic role labeling. In Proceedings of IJCAI, 1117–1123. Acapulco, Mexico.

45 / 46

http://www.aclweb.org/anthology/P00-1065


References II

Punyakanok, Vasin; Dan Roth; Scott Wen-tau Yih; Dav Zimak; and Tuancheng Tu. 2004. Semantic rolelabeling via generalized instances over classifiers. In Proceedings of CoNLL, 130–133. Boston.

Riesbeck, Christopher K. 1975. Conceptual analysis. In Roger C. Schank, ed., Conceptual InformationProcessing. North-Holland and Elsevier.

Ruppenhofer, Josef; Michael Ellsworth; Miriam R. L. Petruck; Christopher R. Johnson; and JanScheffczyk. 2006. FrameNet II: Extended Theory and Practice. Berkeley, CA: International ComputerScience Institute.

Simmons, Robert F. 1973. Semantic networks: Their computation and use for understanding Englishsentences. In Roger Schank and Kenneth Mark Colby, eds., Computer Models of Thought andLanguage, 61–113. San Francisco: W. H. Freeman and Company.

Surdeanu, Mihai; Sanda Harabagiu; John Williams; and Paul Aarseth. 2003. Using predicate-argumentstructures for information extraction. In Proceedings of the 41st Annual Meeting of the Association forComputational Linguistics, 8–15. Sapporo, Japan: Association for Computational Linguistics.doi:\bibinfo{doi}{10.3115/1075096.1075098}. URL http://www.aclweb.org/anthology/P03-1002.

Surdeanu, Mihai; Luıs Marquez; Xavier Carreras; and Pere R. Comas. 2007. Combination strategies forsemantic role labeling. Journal of Artificial Intelligence Research 29:105–151.

Surdeanu, Mihai and Jordi Turmo. 2005. Semantic role labeling using complete syntactic analysis. InProceedings of CoNLL 2005 Shared Task.

Toutanova, Kristina; Aria Haghighi; and Christopher D. Manning. 2008. A global joint model for semanticrole labeling. Computational Linguistics 34(2):161–191.

Xue, Niawen and Martha Palmer. 2004. Calibrating features for semantic role labeling. In Proceeedingsof EMNLP, 88–94. Barcelona, Spain.

Yih, Scott Wen-tau and Kristina Toutanova. 2007. Automatic semantic role labeling. Tutorial at AAAI-07,URL http://research.microsoft.com/apps/pubs/default.aspx?id=101987.

46 / 46

http://www.aclweb.org/anthology/P03-1002

http://research.microsoft.com/apps/pubs/default.aspx?id=101987

Documents

Christopher Potts Feb 2€¦ · Semantic roles capture usefully abstract semantic information (despite the challenges of assigning them). SRL reached a peak of popularity around 2005-2006,