Upload
naushad-uzzaman
View
215
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Extracting temporal information from raw text is fundamental for deep language understand- ing, and key to many applications like ques- tion answering, information extraction, and document summarization. In this paper, we describe two systems we submitted to the TempEval 2 challenge, for extracting temporal information from raw text. The systems use a combination of deep semantic parsing, Markov Logic Networks and Conditional Random Field classifiers. Our two submitted systems, TRIPS and TRIOS, approached all tasks and outperformed all teams in two tasks. Furthermore, TRIOS mostly had second-best performances in other tasks. TRIOS also out- performed the other teams that attempted all the tasks. Our system is notable in that for tasks C – F, they operated on raw text while all other systems used tagged events and tem- poral expressions in the corpus as input.
Citation preview
TRIPS and TRIOS System for TempEval-2: Extracting Temporal Information from Text
Naushad UzZaman and James F. AllenComputer Science Department
University of RochesterRochester, NY, USA
International Workshop on Semantic Evaluations (SemEval-2010), Association for Computational
Linguistics (ACL), Sweden, July 2010.
Naushad UzZaman July 16, 2010
Task B: Event extraction
The New York Times said in an editorial on Saturday, April 25: The Supreme Court took a detour this week from the core principle of gender fairness it vindicated two years ago in its ruling invalidating the use of sexual stereotypes to justify denying women admission to the Virginia Military Institute.
By a 6-3 vote, the court upheld a discriminatory immigration law that gives a child born overseas to an unmarried American woman a better chance at citizenship than a child born to an unmarried American man.
Naushad UzZaman July 16, 2010
Task A: Temporal Expression extraction
The New York Times said in an editorial on Saturday, April 25: The Supreme Court took a detour this week from the core principle of gender fairness it vindicated two years ago in its ruling invalidating the use of sexual stereotypes to justify denying women admission to the Virginia Military Institute.
By a 6-3 vote, the court upheld a discriminatory immigration law that gives a child born overseas to an unmarried American woman a better chance at citizenship than a child born to an unmarried American man.
Naushad UzZaman July 16, 2010
Tasks C-F: Temporal relations extraction
The New York Times said in an editorial on Saturday, April 25: The Supreme Court took a detour this week from the core principle of gender fairness it vindicated two years ago in its ruling invalidating the use of sexual stereotypes to justify denying women admission to the Virginia Military Institute.
By a 6-3 vote, the court upheld a discriminatory immigration law that gives a child born overseas to an unmarried American woman a better chance at citizenship than a child born to an unmarried American man.
Naushad UzZaman July 16, 2010
Our Systems
• TRIPS: based on TRIPS Parser
• TRIOS: hybrid between TRIPS Parser and machine learning classifiers (MLN, CRF)
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
TRIPS Parser: Broad coverage deep parsing
named entity recognizers
address recognizer
statisticalparser
InputChart
name hypotheses
address hypotheses
BracketingPreferences
semantic preferences
LF formpreferences
Wordnet Wordfinder
new lexical entries
Core Lexicon & LF Ontology
Grammar
Parser
ContentExtractor
FinalLogical Form
Input
OutputChart
word hypotheses
Comlex
POStagging
POShyps
Advanced Medical paid $ 106 million in cash for its share in a unit of Henley 's Fisher Scientific subsidiary .
preprocessing
Lexicon on demand
Search guidance
heuristicextraction
Naushad UzZaman July 16, 2010
TRIPS Parser: Broad coverage deep parsing
named entity recognizers
address recognizer
statisticalparser
InputChart
name hypotheses
address hypotheses
BracketingPreferences
semantic preferences
LF formpreferences
Wordnet Wordfinder
new lexical entries
Core Lexicon & LF Ontology
Grammar
Parser
ContentExtractor
FinalLogical Form
Input
OutputChart
word hypotheses
Comlex
POStagging
POShyps
preprocessing
Lexicon on demand
Search guidance
heuristicextraction
Advanced Medical paid $ 106 million in cash for its share in a unit of Henley 's Fisher Scientific subsidiary .
Naushad UzZaman July 16, 2010
TRIPS Parser: Broad coverage deep parsing
named entity recognizers
address recognizer
statisticalparser
InputChart
name hypotheses
address hypotheses
BracketingPreferences
semantic preferences
LF formpreferences
Wordnet Wordfinder
new lexical entries
Core Lexicon & LF Ontology
Grammar
Parser
ContentExtractor
FinalLogical Form
Input
OutputChart
word hypotheses
Comlex
POStagging
POShyps
preprocessing
Lexicon on demand
Search guidance
heuristicextraction
Advanced Medical paid $ 106 million in cash for its share in a unit of Henley 's Fisher Scientific subsidiary . subsidiary
Naushad UzZaman July 16, 2010
TRIPS Parser: Broad coverage deep parsing
named entity recognizers
address recognizer
statisticalparser
InputChart
name hypotheses
address hypotheses
BracketingPreferences
semantic preferences
LF formpreferences
Wordnet Wordfinder
new lexical entries
Core Lexicon & LF Ontology
Grammar
Parser
ContentExtractor
FinalLogical Form
Input
OutputChart
word hypotheses
Comlex
POStagging
POShyps
preprocessing
Lexicon on demand
Search guidance
heuristicextraction
Advanced Medical paid $ 106 million in cash for its share in a unit of Henley 's Fisher Scientific subsidiary .
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman May 21, 2010
Markov Logic Network
• Problems with rule based system and machine learning techniques
• Markov logic = first order logic + markov network (probabilistic graphical model)
• FOL with weights
• weights determine how much penalty for a formula to be violated
Example: It is not going to change
tense(e1, INFINITIVE) & aspect(e1, NONE) => class(e1, OCCURRENCE) weight = 0.319913
TheBeast: http://code.google.com/p/thebeast/
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
He fought in the war
TRIPS parseroutput
Sentence
Extracted with extraction rules
100+ Extraction rules
(SPEECHACT V1 SA-TELL :CONTENT V2) (F V2 (:* FIGHTING FIGHT) :AGENT V3 :MODS (V4) :TMA ((TENSE PAST)))(PRO V3 (:* PERSON HE) :CONTEXT-REL HE) (F V4 (:* SITUATED-IN IN) :OF V2 :VAL V5) (THE V5 (:* ACTION WAR))
((THE ?x (? type SITUATION-ROOT)) -extract-noms>
(EVENT ?x (? type SITUATION-ROOT) :pos NOUN :class OCCURRENCE ))
<EVENT eid=V2 word=FIGHT pos=VERBAL ont-type=FIGHTING tense=PAST class=OCCURRENCE voice=ACTIVE aspect=NONE polarity=POSITIVE nf-morph=NONE>
<RLINK eventInstanceID=V2 ref-word=HE ref-ont-type=PERSON relType=AGENT><SLINK signal=IN eventInstanceID=V2 subordinatedEventInstance=V5
relType=SITUATED-IN><EVENT eid=V5 word=WAR pos=NOUN ont-type=ACTION class=OCCURRENCE
voice=ACTIVE polarity=POSITIVE aspect=NONE tense=NONE>
Events and event features extraction using TRIPS parser
Naushad UzZaman July 16, 2010
TRIPS and TRIOS System for event extraction (Task B)
• TRIPS event and event feature extraction
• Uses TRIPS parser
• High recall, lower precision
• Event features performance not the best
• TRIOS event and event feature extraction
• Take TRIPS events
• MLN for filtering TRIPS events
• MLNs to classify event features
Naushad UzZaman July 16, 2010
Event Extraction Performance
System Precision Recall Fscore
TRIOS 0.80 0.74 0.77
TRIPS 0.55 0.88 0.68
Best (TIPSem) 0.81 0.86 0.84
Table 1: Performance of Event Extraction (Task B) in TempEval-2
System TRIPS TRIOS Best
Class 0.67 0.77 0.79 (TIPSem)
Tense 0.67 0.91 0.92 (Edinburgh-LTG)
Aspect 0.97 0.98 0.98
Pos 0.88 0.96 0.97 (TIPSem, Edinburgh-LTG)
Polarity 0.99 0.99 0.99
Modality 0.95 0.95 0.99 (Edinburgh-LTG)
Table 1: Performance of Event Features on TempEval-2 (Task B)
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
Temporal Expression Extraction (Task A)
• Recognizing Temporal Expression
• TRIPS parser
• Conditional Random Filed classifier (CRF++)
• word, shape, is year, is date of week, is month, is number, is time string, is day, is quarter, is punctuation, if belong to word-list like init-list , follow-list, etc.
• Determining the Normalized Value and Type
• Rule-based technique
• Matches regular expressions
• Released as open source
init-list is number is time string
last next
fiveten
months years
cw -2 cw - 1 current word
Temporal expression Type Value
DCT (given):
March 1, 1998; 14:11 hours TIME 1998-03-01T14:11:00
Sunday DATE 1998-03-01
last week DATE 1998-W08
mid afternoon TIME 1998-03-01TAF
nearly two years DURATION P2Y
each month SET P1M
Table 1: Examples of normalized values and types for temporal expressions
according to TimeML
Naushad UzZaman July 16, 2010
Performance on Temporal Expression Extraction
TRIPS BestTRIOS HeidelTime-1
Temp Exp Precision 0.85 0.90extraction Recall 0.85 0.82
Fscore 0.85 0.86Normalization type 0.94 0.96
value 0.76 0.85
Table 1: Performance on Temporal Expression extraction (Task A)
Naushad UzZaman July 16, 2010
Overview
Text
TRIPS parser
MLN event feature extractor
MLN event filtering
CRF timex extractor
timex normalizer
events event features timex timex
features
TaskB:TRIPS
Task BTRIOS
TaskB:TRIPS
Task BTRIOS Task A
Task A
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
Identifying Temporal Relations (Tasks C-F)
• TempEval-2 Tasks:
• Task C: Temporal relation between event and temporal expression in the same sentence
• Task D: Temporal relation between event and document creation time
• Task E: Temporal relation between the main events of adjacent sentences
• Task F: Temporal relation between two events where one event syntactically dominates the other
• Our approach
• Based on MLN
• Features from TRIPS parser and TRIOS system
Naushad UzZaman July 16, 2010
Features Task C Task D Task E Task FEvent Class YES YES e1 x e2 e1 x e2
Event Tense YES YES e1 x e2 e1 x e2
Event Aspect YES YES e1 x e2 e1 x e2
Event Polarity YES YES e1 x e2 e1 x e2
Event Stem YES YES e1 x e2 e1 x e2
Event Word YES YES YES YESEvent Constituent YES e1 x e2 e1 x e2
Event Ont-type YES YES e1 x e2 e1 x e2
Event LexAspect x Tense YES YES e1 x e2 e1 x e2
Event Pos YES YES e1 x e2 e1 x e2
Timex Word YESTimex Type YES YESTimex Value YES YESTimex DCT relation YES YESEvent’s semantic role YES YESEvent’s argument’s ont-type YES YESTLINK event-time signal YES YESSLINK event-event relation type YES
Table 1: Features used for Tasks C, D, E and F.
Naushad UzZaman July 16, 2010
Performance on Task C-F
TRIPS TRIOS Best (with corpus features)
Task Precision Recall Precision Recall Precision
Task C 0.63 0.52 0.65 0.52 0.63 (JU-CSE, UCFD, NCSU-indi)
Task D 0.76 0.69 0.79 0.67 0.82 (TIPSem)
Task E 0.58 0.50 0.56 0.42 0.55 (TIPSem)
Task F 0.59 0.54 0.6 0.46 0.66 (NCSU-individual)
Table 1: Performance of Temporal Relations on TempEval-2 (Task C-F)
Naushad UzZaman July 16, 2010
Overall Performance in TempEval-2
Task Description BestTask A Temporal expression extraction TRIOS
Task B Event Extraction TIPSemTask C Event-Timex relationship TRIOS
Task D Event-DCT relationship TIPSemTask E Main event-event relationship TRIOS
Task F Subordinate event-event relationship TRIOS
Table 1: Head-to-head comparison of TRIOS, TIPSem and JU-CSE-TEMP(teams that approached all tasks) in TempEval-2 challenge
Naushad UzZaman July 16, 2010
Outline
• Our System Modules
• TRIPS Parser
• Markov Logic Network
• Task B: Event extraction
• Task A: Temporal Expression extraction
• Tasks C-F: Identify temporal relations
• Future Work and Summary
Naushad UzZaman July 16, 2010
Future Work
• Automatically build larger temporally annotated corpus:
• for news domain, with human reviews
• explore other domains
• Temporal Structure in Discourse
Naushad UzZaman July 16, 2010
Summary
• Approached all six tasks in TempEval 2010
• Systems: hybrid between semantic parser and ML classifiers
• TRIOS System: Performed better than any other systems approaching all tasks
• TRIPS System: Higher recall; could be used for automatic temporal annotation with human review
Naushad UzZaman July 16, 2010
Questions ?
• Temporal Expression Normalizer:
http://www.cs.rochester.edu/u/naushad/temporal
Naushad UzZaman July 16, 2010
Benefits of TRIPS ontology
• Superior semantic ontology; better abstraction
• No problem with word sense disambiguation
• Considers semantic roles for disambiguation
• Helps to generate better links
Naushad UzZaman July 16, 2010
Table 1: Most common relTypes used in SLINKs and RLINKs
Our Role VerbNet Lirics SLINK RLINK
equivalents equivalents Count Count
Agent Agent, Agent 19 709
Actor
Theme Theme, Theme 336 1137
Stimulus
Affected Patient Patient 13 92
Cause Cause Cause 49
Goal-as-Loc Destination finalLocation 47
To-Loc Recipient Goal 46
At-Loc Location Location 42
In-Loc Location Location 28
On Location Location 20
Situated-In Location? Location? 39
Purpose – Purpose 226