Upload
leon-derczynski
View
106
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Automatic temporal ordering of events described in discourse has been of great interest in recent years. Event orderings areconveyed in text via various linguistic mechanisms including the use of expressions such as “before”, “after” or “during”that explicitly assert a temporal relation – temporal signals. We investigate the role of temporal signals in temporal relation extraction and provide a quantitative analysis of these expressions in the TimeBank annotated corpus.
Citation preview
Introduction Temporal links Temporal signals Improving annotation Summary
A Corpus-based Study of Temporal Signals
Leon Derczynski
University of Sheffield
20 July, 2011
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Outline
1 Introduction
2 Temporal links
3 Temporal signals
4 Improving annotation
5 Summary
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Motivation
Language for time helps us describe:
changes
planning
history
Time is not always explicit in natural language – we don’t includea timestamp with every actionGoals:
Try to automatically extract temporal information fromdocuments, so that we can build a model that connectsinformation in a text with time
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Temporal Entities
What elements can we try to extract from discourse?Each document might contain:Basic primitives:
Events – occurences, states, reports
Times – dates and times, durations, sets
Linkages between primitives:
general temporal link
aspectual links and subordination
We can use the basic primitives as nodes on a graph, and links asits arcs.
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Outline
1 Introduction
2 Temporal links
3 Temporal signals
4 Improving annotation
5 Summary
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Temporal link labelling
How do we label the links between temporal entities?
First, choose a relation set: TimeML gives us 13, includingbefore, simultaneous, includes..
Some relations have transitive and commutative properties:
If “a before b” and “b before c” then we can infer “a before c”
This means that consistency can be important
Develop a gold-standard corpus – TimeBank
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Automated temporal link labelling
How can we automatically label links?
Machine learning approaches: teach ourselves how to label alink based on times and events it may connect
Use TimeBank and other as examples of how
A difficult task: notable research effort, including variousevaluation exercises, have attempted it
Overall accuracy remains around 60% – 70% : too low1
1See Chambers & Jurafsky, 2008;
Mirroshandel et. al. 2010; TempEval-2010Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Source of temporal linking information
What information can we use to label links?
If a human can manage to understand temporal relations, theinformation must be somewhere
Possible sources:
– tense and aspect
– world knowledge
– discourse structure
– specific time information (at 9 o’clock)
– explicit signals: temporal conjunctions
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Outline
1 Introduction
2 Temporal links
3 Temporal signals
4 Improving annotation
5 Summary
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Temporal conjunctions
Are these words/phrases useful for automatic understanding?
A baseline system could learn to label links with 62% accuracy
With simple modification, links in TimeBank that hadassociated signals could be annotated with 83% accuracy
Clear indication that signals are an accessible source oftemporal information
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Temporal conjunctions in newswire
What do temporal conjunctions look like in TimeBank?
11.2% of temporal links are annotated as having one (718instances)
Top words:
– prepositions (in, for, on)
– conjunctions (after, before, since)
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Temporal conjunctions in newswire
Phrase Corpus freq.
Occurrences
as signal
Likelihood of
being a signal
subsequently 3 3 100%
after 72 67 93%
follows 4 3 75%
before 33 23 70%
until 36 25 69%
during 19 13 68%
as soon as 3 2 67%
Table: A sample of phrases most likely to be annotated as a signal whenthey occur in TimeBank, which occur more than once in the corpus.
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Discrimination of temporal signal words
What else are these temporal signal words used for?
Some words are very likely to have a temporal sense:
subsequently – 3 instances, all temporal;
after – 72 instances, 93% temporal.
Other words are versatile:
from – 366 instances, 5% temporal.
between – 33 instances, 1 temporal;
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Signal-to-link relations
What temporal relations do these words signify?
after doesn’t always signify a temporal after relation
Word order is important
After I ate, I went to bed
I ate after I went to bed
Signal phrase TimeML relation Frequency
after AFTER 56
after ENDS 6
after BEGINS 4
after IAFTER 1
already BEFORE 6
already INCLUDES 4
already IS INCLUDED 3
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Signal class
How can we characterise temporal signals?
Signals are likely to belong to a closed class of words
Common prepositions as seen earlier
Some adverbs – previously, subsequently
Set phrases – as soon as, so far
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Spatial/Temporal overlap
Time and space are related and events are constrained interms of both
Language for space and time has some similarities
before has both temporal and spatial senses
Spatially annotated corpora – SpatialML
Relative spatial links in this corpus are much more likely toemploy a signal (97.5%)
Possible explanation – temporal language is more diverse(tense, auxiliaries)
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Outline
1 Introduction
2 Temporal links
3 Temporal signals
4 Improving annotation
5 Summary
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Re-annotation
Are these signals correctly annotated in TimeBank?
Manual examination: start with words that are likely to betemporal signals
before: found 33 times in the corpus, 23 are signals
Many under-annotated cases:
before the war began
was scheduled to return to port before hostilities erupted
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Re-annotation
How could we improve signal annotation?
Linguistic description of temporal conjunctions may be weak
Annotation guidelines may be insufficient
Solution: provide an enhanced signal description, and reviseTimeBank accordingly
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Formal signal description
A temporal signal is a word that indicates the type oftemporal relation between two intervals
Signal surface forms have a head and an optional quantifier
shortly after – quantified temporal signal
Temporal signals have exactly two arguments (events and/ortimes)
One argument may be implicit (e.g. for Later)
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Augmented TimeBank
We examined 30 of the most frequent signal words andphrases that were not annotated as temporal
This comprised around 1 000 instances in text
We annotated any missed temporal signals, including EVENTand TLINK annotations where required
This resulted in 15.8% of TLINKs using a signal
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Outline
1 Introduction
2 Temporal links
3 Temporal signals
4 Improving annotation
5 Summary
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Conclusion
Temporal signals are a usable and important source ofinformation
We have provided a definition for temporal signals
Existing corpora have been upgraded with better annotation
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Future work
Automatic signal discrimination
Signal association
Applying findings to spatial language
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals
Introduction Temporal links Temporal signals Improving annotation Summary
Thank you. Are there any questions?
Leon Derczynski University of Sheffield
A Corpus-based Study of Temporal Signals