1
Anaphora, Discourse and Information Structure
Oana [email protected]
EGK Colloquium
April 29, 2004
2
Overview
Anaphora ResolutionDiscourse (parsing)Balkanet
Information Structure
Joint work with Prof. Dan Cristea & Prof. Dan Tufis; Univ. of Iasi
3
Anaphora Resolution
“If an incendiary bomb drops next to you, don’t loose your head. Put it in a bucket and cover it with sand”.
Ruslan Mitkov (p.c.)
4
Anaphora Resolution
“Anaphora represents the relation between a term (named anaphor) and another (named antecedent), when the interpretation of the anaphor is somehow determined by the interpretation of the antecedent”.
Barbara Lust, Introduction to Studies of Anaphora Acquisition, D. Reidel, 1986
5
Anaphora Resolution Types
Coreference resolution The anaphor and the antecedent refer to
the same entity in the real world.Three blind mice, three blind mice.See how they run! See how they run!
Functional anaphora resolution The anaphor and the antecedent refer to
two distinct entities that are in a certain relation.
When the car stopped, the driver got scared.
Haliday & Hassan 1976
6
Types of CoreferencePronominal coreference
The butterflies were dancing in the air. They offered an amazing couloured show.
Common nouns with different lemmasAmenophis the IVth's wife was looking through the window. The beautiful queen was sad.
Common nouns with different lemmas and numberA patrol was marching in the street. The soldiers were very well trained.
Proper namesThe President of U.S. gave a very touching speech. Bush talked about the antiterorist war.
AppositionsMrs. Parson, the wife of a neighbour on the same floor,
was looking for help.Nominal predicates
Maria is the best student of the whole class.Function-value coreference
The visitors agreed on the ticket price. They concluded that 100$ was not that much.
7
RARE – Robust Anaphora Resolution Engine
RARE
text
AR-model3
AR-model2
AR-model1
Coreference chains
8
RARE: Two main principles
1. Coreferential relations are semantic, not textual.
Coreferential anaphoric relation
text layer………………………………………………..
semantic layer……………………………………………
a
a proposes centera
centera
b evokes centera
b
9
RARE: Two main principles
2. Processing is incremental
text layer…………………………………………
projection layer………………………………………………………..
semantic layer………………………………….
RE b projects PSb
PSb
centera
PSa proposes centera
RE a projects PSa
PSa
………………………
b a
PSb evokes centera
10
Terminology
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
DEj
PSx
REb REc REd REx
reference expressions
DE1
projected structures
discourse entities
11
What is an AR-model?
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
DEj
PSx
REb REc REd REx
DE1
knowledge sources
primary attributes
heuristics/rules
domain of referential accessibility
12
Primary attributes1. Morphological (number, lexical gender, person)
2. Syntactic (REs as constituents of a syntactic tree, quality of being adjunct, embedded or complement of a preposition, inclusion or not in an existential construction, syntactic patterns in which the RE is involved)
3. Semantic and lexical (RE’s head position in a conceptual hierarchy, animacy, sex/natural gender, concreteness, inclusion in a synonymy class, semantic roles)
4. Positional (RE’s offset in the text, inclusion in a discourse unit)
5. Surface realisation (zero/clitic/full/reflexive/possessive/ demonstrative/reciprocal pronoun, expletive “it”, bare noun, indefinite NP, definite NP, proper noun)
6. Other (domain concept, frequency of the term in the text, occurrence of the term in a heading)
13
Knowledge sources
• A knowledge source: a (virtual) processor able to fetch values to attributes on the projections layer
Minimum set: POS-tagger + shallow parser
14
Matching Rules
• Certifying Rules (applied first): certify without ambiguity a possible candidate.
• Demolishing Rules (applied afterwards): rule out a possible candidate.
• Scored Rules: increase/decrease a resolution score associated with a pair <PS, DE>.
15
Domain of referential accesibility
Filter and order the candidate discourse entities: a. Linearly
Dorepaal, Mitkov, ...
b. Hierarchically
Grosz & Sidner; Cristea, Ide & Romary ...
16
The engine
for_each RE in RESequence:
projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
17
The engine: Projectionfor_each RE in RESequence:
projection(RE)proposing/evoking(PS)completion(DE,PS) re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
REb REc REd
DEn
PSd
REx
psx primary attributes
knowledge sources
PSx
18
The engine: Proposingfor_each RE in RESequence:
projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
PSx
REb REc REd REx
domain of referential accessibility
DEn
PSd
heuristics/rulesDEn
19
The engine: Proposing (2)for_each RE in RESequence:
projection(RE)proposing/evoking(PS)
• apply certifying rules • apply demolishing rules • apply scored rules• sort candidates in desc. order of scores• use thresholds to:
– propose a new DE– link the current PS to an existing DE– postpone decision
completion(DE,PS)re-evaluation
20
The engine: Completion
for_each RE in RESequence:projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
PSx
REb REc REd REx
DEn
PSd
DEn
21
The engine: Completion (2)
for_each RE in RESequence:projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
REb REc REd REx
PSd
DEn
22
The engine: Re-evaluation
for_each RE in RESequence:projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
REb REc REd REx
PSd
DEn
PSd
DEn
23
The engine: Re-eval (2)
for_each RE in RESequence:projection(RE)proposing/evoking(PS)completion(DE,PS)re-evaluation
text layer ……………………….…………………………………………
semantic layer ………………………………………
DEm
REa
projection layer ………………………………………………
REb REc REd REx
DEn
24
The Coref Corpus
• 4 chapters from George Orwell’s novel “1984” summing up aprox. 19,500 words.
• Preprocessed using a POS-tagger & a FDG-parser.• The NPs automatically extracted from FDG structure (some manual corrections were necessary, also adding other types of referential expressions).• Manual annotation of the coreferential links (each text was assigned to two annotators).
• Interannotator agreement – as low as 60%.
Our annotation is conformant with MUC & ACE
25
The Coref CorpusText 1 Text 2 Text 3 Text 4 Total
No. of sentences 311 175 169 328 983No. of words 6935 3317 3260 6008 19520No. of REs 1942 914 916 1702 5472Average no. of
REs per sentence
6.2 5.2 5.4 5.1 5.4
Pronouns 645 281 362 614 1902No. of DEs 921 520 464 863
26
Evaluation
Success Rate = #correctly solved anaphors / all anaphors
For the four texts we obtained values between 60% and 70%.
(Mitkov 2000)
28
Discourse Parsing
Input: plain text
Goal: - Automatically obtain a discourse structure of the text (resembling RST trees). - Apply the Veins Theory to produce focussed summaries.
Cristea, Ide & Romary 1998
29
Veins Theory: Quick Intro
Cristea, Ide & Romary 1998
1 2 3 4
5
H=1 3 5
H=1 3
H=1
H=3
H=1
H=2H=3
H=4
H=5
V=1 3 5
V=1 3 5
V=1 3 5
V=1 3 5
V=1 3 5
V=1 2 3 5
V=1 3 5
V=1 3 5
V=1 3 4 5
Head expression: the sequence of the most important units within the corresponding span of text
Vein expression: the sequence of units that are required to understand the span of text covered by the node, in the context of the whole discourse
30
Focused Summaries
We call focused summary on an entity X, a coherent excerpt presenting how X is involved in the story that constitutes the content of the text.
- It is given by the vein expression of the unit to which X belongs.
31
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
32
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
Conexor FDG parser
http://www.connexor.com/m_syntax.html
33
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
Extracts NPs from the FDG structure
34
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
RARE...
35
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
Detects the boundaries of clauses, based on learning methods.
Georgiana Puscasu (2004): A Multilingual Method for Clause Splitting.
36
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
— Proposes one or more tree structure(s) at the sentence level. — The leaves are the clauses previously detected. — Uses the FDG structure and the cue-phrases.
37
The method
Plain text
FDGparser
segmentsdetector
NPDetector
sentence tree
extractor
AR-engine
taggedcorefss-trees
DiscouseParser
Discoursestructure
VeinsTheory
focusedsummary
38
The Discourse Parser
– We have trees for each sentence;– The goal is to incrementally integrate these trees into a single structure corresponding to the entire text
The current tree is inserted at each node on the right frontier; each resulting structure is scored considering:
– The coreference links– Centering Theory– Veins Theory
foot node
*
Cristea, Postolache, Pistol (2004): Summarization through Discourse structure (submitted to Coling)
39
The Discourse Parser
– At the end of the process - set of trees corresponding to the input text, each with a score
T* = argmax score(Ti)
– Veins(T*)– Extract the summary
Ti
40
Discusion & Evaluation
- We do obtain automatically coherent summaries!
- How to evauate?- We have 90 summaries made by humans...1) Construct a golden summary out of the 90
summaries and compare it with the system output?
2) Compare the sytem output with all 90 summaries and take the best result?
42
Information Structure
Many approaches for IS:Prague School Approach; Formal account of English intonation; Integrating different means of IS realization
within one grammar framework;Formal semantics of focus;Formal semantics of topic; Integrating IS within a theory of discourse
interpretation; IS-sensitive discourse context updating;
Sgall et al; Steedman; Kruijff; Krifka, Rooth; Hendriks; Vallduvi, Kruijff-Korbayova
43
Information StructureGoals:
Improve/Create/Enlarge a corpus annotated at IS (and not only);
Investigate means of continuing the annotation (at least partially) automatically
Investigate how the (major) NLP tasks can benefit from IS.Find correlation between different features.
System that detects IS
44
Summary
Anaphora Resolution: RARE
Discourse Parsing: Veins theory
Balkanet: Multilingual WordNet
Information Structure