8
4/14/2005 4/14/2005 1 ACE Annotation Ralph Grishman New York University

ACE Annotation

  • Upload
    darryl

  • View
    17

  • Download
    0

Embed Size (px)

DESCRIPTION

ACE Annotation. Ralph Grishman New York University. ACE (Automatic Content Extraction). Government evaluation task for information extraction 6 evaluations since 2000 next one Nov. 2005 incremental increases in task complexity (Current) criteria for what to annotate: - PowerPoint PPT Presentation

Citation preview

Page 1: ACE Annotation

4/14/2005 4/14/2005 1

ACE Annotation

Ralph Grishman

New York University

Page 2: ACE Annotation

4/14/2005 4/14/2005 2

ACE

(Automatic Content Extraction)

• Government evaluation task for information extraction

• 6 evaluations since 2000– next one Nov. 2005

– incremental increases in task complexity

• (Current) criteria for what to annotate:– interest to Government sponsors

– good inter-annotator agreement

– reasonable density of annotations• initially for news, now for wider range of genres

(trade-off between coverage and agreement)

Page 3: ACE Annotation

4/14/2005 4/14/2005 3

Types of Annotations

Entities

Relations

Events

• Inter-annotator agreement measured by ‘value’ metric– roughly 1.00 - % missing - % spurious

Page 4: ACE Annotation

4/14/2005 4/14/2005 4

Entities

• Objects of the discourse• (Semantic) Types:

– persons, organizations, geo-political entities, [non-political] locations, facilities, vehicles, weapons

• Two levels of annotation:– mentions (individual names, nominals, pronouns)

– entities (sets of coreferring mentions)

• Inter-annotator agreement around 0.90

Page 5: ACE Annotation

4/14/2005 4/14/2005 5

Relations

• Binary, generally static relationships between entities

• Main types:– physical (location), part-whole, personal-social,

org-affiliation, gen-affiliation, and agent-artifact

• Example: the CEO of Microsoft

• Inter-annotator agreement (given entities) around 0.75 - 0.80

Org-affiliation

Page 6: ACE Annotation

4/14/2005 4/14/2005 6

Events

• New for 2005

• Types:– life (born/marry/die), movement, transaction, business (start / end),

personnel (hire / fire), conflict (attack), contact (meet), justice

• Example: China purchased two subs from Russia in 1998.

transfer-ownership: buyer (trigger) artifact seller time

• Inter-annotator agreement (given entities) around 0.55-0.60• some events (born, hire/fire, justice) fairly clear-cut

• others (attack, meet, move) hard to delimit

• coreference sometimes hard

• No causal / subevent linkage -- too hard (maybe in 2006?)

Page 7: ACE Annotation

4/14/2005 4/14/2005 7

Corpora

• Genres• newswire and broadcast news

• adding weblogs, conversational telephone, talk shows, usenet this year

• Multi-lingual• English, Chinese, Arabic (since 2003)

• Volume• 2004 set: 140 KW training, 50 KW test per language

• Distributed by LDC

Page 8: ACE Annotation

4/14/2005 4/14/2005 8

A (Nearly) Semantic Annotation

• Annotation criteria primarily truth-conditional, not linguistic– although annotations are linked back to text

• e.g., event triggers

– and some constraints are included to improve inter-annotator agreement• e.g., event arguments must be in same sentence as trigger

• Event arguments are filled in using ‘true beyond a reasonable doubt’ rule “An attack in the Middle East killed two Israelis.”

– Both the attack and die events are tagged as occurring in the Middle East