11
Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data Consortium University of Pennsylvania, USA

Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Embed Size (px)

Citation preview

Page 1: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation

Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel

Linguistic Data ConsortiumUniversity of Pennsylvania, USA

Page 2: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Query Selection

Annotators search and annotate chains of entities connected by KBP slots

Cold Start queries comprised of Entity

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Appleton Museum of Art

org:top_members_employees

John Lofgren

per:title

director

Page 3: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Query Selection

Annotators search and annotate chains of entities connected by KBP slots

Cold Start queries comprised of Entity – Slot 0

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Appleton Museum of Art

org:top_members_employees

John Lofgren

per:title

director

Page 4: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Query Selection

Annotators search and annotate chains of entities connected by KBP slots

Cold Start queries comprised of Entity – Slot 0 – Slot 1

Inverse slots to increase connectivity e.g. per:cities_of_residence – gpe:residents_of_city

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Appleton Museum of Art

org:top_members_employees

John Lofgren

per:title

director

Page 5: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Query Selection

Annotators search and annotate chains of entities connected by KBP slots

Cold Start queries comprised of Entity – Slot 0 – Slot 1

Inverse slots to increase connectivity e.g. org:founded_by – {per,org,gpe}:organizations_founded

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Appleton Museum of Art

org:top_members_employees

John Lofgren

per:title

director

Page 6: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Query Selection

Annotators search and annotate chains of entities connected by KBP slots

Cold Start queries comprised of Entity – Slot 0 – Slot 1

Inverse slots to increase connectivity e.g. org:top_members_employees – per:top_member_employee_of

Cold Start corpus KBA output Comprised of web documents from Ocala, FL; Kentucky; Guyana

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Appleton Museum of Art

org:top_members_employees

John Lofgren

per:title

director

Page 7: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Annotation

Unlike other SF tasks, Cold Start annotation is performed concurrently with query development

Multiple fillers at each “hop” level, all of which must be annotated and correctly connected to one another

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

London – gpe:residents_of_city – per:charges

• Lance Barrett

• first-degree attempted burglary• theft of a firearm• carrying a concealed weapon

• Lesa Bailey

• criminal conspiracy to make meth• unlawful possession of meth precursors• possession of a controlled substance

Page 8: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Assessment

Assess validity of fillers & justification from humans & systems Filler

Correct – meets the slot requirements and supported in document Wrong – doesn’t meet slot requirements and/or not supported in doc Inexact – otherwise correct, but is incomplete, includes extraneous text,

or is not the most informative string in the document

Predicate Correct, Wrong, Inexact-Short, Inexact-Long

Subject/Object Correct, Wrong, Inexact

Ignore

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 9: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Justification

Justification is the string(s) of text that show a relation is true

Predicate: Includes all three pieces of information necessary to justify the entity/slot/filler relation

Subject: proves the entity’s involvement in the relation Object: proves the filler’s involvement in the relation Each part can be comprised of up to two, discontiguous strings

<Harkat-ul-Mujahideen - org:country_of_headquarters - Pakistan>• Predicate 1: the Islamabad headquarters of Harkat-ul-Mujahideen• Predicate 2: Islamabad, the capital city of Pakistan

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

New in 2013:

Ronnie James Dio- per:date_of_death:

Sunday[2010-05-16]

Page 10: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

2013 Discoveries

New justification scheme used in unexpected, creative ways Additional predicate strings used to disambiguate entities

<Vitaly Ginzburg - per:cause_of_death - cardiac arrest> Predicate 1: Ginzburg died late Sunday of cardiac arrest. Predicate 2: Vitaly Ginzburg, a Nobel Prize-winning Russian

physicist and one of the fathers of the Soviet hydrogen bomb

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Page 11: Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data

Delivered 2013 Resources

TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Corpus Title Type LDC Catalog Language Size

TAC 2013 KBP English Cold Start Evaluation Queries and Annotations V1.1 Evaluation LDC2013E87 English 326 Queries

TAC 2013 KBP English Cold Start Evaluation Assessment Results Evaluation LDC2013E101 English

6,755 Assessments