23
PRIS at Slot Filling in KBP 2012: An Enhanced Adaboost Pattern- Matching System Yan Li Beijing University of Posts and Telecommunications [email protected]

PRIS at Slot Filling in KBP 2012: An Enhanced Adaboost Pattern-Matching System

  • Upload
    anneke

  • View
    58

  • Download
    0

Embed Size (px)

DESCRIPTION

PRIS at Slot Filling in KBP 2012: An Enhanced Adaboost Pattern-Matching System. Yan Li Beijing University of Posts and Telecommunications [email protected]. Outline. Introduction Preprocessing Entity Expansion Pattern bootstrapping Post-processing Evaluation results Conclusion. - PowerPoint PPT Presentation

Citation preview

Page 1: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

PRIS at Slot Filling in KBP 2012: An Enhanced Adaboost Pattern-Matching

System

Yan LiBeijing University of Posts and Telecommunications

[email protected]

Page 2: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Outline Introduction Preprocessing Entity Expansion Pattern bootstrapping Post-processing Evaluation results Conclusion

Page 3: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Introduction: the framework

Page 4: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Preprocessing NLP (the Standford CoreNLP toolkit)

POS tagger NER Date and time expression recognition Dependency parser Coreference resolution

Page 5: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Preprocessing (cont’) Example: Takeshi Watanabe, the first president of

the ADB, died in his native Japan.

The categorizations of slots

Page 6: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

PER ORGDomain Slots Domain Slots

PERalternate_names; spouses; children; parents; siblings;

other_familyPER

alternate_names; members; shareholders; founded_by;

top_members/emplyeesORG member_of; employee_of

ORG

parents; members; member_of; shareholders;

subsidiariesLOC country/state/city_of_birth/death/residence

DATE date_of_birth/deathLOC

member_of; country/state/city_of_headqu

arters; NUM age

ORI originREL religion DATE founded; dissolved

SCHOOL schools_attendedNUM

number_of_employees/membersCAUSE cause_of_death

TITLE titles URL websiteCHARGE charges REL political/religious_affiliation

Page 7: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System
Page 8: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Entity Expansion The coreferences and alternate names of an

entity exist in relevant documents. In the purpose of improving recall. Scheme 1 (PER & ORG): coreference

resolution The relation chain run by the Stanford CoreNLP. Example:

Page 9: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Entity Expansion (cont’) Scheme 2 (PER & ORG): identifying

alternate names Rule-based information extraction Interpretative entities in parenthesis Example:

Starr International Co., known as SICO, ……

Scheme 3 (ORG) Removing the corporate suffixes in queries Finding the acronyms or full expressions Example:

Norwegian University of Science and Technology (NTNU)

Page 10: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System
Page 11: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Pattern Bootstrapping: WorkflowRalph Grishman and Bonan Min, “New York University KBP 2010 Slot‐Filling System”, 2010.

Page 12: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Pattern Bootstrapping: Seed Pairs

The KBP English Monolingual Slot Filling Evaluation Data in the past three years 92 PER entities 106 ORG entities 1,627 entity-value pairs

Page 13: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Word sequence pattern the middle context between an entity-value pair Example:

PER:countries_of_residence <PER> native <LOC>

Dependency path pattern the shortest dependency path which connects an

entity-value pair Example: PER:title <PER> appos <TITLE>

PER:member_of <PER> appos president prep_of<ORG> PER:country_of_death <PER> nsubj-1 died prep_in<LOC>

Pattern Bootstrapping: Pattern Generation

Page 14: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Pattern Bootstrapping: Pattern Evaluation

In the purpose of improving precision Pattern frequency Trigger phrase High-confidence patterns

New entity-value pairs Iteration

Page 15: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System
Page 16: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Post-processing In the purpose of improving precision DATE

The SUTime module of the CoreNLP TIMEX2 normalization

PER: spouses, children and parents Last name complement Example: John Doe’s first wife, Ruth

“Ruth Doe” is better than “Ruth”.

Page 17: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Post-processing (cont’) Identifying countries, states/provinces

and cities for LOC slots A Wikipedia list containing all countries and

states or provinces. Adding modifiers into fillers of per: title

adjectival modifier: financial Minister noun compound modifier: police chief prepositional modifier: chief of military

operations

Page 18: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Evaluation Results PRIS

Summary StatisticsLDC Top-1 Top-2 Median

Precision 0.9278607 0.6757322 0.48955223 0.11392405

Recall 0.7252106 0.41866493 0.21257292 0.0874919

F1 0.8141142 0.5170068 0.2964302 0.0989736

Page 19: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Slot non-NIL correct redundant inexact wrong missing

Alternate names 6 0 0 0 23

Date of birth 16 4 0 1 1

Date of death 17 1 0 4 2

age 22 0 0 2 2

Country of birth 1 0 0 0 1

State or province of birth 8 0 2 3 2

City of birth 13 1 0 5 2

Country of death 1 0 0 2 0

State or province of death 13 0 2 1 2

City of death 17 0 0 4 1

Country of residence 10 2 2 7 3

State or province of residence 22 1 4 5 13

City of residence 35 1 0 14 8

origin 16 2 0 17 0

Cause of death 18 0 0 1 13

Schools attended 19 7 0 1 14

titles 85 13 8 24 4

Member of 26 2 4 17 10

Employee of 7 0 2 5 20

religion 4 0 0 1 3

spouses 16 5 1 3 10

Children 73 0 3 10 6

Parents 21 4 0 1 4

Siblings 20 0 1 8 3

Other family 2 0 0 0 7

Charges 5 0 0 4 2

Page 20: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Slot non-NIL correct redundant inexact wrong missingAlternate names 46 4 5 25 5Political/religious

affiliations7 1 0 6 3

Top members/employees 59 1 2 20 8Number of

employees/members3 0 0 0 8

Members 0 0 0 0 4

Member of 0 0 0 0 7

Subsidiaries 7 0 0 3 10Parents 4 1 0 4 4

Founded by 5 0 0 3 5

Founded 5 0 0 1 3Dissolved 1 0 0 0 2

Country of headquarters 3 0 0 1 20State or province of

headquarters1 1 0 7 11

City of headquarters 2 0 0 3 10

Shareholders 3 0 1 8 0Website 7 0 0 1 8

Page 21: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Conclusion In the slot filling task of KBP 2012, we

designed an enhanced pattern-matching system which consists of preprocessing, entity expansion, pattern bootstrapping and post-processing.

The precision and recall are relatively good for some specific slots.

It is urgent to improve the remaining slots.

Page 22: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Tips Adequate preparation A harmonious team Active and disciplined environment Be passionate, patient and hardworking ……

Page 23: PRIS at Slot Filling in KBP 2012:  An Enhanced  Adaboost  Pattern-Matching System

Thank you!