18
Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland COREFERENCE Adapted from slides by Vincent Ng Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 1 / 12

Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Entity-Driven Desiderata

Computational Linguistics: Jordan Boyd-GraberUniversity of MarylandCOREFERENCE

Adapted from slides by Vincent Ng

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 1 / 12

Page 2: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

What is Coref?

Identify the noun phrases (or entity mentions) that refer to the samereal-world entity

Example

Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .

� Inherently a transitive clustering task

� Typical reframing: selecting antecedent for each mention mj

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12

Page 3: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

What is Coref?

Identify the noun phrases (or entity mentions) that refer to the samereal-world entity

Example

Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .

� Inherently a transitive clustering task

� Typical reframing: selecting antecedent for each mention mj

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12

Page 4: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

What is Coref?

Identify the noun phrases (or entity mentions) that refer to the samereal-world entity

Example

Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .

� Inherently a transitive clustering task

� Typical reframing: selecting antecedent for each mention mj

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12

Page 5: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

What is Coref?

Identify the noun phrases (or entity mentions) that refer to the samereal-world entity

Example

Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .

� Inherently a transitive clustering task

� Typical reframing: selecting antecedent for each mention mj

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12

Page 6: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Why it’s hard

� Many sources of information play a role� lexical / word: head noun matches President Clinton = Clinton =? Hillary

Clinton� grammatical: number/gender agreement, . . .� syntactic: syntactic parallelism, binding constraints: John helped himself

to... vs. John helped him to. . .� discourse: discourse focus, salience, recency, . . .� semantic: semantic class agreement, . . .� world knowledge

� Not all knowledge sources can be computed easily

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 3 / 12

Page 7: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Application: Question Answering

Where was Mozart born?

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 4 / 12

Page 8: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Application: Question Answering

Where was Mozart born?

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 4 / 12

Page 9: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Hobb’s Algorithm

Intuition:

� Start with target pronoun

� Climb parse tree to S root� For each NP or S� Do breadth-first, left-to-right search of children� Restricted to left of target� For each NP, check agreement with target

� Repeat on earlier sentences until matching NP found

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 5 / 12

Page 10: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Hobb’s Algorithm Example

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 6 / 12

Page 11: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Machine Learning Approach

� Preprocessing

� Mention Detection

� Coreference

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 7 / 12

Page 12: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Machine Learning Approach

� Preprocessing

� Mention Detection

� Coreference

Not-so-trivial: extract the mentions (pronouns, names, nominals, nestedNPs): Some researchers reported results on gold mentions, not systemmentions

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 7 / 12

Page 13: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Machine Learning: Pairwise

Features:

� Exact string: are mi and mj same after determiners removed

� Grammatical: gender and number agreement

� Semantic: class agreement (country/company)

� Positional: distance between the two mentions

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 8 / 12

Page 14: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Problems

� Conflicts

� Constraints

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 9 / 12

Page 15: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

More Advanced Coreference

� Anaphoric classifier

� Rank mentions

� Cluster assignment

� Pipeline approach

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12

Page 16: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

More Advanced Coreference

� Anaphoric classifier

� Rank mentions

� Cluster assignment

� Pipeline approach (Hand-crafted?)

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12

Page 17: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

More Advanced Coreference

� Anaphoric classifier

� Rank mentions

� Cluster assignment

� Pipeline approach

Harder to evaluate!

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12

Page 18: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland

Possible Projects

� Improve QA (find mentions of candidate answers in Wikipedia)

� Use world knowledge to improve coref

� Better features / representations

Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 11 / 12