22
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies Institute Carnegie Mellon University

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

  • Upload
    elda

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement. Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies Institute Carnegie Mellon University AMTA 2004. Outline. Automatic Rule Refinement AVENUE and resource-poor scenarios - PowerPoint PPT Presentation

Citation preview

Page 1: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

Error Analysis of Two Types of Grammar for the purpose

ofAutomatic Rule Refinement

Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell

Language Technologies Institute

Carnegie Mellon University

AMTA 2004

Page 2: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 2

Outline

• Automatic Rule Refinement• AVENUE and resource-poor scenarios• Experiment

• Data (eng2spa)• Two types of grammar• Evaluation results• Error analysis• RR required for each type

• Conclusions and Future Work

Page 3: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 3

General- MT output still requires post-editing- Current systems do not recycle post-editing efforts

back into the system, beyond adding as new training data

within Avenue- Resource-poor scenarios: lack of manual grammar

or very small initial grammar- Need to validate elicitation corpus and

automatically learned translation rules

Motivation for Automatic RR

Page 4: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 4

Motivation for Automatic RRGeneral- MT output still requires post-editing- Current systems do not recycle post-editing efforts

back into the system, beyond adding as new training data

within Avenue- Resource-poor scenarios: lack of manual grammar

or very small initial grammar- Need to validate elicitation corpus and

automatically learned translation rules

Page 5: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 5

AVENUE and resource-poor scenarios

• No e-data available (often spoken tradition) SMT or EBMT

• lack of computational linguists to write a grammar

So how can we even start to think about MT?– That’s what AVENUE is all about

Elicitation Corpus + Automatic Rule Learning + Rule Refinement

What do we usually have available in resource-poor scenarios? Bilingual users

Page 6: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 6

AVENUE overview

Learning

Module

Transfer Rules

Lexical Resources

Run Time Transfer System

Lattice

Translation

Correction

Tool

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Rule Refinement

Rule

Refinement

Module

Handcrafted rules

Morphology

Morpho-logical analyzer

Page 7: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 7

Automatic and Interactive RLR

SLS3

SLSentence1– TLSentence1 SLSentence2– TLSentence2

Automatically Learned Rule R

TLS3

1st step

2nd step

TLS3’

RR module

R’ (R refined)

SLS3

TLS3’

Page 8: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 8

Interactive Elicitation of MT errorsAssumptions:

• non-expert bilingual users can reliably detect and minimally correct MT errors, given:– SL sentence (I saw you)– up to 5 TL sentences (Yo vi tú, ...)– word-to-word alignments (I-yo, saw-vi, you-tú)– (context)

• using an online GUI: the Translation Correction Tool (TCTool)

Goal: Simplify MT correction task maximally

User studies: 90% error detection accuracy and 73% error classification [LREC 2004]

Page 9: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 11

TCTool v0.1•Add a word•Delete a word•Modify a word•Change word order

Actions:

Page 10: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 12

RR Framework• Find best RR operations given a:

• grammar (G),

• lexicon (L),

• (set of) source language sentence(s) (SL),

• (set of) target language sentence(s) (TL),

• its parse tree (P), and

• minimal correction of TL (TL’)

such that TQ2 > TQ1• Which can also be expressed as:

max TQ(TL|TL’,P,SL,RR(G,L))

Page 11: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 13

Types of RR operations• Grammar:

– R0 R0 + R1 [=R0’ + contr] Cov[R0] Cov[R0,R1]

– R0 R1 [=R0 + constr] Cov[R0] Cov[R1]

– R0 R1[=R0 + constr= -]

R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2]

• Lexicon– Lex0 Lex0 + Lex1[=Lex0 + constr]

– Lex0 Lex1[=Lex0 + constr]

– Lex0 Lex1[Lex0 + TLword] Lex1 (adding lexical item)

bifurcate

refine

Page 12: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 15

Data: English - Spanish

Training• First 200 sentences from AVENUE Elicitation

Corpus• Lexicon: extracted semi-automatically from first

400 sentences (442 entries)

Test• 32 sentences manually selected from the next 200

sentences in the EC to showcase a variety of MT errors

Page 13: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 16

Manual grammar

• 12 rules (2 S, 7 NP, 3 VP)

• Produces 1.6 different translations on average

Page 14: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 17

Learned Grammar + feature constraints

• 316 rules (194 S, 43 NP, 78 VP, 1 PP)• emulated decoder by reordering of 3 rules

• Produces 18.6 different translations on average

Page 15: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 18

Comparing Grammar Output: Results

• Manually:

• Automatic MT Evaluation:NIST BLEU METEOR

Manual grammar 4.3 0.16 0.6Learned grammar 3.7 0.14 0.55

Page 16: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 19

Error Analysis• Most of the errors produced by the manual grammar can be

classified into:– lack of subj-pred agreement– wrong word order of object pronouns (clitic)– wrong preposition– wrong form (case)– OOV words

• On top of these, the learned grammar output exhibited errors of the following type:– lack of agreement constraints– missing preposition– over-generalization

Page 17: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 20

• Same (both good)

• Manual Grammar better

• Learned Grammar better

• Different (both bad)

Examples

Page 18: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 21

Types of RR required for

Manual Grammar

• Bifurcate a rule to code an exception:– R0 R0 + R1 [=R0’ + contr] Cov[R0] Cov[R0,R1]

– R0 R1[=R0 + constr= -]

R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2]

Learned Grammar

• Adjust feature constraints, such as agreement:– R0 R1 [=R0 +|- constr] Cov[R0] Cov[R1]

Page 19: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 22

Conclusions

• TCTool + RR can improve both hand-crafted and automatically learned grammars.

• In the current experiment, MT errors differ almost 50% of the time, depending on the type of grammar.

• Manual G will need to be refined to encode exceptions, whereas Learned G will need to be refined to achieve the right level of generalization.

• We expect the RR to give the most leverage when combined with the Learned Grammar.

Page 20: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 23

Future Work

• Experiment where user corrections are used both as new training examples for RL and to refine the existing grammar with the RR module.

• Investigate using reference translations to refine MT grammars automatically... but much harder since they are not minimal post-editions.

Page 21: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 24

Questions???

Thank you!

Page 22: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement

October 1 AMTA 2004 28

RR Framework• types of operations: bifurcate, make more

specific/general, add blocking constraints, etc.

• formalizing error information (clue word)

• finding triggering features