67
Machine Translation Day 20

Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Embed Size (px)

Citation preview

Page 1: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Machine Translation

Day 20

Page 2: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

2

EVALUATING MT

Page 3: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

MT Evaluation

• I have a throbbing pain.• I am experiencing a throbbing

pain.• I am suffering from a throbbing

pain.• I am feeling a throbbing pain.• It is a throbbing pain.• It's throbbing and it really

hurts.• It's painful and it's throbbing.• It's throbbing with pain.

• It's in throbbing pain.• It hurts so much it's throbbing.• I've got a throbbing pain.• I can feel a throbbing pain.• I am suffering from a

throbbing pain.• I am experiencing a throbbing

pain.• I have a painful throbbing.• I feel a painful throbbing.

Source : ズキズキ 痛み ます 。16 human translations:

3

Data from International Workshop on Spoken Language Translation

Page 4: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

4

MT Evaluation

• No “right answer”!• What can we test instead?

– Human adequacy / fluency ratings– Human efficacy in an application

(e.g. question answering from translated foreign documents vs. native documents)

– Very accurate, but slow & expensive• Agreement with reference translations

– BLEU (BiLingual Evaluation Understudy: IBM)– Fast system development

Page 5: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

5

BLEU (Papineni, ACL 2002)

• MT output:1: It is a guide to action which ensures that the military always obeys the

commands of the party.2: It is to insure the troops forever hearing the activity guidebook that

party direct.

• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed

Party commands.2: It is the guiding principle which guarantees the military forces always

being under the command of the Party.3: It is the practical guide for the army always to heed the directions of

the party.

Page 6: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

6

BLEU

• MT output:1: It is a guide to action which ensures that the military always obeys

the commands of the party.2: It is to insure the troops forever hearing the activity guidebook that

party direct.

• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed

Party commands.2: It is the guiding principle which guarantees the military forces always

being under the command of the Party.3: It is the practical guide for the army always to heed the directions of

the party.

Page 7: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

7

BLEU

• MT output:1: It is a guide to action which ensures that the military always obeys the

commands of the party.2: It is to insure the troops forever hearing the activity guidebook that

party direct.

• Human (reference) translations:1: It is a guide to action that ensures that the military will forever heed

Party commands.2: It is the guiding principle which guarantees the military forces always

being under the command of the Party.3: It is the practical guide for the army always to heed the directions of

the party.

Page 8: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

8

BLEU: observations

1: It is a guide to action which ensures that the military always obeys the commands of the party.

2: It is to insure the troops forever hearing the activity guidebook that party direct.

• Observations– Word overlap is indicative– n-gram (word sequence) overlap is even more distinct– Drawing from multiple reference translations helps

Page 9: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

9

BLEU metric

• Compute n-gram precisions:Pn = c(matched n-grams) / c(n-grams in candidate)

• Compute a brevity penalty(Prevent candidates from deleting difficult words)BP = exp( min( 1 – r/c, 0 ) ), r = reference length, c =

candidate length• Combine using geometric mean

BLEU = BP (∏∙ i=1n Pi)^(1/n)

• Produces score on a 0-1 scale – often expressed as a “percentage” (e.g., * 100)

Page 10: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

BLEU results circa 2002

[from Papineni et al., ACL 2002] [from G. Doddington, NIST]

Distinguishes humans from machines… …correlates well with human judgments

10

However nowadays we’re starting to see problems: - Some systems score better than human translations - In competitions, some “gaming of BLEU” - Rule based systems are at a disadvantage after tuning

Page 11: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

11

MT Evaluation: Human• Absolute evaluation

– Given a reference translation human evaluators are asked to rank translation quality on a scale of 1-4

4= Ideal: grammatically correct, all information included3= Acceptable: Not perfect, but definitely comprehensible, AND with

accurate transfer of all important information.2= Possibly acceptable: may be interpretable given context/time, some

information transferred accurately1= Unacceptable: Absolutely not comprehensible and/or little or no

information transferred accurately.

• Relative evaluation– Human judges are presented with a reference translation and two

machine translations in random order, and must pick the better of the two

– Criteria for decision are left up to individual judge

Page 12: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

12

Absolute quality: SpanishEnglish

0

20

40

60

80

100

120

Number of Sentences

1 1.5 2 2.5 3 3.5 4Quality Score

BabelfishMSR MT

Average quality scores: Babelfish=2.344 MSR-MT=2.727

Page 13: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Extrinsic evaluation: Microsoft product support site

• Microsoft support knowledge base– Thousands of customer support articles available at

http://support.microsoft.com– However, most are only available in English– Translating all articles by hand is too expensive– Instead we present unedited MT articles– Available in Spanish, French, German, Japanese, etc.

• Some of the publicly available data-driven translations (2002-2003)

Page 14: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

14

http://support.microsoft.com

Page 15: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

15

PSS survey results (Spanish)

• Overall satisfaction with the article (scale: 1 to 9)– 86.0% scored between 5 and 9; US English = 74.2%

• Technical accuracy of the article (1 to 9)– 75.3% scored between 5 and 9

• Task success– “Did the information in the (machine translated) knowledge base article

help answer your question?” – Yes:

• Machine translated Spanish = 49.7%• Human translated Spanish = 51.2%• US English = 53.6%

Page 16: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

WORD ALIGNMENT

Page 17: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

17

A very simple MT system

• Get a translation dictionary• Assign a uniform distribution over all

translations of each source word• Tokenize input sentence, replace each word

with its English translation:weil er gestern gegangen istbecause he yesterday gone is

• Not terrible, but not very fluent

Page 18: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

18

Simple Statistical Machine Translation

• Given foreign f, find best English translation e*e* = argmaxe P(e | f)

• Use Bayes’ rule to get “noisy channel” modelP(e | f) = P(f | e) P(∙ e) / P(f)argmaxe P(e | f) = argmax P(f | e) P(∙ e)

• P(f | e) is the channel or translation model• P(e) is the language model

Page 19: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

19

Toy System A

• Channel model reversed, otherwise identical– Now gives a probability of source given target– Uniform distribution over all source translations of

a given target word• Word-based bigram model as language model

– Improve translations in context– Improves fluency overall

• Looks like an HMM tagger:– Find Viterbi path through a lattice or trellis

Page 20: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

20

eat-10.3

eat-9.8

Toy System A: searchweil er gestern gegangen ist

because he yesterday gone is

him left had

his are

has

<s>0

because-3.2

he-5.6

him-5.4

his-5.9

yesterday-8.3

gone-9.9

Only need to keep the best hypothesis ending in some word – bigram LM can’t see beyond that

(Viterbi!)

Each partial hypothesis keeps track of the last word generated (for LM score) and the total score so far

left-10.4

Page 21: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Learning the translation model

• Start from seminal work by IBM back in the late 1980s – early 1990s

• They develop models for identifying word correspondences (word alignments) of parallel data

Page 22: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Learning the translation model

• Say we had some word aligned parallel data• How would we estimate a translation model?

the

house

la maison

the

flower

la fleur

blue

house

the

la maison bleu

Page 23: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Learning the translation model

• Say I had a model of P(french | english)• How can I find alignments?

the

house

la maison

the

flower

la fleur

blue

house

the

la maison bleu

blue

Word Prob

bleu 0.8

… …

the

Word Prob

la 0.3

le 0.3

les 0.2

Page 24: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

24

Parameter estimation

• Given lists of parallel sentences (e, f)• If we had the hidden alignments a, then we could

estimate multinomial parameters based on countsc(e, f) := number of times e was aligned to fc(e) := number of occurrences of et(f | e) := c(e, f) / c(e)

• On the other hand, if we knew the parameters t( | )∙ ∙ , we could find the most likely alignments

• Bit of a chicken and an egg problem…

Page 25: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

25

Expectation-Maximization

• Enter the Expectation-Maximization algorithm– Method for optimizing parameters / finding hidden state in

unsupervised problems• A procedural description for now

– Pick an initial set of parameters t0(f | e), set k = 0– Until convergence…

• Find expected values of the hidden states ak+1 for each pair assuming parameters tk are correct (Expectation)

• Find the most likely parameters tk+1 assuming that hidden states ak+1 are correct (Maximization)

• Increment k

Page 26: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

26

Model 1

the

house

[null]

la maison

the

flower

[null]

la fleur

blue

house

the

la maison

[null]

bleu

Page 27: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

27

Model 1, EM iteration 0

the

house

[null]

la maison

0.33 0.33

0.33

0.33

0.33

0.33

the

flower

[null]

la fleur

0.33 0.33

0.33

0.33

0.33

0.33

blue

house

the

la maison

0.25 0.25

[null]

bleu

0.25 0.25

0.25 0.25

0.25 0.25

0.25

0.25

0.25

0.25

Page 28: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

28

Model 1, EM iteration 1

the

house

[null]

la maison

0.34 0.28

0.34

0.31

0.28

0.42

the

flower

[null]

la fleur

0.32 0.20

0.32

0.36

0.20

0.60

blue

house

the

la maison

0.25 0.32

[null]

bleu

0.27 0.21

0.21 0.25

0.27 0.21

0.16

0.45

0.24

0.16

Page 29: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

29

Model 1, EM iteration 2

the

house

[null]

la maison

0.37 0.27

0.37

0.26

0.27

0.46

the

flower

[null]

la fleur

0.37 0.13

0.37

0.26

0.13

0.74

blue

house

the

la maison

0.23 0.36

[null]

bleu

0.31 0.21

0.14 0.21

0.31 0.21

0.11

0.60

0.18

0.11

Page 30: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

30

Model 1, EM iteration 6

the

house

[null]

la maison

0.44 0.18

0.44

0.11

0.18

0.64

the

flower

[null]

la fleur

0.48 0.02

0.48

0.05

0.02

0.96

blue

house

the

la maison

0.11 0.58

[null]

bleu

0.44 0.17

0.02 0.08

0.44 0.17

0.02

0.91

0.05

0.02

Page 31: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

31

IBM Word-based translation(Brown et al., 1993)

• Model P(f | e): French translations given English

I

do

not

speak

French

je ne parle pas francais

[null]

Page 32: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

32

Model 1

• Lots of simplifying assumptions:– All lengths are equally likely

P(m | e) uniform = ∼ ε

– All word alignments are equally likelyP(aj | a1

j-1, f1j-1, m, e) uniform = 1 / (∼ l + 1)

– French word depends on English word it’s aligned toP(fj | a1

j, f1j-1, m, e) ∼ t(fj | eaj) multinomial over English words∼

• Resulting modelP(f, a | e) = ε / (l + 1)m ∏j=1

m t(fj | eaj)

Page 33: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

33

A generative story(IBM Models 1-2, HMM)

P(f, a | e) =P(m | e) ∙

∏j=1m (

P(aj | a1j-1, f1

j-1, m, e) ∙

P(fj | a1j, f1

j-1, m, e)

)Exact – chain rule!

Pick the length of the French sentence

For each position in the French sentence…

Pick the English word aligned to the French word in that

position, then…

Pick the French word in that position

E, F: English, French vocabulariese = e1

l = (e1, …, el): English sentence, ei E∈f = f1

m = (f1, …, fm): French sentence, fj F∈a = a1

m = (a1, …, am): word alignment, aj [0..l]∈

Page 34: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Progression of alignment models• Models of increasing complexity

– Only Model 1 is convex

• Models 3, 4, 5 each capture new aspects of the sentence– Capture “fertility”– Different movement models– Each model can initialize its

successor – helps avoid local minima

• Freely available tools for this task– GIZA++– Berkeley aligner

Model Translation Distortion Fertility

1 Yes --- ---2 Yes Abs ---HMM Yes Rel ---3 Yes Abs Yes4 Yes Rel Yes5 Yes Rel Yes

34

Page 35: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Toy System A’

• Our prior toy system used a uniform distribution for translations

• Now we can plug in Model 1 parameters• Language model helps pick translations that

are fluent• Translation model helps pick translations that

are adequate• Looks just like an HMM!

Page 36: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

36

eat-10.3

eat-9.8

Toy System A’weil er gestern gegangen ist

because he yesterday gone is

him left had

his are

has

<s>0

because-3.2

he-5.6

him-5.4

his-5.9

yesterday-8.3

gone-9.9

Each translation is like a part-of-speech tag

Becomes

Bigram LM + Model 1!

left-10.4

Page 37: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Some questions:

• What about standard translation dictionaries? Should we include them, and how?

• What translation phenomena are we covering and what are we missing?

• Does it work?

Page 38: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

38

Toy System B• System A: finds better translations in context, but can’t reorder

“er gestern gegangen ist he yesterday left had”(should be “he had left yesterday”)

• System B: allow all possible permutations– Each hypothesis now remembers:

• Last target word generated• Set of source words already translated

– 5! = 125 permutations, 10! = 3.6M, 20! = 2.43e18– No way we can afford to keep all translations!

• Group into stacks based on count of words covered– Histogram pruning: limited number of hypotheses on any stack– Threshold pruning: only keep hypotheses within d of best on stack

Page 39: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

39Stack 2Stack 0 Stack 1

Toy System B: search

<s>0

00000 because-3.2

10000

he-3.5

01000

he-5.8

1100000

… …

Like an expanded Viterbi search, but each hypothesis also needs to remember which source words have been translated already!

yesterday-1.9

00100

because-5.2

100100

weil er gestern gegangen ist

because he yesterday gone is

him left had

his are

has

yesterday-5.6

100100

Page 40: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

40

Beyond Toy System B

• Many problems with this system:– System allows all possible reorderings, but some are

much more likely than others– Contextual information is only captured by the target

language model, not in the source• Multiple paths from here:

– Better word alignment– Phrase-based translation: learn bigger translation

units – this is crucial!– Better reordering models: syntax can help here

Page 41: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

41

Word-based MT results

SRC: 对外经济贸易合怍部今无提供的数据表明,今年至十一月中国实际利用外资四百六十九点五九亿美元 , 其中包括外商直接投资四百点零七亿美元。

REF: According to the data provided today by the Ministry of Foreign Trade and Economic Cooperation, as of November this year, China has actually utilized 46.959 billion US dollars of foreign capital, including 40.007 billion US dollars of direct investment from foreign businessmen.

WB: The Ministry of Foreign Trade and Economic Cooperation, including foreign direct investment 40.007 billion US dollars today provide data include that year to November china actually using 46.959 billion US dollars and

SRC: Le politique de la haineREF: Politics of hateWB: The policy of the hatred

SRC: Nous avone signé le protocole.REF: We did sign the memorandum of agreement.WB: We have signed the protocol.

SRC: Où était le plan solide?REF: But where was the solid plan?WB: Where was the economic base?

Page 42: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

42

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

blue

house

the

a casa azul

Page 43: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

43

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

blue

house

the

a casa azul

Page 44: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

44

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

blue

house

the

a casa azul

Page 45: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

45

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

• house casa

blue

house

the

a casa azul

Page 46: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

46

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

• house casa

• blue house casa azul

blue

house

the

a casa azul

Page 47: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

47

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

• house casa

• blue house casa azul

• the blue house a casa azul

blue

house

the

a casa azul

Page 48: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

48

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

• house casa

• blue house casa azul

• the blue house a casa azul

blue

house

the

a casa azul

Page 49: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

49

Word alignment and phrase extraction (Koehn, Och, Marcu 2003)

• the a

• blue azul

• house casa

• blue house casa azul

• the blue house a casa azul

blue

house

the

a casa azul

Page 50: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

50

Phrase table

• Extract phrases from all sentence pairs• Estimate P(src | tgt) with c(src, tgt) / c(tgt)

Portuguese English Probver see 0.533ver view 0.129ver to see 0.044ver viewing 0.009ver seeing 0.008ver watch 0.007

…ver o mundo atravês view the world through 1.000ver e adquirir browse and purchase 1.000ver ou editar view or edit 0.875ver filmes watch movies 0.667

Page 51: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

51

Word-based vs. phrase-based(BLEU score vs. training data size)

40k 80k 160k 320k20

22

24

26

28

30

Phrases from word alignment

Word-based

[Koehn, Och, and Marcu 2003]

These systems, with sufficient data, produce better translations than

rule-based systems… mostly.

Page 52: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

52

Syntax in translation

• Phrases capture contextual translation and local reordering surprisingly well

• However this information is brittle:– “author of the book 本書的作者” tells us nothing about

how to translate “author of the pamphlet” or “author of the play”

– The Chinese phrase “NOUN1 的 NOUN2” becomes “NOUN2 of NOUN1” in English

• No information about global reordering– In Chinese, prepositional phrases often come before verbs; in

English, they’re come after

Page 53: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

53

Syntax-based source reordering

• Language is hierarchical – our models should capture this

• Phrasal cohesion (Fox, 2002): most often, each source constituent translates to a contiguous target constituent

• Source parse trees can inform reordering– First parse the source sentence– Then use information about the source to guide

reordering

Page 54: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

54

Wang, Collins, Koehn (2007):Parse the Chinese, reorder like English

Page 55: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

55

Some pertinent rules

Page 56: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

56

Syntax-directed translation

• Begin by parsing source sentence– Syntactic analysis can guide reordering and inform

translation• One approach: Treelet translation (Quirk,

Menezes, and Cherry, 2005)– Use dependency trees: minimal amount of

syntactic information (just head node)

Page 57: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

57

Treelet and template extraction

• Start from word aligned sentence pairs

blue housethe

a casa azul

Page 58: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

58

Treelet and template extraction

• Parse source:

blue/JJ

house/NN

the/DT

a casa azul

Page 59: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

59

Treelet and template extraction

• Project tree:

blue/JJ

house/NN

the/DT

a

casa

azul

Page 60: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

60

Treelet and template extraction

• Extract treelet pairs:

• Treelet: connected subgraphof the dependency tree

blue/JJ

house/NN

the/DT

a

casa

azul

the a

blue azul

house casa

blue house casa azul

the blue house a casa azul

the house a casa

Page 61: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

61

Treelet and template extraction

• Extract templates:

blue/JJ

house/NN

the/DT

a

casa

azul*/JJ

*/NN

*/DT

*

*

*

Page 62: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

62

Europarl English-Spanish

devtest in-domain out-of-domain20%

25%

30%

35%

PhrasalTemplateBL

EU

Page 63: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Impact of preserving ambiguity

• Start with treelet systems– Technical English-German,

English-Japanese– Newswire Chinese-English

• Translate each of k-best parses independently

• Keep the translation with the best score

• Evaluate using BLEU

parses EG EJ CE

1 33.6 36.0 28.2

2 33.8 36.1 28.5

4 34.1 36.3 28.9

8 34.3 36.6 29.2

16 34.5 36.8 29.7

32 34.8 37.1 30.0

Page 64: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

64

Target langauge syntax

• If we want a grammatical translation, shouldn’t we use a grammar?

• Use a parser in the target language instead– Translation becomes cross-lingual parsing: find the best

English parse tree for a Chinese sentence– Great for translating into English or other languages with

lots of linguistic resources• Later approaches capture larger synchronous rules at a

time (Marcu et al., 2006) and are pretty successful, though somewhat slow in comparison– Ongoing research to speed things up

Page 65: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

What about morphology?

• Most of these approaches treat words as indivisible units

• Some recent work addresses this problem:– Phrasal translations of morpheme sequences

(requires morphological segmentation)

Page 66: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Remaining limitations

• Most systems consider only a single sentence at a time– What about discourse phenomena?– Coreference?

• How do we handle unknown words?• Where do we get the data?

Page 67: Machine Translation Day 20. EVALUATING MT 2 MT Evaluation I have a throbbing pain. I am experiencing a throbbing pain. I am suffering from a throbbing

Thanks!