46
Machine Translation Course 8 Diana Trandabăț Academic year: 2014- 2015

Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Embed Size (px)

Citation preview

Page 1: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Machine TranslationCourse 8

Diana TrandabățAcademic year: 2014-2015

Page 2: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering

• Different languages put words in different order• Computational Problem

– during translation we have to compose an output sentence

– when composing a sentence of n words, there are n! possible orderings

– exponential spaces are hard to search• Linguistic Problem

– how do different languages define the word order?– how can we model reordering transformations during

translation?

Page 3: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering

• Reordering remains a hard problem for SMT and motives new models

• Reordering is the most important feature for predicting translation performance (on indo-european languages)

Page 4: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Bag of words

the nobody security Swiss account the of will bank trust

nobody will trust the security of the Swiss bank account

•From New York Times•Can you put the words in the correct order?

– Does a good language model suffice?

Page 5: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

The Language Model/Translation Model Debate

• Language model argument– Humans can make sense of scrambled sentences,

machines should too– Build really big language models to determine what makes

sense in the target language• Translation model argument

– structure in source sentence defines relationship between words

– we can learn how to map this structure into a target-language structure

Page 6: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Example

German: Ich halte eine Vorlesung über die Reihenfolge von Wörten

English: I teach a class about the order of words

Monotone reordering (no reordering)

Page 7: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Example

French: Barack Obama vit dans la Maison Blanche

English:Barack Obama lives in the White House

Adjective/noun swapping

Page 8: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Example

German:Ich muss eine Vorlesung über die Reihenfolge von Wörten halten

English: I must teach a class about the order of words

German main verb movement

Page 9: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Example

English:we have taken these considerations very much into account when we designed the budget

French:

lorsque nous avons préparé le budget , nous avons pris cela en considération

Clause reordering, zero alignment of the comma

Page 10: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

“Free” word order

• The following German sentences mean the same:– Der Mann gibt der Frau das Buch.– Das Buch gibt der Mann der Frau.– Der Frau gibt der Mann das Buch.– Der Mann gibt das Buch der Frau.– Das Buch gibt der Frau der Mann.– Der Frau gibt das Buch der Mann.

Page 11: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Amount of reordering (in a couple of European languages)

From (Birch et al., EMNLP2008)

Page 12: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Amount of reorderingItalian itSwedish svEnglish enGreek elPortuguese ptDanish daSpanish esFrench fr*Dutch nl*Finnish fi*German de

* hard languages for reordering

Page 13: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 14: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 15: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distortion

• Corresponding words do not appear in the same order in source and target languages

• Assign an offset to move words from their original positing to their final position

• In IBM model 3, the distortion factor determines how likely it is that an English word in position i aligns to a foreign word in position j, given the lengths of both sentences:

• Note, positions are absolute positions

Page 16: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distortion (Reordering)

• Making the offset dependent on all the words would be too costly… so in IBM model 3, the offset only depends:

• on the position of the word within the sentence!!!• the length of the sentences in both languages• P(offset=o | Position = p, SourceLength = m,

ForeignLength = n)– ex: chien brun --> brown dog

• brown --> offset -1 dog --> offset +1– offset of brown = P(offset| 1,2,2)– ex: P(+1| 1,2,2) = 0.3 P(0| 1,2,2) = 0.6 P(-1| 1,2,2) = 0.1

Page 17: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

An Example (en|fr)

Given the English

The brown dog did not

go home

Fertility Model

1 1 1 1 2 1 3

Transformed English

The brown dog did not not go home home home

Translation Model

Le brun chien est n' pas allé à la maison

Offset Model 0 +1 -1 +1 -1 0 0 0 0 0

A possible Translation

Le chien brun n' est pas allé à la maison

• Then use Language Model P(e) to evaluate fluency of all possible translations

P(e) x e)|P(f argmaxe*e

Page 18: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distortion Probabilities• The translation model includes probabilities for “distortion”

– Ex.: P(5|2,4,6)– the probability that ws in position 2 (2nd word in the

source language) will produce a wt in position 5 when the source sentence has 4 words and the target sentence has 6 words.

Page 19: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distortion Probabilities• For phrase-based MT: measure distortion of phrase i as

the distance between the start of the f phrase generated by ēi, ai and the end of the phrase generated by the previous phrase ēi-1 with its alignments ai-1

• Where:– f phrase is the phrase in the source language– ēi is the phrase in the target language corresponding fo f– ai is the alignment of phrase i– ēi-1 is the previous phrase in the target language

• Typically assume the probability of a distortion decreases exponentially with the distance of the movement.

19

Set 0<<1 based on fit to phrase-aligned training dataThen set c to normalize the distortion of our phrase i, d(i) , so it sums to 1.

Page 20: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 21: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distance Model

• Simple reordering model– Cost of a reordering depends only on the distance of the

reordering– Distribution can be estimated from alignment– Or just a Gaussian with mean 1– Or log p( aj | aj-1, I) = aj – aj-1 i.e. reordering cost

proportional to distance

Page 22: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Distance-based Reordering• Distance-based reordering models have a simplistic view of

language– Consider how many words a phrase moves (e.g. a maximum of 6

words)• In reality, movement depends on the words

– German: modal verbs require moving the main verb to the end of the sentence

– French: adjectives require moving the noun before the adjective • Use the aligned training corpus to estimate a model• These models do not generalize well since their parameters

are tied to absolute word position within sentences which tend to be different for the same words across sentences.

Page 23: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Relative distortion model

– defined in terms of the difference between the position of the current phrase and the position of the previous phrase in the source sentence.

– Or, assigning a probability distribution over possible relative jumps conditioned on source words.

– Conditioning on the source words allows for a much more fine-grained model. For instance, words that tend to act as modifers (e.g., adjectives) would have a different distribution than verbs or nouns => this lead to lexicalized reordering models

Page 24: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 25: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Lexicalized Reordering Models• Also called Block Distortion model

– Used in Moses SMT system– Condition on actual words– Different possibilities:

• Condition on source words vs target words• Condition on words at start of jump (O - outbound) vs words at

landing point (I - inbound) ),,,,|(111 Ieeffaap

jj aajjjj

E

F Where:aj-aj-1 is the offsetfj is the foreign word at position jfj-1 is the previous foreign word is the source word aligned with foreign word at position j is the source word aligned with the previous foreign wordI or O stands for inbound (end of jump) or outbound (start of jump)

jae

1jae

Page 26: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Lexicalized model

• Three types of reordering:– monotone (m)– swap (s)– discontinuous (d)

Page 27: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Given current block, look at links at the corners• Top: how did I come from previous phrase?• Bottom: how do I continue to next phrase?

F

E

CurrentBlock

LeftTop

LeftBottom

RightBottom

RightTop

PreviousBlock

NextBlock

CurrentBlock

Page 28: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Top-Left: prev-to-current = monotone

F

E

CurrentBlock

LeftTop

PreviousBlock

Page 29: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Top-Right: prev-to-current = swap

F

E

CurrentBlock

RightTop

PreviousBlock

Page 30: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Neither top-left nor top-right: prev-to-current

= disjoint

F

E

CurrentBlock

PreviousBlock

Page 31: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Bottom-Right: current-to-next = monotone

F

E

CurrentBlock

NextBlock

Page 32: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Bottom-Left: current-to-next = swap

F

E

CurrentBlock

NextBlock

Page 33: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Block Distortion Model• Neither bottom-Left nor bottom-right: current-

to-next = disjoint

F

E

CurrentBlock

NextBlock

Page 34: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 35: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Extracting reordering rules

• Many types of reorderings are systematic– move verb group together– subject - verb - object– move negation in front of verb

• Write rules by hand or try to derive them from annotated data

Page 36: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Word reordering

• Trying all possible word reordering is an NP-Complete problem, which makes searching for the optimal solution among all possible permutations computationally intractable.

• Therefore, typically, the number of permutations considered are limited for efficiency reasons by placing reordering restrictions. – allow only reordering of at most n words at any given

time. – using contiguity restrictions on the reordering.

Page 37: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Pre-reordering

• A different approach to allow for a limited reordering is to reorder the input sentence such that the source and the target sentences have similar word order and then proceed to monotonically decode the reordered source sentence.

• Pre-reordering translates words in the same order they appear in the target language. – the input and output sentences have the same word order.

• It is very efficient since the optimal decoding can be found in polynomial time.

Page 38: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Rewrite patterns• A different method is to automatically acquire rewrite

patterns to be applied to any given input sentence so that the rewritten source and target sentences have similar word order. – rewrite patterns can be automatically extracted by parsing

the source and target sides of the training parallel corpus, i.e. using syntactic rules.

– Or, rewriting the input sentence can be done using heuristics.

• Reordering is better handled during the search algorithm and as part of the optimization function, rather than during the decoding phase.

Page 39: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Word reorderings restrictions

• Allowing a restricted set of word reorderings:– word reordering is done only for words belonging to the

verb group. – word reordering based on reordering operations extracted

from syntactic parse trees.

Page 40: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Reordering models

• Distortion• Distance-based models• Lexicalised models• Pre-ordering• Clause reordering

Page 41: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Clause Level Reordering

• Why clause structure?– languages differ vastly in their clause structure (English:

SVO, Arabic: VSO, German: fairly free order; a lot details differ: position of adverbs, sub clauses, etc.)

– large-scale restructuring is a problem for phrase models• Restructuring

– reordering of constituents (main focus)– add/drop/change of function words

Page 42: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Clause Structure

Example from (Koehn, 2006)

Syntax tree from German parser

Page 43: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Clause Reordering When Translating

• Reordering when translating into English– tree is flattened– clause level constituents line up

Page 44: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Clause structure

Clause level reordering is a well defined task– label German constituents with their English order

Page 45: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

Examples of translations improved when using reordering

• we must also this criticism should be taken seriously .→ we must also take this criticism seriously .• i am with him that it is necessary , the institutional balance by

means of a political revaluation of both the commission and the council to maintain .

→ i agree with him in this , that it is necessary to maintain the institutional balance by means of a political revaluation of both the commission and the council .

• perhaps it would be a constructive dialog between the government and opposition parties , social representative a positive impetus in the right direction .

→ perhaps a constructive dialog between government and opposition parties and social representative could give a positive impetus in the right direction .

Page 46: Machine Translation Course 8 Diana Trandab ă ț Academic year: 2014-2015

“I’m trying to translate what my cat says and put it in a book, but how many homonyms are there for meow?”

Jarod Kintz The Days of Yay are Here! Wake Me Up When They're

Over.