132
e Dark Secrets of MT Revealed Machine Translation I256: Applied Natural Language Processing John DeNero Some slides on loan from Dan Klein & others Thursday, November 5, 2009

Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The Dark Secrets of

MTRevealed

Machine TranslationI256: Applied Natural Language Processing

John DeNeroSome slides on loan from Dan Klein & others

Thursday, November 5, 2009

Page 2: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Data-Driven Machine Translation

Sentence-aligned parallel corpus:

Yo lo haré mañanaI will do it tomorrow

Hasta prontoSee you soon

Hasta prontoSee you around

Target language corpus:

I will get to it soon See you later He will do it

Thursday, November 5, 2009

Page 3: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Data-Driven Machine Translation

Sentence-aligned parallel corpus:

Yo lo haré mañanaI will do it tomorrow

Hasta prontoSee you soon

Hasta prontoSee you around

Machine translation system:

Model of translation

Target language corpus:

I will get to it soon See you later He will do it

Thursday, November 5, 2009

Page 4: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Data-Driven Machine Translation

Sentence-aligned parallel corpus:

Yo lo haré mañanaI will do it tomorrow

Hasta prontoSee you soon

Hasta prontoSee you around

Yo lo haré prontoNOVEL SENTENCE

Machine translation system:

Model of translation

Target language corpus:

I will get to it soon See you later He will do it

Thursday, November 5, 2009

Page 5: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Data-Driven Machine Translation

Sentence-aligned parallel corpus:

Yo lo haré mañanaI will do it tomorrow

Hasta prontoSee you soon

Hasta prontoSee you around

Yo lo haré prontoNOVEL SENTENCE

I will do it soon

Machine translation system:

Model of translation

Target language corpus:

I will get to it soon See you later He will do it

Thursday, November 5, 2009

Page 6: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Uses of Translation

• Assimilation

• Gist of a document is helpful

• Dissemination

• High quality expected; may be closed domain

• Communication

• Wide range of quality requirements

Thursday, November 5, 2009

Page 7: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Uses of Translation

• Assimilation

• Gist of a document is helpful

• Dissemination

• High quality expected; may be closed domain

• Communication

• Wide range of quality requirements

Machine translation is much lower cost, much faster, and much easier to access than convetional translation. However, it’s worse.

Thursday, November 5, 2009

Page 8: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s’58 ’00’s

Thursday, November 5, 2009

Page 9: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

’58 ’00’s

Thursday, November 5, 2009

Page 10: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

’58 ’00’s

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 11: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

Berkeley’s first MT grant

’58 ’00’s

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 12: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

Berkeley’s first MT grant

ALPAC report deems MT bad

’58 ’00’s

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 13: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

Berkeley’s first MT grant

ALPAC report deems MT bad

’58 ’00’s

John Pierce

“Machine Translation” presumably means going by algorithm from machine-readable source text to

useful target text... In this context, there has been no

machine translation...

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 14: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

Berkeley’s first MT grant

ALPAC report deems MT bad

Statistical data-driven approach introduced

’58 ’00’s

John Pierce

“Machine Translation” presumably means going by algorithm from machine-readable source text to

useful target text... In this context, there has been no

machine translation...

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 15: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Brief and Biased History

’47 ’66 ’90’s

MT is the “first” non-numeral compute task

Berkeley’s first MT grant

ALPAC report deems MT bad

Statistical data-driven approach introduced

Statistical MT thrives

’58 ’00’s

John Pierce

“Machine Translation” presumably means going by algorithm from machine-readable source text to

useful target text... In this context, there has been no

machine translation...

Thus it may be true that the way to translate from Chinese to Arabic, or from Russian to

Portuguese, is not to attempt the direct route, shouting from tower to tower. Perhaps the way

is to descend, from each language, down to the common base of human communication

— the real but as yet undiscovered universal

language — and — then re-emerge by whatever particular

route is convenient.

Warren Weaver

Warren Weaver

When I look at an article in Russian, I say: “This is really written in English, but it has been coded in some strange symbols. I will now proceed

to decode.”

Thursday, November 5, 2009

Page 16: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The Problem with Dictionary Look-ups

顶部顶端顶头

盖盖帽极尖峰面摘心

/top/roof/

/summit/peak/top/apex/

/coming directly towards one/top/end/

/lid/top/cover/canopy/build/Gai/

/surpass/top/

/extremely/pole/utmost/top/collect/receive/

/peak/top/

/fade/side/surface/aspect/top/face/flour/

/top/topping/

Example from Douglas Hofstadter

Thursday, November 5, 2009

Page 17: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The Problem with Dictionary Look-ups

顶部顶端顶头

盖盖帽极尖峰面摘心

/top/roof/

/summit/peak/top/apex/

/coming directly towards one/top/end/

/lid/top/cover/canopy/build/Gai/

/surpass/top/

/extremely/pole/utmost/top/collect/receive/

/peak/top/

/fade/side/surface/aspect/top/face/flour/

/top/topping/

carrot, class, pile, condition, drawer, speed, bikini, lungs, “top dog”, “top brass”, “top of the line”, “big top”, “over the top”, “pop top”, “top off”, “off the top of my head”, “take it from the top”, “I’m on top of it”, ...

Example from Douglas Hofstadter

Thursday, November 5, 2009

Page 18: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Levels of Language Transfer

Source text

Target text

Thursday, November 5, 2009

Page 19: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Levels of Language Transfer

Source text

Generation

Analys

is Transfer

Target text

Thursday, November 5, 2009

Page 20: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Levels of Language Transfer

Source text

Generation

Analys

is Transfer

Target text

Morphology

Thursday, November 5, 2009

Page 21: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Levels of Language Transfer

Source text

Generation

Analys

is Transfer

Target text

Syntax

Morphology

Thursday, November 5, 2009

Page 22: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Levels of Language Transfer

Source text

Generation

Analys

is Transfer

Target text

Semantics

Syntax

Morphology

Thursday, November 5, 2009

Page 23: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Interlingua

Levels of Language Transfer

Source text

Generation

Analys

is Transfer

Target text

Semantics

Syntax

Morphology

Thursday, November 5, 2009

Page 24: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .de muy buen grado

Input Output

Grammar

Thursday, November 5, 2009

Page 25: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

Input Output

Grammar

Thursday, November 5, 2009

Page 26: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

ADV

Input Output

Grammar

gladly

ADV

Thursday, November 5, 2009

Page 27: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

ADV

S → 〈 lo haré ADV . ; I will do it ADV . 〉

Input Output

Grammar

gladly

ADV

Thursday, November 5, 2009

Page 28: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

ADV

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Input Output

Grammar

gladly

ADV

Thursday, November 5, 2009

Page 29: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Translating with Tree Transducers

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

ADV

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Input Output

Grammar

gladly

ADV

PRPVB

MD VP

VPNP .

S

PRP

Thursday, November 5, 2009

Page 30: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

Thursday, November 5, 2009

Page 31: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

Product of Experts Model

Models that factor over rules

Product of Experts Model

Thursday, November 5, 2009

Page 32: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

!

r

P (er|fr)!2P (fr|er)!3 . . .

Product of Experts Model

Models that factor over rules

Product of Experts Model

Thursday, November 5, 2009

Page 33: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

!

r

P (er|fr)!2P (fr|er)!3 . . .

Product of Experts Model

Models that factor over rules

Product of Experts Model

How good is this rule?

Thursday, November 5, 2009

Page 34: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

!

r

P (er|fr)!2P (fr|er)!3 . . .

Language model factors over n-grams

Product of Experts Model

Models that factor over rules

Product of Experts Model

How good is this rule?

Thursday, November 5, 2009

Page 35: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

!

r

P (er|fr)!2P (fr|er)!3 . . .

I!

i=1

P (ei|ei!1, ..., e1)!1

Language model factors over n-grams

Product of Experts Model

Models that factor over rules

Product of Experts Model

How good is this rule?

Thursday, November 5, 2009

Page 36: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Statistical Translation Model

lo haré .

ADV → 〈 de muy buen grado ; gladly 〉

de muy buen grado

S → 〈 lo haré ADV . ; I will do it ADV . 〉

S

I will do it

S

.

Synchronous Derivation

Grammar

ADV

gladly

ADV

!

r

P (er|fr)!2P (fr|er)!3 . . .

I!

i=1

P (ei|ei!1, ..., e1)!1

Language model factors over n-grams

Product of Experts Model

Models that factor over rules

Product of Experts Model

How good is this rule?

How good is this target sentence?

Thursday, November 5, 2009

Page 37: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning to Translate

Example from Adam Lopez

Thursday, November 5, 2009

Page 38: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning to Translate

Example from Adam Lopez

Thursday, November 5, 2009

Page 39: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning to Translate

Example from Adam Lopez

Thursday, November 5, 2009

Page 40: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning to Translate

Example from Adam Lopez

Thursday, November 5, 2009

Page 41: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning to Translate

Example from Adam Lopez

Thursday, November 5, 2009

Page 42: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Unsupervised Word Alignment

• Input: A large bitext of sentences and their translations

• Approach: Using what we know about the problem and corpus statistics, align words of translations automatically

• Exciting fact: Unsupervised methods perform well enough that very few systems use supervised word alignment

Thursday, November 5, 2009

Page 43: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Unsupervised Word Alignment

• Input: A large bitext of sentences and their translations

• Approach: Using what we know about the problem and corpus statistics, align words of translations automatically

• Exciting fact: Unsupervised methods perform well enough that very few systems use supervised word alignment

Thursday, November 5, 2009

Page 44: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

Thursday, November 5, 2009

Page 45: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

Thursday, November 5, 2009

Page 46: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

Thursday, November 5, 2009

Page 47: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

Thursday, November 5, 2009

Page 48: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

Thursday, November 5, 2009

Page 49: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Properties of Cross-Lingual Alignments

I declare resumed the session of the european parliament

Declaro reanudado el periodo de sesiones del parlamento europeo

adjourned on Friday 17 December 1999 , ...

interrumpido el Viernes 17 de Diciembre pasado , ...

• Often one-to-one or many-to-one (usually over contiguous phrases)

• Occasionally many-to-many, driven by non-literal translations

Thursday, November 5, 2009

Page 50: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Heuristic Estimation

• Two words that co-occur regularly are translations

• Normalize by the word frequencies

• Enforcing competition across words (e.g., finding a one-to-one or many-to-one mapping) is a good idea

c(f) c(e)

c(e, f)

2 · c(e, f)c(e) + c(f)

The number of times e and f appear together

Count of word f Count of word f

Dice coefficient

Thursday, November 5, 2009

Page 51: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Heuristic Estimation

• Two words that co-occur regularly are translations

• Normalize by the word frequencies

• Enforcing competition across words (e.g., finding a one-to-one or many-to-one mapping) is a good idea

c(f) c(e)

c(e, f)

2 · c(e, f)c(e) + c(f)

The number of times e and f appear together

Count of word f Count of word f

Dice coefficient

Thursday, November 5, 2009

Page 52: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Model 1 (Brown et al, ’93)

• Probabilistic models naturally impose competition

• Assume that foreign words are generated independently

• Assume a hidden alignment vector a encoding which English word generates each foreign word

I declare resumed the session

Declaro reanudado el periodo de sesiones

Thursday, November 5, 2009

Page 53: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Model 1 (Brown et al, ’93)

• Probabilistic models naturally impose competition

• Assume that foreign words are generated independently

• Assume a hidden alignment vector a encoding which English word generates each foreign word

I declare resumed the session

Declaro reanudado el periodo de sesiones

a6=5

Thursday, November 5, 2009

Page 54: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Model 1 (Brown et al, ’93)

• Probabilistic models naturally impose competition

• Assume that foreign words are generated independently

• Assume a hidden alignment vector a encoding which English word generates each foreign word

I declare resumed the session

Declaro reanudado el periodo de sesiones

P (f, a|e) =J!

j=1

P (aj = i|I, J)P (fj |ei)

=1

I + 1P (fj |ei)

a6=5

Thursday, November 5, 2009

Page 55: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Model 1 (Brown et al, ’93)

• Probabilistic models naturally impose competition

• Assume that foreign words are generated independently

• Assume a hidden alignment vector a encoding which English word generates each foreign word

I declare resumed the session

Declaro reanudado el periodo de sesiones

P (f, a|e) =J!

j=1

P (aj = i|I, J)P (fj |ei)

=1

I + 1P (fj |ei)

a6=5

Thursday, November 5, 2009

Page 56: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

P (f |e)

Thursday, November 5, 2009

Page 57: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

P (f |e)

Thursday, November 5, 2009

Page 58: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

• E-step computes expected alignments (posteriors)

P (f |e)

Thursday, November 5, 2009

Page 59: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

• E-step computes expected alignments (posteriors)

P (f |e)

P (aj = i|e, f) =1

I+1P (fj |ei)!i!

1I+1P (fj |ei!)

Thursday, November 5, 2009

Page 60: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

• E-step computes expected alignments (posteriors)

• M-step computes ratios of expected counts

P (f |e)

P (aj = i|e, f) =1

I+1P (fj |ei)!i!

1I+1P (fj |ei!)

Thursday, November 5, 2009

Page 61: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

• E-step computes expected alignments (posteriors)

• M-step computes ratios of expected counts

P (f |e)

P (aj = i|e, f) =1

I+1P (fj |ei)!i!

1I+1P (fj |ei!)

P (f |e) =sum of posteriors for f aligned to e

sum of posteriors of any f ! aligned to e

Thursday, November 5, 2009

Page 62: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Model 1 Parameters

• Free parameters in the model:

• Goal is to maximize the data likelihood

• E-step computes expected alignments (posteriors)

• M-step computes ratios of expected counts

• Repeat e- and m-step many times (like 5 or 10)

P (f |e)

P (aj = i|e, f) =1

I+1P (fj |ei)!i!

1I+1P (fj |ei!)

P (f |e) =sum of posteriors for f aligned to e

sum of posteriors of any f ! aligned to e

Thursday, November 5, 2009

Page 63: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Words Under the Model

• Viterbi: For every j, select i that maximizes

• Posterior: Align every (i,j) that has

P (aj = i|e, f)

P (aj = i|e, f) > !

Gives competition among explanations

Gives control over how many alignment links to posit

Thursday, November 5, 2009

Page 64: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Evaluation: Alignment Error Rate

Sure align.

Possible align.

Predicted align.

=

=

=

Thursday, November 5, 2009

Page 65: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Evaluation: Alignment Error Rate

Sure align.

Possible align.

Predicted align.

=

=

=

Thursday, November 5, 2009

Page 66: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Evaluation: Alignment Error Rate

Sure align.

Possible align.

Predicted align.

=

=

=

Thursday, November 5, 2009

Page 67: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Problems with IBM Model 1

• Too many alignments to rare words (garbage collection)

• Alignments jump around all over the sentence

Thursday, November 5, 2009

Page 68: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Problems with IBM Model 1

• Too many alignments to rare words (garbage collection)

• Alignments jump around all over the sentence

Thursday, November 5, 2009

Page 69: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Intersected IBM Model 1

Thursday, November 5, 2009

Page 70: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Intersected IBM Model 1

• Train Model 1 in both directions, align with each, then intersect the output(Och and Ney, ’03)

• Result is one-to-one with Viterbi alignments

• Second model filters the first, eliminating mistakes

Thursday, November 5, 2009

Page 71: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Intersected IBM Model 1

Model P/R AERModel 1 E→F 82/58 30.6

Model 1 F→E 85/58 28.7

Model 1 AND 96/46 34.8

• Train Model 1 in both directions, align with each, then intersect the output(Och and Ney, ’03)

• Result is one-to-one with Viterbi alignments

• Second model filters the first, eliminating mistakes

Thursday, November 5, 2009

Page 72: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Joint Training for IBM Model 1

Model P/R AERModel 1 E→F 82/58 30.6Model 1 F→E 85/58 28.7Model 1 AND 96/46 34.8Model 1 INT 93/69 19.5

• We can intersect model predictions during training as well

• Modified alignment posterior:

• Models are forced to agree as they select parameters

• Same precision benefits, but higher recall from more agreement

Pe!f (aj = i|e, f) · Pf!e(ai = j|e, f)

Thursday, November 5, 2009

Page 73: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Model 2

• Words at the beginning of sentences should align

• Words at the end of sentences should align

• Alignment probability depends on position, e.g.

P (f, a|e) =J!

j=1

P (aj = i|I, J) · P (fj |ei)

! exp("!

""""ai " iI

J

"""") · P (fj |ei)

Thursday, November 5, 2009

Page 74: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Phrase Movement

Des tremblements de terre ont à nouveau touché le Japon jeudi 4 novembre.

On Tuesday Nov. 4, earthquakes rocked Japan once again

Absolute position distortion isn’t quite right

Thursday, November 5, 2009

Page 75: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

IBM Models 1/2

Thank you , I shall do so gladly .

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 76: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

IBM Models 1/2

Thank you , I shall do so gladly .

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 77: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

IBM Models 1/2

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 78: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

IBM Models 1/2

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3| I, J)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 79: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

IBM Models 1/2

Thank you , I shall do so gladly .

1 3 7 6 9

1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3| I, J)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

8 8 88

E:

F:

Thursday, November 5, 2009

Page 80: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 81: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

E:

F:

Thursday, November 5, 2009

Page 82: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

1

E:

F:

Thursday, November 5, 2009

Page 83: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

1 3

E:

F:

Thursday, November 5, 2009

Page 84: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

1 3 7 6

E:

F:

Thursday, November 5, 2009

Page 85: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

1 3 7 6 8 8 88

E:

F:

Thursday, November 5, 2009

Page 86: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A:

The HMM Model

Thank you , I shall do so gladly .1 2 3 4 5 76 8 9

Model ParametersTransitions: P( A2 = 3 | A1 = 1)Emissions: P( F1 = Gracias | EA1 = Thank )

Gracias , lo haré de muy buen grado .

1 3 7 6 98 8 88

E:

F:

Thursday, November 5, 2009

Page 87: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The HMM Model

• Model 2 preferred global monotonicity

• We want local monotonicity (small jumps)

• HMM model (Vogel et al 96)

• Re-estimate using the forward-backward algorithm

• Handling nulls requires some care

Thursday, November 5, 2009

Page 88: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The HMM Model

• Model 2 preferred global monotonicity

• We want local monotonicity (small jumps)

• HMM model (Vogel et al 96)

• Re-estimate using the forward-backward algorithm

• Handling nulls requires some care

Thursday, November 5, 2009

Page 89: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

The HMM Model

• Model 2 preferred global monotonicity

• We want local monotonicity (small jumps)

• HMM model (Vogel et al 96)

• Re-estimate using the forward-backward algorithm

• Handling nulls requires some care

-2 -1 0 1 2 3

Thursday, November 5, 2009

Page 90: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

HMM Examples

Thursday, November 5, 2009

Page 91: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

AER for HMMs

Model AER

Model 1 INT 19.5

HMM E→F 11.4

HMM F→E 10.8

HMM AND 7.1

HMM INT 4.7

GIZA M4 AND 6.9

Thursday, November 5, 2009

Page 92: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

Thursday, November 5, 2009

Page 93: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

Thursday, November 5, 2009

Page 94: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

Thursday, November 5, 2009

Page 95: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

English (E) P( E | lo haré )

will do it 0.8

will do so 0.2

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

Thursday, November 5, 2009

Page 96: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

English (E) P( E | lo haré )

will do it 0.8

will do so 0.2

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

In 2004, we aligned trees

Yo lo haré mañanaI will do it tomorrow

Thursday, November 5, 2009

Page 97: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

English (E) P( E | lo haré )

will do it 0.8

will do so 0.2

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

In 2004, we aligned trees

Yo lo haré mañanaI will do it tomorrow

VPNP

Thursday, November 5, 2009

Page 98: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

English (E) P( E | lo haré )

will do it 0.8

will do so 0.2

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

In 2004, we aligned trees

Yo lo haré mañanaI will do it tomorrow

VPNP

Thursday, November 5, 2009

Page 99: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Larger Structures

In 1990, we aligned words

Yo lo haré mañana

I will do it tomorrow

English (E) P( E | mañana )

tomorrow 0.7

morning 0.3

English (E) P( E | lo haré )

will do it 0.8

will do so 0.2

In 1999, we aligned phrases

Yo lo haré mañanaI will do it tomorrow

In 2004, we aligned trees

Yo lo haré mañanaI will do it tomorrow

VPNP PRNVB

MD VP

VP

NP

will do it

P( ) = 0.8VP

lo haré NP

Thursday, November 5, 2009

Page 100: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 101: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Align words with a probabilistic model

1

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 102: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Yo lo haré mañana

I will do it tomorrow

Align words with a probabilistic model

1

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 103: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Yo lo haré mañana

I will do it tomorrow

Align words with a probabilistic model

1

Infer presence of larger structures from this alignment

2

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 104: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Yo lo haré mañana

I will do it tomorrow

Align words with a probabilistic model

1

Infer presence of larger structures from this alignment

2

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 105: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Aligning Structural Components

In 2009, we still align words

Yo lo haré mañana

I will do it tomorrow

Align words with a probabilistic model

1

Infer presence of larger structures from this alignment

2

Translate with the larger structures

3

Fragment-level correspondence is derived from word alignments

Thursday, November 5, 2009

Page 106: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

Grammar RulesWord Aligned Sentence Pair

Thursday, November 5, 2009

Page 107: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

Grammar RulesWord Aligned Sentence Pair

Thursday, November 5, 2009

Page 108: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

〈haré ;

will do〉

Grammar RulesWord Aligned Sentence Pair

Thursday, November 5, 2009

Page 109: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

〈haré ;

will do〉

Grammar RulesWord Aligned Sentence Pair

Thursday, November 5, 2009

Page 110: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

〈haré ;

will do〉

Grammar Rules

〈lo X de ... grado ;

X it gladly〉

Word Aligned Sentence Pair

Thursday, November 5, 2009

Page 111: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Estimating Rule Parameters from Words

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

〈haré ;

will do〉

Grammar Rules

〈lo X de ... grado ;

X it gladly〉

Word Aligned Sentence Pair

Model Parameters

Relative frequency counts

c( lo X de muy buen grado ; X it gladly )P(es|en) =

c( * ; X it gladly )

Thursday, November 5, 2009

Page 112: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning Grammars for Translation

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

PRPVB

MD VP

VPNP

.S

PRP ADV

S

S

VB NP

PRP

VP

,

Grammar Rules

Thursday, November 5, 2009

Page 113: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning Grammars for Translation

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

PRPVB

MD VP

VPNP

.S

PRP ADV

S

S

VB NP

PRP

VP

,

Grammar Rules

Thursday, November 5, 2009

Page 114: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning Grammars for Translation

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

PRPVB

MD VP

VPNP

.S

PRP ADV

S

S

VB NP

PRP

VP

,

Grammar RulesA

DV

Thursday, November 5, 2009

Page 115: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning Grammars for Translation

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

PRPVB

MD VP

VPNP

.S

PRP ADV

S

S

VB NP

PRP

VP

,

Grammar RulesA

DV

Thursday, November 5, 2009

Page 116: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Learning Grammars for Translation

Thank you , I will do it gladly .

Gracias,loharédemuybuengrado.

PRPVB

MD VP

VPNP

.S

PRP ADV

S

S

VB NP

PRP

VP

,

Grammar Rules

〈lo haré ADV ;

will do it ADV〉

VP →

AD

V

Thursday, November 5, 2009

Page 117: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Je vois un chat

Machine translation system:

Model of translation

Thursday, November 5, 2009

Page 118: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Je vois un chat I see a spade

Machine translation system:

Model of translation

Thursday, November 5, 2009

Page 119: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Je vois un chat I see a spade

Machine translation system:

Model of translation

... appelez un chat un chat

... call a spade a spade

Sentence-aligned parallel corpus:

......

Thursday, November 5, 2009

Page 120: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Gracias

,

lo

haré

de

muy

buen

grado

.

Thank you , I shall do so gladly .

A real word alignment(GIZA++ Model 4 with

grow-diag-final combination)

Thanks,thatdo [first; future]

ofverygooddegree.

Gloss

Thursday, November 5, 2009

Page 121: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Gracias

,

lo

haré

de

muy

buen

grado

.

Thank you , I shall do so gladly .

A real word alignment(GIZA++ Model 4 with

grow-diag-final combination)

Thanks,thatdo [first; future]

ofverygooddegree.

Gloss

Thursday, November 5, 2009

Page 122: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Gracias

,

lo

haré

de

muy

buen

grado

.

Thank you , I shall do so gladly .Thank you , I shall do so gladly .

A real word alignment(GIZA++ Model 4 with

grow-diag-final combination)

A sampled phrase alignment(our system)

Thursday, November 5, 2009

Page 123: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

What Happens in Practice

Gracias

,

lo

haré

de

muy

buen

grado

.

Thank you , I shall do so gladly .Thank you , I shall do so gladly .

A real word alignment(GIZA++ Model 4 with

grow-diag-final combination)

A sampled phrase alignment(our system)

Thursday, November 5, 2009

Page 124: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Example Machine Translation Pipeline

Thursday, November 5, 2009

Page 125: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

A Machine Translation Pipeline

Phrase Model Training (Moses)

Example from CMU INCA System (Vogel et al)

Thursday, November 5, 2009

Page 126: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Example Syntax-Based TranslationNew Arabic v5.1 base system - sentence 211 Generated by Jens-S. Vöckler 2008-04-10 21:29 3

[ara-tune4600:211] 1-best PoS-Tree

al

NNP

@-@

HYPH

baz

NNP

NML

NPB

NP-C63425995

declined

VBD

to

TO

give

VB

any

DT

statements

NNS30229081

NPB

upon

IN

his

PRP$

arrival

NN

NPB

in

IN

the

DT

province

NN

NPB

NP-C59736686

PP

NP-C

PP114470921

NP-C

VP-C

VP220719583

SG-C

VP

S-BAR

.

.

S151963398

GLUE265961794

TOP265961890

64

2190

13

3

7

26

New Arabic v5.1 base system - sentence 211 Generated by Jens-S. Vöckler 2008-04-10 21:29 2

New Arabic v5.1 base system - sentence 211foreign:tac-lang: urfD albaz aladla’ baá tSryHat fur uSulh alá almqaT‘e .bckwltr: wrfD AlbAz AlAdlA’ bAY tSryHAt fwr wSwlh AlY AlmqATEp .

Tune.nw.0: al @-@ baz declined to make any statements upon his arrival in the province .Tune.nw.1: al @-@ baz refused to give any statements on arriving at al @-@ muqataah .Tune.nw.2: immediately upon his arrival in the area , al @-@ baz declined to give any statements .Tune.nw.3: al @-@ baz refused to make any statement upon his arrival at the moqata’ah .1-best: al @-@ baz declined to give any statements upon his arrival in the province .

[ara-tune4600:211] 1-best Dot Productfeature weight value product

derivation-size 0.41 8 3.30glue-rule 3.89 2 7.78green -0.08 0 0gt_prob 0.40 36.18 14.43identity -9.97 0 0

is_lexicalized -0.65 6 -3.91lex_pef 1.02 5.47 5.60lex_pfe 0.31 4.44 1.39lm1 1 22.76 22.76

lm1-unk 30.08 0 0lm2 0.74 26.66 19.79

lm2-unk -39.18 0 0missingWord -1.29 0 0model1inv 1.02 10.60 10.81model1nrm 1.35 11.29 15.22

nonmonotone 4.17 0 0olive 1.95 0 0psm1n 0.50 24.65 12.30

text-length -3.87 15 -58.05trivial_cond_prob 0.41 3.34 1.38

unk-rule 19.28 0 0reported totalcost 52.82 !v · !w 52.82

New Arabic v5.1 base system - sentence 211 Generated by Jens-S. Vöckler 2008-04-10 21:29 2

New Arabic v5.1 base system - sentence 211foreign:tac-lang: urfD albaz aladla’ baá tSryHat fur uSulh alá almqaT‘e .bckwltr: wrfD AlbAz AlAdlA’ bAY tSryHAt fwr wSwlh AlY AlmqATEp .

Tune.nw.0: al @-@ baz declined to make any statements upon his arrival in the province .Tune.nw.1: al @-@ baz refused to give any statements on arriving at al @-@ muqataah .Tune.nw.2: immediately upon his arrival in the area , al @-@ baz declined to give any statements .Tune.nw.3: al @-@ baz refused to make any statement upon his arrival at the moqata’ah .1-best: al @-@ baz declined to give any statements upon his arrival in the province .

[ara-tune4600:211] 1-best Dot Productfeature weight value product

derivation-size 0.41 8 3.30glue-rule 3.89 2 7.78green -0.08 0 0gt_prob 0.40 36.18 14.43identity -9.97 0 0

is_lexicalized -0.65 6 -3.91lex_pef 1.02 5.47 5.60lex_pfe 0.31 4.44 1.39lm1 1 22.76 22.76

lm1-unk 30.08 0 0lm2 0.74 26.66 19.79

lm2-unk -39.18 0 0missingWord -1.29 0 0model1inv 1.02 10.60 10.81model1nrm 1.35 11.29 15.22

nonmonotone 4.17 0 0olive 1.95 0 0psm1n 0.50 24.65 12.30

text-length -3.87 15 -58.05trivial_cond_prob 0.41 3.34 1.38

unk-rule 19.28 0 0reported totalcost 52.82 !v · !w 52.82

Thursday, November 5, 2009

Page 127: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Automatic Translation Evaluation

• Scores how similar an automatically generated hypothesis is to human-generated references

• Dozens of variants — most common is BLEU

Al - baz declined to make any statement

Al - baz declined to give any statement

Reference:

Hypothesis:

Thursday, November 5, 2009

Page 128: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Automatic Translation Evaluation

• Scores how similar an automatically generated hypothesis is to human-generated references

• Dozens of variants — most common is BLEU

Al - baz declined to make any statement

Al - baz declined to give any statement

Reference:

Hypothesis:

2/5

Thursday, November 5, 2009

Page 129: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Automatic Translation Evaluation

• Scores how similar an automatically generated hypothesis is to human-generated references

• Dozens of variants — most common is BLEU

Al - baz declined to make any statement

Al - baz declined to give any statement

Reference:

Hypothesis:

2/5

3/6

Thursday, November 5, 2009

Page 130: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Automatic Translation Evaluation

• Scores how similar an automatically generated hypothesis is to human-generated references

• Dozens of variants — most common is BLEU

Al - baz declined to make any statement

Al - baz declined to give any statement

Reference:

Hypothesis:

2/5

3/6

5/7

7/8

Thursday, November 5, 2009

Page 131: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Automatic Translation Evaluation

• Scores how similar an automatically generated hypothesis is to human-generated references

• Dozens of variants — most common is BLEU

Al - baz declined to make any statement

Al - baz declined to give any statement

Reference:

Hypothesis:

2/5

3/6

5/7

7/8

Systems are trained to optimize this

metric

Thursday, November 5, 2009

Page 132: Machine Translation - Coursescourses.ischool.berkeley.edu/i256/f09/lectures/anlp...Machine translation is much lower cost, much faster, and much easier to access than convetional translation

Integrating MT into Other Systems

• Speech-to-speech translation

• Cross-lingual information retrieval

• Translated optical character recognition

• Mobile device integration

• Text-oriented web services of all kinds

Thursday, November 5, 2009