74
Low Resource Machine Translation Marc’Aurelio Ranzato Facebook AI Research - NYC [email protected] Stanford - CS224N, 10 March 2020

Low Resource Machine Translation - Stanford University

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Low Resource Machine Translation - Stanford University

Low Resource Machine Translation

Marc’Aurelio RanzatoFacebook AI Research - NYC

[email protected]

Stanford - CS224N, 10 March 2020

Page 2: Low Resource Machine Translation - Stanford University

Machine Translation

2

English French

Training data

NMT SystemTrain NMT

Test NMT NMT Systemlife is beautiful la vie est belle

Ingredients: • seq2seq with attention • SGD

Ingredient: • beam

Page 3: Low Resource Machine Translation - Stanford University

3

• 6000+ languages in the world • 80% of the world population

does not speak English • Less than 5% of the people in

the world are native English speakers.

Some Stats

Page 4: Low Resource Machine Translation - Stanford University

https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/source:

The Long Tail of Languages

The top 10 languages are spoken by less than 50% of the people. The remaining ~6500 are spoken by the rest! More than 2000 languages are spoken by less than 1000 people.

Page 5: Low Resource Machine Translation - Stanford University

https://ai.googleblog.com/2019/10/exploring-massively-multilingual.htmlsource:

(X to English)

Page 6: Low Resource Machine Translation - Stanford University

Machine Translation in Practice

6

English Nepali

Training data25M people

Page 7: Low Resource Machine Translation - Stanford University

7

English Nepali

Training data

Parallel training data (collection of sentences with corresponding translation) is small!

25M people

Machine Translation in Practice

Page 8: Low Resource Machine Translation - Stanford University

8

English Nepali

Training data

Let’s represent data with rectangles. The color indicates the language.

Machine Translation in Practice

Page 9: Low Resource Machine Translation - Stanford University

English Nepali

• Some parallel data originates in the source, some in the target language. • Source and target domains may not match.

Dom

ain

Bible

Parliamentary

Let’s represent (human) translations with empty rectangles.

sentences originating in English

corresponding Nepali translations

sentences originating in Nepali

corresponding English translations

Machine Translation in Practice

Page 10: Low Resource Machine Translation - Stanford University

English Nepali

• Test data might be in another domain. • There might exist source side in-domain monolingual data.

Dom

ain

mono

mono

News

Bible

Parliamentary

TEST

mono

Machine Translation in Practice

Page 11: Low Resource Machine Translation - Stanford University

English Nepali

• There might be parallel and monolingual data with a high resource language close to the low resource language of interest. This data may belong to a different domain.

Dom

ain

mono

Hindi

monoBooks

mono

TEST

mono

Machine Translation in PracticeNews

Bible

Parliamentary

Page 12: Low Resource Machine Translation - Stanford University

English NepaliD

omai

nHindi

TEST

Sinhala Bengali Spanish Tamil Gujarati

…the Mondrian like learning setting!

Page 13: Low Resource Machine Translation - Stanford University

13

Low Resource Machine TranslationLoose definition: A language pair can be considered low resource when the number of parallel sentences is in the order of 10,000 or less. Note: modern NMT systems have several hundred million parameters nowadays!

Challenges: - data

- sourcing data to train on - evaluation datasets

- modeling - unclear learning paradigm - domain adaptation - generalization

Page 14: Low Resource Machine Translation - Stanford University

Why Low Resource MT Is Interesting?

• It is about learning with less labeled data.

• It is about modeling structured outputs and compositional learning.

• It is a real problem to solve.

14

Page 15: Low Resource Machine Translation - Stanford University

Outline

15

MODELDATA

ANALYSIS

life of a researcher

“The FLoRes evaluation for low resource MT:…” Guzmán, Chen et al. ’EMNLP 2019

“Phrase-based & Neural Unsup MT” Lample et al. EMNLP 2018 “FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019 “Investigating Multilingual NMT Representations at Scale” Kudugunta et al., EMNLP 2019 “Multilingual Denoising Pre-training for NMT” Liu et al., arXiv 2001:08210 2020

“Analyzing uncertainty in NMT” Ott et al. ICML 2018 “On the evaluation of MT systems trained with back-translation” Edunov et al. ACL 2020 “The source-target domain mismatch problem in MT” Shen et al. arXiv 1909.13151 2019

Page 16: Low Resource Machine Translation - Stanford University

16 http://opus.nlpl.eu/

A Big “Small-Data” Challenge

Page 17: Low Resource Machine Translation - Stanford University

English NepaliD

omai

n

Wikipedia

Bible, JW300, etc.

GNOME, Ubuntu, etc.

mono

TEST

mono

Common Crawlmono mono

In-domain data: no parallel, little monolingual. Out-of-domain: little parallel, quite a bit monolingual No translation originating from Nepali.

Case Study: En-Ne

Page 18: Low Resource Machine Translation - Stanford University

A Case Study: En-Ne• Parallel Training data: versions of bible and ubuntu handbook (<1M

sentences).

• Nepali Monolingual data: wikipedia (90K), common crawl (few millions).

• English Monolingual data: unlimited almost.

• Test data: ???

18

Page 19: Low Resource Machine Translation - Stanford University

FLoRes Evaluation Benchmark• Validation, test and hidden test set, each with 3000 sentences in

English-Nepali and English-Sinhala. • Sentences taken from Wikipedia documents.

• Very expensive and slow. • Very hard to produce high-quality translations:

• automatic checks (language model filtering, transliteration filtering, length filtering, language id filtering, etc),

• human assessment.

Data Collection Process:

Guzmàn, Chen et al. “The FLoRes evaluation datasets for low resource MT…” EMNLP 2019

Page 20: Low Resource Machine Translation - Stanford University

ExamplesSi-En

En-Si

original

translation

Guzmàn, Chen et al. “The FLoRes evaluation datasets for low resource MT…” EMNLP 2019

Wikipedia originating in Si has differenttopics than Wikipedia originating in En

Page 21: Low Resource Machine Translation - Stanford University

ExamplesNe-En

En-Ne

Guzmàn, Chen et al. “The FLoRes evaluation datasets for low resource MT…” EMNLP 2019

Page 22: Low Resource Machine Translation - Stanford University

• Useful to evaluate truly low resource language pairs.

• WMT 2019 and WMT 2020 shared filtering task.

• Several publications.

• Sustained effort, more to come…

22

https://github.com/facebookresearch/floresdata & baseline models

Page 23: Low Resource Machine Translation - Stanford University

What Did We Learn?

• Data is often as or more important than designing a model.

• Collecting data is not trivial.

• Look at the data!!

23

Page 24: Low Resource Machine Translation - Stanford University

Outline

24

MODELDATA

ANALYSIS

life of a researcher

“The FLoRes evaluation for low resource MT:…” Guzmán, Chen et al. ’EMNLP 2019

“Phrase-based & Neural Unsup MT” Lample et al. EMNLP 2018 “FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019 “Massively Multilingual NMT” Aharoni et al.,ACL 2019 “Multilingual Denoising Pre-training for NMT” Liu et al., arXiv 2001:08210 2020

“Analyzing uncertainty in NMT” Ott et al. ICML 2018 “On the evaluation of MT systems trained with back-translation” Edunov et al. ACL 2020 “The source-target domain mismatch problem in MT” Shen et al. arXiv 1909.13151 2019

Page 25: Low Resource Machine Translation - Stanford University

English NepaliD

omai

nHindi

TEST

Sinhala Bengali Spanish Tamil Gujarati

Page 26: Low Resource Machine Translation - Stanford University

ML Perspective: Supervised Learning

If N is small, how can we further regularize the model? - dropout [1] - label smoothing [2]

Learning Framework: Supervised Learning.

Encoder

en ne

Training DatasetD = {(x, y)i}i=1,..,N

<latexit sha1_base64="KpnY63s6K47gGjqoo/CKclue32o=">AAACuXicfZHfa9swEMdl70fb7Fe2Pe5FLBQyCMbuSjsYhbLtoS8bGSxtIc6MrMiJEsky0rnECP+PY2/7b6Y4DnRp2YHgy9190N330kJwA2H4x/MfPHz0eG//oPPk6bPnL7ovX10aVWrKRlQJpa9TYpjgORsBB8GuC82ITAW7Spef1/WrG6YNV/kPqAo2kWSW84xTAi6VdH/FksCcEmG/1PgMxxb3V4PqXcJxXCeWn0WDIBh8qzuHt/p+2limamWlyhU2mtYtuUoWDbXYUF8T8x8OZrDlqmTZcMstB/dzBdEEk3K15frl4GZn0mGddHthEDaB74qoFT3UxjDp/o6nipaS5UAFMWYchQVMLNHAqWB1Jy4NKwhdkhkbO5kTyczENs7X+NBlpjhT2r0ccJO9TVgijalk6jrX+5jd2jp5X21cQvZhYnlelMByuvkoKwUG55w7I55yzSiIyglCNXezYjp3/lBwx+44E6Ldle+Ky6Mgeh8cfT/unX9q7dhHb9Bb1EcROkXn6AIN0QhR78SLPeZl/kef+HN/sWn1vZZ5jf4J3/wFCeLTrg==</latexit>

L(✓) = � log p(y|x)<latexit sha1_base64="5sjE0mn5xSjyfKofXXGLV97RWgQ=">AAADP3icdZJNb9NAEIbX5qPFfDSFI5cVUSRbMpZdkMqlUoEeOFAUJNJWihNrvdkkm6w/5F1Xtrb+Z1z4C9y4cuEAQly5sbHTkqQwkqXRO8/svDveMGWUC9f9ouk3bt66vbV9x7h77/6DndbuwxOe5BkmPZywJDsLESeMxqQnqGDkLM0IikJGTsP560X99JxknCbxB1GmZBChSUzHFCOhpGBX63X8CIkpRkweVfAA+hKahV1aAYV+FUh64NmOY7+rjL/ccTXkDVkMeTCruVnDHQd8gxQNWQ5FMK/J+SUpVsmjaij9KEwKifKiujSS2+cbRrqqKTXLi8IyOmo69DmN4JqzlUNfXhk1a6c29EOUybIKZtb/Tas1+DhP4doxqt60Glfy28r0xZQIZKkZT6HPkglU1uAFLKyg1XYdtw54PfGWSRssoxu0PvujBOcRiQVmiPO+56ZiIFEmKGZETc05SRGeownpqzRGEeEDWf//CnaUMoLjJFNfLGCtrnZIFHFeRqEiF+b5Zm0h/qvWz8X4xUDSOM0FiXEzaJwzKBK4eExwRDOCBStVgnBGlVeIpyhDWKgnZ6gleJtXvp6c7DneM2fv/fP24avlOrbBY/AEmMAD++AQvAFd0ANY+6h91b5rP/RP+jf9p/6rQXVt2fMIrIX++w+p+wMb</latexit>

Per-sample loss:

English NepaliTEST

TRAIN //

x<latexit sha1_base64="iWIFvj4hQ56IJLkVNOFByT95Uok=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt+LYrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr390Xyaq</latexit>

Cross-EntropyLoss

y<latexit sha1_base64="0flskoD0eLhdjGE1HTyI8J9TWSE=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt+LErlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr3914yar</latexit>

human translator

human reference

predictioninput sentenceDecoder

NMT systemDATA

[1] Srivastava et al. “Dropout: a simple way to prevent neural networks from overfitting” JMLR 2014[2] Szegedy et al. “Rethinking the inception architecture for computer vision” CVPR 2016

usual attention-based transformer

Page 27: Low Resource Machine Translation - Stanford University

ML Perspective: Semi-Supervised Learning

Adding source-side monolingual data. Idea: model p(x).

Training Dataset Learning Framework: DAED = {(x, y)i}i=1,..,N

<latexit sha1_base64="KpnY63s6K47gGjqoo/CKclue32o=">AAACuXicfZHfa9swEMdl70fb7Fe2Pe5FLBQyCMbuSjsYhbLtoS8bGSxtIc6MrMiJEsky0rnECP+PY2/7b6Y4DnRp2YHgy9190N330kJwA2H4x/MfPHz0eG//oPPk6bPnL7ovX10aVWrKRlQJpa9TYpjgORsBB8GuC82ITAW7Spef1/WrG6YNV/kPqAo2kWSW84xTAi6VdH/FksCcEmG/1PgMxxb3V4PqXcJxXCeWn0WDIBh8qzuHt/p+2limamWlyhU2mtYtuUoWDbXYUF8T8x8OZrDlqmTZcMstB/dzBdEEk3K15frl4GZn0mGddHthEDaB74qoFT3UxjDp/o6nipaS5UAFMWYchQVMLNHAqWB1Jy4NKwhdkhkbO5kTyczENs7X+NBlpjhT2r0ccJO9TVgijalk6jrX+5jd2jp5X21cQvZhYnlelMByuvkoKwUG55w7I55yzSiIyglCNXezYjp3/lBwx+44E6Ldle+Ky6Mgeh8cfT/unX9q7dhHb9Bb1EcROkXn6AIN0QhR78SLPeZl/kef+HN/sWn1vZZ5jf4J3/wFCeLTrg==</latexit> Either pre-train or add a DAE loss to the supervised cross-entropy term.Ms = {xs

j}j=1,..,Ms<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Noise: word drop, swap, etc.

LDAE(✓) = � log p(x|x+ n)<latexit sha1_base64="BBVELMu8ESha+ppzsI/mZ3d7HZo=">AAADd3icdZJLb9NAEMc3Do8SXmm5wYERIchRQxSXSnCp1EKQOFAUJNJWyibWerNJNvFL3nVly92PwJfjxvfgwo2NnUCSlpFWGs38Zuc/o3FClwvZbv8sGeVbt+/c3blXuf/g4aPH1d29MxHEEWU9GrhBdOEQwVzus57k0mUXYcSI57js3Jl/WOTPL1kkeOB/k2nIBh6Z+HzMKZE6ZO+WvtexR+SUEjfrKDgCnIGZNNOGzQErO+NHVrPVan5RlX/cqRqKgkyGwp7l3KzgTm2xRcqCTIfSnufkfEXKdbKjhhn2nCDJSJyolZC4ebklpKuLQjO9ShqVuu4OWHAPNpStfXryV6iZK20CdkiUpcqeNf4vWq8B0ziEjW90vihdAz8rE8spk6Shm7wG7AYT0NrgCrS6NWo1meqcfLyxJFmUwD74Dbtaa7faucF1x1o6NbS0rl39gUcBjT3mS+oSIfpWO5SDjESSU5epCo4FCwmdkwnra9cnHhODLL8bBXUdGcE4iPTzJeTR9YqMeEKknqPJxTRiO7cI3pTrx3L8bpBxP4wl82nRaBy7IANYHCGMeMSodFPtEBpxrRXolESESn2qFb0Ea3vk687ZQct60zr4elg7fr9cxw56hl4gE1noLTpGn1AX9RAt/TKeGjXjpfG7/Lz8qmwWqFFa1jxBG1a2/gBhxBC2</latexit>

enEnglish NepaliTEST

TRAIN //

Cross-EntropyLosspredictioninput sentence

mono

+

n<latexit sha1_base64="EpqUFrDe+QYGhdFpOA+JzvOejPY=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt2LPrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr39lNyag</latexit>

enEncoder Decoder

NMT system

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

DATA

Vincent et al. “Stacked denoising auto-encoders:…” JMLR 2010 Liu et al. “Multilingual denoising pretraining for NMT” arXiv:2001.08210 2020

E.g.: The cat the on sat mat.The cat sat on the.

no noise noise only

good region

Page 28: Low Resource Machine Translation - Stanford University

Adding source-side monolingual data. An alternative approach to DAE.

Training Dataset Learning Framework: Self-Training (ST).D = {(x, y)i}i=1,..,N

<latexit sha1_base64="KpnY63s6K47gGjqoo/CKclue32o=">AAACuXicfZHfa9swEMdl70fb7Fe2Pe5FLBQyCMbuSjsYhbLtoS8bGSxtIc6MrMiJEsky0rnECP+PY2/7b6Y4DnRp2YHgy9190N330kJwA2H4x/MfPHz0eG//oPPk6bPnL7ovX10aVWrKRlQJpa9TYpjgORsBB8GuC82ITAW7Spef1/WrG6YNV/kPqAo2kWSW84xTAi6VdH/FksCcEmG/1PgMxxb3V4PqXcJxXCeWn0WDIBh8qzuHt/p+2limamWlyhU2mtYtuUoWDbXYUF8T8x8OZrDlqmTZcMstB/dzBdEEk3K15frl4GZn0mGddHthEDaB74qoFT3UxjDp/o6nipaS5UAFMWYchQVMLNHAqWB1Jy4NKwhdkhkbO5kTyczENs7X+NBlpjhT2r0ccJO9TVgijalk6jrX+5jd2jp5X21cQvZhYnlelMByuvkoKwUG55w7I55yzSiIyglCNXezYjp3/lBwx+44E6Ldle+Ky6Mgeh8cfT/unX9q7dhHb9Bb1EcROkXn6AIN0QhR78SLPeZl/kef+HN/sWn1vZZ5jf4J3/wFCeLTrg==</latexit> ALGORITHM • train model on • repeat

• decode to and create additional dataset

• retrain model on:

p(y|x)<latexit sha1_base64="0bhALflm3VcKbOBXhaYvsf98m/8=">AAACnHicbZFdSxtBFIZnt7XVrW1jeyWFMhiECGHZtQV7IxXNhaWkRGiikA3D7GSiY2Y/mDkru2z3V/lPvPPfONmN1MYeGHh5z3OY8xGmUmjwvHvLfvFy7dXr9Q3nzebbd+9bWx9GOskU40OWyERdhFRzKWI+BAGSX6SK0yiU/Dycnyzy5zdcaZHEv6FI+SSil7GYCUbBWKR1uxtEFK4YlWWvwoc4KHEn7xZ7ROCgIqU49Luu2/1VOX+5fkV0Q+bkuqauG6pP9AoHDVeQec3NHzl4yvVMJojCJC9pllePTWTdm5UmBpWTdoo/+R5ptT3XqwM/F/5StNEyBqR1F0wTlkU8Biap1mPfS2FSUgWCSV45QaZ5StmcXvKxkTGNuJ6U9XIrvGucKZ4lyrwYcO0+rShppHURhYZcTKRXcwvzf7lxBrNvk1LEaQY8Zs1Hs0xiSPDiUngqFGcgCyMoU8L0itkVVZSBuadjluCvjvxcjPZd/4u7f/a1fXS8XMc6+oR2UAf56AAdoVM0QEPErG3ru3Vq/bA/2z37p91vUNta1nxE/4Q9egC4K8XU</latexit>

D<latexit sha1_base64="bApTgAzl7lzR9QdmUSctCLbuIdA=">AAACqXicbZFdS+NAFIYn0d3V7IdVL70ZLC6VLSFxBb0RRL3wQqXCtpZtSphMpzp28sHMiSTE/Dd/g3f+G6dJxW7dAwMv531e5syZIBFcgeO8GObS8qfPX1ZWra/fvv9Ya6xv9FScSsq6NBax7AdEMcEj1gUOgvUTyUgYCHYTTE6n/s0Dk4rH0R/IEzYMyW3Ex5wS0C2/8bTjhQTuKBHFWYmPsFfgVtbOd32OvdIv+JHbtu32VWm9c5elr2oy8+8r6r6mLn21wEHN5f6k4iZvHMxzZ9rxwiDOCpJm5dsQafthYYiODiWt/DHbteazjaZjO1Xhj8KdiSaaVcdvPHujmKYhi4AKotTAdRIYFkQCp4KVlpcqlhA6IbdsoGVEQqaGRbXpEu/ozgiPY6lPBLjqzicKEiqVh4EmpyOqRW/a/J83SGF8OCx4lKTAIlpfNE4FhhhPvw2PuGQURK4FoZLrWTG9I5JQ0J9r6SW4i0/+KHp7tvvb3rvebx6fzNaxgrbQNmohFx2gY3SOOqiLqPHTuDC6Rs/8ZV6bffNvjZrGLLOJ/imTvgLIHMr3</latexit>

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

As = {(xsj , yj)}j=1,..,Ms

<latexit sha1_base64="WViJMaNEUbzC6NMXxyJtKcfS2lg=">AAAC73icbZJNb9MwGMedwNgIbx0cuVhUlVopipINaVwmDdiBy1CR6DapaS3HdTe3zgu2MyUy+RJcOIAQV74ON74NbtKNLeORLP39PL9Hz98vUcaZVL7/x7Lv3N24t7l133nw8NHjJ53tp8cyzQWhI5LyVJxGWFLOEjpSTHF6mgmK44jTk2j5dlU/uaBCsjT5qMqMTmJ8lrA5I1iZFNq2NnphjNU5wVwfVnAfhhr2C7ccIAbDCmm2H7ie576vnH/cUTWVDVlMJVrU3KLhjpBskaohy6lCy5pcXpLqOnlYTXUYR2mhcV5Ul0Zy96JlZGiasn75uRg4PTMdhpLF8IYz52r3+spnvzbqwjDCQpcVWgxanlGn63t+HfC2CNaiC9YxRJ3f4SwleUwTRTiWchz4mZpoLBQjnFZOmEuaYbLEZ3RsZIJjKie6fq8K9kxmBuepMCtRsM5e79A4lrKMI0OuziLbtVXyf7VxruavJpolWa5oQppB85xDlcLV48MZE5QoXhqBiWDGKyTnWGCizBdxzCUE7SPfFsc7XrDr7Xx42T14s76OLfAcvAB9EIA9cADegSEYAWJx64v1zfpuf7K/2j/snw1qW+ueZ+BG2L/+Ao275JQ=</latexit>

D [As<latexit sha1_base64="LzJmmXdcvEVdwpwvmNy9sV6dCi8=">AAADD3icbZLPb9MwFMedjB+j/OrgyMWiqtRKUZQMJLhMGmMHLkNFotukpo0c193c2klkO1Mik/+Ay/6VXTiAEFeu3PhvcJMO2ownRfrqvc/z+/rFUcqoVJ7327K3bt2+c3f7Xuv+g4ePHrd3nhzLJBOYDHHCEnEaIUkYjclQUcXIaSoI4hEjJ9Hi7bJ+ckGEpEn8URUpGXN0FtMZxUiZVLhjdbsBR+ocI6YPS7gHAw17uVP0QwqDMtR0z3dc13lftv5xR+VE1mQ+keG84uY1dxTKBqlqspiocFGRi2tSrZOH5UQHPEpyjbK8vDaSORcNIwPTlPaKT3m/1TXTYSAphxvO1g5989dor3LqwCBCQhdlOO83Ta9vIcBZCjdOCdsdz/WqgDeFvxIdsIpB2P4VTBOccRIrzJCUI99L1VgjoShmxIzLJEkRXqAzMjIyRpzIsa7+Zwm7JjOFs0SYL1awyq53aMSlLHhkyKVJ2awtk/+rjTI1ez3WNE4zRWJcD5plDKoELh8HnFJBsGKFEQgLarxCfI4Ewso8oZZZgt+88k1xvOv6L9zdDy87+werdWyDZ+A56AEfvAL74B0YgCHA1mfryvpqfbMv7S/2d/tHjdrWqucp2Aj75x/BqfGZ</latexit>

y<latexit sha1_base64="tLy1OgwMuQcWRRWcjddx94KCJ6o=">AAADGHicdZJNb9MwGMed8DbCWwdHLhZVpVaKomRDgsukATtwGSoS3SY1beS47ubWeZHtTImMPwYXvgoXDiDEdTe+DW7SQtvBI1n663l+j/33Y8c5o0L6/i/LvnHz1u07O3ede/cfPHzU2n18IrKCYzLAGcv4WYwEYTQlA0klI2c5JyiJGTmN528W9dNLwgXN0g+yyskoQecpnVKMpElFu5bXCRMkLzBi6kjDAxgq2C3dqhdRGOpI0YPA9Tz3nXb+csd6LBqyHItoVnOzhjuOxBYpG7Iay2hek/MVKdfJIz1WYRJnpUJFqVdGCvdyy0jfNOXd6mPZczrmdBgKmsANZ2ubvvpjtFs7dWEYI64qHc16/zdtxhDiIocb2zirzlbb9/w64HURLEUbLKMfta7CSYaLhKQSMyTEMPBzOVKIS4oZ0U5YCJIjPEfnZGhkihIiRqp+WA07JjOB04yblUpYZ9c7FEqEqJLYkAuzYru2SP6rNizk9OVI0TQvJElxc9C0YFBmcPFL4IRygiWrjECYU+MV4gvEEZbmLzlmCMH2la+Lkz0v2Pf23j9vH75ejmMHPAXPQBcE4AU4BG9BHwwAtj5ZX6xv1nf7s/3V/mH/bFDbWvY8ARthX/0Gco31JA==</latexit>

Key elements: decoding and training noise.

LST (✓) = � log p(y|x+ n)<latexit sha1_base64="F3K+2TxkBw65jeBJbH/60t8pRlA=">AAADtHicdZJdb9MwFIbdhI8Rvjq45MaiqtSKUjUDATeTNigSFwwVsa6T6jY4rtu6dT4UO1OizH+QS+74NzhJO9quWLJ0dM7z+rw+Om7ImZCdzp+KYd65e+/+wQPr4aPHT55WD59diCCOCO2TgAfRpYsF5cynfckkp5dhRLHncjpwl5/y+uCKRoIF/rlMQzry8MxnU0aw1CnnsPKrjjws5wTzrKvgMUQZbCSttOkwiJSTsWO71W63vinrH3emxqIkk7FwFgW3KLkzR+yQsiTTsXSWBblck3KT7Kpxhjw3SDIcJ2ptJG5d7RjpaVHYSK+TplXX3SESzINbzjYePb0x2iictiBycZSlylk0/29ajwGROIRbz+h6Kd0Av6oGknMqcVM3eQ0RD2ZQe4PXMHe3ga2/prqnn/dqklwDX0G/ae2T/Tjfq1o5utE61Vqn3SkOvB3Yq6AGVqfnVH+jSUBij/qScCzE0O6EcpThSDLCqbJQLGiIyRLP6FCHPvaoGGXF0ilY15kJnAaRvr6ERXZTkWFPiNRzNZl/SezW8uS+2jCW0w+jjPlhLKlPykbTmEMZwHyD4YRFlEie6gCTiGmvkMxxhInUe27pIdi7X74dXBy17Tfto+9vaycfV+M4AC/AS9AANngPTsAX0AN9QAzbGBg/DWy+M5FJTFqiRmWleQ62jun/BT2aJeU=</latexit>

enEnglish NepaliTEST

TRAIN //

Cross-EntropyLosspredictioninput sentence

mono

+

n<latexit sha1_base64="EpqUFrDe+QYGhdFpOA+JzvOejPY=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt2LPrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr39lNyag</latexit>

neEncoder Decoder

NMT system

Encoder Decodery

<latexit sha1_base64="dqcAHIDoQn8vOgakMoO2ou+kvGw=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznle+z1Hxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmf4dIMBeuOVt59Pi/0XrutAmRg8M0UfascbdpPQZEogCuPaPrhXQF/KbqSE6pxA39yVuIuD+B2hu8hpm7FWzZmuocf9mqiTMNfAO9O3Q/T7fKFpZuxMv2KtV2q50feDuwFkEVLE7XrvxGI59ELvUk4ViIvtUO5CDFoWSEU1VGkaABJnM8oX0detilYpDm26dgTWdGcOyH+noS5tlVRYpdIRLX0WTWmtisZclttX4kxweDlHlBJKlHio/GEYfSh9kqwxELKZE80QEmIdNeIZniEBOpF76sh2Bttnw7ONtrWe9aez/eV48+LcaxA16B16AOLLAPjsBX0AU9QIwPxqXBjJn50aQmN70CNUoLzUuwdsxf/wDiJClw</latexit>

decoded output

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

DATA

He et al. “Revisiting self-training for neural sequence generation” ICLR 2020

ML Perspective: Semi-Supervised Learning

L(✓) = Lsup(✓) + �LST(✓)<latexit sha1_base64="LB2Wg9oUku21WD7fdgfkmjlUEFI=">AAAGP3ichVRNb9NAEHVLAiV8tXDkMqKqlAgTxQEJJFSpLanEgaIimrZSnFjrzabZxl/yritH7v4zLvwFbly5cAAhrtxYfzQ4jhtWijTZeW/nzdvxmp5FGW+1vq6s3qhUb95au127c/fe/QfrGw+PmRv4mHSxa7n+qYkYsahDupxyi5x6PkG2aZETc/Imzp9cEJ9R1zniU4/0bXTm0BHFiMstY6PS3dJtxMcYWVFHwDboEdRDddowKOjCiOi2pjab6ntR+4c7EAOWIsMBM84T3HmKOzBYAclT5HTAjUmCnFwheR7ZEYNIt003jFAQiishgXpREHIoSV59ehk2aluyOuiM2jCnLHfo7kxoPVGqgm4iP5oK47xxvWhpg44DD+aOkfmUmgO+E3WdjwlHDVnkGeiWewZSG1xCrC4Hu2pNdHb3SzlhzIGn4FzD+3hUSssklZF3Z86nqFAYEzW9hcayW1hsnZdL2iuXFHefFWz87+BymxdrcZcjSywxO/kvZ36IDG2WkefPpAx4HtMu+CeH5DIe5caSu2WBbURM5SLVa5rRvnQx+VIKM9gRGVBAD2BO6GuQOqA/54uRNUkc1SGiaNNcfkyX5MdUJc6SvENSfq28xRLbWeDlTAc5YZmBZWA5oTlszVjfbDVbyYLFQMuCTSVbh8b6F33o4sAmDscWYqyntTzej5DPKbaIqOkBIx7CE3RGejJ0kE1YP0rePwFbcmcII9eXP4dDsptnRMhmbGqbEhkrZ8VcvFmW6wV89KofUccLOHFwWmgUWMBdiB9TGFKfYG5NZYCwT6VWwGPkI8zlkxuboBVbXgyO203tebP94cXmzl5mx5ryWHmi1BVNeansKG+VQ6Wr4MqnyrfKj8rP6ufq9+qv6u8UurqScR4pc6v65y9sChKf</latexit>

Page 29: Low Resource Machine Translation - Stanford University

Adding target-side monolingual data. Two benefits: a) Decoder learns a good language model. b) Better generalization via data augmentation. c) Unlike ST, target is correct but input is not.

Training Dataset Learning Framework: Back-Translation (BT).D = {(x, y)i}i=1,..,N

<latexit sha1_base64="KpnY63s6K47gGjqoo/CKclue32o=">AAACuXicfZHfa9swEMdl70fb7Fe2Pe5FLBQyCMbuSjsYhbLtoS8bGSxtIc6MrMiJEsky0rnECP+PY2/7b6Y4DnRp2YHgy9190N330kJwA2H4x/MfPHz0eG//oPPk6bPnL7ovX10aVWrKRlQJpa9TYpjgORsBB8GuC82ITAW7Spef1/WrG6YNV/kPqAo2kWSW84xTAi6VdH/FksCcEmG/1PgMxxb3V4PqXcJxXCeWn0WDIBh8qzuHt/p+2limamWlyhU2mtYtuUoWDbXYUF8T8x8OZrDlqmTZcMstB/dzBdEEk3K15frl4GZn0mGddHthEDaB74qoFT3UxjDp/o6nipaS5UAFMWYchQVMLNHAqWB1Jy4NKwhdkhkbO5kTyczENs7X+NBlpjhT2r0ccJO9TVgijalk6jrX+5jd2jp5X21cQvZhYnlelMByuvkoKwUG55w7I55yzSiIyglCNXezYjp3/lBwx+44E6Ldle+Ky6Mgeh8cfT/unX9q7dhHb9Bb1EcROkXn6AIN0QhR78SLPeZl/kef+HN/sWn1vZZ5jf4J3/wFCeLTrg==</latexit>

ALGORITHM • train model and on • decode to with , create

additional dataset • retrain model on:

D<latexit sha1_base64="bApTgAzl7lzR9QdmUSctCLbuIdA=">AAACqXicbZFdS+NAFIYn0d3V7IdVL70ZLC6VLSFxBb0RRL3wQqXCtpZtSphMpzp28sHMiSTE/Dd/g3f+G6dJxW7dAwMv531e5syZIBFcgeO8GObS8qfPX1ZWra/fvv9Ya6xv9FScSsq6NBax7AdEMcEj1gUOgvUTyUgYCHYTTE6n/s0Dk4rH0R/IEzYMyW3Ex5wS0C2/8bTjhQTuKBHFWYmPsFfgVtbOd32OvdIv+JHbtu32VWm9c5elr2oy8+8r6r6mLn21wEHN5f6k4iZvHMxzZ9rxwiDOCpJm5dsQafthYYiODiWt/DHbteazjaZjO1Xhj8KdiSaaVcdvPHujmKYhi4AKotTAdRIYFkQCp4KVlpcqlhA6IbdsoGVEQqaGRbXpEu/ozgiPY6lPBLjqzicKEiqVh4EmpyOqRW/a/J83SGF8OCx4lKTAIlpfNE4FhhhPvw2PuGQURK4FoZLrWTG9I5JQ0J9r6SW4i0/+KHp7tvvb3rvebx6fzNaxgrbQNmohFx2gY3SOOqiLqPHTuDC6Rs/8ZV6bffNvjZrGLLOJ/imTvgLIHMr3</latexit>

ML Perspective: Semi-Supervised LearningEnglish Nepali

TEST

TRAIN //

mono

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

yt ⇠ Mt<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>

D [At<latexit sha1_base64="9r4+5t8z/Eutjwqsqwo6D2I5ofs=">AAAECnicfZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdKUlSyNZp53552xbQcO46LV+rOl6deu37i5fat0+87de/fLOw+OuB+FhPaI7/jhiY05dZhHe4IJh54EIcWu7dBje/Y6qx+f0ZAz3zsUSUD7Lh57bMQIFipl7WhQNV0sJgQ7aUfCHpgp1OJGUrcYmNJK2Z7RaDYb72XpH3cgB7wg4wG3pjk3LbgDi6+RoiCTgbBmOTlbkGKZ7MhBarq2H6c4iuXCSNQ4WzPSVaKglpzH9VJVdQeTMxdWnC1d2r4wWsudNsC0cZgm0prWrzat1mCSKICVa1S9kC6B72TNFBMqcF01eQqm449BeYNzyNwtYYvRZKf9ZqMmzjTwBLwrdB8PN8rmljaJ2xebL6hYZuMn2RZWR1dv4b+TC6tcaTVb+YHLgTEPKmh+ulb5tzn0SeRSTxAHc35qtALRT3EoGHGoahdxGmAyw2N6qkIPu5T30/xTllBVmSGM/FA9noA8u6xIsct54tqKzEzy9VqW3FQ7jcToZT9lXhAJ6pGi0ShyQPiQ/RcwZCElwklUgEnIlFcgExxiItTfU1JLMNZHvhwc7TaNZ83dD88r+6/m69hGj9BjVEMGeoH20VvURT1EtE/aF+2b9l3/rH/Vf+g/C1TbmmseopWj//oLD/dGQg==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

Encoder

en neCross-Entropy

Losspredictioninput sentenceDecoder

NMT system

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

neyt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

backward NMT system

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

LBT (✓) = � log p(y|x)<latexit sha1_base64="QFoLzUqdj7aJb5peYo68MenzgH0=">AAAERHichZPLbtNAFIanNpdibiks2YyIIiUiRHFBgk2lpgSJBUVBNG1FnFjjySSZxDd5xpUtdx6ODQ/AjidgwwKE2CLGl5QkdcVIIx2d8/1z/nOksXybMt5uf91S1GvXb9zcvqXdvnP33v3KzoNj5oUBJn3s2V5waiFGbOqSPqfcJqd+QJBj2eTEWrxK6ydnJGDUc4947JOhg6YunVCMuEyZO8rHmuEgPsPITroC7kEjgfWoGTdMCg1hJnRPb7ZazXdC+8cdihHLyWjEzHnGzXPu0GQbJM/JeMTNRUYuliRfJbtilBiO5UUJCiOxNBI2zzaM9KTIr8fnUUOrye7QYNSBa85WHu1cGK1nTpvQsFCQxMKcN642Lddg4NCHa8/Iei5dAd+KusFnhKOGbPIUGrY3hdIbPIepuxVsOZrodl6XaqJUA59A9wrdh6NSWWGpTNy52HxORSIdP063sD46/9/oXCtzdFDuKB2+6NfQzEq13WpnB14O9CKoguL0zMoXY+zh0CEuxzZibKC3fT5MUMAptonQjJARH+EFmpKBDF3kEDZMsk8gYE1mxnDiBfK6HGbZVUWCHMZix5JkOg/brKXJstog5JOXw4S6fsiJi/NGk9CG3IPpj4JjGhDM7VgGCAdUeoV4hgKEufx36RL0zZEvB8e7Lf1Za/f98+r+QbGObfAIPAZ1oIMXYB+8AT3QB1j5pHxTfig/1c/qd/WX+jtHla1C8xCsHfXPXzOCXHE=</latexit>

DATA

Sennrich et al. “Improving NMT models with monolingual data” ACL 2016

At = {(xk, ytk)}k=1,..,Mt

<latexit sha1_base64="fgtcm855BUPx8R0Y2MIqfz4NPuA=">AAAF2HichVRdb9MwFM1YC6N8bfDIyxXTpEaEqhlIIKFJ2+gkHhgaYt0mmjZyXHf1mi/FzpQos8QDCPHKT+ONP8FvwPnYlqZdsVTp1vece889dmz5NmW83f6zdGu5Vr99Z+Vu4979Bw8fra49PmJeGGDSxZ7tBScWYsSmLulyym1y4gcEOZZNjq3JuzR/fE4CRj33kMc+6Tvo1KUjihGXW+ba8t8Nw0F8jJGddARsgZFAM9Ji1aRgCDOhW7rWamkfReMaty8GLEdGA2aeZbizHLdvsgqS58h4wM1JhpxcInkZ2RGDxHAsL0pQGIlLIaF2XhFyIEl+M76I1MaG7A4Gow5MKSsV3bkS2syUamBYKEhiYZ6pN4uWNhg49GGqjMzn1BLwg2gafEw4UmWTF2DY3ilIbXABqboS7HI00dnZm8uJUg48B/cG3ufDubRC0jW5LLkwPgdFwpxo+SGoiw5hdnI+X9HufEXp8EVD9X+F57s824t7HNligdfZf3nlh8jUrzKy/pWUAS9jNiv2yTtykd5kdcHRstAxE6Zxkeu1rGRPuph9KJUr2BEFUEAPYEroW5A6oD/li1kMSVzNJaJq01R+TBfkx1Qj7oK8S3K+ubrebrWzBbOBXgTrSrEOzNXfxtDDoUNcjm3EWE9v+7yfoIBTbBPRMEJGfIQn6JT0ZOgih7B+kj1MAjbkzhBGXiB/Lodst8xIkMNY7FgSmUpm1Vy6OS/XC/noTT+hrh9y4uK80Si0gXuQvnIwpAHB3I5lgHBApVbAYxQgzOVb2JAm6NWRZ4OjzZb+srX56dX69m5hx4ryVHmmNBVdea1sK++VA6Wr4Fq3ltS+1b7Xv9S/1n/Uf+bQW0sF54kyteq//gGro+qV</latexit>

L(✓) = Lsup(✓) + �LBT(✓)<latexit sha1_base64="P8VXi2JP7nzKZMIQIatLsVBxvsM=">AAAGP3ichVRNb9NAEHVLAiV8tXDkMqKqlAgTxQEJJFSpLanEgaIimrZSnFjrzabZxl/yritH7v4zLvwFbly5cAAhrtxYfzQ4jhtWijTZeW/nzdvxmp5FGW+1vq6s3qhUb95au127c/fe/QfrGw+PmRv4mHSxa7n+qYkYsahDupxyi5x6PkG2aZETc/Imzp9cEJ9R1zniU4/0bXTm0BHFiMstY6PS3dJtxMcYWVFHwDboEdRDddowKOjCiOi2pjab6ntR+4c7EAOWIsMBM84T3HmKOzBYAclT5HTAjUmCnFwheR7ZEYNIt003jFAQiishgXpREHIoSV59ehk2aluyOuiM2jCnLHfo7kxoPVGqgm4iP5oK47xxvWhpg44DD+aOkfmUmgO+E3WdjwlHDVnkGeiWewZSG1xCrC4Hu2pNdHb3SzlhzIGn4FzD+3hUSssklZF3Z86nqFAYEzW9hcayW1hsnZdL2iuXFHefFWz87+BymxdrcZcjSywxO/kvZ36IDG2WkefPpAx4HtMu+CeH5DIe5caSu2WBbURM5SLVa5rRvnQx+VIKM9gRGVBAD2BO6GuQOqA/54uRNUkc1SGiaNNcfkyX5MdUJc6SvENSfq28xRLbWeDlTAc5YZmBZWA5DjlszVjfbDVbyYLFQMuCTSVbh8b6F33o4sAmDscWYqyntTzej5DPKbaIqOkBIx7CE3RGejJ0kE1YP0rePwFbcmcII9eXP4dDsptnRMhmbGqbEhkrZ8VcvFmW6wV89KofUccLOHFwWmgUWMBdiB9TGFKfYG5NZYCwT6VWwGPkI8zlkxuboBVbXgyO203tebP94cXmzl5mx5ryWHmi1BVNeansKG+VQ6Wr4MqnyrfKj8rP6ufq9+qv6u8UurqScR4pc6v65y9RWBKO</latexit>

Page 30: Low Resource Machine Translation - Stanford University

ALGORITHM • train model and on • repeat

• decode to with , create additional dataset

• decode to with , create additional dataset

• retrain both and on:

D<latexit sha1_base64="bApTgAzl7lzR9QdmUSctCLbuIdA=">AAACqXicbZFdS+NAFIYn0d3V7IdVL70ZLC6VLSFxBb0RRL3wQqXCtpZtSphMpzp28sHMiSTE/Dd/g3f+G6dJxW7dAwMv531e5syZIBFcgeO8GObS8qfPX1ZWra/fvv9Ya6xv9FScSsq6NBax7AdEMcEj1gUOgvUTyUgYCHYTTE6n/s0Dk4rH0R/IEzYMyW3Ex5wS0C2/8bTjhQTuKBHFWYmPsFfgVtbOd32OvdIv+JHbtu32VWm9c5elr2oy8+8r6r6mLn21wEHN5f6k4iZvHMxzZ9rxwiDOCpJm5dsQafthYYiODiWt/DHbteazjaZjO1Xhj8KdiSaaVcdvPHujmKYhi4AKotTAdRIYFkQCp4KVlpcqlhA6IbdsoGVEQqaGRbXpEu/ozgiPY6lPBLjqzicKEiqVh4EmpyOqRW/a/J83SGF8OCx4lKTAIlpfNE4FhhhPvw2PuGQURK4FoZLrWTG9I5JQ0J9r6SW4i0/+KHp7tvvb3rvebx6fzNaxgrbQNmohFx2gY3SOOqiLqPHTuDC6Rs/8ZV6bffNvjZrGLLOJ/imTvgLIHMr3</latexit>

ML Perspective: Semi-Supervised LearningEnglish Nepali

TEST

TRAIN //

mono

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

yt ⇠ Mt<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

mono

Adding both source & target-side monolingual data.

Training DatasetD = {(x, y)i}i=1,..,N

<latexit sha1_base64="KpnY63s6K47gGjqoo/CKclue32o=">AAACuXicfZHfa9swEMdl70fb7Fe2Pe5FLBQyCMbuSjsYhbLtoS8bGSxtIc6MrMiJEsky0rnECP+PY2/7b6Y4DnRp2YHgy9190N330kJwA2H4x/MfPHz0eG//oPPk6bPnL7ovX10aVWrKRlQJpa9TYpjgORsBB8GuC82ITAW7Spef1/WrG6YNV/kPqAo2kWSW84xTAi6VdH/FksCcEmG/1PgMxxb3V4PqXcJxXCeWn0WDIBh8qzuHt/p+2limamWlyhU2mtYtuUoWDbXYUF8T8x8OZrDlqmTZcMstB/dzBdEEk3K15frl4GZn0mGddHthEDaB74qoFT3UxjDp/o6nipaS5UAFMWYchQVMLNHAqWB1Jy4NKwhdkhkbO5kTyczENs7X+NBlpjhT2r0ccJO9TVgijalk6jrX+5jd2jp5X21cQvZhYnlelMByuvkoKwUG55w7I55yzSiIyglCNXezYjp3/lBwx+44E6Ldle+Ky6Mgeh8cfT/unX9q7dhHb9Bb1EcROkXn6AIN0QhR78SLPeZl/kef+HN/sWn1vZZ5jf4J3/wFCeLTrg==</latexit>

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Learning Framework: Iterative ST & BT.

y<latexit sha1_base64="tLy1OgwMuQcWRRWcjddx94KCJ6o=">AAADGHicdZJNb9MwGMed8DbCWwdHLhZVpVaKomRDgsukATtwGSoS3SY1beS47ubWeZHtTImMPwYXvgoXDiDEdTe+DW7SQtvBI1n663l+j/33Y8c5o0L6/i/LvnHz1u07O3ede/cfPHzU2n18IrKCYzLAGcv4WYwEYTQlA0klI2c5JyiJGTmN528W9dNLwgXN0g+yyskoQecpnVKMpElFu5bXCRMkLzBi6kjDAxgq2C3dqhdRGOpI0YPA9Tz3nXb+csd6LBqyHItoVnOzhjuOxBYpG7Iay2hek/MVKdfJIz1WYRJnpUJFqVdGCvdyy0jfNOXd6mPZczrmdBgKmsANZ2ubvvpjtFs7dWEYI64qHc16/zdtxhDiIocb2zirzlbb9/w64HURLEUbLKMfta7CSYaLhKQSMyTEMPBzOVKIS4oZ0U5YCJIjPEfnZGhkihIiRqp+WA07JjOB04yblUpYZ9c7FEqEqJLYkAuzYru2SP6rNizk9OVI0TQvJElxc9C0YFBmcPFL4IRygiWrjECYU+MV4gvEEZbmLzlmCMH2la+Lkz0v2Pf23j9vH75ejmMHPAXPQBcE4AU4BG9BHwwAtj5ZX6xv1nf7s/3V/mH/bFDbWvY8ARthX/0Gco31JA==</latexit>

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

As = {(xsj , yj)}j=1,..,Ms

<latexit sha1_base64="WViJMaNEUbzC6NMXxyJtKcfS2lg=">AAAC73icbZJNb9MwGMedwNgIbx0cuVhUlVopipINaVwmDdiBy1CR6DapaS3HdTe3zgu2MyUy+RJcOIAQV74ON74NbtKNLeORLP39PL9Hz98vUcaZVL7/x7Lv3N24t7l133nw8NHjJ53tp8cyzQWhI5LyVJxGWFLOEjpSTHF6mgmK44jTk2j5dlU/uaBCsjT5qMqMTmJ8lrA5I1iZFNq2NnphjNU5wVwfVnAfhhr2C7ccIAbDCmm2H7ie576vnH/cUTWVDVlMJVrU3KLhjpBskaohy6lCy5pcXpLqOnlYTXUYR2mhcV5Ul0Zy96JlZGiasn75uRg4PTMdhpLF8IYz52r3+spnvzbqwjDCQpcVWgxanlGn63t+HfC2CNaiC9YxRJ3f4SwleUwTRTiWchz4mZpoLBQjnFZOmEuaYbLEZ3RsZIJjKie6fq8K9kxmBuepMCtRsM5e79A4lrKMI0OuziLbtVXyf7VxruavJpolWa5oQppB85xDlcLV48MZE5QoXhqBiWDGKyTnWGCizBdxzCUE7SPfFsc7XrDr7Xx42T14s76OLfAcvAB9EIA9cADegSEYAWJx64v1zfpuf7K/2j/snw1qW+ueZ+BG2L/+Ao275JQ=</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

D [At [As<latexit sha1_base64="e0NAylEw1ooQi7AHFM3rtuBW9V4=">AAAEdnichZNdaxNBFIanSdS6fqX1SgQZGoIJpmG3CvWm0NQIXliJ2LSFbLLMTibJJPvFzmzZZTv/wF/nnb/DGy+d/UhN0q0ODBzOed457zkwpmdRxlX151apXLl3/8H2Q+XR4ydPn1V3ds+ZG/iY9LFruf6liRixqEP6nHKLXHo+QbZpkQtz8SGpX1wRn1HXOeORR4Y2mjp0QjHiMmXslL7XdRvxGUZW3BXwCOoxbIStqGlQqAsjpkdaq91ufRHKX+5UjFhGhiNmzFNunnGnBtsgeUZGI24sUnKxJPkq2RWjWLdNN4xREIqlkaB1tWGkJ0VeI7oOm0pddoc6ozZcc7byaOfGaCN12oK6ifw4Esa8ebdpuQYdBx5ce0bWM+kK+Fk0dD4jHDVlk32oW+4USm/wGibuVrDlaKLb+VioCRMNfAOdO3TfzgpluaUicedm8xkVimT8KNnC+uj8f6PzYksnxZaS6fOGTeXf7xZs2ajW1LaaHng70PKgBvLTM6o/9LGLA5s4HFuIsYGmenwYI59TbBGh6AEjHsILNCUDGTrIJmwYp99GwLrMjOHE9eV1OEyzq4oY2YxFtinJxCTbrCXJotog4JP3w5g6XsCJg7NGk8CC3IXJH4Rj6hPMrUgGCPtUeoV4hnyEufypilyCtjny7eD8oK29bR98fVc7PsnXsQ1egj3QABo4BMfgE+iBPsClX+UX5b1yrfy78qpSr7zO0NJWrnkO1k5F/QNLLWpM</latexit>

Ltotal(✓) = � log p(y|x)� �1 log p(yt|xt)� �2 log p(y

s|xs)<latexit sha1_base64="yVeodM+YYA8v80/XuviO5lyyFlo=">AAAE9HichVRNb9NAEHWbACV8pXDksiKKFIsQxQEJLpWaUiQOFAXRtJXqxFpvNskm/sI7jmw5+zu4cAAhrvwYbvwb1nYSJalLV7I0nnlv5s3T2qZnMQ7N5t+d3ULx1u07e3dL9+4/ePiovP/4jLuBT2iXuJbrX5iYU4s5tAsMLHrh+RTbpkXPzenbpH4+oz5nrnMKkUd7Nh45bMgIBpky9gulqm5jGBNsxccCHSA9RrWwHqkGQ7owYnag1RuN+kexhjsRfZ4hwz43JilukuFODL6FhAwZ9cGYpsjpEgliY3Y/1m3TDWMchGIpJKjPtoR0JMmrRfNQLVXldKRzZqMNZWtN2yuhtVRpHekm9uNIGBP1etHSBp0EHtpoI+sZdQ34QdR0GFPAqhzyAumWO0JSG5qjRN0abLmaOG6/y+WECQc9R841vM+nubSFpDxye+V8hgpFsn6UuLC5Oty0OuRLOsqXlGy/GKje1DjP5pxR4AK2xH+8Tt/llR9gQ1tVZPuVkj6sY1pb9sk7Mk9usmqUK81GMz3oaqAtgoqyOB2j/EcfuCSwqQPEwpxfak0PejH2gRGLipIecOphMsUjeilDB9uU9+L0oxWoKjMDNHR9+TiA0uw6I8Y255FtSmTiCd+uJcm82mUAwze9mDleANQh2aBhYCFwUfIHQAPmUwJWJANMfCa1IjLGPiYg/xMlaYK2vfLV4KzV0F42Wp9eVQ6PFnbsKU+VZ0pN0ZTXyqHyXukoXYUUvhS+Fr4XfhRnxW/Fn8VfGXR3Z8F5omyc4u9/a1CXhg==</latexit>

mono

mono

English Nepali

decoded with:

decoded with:p(x|y)

<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

phas

e 1

phas

e 1

phas

e 2

phas

e 2

TRAIN //

mono

mono

generate train models

DATAEnglish Nepali

Shen et al. “The source-target domain mismatch problem in MT” arXiv:1909.13151 2019 Chen et al. “FBAI WAT’19 Myanmar-English translation task submission” WAT@EMNLP 2019

At = {(xk, ytk)}k=1,..,Mt

<latexit sha1_base64="fgtcm855BUPx8R0Y2MIqfz4NPuA=">AAAF2HichVRdb9MwFM1YC6N8bfDIyxXTpEaEqhlIIKFJ2+gkHhgaYt0mmjZyXHf1mi/FzpQos8QDCPHKT+ONP8FvwPnYlqZdsVTp1vece889dmz5NmW83f6zdGu5Vr99Z+Vu4979Bw8fra49PmJeGGDSxZ7tBScWYsSmLulyym1y4gcEOZZNjq3JuzR/fE4CRj33kMc+6Tvo1KUjihGXW+ba8t8Nw0F8jJGddARsgZFAM9Ji1aRgCDOhW7rWamkfReMaty8GLEdGA2aeZbizHLdvsgqS58h4wM1JhpxcInkZ2RGDxHAsL0pQGIlLIaF2XhFyIEl+M76I1MaG7A4Gow5MKSsV3bkS2syUamBYKEhiYZ6pN4uWNhg49GGqjMzn1BLwg2gafEw4UmWTF2DY3ilIbXABqboS7HI00dnZm8uJUg48B/cG3ufDubRC0jW5LLkwPgdFwpxo+SGoiw5hdnI+X9HufEXp8EVD9X+F57s824t7HNligdfZf3nlh8jUrzKy/pWUAS9jNiv2yTtykd5kdcHRstAxE6Zxkeu1rGRPuph9KJUr2BEFUEAPYEroW5A6oD/li1kMSVzNJaJq01R+TBfkx1Qj7oK8S3K+ubrebrWzBbOBXgTrSrEOzNXfxtDDoUNcjm3EWE9v+7yfoIBTbBPRMEJGfIQn6JT0ZOgih7B+kj1MAjbkzhBGXiB/Lodst8xIkMNY7FgSmUpm1Vy6OS/XC/noTT+hrh9y4uK80Si0gXuQvnIwpAHB3I5lgHBApVbAYxQgzOVb2JAm6NWRZ4OjzZb+srX56dX69m5hx4ryVHmmNBVdea1sK++VA6Wr4Fq3ltS+1b7Xv9S/1n/Uf+bQW0sF54kyteq//gGro+qV</latexit>

Page 31: Low Resource Machine Translation - Stanford University

ML Perspective: Multi-Task/Multi-ModalEnglish Nepali

TEST

TRAIN //

Adding parallel data in other languages.

Training Dataset Learning Framework: Multilingual Training

DATA

TRAIN //

TRAIN //

HindiEncoder

Src TgtCross-Entropy

Loss

y<latexit sha1_base64="0flskoD0eLhdjGE1HTyI8J9TWSE=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt+LErlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr3914yar</latexit>

human translator

human reference

predictioninput sentence with target language ID

Decoder

NMT system

(x, LID)<latexit sha1_base64="HmWqM9tiKnFxrNajg0yfX6wBGuo=">AAAE/nichVTLbtNAFHUbA8W8UmDHZkRVKRYhigsSbCo1JUggtaiIvqQ6scaTSTOJX/JcV7ackfgVNixAiC3fwY6/YWynIQ+XjmTp+t5z7j33aGw7cBiHZvPPympFvXHz1tpt7c7de/cfVNcfHnM/Cgk9Ir7jh6c25tRhHj0CBg49DUKKXduhJ/boTVY/uaAhZ753CElAOy4+91ifEQwyZa1XHm+aLoYBwU7aFmgbmSmqxfVEtxgyhZWybaPeaNQ/CO0fbl90eYGMu9wa5rhhgdu3+AISCmTSBWuUI0eXSJhFtkU3NV3bj1McxeJSSFS/WBByIElBLRnHurYppyOTMxfNKZtp2poKreVK68i0cZgmwhrqV4uWNpgkCtBcG1kvqDPAPVEzYUAB63LIc2Q6/jmS2tAYZepmYJeriXbrbSknzjjoGfKu4H06LKVNJJWRW1PnC1QssvWTzIX51eG61aFc0m65pGz7yUD9usblNi/PAh+wI/5jdv4u73wPW8a0IvtPpXRhFrO14J+8JOPsKuuavPlo731bt6obzUYzP2g5MCbBhjI5B1b1t9nzSeRSD4iDOT8zmgF0UhwCIw4VmhlxGmAywuf0TIYedinvpPnnK9CmzPRQ3w/l4wHKs7OMFLucJ64tkZk5fLGWJctqZxH0X3dS5gURUI8Ug/qRg8BH2b8A9VhICTiJDDAJmdSKyACHmID8Y2jSBGNx5eXgeKthvGhsfXy5sbM7sWNNeaI8VWqKobxSdpR3yoFypJBKWvlS+Vb5rn5Wv6o/1J8FdHVlwnmkzB3111+HPJoH</latexit>

L(✓) = �X

s,t

E(x,y)⇠Ds,t[log p(y|x; t)]

<latexit sha1_base64="2m27Zees8z1jOYapGr5U+EniNnU=">AAAFW3ichVTbbtNAEHWbhAZTaAviiZcRVaVEhCguSCChSm1JJR4oKqI3KU6s9WbTOvEN77iy5e5P8gQP/ApifWlIUpeuZGm8c87MmeP1mr5tcex0fi0tV6q1Byv1h+qj1cdP1tY3np5yLwwoO6Ge7QXnJuHMtlx2ghba7NwPGHFMm52Zk49p/uyKBdzy3GOMfdZ3yIVrjSxKUG4ZG5XvW7pD8JISO+kK2AE9gUbUipuGBbowEmtHa7XbrS9C/Yc7FAOeI6MBN8YZbpzjDg2+gMQcGQ/QmGTIyQ0SZ5FdMUh0x/SihISRuBEStq4WhBxJkt+Ir6OmuiW7g84tB+aUzRTdmwptZEpboJskSGJhjJt3i5Y26DT0Ya6MzOfUGeBn0dDxkiFpyiavQbe9C5Da4BpSdTOwm9FEd++glBOlHHgF7h28b8eltEJSGXlv6nyOikQ6fpy6MD863jc6lkvaL5eUTl80bN5XuNzm273QQ2KL/5idvcszPySGNs3I+lMpA5zFbC/4Jw/JdXqUm+qdn5aHjpHwFopcrmkmB9LE7EdZOIJdUQAF9ADmdH4AKQP6xvpmp93JFtwOtCLYVIp1ZKz/0IceDR3mIrUJ5z2t42M/IQFa1GZC1UPOfEIn5IL1ZOgSh/F+kt0NArbkzhBGXiAfFyHbnWUkxOE8dkyJTEfgi7l0syzXC3H0vp9Yrh8ic2neaBTagB6kFw0MrYBRtGMZEBpYUivQSxIQivI6UqUJ2uLIt4PT7bb2pr399e3m7n5hR115obxUGoqmvFN2lU/KkXKi0MrPyp/qSrVe/V2r1NTaag5dXio4z5S5VXv+F4MBuCM=</latexit>

Share the same encoder and the same decoder with all the language pairs. Prepend a target language identifier to the source sentence to inform decoder of desired language. Concatenate all the datasets together. Train using standard cross-entropy loss.

TRAIN //

Den,ne [Den,hi [Dhi,en [Dne,hi<latexit sha1_base64="tUwh6HHQXAg003JxSttpQv2fWQE=">AAAF2HichVTLbtNAFHVpAiW8WliyuaKqlAgTxQUJJFSpLanEgqIimrYiTqzxZNJM45c848qWOxILEGLLp7HjJ/gGxo+G2HXTkSxdzz3n3nOPx2N6FmW80/mzdGu5Vr99Z+Vu4979Bw8fra49PmJu4GPSw67l+icmYsSiDulxyi1y4vkE2aZFjs3puyR/fE58Rl3nkEceGdjo1KFjihGXW8ba8t8N3UZ8gpEVdwVsgR5DM1SjlkFBF0ZMtzS13VY/isZ/3L4YsgwZDplxluLOMty+wUpIniGjITemKXJ6ieTzyK4YxrptumGMglBcCgnU85KQA0nymtFF2GpsyO6gM2pDQdlc0Z2Z0GaqVAXdRH4cCeOsdb1oaYOOAw8KZWQ+o84BP4imzieEo5Zs8gJ0yz0FqQ0uIFE3B7scTXR39io5YcKB5+Bcw/t8WEnLJVWRd2bOZ6hQJONHiQvF0flNo/NqSbvVkpLp84atmwpX23y1F3c5ssQCs9N3eeZHyNBmGVl/JmXI5zGbJf/kIblIjnJrwbdlgW3ETOUi02ua8Z50Mf1TSmewK3KggD5AQehbkDpg0ChgsxmJozpElF0q5Cd0QX5CVeIsyDsk4xur6512J11wNdDyYF3J14Gx+lsfuTiwicOxhRjrax2PD2Lkc4otIhp6wIiH8BSdkr4MHWQTNojTi0nAhtwZwdj15eNwSHfnGTGyGYtsUyITyaycSzarcv2Aj98MYup4AScOzhqNAwu4C8ktByPqE8ytSAYI+1RqBTxBPsJc3oUNaYJWHvlqcLTZ1l62Nz+9Wt/eze1YUZ4qz5SmoimvlW3lvXKg9BRc69Xi2rfa9/qX+tf6j/rPDHprKec8UQqr/usfSBTqkQ==</latexit>

Johnson et al. “Google’s multilingual NMT system…” ACL 2017 Aharoni et al. “Massively multilingual NMT” ACL 2019

Page 32: Low Resource Machine Translation - Stanford University

Conclusion so far…• Assuming no domain effect, there are lots of training paradigms

depending on the available data.

• It is hard to predict what works best.

• In general, DAE pretraining, (iterative) BT and multi-lingual training perform strongly on low resource languages.

• All these methods can be combined together, but it requires some level of craftsmanship…

• Final touch: ensembling, fine-tuning, distillation, etc.32

amount of data

domain

language pair

Page 33: Low Resource Machine Translation - Stanford University

Open Challenges

• Diversity of domains and varying translation quality.

• Wildly varying dataset sizes.

• Diversity of language pairs.

• Training large models to handle large datasets, as soon as we put together lots of language pairs and monolingual data.

33

Page 34: Low Resource Machine Translation - Stanford University

Case Study #1: Unsupervised MTEnglish French

TEST

monomono

DATA

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

enx

<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

fryt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

Lample et al. “Phrase-based and neural unsupervised MT” EMNLP 2018 Artetxe et al. “An effective approach to unsupervised MT” ACL 2019

Page 35: Low Resource Machine Translation - Stanford University

Case Study #1: Unsupervised MT

35

English FrenchTEST

monomono

DATA

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Encoder

en frCross-Entropy

Losspredictioninput sentenceDecoder

NMT system

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

fryt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

backward NMT system

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

…and vice versa starting from English.This is an example of auto-encoding or cycle consistency.

Problem: lack of constrained on x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>

Page 36: Low Resource Machine Translation - Stanford University

Case Study #1: Unsupervised MTEnglish French

TEST

monomono

DATA

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Encoder

en frCross-Entropy

Losspredictioninput sentenceDecoder

NMT system

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

fryt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

backward NMT system

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

Problem: lack of modularity. Decoder may behave differently when fed with representations from French

encoder VS English encoder.

enCross-Entropy

Losspredictioninput sentence+

enEncoder Decoder

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

DAE makes sure decoder outputs fluently in the desired language.

n<latexit sha1_base64="EpqUFrDe+QYGhdFpOA+JzvOejPY=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt2LPrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr39lNyag</latexit> NMT system

Lample et al. “Phrase-based and neural unsupervised MT” EMNLP 2018 Artetxe et al. “An effective approach to unsupervised MT” ACL 2019

Page 37: Low Resource Machine Translation - Stanford University

Case Study #1: Unsupervised MT

37

English FrenchTEST

monomono

DATA

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Encoder

en frCross-Entropy

Losspredictioninput sentenceDecoder

NMT system

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

fryt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

backward NMT system

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

enCross-Entropy

Losspredictioninput sentence+

n<latexit sha1_base64="EpqUFrDe+QYGhdFpOA+JzvOejPY=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt2LPrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr39lNyag</latexit> en

Encoder Decoder

NMT system

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

DAE makes sure decoder outputs fluently in the desired language.

Encoder

Src Tgtpredictioninput sentence

with target language ID

Decoder

NMT system

(x, LID)<latexit sha1_base64="HmWqM9tiKnFxrNajg0yfX6wBGuo=">AAAE/nichVTLbtNAFHUbA8W8UmDHZkRVKRYhigsSbCo1JUggtaiIvqQ6scaTSTOJX/JcV7ackfgVNixAiC3fwY6/YWynIQ+XjmTp+t5z7j33aGw7cBiHZvPPympFvXHz1tpt7c7de/cfVNcfHnM/Cgk9Ir7jh6c25tRhHj0CBg49DUKKXduhJ/boTVY/uaAhZ753CElAOy4+91ifEQwyZa1XHm+aLoYBwU7aFmgbmSmqxfVEtxgyhZWybaPeaNQ/CO0fbl90eYGMu9wa5rhhgdu3+AISCmTSBWuUI0eXSJhFtkU3NV3bj1McxeJSSFS/WBByIElBLRnHurYppyOTMxfNKZtp2poKreVK68i0cZgmwhrqV4uWNpgkCtBcG1kvqDPAPVEzYUAB63LIc2Q6/jmS2tAYZepmYJeriXbrbSknzjjoGfKu4H06LKVNJJWRW1PnC1QssvWTzIX51eG61aFc0m65pGz7yUD9usblNi/PAh+wI/5jdv4u73wPW8a0IvtPpXRhFrO14J+8JOPsKuuavPlo731bt6obzUYzP2g5MCbBhjI5B1b1t9nzSeRSD4iDOT8zmgF0UhwCIw4VmhlxGmAywuf0TIYedinvpPnnK9CmzPRQ3w/l4wHKs7OMFLucJ64tkZk5fLGWJctqZxH0X3dS5gURUI8Ug/qRg8BH2b8A9VhICTiJDDAJmdSKyACHmID8Y2jSBGNx5eXgeKthvGhsfXy5sbM7sWNNeaI8VWqKobxSdpR3yoFypJBKWvlS+Vb5rn5Wv6o/1J8FdHVlwnmkzB3111+HPJoH</latexit>

Like in multilingual NMT, share encoder and decoder parameters. Encoder is encouraged to produce

shared representations (particularly if pre-trained).

Page 38: Low Resource Machine Translation - Stanford University

Case Study #1: Unsupervised MT

38

English FrenchTEST

monomono

DATA

Mt = {ytk}k=1,..,Mt<latexit sha1_base64="WPPS0Z7qvI346bxZ6Qp8zIQogqY=">AAADd3icbZJLb9NAEMc3Do8SXinc4MCIEOQIE8UFCS6VWggSB4qCRNpKcWKtN5tkE7/kXVe23P0IfDlufA8u3FjbCThpV7I0mvnNzH/G44Qu46LX+1XT6jdu3rq9d6dx9979Bw+b+49OeRBHhA5J4AbRuYM5dZlPh4IJl56HEcWe49IzZ/Uxj59d0IizwP8u0pCOPTz32YwRLJTL3q/9aFseFguC3awv4RCsDPTESDs2A0vaGTs0jW7X+Cob/7kTOeElmUy4vSy4Zcmd2Fw2qqAowXQi7FUBrjagqJbsy0lmeU6QZDhO5EZHbFzs6BiopFBPL5NOo62ag8WZB1vCKkWP/+nUC6EGWA6OslTay86u5q0tWCQOYauMipepFfCL1C2xoAJ3VJPXYLnBHJQ2uIRcXQXbjCb7x5+uzUnyHHgFfsdutnrdXvHgqmGujRZav4Hd/GlNAxJ71BfExZyPzF4oxhmOBCMuVf8i5jTEZIXndKRMH3uUj7PibiS0lWcKsyBSny+g8FYzMuxxnnqOIvNp+G4sd14XG8Vi9n6cMT+MBfVJ2WgWuyACyI8QpiyiRLipMjCJmNIKZIEjTIQ61YZagrk78lXj9KBrvukefHvbOvqwXsceeoqeIx2Z6B06Qp/RAA0Rqf3Wnmgt7YX2p/6s/rKul6hWW+c8Rluvbv4FlMEQtg==</latexit>

Ms = {xsj}j=1,..,Ms

<latexit sha1_base64="Wuso7jnamC5w7EBqWeo2cvUsqLM=">AAACx3icbZFNj9MwEIad8LWErwJHLhZVpa5URcmCBJeVFugBDouKRHdXalrLcd1dt04c2ZMqUciBv8iNC78FN+mibpeRLL2aecZ+PRNnUhgIgt+Oe+fuvfsPDh56jx4/efqs8/zFmVG5ZnzMlFT6IqaGS5HyMQiQ/CLTnCax5Ofx6tOmfr7m2giVfocy49OEXqZiIRgFmyKdP70ooXDFqKyGNT7GUYX7xaA8JAJHNanEcTjw/cHX2vuHndYz04LFzJBlgy1b7JSY2uvtktCS5QzIqiFX1yTsksN6VkVJrIqK5kV97SMfrPd8jGxT1i9/FIdez76OIyMSfMPZzqUfamJIpxv4QRP4tgi3oou2MSKdX9FcsTzhKTBJjZmEQQbTimoQTHI7htzwjLIVveQTK1OacDOtmj3UuGczc7xQ2p4UcJPd7ahoYkyZxJbcmDT7tU3yf7VJDov300qkWQ48Ze1Di1xiUHizVDwXmjOQpRWUaWG9YnZFNWVgV+/ZIYT7X74tzo788I1/9O1t9+TjdhwH6BV6jfooRO/QCfqMRmiMmDN0lo5xwP3iKnftFi3qOtuel+hGuD//Am3y1zs=</latexit>

Encoder

en frCross-Entropy

Losspredictioninput sentenceDecoder

NMT system

x<latexit sha1_base64="jDLmMVPu3t12MXHk9cCHPRMrYik=">AAADvXicdZJdb9owFIZNso+OfdHtcjfWEBJoDJFuUqtJ1dqNSbtYJ6aVthKG1DEGDM6HYqdLlPpPblf7N3MSWIFSS5aOznlen9dHxwk4E7Ld/lsyzHv3HzzceVR+/OTps+eV3Rdnwo9CQnvE53544WBBOfNoTzLJ6UUQUuw6nJ47889Z/fyKhoL53qlMAjpw8cRjY0aw1Cl7t/SnhlwspwTztKPgIUQprMfNpGEziJSdskOr2Wo1v6vyDXeihqIg46GwZzk3K7gTW2yQsiCTobTnOTlfknKV7KhhilzHj1McxWppJGpebRjpalFQT67jRrmmu0MkmAvXnK08evzfaD132oTIwWGaKHvWuNu0HgMiUQDXntH1QroCflN1JKdU4oZu8hYi7k+g9gavYeZuBVt+TXWOv2zVxJkGvoHeHbqfp1tlC0s34jwRK7tSbbfa+YG3A2sRVMHidO3KbzTySeRSTxKOhehb7UAOUhxKRjhVZRQJGmAyxxPa16GHXSoGab59CtZ0ZgTHfqivJ2GeXVWk2BUicR1NZl8Tm7Usua3Wj+T4YJAyL4gk9UjRaBxxKH2YrTIcsZASyRMdYBIy7RWSKQ4xkXrhy3oI1uaXbwdney3rXWvvx/vq0afFOHbAK/Aa1IEF9sER+Aq6oAeI8cG4NJgxMz+a1OSmV6BGaaF5CdaO+esf4J8pbw==</latexit>Encoder Decoder

fryt ⇠ Mt

<latexit sha1_base64="oC/1YnfKTpkRP2NJh0qgsiTvklU=">AAADzHicdZJba9swFMcVe5cuu6Xb417EQiBhWYi7QftSaLcMNlhLxpq2ECVGVpREiW9YcrFR9boPuLe97pNMtpM1tx4QHM75/XX+OsgJXcZFu/2nZJgPHj56vPek/PTZ8xcvK/uvLnkQR4T2SOAG0bWDOXWZT3uCCZdehxHFnuPSK2f+Oetf3dCIs8C/EGlIBx6e+GzMCBa6ZO+X/taQh8WUYFd2FDyGSMJ60kwbNoNI2ZIdW81Wq3muynfcmRrygkyG3J7l3Kzgzmy+QYqCTIfCnufkfEmKVbKjhhJ5TpBIHCdqaSRu3mwY6WpRWE9vk0a5pqdDxJkH15ytXHr632g9d9qEyMGRTJU9a9xvWq8BkTiEa9fofiFdAb+rOhJTKnBDD3kPkRtMoPYGb2HmbgVbPk11Tr/s1CSZBr6D/j26nxc7ZQtLd2K95e2NCLtSbbfaecDtxFokVbCIrl35jUYBiT3qC+JizvtWOxQDiSPBiEtVGcWchpjM8YT2depjj/KBzD+jgjVdGcFxEOnjC5hXVxUSe5ynnqPJzCTf7GXFXb1+LMZHA8n8MBbUJ8WgcexCEcDsZ8MRiygRbqoTTCKmvUIyxREmQv//sl6Ctfnk7eTyoGV9aB38+Fg9+bRYxx54A96COrDAITgBX0EX9AAxvhmBkRipeW4KU5qqQI3SQvMarIX56x/eni+B</latexit>

backward NMT system

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

enCross-Entropy

Losspredictioninput sentence+

n<latexit sha1_base64="EpqUFrDe+QYGhdFpOA+JzvOejPY=">AAADt3icdZJdb9owFIZNso+OfZR2l7uxhpBAY4h0lbZeVGo3Ju1inZhW2moYIscYMDgfi50qUeqfuJvd7d/MSWAFSi1ZOjrneX1eHx0n4EzIdvtvyTAfPHz0eOdJ+emz5y92K3v7F8KPQkJ7xOd+eOVgQTnzaE8yyelVEFLsOpxeOvNPWf3ymoaC+d65TAI6cPHEY2NGsNQpe6/0u4ZcLKcE87Sj4DFEKazHzaRhM4iUnbJjq9lqNb+p8i13poaiIOOhsGc5Nyu4M1tskLIgk6G05zk5X5JyleyoYYpcx49THMVqaSRqXm8Y6WpRUE9u4ka5prtDJJgL15ytPHr632g9d9qEyMFhmih71rjftB4DIlEA157R9UK6An5VdSSnVOKGbvIWIu5PoPYGb2DmbgVbfk11Tj9v1cSZBr6B3j26H+dbZQtLt2LPrlTbrXZ+4N3AWgRVsDhdu/IHjXwSudSThGMh+lY7kIMUh5IRTlUZRYIGmMzxhPZ16GGXikGa752CNZ0ZwbEf6utJmGdXFSl2hUhcR5PZp8RmLUtuq/UjOf4wSJkXRJJ6pGg0jjiUPsyWGI5YSInkiQ4wCZn2CskUh5hIveplPQRr88t3g4uDlvWudfD9sHrycTGOHfAKvAZ1YIH34AR8AV3QA8Q4NH4axBiZR6Ztjs1pgRqlheYlWDvmr39lNyag</latexit> en

Encoder Decoder

NMT system

xs ⇠ Ms<latexit sha1_base64="+D2rKjCVJUv9t1Da29dPHSAUntk=">AAACx3icbZFNbxMxEIa9y0fL8hXgyMUiqpRK0Wq3INFLpQI5wKEoSKStlE0sr+O0TrwfsmejXZk98Be5ceG34OymIk0ZydKrmWc8rz1xLoWGIPjtuPfuP3i4t//Ie/zk6bPnnRcvz3VWKMZHLJOZuoyp5lKkfAQCJL/MFadJLPlFvPy0rl+suNIiS79DlfNJQq9SMReMgk2Rzp+DKKFwzag0gxqf4MjgXtmvDonAUU2MOAn7vt//Wnv/uLN6qluynGqyaLhFy50RvUNCS1ZTIMuGXN6QsE0O6qmJkjgrDS3K+sZI0V/tGBnaprxX/SgPPTscR1ok+JaxrTs/1ESTTjfwgybwXRFuRBdtYkg6v6JZxoqEp8Ak1XocBjlMDFUgmOS1FxWa55Qt6RUfW5nShOuJafZQ4wObmeF5puxJATfZ7Q5DE62rJLbk2qTera2T/6uNC5gfT4xI8wJ4ytpB80JiyPB6qXgmFGcgKysoU8J6xeyaKsrArt6znxDuPvmuOD/yw7f+0bd33dOPm+/YR6/RG9RDIXqPTtFnNEQjxJyBs3C0A+4XN3NXbtmirrPpeYVuhfvzL1Ju1zs=</latexit>

DAE makes sure decoder outputs fluently in the desired language.

Encoder

Src Tgtpredictioninput sentence

with target language ID

Decoder

NMT system

(x, LID)<latexit sha1_base64="HmWqM9tiKnFxrNajg0yfX6wBGuo=">AAAE/nichVTLbtNAFHUbA8W8UmDHZkRVKRYhigsSbCo1JUggtaiIvqQ6scaTSTOJX/JcV7ackfgVNixAiC3fwY6/YWynIQ+XjmTp+t5z7j33aGw7cBiHZvPPympFvXHz1tpt7c7de/cfVNcfHnM/Cgk9Ir7jh6c25tRhHj0CBg49DUKKXduhJ/boTVY/uaAhZ753CElAOy4+91ifEQwyZa1XHm+aLoYBwU7aFmgbmSmqxfVEtxgyhZWybaPeaNQ/CO0fbl90eYGMu9wa5rhhgdu3+AISCmTSBWuUI0eXSJhFtkU3NV3bj1McxeJSSFS/WBByIElBLRnHurYppyOTMxfNKZtp2poKreVK68i0cZgmwhrqV4uWNpgkCtBcG1kvqDPAPVEzYUAB63LIc2Q6/jmS2tAYZepmYJeriXbrbSknzjjoGfKu4H06LKVNJJWRW1PnC1QssvWTzIX51eG61aFc0m65pGz7yUD9usblNi/PAh+wI/5jdv4u73wPW8a0IvtPpXRhFrO14J+8JOPsKuuavPlo731bt6obzUYzP2g5MCbBhjI5B1b1t9nzSeRSD4iDOT8zmgF0UhwCIw4VmhlxGmAywuf0TIYedinvpPnnK9CmzPRQ3w/l4wHKs7OMFLucJ64tkZk5fLGWJctqZxH0X3dS5gURUI8Ug/qRg8BH2b8A9VhICTiJDDAJmdSKyACHmID8Y2jSBGNx5eXgeKthvGhsfXy5sbM7sWNNeaI8VWqKobxSdpR3yoFypJBKWvlS+Vb5rn5Wv6o/1J8FdHVlwnmkzB3111+HPJoH</latexit>

Like in multilingual NMT, share encoder and decoder parameters. Encoder is encouraged to produce

shared representations (particularly if pre-trained).

ITERATIVE BT

DAE

Multi-Lingual

+

+

Page 39: Low Resource Machine Translation - Stanford University

Same ideas can be applied to phrase-based statistical MT systems (PBSMT). NMT and PBSMT can be combined for

even better results.

WMT’14 En-Fr

Lample et al. “Phrase-based and neural unsupervised MT” EMNLP 2018

Since unsupMT was trained on about 10M sentences, each parallel sentence is worth 100 monolingual sentences (for this dataset

and language pair).

Page 40: Low Resource Machine Translation - Stanford University

Case Study #2: FLoRes Ne-En

40

In-domain (Wikipedia) Out-of-domain

Parallel None

500K sentences (Bible, GNOME/Ubuntu,

OpenSubtitle, …)

*Hindi: 1.5M

Monolingual 100K sentences~5M sentences

(CommonCrawl)

*Hindi: 45M

Page 41: Low Resource Machine Translation - Stanford University

Results on FLoRes: Ne-En

41

Page 42: Low Resource Machine Translation - Stanford University

42

Results on FLoRes: Ne-En

Page 43: Low Resource Machine Translation - Stanford University

43

Results on FLoRes: Ne-En

Page 44: Low Resource Machine Translation - Stanford University

44

Results on FLoRes: Ne-En

Page 45: Low Resource Machine Translation - Stanford University

Case-Study #3: English-Burmese

Page 46: Low Resource Machine Translation - Stanford University

Workshop on Asian Translation 2019: English-Burmese

In-domain (News) Out-of-domain

Parallel 20K sentences 200K sentences

Monolingual ~79M sentences (En only)

~23M sentences (My only)

“FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019

Page 47: Low Resource Machine Translation - Stanford University

Results: Iterative ST+BTMy —> En En —> My

BLE

U

26

29

32

35

38

Parallel Iter. 1 Iter. 2 Iter. 3B

LEU

35

36.5

38

39.5

41

Parallel Iter. 1 Iter. 2 Iter. 3

“FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019

Page 48: Low Resource Machine Translation - Stanford University

Results: BT vs ST vs BT+STMy —> En, iter. 2

BLEU

30

31.25

32.5

33.75

35

iter. 1 iter. 2 BT iter. 2 ST iter. 2 BT+ST

“FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019

Page 49: Low Resource Machine Translation - Stanford University

Final Results of 2019 Competition

BLEU

24

28

32

36

40

Ours NICT NICT-NMT UCSMNLP

En —> My

BLEU

18

23.5

29

34.5

40

Ours NICT-NMT NICT UCSYNLP

My —> En

+8 BLEU compared to second best

“FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019

Page 50: Low Resource Machine Translation - Stanford University

50

Demo (Burmese —> English) 1

23.3 26.32 27.74

slide credit to Peng-Jen Chen

Page 51: Low Resource Machine Translation - Stanford University

Conclusion so far…• Iterative back-translation, multi-lingual training work remarkably well.

• By feeding more data (BT, ST, pre-training, multi-lingual training) we can afford training bigger models. Bigger models train on more data generalize better.

• Low-resource MT requires big compute! Remember that about 100 monolingual sentences give the same training signal as a single pair of parallel sentence.

51

Page 52: Low Resource Machine Translation - Stanford University

Outline

52

MODELDATA

ANALYSIS

life of a researcher

“The FLoRes evaluation for low resource MT:…” Guzmán, Chen et al. ’EMNLP 2019

“Phrase-based & Neural Unsup MT” Lample et al. EMNLP 2018 “FBAI WAT’19 My-En translation task submission” Chen et al., WAT@EMNLP 2019 “Massively Multilingual NMT” Aharoni et al.,ACL 2019 “Multilingual Denoising Pre-training for NMT” Liu et al., arXiv 2001:08210 2020

“Analyzing uncertainty in NMT” Ott et al. ICML 2018 “On the evaluation of MT systems trained with back-translation” Edunov et al. ACL 2020 “The source-target domain mismatch problem in MT” Shen et al. arXiv 1909.13151 2019

Page 53: Low Resource Machine Translation - Stanford University

Simulating Low-Resource MTSimulating low-resource MT with a high resource language: using EuroParl data with 20K parallel sentences and 100K monolingual target sentences.

only parallel data

parallel data + BT

30.4 BLEU

33.8 BLEU+3.4 BLEU!

https://www.statmt.org/europarl/

EuroParl Fr—>En

Page 54: Low Resource Machine Translation - Stanford University

A Worrisome FindingBT sometimes yields very mild improvements.

The FLORES evaluation datasets… Guzmán, Chen et al. EMNLP 2019

Exampleonly parallel data

parallel data + BT

15.2 BLEU

15.3 BLEU

FB public posts En—>My

+0.1 BLEU!

Why is BT not working as well?

Page 55: Low Resource Machine Translation - Stanford University

• football• baseball• basketball

• soccer• cricket• rowing

Sports

55For the same topic: Different distribution over words!

Page 56: Low Resource Machine Translation - Stanford University

• hamburger• clam chowder• apple pie

• kimchi• bulgogi• bibimbap

Food

56For the same topic: Different distribution over words!

Page 57: Low Resource Machine Translation - Stanford University

topic distribution57Domains differ in the topic distribution

• politics• sports

• religion• environment

Page 58: Low Resource Machine Translation - Stanford University

Examples from FLoResSi-En

En-Si

original

translation

Guzmàn, Chen et al. “The FLoRes evaluation datasets for low resource MT…” EMNLP 2019

Wikipedia in Sinhala has different topic distribution.

Page 59: Low Resource Machine Translation - Stanford University

Source-Target Domain Mismatch (STDM)• Def.: Content produced in blogs, social networks, news outlets, etc. varies with the

geographic location.

• STDM is even more pronounced in low resource MT, where source & target geographic locations are typically farther apart and cultures have more distinct traits.

• STDM manifests itself in terms of: Different distribution of topics, and for the same topic different distribution over words.

• STDM is hard to measure because of the unknown effect of translationese.

• STDM makes the MT problem even harder: source originating data and target originating data are not comparable in general. BT won’t work as well. UnsupMT won’t work as well.

Language and place. Johnstone Cambridge Univ. Press 2010 Leech et al. “Computer corpora: What do they tell us about culture?”Journal Computers in English Linguistics 1992

Page 60: Low Resource Machine Translation - Stanford University

Questions

• Is it true that BT is less effective when there is STDM?

• What baselines shall we consider when there is STDM?

• What are general best practices when there is STDM?

• How to study STDM in a controlled setting?

60

Page 61: Low Resource Machine Translation - Stanford University

Controlled Setting

Source Domain EuroParl

Target Domain OpenSubtitles

mono

mono

Source Language: Fr Target Language: En

10K sentences

<1M sentences

61

<1M sentences

human translations

human translations

Page 62: Low Resource Machine Translation - Stanford University

Controlled Setting

Source Domain EuroParl mono

mono

Source Language: Fr Target Language: En

62

(1� ↵)<latexit sha1_base64="hXiGHlRj7nWlfLezr2o5piRxOpQ=">AAAGAHicjVTbbtMwGM5GCyOcNpC44cZiqtTSg5KBAAkmTaBJXIAYsI1JTRs5qdOaOQfFLksVfMOrcMMFCHHLY3DH2/A7SSHt2oGlKPZ/+Pz9JzsRo1wYxq+V1XOV6vkLaxf1S5evXL22vnH9kIfj2CUHbsjC+MjBnDAakANBBSNHUUyw7zDy1jl+qvRv35OY0zDYF5OI9Hw8DKhHXSxAZG9UbtZQVE8ayIriMBIhIv0UWSwcoshOLd8Jk9QSNJikz19IKZVhE3H4Sb2W9NtoG9VP7NTcMmXLGoSCt9TpviEBj1N/IcYSD/QBndhmC2VClJkAjF57WbfEiAjcgLssnwZ1o6UI9JuP0FTRVgK91m+XRT74KqKZm6LghgEXHSSR5YUxZgwlyKI+5JjwPAPby0LWa3lC6oltFvwSe185nJWpWdsmmEJVBlhxPaUqEYTEqosE5COx08dCLic2YwQkvRi7UDwfi5HjpLvSTkFpZT2SxmQgRdMsTg7D7vG0APsyhzGLs4AaQReU+DYkkmcC/xds21wADOHmkVEPSjMX5wALrDoGEj2neRfSQCjVnxKW2S2Gkai7BAWI9LI+WQoxtTwbQ89mqX8HsJI+nZmpUtCggaiRGp+lQwL6ac/9q8/y3lqkbr/effPXpNxjSOGH8C4w4gkcx+FJOpR63YS5sTCLRrhhr28aHSNb6PTGLDabWrH27PWfUGZ37JNAuAxz3jWNSPRSHAvqMiJ1a8xJBO2Bh6QL2wD7hPfSrG8kqoFkgGAu4QsEyqRljxT7nE98ByxVjfi8TgkX6bpj4T3spTSIxoIEbn6RN2YIiqJeQzSgMXEFm8AGuzEFrsgdYZgjAW+mDkkw50M+vTnc6ph3O1uv7m3uPCnSsabd0m5rdc3UHmg72jNtTzvQ3IqsfKp8qXytfqx+rn6rfs9NV1cKnxvazKr++A3w4ANP</latexit>

EuroParl + OpenSubtitles ↵<latexit sha1_base64="KLcj9QsFirN9JCJt1Gli0B4M6Gw=">AAAF+nicjVTbbtMwGM5GCyOcOrjkxmKq1NGDkoIACSZNoElcgBiwk9S0kZM6rZlzUOzSVJkfhRsuQIhbnoQ73obfSQtp1w4sRbH/w+fvP9mJGOXCMH6trV8qlS9f2biqX7t+4+atyubtIx6OYpccuiEL4xMHc8JoQA4FFYycRDHBvsPIsXP6QumPP5KY0zA4EJOIdH08CKhHXSxAZG+WKlUU1ZJtZEVxGIkQkV6KLBYOUGSnlu+ESWoJGkzSV6+llMqwjjj8pF5Nek20g2pjOzXbpmxY/VDwhjo9MiTgceovxVjhgc7Q2DYbKBOizARg9OqbmiWGROBtuMvyaVAzGopAr/4UzRRNJdCrvWZR5IOvIpq5KQpuGHDRQhJZXhhjxlCCLOpDjgnPM7CzKmS9miekltjmlF9iHyiHizI1b1sHU6hKHyuu51QFgpBYdZGAfCR2+kzI1cTmjICkF2MXiudjMXScdE/aKSitrEfSmPSlqJvTk8OwezorwIHMYczpWUCNoAsKfLclkhcC/xds01wCDOHmkVEPSrMQZx8LrDoGEr2g+RDSQCjVnxIW2S2HkaizAgWIdLM+WQkxs7wYQ89mqXcfsJIenZupQtCggaiRGp+VQwL6Wc/9q8/y3lqmbr7be//XpNhjSOGH8C4w4gkcx+E4HUjdwiwaYruyZbSMbKHzG3O62dKma9+u/IQCuyOfBMJlmPOOaUSim+JYUJcRgB1xEkFj4AHpwDbAPuHdNOsYiaog6SOYSPgCgTJp0SPFPucT3wFLVR2+qFPCZbrOSHhPuikNopEggZtf5I0YgnKodxD1aUxcwSawwW5MgStyhxgmSMBrqUMSzMWQz2+O2i3zQav99uHW7vNpOja0u9o9raaZ2mNtV3up7WuHmlsalz6VvpS+ls/Kn8vfyt9z0/W1qc8dbW6Vf/wG2L8CJA==</latexit>

↵ = 0<latexit sha1_base64="ZPnRfubrH7b3OY+yvc0OydClHfo=">AAAF/nicjVTbbtMwGM5GCyMctgF33FhMlTp6UDIQIMGkCTSJCxADdpKaNnJStzVzDopdlspY4lW44QKEuOU5uONt+J2k0HbtwFIU+z98/v6TvZhRLizr19LyhVL54qWVy+aVq9eur66t3zjk0TDxyYEfsSg59jAnjIbkQFDByHGcEBx4jBx5J8+0/ug9STiNwn0xikk7wP2Q9qiPBYjc9dKtCoqr6SZy4iSKRYRIRyKHRX0Uu9IJvCiVjqDhSL54qZTShjXE4afMStppoG1UPXWlvWWrutONBK/r0wNLAR6nwVyMBR7oAzp17TrKhCgzARiz8qrqiAEReBPucgIaVq26JtCpPUZjRUMLzEqnMSkKwFcTzdw0BT8KuWgihZxelGDGUIocGkCOCc8zsL0oZLOSJ6SaunbBL3X3tcN5mZq2rYEpVKWLNdczqgmCkFh9kYB8pK58ItRiYlNGQLKXYB+KF2Ax8Dy5q1wJSifrEZmQrhI1uzh5DPsn4wLsqxzGLs4CagRdMMF3UyF1LvB/wTbsOcAQbh4Z7UFpZuLsYoF1x0CiZzTvIhoKrfpTwkl282EUai1AASLtrE8WQowtz8cws1nq3AWstEOnZmoiaNBA1EiPz8IhAf245/7VZ3lvzVM33uy+/Wsy2WNI40fwLjDSEzhJolPZV6aDWTzAcJ3lrm1YTStb6OzGLjYbRrH23LWfUGR/GJBQ+Axz3rKtWLQlTgT1GQHoIScxNAfukxZsQxwQ3pZZ1yhUAUkXwVTCFwqUSSc9JA44HwUeWOoK8VmdFs7TtYai96gtaRgPBQn9/KLekCEoiX4LUZcmxBdsBBvsJxS4In+AYYoEvJgmJMGeDfns5nCrad9rbr2+v7HztEjHinHbuGNUDdt4aOwYz40948DwS7L0qfSl9LX8sfy5/K38PTddXip8bhpTq/zjN1cdAvk=</latexit>

Target Domain OpenSubtitles

Page 63: Low Resource Machine Translation - Stanford University

Controlled Setting

Source Domain EuroParl

Target Domain

mono

mono

Source Language: Fr Target Language: En

63

(1� ↵)<latexit sha1_base64="hXiGHlRj7nWlfLezr2o5piRxOpQ=">AAAGAHicjVTbbtMwGM5GCyOcNpC44cZiqtTSg5KBAAkmTaBJXIAYsI1JTRs5qdOaOQfFLksVfMOrcMMFCHHLY3DH2/A7SSHt2oGlKPZ/+Pz9JzsRo1wYxq+V1XOV6vkLaxf1S5evXL22vnH9kIfj2CUHbsjC+MjBnDAakANBBSNHUUyw7zDy1jl+qvRv35OY0zDYF5OI9Hw8DKhHXSxAZG9UbtZQVE8ayIriMBIhIv0UWSwcoshOLd8Jk9QSNJikz19IKZVhE3H4Sb2W9NtoG9VP7NTcMmXLGoSCt9TpviEBj1N/IcYSD/QBndhmC2VClJkAjF57WbfEiAjcgLssnwZ1o6UI9JuP0FTRVgK91m+XRT74KqKZm6LghgEXHSSR5YUxZgwlyKI+5JjwPAPby0LWa3lC6oltFvwSe185nJWpWdsmmEJVBlhxPaUqEYTEqosE5COx08dCLic2YwQkvRi7UDwfi5HjpLvSTkFpZT2SxmQgRdMsTg7D7vG0APsyhzGLs4AaQReU+DYkkmcC/xds21wADOHmkVEPSjMX5wALrDoGEj2neRfSQCjVnxKW2S2Gkai7BAWI9LI+WQoxtTwbQ89mqX8HsJI+nZmpUtCggaiRGp+lQwL6ac/9q8/y3lqkbr/effPXpNxjSOGH8C4w4gkcx+FJOpR63YS5sTCLRrhhr28aHSNb6PTGLDabWrH27PWfUGZ37JNAuAxz3jWNSPRSHAvqMiJ1a8xJBO2Bh6QL2wD7hPfSrG8kqoFkgGAu4QsEyqRljxT7nE98ByxVjfi8TgkX6bpj4T3spTSIxoIEbn6RN2YIiqJeQzSgMXEFm8AGuzEFrsgdYZgjAW+mDkkw50M+vTnc6ph3O1uv7m3uPCnSsabd0m5rdc3UHmg72jNtTzvQ3IqsfKp8qXytfqx+rn6rfs9NV1cKnxvazKr++A3w4ANP</latexit>

EuroParl + OpenSubtitles ↵<latexit sha1_base64="KLcj9QsFirN9JCJt1Gli0B4M6Gw=">AAAF+nicjVTbbtMwGM5GCyOcOrjkxmKq1NGDkoIACSZNoElcgBiwk9S0kZM6rZlzUOzSVJkfhRsuQIhbnoQ73obfSQtp1w4sRbH/w+fvP9mJGOXCMH6trV8qlS9f2biqX7t+4+atyubtIx6OYpccuiEL4xMHc8JoQA4FFYycRDHBvsPIsXP6QumPP5KY0zA4EJOIdH08CKhHXSxAZG+WKlUU1ZJtZEVxGIkQkV6KLBYOUGSnlu+ESWoJGkzSV6+llMqwjjj8pF5Nek20g2pjOzXbpmxY/VDwhjo9MiTgceovxVjhgc7Q2DYbKBOizARg9OqbmiWGROBtuMvyaVAzGopAr/4UzRRNJdCrvWZR5IOvIpq5KQpuGHDRQhJZXhhjxlCCLOpDjgnPM7CzKmS9miekltjmlF9iHyiHizI1b1sHU6hKHyuu51QFgpBYdZGAfCR2+kzI1cTmjICkF2MXiudjMXScdE/aKSitrEfSmPSlqJvTk8OwezorwIHMYczpWUCNoAsKfLclkhcC/xds01wCDOHmkVEPSrMQZx8LrDoGEr2g+RDSQCjVnxIW2S2HkaizAgWIdLM+WQkxs7wYQ89mqXcfsJIenZupQtCggaiRGp+VQwL6Wc/9q8/y3lqmbr7be//XpNhjSOGH8C4w4gkcx+E4HUjdwiwaYruyZbSMbKHzG3O62dKma9+u/IQCuyOfBMJlmPOOaUSim+JYUJcRgB1xEkFj4AHpwDbAPuHdNOsYiaog6SOYSPgCgTJp0SPFPucT3wFLVR2+qFPCZbrOSHhPuikNopEggZtf5I0YgnKodxD1aUxcwSawwW5MgStyhxgmSMBrqUMSzMWQz2+O2i3zQav99uHW7vNpOja0u9o9raaZ2mNtV3up7WuHmlsalz6VvpS+ls/Kn8vfyt9z0/W1qc8dbW6Vf/wG2L8CJA==</latexit>

intermediate value of ↵<latexit sha1_base64="KLcj9QsFirN9JCJt1Gli0B4M6Gw=">AAAF+nicjVTbbtMwGM5GCyOcOrjkxmKq1NGDkoIACSZNoElcgBiwk9S0kZM6rZlzUOzSVJkfhRsuQIhbnoQ73obfSQtp1w4sRbH/w+fvP9mJGOXCMH6trV8qlS9f2biqX7t+4+atyubtIx6OYpccuiEL4xMHc8JoQA4FFYycRDHBvsPIsXP6QumPP5KY0zA4EJOIdH08CKhHXSxAZG+WKlUU1ZJtZEVxGIkQkV6KLBYOUGSnlu+ESWoJGkzSV6+llMqwjjj8pF5Nek20g2pjOzXbpmxY/VDwhjo9MiTgceovxVjhgc7Q2DYbKBOizARg9OqbmiWGROBtuMvyaVAzGopAr/4UzRRNJdCrvWZR5IOvIpq5KQpuGHDRQhJZXhhjxlCCLOpDjgnPM7CzKmS9miekltjmlF9iHyiHizI1b1sHU6hKHyuu51QFgpBYdZGAfCR2+kzI1cTmjICkF2MXiudjMXScdE/aKSitrEfSmPSlqJvTk8OwezorwIHMYczpWUCNoAsKfLclkhcC/xds01wCDOHmkVEPSrMQZx8LrDoGEr2g+RDSQCjVnxIW2S2HkaizAgWIdLM+WQkxs7wYQ89mqXcfsJIenZupQtCggaiRGp+VQwL6Wc/9q8/y3lqmbr7be//XpNhjSOGH8C4w4gkcx+E4HUjdwiwaYruyZbSMbKHzG3O62dKma9+u/IQCuyOfBMJlmPOOaUSim+JYUJcRgB1xEkFj4AHpwDbAPuHdNOsYiaog6SOYSPgCgTJp0SPFPucT3wFLVR2+qFPCZbrOSHhPuikNopEggZtf5I0YgnKodxD1aUxcwSawwW5MgStyhxgmSMBrqUMSzMWQz2+O2i3zQav99uHW7vNpOja0u9o9raaZ2mNtV3up7WuHmlsalz6VvpS+ls/Kn8vfyt9z0/W1qc8dbW6Vf/wG2L8CJA==</latexit>

OpenSubtitles

Page 64: Low Resource Machine Translation - Stanford University

Controlled Setting

Source Domain & Target Domain

EuroParlmono

mono

Source Language: Fr Target Language: En

64

(1� ↵)<latexit sha1_base64="hXiGHlRj7nWlfLezr2o5piRxOpQ=">AAAGAHicjVTbbtMwGM5GCyOcNpC44cZiqtTSg5KBAAkmTaBJXIAYsI1JTRs5qdOaOQfFLksVfMOrcMMFCHHLY3DH2/A7SSHt2oGlKPZ/+Pz9JzsRo1wYxq+V1XOV6vkLaxf1S5evXL22vnH9kIfj2CUHbsjC+MjBnDAakANBBSNHUUyw7zDy1jl+qvRv35OY0zDYF5OI9Hw8DKhHXSxAZG9UbtZQVE8ayIriMBIhIv0UWSwcoshOLd8Jk9QSNJikz19IKZVhE3H4Sb2W9NtoG9VP7NTcMmXLGoSCt9TpviEBj1N/IcYSD/QBndhmC2VClJkAjF57WbfEiAjcgLssnwZ1o6UI9JuP0FTRVgK91m+XRT74KqKZm6LghgEXHSSR5YUxZgwlyKI+5JjwPAPby0LWa3lC6oltFvwSe185nJWpWdsmmEJVBlhxPaUqEYTEqosE5COx08dCLic2YwQkvRi7UDwfi5HjpLvSTkFpZT2SxmQgRdMsTg7D7vG0APsyhzGLs4AaQReU+DYkkmcC/xds21wADOHmkVEPSjMX5wALrDoGEj2neRfSQCjVnxKW2S2Gkai7BAWI9LI+WQoxtTwbQ89mqX8HsJI+nZmpUtCggaiRGp+lQwL6ac/9q8/y3lqkbr/effPXpNxjSOGH8C4w4gkcx+FJOpR63YS5sTCLRrhhr28aHSNb6PTGLDabWrH27PWfUGZ37JNAuAxz3jWNSPRSHAvqMiJ1a8xJBO2Bh6QL2wD7hPfSrG8kqoFkgGAu4QsEyqRljxT7nE98ByxVjfi8TgkX6bpj4T3spTSIxoIEbn6RN2YIiqJeQzSgMXEFm8AGuzEFrsgdYZgjAW+mDkkw50M+vTnc6ph3O1uv7m3uPCnSsabd0m5rdc3UHmg72jNtTzvQ3IqsfKp8qXytfqx+rn6rfs9NV1cKnxvazKr++A3w4ANP</latexit>

EuroParl + OpenSubtitles ↵<latexit sha1_base64="KLcj9QsFirN9JCJt1Gli0B4M6Gw=">AAAF+nicjVTbbtMwGM5GCyOcOrjkxmKq1NGDkoIACSZNoElcgBiwk9S0kZM6rZlzUOzSVJkfhRsuQIhbnoQ73obfSQtp1w4sRbH/w+fvP9mJGOXCMH6trV8qlS9f2biqX7t+4+atyubtIx6OYpccuiEL4xMHc8JoQA4FFYycRDHBvsPIsXP6QumPP5KY0zA4EJOIdH08CKhHXSxAZG+WKlUU1ZJtZEVxGIkQkV6KLBYOUGSnlu+ESWoJGkzSV6+llMqwjjj8pF5Nek20g2pjOzXbpmxY/VDwhjo9MiTgceovxVjhgc7Q2DYbKBOizARg9OqbmiWGROBtuMvyaVAzGopAr/4UzRRNJdCrvWZR5IOvIpq5KQpuGHDRQhJZXhhjxlCCLOpDjgnPM7CzKmS9miekltjmlF9iHyiHizI1b1sHU6hKHyuu51QFgpBYdZGAfCR2+kzI1cTmjICkF2MXiudjMXScdE/aKSitrEfSmPSlqJvTk8OwezorwIHMYczpWUCNoAsKfLclkhcC/xds01wCDOHmkVEPSrMQZx8LrDoGEr2g+RDSQCjVnxIW2S2HkaizAgWIdLM+WQkxs7wYQ89mqXcfsJIenZupQtCggaiRGp+VQwL6Wc/9q8/y3lqmbr7be//XpNhjSOGH8C4w4gkcx+E4HUjdwiwaYruyZbSMbKHzG3O62dKma9+u/IQCuyOfBMJlmPOOaUSim+JYUJcRgB1xEkFj4AHpwDbAPuHdNOsYiaog6SOYSPgCgTJp0SPFPucT3wFLVR2+qFPCZbrOSHhPuikNopEggZtf5I0YgnKodxD1aUxcwSawwW5MgStyhxgmSMBrqUMSzMWQz2+O2i3zQav99uHW7vNpOja0u9o9raaZ2mNtV3up7WuHmlsalz6VvpS+ls/Kn8vfyt9z0/W1qc8dbW6Vf/wG2L8CJA==</latexit>

↵ = 1<latexit sha1_base64="xIEQP2x/mHoaZFbz/3+L5Jgy2hk=">AAAF/nicjVTbbtMwGM5GCyMctgF33FhMlTp6UDIQIMGkCTSJCxADdpKaNnJSpzVzDopdlspY4lW44QKEuOU5uONt+J22kHbtwFIU+z98/v6TvYRRLizr19LyhVL54qWVy+aVq9eur66t3zjk8SD1yYEfszg99jAnjEbkQFDByHGSEhx6jBx5J8+0/ug9STmNo30xTEg7xL2IBtTHAkTueulWBSXVbBM5SRonIkakI5HD4h5KXOmEXpxJR9BoKF+8VEppwxri8FNmJes00DaqnrrS3rJV3enGgtf16YGlAI/TcC7GAg/0AZ26dh3lQpSbAIxZeVV1RJ8IvAl3OSGNqlZdE+jUHqOJoqEFZqXTKIpC8NVEczdNwY8jLppIISeIU8wYypBDQ8gx4aMMbC8K2ayMElLNXHvML3P3tcN5mZq2rYEpVKWLNdczqgJBSKy+SEA+Mlc+EWoxsSkjIBmk2IfihVj0PU/uKleC0sl7RKakq0TNHp88hv2TSQH21QjGHp8F1Ai6oMB3UyF1LvB/wTbsOcAQ7igyGkBpZuLsYoF1x0CiZzTvYhoJrfpTwiK7+TAKtRagAJF23icLISaW52OY+Sx17gJW1qFTM1UIGjQQNdLjs3BIQD/puX/12ai35qkbb3bf/jUp9hjS+DG8C4wEAqdpfCp7ynQwS/oYrrPdtQ2raeULnd3Y482GMV577tpPKLI/CEkkfIY5b9lWItoSp4L6jAD0gJMEmgP3SAu2EQ4Jb8u8axSqgKSLYCrhiwTKpUUPiUPOh6EHlrpCfFanhfN0rYEIHrUljZKBIJE/uigYMAQl0W8h6tKU+IINYYP9lAJX5PcxTJGAF9OEJNizIZ/dHG417XvNrdf3N3aejtOxYtw27hhVwzYeGjvGc2PPODD8kix9Kn0pfS1/LH8ufyt/H5kuL419bhpTq/zjN1ihAvo=</latexit>

Page 65: Low Resource Machine Translation - Stanford University

• Back-Translation:

• Self-Training:

target mono data

source mono data

in-domain mono data

Q.: Is it better to have clean targets but out-of-domain data, or noisy targets but in-domain data? Q.: What’s the effect of amount of parallel/monolingual data? Q.: What’s the effect of the quality of the model forward model when training with ST?

65

Page 66: Low Resource Machine Translation - Stanford University

Varying Domain of Target Originating Data

Target originating data is out-of-domain

Target originating data is in-domain

(1� ↵)<latexit sha1_base64="hXiGHlRj7nWlfLezr2o5piRxOpQ=">AAAGAHicjVTbbtMwGM5GCyOcNpC44cZiqtTSg5KBAAkmTaBJXIAYsI1JTRs5qdOaOQfFLksVfMOrcMMFCHHLY3DH2/A7SSHt2oGlKPZ/+Pz9JzsRo1wYxq+V1XOV6vkLaxf1S5evXL22vnH9kIfj2CUHbsjC+MjBnDAakANBBSNHUUyw7zDy1jl+qvRv35OY0zDYF5OI9Hw8DKhHXSxAZG9UbtZQVE8ayIriMBIhIv0UWSwcoshOLd8Jk9QSNJikz19IKZVhE3H4Sb2W9NtoG9VP7NTcMmXLGoSCt9TpviEBj1N/IcYSD/QBndhmC2VClJkAjF57WbfEiAjcgLssnwZ1o6UI9JuP0FTRVgK91m+XRT74KqKZm6LghgEXHSSR5YUxZgwlyKI+5JjwPAPby0LWa3lC6oltFvwSe185nJWpWdsmmEJVBlhxPaUqEYTEqosE5COx08dCLic2YwQkvRi7UDwfi5HjpLvSTkFpZT2SxmQgRdMsTg7D7vG0APsyhzGLs4AaQReU+DYkkmcC/xds21wADOHmkVEPSjMX5wALrDoGEj2neRfSQCjVnxKW2S2Gkai7BAWI9LI+WQoxtTwbQ89mqX8HsJI+nZmpUtCggaiRGp+lQwL6ac/9q8/y3lqkbr/effPXpNxjSOGH8C4w4gkcx+FJOpR63YS5sTCLRrhhr28aHSNb6PTGLDabWrH27PWfUGZ37JNAuAxz3jWNSPRSHAvqMiJ1a8xJBO2Bh6QL2wD7hPfSrG8kqoFkgGAu4QsEyqRljxT7nE98ByxVjfi8TgkX6bpj4T3spTSIxoIEbn6RN2YIiqJeQzSgMXEFm8AGuzEFrsgdYZgjAW+mDkkw50M+vTnc6ph3O1uv7m3uPCnSsabd0m5rdc3UHmg72jNtTzvQ3IqsfKp8qXytfqx+rn6rfs9NV1cKnxvazKr++A3w4ANP</latexit>

EuroParl + OpenSubtitles ↵<latexit sha1_base64="KLcj9QsFirN9JCJt1Gli0B4M6Gw=">AAAF+nicjVTbbtMwGM5GCyOcOrjkxmKq1NGDkoIACSZNoElcgBiwk9S0kZM6rZlzUOzSVJkfhRsuQIhbnoQ73obfSQtp1w4sRbH/w+fvP9mJGOXCMH6trV8qlS9f2biqX7t+4+atyubtIx6OYpccuiEL4xMHc8JoQA4FFYycRDHBvsPIsXP6QumPP5KY0zA4EJOIdH08CKhHXSxAZG+WKlUU1ZJtZEVxGIkQkV6KLBYOUGSnlu+ESWoJGkzSV6+llMqwjjj8pF5Nek20g2pjOzXbpmxY/VDwhjo9MiTgceovxVjhgc7Q2DYbKBOizARg9OqbmiWGROBtuMvyaVAzGopAr/4UzRRNJdCrvWZR5IOvIpq5KQpuGHDRQhJZXhhjxlCCLOpDjgnPM7CzKmS9miekltjmlF9iHyiHizI1b1sHU6hKHyuu51QFgpBYdZGAfCR2+kzI1cTmjICkF2MXiudjMXScdE/aKSitrEfSmPSlqJvTk8OwezorwIHMYczpWUCNoAsKfLclkhcC/xds01wCDOHmkVEPSrMQZx8LrDoGEr2g+RDSQCjVnxIW2S2HkaizAgWIdLM+WQkxs7wYQ89mqXcfsJIenZupQtCggaiRGp+VQwL6Wc/9q8/y3lqmbr7be//XpNhjSOGH8C4w4gkcx+E4HUjdwiwaYruyZbSMbKHzG3O62dKma9+u/IQCuyOfBMJlmPOOaUSim+JYUJcRgB1xEkFj4AHpwDbAPuHdNOsYiaog6SOYSPgCgTJp0SPFPucT3wFLVR2+qFPCZbrOSHhPuikNopEggZtf5I0YgnKodxD1aUxcwSawwW5MgStyhxgmSMBrqUMSzMWQz2+O2i3zQav99uHW7vNpOja0u9o9raaZ2mNtV3up7WuHmlsalz6VvpS+ls/Kn8vfyt9z0/W1qc8dbW6Vf/wG2L8CJA==</latexit>

Page 67: Low Resource Machine Translation - Stanford University

Baseline Approaches• Bitext only:

• Back-Translation:

• Self-Training:

learn:

1) 2) 3) learn:apply

Source Target

1)

learn:

2)source mono

apply

modeltranslation

3) re-learn: source mono

modeltranslation

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

Page 68: Low Resource Machine Translation - Stanford University

• Bitext only:

• Back-Translation:

• Self-Training:

learn:

1) 2) 3) learn:apply

Source Target

1)

learn:

2)source mono

apply

modeltranslation

3) re-learn: source mono

modeltranslation

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

p(x|y)<latexit sha1_base64="CyQwSgLN0ehm4cfS61mfQujnag8=">AAADvnicdZJda9swFIYVex9d9pVul7sRC4GEZcHuBhuMQrplsIt1ZKxpC1FiZEVJlMgfWHKxcfUnx272bybbyZqkqcBwOOd5j14dHzfkTEjL+lsxzHv3Hzw8eFR9/OTps+e1wxfnIogjQgck4EF06WJBOfPpQDLJ6WUYUey5nF64yy95/eKKRoIF/plMQzry8MxnU0aw1CnnsPKngTws5wTzrKfgMUQZbCbttOUwiJSTsWO73em0f6jqDXeqxqIkk7FwFgW3KLlTR+yQsiTTsXSWBblck3KT7Klxhjw3SDIcJ2ptJG5f7Rjpa1HYTK+TVrWhb4dIMA9uOdtoevLfaLNw2obIxVGWKmfRutu0HgMicQi32uh6Kd0Av6smknMqcUtf8hYiHsyg9gavYe5uA1s/TfVOvu7VJLkGvoH+HbpfZ3tlK0s34rKT/nm1utWxigNvB/YqqIPV6Tu132gSkNijviQcCzG0rVCOMhxJRjhVVRQLGmKyxDM61KGPPSpGWbF+CjZ0ZgKnQaQ/X8Iiu6nIsCdE6rmazN8mdmt5cl9tGMvpx1HG/DCW1CflRdOYQxnAfJfhhEWUSJ7qAJOIaa+QzHGEidQbX9VDsHeffDs4P+rY7zpHP9/Xu59X4zgAr8Br0AQ2+AC64BvogwEgxicDGwtjaXbNqemZQYkalZXmJdg6ZvIPxHwo5g==</latexit>

Baseline Approaches: Only In-Domain Data

p(y|x)<latexit sha1_base64="dBHzK3A8/hcNmG8nE+hfv1ikGd4=">AAAECnichZNNb9NAEIa3Nh8lfKVw5DIiipSIEMUFCS6VGggSB4qCaNpKdWKtN5tkE3/Ju65suXvmwl/hwgGEuPILuPFvWNtJSdJUrGRpNPO8O++MvHbgMC5arT9bmn7t+o2b27dKt+/cvXe/vPPgiPtRSGiP+I4fntiYU4d5tCeYcOhJEFLs2g49tmevs/rxGQ05871DkQS07+Kxx0aMYKFS1o4GVdPFYkKwk3Yk7IGZQi1uJHWLgSmtlO0ZjWaz8V6W/nEHcsALMh5wa5pz04I7sPgaKQoyGQhrlpOzBSmWyY4cpKZr+3GKo1gujESNszUjXVkKasl5XC9VVXMwOXNhxdjSne0Ln7XcaANMG4dpIq1p/WrPagsmiQJYuUbVC+kS+E7WTDGhAtdVk6dgOv4YlDc4h8zdEraYTHbabzZq4kwDT8C7QvfxcKNsbmmTuH2x+IKKZTZ+km1hdXTxv9GFVa60mq38wOXAmAcVND9dq/zbHPokcqkniIM5PzVageinOBSMOFSWzIjTAJMZHtNTFXrYpbyf5r+yhKrKDGHkh+rzBOTZZUWKXc4T11ZkZpKv17LkptppJEYv+ynzgkhQjxSNRpEDwofsXcCQhZQIJ1EBJiFTXoFMcIiJUK+npJZgrI98OTjabRrPmrsfnlf2X83XsY0eoceohgz0Au2jt6iLeohon7Qv2jftu/5Z/6r/0H8WqLY11zxEK0f/9RdRQUZC</latexit>

Page 69: Low Resource Machine Translation - Stanford University

In-Domain Only VS. Mixed Domain

69

Page 70: Low Resource Machine Translation - Stanford University

Varying Amount of Monolingual Data ( )↵ = 0

<latexit sha1_base64="ZPnRfubrH7b3OY+yvc0OydClHfo=">AAAF/nicjVTbbtMwGM5GCyMctgF33FhMlTp6UDIQIMGkCTSJCxADdpKaNnJStzVzDopdlspY4lW44QKEuOU5uONt+J2k0HbtwFIU+z98/v6TvZhRLizr19LyhVL54qWVy+aVq9eur66t3zjk0TDxyYEfsSg59jAnjIbkQFDByHGcEBx4jBx5J8+0/ug9STiNwn0xikk7wP2Q9qiPBYjc9dKtCoqr6SZy4iSKRYRIRyKHRX0Uu9IJvCiVjqDhSL54qZTShjXE4afMStppoG1UPXWlvWWrutONBK/r0wNLAR6nwVyMBR7oAzp17TrKhCgzARiz8qrqiAEReBPucgIaVq26JtCpPUZjRUMLzEqnMSkKwFcTzdw0BT8KuWgihZxelGDGUIocGkCOCc8zsL0oZLOSJ6SaunbBL3X3tcN5mZq2rYEpVKWLNdczqgmCkFh9kYB8pK58ItRiYlNGQLKXYB+KF2Ax8Dy5q1wJSifrEZmQrhI1uzh5DPsn4wLsqxzGLs4CagRdMMF3UyF1LvB/wTbsOcAQbh4Z7UFpZuLsYoF1x0CiZzTvIhoKrfpTwkl282EUai1AASLtrE8WQowtz8cws1nq3AWstEOnZmoiaNBA1EiPz8IhAf245/7VZ3lvzVM33uy+/Wsy2WNI40fwLjDSEzhJolPZV6aDWTzAcJ3lrm1YTStb6OzGLjYbRrH23LWfUGR/GJBQ+Axz3rKtWLQlTgT1GQHoIScxNAfukxZsQxwQ3pZZ1yhUAUkXwVTCFwqUSSc9JA44HwUeWOoK8VmdFs7TtYai96gtaRgPBQn9/KLekCEoiX4LUZcmxBdsBBvsJxS4In+AYYoEvJgmJMGeDfns5nCrad9rbr2+v7HztEjHinHbuGNUDdt4aOwYz40948DwS7L0qfSl9LX8sfy5/K38PTddXip8bhpTq/zjN1cdAvk=</latexit>

Page 71: Low Resource Machine Translation - Stanford University

Conclusion• STDM is particularly significant in low resource language pairs.

• Controlled setting helps studying STDM in isolation.

• ST is more robust than BT to STDM. We have already seen that combining ST & BT worked best in En-My.

• In practice, the influence of STDM depends on several factors, such as the amount of parallel and monolingual data, the domains, etc. In particular, if domains are not too distinct, STDM may even help regularizing!

71

Page 72: Low Resource Machine Translation - Stanford University

What I did not talk about: Filtering

• Data: extract clean version of common crawl.

• Learn a joint embedding space.

• Use approximate nearest neighbor methods to find closest matching sentence in embedding space.

• Translation quality on several languages is even higher than using actual existing bilingual + mono data!

Schwenk et al. “CCMatrix: mining billions of high-quality parallel sentences on the WEB” arXiv:1911.04944 2019

BT pre-training

multilingualfiltering

Page 73: Low Resource Machine Translation - Stanford University

Final Remarks• Low resource MT is a good use case applications of several long standing ML problems:

aligning domains, learning with less supervision, leveraging compositionality, etc.

• The importance and difficulty of data collection should not be under-estimated.

• A healthy cycle of research: data, modeling, analysis.

• Low resource MT key idea: use as many auxiliary tasks and data.

• Low resource MT requires lots of data and compute.

• Lots of open challenges. • Specific to low resource: dealing with all sorts of domain mismatch, learning from little data, quality of evaluation

sets… • General of text generation: better use of context, common sense, striking a good trade-off between accuracy and

speed, controllability, safety, biases…

73

Page 74: Low Resource Machine Translation - Stanford University

Questions? Вопросы?

¿Preguntas? Domande?

Guillaume Lample

Ludovic Denoyer

Alexis Conneau Hervé Jegou

Myle Ott

Michael AuliSergey Edunov

Peng-Jen Chen

Matt Le

Jiajun Shen

Juan-Miguel Pino

Jiatao Gu

Philipp Kohen

Paco Guzmán

Vishrav Chaudhary

Xian Li

Junxian He

Naman Goyal