Meta-Learning for Low Resource NMT

Meta-Learning for Low

Resource NMT

Introduction

● Historically Statistical Translation

● Neural Machine Translation recently outperforms

● Statistical Models outperformed translations on low resource language pairs

NMT Previous Work

Single Task Mixed Datasets

Monolingual Corpora

Direct Transfer Learning

Meta Learning in NMT

Improve on direct transfer learning by better fine-tuning

Danish

FrenchSpanish

Polish

Portuguese

Italian

17 High-Resource Languages 4 Low-resource Languages

Turkish

Romanian Latvian

Finnish

MAML for NMT

Danish

FrenchSpanish

Polish

Portuguese

Italian

17 High-Resource Languages 4 Low-resource Languages

Turkish

Romanian Latvian

Finnish

Meta-train on these!

Meta-test on these!

e.g. Spanish English

e.g. Turkish English

Note: they simulate low-resource by sub-sampling

Gradient Update

Meta-Gradient Update

Gradient Update

1st-order Approximate Meta-Gradient Update

Meta-Gradient Update

Issue: Meta-train and meta-test input spaces should match!

Meta-train

Meta-test

En un lugar de la mancha, de cuyo nombre no puedo...

In some place in the Mancha, whose name...

Benim adım kırmızı... My name is Red…

Spanish Word Embeddings

Turkish Word Embeddings

Spanish Embedding for nombre

Turkish embedding

for adım

Trained independently

Universal Lexical Representation

English Word Embeddings

French Word Embeddings

Word embeddings trained independently on monolingual corpora

English Word Embeddings

French Word Embeddings

Word embeddings trained independently on monolingual corpora

Universal Embedding Values

Universal Embedding Keys

Universal Embedding Keys(transposed)

nombre

Transformation Matrix

Key: We represent “nombre” as a linearcombination of tokens in the ULR!

And, these are the weights of the linear combination!

Same embedding space as Spanish!

Training

nombre

trainable fixed

Experiments

Comment: Best to leave the

decoder be! Why?

Experiments

Comment: Gap narrows as more

training examples are included

Critique: Don’t evaluate on any

real low-resource languages!

Critique: Don’t know how many

training examples per task? k-shot, but what is k?

Meta-Learning for Low Resource NMT

Documents

NMT wk 2 project

NMT Policy for Chennai

NMT Times February 2013

Non- Motorized Transport (NMT) in Dhakacdn.cseindia.org/userfiles/non-motorized_transport_rezwana.pdf · Economy, Society and NMT NMT and Economy Rickshaw industry generates huge

ResTune: Resource Oriented Tuning Boosted by Meta-Learning

Nmt march 2016

NMT Script

CRRIIMMEE EPPRREEVVENNTTIIOON NMT THHRROOUUGGHH

Dementia nmt

Untitled-1 [salvichem.com]salvichem.com/assets/SCI-IRON-SALTS-BOOKLET.pdf · NMT. 1 Oppm NMT 0.3% No turbidty is produced Wthin 5 minutes NMT 10ppm 16.5 to 18.5% FCC Passes Test NMT

NMT PUMPS...979523886 NMT PLUS ER 25/80-180 0,20 180 Rp 1 55 2,4 979523887 NMT PLUS ER 32/80-180 0,20 180 Rp 1¼ 55 2,5 979523891 NMT PLUS PWM S …

NMT July 2010

GIZ Per Serving the NMT 3D NMT En

NMT Sept 2010

Nmt monitoring2011 dr emad

NMT City Wide

NMT Certification

NMT Storyboard

Luminara Serdar, BS, CPN, NMT · Luminara Serdar, BS, CPN, NMT AutismTransformed.com HeartFullHealing.com. Autism Recovery Telesummit NeuroModulation Technique (NMT) Nutrition Guidance/Consultation

NMT August 2010