19
Meta-Learning for Low Resource NMT

Meta-Learning for Low Resource NMT

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Meta-Learning for Low Resource NMT

Meta-Learning for Low

Resource NMT

Page 2: Meta-Learning for Low Resource NMT

Introduction

● Historically Statistical Translation

● Neural Machine Translation recently outperforms

● Statistical Models outperformed translations on low resource language pairs

Page 3: Meta-Learning for Low Resource NMT

NMT Previous Work

Single Task Mixed Datasets

Monolingual Corpora

Direct Transfer Learning

Page 4: Meta-Learning for Low Resource NMT

Meta Learning in NMT

Idea:

Improve on direct transfer learning by better fine-tuning

Page 5: Meta-Learning for Low Resource NMT

Danish

FrenchSpanish

Greek

Greek

Polish

Portuguese

Italian

17 High-Resource Languages 4 Low-resource Languages

Turkish

Romanian Latvian

Finnish

MAML for NMT

Page 6: Meta-Learning for Low Resource NMT

Danish

FrenchSpanish

Greek

Greek

Polish

Portuguese

Italian

17 High-Resource Languages 4 Low-resource Languages

Turkish

Romanian Latvian

Finnish

Meta-train on these!

Meta-test on these!

e.g. Spanish English

e.g. Turkish English

Note: they simulate low-resource by sub-sampling

Page 7: Meta-Learning for Low Resource NMT

Gradient Update

Meta-Gradient Update

Page 8: Meta-Learning for Low Resource NMT

Gradient Update

1st-order Approximate Meta-Gradient Update

Meta-Gradient Update

Page 9: Meta-Learning for Low Resource NMT

Issue: Meta-train and meta-test input spaces should match!

Meta-train

Meta-test

En un lugar de la mancha, de cuyo nombre no puedo...

In some place in the Mancha, whose name...

Benim adım kırmızı... My name is Red…

Spanish Word Embeddings

Turkish Word Embeddings

Spanish Embedding for nombre

Turkish embedding

for adım

Trained independently

Page 10: Meta-Learning for Low Resource NMT

Universal Lexical Representation

Spanish Word Embeddings

English Word Embeddings

French Word Embeddings

Turkish Word Embeddings

Word embeddings trained independently on monolingual corpora

Page 11: Meta-Learning for Low Resource NMT

Universal Lexical Representation

Spanish Word Embeddings

English Word Embeddings

French Word Embeddings

Turkish Word Embeddings

Word embeddings trained independently on monolingual corpora

Universal Embedding Values

Universal Embedding Keys

Page 12: Meta-Learning for Low Resource NMT

Universal Lexical Representation

Universal Embedding Keys(transposed)

Universal Embedding Values

nombre

Spanish Word Embeddings

Transformation Matrix

Key: We represent “nombre” as a linearcombination of tokens in the ULR!

And, these are the weights of the linear combination!

Page 13: Meta-Learning for Low Resource NMT

Universal Lexical Representation

Universal Embedding Keys(transposed)

Universal Embedding Values

adım

Turkish Word Embeddings

Transformation Matrix

Same embedding space as Spanish!

Page 14: Meta-Learning for Low Resource NMT

Training

Universal Embedding Keys(transposed)

Universal Embedding Values

nombre

Spanish Word Embeddings

Transformation Matrix

trainable fixed

Page 15: Meta-Learning for Low Resource NMT

Experiments

Page 16: Meta-Learning for Low Resource NMT

Comment: Best to leave the

decoder be! Why?

Experiments

Page 17: Meta-Learning for Low Resource NMT
Page 18: Meta-Learning for Low Resource NMT

Comment: Gap narrows as more

training examples are included

Page 19: Meta-Learning for Low Resource NMT

Critique: Don’t evaluate on any

real low-resource languages!

Critique: Don’t know how many

training examples per task? k-shot, but what is k?