45
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing Tal Schuster*, Ori Ram*, Regina Barzilay, Amir Globerson

Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings,

with Applications to Zero-shot Dependency Parsing

Tal Schuster*, Ori Ram*, Regina Barzilay, Amir Globerson

Page 2: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Task: Cross-lingual Zero-shot Dependency Parsing

Cross-Lingual Alignment of Contextual Word Embeddings 2

Goal: Utilize universal space of contextual embeddings

73.9

87.1

60

65

70

75

80

85

90

Many -> Frenchnon-contextual

French -> French

LAS

13.2 Gap

ELMo

Page 3: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 3

Idea: Align Contextual Word Embeddings

warm

English Spanish

calentar

cálido

Page 4: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 4

Idea: Align Contextual Word Embeddings

English Spanish

calentar

cálidowarm

Page 5: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Our Results – zero-shot

Cross-Lingual Alignment of Contextual Word Embeddings 5

By aligning ELMo contextual embeddings

73.9

80.8

87.1

60

65

70

75

80

85

90

Many -> Frenchnon-contextual

Ours French -> French

LAS

13.26.3

ELMo

Page 6: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 6

Problem DefinitionEnglish Spanish

Dictionarybear osowarm cálido

… …

• ELMo embeddings

• POS tags

• ELMo embeddings

• POS tags

Goal: Learn a linear alignment (!)

Page 7: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 7

Problem Definition - ExtensionsEnglish Spanish

Goal: Learn a linear alignment (!)

Dictionarybear osowarm cálido

… …

• ELMo embeddings

• POS tags

• ELMo embeddings

• POS tags

Page 8: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 8

Problem Definition - ExtensionsEnglish Spanish

Dictionarybear osowarm cálido

… …

• ELMo embeddings

• POS tags

• Deficient ELMo embeddings

• POS tags

Goal: Alignment (!) and improve the embeddings

Small

Page 9: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 9

Aligning Embeddings - Static Case

English Spanish

warm

cálido

!"#$ = &!"#'

& = argmin.∈01

∑ !"#$ −&!"#'4

(Mikolov et al., 2013)

Page 10: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 10

Aligning Embeddings - Contextual Case

warmCalentar

Cálido

?

Challenges: 1. Multiple senses per token2. Many representations per senses

Page 11: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 11

The Contextual Component

Fuzz (electric guitar) , distortion effects to create "warm" and "dirty" sounds.

winning just four matches in her Wimbledon warm up tournaments

Sunday was a glorious day , clear and warm.

He was a warm friend of Pope St. Gregory.

• Contextual embeddings of the word “warm”:

Page 12: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 12

Per Token Anchor

!"# = %&[(#,&]warm

Page 13: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 13

Utilizing Lexical Anchors for Alignment

English Spanish

river / río

less / menos

Dictionaryriver ríoless menos

… …

Page 14: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 14

Geometry of the Contextual Space

warm

0.18

• Contextual representation of the same token are clustered together

• The average distance between tokens is larger than within each token

0.85

river

warm

Page 15: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 15

Factorizing the Contextual Embedding

!"# = %&[(#,&]

warm

"#,& = !"# + ,"#,&

Anchor Shift+

Page 16: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 16

Anchor Based Alignment

A. Train ELMo model per language

Page 17: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 17

Anchor Based Alignment

B. Extract anchors!"# = %&[(#,&]

Page 18: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 18

Anchor Based Alignment

C. Compute alignment by anchors

! = argmin)∈+,

∑ ./01 −!./034

Dictionaryriver ríoless menos

… …

Page 19: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 19

Anchor Based Alignment

D. Apply alignment on contextual space

!",$%& = (!",$%)

= ((+!" + -!",$)

Page 20: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 20

Anchor Based AlignmentA. Train ELMo model per language B. Extract anchors

C. Align by anchors D. Apply alignment on contextual space

!"# = %&[(#,&]

+ = argmin2∈45

∑ (#78 −+(#7:;

(#,&78 = +(#,&7:

Page 21: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 21

Potential Problem: Multi-sense Words

bear her name

bear the pain

polar bear cub

teddy bear

• Contextual embeddings of the word “bear”:

Page 22: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 22

The Alignment Works for Multi-sense Words

osotener

bear

Page 23: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 23

The Alignment Works for Multi-sense Words

soil seed bankbattery bankclue bank

bank of the rivereastern bank of …

Page 24: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 24

No DictionaryEnglish Spanish

Dictionarybear osowarm cálido

… …

• ELMo embeddings

• POS tags

• ELMo embeddings

• POS tags

Page 25: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 25

Anchor Based Alignment - Unsupervised

Compute alignment by anchors via adversarial training

Page 26: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 26

Anchor Based Alignment - UnsupervisedA. Train ELMo model per language B. Extract anchors

C. Align by anchors – adversarial training D. Apply alignment on contextual space

!"# = %&[(#,&]

(#,&+, = -(#,&+.

Page 27: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 27

Low Resource Languages

English 2.5MGerman 800k

Spanish 400k

Turkish 100k

Kazakh 3k

articles per language

Page 28: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 28

Low Resource Languages

Dictionary

bear oso

warm cálido

… …

• ELMo embeddings

• POS tags

• Deficient ELMo embeddings

• POS tags

Small

Goal: Alignment (!) and improve the embeddings

Page 29: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 29

Anchored Language Model

A. Extract anchors from English model

!"# = %&[(#,&]

Page 30: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 30

Anchored Language Model

B. Use anchors as seeds for the low resource language

Dictionaryriver ríoless menos

… …

Page 31: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 31

Anchored Language Model

C. Learn language model for low resource language

!",$ − !̅"'

Page 32: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 32

Anchored Language Model

D. Learn and apply finer alignment

Dictionaryriver ríoless menos

… …

! = argmin)∈+,

∑ ./01 −!./034

Page 33: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 33

Anchored Language ModelA. Extract anchors from English model B. Learn language model for low resource language

C. Learn language model for low resource language D. Learn and apply finer alignment !",$%& = (!",$)*!",$ − !̅"

-

Page 34: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

• Contextual embeddings (Peters et al., 2018; McCann et al., 2017; Howard and Ruder, 2018; Radford et al., 2018; Devlin et al., 2018)

• Cross-lingual alignment (Mikolov et al., 2013; Smith et al., 2017; Artetxe et al., 2017; Conneau et al., 2018)

• Multilingual parsing (Duong et al., 2015; Guo et al., 2016; Ammar et al., 2016; de Lhoneux et al., 2018; Che et al., 2018; Wang et al., 2018; Clark et al., 2018)

Cross-Lingual Alignment of Contextual Word Embeddings 34

Related Work

Page 35: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings35

I prefer the morning flight

Dependency Parsing

Encoder

I prefer the morning flight

• first-order graph-based model (Dozat and Manning, 2017)

Page 36: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 36

Cross-lingual Zero-shot

Model

English…

FrenchSpanish German

Train Test

Page 37: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 37

Cross-lingual Transfer for Dependency Parsing

70.5

74.4

60

65

70

75

80

Average LAS score

Ammar et al.FastText

Page 38: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 38

Cross-lingual Zero-shot

Model

English Spanish German…

French

Train Test

!"#!"$

!%" !&'

Page 39: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 39

Cross-lingual Transfer for Dependency Parsing

70.5

74.477.3

60

65

70

75

80

Average LAS score Ammar et al.FastTextOurs

Page 40: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 40

Cross-lingual Transfer for Dependency Parsing

45

50

55

60

65

70

75

80

85

German Spanish French Italian Portuguese Swedish

LAS score per language Guo et al.Ammar et al.FastTextOurs

Page 41: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 42

Cross-lingual Transfer for Dependency Parsing

70.5

77.375

60

65

70

75

80

Average LAS score

Ammar et al.OursNo dictionary

Page 42: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 44

Cross-lingual Transfer for Dependency Parsing

70.5

77.375

73.1

60

65

70

75

80Average LAS score

Ammar et al.OursNo dictionaryNo POS tags

Page 43: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 45

Low Resource Language

Dictionarybear osowarm cálido

… …

• ELMo embeddings

• POS tags

• Deficient ELMo embeddings

• POS tags

Small

English 10k sentences (vs. 28M)

Page 44: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

Cross-Lingual Alignment of Contextual Word Embeddings 46

Low Resource Language (10k sentences)

33.1

42.2

20

25

30

35

40

45

LAS score

4060

600

0

1000

2000

3000

4000

5000

Perplexity

Page 45: Cross-Lingual Alignment of Contextual Word Embeddings, with …people.csail.mit.edu › tals › publication › crosslingual_elmo › ... · 2019-06-28 · FastText Ours. Cross-Lingual

• ELMo embeddings are clustered around their anchor

• Anchor based alignment preserves the contextual component

• Effective for cross-lingual transfer learning (not task-specific)

Cross-Lingual Alignment of Contextual Word Embeddings 47

Conclusions

Code available at:https://github.com/TalSchuster/CrossLingualELMo

https://github.com/TalSchuster/allennlp-MultiLang(soon part of the AllenNLP repo)