65
Word Embeddings Yulia Tsvetkov – CMU Slides: Dan Jurafsky – Stanford, Mike Peters – AI2, Edouard Grave – FAIR Algorithms for NLP

Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Word EmbeddingsYulia Tsvetkov – CMU

Slides: Dan Jurafsky – Stanford, Mike Peters – AI2, Edouard Grave – FAIR

Algorithms for NLP

Page 2: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Brown Clustering

dog [0000]

cat [0001]

ant [001]

river [010]

lake [011]

blue [10]

red [11]

dog cat ant river lake blue red

0

0

00

0 0 11 1 1

11

Page 3: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Brown Clustering

[Brown et al, 1992]

Page 4: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Brown Clustering

[ Miller et al., 2004]

Page 5: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ is a vocabulary

▪ is a partition of the vocabulary into k clusters

▪ is a probability of cluster of wi to follow the cluster of wi-1

Brown Clustering

Quality(C)

The model:

Page 6: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Quality(C)

Slide by Michael Collins

Page 7: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

A Naive Algorithm

▪ We start with |V| clusters: each word gets its own cluster

▪ Our aim is to find k final clusters

▪ We run |V| − k merge steps:

▪ At each merge step we pick two clusters ci and cj , and merge them into a single cluster

▪ We greedily pick merges such that Quality(C) for the clustering C after the merge step is maximized at each stage

▪ Cost? Naive = O(|V|5 ). Improved algorithm gives O(|V|3 ): still too slow for realistic values of |V|

Slide by Michael Collins

Page 8: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Brown Clustering Algorithm▪ Parameter of the approach is m (e.g., m = 1000)▪ Take the top m most frequent words,

put each into its own cluster, c1, c

2, … c

m

▪ For i = (m + 1) … |V| ▪ Create a new cluster, c

m+1, for the i’th most frequent word.

We now have m + 1 clusters ▪ Choose two clusters from c

1 . . . c

m+1 to be merged: pick the merge that gives

a maximum value for Quality(C). We’re now back to m clusters

▪ Carry out (m − 1) final merges, to create a full hierarchy

▪ Running time: O(|V|m2 + n) where n is corpus length

Slide by Michael Collins

Page 9: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Plan for Today

▪ Word2Vec▪ Representation is created by training a classifier to distinguish nearby and

far-away words

▪ FastText▪ Extension of word2vec to include subword information

▪ ELMo▪ Contextual token embeddings

▪ Multilingual embeddings▪ Using embeddings to study history and culture

Page 10: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Word2Vec

▪ Popular embedding method▪ Very fast to train▪ Code available on the web▪ Idea: predict rather than count

Page 11: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Word2Vec

[Mikolov et al.’ 13]

Page 12: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

▪ Predict vs Count

the cat sat on the mat

Page 13: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ Predict vs Count

Skip-gram Prediction

the cat sat on the mat

context size = 2

wt = the CLASSIFIER

wt-2

= <start-2

>w

t-1 = <start

-1>

wt+1

= catw

t+2 = sat

Page 14: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

▪ Predict vs Count

the cat sat on the mat

context size = 2

wt = cat CLASSIFIER

wt-2

= <start-1

>w

t-1 = the

wt+1

= satw

t+2 = on

Page 15: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

the cat sat on the mat

▪ Predict vs Count

Skip-gram Prediction

context size = 2

wt = sat CLASSIFIER

wt-2

= thew

t-1 = cat

wt+1

= onw

t+2 = the

Page 16: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ Predict vs Count

the cat sat on the mat

Skip-gram Prediction

context size = 2

wt = on CLASSIFIER

wt-2

= catw

t-1 = sat

wt+1

= thew

t+2 = mat

Page 17: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ Predict vs Count

the cat sat on the mat

Skip-gram Prediction

context size = 2

wt = the CLASSIFIER

wt-2

= satw

t-1 = on

wt+1

= matw

t+2 = <end

+1>

Page 18: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ Predict vs Count

the cat sat on the mat

Skip-gram Prediction

context size = 2

wt = mat CLASSIFIER

wt-2

= onw

t-1 = the

wt+1

= <end+1

>w

t+2 = <end

+2>

Page 19: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ Predict vs Count

Skip-gram Prediction

wt = the CLASSIFIER

wt-2

= <start-2

>w

t-1 = <start

-1>

wt+1

= catw

t+2 = sat

wt = the CLASSIFIER

wt-2

= satw

t-1 = on

wt+1

= matw

t+2 = <end

+1>

Page 20: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

Page 21: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

▪ Training data

wt , w

t-2w

t , w

t-1w

t , w

t+1w

t , w

t+2...

Page 22: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

Page 23: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

▪ For each word in the corpus t= 1 … T

Maximize the probability of any context window given the current center word

Page 24: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

▪ Softmax

Page 25: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

SGNS

▪ Negative Sampling▪ Treat the target word and a neighboring context word as positive examples.

▪ subsample very frequent words

▪ Randomly sample other words in the lexicon to get negative samples▪ x2 negative samples

Given a tuple (t,c) = target, context

▪ (cat, sat)▪ (cat, aardvark)

Page 26: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Learning the classifier

▪ Iterative process▪ We’ll start with 0 or random weights▪ Then adjust the word weights to

▪ make the positive pairs more likely ▪ and the negative pairs less likely

▪ over the entire training set:

▪ Train using gradient descent

Page 27: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

How to compute p(+|t,c)?

Page 28: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

SGNS

Given a tuple (t,c) = target, context

▪ (cat, sat)▪ (cat, aardvark)

Return probability that c is a real context word:

Page 29: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Choosing noise words

Could pick w according to their unigram frequency P(w)

More common to chosen then according to pα(w)

α= ¾ works well because it gives rare noise words slightly higher probability

To show this, imagine two events p(a)=.99 and p(b) = .01:

Page 30: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Skip-gram Prediction

Page 31: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText

https://fasttext.cc/

Page 32: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText: Motivation

Page 33: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Subword Representation

skiing = {^skiing$, ^ski, skii, kiin, iing, ing$}

Page 34: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText

Page 35: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Details

▪ n-grams between 3 and 6 characters▪ how many possible ngrams?

▪ |character set|n

▪ Hashing to map n-grams to integers in 1 to K=2M

▪ get word vectors for out-of-vocabulary words using subwords.▪ less than 2× slower than word2vec skipgram

▪ short n-grams (n = 4) are good to capture syntactic information▪ longer n-grams (n = 6) are good to capture semantic information

Page 36: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText Evaluation▪ Intrinsic evaluation

▪ Arabic, German, Spanish, French, Romanian, Russian

word1 word2similarity (humans)

vanish disappear 9.8

behave obey 7.3

belief impression 5.95

muscle bone 3.65

modest flexible 0.98

hole agreement 0.3

similarity (embeddings)

1.1

0.5

0.3

1.7

0.98

0.3

Spearman's rho (human ranks, model ranks)

Page 37: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText Evaluation

[Grave et al, 2017]

Page 38: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText Evaluation

Page 39: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

FastText Evaluation

Page 40: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

ELMo

https://allennlp.org/elmo

Page 41: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Motivation

p(play | Elmo and Cookie Monster play a game .)

≠p(play | The Broadway play premiered yesterday .)

Page 42: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Background

Page 43: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

??

Page 44: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM??

Page 45: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

?? ??

Page 46: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

Embeddings from Language Models

ELMo= ??

Page 47: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

Embeddings from Language Models

ELMo=

Page 48: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

ELMo= + +

Embeddings from Language Models

Page 49: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

The Broadway play premiered yesterday .

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

LSTM

λ1

ELMoλ2 λ0= + +( ( () ) )

Embeddings from Language Models

Page 50: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Evaluation: Extrinsic Tasks

Page 51: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Stanford Question Answering Dataset (SQuAD)

[Rajpurkar et al, ‘16, ‘18]

Page 52: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

SNLI

[Bowman et al, ‘15]

Page 53: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Multilingual Embeddings

https://github.com/mfaruqui/crosslingual-ccahttp://128.2.220.95/multilingual/

Page 54: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Motivation

model 1 model 2

?

Page 55: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Motivation

English French

?

Page 56: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Canonical Correlation Analysis (CCA)

Canonical Correlation Analysis (Hotelling, 1936)

Projects two sets of vectors (of equal cardinality) in a space where they are maximally correlated.

Ω ∑CCA

Ω ∑Ω ∑

Page 57: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Canonical Correlation Analysis (CCA)

X Y

W, V = CCA(Ω, ∑)

x xW V

Ω ⊆ X, ∑ ⊆ Y

n1

d1 k

n2

d2k

X’ Y’

X’ and Y’ are now maximally correlated.

n2n1

k k

k = min(r(Ω), r(∑))

[Faruqui & Dyer, ‘14]

Page 58: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Extension: Multilingual Embeddings

[Ammar et al., ‘16]

58

English

French Spanish Arabic Swedish

French-English

Ofrench→english

Ofrench←english

Ofrench→english x

Ofrench←english

-1

Page 59: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Embeddings can help study word history!

Page 60: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Diachronic Embeddings

60



1900 1950 2000

vs.

Word vectors for 1920 Word vectors 1990

“dog” 1920 word vector“dog” 1990 word vector

▪ count-based embeddings w/ PPMI▪ projected to a common space

Page 61: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Project 300 dimensions down into 2

~30 million books, 1850-1990, Google Books data

Page 62: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Negative words change faster than positive words

Page 63: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Embeddings reflect ethnic stereotypes over time

Page 64: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Change in linguistic framing 1910-1990

Change in association of Chinese names with adjectives framed as "othering" (barbaric, monstrous, bizarre)

Page 65: Algorithms for NLPdemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711... · FastText Evaluation Intrinsic evaluation Arabic, German, Spanish, French, Romanian, Russian word1 word2 similarity

Conclusion

▪ Concepts or word senses▪ Have a complex many-to-many association with words (homonymy, multiple

senses)▪ Have relations with each other▪ Synonymy, Antonymy, Superordinate▪ But are hard to define formally (necessary & sufficient conditions)

▪ Embeddings = vector models of meaning▪ More fine-grained than just a string or index▪ Especially good at modeling similarity/analogy▪ Just download them and use cosines!!▪ Useful in many NLP tasks▪ But know they encode cultural stereotypes