Upload
wilson
View
164
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Linguistic Regularities in Sparse and Explicit Word Representations. Omer Levy Yoav Goldberg Bar- Ilan University Israel. Papers in ACL 2014*. * Sampling error: +/- 100%. Neural Embeddings. Dense vectors Each dimension is a latent feature Common software package: word2vec - PowerPoint PPT Presentation
Citation preview
Linguistic Regularities in Sparse and Explicit
Word RepresentationsOmer Levy Yoav Goldberg
Bar-Ilan UniversityIsrael
Papers in ACL 2014*
Neural Networks & Word Embed-
dings
Other Topics
* Sampling error: +/- 100%
Neural Embeddings• Dense vectors• Each dimension is a latent feature• Common software package: word2vec
• “Magic”king man woman queen
(analogies)
Representing words as vectors is not new!
Explicit Representations (Distributional)• Sparse vectors• Each dimension is an explicit context• Common association metric: PMI, PPMI
• Does the same “magic” work for explicit representations too?• Baroni et al. (2014) showed that embeddings outperform explicit, but…
Questions• Are analogies unique to neural embeddings?Compare neural embeddings with explicit representations
• Why does vector arithmetic reveal analogies?Unravel the mystery behind neural embeddings and their “magic”
Background
Mikolov et al. (2013a,b,c)• Neural embeddings have interesting geometries
Mikolov et al. (2013a,b,c)• Neural embeddings have interesting geometries
• These patterns capture “relational similarities”
• Can be used to solve analogies:man is to woman as king is to queen
Mikolov et al. (2013a,b,c)• Neural embeddings have interesting geometries
• These patterns capture “relational similarities”
• Can be used to solve analogies: is to as is to
• Can be recovered by “simple” vector arithmetic:
Mikolov et al. (2013a,b,c)• Neural embeddings have interesting geometries
• These patterns capture “relational similarities”
• Can be used to solve analogies: is to as is to
• With simple vector arithmetic:
𝑎−𝑎∗=𝑏−𝑏∗
Mikolov et al. (2013a,b,c)
𝑏−𝑎+𝑎∗=𝑏∗
Mikolov et al. (2013a,b,c)
king man woman queen
Mikolov et al. (2013a,b,c)
𝑏𝑎𝑎∗𝑏∗
Tokyo Japan France Paris
Mikolov et al. (2013a,b,c)
𝑏𝑎𝑎∗𝑏∗
best good strong strongest
Mikolov et al. (2013a,b,c)
𝑏𝑎𝑎∗𝑏∗
best good strong strongest
Mikolov et al. (2013a,b,c)
vectors in
𝑏𝑎𝑎∗𝑏∗
Are analogies unique to neural embeddings?
• Experiment: compare embeddings to explicit representations
Are analogies unique to neural embeddings?
Are analogies unique to neural embeddings?• Experiment: compare embeddings to explicit representations
Are analogies unique to neural embeddings?• Experiment: compare embeddings to explicit representations
• Learn different representations from the same corpus:
Are analogies unique to neural embeddings?• Experiment: compare embeddings to explicit representations
• Learn different representations from the same corpus:
• Evaluate with the same recovery method:
Analogy Datasets• 4 words per analogy: is to as is to
• Given 3 words: is to as is to
• Guess the best suiting from the entire vocabulary • Excluding the question words
• MSR: 8000 syntactic analogies• Google: 19,000 syntactic and semantic analogies
Embedding vs Explicit (Round 1)
Embedding vs Explicit (Round 1)
MSR Google0%
10%
20%
30%
40%
50%
60%
70%
Embedding54%
Embedding63%
Explicit29%
Explicit45%
Accu
racy
Many analogies recovered by explicit, but many more by embedding.
Why does vector arithmetic reveal analogies?
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
Problem: one similarity might dominate the rest.
Why does vector arithmetic reveal analogies?• We wish to find the closest to
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
vector arithmetic similarity arithmetic
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
vector arithmetic similarity arithmetic
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
vector arithmetic similarity arithmetic
Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:
vector arithmetic similarity arithmetic
royal? female?
What does each similarity term mean?• Observe the joint features with explicit representations!
uncrowned Elizabethmajesty Katherinesecond impregnate
… …
Can we do better?
Let’s look at some mistakes…
Let’s look at some mistakes…
England London Baghdad ?
Let’s look at some mistakes…
England London Baghdad Iraq
Let’s look at some mistakes…
England London Baghdad Mosul?
The Additive Objective
The Additive Objective
The Additive Objective
The Additive Objective
The Additive Objective
The Additive Objective
• Problem: one similarity might dominate the rest• Much more prevalent in explicit representation• Might explain why explicit underperformed
How can we do better?
How can we do better?• Instead of adding similarities, multiply them!
How can we do better?• Instead of adding similarities, multiply them!
How can we do better?• Instead of adding similarities, multiply them!
Embedding vs Explicit (Round 2)
Multiplication > Addition
MSR Google MSR GoogleEmbedding Explicit
0%
10%
20%
30%
40%
50%
60%
70%
80%
Add54%
Add63%
Add29%
Add45%
Mul59%
Mul67% Mul
57%
Mul68%Ac
cura
cy
Explicit is on-par with Embedding
MSR Google0%
10%
20%
30%
40%
50%
60%
70%
80%
Embedding59%
Embedding67%Explicit
57%
Explicit68%Ac
cura
cy
Explicit is on-par with Embedding• Embeddings are not “magical”
• Embedding-based similarities have a more uniform distribution
• The additive objective performs better on smoother distributions
• The multiplicative objective overcomes this issue
Conclusion• Are analogies unique to neural embeddings?No! They occur in sparse and explicit representations as well.
• Why does vector arithmetic reveal analogies?Because vector arithmetic is equivalent to similarity arithmetic.
• Can we do better?Yes! The multiplicative objective is significantly better.
More Results and Analyses (in the paper)• Evaluation on closed-vocabulary analogy questions (SemEval 2012)
• Experiments with a third objective function (PairDirection)
• Do different representations reveal the same analogies?
• Error analysis
• A feature-level interpretation of how word similarity reveals analogies
Thanks for listening )