Linguistic Regularities in Sparse and Explicit Word Representations

Linguistic Regularities in Sparse and Explicit

Word RepresentationsOmer Levy Yoav Goldberg

Bar-Ilan UniversityIsrael

Papers in ACL 2014*

Neural Networks & Word Embed-

Other Topics

* Sampling error: +/- 100%

Neural Embeddings• Dense vectors• Each dimension is a latent feature• Common software package: word2vec

• “Magic”king man woman queen

(analogies)

Representing words as vectors is not new!

Explicit Representations (Distributional)• Sparse vectors• Each dimension is an explicit context• Common association metric: PMI, PPMI

• Does the same “magic” work for explicit representations too?• Baroni et al. (2014) showed that embeddings outperform explicit, but…

Questions• Are analogies unique to neural embeddings?Compare neural embeddings with explicit representations

• Why does vector arithmetic reveal analogies?Unravel the mystery behind neural embeddings and their “magic”

Background

Mikolov et al. (2013a,b,c)• Neural embeddings have interesting geometries

• These patterns capture “relational similarities”

• Can be used to solve analogies:man is to woman as king is to queen

• Can be used to solve analogies: is to as is to

• Can be recovered by “simple” vector arithmetic:

• Can be used to solve analogies: is to as is to

• With simple vector arithmetic:

𝑎−𝑎∗=𝑏−𝑏∗

Mikolov et al. (2013a,b,c)

𝑏−𝑎+𝑎∗=𝑏∗

king man woman queen

𝑏𝑎𝑎∗𝑏∗

Tokyo Japan France Paris

best good strong strongest

vectors in

Are analogies unique to neural embeddings?

• Experiment: compare embeddings to explicit representations

Are analogies unique to neural embeddings?

Are analogies unique to neural embeddings?• Experiment: compare embeddings to explicit representations

• Learn different representations from the same corpus:

Are analogies unique to neural embeddings?• Experiment: compare embeddings to explicit representations

• Learn different representations from the same corpus:

• Evaluate with the same recovery method:

Analogy Datasets• 4 words per analogy: is to as is to

• Given 3 words: is to as is to

• Guess the best suiting from the entire vocabulary • Excluding the question words

• MSR: 8000 syntactic analogies• Google: 19,000 syntactic and semantic analogies

Embedding vs Explicit (Round 1)

MSR Google0%

Embedding54%

Embedding63%

Explicit29%

Explicit45%

Many analogies recovered by explicit, but many more by embedding.

Why does vector arithmetic reveal analogies?

Why does vector arithmetic reveal analogies?• We wish to find the closest to • This is done with cosine similarity:

Problem: one similarity might dominate the rest.

Why does vector arithmetic reveal analogies?• We wish to find the closest to

vector arithmetic similarity arithmetic

royal? female?

What does each similarity term mean?• Observe the joint features with explicit representations!

uncrowned Elizabethmajesty Katherinesecond impregnate

… …

Can we do better?

Let’s look at some mistakes…

England London Baghdad ?

England London Baghdad Iraq

England London Baghdad Mosul?

The Additive Objective

• Problem: one similarity might dominate the rest• Much more prevalent in explicit representation• Might explain why explicit underperformed

How can we do better?

How can we do better?• Instead of adding similarities, multiply them!

Embedding vs Explicit (Round 2)

Multiplication > Addition

MSR Google MSR GoogleEmbedding Explicit

Add54%

Add63%

Add29%

Add45%

Mul59%

Mul67% Mul

Mul68%Ac

Explicit is on-par with Embedding

MSR Google0%

Embedding59%

Embedding67%Explicit

Explicit68%Ac

Explicit is on-par with Embedding• Embeddings are not “magical”

• Embedding-based similarities have a more uniform distribution

• The additive objective performs better on smoother distributions

• The multiplicative objective overcomes this issue

Conclusion• Are analogies unique to neural embeddings?No! They occur in sparse and explicit representations as well.

• Why does vector arithmetic reveal analogies?Because vector arithmetic is equivalent to similarity arithmetic.

• Can we do better?Yes! The multiplicative objective is significantly better.

More Results and Analyses (in the paper)• Evaluation on closed-vocabulary analogy questions (SemEval 2012)

• Experiments with a third objective function (PairDirection)

• Do different representations reveal the same analogies?

• Error analysis

• A feature-level interpretation of how word similarity reveals analogies

Thanks for listening )

Linguistic Regularities in Sparse and Explicit Word Representations

Documents

JOURNAL OF LANGUAGE AND LINGUISTIC STUDIES Regularities …

Some Empirical Regularities of Spatial Economies: A

Cross-cultural regularities in the cognitive architecture of pride › papers › 2017Sznycer et al_Cross-cultural... · Cross-cultural regularities in the cognitive architecture

Exploiting Spatial and Spectral Image Regularities for Color ...color.psych.upenn.edu/brainard/papers/BarunProceedings.pdfExploiting Spatial and Spectral Image Regularities for Color

The Manhattan World Assumption: Regularities in Scene ...papers.nips.cc/paper/1804-the-manhattan-world-assumption-regularities... · of this assumption and show that, surprisingly,

Deep Learning on Sparse Manifolds for Faster Object ...carneiro/publications/TIP_RNR_double...Deep Learning on Sparse Manifolds for Faster Object Segmentation ... an explicit representation,

STATISTICAL REGULARITIES OF SELF-INTERSECTION COUNTS …

REGULARITIES OF VERTICAL ELEMENT DISTRIBUTION …

Justin Solomon David Bommes · 2015-07-06 · Direct (explicit matrix) Dense: Gaussian elimination/LU, QR for least-squares Sparse: Reordering (SuiteSparse, Eigen) Iterative (apply

MARIANNE MITHUN B The deeper regularities behind irregularitiesmithun.faculty.linguistics.ucsb.edu/pdfs/Mithun 2012 Regularities... · tell us much about the cognitive capacities

Linguistic regularities in sparse and explicit word representations conll-2014

NEUROSCIENCE Copyright © 2021 Learning hierarchical ...Feb 19, 2021 · auditory and visual sequences containing temporal regularities. We find neural tracking of regularities within

Sparse covariance estimation in heterogeneous samples...avoid explicit representation of the unknown mixing distributions, and allow us to identify structural changes in the underlying

User Session Identiﬁcation Based on Strong Regularities in

Regularities in Data from Factorial Experiments

Modeling Behavioral Regularities of Consumer …faculty.haas.berkeley.edu/hoteck/PAPERS/consumerlea… · Web viewTitle Modeling Behavioral Regularities of Consumer Learning in Conjoint

Regularities considered harmful: forcing randomness to ...csl.skku.edu/uploads/SDE5007M16/park13.pdf · Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Reduce

Geomorfoedaphological landscapes and regularities of soil

Cognitive Regularities in Creative Activity - Department of Psychology

-20ptCollaborative Autoencoder for Recommender Systems · Autoencoder (NCAE) for both explicit feedback and implicit feed-back. By utilizing a sparse forward module and a sparse backward