68
Topic models Source: “Topic models”, David Blei, MLSS ‘09

Topic models

  • Upload
    pello

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Topic models. Source: “Topic models”, David Blei , MLSS ‘09. Topic modeling - Motivation. Discover topics from a corpus . Model connections between topics . Model the evolution of topics over time . Image annotation. Extensions*. - PowerPoint PPT Presentation

Citation preview

Page 1: Topic models

Topic models

Source: “Topic models”, David Blei, MLSS ‘09

Page 2: Topic models

Topic modeling - Motivation

Page 3: Topic models

Discover topics from a corpus

Page 4: Topic models

Model connections between topics

Page 5: Topic models

Model the evolution of topics over time

Page 6: Topic models

Image annotation

Page 7: Topic models

Extensions*

• Malleable: Can be quickly extended for data with tags (side information), class label, etc

• The (approximate) inference methods can be readily translated in many cases

• Most datasets can be converted to ‘bag-of-words’ format using a codebook representation and LDA style models can be readily applied (can work with continuous observations too)

*YMMV

Page 8: Topic models

Connection to ML research

Page 9: Topic models

Latent Dirichlet Allocation

Page 10: Topic models

LDA

Page 11: Topic models

Probabilistic modeling

Page 12: Topic models

Intuition behind LDA

Page 13: Topic models

Generative model

Page 14: Topic models

The posterior distribution

Page 15: Topic models

Graphical models (Aside)

Page 16: Topic models

LDA model

Page 17: Topic models

Dirichlet distribution

Page 18: Topic models

Dirichlet Examples

Darker implies lower magnitude

\alpha < 1 leads to sparser topics

Page 19: Topic models

LDA

Page 20: Topic models

Inference in LDA

Page 21: Topic models

Example inference

Page 22: Topic models

Example inference

Page 23: Topic models

Topics vs words

Page 24: Topic models

Explore and browse document collections

Page 25: Topic models

Why does LDA “work” ?

Page 26: Topic models

LDA is modular, general, useful

Page 27: Topic models

LDA is modular, general, useful

Page 28: Topic models

LDA is modular, general, useful

Page 29: Topic models

Approximate inference

• An excellent reference is “On smoothing and inference for topic models” Asuncion et al. (2009).

Page 30: Topic models

Posterior distribution for LDA

The only parameters we need to estimate are \alpha, \beta

Page 31: Topic models

Posterior distribution

Page 32: Topic models

Posterior distribution for LDA

• Can integrate out either \theta or z, but not both

• Marginalize \theta => z ~ Polya (\alpha)• Polya distribution also known as Dirichlet

compound multinomial (models “burstiness”)• Most algorithms marginalize out \theta

Page 33: Topic models

MAP inference

• Integrate out z• Treat \theta as random variable• Can use EM algorithm• Updates very similar to that of PLSA (except

for additional regularization terms)

Page 34: Topic models

Collapsed Gibbs sampling

Page 35: Topic models

Variational inference

Can think of this as extension of EM where we compute expectations w.r.t “variational distribution” instead of true posterior

Page 36: Topic models

Mean field variational inference

Page 37: Topic models

MFVI and conditional exponential families

Page 38: Topic models

MFVI and conditional exponential families

Page 39: Topic models

Variational inference

Page 40: Topic models

Variational inference for LDA

Page 41: Topic models

Variational inference for LDA

Page 42: Topic models

Variational inference for LDA

Page 43: Topic models

Collapsed variational inference

• MFVI: \theta, z assumed to be independent• \theta can be marginalized out exactly• Variational inference algorithm operating on

the “collapsed space” as CGS• Strictly better lower bound than VB• Can think of “soft” CGS where we propagate

uncertainty by using probabilities than samples

Page 44: Topic models

Estimating the topics

Page 45: Topic models

Inference comparison

Page 46: Topic models

Comparison of updates

“On smoothing and inference for topic models” Asuncion et al. (2009).

MAP

VB

CVB0

CGS

Page 47: Topic models

Choice of inference algorithm

• Depends on vocabulary size (V) , number of words per document (say N_i)

• Collapsed algorithms – Not parallelizable• CGS - need to draw multiple samples of topic

assignments for multiple occurrences of same word (slow when N_i >> V)

• MAP – Fast, but performs poor when N_i << V• CVB0 - Good tradeoff between computational

complexity and perplexity

Page 48: Topic models

Supervised and relational topic models

Page 49: Topic models

Supervised LDA

Page 50: Topic models

Supervised LDA

Page 51: Topic models

Supervised LDA

Page 52: Topic models

Supervised LDA

Page 53: Topic models

Variational inference in sLDA

Page 54: Topic models

ML estimation

Page 55: Topic models

Prediction

Page 56: Topic models

Example: Movie reviews

Page 57: Topic models

Diverse response types with GLMs

Page 58: Topic models

Example: Multi class classification

Page 59: Topic models

Supervised topic models

Page 60: Topic models

Upstream vs downstream models

Upstream: Conditional modelsDownstream: The predictor variable is generated based on actually observed z than \theta which is E(z’s)

Page 61: Topic models

Relational topic models

Page 62: Topic models

Relational topic models

Page 63: Topic models

Relational topic models

Page 64: Topic models

Predictive performance of one type given the other

Page 65: Topic models

Predicting links from documents

Page 66: Topic models

Predicting links from documents

Page 67: Topic models

Things we didn’t address

• Model selection: Non parametric Bayesian approaches

• Hyperparameter tuning• Evaluation can be a bit tricky (comparing

approximate bounds) for LDA, but can use traditional metrics in supervised versions

Page 68: Topic models

Thank you!