58
Learning Topic Models – Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and Rong Ge October 21, 2012 Ankur Moitra (IAS) LDA October 21, 2012

Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Learning Topic Models– Going Beyond SVD

Ankur Moitra, IAS

joint with Sanjeev Arora and Rong Ge

October 21, 2012

Ankur Moitra (IAS) LDA October 21, 2012

Page 2: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Topic Models

Large collection of articles, say from the New York Times:

newspaper articles

Question

How can we automatically organize them by topic? (unsupervisedlearning)

Challenge: Develop tools for automatic comprehension of data - e.g.newspaper articles, webpages, images, genetic sequences, userratings...

Page 3: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Topic Models

Large collection of articles, say from the New York Times:

newspaper articles

Question

How can we automatically organize them by topic? (unsupervisedlearning)

Challenge: Develop tools for automatic comprehension of data - e.g.newspaper articles, webpages, images, genetic sequences, userratings...

Page 4: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Topic Models

Large collection of articles, say from the New York Times:

newspaper articles

Question

How can we automatically organize them by topic? (unsupervisedlearning)

Challenge: Develop tools for automatic comprehension of data - e.g.newspaper articles, webpages, images, genetic sequences, userratings...

Page 5: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

topics

per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

A

Page 6: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

W

topics

document #1: (1.0, personal finance)per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds to

pic

s

documents

A

Page 7: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

topics

document #1: (1.0, personal finance)per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

=

topic

s

documents

A W

Page 8: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

W

topics

document #2: (0.5, baseball); (0.5, movie reviews)

per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

=

topic

s

documents

A

Page 9: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

topics

document #2: (0.5, baseball); (0.5, movie reviews)

per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

=

topic

s

documents

A W

Page 10: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

topics

per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

=

topic

s

documents

A W = M

Page 11: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

topics

per

sonal

fin

ance

bas

ebal

l

... movie

rev

iew

s

wo

rds

=

topic

s

documents

A W M

Page 12: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

So Many Models!

Pure Topics: one topic per document

[

         

                 Stochastic    

       Fixed          

M

=

A W M

=

A W

Page 13: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

So Many Models!

LDA: [Blei et al] Dirichlet distribution

         

                 Stochastic    

       Fixed          

M

=

A W M

=

A W

Page 14: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

So Many Models!

CTM / Pachinko: structured correlations

         

                 Stochastic    

       Fixed          

M

=

A W M

=

A W

Page 15: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Algorithms

Maximum Likelihood: Find the parameters that maximize thelikelihood of generating the observed data.

Hard to compute!

Spectral: Compute the singular value decomposition of M̂ .[Papadimitriou et al], [Azar et al], ...

But the singular vectors are orthonormal!

Question

Can we use tools from nonnegative matrix factorization instead ofspectral methods?

[AGKM]: fixed parameter intractable but there are easy cases

Page 16: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Algorithms

Maximum Likelihood: Find the parameters that maximize thelikelihood of generating the observed data.

Hard to compute!

Spectral: Compute the singular value decomposition of M̂ .[Papadimitriou et al], [Azar et al], ...

But the singular vectors are orthonormal!

Question

Can we use tools from nonnegative matrix factorization instead ofspectral methods?

[AGKM]: fixed parameter intractable but there are easy cases

Page 17: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Algorithms

Maximum Likelihood: Find the parameters that maximize thelikelihood of generating the observed data.

Hard to compute!

Spectral: Compute the singular value decomposition of M̂ .[Papadimitriou et al], [Azar et al], ...

But the singular vectors are orthonormal!

Question

Can we use tools from nonnegative matrix factorization instead ofspectral methods?

[AGKM]: fixed parameter intractable but there are easy cases

Page 18: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Algorithms

Maximum Likelihood: Find the parameters that maximize thelikelihood of generating the observed data.

Hard to compute!

Spectral: Compute the singular value decomposition of M̂ .[Papadimitriou et al], [Azar et al], ...

But the singular vectors are orthonormal!

Question

Can we use tools from nonnegative matrix factorization instead ofspectral methods?

[AGKM]: fixed parameter intractable but there are easy cases

Page 19: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Algorithms

Maximum Likelihood: Find the parameters that maximize thelikelihood of generating the observed data.

Hard to compute!

Spectral: Compute the singular value decomposition of M̂ .[Papadimitriou et al], [Azar et al], ...

But the singular vectors are orthonormal!

Question

Can we use tools from nonnegative matrix factorization instead ofspectral methods?

[AGKM]: fixed parameter intractable but there are easy cases

Page 20: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Results

Let E [WW T ] = R be the topic-topic covariance matrix, let κ be its

condition number and let a = maxi ,jE [Wi ]E [Wj ]

be the topic imbalance.

If the topic matrix A satisfies the “anchor word assumption” forp > 0:

Theorem

We can learn the topic matrix A and covariance matrix R to withinaccuracy ε in time and number of docs poly(log n, r , 1/ε, 1/p, κ, a)with n words and r topics

Suffices to have documents of size two!

Page 21: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Results

Let E [WW T ] = R be the topic-topic covariance matrix, let κ be its

condition number and let a = maxi ,jE [Wi ]E [Wj ]

be the topic imbalance.

If the topic matrix A satisfies the “anchor word assumption” forp > 0:

Theorem

We can learn the topic matrix A and covariance matrix R to withinaccuracy ε in time and number of docs poly(log n, r , 1/ε, 1/p, κ, a)with n words and r topics

Suffices to have documents of size two!

Page 22: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Results

Let E [WW T ] = R be the topic-topic covariance matrix, let κ be its

condition number and let a = maxi ,jE [Wi ]E [Wj ]

be the topic imbalance.

If the topic matrix A satisfies the “anchor word assumption” forp > 0:

Theorem

We can learn the topic matrix A and covariance matrix R to withinaccuracy ε in time and number of docs poly(log n, r , 1/ε, 1/p, κ, a)with n words and r topics

Suffices to have documents of size two!

Page 23: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Results

Let E [WW T ] = R be the topic-topic covariance matrix, let κ be its

condition number and let a = maxi ,jE [Wi ]E [Wj ]

be the topic imbalance.

If the topic matrix A satisfies the “anchor word assumption” forp > 0:

Theorem

We can learn the topic matrix A and covariance matrix R to withinaccuracy ε in time and number of docs poly(log n, r , 1/ε, 1/p, κ, a)with n words and r topics

Suffices to have documents of size two!

Page 24: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

If an anchor word (for a topic) occurs, the document is at leastpartially about the given topic:

                               

       

bunt        

401k    oscar-­‐winning        

         

topics

per

son

al f

inan

ce

bas

ebal

l

... mov

ie r

evie

ws

word

s topic

s

documents

A W

Each topic has an anchor word that occurs with probability ≥ p

Page 25: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

If an anchor word (for a topic) occurs, the document is at leastpartially about the given topic:

                               

       

bunt        

401k    oscar-­‐winning        

         

topics

per

son

al f

inan

ce

bas

ebal

l

... mov

ie r

evie

ws

word

s topic

s

documents

A W

Each topic has an anchor word that occurs with probability ≥ p

Page 26: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 27: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 28: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 29: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 30: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 31: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 32: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Anchor Words as Extreme Points [AGKM]

=

A W M

Can we efficiently determine

if a word is an anchor word?

Page 33: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Problem: Sampling “Noise”

A W M

Can we efficiently determine

if a word is an anchor word?

M

Page 34: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Problem: Sampling “Noise”

MA W M

Can we efficiently determine

if a word is an anchor word?

Page 35: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Problem: Sampling “Noise”

A W M

Can we efficiently determine

if a word is an anchor word?

M

Page 36: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 37: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 38: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 39: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Finding the Anchor Words

   

     

             

       

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

MA W M

Can we efficiently determine

if a word is an anchor word?

M

A

W

M

Can we efficiently determine

if a word is an anchor word?

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

A W M

Can we efficiently determine

if a word is an anchor word?

MA W M

Can we efficiently determine

if a word is an anchor word?

M

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 40: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Finding the Anchor Words

   

     

       

     

    Nonnegative!    

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

MA W M

Can we efficiently determine

if a word is an anchor word?

M

A

W

M

Can we efficiently determine

if a word is an anchor word?

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

A W M

Can we efficiently determine

if a word is an anchor word?

MA W M

Can we efficiently determine

if a word is an anchor word?

M

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 41: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Finding the Anchor Words

   

     

       

     

    Nonnegative!    

Anchor  words  from:    

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

MA W M

Can we efficiently determine

if a word is an anchor word?

M

A

W

M

Can we efficiently determine

if a word is an anchor word?

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

A W M

Can we efficiently determine

if a word is an anchor word?

MA W M

Can we efficiently determine

if a word is an anchor word?

M

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i!!

! !

! !

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 42: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 43: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 44: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 45: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 46: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

   

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 47: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

   

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 48: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

   

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 49: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Using the Anchor Words

   

   

   

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

DRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

iDRA

A

WT

AT

W(D)

(R)(U)

DR1 = DRAT1

find :

DRD diag( ) = DR1

z

z

DRUT

DRD

DERUT

1 + eps1 ! eps

output: (DRD diag(z))T!1

= M MT

DERE DT

{(E ) 1}T !1

i

Page 50: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 51: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Our Algorithm

M̂ is far from M , but let’s use M̂M̂T instead!

M̂M̂T → MMT and WW T → R as number of documents increase

Step

We can recover the anchor words from MMT .

Step

We can recover A and R given MMT and the anchor words.

And we can use matrix perturbation bounds to quantify how erroraccumulates

Page 52: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Concluding Remarks

joint work with Arora, Ge, Halpern, Mimno, Sontag, Wu and Zhu

We ran our algorithm on a database of 300,000 New York Timesarticles (from the UCI database) with 30,000 distinct words

Run time: 12 minutes (compared to 10 hours for MALLET andother state-of-the-art topic modeling tools)

Topics are high quality (Ask me if you want to see the results!)

Independently, [Anandkumar et al] gave an algorithm for LDAwithout any assumptions!

Are there other trapdoors – like anchor words – that make machinelearning much easier?

Page 53: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Concluding Remarks

joint work with Arora, Ge, Halpern, Mimno, Sontag, Wu and Zhu

We ran our algorithm on a database of 300,000 New York Timesarticles (from the UCI database) with 30,000 distinct words

Run time: 12 minutes (compared to 10 hours for MALLET andother state-of-the-art topic modeling tools)

Topics are high quality (Ask me if you want to see the results!)

Independently, [Anandkumar et al] gave an algorithm for LDAwithout any assumptions!

Are there other trapdoors – like anchor words – that make machinelearning much easier?

Page 54: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Concluding Remarks

joint work with Arora, Ge, Halpern, Mimno, Sontag, Wu and Zhu

We ran our algorithm on a database of 300,000 New York Timesarticles (from the UCI database) with 30,000 distinct words

Run time: 12 minutes (compared to 10 hours for MALLET andother state-of-the-art topic modeling tools)

Topics are high quality (Ask me if you want to see the results!)

Independently, [Anandkumar et al] gave an algorithm for LDAwithout any assumptions!

Are there other trapdoors – like anchor words – that make machinelearning much easier?

Page 55: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Concluding Remarks

joint work with Arora, Ge, Halpern, Mimno, Sontag, Wu and Zhu

We ran our algorithm on a database of 300,000 New York Timesarticles (from the UCI database) with 30,000 distinct words

Run time: 12 minutes (compared to 10 hours for MALLET andother state-of-the-art topic modeling tools)

Topics are high quality (Ask me if you want to see the results!)

Independently, [Anandkumar et al] gave an algorithm for LDAwithout any assumptions!

Are there other trapdoors – like anchor words – that make machinelearning much easier?

Page 56: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Concluding Remarks

joint work with Arora, Ge, Halpern, Mimno, Sontag, Wu and Zhu

We ran our algorithm on a database of 300,000 New York Timesarticles (from the UCI database) with 30,000 distinct words

Run time: 12 minutes (compared to 10 hours for MALLET andother state-of-the-art topic modeling tools)

Topics are high quality (Ask me if you want to see the results!)

Independently, [Anandkumar et al] gave an algorithm for LDAwithout any assumptions!

Are there other trapdoors – like anchor words – that make machinelearning much easier?

Page 57: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Questions?

Page 58: Learning Topic Models Going Beyond SVD - MIT CSAILpeople.csail.mit.edu/moitra/docs/LDA.pdf · Learning Topic Models { Going Beyond SVD Ankur Moitra, IAS joint with Sanjeev Arora and

Thanks!