72
Topics in statistical language modeling Tom Griffiths

Topics in statistical language modeling Tom Griffiths

Embed Size (px)

Citation preview

Page 1: Topics in statistical language modeling Tom Griffiths

Topics in statistical language modeling

Tom Griffiths

Page 2: Topics in statistical language modeling Tom Griffiths

Mark SteyversUC Irvine

Josh TenenbaumMIT

Dave BleiCMU

Mike JordanUC Berkeley

Page 3: Topics in statistical language modeling Tom Griffiths

Latent Dirichlet Allocation (LDA)

• Each document a mixture of topics

• Each word chosen from a single topic

• Introduced by Blei, Ng, and Jordan (2001), reinterpretation of PLSI (Hofmann, 1999)

• Idea of probabilistic topics widely used (eg. Bigi et al., 1997; Iyer & Ostendorf, 1996; Ueda & Saito, 2003)

Page 4: Topics in statistical language modeling Tom Griffiths

• Each document a mixture of topics

• Each word chosen from a single topic

• from parameters

• from parameters

Latent Dirichlet Allocation (LDA)

Page 5: Topics in statistical language modeling Tom Griffiths

Latent Dirichlet Allocation (LDA)

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2SCIENTIFIC 0.0KNOWLEDGE 0.0WORK 0.0RESEARCH 0.0MATHEMATICS 0.0

HEART 0.0 LOVE 0.0SOUL 0.0TEARS 0.0JOY 0.0 SCIENTIFIC 0.2KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

topic 1 topic 2

w P(w|z = 1) = (1) w P(w|z = 2) = (2)

Page 6: Topics in statistical language modeling Tom Griffiths

Choose mixture weights for each document, generate “bag of words”

= {P(z = 1), P(z = 2)}

{0, 1}

{0.25, 0.75}

{0.5, 0.5}

{0.75, 0.25}

{1, 0}

MATHEMATICS KNOWLEDGE RESEARCH WORK MATHEMATICS RESEARCH WORK SCIENTIFIC MATHEMATICS WORK

SCIENTIFIC KNOWLEDGE MATHEMATICS SCIENTIFIC HEART LOVE TEARS KNOWLEDGE HEART

MATHEMATICS HEART RESEARCH LOVE MATHEMATICS WORK TEARS SOUL KNOWLEDGE HEART

WORK JOY SOUL TEARS MATHEMATICS TEARS LOVE LOVE LOVE SOUL

TEARS LOVE JOY SOUL LOVE TEARS SOUL SOUL TEARS JOY

Page 7: Topics in statistical language modeling Tom Griffiths

Generating a document

1. Choose d Dirichlet ()

2. For each word in the document– choose z Multinomial ((d))– choose w Multinomial ((z))

z

w

zz

w w

Page 8: Topics in statistical language modeling Tom Griffiths

Inverting the generative model

• Generative model gives procedure to obtain corpus from topics, mixing proportions

• Inverting the model extracts topics and mixing proportions from corpus

• Goal: describe content of documents, and be able to identify content of new documents

• All inference completely unsupervised, fixed # of topics T, words W, documents D

Page 9: Topics in statistical language modeling Tom Griffiths

Inverting the generative model

• Maximum likelihood estimation (EM)– eg. Hofmann (1999)– slow, local maxima

• Approximate E-steps – VB; Blei, Ng & Jordan (2001)– EP; Minka & Lafferty (2002)

• Bayesian inference(via Gibbs sampling)

Page 10: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

• Numerator rewards sparsity in words assigned to topics, topics to documents

• Sum in the denominator over Tn terms

• Full posterior tractable to a constant, so use Markov chain Monte Carlo (MCMC)

Page 11: Topics in statistical language modeling Tom Griffiths

Markov chain Monte Carlo

• Sample from a Markov chain constructed to converge to the target distribution

• Allows sampling from unnormalized posterior, and other complex distributions

• Can compute approximate statistics from intractable distributions

• Gibbs sampling one such method, construct Markov chain with conditional distributions

Page 12: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

• Need full conditional distributions for variables

• Since we only sample z we need

number of times word w assigned to topic j

number of times topic j used in document d

Page 13: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

iteration1

Page 14: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

?

iteration1 2

Page 15: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

?

iteration1 2

Page 16: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

?

iteration1 2

Page 17: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

2?

iteration1 2

Page 18: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

21?

iteration1 2

Page 19: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

211?

iteration1 2

Page 20: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

2112?

iteration1 2

Page 21: Topics in statistical language modeling Tom Griffiths

Gibbs sampling in LDA

i wi di zi zi zi123456789

101112...

50

MATHEMATICSKNOWLEDGE

RESEARCHWORK

MATHEMATICSRESEARCH

WORKSCIENTIFIC

MATHEMATICSWORK

SCIENTIFICKNOWLEDGE

.

.

.JOY

111111111122...5

221212212111...2

211222212212...1

222122212222...1

iteration1 2 … 1000

Page 22: Topics in statistical language modeling Tom Griffiths

Estimating topic distributions

Parameter estimates from posterior predictive distributions

Page 23: Topics in statistical language modeling Tom Griffiths

pixel = word image = document

sample each pixel froma mixture of topics

A visual example: Bars

Page 24: Topics in statistical language modeling Tom Griffiths
Page 25: Topics in statistical language modeling Tom Griffiths
Page 26: Topics in statistical language modeling Tom Griffiths

Strategy

• Markov chain Monte Carlo (MCMC) is normally slow, so why consider using it?

• In discrete models, use conjugate priors to reduce inference to discrete variables

• Several benefits:– save memory: need only track sparse counts– save time: cheap updates, even with complex

dependencies between variables

Page 27: Topics in statistical language modeling Tom Griffiths

(not estimating Dirichlet hyperparameters , )

Perplexity vs. time

Page 28: Topics in statistical language modeling Tom Griffiths

Strategy

• Markov chain Monte Carlo (MCMC) is normally slow, so why consider using it?

• In discrete models, use conjugate priors to reduce inference to discrete variables

• Several benefits:– save memory: need only track sparse counts– save time: cheap updates, even with complex

dependencies between variablesThese properties let us explore larger, more complex models

Page 29: Topics in statistical language modeling Tom Griffiths

Application to corpus data

• TASA corpus: text from first grade to college

• 26414 word types, over 37000 documents, used approximately 6 million word tokens

• Run Gibbs for models with T = 300, 500, …, 1700 topics

Page 30: Topics in statistical language modeling Tom Griffiths

THEORYSCIENTISTS

EXPERIMENTOBSERVATIONS

SCIENTIFICEXPERIMENTSHYPOTHESIS

EXPLAINSCIENTISTOBSERVED

EXPLANATIONBASED

OBSERVATIONIDEA

EVIDENCETHEORIESBELIEVED

DISCOVEREDOBSERVE

FACTS

SPACEEARTHMOON

PLANETROCKET

MARSORBIT

ASTRONAUTSFIRST

SPACECRAFTJUPITER

SATELLITESATELLITES

ATMOSPHERESPACESHIPSURFACE

SCIENTISTSASTRONAUT

SATURNMILES

ARTPAINT

ARTISTPAINTINGPAINTEDARTISTSMUSEUM

WORKPAINTINGS

STYLEPICTURES

WORKSOWN

SCULPTUREPAINTER

ARTSBEAUTIFUL

DESIGNSPORTRAITPAINTERS

STUDENTSTEACHERSTUDENT

TEACHERSTEACHING

CLASSCLASSROOM

SCHOOLLEARNING

PUPILSCONTENT

INSTRUCTIONTAUGHTGROUPGRADE

SHOULDGRADESCLASSES

PUPILGIVEN

BRAINNERVESENSE

SENSESARE

NERVOUSNERVES

BODYSMELLTASTETOUCH

MESSAGESIMPULSES

CORDORGANSSPINALFIBERS

SENSORYPAIN

IS

CURRENTELECTRICITY

ELECTRICCIRCUIT

ISELECTRICAL

VOLTAGEFLOW

BATTERYWIRE

WIRESSWITCH

CONNECTEDELECTRONSRESISTANCE

POWERCONDUCTORS

CIRCUITSTUBE

NEGATIVE

A selection from 500 topics [P(w|z = j)]

Page 31: Topics in statistical language modeling Tom Griffiths

STORYSTORIES

TELLCHARACTER

CHARACTERSAUTHOR

READTOLD

SETTINGTALESPLOT

TELLINGSHORT

FICTIONACTION

TRUEEVENTSTELLSTALE

NOVEL

MINDWORLDDREAM

DREAMSTHOUGHT

IMAGINATIONMOMENT

THOUGHTSOWNREALLIFE

IMAGINESENSE

CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE

FIELDMAGNETICMAGNET

WIRENEEDLE

CURRENTCOIL

POLESIRON

COMPASSLINESCORE

ELECTRICDIRECTION

FORCEMAGNETS

BEMAGNETISM

POLEINDUCED

SCIENCESTUDY

SCIENTISTSSCIENTIFIC

KNOWLEDGEWORK

RESEARCHCHEMISTRY

TECHNOLOGYMANY

MATHEMATICSBIOLOGY

FIELDPHYSICS

LABORATORYSTUDIESWORLD

SCIENTISTSTUDYINGSCIENCES

BALLGAMETEAM

FOOTBALLBASEBALLPLAYERS

PLAYFIELD

PLAYERBASKETBALL

COACHPLAYEDPLAYING

HITTENNISTEAMSGAMESSPORTS

BATTERRY

JOBWORKJOBS

CAREEREXPERIENCE

EMPLOYMENTOPPORTUNITIES

WORKINGTRAINING

SKILLSCAREERS

POSITIONSFIND

POSITIONFIELD

OCCUPATIONSREQUIRE

OPPORTUNITYEARNABLE

A selection from 500 topics [P(w|z = j)]

Page 32: Topics in statistical language modeling Tom Griffiths

STORYSTORIES

TELLCHARACTER

CHARACTERSAUTHOR

READTOLD

SETTINGTALESPLOT

TELLINGSHORT

FICTIONACTION

TRUEEVENTSTELLSTALE

NOVEL

MINDWORLDDREAM

DREAMSTHOUGHT

IMAGINATIONMOMENT

THOUGHTSOWNREALLIFE

IMAGINESENSE

CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE

FIELDMAGNETICMAGNET

WIRENEEDLE

CURRENTCOIL

POLESIRON

COMPASSLINESCORE

ELECTRICDIRECTION

FORCEMAGNETS

BEMAGNETISM

POLEINDUCED

SCIENCESTUDY

SCIENTISTSSCIENTIFIC

KNOWLEDGEWORK

RESEARCHCHEMISTRY

TECHNOLOGYMANY

MATHEMATICSBIOLOGY

FIELDPHYSICS

LABORATORYSTUDIESWORLD

SCIENTISTSTUDYINGSCIENCES

BALLGAMETEAM

FOOTBALLBASEBALLPLAYERS

PLAYFIELD

PLAYERBASKETBALL

COACHPLAYEDPLAYING

HITTENNISTEAMSGAMESSPORTS

BATTERRY

JOBWORKJOBS

CAREEREXPERIENCE

EMPLOYMENTOPPORTUNITIES

WORKINGTRAINING

SKILLSCAREERS

POSITIONSFIND

POSITIONFIELD

OCCUPATIONSREQUIRE

OPPORTUNITYEARNABLE

A selection from 500 topics [P(w|z = j)]

Page 33: Topics in statistical language modeling Tom Griffiths

PLANETCue:

(Nelson, McEvoy & Schreiber, 1998)

Evaluation: Word association

Page 34: Topics in statistical language modeling Tom Griffiths

EARTHPLUTO

JUPITERNEPTUNE

VENUSURANUSSATURNCOMETMARS

ASTEROID

PLANETCue:

Associates:

(Nelson, McEvoy & Schreiber, 1998)

Evaluation: Word association

Page 35: Topics in statistical language modeling Tom Griffiths

associates

cues

Evaluation: Word association

Page 36: Topics in statistical language modeling Tom Griffiths

• Comparison with Latent Semantic Analysis (LSA; Landauer & Dumais, 1997)

• Both algorithms applied to TASA corpus (D > 30,000, W > 20,000, n > 6,000,000)

• Compare LSA cosine, inner product, with the “on-topic” conditional probability

Evaluation: Word association

Page 37: Topics in statistical language modeling Tom Griffiths

Latent Semantic Analysis(Landauer & Dumais, 1997)

1

6

11

spaces

6195semantic

2120in

3034words

Doc3 … Doc2Doc1

SVD words

in

semantic

spaces

X U D V T

co-occurrence matrix high dimensional space

Page 38: Topics in statistical language modeling Tom Griffiths

wor

ds

documents

U D V

wor

ds

dims

dims

dim

s

vect

ors documents

LSA

Latent Semantic Analysis(Landauer & Dumais, 1997)

Dimensionality reduction makes storage efficient, extracts correlation

Page 39: Topics in statistical language modeling Tom Griffiths
Page 40: Topics in statistical language modeling Tom Griffiths

100

101

102

103

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P(set contains first associate)

Set size

LSATOPICS

Pro

babi

lity

of

cont

anin

g fi

rst a

ssoc

iate

Rank

Page 41: Topics in statistical language modeling Tom Griffiths

Problems

• Finding the right number of topics

• No dependencies between topics

• The “bag of words” assumption

• Need for a stop list

Page 42: Topics in statistical language modeling Tom Griffiths

Problems

• Finding the right number of topics

• No dependencies between topics

• The “bag of words” assumption

• Need for a stop list

}

}

CRP models(Blei, Jordan,Tenenbaum)

HMM syntax(Steyvers,

BleiTenenbaum)

Page 43: Topics in statistical language modeling Tom Griffiths

Problems

• Finding the right number of topics

• No dependencies between topics

• The “bag of words” assumption

• Need for a stop list

}

}

CRP models(Blei, Jordan,Tenenbaum)

HMM syntax(Steyvers,

BleiTenenbaum)

Page 44: Topics in statistical language modeling Tom Griffiths

T corpus topics

all T topics are ineach document

1 T

Standard LDA:

doc1

doc2

doc3

Page 45: Topics in statistical language modeling Tom Griffiths

T corpus topics

only L topics are ineach document

1 T

doc1

doc2

doc3

Page 46: Topics in statistical language modeling Tom Griffiths

T corpus topics

only L topics are ineach document

1 T

doc1

doc2

doc3

topic identities indexed by c

Page 47: Topics in statistical language modeling Tom Griffiths

Richer dependencies

• Nature of topic dependencies comes from prior on assignments to documents p(c)

• Inference with Gibbs is straightforward

• Boring prior: pick L from T uniformly

• Some interesting priors on assignments:– Chinese restaurant process (CRP)– nested CRP (for hierarchies)

Page 48: Topics in statistical language modeling Tom Griffiths

Chinese restaurant process

• The mth customer at an infinitely large Chinese restaurant chooses a table with

• Also Dirichlet process, infinite models (Beal, Ghahramani, Neal, Rasmussen)

• Prior on assignments: one topic on each table, L visits/document, T is unbounded

Page 49: Topics in statistical language modeling Tom Griffiths

Generating a document

1. Choose c by sampling L tables from the Chinese restaurant, without replacement

2. Choose d Dirichlet () (over L slots)

3. For each word in the document– choose z Multinomial ((d))– choose w Multinomial ((c(z)))

Page 50: Topics in statistical language modeling Tom Griffiths

Inverting the generative model

• Draw z as before, but conditioned on c

• Draw c one at a time from

• Need only track occupied tables

• Recover topics, number of occupied tables

Page 51: Topics in statistical language modeling Tom Griffiths

Chinese restaurant process prior Bayes factor

Model selection with the CRP

Page 52: Topics in statistical language modeling Tom Griffiths

Nested CRP

• Infinitely many infinite-table restaurants

• Every table has a card for another restaurant, forming an infinite-branching tree

• L day vacation: visit root restaurant first night, go to restaurant on card the next night, etc.

• Once inside the restaurant, choose the table (and the next restaurant) via the standard CRP

Page 53: Topics in statistical language modeling Tom Griffiths

The nested CRP as a prior

• One topic per restaurant, each document has one topic at each of the L-levels of a tree

• Each c is a path through the tree

• Collecting these paths from all documents gives a finite subtree of used topics

• Allows unsupervised learning of hierarchies

• Extends Hofmann’s (1999) topic hierarchies

Page 54: Topics in statistical language modeling Tom Griffiths

Generating a document

1. Choose c by sampling a path from the nested Chinese restaurant process

2. Choose d Dirichlet () (over L slots)

3. For each word in the document– choose z Multinomial ((d))– choose w Multinomial ((c(z)))

Page 55: Topics in statistical language modeling Tom Griffiths

Inverting the generative model

• Draw z as before, but conditioned on c

• Draw c as a block from

• Need only track previously taken paths

• Recover topics, set of paths (finite subtree)

Page 56: Topics in statistical language modeling Tom Griffiths

Twelve years of NIPS

Page 57: Topics in statistical language modeling Tom Griffiths

Summary

• Letting document topics to be a subset of corpus topics allows richer dependencies

• Using Gibbs sampling makes it possible to have an unbounded number of corpus topics

• Flat model, hierarchies only two options of many: factorial, arbitrary graphs, etc

Page 58: Topics in statistical language modeling Tom Griffiths

Problems

• Finding the right number of topics

• No dependencies between topics

• The “bag of words” assumption

• Need for a stop list

}

}

CRP models(Blei, Jordan,Tenenbaum)

HMM syntax(Steyvers,

Tenenbaum)

Page 59: Topics in statistical language modeling Tom Griffiths

Syntax and semantics from statistics

z

w

zz

w w

xxx

semantics: probabilistic topics

syntax: probabilistic regular grammar

Factorization of language based onstatistical dependency patterns:

long-range, document specific,dependencies

short-range dependencies constantacross all documents

Page 60: Topics in statistical language modeling Tom Griffiths

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2

z = 1 0.4

SCIENTIFIC 0.2 KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

z = 2 0.6

x = 1

THE 0.6 A 0.3MANY 0.1

x = 3

OF 0.6 FOR 0.3BETWEEN 0.1

x = 2

0.9

0.1

0.2

0.8

0.7

0.3

Page 61: Topics in statistical language modeling Tom Griffiths

THE ………………………………

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2

z = 1 0.4

SCIENTIFIC 0.2 KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

z = 2 0.6

x = 1

THE 0.6 A 0.3MANY 0.1

x = 3

OF 0.6 FOR 0.3BETWEEN 0.1

x = 2

0.9

0.1

0.2

0.8

0.7

0.3

Page 62: Topics in statistical language modeling Tom Griffiths

THE LOVE……………………

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2

z = 1 0.4

SCIENTIFIC 0.2 KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

z = 2 0.6

x = 1

THE 0.6 A 0.3MANY 0.1

x = 3

OF 0.6 FOR 0.3BETWEEN 0.1

x = 2

0.9

0.1

0.2

0.8

0.7

0.3

Page 63: Topics in statistical language modeling Tom Griffiths

THE LOVE OF………………

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2

z = 1 0.4

SCIENTIFIC 0.2 KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

z = 2 0.6

x = 1

THE 0.6 A 0.3MANY 0.1

x = 3

OF 0.6 FOR 0.3BETWEEN 0.1

x = 2

0.9

0.1

0.2

0.8

0.7

0.3

Page 64: Topics in statistical language modeling Tom Griffiths

THE LOVE OF RESEARCH ……

HEART 0.2 LOVE 0.2SOUL 0.2TEARS 0.2JOY 0.2

z = 1 0.4

SCIENTIFIC 0.2 KNOWLEDGE 0.2WORK 0.2RESEARCH 0.2MATHEMATICS 0.2

z = 2 0.6

x = 1

THE 0.6 A 0.3MANY 0.1

x = 3

OF 0.6 FOR 0.3BETWEEN 0.1

x = 2

0.9

0.1

0.2

0.8

0.7

0.3

Page 65: Topics in statistical language modeling Tom Griffiths

Inverting the generative model

• Sample z conditioned on x, other z– draw from prior if x > 1

• Sample x conditioned on z, other x

• Inference allows estimation of– “semantic” topics– “syntactic” classes

Page 66: Topics in statistical language modeling Tom Griffiths

FOODFOODSBODY

NUTRIENTSDIETFAT

SUGARENERGY

MILKEATINGFRUITS

VEGETABLESWEIGHT

FATSNEEDS

CARBOHYDRATESVITAMINSCALORIESPROTEIN

MINERALS

MAPNORTHEARTHSOUTHPOLEMAPS

EQUATORWESTLINESEAST

AUSTRALIAGLOBEPOLES

HEMISPHERELATITUDE

PLACESLAND

WORLDCOMPASS

CONTINENTS

DOCTORPATIENTHEALTH

HOSPITALMEDICAL

CAREPATIENTS

NURSEDOCTORSMEDICINENURSING

TREATMENTNURSES

PHYSICIANHOSPITALS

DRSICK

ASSISTANTEMERGENCY

PRACTICE

BOOKBOOKS

READINGINFORMATION

LIBRARYREPORT

PAGETITLE

SUBJECTPAGESGUIDE

WORDSMATERIALARTICLE

ARTICLESWORDFACTS

AUTHORREFERENCE

NOTE

GOLDIRON

SILVERCOPPERMETAL

METALSSTEELCLAYLEADADAM

OREALUMINUM

MINERALMINE

STONEMINERALS

POTMININGMINERS

TIN

BEHAVIORSELF

INDIVIDUALPERSONALITY

RESPONSESOCIAL

EMOTIONALLEARNINGFEELINGS

PSYCHOLOGISTSINDIVIDUALS

PSYCHOLOGICALEXPERIENCES

ENVIRONMENTHUMAN

RESPONSESBEHAVIORSATTITUDES

PSYCHOLOGYPERSON

CELLSCELL

ORGANISMSALGAE

BACTERIAMICROSCOPEMEMBRANEORGANISM

FOODLIVINGFUNGIMOLD

MATERIALSNUCLEUSCELLED

STRUCTURESMATERIAL

STRUCTUREGREENMOLDS

Semantic topics

PLANTSPLANT

LEAVESSEEDSSOIL

ROOTSFLOWERS

WATERFOOD

GREENSEED

STEMSFLOWER

STEMLEAF

ANIMALSROOT

POLLENGROWING

GROW

Page 67: Topics in statistical language modeling Tom Griffiths

GOODSMALL

NEWIMPORTANT

GREATLITTLELARGE

*BIG

LONGHIGH

DIFFERENTSPECIAL

OLDSTRONGYOUNG

COMMONWHITESINGLE

CERTAIN

THEHIS

THEIRYOURHERITSMYOURTHIS

THESEA

ANTHATNEW

THOSEEACH

MRANYMRSALL

MORESUCHLESS

MUCHKNOWN

JUSTBETTERRATHER

GREATERHIGHERLARGERLONGERFASTER

EXACTLYSMALLER

SOMETHINGBIGGERFEWERLOWER

ALMOST

ONAT

INTOFROMWITH

THROUGHOVER

AROUNDAGAINSTACROSS

UPONTOWARDUNDERALONGNEAR

BEHINDOFF

ABOVEDOWN

BEFORE

SAIDASKED

THOUGHTTOLDSAYS

MEANSCALLEDCRIED

SHOWSANSWERED

TELLSREPLIED

SHOUTEDEXPLAINEDLAUGHED

MEANTWROTE

SHOWEDBELIEVED

WHISPERED

ONESOMEMANYTWOEACHALL

MOSTANY

THREETHIS

EVERYSEVERAL

FOURFIVEBOTHTENSIX

MUCHTWENTY

EIGHT

HEYOU

THEYI

SHEWEIT

PEOPLEEVERYONE

OTHERSSCIENTISTSSOMEONE

WHONOBODY

ONESOMETHING

ANYONEEVERYBODY

SOMETHEN

Syntactic classes

BEMAKE

GETHAVE

GOTAKE

DOFINDUSESEE

HELPKEEPGIVELOOKCOMEWORKMOVELIVEEAT

BECOME

Page 68: Topics in statistical language modeling Tom Griffiths

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Bayes factors for different models

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Part-of-speech tagging

Page 69: Topics in statistical language modeling Tom Griffiths

MODELALGORITHM

SYSTEMCASE

PROBLEMNETWORKMETHOD

APPROACHPAPER

PROCESS

ISWASHAS

BECOMESDENOTES

BEINGREMAINS

REPRESENTSEXISTSSEEMS

SEESHOWNOTE

CONSIDERASSUMEPRESENT

NEEDPROPOSEDESCRIBESUGGEST

USEDTRAINED

OBTAINEDDESCRIBED

GIVENFOUND

PRESENTEDDEFINED

GENERATEDSHOWN

INWITHFORON

FROMAT

USINGINTOOVER

WITHIN

HOWEVERALSOTHENTHUS

THEREFOREFIRSTHERENOW

HENCEFINALLY

#*IXTN-CFP

EXPERTSEXPERTGATING

HMEARCHITECTURE

MIXTURELEARNINGMIXTURESFUNCTION

GATE

DATAGAUSSIANMIXTURE

LIKELIHOODPOSTERIOR

PRIORDISTRIBUTION

EMBAYESIAN

PARAMETERS

STATEPOLICYVALUE

FUNCTIONACTION

REINFORCEMENTLEARNINGCLASSESOPTIMAL

*

MEMBRANESYNAPTIC

CELL*

CURRENTDENDRITICPOTENTIAL

NEURONCONDUCTANCE

CHANNELS

IMAGEIMAGESOBJECT

OBJECTSFEATURE

RECOGNITIONVIEWS

#PIXEL

VISUAL

KERNELSUPPORTVECTOR

SVMKERNELS

#SPACE

FUNCTIONMACHINES

SET

NETWORKNEURAL

NETWORKSOUPUTINPUT

TRAININGINPUTS

WEIGHTS#

OUTPUTS

NIPS Semantics

NIPS Syntax

Page 70: Topics in statistical language modeling Tom Griffiths

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Function and content words

Page 71: Topics in statistical language modeling Tom Griffiths

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Highlighting and templating

Page 72: Topics in statistical language modeling Tom Griffiths

Open questions

• Are MCMC methods useful elsewhere?– “smoothing with negative weights”– Markov chains on grammars

• Other nonparametric language models?– infinite HMM, infinite PCFG, clustering

• Better ways of combining topics and syntax?– richer syntactic models– better combination schemes