19
Natural Language Processing Spring 2007 V. “Juggy” Jagannathan

Natural Language Processing

  • Upload
    charis

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Natural Language Processing. Spring 2007 V. “Juggy” Jagannathan. Course Book. Foundations of Statistical Natural Language Processing. By Christopher Manning & Hinrich Schutze. Chapter 9. Markov Models March 5, 2007. Markov models. Markov assumption - PowerPoint PPT Presentation

Citation preview

Page 1: Natural Language Processing

Natural Language Processing

Spring 2007

V. “Juggy” Jagannathan

Page 2: Natural Language Processing

Foundations of Statistical Natural Language Processing

By

Christopher Manning & Hinrich Schutze

Course Book

Page 3: Natural Language Processing

Chapter 9

Markov Models

March 5, 2007

Page 4: Natural Language Processing

Markov models

• Markov assumption– Suppose X = (X1, …, XT) is a sequence of

random variables taking values in some finite set S = {s1,…,sN}, Markov properties are:

• Limited Horizon– P(Xt+1 = sk|X1,…,Xt) = P(Xt+1 = sk|Xt)

– i.e. the t+1 value only depends on t value

• Time invariant (stationary)• Stochastic Transition matrix A:

– aij = P(Xt+1 = sj|Xt=si) where

N

j ijij iajia1

,1&,,0

Page 5: Natural Language Processing

Markov model example

1

1

123121

112131211

11

)|()...|()|()(

),...,|()...,|()|()(),...,(

T

tXXX

TT

TTT

tta

XXPXXPXXPXP

XXXPXXXPXXPXPXXP

18.0

6.03.00.1

)|()|()(),,( 23121

iXpXPtXiXPtXPpitP

Page 6: Natural Language Processing

Probability: {lem,ice-t} giventhe machine starts in CP?

0.3x0.7x0.1+0.3x0.3x0.7=0.021+0.063 = 0.084

Hidden Markov Model Example

Page 7: Natural Language Processing

Why use HMMs?

• Underlying events generating surface observable events

• Eg. Predicting weather based on dampness of seaweeds• http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/

html_dev/main.html

• Linear Interpolation in n-gram models:

),|( 12 nnnli wwwP

),|()|()( 123312211 nnnnnn wwwPwwPwP

Page 8: Natural Language Processing
Page 9: Natural Language Processing

Look at Notes from David Meir Blei [UC Berkley]

http://www-nlp.stanford.edu/fsnlp/hmm-chap/blei-hmm-ch9.pptSlides 1-13

Page 10: Natural Language Processing

(Observed states)

Page 11: Natural Language Processing

Forward Procedure

Page 12: Natural Language Processing

)|,...()( 11 iXooPt tti

Niii 1,)1(

N

iijoijij NjTtbattt

1

1,1,)()1(

N

ii TOP

1

)1()|(

Initialization:

Induction:

Total computation:

Forward Procedure

Page 13: Natural Language Processing

)|,...()( iXooPt tTti

NiTi 1,1)1(

N

jjijoiji NiTttbat

t1

1,1),1()(

N

iiiOP

1

)1()|(

Initialization:

Induction:

Total computation:

Backward Procedure

Page 14: Natural Language Processing

)()(

)||...()|,...(

)|,...()|,(

11

1

tt

iXooPiXooP

iXooPiXOP

ii

tTttt

tTt

N

iii TtttOP

1

11),()()|(

Combining both – forward and backward

Page 15: Natural Language Processing

Finding the best state sequence

11),(maxarg

)()(

)()(

)|(

)|,(

)(

1

1

TttX

tt

tt

OP

OiXP

t

iNi

t

N

j jj

ji

t

i

To determine the state sequence that best explains observationsLet:

Individually the most likely state is:

This approach, however, does not correctly estimate the most likely state sequence.

Page 16: Natural Language Processing

Finding the best state sequenceViterbi algorithm

)|(maxarg OXPX

)|,...,...(max)( 1111... 11

jXooXXPt tttxx

jt

Njjj 1,)1(

Store the most probable path that leads to a given node

Initialization

Induction

Njbatttijoiji

Nij

1,)(max)1(

1

Store Backtrace

Njbatttijoiji

Nij

1,)(maxarg)1(

1

)1(maxarg1

1

TX iNi

T

)1(max)(1

TXP iNi

Page 17: Natural Language Processing

Parameter Estimation

Page 18: Natural Language Processing

Parameter Estimation

N

m mm

jijoiji

tt

ttt

tt

tbat

OP

OjXiXP

OjXiXPjip

t

1

1

1

)()(

)1()(

)|(

)|,,(

),|,(),(

Probability of traversing an arc at time t given observation sequence O:

T

tt

T

ti

Oinjtoistatefromstransitionofnumberectedjip

Oinistatefromstransitionofnumberectedt

1

1

__________exp),(

________exp)(

Page 19: Natural Language Processing

Parameter Estimation

T

t t

Ttkot t

ijk

T

t i

T

t t

ij

jip

jip

jtoistatefromstransitionofnumberected

observedkwithjtoistatefromstransitionofnumberectedb

t

jip

istatefromstransitionofnumberected

jtoistatefromstransitionofnumberecteda

t

1

}1,:{

1

1

),(

),(

________exp

___________exp

)(

),(

______exp

________exp