Part1 Markov Models for Pattern Recognition – Introduction CSE717, SPRING 2008 CUBS, Univ at...

Preview:

Citation preview

Part1 Markov Models for Pattern Recognition – Introduction

CSE717, SPRING 2008

CUBS, Univ at Buffalo

Textbook

Markov models for pattern recognition: from theory to applications

by Gernot A. Fink, 1st Edition, Springer, Nov 2007

Textbook

Foundation of Math Statistics Vector Quantization and Mixture

Density Models Markov Models

Hidden Markov Model (HMM) Model formulation Classic algorithms in the HMM Application domain of the HMM

n-Gram Systems

Character and handwriting recognition

Speech recognition Analysis of biological sequences

Preliminary Requirements

Familiar with Probability Theory and Statistics

Basic concepts in Stochastic Process

Part 2 aFoundation of Probability Theory, Statistics & Stochastic Process

CSE717 , SPRING 2008

CUBS, Univ at Buffalo

Coin Toss Problem

Coin toss result: X: random variable head, tail: states SX: set of states Probabilities:

} tailhead,{ XSX

5.0)tail(Pr)head(Pr XX

Discrete Random Variable

A discrete random variable’s states are discrete: natural numbers, integers, etc

Described by probabilities of states PrX(s1), PrX(x=s2), …

s1, s2, …: discrete states (possible values of x)

Probabilities over all the states add up to 1 i

iX s 1)(Pr

Continuous Random Variable

A continuous random variable’s states are continuous: real numbers, etc

Described by its probability density function (p.d.f.): pX(s)

The probability of a<X<b can be obtained by integral

Integral from to

b

a X dssp )(

1)(

dsspX

Joint Probability and Joint p.d.f.

Joint probability of discrete random variables

Joint p.d.f. of continuous random variables

Independence Condition

iinXXX Xxxxxn

of state possibleany is ),,...,,(Pr 21,...,, 21

)(Pr...)(Pr)(Pr),...,,(Pr 2121,...,, 2121 nXXXnXXX xxxxxxnn

iinXXX Xxxxxpn

of state possibleany is ),,...,,( 21,...,, 21

)(...)()(),...,,( 2121,...,, 2121 nXXXnXXX xpxpxpxxxpnn

Conditional Probability and p.d.f.

Conditional probability of discrete random variables

Joint p.d.f. for continuous random variables

)(Pr/),(Pr)|(Pr 121,12| 11112xxxxx XXXXX

)(/),()|( 121,12| 11112xpxxpxxp XXXXX

Statistics: Expected Value and Variance

For discrete random variable

For continuous random variable

dxxxpXE X )(}{

i

iXi ssXE )(Pr}{

i

iXi sXEsXVar )(Pr}){(}{ 2

dxxpXExXVar X )(}){(}{ 2

Normal Distribution of Single Random Variable

Notation

p.d.f

Expected value

Variance

)2

)(exp(

2

1)(

2

2

x

xpX

}{xE

2}{ xVar

),( 2N

),(~ 2NX

Stochastic Process

A stochastic process is a time series of random variables : random variable t: time stamp

},,,{...,}{ 11 tttt XXXX

tX

Audio signal

Stock market

Causal Process

A stochastic process is causal if it has a finite history

A causal process can be represented by

,...,,, 21 tXXX

Stationary Process

A stochastic process is stationary if the probability at a fixed time t is the same for all other times, i.e., for any n, and ,

A stationary process is sometimes referred to as strictly stationary, in contrast with weak or wide-sense stationarity

}{ tX

}{,,,21 tttt XXXX

n

),,,(Pr

),,,(Pr

21,,,

21,,,

21

21

nXXX

nXXX

xxx

xxx

nttt

nttt

Gaussian White Noise

White Noise: obeys independent identical distribution (i.i.d.)

Gaussian White Noise

),(~ 2NX t

tX

Gaussian White Noise is a Stationary Process

Proof

for any n, and , }{,,,21 tttt XXXX

n

),,,(p

)2

)(exp(

)2(

1

)(

),,,(

21,,,

12

2

1

21,,,

21

21

nXXX

n

i

i

n

iiX

nXXX

xxx

x

xp

xxxp

nttt

it

nttt

Temperature

Q1: Is the temperature within a day stationary?

Markov Chains

A causal process is a Markov chain if

for any x1, …, xt

k is the order of the Markov chain First order Markov chain

Second order Markov chain

),,|(Pr),,|(Pr 1,,|11,,| 111 tkttXXXttXXX xxxxxx

tktttt

}{ tX

)|(Pr),,|(Pr 1|11,,| 111 ttXXttXXX xxxxx

tttt

),|(Pr),,|(Pr 12,|11,,| 1211 tttXXXttXXX xxxxxx

ttttt

Homogeneous Markov Chains

A k-th order Markov chain is homogeneous if the state transition probability is the same over time, i.e.,

Q2: Does homogeneous Markov chain imply stationary process?

}{ tX

k

kXXXkXXX

xxτt

xxxxxxktktt

,, ,, any for

),,|(Pr),,|(Pr

0

10,,|10,,| 11

State Transition in Homogeneous Markov Chains Suppose is a k-th order Markov chain and

S is the set of all possible states (values) of xt, then for any k+1 states x0, x1, …, xk, the state transition probability

can be abbreviated to

}{ tX

),,|(Pr 10,,| 1xxx kXXX tktt

),,|Pr( 10 xxx k

Rain Dry

0.60.4

0.2 0.8Two states : ‘Rain’ and ‘Dry’.Transition probabilities:

Pr(‘Rain’|‘Rain’)=0.4 , Pr(‘Dry’|‘Rain’)=0.6 , Pr(‘Rain’|‘Dry’)=0.2, Pr(‘Dry’|‘Dry’)=0.8

Example of Markov Chain

Rain Dry

0.60.4

0.2 0.8

Initial (say, Wednesday) probabilities: PrWed(‘Rain’)=0.3, PrWed(‘Dry’)=0.7

What’s the probability of rain on Thursday?

PThur(‘Rain’)= PrWed(‘Rain’)xPr(‘Rain’|‘Rain’)+PrWed(‘Dry’)xPr(‘Rain’|‘Dry’)= 0.3x0.4+0.7x0.2=0.26

Short Term Forecast

Rain Dry

0.60.4

0.2 0.8

Pt(‘Rain’)= Prt-1(‘Rain’)xPr(‘Rain’|‘Rain’)+Prt-1(‘Dry’)xPr(‘Rain’|‘Dry’)= Prt-

1(‘Rain’)x0.4+(1– Prt-1(‘Rain’)x0.2=0.2+0.2xPrt(‘Rain’)

Pt(‘Rain’)= Prt-1(‘Rain’) => Prt-1(‘Rain’)=0.25, Prt-1(‘Dry’)=1-0.25=0.75

Condition of Stationary

steady state distribution

Rain Dry

0.60.4

0.2 0.8

Pt(‘Rain’) = 0.2+0.2xPrt-1(‘Rain’)

Pt(‘Rain’) – 0.25 = 0.2x(Prt-1(‘Rain’) – 0.25)

Pt(‘Rain’) = 0.2t-1x(Pr1(‘Rain’)-0.25)+0.25

Pt(‘Rain’) = 0.25 (converges to steady state distribution)

Steady-State Analysis

tlim

Rain Dry

10

1 0

Periodic Markov chain never converges to steady states

Periodic Markov Chain