CHAPTER 15 SECTION 3 – 4 Hidden Markov Models. Terminology

CHAPTER 15 SECTION 3 4 Hidden Markov Models

Terminology

It gets big!

Conditional independence

P(Toothache, Cavity, Catch) If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: P(+catch | +toothache, +cavity) = P(+catch | +cavity) The same independence holds if I dont have a cavity: P(+catch | +toothache, -cavity) = P(+catch| -cavity) Catch is conditionally independent of Toothache given Cavity: P(Catch | Toothache, Cavity) = P(Catch | Cavity) Equivalent statements: P(Toothache | Catch, Cavity) = P(Toothache | Cavity) P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) One can be derived from the other easily

Probability Recap

Reasoning over Time or Space Often, we want to reason about a sequence of observations Speech recognition Robot localization User attention Medical monitoring Need to introduce time (or space) into our models

Markov Models Recap

Example: Markov Chain

Mini-Forward Algorithm

Example Run of Mini-Forward Algorithm From initial observations of sun: From initial observations of rain:

Example Run of Mini-Forward Algorithm From yet another initial distribution P(X 1 ):

Hidden Markov Models Markov chains not so useful for most agents Eventually you dont know anything anymore Need observations to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states S You observe outputs (effects) at each time step As a Bayes net:

Example

Hidden Markov Models

HMM Computations Given parameters evidence E 1:n =e 1:n Inference problems include: Filtering, find P(X t |e 1:t ) for all t Smoothing, find P(X t |e 1:n ) for all t Most probable explanation, find x* 1:n = argmax x1:n P(x 1:n |e 1:n )

Real HMM Examples Speech recognition HMMs: Observations are acoustic signals (continuous valued) States are specific positions in specific words (so, tens of thousands)

Real HMM Examples Machine translation HMMs: Observations are words (tens of thousands) States are translation options

Real HMM Examples Robot tracking: Observations are range readings (continuous) States are positions on a map (continuous)

Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present

Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present Current observation independent of all else given current state

Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present Current observation independent of all else given current state Quiz: does this mean that observations are independent given no evidence?

HMM Notation

HMM Problem 1 Evaluation Consider the problem where we have a number of HMMs (that is, a set of ( ,A,B) triples) describing different systems, and a sequence of observations. We may want to know which HMM most probably generated the given sequence. Solution: Forward Algorithm

HMM Problem 2 Decoding: Finding the most probable sequence of hidden states given some observations Find the hidden states that generated the observed output. In many cases we are interested in the hidden states of the model since they represent something of value that is not directly observable Solution: Backward Algorithm or Viterbi Algorithm

HMM Problem 3 Learning: Generating a HMM from a sequence of obersvations Solution: Forward-Backward Algorithm

Exhaustive Search Solution Sequence of observations for seaweed state: Dry Damp Soggy

A better solution: dynamic programming We can calculate the probability of reaching an intermediate state in the trellis as the sum of all possible paths to that state.

A better solution: dynamic programming t ( j )= Pr( observation | hidden state is j ) x Pr(all paths to state j at time t)

A better solution: dynamic programming the sum of these final partial probabilities is the sum of all possible paths through the trellis

A better solution: dynamic programming

Exhaustive search: O(T m ) Dynamic programming: O(T)

References CSE473: Introduction to Artificial Intelligence http://courses.cs.washington.edu/courses/cse473/ http://courses.cs.washington.edu/courses/cse473/ Hidden Markov Models Tutorial http://www.comp.leeds.ac.uk/roger/HiddenMarkov Models/html_dev/main.html http://www.comp.leeds.ac.uk/roger/HiddenMarkov Models/html_dev/main.html

Documents

CHAPTER 15 SECTION 3 – 4 Hidden Markov Models. Terminology