CHAPTER 15 SECTION 3 – 4 Hidden Markov Models. Terminology

Embed Size (px)

Citation preview

  • Slide 1
  • CHAPTER 15 SECTION 3 4 Hidden Markov Models
  • Slide 2
  • Terminology
  • Slide 3
  • It gets big!
  • Slide 4
  • Conditional independence
  • Slide 5
  • P(Toothache, Cavity, Catch) If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: P(+catch | +toothache, +cavity) = P(+catch | +cavity) The same independence holds if I dont have a cavity: P(+catch | +toothache, -cavity) = P(+catch| -cavity) Catch is conditionally independent of Toothache given Cavity: P(Catch | Toothache, Cavity) = P(Catch | Cavity) Equivalent statements: P(Toothache | Catch, Cavity) = P(Toothache | Cavity) P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) One can be derived from the other easily
  • Slide 6
  • Probability Recap
  • Slide 7
  • Reasoning over Time or Space Often, we want to reason about a sequence of observations Speech recognition Robot localization User attention Medical monitoring Need to introduce time (or space) into our models
  • Slide 8
  • Markov Models Recap
  • Slide 9
  • Example: Markov Chain
  • Slide 10
  • Mini-Forward Algorithm
  • Slide 11
  • Example Run of Mini-Forward Algorithm From initial observations of sun: From initial observations of rain:
  • Slide 12
  • Example Run of Mini-Forward Algorithm From yet another initial distribution P(X 1 ):
  • Slide 13
  • Hidden Markov Models Markov chains not so useful for most agents Eventually you dont know anything anymore Need observations to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states S You observe outputs (effects) at each time step As a Bayes net:
  • Slide 14
  • Example
  • Slide 15
  • Hidden Markov Models
  • Slide 16
  • HMM Computations Given parameters evidence E 1:n =e 1:n Inference problems include: Filtering, find P(X t |e 1:t ) for all t Smoothing, find P(X t |e 1:n ) for all t Most probable explanation, find x* 1:n = argmax x1:n P(x 1:n |e 1:n )
  • Slide 17
  • Real HMM Examples Speech recognition HMMs: Observations are acoustic signals (continuous valued) States are specific positions in specific words (so, tens of thousands)
  • Slide 18
  • Real HMM Examples Machine translation HMMs: Observations are words (tens of thousands) States are translation options
  • Slide 19
  • Real HMM Examples Robot tracking: Observations are range readings (continuous) States are positions on a map (continuous)
  • Slide 20
  • Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present
  • Slide 21
  • Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present Current observation independent of all else given current state
  • Slide 22
  • Conditional Independence HMMs have two important independence properties: Markov hidden process, future depends on past via the present Current observation independent of all else given current state Quiz: does this mean that observations are independent given no evidence?
  • Slide 23
  • HMM Notation
  • Slide 24
  • HMM Problem 1 Evaluation Consider the problem where we have a number of HMMs (that is, a set of ( ,A,B) triples) describing different systems, and a sequence of observations. We may want to know which HMM most probably generated the given sequence. Solution: Forward Algorithm
  • Slide 25
  • HMM Problem 2 Decoding: Finding the most probable sequence of hidden states given some observations Find the hidden states that generated the observed output. In many cases we are interested in the hidden states of the model since they represent something of value that is not directly observable Solution: Backward Algorithm or Viterbi Algorithm
  • Slide 26
  • HMM Problem 3 Learning: Generating a HMM from a sequence of obersvations Solution: Forward-Backward Algorithm
  • Slide 27
  • Exhaustive Search Solution Sequence of observations for seaweed state: Dry Damp Soggy
  • Slide 28
  • Exhaustive Search Solution Pr(dry,damp,soggy | HMM) = Pr(dry,damp,soggy | sunny,sunny,sunny) + Pr(dry,damp,soggy | sunny,sunny,cloudy) + Pr(dry,damp,soggy | sunny,sunny,rainy) +.... Pr(dry,damp,soggy | rainy,rainy,rainy)
  • Slide 29
  • A better solution: dynamic programming We can calculate the probability of reaching an intermediate state in the trellis as the sum of all possible paths to that state.
  • Slide 30
  • A better solution: dynamic programming t ( j )= Pr( observation | hidden state is j ) x Pr(all paths to state j at time t)
  • Slide 31
  • A better solution: dynamic programming the sum of these final partial probabilities is the sum of all possible paths through the trellis
  • Slide 32
  • A better solution: dynamic programming
  • Slide 33
  • Slide 34
  • Exhaustive search: O(T m ) Dynamic programming: O(T)
  • Slide 35
  • References CSE473: Introduction to Artificial Intelligence http://courses.cs.washington.edu/courses/cse473/ http://courses.cs.washington.edu/courses/cse473/ Hidden Markov Models Tutorial http://www.comp.leeds.ac.uk/roger/HiddenMarkov Models/html_dev/main.html http://www.comp.leeds.ac.uk/roger/HiddenMarkov Models/html_dev/main.html