Upload
bennett-logan
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
11
CS6825: RecognitionCS6825: Recognition8. Hidden Markov Models8. Hidden Markov Models
22
Hidden Markov Model (HMM)Hidden Markov Model (HMM)
HMMs allow you to estimate HMMs allow you to estimate probabilities of unobserved eventsprobabilities of unobserved events
E.g., in speech recognition, the E.g., in speech recognition, the observed data is the acoustic signal observed data is the acoustic signal and the words are the hidden and the words are the hidden parameters you are trying to figure parameters you are trying to figure out.out.
33
HMMs and their UsageHMMs and their Usage
HMMs are very common in Computational HMMs are very common in Computational Linguistics:Linguistics:• Speech recognition (observed: acoustic signal, Speech recognition (observed: acoustic signal,
hidden: words)hidden: words)• Handwriting recognition (observed: image, Handwriting recognition (observed: image,
hidden: words)hidden: words)• Part-of-speech tagging (observed: words, Part-of-speech tagging (observed: words,
hidden: part-of-speech tags)hidden: part-of-speech tags)• Machine translation (observed: foreign words, Machine translation (observed: foreign words,
hidden: words in target language)hidden: words in target language)
44
Hidden Markov ModelsHidden Markov Models
Now in each state Now in each state we could emit a we could emit a measurement, with measurement, with probability probability depending on the depending on the state and the state and the measurementmeasurement
We observe these We observe these measurementsmeasurements
55
Hidden Markov Models….exampleHidden Markov Models….example
Elements of sign language understandingElements of sign language understanding• the speaker makes a sequence of signsthe speaker makes a sequence of signs• Some signs are more common than othersSome signs are more common than others• the next sign depends (roughly, and the next sign depends (roughly, and
probabilistically) only on the current signprobabilistically) only on the current sign• there are measurements, which may be there are measurements, which may be
inaccurate; different signs tend to generate inaccurate; different signs tend to generate different probability densities on measurement different probability densities on measurement valuesvalues
Many problems share these propertiesMany problems share these properties• tracking is like this, for exampletracking is like this, for example
66
HMM’s - dynamicsHMM’s - dynamics
77
HMM’s - the Joint and InferenceHMM’s - the Joint and Inference
88
TrellisesTrellises Each column Each column
corresponds to a corresponds to a measurement in the measurement in the sequencesequence
Trellis makes the Trellis makes the collection of legal paths collection of legal paths obviousobvious
Now we would like to get Now we would like to get the path with the largest the path with the largest negative log-posteriornegative log-posterior
Trellis makes this easy, Trellis makes this easy, as follows.as follows.
99
1010
Fitting an HMMFitting an HMM
I have:I have:• sequence of measurementssequence of measurements• collection of statescollection of states• topologytopology
I wantI want• state transition probabilitiesstate transition probabilities• measurement emission probabilitiesmeasurement emission probabilities
Straightforward application of EMStraightforward application of EM• discrete vars give state for each measurementdiscrete vars give state for each measurement• M step is just averaging, etc.M step is just averaging, etc.
1111
HMM’s for sign language HMM’s for sign language understanding-1understanding-1
Build an HMM for Build an HMM for each wordeach word
1212
HMM’s for sign language HMM’s for sign language understanding-2understanding-2
Build an HMM for Build an HMM for each wordeach word
Then build a Then build a language modellanguage model
1313
Figure from “Real time American sign language recognition using desk and wearable computer based video,” T. Starner, et al. Proc. Int. Symp. on Computer Vision, 1995, copyright 1995, IEEE
User gesturing
For both isolated word recognition tasks and for recognition using a language model that has five word sentences (words always appearing in the order pronoun verb noun adjective pronoun), Starner and Pentland’s displays a word accuracy of the order of 90%. Values are slightly larger or smaller, depending on the features and the task, etc.
1414
Example – American Sign Example – American Sign Language DetectionLanguage Detection
gri.gallaudet.edugri.gallaudet.edu/~cvogler/research/~cvogler/research/data/cvdm-iccv98./data/cvdm-iccv98.pdf pdf
1515
HMM’s can be spatial rather thantemporal; for example, we have asimple model where the position ofthe arm depends on the position ofthe torso, and the position of theleg depends on the position of thetorso. We can build a trellis, whereeach node represents correspondencebetween an image token and a bodypart, and do DP on this trellis.
1616
1717
Figure from “Efficient Matching of Pictorial Structures,” P. Felzenszwalb and D.P. Huttenlocher, Proc. Computer Vision and Pattern Recognition2000, copyright 2000, IEEE
1818
Another Example – Emotion DetectonAnother Example – Emotion Detecton http://http://
www.ifp.uiuc.edu/~iracohen/publications/www.ifp.uiuc.edu/~iracohen/publications/mlhmmemotions.pdfmlhmmemotions.pdf
1919
Advantage of HMMAdvantage of HMM
Does not just use current state to do Does not just use current state to do recognition….looks at previous recognition….looks at previous state(s) to understand what is going state(s) to understand what is going on.on.
This is powerful idea when such This is powerful idea when such temporal dependencies exist.temporal dependencies exist.