Upload
svein
View
35
Download
0
Embed Size (px)
DESCRIPTION
Class Hidden Markov Models. Markov Chains. First order Markov process set of states, S i transition probabilities, a ij the probability of each state is dependent only on the previous state trivial constraints (transitions must be true probabilities). a AC. A. C. a AA. a AT. a AG. G. - PowerPoint PPT Presentation
Citation preview
1
Class Hidden Markov Models
2
Markov ChainsFirst order Markov process
• set of states, Si
• transition probabilities, aij
○ the probability of each state is dependent only on the previous state
○ trivial constraints (transitions must be true probabilities)𝑎𝑖𝑗≥0∧∑
𝑖=1
𝑁
𝑎𝑖𝑗=1
G
A C
T
aAA
aAC
aAGaAT
A C G T
A 0.18 0.27 0.43 0.12
C 0.17 0.37 0.27 0.19
G 0.16 0.34 0.38 0.12
T 0.08 0.36 0.38 0.18
𝑎𝑖𝑗=𝑃 (𝑠𝑡=𝑆 𝑗∨𝑠𝑡− 1=𝑆𝑖 )
3
Markov ChainsProbabilities
• for a sequence of observed states O = ( s0, s1, …sT)in general
for the Markov chain
4
Markov ChainsWhat is it good for?
• Model comparison – for a given model (set of transition probabilities), what is the probability of seeing the observed data○ The best model (maximum likelihood model) can be found by simply
counting the observed transitions in a sufficiently large set of data• Generated random sequence according to specific background
models• Birth /death processes – probability of extinction• Branching processes (trees)• Markov chain Monte Carlo (MCMC)
5
Hidden Markov ModelMore complicated model
• let each state emit a character, ck, according to a set of emission probabilities, bik, where bik = P( ck|Si)
• For a set of observed characters O = (o0, …, oT) and states (so, ..., sT)𝑃 (𝑜𝑡 )=𝑏𝑠𝑡𝑜𝑡𝑃 (𝑠𝑡 )
6
Hidden Markov ModelTwo state model for sequences
• maybe S0 is exon and S1 is intron
• what if you have a set of observed characters, but you want to know what the state is (or the most likely state)
• the state information is the hidden part of the hidden Markov model
S0
P(A) = 0.148P(C) = 0.334P(G) = 0.365P(T) = 0.154
S1
P(A) = 0.262P(C) = 0.247P(G) = 0.238P(T) = 0.253
CGCTTAGCTATCGCATTCGGCTACGATCTAGCTACGTAGCTATGCCGATGCATTATTACGCGATCTCGATCG
C G C T T A
S0 S0 S0 S1 S1 S1
7
Hidden Markov ModelRabiner's three basic problems
• what is the probability of an observed sequence, O, given a model?(evaluation)○ joint probability of observations and state sequence – P(O,S|θ)○ as useful as Markov Chain
• what is the optimal sequence of states that "explains" the observed data? (decoding)○ optimality criterion?
• how can one adjust the model parameters to maximize the probability of the observed data given the model(learning)
8
Hidden Markov ModelEvaluation
• for a model with parameters , what is or more simply • model parameters
Q a set of states A the state transition matrix B the emission probabilities ω an initial probability distribution
• observations
9
Hidden Markov ModelEvaluation
• Assume the observations are independent
The probability of a a particular state sequence (or path), π,
• There are NT state sequences and O(T) calculations so the brute force complexity is O(TNT)
10
Hidden Markov ModelForward algorithm
• is the probability of observing the partial sequence given that state
with
• complexity O(N2T)
11
Hidden Markov ModelBackward algorithm
• almost the same as forward algorithm• is the probability of observing the partial sequence given that state
with initial condition
12
Hidden Markov ModelProblem 2: Decoding
• what is the optimal sequence of states that "explains" the observed data?
• optimality criteria○ the path π that maximizes the correct number of individual states, i.e.,
the path where the states are individually most likely○ the most probable single path, maximize or equivalently Viterbi
algorithm• The optimal path is the path that maximizes
let be the highest probability path ending in state i
• keep track of argument that maximizes at each position t
13
Hidden Markov ModelProblem 3: learning
• find the parameters, , that maximizes
• No analytical solution requires iterative solution (Baum-Welch algorithm)1. initial model , repeat2. compute parameters based on and observations O3. if , stop4. else, accept , goto 2.
• With Baum-Welch algorithm likelihood is proven to be greater or equal at each step
14
Hidden Markov ModelTraining
• need to update the transition probabilities. the probability of being in state i at time t and state j at time t+1 is
• the probability of being in state i at time t, given the observed sequence O is
or in terms of ,
15
Hidden Markov ModelTraining
• Derived quantities
expected number of times state i is used
expected number of transitions from state i to state j
• BW parameter updates○ probability of starting in state I
○ where
16
Class Hidden Markov ModelWhy?
• Basic HMM has no way to include the known states of classified training data into the optimization! Generally trained in an unsupervised learning approach.
• HMMs are not discriminitive models• How do you discriminate?
○ Separate training data into classes and train○ for any set of observations, compare probabilities of observed data
given each model and choose the best model (likelihood ratio, for example)
17
Class Hidden Markov ModelHidden Markov Models for Labeled SequencesAnders Krogh, ISMB 1994
• Assume you have a sequence of observed symbols, (s0, s1, …, sT), and a sequence of (observed) classes (c0, c1, …, cT)
• Each state at time t, emits an observed symbol and an observed class (label). The classes are treated similarly to the emission probabilities
probability of emitting character a and class label x• Most often a state will emit a single class, i.e., for two states x, and
y, = 1 and = 0 = 0 and = 1
• What is the probability of the class labels given this model
• is the basic HMM calculation described earlier and solved using the forward/backward algorithm
18
Class Hidden Markov Model
• in the general case, multiple paths through the model can give the same labeling. in the special case where each state emits only a single class label, you can use Viterbi to optimize
the most probable class labeling of sequence s
• Maximum likelihood parameter estimates
19
Class Hidden Markov Model
20
Class Hidden Markov ModelCalculating gradient -
21
Class Hidden Markov ModelCalculating gradient -
• basically the same as
• the overall likelihood and optimization
22
Class Hidden Markov Model
• Overall gradient of the likelihood• Intuitively
○ Permissible paths - the state path given in φ
• mk is the number of times θk is used in permissible paths○ Slightly modified forward/backward○ αt and βt are zero except along the permissible paths
• nk is the number of times θk is used in all possible○ Use standard forward/backward algorithm
• Straightforward application of EM (Baum-Welch) is likely to give negative probabilities (eq. 16)
• Krogh proposes an iterative method, including multiple training sequences, μ
23
ReferencesTo learn about HMMs
• Rabiner L. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257-286, 1989.○ This is the ultimate origin, everything comes back to here including basic nomenclature and
definition of problems• Durbin R, Eddy S, Krogh A, Mitchison G. Biological Sequence analysis, probabilistic
models of proteins and nucleic acids. Cambridge University Press, 1998.○ Detailed applications to sequence alignment, phylogenetic trees, RNA folding, and other
biological models• Mann T. Numerically stable hidden Markov model implementation.
http://bozeman.genome.washington.edu/compbio/mbt599_2006/hmm_scaling_revised.pdf. 2006.○ Detailed pseudocode for implementation of HMM calculation in log space to avoid numerical
underflow. Very useful if you are writing your own code.• De Fonzo V, Aluffi-Pentini F, Parisi V. Hidden Markov Models in Bioinformatics. Current
Bioinformatics 2, 49-61, 2007.○ A more recent review that gives a clear outline of the algorithms and lots of references to
more recent applications in computational biology.• Kanungo T. Hidden Markov Models. http://www.kanungo.com/software/hmmtut.pdf. • Kanungo T. UMDHMM: hidden Markov model toolkit. in "Extended finite state models of
language", Kornai A (ed). Cambridge University Press. Software download http://www.kanungo.com/software/software.html#umdhmm 1999.