View
234
Download
0
Category
Preview:
Citation preview
7/30/2019 HMM Presentation
1/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Hidden Markov Model and Speech Recognition
Nirav S. Uchat
1 Dec,2006
Nirav S. Uchat Hidden Markov Model and Speech Recognition
7/30/2019 HMM Presentation
2/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Outline
1 Introduction
2 Motivation - Why HMM ?
3 Understanding HMM
4 HMM and Speech Recognition
5 Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
7/30/2019 HMM Presentation
3/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Introduction
What is Speech Recognition ?
Understanding what is being said
Mapping speech data to textual information
Speech Recognition is indeed challenging
Due to presence of noise in input data
Variation in voice data due to speakers physical condition,mood etc..
Difficult to identify boundary condition
Nirav S. Uchat Hidden Markov Model and Speech Recognition
7/30/2019 HMM Presentation
4/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Different types of Speech Recognition
Type of SpeakerSpeaker Dependent(SD)
relatively easy to constructrequires less training data (only from particular speaker)also known as speaker recognition
Speaker Independent(SID)requires huge training data (from various speaker)difficult to construct
Type of DataIsolates Word Recognizer
recognize single wordeasy to construct (pointer for more difficult speechrecognition)may be speaker dependent or speaker independent
Continuous Speech Recognitionmost difficult of all
problem of finding word boundaryNirav S. Uchat Hidden Markov Model and Speech Recognition
I d i
7/30/2019 HMM Presentation
5/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Outline
1 Introduction
2 Motivation - Why HMM ?
3 Understanding HMM
4 HMM and Speech Recognition
5 Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
I t d ti
7/30/2019 HMM Presentation
6/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Use of Signal Model
it helps us to characterize the property of the given signal
provide theoretical basis for signal processing system
way to understand how system works
we can simulate the source and it help us to understand asmuch as possible about signal source
Why Hidden Markov Model (HMM) ?
very rich in mathematical structure
when applied properly, work very well in practical application
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
7/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Outline
1 Introduction
2 Motivation - Why HMM ?
3 Understanding HMM
4 HMM and Speech Recognition
5 Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
8/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Components of HMM [2]
1 Number of state = N2 Number of distinct observation symbol per state = M,
V = V1, V2, ,VM3 State transition probability =aij4 Observation symbol probability distribution in state
j,Bj(K) = P[Vk at t|qt = Sj]
5 The initial state distribution i = P[q1 = Si] 1 i NNirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
9/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Problem For HMM : Problem 1 [2]
Problem 1 : Evaluation Problem Given the observationsequence O = O1 O2 OT, and model = (A,B, ), howdo we efficiently compute P(O|), the probability ofobservation sequence given the mode.
Figure: Evaluation Problem
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
10/31
IntroductionMotivation - Why HMM ?
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Problem 2 and 3 [2]
Problem 2 : Hidden State Determination (Decoding)Given the observation sequence O = O1 O2 OT, andmodel = (A, B, ), How do we choose BEST state
sequence Q = q1 q2 qT which is optimal in somemeaningful sense.(In Speech Recognition it can be considered as state emittingcorrect phoneme)
Problem 3 : Learning How do we adjust the modelparameter = (A,B, ) to maximize P(O|). Problem 3 isone in which we try to optimize model parameter so as tobest describe as to how given observation sequence comes out
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
11/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Solution For Problem 1 : Forward Algorithm
P(O|) =
q1, ,qT
q1 bq1(O1)aq1q2 bq2 (O2) aqT1qTbqT(OT)
Which is O(NT) algorithm i.e. at every state we have N
choices to make and total length is T.
Forward algorithm which uses dynamic programming methodto reduce time complexity.
It uses forward variable t(i) defined as
t(i) = P(O1,O2, ,Oi, qt = Si|)
i.e., the probability of partial observation sequence, O1,O2 tillOt and state Si at time t given the model ,
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
12/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Figure: Forward Variable
t+1(j) =
N
i=1
t(i)aij
bj(Ot+1), 1 t T 1, 1 j N
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction
7/30/2019 HMM Presentation
13/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Solution For Problem 2 : Decoding using Viterbi Algorithm
[1]Viterbi Algorithm : To find single best state sequencewe define a quantity
t(i) = maxq1
,
q2,
,
qt1
P[q1q2 qt = i,O1O2 Ot|]
i.e., t(i) is the best score along a single path, at time t,which account for the first t observations and ends in state Si,by induction
t+1(j) = maxi
t(i)aij bj(Ot+1)Key point is, Viterbi algorithm is similar (except for thebacktracking part) in implementation to the Forwardalgorithm. The major difference is maximization of theprevious state in place of summing procedure in forward
calculationNirav S. Uchat Hidden Markov Model and Speech Recognition
Introduction?
7/30/2019 HMM Presentation
14/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Solution For Problem 3 : Learning (Adjusting model
parameter)
Uses Baum-Welch Learning Algorithm
Core operation is
t(i,j) = P(qt = Si, qt+1 = Sj|O, ) i.e.,the probability ofbeing in state Si at time t, and state Sj at time t+ 1 given themodel and observation sequencet(i) = the probability of being in state Si at time t, given theobservation sequence and modelwe can relate :
t(i) =N
j=1
t(i,j)
re-estimated parameters are :
= Expected number of times in state Si = 1(i)
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionM i i Wh HMM ?
7/30/2019 HMM Presentation
15/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
aij = expected number of transition from state Si to Sj
expected number of transition form state Si
=
T1t=1
t(i,j)
T1t=1
t(i)
bj(k) = number of times in state j and observing symbol vkexpected number of times in state j
=
Tt=1
s.t Ot=Vk
t(j)
T
t=1
t(j)
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionM ti ti Wh HMM ?
7/30/2019 HMM Presentation
16/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Outline
1 Introduction
2 Motivation - Why HMM ?
3 Understanding HMM
4 HMM and Speech Recognition
5 Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation Why HMM ?
7/30/2019 HMM Presentation
17/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Block Diagram of ASR using HMM
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
18/31
Motivation - Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Basic Structure
Phoneme
smallest unit of information in speech signal (over 10 msec) isPhoneme
ONE : W AH N
English language has approximately 56 phoneme
HMM structure for a Phoneme
This model is First Order Markov Model
Transition is from previous state to next state (no jumping)Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
19/31
Motivation Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Question to be ask ?
What represent state in HMM ?
HMM for each phoneme
3 state for each HMM
states are : start mid and endONE : has 3 HMM for phoneme W AH and N each HMMhas 3 state
What is output symbol ?Symbol form Vector Quantization is used as output symbolfrom state
concatenation of symbol gives phoneme
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
20/31
Motivation Why HMM ?Understanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Front-End
purpose is to parameterize an input signal (e.g., audio) into asequence of Features vectorMethod for Feature Vector extraction are
MFCC - Mel Frequency Cepsteral Coefficient
LPC Analysis - Linear Predictive Coding
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
21/31
yUnderstanding HMM
HMM and Speech RecognitionIsolated Word Recognizer
Acoustic Modeling[1]
Uses Vector Quantization to map Feature vector to Symbol.
create training set of feature vector
cluster them in to small number of classes
represent each class by symbol
for each class Vk, compute the probability that it is generatedby given HMM state.
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
22/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Creation Of Search Graph [3]
Search Graph represent Vocabulary under consideration
Acoustic Model, Language model and Lexicon (Decoder
during recognition) works together to produce Search GraphLanguage model represent how word are related to each other(which word follows other)
it uses Bi-Gram model
Lexicon is a file containing WORD PHONEME pairSo we have whole vocabulary represented as graph
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
7/30/2019 HMM Presentation
23/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Complete Example
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
U d di HMM
7/30/2019 HMM Presentation
24/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Training
Training is used to adjust model parameter to maximize theprobability of recognition
Audio data from various different source are taken
it is given to the prototype HMM
HMM will adjust the parameter using Baum-Welch algorithm
Once the model is train, unknown data is given for recognition
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
U d t di HMM
7/30/2019 HMM Presentation
25/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Decoding
It uses Viterbi algorithm for finding BEST state sequence
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
26/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Decoding Continued
This is just for Single WordDuring Decoding whole graph is searched.
Each HMM has two non emitting state for connecting it toother HMM
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
27/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Outline
1 Introduction
2
Motivation - Why HMM ?
3 Understanding HMM
4 HMM and Speech Recognition
5 Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
28/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Isolated Word Recognizer [4]
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
29/31
Understanding HMMHMM and Speech Recognition
Isolated Word Recognizer
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
30/31
gHMM and Speech Recognition
Isolated Word Recognizer
Problem With Continuous Speech Recognition
Boundary condition
Large vocabulary
Training time
Efficient Search Graph creation
Nirav S. Uchat Hidden Markov Model and Speech Recognition
IntroductionMotivation - Why HMM ?
Understanding HMM
7/30/2019 HMM Presentation
31/31
gHMM and Speech Recognition
Isolated Word Recognizer
Dan Jurafsky.
CS 224S / LINGUIST 181 Speech Recognition and Synthesis.World Wide Web, http://www.stanford.edu/class/cs224s/.
Lawrence R. Rabiner.A Tutorial on Hidden Markov Model and Selected Applicaiton
in Speech Recognition.IEEE, 1989.
Willie Walker, Paul Lamere, and Philip Kwok.Sphinx-4: A Flexible Open Source Framework for SpeechRecognition.SUN Microsystem, 2004.
Steve Young and Gunnar Evermannl.The HTK Book.Microsoft Corporation, 2005.
Nirav S. Uchat Hidden Markov Model and Speech Recognition
Recommended