67
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 3: ASR: HMMs, Forward, Viterbi Original slides by Dan Jurafsky

Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

CS 224S / LINGUIST 285Spoken Language Processing

AndrewMaasStanfordUniversity

Spring2017

Lecture3:ASR:HMMs,Forward,ViterbiOriginalslidesbyDanJurafsky

Page 2: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Fun informative read on phoneticsTheArtofLanguageInvention.DavidJ.Peterson.2015.http://www.artoflanguageinvention.com/books/

Page 3: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Outline for Today� ASRArchitecture� DecodingwithHMMs

� Forward� ViterbiDecoding

� HowthisfitsintotheASRcomponentofcourse� Onyourown:N-gramsandLanguageModeling� Apr12:Training,AdvancedDecoding� Apr17:FeatureExtraction,GMMAcousticModeling� Apr24:NeuralNetworkAcousticModels� May1:Endtoendneuralnetworkspeechrecognition

Page 4: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Noisy Channel Model

� Searchthroughspaceofallpossiblesentences.� Picktheonethatismostprobablegiventhewaveform.

!"#$%&$'!('!)'

!"#$%&'!&()&(%&

!"#$%&'()!!*+

*#&!!'+)'!"#$%&,!"#$%&*

!"#$%&+

!"#$%&,

-.'/#!0%'1&'

)2&'.""3'".'4"5&666

-.'/#!0%'1&'

)2&'.""3'".'4"5&666

!"#$!"%75&$8'2+998'.+/048

-('+'2"4&'0(')2&'*$"#(3

&&&

-.'/#!0%'1&')2&'.""3'".'4"5&

Page 5: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Noisy Channel Model (II)� WhatisthemostlikelysentenceoutofallsentencesinthelanguageLgivensomeacousticinputO?

� TreatacousticinputOassequenceofindividualobservations�O=o1,o2,o3,…,ot

� Defineasentenceasasequenceofwords:�W=w1,w2,w3,…,wn

Page 6: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Noisy Channel Model (III)� Probabilisticimplication:PickthehighestprobS:

� WecanuseBayesruletorewritethis:

� SincedenominatoristhesameforeachcandidatesentenceW,wecanignoreitfortheargmax:

ˆ W = argmaxW ∈L

P(W | O)

ˆ W = argmaxW ∈L

P(O |W )P(W )€

ˆ W = argmaxW ∈L

P(O |W )P(W )P(O)

Page 7: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Speech Recognition Architecture

!"#$%&'()*"'%+&")",%&'!%-./

0'+$$-'/)1!.+$%-!)2.3"(

2455)*"'%+&"$

#6./")(-7"(-6..3$

822)(",-!./

!9:&';)('/:+':");.3"(

<-%"&=-)>"!.3"&

!"#$%&!'#()#*+)#",,-#,"#.,/)000

!

"

#$"%#$!&"%

Page 8: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Noisy channel model

ˆ W = argmaxW ∈L

P(O |W )P(W )

likelihood prior

Page 9: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

!"#$%&$'!('!)'

!"#$%&'!&()&(%&

!"#$%&'()!!*+

*#&!!'+)'!"#$%&,!"#$%&*

!"#$%&+

!"#$%&,

-.'/#!0%'1&'

)2&'.""3'".'4"5&666

-.'/#!0%'1&'

)2&'.""3'".'4"5&666

!"#$!"%75&$8'2+998'.+/048

-('+'2"4&'0(')2&'*$"#(3

&&&

-.'/#!0%'1&')2&'.""3'".'4"5&

The noisy channel modelIgnoringthedenominatorleavesuswithtwofactors:P(Source)andP(Signal|Source)

Page 10: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Speech Architecture meets Noisy Channel

Page 11: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Decoding Architecture: five easy pieces

� FeatureExtraction:� 39“MFCC” features

� AcousticModel:� Gaussiansforcomputingp(o|q)

� Lexicon/PronunciationModel� HMM:whatphonescanfolloweachother

� LanguageModel� N-gramsforcomputingp(wi|wi-1)

� Decoder� Viterbialgorithm:dynamicprogrammingforcombiningallthesetogetwordsequencefromspeech

11

Page 12: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Lexicon� Alistofwords� Eachonewithapronunciationintermsofphones� Wegetthesefromon-linepronunciationdictionary

� CMUdictionary:127Kwords� http://www.speech.cs.cmu.edu/cgi-bin/cmudict

� We’llrepresentthelexiconasanHMM

12

Page 13: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMMs for speech

Page 14: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Phones are not homogeneous!

Time (s)0.48152 0.937203

0

5000

ay k

ay k

Page 15: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Each phone has 3 subphones

!"#$

%&&

'()& *"+,-.%/.01#+2

%,, %$$

%0& %&, %,$ %$2

Page 16: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Resulting HMM word model for “six”

!"#

!"$

!"%

&'()' *+,-#

-$

-%

.#

.$

.%

-#

-$

-%

Page 17: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMM for the digit recognition task

Page 18: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Markov chain for weather

!"#$"%

&'()

*+,-.

/012

30456

#66

#%6

#22

#26

#%.

#%2

#62

#2.

#..

#6)

#2)

#6.

#.)

#.6

#.2

Page 19: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Markov chain for words

!"#$"%

&'()

*+,"-.

,/0

/'1*2

#22

#%2

#00

#02

#%.

#%0

#20

#0.

#..

#2)

#0)

#.0

#.)

#.2#

2.

Page 20: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Markov chain = First-order observable Markov Model

� asetofstates� Q=q1,q2…qN; thestateattimetisqt

� Transitionprobabilities:� asetofprobabilitiesA=a01a02…an1…ann.� Eachaijrepresentstheprobabilityoftransitioningfromstateitostatej

� ThesetoftheseisthetransitionprobabilitymatrixA

� Distinguishedstartandendstates€

aij = P(qt = j |qt−1 = i) 1≤ i, j ≤ N

aij =1; 1≤ i ≤ Nj=1

N

Page 21: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Markov chain = First-order observable Markov Model

Currentstateonlydependsonpreviousstate

Markov Assumption : P(qi |q1!qi−1) = P(qi |qi−1)

Page 22: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Another representation for start state

� Insteadofstartstate�Specialinitialprobabilityvectorp�Aninitialdistributionoverprobabilityofstartstates

�Constraints:

π i = P(q1 = i) 1≤ i ≤ N

π j =1j=1

N

Page 23: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The weather figure using pi

Page 24: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The weather figure: specific example

Page 25: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Markov chain for weather�Whatistheprobabilityof4consecutivewarmdays?

� Sequenceiswarm-warm-warm-warm� I.e.,statesequenceis3-3-3-3� P(3,3,3,3)=�p3a33a33a33a33 =0.2x(0.6)3 =0.0432

Page 26: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

How about?

� Hothothothot� Coldhotcoldhot

�Whatdoesthedifferenceintheseprobabilitiestellyouabouttherealworldweatherinfoencodedinthefigure?

Page 27: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMM for Ice Cream

� Youareaclimatologistintheyear2799� Studyingglobalwarming� Youcan’tfindanyrecordsoftheweatherinBaltimore,MDforsummerof2008

� ButyoufindJasonEisner’sdiary� Whichlistshowmanyice-creamsJasonateeverydatethatsummer

� Ourjob:figureouthowhotitwas

Page 28: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Hidden Markov Model

� ForMarkovchains,outputsymbols=statesymbols� Seehot weather:we’reinstatehot

� Butnotinspeechrecognition� Outputsymbols:vectorsofacoustics(cepstral features)� Hiddenstates:phones

� Soweneedanextension!� AHiddenMarkovModel isanextensionofaMarkovchaininwhichtheinputsymbolsarenotthesameasthestates.

� Thismeanswedon’tknowwhichstatewearein.

Page 29: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Hidden Markov Models

Page 30: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Assumptions�Markovassumption:

�Output-independenceassumption

P(qi |q1!qi−1) = P(qi |qi−1)

P(ot |O1t−1,q1

t ) = P(ot | q t )

Page 31: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Eisner task

GivenObservedIceCreamSequence:

1,2,3,2,2,2,3…

Produce:HiddenWeatherSequence:

H,C,H,H,H,C…

Page 32: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMM for ice cream

!"#$"%

&'()*+',-

!"

./-010&'()2000000000034

./*010&'()200005000036

./7010&'()200000000003-

3*

38

393:

36

37

./-010+',200000000003*

./*010+',200005000036

./7010+',2000000000036

!#

Page 33: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Different types of HMM structure

Bakis =left-to-right Ergodic =fully-connected

Page 34: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Three Basic Problems for HMMs

Problem1(Evaluation):GiventheobservationsequenceO=(o1o2…oT),andanHMMmodelF =(A,B),howdoweefficientlycomputeP(O|F),theprobabilityoftheobservationsequence,giventhemodel?

Problem2(Decoding):GiventheobservationsequenceO=(o1o2…oT),andanHMMmodelF =(A,B),howdowechooseacorrespondingstatesequenceQ=(q1q2…qT) thatisoptimalinsomesense(i.e.,bestexplainstheobservations)?

Problem3(Learning):HowdoweadjustthemodelparametersF =(A,B) tomaximizeP(O|F )?

Jack Ferguson at IDA in the 1960s

Page 35: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Problem 1: computing the observation likelihood

GiventhefollowingHMM:

Howlikelyisthesequence313?

!"#$"%

&'()*+',-

!"

./-010&'()2000000000034

./*010&'()200005000036

./7010&'()200000000003-

3*

38

393:

36

37

./-010+',200000000003*

./*010+',200005000036

./7010+',2000000000036

!#

Page 36: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

How to compute likelihood

� ForaMarkovchain,wejustfollowthestates313andmultiplytheprobabilities

� ButforanHMM,wedon’tknowwhatthestatesare!

� Solet’sstartwithasimplersituation.� Computingtheobservationlikelihoodforagivenhiddenstatesequence� SupposeweknewtheweatherandwantedtopredicthowmuchicecreamJasonwouldeat.

� i.e.,P(313|HHC)

Page 37: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Computing likelihood of 3 1 3 given hidden state sequence

Page 38: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Computing joint probability of observation and state sequence

Page 39: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Computing total likelihood of 3 1 3� Wewouldneedtosumover

� Hothotcold� Hothothot� Hotcoldhot� ….

� Howmanypossiblehiddenstatesequencesarethereforthissequence?

� HowaboutingeneralforanHMMwithNhiddenstatesandasequenceofTobservations?� NT

� Sowecan’tjustdoseparatecomputationforeachhiddenstatesequence.

Page 40: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Instead: the Forward algorithm� Adynamicprogrammingalgorithm

� JustlikeMinimumEditDistanceorCKYParsing�Usesatabletostoreintermediatevalues

� Idea:� Computethelikelihoodoftheobservationsequence

� Bysummingoverallpossiblehiddenstatesequences

� Butdoingthisefficiently� Byfoldingallthesequencesintoasingletrellis

Page 41: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The forward algorithm� Thegoaloftheforwardalgorithmistocompute

�We’lldothisbyrecursion

P(o1,o2,...,oT ,qT = qF | λ)

Page 42: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The forward algorithm� Eachcelloftheforwardalgorithmtrellisalphat(j)

� Representstheprobabilityofbeinginstatej� Afterseeingthefirstt observations�Giventheautomaton

� Eachcellthusexpressesthefollowingprobability

Page 43: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Forward Recursion

Page 44: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Forward Trellis

!"#$"

%

&

%

&

%

&

'()

*+&,!"#$"-./.*+0,&-

12./.13

*+%,%-./.*+3,%-

14./.12

*+&,&-./.*+3,&-

15./.16

*+&,%-./.*+3,&-10./.16

*+%,&-./.*

+3,%-

17./.12

*+%,!"#$"-/*+0,%-

18./.17

!!"#$9102

!!"!$.9.1:2

!#"#$9.102/1:37.;.1:2/1:8.9.1::5:8

!#"!$.9.102/136.;.1:2/10:.9.1:67

!"#$" !"#$" !"#$"

"

&

%

'() '() '()<=

<2

<3

<:

>3

0

>2 >0

3 0

.32*.14+.02*.08=.0464

Page 45: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

We update each cell

!"#$ !"

%$&

%'&

%(&

%)&

*&+!",

!!"#$%&"'&!!()"'$&%-&.*&+!",&

/$

/'

/)

/(

/$

/&

/'

/$

/'

!"0$!"#'

/$

/'

/) /)

/( /(

!!()"*$

!!()"+$

!!()",$

!!()")$

!!(,"*$

!!(,"+$

!!(,",$

!!(,")$

Page 46: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Forward Algorithm

Page 47: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Decoding� Givenanobservationsequence

� 313

� AndanHMM� Thetaskofthedecoder

� Tofindthebesthidden statesequence

� GiventheobservationsequenceO=(o1o2…oT),andanHMMmodelF =(A,B),howdowechooseacorrespondingstatesequenceQ=(q1q2…qT) thatisoptimalinsomesense(i.e.,bestexplainstheobservations)

Page 48: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Decoding� Onepossibility:

� ForeachhiddenstatesequenceQ� HHH,HHC,HCH,

� ComputeP(O|Q)� Pickthehighestone

� Whynot?�NT

� Instead:� TheViterbialgorithm� Isagainadynamicprogramming algorithm�UsesasimilartrellistotheForwardalgorithm

Page 49: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi intuition� Wewanttocomputethejointprobabilityoftheobservationsequencetogetherwiththebeststatesequence

maxq 0,q1,...,qT

P(q0,q1,...,qT ,o1,o2,...,oT ,qT = qF | λ)

Page 50: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi Recursion

Page 51: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Viterbi trellis

!"#$"

%

&

%

&

%

&

'()

*+&,!"#$"-./.*+0,&-

12./.13

*+%,%-./.*+3,%-

14./.12

*+&,&-./.*+3,&-

15./.16

*+&,%-./.*+3,&-10./.16

*+%,&-./.*

+3,%-

17./.12

*+%,!"#$"-/*+0,%-

18./.17

!"#$%9102

!"#"%.9.1:2

!$#$%9.;#<+102/1:37=.1:2/1:8-.9.1:778

!$#"%.9.;#<+102/136=.1:2/10:-.9.1:78

!"#$" !"#$" !"#$"

"

&

%

'() '() '()>?

>2

>3

>:

@3 @2 @0

0 3 0

/

Page 52: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi intuition� Processobservationsequencelefttoright� Fillingoutthetrellis� Eachcell:

Page 53: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi Algorithm

Page 54: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi backtrace

!"#$"

%

&

%

&

%

&

'()

*+&,!"#$"-./.*+0,&-

12./.13

*+%,%-./.*+3,%-

14./.12

*+&,&-./.*+3,&-

15./.16

*+&,%-./.*+3,&-10./.16

*+%,&-./.*

+3,%-

17./.12

*+%,!"#$"-/*+0,%-

18./.17

!"#$%9102

!"#"%.9.1:2

!$#$%9.;#<+102/1:37=.1:2/1:8-.9.1:778

!$#"%.9.;#<+102/136=.1:2/10:-.9.1:78

!"#$" !"#$" !"#$"

"

&

%

'() '() '()>?

>2

>3

>:

@3 @2 @0

0 3 0

/

Page 55: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMMs for Speech� Wehaven’tyetshownhowtolearn theAandBmatricesforHMMs;� we’lldothatonThursday� TheBaum-Welch(Forward-Backwardalg)

� Butlet’sreturntothinkaboutspeech

Page 56: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Reminder: a word looks like this:

!"#

!"$

!"%

&'()' *+,-#

-$

-%

.#

.$

.%

-#

-$

-%

Page 57: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

HMM for digit recognition task

Page 58: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The Evaluation (forward) problem for speech� TheobservationsequenceOisaseriesofMFCCvectors

� ThehiddenstatesWarethephonesandwords� Foragivenphone/wordstringW,ourjobistoevaluateP(O|W)

� Intuition:howlikelyistheinputtohavebeengeneratedbyjustthatwordstringW?

Page 59: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Evaluation for speech: Summing over all different paths!� fayayayayvvvv� ffayayayayvvv� ffffayayayayv� ffayayayayayayv� ffayayayayayayayayv� ffayvvvvvvv

Page 60: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The forward lattice for “five”

Page 61: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

The forward trellis for “five”

Page 62: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi trellis for “five”

Page 63: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi trellis for “five”

Page 64: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Search space with bigrams

!"!" !"## # $$ $ %&%& %&

'('( '(&& & )) )

*&*& *&++ +

,,,

-./%)0/1/+&%/2 -./+&%/1/%)0/2

-./%)0/1/%)0/2

-./+&%/1/+&%/2

-./%)0/1/#0$%/2

-./#0$%/1/#0$%/2

-./#0$%/1/%)0/2

-./+&%/1/#0$%/2

-./#0$%/1/+&%/2

Page 65: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi trellis

65

Page 66: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Viterbi backtrace

66

Page 67: Lecture 3: ASR: HMMs, Forward, Viterbi - Stanford University · Which lists how many ice-creams Jason ate every date that summer ... The Evaluation (forward) problem for speech The

Summary: ASR Architecture� Fiveeasypieces:ASRNoisyChannelarchitecture

� FeatureExtraction:� 39“MFCC” features

� AcousticModel:� Gaussiansforcomputingp(o|q)

� Lexicon/PronunciationModel� HMM:whatphonescanfolloweachother

� LanguageModel� N-gramsforcomputingp(wi|wi-1)

� Decoder� Viterbialgorithm:dynamicprogrammingforcombiningallthesetogetwordsequencefromspeech

67