Upload
mohammed-musfir-n-n
View
523
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Presented at the Main project evaluation at MES College of Engineering
Citation preview
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Robust ASR system : Malayalam
Carrol Xavier,Mohammed Musfir,
Rahmathulla,Supriya,
Yasif
Guided By :Mr.Edet Bijoy K
Assistant Professor
Department of ECEMES College of Engineering
May 3, 2012Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Objective
To implement a digit recognizing prototype for MalayalamLanguage 0-9 using HMM model of speech
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Contents1 Introduction
SpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result5 Future Work6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
What is Speech?
Produced when air fromlungs passes throughglottis, throat and mouth
Excitation in three ways:
Voiced excitationUnvoiced excitationTransient excitation
Some sounds -Combinations of threeexcitations
Spectral Changes - VocalTract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
What is Speech?
Produced when air fromlungs passes throughglottis, throat and mouth
Excitation in three ways:
Voiced excitationUnvoiced excitationTransient excitation
Some sounds -Combinations of threeexcitations
Spectral Changes - VocalTract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
What is Speech?
Produced when air fromlungs passes throughglottis, throat and mouth
Excitation in three ways:
Voiced excitationUnvoiced excitationTransient excitation
Some sounds -Combinations of threeexcitations
Spectral Changes - VocalTract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
What is Speech?
Produced when air fromlungs passes throughglottis, throat and mouth
Excitation in three ways:
Voiced excitationUnvoiced excitationTransient excitation
Some sounds -Combinations of threeexcitations
Spectral Changes - VocalTract
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Pictorial Representation of “SHOP”
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Characteristics of Speech
Bandwidth - 4 KHz
Fundamental Frequency - Depends on the type ofarticulation
Peaks in the Spectrum -
Voiced excitation - P(f ) - Triangular PulseUnvoiced excitation - a white noise generator
Pitch Extraction:
Rabiner Gold Pitch TrackerAutocorrelation Pitch Tracker
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Pitch Extraction - Autocorrelation
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Formant Frequency
Concentration of acoustic energy on particular frequency
At 1000 Hz intervals
Resonance in Vocal Tracts
Spectrogram - Darkness: Strength of formant
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Spectrogram
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Speech Production Model
S(f ) = (vP(f ) + uN(f ))H(f )R(f ) = X (f )H(f )R(f )
The mixture between voiced and unvoiced excitationdetermined by v and u
The fundamental frequency determined by P(f )
The spectral shaping determined by H(f )
The signal amplitude depending on v and u
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing andchallenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing andchallenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing andchallenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing andchallenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
About Automatic Speech Recognition
Automatic Speech Recognition - Advancing andchallenging
Most of the research works - English, Arabic, Mandarin
Native Indian Languages - Minimal work
Industry - AT & T, Nuance, IBM
Open Source - Vox Forge
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Classifying ASR system
System contains two subsystems:ASR - Transcribe natural speechSU - Understand the meaning of transcribed speech
ASR system classified as:DVI - Direct Voice InputLVCSR - Large Vocabulary Continuous SpeechRecognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Block Diagram of ASR
Acoustic Properties - Linguistic representation
Initial acquisition - Signal transduction or Recording
Feature extraction - Spectral Analysis
Segmentation - Phoneme Boundary Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR
Approaches of ASR
Template Based Approach
Knowledge Based Approach
Statistical Approach
Conversational Recognition
Recognition using Learning Approach
Artificial Intelligence in Recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
Implementation Challenges
Successive Recognition - Artificial Pauses
Continuous speech recognition - Co Articulation
Physiological parameters
Prosody and Temporal features
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
Database Preparation
Most important phase for training and recognitionaccuracy
50 people - 25 males and 25 females
10 words repeated 20 time each
10000 words for training
35 speakers used for training and 15 reserved forrecognition
Utterances converted to Cepstral domain
Optimization for HMM parameter determination
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
Feature Extraction
Temporal - SPEAKER RecognitionSpectral - SPEECH Recognition
Critical band filterCepstral Analysis
S(k) =N−1∑n=0
s(n)exp((−j2π/N)nk) (1)
ˆS(k) = log(S(K )) (2)
ˆS(n) = 1/NN−1∑k=0
S(k)exp((−j2π/N)nk) (3)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
MFCC
Fourier of a windowed signal
Map power of spectrum on mel scale
Logs of power at each mel
DCT
Amplitude - MFCC
Normalising
Raising log mel amplitudes to higher powers
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
MFCC for HTK
Usually static
Performance - Time derivative
Delta D
Acceleration A
Third Differential
Suppress Absolute energy - Optionally
Vocal Tract Length Normalisation (VTLN)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
HMM for isolated word recognition
In normal method - Isolated word concatenation
Recognizer map between sequences of speech vectors andsymbol sequences
But one to one mapping complex as underlying sequencesproduce similar sounds
Boundaries between symbols cannot be identifiedexplicitly
Sequence of speech vectors corresponding to each wordgenerated by a Markov model
A Markov model is a finite state machine which changesstate once every time unit
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
A Markov Generation Model
Bayesian Interpretation - Finite State Bayesian modelwith Markovian prior
Θ∗ = ArgMaxΘ
{P(Θ)
∑s∈S
P(S |Θ)P(Y |S ,Θ)
}(4)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation
HMM for isolated word recognition
Modeling of HMM - HTK
Six state model moves through the state sequence X = 1,2, 2, 3, 4, 4, 5, 6 to generate the sequence o1 to o6
P(O,X | M) = a12b2(o1)a22b2(o2)a23b3(o3)...
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
What is HTK?HTK Familiarisation
1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR
2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction
3 HTK ImplementationWhat is HTK?HTK Familiarisation
4 Analysis and Result
5 Future Work
6 Conclusion
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
What is HTK?HTK Familiarisation
What is HTK?
HMM Toolkit
Cambridge University - Initially by MS
Used for OCR, WSN and Speech Recognition
39 tools and customized tools ...
Variety of options: Time limitation, thus only default used
Portable
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
What is HTK?HTK Familiarisation
HTK Familiarisation
Tools FunctionHParse Parsing using Backus NaurHDMan Dictionary Creation of HTK formatHLEd MLF file ManipulationHCopy Feature Extraction - Acoustic AnalysisHCompV HMM prototype creationHRest Training - Baum WelchHHed HMM manipulationHVite Viterbi -DecodeHResult Gives the result
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Analysis and Result
50 types of database - 25 training and 25 Testing
35 training and 15 testing
Speaker dependent - 90%
Speaker Independent - 83%
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Confusions
Numbers Confusion 10 31 32 -3 14 -5 -6 -7 88 79 -
START SIL -END SIL -
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Future Work
Extended word recognition system
MS SDKAcoustic unstable field
System can be easily adopted to Continuous Speech
Real time recognition
Blocksets to existing tools
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Conclusion
Speech - Technical approach
ASR
ApproachesChallenges
Feature Extraction
HTK Familiarization
Inaccuracy - Lack of Database
Extended digit recognition
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Appendix
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Mel Scale and Cepstrum
Convert Hz to Mel
m = 2529log10
(1 +
f
700
)= 1127loge
(1 +
f
700
)(5)
Gunnar Fant proposed
m =1000
log2log10
(1 +
f
1000
)(6)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Real time understanding of HMM
Evaluate - Forward Algorithm
Decode - Viterbi
Train - Baum Welch
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
HMM Example
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Viterbi Example
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Baum Welch
Generalized Expectation-Maximization (GEM) algorithm
Maximum Likelihood Estimates
Posterior Mode Estimate
Transition and Emission probabilities
Dividing the expected transition from Si to Sj by theexpected transitions from Si
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Frequency Warping
Performed by applying the unitary warping operator U
One spectral representation on a certain frequency scaleand with a certain frequency resolution transformed toanother representation on a new frequency scale
Resolution uniform on the new scale - Non-Uniform withrespect to old scale
Scale transform of a function
DX (c) =
∫ ∞0
X (f )e−j2πlnf√
fdf (7)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
Inverse scale-transform
D∞X (c) =
∫ ∞0
√αX (αf )
e−2πclnf
√f
df = e j2πlnαDX (c) (8)
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
MLE
Maximum Likelihood Estimation
Value of parameter vector maximizing the probability
Searching the multi-dimensional parameter space
MLE Estimate
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
References
[1 ] Claudio Bechetti and Klucio Prina Ricotti, Speech Recognition Theory andC++ Implementation, John Wiley and Sons, pp. 10, 2004.
[2 ] Davis K. H., Biddulph R. and Balashek S, Automatic Recognition of SpokenDigits, Journal of Acoustical Society of America, Volume:24, Issue:6, pp.637-642, 1952.
[3 ] Rabiner, L., R., Wilpon, J. G., Considerations in applying clusteringtechniques to speaker-independent word recognition, Journal of AcousticalSociety of America,Volume:66, Number:3, pp. 663-673. 1979.
[4 ] Mori R.D, Lam L, Gilloux M., Learning and plan refinement in a knowledgebased system for automatic speech recognition,IEEE Transaction on PatternAnalysis Machine Intelligence, Volume 9, Number 2, pp.289-305, 2001.
[5 ] Huang, C., Tao, C., Chang,E., Accent Issues in Large Vocabulary ContinuousSpeech Recognition, International Journal Of Speech Technology, Volume:7,pp.141-153, 2004
[6 ] Steve Young et. al.,The HTK Book(for HTK Version 3.4),CambridgeUniversity Engineering Department, pp.3-6, 2009
[7 ] E. J. Candes,Compressive sampling,Proceedings of International Congress ofMathematicians, 2006
[8 ] S. F. Cotter,Sparse Representation for accurate classifi cation of corrupted andoccluded facial expressions ,Proceedings of ICASSP, pp. 838-841, 2010
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam
IntroductionImplementation Methodology
HTK ImplementationAnalysis and Result
Future WorkConclusion
“A technology is a real progress when it is available toanyone”- Henry Ford
THANK YOU
Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam