66
Introduction Implementation Methodology HTK Implementation Analysis and Result Future Work Conclusion Robust ASR system : Malayalam Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Guided By : Mr.Edet Bijoy K Assistant Professor Department of ECE MES College of Engineering May 3, 2012 Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Robust ASR system : Malayalam

Embed Size (px)

DESCRIPTION

Presented at the Main project evaluation at MES College of Engineering

Citation preview

Page 1: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Robust ASR system : Malayalam

Carrol Xavier,Mohammed Musfir,

Rahmathulla,Supriya,

Yasif

Guided By :Mr.Edet Bijoy K

Assistant Professor

Department of ECEMES College of Engineering

May 3, 2012Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 2: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Objective

To implement a digit recognizing prototype for MalayalamLanguage 0-9 using HMM model of speech

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 3: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 4: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 5: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 6: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 7: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 8: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Contents1 Introduction

SpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result5 Future Work6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 9: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result

5 Future Work

6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 10: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

What is Speech?

Produced when air fromlungs passes throughglottis, throat and mouth

Excitation in three ways:

Voiced excitationUnvoiced excitationTransient excitation

Some sounds -Combinations of threeexcitations

Spectral Changes - VocalTract

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 11: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

What is Speech?

Produced when air fromlungs passes throughglottis, throat and mouth

Excitation in three ways:

Voiced excitationUnvoiced excitationTransient excitation

Some sounds -Combinations of threeexcitations

Spectral Changes - VocalTract

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 12: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

What is Speech?

Produced when air fromlungs passes throughglottis, throat and mouth

Excitation in three ways:

Voiced excitationUnvoiced excitationTransient excitation

Some sounds -Combinations of threeexcitations

Spectral Changes - VocalTract

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 13: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

What is Speech?

Produced when air fromlungs passes throughglottis, throat and mouth

Excitation in three ways:

Voiced excitationUnvoiced excitationTransient excitation

Some sounds -Combinations of threeexcitations

Spectral Changes - VocalTract

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 14: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Pictorial Representation of “SHOP”

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 15: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Characteristics of Speech

Bandwidth - 4 KHz

Fundamental Frequency - Depends on the type ofarticulation

Peaks in the Spectrum -

Voiced excitation - P(f ) - Triangular PulseUnvoiced excitation - a white noise generator

Pitch Extraction:

Rabiner Gold Pitch TrackerAutocorrelation Pitch Tracker

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 16: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Pitch Extraction - Autocorrelation

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 17: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Formant Frequency

Concentration of acoustic energy on particular frequency

At 1000 Hz intervals

Resonance in Vocal Tracts

Spectrogram - Darkness: Strength of formant

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 18: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Spectrogram

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 19: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Speech Production Model

S(f ) = (vP(f ) + uN(f ))H(f )R(f ) = X (f )H(f )R(f )

The mixture between voiced and unvoiced excitationdetermined by v and u

The fundamental frequency determined by P(f )

The spectral shaping determined by H(f )

The signal amplitude depending on v and u

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 20: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

About Automatic Speech Recognition

Automatic Speech Recognition - Advancing andchallenging

Most of the research works - English, Arabic, Mandarin

Native Indian Languages - Minimal work

Industry - AT & T, Nuance, IBM

Open Source - Vox Forge

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 21: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

About Automatic Speech Recognition

Automatic Speech Recognition - Advancing andchallenging

Most of the research works - English, Arabic, Mandarin

Native Indian Languages - Minimal work

Industry - AT & T, Nuance, IBM

Open Source - Vox Forge

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 22: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

About Automatic Speech Recognition

Automatic Speech Recognition - Advancing andchallenging

Most of the research works - English, Arabic, Mandarin

Native Indian Languages - Minimal work

Industry - AT & T, Nuance, IBM

Open Source - Vox Forge

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 23: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

About Automatic Speech Recognition

Automatic Speech Recognition - Advancing andchallenging

Most of the research works - English, Arabic, Mandarin

Native Indian Languages - Minimal work

Industry - AT & T, Nuance, IBM

Open Source - Vox Forge

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 24: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

About Automatic Speech Recognition

Automatic Speech Recognition - Advancing andchallenging

Most of the research works - English, Arabic, Mandarin

Native Indian Languages - Minimal work

Industry - AT & T, Nuance, IBM

Open Source - Vox Forge

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 25: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Classifying ASR system

System contains two subsystems:ASR - Transcribe natural speechSU - Understand the meaning of transcribed speech

ASR system classified as:DVI - Direct Voice InputLVCSR - Large Vocabulary Continuous SpeechRecognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 26: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Block Diagram of ASR

Acoustic Properties - Linguistic representation

Initial acquisition - Signal transduction or Recording

Feature extraction - Spectral Analysis

Segmentation - Phoneme Boundary Recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 27: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

SpeechAutomatic Speech RecognitionComponents of ASRApproaches of ASR

Approaches of ASR

Template Based Approach

Knowledge Based Approach

Statistical Approach

Conversational Recognition

Recognition using Learning Approach

Artificial Intelligence in Recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 28: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result

5 Future Work

6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 29: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

Implementation Challenges

Successive Recognition - Artificial Pauses

Continuous speech recognition - Co Articulation

Physiological parameters

Prosody and Temporal features

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 30: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

Database Preparation

Most important phase for training and recognitionaccuracy

50 people - 25 males and 25 females

10 words repeated 20 time each

10000 words for training

35 speakers used for training and 15 reserved forrecognition

Utterances converted to Cepstral domain

Optimization for HMM parameter determination

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 31: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

Feature Extraction

Temporal - SPEAKER RecognitionSpectral - SPEECH Recognition

Critical band filterCepstral Analysis

S(k) =N−1∑n=0

s(n)exp((−j2π/N)nk) (1)

ˆS(k) = log(S(K )) (2)

ˆS(n) = 1/NN−1∑k=0

S(k)exp((−j2π/N)nk) (3)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 32: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

MFCC

Fourier of a windowed signal

Map power of spectrum on mel scale

Logs of power at each mel

DCT

Amplitude - MFCC

Normalising

Raising log mel amplitudes to higher powers

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 33: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

MFCC for HTK

Usually static

Performance - Time derivative

Delta D

Acceleration A

Third Differential

Suppress Absolute energy - Optionally

Vocal Tract Length Normalisation (VTLN)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 34: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

HMM for isolated word recognition

In normal method - Isolated word concatenation

Recognizer map between sequences of speech vectors andsymbol sequences

But one to one mapping complex as underlying sequencesproduce similar sounds

Boundaries between symbols cannot be identifiedexplicitly

Sequence of speech vectors corresponding to each wordgenerated by a Markov model

A Markov model is a finite state machine which changesstate once every time unit

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 35: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

A Markov Generation Model

Bayesian Interpretation - Finite State Bayesian modelwith Markovian prior

Θ∗ = ArgMaxΘ

{P(Θ)

∑s∈S

P(S |Θ)P(Y |S ,Θ)

}(4)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 36: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Implementation ChallengesDatabase PreparationFeature ExtractionHMM Implementation

HMM for isolated word recognition

Modeling of HMM - HTK

Six state model moves through the state sequence X = 1,2, 2, 3, 4, 4, 5, 6 to generate the sequence o1 to o6

P(O,X | M) = a12b2(o1)a22b2(o2)a23b3(o3)...

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 37: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

What is HTK?HTK Familiarisation

1 IntroductionSpeechAutomatic Speech RecognitionApproaches of ASR

2 Implementation MethodologyImplementation ChallengesDatabase PreparationFeature Extraction

3 HTK ImplementationWhat is HTK?HTK Familiarisation

4 Analysis and Result

5 Future Work

6 Conclusion

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 38: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

What is HTK?HTK Familiarisation

What is HTK?

HMM Toolkit

Cambridge University - Initially by MS

Used for OCR, WSN and Speech Recognition

39 tools and customized tools ...

Variety of options: Time limitation, thus only default used

Portable

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 39: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

What is HTK?HTK Familiarisation

HTK Familiarisation

Tools FunctionHParse Parsing using Backus NaurHDMan Dictionary Creation of HTK formatHLEd MLF file ManipulationHCopy Feature Extraction - Acoustic AnalysisHCompV HMM prototype creationHRest Training - Baum WelchHHed HMM manipulationHVite Viterbi -DecodeHResult Gives the result

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 40: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Analysis and Result

50 types of database - 25 training and 25 Testing

35 training and 15 testing

Speaker dependent - 90%

Speaker Independent - 83%

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 41: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Confusions

Numbers Confusion 10 31 32 -3 14 -5 -6 -7 88 79 -

START SIL -END SIL -

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 42: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 43: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 44: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 45: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 46: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 47: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Future Work

Extended word recognition system

MS SDKAcoustic unstable field

System can be easily adopted to Continuous Speech

Real time recognition

Blocksets to existing tools

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 48: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 49: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 50: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 51: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 52: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 53: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 54: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 55: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Conclusion

Speech - Technical approach

ASR

ApproachesChallenges

Feature Extraction

HTK Familiarization

Inaccuracy - Lack of Database

Extended digit recognition

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 56: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Appendix

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 57: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Mel Scale and Cepstrum

Convert Hz to Mel

m = 2529log10

(1 +

f

700

)= 1127loge

(1 +

f

700

)(5)

Gunnar Fant proposed

m =1000

log2log10

(1 +

f

1000

)(6)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 58: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Real time understanding of HMM

Evaluate - Forward Algorithm

Decode - Viterbi

Train - Baum Welch

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 59: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

HMM Example

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 60: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Viterbi Example

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 61: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Baum Welch

Generalized Expectation-Maximization (GEM) algorithm

Maximum Likelihood Estimates

Posterior Mode Estimate

Transition and Emission probabilities

Dividing the expected transition from Si to Sj by theexpected transitions from Si

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 62: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Frequency Warping

Performed by applying the unitary warping operator U

One spectral representation on a certain frequency scaleand with a certain frequency resolution transformed toanother representation on a new frequency scale

Resolution uniform on the new scale - Non-Uniform withrespect to old scale

Scale transform of a function

DX (c) =

∫ ∞0

X (f )e−j2πlnf√

fdf (7)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 63: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

Inverse scale-transform

D∞X (c) =

∫ ∞0

√αX (αf )

e−2πclnf

√f

df = e j2πlnαDX (c) (8)

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 64: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

MLE

Maximum Likelihood Estimation

Value of parameter vector maximizing the probability

Searching the multi-dimensional parameter space

MLE Estimate

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 65: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

References

[1 ] Claudio Bechetti and Klucio Prina Ricotti, Speech Recognition Theory andC++ Implementation, John Wiley and Sons, pp. 10, 2004.

[2 ] Davis K. H., Biddulph R. and Balashek S, Automatic Recognition of SpokenDigits, Journal of Acoustical Society of America, Volume:24, Issue:6, pp.637-642, 1952.

[3 ] Rabiner, L., R., Wilpon, J. G., Considerations in applying clusteringtechniques to speaker-independent word recognition, Journal of AcousticalSociety of America,Volume:66, Number:3, pp. 663-673. 1979.

[4 ] Mori R.D, Lam L, Gilloux M., Learning and plan refinement in a knowledgebased system for automatic speech recognition,IEEE Transaction on PatternAnalysis Machine Intelligence, Volume 9, Number 2, pp.289-305, 2001.

[5 ] Huang, C., Tao, C., Chang,E., Accent Issues in Large Vocabulary ContinuousSpeech Recognition, International Journal Of Speech Technology, Volume:7,pp.141-153, 2004

[6 ] Steve Young et. al.,The HTK Book(for HTK Version 3.4),CambridgeUniversity Engineering Department, pp.3-6, 2009

[7 ] E. J. Candes,Compressive sampling,Proceedings of International Congress ofMathematicians, 2006

[8 ] S. F. Cotter,Sparse Representation for accurate classifi cation of corrupted andoccluded facial expressions ,Proceedings of ICASSP, pp. 838-841, 2010

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam

Page 66: Robust ASR system : Malayalam

IntroductionImplementation Methodology

HTK ImplementationAnalysis and Result

Future WorkConclusion

“A technology is a real progress when it is available toanyone”- Henry Ford

THANK YOU

Carrol Xavier, Mohammed Musfir, Rahmathulla, Supriya, Yasif Robust ASR system : Malayalam