Upload
saran52ece
View
355
Download
3
Embed Size (px)
Citation preview
GMM BASED SPEECH RECOGNITION
Objective
Speech Recognition is the process of recognizing the word (predefined) spoken by the speaker on the basis of information included in speech waves. GMM or Gaussian Mixture model algorithm compares the cepstral coefficients generated by speech samples in the training and testing phase. Furthermore this technique makes it possible to use the speaker’s voice to verify their identity. This project is implemented in ADSP 2181 processor.
Project Description
The Speech Recognition can be classified into two phases.
1) Training Phase. 2) Testing Phase
In Training Phase, the frequency components of the given speech signal is extracted. Each registered speaker has to provide samples of their speech (given words) so that the system can build or train a reference model for that speaker. In addition, a speaker – specific threshold is also computed from the training samples.
In testing phase, the input speech is matched with stored references models (s) and recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC) , Gaussian Mixture model(GMM).
Block Diagram
FramingFraming
Windowing
Framing
|FFT|Framing
Mel- Filtering
Framing
DCTFraming
Static coefficient
sFraming
GMM Classifier
Framing
Speech Input
Framing
Recognized O/P
Framing
Implementation
a. The Speech signal is sampled by means of PC port.b. The sampled speech signals are given to Matlab code.c. In the Training phase the sample speech signal is converted to MFCC
codes.d. In the Testing Phase the test signal given is compared and recognized by
GMM algorithm.e. The recognized word is displayed in the PC.