Gmm Based Speech Recognition

GMM BASED SPEECH RECOGNITION

Objective

Speech Recognition is the process of recognizing the word (predefined) spoken by the speaker on the basis of information included in speech waves. GMM or Gaussian Mixture model algorithm compares the cepstral coefficients generated by speech samples in the training and testing phase. Furthermore this technique makes it possible to use the speaker’s voice to verify their identity. This project is implemented in ADSP 2181 processor.

Project Description

The Speech Recognition can be classified into two phases.

1) Training Phase. 2) Testing Phase

In Training Phase, the frequency components of the given speech signal is extracted. Each registered speaker has to provide samples of their speech (given words) so that the system can build or train a reference model for that speaker. In addition, a speaker – specific threshold is also computed from the training samples.

In testing phase, the input speech is matched with stored references models (s) and recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC) , Gaussian Mixture model(GMM).

Block Diagram

FramingFraming

Windowing

Framing

|FFT|Framing

Mel- Filtering

Framing

DCTFraming

Static coefficient

sFraming

GMM Classifier

Framing

Speech Input

Framing

Recognized O/P

Framing

Implementation

a. The Speech signal is sampled by means of PC port.b. The sampled speech signals are given to Matlab code.c. In the Training phase the sample speech signal is converted to MFCC

codes.d. In the Testing Phase the test signal given is compared and recognized by

GMM algorithm.e. The recognized word is displayed in the PC.

Documents

Gmm Based Speech Recognition