My Project

ROBOTIC CONTROL THROUGH SPEECH

INTRODUCTION• This voice recognition project consists of two

major components, a speech recognition module and a motorized robot.

• Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture).

• The motorized robot will consist of two DC motors and will make the robot forward and backward directions.

DEPARTMENT OF ECE 2

PROJECT DESCRIPTION

The Speaker Recognition can be classified into two phases.

1 Training Phase.2 Testing Phase.

DEPARTMENT OF ECE 3

Training Phase.

• In Training Phase ,the frequency components of the given speech signal is extracted.

• Each registered speaker has to provide samples of their speech (given words).

• so that the system an build or train a reference model for that speaker.

DEPARTMENT OF ECE 4

Testing phase

• In testing phase ,the input speech is matched with stored references models (s)

• Recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC)

• The command recognition is observed by the operation of stepper motor & DC motor and the control signals to the DC motor

DEPARTMENT OF ECE 5

ARCHITECTURE OF ADSP 2181

DEPARTMENT OF ECE 6

FEATURES OF ADSP 2181 PROCESSOR

• 25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 Volts

• Single-Cycle Instruction Execution

• Multifunction Instructions

• Low Power Dissipation in Idle Mode

• 16K Words On-Chip Program Memory RAM

• 16K Words On-Chip Data Memory RAM

• Independent ALU, Multiplier/Accumulator, and Barrel Shifter Units

• 3-Bus Architecture Allows Dual Operand Fetches in every Instruction Cycle

DEPARTMENT OF ECE 7

ALU and MAC

• The ALU performs a standard set of arithmetic and logic operations in addition to division primitives.

• The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations.

DEPARTMENT OF ECE 8

SHIFTER

• The shifter performs logical and arithmetic shifts, normalization, de-normalization, and derive exponent operations.

• The shifter implements numeric format control including multiword floating-point representations.

DEPARTMENT OF ECE 9

SPEECH

• The input speech is given in the form of nos. like1, 2,3..

• The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHz

• In coding only 2000 samples are considered because only 0.25 sec will be taken for one character

10DEPARTMENT OF ECE

REPRESENTATION OF SPEECH SIGNAL

0 5 10 15 20 25 30 35 40-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

msec

11DEPARTMENT OF ECE

Block Diagram

Input speech

via mic ADSP 2181

DEPARTMENT OF ECE 12

DCMOTOR

MELSPECTRUM

WINDOWING FFT

MEL FREQWRAP

MELCEPSTRUM

CODEC FRAMMING

FRAMING

• Speech signal is blocked into frames of N samples (n=256)

• Adjacent Frames are separated by M samples (M=100)

• Frame1= 0-256

• Frame2=100-356

• Such kind of 18 frames are required for 2000 samples/sec character.

13DEPARTMENT OF ECE

FRAMING

14DEPARTMENT OF ECE

Windowing

• Minimizes signal discontinuity in each frame

• Reduced spectral distortion

• Window signal is obtained by

Y1(n)=x1(n)*w(n) ; 0<=n<N-1

• Where w(n) is Hamming Window and is given by

w(n)=0.54-0.46Cos(2∏ n/N-1); 0<=n<N-1

15DEPARTMENT OF ECE

Windowing

16DEPARTMENT OF ECE

Result of Windowing• 256 values are o/p of this process

• These values are given as an

input for FFT.

• Some values of windowing

for 1 kHz is shown

0x00000x08260x0BE60x08B70x000F0xF6C70xF26C0xF5FC0xFFE80x0AA90x0FC7

17DEPARTMENT OF ECE

Fast Fourier Transform

• Converts time domain signal into frequency domain signal

• Power spectrum is obtained with real and imaginary part of the frequency domain of the speech signal.

18DEPARTMENT OF ECE

Wrapping

• A subjective pitch for each frequency is computed using Mel Scale

• Mel frequency scale is given by mel(f)=2595*log10(1+f/700)

19DEPARTMENT OF ECE

Mel Frequency Coefficients

20DEPARTMENT OF ECE

MFCC

• It is Mel Frequency Cepstrum Coefficient

• It consists of various frequency coefficient components.

• It contains:

Mel Spectrum (frequency domain)

Mel Cepstrum (time domain)

21DEPARTMENT OF ECE

SPECTRUM

• Samples are convoluted with mel filter bank to obtain mel frequency spectrum.

• Mel frequency spectrum is given by

s(n)=y(n)*f(n)

s(n)------>mel frequency spectrum

y(n)------>samples

f(n)------->filter coefficients

22DEPARTMENT OF ECE

Inverse Discrete Cosine Transformation

• Mel frequency power spectrum is in frequency domain function

• In order to obtain a time domain function the signal undergoes IDCT

• Now mel frequency spectrum is converted into mel frequency cepstrum.

23DEPARTMENT OF ECE

CEPSTRUM

• MFCC real numbers and are convoluted to time domain using IDCT

• The time domain coefficients are called mel frequency cepstrum coefficients..

• MFCC is given by c(n)=sum of log (Sk * cos (n(k-.5)*pi/k)

24DEPARTMENT OF ECE

LEAST MEAN SQUARE ALGORITHM (LMS)

• This algorithm is used to find out the the minimum deviation between certain values.

• During testing phase the input speech is compared with the stored 4 values.

• The least deviated value is sent.

25DEPARTMENT OF ECE

INTERFACING PC WITH KIT

RS-232 SERIAL CABLE


PCDSP

PROCESSOR

DSP TO DC MOTOR


CIRCUIT DIAGRAM


HARDWARE DETAILS

• The latched output from the latch IC is given to the relays via resistor and transistor.

• According to the predefined input, the coil gets energized and relay is switched to ON position.

• Here we use SPDT relay

• It causes a current flow in the DC Motor.


Details of dc motor

• Speed of the motor - 300 rpm

• Current – 750mA

• Voltage – 7.5V


Advantages

• It is SPEECH recognizable

• Processing time is less

• Easy and efficient

• Useful for physically disable people

• Less cost

• Maintenance is easy


Limitations

• Mismatching of frequency may affect the compatibility with the hardware.

• Each and everyone voice should be trained before testing it.


APPLICATIONS

• Physically and visually impaired friendly device where only the speech signals of the user is required.

• In cases of acute problems like system crashes and all, this method can be utilized for emergency.

33DEPARTMENT OF ECE

CONCLUSION and FUTURE MODIFICATIONS

• Speech recognition is still an active research area.

• Speech Recognition brings in the communication between human and machine.

• This project recognizes the given speech signal and the word is displayed on the PC.


THANK YOU


Documents

My Project