View
220
Download
0
Category
Preview:
Citation preview
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 1/26
A
PROJECT REPORT
ON
SPEECH RECOGNITION
Submitted in partial fulfillment for of the award of
Bachelor of Technology
In
Electronics &Communications
By:
Hitesh Garg (0181312807)
Prateek Gupta (0571312807)
Rachit Kumar Gupta (0741312807)
Vandana Chauhan (0481312807)
Under the guidance of
Ms. Neha Gupta
Lecturer
Department of Electronics and Communications
Guru Premsukh Memorial College of Engineering
245, Budhpur Village, G.T Karnal Road, Delhi-36.
Guru Gobind Singh Indraprastha University, Delhi
2010-2011
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 2/26
MINOR PROJECT
ON
SPEECH RECOGNITION
Submitted by:
Hitesh Garg (0181312807)
Prateek Gupta (0571312807)
Rachit Kumar Gupta (0741312807)
Vandana Chauhan (0481312807)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 3/26
Certificate
This is to certify that the dissertation/ project report (course code)
entitled: SPEECH RECOGNITION : done by Mr. Hitesh Garg
(0181312807), Mr. Prateek Gupta (0571312807), Mr. Rachit
Kumar Gupta (0741312807), Ms. Vandana Chauhan
(0481312807) is an authentic work carried out by him/her at Guru
Premsukh Memorial College of Engineering under my guidance.
The matter embodied in this has not been submitted earlier for the
award of any degree or diploma to best of my Knowledge and belief.
Date:
(Ms. Neha Gupta)Lecturer
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 4/26
Acknowledgments
In the: Acknowledgments: page, the writer recognizes his
indebtedness for guidance and assistance of the thesis adviser and
other member of the faculty. Courtesy demands that he also recognize
specific contributions by other persons or institutions such as libraries
and research foundations. Acknowledgments should be expressed
simply, tastefully, and tactfully duly singed above the E-mail should
also be given at the end.
Hitesh Garg (0181312807)
Prateek Gupta (0571312807)
Rachit Kumar Gupta (0741312807)
Vandana Chauhan (0481312807)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 5/26
Abstract
Many voice recognition algorithms have been developed in past. We havereferenced Tor¶s Speech recognition algorithm to do speech recognition in
real time.
We have used the frequency variation in speech patterns to identify Individual¶s
voice to respective operation. The challenges faced were the detection of voice
patterns and compensating for the ambient noise for successful detection of
individuals. To deal with the limited memory and processing power, we had to
remove the redundancies in the speech pattern data; hence we developed
fingerprinting as a solution. The compensation for noise was done by accurate
approximation of the ambient noise and calculating for a threshold level based
on the same.
This fingerprint is compared with the stored fingerprint in the dictionary to
detect which word has been spoken. For comparison we use pseudo
Euclidean distance method. If the word is recognized, corresponding
control signal is generated for the respective device.
The algorithm used for recognition is simulated on MATLAB. The voice
signal is read into MATLAB using wavread command. To convert analogsignal to digital form we first sample, then quantize the signal. The
fingerprints are obtained by passing through the filters which are thencompared with the stored fingerprints to generate control signal.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 6/26
Table of Contents
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 7/26
List of Tables
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 8/26
List of Figures
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 9/26
List of Symbols, Abbreviations and Nomenclature
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 10/26
INTRODUCTION
We all use remote controlled devices in our daily life. These remotes have
to be manually operated by the user. These devices usually use infraredcommunication for transmitting the control signal. Also these remotes can
be controlled by anyone.
For this we used speaker dependent voice recognition algorithm. By use of
this algorithm we can recognize which word has been spoken and
corresponding control signals are generated.Many voice recognition algorithms have been developed in past. We havereferenced Tor¶s Speech recognition algorithm to do speech recognition in
real time. This helped us to implement voice recognition algorithm on aMATLAB 6.0
In our project, the fingerprints of the word are stored in the header file-Dictionary.
Then speaker speaks into the microphone, which acts as a transducer toconvert voice signal into electrical signal.which is further passed through the analog todigital converter(ADC). ADC converts the incoming analog signal to digital signal.
The ADC¶s output is checked at the rate of 4 KHz and
passed through 10 filters after every 250 samples. Thus we get 16 pointsfrom each filter. This gives total 160 data points for each word. This iscalled as the fingerprint of the word.
This fingerprint is compared with the stored fingerprint in the dictionary to
detect which word has been spoken. For comparison we use pseudoEuclidean distance method. If the word is recognized, corresponding
control signal is generated for the respective operation.
The filters used are chebyshev 4th order filters. To implement 4 th order
filters we have cascaded two 2nd order filters whose coefficients are
obtained using MATLAB 6.0. We used chebyshev type 2 filters as they aremonotonic in passband and equiripple in stop band.
The algorithm used for recognition is simulated on MATLAB. The voice
signal is read into MATLAB using wavread command. To convert analog
signal to digital form we first sample, then quantize the signal. Thefingerprints are obtained by passing through the filters which are thencompared with the stored fingerprints to generate control signal.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 11/26
Overview of Methodology
Fi B i bl di of speech recognition
The B sic algor ithm of our code was to check the ADC input at a rate of 4 KHz. If value of
the ADC is greater than the threshold value it is interpreted as the beginning of a half a
second long word. The sample word passes through 8 band pass f ilters and is conver ted into a
f ingerpr int. The words to be matched are stored as f ingerpr ints in a dictionary so that sampled
word f ingerpr ints can be compared against them later. Once a f ingerpr int is generated from asample word it is compared against the dictionary f ingerpr ints and using the modif ied
Euclidean distance calculation f inds the f ingerpr int in the dictionary that is the closest match.
Based on the word that matched the best the program sends a PWM signal to the car to
perform basic operations like lef t r ight go, stop, or reverse.
y Initi l Threshold Cal lation:
At star t up as par t of the initialization the program reads the ADC input using timercounter0
and accumulates its value 256 times. By interpreting the read in ADC value as a number between 1 to 1/256, in f i ed point, and accmulating 256 times. The average value of ADC
was calculated without doing a multi ply or divide. Three average values are taken each with a16.4msec delay between the samples. Af ter receiving three average values, the threshold
value is to be four times the value of the median number. The threshold value is useful todetect when a word has been spoken or not.
y Fingerprint Generation:
The program considers a word detected if a sample value from the ADC is greater than the
threshold value. Every sample of ADC is typecast to an int and stored in a dummy var iable
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 12/26
Ain. The Ain value passes through eight 4th
order Chebyshev band pass filters with a 40 dBstop band for 2000 samples (half a second) once a word has been detected. When a filter is
used its output is squared and that value is accumulated with the previous squares of the filter output. After 125 samples the accumulated value is stored as a data point in the fingerprint of
that word. The accumulator is then cleared and the process is begun again. After 2000
samples 16 points have been generated from each filter, thus every sampled word is divided
up into 16 parts. Our code is based around using 10 filters and since each one outputs 16data points every fingerprint is made up of 160 data points.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 13/26
Body of Thesis
The Speech/Voice Recognition Method.The Speech/Voice Recognition method is similar to the template generation method. The
only difference is that the voice print is loaded to the Random Access memory of the
microcontroller and is compared with the templates available in the database. The appropriateaction is taken once a match is found.
Filter Implementation:
Figure 2Flowchart for Template Generation (160 point data)
We chose a 4th order Chebyshev filter with 40 dB stop band since it had very sharp transitions
after the cutoff frequency. We designed 10 filters a low pass with a cutoff of 200 Hz, a high
pass with a cutoff of 1.8 KHz, and eight band passes that each had a 200 Hz bandwidth and
were evenly distributed from 200Hz to 1.8 KHz.Thus we had band pass filters that went from
200-400 Hz, 400-600, 600 ± 800 and so on all the way to the filter that covered 1.6 Khz ± 1.8Khz. We designed our filters in this way because we felt that most of the important frequency
content in words was within the first 2 KHz since this usually contains the first and secondspeech formants, (resonant frequencies) This also allowed us to sample at 4 KHz and gave us
almost enough time to implement 10 filters. We thought we needed ten filters each withapproximately a 200 Hz bandwidth so that we would have enough frequency resolution to
properly identify words.Originally we had 5 filters that spanned from 0 ± 4 KHz and were
sampling at 8 KHz, but this scheme did not produce consistent word recognition.
In order to implement 4th order Chebyshev filters, we cascaded 2 second order IIR filters in
series to make 4 th order filter using Prof. Land's sample assembly code for 2 nd order IIR
filters. We generated 4th
order IIR filter coefficients using Matlab 6.0 as described in the math
section above. The coefficients though are floating point numbers and to convert them to
fixed point we multiply the numbers by 256 and round it off to nearest integer instead of
using the float2fix macro, which does not round .We needed to call all our filters at a rate of 4
KHz, the sampling frequency.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 14/26
In order to make a fingerprint from a word we had to pass the ADC output through all thefilters faster than the ADC sample time of 250µs.We also modified our filters slightly, by
altering the gain coefficient, to have a maximum filter gain of 20 instead of 1. This wasdone to prevent the filter output from under flowing and going to zero when it was squared.
The output of the filter was squared in order to store the intensity of the sound rather than just
the amplitude .To reduce the time squaring the filter output took it was combined with the
filter function and also the accumulation of the previous squared filter outputs was also putinto assembly to reduce the cycle time .Nonetheless the reduction in cycle time was not
enough to implement all 10 filters, so we stopped calling the high pass filter .Later on the low
pass filter was removed because low frequency noise seemed to be interfering with the filter
output.
Fingerprint Comparison:
Once the fingerprints are created and stored in the dictionary when a word was spoken, it was
compared against the dictionary fingerprints. In order to do the comparison, we called a
lookup() function. The lookup() function did a pseudo Euclidean distance formula by
calculating the sum of the absolute value of the difference between each sample finger print a
finger print from the dictionary. The dictionary has multiple words in it and the lookup went
through all of them and picked the word with the smallest calculated number. We hadoriginally used the square of the correct Euclidean distance calculation, d = (pi ± qi) 2. We
had originally used English words, go, left, right, stop, and back, but many of these wordsseemed to be very similar in frequency as far as our algorithm was concerned.We then went
to vowels and had better success, but we still wanted to use words that were directions and sowe went to Hindi The set of words that we used were mostly orthogonal, but in Hindi left is
baiya, which sound very similar to daiya and so that could not be used. We had previouslyhad success with snapping so we used that for left.
Voice Sampling:
Human Voice consists of frequency components from 100-2000Hz .According
to the Nyquist µs sampling theorm,Sampling is the process of converting a signal (for
example, a function of continuous time or space) into a numeric sequence (a function of
discrete time or space). The theorem states, in the original words of Shannon (where he uses
"cps" for "cycles per second" instead of the modern unit hertz): If a function f(t) contains no
frequencies higher than W cps, it is completely determined by giving its ordinates at a series
of points spaced 1/(2W) seconds apart.
Hence we need sampling at the rate of at least 4000Hz . Hence the sampling time should be
ideally 1/4000sec = 250s.
The voice sampling is achieved by setting the ADC control registers and a timer
which interrupts (triggers) the microcontroller to generate ADC data every
232s.
The method for sampling consists of the following steps:� Setup the interrupt for Counter 0 and Counter 1.
� Initializing the ADC data available flag to 0.� Configure the ADMUX register to obtain data from ADC channel 1.
� Setup the timer to interrupt the microcontroller to acquire data from ADC, every 232s.� Receive the ADC data into the Accumulator
� Process the data further as per the Voiceprint generation process.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 15/26
Algorithm :
1. Set TIMSK watchdog timer control register to 0b00000010 so that it interrupts every time
the timer hits the count.
2. Set the ADC input data register to zero.
3. Set the ADMUX value to 0b00100000 to select channel 1.
4. ADCSR value is set to 0b11000111 to start the conversion and sets the ADC conversion
status flag to1.
5. The timer control register TCCR0=0b00001011 and OCR0=62 so that we get a sampling
rate of 4300Hz.
Bandpass Filtering:
The filter used here is Chebyshev Filter (IIR).
Infinite impulse response (IIR) is a property of signal processing systems.
Systems with that property are known as IIR systems or when dealing withelectronic filter systems as IIR filters. They have an impulse response function
which is non-zero over an infinite length of time. This is in contrast to finite
impulse response filters (FIR) which have fixed-duration impulse responses.
The simplest analog IIR filter is an RC filter made up of a single resistor (R)
feeding into a node shared with a single capacitor (C). This filter has an
exponential impulse response characterized by an RC time constant.
IIR filters may be implemented as either analog or digital filters. In digital IIR
filters, the output feedback is immediately apparent in the equations defining the
output. Note that unlike with FIR filters, in designing IIR filters it is necessary
to carefully consider "time zero" case in which the outputs of the filter have not
yet been clearly defined.
Design of digital IIR filters is heavily dependent on that of their analogcounterparts because there are plenty of resources, works and straightforwarddesign methods concerning analog feedback filter design while there are hardly
any for digital IIR filters. As a result, mostly, if a digital IIR filter is going to beimplemented, first, an analog filter (e.g. Chebyshev filter, Butterworth filter,
Elliptic filter) is designed and then it is converted to digital by applyingdiscretization techniques such as Bilinear transform or Impulse invariance.
Example IIR filters include the Chebyshev filter, Butterworth filter, and theBessel filter.
Bandpass Filter Design:
This is an important part for generating a voice template for the voice. This stepremoves the redundancies in voice and stores the signature in a 160 point vector
data.
The bandpass filter is a Second order Chebychev IIR filter. The coefficients for
the filter are calculated using MATLAB 6.0.
In order to analyze speech, we needed to look at the frequency content of the
detected word. To do this we used several 4th order Chebyshev band pass
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 16/26
filters. To create 4th order filters, we cascaded two second order filters using thefollowing "Direct Form II Transposed" implementation of a difference
equations.
Figure 3TransposedDirectFormII implementation of a second order IIR digital filter (input on the right, output on the left)
The filter expressions can be now written as.
The assembly language code is then written to implement the filter is written
taking care that the filter is able to calculate within 2100 system cycles that is before the next sample arrives. Hence to optimize the following process we
have optimized the data format from float to fixed point 2¶s complement form which hasimproves the performance of the program and helps to compute within the required number
of system clock cycles.
Filter Coefficient Calculation.
The filter coefficient is calculated using the Signal Processing Blockset of
MATLAB 6.0. The parameters for the bandpass filters are 100-200Hz,200-400Hz to 1800-
2000Hz(for 8 bandpass filters). The gain for the passband is 20dB and the rolloff is quite
steep as two IInd Order Chebyshev bandpass filter are cascaded in series
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 17/26
Matlab Function Description
cheby2 - Chebyshev Type II filter design (stopband ripple)
Syntax
[z,p,k]=cheby2(n,R,Wst)
[z,p,k]=cheby2(n,R,Wst,' ftype')
[b,a]=cheby2(n,R,Wst)[ b,a]=cheby2(n,R,Wst,' ftype')
[A,B,C,D]=cheby2(n,R,Wst)
[A,B,C,D]=cheby2(n,R,Wst,' ftype')[z,p,k]=cheby2(n,R,Wst,'s')
[z,p,k]=cheby2(n,R,Wst,'ftype' ,'s')[ b,a]=cheby2(n,R,Wst,'s')
[ b,a]=cheby2(n,R,Wst,'ftype' ,'s')[A,B,C,D]=cheby2(n,R,Wst,'s')
[A,B,C,D]=cheby2(n,R,Wst, 'ftype' ,'s')
Description
cheby2 designs lowpass, highpass, bandpass, and bandstop digital and analog ChebyshevType II filters. Chebyshev Type II filters are monotonic in the passband and equiripple in the
stopband. Type II filters do not roll off as fast as type I filters, but are free of passband ripple.
Digital Domain
[z,p,k] = cheby2(n,R,Wst) designs an order n lowpass digital Chebyshev Type II filter with normalized stopband edge frequency Wst and stopband ripple R dB down from the peak
passband value. It returns the zeros and poles in length n column vectors z and p and the gain
in the scalar k.
Normalized st o pband edge frequency is the beginning of the stopband, where the magnitude
response of the filter is equal to -R dB. For cheby2, the normalized stopband edge frequency
Wst is a number between 0 and 1, where 1 corresponds to half the sample rate. Larger values
of stopband attenuation R lead to wider transition widths (shallower rolloff characteristics).
If Wst is a two-element vector, Wst = [w1 w2], cheby2 returns an order 2*n bandpass filter
with passband w1 < < w2.
[z,p,k] = cheby2(n,R,Wst,'ftype')designs a highpass, lowpass, or bandstop filter,
where the string 'ftype' is one of the following:
y 'high' for a highpass digital filter with normalized stopband edge frequency Wst
y 'low' for a lowpass digital filter with normalized stopband edge frequency Wst
y 'stop' for an order 2*n bandstop digital filter if Wst is a two-element vector,
Wst = [w1 w2]. The stopband is w1 < < w2.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 18/26
With different numbers of output arguments, cheby2 directly obtains other realizations of thefilter. To obtain the transfer function form, use two output arguments as shown below.
[b,a] = cheby2(n,R,Wst) designs an order n lowpass digital Chebyshev Type II filter with
normalized stopband edge frequency Wst and stopband ripple R dB down from the peak
passband value. It returns the filter coefficients in the length n+1 row vectors b and a, with
coefficients in descending powers of z.
[b,a] = cheby2(n,R,Wst,'ftype') designs a highpass, lowpass, or bandstop filter, where
the string 'ftype' is 'high', 'low', or 'stop', as described above.
To obtain state-space form, use four output arguments as shown below.
[A,B,C,D] = cheby2(n,R,Wst) or
[A,B,C,D] = cheby2(n,R,Wst,'ftype') where A, B, C, and D are
and u is the input, x is the state vector, and y is the output.
tf2sos - Convert digital filter transfer function data to second-order
sections form
Syntax
[sos,g]=tf2sos(b,a)
[sos,g] = tf2sos(b,a,'ord er')
[sos,g]=tf2sos(b,a, 'ord er','scale')
sos=tf2sos(...)
Description
tf2sos converts a transfer function representation of a given digital filter to an equivalent
second-order section representation.
[sos,g] = tf2sos(b,a) finds a matrix sos in second-order section form with gain g that isequivalent to the digital filter represented by transfer function coefficient vectors a and b.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 19/26
sos is an L-by-6 matrix
whose rows contain the numerator and denominator coefficients bik and aik of the second-
order sections of H ( z).
[sos,g] = tf2sos(b,a,'ord er') specifies the order of the rows in sos, where 'ord er' is
y 'down', to order the sections so the first row of sos contains the poles closest to the
unit circle
y 'up', to order the sections so the first row of sos contains the poles farthest from theunit circle (default)
[sos,g] = tf2sos(b,a,'ord er','scale') specifies the desired scaling of the gain and
numerator coefficients of all second-order sections, where 'scale' is:
y 'none', to apply no scaling (default)
y 'inf', to apply infinity-norm scaling
y 'two', to apply 2-norm scaling
Using infinity-norm scaling in conjunction with up-ordering minimizes the probability of
overflow in the realization. Using 2-norm scaling in conjunction with down-orderingminimizes the peak round-off noise.
sos = tf2sos(...) embeds the overall system gain, g, in the first section, H 1( z), so that
sos2tf - Convert digital filter second-order section data to transfer
function form
Syntax
[ b,a]=sos2tf(sos)
[b,a] = sos2tf(sos,g)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 20/26
Description
sos2tf converts a second-order section representation of a given digital filter to an
equivalent transfer function representation.
[b,a] = sos2tf(sos) returns the numerator coefficients b and denominator coefficients a
of the transfer function that describes a discrete-time system given by sos in second-order section form. The second-order section format of H ( z) is given by
sos is an L- by-6 matrix that contains the coefficients of each second-order section stored inits rows.
Row vectors b and a contain the numerator and denominator coefficients of H ( z) stored indescending powers of z.
[b,a] = sos2tf(sos,g) returns the transfer function that describes a discrete-time system
given bysos
in second-order section form with gaing
.
.
freqz - Frequency response of digital filter
Syntax
[h,w]=freqz(b,a,n)h=freqz(b,a,w)
[h,w]=freqz(b,a,n,'whole')[h,f]=freqz(b,a,n,fs)
h=freqz(b,a,f,fs)
[h,f]=freqz(b,a,n,'whole',fs)
freqz(b,a,...) freqz(Hd)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 21/26
Description
[h,w] = freqz(b,a,n) returns the frequency response vector h and the corresponding
angular frequency vector w for the digital filter whose transfer function is determined by the
(real or complex) numerator and denominator polynomials represented in the vectors b and a,
respectively. The vectors h and w are both of length n. The angular frequency vector w has
values ranging from 0 to radians per sample. If you do not specify the integer n, or you
specify it as the empty vector [], the frequency response is calculated using the default valueof 512 samples.
h = freqz(b,a,w) returns the frequency response vector h calculated at the frequencies (in
radians per sample) supplied by the vector w. The vector w can have any length.
[h,w] = freqz(b,a,n,'whole') uses n sample points around the entire unit circle to
calculate the frequency response. The frequency vector w has length n and has values ranging
from 0 to 2 radians per sample.
[h,f] = freqz(b,a,n,fs) returns the frequency response vector h and the correspondingfrequency vector f for the digital filter whose transfer function is determined by the (real or
complex) numerator and denominator polynomials represented in the vectors b and a,
respectively. The vectors h and f are both of length n. For this syntax, the frequency response
is calculated using the sampling frequency specified by the scalar fs (in hertz). The
frequency vector f is calculated in units of hertz (Hz). The frequency vector f has values
ranging from 0 to fs/2 Hz.
h = freqz(b,a,f,fs) returns the frequency response vector h calculated at the frequencies
(in Hz) supplied in the vector f. The vector f can be any length.
[h,f] = freqz(b,a,n,'whole',fs) uses n points around the entire unit circle to calculate
the frequency response. The frequency vector f has length n and has values ranging from 0 tofs Hz.
freqz(b,a,...) plots the magnitude and unwrapped phase of the frequency response
of the filter. The plot is displayed in the current figure window.
freqz(Hd) plots the magnitude and unwrapped phase of the frequency response of the filter.
The plot is displayed in fvtool. The input Hd is a dfilt filter object or an array of dfilt
filter objects.
reshape - Reshape arraySyntax
B = reshape(A,m,n) B = reshape(A,m,n,p,...)
B = reshape(A,[m n p ...])B = reshape(A,...,[],...)
B = reshape(A,siz)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 22/26
Description
B = reshape(A,m,n) returns the m -by-n matrix B whose elements are taken column-wise
from A . An error results if A does not have m*n elements.
B = reshape(A,m,n,p,...) or B = reshape(A,[m n p ...]) returns an n-dimensional
array with the same elements as A but reshaped to have the size m-by-n-by-p-by-.... The product of the specified dimensions, m*n*p*..., must be the same as prod(size(A)).
B = reshape(A,...,[],...) calculates the length of the dimension represented by the
placeholder [], such that the product of the dimensions equals prod(size(A)) . The value of
prod(size(A)) must be evenly divisible by the product of the specified dimensions. You can
use only one occurrence of [].
B = reshape(A,siz) returns an n-dimensional array with the same elements as A, but
reshaped to siz, a vector representing the dimensions of the reshaped array. The quantity
prod(siz) must be the same as prod(size(A)) .
wavread - Read WAVE (.wav) sound file
Graphical Interface
As an alternative to wavread, use the Import Wizard. To activate the Import Wizard, select
File > Import Data.
Syntax
y = wavread( filename)
[ y, Fs] = wavread( filename)[ y, Fs, nbits] = wavread( filename) [ y, Fs, nbits, o pts] = wavread( filename)
[...] = wavread( filename, N )[...] = wavread( filename, [ N 1 N2])
[...] = wavread(..., fmt )
siz = wavread( filename,'size')
Description
y = wavread(filenam e) loads a WAVE file specified by the string filenam e, returning the
sampled data in y . If filenam e does not include an extension, wavread appends .wav.
[y , F s] = wavread(filenam e) returns the sample rate (F s) in Hertz used to encode the
data in the file.
[ y , Fs, nbits] = wavread(filename) returns the number of bits per sample (nbits).
[y , F s, nbits, o pts] = wavread(filenam e) returns a structure o pts of additional
information contained in the WAV file. The content of this structure differs from file to file.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 23/26
Typical structure fields include o pts.fmt (audio format information) and o pts.info (textthat describes the title, author, etc.).
[...] = wavread(filenam e, N ) returns only the first N samples from each channel in the
file.
[...] = wavread(filenam e, [N1 N2]) returns only samples N1 through N2 from eachchannel in the file.
[...] = wavread(..., f m t) specifies the data format of y used to represent samples read
from the file. f m t can be either of the following values, or a partial match (case-insensitive):
'double' Double-precision normalized samples (default).
'native' Samples in the native data type found in the file.
siz = wavread(filenam e,'size') returns the size of the audio data contained in
filenam e instead of the actual audio data, returning the vector siz = [sam ples channels].
wavread supports multi-channel data, with up to 32 bits per sample.
wavread supports Pulse-code Modulation (PCM) data format only.
mean - Average or mean value of array
Syntax
M = mean(A)
M = mean(A,dim)
Description
M = mean(A) returns the mean values of the elements along different dimensions of an array.
If A is a vector, mean(A) returns the mean value of A.
If A is a matrix, mean(A) treats the columns of A as vectors, returning a row vector of mean
values.
If A is a multidimensional array, mean(A) treats the values along the first non-singleton
dimension as vectors, returning an array of mean values.
M = mean(A,dim) returns the mean values for elements along the dimension of A specified
by scalar dim. For matrices, mean(A,2) is a column vector containing the mean value of eachrow.
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 24/26
decimate - Decimation decrease sampling rate
Syntax
y=decimate(x,r) y=decimate(x,r,n)
y=decimate(x,r,'fir')y=decimate(x,r,n,'fir')
Description
Decimation reduces the original sampling rate for a sequence to a lower rate, the opposite of
interpolation. The decimation process filters the input data with a lowpass filter and thenresamples the resulting smoothed signal at a lower rate.
y = decimate(x,r) reduces the sample rate of x by a factor r. The decimated vector y is r
times shorter in length than the input vector x. By default, decimate employs an eighth-order
lowpass Chebyshev Type I filter with a cutoff frequency of 0.8*(Fs/2)/r. It filters the input
sequence in both the forward and reverse directions to remove all phase distortion, effectively
doubling the filter order.
y = decimate(x,r,n) uses an order n Chebyshev filter. Orders above 13 are not
recommended because of numerical instability. In this case, a warning is displayed.
y = decimate(x,r,'fir') uses an order 30 FIR filter, instead of the Chebyshev IIR filter.
Here decimate filters the input sequence in only one direction. This technique conserves
memory and is useful for working with long sequences.
y = decimate(x,r,n,'fir') uses an order n FIR filter.
abs - Absolute value and complex magnitude
Syntax
abs(X)
Description
abs(X) returns an array Y such that each element of Y is the absolute value of the
corresponding element of X.
If X is complex, abs(X) returns the complex modulus (magnitude), which is the same as
sqrt(real(X).^2 + imag(X).^2)
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 25/26
1-D digital filter
Syntax
y = filter(b,a,X) [y,zf] = filter(b,a,X)
[y,zf] = filter(b,a,X,zi)y = filter(b,a,X,zi,dim)
[...] = filter(b,a,X,[],dim)
Description
The filter function filters a data sequence using a digital filter which works for both real
and complex inputs. The filter is a direct f orm II transpo sed implementation of the standard
difference equation.
y = filter(b,a,X) filters the data in vector X with the filter described by numerator
coefficient vector b and denominator coefficient vector a. If a(1) is not equal to 1, filter
normalizes the filter coefficients by a(1). If a(1) equals 0, filter returns an error.
If X is a matrix, filter operates on the columns of X. If X is a multidimensional array,
filter operates on the first nonsingleton dimension.
[y,zf] = filter(b,a,X) returns the final conditions, zf, of the filter delays. If X is a row
or column vector, output zf is a column vector of max(length(a),length(b)) -1. If X is a
matrix, zf is an array of such vectors, one for each column of X, and similarly for multidimensional arrays.
[y,zf] = filter(b,a,X,zi) accepts initial conditions, zi, and returns the final conditions,
zf, of the filter delays. Input zi is a vector of length max(length(a),length(b)) -1, or anarray with the leading dimension of size max(length(a),length(b)) -1 and with remaining
dimensions matching those of X.
y = filter(b,a,X,zi,dim) and [...] = filter(b,a,X,[],dim) operate across the
dimension dim.
sum - Sum of array elements
Syntax
B = sum(A) B = sum(A,dim)
B = sum(..., 'double')
B = sum(..., dim,'double')
B = sum(..., 'native')
B = sum(..., dim,'native')
8/4/2019 Minor Proj Report
http://slidepdf.com/reader/full/minor-proj-report 26/26
Description
B = sum(A) returns sums along different dimensions of an array. If A is floating point, that is
double or single, B is accumulated natively, that is in the same class as A, and B has the same
class as A. If A is not floating point, B is accumulated in double and B has class double.
If A is a vector, sum(A) returns the sum of the elements.
If A is a matrix, sum(A) treats the columns of A as vectors, returning a row vector of the sums
of each column.
If A is a multidimensional array, sum(A) treats the values along the first non-singletondimension as vectors, returning an array of row vectors.
B = sum(A,dim) sums along the dimension of A specified by scalar dim. The dim input is an
integer value from 1 to N, where N is the number of dimensions in A. Set dim to 1 to computethe sum of each column, 2 to sum rows, etc.
B = sum(..., 'double') and B = sum(..., dim,'double') performs additions in
double-precision and return an answer of type double, even if A has data type single or aninteger data type. This is the default for integer data types.
B = sum(..., 'native') and B = sum(..., dim,'native') performs additions in the
native data type of A and return an answer of the same data type. This is the default for
single and double.
If A = int8(1:20) then sum(A) accumulates in double and the result is double(210) while
sum(A,'native') accumulates in int8, but overflows and saturates to int8(127) .
Recommended