47
MSRSAS - Postgraduate Engineering and Management Programme - PEMP i ASSIGNMENT Module Code ESD 521 Module Name Embedded DSP Course M.Sc. [Engg.] in RTES Department Computer Engineering Name of the Student Bhargav Shah Reg. No CHB0911001 Batch Full-Time 2011. Module Leader Mr. Chandan N. POSTGRADUATE ENGINEERING AND MANAGEMENT PROGRAMME (PEMP) M.S.Ramaiah School of Advanced Studies Postgraduate Engineering and Management Programmes(PEMP) #470-P Peenya Industrial Area, 4 th Phase, Peenya, Bengaluru-560 058 Tel; 080 4906 5555, website: www.msrsas.org

Assignment 7

Embed Size (px)

Citation preview

Page 1: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

i

ASSIGNMENT

Module Code ESD 521

Module Name Embedded DSP

Course M.Sc. [Engg.] in RTES

Department Computer Engineering

Name of the Student Bhargav Shah

Reg. No CHB0911001

Batch Full-Time 2011.

Module Leader Mr. Chandan N.

PO

STG

RA

DU

ATE

EN

GIN

EER

ING

AN

D M

AN

AG

EM

EN

T P

RO

GR

AM

ME –

(P

EM

P)

M.S.Ramaiah School of Advanced Studies Postgraduate Engineering and Management Programmes(PEMP)

#470-P Peenya Industrial Area, 4th Phase, Peenya, Bengaluru-560 058

Tel; 080 4906 5555, website: www.msrsas.org

Page 2: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP ii

Declaration Sheet Student Name Bhargav Shah

Reg. No CHB0911001

Course RTES Batch Full-Time 2011.

Batch Full-Time 2011

Module Code ESD 521

Module Title Embedded DSP

Module Date 11-06-2012 to 07-07-2012

Module Leader Mr. Chandan N.

Extension requests: Extensions can only be granted by the Head of the Department in consultation with the module leader.

Extensions granted by any other person will not be accepted and hence the assignment will incur a penalty.

Extensions MUST be requested by using the „Extension Request Form‟, which is available with the ARO.

A copy of the extension approval must be attached to the assignment submitted.

Penalty for late submission Unless you have submitted proof of mitigating circumstances or have been granted an extension, the

penalties for a late submission of an assignment shall be as follows:

Up to one week late: Penalty of 5 marks

One-Two weeks late: Penalty of 10 marks

More than Two weeks late: Fail - 0% recorded (F)

All late assignments: must be submitted to Academic Records Office (ARO). It is your responsibility to

ensure that the receipt of a late assignment is recorded in the ARO. If an extension was agreed, the

authorization should be submitted to ARO during the submission of assignment.

To ensure assignment reports are written concisely, the length should be restricted to a limit

indicated in the assignment problem statement. Assignment reports greater than this length may

incur a penalty of one grade (5 marks). Each delegate is required to retain a copy of the assignment

report.

Declaration The assignment submitted herewith is a result of my own investigations and that I have conformed to the

guidelines against plagiarism as laid out in the PEMP Student Handbook. All sections of the text and

results, which have been obtained from other sources, are fully referenced. I understand that cheating and

plagiarism constitute a breach of University regulations and will be dealt with accordingly.

Signature of the student Bhargav Shah Date 07-07-2012

Submission date stamp (by ARO)

Signature of the Module Leader and date Signature of Head of the Department and date

Page 3: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP iii

M. S. Ramaiah School of Advanced Studies Postgraduate Engineering and Management Programme- Coventry University (UK)

Assessment Sheet

Department Computer Engineering

Course RTES Batch Full-Time 2011

Module Code ESD 521 Module Title System on Programmable Chip

Module Leader Chandan N. Module Completion

Date 07-07-2012

Student Name Bhargav Shah ID Number CHB0911001

Attendance Details Theory Laboratory Fine Paid (if any for shortage of attendance)

Remarks

Written Examination – Marks – Sheet (Assessor to Fill)

Q. No a b C d Total Remarks

1

2

3

4

5

6

Marks Scored for 100 Marks Scored out of 50

Result PASS FAIL

Assignment – Marks-Sheet (Assessor to Fill)

Part a b C d Total Remarks

A

B

C

Marks Scored for 100 Marks Scored out of 50

Result PASS FAIL

PMAR- form completed for student feedback (Assessor has to mark) Yes No

Overall-Result

Components Assessor Reviewer

Written Examination (Max 50) Pass / Fail

Assignment (Max 50) Pass / Fail

Total Marks (Max 100) (Before Late Penalty) Grade

Total Marks (Max 100) (After Late Penalty) Grade

IMPORTANT

1. The assignment and examination marks have to be rounded off to the nearest integer and entered in the respective fields

2. A minimum of 40% required for a pass in both assignment and written test individually

3. A student cannot fail on application of late penalty (i.e. on application of late penalty if the marks are below 40, cap at 40 marks)

Signature of Reviewer with date Signature of Module Leader with date

Page 4: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP iv

Abstract

DSPs are processors or microcomputers whose hardware, software, and instruction sets are

optimized for high-speed numeric processing applications which is essential for processing digital data

representing analog signals in real time. When acting as a audio or video compressor, for example, the

DSP receives digital values based on samples of a signal, calculates the results of a compression

function operating on these values, and provides digital values that represent the uncompressed output.

The DSP‟s high-speed arithmetic and logical hardware is programmed to rapidly execute algorithms

modeling the audio compression and de compression.

In Chapter 1, Different mechanics for driver vision enhancement system for rainy condition

has been discussed. To remove the rain drops from the image it is essential to understand the rain

drop properties. In the part the different rain properties has been discussed. After knowing the

nature of the rain drop, algorithm to eliminate that is proposed. This is based on the deep analysis of

existing algorithm. The block diagram for proposed system is explained.

Chapter 2, expects the students to carry out literature studies on existing audio compression

algorithm. The work involves design and implementation of MATLAB algorithm for FFT based audio

compression. The performance of the audio compression technique is analyzed by average SNR.

In Chapter 3, Implementation of the audio compression algorithm on TMS320C6XX series

DSP is documented. To implement MATLAB code on processor, it is essential to generate

processor specific c code. To obtain same, SIMULINK model is generated. The basic flow of the

SIMULINK model is documented. Process to port and optimizes the algorithm is also documented.

Page 5: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP v

Contents

Declaration Sheet ................................................................................................................................. ii

Abstract.............................................................................................................................................. iv

Contents .............................................................................................................................................. v

List of Tables .................................................................................................................................... vii

List of Figures.................................................................................................................................. viii

Nomenclature ..................................................................................................................................... x

CHAPTER 1 ..................................................................................................................................... 11

Driver Vision Enhancement System for rainy condition ............................................................. 11

1.1 Introduction ......................................................................................................................... 11 1.2 Literature survey ................................................................................................................. 11 1.3 Rain drop properties and its effect on image ...................................................................... 11

1.3.1 Spatio-temporal Property ................................................................................................................ 11

1.3.2 Chromatic Property ......................................................................................................................... 12

1.3.3 Photometric constraint .................................................................................................................... 12

1.3.4 Shape of the rain drop ..................................................................................................................... 12

1.3.5 Size of the rain drop ........................................................................................................................ 12

1.4 Identification of the rain drop noise and its elimination method ........................................ 13 1.5 Top level block diagram for Driver vision enhancement system ....................................... 13 1.6 Conclusion .......................................................................................................................... 14

CHAPTER 2 ..................................................................................................................................... 15

Design and development of audio compression algorithm........................................................... 15

2.1 Introduction .............................................................................................................................. 15 2.2 Literature survey on audio compression algorithms and comparison with the FFT algorithm

....................................................................................................................................................... 15

2.3 MATLAB code for the system ................................................................................................ 16

2.4 Performance Evaluation of the table ........................................................................................ 21

2.5 Documentation and discussion of results................................................................................. 22 2.5 Conclusion ............................................................................................................................... 27

NOTE: Due to size constrain of this document selected graphs for different D and N value is

documented. ....................................................................................................................................... 27

CHAPTER 3 ..................................................................................................................................... 28

DSP Implementation........................................................................................................................ 28

3.1 Introduction ......................................................................................................................... 28 3.2 DSP Implementation using c ................................................................................................... 28

Page 6: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP vi

3.2 Identification of Processor .................................................................................................. 34 3.3 Optimization using intrinsic functions ................................................................................ 36 3.4 Assembly level optimization for bulky loops ..................................................................... 37 3.5 Testing and comparison with the MATLAB results ........................................................... 37

3.6 Conclusion .......................................................................................................................... 39

CHAPTER 4 ..................................................................................................................................... 40

4.1 Summary .................................................................................................................................. 40

References ......................................................................................................................................... 41

Appendix-1 ....................................................................................................................................... 42

Page 7: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP vii

List of Tables

Table 2. 1 Obtained results of SNROV and AVSNRSG ................................................................... 22

Page 8: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP viii

List of Figures

Figure 1. 1 Block Diagram of Driver vision enhancement system .................................................... 14

Figure 2. 1 Sampling of input sound signal ....................................................................................... 17

Figure 2. 2 Output of sampling mechanism ....................................................................................... 17

Figure 2. 3 Saving of original samples as file.................................................................................... 18 Figure 2. 4 Sliding window concept of FFT ...................................................................................... 18

Figure 2. 5 Implementation of Window based FFT algorithm .......................................................... 19 Figure 2. 6 Size of 64 FFT of input signal ......................................................................................... 19 Figure 2. 7 FFT of some random signal............................................................................................. 20

Figure 2. 8 Compression logic ........................................................................................................... 20 Figure 2. 9 Compression logic ........................................................................................................... 20 Figure 2. 10 IFFT and Decompression logic ..................................................................................... 21

Figure 2. 11 Original wav file ............................................................................................................ 23 Figure 2. 12 Compressed signal for N=64 & D=8............................................................................. 23

Figure 2. 13 Uncompressed signal for N=128 and D=8 .................................................................... 24 Figure 2. 14 SNROV vs (D/N) and AVSNR vs (D/N) for N=64 ...................................................... 24 Figure 2. 15 Compressed signal for N=128 & D=32......................................................................... 25 Figure 2. 16 Uncompressed signal for N=128 & D=32 ..................................................................... 25 Figure 2. 17 SNROV vs (D/N) and AVSNR vs (D/N) for N=128 .................................................... 25

Figure 2. 18 Compressed signal for N=256 & D=16......................................................................... 26 Figure 2. 19 Uncompressed signal for N=256 & D=16 ..................................................................... 26

Figure 2. 20 SNROV vs (D/N) and AVSNR vs (D/N) for N=256. ................................................... 27

Figure 3. 1 C code generation process from SIMULINK model....................................................... 29

Figure 3. 2 Logic to read the sound signal is C6xx ........................................................................... 30 Figure 3. 3 Logic for compression algorithm .................................................................................... 31

Figure 3. 4 Logic for IFFT and decompression ................................................................................. 32 Figure 3. 5 Selecting the TMS67XX board in CCS .......................................................................... 33 Figure 3. 6 C code compilation in CCS ............................................................................................. 34

Figure 3. 7 Downloading successfully ............................................................................................... 34

Figure 3.8 Assembly Code………………………………………………………………………….36

Figure 3. 8 SIMULINK model .......................................................................................................... 38 Figure 3. 9 Original Signal ................................................................................................................ 38

Figure 3. 10 FFT of Original signal ................................................................................................... 38 Figure 3. 11 IFFT of original signal .................................................................................................. 39

Page 9: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP ix

Page 10: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

Embedded DSP x

Nomenclature

FFT Fast Fourier Transform

IFFT Inverse Fast Fourier Transform

PCM Pulse Code Modulation

KHz KiloHertz

Hz Hertz

CD Compact Disc

kbps Kilo Byte Per Second

MDST Modified Discrete Cosine Transform

Page 11: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

11 Embedded DSP

CHAPTER 1

Driver Vision Enhancement System for rainy condition

1.1 Introduction

Vehicle crashes remain the leading cause of accidental death and injuries in the US,

claiming tens of thousands of lives and injuring millions of people each year. Many of these crashes

occur during low visibility conditions (i.e., due to rain) to the risk perception of driving during night

time under low visibility conditions. Where a variety of modifiers affect the risk of a crash,

primarily through the reduction of object visibility. Furthermore, many of these modifiers also

affect the night time mobility of older drivers, who avoid driving during the night time. Thus, a

two-fold need exists for new technologies that enhance night visibility.

1.2 Literature survey

Driving depends essentially on visual information processing. Therefore, it is no surprise

that under restricted visibility conditions such as at night, under heavy rain, many severe and fatal

crashes occur [2]. Because of technical and legal regulations the problem cannot be solved simply

by increasing the light output of low beams. Therefore, infrared-based night vision enhancement

systems (NVESs) were developed which try to overcome the limits of human's sensory and

cognitive abilities, as well as limits in reaction time. Infrared sensitive camera based night vision

systems can enhance the visibility of objects emitting or reflecting heat waves, making visibility

comparable to high beam conditions. Rain produces sharp intensity fluctuations in images and

videos, which degrade the performance of outdoor vision systems. These intensity fluctuations

depend on the camera parameters, the properties of rain and the brightness of the scene. The

properties of rain which are its small drop size, high velocity and low density, make its visibility

strongly dependent on camera parameters such as exposure time and depth of field. These

parameters can be selected so as to reduce or even remove the effects of rain without altering the

appearance of the scene. Conversely, the parameters also can be set to enhance the visibility of rain.

1.3 Rain drop properties and its effect on image

1.3.1 Spatio-temporal Property

Rain randomly distribute in space and fall at high speeds when they reach at the ground.

Due to high speed any pixel may not always covered by rain in two successive frames. The pixels

which are covered by rain have similar intensity distribution [5].

Page 12: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

12 Embedded DSP

1.3.2 Chromatic Property

A stationary drop is like spherical lens, so when light passes through the drop it gets some

internal reflections and thus the drop becomes brighter than background. The increase in

chrominance values is dependent on the background. The difference in three planes between two

consecutive frames will be almost same. These variations are bound by a small threshold [3].

1.3.3 Photometric constraint The photometry deals with the physical properties of the rain. The intensity of the rain

streak depends on the brightness of the drop, background scene radiances and the integration time

of the camera. Photometric model assumed that raindrops have almost the same size and velocity. It

is also assumed that pixels that lie on the same rain streak have same irradiance because the

brightness of the drop is weakly affected by the background

1.3.4 Shape of the rain drop

The shape of a drop can be expressed as a function of its size the shapes of raindrops of

various sizes. Smaller raindrops are generally spherical in shape while larger drops resemble oblate

spheroids. In a typical rainfall, most of the drops are less than 1mm in size. Hence, most raindrops

are spherical

1.3.5 Size of the rain drop The minimum size of raindrops falling on the ground depends on vertical wind speeds in

clouds. In clouds with up draughts of less than 50 cm s, drops of 0.2 mm and more will fall out. In

air of 90% humidity such a drop can fall 150 m before total evaporation and thus reach the ground.

A drop of 1 mm can fall 40 km. Rain which mainly consists of drops of 0.1 mm diameter, is called

drizzle, and is produced by low layer clouds. The maximum diameter of raindrops is about 7 mm,

because larger drops will break apart during the fall. Only drops of diameters of less than 0.3 mm

are nearly perfect spheres at terminal (falling) velocity. Therefore for larger drops one cannot

unambiguously describe the shape by one length. This problem is solved by the definition of a

equivalent diameter: the diameter of a sphere with the same volume as the deformed drop.[2]

Based on these optical properties of a drop, we make the following observations:

• Raindrops refract light from a large solid angle of the environment (including the sky) towards

the camera. Secular and internal reflections further add to the brightness of the drop. Thus, a

drop tends to be much brighter than its background (the portion of the scene it occludes).

• The solid angle of the background occluded by a drop is far less than the total field of view of

the drop itself. Thus, in spite of being transparent, the average brightness within a stationary

drop (without motion-blur) does not depend strongly on its background

Page 13: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

13 Embedded DSP

1.4 Identification of the rain drop noise and its elimination method

The visual effects of rain are complex. Rain consists of spatially distributed drops falling at

high velocities. Each drop refracts and reflects the environment, producing sharp intensity changes

in an image. A group of such falling drops creates a complex time varying signal in images and

videos due to the finite exposure time of the camera, intensities due to rain are motion blurred and

hence depend on the background intensities. Based on the dynamics and photometric models of

rain, a robust algorithm to detect (segment) regions of rain in videos was developed. Although this

model does not explicitly take into account scene motions, we will show that they provide strong

constraints which are sufficient to disambiguate rain from other forms of scene motions

Photometric Model

Once the video is segmented into rain and non-rain regions, next step would be to remove

rain from each frame of the video. For each pixel with rain in the frame, replace its intensity and

estimate of the background obtained from this most of the rain is removed from the frame.

However, since drop velocities are high compared to the exposure time of the camera, the same

pixel may see different drops in consecutive frames. Such cases are not accounted for detection

algorithm. Fortunately, the probability of raindrops affecting a pixel in more than three consecutive

frames is negligible. In the case of a pixel being effected by raindrops in 2 or 3 consecutive frames,

remove rain by assigning the average of intensities in the two neigh boring pixel (on either side)

that are not affected by raindrops. From this algorithm we can only remove streaks which are

detected. Severely defocused streaks and streaks on bright backgrounds produce very small changes

in intensities that are difficult to detect in the presence of noise. Raindrops far from the camera are

much smaller than a pixel. Hence, the intensity at a pixel is due to a large number of drops in the

pixel‟s field of view. These aggregate effects are similar to the effects of fog. Hence, defogging

algorithms can be used to remove the steady effects of rain.

1.5 Top level block diagram for Driver vision enhancement system

The top level block diagram for the driver assistance system consists of a camera analog to

digital converter, digital signal processing and a display. When a raindrop hit the windscreen visual

effect produced by the apparition of this unfocused raindrop is such that local property of the image

changes slightly yet suddenly. Temporal analysis of this local region show us that indeed intensity

increased locally while in the same time the scene seen through the drop seems to be blurred

which means that sought gradients and so edges inside the drop are smoother, this problem can be

overcome by using a high-speed camera which uses an algorithm to predict the path of individual

rain drops , and when a raindrop's path intersects with the precise beam of one of the individual

Page 14: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

14 Embedded DSP

mini-lights, the system briefly flicks that beam off. Camera is used to capture images during rainy

season and this image is converted to digital by analog to digital converter and then the image is

processed and displayed to the driver. Conceptual diagram of the system is shown by Figure 1.1.

Figure 1. 1 Block Diagram of Driver vision enhancement system

1.6 Conclusion

Vehicle crashes remain the leading cause of accidental death and injuries in the world,

claiming tens of thousands of lives and injuring millions of people each year. Many of these crashes

occur due to poor vision during rainy condition. Driver vision enhancement systems provide major

part to avoid these crashes by enhancing the sight of driver. If the system is accurate enough to

regenerate the vision without rain drop in real time then it has a great beneficial to the transport

system.

Page 15: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

15 Embedded DSP

CHAPTER 2

Design and development of audio compression algorithm

2.1 Introduction

The underlying idea behind compression is that a data file can be re-written in a different

format that takes a less space. A data is called as a compressed it saves either more information in

the same space, or save information in a less space than a standard uncompressed format. A

Compression algorithm for audio signal will amylase the signal and stores it in different way. An

analogy could be made between compression and shorthand. In standard, words are represented by

symbols, effectively shortening the amount of space occupied. Data compression uses this concept.

Audio CDs use the popular WAV (waveform) format. WAV is under the more general RIFF

file format used by Windows. The WAV format is uncompressed and falls into many categories.

One of them is the pulse code modulation (PCM) which is the accepted input for MP3 encoding.

The size of a WAV file depends on its sampling rate. An 8-bit mono WAV sampled at 22,050 Hz

(Hertz) would take 22,050 bytes per second. A 16-bit stereo WAV with a sampling rate of 44.1 kHz

(kiloHertz) takes 176,400 bytes per second (44,100/second * 2 bytes * 2 channels). One minute of a

CD-quality audio roughly takes 10 MB. Tremendous amount of the storage space is required to

store high quality digital audio data. The use of compression allows the a significant reduction in

the amount of the data needed to create audio signal with usually only a minimal loss in a quality of

signal. This phenomena of reducing the voice samples logically, is provided by the compression

algorithm.

2.2 Literature survey on audio compression algorithms and comparison with the

FFT algorithm

The idea of audio compression is to encode audio data to take up less storage space and less

bandwidth for transmission. To meet this goal different methods for compression have been

designed. Just like every other digital data compression, it is possible to classify them into two

categories: lossless compression and lossy compression.

Lossless compression

Lossless compression in audio is usually performed by waveform coding techniques. These coders

attempt to copy the actual shape of the analog signal, quantizing each sample using different types

of quantization. These techniques attempt to approximate the waveform, and, if a large enough bit

rate is available they get arbitrary close to it. A popular waveform coding technique, that is

considered uncompressed audio format, is the pulse code modulation (PCM), which is used by the

Page 16: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

16 Embedded DSP

Compact Disc Digital Audio (or simply CD). The quality of CD audio signals is referred to as a

standard for hi-fidelity. CD audio signals are sampled at 44.1 kHz and quantized using 16

bits/sample Pulse Code Modulation (PCM) resulting in a very high bit rate of 705 kbps.

Other lossless techniques have been used to compress audio signals, mainly by finding

redundancy and removing it or by optimizing the quantization process. Among those techniques it

is possible to find adaptative PCM and Differential quantization. Other lossless techniques such as

Huffman coding and LZW have been directly applied to audio compression without obtaining

significant compression ratio.

Lossy compression

Opposed to lossless compression, lossy compression reduces perceptual redundancy; i.e.

sounds which are considered perceptually irrelevant are coded with decreased accuracy or not

coded at all. In order to do this, it is better to have scalar frequency domains coders, because the

perceptual effects of masking can be more easily implemented in frequency domain by using sub

band coding.

Using the properties of the auditory system we can eliminate frequencies that cannot be

perceived by the human ear, i.e. frequencies that are too low or too high are eliminated, as well as

soft sounds that are drowned out by loud sounds. In order to determine what information in an

audio signal is perceptual irrelevant, most lossy compression algorithms use transforms such as the

Modified Discrete Cosine Transform (MDCT) to convert time domain sampled waveforms into a

frequency domain. Once transformed into the frequency domain, frequencies component can be

digitally allocated according to how audible they are (i.e. the number of bits can be determined by

the SNR). Audibility of spectral components is determined by first calculating a masking threshold,

below which it is estimated that sounds will be beyond the limits of human perception.

Briefly, the modified discrete cosine transform (MDCT) is a Fourier-related transform with

the additional property of being lapped. It is designed to be performed on consecutive blocks of a

larger data set, where subsequent blocks are overlapped so that the last half of one block coincides

with the first half of the next block. This overlapping, in addition to the energy-compaction

qualities of the DCT, makes the MDCT especially attractive for signal compression applications,

since it helps to avoid artifacts stemming from the block boundaries.

2.3 MATLAB code for the system

There are three main sub system of the designed compression algorithm.

1. FFT implementation

Page 17: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

17 Embedded DSP

Signals are functions of time. A frequency response representation is a way of representing

the same signal as a frequency function. The Fourier transform defines a relationship between a

signal in the time domain and its representation in the frequency domain. Being a transform, no

information is created or lost in the process, so the original signal can be recovered from knowing

the Fourier transform, and vice versa. Basically, sound signal is the analog signal. To process sound

signal in the in the digital system at the initial level signal should be digitalized using ADC. The

output of the ADC will be digitalized signal.

Figure 2.1 shows the sampling of the input signal. In MATLAB “wavrecord” is the

predefined function which take number of samples to be recorded as first argument, sampling

frequency number of channel and data type to be stored. Here sampling frequency is 20161 so

delays between two consecutive samples are 1\20161 that is nothing but 1\Fs. Here for simplicity of

the algorithm voice is signal is taken a single channel. The range of sample value is given as -1 to 1

with the usages of 16 bit representation, which is achieved by the last argument of wave record.

Figure 2. 1 Sampling of input sound signal

Figure 2.2 shoes the output of the code section which is shows by Figure 2.1.Here the input

string hold all the samples values for input analog sound signal. So total number of samples is

100500 at the sampling frequency of 20161 Hz.

Figure 2. 2 Output of sampling mechanism

To observe the size, input signal is saved on the hard driver. The Figure 2.3 shows this

mechanism. MATLAB function named “auwrite” is used to store the file in .au format. First

argument of this is input sample string. Here in out case input sample string is “input”. Second

Page 18: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

18 Embedded DSP

argument is sampling frequency, which is “Fs”. Last argument is path to store the file. Here I am

storing in D:\ as the name of “original”.

Figure 2. 3 Saving of original samples as file

Now here as the requirement 64,128 and 256 point FFT is required. To achieve that sliding

window need to be taken upon input samples. This mechanism is represented by the Figure 2.4. In

the first operation in the image 1 to 64 samples is taken and converted in to frequency domain.

Then at the second stage window will be incremented to the next 64 samples. As per the

requirements here this window is taken as the size of 64,128 and 265 samples.

If the sample array size is not in multiple of window size then at the end of the sample array

some samples will remain. To solve such condition there are two approaches.

Padded approach

In which, the input string is padded to the zeros at the end to make it multiple of window

size. This approach is useful when the sampling frequency is law.

Unpadded approach

In this approach last remaining samples will be eliminated. In the case of the higher

sampling frequency, elimination of last samples can‟t make judge able difference in audio.

Figure 2. 4 Sliding window concept of FFT

Figure 2.5 shows the algorithm implemented to make FFT with windowing concept. At the

initial deceleration window_start is assigned 1 and window_end is assigned 64 .By changing the

window_end value, various points of FFT can be achived.Here main for loop will iterate for

Page 19: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

19 Embedded DSP

number of samples. If value of variable end_loop will reach to end of the string, main for loop will

be ended. At every iteration 64 samples from input string in transferred to the variable „g‟ and FFT

is done. The result of this FFT is stored in fourier_transfarm variable.

Figure 2. 5 Implementation of Window based FFT algorithm

Figure 2.6 shows the matlab variable window. The size of 64 point FFT transform of the

input samples is almost same as the raw input string. There is a small difference between the size is

due to,last 5 samples cannot make full frame of 64 points. To solve this situation unpadded

approach is chosen due to higher sampling frequency.

Figure 2. 6 Size of 64 FFT of input signal

2. Compression algorithm implementation

Data compression is classified into two major categories: lossless and lossy. A lossless

compression produces the exact copy of the original after decompression while its lossy counterpart

does not. A typical example of a lossless compression is the ZIP format. This form of data

compression is effective on a range of files. Compressing images and audio through this format is

not as effective since the information in these types of data is less redundant. This is where lossy or

perceptually lossless compression comes in.

A time-domain graph shows how a signal changes over time, whereas a frequency-domain

graph shows how much of the signal lies within each given frequency band over a range of

frequencies.Figure 2.7 shows the Frequenct domain transformation of the some random signal.

Page 20: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

20 Embedded DSP

Figure 2. 7 FFT of some random signal At the lower frequency there are higher number of samples are stored. All the voice

signal is stored in the higher frequency range. The other observation, the whole graph is the mirror

image of the first half. As per the requirement here from every frame of FFT we can consider D

number of samples to compress the whole frame. In the case of D=8 , from every frame only 8

samples are considered. By reducing the number of FFT out opt file compression is done. This

phenomena is shown by Figure 2.9.

Figure 2. 8 Compression logic

Figure 2.9 shows the compression logic for the given input signal. As the central idea of this

logic here, in for loop variable “result” will hold the 64 point FFT of every window from every

window .And compressed signal will be constructed using concatenating of first D number of

samples. Here in the case value of the D varies from 2,4,8,16,32 and 64.

Figure 2. 9 Compression logic

Page 21: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

21 Embedded DSP

Here compression is done using the reducing the number of samples .As effect of reducing

number of samples, size to store the samples will be reduced.

Length of compressed signal =D*(Total number of sample)/(Frame size of FFT)

3. IFFT implementation

Figure 2.10 shows the decompression and IFFT logic for same compressed audio signal. In

compression from every window D number of points is considered as the compressed audio. To

regenerate it the for loop will iterate for length of compressed signal device by D times. IFFT of 64

point is done of 1 to D points of compressed sequence. That will give the result of 64 point which is

nearly similar to the input sequence.

Figure 2. 10 IFFT and Decompression logic

2.4 Performance Evaluation of the table

Signal-to-noise ratio, or SNR, is a measurement that describes how much noise is in the

output of a device, in relation to the signal level. A small amount of noise may not objectionable if

the output signal is very strong. In many cases, the noise may not be audible at all. But if the signal

level is very small, even a very low noise level can have an adverse effect. To make the

determination objectively, we need to look at the relative strengths of the output signal and the

noise level.

Human perception of sound is affected by SNR, because adding noise to a signal is not as

noticeable if the signal energy is large enough. When digitalize an audio signal, ideally SNR could

to be constant for al quantization levels, which requires a step size proportional to the signal value.

This kind of quantization can be done using a logarithmic commander (compressor-expander).

Using this technique it is possible to reduce the dynamic range of the signal, thus increasing the

coding efficiency, by using fewer bits. The two most common standards are the μ-law and the A-

law, widely used in telephony.

Page 22: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

22 Embedded DSP

Table 2. 1 Obtained results of SNROV and AVSNRSG

Frame size(N) D SNROV AVSNRSG

64 2 9.8168 2.2979

64 4 21.4872 9.4578

64 8 29.9553 4.6043

64 16 39.2292 9.0322

128 4 13.1462 3.9324

128 8 23.6417 4.3549

128 16 31.8198 1.5627

128 32 40.8769 2.0435

256 8 14.7173 4.2964

256 16 24.7699 24.7699

256 32 32.8703 1.0505

256 64 41.8077 1.1836

D shows the number of component of audio signal. This means that by increasing D

value we are increasing the resolution of the signal, as we get closer to the original signal. So by

increasing the D value we can recover the batter signal. A signal with increased D gives a better

quality audio signal. This can also be seen from the SNR values. The SNR values increase as we

increase the D value. This fact is shown by Table 2.1. The drawback of choosing large D is that

the size of the file starts getting bigger as we increase D.

N represents the size of FFT. The length of the FFT window is N. In these three values

of N is considered, named 64, 128, and 256. As we increase the value N, we increase the

resolution of our signal, by increasing the number of samples. This will increase the quality of the

audio signal. But here, quality of the audio signal depends on N and D values. So if N value is

increased but D value is chosen to be very low, the overall signal will not be a high quality signal.

This is shown in Table 2.1.

2.5 Documentation and discussion of results

Here in this part of the assignment, waveform for original signal, compressed signal

and uncompressed signal obtained by MATLAB is documented .There are two waveform

associated with the each combination of N and D value. Analysis is documented for only three

combination of N and D. Figure 2.11 shows the time domain representation of the original wave.

This input file remains same for all combination of D and N values. In the original wave nearly

Page 23: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

23 Embedded DSP

100000 samples are recorded. This can be seen from the x axis of the Figure 2.11.

Figure 2. 11 Original wav file

N=64 and D=8

Figure 2.12 shows the compressed audio signal of the original signal using N=64 and

D=8.So,as per our equation length of compressed signal is :

Length of compressed signal =D*(Total number of sample)/(Frame size of FFT)

Length of compressed signal =(8*100800)/64

Length of compressed=12600

This can be seen from the x axis of the signal in Figure 2.12.Here compression is done by

eliminating number of samples logically. Here, the compressed file size is 12% of original file

size.

Figure 2.13 shows the reconstructed signal from the compressed signal sown by the

2.12.Decompression algorithm regenerates the original signal from the compressed once. Here

uncompressed signal has nearly 100000 samples as the original one in figure 2.10. The amplitude

of the uncompressed signal is getting increased from the originals signal. This proves the signal

has a higher value then noise in the signal to noise ratio (SNR).Figure 2.14 SNROV vs (D/N) and

AVSNR vs (D/N) for same D and N value.

Figure 2. 12 Compressed signal for N=64 & D=8

Total number

of sample for

original signal

Number of

samples is reduced

for N=64 and D=8

Page 24: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

24 Embedded DSP

Figure 2. 13 Uncompressed signal for N=128 and D=8

Figure 2. 14 SNROV vs (D/N) and AVSNR vs (D/N) for N=64

N=128 and D=32

Figure 2.14 shows the compressed audio signal of the original signal using N=128 and

D=32.So,as per our equation length of compressed signal is :

Length of compressed signal =D*(Total number of sample)/(Frame size of FFT)

Length of compressed signal =(32*100800)/128

Length of compressed=25200

This can be seen from the x axis of the signal in Figure 2.14.Here compression is done by

eliminating number of samples logically. Here, size of compressed file is 25% of the original file.

Figure 2.16 shows the reconstructed signal from the compressed signal sown by the

2.15.Decompression algorithm regenerates the original signal from the compressed once. Here

uncompressed signal has nearly 100000 samples as the original one in figure 2.10. The amplitude

of the uncompressed signal is getting increased from the originals signal. This proves the signal

has a higher value then noise in the signal to noise ratio (SNR).

Number of samples

are same for

uncompressed

sequence with N=128

and D =8

Page 25: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

25 Embedded DSP

Figure 2. 15 Compressed signal for N=128 & D=32

The quality of this signal is degrades from the previous once. The main reason behind this

the average amplitude of the decompressed signal is .05.That menace that some amount of noise

(disturbance) will always present in compressed signal. This noise is comparatively less in above

case. Figure 2.17 shows the SNROV vs (D/N) and AVSNR vs (D/N) for N=128.

Figure 2. 16 Uncompressed signal for N=128 & D=32

Figure 2. 17 SNROV vs (D/N) and AVSNR vs (D/N) for N=128

Number of samples

is reduced for

N=128 and D=32

Regeneration of

signal with original

number of samples

Page 26: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

26 Embedded DSP

N=256 and D=16

Figure 2.16 shows the compressed audio signal of the original signal using N=256 and

D=16.So,as per our equation length of compressed signal is :

Length of compressed signal =16*(Total number of sample)/(Frame size of FFT)

Length of compressed signal =(16*100800)/256

Length of compressed=6300

This can be seen from the x axis of the signal in Figure 2.16.Here compression is done by

eliminating number of samples logically. Here, size of compressed file is 10% of the original file.

Figure 2. 18 Compressed signal for N=256 & D=16

Figure 2.19 shows the reconstructed signal from the compressed signal sown by the

2.18.Decompression algorithm regenerates the original signal from the compressed once. Here

uncompressed signal has nearly 100000 samples as the original one in figure 2.10. The amplitude

of the uncompressed signal is almost same as the originals signal. This proves the signal has a

higher value then noise in the signal to noise ratio (SNR).

Figure 2. 19 Uncompressed signal for N=256 & D=16

The quality of this signal is degrades from the original once. But as compared to the

other combination of the D and N value it is good.Here Figure 2.20 shows the SNROV vs (D/N)

Number of

samples is

still reduced

Regeneration of

Signal with

original number

of samples

Page 27: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

27 Embedded DSP

and AVSNR vs (D/N) for N=256.

Figure 2. 20 SNROV vs (D/N) and AVSNR vs (D/N) for N=256.

2.5 Conclusion

In audio compression technique, sound signal is recorded from using MATLAB. N point

FFT is performed on audio signal to obtain the frequency domain components of the signal. In the

frequency domain, first D bits of the all FFT window is considered as the compressed signal. For

obtaining the original signal N point IFFT of D windowed signal from compressed signal is

performed. By compression and decompression process according to the D value the quality of

the sound file (audio) is decreased.

NOTE: Due to size constrain of this document selected graphs for different D and N value is

documented.

Page 28: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

28 Embedded DSP

CHAPTER 3

DSP Implementation 3.1 Introduction

Digital signal processing is one of the core technologies, in rapidly growing application

areas, such as wireless communications, audio and video processing and industrial control. The

number and variety of products that include some form of digital signal processing has grown

dramatically over the last few years. Due to increasing popularity, the variety of the DSP-capable

processors has expanded greatly. DSPs are processors or microcomputers whose hardware,

software, and instruction sets are optimized for high-speed numeric processing applications, an

essential for processing digital data, representing analog signals in real time.

Digital signal processors such as the TMS320C6x (C6x) family of processors are like fast

special-purpose microprocessors with a specialized type of architecture and an instruction set

appropriate for signal processing. The C6x notation is used to designate a member of Texas

Instruments‟ (TI) TMS320C6000 family of digital signal processors. The architecture of the C6x

digital signal processor is very well suited for numerically intensive calculations. Based on a very-

long-instruction-word (VLIW) architecture, the C6x is considered to be TI‟s most powerful

processor. Digital signal processors are used for a wide range of applications, from communications

and controls to speech and image processing. The general-purpose digital signal processor is

dominated by applications in communications (cellular). Applications embedded digital signal

processors are dominated by consumer products. They can handle different tasks, since they can be

reprogrammed readily for a different application. DSP techniques have been very successful

because of the development of low-cost software and hardware support. Here, Audio compression

algorithm which is developing in above section is ported on C6x.

3.2 DSP Implementation using c

In this section C-code is developed for the audio compression using FFT. The idea of the

Audio compression remains same in the model developed in the SIMULINK. Due to the size

constrain, SIMULINK model is not explained in this document but the basic flow of SIMULINK

model will be the same as generated c code which is explained briefly. Here C-code for performing

FFT and IFFT with appropriate points is developed which contains half part of the audio

compression process. After converting the signal in to the frequency domain number of samples is

reduces according to D value, which contains another half part of algorithm. The C-code is

generated for audio compression which can process 5 second audio signal. This process is shows by

figure 3.1. If the audio signal is of higher size then they have to be scaled down to 5 sec in order to

Page 29: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

29 Embedded DSP

satisfy the code requirement otherwise code has to be subjected to suitable changes. The logic

involved in the C-code for performing FFT, compression and IFFT are explained in detail using

flowchart in next sections. The C code i.e. generated for audio compression; the same concept can

be extended to .mp3 also. Generated c code is documented in appendix.

Figure 3. 1 C code generation process from SIMULINK model

An overview of Audio compression C-code

Initially audio signal from real world is being digitalized in the MATLAB with appropriate

sampling frequency, channels, time and storage data range, which are written as the text files.

These text files are read using C in CCS.

The text file containing sample values of sound signal is read in array using C in CCS for

further processing.

Then, depend upon the required N value 64,128 or 256 point FFT is performed on audio

sample to convert signal in to frequency domain

Depend upon required D value, first D number of frequency component is transferred another

array from every N sized FFT frame .Which is the member of compressed file.

After doing compression as the part of the decompression, from the compressed string N point

IFFT of D sized frames is being done. Size of the output string is equal to the original input

string.

These uncompressed image values are then written in to the text file which can be read in the

MATLAB and played the sound signal. The amount information in the sound file is remains as

Page 30: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

30 Embedded DSP

in it. But as the result lossy compression, the quality of the information will be degraded as per

the N and D value.

FFT implementation

This part discusses the logic for performing N point FFT on digitalized audio signal

using C language. Here initially all the samples are serially sent from the computer to DSP

TMS320C6713.Here the sampling frequency is fixed. So, number of samples is fixed for 5 sec of

audio signal. After receiving the whole array of samples, DSP will perform N point FFT. The logic

of the N point FFT operation is explained in detail with help of flowcharts below.

Figure 3.2 shows the flow chart which explains the way to obtain digitalized sound

signal in the DSP using C. Here left side flow graph shows the MATLAB algorithm running in the

computer. This will send the sound data on UART with the baud rate of 9600.Here out sampling

frequency is 20161 Hz. So for five second ,the number of samples will be roughly 100800 .Total

time to pass the voice signal from computer to DSP will be:

Figure 3. 2 Logic to read the sound signal is C6xx

Total time =Number of samples/baud rate

Page 31: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

31 Embedded DSP

Total time =100800/9600

Total time =10.2 sec

This data is received by the DSP. Program Flow graph of algorithm used by the DSP is

shows by the left sided flow graph. It is essential to configure the UART with proper baud rate to

establish the proper communication. After establishing the communication processor algorithm

waits for arrival of data on the Rx pin of UART. After receiving the data one array named „a‟ is

used to store voice data. After receiving the whole string N point FFT is done for window of N

size. This window will slide till the end of the whole string and perform N point FFT. So total

iteration of This for loop is (number of sample/N).The output of this sliding window FFT is stored

in stored in array named “fft”.

Compression algorithm implementation

This part discusses the logic for performing compression using the different D values in

C. Here after the storing the result of N point FFT selected frequency component is chosen to

compress the signal. So, the basic idea is to eliminate the number of samples values logically to

reduce the size of signal. To reduce the number of sample here only D bits from every N point FFT

window will construct the string named “compressed”. This compression is done by the

TMSC67XX .The logic of the compression operation is explained in detail with help of flowcharts

below.

Figure 3. 3 Logic for compression algorithm

Figure 3.3 shows the flow chart which explains the way to compress sound signal in the

DSP using C. Here program flow graph is started from the point „B‟ which is ending point in Figure

Page 32: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

32 Embedded DSP

3.2. At the starting point (point „B‟) .N point FFT sequence is available for compression. So the

size of the frequency domain representation of the sound signal is as same as original signal. To

achieve the compression we need to deal with the two offset .One is to increment the address in the

“compressed “string to append and store the element. The offset for “compressed” string is D. The

other is to slide the window at the “fft” array which is having the audio signal in frequency domain

.The size of the offset is N.

IFFT implementation

This part discusses the logic for performing decompression and IFFT using the different D

values in C. Here after compression the signal has be de compressed to retrieve original sound

signal. By taking the IFFT of “compressed” string which holds the compressed signal. This N point

IFFT is done for D number of points. This phenomenon will again gives the time domain represent

of original voice signal. After finishing the calculations result is send to UART with the same baud

rate of 9600.The other side of system MATLAB running server will receives the data and play it in

MATLAB using the inbuilt player. This decompression is done by the TMSC67XX .The logic of

the compression operation is explained in detail with help of flowcharts below.

Figure 3. 4 Logic for IFFT and decompression

Figure 3.4 shows the flow chart which explains the way to decompress sound signal in the

DSP using C. Here left side program flow graph shows the algorithm which is running in

Page 33: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

33 Embedded DSP

TMSC67XX. This receives the compresses signal from above subsystem. Here, D number of points

from compressed signal is inserted for N point IFFT. To Obtain the original signal, whole

compressed signal is passed in to N point IFFT by appending the source and destination address

point IDDT of D points will give N points in result. At initial level result is stored in string named

“uncompress”. After reconstruction of original signal, the whole array is transferred to MATLAB.

Left graph represents the MATLAB algorithm. At the initial stage it looks for the arrival of data.

On the arrival of data, it stores the data in to an array. After completion of the reception MATLAB

algorithm play the received file using the “wavpaly” API.

Compiling C code in CCS

In this section the C-code developed for the audio compression in the previous section is

compiled and simulated by selecting TMS67XX processor as the target board in the CCS simulator.

Here configuring TMS67XX and compiling the c code for same is documented.

Figure 3. 5 Selecting the TMS67XX board in CCS

Figure 3.5 shows that TMS67XX board is selected as the target board in the CCS for the

simulation of audio compression C code. The code is compiled and checked for errors and

warnings. Then it is simulated to examine the results.

Figure 3.6 shows that the C-code developed for audio compression is successfully compiled in

the CCS without any errors and warnings. It indicates that the C code developed for the audio

compression can run successfully on the TMSC67XX processor. This process will give the machine

level code for C67X.

For the simulator

C76XX platform

is selected

Page 34: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

34 Embedded DSP

Figure 3. 6 C code compilation in CCS

Loading program into DSP Processor

Finally, to run the program, load the program into the DSP. Go to File ->Load Program in the

CCS. Load the executable file (.out) that the compiler generated on the successful compilation of

the program. Figure 3.7 shows this downloading process.

Figure 3. 7 Downloading successfully

3.2 Identification of Processor

Over the past few years it is seen that general purpose computers are capable of performing

two major tasks.

(1) Data Manipulation, and

(2) Mathematical Calculations

All the microprocessors are capable of doing these tasks but it is difficult to make a device

which can perform both the functions optimally, because of the involved technical tradeoffs like the

size of the instruction set, how interrupts are handled etc. As a broad generalization these factors

have made traditional microprocessors such as Pentium Series, primarily directed at data

manipulation. Similarly DSPs are designed to perform the mathematical calculations needed in

Code is successfully

lorded in to C6713

Compression C

code is successfully

compiled under

CCS compiler

Page 35: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

35 Embedded DSP

Digital Signal Processing.

The TMS320C6x are the first processors to use velociTI architecture, having implemented

the VLIW architecture. The TMS320C62x is a 16-bit fixed point processor and the „67x is a

floating

point processor, with 32-bit integer support. The discussion in this chapter is focused on the

TMS320C67x processor. The architecture and peripherals associated with this processor are also

Discussed.

The C6713 DSK is a low-cost standalone development platform that enables users to

evaluate and develop applications for the TI C67xx DSP family. The DSK also serves as a

hardware reference design for the TMS320C6713 DSP. Schematics, logic equations and application

notes are available to ease hardware development and reduce time to market.

Functional Overview of C6713

The DSP on the 6713 DSK interfaces to on-board peripherals through a 32-bit wide EMIF

(External Memory Interface). The SDRAM, Flash and CPLD are all connected to the bus. EMIF

signals are also connected daughter card expansion connectors which are used for third party add-in

boards.

The DSP interfaces to analog audio signals through an on-board AIC23 codec and four 3.5

mm audio jacks (microphone input, line input, line output, and headphone output). The codec can

select the microphone or the line input as the active input. The analog output is driven to both the

line out (fixed gain) and headphone (adjustable gain) connectors. McBSP0 is used to send

commands to the codec control interface while McBSP1 is used for digital audio data. McBSP0

and McBSP1 can be re-routed to the expansion connectors in software.

A programmable logic device called a CPLD is used to implement glue logic that ties the

board components together. The CPLD has a register based user interface that lets the user

configure the board by reading and writing to its registers.

The DSK includes 4 LEDs and a 4 position DIP switch as a simple way to provide the user

with interactive feedback. Both are accessed by reading and writing to the CPLD registers. An

included 5V external power supply is used to power the board. On-board switching voltage

regulators provide the +1.26V DSP core voltage and +3.3V I/O supplies. The board is held in reset

until these supplies are within operating specifications. Code Composer communicates with the

DSK through an embedded JTAG emulator with a USB host interface. The DSK can also be used

with an external emulator through the external JTAG connector.

Due to the needed functionality, C6713 is chosen for audio compression algorithm implementation.

Page 36: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

36 Embedded DSP

3.3 Optimization using intrinsic functions

An intrinsic function is a function available for use in a given language whose

implementation is handled specially by the compiler. Typically, it substitutes a sequence of

automatically generated instructions for the original function call, similar to an inline function.

Unlike an inline function though, the compiler has an intimate knowledge of the intrinsic function

and can therefore better integrate it and optimize it for the situation. Compilers that implement

intrinsic functions generally enable them only when the user has requested optimization, falling

back to a default implementation provided by the language runtime environment otherwise.

The C6000 compiler recognizes a number of intrinsic operators. Intrinsic allow to express

the meaning of certain assembly statements that would otherwise be cumbersome or inexpressible

in C/C++. Intrinsic are used like functions. The code has to be optimised to save the number of

CPU clock cycles.

Figure 3. 6 Un-optimized code need 201518 CPU-cycles for execution

Figure 3.6 shows that if the C code for audio compression is made to run without any

optimization then it requires 201518 CPU cycles for the execution.

Figure 3. 7 optimization options in CCS and running optimized code

Number of cycle to

complete operation

without optimization

Number of cycle

after optimization

using insintric

function

Direct compiler to

perform insintric

functional level

optimization

Page 37: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

37 Embedded DSP

Figure 3.7 shows the different optimization options that are available in the CCS also

shows that the audio compression code need 183436 number of CPU cycles for the execution after

the „intrinsic functional level ‟ optimization is done in CCS.

3.4 Assembly level optimization for bulky loops

DSPs are programmed in the same languages as other scientific and engineering

applications, usually assembly or C. Programs written in assembly can execute faster, while

programs written in C are easier to develop and maintain. In traditional application. DSP programs

are different from traditional software tasks in two important respects. First, the programs are

usually much shorter, say, one-hundred lines versus ten-thousand lines. Second, the execution

speed is often a critical part of the application. After all, that's why someone uses a DSP in the first

place, for its blinding speed. These two factors motivate many software engineers to switch from C

to assembly for programming Digital Signal Processors. The main limitation of the assembly is the

bulky code and the code developer should understand complete structure of hardware.

Here it is observed that „For‟ loop which is involved in the compression of the audio signal is

taking long time to execute. As the central idea of this part that loop is created in the assembly

language for C6XX.That part of assembly code is shown by figure 3.8.

Figure 3. 8 Assembly Code

3.5 Testing and comparison with the MATLAB results

Here SIMULINK model has generated for MATLAB code. For porting the MATLAB code

in to C6X processor specific c code is essential. It is easy to generate processor oriented c code

from SIMMULINK model. Figure 3.9 shows the SIMMULINK model for above explained c code.

For functionality testing here, one input signal is given to SIMULINK model. By using the scope

All signals can be observed. Figure 3.10 shows the original signal. No computation is performed

yet to the signal.

Page 38: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

38 Embedded DSP

Figure 3. 9 SIMULINK model

Figure 3. 10 Original Signal

N point FFT is performed on the input signal. As the result of FFT the first component will

be the DC component. The amplitude of such component will be high. This can be seen from

Figure 3.11 which shows the N point FFT representation of the input signal. The yellow high signal

which is coming at the continuous interval is shown in figure 3.11.

Figure 3. 11 FFT of Original signal

Page 39: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

39 Embedded DSP

After FFT and compression, DSP will uncompress this signal to retrieve the original signal.

As per algorithm developed above the uncompressed signal should be equivalent to the original

signal. Figure 3.12 shows the uncompressed signal .Which is equivalents to the original signal

shows in Figure 3.9.

Figure 3. 12 IFFT of original signal

3.6 Conclusion

The goal of this work is to achieve this SIMMULINK model for audio compression and de

compression algorithm. Processor specific c code is generated from the simmulink model and

optimizes using the intrinsic function. As the effect of optimization the number of processor cycle

is reduced for audio compression algorithm. The optimized algorithm is successfully ported to the

TMS320C6XX DSP.

Page 40: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

40 Embedded DSP

CHAPTER 4

4.1 Summary

This module helped in understanding the concept of signal processing and the general block

diagram associated with any signal processing real time application. Aliasing effect, over

sample and under sampling has been discussed in the class and implemented in the lab too.

One of the very important blocks of signal processing, filters has been discussed in the

session and concluded that convolution does basic filtering. The filter design has been

implemented using FDATOOL.

The CCS introduction in the lab helped in understanding and implementing lot of real time

problems. The echo operation has been done in C using CCS and imported the generated

header file to CCS. The architecture, assembly instruction and constraints of using assembly

instruction as parallel for TMS 67xx series given a base for using any DSP. The peripheral

of the processor and optimization has been discussed in the class.

Solving this assignment helped in using different audio compression techniques and they

enhanced C programming skills.

Audio processing concepts covered in the theory class. The interfacing example shown in

the lab for TMS5416 and TMS6713 helped in understanding the real time importance of this

subject. It was a truly benefited module in terms of theoretical concepts and by practical

exposure also.

Page 41: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

41 Embedded DSP

References

[1] K. Garg and S.K. Nayar, “When does a camera see rain?,” in Proc. ICCV 2005., vol. 2, pp.

1067–1074.

[2] K.A. Patwardhan and G. Sapiro, “Projection based image and video inpainting using wavelets,”

in Proc. ICIP 2003., vol. 1, pp. 857–860

[3] Kshitiz Garg and Shree K. Nayar, “Vision and Rain”, International Journal of Computer Vision

75(1),

3–27, February 2007

[4] Jeremie Bossu, Nicolas Hautiere, Jean-Philippe Tarel, “Rain or Snow Detection in Image

Sequences

Through Use of a Histogram of Orientation of Streaks”, Springer International Journal of Computer

Vision, January 2011

[5] Sullivan, J. M., & Flannagan, M. J. (2001). Characteristics of pedestrian risk in darkness.

(UMTRI – 2001 – 33). Ann Arbor, Michigan: University of Michigan, Transportation Research

Institute.

[6] Abhishek Kumar Tripathi, Sudipta Mukhopadhyay, “A Probabilistic Approach for Detection

and

Removal of Rain from Videos”, IETE Journal of Research, vol. 57, pp. 82-91, 2011.

[7] Stuart, A., and Ord, J.K., Kendall‟s Advanced Theory of Statistics, 5th edition, 1987, vol 1,

section 10.15.

Page 42: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

42 Embedded DSP

Appendix-1 /*

* comp1.c

*

* Real-Time Workshop code generation for Simulink model "comp1.mdl".

*

* Model Version : 1.7

* Real-Time Workshop version : 7.0 (R2007b) 02-Aug-2007

* C source code generated on : Tue Aug 14 13:15:08 2012

*/

#include "comp1.h"

#include "comp1_private.h"

/* Block signals (auto storage) */

#pragma DATA_ALIGN(comp1_B, 8)

BlockIO_comp1 comp1_B;

/* Real-time model */

RT_MODEL_comp1 comp1_M_;

RT_MODEL_comp1 *comp1_M = &comp1_M_;

/* Model output function */

static void comp1_output(int_T tid)

{

/* S-Function Block: <Root>/ADC (c6416dsk_adc) */

{

const real_T ADCScaleFactor = 1.0 / 32768.0;

int_T i;

int16_T *blkAdcBuffPtr;

// Retrieve ADC buffer pointer and invalidate CACHE

blkAdcBuffPtr = (int16_T *) getAdcBuff();

CACHE_wbInvL2( (void *) blkAdcBuffPtr, 256, CACHE_WAIT );

for (i = 0; i < 64; i++) {

comp1_B.ADC[i] = (real_T)*blkAdcBuffPtr * ADCScaleFactor;

/* Skip Right side for mono mode */

blkAdcBuffPtr += 2;

}

}

/* Signal Processing Blockset FFT (sdspfft2) - '<Root>/FFT' */

/* Real input, 1 channels, 64 rows, linear output order */

/* Interleave data to prepare for real-data algorithms: */

MWDSP_FFTInterleave_BR_D(comp1_B.FFT, comp1_B.ADC, 1, 64);

/* Apply half-length algorithm to single real signal: */

{

Page 43: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

43 Embedded DSP

creal_T *lastCol = comp1_B.FFT; /* Point to last column of input

*/

MWDSP_R2DIT_TBLS_Z(lastCol, 1, 64, 32, &comp1_P.FFT_TwiddleTable[0],

2, 0);/* Radix-2 DIT FFT using TableSpeed twiddle computation */

MWDSP_DblLen_TBL_Z(lastCol, 64, &comp1_P.FFT_TwiddleTable[0], 1);

}

/* Embedded MATLAB: '<Root>/Embedded MATLAB Function' */

{

int32_T eml_i0;

/* This block supports the Embedded MATLAB subset. */

/* See the help menu for details. */

for (eml_i0 = 0; eml_i0 < 64; eml_i0++) {

comp1_B.y[eml_i0].re = comp1_B.FFT[eml_i0].re;

comp1_B.y[eml_i0].im = comp1_B.FFT[eml_i0].im;

}

}

/* Signal Processing Blockset FFT (sdspfft2) - '<Root>/IFFT' */

/* Complex input, complex output,1 channels, 64 rows, linear output

order */

MWDSP_R2BR_Z_OOP(comp1_B.IFFT, (const creal_T *)comp1_B.y, 1, 64,

64);/* Out-of-place bit-reverse reordering */

/* Radix-2 DIT IFFT using TableSpeed twiddle computation */

MWDSP_R2DIT_TBLS_Z(comp1_B.IFFT, 1, 64, 64,

&comp1_P.IFFT_TwiddleTable[0], 1,

1);

MWDSP_ScaleData_DZ(comp1_B.IFFT, 64, 1.0/64);

/* Embedded MATLAB: '<Root>/Embedded MATLAB Function1' */

{

int32_T eml_i0;

/* This block supports the Embedded MATLAB subset. */

/* See the help menu for details. */

for (eml_i0 = 0; eml_i0 < 64; eml_i0++) {

comp1_B.y_d[eml_i0] = comp1_B.IFFT[eml_i0].re;

}

}

/* S-Function Block: <Root>/DAC (c6416dsk_dac) */

{

const real_T DACScaleFactor = 32768.0;

int_T i;

void *blkDacBuffPtr;

int16_T *outPtr;

blkDacBuffPtr = getDacBuff();

outPtr = (int16_T *) blkDacBuffPtr;

for (i = 0; i < 64; i++) {

/* Left */

*outPtr = (int16_T)(comp1_B.y_d[i] * DACScaleFactor);

/* Right */

Page 44: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

44 Embedded DSP

*(outPtr+1) = *outPtr; /* Copy same word to RHS for mono

mode. */

outPtr += 2;

}

CACHE_wbL2( (void *) blkDacBuffPtr, 256, CACHE_WAIT );

}

UNUSED_PARAMETER(tid);

}

/* Model update function */

static void comp1_update(int_T tid)

{

/* Update absolute time for base rate */

if (!(++comp1_M->Timing.clockTick0))

++comp1_M->Timing.clockTickH0;

comp1_M->Timing.t[0] = comp1_M->Timing.clockTick0 * comp1_M-

>Timing.stepSize0

+ comp1_M->Timing.clockTickH0 * comp1_M->Timing.stepSize0 *

4294967296.0;

UNUSED_PARAMETER(tid);

}

/* Model initialize function */

void comp1_initialize(boolean_T firstTime)

{

(void)firstTime;

/* Registration code */

/* initialize non-finites */

rt_InitInfAndNaN(sizeof(real_T)); /* initialize real-time model */

(void) memset((char_T *)comp1_M,0,

sizeof(RT_MODEL_comp1));

/* Initialize timing info */

{

int_T *mdlTsMap = comp1_M->Timing.sampleTimeTaskIDArray;

mdlTsMap[0] = 0;

comp1_M->Timing.sampleTimeTaskIDPtr = (&mdlTsMap[0]);

comp1_M->Timing.sampleTimes = (&comp1_M-

>Timing.sampleTimesArray[0]);

comp1_M->Timing.offsetTimes = (&comp1_M-

>Timing.offsetTimesArray[0]);

/* task periods */

comp1_M->Timing.sampleTimes[0] = (0.008);

/* task offsets */

comp1_M->Timing.offsetTimes[0] = (0.0);

}

rtmSetTPtr(comp1_M, &comp1_M->Timing.tArray[0]);

Page 45: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

45 Embedded DSP

{

int_T *mdlSampleHits = comp1_M->Timing.sampleHitArray;

mdlSampleHits[0] = 1;

comp1_M->Timing.sampleHits = (&mdlSampleHits[0]);

}

rtmSetTFinal(comp1_M, 10.0);

comp1_M->Timing.stepSize0 = 0.008;

comp1_M->solverInfoPtr = (&comp1_M->solverInfo);

comp1_M->Timing.stepSize = (0.008);

rtsiSetFixedStepSize(&comp1_M->solverInfo, 0.008);

rtsiSetSolverMode(&comp1_M->solverInfo, SOLVER_MODE_SINGLETASKING);

/* block I/O */

comp1_M->ModelData.blockIO = ((void *) &comp1_B);

{

int_T i;

void *pVoidBlockIORegion;

pVoidBlockIORegion = (void *)(&comp1_B.FFT[0].re);

for (i = 0; i < 256; i++) {

((real_T*)pVoidBlockIORegion)[i] = 0.0;

}

pVoidBlockIORegion = (void *)(&comp1_B.y[0].re);

for (i = 0; i < 128; i++) {

((real_T*)pVoidBlockIORegion)[i] = 0.0;

}

pVoidBlockIORegion = (void *)(&comp1_B.ADC[0]);

for (i = 0; i < 64; i++) {

((real_T*)pVoidBlockIORegion)[i] = 0.0;

}

pVoidBlockIORegion = (void *)(&comp1_B.y_d[0]);

for (i = 0; i < 64; i++) {

((real_T*)pVoidBlockIORegion)[i] = 0.0;

}

}

/* parameters */

comp1_M->ModelData.defaultParam = ((real_T *) &comp1_P);

}

/* Model terminate function */

void comp1_terminate(void)

{

/* (no terminate code required) */

}

/*======================================================================

==*

* Start of GRT compatible call interface

*

Page 46: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

46 Embedded DSP

*=======================================================================

=*/

void MdlOutputs(int_T tid)

{

comp1_output(tid);

}

void MdlUpdate(int_T tid)

{

comp1_update(tid);

}

void MdlInitializeSizes(void)

{

comp1_M->Sizes.numContStates = (0); /* Number of continuous states */

comp1_M->Sizes.numY = (0); /* Number of model outputs */

comp1_M->Sizes.numU = (0); /* Number of model inputs */

comp1_M->Sizes.sysDirFeedThru = (0); /* The model is not direct

feedthrough */

comp1_M->Sizes.numSampTimes = (1); /* Number of sample times */

comp1_M->Sizes.numBlocks = (9); /* Number of blocks */

comp1_M->Sizes.numBlockIO = (5); /* Number of block outputs */

comp1_M->Sizes.numBlockPrms = (97); /* Sum of parameter "widths" */

}

void MdlInitializeSampleTimes(void)

{

}

void MdlInitialize(void)

{

}

void MdlStart(void)

{

/* S-Function Block: <Root>/ADC (c6416dsk_adc) */

codec_init();

/* S-Function Block: <Root>/DAC (c6416dsk_dac) */

MdlInitialize();

}

RT_MODEL_comp1 *comp1(void)

{

comp1_initialize(1);

return comp1_M;

}

void MdlTerminate(void)

{

comp1_terminate();

}

Page 47: Assignment 7

MSRSAS - Postgraduate Engineering and Management Programme - PEMP

47 Embedded DSP

/*======================================================================

==*

* End of GRT compatible call interface

*

*=======================================================================

=*/