Download pdf - Xiao Xin 2002

AN ABSTRACT OF THE THESIS OF

Xin Xiao for the degree of Master of Science in Electrical & Computer Engineering

presented on May 17, 2001.

Title: A VHDL Description of Speech Recognition Front-End.

Abstract approved:

This thesis investigates an implementation of speech recognition front-end.

It is an application specific integrated circuit (ASIC) solution. A Mel Cepstrum

algorithm is implemented for the feature extraction. We present a new mixed split-

radix and radix-2 Fast Fourier Transform (FFT) algorithm, which can effectively

minimize the number of complex multiplications in the speech recognition front-

end. A prime length discrete cosine transform (DCT) is done effectively through

the use of two shorter length correlations. The algorithm results in a circular

correlation structure that is suitable for a constant coefficient multiplication and

shift-register realization. The multiplicative normalization algorithm is used for

square root function. Radix-2 algorithm is used in the first 5 stages and radix-4

algorithm is used in the other stages to speed up the convergence. A similar

normalization algorithm is present for natural logarithm.

Redacted for Privacy

©Copyright by Xin Xiao

May 17, 2001

All Rights Reserved

A VHDL Description of Speech Recognition Front-End

by

Xin Xiao

A THESIS

submitted to

Oregon State University

In partial fulfillment ofthe requirements for the

degree of

Master of Science

Completed May 17, 2001Commencement June 2002

Master of Science thesis of Xin Xiao presented on May 17, 2001

APPROVED:

Major Professor, representing Electrical and Computer Engineering

Chair of Department of Electrical

Dean of Graduate

Engineering

I understand that my thesis will become part of the permanent collection of Oregon

State University libraries. My signature below authorizes release of my thesis to

any reader upon request.

Xin Xiao, Author




Redacted for privacy

£KNOWLEDGEMENTS

ss my most sincere gratitude to Prof. S hi h-Lien Lu. It is a

very wonderful experience to have worked under his supervision and guidance. I

have benefited greatly from his creative ideas and perspectives. I also thank him

for his warm encouragement and generous financial support for my master studies

in the department.

Special thanks to Prof. Un-Ku Moon and Prof. Gabor Temes for their excellent

teachings on analog circuit designs, which helps me a lot in my current and future

career life. I am very glad that Prof. Un-Ku Moon agreed to be my minor

professor.

I would also like to thank Prof. Jon Herlocker of Computer Science

Department for taking time out of his busy schedule to serve as the Graduate

Council Representative of my committee.

I would like to acknowledge Roger Traylor for providing VHDL language

support. Special thanks to Ferne Simendinger for her help on my ECE study.

Thanks to my Chinese friends Huibo Lui, Jipeng Li, Binglei Zhang, Yin Hua,

Chengwei Zhang, Haibin Zhang, Hai Nan etc. from whom I learned a lot.

Special thanks to Mrs. Ruth Bateman for her help on my English and my life in

the USA.

I am grateful to my parents, my parents in law, my grandparents, my sister, my

brother in law, my aunts and every member in my family for their love and support

throughout my life.

Finally, I would like to thank my wife Boling and my daughter Weirui for their

love, patience, support and encouragements. To them I dedicate this work.

TABLE OF CONTENTS

Chapter 1. Introduction

Page

1.1 Speech Recognition Systems ...............................................3

1.2 Thesis Objectives ..............................................................4

1.3 Arrangement of Material ...................................................... 5

Chapter 2. SR Front-End Algorithm and Architecture

2.1 Speech Signal Parameterization ............................................. 7

2.1 .1 Front-End Algorithm ..................................................72.1.2 Cepstral Analysis ..................................................... 8

2.2 Front-End Block Diagram .................................................10

2.2.1 ESTI Front-End Extract Algorithm ............................... 102.2.2 One Frame Speech Signal Processing Flow Diagram............ 11

2.2.3 Flow Diagram of Whole Design ................................... 12

2.2.4 Each Block's Function ............................................... 12

2.3 Summaiy ..................................................................... 15

Chapter 3. The FFT Algorithm and Implementation

3.1 FFT Algorithm Review ...................................................... 16

3.1.1 Radix-2 Decimation-in-Frequency FFT Algorithm ............... 173.1.2 Split-Radix Algorithm .............................................. 18

3.2 A Mixed Split-Radix and Radix-2 Algorithm ............................ 21

3.3 FFT Algorithm Implementation ...........................................24

TABLE OF CONTENTS (Continued)

Chapter 4 Logarithm and Square-Root Functions

4.1 Logarithm Function ........................................................ 29

4.1.1 Logarithm Function Algorithm ...................................304.1.2 Logarithm Function Implementation ........................... 34

4.2 Square-Root Function ..................................................... 35

4.2.1 Square-Root Function Algorithm ................................364.2.2 Square-Root Function Implementation .........................38

Chapter 5. Prime-Length DCT Transformation

5.1 Algorithm ...................................................................40

5.2 Implementation ..............................................................43

Chapter 6 Other Implementations

6.1 Offcom Block ...............................................................45

6.2 Framing Block .............................................................46

6.3 EM Block ...................................................................47

6.4 PE Block ....................................................................48

6.5 WBlock .....................................................................48

6.6 MFBlock ...................................................................49

Chapter 7 Simulation and Synthesis

7.1 Simulation .................................................................. 52

7.2 Synthesis .................................................................... 53

Chapter 8. Summary and Future Work

8.1 Summary .................................................................... 54

TABLE OF CONTENTS (Continued)

Page

8.2 Future Work .55

References.................................................................................... 57

Appendices................................................................................... 60

Appendix A Software Description ......................................................61

Appendix B Synthesis Report ...........................................................62

Appendix C Design Files .................................................................64

LIST OF FIGURESFigure Page

1.1: An example of speech recognition system ..........................................4

2.1: Cepstral analysis ........................................................................9

2.2: Block diagram of Mel-Cepstrum front-end algorithm ........................... 11

2.3 One-frame speech signal processing data flow diagram ......................... 11

2.4 Block diagram of whole design ...................................................... 12

2.5 Bank of filter scaled according to Mel scale ....................................... 15

3.1 A butterfly unit ......................................................................... 17

3.2 Flow graph of 8-point FFT based on decimation-in-frequency ................ 18

3.3 Flow graph of 16-point split-radix FFT ............................................ 19

3.4 L-butterfly .............................................................................. 20

3.5 Diagram of the sequence of butterfly in a 16-point split-radix FFT ........... 20

3.6 Diagram of mixed split-radix and radix-2 algorithm ............................ 22

3.7 Radix-2 pipeline unit ..................................................................24

3.8 Radix-2 pipeline FFT structure ...................................................... 25

3.9 Pipeline unit of split-radix algorithm .............................................. 26

3.10 Schematic representation of the progression of a 32-point mixedsplit-radix and radix-2 algorithm ...................................................27

3.11 Mixed split-radix and radix-2 FFT structure ......................................27

4.1 The error analysis of the logarithm algorithm ......................................34

4.2 Top level of the logarithm function .................................................34

LIST OF FIGURES (Continued.)

Figure Page

4.3 Error analysis of the square-root algorithm ........................................37

4.4 Top level of the square-root function ............................................... 38

5.1 Matrix multiplication where SR denotes shift register ..........................44

6.1 Implementation of the offset compensation block ................................45

6.2 Framing block state machine diagram .............................................46

6.3 The simulation result of Framing block ............................................. 47

6.4 The structure of EM block ...........................................................47

6.5 The implementation of pre-emphasis block ....................................... 48

6.6 The structure of W block .............................................................49

6.7 Bands of filter scaled according to me! scale ......................................50

6.8 Even bands of mel filter ..............................................................50

6.9 Odd bands of mel filter ..............................................................50

6.10 The structure of MF block ......................................................... 51

LIST OF TABLES

Table Page

3.1 The number of real multiplication and addition for radix-2, split-radix andthe mixed split-radix and radix-2 algorithms for N = 512 ......................23

4.1 The u(i) bounds in radix-4 multiplicative normalization ..........................32

7.1 Synthesis result ..........................................................................53

LIST OF ABBREVIATIONS

ADC Analog-to-digital Conversion

ASIC Application Specific Integrated Circuit

CPU Central Processing Units

DC Direct Current

DCT Discrete Cosine Transform

DEMUX Demultiplexing

DIF Decimation-in-Frequency

DFT Discrete Fourier Transform

DSR Distributed Speech Recognition

EM Energy Measure

ESTI European Telecommunications Standards Institute

FFT Fast Fourier Transform

hR Infinite Impulse Response

LOG Natural Logarithm Function

LogE Energy Measure Computation

LPC Linear Predictive Coding

MF Mel-filtering

MFB Mel Filter Bank

MUX Multiplexer

Offcom Offset Compensation

PE Pre-emphasis

REG Register

SR Speech Recognition

VHDL VHSIC Hardware Description Language

VHSIC Very High Speed Integrated Circuit

VLSI Very Large Scale Integration

W Windowing

A VHDL Description of Speech Recognition Front-End

Chapter 1. Introduction

For decades, people have a strong desire to make computer behave like

human beings. Speech is one of the most attractive interfaces between humans and

machines. Speech is natural and efficient. We know how to speak before we know

how to read and write. We need fewer keyboard skills. Dictating directly into text

is often much faster than keyboarding. Because speech recognition engines turn

speech into text, speech can serve as a front-end for any application with a text

interface. The primary objective of speech recognition is to enable all of us to have

easy access to the full range of computer services and communication systems,

without the need for all of us to be able to type, or to be near a keyboard [1].

High performance of speech recognition systems requires high quality audio

input. They also require substantial processing speed and memory to perform the

digital signal processing. Most speech recognition systems have a feature extraction

front-end which performs a significant level of data reduction, to a bit rate capable

of being transmitted over most fixed and mobile networks [2]. The speech

recognition front-end process the speech input produced by A/D and converts it to

features, which can be used by the recognizer engine. By offloading the front-end

processing from the central processing units (CPU) makes the entire system more

scalable and efficient. It also allows many concurrent users to be connected to the

2

speech recognizer without any serious compromise in voice quality or transaction

processing speed. In this client-server environment, you can run a powerful speech-

recognition engine on the advanced state-of-the-art computer system as the server,

while using a front-end engine client to extract speech features to send to the server

for recognition. In this way, the main cost of speech-recognition processing falls on

the server, which is less expensive per Million Instruction- Per Second (MIPS).

Moreover, many front-end units can share a single server and advantage of the less

expensive processing capability [3]. This client-server setup reduces system

latency, increases recognition accuracy, and improves overall system response

time.

This thesis investigates an implementation of speech recognition front-end.

It is an application specific integrated circuit (ASIC) solution. A Mel Cepstrum

algorithm is implemented for the feature extraction. We present a new mixed split-

radix and radix-2 Fast Fourier Transform (FFT) algorithm, which can effectively

minimize the number of complex multiplications in the speech recognition front-

end. A prime length discrete cosine transform (DCT) is done effectively through

the use of two shorter length correlations. The algorithm results in a circular

correlation structure that is suitable for a constant coefficient multiplication and

shift-register realization. The multiplicative normalization algorithm is used for

square root function. Radix-2 algorithm is used in the first 5 stages and radix-4

algorithm is used in the other stages to speed up the convergence. A similar

normalization algorithm is present for natural logarithm.

3

1.1 Speech Recognition Systems

SR systems are potentially very useful and have been well researched and

developed for many years. Some sample applications are telephone application,

hands-free operation, applications for the physically handicapped, dictation, and

translation. Telephone application currently holds the largest market share among

SR systems. For example, AT&T's Call Routing system save AT&T $300 million

a year in operator costs and was recently claimed to handle in excess of 1 billion

calls per day [6].

A SR system may be described as consisting of four major modules: input

acquisition, signal processing front-end, word and language models, pattern

classification and language processing [7][8][9]. The input acquisition module

extracts significant data from the speech samples. This adapts SR systems to the

environment variability and to reduce its influences. Signal processing front-end

consists of transforming the speech signal into a sequence of feature vectors that

are robust to acoustic variation but sensitive to linguistic content. This

representation has a significant level of data reduction than the original speech

waveform. The models are built in the training phase. The system learns the feature

vectors representing the different speech samples. Generally, the feature vectors

characterize the statistical properties of the spoken examples. Finally, the pattern

classification and language-processing module match the feature vectors of

observation with the models, which are built with predefined examples. An

example of SR system is shown in Figure.1.1 [7].

Figure 1.1: An example of speech recognition system

1.2 Thesis Objectives

The purpose of this thesis is to design a signal processing front-end to

transform the speech signal into a sequence of feature vectors. The front-end

algorithm is based on mel-cepstral feature extraction technique. The feature vectors

consist of 13 static cepstral coefficients and a log-energy coefficient. The front-end

algorithm primary consists of the following blocks:

Offset compensation

Framing

Per-emphasis

Windowing

FFT

Deframing

Square root function

Mel filtering

Logarithm function

DCT

1.3 Arrangement of Material

This thesis comprises eight chapters.

thesis.

5

Chapter One briefly introduces the SR system and the objective of the

Chapter Two introduces the popular SR front-end processing algorithm

based on auditory properties. A description of the block diagrams of the Mel

Cepstrum is also given.

Chapter Three investigates the FF1 algorithm based on the number of

multiplications. A mixed split-radix and radix-2 algorithm, which minimize the

6

number of complex multiplications in the application, is described. The structure of

the FFT is presented.

Chapter Four examines the multiplicative normalization algorithm for

square root and natural logarithm function. The architecture of the two functions is

presented.

The algorithm for DCT is presented in Chapter Five. A prime length DCT is

done effectively through the use of two shorter length correlations. The circular

correlation architecture from the algorithm is suitable for a constant coefficient

multiplication and shift-register realization.

Chapter Six presents the implementation architecture of the other blocks in

the SP front-end algorithm.

Chapter Seven presents the simulation and test results. Synthesis results are

also reported.

Chapter Eight summarizes the project and points out some future work.

Appendix A describes the files.

Appendix B presents a project synthesis report.

Appendix C presents the design files.

7

Chapter 2. SR Front-End Algorithm and Architecture

In this chapter, we review the physiological voice production process to be

able to model it approximately. Several methods of source-filter separation and

subsequent parameterization are discussed. The architecture and block diagram of

source-filter separation through cepstral analysis are given.

2.1 Speech Signal Parameterization

The phonatory mechanism articulator produces voice. It maintains in a

position for a short time in order to get a phoneme. Then it changes to a different

position through an articulatory transition movement [9]. Fant [10] gave a

statement to define the model of speech: The speech wave is the response of the

vocal tract to one or more excitation signals. This gives us a way to separate the

excitation signal and the vocal tract. The excitation signal (source) and the vocal

tract (filter) can be separated through deconvolution. The mathematical model of

the physiological voice production process is the source-filter model.

2.1.1 Front-End Algorithm

The methods of source-filter separation have been developed for many

years. The vast majority of them are based on standard signal processing

techniques, such as filter banks, linear predictive coding (LPC) and Mel filter bank

8

(MFB) cepstral analysis. There are also several methods that are based on known

properties of the human auditory systems, such as Seneff Auditory model, EIH

auditory model. The results of the evaluation in [11] show that the MFB cepstral

front-end significantly outperforms the LPC front-end, especially in noise and an

auditory model might perform better than an LPC-based front end as other studies

have shown. It performs very similar to and only slightly better than a MFB cepstra

front-end. An auditory front-end is much complex and slow compare to the MFB

cepstra front-end. Seneff model is about 120 times slower and the EIH model is

about 360 times slower than the MFB cepstral model. So, MFB front-end is chosen

in this thesis.

2.1.2 Cepstral Analysis

Bogert et al first used Cepstral processing in the seismic analysis. [12]. Alan

Oppenheim applied the cepstral processing concept to voice processing [13]. If we

refer P1 to the spectrum of the observed speech signal, P2 to the excitation signal,

and P3 to the spectrum of vocal tract filter, we have:

P1(w) = 1P2(w)11P3(w)1 (2.1)

Taking the logarithm of Eq. (2.1) yields

log JP1(w)l = logP2(w) + logP3(w)J (2.2)

P2 term characterizes by a relatively rapidly varying function of w. P3 term

varies slowly with w. The logarithm operation performs deconvolution. Equation

2.2 transforms the multiplicative relation between the envelope and fine structure

into an additive relation. By performing a Fourier transform on log IP1(w),

one

separates the slowly varying log spectral envelope from the more rapidly varying

spectral fine structure. The FFT of log IP1(w) is called the cepstrum. The cepstrum

coefficient is computed by taking the inverse Discrete Fourier Transform (DFT) of

Eq. 2.2. Since the log power spectrum is real and symmetric, the inverse DFT

reduces to a DCT, yielding

MC(i) = logP1(w) * cos[{ I = 0,.. .,L (2.3)

j=I

Time (a)

Windowed speech wave

, Spectrum

A Frequency (b)

-

.1)

Frequency (c)

Cepstrum of Cepstrumspectral envelope

Cepstrum

\ /

I of excitation

A A Time (d)I

Figure 2.1: Cepstral analysis.

10

The DCT has the property to produce highly uncorrelated features. This

significantly reduces the computational cost and the number of parameters to be

estimated [9].

Figure 2.1 [8] depicts the sequence of operations from the speech wave to the

cepstrum.

2.2 Front-End Block Diagram

After comparing the algorithms and explain how to get cepstrum from the

speech signal, we will present the front-end architecture and implementation in this

section. The European Telecommunications Standards Institute (ETSI) standard

about distributed speech recognition front-end feature extraction algorithm is

introduced. Subsequently, the data flow diagram for one frame speech signal

processing and for the whole design is given. Finally, the blocks in the design are

discussed.

2.2.1 ETSI Front-End Extract Algorithm

ETSI released a standard [14] about front-end extraction and compression

algorithm used in distributed speech recognition (DSR) system in April 2000. A

Mel Cepstrum algorithm is proposed for the feature extraction. Figure 2.2 shows

the block diagram of Mel-Cepstrum front-end algorithm.

11

ADC Off corn Framing W ' FFT MF LOG

logE

ANirevithm:

ADC aiakg-to.cigjtal cmvsicnOffain offsdajnknPElogE aiermirecmip*imW

FFT

wirdowiigftFaii trasf*xm(m1ymmhx coinuuits) Bit Stream Formatting

MF md-fflteiiLOG iuiliirartrfcrmimDCI &ae cii trafcnnMFCC nid-friemcy cqtr coeffloal To transmission channel

Figure 2.2 Block diagram of Mel-Cepstrum front-end algorithm

2.2.2 One Frame Speech Signal Processing Flow Diagram

We get our one frame speech signal processing data flow diagram based on

the ETSI front-end extract algorithm. The ETSI algorithm is implemented by

software. We add additional function, such as the square-root (SR) function to

compute the magnitude of the FFT, to implement the hardware design. Figure 2.3

shows the one frame speech signal processing data flow diagram.

Offtarn

IJ1L

vectot

Figure 2.3 One-frame speech signal processing data flow diagram

12

2.2.3 Flow Diagram of Whole Design

After increasing the Framing, Deframing, and paralleling the PE, W, SR,

MF and FFT blocks, we get the data flow diagram of the whole design. Figure 2.4

shows the data flow diagram of the whole design.

Figure 2.4 Block diagram of whole design

Now, we have the block diagram of the design. We need to implement the

function of each block.

2.2.4 Each Block's Function

In the section, we will discuss the block's function one by one.

ADC: the analog-to-digital conversion part converses the input analog

signal to the digital speech signal sin. In this project, we assume the digital signal

word-length is 16-bit. It is not a part of the ASIC design.

13

Offcom: an hR filter is applied to the digital speech signal to remove their

DC offset to produce the offset-free input signal sof

sof(n) = sin(n) sin(n 1) + 0.999 * sof(n 1) (2.4)

Framing: The offset-free signal sof is divided into overlapping frames of N

samples with the frame shift interval M. In the project, we select N = 400 and M

160. This selection corresponds to the sampling rate 16kHz. Each frame of N

samples is zero padded to 512 to form an extended frame. The outputs are 4

channels that each channel has a start-of-frame: sofand data signal sfr.

sem.

EM: the signal energy is computed after the framing, producing the signal

sem = sfr(i)2 (2.5)

LOG: the output of signal energy sem is subjected to a natural logarithm

function. CO is the output of the log function.

CO = ln(sem) (2.6)

PE: A pre-emphasis of high frequencies is required to obtain similar

amplitude for all formants since high frequency formants have smaller amplitude

with respect to low frequency formants. It produces the signal spe

spe(n) = sfr(n) 0.97 * sfr(n 1) (2.7)

W: a rectangular shape window is implicitly used when a sequence of N

samples is retrieved from a signal. A product between the speech signal and the

window in the time domain causes a convolution in the frequency domain. This

14

provokes a distortion on the estimated speech spectrum. A Hamming window of

length N is applied to the spe signal to reduce the distortion. The output of the

windowing block is SW.

= 0.54-0.46 * cosi I* spe(n),1 n N (2.8)sw(n)

{(2ir(n-1)'\1

N-1)j

FFT: An FFT of length 512 is applied to compute the signal frequency

spectrum.

511

Sffl(k)=sfft_real+j*sffl_imag=sw(n)e" 512

k=0,...,511 (2.9)

SR: a SR is applied to compute the magnitude spectrum of the signal.

bin(k)=jsffl_real2 +sffl_imag2 (2.10)

MF: a common method of representing the parameters of a speech signal

frame is by building a vector of the energies in the various frequency bands. This

information can be extracted by passing the signal through a bank of band pass

filters and computing the energy at the output of each one [1 5]. The frequency

bands covers by each filter must, to some extent, overlap adjacent bands in order to

avoid information loss. It has been found that some frequency bands contain more

significant information for the purpose of speech recognition than other bands.

Dividing the frequency scale according to the sensitivity of the human ear results in

a nonlinear frequency scale similar to the one illustrated in figure 2.5, which is

suitable for speech recognition {16].

15

F'

Figure 2.5 Bank of filter scaled according to Mel scale

Fbank(k) is the output of the Mel filter.

Deframing: The outputs of Mel filter are combined into one channel before

sent to LOG function.

LOG: The output of Deframing is subjected to a logarithm function to

separate the slowly varying log spectral envelope from the more rapidly varying

spectral fine structure. The output isfO).

DCT: 13 cepstral coefficient are calculated through a DCT from the output

of the log block

23

C(i) = f(J)*co(J_O.5)J, Oi12 (2.11)j=1

2.3 Summary

In this chapter we briefly discuss the generally accepted flow of front-end

processing. This front-end processing changes the speech signal from time domain

through frequency domain then back to time domain.

16

Chapter 3. The FFT Algorithm and Implementation

The chapter investigates FFT algorithms, which related to the algorithm used in

front-end processing. A mixed split-radix and radix-2 algorithms, which minimize

the number of complex multiplications in the application, is described. The

architecture of the FFT and the architecture level low power design are presented

3.1 FFTAlgorithms Review

For a given spatial sequence {x [n], n = 0, 1,..., N-1}, an N-point DFT is

defined as

X[k]=x[nJW, k=0, 1, ...,N-1 (3.1)

where

2,r

WN =e', (3.2)

A direct implementation based on the above equation results in 4N2 real

multiplications and 2N(2N 1) real additions. Direct computation of the DFT is

inefficient. Exploiting the symmetry and periodicity properties of the phase factor

WN can get the computationally efficient algorithms. These two properties are:

Symmetry property: W'2 = _WkN N (3.3)

Periodicity property: W+N = (3.4)

17

3.1.1 Radix-2 Decimation-in-Frequency FF1 Algorithm

The radix-2 decimation-in-frequency algorithm divides the output sequence

X[k] into successively smaller subsequences. Now, let us split X(k) into the even-

and odd-number samples. Thus we obtain

(N/2)-1 r

X[2k] = I x(n) + + , r =0, 1, ..., N/2 1 (3.5)N=O [ 2)] N/2

(N/2)-Ir (N)lWnWknX[2k+1J= x(n)+. n+

N=O (\ 2)] N N/2 r = 0, 1, ..., N/2 1 (3.6)

The computational procedure above can be repeated through decimation of

the N/2-point DFT X(2k) and X(2k+1). The entire FFT needs log2N stages of

decimation. Consequently, the computation of the N-point FFT requires 2Nlog2N

real multiplications. A butterfly, the basic element of the radix-2 FFT is depicted in

Figure 3.1. The eight-point decimation-in-frequency algorithm is depicted in Figure

3.2.

a

Aa+b

b

B (a-b)W

Figure 3.1 A butterfly unit

18

IoJ

p)

f2j

-131

i')

(6J

(7J

4,,'

(2j

.4'dJ

d-sj

f3J

7]

Figure 3.2 Flow graph of 8-point FFT based on decimation-in-frequency

We can get the radix-2 decimation-in-time algorithm if we split the

sequence x{n] into even and odd parts. We can get radix-R algorithm if N=Rv. The

higher order radix algorithms are computationally more efficient. However, they

also require more complex butterfly structures.

3.1.2 Split-Radix Algorithm

The split-radix algorithm is based on the following decomposition:

X(2k)=[x(n)+x(n+!J]W2 r=O, 1, ...,N/2-1 (3.7)

(N/4)-1

+X(4k+1)= [[X(n)_n+JJJ(x(n+)_JJ]

N N14'N=O

r=O, 1, ...,N/4-1 (3.8)

(N14)_I[[

N'X(4k+3)= n+Il+N=O 2))

r=O,1,...,N/4-1

19

I N Ii11WW)-1 n+jx(n+))]

N N/4'

Figure 3.3 shows the flow graph of a 16-point split-radix FFT [17]. Figure 3.4

(3.9)

shows a L-Butterfly-the basic unit of the split-radix FFT [17]. Figure 3.5 shows the

diagram of the sequence of butterflies in a 16-point split-radix FFT.

Figure 3.3 Flow graph of 16-point split-radix FFT

20

f(n)

f(n+ )

f(n+q)

f(n *

Figure 3.4 L-butterfly

1 25tagc1

INpUT

R2 Rad-2Buttci-fly

_____ Block of Split-RadbL Biittcif]ics

Use forF(4k+ 1)

UseforF4k+3)

0UTpUT

Figure 3.5 Diagram of the sequence of butterfly in a 16-point split-radix FFT

21

From Figure 3.5, we can see that the split-radix algorithm lacks the regularity. It is

not good to VLSI pipeline implementation.

3.2 A Mixed Split-Radix and Radix-2 Algorithm

From the above analysis we can see that the radix-2 algorithm has some

advantages, such as: a simple basic unit, a regularity and flexibility algorithm, and

in-place implementation, which reduces the arithmetic complexity. Split-radix

algorithm has lower number of multiplications compare to the radix-2 algorithm.

The L-shape butterfly in the split-radix algorithm makes the control of the split-

radix algorithm more difficult than the radix-2 algorithm. In this part, we present a

mixed split-radix and radix-2 algorithm that is suitable for this application and any

real-valued inputs FFT.

Although X[k] and X* [kJ have complex elements of the form (a + jb), a

conjugate symmetry property applies to the DFT of real-values inputs, and as a

consequence the DFT representation involves a total of only N/2 elements. The

conjugate symmetry is described by:

X{kJ=X[Nk], k=l,2,...,(N/2)-1 (3.10)

So, if we know X[kJ, we know X[N-k] and vise versa.

According to equation 3.7, we know the even-number samples of X[k} is a

real-values output when the input is real-values. The coefficient of X[2kJ is xl[n].

xl[n] = x[n]+ x[n + N12] (3.11)

22

According to equation 3.10 and equation 3.11 and the output bit-inverse

property in radix-2 algorithm, we present a mixed split-radix and radix-2 algorithm.

In this algorithm, the inputs of all L-shape butterflies are real-values. Figure 3.6

shows a diagram of 32-point mixed algorithm.

1 2 3 4 5,t.agcI I I

2R2

4R2

8-R2

N-R2 N-pdht Rachx-2 Off EFT

N B1k of N-Lnput Spiit-RadLx L1UCL{]Y

Figure 3.6 Diagram of mixed split-radix and radix-2 algorithm

23

The mixed algorithm has two kinds of basic unit. One is the block of N-point real-

values input split-radix butterfly. Another is the N-point radix-2 decimation-in-

frequency FFT. The maximum points of radix-2 FFT reduce from N to N/4. This

fact is the fundamental reason for the reduction in the required number of

multiples. Table 3.1 compares the number of real multiplication and addition for

radix-2, split-radix and the new mixed algorithm for N 512.

Size Split-Radix Radix-2 Mixed Split-Radixand Radix-2

N mult add mutt add mutt add

512 3076 12292 5644 12292 2388 5442

Table 3.1 The number of real multiplication and addition for radix-2, split-radixand the mixed split-radix and radix-2 algorithms for N = 512

The number of real multiplication and addition for radix-2 and split-radix

algorithms is from [19]. The mixed algorithm includes the following advantages:

The lowest number of multiplications and additions;

The same regularity as radix-2 algorithm;

The same flexibility as radix-2 algorithm;

No reordering of the data inside the algorithm;

The outputs are bit-inversed order as radix-2 DIF algorithm.

The price that has to be paid for these advantages is that there are two kinds of

basic blocks rather than one kind of block in the radix-2 algorithm. The two kinds

24

of basic blocks are split-radix block and radix-2 block. This property caused an

increasing area when we implement the algorithm in VLSI chip.

3.3. FFT Algorithm Implementation

Only a few basic functions are needed to implement a radix-2 algorithm.

These are

Butterfly modules (BF), which perform the butterfly operation;

Shift register memories (SR) for intermediate storage of data;

ROM for storage of the twiddle factors W.

The radix-2 pipeline unit is shown in Figure 3.7.

ROM

w

I_

- b)W3.:V 2/

-' C

pow

Figure 3.7 Radix-2 pipeline unit

25

It consists only of the three modules listed above. Basically, the operation of the

basic unit is as follows. First, the Switch C is connected to 1 and Switch pow is off.

The first N12 inputs are shifted into the shift registers. Then the Switch C is

connected to 2 and Switch pow is on. The second N/2 inputs perform butterfly

operation with the first N/2 inputs. One of the outputs of the butterfly is output to

the next stage directly and the another is output to the shift register and output to

the next stage after N/2 clock cycles. The operation of the next module is similar,

except the length of the shift register is changed to N/4. The N-point radix-2

pipeline structure is shown in Figure 3.8.

1OM

Iw

[Nt2.SR]

J[N/4SR I

L _____[

[BF I rBF I J [BF

Figure 3.8 Radix-2 pipeline FFT structure

This radix-2 pipeline structure has regularity and flexibility which makes it easy to

incorporate into the mixed split-radix and radix-2 algorithm.

26

Only a few similar basic functions are needed to implement the block of

split-radix algorithm. These are

. Adder and multiplier, which perform the arithmetic operation.

Shift register memories (SR) for intermediate storage of data;

ROM for storage of the twiddle factors W'.

Figure 3.9 shows the pipeline unit of split-radix algorithm.

LJNI4

CnpkMubiplica1i

X4k41)

Figure 3.9 Pipeline unit of split-radix algorithm

The unit is divided into two parts. The first shift register group and the two adders

compute X[2k]. The second shift register group and the complex multiplier

compute X[4k + 1J. Basically, the operation of the unit is as follows. First, the

switch 1 is on and the others are off. The first N/2 inputs are moved into the first

shift register group. When the first shift register group is full, the switch 1 is off

32

Nt2datnaLcditithciL,tSRF

Outptd. XC2k)

N/4 dth oLcd nthe co,dSR

Aitpu1X(4k-+ [>

N-FFr: N-pornt Ladi-2 FF1

[6

8-FFr

$

4-FFr

2-FFr

2-FFT-

27

Figure 3.10 Schematic representation of the progression of a 32-point mixed split-radix and radix-2 algorithm

Figure 3.11 Mixed split-radix and radix-2 FFT structure

28

and switch 2 is on. The second N/2 inputs perform addition operation with the first

N/2 input data. One of the outputs X{2k] is sent to next stage split-radix block.

Another is sent back to the first shift register group. When the N inputs is over. The

Switch 4 is on. The second shift register group gets data from the first one. After

N/4 clock cycles, the Switch 3 and 5 are on. The data from the first shift register

group and the second shift register group perform multiplication with the twiddle

factor Wr. The outputs are sent to corresponding FFT stage according to Figure 3.6.

Figure 3.10 shows the schematic representation of the progression of a 32-point

mixed split-radix and radix-2 algorithm. Figure 3.11 shows the structure of the

mixed split-radix and radix-2 algorithm. It consists of three parts: a N/4-point

radix-2 FF1; pipeline blocks of split-radix and multiplexers. The multiplexer output

is connected to split-radix block when the split-radix outputs the data X[4k + 1].

Otherwise, it is connected to the radix-2 block to let the radix-2 block perform the

general radix-2 FFT.

29

Chapter 4. Logarithm and Square-Root Functions

This chapter describes the design of logarithm and square-root functions. As we

know, these functions can be implemented by using look-up table. It is cost to use

look-up table. For example, it will use 64k entries for a 16-bit input. The algorithm

described here is called multiplicative normalization algorithm. This method

involves using recursive equations to approximate the result progressively. The

general convergence method is characterized by two recurrences of the form [5]:

u(i+1) =f(u(i),v(i)) (4.1)

v(i +1) = g(u(i), v(i)) (4.2)

Beginning with the initial values u(0), v(0), we iterate such that one value, say u,

converges to a constant; v then converge to the desired function(s). When a single

multiplication is needed per iteration to normalize u, then we have a multiplicative

normalization method.

4.1 Logarithm Function

In this section, we first check the multiplicative normalization algorithm

for the logarithm function. Then we give an implementation.

30

4.1.1 Logarithm Function Algorithm

The following equations define a convergence method based on

multiplicative normalization in which multiplications are done by shift/add [4] [5]

x(i + 1) = x(i)b(i) = x(i)(1 + d(i)r') d(i) E [a, a] (4.3)

y(i + 1) = y(i) lnb(i) = y(i) ln(1 + d(i)r) (4.4)

where ln(1 + d(i)r') is read out from a table. Beginning with x(0) = x and y(0) = 0

and choosing the d(i) digits such that x(m) converge to 1, we have, after m steps:

y(m) ln(x) (4.5)

Radix-4 algorithm instead of radix-2 is used. Using radix-4 algorithm can improve

the speed of convergence and reduce the hardware cost of the proposed design. The

relative difficulty of forming multiples of three makes it inconvenient to use the

maximally redundant radix-4 signed-digit set, -3, -2, -1, 0, 1, 2, 3). Therefore, the

minimally redundant set {-2, -1, 0, 1, 2) is used here.

A auxiliary function u(i) = 4' (x(i) 1) is used to compare the magnitude of

u(i) with a few constants to determine the next choice for d(i). The u(i) must be

made to converge to 0. By substituting the function into equation 4.3, we obtain:

u(i+1)=4(u(i)+c

2 if1 if

d(i)= 0 if1 if2 if

(i) + d(i)u(i)r) d(i) E [-2,2] (4.6)

u(i)alal<u(i)a2a2<u(i)<a3 (4.7)

a3u(i)<a4u(i)a4

31

The constant al, a2, a3 and a4 are found by determining the conditions required for

convergence in the five intervals.

We require Iu(i + 1)1 Iu(i)I for all i in order to converge. Let a is the

smallest value of u(i) that can cause divergence. By substituting 2 for d(i) and a

for u(i) in the right-hand side of equation 4.6, we obtain 4(a 2 2a4') a , and

in the limit, as I - 00, we get 4(a 2) a and a 8 / 3.

Let b is the biggest value of u(i) that can cause divergence. By substituting

2 for d(i) and b for u(i) in the right-hand side of equation 4.6, we obtain

4(b+2+2b4')b,andas i=2,we get 4(b+2+2b42)b and b-16/7.

From above analysis, we need to let 16/7 u(i) 8/3 in order to keep

convergence from k = 2. An interval-by-interval analysis is done to choose al, a2,

a3 and a4 to produce convergence from k = 2, then yields:

For d(i) = -2, then u(i)>0, to keep 16/7 u(i+l)8/3,we must have

16 / 7 4(u(i) 2 2u(i)4') 8/3; as , we get 4(u(i) 2) 8/3 and

u(i) 8/3; as I = 2, we get 16/7 4(u(i) 2 2u(i)42) and 80/49 u(i). So,

we have 8O/49u(i)8/3, for d(i)=-2.

For d(i) = -1, then u(i)>0, we must have

16/7 4(u(i) 1 u(i)4') 8/3; as - 00, we get 4(u(i) 1) 8/3 and

u(i) 5 / 3; as i = 2, we get 16 / 7 4(u(i) 1 u(i)42) and 48 / 105 u(i) . So,

we have 48/105 u(i) < 5/3,for d(i) = -1.

32

For d(i) = 0, we must have 16/7 4(u(i) 8/3. So, we have

4/7 u(i) 2/3,for d(i) = 0.

For d(i) = 1, then u(i)<0, we must have

16 / 7 4(u(i) + 1 + u(i)4') 8 / 3; as i - co, we get 4(u(i) + 1) 8 / 3 and

u(i) 1 / 3; as i = 2, we get 16 / 7 4(u(i) + 1 + u(i)42) and 176 / 119 u(i).

So,wehave-176/119u(i)-1/3,ford(i)=1.

For d(i) = 2, then u(i)<0, we must have

16/74(u(i)+2+2u(i)4')8/3; as ioo, we get 4(u(i)+2)8/3 and

u(i) 4 / 3; as i = 2, we get 16 / 7 4(u(i) +2 + 2u(i)42) and 144 / 63 u(i).

So, wehave 144/63u(i) 4/3, ford(i)2.

u(i) Bounds

d(i)Lower Upper

80/49 1.633 8/3 2.667 -2

48/105 0.457 5/3 1.667 -1

-4/7 -0.571 2/3 0.667 0

-176/119 -1.479 -1/3 -0.333 1

-144/63 -2.286 -4/3 -1.333 2

Table 4.1 The u(i) bounds in radix-4 multiplicative normalization

33

Table 4.1 summarizes the selection bounds. The effect of the redundancy in

the digit set is evident in the overlapping of the intervals. Furthermore, because of

the redundancy, al, a2 a3 and a4 need not be known with great accuracy. Then a

full-length representation of u(i) is not necessary for the selection of d(i). The

comparisons can therefore be implemented as fast low-precision operations. Hence,

we can choose al = 1.4, a2 = 0.45, a3 = 0.55 and a4 = 1.65.

An initialization procedure is required for i = 0 and i = 1. Through test, a

good initialization is:

u(0)=4(q*x_l) (4.8)

y(0) = log(q) (4.9)

12 if 1/2x<111161 f 11/16x<1 (4.10)

For the radix-4 algorithm, after neglecting the effect of generated error, K

iterations will yield a result accurate of about 2K 1 bits [4]. That is the magnitude,

e, of the absolute error is given by:

e < 2-(2K-1) (4.11)

We write a MATLAB file to verify the radix-4 algorithm. We use 8

iterations. Figure 4.1 shows the result of the error analysis. From Figure 4.1, we can

see, the accuracy is about 1*10-5 that is better than the calculation error result

e=3*105.

35

Because the algorithm presented in the previous part requires that the input

is bigger than or equal to and smaller than 1. The first block, LOG_pre_scale,

transforms the input data with the following equation:

x=a*2b, aE[1/2,l) (4.12)

After the transformation of the input, the first block also finishes the initialize

procedure to get u(0) and y(0) in the equation 4.8, 4.9 and 4.10 After the

transformation, in the second block, LOG iteration, the ln(a) is got through the

algorithm in the previous part and the b part is transferred to the next block through

register memories. In the third block, LOG_post scale, the ln(x) is get through the

following equation:

ln(x)= ln(a*2")=ln(a)+bln(2)=ln(a)+0.6931*b (4.13)

Equation 4.4, 4.6 and 4.7 are implemented in the block LOG_iteration.

Since we use 16 bits input, we use 8 stages to reach a accuracy 15 bits. In the file

log_pack.vhd, we define a constant array to store the values of ln(1 + d(i)r')

Since we have 8 stages, we use 32 entries in the array

4.2 Square-Root Function.

In the section, we present the multiplicative normalization algorithm for the

square-root function and give an implementation.

36

4.2.1 Square-Root Function Algorithm

The algorithm for square-root function is similar to the algorithm for

logarithm function. Both of them are based on multiplicative normalization. The

recursion equations for square-root algorithm are [4]:

x(k + 1) = x(k)d(k)2 (4.14)

y(k + 1) = y(k)d(k) (4.15)

x(0)=x (4.16)

y(0)=x (4.17)

y(k) when x(k) 1. To avoid the need to rescale the comparison constants

at every iteration, the scaled variable u(k) = 2" (x(k) 1) is substituted in the above

equations to get the equations for radix-2 algorithm are:

u(k + 1) = 2(u(k) + 2d(k)) + 2(4d(k)u(k) + 2d(k)2) + 22"2d(k)2u(k) (4.18)

y(k+1)= y(k)(1+d(k)2) (4.19)

1 if u(k)>ald(k)= 0 jf alu(k)<al (4.20)

1 f u(k)<al

An initialization procedure is required. It is:

u(0)=2(q2 *x_l) (4.21)

y(0)=x*q (4.22)

12 if l/4x<l/21 if 1/2x<1 (4.23)

37

The equations for radix-4 algorithm are:

u(k + 1) = 4(u(k) + 2d(k)) 4" (8d(k)u(k) + 4d(k)2) + 42'4d(k)4u(k) (4.24)

y(k +1) = y(k)(1 +

1 if1/2 if

d(k)= 0 if1/2 if

1 if

(4.25)

u(k)bau(k)<ba u(k) <a (4.26)

bu(k)<au(k)<b

In the above algorithms, al = -, a = -, and b = 3/2. We would like to use radix-4

algorithm since it can improve the speed of convergence and reduce the hardware

cost of the design. With the choices of a and b, the radix-4 algorithm will converge

only for all k 3. The initialization problem is solved by use the radix-2 algorithm

to perform five iterations and then to continue from that point with radix-4

algorithm and k = 3. For the radix-4 algorithm, the error will be:

e < 2-2k (4.27)

Figure 4.3 shows the result of the error analysis after neglecting the effect of

generated error.

Figure 4.3 Error analysis of the square-root algorithm

38

4.2.2 Square-Root Function Implementation

Figure 4.4 is the top level of the square-root function implementation.

SQU_ROUI_pic_sca1cI

SQU_ROOtJteauon SQU_ROOI_postak

Figure 4.4 Top level of the square-root function

The implementation is similar to the logarithm function's implementation.

Because the algorithm presented in the previous part requires that the input is

bigger than or equal to and smaller than 1. The first block,

SQU_ROOT_pre_scale, transforms the input data with the following equation:

x=a*22b, aE{1/4,1) (4.28)

After the transformation of the input, the first block also finishes the initialize

procedure to get u(0) and y(0) in the equation 4.21, 4.22 and 4.23 After the

transformation, in the second block, SQU_ROOT_iteration, the -J is got through

the algorithm in the previous part and the b part is transferred to the next block

through register memories. In the third block, SQU_ROOT_post_scale, the ..i is

got through the following equation:

J=.ja*22l' =/a*b*..JT=1.4142*b*fa (4.29)

To solve the initialization problem, there are two kinds of basic units in the

SQU_ROOT_iteration block. The radix-2 units perform first 5 iterations and the

39

radix-4 units perform the last 6 iterations to achieve a total 16 bits accuracy.

Equations 4.18, 4.19 and 4.20 are implemented in the radix-2 unit. Equations 4.24,

4.25 and 4.26 are implemented in the radix-4 unit.

40

Chapter 5. Prime-Length DCT Transformation

DCT has been widely used in the data compression related application for

its near-optimal performance [21]. Different algorithms have been proposed in the

literature to implement DCT Like DFT, most algorithm were proposed for the

computation of a 2 length DCT [21-271. In this chapter, we introduce an algorithm

to implement a prime-length DCT [29]. Then, we give out an implementation of the

algorithm.

5.1 Algorithm

A prime-length DCT is performed effectively through using two shorter

length correlations. The circular correlation architecture from the algorithm is

suitable for a constant coefficient multiplication and shift-register realization.

A derivation about the prime-length DCT algorithm is shown in [29]. The

algorithm is particularly attractive for hardware realizations like using distributed

arithmetic or other VLSI design techniques.

The DCT of a real data sequence {y(i) : i = 0,1,...,N 1} is defined as

Y(k)= y(i)cos[2(2i+1)k/4N] fork=0, 1, ...,N 1 (5.1)

The DCT defined in equation 5.1 can be formulated as

Y(k)={2T(k)+x(0)}cos[kir/2N], fork=0, 1, ...,N-1 (5.2)

where

41

T(k) = x(i)cos[7rik/N], fork 0, 1, ..., N 1 (5.3)

and x(i) is another sequence defined as

x(N l)=y(Nl)

x(i) = y(i) x(i+1), for! = 0, 1, ..., N 2 (5.4)

As N is prime. According to the number theory, there exists a one-to-one

mapping on the set {i: i = 1,2,...,N 1} to itself:

(gv)(5.5)

where u, v E {1,2,...,N-1} , g is a primitive root and (x)N

denotes the result of x

modulo N operation. We can split T(k) into odd and even sequences T(2k) and

T(2k+1)fork=1,2,...,(N-1)/2.

We define a new sequence F(k), such that

F(k)=T(2k+1)+T(2k-1) fork=1,2,...,(N-1)/2 (5.6)

T(N) =(-1x(i) (5.7)

Applying equation 5.5 to 5.3 and one may observe that

cos(((N 1)12 + n)(2,r I N)) = cos(2nn I N), we can realise T(2k) and F(k) by two

(N 1)/2 length correlation. Hence we have:

(N-1)/2

{x1(i)+x1((N-1)/2+i)}C(i+k)

fork=1,2,...,(N-1)12 (5.8)

(N-I)! 2

Fl((Nl)12 + n) = F1(k) = F((gk)N = {[x1(i) xl((N-1)12 -i-i)]C0(i)}C(i + k)

where

xl(i) = x(c(g')N

C(i) = cos[(2,r/N)(g')}

42

fork=1,2,...,(N-1)/2 (5.9)

In the design, we have N = 23. From equations 5.8 5.11, we have

T1(1) T(22) x(1 1) + x(12)

T1(2) T(12) x(6) + x(17)

T1(3) T(6) x(20) + x(3)

T1(4) T(20) x(13) + x(1O)

Ti(S) T(i 0) x(5) + x(1 8)

T1(6) = T(1 8) = C * x(9) + x(14)

T1(7) T(14) x(7) + x(16)

T1(8) T(i 6) x(8) + x(1 5)

T1(9) T(8) x(1 9) + x(4)

T1(1O) T(4) x(2) + x(21)

T1(1 1) T(2) x(22) + x(1)

F1(1) F(1 1) T(23) + T(21) 2[x(1 1) x(12)] cosi la

F1(2) F(6) T(13) + T(11) 2[x(6) x(17)] cos6a

F1(3) F(3) T(7) + T(5) 2[x(20) x(3)] cos2Oa

F1(4) F(1O) T(21) + T(19) 2[x(13) x(1O)] cosl3a

Fl(S) F(5) T(l 1) + T(9) 2[x(5) x(l 8)] cos 5a

F1(6) = F(9) = T(l 9) + T(l 7) = C * 2[x(9) x(l 4)] cos 9a

F1(7) F(7) T(15) + T(13) 2[x(7) x(16)] cos7a

F1(8) F(8) T(17) + T(15) 2[x(8) x(15)] cos8a

Fl(9) F(4) T(9) + T(7) 2[x(19) x(4)] cosl9a

Fl(lO) F(2) T(5) + T(3) 2[x(2) x(21)] cos2a

F1(11) F(l) T(3) + T(l) 2[x(22) x(l)] cos22a

(5.10)

(5.11)

(5.12)

(5.13)

43

where

c=l

cosl2a cos4Oa cos2Oa cos36a cos28a cos32a cosl6a cos8a cos4a cos44a cos24a

cos4Oa cos2Oa cos36a cos28a cos32a cosl6a cos8a cos4a cos44a cos24a cosl2a

cos2Oa cos36a cos28a cos32a cosl6a cos8a cos4a cos44a cos24a cosl2a cos4Oa

cos36a cos28a cos32a cosl6a cos8a cos4a cos44a cos24a cosl2a cos4Oa cos2Oa

cos28a cos32a cosl6a cos8a cos4a cos44a cos24a cosl2a cos4Oa cos2Oa cos36a

cos32a cosl6a cos8a cos4a cos44a cos24a cosl2a cos4Oa cos2Oa cos36a cos28a

cosl6a cos8a cos4a cos44a cos24a cosl2a cos4Oa cos2Oa cos36a cos28a cos32a

cos8a cos4a cos44a cos24a cosl2a cos4Oa cos2Oa cos36a cos28a cos32a cosl6a

cos4a cos44a cos24a cosl2a cos4Oa cos2Oa cos36a cos28a cos32a cosl6a cos8a

cos44a cos24a cosl2a cos4Oa cos2Oa cos36a cos28a cos32a cosl6a cos8a cos4a

cos24a cosl2a cos4Oa cos2Oa cos36a cos28a cos32a cosl6a cos8a cos4a cos44a

(5.14)

where a denotes n /23. It can be seen that the values of the cosine kernels along

same antidiagonal positions in the matrix of C are the same. The values in the two

continue rows are same in the matrix of C except the one position shift. This

phenomenon tells that the vectors Fl and Ti are the circular convolution of the

cosine kernels and another vector.

5.2 Implementation

From the previous part we know that the matrix multiplication can be

implemented by the structure that is shown in Figure 5.1:

This schematic can be used to get vectors Fl and Ti. The required number

of multiplication reduces from N2 in equation 5.1 to (N - 1)2/2 in equation 5.12 and

5.13.

44

Figure 5.1 Matrix multiplication where SR denotes shift register

45

Chapter 6. Other Implementations

This chapter describes the blocks that have not been described in the

previous chapters.

6.1 Offcom Block

Figure 6.1 shows the implementation of the Offcom block.

Figure 6.1 Implementation of the offset compensation block

It implements the equation 2.4. An hR filter is applied to the digital speech signal

s_in to remove their DC offset to produce the offset-free input signal s_of. The

multiplication s_of(n _1) with a constant 0.999 is implementation by a subtraction.

That is: 0.999 * s_of(n 1) = sof 0.00 1 * s_of(n 1). The last item 0.00 1 *sof(n

46

1) is implemented by 10-bit arithmetic left shift. We assume the input is signed

2's complement.

6.2 Framing Block

The offset-free signal s_of is divided into overlapping frames of N samples

with the frame shift interval M. In the project, we select N = 400 and M = 160. In

the Framing block, the outputs are four channels. Each channel includes a data

output signal s_fr and a start-of-frame signal sof. There are two counters and a state

machine in the block. The first counter is shift-interval-counter. It gives out a sof

signal every 160 clock. The second counter is called frame-length-counter. It sends

a signal to the state machine to let the output equals to 0 from 401 to 512 clock

cycles in a frame. Figure 6.2 shows the state machine diagram. Figure 6.3 shows

the simulation result of the block.

Figure 6.2 Framing block state machine diagram

47

ftrmulgJeSWcIkfrarrnng_tencWreet

ftramrntIt,,th/s2nminJeslbq,di3

Jstbind,/sd4

frrmingJetbenth/s_ot

fl,rnng_sbench/sjr1

ltramIng_IebenoWsjv2

frmung_Ierioh/s_ft3

fr3rnung_iestbonohls_tr4

Figure 6.3 The simulation result of Framing block

In the simulation, the input is fixed to a constant x"7fff".

6.3 EM Block

The EM block implements the equation 2.5. At the beginning of every

frame, the s_em signal is set to zero through a multiplexer controlled by sof. Figure

6.4 shows the structure of EM block.

rIIIIIE_HF__LcD

'of

0

Figure 6.4 The structure of EM block

48

6.4 PE Block

A pre-emphasis of high frequencies is required to obtain similar amplitude

for all formants since high frequency formants have smaller amplitude with respect

to low frequency formants. Figure 6.5 shows the implementation of pre-emphasis

filter. It implements the equation 2.7.

6.5 WBlock

OtLLOLJL

rnfft 3

+

Figure 6.5 The implementation of pre-emphasis block

A rectangular shape window is implicitly used when a sequence of N

samples is retrieved from a signal. A product between the speech signal and the

window in the time domain causes a convolution in the frequency domain. This

provokes a distortion on the estimated speech spectrum. A Hamming window of

49

length N is applied to the spe signal to reduce the distortion. Figure 6.6 shows the

structure of W block. It implements the equation 2.7.

Figure 6.6 The structure of W block

2ii (n 1) \A ROM is used to store the coefficients 0.54 0.46 * co{

NJ.

A counter

controlled by sof signal is used to produce the address of the ROM.

6.6 MF Block

We see, from Figure 2.5, the frequency bands covers by each filter overlap

adjacent bands in the me! filter. We split the frequency bands into even and odd

frequency bands to avoid the overlapping. Figure 6.7 to 6.9 illustrate the original

filter bands and the two split filter bands.

50

I',

Figure 6.7 Bands of filter scaled according to mel scale

Figure 6.8 Even bands of mel filter

Figure 6.9 Odd bands of mel filter

Figure 6.10 shows the implementation of the MF block.

51

Figure 6.10 The structure of MF block

The implementation is split into two channels, one for even bands and one

for odd bands. ROMs are used to store the filter factors. The counter signal from

the previous stage is used to address the ROM and control multiplexer and

demultiplexer. The block gives out a end-of-frame eof signal through checking the

counter signal when a frame ends. The Deframing block uses the eof signal to

choose the channel and send the fbank[ 1:23] to LOG function.

52

Chapter 7. Simulation and Synthesis

Based on the design we give from Chapter 3 to Chapter 6, we use VHDL

programming to implement the design. In this project, we finish the VHDL coding

part, functional simulation and synthesis.

7.1 Simulation

We simulates the design block by block. In most cases, we use a text I/O to

produce block input. A testbench is written for each tested block. The tested block

is declared as a component in the testbench. In this project, a typical testbench is

composed of the following parts:

Clock generator;

Initialization part;

Reading input data from a text file;

One or more components declaration.

It is not necessary to include every part in a testbench file. A macro file (do file) is

written to perform compilation and view waveforms in the view window. A

simulation includes the following steps:

1. Writing the VHDL source file for the block;

2. Writing the VHDL testbench file;

3. Producing the input text file for the simulation;

53

4. Writing a do file;

5. Performing the simulation;

6. Revising the design according to the simulation and back to step 5.

In appendix A, we describe the files used to build and simulate the project.

7.2 Synthesis

After passing the simulations, we finished the synthesis part to make sure

the design pass the synthesis requirements. We succeed in passing all the synthesis

tests for each block in the design. Table 7.1 summarizes the synthesis results.

Block

Name

Number of

Ports

Number of

Nets

Number of

Instances

Latency

(Clock

Cycles)

Number of

Gates

Clock

Frequency

(MHz)

Offcom 34 182 107 4 973 99.4

Framing 86 208 172 1 1167 229.9

PE 36 637 596 2 3586 49.3

W 36 1729 1476 2 9902 38.4

FFT 53 1747 1254 513 379817 9.2

SR 74 1019 634 13 50455 40.4

MF 573 3881 3316 3 26901 38.8

Deframing 2847 5722 2876 1 23535 276.5

EM 35 745 484 0 3625 46.6

LOG 588 825 133 11 12758 81.5

DCT 420 9025 8116 65 105865 34.8

Whole Project 35 3142 30 615 2041441

Tabel 7.1 Synthesis result

Chapter 8. Summary and Future Work

8.1 Summary

54

A VHDL description of a speech-recognition front-end is designed and

captured in this thesis. Our purpose is to design a ASIC chip to offload the CPU

from having to perform the digital signal processing part of the speech recognition

process. The front-end extracts the feature vectors from the input audio to perform

a significant level of data reduction. The following topics are included:

1. Investigation of the popular speech-recognition front-end processing

algorithms based on auditory properties.

2. Presenting a new FFT algorithm to reduce the number of complex

multiplication in the FFT. The VLSI implementation of the algorithm is given.

3. Analysis the multiplicative normalization algorithms for natural

logarithm and square-root function. The VLSI implementations of the algorithms

are given.

4. A prime length DCT algorithm and VLSI implementation.

55

5. VHDL description of every blocks in the design.

The simulations verify the effectiveness of the proposed algorithms and

VHDL implementation.

8.2 Future Work

The SR front-end algorithm and the mapping of the FFT, DCT, LOG,

square-root algorithms and other function blocks to implementation involves many

more issues than the simple functional implementation that we have made to this

point concerning power, area and architecture. The design is too big to tackle in

this project and several very important concerns have not been addressed. To help

in the future exploration of the design space, we have identified key points that

need to be considered for the design.

Flexibility: Should we support the other sampling rates in the ETSI

standard? Do we adapt to the channel on the fly? These considerations mean that

we need to detect the sampling rate and choose FFT length and other functions

accordingly.

Number Representation: We use 2's complement to simplify the

implementation of multiplication, addition and subtraction operation. An evaluation

56

needs to be done to tell if the added complexity of other number representation,

such as sign-magnitude and unsigned representation, justifies using 2's

complement for simplicity.

Coefficient Quantization: How many bits are needed to represent the

coefficients? How many bits are needed in the arithmetic operation, LOG and

square-root, unit. A simulation of noise to signal output needs to be done to help

determine the amount of resolution required.

Power: Do we need to use low-power design to limit the power in system?

Do we need to use low-power algorithms in the block design? The power and

hardware complexities are needed to be optimized or traded off against each other.

Back-End Design: The back-end design is the next step in the project. We

need to finish clock tree synthesis, place & route, timing closure, extraction, delay

calculation and signal integrity check work for the project.

57

References

[1] "Distributed Speech Recognition (DSR)", see web slides on,http://www.etsi.org/technicalactiv/dsr.htm

[2] "Work Programme: Details of DES/STQ-00007' Work Item", see web slideson, http ://webapp.etsi .org/workprogramlreportworkitem.asp'?wkiid=6400.

[3] Nicholas Cravotta, "Speech recognition: It's not what you say; it's how yousay it", see web slides on,http://www.ednmag.com/reg/1999/062499/13df1.htm.

[4] Omondi, A.R., Computer Arithemetic Systems: Algorithms, Arhcitecture andImplementation, Prentice-Hall, 1994.

[5] Behrooz Parhami, Computer Arithmetic, Oxford University Press, 2000.

[6] Marco Sarton, "speech recognition", see web slides on,http://www.gare.co.uk/technologywatch/speech.htm

[7] "Speech Recognition", see web slides on,http ://csd.newcastle.edu.aulusers/staff/speech/homepages/tutoriaisr.htrnl

[8] B. Gold, N. Morgan, Speech and Audio Signal Processing: Processing andPerception of Speech and Music. John Wiley & Sons, Inc.. 2000.

[9] C. Becchetti, L.P. Ricotti, Speech Recognition: Theory and C++Implementation. John Wiley & Sons, LTD. 1999.

[10] Fant, G., Acoustic Theory of Speech Production, Morton, S-Gravenhage,1960.

[11] C. R. Jankowski, H.-D. H. Vo, and R. P. Lippmann. "A comparison of signalprocessing front-ends for automatic word recognition". IEEE Trans. Speechand Audio Proccessing, 3(4):286--293, 1995.

[12] Bogert, B., Healy, M., and Tukey, J., The quefrency analysis of time series forechos, in M. Rosenblatt, ed., Proc. Symp. On Time Series Analysis, Wiley,New York, pp. 209-243.

[13] Oppenheim, A. V., Digital Processing of Signals, McGraw-Hill, New York,pp. 233-264, 1969.

58

[14] ETSI standard: ETSI ES 201 108 vl.1.2 (2000-04). See web slides onhttp :Ilwebapp.etsi .org/workprogramlReport_Workltem.asp?WKI_ID=9948

[15] Rabiner L.R., Rchafer R.W., Digital Processing of Speech Signals, PrenticeHall Inc. 1978.

[16] Alterson, Robert, Performance analysis of object-oriented, speech recognitionalgorithms in a real-time, multiprocessor environment. M.A.Sc thesis,Technical University of Nova Scotia.

[17] "Split-Radix Fast Fourier Transform Using Streaming SIMD Extensions", seeweb slides on,http://www.intel.comlvtune/cbts/strmsimdlappnotes/ap808/srfft.pdf

[18] P. Duhame!, "Implementation of "Split_Radix" FFT Algorithms for Complex,Real, and Real-Symmetric Data," Proc. 1985 IEEE in Conf. Acoust.,Speech, Sign! Processing, 1985, pp. 784-787.

[19] MA Richards, "On Hardware Implementation of the Split-Radix FFT," IEEETransactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 10, pp.1575-8 1

[20] EH Wold and AM Despain, Pipeline and parallel- pipeline FFT processorsfor VLSI implementations, T-COMP 33, 1984, 4 14-426.

[21] N. Ahmed, T. Natarajan, and K.R. Rao, "Discrete Cosine Transform", IEEETrans. Comput., vol. C-33, pp. 90-93, Jan. 1974.

[22] Yu-Tai Chang and Chin-Liang Wang, "New Systolic array implementation ofthe 2-D Discrete Cosine Transform and its Inverse," IEEE Trans. Circuits andSystems for Video Technology. Vol. 5, No. 2, Apr. 1995

[23] Danen Slawecki and Weiping Li, "DCT/IDCT Processor Design for HighData Rate Image Coding," IEEE Trans. Circuits and Systems for VideoTechnology, Vol.2, No.2, Jun. 1992.

[24] Dr.P.C. Jam et al., "VLSI Implementation of Two-Dimensional DCTProcessor in Real Time for Video Codec," IEEE Trans. ConsumerElectronics, Vol.38, No.3, Aug. 1992.

[25] Vetterli, M., and Nussbaumer, H.,"A Simple FFT and DCT Algorithm withReduced Number of Operations", Signal Process., 1984, 6,(4), pp. 267-278.

59

[26] N. I. Cho and S. U. Lee, "DCT Algorithms for VLSI ParallelImplementations," IEEE Trans. Acoust., Speech, Signal Processing, Vol.38,No.1, pp. 121-127, Jan. 1990

[27] K. R. Rao, P. Yip, Discrete Cosine Transform: Algorithms, Advantages,Applications, Academic Press, Inc. 1990

[281 Jiun-in Guo, "An Efficient Parallel Adder Based Design for One DimensionalDiscrete Fourier Transform", Proc. Nat!. Sci. Counc. ROC(A) Vol. 24, No. 3,2000. Pp. 195-204.

[29] Y. H. Chan, and W. C. Siu, "Algorithm for Prime Length Discrete CosineTransforms", Electronics Letters. 1st February 1990. Vol. 26 No. 3. Pp 206-208.

[30] Jiun-in Guo, Chi-min Liu, and Chein-wei Jen, "A New Array Architecturefor Prime-Length Discrete Cosine Transform", IEEE Transactions on SignalProcessing, Vol. 41. No. 1 January 1993. Pp 436-442.

60

Appendices

61

Appendix A Software Description

This appendix describes the software files used to build the speech

recognition front-end.

VHDL Source Files

Sp_front_end.vhd: the top level of the project.

. Offset_comp.vhd: source code for the Offcom block.

Framing.vhd: source code for the Framing block.

Pefilter.vhd: source code for the PE block.

Windowing.vhd: source code for the W block.

Fft5 1 2_sr.vhd: source code for the FFT block

Square_root.vhd: source code for the SR block.

Mel filter.vhd: source code for the MF block.

Energy_measure.vhd: source code for the EM block.

Deframing.vhd: source code for the Deframing block.

Log.vhd: source code for the LOG block.

Dct.vhd: source code for the DCT block.

Figure A1.l and A1.2 show the VHDL file hierarchy

62

Appendix B Synthesis Report

Area Report File

Cell: sp_front_end View: behavioral Library: work

*******************************************************

Cell Library References Total Area

CONZO scl05u 1 x 1 1 CONZOdct work 1 x 105865 105865 gates

2 2 CONZ13 3 CONZO

deframing work 1 x 23535 23535 gatesenergy measure work 4 x 1 4 CONZO

3625 14502 gatesfft5l2sr work 4 x 2 8 CONZ1

379817 1519267 gates17 68 CONZO

framing work 1 x 1167 1167 gateslog work 1 x 9 9 CONZ1

12758 12758 gates10 10 CONZO

mel_filter work 4 x 1 4 CONZO1 4 CONZ1

26901 107602 gatesoffset_comp work 1 x 1 1 CONZO

1 1 CONZ1973 973 gates

pefilter work 4 x 1 4 CONZ13586 14345 gates

square_root work 4 x 50455 201821 gates13 52 CONZO

windowing work 4 x 1 4 CONZO9902 39606 gates

Number of ports : 35

Number of nets : 3142Number of instances : 30

Number of references to this view : 0

Total accumulated areaNumber of CONZO : 147

Number of CONZ1 : 28

63

Number of gates : 2041441

Info, Command 'report area' finished successfully

64

Appendix C Design Files

Sp_front_end.vhd: the top level of the project.

file: spfrontend.vhdthesis projectProgrammer: Xin XiaoDate 02/01

Functional description:

a VHDL description of a speech recognition front-end

LIBRARY ieee;USE ieee.std logic 1164 .ALL;use work.mel filter pack.all;

ENTITY sp front end IS PORT(cik : in std logic;reset : in std logic;s_in : in std logic vector(15 downto 0);sot out : out std logic;s_out : out std logic vector(15 downto 0)

END sp front end;

ARCHITECTURE behavioral OF sp frontend IStype declaration ------------------------------------------------

type sof_vector is array (0 to 3) of std logic;type s_vector is array (0 to 3) of std logic vector(15 downto 0);type mel bank vector is array (0 to 3) of mel_bank;

component declaration

component offset compPORTclk : in std_logic;reset : in std_logic;s_in : in std logic vector(15 DOWNTO 0);

s_of : out std_logic_vector(l5 DOWNTO 0)

end component;

component framing

65

PORTci k

resets ofsof 1

sof2sof 3

sof4sfrls_fr 2

s_fr3s fr4

end component;

in std logic;in std logic;in std logic vector(15 DOWNTO 0);out std logic;out std logic;out std logic;out std logic;

out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0)

component pefilterPORTcik : in

reset : insoffrs_fr : in

sof_pe5 pe

end component;

component windowingPORTclkresetsof_peS pesof_wSW

end component;

component fft 512 Sr

std logic;s tdl og i c;

in std logic;std logic vector(15 DOWNTO 0);

out std logic;out std logic vector(15 DOWNTO 0)

in std logic;in std logic;

in std logic;in std logic vector(l5 DOWNTO 0);out Std logic;out std logic vector(15 DOWNTO 0)

PORTclk : in std logic;sofw : in std_logic;reset : in std logic;data in : in stdlogicvector(l5 DOWNTO 0);soffft : out std logic;valid fft : out std_logic;

out_real : out std logic vector(15 DOWNTO 0);out_imag : out std logic vector(15 DOWNTO 0)

end component;

component square_rootPORTclk : in Std logic;reset in std logic;

66

soffft : in std logic;valid fft : in std logic;fft real in std logic vector(15 DOWNTO 0);fftimag : in std logic vector(l5 DOWNTO 0);

sot squ : out std logic;valid squ : out std logic;

fttmag out std logic vector(l5 DOWNTO 0)

end component;

component mel filterPORTclk : in std logic;reset : in std_logic;sot squ : in std_logic;valid_squ : in std logic;

fftmag : in std logic vector(l5 DOWNTO 0);eofmel : out std_logic;fbank : out mel_bank --1 to 23

end component;

component deframingPORTclk in std logic;reset in std logic;eofmell : in std_logic;eofmel2 : in std_logic;eofmel3 : in std_logic;eof_mel4 : in std logic;fbankl : in mel_bank;semi : in std logic vector(15 DOWNTO 0);fbank2 : in mel_bank;s_em2 : in std_logic_vector(l5 DOWNTO 0);fbank3 : in mel_bank;sem3 : in std logic vector(l5 DOWNTO 0);fbank4 : in mel_bank;sem4 : in std logic vector(15 DOWNTO 0);sof_def : out std logic;fbank_def : out mel_bank;s_em_def : out std logic vector(l5 DOWNTO 0)

end component;

component energy measurePORTclk : in std logic;reset : in std logic;sof_fr : in std logic;s_fr in std logic vector(15 DOWNTO 0);

s_em : out std logic vector(l5 DOWNTO 0)

end component;

67

component logPORTcik : in

reset : in

eofmelfbank : ins_em : in

sof logs log : out

end component;

component dct

std logic;std logic;

in std logic;mel_bank; 1 to 23std logic vector(15 DOWNTO 0);out std logic;

std logic vector(15 DOWNTO 0)

PORTcik : in std logic;reset : in std logic;sof log : in std logic;s_log : in std logic vector(15 downto 0);sof_dct : out std logic;dct_out : out std logic vector(15 downto 0)

end component;attribute noopt : boolean;configuration ---------------------------------------------------

for all: offset_campuse entity work.offset camp (behavioral);

attribute noopt of offset_camp: component is TRUE;

for all: framinguse entity work.framing (behavioral);

attribute noopt of framing: component is TRUE;

for all: pefilteruse entity work.pe filter (behavioral);

attribute noopt of pefilter: component is TRUE;

for all: windowinguse entity work.windowing (behavioral);

attribute noopt of windowing: component is TRUE;

for all: fft5l2sruse entity work.fft 512 sr (behavioral);

attribute noopt of fft5l2sr: component is TRUE;

for all: square_rootuse entity work.square root (behavioral);

attribute noopt of square_root: component is TRUE;

for all: mel_filteruse entity work.mel filter (behavioral);

attribute noopt of mel_filter: component is TRUE;

for all: deframing

68

use entity work.deframing (behavioral);attribute noopt of deframing: component is TRUE;

for all: energy_measureuse entity work.energy measure (behavioral);

attribute noopt of energy_measure: component is TRUE;

for all: loguse entity work.log(behavioral);

attribute noopt of log: component is TRUE;

for all: dctuse entity work.dct(behavioral);

attribute noopt of dct: component is TRUE;

signal declaration ----------------------------------------------

signal s_of : std logic vector(15 DOWNTO 0);signal sof_fr : sof vector;signal s_fr : s_vector;signal sof_pe : sof vector;signal s_pe : s_vector;signal sofw : sof_vector;signal sw : s_vector;signal sof_fft : sof_vector;signal valid fft : sof_vector;signal fft real : s_vector;signal fftimag : s_vector;signal sof_squ : sof vector;signal valid squ : sof_vector;signal fftmag : s_vector;signal eofmel : sof_vector;signal fbank : mel_bank_vector;signal sof_def : std logic;signal fbank_def : mel_bank;signal semdef : std logic vector(l5 DOWNTO 0);signal s_em : s_vector;signal sof log : std logic;signal s_log : std logic vector(l5 DOWNTO 0);

BEGINcomponent instantiation -----------------------------------------

offcom_inst: offset_compport map (clk, reset, s_in, s_of);

framing_inst: framingport map (cik, reset,soffr(2), sof_fr(3),s_fr(3) )

sof, soffr(0), sof_fr(l),s_fr(0), s_fr(l), s_fr(2),

pe_inst: for i in 0 to 3 generatebegin

69

pefilterinst: pefilterport map (clk, reset, sof fr(i), sfr(i), sofpe(i),s_pe (i) )

end generate;

winst: for i in 0 to 3 generatebegin

windowing_inst: windowingport map (cik, reset, sof_pe(i), spe(i), sof_w(i),s_w(i) ) ;

end generate;

fftinst: for i in 0 to 3 generatebegin

fft5l2srinst: fft5l2srport map(clk, sof_w(i), reset, s_w(i), soffft(i),valid fft(i), fftreal(i), fft_imag(i));

end generate;

sr_inst: for i in 0 to 3 generatebegin

square root inst: square_rootport map(clk, reset, sof_fft(i), valid fft(i),fftreal(i), fftimag(i), sofsqu(i), valid squ(i),fftmag(i) ) ;

end generate;

mfinst: for i in 0 to 3 generatebegin

mel filter inst: mel_filterport map(clk, reset, sof_squ(i), valid squ(i),

fftmag(i), eofmel(i), fbank(i));end generate;

eminst: for i in 0 to 3 generatebegin

energy_measure_inst: energy_measureport map(clk, reset, sof_fr(i), s_fr(i), sem(iH;

end generate;

deframing_inst: deframingport map(clk, reset, eofmel(0), eofmel(l),eof_mel(2), eofmel(3), fbank(0), sem(0),fbank(l) , sem(l) , fbank(2), sem(2),fbank(3), sem(3), sof_def, fbank_def, semdef);

log_inst: logport map(clk, reset, sof_def, fbank_def, s_em_def,sof log, s_log)

dot inst: dctport map(clk, reset, sof log, slog, sof_out, s_out);

END behavioral;

70

71

Offset_comp.vhd: source code for the Offcom block.

file: offset comp.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


Prior to the framing, a notch filtering operation is applied tothe digital

samples of the input speech signal s_in to remove their DCoffset, producing

the offset-free input signal s_of.s_of(n) = s_in(n) sin(n 1) + 0.999 * s_of(n 1)

LIBRARY ieee;USE ieee.std logic ll64.ALL;USE ieee. std_logic_signed.ALL;

ENTITY offset comp IS PORT(cik : in std logic;reset : in std logic;s_in : in std logic vector(15 DOWNTO 0);

s_of : out std logic vector(15 DOWNTO 0)

END offset comp;

ARCHITECTURE behavioral OF offset comp ISsignal sinreg : std logic vector(15 DOWNTO 0);signal sinregi : std logic vector(15 DOWNTO 0);signal s_of_reg : std logic vector(15 DOWNTO 0);signal sofregi : std logic vector(15 DOWNTO 0);signal sindiff : std logic vector(15 DOWNTO 0);

signal s_of_diff : std logic vector(15 DOWNTO 0);signal s_of_regl_ext : std logic vector(16 DOWNTO 0);BEGIN

Registers

Input registerinreg: process (cik, reset, s_in)

beginasync reset

if reset = '1' thensinreg <= x"0000";

elsif (clk'event and clk = '1') thens_in_reg <= sin;

72

end if;end process;

Input registerlinregi: process (cik, reset, s_in_reg)

beginasync reset

if reset = '1' thensinregi <= x"0000";

elsif (clk'event and clk = 'l) thensinregi <= sinreg;

end if;end process;

output registeroutreg: process (clk, reset, sofreg)

beginasync reset

if reset = '1' thensot <= x'OOOO";

elsif (clk'event and cik = '1') thens_of <= sofreg;

end if;end process;

output registerlout regi: process (clk, reset, sofregi)

beginasync reset

if reset = '1' thensofreg <= x"OOOO";

elsif (clk'event and clk = '1') thensofreg <= sofregi;

end if;end process;

arithmatics ----------------------------------------------------

sindiff <= S inregi sinreg;sofdiff <= s_of_reg shl(sofreg, b"O101O");S of regi ext <= (S in diff(15) & sindiff) + (5 of diff(15) &

sofdiff)

arith: process (reset, sindiff, sofdiff, sofreglext)begin

async resetif reset = '1' then

sofregi <= X"0000";elsif (5 of regi ext > B"O 0111 1111 1111 1111") then

sofregi < B"0111 1111 1111 1111";elsif (5 of regi ext < B"1 1000 0000 0000 0000") then

sofregl <= B"lOOO 0000 0000 0000";else

73

sofregi <= S of reglext(15 downto 0);end if;

end process;

END behavioral;

74

Framing.vhd: source code for the Framing block.

file: framing.vhdthesis projectProgrammer: Xin XiaoDate 12/00


The offset-free input signal s_of is divided into ouvelappingframes of N

samples. The frame shift interval(difference between thestarting points of

consecutive frames) is M samples. The parameter M defines thenumber of

frames per unit time.

LIBRARY ieee;USE ieee.std logic 1164 .ALL;USE ieee.std logic signed.ALL;

ENTITY framing IS PORT(clk : in std logic;reset : in std logic;s_of : in std logic vector(l5 DOWNTO 0);

sofl : out std logic;sof2 : out std_logic;sof3 : out std logic;sof4 : out std logic;

s_fri : out std logic vector(15 DOWNTO 0);

s_fr2 : out std logic vector(i5 DOWNTO 0);

s_fr3 : out std logic vector(i5 DOWNTO 0);s_fr4 : out std logic vector(15 DOWNTO 0)

END framing;

ARCHITECTURE behavioral OF framing IS

constant N : integer := 400;constant M : integer := 160;

sofstates ------------------------------------------------------

type sof_states is (sof_sti, sofst2, sof_st3, sof_st4);signal sof_ps : sof_states;signal sof_ns : sof_states;signal sof in : std logic;signal shicnt : std_logic_vector(7 DOWNTO 0);signal fra_cnt : std logic vector(6 DOWNTO 0);signal frame : std logic;

75

signal sot inreg : std logic;BEGIN

----shift counter -----------------------------------shicntsm: process(clk, reset, shi_cnt)begin

async resetif reset = 'l then

shi_cnt <= b"OOOOOOOO";sof_in <= '0';

elsif(clk'event and cik = '1') thenif (shi_cnt = (M 1)) then

shi_cnt <= b"OOOOOOOO";sot_in <= '1';

elseshi_cnt <= shi_cnt + 1;sot_in <= '0';

end if;end if;

end process;

--frame length counter --------------------------fracntsm: process(clk, reset, fra_cnt)begin


fra_cnt <= b"OOOOOOO";frame <= '1';

elsif(clk'event and cik = '1') thenif(shicnt = 0) then

fra_cnt <= b"OOOOOOl";frame <= '1';

elsif(fracnt = (N- 2*M)) thenfra_cnt <= b"OOOOOOO";frame <= '0';

elsif (fra_cnt 1= 0) thenfra_cnt <= fra_cnt + 1;frame <= '1';

end if;end if;

end process;

--sot_in register ---------------------------------sofinsm: process(clk, reset, sof_in)begin


sot_in_req <= '0';elsif (clk'event and clk = '1') then

sot_in_req <= sof in;end if;

end process;

--multiplex ---------------------------------------

76

sofsm: process(cik, reset, sofps, sofns, s_of, frame,sofinreg,

shicnt)begin

--async resetif reset = '1' then

sofps <= sofsti;eisif(cik'event and cik = '1') then

sof_ps <= sot ns;end if;

sofi <= '0';sof2 <= '0';sof3 <= '0';sof4 <= '0';s_fri <= s_of;s_fr2 <= s_of;s_fr3 <= s_of;s_fr4 <= s_of;sof_ns <= sof_ns;case sofps iswhen sof_sti =>

if (frame = '0' and shi_cnt = 0) thensof_ns <= sof_st2;

end if;if (frame = '0') then

s_fr3 <= X"OOOO";end if;s_fr2 <= X"OOOO";sofi <= sof_in_reg;

when sof_st2 =>if (frame = '0' and shi_cnt = 0) then

sof_ns <= sof_st3;end if;if (frame = '0') then

s_fr4 <= X"OOOO";end if;s_fr3 <= X"OOOO";sof2 <= sot in reg;

when sof_st3 =>if(frame = '0' and shi_cnt = 0) then

sof_ns <= sof_st4;end if;if(frame = '0') then

s_fri <= X"OOOO";end if;s_fr4 <= X"OOOO";sof3 <= sof_in_reg;

when sot st4 =>if (frame = '0' and shi_cnt = 0) then

sof_ns <= sof_sti;end if;if (frame = '0') then

s_fr2 <= X"OOOO";

i4

78

Pefilter.vhd: source code for the PE block.

file: pefilter.vhdthesis projectProgramnier: Xin XiaoDate 01/2001


A pre-emphasis filter is applied to the framed offset-free inputsignal.

sps(n) = s_fr(n) 0.97 * s_fr(n 1)


ENTITY pe filter IS PORT(clk : in std logic;reset : in std logic;sof_fr : in std logic;s_fr : in std logic vector(15 DOWNTO 0);sof_pe : out std logic;

spe : out std logic vector(15 DOWNTO 0)

END pefilter;

ARCHITECTURE behavioral OF pe filter ISsignal s_fr_reg : std logic vector(l5 DOWNTO 0);signal sfrregl : std logic vector(15 DOWNTO 0);signal smul2 : std logic vector(3l DOWNTO 0);signal smul : std logic vector(15 DOWNTO 0);signal s_add : std logic vector(15 DOWNTO 0);signal s_addl : std logic vector(16 DOWNTO 0);signal sof_reg : std logic;constant s97 : std logic vector(l5 DOWNTO 0)

"0111110000101001";

BEGINRegisters ------------------------------------------------------

Input registerinreg: process (clk, reset, s_fr)

beginasync reset

if reset = '1' thensfrreg <= x"OOOO";

79

elsif (clk'event and cik = '1') thensfrreg <= s_fr;

end if;end process;

Input registerlinregi: process (cik, reset, s_fr_reg)

beginasync reset

if reset = '1' thensfrregl <= x"OOOO";

elsif (clk'event and cik = '1') thensfrregl <= s_f r_reg;

end if;end process;

output registerout_reg: process (cik, reset, sadd)

beginasync reset

if reset = '1' thens_pe <= x"OOOO";

elsif (clk'event and cik = '1') thens_pe <= s_add;

end if;end process;

sof registersofregsm: process (cik, reset, sof_fr)

beginasync reset

if reset = '1' thensof_reg <= '0';

elsif (cik'event and cik = '1') thensot reg <= sot fr;

end if;end process;

sot registerlsot regi: process (cik, reset, sof_reg)

beginasync reset

if reset = '1' thensof_pe <= '0';

elsif (cik'event and cik = '1') thensof_pe <= sof_reg;

end if;end process;

arithmatics ----------------------------------------------------

smul2 <= S frregl * s97;smul <= smul2(30 downto 15);

saddi <= (S ft reg(15) & sfrreg) (smul(15) & smul);

arith: process (reset, saddi)begin


s_add <= X"OOOO";elsif (saddi > B"O 0111 1111 liii 1111")

s_add <= B"011l 1111 1111 1111";elsif (saddl < B"l 1000 0000 0000 0000")

s_add <= B"lOOO 0000 0000 0000";else

s_add <= S addl(15 downto 0);end if;

end process;

END behavioral;

then

then

80

81

Windowing.vhd: source code for the W block.

file: windowing.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


A Hamming window of length of N is applied to the output of thepre-emphasis

block.sw(n) = (0.54 0.46 * cos(2 * pj * (n l)/(N 1))) * s_pe(n

1)

LIBRARY ieee;USE ieee.std logic 1164 .ALL;

LIBRARY ieee;USE ieee. std logic 1164 .ALL;

package thesis_pack isfunction intval(val: std logic vector(8 DOWNTO 0)) return

integer;type coe_windowing is array (0 to 511) of std logic vector(15

downto 0);constant coe_window: coe windowing

("0000101000111101","0000101000llllll","0000101001000101","000010 1001001110","0000101001011011","0000101001101100","0000101010000001","0000101010011001","0000101010110101","0000101011010101","0000101011111000","0000101100011111","0000101101001010","0000101101111000","0000101110101010","0000101111100000",',oOO011OOOo011ool',,

"0000110001010110","0000110010010111","0000110011011011","0000110100100011","0000110101101110",

82

',0000110110111101,'," 0000111000 001111""0000]ll001100101","0000111010111111","0000111100011011,,,"00001111_U 1111011",',OOoolllll1011l_ll","000 1000001000110","000l0000loll0000,,,

00010 00100 0 11110 ""0001000110001111","000100 1000000011",,'UOOlOOlO011l1011,',,'000100lO11_1_l0110,,,000 100 11011100 11"

',000loolllllloloo,',,,0001010001111001,,,"00010 10100000000","00010 10110001010","0001011000010111",,'OoOlollo10101OOo",

000 10 11100 1110 11""0001011111010001","0001100001101010","0001100100000 110","0001100110100101","0001101001000110",

000 110101110 1010"0001101110010001","0001110000111011","0001110011100111","0001110110010101","0001111001000111","0001111011111010",',oOolllll1011OoOo,',"0010000001101001"," 0010000100 100011"I' 0010000111100000"OOlOoO10101OOooU,',"0010001101100001","oUlOOloOooloo101","0010010011101010","0010010110110010","00 10011001111100","0010011101001000","0010100000010101","0010100011100100","0010100110 110110",',OOlolo101OoOlOOl",',Oo10101101011101","0010110000 110100","0010110100001011","0010110111100101",',OUloll1011OoOOOO",

83

"0010111110011100",,'0011000001111001",00110 00 10 1011000 "

"0011001000111001,',"0011001100011010,',"001100111111110 1",00110 10011100000 "

"0011010111000101","0011011010101010",',00llollllool000l',,"0011100001111000,',001110010 110000 1"

001110100 100 10 10 "

"0011101100110011","0011110000011110","0011110100001000","0011110111110100","0011111011100000","0011111111001100","0100000010111000","0100000110100101","0100001010010010","0100001101111111","0100010001101101","0100010101011010","0100011001000111",'V 0 1 0001 1 1 00 110 1 0 1 U

"0100100000100010","0100100100001111","0100100111111011","0100101011101000","0100101111010100"," 0100110010 111111"

"0100110110101011","0100111010010101","0100111101111111","0101000001101000","0101000101010001""0101001000111001""0101001100100000"" 01010 10000 000110 "

"0101010011101011",'V 0 1010 10111001111"

"0101011010110010","0101011110010100","0101100001110101","0101100101010101","0101101000110011","0101101100010000","0101101111101011","0101110011000110",,'O10111011O011llO',,"0101111001110101","0101111101001011",

84

"01100 00000 0 11110 ""01100 000 1111000 1",,0110000111000001',,"011000 1010001111","0110001101011100","0110010000100111,,,"0110010011101111,',"0110010110110110","0110011001111011","0110011100111101","0110011111111110","0110100010111100","0110100101111000","0110101000 110001","0110101011101000","0110101110011101",,,0110110001010000,',"0 110110100000000","0110110110101101","0110111001011000",,,0110111100000000,',"0110111110100101","0111000001001000",,'oll]-oOOolllolOoO',,"0 111000110000110",'I 011100100010000 0""0111001010111000",,'011lO01101o011OO","0111001111011110","0111010001101101","01110 10011111001","0111010110000010","0111011000000111"," 0111011010 001010 ",,0111011100001001",,'011l-011llooOollO","0111011111111111","0111100001110101",,'011llOO011lo011l"',011llOolo101olll,,,"0111100111000011","0111101000101011",,'011llololoolOOol","0111101011110010",,'ollllollololooOl',,"0111101110101100","0111110000000100",,,0111110001011000",,,0111110010101000',,"0111110011110101","0111110100111111","0111110110000101","0111110111000111","0111111000000110",

h',,'00O000,'0,,'0"

0 10 1000101 10 1 1 10 I 1OO1O00O111OT110

'0110000E11L01110 '1111111111101110

' 1010 11100001 1l10

1 ,,, 10 10 10011110 10000 1 1100111 10

'llO101000l011110 1 000 100 10 101 II l0

h'a0l0O'Tl10'0l1l1O' '.1O0O1010110111l0.. '.O011O1011l01l11O

I .001000000O11l1lO I u000t101000l1l110Ia

00 1010 100 1111 10 ii

10 1 1 110011 11 10 I"1I1l11l00,0T11'10"

I .l0100001101111l0. 11000 111 01 111 10

I 1000 00011 111 TO

1 000001 0011 11 1T0 100111100l11lT10

I I0l1I0ll01Il1ll0 10 10000111 1t1 10

lOOT I 00111 lIT TO i000lT0l0'I'T1Tl0

'1 00 111 1011 1 1 1 1 10

I 100O11011Tl1T110. 1

0 00 1 00 IT 1 1 1 11 ITO 01 '0 1 1 I I I 11 1 TO

TOO 101 IIIIIIITIO 0010 ITTI1TL ITT I0

' 00 II IT 1111 1 II 110 I

II11I1ITITII1It0 I tijIT1ITI1I1lITo

I"OOTITII''I''ITTO" ,"00'O'I'''II''''o"

I" TOOTOTTITII I 11 10 1

oIouoiuTutuo I 000100IT11II1I1O

I"T000TIOTTTTTI'To" '

T1 TOIOOITII lIT I0

100111 10111 I1II0 I"000T,OTO,'II',TO"

I o1oo11ooITIIlITo I o1oI0000luuuo

101110 110 IT TI T TO

IOITOTO IOIITI T10 IOOTITI 0011 TI 1 10

I 1I00000IO0TIIT1T0

cs

' 01001100 10 15

,Il11O'O-tOITOIIOOTOII

' 101 010 0101110010 1

11111 1l1101111OO10 'OOOT011O0O00T0T0

T0 00 10 1010001010 ' 10011 1000 10 01010

'O0OOO'OO"00'0'Ol.

11010111O01O1O10 'l,l'l'OOt'101OtOIO"

O10O1TOT0TTO101O OO 10 100111 1010 10

'T0T01110000l101O 1O 1010 10 10011010

'1100T1000I01TO10

i 00001000110 1 10 1 0

II11O 10 111 1 10 11010 II

OT 1000 11001110 10

011TT00TT0111O1O il101011100111T0 10

'110T00101T11T010 '01111000000001T0. '..1000T111000001T0

1000 00 11 10000110 111100010 1000110

II 0011101011000110 II

'.T 11 001 00001 001 10 'T1110111001001T0

£101 101101101 001 10 '.IlTOITllOOIlOOT101. 'p1011110011100110

is 0111111111100 110.s 0011110 1000 101 TO

000 11110100 10110 1

10001100010 10110 000 10111010 10110

'IIIOIIIOOTIIO'O''Ols 0001010 001101 TO

II 0000000010110110 5.

..10 110 101 10110110

U 000 110 100 11 10110

10 100101 11110110 00100 1000001110 11000101 1100001110

IIO1 1000011 00011 10 LI00000I000TOOTITOII

00111010 100 1110 .00 1100 10110011 10

.O11T 101111001110 I IIIOITOTT000TOTTTOII

£ITOOTITTIOOIOTITOII 100000110101110 II

98

I 11tOOO 100111-U 0O0 'O101 111101111000

'000011O111111O0O '1001011000000100

110001001 0000 100 '0000011110000100. 00000101O1000100 ',1000011011000100 '1010010000100T00 '0101011100100100 ,010O110110100100 '001T111001100100 0001001011100100

01010000 00 10 100 0O1001110O010100. '0110110T10010100

100 10001010 10 100 101110 10 110 10 100

'.0010110000110100ai '..110100001O110100 '.10100111T0110100 '0000001101110100

0011100111110100. '1001111000001100. .0001101010001100. '1001110001001100

0101100011001100. '101111111T0O1100 .0000011100101100 '1010001110101100 '010T010T01101100

000 1001 111 01 100 '0001111000011100 '10000110100T1100

'iaOIOIOOIOOlOIIIOOii '1100110011011100 '.0111100000111100 '0001000010111100 '00T01111101T1100 '000001110T111T00

001100 1111111100 Ia

'0001110100000010 '1010010110000010

I 01001001010000 IOU '.1I11T1I0II0000T0 'I011011000100010

10110 10 101000 10 ',11I0001001I000I0 '10T01T0O11I000I0

1000 100000 10010 II1I00 00 100 10010

'I10I111II0010010 00010111010 100 10

,00I0 1011110 100 10

L8

' 1011 11 00010 10000 ' 11 111 11000 101 0000 11

'"OT0O0'0010 10000, I 1 1 00 10010 10000

',,IIUITOIOOlOIOOOO,, ,0OI101100 1010000,,

'10U1I001010I000O, ',,IOIOlIOIOIOIOOOO,,

t0 tO 1011010 10000 0001 1111 010 10000

',,lIIIlOOOIIO 10000,,

0001111011010000 0101010111010000

',, 00000111110 10000 a,

1 001 10 0000110000 '01I0101000110000 ',,T110100100110000 ',,11O1I01100110000

I 1100010010110000 '0T11011010110000,

101111 011 0110000 '1111000001110000

'10 1001100 111 0000 I" 1111110101110000

',,TlOIIOOOIIIIOOOO., ',,1101111011110000

I ,.111110 1111110000 I 0110001000001000

', 000011010000 1000 11110 00 10001 000k

I" 1111000110001000 1 0000000 10 01000

',,IIOIIIIOOTOOIOOO,, I 011011110100T000.

I 10011101 10 01000 00 1011 111 100 l000u

',,100111100010100O I" 0000000010101000 'a

'010 100011 010 1000 111010 000110 1000

I ,000T0T0101101000 10111 001 1101000

'1000101111101000,, 10 10 110000 1 1000

I ,,011000001001I000 ',,1010010110011000

11000 100 1011000 I,,0101011,01011000,, I"IOOoIOoTIIOIIOoO"

',,IIO 1110000111000g. I ,,1I1001I100111000,,

10 1010 01 1011 1 000

88

89

,'0000000000000000",00000 00000 00 0000 "

"0000000000000000","0000000000000000","0000000000000000","0000000000000000"," 0 0000 00000 00 0000"

00000 00000 000000""0000000000000000",000 000 00 000 00000 "

"0000000000000000","0000000000000000","0000000000000000","0000000000000000","0000000000000000",

00000 000 00 00 00 00 "00000 00 000 000000 "00000 00000 000000"

"0000000000000000","0000000000000000","0000000000000000","0000000000000000","0000000000000000","0000000000000000","0000000000000000",

0000000 000 000000 ""0000000000000000","OOOOOOOOOOOOOOOO","0000000000000000","0000000000000000","0000000000000000","oOOoOoOOOOOOOOOO,',"0000000000000000"," 000000 00000000 00 "" 000000000000 0000 ""0000000000000000",

000 00 00 0000 00000" 000000 00000000 00 "

00 000 00 000 00000000000 00000 000000

',OOOoOOoOoOOO0000""0000000000000000","0000000000000000",,'OOOooO000000000O","0000000000000000","0000000000000000","0000000000000000",00000 00 000 000 000 "

"0000000000 000000","0000000000000000","0000000000000000",

00000 00 000 000000 ""0000000000000000","0000000000000000",

90

00 0000 0000000000 ""0000000000000000',,00000 00 000 000000 "

"0000000000 000000","0000000000000000,',"0000000000000000,',"0000000000000000',,"0000000000000000,',"0000000000 000000","0000000000 000000",

000 00 000000 00000000 000 00000 00 00000 0000 0000000000 "00 000 00 000 000000

"0000000000 000000",,,0000000000000000",

000 000 00 00000 000 "00 000000 00 00000000 000 00 000 000000 "

"0000000000000000","00000 00000000000",

000 0000 0000 00 000 "00 00000 000 000 000 "00 000 00 000 000000 "

"0000000000000000",',OOOOoOOOoOoO000O',,000 000 00 00000 000

° 00 00000 00000000000 000 00 000000000 "

"0000000000000000",000 00 000 000 00000

"000000000000000 0","000000000000000 0",

00000 00 000000000"0000000000000000",,'OoOOooOoOooO0000,',

00 00000 000 000000 ""0000000000 000000",

000 00 000000 00 00000 0000 00 000 00 0000000000 000 00 000000000 00 00000 0000 "000 00 00 000 000000 "00 0000 00 000 00 000

° 00000 00 000 000000"0000000000000000","oOOoOooOoOoOooOo",

000 000 00 000 00 000 "0000000 000 00 0000 "

',oOOoOoOOoOOoO000",000 0000 0000 00 000 "

"0000000000000000",00000 00 000000000 "

"0000000000000000',,

91

"0000000000 000000",00000 000000 00000 "

"0000000000000000!,,"0000000000000000")

end package thesis_pack;

package body thesis_pack isfunction intval (val: std_logic_vector(8 DOWNTO 0)) return

integer isvariable sum: integer := 0;

beginfor N in val'low to val'high loop

if val(N) = '1' thensum := sum + (2 ** N);

end if;end loop;return sum;

end intval;end thesis_pack

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;use ieee. std logic arith. all;

use work.thesispack.all;ENTITY windowing IS PORT(

clk : in std logic;reset : in std logic;sofpe : in std logic;s_pe : in std logic vector(15 DOWNTO 0);sofw : out std logic;sw : out std logic vector(15 DOWNTO 0)

END windowing;

ARCHITECTURE behavioral OF windowing ISsignal s_pe_reg : std logic vector(15 DOWNTO 0);signal smul ext : std logic vector(31 DOWNTO 0);signal smul : std logic vector(15 DOWNTO 0);signal sof_reg : std logic;signal cnt : std logic vector(8 DOWNTO 0);signal coe : std logic vector(l5 DOWNTO 0);

BEGINRegisters ------------------------------------------------------

Input registerinreg: process (clk, reset, s_pe)

beginasync reset

if reset = '1' then

92

spereg <= x"OOOO";elsif (clk!event and cik = '1') then

spereg <= spe;end if;

end process;

output registerout_reg: process (clk, reset, smul)

beginasync reset

if reset = '1' thensw <= x"OOOO";

elsif (clk'event and cik = '1') thenSw <= smul;

end if;end process;

sof registersofregsm: process (cik, reset, sof_pe)

beginasync reset

if reset = '1' thensot reg <= '0';

elsif (clk'event and cik = '1') thensot reg <= sof_pe;

end if;end process;

sof out registersot out sm: process (cik, reset, sot reg)

beginasync reset

if reset = '1' thensofw <= '0';

elsif (clk'event and cik = '1') thensot w <= sof_reg;

end if;end process;

coefficient counter ---------------------------------------------

coe_cnt: process(clk, reset, sofreg, cnt)begin


cnt <= b"OOOOOOOOO";elsif(clk'event and cik = '1') then

if sot reg = '1' thencnt <= b"OOOOOOOOl";

elsif (cnt = 511) thencnt <= b"OOOOOOOOO";

elsif (cnt 1= 0) thencnt <= cnt + 1;

else

93

cnt <= cnt;end if;

end if;end process;

arithmatics ----------------------------------------------------

coe <= coewindow(intval(cnt));smulext <= s pe req * coe;smul <= smul ext(30 downto 15);

END behavioral;

94

Fft_5 12 sr.vhd: source code for the FFT block

file: fft 512 sr.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


An FFT of length 512 is applied to compute the magnitudespectrum of the

signal. This file compute the split-radix level of the FFT.

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;USE work.wr 512 pack.all;

ENTITY fft 512 sr IS PORT(cik : in std logic;sof_w : in std logic;reset : in std logic;data in : in std logic vector(l5 DOWNTO 0);soffft : out std logic;valid fft : out std logic;

out real : out std logic vector(l5 DOWNTO 0);out imag : out std logic vector(l5 DOWNTO 0)

END fft5l2sr;

ARCHITECTURE behavioral OF fft 512 sr IS

componentscomponent pipeline fftsrunit

generic (M : natural := 1);PORTclk : in std logic;contO : in std logic;contl : in std logic;pow : in std logic;data_in : in std logic vector(l5 DOWNTO 0);wrreal : in std logic vector(l5 DOWNTO 0);wr_imag : in stdlogicvector(l5 DOWNTO 0);x2k : out std logic vector(l5 DOWNTO 0);x4k real : out std logic vector(l5 DOWNTO 0);

x4kimag : out std logic vector(l5 DOWNTO 0)

end component;attribute noopt : boolean;

95

attribute noopt of pipelinefftsrunit: component is TRUE;

component sr_n_singlegeneric(N: natural := 1);

PORTclk : in std logic;a : in std logic;

c : out std_logic

end component;--attribute noopt of sr_n_single: component is TRUE;

component sr_n_realgeneric(N: natural := 1);

PORTclk : in std_logic;a : in stdlogicvector(l5 DOWNTO 0);

c : out std logic vector(15 DOWNTO 0)

end component;--attribute noopt of sr_n_real: component is TRUE;

component fft 128PORTclk : in std_logic;sof : in std logic;reset : in std logic;pow : in std logic vector(6 downto 0);

c : in std logic vector(5 downto 0);in_real : in std logic vector(l5 DOWNTO 0);

in_imag : in std logic vector(15 DOWNTO 0);fft5 real : in std logic vector(l5 DOWNTO 0);fft5_imag : in std logic vector(l5 DOWNTO 0);

fft4 real : in std logic vector(l5 DOTIJNTO 0);

fft4imag : in std logic vector(l5 DOWNTO 0);

fft3 real : in std logic vector(l5 DOWNTO 0);fft3imag : in std logic vector(l5 DOWNTO 0);fft2 real : in std logic vector(l5 DOWNTO 0);

fft2imag : in std logic vector(l5 DOWNTO 0);fftlreal : in std logic vector(l5 DOWNTO 0);fftlimag : in std logic vector(l5 DOWNTO 0);

fft0 real : in std logic vector(l5 DOWNTO 0);fft0imag : in std logic vector(15 DOWNTO 0);

out_real : out std logic vector(15 DOWNTO 0);out imag : out std logic vector(l5 DOWNTO 0)

end component;attribute noopt of fft 128: component is TRUE;

----configuration ---------------------------------------------for pful, pfu2, pfu3, pfu4, pfu5, pfu6, pfu7, pfu8

pipeline_f ft_sr_unituse entity work.pipeline fft srunit (behavioral);

for sri, sr2, sr3: sr_n_singleuse entity work.srn single (behavioral);

for ffti28: fft 128use entity work.fft 128 (behavioral);

for sr_x2k, srx2ki: sr_n_realuse entity work.srn real (behavioral);

--signals

signal x2ki : std logic vector(15 DOWNTO 0);

signal x2k2 : std logic vector(15 DOWNTO 0);

signal x2k3 : stdlogicvector(15 DOWNTO 0);



signal x2k6 std logic vector(15 DOWNTO 0);

signal x2k7 : stdlogicvector(15 DOWNTO 0);

signal x2k8 std logic vector(15 DOWNTO 0);

signal x2kl delay std logic vector(15 DOWNTO 0);

final stagesignal autO : std logic vector(15 DOWNTO 0);signal outi : std logic vector(15 DOWNTO 0);signal outi minus : std logic vector(16 DOWNTO 0);signal outi_buf : stdlogicvector(15 DOWNTO 0);

signal in_turn

signal wrrealsignal wrimag

std logic vector(15 DOWNTO 0);

coe;

coe;

fft_8signal fft6 real : std logic vector(15 DOWNTOsignal fft6imag : std logic vector(15 DOWNTOsignal fft5 real : std logic vector(15 DOWNTOsignal fft5imag : std logic vector(l5 DOWNTOsignal fft4 real : std logic vector(15 DOWNTOsignal fft4imag : std logic vector(l5 DOWNTOsignal fft3 real std logic vector(15 DOWNTOsignal fft3imag : std logic vector(15 DOWNTOsignal fft2realL : std logic vector(15 DOWNTOsignal fft2imag : std logic vector(15 DOWNTOsignal fftl real : std logic vector(l5 DOWNTOsignal fftlimag : std logic vector(l5 DOWNTOsignal fft0 real : std logic vector(l5 DOWNTOsignal fftOimag : std logic vector(15 DOWNTOsignal f_real std logic vector(15signal fimag : std logic vector(15signal out fft real : std logic vector(15signal out fft imag : std logic vector(15

0);0);

0);

0);

0);

0);

0);

0);

0);0);

0);

0);

0);

0);

DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);

96

97

signal cant

signal templsignal temp2signal temp3signal temp4signal ternp5signal temp6signal temp7signal cfftl28signal paw_srsignal pow 128

mu x

signal muxOsignal muxlsignal mux2signal mux3signal mux4signal mux5signal mux6signal mux7

type mux statesignal muxOpssignal muxOnssignal muxlpssignal muxlnssignal mux2pssignal mux2nssignal mux3pssignal mux3nssignal mux4pssignal mux4nssignal mux5pssignal mux5nssignal mux6pssignal mux6nssignal mux7pssignal mux7ns

std logic vector(8 DOWNTO 0);

stdstdstdstdstdstdstdstdstdstd

control

logic_vector (6logic_vector (6logic_vector (6logic_vector (6logic_vector (6logic_vector (6logic_vector (6logic_vector (7logic_vector (7

logic_vector (6

s tdl ogi c;

s tdl og i c;

s td logic;

std logic;std logic;s tdl ogi c;

s tdl ogi c;

s tdl og i c;

is (mux idle,mux_state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;mux state;

DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);

mux ready, mux done);

power control

type pow_states is (powidle, pow stl, pow st2, pow_st3,pow st4);

signal powcOps : pow_states;signal pow_cl_ps : pow_states;signal pow_c2_ps pow_states;signal powc3ps : pow_states;signal pow_c4_ps : pow_states;signal pow_c5_ps : pow_states;signal pow_c6_ps : pow_states;signal powc0ns : pow_states;

98

signal pow ci ns : pow_states;signal pow_c2_ns : pow_states;signal powc3ns : pow_states;signal powc4ns : pow_states;signal powc5ns : pow_states;signal pow_c6_ns : pow_states;

sot --------------------------------------------signal sofwbuf : std_logic;signal sofwbufl : std logic;signal sot wbuf2 : std logic;signal sot wbuf3 : std logic;

type sof_states is

signal sof_pssignal sotns

type sof_buf3_statessof_b4,

signal sof_buf3_pssignal sof_buf3_ns

(sofl, sof2, sof3, sof4);sof_states;sof states;

is (sofbidle, sof_bl, sot b2, sof_b3,

sot b5, sof_b6, sofb7) ;sot buf3 states;sot but3 states;

BEGIN--pow 128 <= "1111111";

controlcont_cnt: process (clk, reset)

begin---async resetif reset = '1' then

cont <= B"OOOOOOOOO";elsif (clk'event and clk='l') then

if (sof_w = '1') thencont <= B"OOOOOOOOO";

elsif (cont = 511) thencont <= B"OOOOOOOOO";

elsecont <= cont + 1;

end if;end if;

end process cont_cnt;

---turn B"lOOO 0000 0000 0000" into B"lOOO 0000 0000 0001"turn: process (data_in)begin

if (data_in = B"lOOO 0000 0000 0000") thenin_turn <= B"lOOO 0000 0000 0001";

elsein_turn <= data_in;

end if;end process;

Wr ----------------------------------------------------

99

temp7 <= cont(6 downto 0);wrreal(7) <= wr5l2real(intval7(temp7H;wrimag(7) <= wr5l2imag(intval7(temp7fl;temp6 <= cont(5 downto 0)&'O';wrreal(6) <= wr5l2real(intval7(temp6H;wrimag(6) <= wr5l2imag(intval7(temp6H;temp5 <= cont(4 downto 0)&"OO";wrreal(5) <= wr5l2real(intval7(temp5H;wrimag(5) <= wr 512 imag(intval7 (temp5H;temp4 <= cont(3 downto O)&"000";wrreal(4) <= wr5l2real(intval7(temp4));wrimag(4) <= wr5i2imag(intval7(temp4fl;temp3 <= cont(2 downto 0)&"OOOO';wrreal(3) <= wr5l2real(intval7(temp3));wrimag(3) <= wr5i2imag(intval7(temp3));temp2 <= cont(1 downto O)&"00000";wrreal(2) <= wr5l2real(intval7(temp2H;wrimag(2) <= wr5l2imag(intval7(temp2H;tempi <= cont(0)&"000000";wrreal(1) <= wr5l2real(intval7(templ));wrimag(1) <= wr5l2imag(intvai7(templ));wrreal(0) <= B"Oilllllillllilli";wrimag(0) <= B"OOOOOOOOOOOOOOOO";

multiplexmull: process(clk, reset, sofwbuf, cont, muxOps, muxOns)

beginif reset = '1' then

muxOps <= mux idle;elsif (clk'event and cik = '1') then

muxOps <= muxOns;end if;muxO <= '0';case muxOps is

when mux idle =>if (sofwbuf = '1') then

muxOns <= mux ready;else

muxOns <= mux idle;end if;

when mux ready =>muxO <= cont(l);if (cont(l) = '1') then

muxOns <= mux done;else

muxOns <= mux ready;end if;

when mux_done =>muxO <= cont(l);if (cont(l) = '0') then

muxOns <= mux idle;else

100

muxOns <= mux done;end if;

end case;end process;

mul2: process(clk, reset, sofwbuf,begin

if reset = '1' thenmuxlps <= mux idle;

elsif (clk'event and cik = '1')muxips <= muxlns;

end if;muxi <= '0';case muxlps is

when mux idle =>if (sof_w_buf = '1') then

muxlns <= mux ready;else

muxlns <= mux idle;end if;

when mux ready =>muxi <= cont(2);if (cont(2) = '1') then

muxins <= mux done;else

muxlns <= mux ready;end if;

when mux done =>muxi <= cont(2);if (cont(2) = '0') then

muxins <= mux idle;else

muxins <= mux done;end if;


cont, muxlps, muxlns)

then

mul3: process(clk, reset, sofwbuf, cont, mux2ps, mux2ns)begin

if reset = '1' thenmux2ps <= mux idle;

elsif (clk'event and cik = '1') thenmux2ps <= mux2ns;

end if;mux2 <= '0';case mux2ps is


mux2ns <= mux ready;else

mux2ns <= mux idle;end if;

when mux ready >

101

mux2 <= cont(3);if (cont(3) = '1') then

mux2ns <= mux done;else

mux2ns <= mux ready;end if;

when mux done =>mux2 <= cont(3);if (cont(3) = '0') then

mux2ns <= mux idle;else

mux2ns <= mux done;end if;




elsif (clk'event and clk = '1') thenmux3ps <= mux3ns;


when mux idle =>if (sofwbuf = '1') then



when mux ready =>mux3 <= cont(4);if (cont(4) = '1') then









elsif (clk'event and cik = '1') then

102

mux4ps <= mux4ns;end if;mux4 <= '0';case mux4ps is







when mux done =>mux4 <= cant (5);if (cont(5) = '0') then




mul6: process(clk, reset, sofwbuf, cant, mux5ps, mux5ns)begin


elsif (clk'event and clk = '1') thenmux5ps <= mux5ns;

end if;

mux5 <= 'O'case mux5ps is

when mux idle =>if (safwbuf = '1') then







103




mul7: process(clk, reset, sot wbuf, cont, mux6ps, mux6ns)begin


elsif (clk'event and cik = '1') thenmux6ps <= mux6ns;

end if;

mux6 <= '0';case mux6ps is

when mux idle =>if (sot wbuf = '1') then











if reset '1' thenmux7ps <= mux idle;

elLsif (clk'event and cik = '1') thenmux7ps <= mux7ns;




mux7ns <= mux idle;

104

end if;when mux ready =>

mux7 <= cont(8);if (cont(S) = '1') then



when mux_done =>mux7 <= cont(8);if (cont(8) = '0') then


mux7_ns <= mux done;end if;


--final stage ----------------------------------------------sr_x2k: sr_n_real

generic map(l)port map(clk, x2kl, x2kl delay);

--outO plus <= (x2kl(15)&x2kl) + (x2kl delay(15)&x2kl delay);--outO <= outO plus(l6 downto 1);out0 <= x2kl + x2kl delay;outi minus <= (x2kl delay(15)&x2kl delay) (x2kl(l5)&x2kl);outibuf <= outi minus(16 downto 1);

srx2kl: sr_n_realgeneric map(l)port map(clk, outlbuf, outl);

components instantiation -------------------------pful: pipeline fftsrunit

generic map (2)port map (clk, cont(l), cont(0), pow sr(0), x2k2,wrreal(0), wrimag(0), x2kl, freal, f_imag);

pfu2: pipeline_fft_sr_unitgeneric map (4)port map (clk, cont(2), cont(l), pow_sr(l), x2k3,wrreal(l), wr_imag(l), x2k2, fft0 real, fftoimag);

pfu3: pipeline fft sr unitgeneric map (8)port map (clk, cont(3), cont(2), pow_sr(2), x2k4,wrreal(2), wr_imag(2), x2k3, fftl real, fftlimag);

pfu4: pipelinefftsrunitgeneric map (16)port map (clk, cont(4), cont(3), pow_sr(3), x2k5,wrreal(3), wrimag(3), x2k4, fft2_real, fft2_imag)

105

pfus: pipeline fft sr unitgeneric map (32)port map (cik, cont(5), cont(4), powsr(4), x2k6,wrreal(4), wrimag(4), x2k5, fft3 real, fft3imag);

pfu6: pipeline fft sr unitgeneric map (64)port map (clk, cont(6), cont(5), pow_sr(5), x2k7,wrreai(5), wrimag(5), x2k6, fft4 real, fft4_imag);

pfu7: pipeline fftsrunitgeneric map (128)port map (clk, cont(7), cont(6), pow_sr(6), x2k8,wrreal(6), wrimag(6), x2k7, fft5 real, fft5imag);

pfu8: pipeline fftsrunitgeneric map (256)port map (clk, cont(8), cont(7), pow sr(7), in_turn,wrreal(7), wr_imag(7), x2k8, fft6 real, ftt6imag)

ctftl28 <= mux7&mux6&mux5&mux4&mux3&mux2&muxl&muxO;fftl28: fft 128

port map (clk, sofw, reset, pow 128, cfftl28(5 downto 0),fft6 real, fft6imag, fft5 real, fft5imag,fft4 real, fft4imag, fft3_real, fft3imag,fft2_real, fft2imag, ffti real, tttiimag,fft0 real, fft0imag,out f ft real, out_f ftimag);

valid and sof delay --------------------------------

sof_fft <= sof_w_buf;

sri: sr_n_singlegeneric map(5l2)port map(clk, sot w, sofwbuf);

sr2: sr_n_singlegeneric map(l)port map(clk, sof_w_buf, sofwbufl);

sr3: sr_n_singlegeneric map(l)port map(clk, sofwbufl, sofwbuf2);

sot sm: process(reset, cik, sot wbuf, sot wbufl, sofwbuf2,cont, out0, outi, freal, fimag, sof_w_buf3,out fft real, out fft imag, sotps, sof_ns


sot ps <= sofi;elsif(clk'event and cik = 'l')then

sofps <= sofns;

end if;

106

case sot ps iswhen sofl =>

out_real <= outO;out imag <= x"OOOO";valid fft <= sof_w_buf;if sof_w_buf = '1' then

sof_ns <= sof2;end if;

when sof2 =>out_real <= outi;out imag <= x"0000";valid fft <= '1';if sot wbufl = '1' then


when sof3 =>out_real <= f_real;out imag <= fimag;valid fft <= '1';if sot wbuf2 = '1' then


when sof4 =>out_real <= out_fft_real;out imag <= out_f ftimag;valid fft <= sof_w_buf3;if cont = 511 then

sof_ns <= sofi;end if;


sofbuf3sm: process(reset, clk, cont, cfftl28,sofbuf3ps, sofbuf3ns)


sofbuf3ps <= sot_b_idle;elsif(clk'event and clk = 'l')then

sofbuf3ps <= sof_buf3_ns;end if;

sof_w_buf3 <= '1';case sof_buf3_ps is

when sofb idle =>sof_w_buf3 <= '0';if (cfftl28 = b"OOOOOOOl" and cont(0) = '0') then

sof_buf3_ns <= sot bl;elsif (cfftl28 = b"OOOOOOlO" and cont(l downto 0) =

"10") thensof_buf3_ns <= sof_b2;

107

elsif (cfftl28 = b"OOOOOlOO" and cant (2 downto 0) =

"110") thensafbuf3ns <= sof_b3;

elsif (cfftl28 = b"OOOOlOOO" and cant(3 dawnta 0) ="1110") then

saf_buf3_ns <= sofb4;elsif (cfftl28 = b"OOOlOOOO" and cant (4 downta 0) =

"11110") thensaf_buf3_ns <= saf_b5;

elsif (cfftl28 = b"OOlOOOOO" and cant(5 downta 0) ="111110") then

saf_buf3_ns <= safb6;elsif (cfftl28 = b"OlOOOOOO" and cant(6 dawnta 0) =

"1111110") thensafbuf3ns <= sofb7;

end if;when saf bl =>

if (cant(1 dawnta 0) = "00") thensafbuf3ns <= safb idle;

end if;when saf_b2 =>

if (cant(2 downto 0) = "010") thensof_buf3_ns <= safb idle;


if (cont(3 dawnto 0) = "0110") thensof_buf3_ns <= safb idle;


if (cont(4 dawnto 0) = "01110") thensafbuf3ns <= sofb idle;


if (cont(5 dawnta 0) = "011110") thensaf_buf3_ns <= sofb idle;


if (cant(6 dawnto 0) = "0111110") thensofbuf3ns <= sofb idle;

end if;when saf b7 =>

if (cant(7 downto 0) = "01111110") thensofbuf3ns <= safb idle;

end if;end case;

end pracess;

pawer contralpaw_sr <= cfftl28(6 dawnto 0)&sofwbuf2;

pawc0sm: pracess(clk, reset, powc0ps, pawcOns, cant,pawclps, paw 128, cfftl28)

begin--asyn reset

108

if reset = '1' thenpowcOps <= pow_idle;

elsif (clk'event and clk = 'l) thenpow cOps <= powc0ns;

end if;

pow 128(0) <= '0';case pow_co_ps is

when pow_idle >

if cfftl28(0) = '1' thenpowcOns <= pow_st3;

elsif pow_cl_ps 1= pow_idle and cont(l)powcOns <= pow_st4;

elsepowcOns <= pow_idle;

end if;when pow stl =>

pow 128(0) <= cont(0);if pow 128(1) = '1' then

powcOns <= pow_st4;elsif cont(0) = '0' then

powcOns <= pow_idle;elsif pow_cl_ps 1= pow_idle and cont(l)

powcOns <= pow_st4;else

powcOns <= pow stl;end if;

when pow_st2 =>pow 128(0) <= cont(0);if cont(0) = '0' then


powcOns <= pow_st2;end if;

when pow_st3 >

pow 128(0) <= cont(0);if cont(0) = '1' then

powc0ns <= pow stl;else


when pow st4 =>pow 128(0) <= cont(0);if cont(0) = '1' then




= '1' then

= '1' then

pow_cl_sm: process(clk, reset, powclps, pow ci ns, cont,powc2ps, cfftl28)

begin

109

--asyn resetif reset = '1' then

pow_cl_ps <= pow_idle;elsif (clk'event and cik = '1') then

pow_cl_ps <= pow clns;end if;

pow 128(1) <= '0';case pow_cl_ps is

when pow_idle =>if cfftl28(l) = '1' then

pow clns <= pow_st3;elsif powc2ps 1= pow_idle and cont(2) = '1' then

pow ci ns <= pow_st4;else

pow ci ns <= pow_idle;end if;

when pow sti =>pow 128(1) <= cont(l);if cont(l) = '0' then

pow_cl_ns <= pow_idle;else

pow cl ns <= pow stl;end if;

when pow_st2 =>pow 128(1) <= cont(l);if cont(l) = '0' then

pow cl ns <= pow_st3;else

pow ci ns <= pow st2;end if;

when pow st3 =>pow 128(1) <= cont(l);if cont(l) = '1' then

pow ci ns <= pow sti;else

pow clns <= pow_st3;end if;

when pow st4 =>pow 128(1) <= cont(i);if cont(l) = '1' then

pow ci ns <= pow_st2;else

pow ci ns <= pow_st4;end if;


powc2sm: process(clk, reset, pow_c2_ps, pow_c2_ns, cont,pow_c3_ps, cfftl28)

begin--asyn resetif reset = '1' then

pow_c2_ps <= pow_idle;

110

elsif (clk'event and ciLk = '1') thenpow_c2_ps <= pow_c2_ns;

end if;

pow 128(2) <= '0';case pow_c2_ps is

when pow_idle =>if cfftl28(2) = '1' then

pow_c2_ns <= pow_st3;elsif pow_c3_ps 1= pow_idle and cont(3) '1' then

pow_c2_ns <= pow_st4;else

pow_c2_ns <= pow_idle;end if;

when pow sti =>pow 128(2) <= cant (2);if cont(2) = '0' then

powc2ns <= pow_idle;else

pow_c2_ns <= pow sti;end if;

when powst2 =>pow 128(2) <= cont(2);if cont(2) = '0' then

powc2ns <= pow_st3;else

pow_c2_ns <= pow st2;end if;


powc2ns <= pow stl;else

pow_c2_ns <= pow st3;end if;


pow_c2_ns <= powst2;else

pow_c2_ns <= pow_st4;end if;


powc3sm: process(clk, reset, pow_c3_ps, powc3ns, cont,pow_c4_ps, cfftl28)


pow_c3_ps <= pow_idle;elsif (clk'event and clk = '1') then

pow_c3_ps <= pow_c3_ns;end if;

111

pow 128(3) < !o!;

case pow_c3_ps iswhen pow_idle =>

if cfftl28(3) = '1' thenpow_c3_ns <= powst3;

elsif pow_c4_ps 1= pow_idle and cont(4) = '1' thenpow_c3_ns <= pow_st4;

elsepow_c3_ns <= pow_idle;

end if;when pow sti =>


pow_c3_ns <= pow_idle;else

pow_c3_ns <= pow sti;end if;





pow_c3_ns <= pow stl;else



pow_c3_ns <= pow st2;else

pow_c3_ns <= powst4;end if;


pow c4 Sm: process(clk, reset, powc4ps, pow_c4_ns, cont,pow_c5_ps, cfftl28)



powc4ps <= powc4ns;end if;


112


powc4ns <= pow_st3;elsif pow c5ps 1= pow_idle and cont(5) = '1' then



when pow stl =>pow 128(4) <= cont(4);if cont(4) = '0' then


powc4ns <= pow sti;end if;

when pow_st2 =>pow_l28(4) <= cont(4);if cont(4) = '0' then




pow_c4_ns <= powstl;else

pow_c4_ns <= powst3;end if;


powc4ns <= pow_st2;else

powc4ns <= pow_st4;end if;


powc5sm: process(clk, reset, powc5ps, powc5ns, cont,powc6ps, cfftl28)



pow_c5_ps <= powc5ns;end if;



pow_c5_ns <= pow_st3;

113

elsif pow_c6_ps 1= pow_idlepow_c5_ns <= pow_st4;

elsepow_c5_ns <= pow_idle;

end if;when pow sti =>



powc5ns <= pow stl;end if;





powc5ns <= pow sti;else






and cont(6) = '1' then

powc6sm: process(clk, reset, pow c6ps, pow_c6_ns, cont,cfftl28)

begin---asyn resetif reset = '1' then

pow_c6_ps <= pow idle;elsif (clk'event and clk = '1') then

pow c6ps <= powc6ns;end if;





114

when pow stl =>pow 128(6) <= cont(6);if cont(6) = '0' then


pow_c6_ns <= pow_stl;end if;

when pow st2 >

pow_l28(6) <= cont(6);if cont(6) = '0' then




pow_c6_ns <= pow sti;else


when pow st4 =>pow_c6_ns <= pow_st4;


END behavioral;

file: pipeline fft unit.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


a basic unit of split-radix fft algorithm

LIBRARY ieee;USE ieee.std logic 1164.ALL;USE ieee. std logic signed.ALL;USE ieee.std logic arith.all;use work.all;

ENTITY pipeline fft sr unit ISgeneric (M : natural := 2);PORT

115

cik : in std logic;contO : in std logic;conti : in std logic;pow : in std logic;data_in : in std logic vector(15 DOWNTO 0);wrreal : in std logic vector(l5 DOWNTO 0);wrimag : in std logic vector(15 DOWNTO 0);

x2k : out std logic vector(l5 DOWNTO 0);

x4k real : out std logic vector(15 DOWNTO 0);x4kimag : out std logic vector(l5 DOWNTO 0)

END pipeline fftsrunit;

ARCHITECTURE behavioralcomponent comp multi

PORTxrealximagwrrealwrimag

mx_realmximag

end component;

component sr_n_realgeneric (N: naturalPORTclk : ina : in

c

OF pipeline_fft_sr_unit IS

in std logic vector(l5 DOWNTO 0);in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);

out std logic vector(l5 DOWNTO 0);out std logic vector(15 DOWNTO 0)

= M);

std logic;std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0)

end component;configuration ----------------------------------

for compO multi: comp multiuse entity work.comp multi (behavioral);

for sr0, srl: sr_n_realuse entity work, sr_n_real (behavioral);

signals ---------------------------------------------signal sr0in : std logic vector(l5 DOWNTO 0);signal sr0 out : std logic vector(l5 DOWNTO 0);signal srlin : std logic vector(15 DOWNTO 0);signal srl_out : std logic vector(l5 DOWNTO 0);signal addl out : std logic vector(15 DOWNTO 0);signal comp multi imag : std logic vector(l5 DOWNTO 0);signal commulreal : std logic vector(l5 DOWNTO 0);signal corn mulimag : std logic vector(l5 DOWNTO 0);signal wr corn real : std logic vector(15 DOWNTO 0);signal wrcomirnag : std logic vector(l5 DOWNTO 0);signal adO out : std logic vector(l6 DOWNTO 0);

116

signal adi out : std logic vector(16 DOWNTO 0);signal adlirn : std logic vector(l6 DOWNTO 0);

BEGINswO: process(conto, datain, addi out)begin

if (contO = '0') thensr0in <= data_in;

elsesr0in <= addi out;

end if;end process swO;

swl: process(conto, sr0 out, conti)begin

if (conti = '0') thencomp multi imag <= x"OOOO";srlin <= sr0 out;

elsecomp multi imag <= -sr0 out;sri_in <= x"OOOO";

end if;end process swl;

---power control ------------------------------------------pow_sm: process(pow, srl out, comp multi irnag,

wrreal, wr_imag)begin

if (pow = 'i') thencom mul real <= srl out;commulimag <= comp multi imag;wr corn real <= wrreal;wrcornirnag <= wrimag;

elsecorn mul real <= x"OOOO";corn mul irnag <= x"OOOO";wr cam real <= x"OOOO";wrcomimag <= x"OOOO";

end if;

end process;

adO out <= (srO out (15) &srO out) + (data in(l5)&data in);x2k <= adO out(i6 downto 1);adlim <= (data in(i5)&data in);adl out <= srO out(15)&srO out + adiirn;addi_out <= adi_out(i6 downto 1);

srO: sr_n_realgeneric map (M)port map (clk, srOin, srO out);

sri: sr_n_realgeneric map (M/2)port map (clk, srl_in, sri_out);

compO multi: comp multiport map (corn mul real, commulimag, wr corn real,

wrcomimag, x4k real, x4kirnag);END behavioral;

file: compmulti.vhdthesis projectProgrammer: Xin XiaoDate 12/00


Implement complex multiplication

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;USE ieee.std logic arith.all;

--LIBRARY synopsys;--use synopsys . arithmetic, all;ENTITY comp multi IS PORT(

xreal : in

x_imag : inwrreal : inwrimag : in

mx realmximag

END comp multi;

117

std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0)

ARCHITECTURE behavioral OF comp multi ISsignal al : std logic vector(31 DOWNTO 0);

signal a2 : std logic vector(3l DOWNTO 0);

signal a3 : std logic vector(31 DOWNTO 0);signal a4 : std logic vector(31 DOWNTO 0);signal ad_real : std logic vector(3l DOWNTO 0);signal adimag : std logic vector(31 DOWNTO 0);BEGIN

al <= xreal * wrimag;a2 <= x_irnag * wrimag;a3 <= xreal * wrreal;a4 <= x imag * wrreal;

118

ad_real <= a3 a2;

adimag <= al + a4;

mx real <= adreal(30 downto 15);mximag <= adimag(30 downto 15);

END behavioral;

file: srnreal.vhdthesis projectProgrammer: Xin XiaoDate 12/00


16-bit real number shift-register


ENTITY sm real ISgeneric(N: natural := 1);PORT

cik : in std logic;a : in std logic vector(15 DOWNTO 0);

c : out std logic vector(15 DOWNTO 0)

END sr_n_real;

ARCHITECTURE behavioral OF sr_n_real IStype shift reg is array ((N 1) downto 0) of std logic vector(15DOWNTO 0);signal sr_real : shift reg;BEGIN

nr: process (clk, a, sr_real)begin

if (clk'event and clk = '1') thensrreal(0) <= a;if (N /= 1) then

for num in 1 to (N 1) loopsrreal(num) <= Sr real(num 1);

end loop;end if;

end if;

119

c <= sr real(N 1);

end process;

END behavioral;

file: srnreal.vhdthesis projectProgrammer: Xin XiaoDate 12/00


one bit real number shift-register


ENTITY sr_n_single ISgeneric(N: natural := 1);PORT

cik : in std logic;a : in std logic;

c : out std logic

END sr_n_single;

ARCHITECTURE behavioral OF sr_n_single IStype shift_reg is array ((N 1) downto 0) of std logic;signal sr_real : shift reg;BEGIN

nr: process (clk, a, sr_real)begin

if (clk'event and clk = '1') thensrreal(0) <= a;if (N /= 1) then

for num in 1 to (N 1) loopsrreal(num) <= Sr real(num 1);

end loop;end if;

end if;c <= sr_real (N 1);

end process;

120

END behavioral;

file: fft_l28.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


An FFT of length 512 is applied to compute the magnitudespectrum of the

signal. This file computes the 128 points FET.

LIBRARY ieee;USE ieee.std logic 1164 .ALL;USE ieee. std logic signed.ALL;USE work.wr 512 pack.all;

ENTITY fft 128 IS PORT(clk : in std logic;sof : in std logic;reset : in std logic;pow : in std logic vector(6 downto 0);

c : in std logic vector(5 downto 0);

in_real : in std logic vector(15 DOWNTO 0);

inimag : in std logic vector(15 DOWNTO 0);

fft5 real : in std logic vector(l5 DOWNTO 0);

fftsimag : in std logic vector(l5 DOWNTO 0);

fft4 real : in std logic vector(l5 DOWNTO 0);

fft4_imag : in std logic vector(l5 DOWNTO 0);

fft3 real : in std logic vector(15 DOWNTO 0);fft3imag : in std logic vector(l5 DOWNTO 0);

fft2 real : in std logic vector(l5 DOWNTO 0);fft2imag : in std logic vector(l5 DOWNTO 0);

fftl real : in std logic vector(l5 DOWNTO 0);

fftlimag : in std logic vector(15 DOWNTO 0);

fftO real : in std logic vector(15 DOWNTO 0);

fft0imag : in std logic vector(15 DOWNTO 0);out_real : out std logic vector(l5 DOWNTO 0);

out imag : out std logic vector(l5 DOWNTO 0)

END fft 128;

ARCHITECTURE behavioral OF fft 128 IS

components --------------------------------------------------component pipeline_f ft_unit

121

generic (M : natural := 1);PORTclk : in std logic;reset : in std logic;cont in std_logic;pow in std logic;in real in std logic vector(l5 DOWNTO 0);inimag : in std_logic_vector(l5 DOWNTO 0);wrreal : in std logic vector(15 DOWNTO 0);wrimag : in std logic vector(15 DOWNTO 0);out_real : out std logic vector(l5 DOWNTO 0);

out imag : out std logic vector(15 DOWNTO 0)

end component;

component multiplex_vectorPORTcont : in std logic;mux mO real : in stdlogicvector(15 downto 0);muxinOimag : in std logic vector(l5 downto 0);mux ml real in std logic vector(l5 downto 0);muxinlimag in std logic vector(l5 downto 0);mux out real : out std logic vector(15 downto 0);

mux out imag : out std logic vector(l5 downto 0)

end component;

- configuration ---------------------------------------------for pful, pfu2, pfu3, pfu4, pfu5, pfu6, pfu7: pipeline fft unit

use entity work.pipeline fftunit (behavioral);for muxO, muxi, mux2, mux3, mux4, mux5: multiplex_vector

use entity work.multiplex vector (behavioral);

--signalssignal mO realsignal inOimagsignal ml realsignal inhimagsignal in2 realsignal in2imagsignal in3 realsignal in3imagsignal in4 realsignal in4imagsignal in5 realsignal in5imagsignal outO realsignal outOimagsignal outl realsignal outlimagsignal out2_realsignal out2_imagsignal out3 realsignal out3imag

std logic vector(l5 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(l5 DOWNTO 0);std logic vector(15 DOWNTO 0);std logic vector(15 DOWNTO 0);

122

signal out4_real : std logic vector(15 DOWNTO 0);

signal out4imag : std logic vector(15 DOWNTO 0);

signal out5 real : std logic vector(l5 DOWNTO 0);

signal out5imag : std logic vector(15 DOWNTO 0);

signal in_turn_real : std logic vector(15 DOWNTO 0);signal in turn imag : std logic vector(15 DOWNTO 0);

signal wrO real : std logic vector(15 DOWNTO 0);

signal wrOimag : std logic vector(15 DOWNTO 0);

signal wri real : std logic vector(15 DOWNTO 0);

signal wrlimag : std logic vector(15 DOWNTO 0);

signal wr2 real : std logic vector(15 DOWNTO 0);

signal wr2imag : std logic vector(15 DOWNTO 0);


signal wr3imag : std logic vector(l5 DOWNTO 0);

signal wr4 real : stdlogicvector(15 DOWNTO 0);


signal wr5 real : std logic vector(l5 DOWNTO 0);




signal cont : stdlogicvector(6 DOWNTO 0);

signal tempO : std logic vector(7 DOWNTO 0);

signal templ : std logic vector(7 DOWNTO 0);

signal temp2 : std logic vector(7 DOWNTO 0);

signal temp3 : std_logic_vector(7 DOWNTO 0);



BEGIN

control ---------------------------------------------cont_cnt: process (clk, reset)

begin---async resetif reset = '1' then

cont <= B"OOOOOOO";elsif (clk'event and clk='l') then

if (sof = '1') thencont <= B"0000000";

elsif (cant = 127) thencont <= B"OOOOOOO";

elsecont <= cant + 1;

end if;end if;

end process cant cnt;

---turn B"l000 0000 00000000" into B"1000 0000 0000 0001"turn: process(in real, in_imag)begin

if (in_real = B"lOOO 0000 0000 0000") thenin turn real <= B"lOOO 0000 0000 0001";

123

elsein turn real <= in_real;

end if;if (inimag = B"lOOO 0000 0000 0000") then

in turn imag <= B"l000 0000 0000 0001";else

in turn imag <= inimag;end if;

end process;Wr ----------------------------------

temp5 <= cont(5 downto 0)&"OO";wr6 real <= wr 512 real(intval8(temp5H;wr6imag <= wr5l2imag(intval8(temp5H;temp4 <= cont(4 downto 0)&"OOO";wr5 real <= wr 512 real (intval8 (temp4fl;wr5imag <= wr 512 imag(intval8 (temp4fl;temp3 <= cont(3 downto 0)&"OOOO";wr4 real <= wr 512 real (intval8 (temp3));wr4imag <= wr 512 imag(intval8 (temp3H;temp2 <= cont(2 downto 0)&"OOOOO";wr3 real <= wr 512 real (intval8 (temp2H;wr3_imag <= wr5l2imag(intval8(temp2H;templ <= cont(1 downto 0)&"OOOOOO";wr2 real <= wr5l2real(intval8(templH;wr2imag <= wr 512 imag(intval8 (templH;tempO <= cont(0)&"OOOOOOO";wrl real <= wr 512 real (intval8 (tempO));wrlimag <= wr 512 imag (intval8 (tempO));wrO real <= B"011lllllllllllll";wrOimag <= B"OOOOOOOOOOOOOOOO";

components instantiation

muxO: multiplex_vectorport map(c(0), outOreal, outOimag, fft0 real, fftoimag,

inOreal, inOimag);

muxi: multiplex_vectorport map(c(1), outireal, outlimag, fftl real, fftlimag,

ml real, inlimag);

mux2: multiplex_vectorport map(c(2), out2 real, out2imag, fft2 real, fft2_imag,

in2 real, in2imag);

mux3: multiplex_vectorport map(c(3), out3 real, out3imag, fft3 real, fft3imag,

in3 real, in3imag);


in4 real, mn4imag);

124


in5 real, in5_imag);

pful: pipeline fft unitgeneric map (1)port map (cik, reset, cont(0), pow(0), mO real, inOimag,

wrOreal, wrOimag, outreal, out imag);

pfu2: pipeline fft unitgeneric map (2)port map (clk, reset, cont(l), pow(l), inireal, inlimag,

wrl real, wrlimag, outOreal, outOimag);

pfu3: pipeline fft unitgeneric map (4)port map (clk, reset, cont(2), pow(2), in2_real, in2imag,

wr2 real, wr2imag, outireal, outlimag);

pfu4: pipeline fft unitgeneric map (8)port map (clk, reset, cont(3), pow(3), in3 real, in3_imag,

wr3 real, wr3_imag, out2 real, out2imag);

pfu5: pipeline fft unitgeneric map (16)port map (cik, reset, cont(4), pow(4), in4 real, in4imag,

wr4 real, wr4imag, out3 real, out3imag);

pfu6: pipeline_fft_unitgeneric map (32)port map (clk, reset, cont(5), pow(5), in5 real, in5imag,

wr5 real, wr5_imag, out4_real, out4imag);

pfu7: pipeline_f ft_unitgeneric map (64)port map (clk, reset, cont(6), pow(6), in_turn_real,inturnimag, wr6 real, wr6imag, out5 real, out5imag);

END behavioral;

file: pipeline fft unit.vhdthesis projectProgrammer: Xin Xiao

125

Date 01/2001


a basic unit of radix-2 fft algorithm

LIBRARY ieee;USE ieee.std logic 1164.ALL;USE ieee. std logic signed.ALL;USE ieee. std_logic_arith. all;use work.all;

ENTITY pipeline fft unit ISgeneric (M : natural := 1);PORT

clk : in std logic;reset : in std logic;cont : in std_logic;pow : in std logic;in_real : in stdlogicvector(15 DOWNTO 0);inimag : in std logic vector(l5 DOWNTO 0);

wrreal : in std logic vector(15 DOWNTO 0);wrimag : in stdlogicvector(15 DOWNTO 0);

out real : out std logic vector(15 DOWNTO 0);out imag : out std logic vector(15 DOWNTO 0)

END pipeline f ft unit;

ARCHITECTURE behavioral OF pipeline_f ft_unit IScomponent butterfly impi

PORTa_real : in stdlogicvector(l5 DOWNTO 0);aimag : in std logic vector(l5 DOWNTO 0);b_real : in std logic vector(15 DOWNTO 0);bimag : in stdlogicvector(l5 DOWNTO 0);wrreal : in std logic vector(15 DOWNTO 0);wrimag : in std logic vector(15 DOWNTO 0);

out pos real : out std logic vector(15 DOWNTO 0);out pos imag : out std logic vector(15 DOWNTO 0);out neg real : out std logic vector(15 DOWNTO 0);out neg imag : out std logic vector(15 DOWNTO 0)

end component;

component sr_ngeneric(N: natural :=

PORTclk : in std logic;a_real : in std logic vector(15 DOWNTO 0);aimag : in std logic vector(15 DOWNTO 0);

c_real : out std logic vector(15 DOWNTO 0);cimag : out std logic vector(15 DOWNTO 0)

126

end component;configuration ----------------------------------

for bf: butterfly impluse entity work.butterfly impl (behavioral);

for sr: sr_nuse entity work.srn(behavioral);

signalssignal sr_in_realsignal sr in imagsignal srout realsignal Sr out imagsignal bfpos realsignal bfposimagsignal bfneg realsignal bfnegimagsignal butl realsignal butlimagsignal but2 realsignal but2imagsignal but wr realsignal but_wr_imag

BEGIN

std logic vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15std_logic_vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15std logic vector (15

DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);DOWNTO 0);

Sw: process(reset, cont, inreal, inimag, sr_out_real,sr_out_imag,

bfnegreal, bfnegimag, bfposreal,bfposimagbegin

if (cont = '0' or reset = '1') thensr_in_real <= in_real;sr in imag <= inimag;out_real <= sr_out_real;out imag <= Sr out imag;

elsesr_in_real <= bfnegreal;srinimag <= bfnegimag;out_real <= bf_pos_real;out imag <= bfposimag;

end if;end process sw;

powerl: process(reset,pow, sroutreal, Sr out imag,inreal, inimag, wrreal, wrimag)

beginif (reset = '1') then

buti real <= x"OOOO";butlimag <= x"OOOO";but2 real <= x"OOOO";but2imag <= x"OOOO";but wr real <= x"OOOO";butwrimag <= x"OOOO";

127

elsif (pow = '1') thenbutl real <= in real;butlimag <= inimag;but2 real <= srout real;but2imag <= sroutimag;but wr real <= wr_real;butwrimag <= wrimag;

elsebutl real <= x"OOOO";butlimag <= x"OOOO";but2_real <= x"OOOO";but2imag <= x"OOOO";but wr real <= x"OOOO";butwrimag <= x"OOOO";

end if;end process;

bf: butterfly implport map (but2 real, but2imag, butl real, butlimag,

butwr real, but_wr_imag, bfpos real,bfposimag,

bfneg real, bfnegimag);sr: sr_n

generic map (M)port map (clk, srinreal, srinimag,

sroutreal, sroutimag);END behavioral;

file: butterfly impl.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


implement butterfly of radix-2 fft algorithm

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee - std logic signed.ALL;USE ieee.std logic arith.all;

ENTITY butterfly impl IS PORT(a_real : in std logic vector(15 DOWNTO 0);a_imag : in std logic vector(15 DOWNTO 0);

brealbimagwr_realwrimag

out pos realout_p os_i mag

out_ne g_r e a 1

out neg imag

END butterfly impl;

128

in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);

out std logic vector(l5 DOWNTO 0);out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0)

ARCHITECTURE behavioral OF butterfly impl IScomponent camp multi

PORTxreal : in std logic vector(15 DOWNTO 0);

ximag : in std logic vector(l5 DOWNTO 0);

wrreal : in std logic vector(l5 DOWNTO 0);

wrimag : in std logic vector(15 DOWNTO 0);

mx real : out std logic vector(15 DOWNTO 0);mximag : out std logic vector(l5 DOWNTO 0)

end component;

component camp_addPORTa_real : in std logic vector(15 DOWNTO 0);

aimag : in std logic vector(15 DOWNTO 0);

b_real : in std logic vector(l5 DOWNTO 0);

bimag : in std logic vector(15 DOWNTO 0);

c_real : out std logic vector(15 DOWNTO 0);c_imag : out std logic vector(15 DOWNTO 0)

end component;configuration ---------------------------------

for multi: camp_multiuse entity work.comp multi (behavioral);

far addl, add2: camp_adduse entity work.comp add (behavioral);

signal h_real : stdlagicvectar(15 DOWNTO 0);signal h_imag : std logic vectar(15 DOWNTO 0);signal greal : std logic vectar(l5 DOWNTO 0);signal g_imag : std logic vectar(15 DOWNTO 0);

BEGIN

greal <= b_real;gimag <= bimag;

129

multi: camp multi port map (hreal, himag, wrreal, wrimag,out_neg_real, outnegimag);

addi: comp_add port map (areal, aimag, breal, bimag,out pos real, outposimag);

add2: comp_add port map (areal, a_imag, greal, g_imag,hreal, himag);

END behavioral;

file: sr_n.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


16-bit complex number shift register

LIBRARY ieee;USE ieee.std logic 1164 .ALL;

ENTITY sr_n ISgeneric (N: naturalPORT

clk : ina_reala_imag

c_realc_imag

END sr_n;

1);

std logic;in std logic vector(15 DOWNTO 0);in std logic vector(15 DOWNTO 0);

out std logic vector(15 DOWNTO 0);out std logic vector(15 DOWNTO 0)

ARCHITECTURE behavioral OF sr_n IStype shift reg is array ((N 1) downto 0) of std logic vector(15DOWNTO 0);signal sr_real : shift_reg;signal srimag : shift reg;BEGIN

nr: process (clk, areal, a_imag, srreal, srimag)begin

if (clk'event and cik = '1') thensrreal(0) <= a_real;srimag(0) <= aimag;if (N /= 1) then

130

for num in 1 to (N 1) loopsrreal(num) <= sr_real (num 1);srimag(num) <= Sr imag(num 1);

end loop;end if;

end if;c_real <= Sr real(N 1);

cimag <= sr_imag(N 1);

end process;

END behavioral;

file: multiplex vector.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


a complex number switch


ENTITY multiplex_vector ISPORT

cont : in std logic;mux mO real : in std logic vector(15 downto 0);muxinOimag : in std logic vector(15 downto 0);mux ml real : in std logic vector(15 downto 0);muxinlimag : in std logic vector(15 downto 0);mux out real : out std logic vector(15 downto 0);

mux_out_imag : out std logic vector(15 downto 0)

END multiplex_vector;

ARCHITECTURE behavioral OF multiplex_vector ISBEGIN

mux: process (cont, muxin0real,muxinOimag, muxinlimag,muxin 1_real)

beginif (cont = '0') then

mux out real <= mux mO real;mux out imag <= muxinOimag;

131

elsemux out real <= mux ml real;mux out imag <= muxinlimag;

end if;end process;

END behavioral;

132

Square root.vhd: source code for the SR block.

file: square_root.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


The fft 512 sr output is real and imag part of the FFT.The square root is computed.

LIBRARY ieee;USE ieee. std logicUSE ieee. std logicUSE ieee.std logic

use work.exempluse work.wr 512

1164 .ALL;

signed.ALL;arith. cony std logic vector;rll64.sr2;_pack. intval5;

ENTITY square root IS PORT(cik : in std logic;reset : in std logic;sot fft : in std logic;valid fft : in std logic;fft real : in std logic vector(15 DOWNTO 0);fftimag : in std logic vector(15 DOWNTO 0);

sof_squ : out std logic;valid squ : out std logic;

fft_mag : out std logic vector(15 DOWNTO 0)

END square_root;

ARCHITECTURE behavioral OF square_root IScomponentscomponent squ_root_pre

PORTcik : in std logic;reset : in std logic;valid fft : in std logic;fft real : in std logic vector(15 DOWNTO 0);tftimag : in std logic vector(l5 DOWNTO 0);shift squ root : out std_logic_vector(4 DOWNTO 0);yl : out std logic vector(31 DOWNTO 0);ul : out std logic vector(31 DOWNTO 0)

end component;

component squ_root_base2

133

generic (N : natural := 1);PORTclk : in std logic;reset : in std_logic;yin : in std logic vector(3l DOWNTO 0);u_in : in std logic vector(3l DOWNTO 0);

yout : out std logic vector(31 DOWNTO 0);u_out : out std logic vector(31 DOWNTO 0)

end component;

component squ root base5generic (N :

PORTclk : inreset : inyin : inuifl : in

tural := 3);

std logic;std logic;std logic vector(3l DOWNTO 0);std logic vector(31 DOWNTO 0);

y_out : out std logic vector(3l DOWNTO 0);u_out : out std logic vector(3l DOWNTO 0)

end component;

----configuration

for all: squ_root_preuse entity work.squ_root_pre (behavioral);

for all: squ_root_base2use entity work.squ_root_base2 (behavioral);

for all: squ_root_base5use entity work.squ rootbase5 (behavioral);

signal shift squ root : std logic vector(4 DOWNTO 0);signal yl : stdlogicvector(31 DOWNTO 0);signal ul : stdlogicvector(3l DOWNTO 0);

type squ_unit is array (0 to 12) of std logic vector(31 downto0);

signal u: squ unit;signal y: squ_unit;

type shift number is array (0 to 11) of std logic vector(4 downto0);

signal shift_num: shift number;

type sig_bit_array is array (0 to 13) of std logic;signal sof: sigbit array;signal valid: sig bit array;

signal fft_mag_reg : std logic vector(16 DOWNTO 0); acturalvalue

134

signal shiftn : std logic vector(4 downto 0);signal tmp : std logic vector(31 DOWNTO 0);constant sqrt2 : std logic vector(l5 DOWNTO 0) :=

"0101101010000010";BEGIN

preprocessing ------------------------------------------------

squpre: squrootpreport map(clk, reset, valid fft, fft real, fftimag,

shi ft_s qu_root,

yl, Ui);

y(l) <= yi;u(1) <= ul;

5 stages base-2 square root -------------------------------------

gen_base2: for N in 1 to 5 generate gen_base2_1begin

squroota: squ_root_base2generic map (N)port map (cik, reset, y(N), u(N), y(N + 1), u(N + 1));

end generate;

6 stages base-4 square rootgenbase4: for N in 3 to 8 generatebegin

squ_roota: squ_root_base5generic map (N)port map (clk, reset, y(N + 3), u(N + 3), y(N + 4), u(N +

4));end generate;

post_processing

0100 * 0100 = 00010000shift_n <= b"Olill" shift num(i1);tmp <= y(i2) (31 downto 16) * sqrt2;

post_proc_sm: process(clk, reset, shift num(1i)tmp)begin

if reset = '1' thenfftmagreg <= '0' & x"OOOO";

elsif (cik'event and cik = 'i')thenif (shift num(11) = 0)then

fftmagreg <= '0' & x"OOOO";elsif (shift num(11) = 1) then

fftmagreg <= '0' & x"OOOl";elsif (shift num(11) = 16) then

fftmagreg <= tmp(29 downto 13);else

y(l2), shift_n,

fft_mag_reg <= shr(tmp(28 downto 12), shift_n);end if;

end if;end process;

135

fftmag <= fftmagreg(16 downto 1);

register array --------------------------------------------------

11 stages shift of square_rootshift num(0) <= shift_squ_root;gen_shift: for N in 0 to 10 generate

genshiftsm: process (cik, reset, shift num)begin

if (reset = '1' ) thenshift num(N + 1) <= b"OOOOO";

elsif(clk'event and cik = 'l')thenshift num(N + 1) <= shift num(N);

end if;end process;

end generate;

13 stages shift registers for sof, validsof(0) <= sot fft;valid(0) <= valid fft;sof_squ <= sot (13);valid squ <= valid(13);

gen_reg: for N in 0 to 12 generategenregsm: process(clk, reset, sot tft, valid fft)begin

if (reset = 'l')thensof(N + 1) <= '0';valid(N + 1) <= '0';

elsif (clk'event and clk = '1') thensof(N + 1) <= sof(N);valid(N + 1) <= valid(N);

end if;end process;

end generate;

END behavioral;

file: squ_root_pre.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


The fft 512 sr output is real and imag part of the FFT.The square root is computed.


use work.exemplarll64.sr2;use work.wr 512 pack.intval5;

ENTITY squ root pre IS PORT(clk : inreset : in

valid fft : infft real : infftimag : inshi ft_squ_root

100001yl :outul : out

END squ_root_pre;

s td

s td

s td

s td

s td

01

s td

s td

136

logic;logic;_logic;logic vector(15 DOWNTO 0);logic vector(l5 DOWNTO 0);t std logic vector(4 DOWNTO 0); [0,

logic vector(31 DOWNTO 0); (-2, 2)

logic vector(31 DOWNTO 0) (-2, 2)

ARCHITECTURE behavioral OF squ root pre ISsignal sum squ : std logic vector(3l DOWNTO 0);signal sum square : std logic vector(3l DOWNTO 0);

signal fft real mag : std logic vector(16 DOWNTO 0);signal fftimagmag : std logic vector(l6 DOWNTO 0);signal real squ : std logic vector(3l DOWNTO 0);signal imagsqu : std logic vector(3l DOWNTO 0);signal sum squbuf std logic vector(3l DOWNTO 0);

signal umin: std logic vector(31 DOWNTO 0);BEGIN

--magnitude square of the fft output

mag: process(clk, reset, valid fft, fft real, fftimag,fftrealmag,

fftimagmag, real squ, imagsqu, sum squ buf)

beginif (fftreal(l5) = '1') thenfft real mag <= b"l 0000000000000000" ('0' & fft real);

elsefft real mag <= fft real (15) &fft real;

end if;

if (fft imag(15) = '1') thenfftimagmag <= b"l 00000000 0000 0000" ('0' & fftimag);

elsefftimagmag <= fft imag(l5) &fftimag;

end if;

Thy

137

real squ <= fft real mag(15 DOWNTO 0) * fft real mag(15 DOWNTO

imagsqu <= f ft imagmag(15 DOWNTO 0) * fftimagmag(15 DOWNTO

sum_squ_buf <= real squ + imagsqu;

if reset = '1' thensum squ <= x"OOOOOOOO";

elsif (clk'event and clk = '1') thenif (valid ftt = '1') then

sum squ <= sum squ buf;else

sum squ <= x"00000000";end if;

end if;end process;

preprocessing --------------------------------------------------

1/4 <= x < 1 b"OlOO 0000 0000 0000" = 1presm: process (clk, reset, sum squ)begin

if reset = '1' thensum square <= x"OOOOOOOO";shift_squ_root <= b"OOOOO";

elsif (clk'event and clk = '1' ) thenif (sum squ = 0) then

sum square <= x"OOOOOOOO";shift_squ_root <= b"00000";

elsif (sum squ(30 downto 29) 1= 0) thensum square <= "00" & sum squ(30 downto 1);shift_squ_root <= b"lOOOO";

elsif (sum squ(28 downto 27) 1= 0) thensum square <= "00" & sum squ(28 downto 0) & '0';

shift_squ_root <= b"011ll";elsif (sum squ(26 downto 25) 1= 0) then

sum_square <= "00" & sum_squ(26 downto 0) & "000";shift squ root <= b"011lO";

elsif (sum squ(24 downto 23) /= 0) thensum square <= "00" & sum squ(24 downto 0) & "00000";shift_squ_root <= b"01101";

elsif (sum squ(22 downto 21) 1= 0) thensum square <= "00" & sum_squ(22 downto 0) & "0000000";shift squ root <= b"011OO";

elsif (sum squ(20 downto 19) 1= 0) thensum square <= "00" & sum squ(20 downto 0) & "000000000";shift_squ_root <= b"O1011";

elsif (sum squ(18 downto 17) 1= 0) thensum square <= "00" & sum squ(18 downto 0) & "00000000000";shift_squ_root <= b"O101O";

elsif (sum squ(16 downto 15) 1= 0) thensum square <= "00" & sum squ(16 downto 0) &

"0000000000000";shift_squ_root <= b"OlOOl";

138


"OOOOOOOOOOOOOOO"shift squ root <= b"OlOOO";


"00000000000000000";shift squ root <= b"O011l"


"0000000000000000000";shift_squ_root <= b"O011O";

elsif (sum squ(8 downto 7) 7= 0) thensum_square <= "00" & sum squ(8 downto 0) &

"000000000000000000000";shift squ root <= b"OO101";

elsif (sum squ(6 downto 5) 7= 0) thensum_square <= "00" & sum_squ(6 downto 0) &

"00000000000000000000000";shift_squ_root <= b"OOlOO";

elsif (sum_squ(4 downto 3) 7= 0) thensum_square <= "00" & sum squ(4 downto 0) &

"0000000000000000000000000";shift_squ_root <= b"OO011";

elsif (sum squ(2 downto 1) 7= 0) thensum square <= "00" & sum_squ(2 downto 0) &

"000000000000000000000000000";shift_squ_root <= b"OOOlO";

elsesum square <= "00" & sum squ(0) &

"00000000000000000000000000000";shift_squ_root <= b"OOOOl";

end if;end if;

end process;

initializeinitsm: process (clk, reset, sumsquare, umin)

begin

if sum square(29) = '1' then q = 1yl <= sum_square;u_mm <= sum_square x"40000000";ul <= umin(30 downto 0) & '0';

else q = 2yl <= sum square(30 downto 0) & '0';

u_mm <= (sum_square(29 downto 0) & "00")- x"40000000";ul <= umin(30 downto 0) & '0';

end if;end process;

end behavioral;

file: squ root base2.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


Copyright Xin Xiao, 2001. All rights reserved.

Copying or other reproduction of this code except for archivalpurpose is

prohibited without the prior written consent of Xin Xiao.

the unit of base 2 square root

LIBRARY ieee;USE ieee.std logic 1164.ALL;USE ieee. std logic signed.ALL;USE ieee. std logic arith .ALL;

use work.exemplarll64.sr2;use work.exemplarll64.s12;

ENTITY squ root base2 ISgeneric (N : nati

PORTclk : inreset : iny_in : inu_in : in

ral

stdstdstdstd

= 1);

logic;logic;logic vector(31 DOWNTO 0);logic vector(31 DOWNTO 0);

y_out : out std logic vector(3l DOWNTO 0);u_out : out std logic vector(31 DOWNTO 0)

END squ_root_base2;

139

ARCHITECTURE behavioral OF squ root base2 ISsignal ymul : std logic vector(33 DOWNTO 0);signal sk2 : std logic vector(31 DOWNTO 0);signal su4l : std logic vector(31 DOWNTO 0);signal sk22 : std logic vector(31 DOWNTO 0);signal sk22k : std logic vector(31 DOWNTO 0);signal su2k : std logic vector(3l DOWNTO 0);

140

signal sk22isignal sk22kisignal d2isignal d2ilsignal tisignal t2signal tl2signal suk2

BEGIN

stdstdstdstdstdstd

std logic vector(3l DOWNTO 0);std logic vector(31 DOWNTO 0);

logic vector(3l DOWNTO 0);logic vector(31 DOWNTO 0);logic vector(31 DOWNTO 0);logic vector(3l DOWNTO 0);logic vector(31 DOWNTO 0);logic vector(31 DOWNTO 0);

sk22i <= shr(suk2, convstdlogicvector(2*N, 5));sk22k <= sk22 + su4l + su4l;sk22ki <= shr(sk22k, convstdlogicvector(N, 5));su2k <= sk2 + uin;ti <= su2k(30 downto 0) & '0';

t2 <= sk22i + sk22ki;t12 <= ti + t2;d2il <= x"40000000" + d2i;ymul <= yin(31 downto 15) * d2il(31 downto 15);

1/4 <= x < 1 b"Ol00 0000 0000 0000" = 1base2sm: process(clk, reset, uin, yin)

signal ymul : std logic vector(33 DOWNTO 0);

constant Al : std logic vector(31 DOWNTO 0) := b"O011"&x"OOOOOOO";3/4

constant A2 std logic vector(31 DOWNTO 0) b"l101"&x"OOOOOOO";-3/4

begin

1/2

1/2

if (u_in <= A2) thensk2 <= x"40000000"; 1

su4l <= u_in; --u(k)sk22 <= x"20000000"; 1/2suk2 <= '1' & u in(31 downto 1);d2i <= shr(x"20000000", convstdlogicvector(N, 5));

elsif (u_in < Al) thensk2 <= x"OOOOOOOO";su4l <= x"OOOOOOOO";sk22 <= x"OOOOOOOO";suk2 <= x"OOOOOOOO";d2i <= x"OOOOOOOO";

elsesk2 <= x"cOOOOOOO"; -1

su4l <= -(u_in);-- -u(k)sk22 <= x"20000000"; 1/2suk2 <= '0' & uin(31 downto 1);d2i <= shr(x"eOOOOOOO", convstdlogicvector(N, 5));

141

end if;end process;

Duty_sm: process(clk, reset, ymul)beginif (reset = '1') then

yout <= x"OOOOOOOO";elsif (clk'event and cik = '1') then

yout <= ymul(31 downto 0);end if;

end process;

outusm: process(clk, reset, tl2)


u_out <= x"00000000";elsif (cik'event and clk = '1') then

u_out <= tl2;end if;

end process;

END behavioral;

file: squ root base5.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


the unit of base 2 square root

LIBRARY ieee;USE ieee.std logic 1164 .ALL;USE ieee. std logic signed.ALL;USE ieee.std logic arith.ALL;

--use work. exemplar_li 64. sr2;--use work.exemplarii64.s12;

ENTITY squ root base5 ISgeneric (N : natural := 3);PORTclk : in std logic;reset : in std logic;y_in : in std logic vector(31 DOWNTO 0);

142

uin in std logic vector(31 DOWNTO 0);yout : out std logic vector(31 DOWNTO 0);

u_out : out std logic vector(31 DOWNTO 0)

END squ root base5;

ARCHITECTURE behavioral OF squ root base5 IS

BEGIN1/4 <= x < 1 b"OlOO 00000000 0000" = 1

base5sm: process(clk, reset, uin, y_in)variable ymul : std logic vector(33 DOWNTO 0);variable ski : std logic vector(31 DOWNTO 0);variable su8 std logic vector(3l DOWNTO 0);variable sk24 : std logic vector(31 DOWNTO 0);variable sk24k std logic vector(3l DOWNTO 0);variable sulk : std logic vector(31 DOWNTO 0);variable su2k : std logic vector(3l DOWNTO 0);variable sk24i : std logic vector(3l DOWNTO 0);variable sk24ki : std logic vector(31 DOWNTO 0);variable d4i : std logic vector(31 DOWNTO 0);variable d4il : std logic vector(3l DOWNTO 0);variable ti std logic vector(31 DOWNTO 0);variable t2 std logic vector(31 DOWNTO 0);variable suk4 std logic vector(3l DOWNTO 0);constant Al : std logic vector(3l DOWNTO 0) := b"00l0"&x"OOOOOOO";

1/2constant A2 : std logic vector(3l DOWNTO 0) := b"011O"&x"OOOOOOO";

3/2constant Bl : std logic vector(3l DOWNTO 0) := b"lllO"&x"OOOOOOO";

-1/2constant B2 : std logic vector(3l DOWNTO 0) := b"101O"&x"OOOOOOO";

-3/2begin

4 (U (k) + 2s (k) ) + 4 (-k) (8s (k) u (k) + 4s(k) 2) + 4 (

2k) 4s (k) "4u (k)

4(u(k) + 2s(k)) + 4(-(k-2)) (s(k)u(k)/2 + s(k)"2/4) + 4(-2(k-1)) s (k) '4u (k) /4

if (u_in <= B2) thenski x"40000000"; 1

su8 := uin(3l) & uin(31 downto 1); u(k)/2sk24 := x"l0000000"; --1/4suk4 := uin(3l) & uin(3l) & u in(31 downto 2); u(k)/4d4i := shr(x"20000000", convstdlogicvector(2*N 1,

5)); -- l*4'(_k)elsif (u_in <= Bi) then

ski := x"20000000";su8 u_in (31) & U_in (31) & u_in (31 downto 2); u (k) /4

sk24 := x"04000000"; 1/16suk4 := shr(u in(31 downto 0), "0110"); u(k)/2"6d4i shr(x"lOOOOOOO", convstdlogicvector(2*N 1,

5)); -- 1/2*4" (-k)

elsif (u_in < Al) thenski := x"OOOOOOOO";

143

su8 := x"OOOOOOOO";sk24 := x"OOOOOOOO";suk4 x"OOOOOOOO";d4i := x"OOOOOOOO";

elsif (u_in < A2) thenski := x"eOOOOOOO"; -1/2su8 (u_in (31) & u_in (31) & u_in (31 downto 2));sk24 := x"04000000"; 1/16suk4 := shr( uin(31 downto 0), "0110"); u(k)/2"6d4i := shr(x"fOOOOOOO", convstdlogicvector(2*N_l, 5));-

- _l/2*4 (-k)

elseski := x"cOOOOOOO"; -1

su8 := -(uin(31) & uin(31 downto 1)); -u(k)/2sk24 x"lOOOOOOO"; 1/4suk4 := u_in (31) & u_in (31) & uin (31 downto 2); u (k) /4

d4i := shr(x"eOOOOOOO", cony std logic vector(2*N, 5));1*4 (-k

end if;sk24i := shr(suk4, convstdlogicvector(4 * (N 1), 6));sk24k := sk24 + suB;sk24ki := shr(sk24k, convstdlogicvector(2*(N 2), 5));sulk := ski + u_in;su2k ski + sulk;tl su2k(29 downto 0) & "00";t2 := sk24i + sk24ki;d4il := x"40000000" + d4i;ymul := yin(31 downto 15) * d4il(31 downto 15);

if (reset = '1') thenuout <= x"OOOOOOOO";y_out <= x"OOOOOOOO";

elsif (clk'event and clk = '1') thenu_out <= ti + t2;y_out <= ymul(31 downto 0);

end if;end process;

END behavioral;

144

Mel filter.vhd: source code for the MF block.

file: melfilter.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


This bank is divided into 23 channels equidistant in melfrequency domain.

Each channel has triangular-shaped frequency window. Consectivechannels

are half-overlapping.

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic unsigned.ALL;use work.mel filter pack.all;use work.wr 512 pack.all;

ENTITY mel filter IS PORT(clk : in std logic;reset : in std logic;sofsqu : in std logic;valid squ : in std_logic;

fftmag : in std logic vector(l5 DOWNTO 0);eofmel : out std logic;fbank : out mel_bank --1 to 23

END mel_filter;

ARCHITECTURE behavioral OF mel_filter IScomponents --------------------------------------------------

component bit_revPORTclk : in std logic;reset : in std logic;sof_squ : in std logic;valid squ : in std logic;fft_mag : in std logic vector(l5 DOWNTO 0);sof rev : out std logic;valid_rev : out std logic;eof_rev : out std logic;symcont : out std logic vector(8 DOWNTO 0);

sum_rev : out std logic vector(l5 DOWNTO 0)

end component;

145

configuration-for all: bit rev

use entity work.bit rev (behavioral);

signal sof_rev : std logic;signal valid_rev : std_logic;signal eof_rev : std logic;signal symcont : std_logic_vector(8 DOWNTO 0);signal sum_rev : std logic vector(l5 DOWNTO 0);

signal valid_even : std logic;signal valid_odd : std logic;signal sum_mel_even : std logic vector(3l DOWNTO 0);signal sum_mel_odd : std logic vector(31 DOWNTO 0);signal eof_reg : std logic;signal melcont : std logic vector(8 DOWNTO 0);signal add_even : std logic vector(23 DOWNTO 0);signal add_odd : std logic vector(23 DOWNTO 0);signal fbank reg even : std logic vector(23 DOWNTO 0);signal fbank_reg_odd : std logic vector(23 DOWNTO 0);signal fbankregl : mel_bank;

BEGINbit_rev_corn: bit_rev

port map(clk, reset, sof_squ, valid squ, fft_mag,sof_rev, validrev, eof rev, sym_cont, sum_rev);

Registers

counter registercont_reg: process(clk, reset, symcont)begin


melcont <= b"O 0000 0000";elsif (clk'event and clk = '1') then

melcont <= symcont;end if;

end process;

--end registerend reg: process (clk, reset, eof_rev)begin


eofreg <= '0';elsif (clk'event and cik = '1') then

eofreg <= eof rev;end if;

end process;

--end mel registerend mel reg: process(clk, reset, eof_reg)

146

beginasync reset

if reset = '1' theneof_mel <= '0';

elsif (clk'event and cik = '1') theneofmel <= eof_reg;

end if;end process;

multiply -------------------------------------------------even: process(clk, reset, sum rev, sof rev, validrev, symcont)variable symcontint : integer;begin

symcontint := intval9 (symcont);if reset = '1' then

sum_mel_even <= x"OOOOOOOO";valid_even <= '0';

elsif (clk'event and clk = '1') thenif (valid_rev = '1' and sym_cont > 1 and symcont < 257) then

sum_mel_even <= sum_rev * meleven(symcontint);valid_even <= '1';

elsesum_mel_even <= x"OOOOOOOO";valid_even <= '0';

end if;end if;

end process;

odd: process(clk, reset, sumrev, sof rev, validrev, symcont)variable symcontint : integer;begin

symcontint := intval9(symcont);if reset = '1' then

sum_mel_odd <= x"OOOOOOOO";valid_odd <= '0';

elsif (clk'event and clk = '1') thenif (valid_rev = '1' and symcont > 4 and symcont < 230) then

sum_mel_odd <= sum_rev * melodd(symcontint);valid_odd <= '1';

elsesum_mel_odd <= x"OOOOOOOO";valid_odd <= '0';

end if;end if;

end process;

---accumulation

fbank <= fbankregl;even_acc: process(clk, reset, sof_rev, valid_even, sum_mel_even,

fbankregeven)begin

if reset = '1' thenfbankregl(l) <= x"OOOOOO";

147

fbankregl(3) <= x"000000";fbankregl(5) <= x"OOOOOO";fbankregl(7) <= x"OOOOOO";fbankregl(9) <= x"OOOOOO";fbankregl(11) <= x"OOOOOO";fbankregl(13) <= x"OOOOOO";fbankregl(15) <= x"OOOOOO";fbankregl(17) <= x"OOOOOO";fbankregl(19) <= x"OOOOOO";fbankregl(21) <= x"OOOOOO";fbankregl(23) <= x"OOOOOU";

elsif (clk'event and cik = !11) thenif (sof rev = '1') then

fbankregl(1) <= x"000000";fbankregl(3) <= x"OOOOOO";fbankregl(5) <= x"OOOOOO";fbankregl(7) <= x"OOOOOO";fbankregl(9) <= x"OOOOOO";fbankregl(1l) <= x"OOOOOO";fbankregl(13) <= x"OOOOOO";fbankregl(15) <= x"OOOOOO";fbankregl(17) <= x"OOOOOO";fbankregl(19) <= x"OOOOOO";fbankregl(21) <= x"OOOOOO";fbankregl(23) <= x"OOOOOO";

elsif (valid_even = '1') thenif (melcont > 1 and mel_cant <8) then

fbankregl(l) <= fbank_reg_even;elsif (melcont > 7 and mel cont <14) then

fbankregl(3) <= fbank reg even;elsif (mel_cant > 13 and mel_cant <23) then

fbankregl(5) <= fbank_reg_even;elsif (melcont > 22 and mel_cant <33) then

fbankregl(7) <= fbank reg even;elsif (mel_cant > 32 and mel cant <45) then



fbankregl(13) <= fbank_reg_even;elsif (mel_cant > 78 and mel_cant <101) then

fbankregl(l5) <= fbank req even;elsif (mel_cant > 100 and mel_cant <129) then

fbankregl(17) <= fbank req even;elsif (mel_cant > 128 and mel_cant <163) then



fbankregl (23) <= fbank reg even;end if;

end if;end if;

end process;

148

fbank_reg_even <= add even + sum mel even(31 downto 8);even mux: process (fbankregl, melcont)begin

if (melcont > 1 and mel cont <8) thenadd_even <= fbankregl(1);

elsif (melcont > 7 and mel cont <14) thenadd_even <= fbankregl(3);

elsif (melcont > 13 and mel cont <23) thenadd even <= fbankregl (5);




elsif (mel cant > 59 and mel cant <79) thenadd_even <= fbankregl(13);

elsif (mel cant > 78 and mel cont <101) thenadd_even <= fbankregl (15);

elsif (mel_cont > 100 and mel cant <129) thenadd_even <= fbankregl(17);

elsif (mel cant > 128 and mel cont <163) thenadd_even <= fbankregl(19);

elsif (mel cant > 162 and mel_cont <205) thenadd_even <= fbankregl (21);

elsif (mel_cont > 104 and mel cant <257) thenadd_even <= fbankregl (23);

elseadd_even <= x"000000";

end if;

end process;

odd_acc: process(clk, reset, sof rev, valid_odd, sum_mel_odd,fbank req odd)


fbankregl(2) <= x"000000";fbankregl(4) <= x"OOOOOO";fbankregl(6) <= x"OOOOOO";fbankregl(8) <= x"OOOOOO";fbankregl(lO) <= x"OOOOOO";fbankregl(l2) <= x"000000";fbankregl(14) <= x"OOOOOO";fbankregl(16) <= x"OOOOOO";fbankregl(18) <= x"OOOOOO";fbankregl(20) <= x"OOOOOO";fbankregl(22) <= x"OOOOOO";

elsif (clk'event and clk = '1') thenif (sot rev = '1') then

fbankregl(2) <= x"OOOOOO";fbankregl(4) <= x000000";fbankregl(6) <= x"OOOOOO";

149

fbankregl(8) <= x"OOOOOO";fbankregl(l0) <= x"OOOOOO";fbankregl(12) <= x"000000";fbankregl(14) <= x"OOOOOO";fbankregl(16) <= x"OOOOOO";fbankregl(18) <= x"OOOOOO";fbankregl(20) <= x"OOOOOO";fbankregl(22) <= x"OOOOOO";

elsif (valid odd = !1!) thenif (mel cant > 4 and mel cont <11) then

fbank_regl(2) <= fbank reg odd;elsif (melcont > 10 and mel cont <18)

fbankregl(4) <= fbank reg odd;elsif (melcont > 17 and mel cont <27)

fbankregl(6) <= fbank_reg_odd;elsif (mel_cant > 26 and mel_cant <38)

fbankregl(8) <= fbank_reg_add;elsif (mel_cant > 37 and mel_cant <52)

fbankregl(1O) <= fbank reg add;elsif (mel_cant > 51 and mel_cant <69)

fbankregl(12) <= fbank reg add;elsif (mel_cant > 68 and mel_cant <89)

fbankregl(l4) <= fbank reg add;elsif (mel_cant > 88 and mel_cant <115)

fbankregl(l6) <= fbank reg add;elsif (mel_cant > 114 and mel_cant <145

fbankregl(l8) <= fbank req add;elsif (mel_cant > 144 and mel_cant <183

fbankregl (20) <= fbank req add;elsif (mel_cant > 182 and mel_cant <230

tbankregl(22) <= fbank req add;end if;

end if;end if;

end pracess;

then

then

then

then

then

then

then

then

then

then

fbank req add <= add add + sum mel add(31 dawnta 8);add_mux: process (fbankregl, mel cant)begin

if (mel_cant > 4 and mel_cant <11) thenadd_add <= fbankregl(2);

elsif (mel_cant > 10 and mel_cant <18) thenadd_add <= fbankregl(4);



elsif (mel_cant > 37 and mel_cant <52) thenadd_add <= fbankregl(lO);


elsif (mel_cant > 68 and mel_cant <89) thenadd_add <= fbankregl(l4);

elsif (mel_cant > 88 and mel_cant <115) then

150

add_odd <= fbankregl(16);elsif (melcont > 114 and mel cant <145) then

add_odd <= fbankregl(18);elsif (mel cant > 144 and mel cont <183) then

add_odd <= fbankregl(20);elsif (melcont > 182 and mel cont <230) then

add_odd <= fbankregl(22);else

add_odd <= x"OOOOOO";end if;

end process;

END behavioral;

file: bit_rev.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


The FFT output is in bit-reversed order. The magnitude spectrumof the

signal is computed in the square root. Due to symmetry, onlybinO. .FFT/2

are used for further processing.

LIBRARY ieee;USE ieee.std logic 1164 .ALL;USE ieee. std logic unsigned.ALL;

ENTITY bit rev IS PORT(clk : in std logic;reset : in std logic;sof_squ : in std logic;valid squ : in std logic;fftmag : in stdlogicvector(l5 DOWNTO 0);sof_rev : out std logic;valid_rev : out std logic;eof rev : out std logic;sym cant : out std logic vector(8 DOWNTO 0);

sum_rev : out std logic vector(15 DOWNTO 0)

END bit_rev;

151

ARCHITECTURE behavioral OF bit_rev ISsignal cont : std logic vector(8 DOWNTO 0);signal rev_cont : std logic vector(8 DOWNTO 0);

BEGINcont_cnt: process (cik, reset)

begin----async resetif reset = '1' then

cont <= B"000000000";elsif (clk'event and clk='l') then

if (sof_squ = '1') thencont <= B"OOOOOOOOO";

elsif (cont = 511) thencont <= B"OOOOOOOOO";

elsecont <= cant + 1;

end if;end if;

end process cont_cnt;

reverse the bits order

bit_rev: for i in 8 downto 0 generatebegin

rev cont(i) <= cont(8 i);

end generate;

Change into the FFT/2 points

sym: process(clk, reset, rev_cont)begin

if reset = '1' thensymcont <= b"OOOOOOOOO';

elsif (clk'event and cik = '1') thenif (rev_cont > 256) then

symcont <= 512 rev_cont;else

sym cant <= rev_cant;end if;

end if;end process;

end of frame

end_sm : process(clk, reset, cant)begin

if reset = '1' theneof_rev <= '0';

elsif(clk'event and clk = '1') thenif (cont = 384) then

eof_rev <= '1';else

eof_rev <= '0';end if;

152

end if;end process;

Registers ------------------------------------------------------

sof registerinregl: process (clk, reset, sofsqu)

beginasync reset

if reset = '1' thensof rev <= '0';

elsif (clk'event and cik = '1') thensof_rev <= sof_squ;

end if;end process;

valid registervalid reg: process (cik, reset, valid squ)

beginasync reset

if reset = '1' thenvalid_rev < '1';

elsif (clk'event and cik = '1') thenvalid_rev <= valid squ;

end if;end process;

data registerout_reg: process (clk, reset, fftmag)

beginasync reset

if reset = '1' thensum_rev <= x"OOOO";

elsif (clk'event and clk = '1') thensum_rev <= fftmag;

end if;end process;

END behavioral;

153

Energy measure.vhd: source code for the EM block.

file: energy measure.vhdthesis projectProgrammer: Xin XiaoDate 01/2001



ENTITY energy measure IS PORT(cik : in std logic;reset : in std logic;sof_fr : in std_logic;s_fr : in std logic vector(l5 DOWNTO 0);

s_em : out std logic vector(l5 DOWNTO 0)

END energy_measure;

ARCHITECTURE behavioral OF energy_measure ISsignal s_f r_reg : std logic vector(l5 DOWNTO 0);signal smul2 : std_logic_vector(3l DOWNTO 0);signal smul : std logic vector(15 DOWNTO 0);signal s_add : std logic vector(15 DOWNTO 0);signal saddi : std logic vector(15 DOWNTO 0);signal s_add_reg : std logic vector(15 DOWNTO 0);signal semreg : std logic vector(15 DOWNTO 0);signal sof_reg : std logic;signal sofregi : std logic;signal sof_reg2 : std logic;signal sof_reg3 : std logic;

BEGINRegisters

Input registerin_reg_sm: process (clk, reset, s_fr)

beginasync reset

if reset = '1' thensfrreg <= x"OOOO";

elsif (clk'event and clk = '1') then

154

sfrreg <= s_fr;end if;

end process;

add registeradd reg Sm: process (cik, reset, s_add)

beginasync reset

if reset = '1' thensaddreg <= x"OOOO";

elsif (cik'event and cik = '1') thens_add_reg <= s_add;

end if;end process;

output registerlout regism: process (cik, reset, s_add_reg, sofreg2)

beginasync reset

if reset = '1' thensemreg <= x"OOOO";

elsif (cik'event and cik = '1') thenif (sof_reg = '1') then

semreg <= saddreg;end if;

end if;end process;

output registerout reg Sm: process (cik, reset, semreg, sofreg3)

beginasync reset

if reset = '1' thens_em <= x"OOOO";

elsif (cik'event and cik = '1') thenif (sofreg3 = 'l')then

s_em <= semreg;end if;

end if;end process;

sof registersofregsm: process (cik, reset, soffr)

beginasync reset

if reset = '1' thensofreg <= '0';

elsif (cik'event and cik = '1') thensofreg <= soffr;

end if;end process;

sof registerisofregism: process (cik, reset, sofreg)

155

beginasync reset

if reset = '1' thensot regi <= '0';

elsif (clk'event and clk '1') thensot regi <= sof_reg;

end if;end process;

sot register2sofreg2sm: process (clk, reset, sofregi)

beginasync reset

if reset = '1' thensot reg2 <= '0';

elsif (clk'event and clk = '1') thensof_reg2 <= sot regi;

end if;end process;

sot register3sofreg3sm: process (cik, reset, sot reg2)

beginasync reset

if reset = '1' thensof_reg3 <= '0';

elsif (clk'event and clk = '1') thensot reg3 <= sot reg2;

end it;end process;

arithmatics ----------------------------------------------------

smul2 <= s_tr_reg * strreg;smul <= smul2(30 downto 15);s_add <= smul + saddi;

arith: process (reset, saddreg, sot regi)begin


saddi <= X"OOOO";elsit (sofregi = '1') then

saddi <= x"OOOO";else

saddi <= saddreg;end if;

end process;

END behavioral;

156

Deframing.vhd: source code for the Deframing block.

file: deframing.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


The outputs from the mel filter and the energy measure arecombined into

one channel.

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;use work.mel filter pack.all;

ENTITY deframing IS PORT(clk : in std logic;reset : in std logic;eofmell : in std_logic;eofmel2 : in std logic;eofmel3 : in std logic;eofmel4 : in std logic;fbankl : in mel_bank;seml : in std logic vector(l5 DOWNTO 0);

fbank2 : in mel_bank;sem2 : in std logic vector(l5 DOWNTO 0);

fbank3 : in mel_bank;sem3 : in stdlogicvector(15 DOWNTO 0);fbank4 : in mel_bank;sem4 : in std logic vector(15 DOWNTO 0);sof_def : out std logic;fbankdef : out mel_bank;s em def : out std logic vector(15 DOWNTO 0)

END deframing;

ARCHITECTURE behavioral OF deframing ISBEGIN

Registers ------------------------------------------------------

deframingsm: process(clk, reset, eofmell, eofmel2, eofmel3,eofmel4,

fbankl, fbank2, fbank3, fbank4, semi, sem2, sem3, sem4)begin

if (reset = '1') then

157

fbankdef <= fbankl;semdef <= semi;

eisif(cik'event and cik = '1') thenif (eofmell = '1') then

fbankdef <= fbanki;semdef <= semi;

elsif (eofmel2 = '1') thenfbankdef <= fbank2;semdef <= sem2;

elsif (eofmel3 = Vii) thenfbankdef <= fbank3;semdef <= sem3;

elsif(eofmei4 = '1') thenfbank_def <= fbank4;semdef <= sem4;

end if;end if;

end process;

sofsm: process(cik, reset, eofmeli, eofmei2, eofmei3,eofmei4)


sof_def <= '0';elsif(cik'event and cik = '1') then

if (eofmeii = '1' or eof mel2 = 'i'or eof mel3 = '1' oreofmel4 = '1') thensofdef <= '1';

elsesof_def <= '0';

end if;end if;

end process;END behavioral;

158

Log.vhd: source code for the LOG block.

file: log.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


The output of mel filtering is subjected to a logarithm function(natural logarithm)

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;use work.mel filter pack.mel bank;

ENTITY log ISPORTclk : in std logic;reset : in std logic;eofmel : in std logic;

fbank : in mel_bank; 1 to 23s_em : in std logic vector(l5 DOWNTO 0);sof log : out std logic;s_log : out std logic vector(l5 DOWNTO 0)

14 * ln(2) 15 * ln(2)

END log;

u[-4, 2] -4=x"cOOO"y[-0.693l, 0] [-1, 0] [x"cOOO", x"OOOO"]d[-2, 2]r=4

ARCHITECTURE behavioral OF log IS--log precomponent log pre

PORTclk : in std logic;reset : in std logic;eofmel : in std logic;fbank : in mel_bank; 1 to 23s_em : in std logic vector(15 DOWNTO 0);sof log : out std logic;log_scale : out std logic vector(4 DOWNTO 0);

u_out : out std logic vector(15 DOWNTO 0);yout : out std logic vector(l5 DOWNTO 0)

159

end component;

--log_unitcomponent log_unit

generic (I: natural := 1);PORTcik : in std_logic;reset : in std logic;sof_in : in std_logic;log_scale_in : in std logic vector(4 DOWNTO 0);uin : in std logic vector(15 DOWNTO 0);yin : in std logic vector(15 DOWNTO 0);sof_out : out std logic;log_scale_out : out std logic vector(4 DOWNTO 0);

u_out : out std logic vector(15 DOWNTO 0);yout : out std logic vector(15 DOWNTO 0)

end component;

----configuration -----------------------------------------------

for log pre inst : log preuse entity work.log pre (behavioral);

for all: log_unituse entity work.log_unit (behavioral);

--type declaration ----------------------------------------------

type sof_array is array (0 to 8) of std_logic;signal sof_reg : sof_array;

type log_scale_array is array (0 to 8) of std logic vector(4DOWNTO 0);

signal log_scale_reg : log_scale_array;

type yuarray is array (0 to 8) of std logic vector(15 DOWNTO

signal u_reg : yu_array;signal y_reg : yuarray;

signal sof log reg : std logic;BEGIN

component instantiation

log_pre_inst: log preport map (clk, reset, eofmel, fbank, sem, sofreg(0),

log scale reg (0),ureg(0), yreg(0fl;

log unit inst: for i in 0 to 7 generatebegin

log unit reg: log_unitgeneric map( (i+l))

port map(clk, reset, sofreg(i), log scale reg(i),ureg(i),

160

yreg(i), sofreg(i+l), log scale reg(i+1), ureg(i+1),yreg(i+lH;

end generate;

sof log reg <= sofreg(8);

post operation

ln(x) = ln(a) + 0.6931 * b 14 * ln(2) 15 * ln(2)15 = dec2bin(2"1l*15, 16)

post_sm: process(clk, reset, log scale reg(8), yreg(8))variable log_scale : std logic vector(l5 DOWNTO 0);begin

if log scale reg(8) = b"OOOOl" then 1 * ln(2)

log_scale := b"OOOOO1011OOO1011";elsif log scale reg(8) = b"OOOlO" then 2 * ln(2)

log_scale := b"000010110001011l";elsif log scale reg(8) = b"OO011" then

log_scale := b"000l0000101000lO";elsif log scale reg(8) = b"OOlOO" then

log_scale := b"OOO1011OOO1011lO";elsif log scale reg(8) = b"OO101' then

log_scale := b!t000l10111011lOOlvl;elsif log scale reg(8) = b"O011O" then

log_scale := b"OOlOOOO101OOO101";elsif log scale reg(8) = b"O011l" then 7

log_scale := b"OOlO01101101OOOO";elsif log scale reg(8) = b"OlOOO" then

log_scale := b"OO10110001011lOO";elsif log scale reg(8) = b"OlOOl" then

log_scale := b"O011000lll101000";elsif log scale reg(8) = b"O101O" then

log_scale := b"O0110111011lO011";elsif log scale reg(8) = b"Ololl" then

log_scale := b"O0llll00llllllll";elsif log scale reg(S) = b"011OO" then --c

log_scale := b"OlOOOO101OOO101O";elsif log scale reg(S) = b"011ol" then

log_scale := b"OlOOlOOOOOO1011O";elsif log scale reg(8) = b"011lO" then

log_scale := b"OlO01101101OOOOl";elsif log scale reg(8) = b"011ll" then --f

log_scale := b"O101O011OO101101";end if;

if reset = '1' thens_log <= x"OOOO";

elsif (clk'event and clk = '1') thenif log scale reg(8) = b"OOOOO" then

slog <= x"OOOO";else

s_log <= log_scale + (b"llll" & yreg(8) (15 downto 4));

161

end if;end if;

end process;

register ----------------------------sofsm: process(clk, reset, sof log reg)begin

if reset = '1' thensof_log <= '0';

elsif(clk'event and clk = '1' ) thensof log <= sof log reg;

end if;end process;

END behavioral;

file: log pre.vhdthesis projectProgrammer: Xin XiaoDate 02/2001


The output of mel filtering is subjected to a logarithm function(natural logarithm) Before the log, the input data need to be

scaled.

LIBRARY ieee;USE ieee.std logic ll64.ALL;USE ieee. std logic signed.ALL;

use work.logpack.all;use work.mel filter pack.mel bank;use work.wr 512 pack.intval5;

ENTITY log pre ISPORTcik : in std logic;reset : in std logic;eof_mel : in std logic;

fbank : in mel_bank; 1 to 23s_em : in std logic vector(l5 DOWNTO 0);sof log : out std logic;log_scale : out std logic vector(4 DOWNTO 0);

u_out : out std logic vector(l5 DOWNTO 0);yout out std logic vector(15 DOWNTO 0)

162

END log pre;

u(-3, 3) (x"aOOO", x"6000")y[-O.693l, 0] (-1, 0] (x"8000", x"OOOO"}d[-2, 2] [x"cOOO", x"4000"]r=4

ARCHITECTURE behavioral OF log pre ISsignal log_in : std logic vector(l5 DOWNTO 0);signal u_buf : std logic vector(15 DOWNTO 0);signal y_buf : std logic vector(l5 DOWNTO 0);signal sot log req : std logic;signal log_scale_but : std logic vector(4 DOWNTO 0);signal cnt : std logic vector(4 DOWNTO 0);signal qini : std logic vector(15 DOWNTO 0);

--[1/2 to 1) [x"4000", x"7fff"]constant q : std logic vector(15 downto 0) :=

" 01011000 000 000 00 "

--11/16 = "0101100000000000", 1/2 = x"4000";constant log2 minus : std logic vector(l5 downto 0) :=

"1010011101000110";-1 = x"8000"

BEGIN--counter -------------------------------------------------------

cntsm: process(clk, reset, eofmel, cnt)begin

if (reset = '1') thencnt <= b"OOOOO";

elsit(clk'event and clk = '1') thenif (eofmel = 'l')then

cnt <= b"OOOOl";elsif (cnt = 23) then --cnt = 23 1 + 1

cnt <= b"OOOOO";elsif (cnt /= 0)then

cnt <= cnt + 1;else

cnt <= b"OOOOO";end if;

end if;end process;

input multiplex

multiplex Sm: process(clk, reset, eofmel, cnt, log_in)variable fcoe integer;begin

fcoe := intval5(23 cnt); DCT input sequenceif (reset = '1') then

log_in <= x"OOOO";elsif(clk'event and clk = 'l')then

if (cnt = 23) thenlog_in <= s_em;

elsif (eofmel = '1' or cnt /= 0)thenlog_in <= fbank(fcoe) (23 downto 8);

163

elselog_in <= x"OOOO";

end if;end if;

end process;

scale --------------------------------------------------------

scalesm: process (log_in)begin

if (log in(14) = '1') thenlog_scale_but <= b"011ll";qini <= log_in;

elsif (log in(13) = '1') thenlog_scale_but <= b"011lO";qini <= log in(14 downto 0) &

elsit (log in(12) = '1') thenlog_scale_but <= b"01101";q_ini <= log in(13 downto 0) &

elsif (log in(ll) = '1') thenlog_scale_but <= b"011OO";q_ini <= log in(l2 downto 0) &

elsif (log in(lO) = '1') thenlog scale buf <= b"O1011";q_ini <= log in(ll downto 0) &

elsif (log_in(9) = '1') thenlog scale buf <= b"O101O";q_ini <= log in(l0 downto 0) &

elsif (log in(8) = '1') thenlog_scale_but <= b"OlOOl";q_ini <= log in(9 downto 0) &

elsif (log in(7) = '1') thenlog scale buf <= b"OlOOO";q_ini <= log in(8 downto 0) &

elsif (log in(6) = '1') thenlog_scale_but <= b"O011l";qini <= log in(7 downto 0) &

elsit (log in(5) = '1') thenlog scale buf <= b"O011O";q_ini <= log in(6 downto 0) &

elsif (log in(4) = '1') thenlog scale buf <= b"OO101";q_ini <= log in(5 downto 0) &

elsif (log in(3) = '1') thenlog_scale_but <= b"OOlOO";q_ini <= log in(4 downto 0) &

elsit (log in(2) = '1') thenlog scale buf <= b"OO011";q_ini <= log in(3 downto 0) &

elsif (log in(l) = '1') thenlog scale buf <= b"OOOlO";q_ini <= log in(2 downto 0) &

elsif (log in(0) = '1') thenlog_scale_but <= b"OOOOl";

'O'

"00";

"Ooo"

"OOOO"

"OoOOO"

"000000"-

"OOOOOOO"-

"00000000"-

"000000000";

"0000000000";

"OOOOOOOOOOO"-

"000000000000"-

"0000000000000";

164

qini <= log in(1 downto 0) & "00000000000000";else

log_scale_buf <= b"00000";q_ini <= log_in;

end if;end process;

initializationu (-3, 3) (x"aOOO", x"6000")y[-O. 6931, 0] (-1, 0] (x"8000", x"OOOO"]d[-2, 2] {x"cOOO", x"4000"]r=4

ul and ylfor u 2 = "4000"for y -1 = "8000"qini [x"4000", x"7fff"] [1/2, 1)

initsm: process ( qini)variable xl : std logic vector(15 DOWNTO 0);begin

xl := q_ini x"4000"; 4x 2

if (qini < q) thenu_buf <= '0' & qini(13 downto 0) & '0'; 2(4x 2)

ybuf <= log 2 minus;else

ubuf <= xl x"4000"; 4x 2 2

ybuf <= x"OOOO";end if;

end process;

out registers ------------------------------------------------

u registersureg: process(clk, reset, ubuf)begin

if reset = '1' thenu_out <= x"OOOO";

elsif (clk'event and cik = '1' ) thenu_out <= u_buf;

end if;end process;

y registersyreg: process(clk, reset, ybuf)begin

if reset = '1' thenyout <= x"OOOO";

elsif (clk'event and cik = '1' ) thenyout <= y_buf;

end if;end process;

log_scale registerslog reg: process(clk, reset, log scale buf)begin

165

if reset = '1' thenlog_scale <= b"OOOOO";

elsif (clk'event and ciLk = '1' ) thenlog_scale <= log scale buf;

end if;end process;

sof registerslsofregi: process(clk, reset, eofmel)begin

if reset = '1' thensof log reg <= '0';

elsif (clk'event and cik = '1' ) thensot log reg <= eofmel;

end if;end process;

sof registerssot reg: process(clk, reset, sot log reg)begin

if reset = '1' thensof log <= '0';

elsif (clk'event and clk = '1' ) thensot_log <= sot log reg;

end if;end process;

END behavioral;

file: log unit.vhdthesis projectProgrammer: Xin XiaoDate 01/2001


The output of mel filtering is subjected to a logarithm function(natural logarithm)

LIBRARY ieee;USE ieee.std logic 1164.ALL;USE ieee. std logic signed.ALL;USE ieee. std logic arith.CONV STD LOGIC VECTOR;

--use work.exemplar 1164 .sr2;

166

--use work.exemplarll64.s12;use work. log_pack, all;

ENTITY log unit ISgeneric (I: natural := 1);PORTclk : in std logic;reset : in std logic;sof_in : in std_logic;log scale in : in std logic vector(4 DOWNTO 0);u_in in std logic vector(15 DOWNTO 0);yin : in std logic vector(15 DOWNTO 0);sof_out out std_logic;log_scale_out : out std logic vector(4 DOWNTO 0);

u_out : out std logic vector(15 DOWNTO 0);y_out : out std logic vector(15 DOWNTO 0)

END log_unit;

u[-4, 2] -4=x"cOOO" ? x"8000"y[-O. 6931, 0] [-1, 0] [x"cOOO", x"OOOO"] ?x"8000"d[-2, 2]r=4

ARCHITECTURE behavioral OF log_unit ISsignal u_buf : std logic vector(15 DOWNTO 0);signal ybuf : std logic vector(15 DOWNTO 0);

BEGINdif: process(clk, reset, uin, y_in)

variable u_ddur : std logic vector(15 DOWNTO 0);d(i) (1 + u(i) *

variable u_uddur std logic vector(l5 DOWNTO 0);u(i) + d(i) (1 + u(i) * r(-i))

variable y_dr : std logic vector(l5 DOWNTO 0);variable o_ur std logic vector(l5 DOWNTO 0); --1 + u(i) *

r" (-i)

beginour := x"2000" + shr(uin, convstdlogicvector(2*i, 16));if u_in <= "1101001100110011" then dec2bin(216 - l.4*213,

16)

2 = x"2000"uddur := shl(our, "1");y_dr := log coe(i, 0);

elsit uin <= "llll000llO011OOl" then -0.45uddur o_ur;y_dr log coe(i, 1);

elsif u_in <= "0001000110011001" then 0.55u_ddur := x"OOOO";y_dr := x"OOOO";

elsif u_in <= "0011010011001100" then 1.65u_ddur := -o_ur;y_dr := log_coe(i, 2);

elseu_ddur := shl(our, "1");ydr := log coe(i, 3);

167

end if;uuddur : uin + u_ddur;

ybuf <= yin y_dr;ubuf <= shl( uuddur,"lO");

end process;

Registers ------------------------------------------------------

output registerout reg: process (cik, reset, y_buf, ubuf)

beginasync reset

if reset '1' thenyout <= x"OOOO";uout <= x"OOOO";

elsif (clk'event and clk = '1') thenyout <= ybuf;u_out <= ubuf;

end if;end process;

sof registersof_reg: process(clk, reset, sot_in)

beginasync reset

if reset '1' thensot_out <= '0';

elsif (clk'event and cik = '1') thensot out <= sot in;

end if;end process;

log scale registerlog scale reg: process(clk, reset, log_scale_in)

beginasync reset

if reset = '1' thenlog_scale_out <= b"DOOOO";

elsif (clk'event and cik = '1') thenlog_scale_out <= log_scale_in;

end if;end process;

END behavioral;

Dct.vhd: source code for the DCT block.

file: dct.vhdthesis projectProgrammer: Xin XiaoDate 02/01


13 cepstral coefficient are calculated from the output of theNon-linear

Transformation block.

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;use work.dctpack.all;

ENTITY dct IS PORT(clk : inreset : insof logs_log : insof_dctdct_out

END dct;

std logic;std logic;

in std logic;std logic vector(15 downto 0);out std logic;out std logic vector(15 downto 0)

168

ARCHITECTURE behavioral OF dct IScomponent declaration -------------------------------------------

component dct pre PORT(clk : in std logic;reset : in std logic;sof log : in std logic;s_log : in std logic vector(15 downto 0);

sofdctpre : out std logic;yO : out std logic vector(20 downto 0);

xO : out std logic vector(20 downto 0);tn : out std logic vector(20 downto 0);xxi : out dct_vector;xx2 : out dct vector

end component;

component dct_post PORT(cik : in std_logic;reset : in std logic;

169

sof_dct_pre : in std logic;yO : in std logic veotor(20 downto 0);

xO : in std logic vector(20 downto 0);

tn : in std logic vector(20 downto 0);

xxl : in dot vector;xx2 : in dct_vector;sof_dct : out std logic;dct out : out std logic vector(20 downto 0)

end component;

configuration

for all: dct_preuse entity work.dct pre (behavioral);

for all: dot_postuse entity work.dct_post (behavioral);

signal sof dot pre : std logic;signal y0 : std logic vector(20 downto 0);signal xO : std logic vector(20 downto 0);signal tn : std logic vector(20 downto 0);signal xxi : dot_vector;signal xx2 : dot_vector;signal dot outi : std logic vector(20 downto 0);signal sof log cnt : std logic vector(4 downto 0);signal sof_dct_cnt : std logic vector(4 downto 0);signal semreg : std logic vector(15 downto 0);signal sof_dct_buf : std logic;BEGINsof_dct <= sof dot buf;

component instantiation -----------------------------------------

dct_pre_inst: dct_pre port map (clk, reset, sof log, s_log,sofdctpre,

yO, xO, tn, xxi, xx2) ;

dct_post_inst: dot_post port map (clk, reset, sof dot pre, yO, xO,tn,

xxi, xx2, sof dot buf, dot outi);

logcntsm: process(clk, reset, sof log)begin

if reset = '1' thensof_log_cnt <= b"00000";

elsif clk'event and clk = '1' thenif sof log = '1' then

sof log ont <= b"OOOOl";elsif sof log cnt = 23 then

sof log cnt <= b"OOOOO";elsif sof log cnt 1= 1 then

sof log cnt <= sof log cnt + 1;else

170

sof log cnt <= b"OOOOO";end if;

end if;end process;

dct_cnt_sm: process(clk, reset, sofdctbuf, sof_dct_cnt)begin

if sofdctbuf = '1' thensofdctcnt <= b"0000IJ';

elsif sofdctcnt = 23 thensof_dct_cnt <= b"OOOOO";

elsif sofdctcnt 1= 1 thensofdctcnt <= sof_dct_cnt + 1;

elsesof_dct_cnt <= b"OOOOO";

end if;end process;

sem register

s_em inputs_em_in: process(clk, reset, slog, sof log cnt)begin

if reset = '1' thensemreg <= x"OOOO";

elsif (clk'event and clk = '1' ) thenif sof log cnt = 23 then

semreg <= slog;end if;

end if;end process;

s_em outputs_em_out: process(clk, reset, semreg, sof dot cnt)begin

if reset = '1' thendct_out <= x"OOOO";

elsif(clk'event and clk = '1' ) thenif sof_dct_cnt = 23 then

dct_out <= semreg;elsif (sof_dct_buf = '1' or sof_dct_cnt 1= 0) then

dct_out <= dctoutl(20 downto 5);end if;

end if;end process;

END behavioral;

171

file: dct_pre.vhdthesis projectProgrammer: Xin XiaoDate 02/01



Transformation block.caculate the coefficients vectors for matrix multiplication.

LIBRARY ieee;USE ieee.std logic 1164 .ALL;USE ieee. std logic signed.ALL;use work.dctpack.all;

ENTITY dct pre IS PORT(clk : in std logic;reset : in std logic;sof log : in std logic;s_log : in std logic vector(l5 downto 0);

sof_dct_pre : out std logic;yO : out std logic vector(20 downto 0);

xO : out std logic vector(20 downto 0);

tn : out std logic vector(20 downto 0);

xxl : out dct vector;xx2 : out dct vector

END dct_pre;

ARCHITECTURE behavioral OF dct pre ISsignal sof log reg : std logic;signal slogreg : std logic vector(15 downto 0);

signal sof_xi : std logic;signal yOreg : std logic vector(20 downto 0);

signal xii : std logic vector(20 downto 0);

signal y0regl : std logic vector(20 downto 0);

signal y0reg2 : std logic vector(20 downto 0);

signal xl : dct_vector;signal x2 : dct vector;signal xi_ps : xi_states;signal xi_ns : xi_states;signal tn_reg : std logic vector(20 downto 0);

signal tnregl : std logic vector(20 downto 0);

signal tn_addl : std logic vector(20 downto 0);

signal tn_add2 : std logic vector(20 downto 0);

signal tn_add : std logic vector(20 downto 0);

signal tn_buf : std logic vector(20 downto 0);

signal x0reg : std logic vector(20 downto 0);

signal xoregl : std logic vector(20 downto 0);

signal xxi reg : std logic vector(20 downto 0);

172

signal xx2_reg std logic vector(20 downto 0);

signal xxps : xxi states;signal xx_ns : xxi states;signal xxladdl : std logic vector(20 downto 0);

signal xxladd2 : std logic vector(20 downto 0);

signal xxl add : std logic vector(20 downto 0);

signal xx2addl : std logic vector(20 downto 0);

signal xx2_add2 : std logic vector(20 downto 0);

signal xx2 add : std logic vector(20 downto 0);

signal xx2mul std_logic_vector(4l downto 0);

signal xx2 mull : std logic vector(20 downto 0);

signal dct_cnt : std logic vector(4 downto 0);signal yObuf : std logic vector(20 downto 0);

signal yOmin : std logic vector(20 downto 0);

signal xii buf : std logic vector(20 downto 0);

signal xii mm : std logic vector(20 downto 0);

BEGINinput register

sofsm: process( cik, reset, sof log)begin

if reset = '1' thensof log req <= '0';

elsif (clk'event and cik = '1') thensof log reg <= sof log;

end if;end process;

S log Sm: process( cik, reset, s_log)begin

if reset = '1' thenslogreg <= x"OOOO";

elsif (clk'event and clk = '1') then5 log req <= slog;

end if;end process;

counter ---------------------------------------------------------

cnt_sm: process(clk, reset, sof log reg)begin

if reset '1' thendct_cnt <= b"OOOOO";

elsif(clk'event and clk = '1' ) thenif sof logreg = '1' then

dct_cnt <= b"OOOOl";elsif (dct_cnt = 22) then

dct_cnt <= b"OOOOO";elsif (dctcnt 1= 0) then

dctcnt <= dct cnt + 1;end if;

end if;end process;

caculate yO

yObuf <= b"OOOOO" & S log reg + yOmin;yOsm: process (clk, reset, sof log reg,yObuf)beginif(sofiogreg = '1') then

yOmin <= '0' & x"OOOOO";else

yOmin <= yOreg;end if;

173

siogreg, yOreg, yOmin,

if (reset = '1' ) thenyOreg <= '0' & x"OOOOO";

elsif (clk'event and cik = 'l')thenif (sof log reg = '1' or dct_cnt 1= 0) then

yOreg <= yObuf;else

yOreg <= yOreg;end if;

end if;end process;

cacuiate x (i) ...................................................

xii buf <= b"OOOOO" & S logreg xilmin;xism: process (cik, reset, soflogreg, siogreg, xii, xilbuf,xii mm)beginif (sof log req = '1') then

xii mm <= '0' & x"OOOOO";else

xii mm <= xii;end if;

if (reset = '1' ) thenxii <= '0' & x"OOOOO";

elsif (cik'event and clk = 'l')thenxii <= xii_buf;

end if;end process;

caculate array xi and x2tn_add <= tn addl tn_add2;

xidemuxsm: process(cik, reset, sof log reg, xii, xips, xi_ns,slogreg, tn_reg,

tn_addi, tn_add2, tn_add)begin

if (reset = '1') thenxips <= xist0;

174

tnreg <= '0' & x"OOOOO";elsif(clk'event and cik = '1') then

xips <= xi_ns;if dct_cnt 1= 0 then

tnreg <= tn_buf;end if;

end if;

tn_addl <= tn_reg;tn_add2 <= xii;tnbuf <= tnadd;sof xi <= '0';case xips is

when xi stO >

if(sofiogreg = '1') thenxins <= xisti;

elsexins <= xistO;

end if;tnbuf <= '0' & x"OOOOO";

when xi stl =>xi_ns <= xist2;xi(i0) <= xii;tn_add2 <= -xii;

when xi st2 =>xins <= xi_st3;x2(9) <= xii;

when xi_st3 =>xins <= xist4;xi(2) <= xii;tnadd2 <= -xii;

when xi st4 =>xi_ns <= xi_st5;xi(8) <= xii;

when xi_st5 =>xi_ns <= xi_st6;x2(4) <= xii;tnadd2 <= -xii;

when xi_st6 =>xins <= xi_st7;x2(i) <= xii;

when xi st7 =>xi_ns <= xi_st8;x2(6) <= xii;tn_add2 <= -xii;

when xi_st8 >

xi_ns <= xi_st9;x2(7) <= xii;

when xi st9 >

xins <= xisti0;x2(5) <= xii;tn_add2 <= -xii;

when xi sti0 =>xi_ns <= xistii;

175

xl(3) <= xii;when xi stii =>

xins <= xisti2;x2(0) <= xii;tn_add2 <= -xii;

when xi sti2 =>xins <= xisti3;xi(0) <= xii;

when xi_sti3 =>xi_ns <= xistl4;x2(3) <= xii;tn_add2 <= -xii;

when xi sti4 =>xi_ns <= xistl5;xi(5) <= xii;

when xi sti5 =>xi_ns <= xisti6;xi(7) <= xii;tn_add2 <= -xii;

when xi_sti6 =>xins <= xisti7;xi(6) <= xii;

when xi sti7 =>xins <= xisti8;xi(i) <= xii;tnadd2 <= -xii;

when xi sti8 =>xi_ns <= xisti9;xl(4) <= xii;

when xi sti9 =>xi_ns <= xist2O;x2(8) <= xii;tn_add2 <= -xii;

when xi st2O =>xins <= xist2i;x2(2) <= xii;

when xi st2i =>xins <= xist22;xi(9) <= xii;tn_add2 <= -xii;

when xi st22 =>xi_ns <= xi_st23;x2(iO) <= xii;

when xi_st23 =>xins <= xist24;xOreg <= xii;tn_add2 <= -xii;

when xi st24 =>xi_ns <= xistO;sof xi <= 'i';


cacuiate array xxi and xx2

176

xxi add <= xxi addi + xxi add2;xx2_add <= xx2 addi xx2add2;xx2mul <= xx2 mull * xx2reg;xx Sm: process(clk, reset, sof_xi, xi, x2, xxps, xxns, xxi reg,xx2mul)begin

if (reset = '1') thenxxps <= xxi stO;xxi reg <= '0' & x"OOOOO";xx2reg <= '0' & x"OOOOO";xxi <= ('0' & x"OOOOO", '0'

'0' & x"OOOOO",'0' & x"OOOOO",'0' & x"OOOOO",

xx2 <= ('0' & x"OOOOO", '0'


elsif (clk'event and clk = 'i')xxps <= xx_ns;xxi req <= xxi add;xx2_reg <= xx2_add;

end if;

& x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO");& x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO");then

xxi_addi <= '0' & x"OOOOO";xxi add2 <= '0' & x"OOOOO";xx2_addi <= '0' & x"OOOOO";xx2_add2 <= '0' & x"OOOOO";sofdctpre <= '0';case xxps is

when xxi stO =>if(sof_xi = '1') then

xx_ns <= xxi_sti;else

xx_ns <= xxi stO;end if;

when xxi sti =>xxns <= xxi_st2;xxi addl <= xi(0);xxi_add2 <= x2(0);xx2addi <= xl(0);xx2add2 <= x2(0);

when xxi_st2 =>xx_ns <= xxi_st3;xxi_addi <= xi(l);xxi add2 <= x2(i);xx2addi <= xl(i);xx2add2 <= x2 (1);xxi(0) <= xxi reg;xx2(0) <= xx2 mul(39 downto 19);xx2 mull <= b"OOOOiOOOlOiliiOOOO101";--cos(xi(0) * a) = cosiia; a = pi/23

when xxi_st3 >

xxns <= xxi st4;

177

xxi addi <= xl(2);xxi add2 <= x2(2);xx2addi <= xi (2);

xx2_add2 <= x2(2);xxi(i) <= xxi reg;xx2(1) <= xx2 mul(39 downto 19);xx2 mull <= b"OiOi011101011iOliiiOO"; cos6a

when xxi_st4 =>xxns <= xxi st5;xxi addi <= xl(3);xxi add2 <= x2 (3);

xx2addl <= xl(3);xx2_add2 <= x2(3);xxi(2) <= xxi reg;xx2(2) <= xx2 mul(39 downto 19);xx2 mull <= b"lOOO10101O011OO01101O"; cos2Oa

when xxi st5 =>xxns <= xxi_st6;xxi addi <= xl(4);xxi add2 <= x2(4);xx2addl <= xl(4);xx2add2 <= x2(4);xxi(3) <= xxi reg;xx2(3) <= xx2 mul(39 downto 19);xx2 mull <= b"lllOO1011ll10101OOlOO"; cosi3a

when xxi st6 =>xxns <= xxi st7;xxi addi <= xl(5);xxi add2 <= x2(5);xx2addl <= xl (5);

xx2_add2 <= x2 (5);

xxi(4) <= xxi reg;xx2(4) <= xx2 mul(39 downto 19);xx2 mull <= b"O110001lO1001O1O10000"; cos5a

when xxi st7 =>xxns <= xxi_st8;xxi addi <= xl(6);xxi add2 <= x2(6);xx2addi <= xl(6);xx2_add2 <= x2(6);xxl(5) <= xxi reg;xx2(5) <= xx2 mul(39 downto 19);xx2 mull <= b"OO101010110111010101O"; cos9a

when xxi st8 =>xx_ns <= xxi_st9;xxi addi <= xl(7);xxi add2 <= x2(7);xx2addl <= xl (7);

xx2_add2 <= x2(7);xxl(6) <= xxi reg;xx2(6) <= xx2 mul(39 downto 19);xx2 mull <= b"OlOOlOOil101OOOO10101"; cos7a

when xxi_st9 =>

xx_ns <= xxi stiD;

xxi addi <= xi(8);xxi add2 <= x2 (8);xx2addl <= xl (8);xx2_add2 <= x2(8);xxi(7) <= xxi reg;xx2(7) <= xx2 mui(39 downto 19);xx2 mull <= b"OOiilOi011i000liOiiOl";

when xxi stiO =>xx_ns <= xxi stil;xxi addi <= xl(9);xxi add2 <= x2(9);xx2addl <= xl(9);xx2add2 <= x2(9);xxi(8) <= xxi reg;xx2(8) <= xx2 mui(39 downto 19);xx2 mull <= b"iOOiOO10101OOOlOOiiOO";

when xxi stli =>xx_ns <= xxi sti2;xxi addi <= xi(1O);xxi_add2 <= x2(iO);xx2addl <= xi(iO);xx2_add2 <= x2(iO);xxl(9) <= xxi reg;xx2(9) <= xx2 mui(39 downto 19);xx2 mull <= b"011liOliOiOOOOO011011";

when xxi sti2 =>xx_ns <= xxi sti3;xxl(1O) <= xxi reg;xx2(1O) <= xx2 mul(39 downto 19);xx2 mull <= b"lOOOOOOlO011OOOlOOliO";

when xxi stl3 >

xx_ns <= xxi stO;sof_dct_pre <= '1';


cos8a

cosl9a

cos2a

cos22a

178

Registers

3 registers for yOyOreglsm: process(cik, reset, yOreg)begin

if (reset = '1') thenyOregi <= '0' & x"00000";

elsif(cik'event and cik = '1') thenyoregl <= yOreg;

end if;end process;

y0reg2sm: process(clk, reset, yOregi)begin

if (reset = '1') thenyoreg2 <= '0' & x"OOOOO";

179

elsif(clk'event and clk = '1') thenyOreg2 <= yOregi;

end if;end process;

yOreg3sm: process(clk, reset, yOreg2)begin

if (reset = '1') thenyO <= '0' & x"OOOOO";

elsif(clk'event and cik = '1') thenyO <= y0reg2;

end if;end process;

2 registers for tntnreglsm: process(clk, reset, tn_reg)begin

if (reset = '1') thentnregl <= '0' & x"OOOOO";

elsif(clk'event and cik = '1') thentnregl <= tn_reg;

end if;end process;

tnreg2sm: process(clk, reset, tnregl)begin

if (reset = '1') thentn <= '0' & x"OOOOO";

elsif(clk'event and cik = '1') thentn <= tnregl;

end if;end process;

2 registers for xOxoreglsm: process(clk, reset, xoreg)begin

if (reset = '1') thenx0regl <= '0' & x"OOOOO";

elsif(clk'event and cik = '1') thenxOregl <= xoreg;

end if;end process;

x0reg2sm: process(clk, reset, xoregl)begin

if (reset = '1') thenxO <= '0' & x"OOOOO";

elsif(clk'event and clk = '1') thenxO <= x0regl;

end if;end process;

END behavioral;

180

file: dct_post.vhdthesis projectProgrammer: Xin XiaoDate 02/01



Transformation block.

LIBRARY ieee;USE ieee. std logic 1164 .ALL;USE ieee. std logic signed.ALL;use work.dctpack.all;

ENTITY dct post IS PORT(clk : inreset : insofdctpre : in

yO : inxO : intn : inxxl : in

xx2 : in

so f_dc t

dct out

END dct_post;

std logic;std logic;s td_l og i c;

std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);dct_vector;dct_vector;

out std logic;out std logic vector(20 downto 0)

ARCHITECTURE behavioral OF dct_post ISsignal xxi req : dct vector;signal xx2 req : dct vector;signal xxreg : dct_vector;signal sr_mux : std logic;signal sr_d : dct_vector;signal sr_q : dct_vector;signal sof_dct_reg : std logic;signal tt_reg : dct_vector32;signal ttregl : dct vector;signal ttaddreg : std logic vector(20 downto 0);signal ar_ps : ar_states;signal ar_ns : ar_states;signal t_odd_reg : dct_vector;signal t_even_reg : dct_vector_even;signal t_even : dct_vector_even;

signal t_oddsignal tminsignal tminlsignal tmin2signal tminregsignal softsignal t2

signal y_addsignal yaddisignal y_add2signal ymullsignal ymul2signal ymulsignal dctoutregsignal dct_pssignal dct_nssignal sof_pre_regsignal sofprereglsignal sofprereg2

BEGIN

dct_vector;std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);

std logic vector(20 downto 0);std logic;

std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(20 downto 0);std logic vector(4l downto 0);std logic vector(20 downto 0);dct_states;dct_states;std logic;std logic;std logic;

181

Circular Shift Register -----------------------------------------

sr: for i in 0 to 10 generatebegin

srO: if(i = 0) generatebegin

srOsm: process(srq(lO), xx_reg(i), srmux)begin

if srmux = '0' thensr_d(i) <= srq(lO);

elsesrd(i) <= xxreg(i);

end if;end process;

end generate;srl: if(i 1= 0) generatebegin

srlsm: process (sr_q(i 1), xxreg(i), srmux)begin

if sr_mux = '0' thensr_d(i) <= sr_q(i 1);

elsesr_d(i) <= xx_reg(i);

end if;end process;

end generate;sr_sm: process(clk, reset, sr_d)begin

if reset = '1' thensrq(i) <= '0' & x"OOOOO";

elsif(clk'event and clk = '1') then

182

sr_q(i) <= srd(i);end if;

end process;end generate;

array multiplication

array mul: for i in 0 to 10 generatebegin

tt_reg(i) <= sr_q(i) * sr_con(i);

tsm: process(clk, reset, tt_reg(i))begin

if reset = '1' thenttregl(i) <= '0' & x"OOOOO";

elsif (clk'event and clk = '1') thenttregl(i) <= tt_reg(i) (39 downto 19);

end if;end process;

end generate;

array addition --------------------------------------------------

array_add: process(clk, reset, ttregl)variable ttaddl : std logic vector(20 downtovariable tt_add2 : std logic vector(20 downtovariable tt_add3 : std logic vector(20 downtovariable tt_add4 : std logic vector(20 downtovariable tt_add5 : std logic vector (20 downtovariable tt_add6 : std logic vector (20 downtovariable tt_add7 : std logic vector(2Q.downtovariable ttadd8 : std logic vector (20 downtovariable tt_add9 : std logic vector(20 downtovariable tt_add : std logic vector(20begin

tt_addl := ttregl(0) + ttregl(l);tt_add2 := ttregl(2) + ttregl(3);tt_add3 := ttregl(4) + ttregl(5);tt_add4 := ttregl(6) + ttregl(7);tt_add5 := ttregl(8) + ttregl(9);tt_add6 := tt addl + tt_add2;tt_add7 := tt add3 + ttadd4;tt_add8 := tt_add5 + ttregl(10);tt_add9 := tt_add6 + ttadd7;ttadd := tt_add9 + tt_add8;if reset = '1' then

tt_add_reg <= '0' & x"OOOOO";elsif(clk'event and clk = '1') then

ttaddreg <= ttadd;end if;

end process;

0);

0);

0);

0);

0);

0);

0);

0);

0);

downto 0);

Array inout and output control

tmin <= tminl tmin2;

183

ar_sm: process(clk, reset, ttaddreg, sofdctpre, xxi reg,xx2_reg, tn, xO,

arps, ar_ns, tmin, tminl, tmin2, t_odd, t_even,tminreg, sofprereg)


ar_ps <= arstO;t_even <= ('0' & x"OOOOO", 'C

'0' & x"OOOOO",t_odd <= ('0' & x"OOOOO", '0


t_min_reg <= '0' & x"OOOOO";elsif (clk'event and cik = '1')

ar_ps <= ar_ns;t_even <= tevenreg;t_odd <= t_odd_reg;tminreg <= tmin;

end if;

)' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO")& x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO",

'0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO");

then

soft <= '0';tmin2 <= tminreg;case ar_ps is

when ar stO >

if sof_pre_reg = '1' thenar_ns <= arsti;

elsear_ns <= arstO;

end if;when at sti =>

ar_ns <= arst2;toddreg(l0) <= ttaddreg;

when ar_st2 >

arns <= arst3;t_odd_reg(5) <= ttaddreg;

when ar_st3 =>ar_ns <= arst4;t_odd_reg(2) <= tt_add_reg;

when arst4 =>ar_ns <= ar_st5;toddreg(9) <= tt_add_reg;

when ar_st5 =>ar_ns <= ar_st6;toddreg(4) <= ttaddreg;

when ar_st6 >

arns <= arst7;t_odd_reg(8) <= ttaddreg;

when ar_st7 =>ar_ns <= ar_st8;toddreg(6) <= ttaddreg;

when ar_st8 =>ar_ns <= arst9;toddreg(7) <= tt_add_reg;

184

when ar_st9 =>arns <= arstlO;toddreg(3) <= ttaddreg;

when at stlO =>arns <= arstil;toddreg(1) <= ttaddreg;

when at stil =>ar_ns <= arstl2;toddreg(0) <= ttaddreg;

when ar stl2 =>arns <= arstl3;tevenreg(1O) <= tt_add_reg;

tminl <= todd(1O);tm±n2 <= tn(19 downto 0) & '0';

toddreg(10) <= tmin;when ar stl3 =>

ar_ns <= arstl4;tevenreg(5) <= ttaddreg;tm±nl <= t_odd(9);toddreg(9) <= tmin;

when arstl4 =>ar_ns <= arstl5;tevenreg(2) <= tt_add_reg;tminl <= todd(8);toddreg(8) <= tmin;

when ar stl5 =>ar_ns <= arstl6;tevenreg(9) <= tt_add_reg;

tminl <= todd(7);toddreg(7) <= tmin;

when ar stl6 =>ar_ns <= arstl7;tevenreg(4) <= tt_add_reg;tm±nl <= todd(6);toddreg(6) <= tmin;

when at stl7 =>ar_ns <= arstl8;tevenreg(8) <= tt_add_reg;

tminl <= todd(5);toddreg(5) <= tmin;

when ar stl8 =>arns <= arstl9;tevenreg(6) <= tt_add_reg;

tminl <= todd(4)toddreg(4) <= tmin;

when ar stl9 =>ar_ns <= arst20;tevenreg(7) <= ttaddreg;

tminl <= t_odd(3);toddreg(3) <= tmin;

when ar st2O =>ar_ns <= arst2l;t_even_reg(3) <= tt_add_reg;tminl <= t_odd(2);

185

toddreg(2) <= tmin;when ar st2l =>

arns <= ar_st22;tevenreg(l) <= tt_add_reg;tminl <= todd(l)toddreg(l) <= tmin;

when at st22 =>arns <= arstO;tevenreg(0) <= ttaddreg;tminl <= todd(0);toddreg(0) <= tmin;sof_t <= '1';


dctmuxsm: process (reset, sof_dct_reg, ar_ps)begin

if (reset = '1') thensrmux <= '0';

elsif sofdctreg = '1' thensrmux <= '1';

elsif ar_ps = at st9 thensrmux <= '1';

elsesrmux <= '0';

end if;end process;

dctdatamuxsm: process (reset, sof_dct_reg, xxi reg, xx2reg,ar_ps)begin

if reset = '1' thenxxreg <= ('0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",

'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",0' & x"OOOOO", 'O' & x"OOOOO", 'O' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO");

elsif sofdctreg = '1' thenxx req <= xx2reg;

elsif ar_ps = ar_st9 thenxxreg <= xxi reg;

elsexxreg <= ('0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",

'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO")

end if;end process;

out

ymul <= ymuli * ymui2;y_add <= yaddi + y_add2;yaddi <= xO;yadd2 <= t2;

186

ymull <= yadd;out Sm: process(clk, reset, sof_t, teven, t_odd, dctps, dct_ns,ymul, yO)begin

if reset = '1' thendct_ps <= dctst0;dot_out <= '0' & x"OOOOO";

elsif (clk'event and cik = '1') thendot_ps <= dctns;dot_out <= dot out reg;

end if;

dct_out_reg <= ymul(40 downto 20);sof dot <= '0';case dot_ps is

when dot st0 =>if soft = '1' then

dctns <= dot sti;else

dctns <= dot stO;end if;dot out reg <= '0' & x"OOOOO";

when dot sti =>sof dot <= '1';dctns <= dot st2;dot out reg <= yO;

when dot st2 =>dctns <= dot st3;t2 <= todd(0);ymul2 <= "011lll1011O0111011OOl"; cosla

when dot st3 >

dctns <= dot st4;t2 <= teven(0);ymul2 <= "011l101101OOOOO011011"; cos2a

when dct_st4 >

dctns <= dct_st5;t2 <= todd(l);ymul2 <= "01110101011O011lOO101"; cos3a

when dot st5 =>dctns <= dot st6;t2 <= teven(l);ymul2 <= "01101101010111011O011"; cos4a

when dot_st 6 >

dctns <= dct_st7;t2 <= todd(2);ymul2 <= "011OO01101OO10101OOOO"; cos5a

when dot st7 >

dotns <= dot st8;t2 <= teven(2);ymul2 <= "O1010111010111011llOO"; cos6a

when dot st8 =>dotns <= dot st9;t2 <= t_odd(3);ymul2 <= "OlOOlO011101OOOO10101"; cos7a

187

when dct st9 =>dct_ns <= dctstlO;t2 <= t_even(3);ymui2 <= "O011101011lOO01101iOl"; cos8a

when dct stlO =>dct_ns <= dctstll;t2 <= todd(4);ymui2 <= "OO1O1O1O11O111O1O1O1O"; cos9a

when dct stil >

dctns <= dctstl2;t2 <= t_even(4);ymui2 <= "OO01101OOOOO101011011'; coslOa

when dct stl2 >

dct_ns <= dctstl3;t2 <= t_odd(5);ymui2 <= "000010001O111100001O1"; cosila

when dct stl3 =>dct_ns <= dctstO;t2 <= t_even(5);ymul2 <= "1111O111O100001111OiO"; cosl2a

end case;

end process;Registers

input sof registerssofpresml: process(cik, reset, sofdctpre)begin

if reset = '1' thensofpreregl <= '0';

elsif (clk'event and cik = '1') thensofpreregl <= sofdctpre;

end if;end process;

sofpresm2: process(cik, reset, sofpreregl)begin

if reset = '1' thensof_pre_reg2 <= '0';

elsif (cik'event and cik = '1') thensofprereg2 <= sofpreregl;

end if;end process;

sofpresm3: process(clk, reset, sof_pre_reg2)begin

if reset = '1' thensof_pre_reg <= '0';

elsif (clk'event and cik = '1') thensof_pre_reg <= sof_pre_reg2;

end if;end process;

xxlregsm: process(clk, reset, sof_dct_pre, xxi)

188


xxi reg <= ('0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",

'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO");

elsif ( clk'event and cik = '1') thenif (sofdctpre = '1') then

xxi reg <= xxi;end if;

end if;end process;

xx2regsm: process(clk, reset, sofdctpre, xx2)beginif reset = '1' then

xx2reg <= ('0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",

'0' & x"OOOOO", '0' & x"OOOOO", '0' & x"OOOOO",'0' & x"OOOOO", '0' & x"OOOOO");

elsif ( clk'event and cik = 'i') thenif (sof_dct_pre = 'i') then

xx2_reg <= xx2;end if;

end if;end process;

sofdctsm: process(cik, reset, sofdctpre)beginif reset = '1' then

sof dct req <= '0';eisif(clk'event and cik = 'i') then

sofdctreg <= sof_dct_pre;end if;end process;END behavioral;