VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE … · ii CERTIFICATE BY THE SUPERVISOR I certify that the thesis entitled “VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC

VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC NETWORK

Thesis submitted in partial fulfilment for the award of the degree of

DOCTOR OF PHILOSOPHY

By

S. CHANDRIKA

Under the Guidance of

Dr. R. RANI HEMAMALINI

Professor & Head,

ST.PETER’S COLLEGE OF ENGINEERING AND TECHNOLOGY, CHENNAI

VINAYAKA MISSIONS UNIVERSITY SALEM, TAMIL NADU, INDIA

APRIL 2016

VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC NETWORK

Thesis submitted in partial fulfilment for the award of the degree of

DOCTOR OF PHILOSOPHY

By

S. CHANDRIKA

Under the Guidance of

Dr. R. RANI HEMAMALINI

Professor & Head, ST.PETER’S COLLEGE OF ENGINEERING AND

TECHNOLOGY, CHENNAI

VINAYAKA MISSIONS UNIVERSITY SALEM, TAMIL NADU, INDIA

APRIL 2016

ii

CERTIFICATE BY THE SUPERVISOR

I certify that the thesis entitled “VLSI BASED

RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC

NETWORK” submitted for the Degree of Doctor of Philosophy by

Mrs. S. CHANDRIKA is the record of research work carried out by her

during the period from April 2010 to April 2016 under my guidance and

supervision and that this work has not formed the basis for the award of

any degree, diploma, associate-ship, fellowship or titled in the University

or any other University or Institutions of higher learning.

Signature of the Supervisor with designation

Place:

Date:

iii

DECLARATION

I declare that the thesis entitled “VLSI BASED

RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC

NETWORK” submitted by me for the Degree of Doctor of Philosophy is

the record of research work carried out by me during the period from

April 2010 to April 2016 under the guidance of Dr. R. Rani Hemamalini,

Professor & Head, Electronics And Communication Department,

St.Peter’s College Of Engineering And Technology, Chennai and that

this work has not formed the basis for the award of any degree,

diploma, associate-ship, fellowship or titled in the University or any

other University or Institutions of higher learning.

Signature of the Candidate

Place:

Date:

iv

ACKNOWLEDGEMENT

I am thankful to the Chancellor Prof.Dr.A.Shanmugasundaram,

Vice Chancellor Prof.Dr.V.R.Rajendran and Prof.Dr.K.Rajendran,

Dean Research, Vinayaka Missions University, Salem who have

extended their co-operation for the several phases of my research

work.

I extend my sincere gratitude to my supervisor Dr.R.Rani

Hemamalini, Professor & Head, Electronics and Communication

Department, St.Peter’s College of Engineering and Technology,

Chennai for providing the vital, inspirational, formative and

constructive thought process which facilitated my research work. I

gratefully acknowledge her guiding insight towards the articulation

and execution of this scholarly enterprise.

I express my sincere thanks to Dr.G.Gunasekaran, Principal,

Meenakshi College of Engineering, Chennai for having extended

his fullest co-operation and constant encouragement throughout

the span of this work.

The timely support rendered by my family members is duly

acknowledged.

S. Chandrika

v

ABSTRACT

A Mobile Ad-hoc Network (MANET) is a self-configuring and

infra-structure less independent network for wireless mobile

telecommunication applications. The minimal configuration and

quick deployment make the MANET as suitable for emergency

solutions like natural disaster and military conflicts. The presence

of dynamic reconfigurable hardware architecture enables ad-hoc

networks to be formed quickly. Orthogonal Frequency Division

Multiplexing (OFDM) is one of the key architectures presented in

MANET to establish data communication between source and

destination. In order to achieve high speed in data communication

process of OFDM, reduced complexity based OFDM architectures

are required. Hence Very Large Scale Integration (VLSI) System

design environment is considered to design high speed data

communication architectures.

OFDM System consists of frequency transformation (Fast

Fourier Transformation (FFT)) and Channel Encoding/Channel

Decoding for establishing data communication process. In order to

transfer the original information signal over a long distance,

frequency transformation is a essential part in every data

communication process. In this work, an efficient VLSI based

vi

architecture for establishing frequency transformation is proposed.

The proposed frequency transformation technique is called as

Radix-2 Single-path Delay Feedback (R2SDF) FFT. FFT technique

is used to convert the time domain signal into frequency domain

signal and vice versa. Processing Element and Complex

Multiplication architecture of R2SDF FFT has been modified to

improve the area and speed performances.

Additive White Gaussian Noise (AWGN) is the one of the

important noises which is added in the receiver section of data

communication process while transferring data into channel. To

reduce this noise, an efficient architecture of channel decoder

called “Hamming Decoder” is proposed. Practically radiation

issues may affect more than one bit during data transmission in

channels. Hence, Single Error Correction and Triple Adjacent Error

Detection (SEC-TAED) hamming code is designed in this research

work. Proposed SEC-TAED hamming code helps to improve the

detection probability of error detection and correction process. The

simulation results of proposed FFT and decoding blocks are

validated by using Model Sim6.3C. Synthesis performances are

evaluated with the help of Xilinx 10.1i design tool. Hence, the

proposed Reconfigurable architecture such as FFT and decoding

vii

block helps to increase the efficiency of MANET in various aspects

like area, delay and power.

viii

LIST OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

Abstract v

List of Tables xiii

List of Figures xiv

List of Symbols and Abbreviations xviii

1 MOBILE ADHOC NETWORK 1

1.1 Introduction to Mobile ADHOC

Networks

1

1.2 Orthogonal Frequency Division

Multiplexing (OFDM) in Mobile Ad-hoc

Network (MANET)

2

1.3 Architectural Approach of OFDM 5

1.4 OFDM Systems based on

Encoding/Decoding

7

1.5 Challenges in VLSI 9

1.6 Stages in VLSI Design 10

1.7 Performance Measurements of VLSI 11

1.8 Need for this study 13

1.9 Objectives 15

1.10 Methodology adopted 16

ix

2 REVIEW OF LITERATURES 18

2.1 Overview of OFDM 18

2.2 Architectural Oriented OFDM Model 19

2.3 Architectural Oriented FFT Models 22

2.3.1 Complex Multiplier Design of FFT

Models

24

2.4 Hamming Error Correction Codes 28

Summary 33

3 HIGH SPEED PIPELINED BASED 64-POINT

RADIX-2 SINGLE PATH DELAY FEEDBACK

(R2SDF) FFT

36

3.1 FFT in MANET 36

3.2 FFT Algorithm 37

3.2.1 Discrete and Fast Fourier

Transformation Techniques

38

3.2.2 Properties of Twiddle Factor 42

3.2.3 FDM Systems based on FFT and

Modulation

44

3.2.4 Digital Signal Processors for FFT

Computation

47

x

3.3 Radix-2 Single path Delay Feedback

(R2SDF) FFT

53

3.4 Pipelined Processing Element (PE)

Structures for R2SDF FFT

56

3.5 Design of Reduced Complex Multiplier 61

3.5.1 Bit Parallel Multiplier for 1/ 62

3.5.2 Design of Complex Multiplier 64

3.5.3 Reduced Complex Multiplier

design

66

3.6 Design of Radix-2 Single path Delay

Feedback (R2SDF) FFT for 64-point

68

Summary 70

4 HAMMING SINGLE ERROR CORRECTION –

TRIPLE ADJACENT ERROR DETECTION

CODE ALGORITHM FOR DATA

COMMUNICATION

72

4.1 Introduction to Error Detection and

Correction (EDC) Codes

72

4.2 Hamming Error Detection and

Correction Mechanism

74

xi

4.2.1 Single Error Correction (SEC)

Hamming Codes

82

4.2.2 Single Error Correction–Double

Adjacent Error Detection (SEC-

DAED) Hamming Code

85

4.2.3 Single Error Correction – Triple

Adjacent Error Detection (SEC-

TAED)

91

4.3 Proposed Extended (12, 8) Hamming

Code for SEC-TAED

92

Summary 93

5 RESULTS AND DISCUSSIONS 95

5.1 Synthesis Result of Pipeline Based

Processing Element Structures

96

5.2 Synthesis Result of Reduced Complex

Multiplier

101

5.3 Synthesis Result of Pipelined PEs and

Reduced Complex Multiplier Based

R2SDF FFT

103

5.4 Simulation Result of Proposed

Hamming (12, 8) SEC-TAED Codes

106

xii

6 CONCLUSION 110

6.1 SUMMARY OF THE THESIS 110

6.2 FUTURE WORK 111

References 113

List of Publication 126

xiii

LIST OF TABLES

TABLE NO. TITLE PAGE NO.

3.1 Tabulation for number of complex

multiplications and complex additions

required for Radix-2 16 point FFT

44

3.2 Twiddle factor values for 64-point FFT 70

4.1 Calculation of parity bits for (7, 4) hamming

code

78

4.2 Double Adjacent Error Detection for

Hamming (12, 8)

91

4.3 Triple Adjacent Error Detection for

Hamming (13, 8)

93

4.4 Triple Adjacent Error Detection for

Proposed Hamming (12, 8)

94

5.1 Comparison of area and delay for

proposed pipeline based Processing

Elements

102

5.2 Comparison of area and delay between

traditional complex multiplier and proposed

reduced complex multiplier

103

5.3 Comparison of area and delay between

traditional and proposed R2SDF FFT

architectures

107

xiv

LIST OF FIGURES

FIGURE NO. TITLE PAGE NO.

1.1 Single hop ad hoc network 2

1.2 Multi-hop ad hoc network 3

1.3 Physical layer model of MANET 5

1.4 Block diagram of OFDM System 6

3.1 Butterfly Structure for 2-point DIT FFT 40

3.2 Butterfly structure for 2-point DIF FFT 40

3.3 Radix-2 DIF-FFT structure for 8-point 44

3.4 The partial twiddle factor for N-point DFT 45

3.5 Block diagram for Multi-carrier Modulation

Scheme

47

3.6 Spectrum of overlapping of sub-carriers in

OFDM

48

3.7 OFDM System based on FFT 49

3.8 General Purpose Programmable DSP

Processor

50

3.9 Programmable specific FFT processor 52

3.10 Butterfly structure for R2SDF FFT 55

3.11 Symbolic representation of R2SDF FFT 56

3.12 Structure of Radix-2 8-point Single-path

Delay Feedback (R2SDF) FFT

56

3.13 Structure of Radix-2 Single-path Delay

Feedback (R2SDF) FFT

59

3.14 Block diagram of PE3 structure for 64-

point FFT

59

xv

3.15 Block diagram of PE2 structure for 64-

point FFT

60

3.16 Block diagram of PE1 structure for 64

point FFT

61

3.17 Block diagram of Pipelined PE3 structure

for 64 point FFT

62


for 64 point FFT

62


for 64 point FFT

63

3.20 Circuit diagram of the bit-parallel

multiplication by 1/

64

3.21 Circuit diagram of reduced bit parallel

multiplier for 1/

65

3.22 Butterfly structure of FFT with the help of

bit parallel multiplier

65

3.23 Structure of Complex Multiplier 66

3.24 Structure of Proposed Reduced Complex

Multiplier

69

3.25 Architecture of 64-point R2SDF FFT 71

4.1 Classification of Error Detection and

Correction (EDC) codes

75

4.2 Parity bit calculation 78

4.3 Flow chart of Bit Placement Strategy 92

5.1 Synthesis result of PE1 to determine the

Slice and LUT utilization

98



99

xvi



99

5.4 Synthesis result of Pipelined PE1 to

determine the Slice and LUT utilization

100

5.5 Synthesis result of Pipelined PE2 to


100



101

5.7 Performances of proposed pipelined PE1,

PE2 and PE3 processors

102

5.8 Synthesis result of proposed reduced

complex multiplier to determine the Slice

and LUT utilization

104

5.9 Synthesis result of proposed reduced

complex multiplier to determine the delay

consumption

104

5.10 Synthesis result of Proposed R2SDF FFT

by using Pipelined PEs and Reduced

Complex Multiplier to determine the Slice

and LUT utilization

106

5.11 Synthesis result of Proposed R2SDF FFT

by using Pipelined PEs and Reduced

Complex Multiplier to determine the delay

consumption

106

5.12 Performances of proposed and traditional

R2SDF FFT

107

5.13 Simulation result of proposed hamming

(12, 8) SEC-TAED code

109

xvii

5.14 Simulation result of hamming (12, 8) SEC-

TAED error-less data transmission: Status

displayed as “No error”

110

5.15 Simulation result of hamming (12, 8) SEC-

TAED code with single bit flipping: Status

displayed as “SEC”

110

xviii

LIST OF SYMBOLS AND ABBRIVIATIONS

ADC - Analog to Digital Converter

ADSL - Asymmetric Digital Subscriber Line

ANN - Artificial Neural Network

ASIC - Application Specific Integrated Circuits

AVD - Adaptive Viterbi Decoder

BER - Bit Error Rate

BPSK - Binary Phase Shift Keying

CFFT - Complex Fast Fourier Transform

CLB - Configurable Logic Block

CMA - Cached-Memory Architecture

CP - Cyclic Prefix

CPI - Clocks Per Instruction

CPU - Central Processing Unit

CRC - Cyclic Redundancy Check

DAB - Digital Audio Broadcast

DAC - Digital to Analog Converter

DAED - Double Adjacent Error Detection

DBPSK - Differential Binary Phase Shift Keying

DFT - Discrete Fourier Transform

DIF-FFT - Decimation in Frequency Fast Fourier Transform

DIT-FFT - Decimation in Time Fast Fourier Transform

xix

DQPSK - Differential Quadrature Phase Shift Keying

DSP - Digital signal processor

DVB - Digital Video Broadcasting

DWT - Discrete Wave Transform

ECC - Error Correcting Codes

EDC - Error Detection and Correction

FB - Feedback

FDMA - Frequency Division Multiple Access

FEC - Forward Error Correction

FF - Feedforward

FFT - Fast Fourier Transform

Fig. - Figure

FPGA - Field programmable gate array

GSM - Global System for Mobile Communication

HDL - Hardware Description Language

IC - Integrated Circuits

ICI - Inter-Carrier Interference

ICI - Inter-Carrier Interference

IDWT - Inverse Discrete Wave Transform

IEEE - Institute of Electrical and Electronic Engineering

IFFT - Inverse Fast Fourier Transform

IOB - Input/output blocks

xx

IoT - Internet of Things

IPC - Instruction Per Cycle

ISI - Inter-Symbol Interference

LDPC - Low Density Power Check

LED - Light Emitting Diode

LTE - Long Term Evaluation

LUT - Look up table

MAC - Multiplication and Accumulation

MANET - Mobile Ad-hoc Network

MBU - Multiple Bit Upset

MCM - Multi-chip Module

MCM - Multiple Constant Multiplications

MFCC - Mel Frequency Cepstral Coefficient

MIMO - Multi-In-Multi-Out

MIPS - Million Instructions Per Second

MMSE - Minimum Mean Square Error

MOS - Metal Oxide Semiconductor

MVS - Multiple Virtual Storage

OFDM - Orthogonal Frequency Division Multiplexing

OOK - On-Off-Keying

PAPR - Peak-to-Power Ratio

PC - Program Counter

xxi

PCB - Printed Circuit Board

PE - Processing Element

PE - Processing Element

PSK - Phase Shift Keying

PTS - Partial Transmit Sequence

QAM - Quadrature Amplitude Modulation

QPSK - Quadrature Phase Shift Keying

R22MDC - Radix-22 Feedforward Multipath Delay Commutator

R2MDC - Radix-2 Multipath Delay Commutator


R2SDF - Radix-2 Single path Delay Feedback

R2SDF - Radix-2 Single-path Delay Feedback


R4SDF - Radix-4 Single-path Delay Feedback

RAM - Random Access Memory

RF - Radio Frequency

RFFT - Real-valued-Fast Fourier Transform

ROM - Read Only Memory

ROM - Read Only Memory

SCA - Software Communication Architecture

SCs - Sub-Carriers

SDC - Silent Data Corruption

xxii

SDC - Single Delay Commutator

SDR - Software Defined Radio

SEC - Single Error Correction

SMU - Survivor Memory Unit

SNR - Signal to Noise Ratio

SNR - Signal to Noise Ratio

SoC - System on Chip

SRAM - Static Random Access Memory

TAED - Triple Adjacent Error Detection

TDMA - Time Division Multiple Access

TTM - Time to Market

TTV - Time to Volume

ULSI - Ultra Large Scale Integration

USRP - Universal Software Radio Peripheral

VLSI - Very Large Scale Integrated Circuits

VMS - Virtual Memory System

Wi-Fi - Wireless Fidelity

WLAN - Wireless Local Area Network

WPT - Wave Pipelining Technique

WSS - Wide Spread Spectrum

ZF - Zero Forcing

1

CHAPTER 1

MOBILE ADHOC NETWORK

1.1 Introduction to Mobile ADHOC Networks

A Mobile Ad-hoc Network (MANET) is a self-configuring and

infra-structure less independent network for wireless mobile

telecommunication applications. The minimal configuration and

quick deployment make the MANET as suitable for emergency

solutions like natural disaster and military conflicts. The presence

of dynamic reconfigurable hardware architecture enables ad-hoc

networks to be formed quickly. In this research work, MANET is

reconfigured through Very Large Scale Integration (VLSI) System

design implementation. Orthogonal Frequency Division

Multiplexing (OFDM) is one of the architectures presented in

MANET. The OFDM consists of Fast Fourier Transformation (FFT)

technique and Channel Encoding/Decoding block for performing

data communication process. The simulation results are validated

by using Modelsim 6.3C. Synthesis performances are evaluated

with the help of Xilinx 10.1i design tool. Hence, the proposed

reconfigurable architecture such as FFT and decoding block helps

2

to increase the efficiency of MANET in various aspects like area,

delay and power.

1.2 Orthogonal Frequency Division Multiplexing (OFDM) in

Mobile Ad-hoc Network (MANET)

OFDM is one of the emerging fields in wireless local area network

which is targeted for ad hoc network. OFDM can be exploited in MANET

to improve the energy and speed performance. Mobile nodes in MANET

communicate directly through radio frequency range and wireless links.

If the destination mobile node is out of range, then other nodes between

source and destination act as router to transmit information between

source and destination. This process is referred as multi-hop ad hoc

networks. Single and Multi-hop ad hoc networks is illustrated in Figure.

1.1 and Figure. 1.2.

Figure 1.1 Single hop ad hoc network

Source

Destination 1

Destination 2

Destination 3

3

Figure 1.2 Multi-hop ad hoc network

Mobile Ad Hoc Networks are self-configurable and less

infrastructure networks consisting of mobile devices and routers which

are able to support mobility and organize themselves arbitrarily. It

requires an extremely flexible technology for establishing

communications between source and destination nodes.

There are some challenges in mobile environments like

limitations of the wireless network, variable capacity links, data loss due

to transmission errors, limited communication bandwidth, frequent

disconnections or partitions and broadcast nature of the

Source

Node (Router)

Destination

4

communications. Also limitation is imposed by mobility dynamically

changing routers or topologies which lack mobility awareness by Malik

Nasereldin Ahmed [26]. Limitations of mobile devices like capacities

and battery life time will create more problems for the transmission.

OFDM is also a multiplexing technique as well as modulation

technique. It is a multi-carrier transmission technique in which single

high data stream is divided into a number of lower rate streams that are

transmitted simultaneous over some narrow sub channel. OFDM avoids

Inter-Symbol Interference (ISI), Inter-Carrier Interference (ICI) and fault

transmissions between source and destination node. In addition, OFDM

is used to improve the performance of MANET in terms of energy,

power consumption, time consumption and throughput for transmission

of information between source and destinations. OFDM is combined

with MANET to improve the efficiency in various aspects described by

Malik Nasereldin Ahmed [26], and Abdeldime M.S. Abdelgader [1].

Physical layer model of MANET is illustrated in figure 1.3, which shows

how to contribute OFDM in MANET. In OFDM, most of the approaches

to combat ISI and ICI are towards using interference cancellation and

frequency synchronization. It reduces equalization complexity by

implementing with Inverse Fast Fourier Transform (IFFT) at the

transmitter and Fast Fourier Transform (FFT) at the receiver that

converts the wide band signal into N narrow band flat fading signals

5

explained in Moose, Paul S S [38], Yang, Hongwei [56], Malik

Nasereldin Ahmed [26].

Figure 1.3 Physical layer model of MANET

From above description, it is clear that OFDM is a promising

technique for supporting co-operative transmission in MANET. To

achieve a large diversity gain for combating frequency-selective and

fast time-varying as well as tolerating imperfect synchronization among

different mobile nodes, an asynchronous cooperative transmission

scheme is developed in Quan Yu [41] with the help of distributed unitary

space-frequency coded OFDM (USFC-OFDM) which provide better

reliability and robustness and has a lower decoding complexity with no

need for any channel estimation.

1.3 Architectural Approach of OFDM

The structure of OFDM transmitter and receiver is illustrated in

Figure 1.4. The Encoder and Decoder of OFDM transmitter and receiver

6

act as channel encoder and channel decoder which converts the source

signals into set of binary information. Fast Fourier Transform is a key

technique used in OFDM which is implemented on the sender side and

receiver side for efficient communication with narrow bandwidth. FFT is

used to convert the time domain signals into frequency domain signals

and vice versa.

Figure 1.4 Block diagram of OFDM System

Convolutional encoder is considered as a channel encoder of any

wireless transmission techniques. The encoder performs a convolution

of the input stream with impulse response. Serial to parallel unit

converts the serial information of channel encoder into parallel one for

access all the inputs to IFFT unit at the same time. IFFT converts the

time domain signals into frequency domain signals. Parallel information

from IFFT processor is converted into serial one. Cyclic prefix is used to

eliminate the ISI and ICI by adding the guard interval or prefixing of bits

Serial to Parallel

Channel Serial Data Source

Channel Encoder

Serial to Parallel

IFFT

Parallel to Serial

Cyclic Prefix

Channel Decoder

Parallel to Serial

FFT

Remove Cyclic Prefix

Serial Data Source

Channel

OFDM Transmitter

OFDM Receiver

7

into estimated binary signals. In receiver side of OFDM System, reverse

operation is made to retrieve the original information signals.

The presence of multipath fading channel and capability for

parallel/pipelined processing of signal in OFDM make it a promising

technique for the next generation wide-band communications systems.

The modulation and demodulation of OFDM System can be efficiently

implemented with the help of IFFT and FFT transformation technique.

The OFDM based communication systems need to have high

performance in both power consumptions and throughput. This

performance requirement in OFDM can be achieved by efficient

IFFT/FFT implementation. This thesis addresses the problem of

designing efficient application-specific FFT processor for OFDM based

wide-band communication systems. The functionality of OFDM scheme

is represented in figure. 1.4. It indicates digital implementation of OFDM

modulator/demodulators with respect to Discrete Fourier Transform

(DFT). The performance of FFT with regard to Area occupancy and

power consumption in FPGA provides better solutions than Application-

Specific Integrated Circuit (ASIC) solution for FFT implementation.

1.4 OFDM Systems based on Encoding/Decoding

OFDM System consists of channel encoder and channel decoder

for encoding and decoding purpose. Source encoder and source

decoder are used to convert the analog signals into digital one and

8

digital signals to analog one respectively. The purpose of channel

encoder and decoder in OFDM System is to transmit the multiple

discrete signals into single channel. Two types of binary codes are

available for channel encoder and channel decoder of OFDM Systems.

These are block codes and convolutional codes.

Block codes is the combination of both linear and cyclic codes.

Linear block codes are error correction codes. The error correction code

encodes data into blocks. This is a linear combination of code words.

Linear codes are used in forward error correction and in transmitting

symbols (i.e. bits) on a communication channel. Cyclic code is also a

block code in which cyclic shifts of each code word is considered as

another code word belonging to that code. In other hand, convolutional

codes are one of the best codes in the encoding part.

Various decoders are available to decode the digital inputs such

as hamming decoder, Viterbi decoder, Adaptive Viterbi decoder (AVD),

Low density Power Check (LDPC) decoder, Cyclic Redundancy Check

(CRC) decoder, Bose, Ray- Chaudhuri, Hocquenghem (BCH) decoder

and Reed Solomon decoder. These all are the block codes, hence it

may be a linear error correction codes or cyclic error correction codes.

Among those encoders and decoders hamming encoders and decoders

are the best Error Correcting Coding (ECC) techniques for Very Large

Scale Integration (VLSI) Implementation.

9

1.5 Challenges in VLSI

Process variation: When lithographic technique is used in the IC

fabrication process it is difficult to maintain the accuracy of doping

concentrations. The fabricated wires are prone to errors in terms of their

geometrical dimensions and electrical characteristics.

Strict design rules: The IC scaling creates problems in

lithographic and etching process. As a result the design rules for IC

layout become tedious. The situation becomes worse with custom

integrated circuits. Hence designers move to automated tools for doing

layout. The big disadvantage of automated tools is that they do not

produce efficient layout. Reducing area by squeezing up to the last bit

available is possible only in manual layout.

Timing: The clock frequency keeps on scaling up with advent of

new fabrication techniques. This leads to the skew problem of the clock

signal distributed to the entire chip. To overcome this problem multi

core, multiprocessor chips are fabricated. The functionality of a single

core processor at high clock frequency can be achieved with multi-core

processor with low frequency.

Success-rate: The die size keeps on shrinking as fabrication

technology improves and wafer size keeps on going up due to low

manufacturing cost. The mask price becomes higher as technology

scales down. The mask involves a high non recurring cost that assures

10

first pass silicon success without several spin cycles to find errors in

silicon. Several new design philosophies have to be developed to meet

this strategy.

1.6 Stages in VLSI Design:

Schematic Entry: The realization of the circuit in the form of a

Netlist is done in this step. With the help of gates, transistors and

interconnects, we can make a netlist. The outcome of designed circuit is

checked via Simulation.

Physical Design: The conversion of the netlist into its geometrical

representation is done with the help of some predefined fixed rules

called lambda rules. The result of this level is called as a layout. This

step is further divided into sub-steps like circuit partitioning, floor

planning & placement, Routing, Layout compaction and Extraction &

Verification.

Logic Verification and Implementation: The physical design and

routing of desired design, logic must be verified by using suitable

simulation tools and implementation is done on suitable Field

Programmable Gate Array (FPGA) board.

Packaging: The chips are put together on a Multi Chip Module

(MCM) or a Printed Circuit Board (PCB) to obtain the final finished

implemented design.

11

1.7 Performance Measurements of VLSI

VLSI design is the process of finding optimal point in a

multidimensional space. Obvious tradeoffs of VLSI design processes

are hardware utilization, timing constraints, frequency analysis and

power consumption. Robustness and Trade-offs are the two important

issues for VLSI designers. The trade-offs includes complexity, Time to

Market (TTM), Time to Volume (TTV), Instructions per Clock cycle

(IPC), chip size, frequency and power performances. Time to Volume

(TTV) is the most important trade-off to consider.

There are numerous VLSI design metrics which impact the

successful design of a VLSI chip. The primary design metrics are as

follows:

Area: It includes size of the die, number of Look up Tables

(LUTs), Flip-Flop (FFs) and Slices utilized in desired design. Also this

design metrics relates to cost and profit. The performance of circuit

depends on the wire delays. There are standard methods to estimate

the area from the schematic or RTL design. For random/control logic

estimation is done by using synthesis tool or cell area or preliminary

placement. For data-path structures, estimation is done by combining a

regular structure and random logic techniques.

Speed/Delay: The switching time of used devices (transistors,

FFs, etc.) and how fast the desired design can execute. The path of the

12

entire design can be used to determine the time required for signal

propagation. More computational paths will take long time to execute

and simple paths will execute on shorter time period. The task of

designer could be reduced by eliminating the shortest computational

path to improve the system speed and other performance without

change in system functionality.

Power: It measures energy consumption to operate the circuit

over certain clock cycles. There are two primary issues associated with

power. They are Power delivery and Power extraction. The power which

has ability to deliver the voltage and current needed to run the chip is

referred as delivered power and the power which has ability to remove

the heat generated by the chip is referred as extracted power.

The VLSI design metrics can be measured by using proper

synthesis tool like Xilinx or Altera Quartus II web edition, etc. Area

utilization and delay consumption of desired design is directly measured

by using synthesis tools. But in case of measurement of power

consumption, it is essential the simulation tool like Model Sim any

version, etc. These design metrics produce the resulting chip attributes

like the number of instructions, frequency and Area-Delay Product

(ADP).

13

1.8 Need for this study

Mobile ADHOC networks use OFDM for wireless data

transmission. In OFDM several carriers are packed in to a band. Data is

transmitted serially within a band at a single frequency. Data is treated

as parallel among several bands. The parallel data transmission

requires N oscillators for N data transmission channels. This becomes a

disadvantage. If IFFT is taken for the input data stream it implicitly

means the actual input is a spectrum or band of frequencies. This

avoids the need for N oscillators to generate N bands. In conventional

OFDM spectrum produced may be subjected to Inter Symbol

Interference. To avoid inter symbol interference the IFFT architecture

has to be tuned periodically. This is the need for VLSI based

reconfigurable architecture for Mobile ADHOC Networks and is also the

notion behind this study.

The Fast Fourier Transform (FFT) is a critical block and widely

used in digital signal processing (DSP) applications. The FFT facilitates

the efficient transformation between the time domain and the frequency

domain for a sampled signal. Various FFT processors can be used for

hardware implementation. These implementations can be classified into

pipeline and memory-based to design an FFT processor. The best

existing Radix-2 Single-path Delay Feedback (R2SDF) FFT recognized

as the single processing element (PE) approach. Thus power

consumption and the hardware cost both are lower than the other

14

architecture methods. The effective complex multiplier structure is used

in R2SDF FFT for performing complex multiplication operation. Hence,

more flexibility and high speeds are available in existing R2SDF FFT. In

addition, hamming codes are available for detect and correct a single

error in data communication systems. Further to detect the double and

triple bit errors, hamming code can be extended by using more number

of parity bits. The best existing SEC-TAED hamming code can detect a

single error as well as triple adjacent error with help of extended

hamming code.

The disadvantage of existing R2SDF FFT architecture is large

area, delay, power and throughput Also existing R2SDF FFT

architecture cannot be parallelized. The processing elements of existing

R2SDF FFT require more delay to implement FFT computation due to

absence of synchronization of inputs. The complex multiplier structure

in R2SDF FFT consists of more number of adders and multipliers to

perform the complex multiplication. Hence, existing R2SDF FFT

requires more hardware complexity, area and delay for computation of

FFT. In existing SEC-TAED hamming code technique, more number of

parity bits required to detect the single and triple adjacent error as well

as have less detection efficiency.

In order to overcome these disadvantages, pipelined PEs based

R2SDF FFT and new SEC-TAED hamming codes are proposed in this

research work through VLSI implementation. The proposed R2SDF FFT

15

consists of pipeline based PE structures and reduced complexity

multiplier. With use of pipeline PE structures, synchronization is

provided between the input signals of R2SDF FFT. Further, the reduced

complex multiplier consists less number of adder and multiplier

structures. Further, we can effectively realize the bit replacement

algorithm for improving the detection efficiency of hamming SEC-TAED

codes.

1.9 Objectives

Wireless Mobile Ad hoc networks (MANET) have found a

significant place in growth of technologies. MANET in OFDM provides

communication between devices. OFDM transmits data using a set of

FET algorithm. This algorithm is implemented using PEs, the complex

multiplier and Radix - 2 single-path Delay Feedback (R2SDF) FFT. The

hamming SEC-TAED code is used for error detection. The objectives of

this research work are as follows:

1) To increase the speed of Processing Element (PE) of FFT

architectures.

2) To reduce the area and delay of Complex Multiplier in FFT.

3) To increase the efficiency of Radix-2 Single-path Delay Feedback

(R2SDF) FFT.

4) To introduce Reconfigurable architecture for complex multiplier.

16

5) To increase the detection probability of Single Error Correction

and Triple Adjacent Error Detection (SEC-TAED).

1.10 Methodology Adopted

A modification is made on processing elements in FFT

architecture with the help of pipelining technique to increase the

processing speed of ad hoc networks. The asynchronous effect in

existing PE structure leads to more delay for FFT computation. In order

to overcome this problem, the register unit is added in the end structure

of PE architecture. In register unit, Flip-Flops are used to synchronize

the all incoming signals.

To reduce the area and delay of the network,

a reconfiguration is done on complexity multiplier architecture. The

existing complex multiplier consists of more number of adder and

multiplier units and it leads to more area and delay to perform the

multiplication computation for FFT. In order to overcome this problem,

the expressions for existing complex multipliers are simplified using

Common Sub expression technique.

To increase the efficiency for Radix-2 Single-path Delay

Feedback (R2SDF) FFT, the modified PE structures with pipelined

technique and reduced complex multiplier are used in R2SDF FFT [7].

The R2SDF FFT consists of multiplier, PEs and delay elements. Instead

of complex multiplier and existing PE structure, the proposed reduced

17

complex multiplier and pipeline based PE structures are incorporated

into R2SDF FFT.

To maximize the probability of triple adjacent error detection with

less number of parity bits, Bit Replacement algorithm is used in SEC-

TAED hamming code. In Bit Replacement algorithm the encoded

codeword are re-ordered in a certain condition in order to maximize the

probability of detecting more number of triple adjacent error as well as

correct a single error. The proposed re-ordered combination is designed

with less number of parity bit.

1.11 Summary

In this research, the reduced complex multiplier, pipelined PE

architectures, pipelined PE structure based R2SDF FFT with reduced

complex multiplier and proposed SEC-TAED hamming codes are

designed using Verilog Hardware Description Language (HDL) and

simulation results are verified using Model sim6.3c tool. The Synthesis

for all proposed methods are done in Xilinx Spartan 3 XC3S200

(package: pQ208, speed grade: -5) FPGA using the Xilinx ISE 10.1i

design tool.

18

CHAPTER 2

REVIEW OF LITERATURES

The literature survey focuses on design of Fast Fourier

Transformation (FFT) techniques, architecture oriented FFT models and

fixed benchmarks regarding to decoding strategy. Also, the

architectures oriented Orthogonal Frequency Division Multiplexing

(OFDM) designs are presented.

2.1 Overview of OFDM

The transmitter side of OFDM consists of Source Encoder for

observing the input analog signals and convert into discrete one,

Channel Encoder to encode the input discrete signal, use vales of same

kind. Inverse Fast Fourier Transformation (IFFT) for performing

conversion of frequency samples into time samples and cyclic prefix to

add some more bits in either prefix/suffix side to recover the original

signal after transmitting.

The receiver side of OFDM consists of cyclic prefix removal Fast

Fourier Transformation (FFT), Channel Decoder and Source decoder.

Quadrature Amplitude Modulation (QAM) is frequently used for

better Bit Error Rate (BER) and Signal to Noise Ratio (SNR). Hamming

codes are widely used for improving the error detecting probability.

Single-path Delay Feedback (SDF) based IFFT/FFT structure is used

19

for converting the time domain signal into frequency domain signal and

vice versa.

To increase the performances of OFDM Systems in terms VLSI

concerns like high speed, low power consumption and less area, two

supportive mechanisms are developed in our research work.

I. Pipeline technique based Radix-2 Single-path Delay

Feedback (R2SDF) FFT to increase the efficiency of

frequency transformation technique of OFDM System.

II. Single Error Correction (SEC) and Triple Adjacent Error

Detection (TAED) extended hamming code to increase

probability of error detection efficiency.

The previous works for supporting these two proposals are briefly

discussed in this chapter. Further architectural oriented OFDM System

based works are also considered for our future work.

2.2 Architectural Oriented OFDM Models

Transmitter side of OFDM consists of source and channel

encoder, IFFT and receiver side of OFDM consists of source and

channel decoder and FFT. Hence, frequency transformation technique

(FFT/IFFT), encoding and decoding techniques are very important tools

in OFDM communication based system design. Other than FFT/IFFT

20

and Encoding/Decoding techniques, Modulation technique also plays an

important role in OFDM System.

In the design of Naga Tanuja, K [33], OFDM architecture is

analyzed and implemented the supportive tools of OFDM Systems. The

simulation results of serial to parallel converter, Binary Phase Shift

Keying (BPSK) modulation and frequency transformation techniques

are validated in this study with the help of Modelsim and results are

synthesized in Xilinx - Project Navigator, ISE 8.2i suite. VHDL language

is used to implement synthesis of the scalable Radiz-2 N-point FFT

processor. Clear architectural diagram for OFDM Systems are

presented in this design.

In the review of Noman, H. M. F [35], Software Defined Radio

(SDR) System is designed with the help of reconfigurable mechanism

for OFDM transceivers. Reconfigurable architecture for transmitter and

receiver side of OFDM is illustrated. Universal Software Radio

Peripheral (USRO) board is used for the design of SDR. Simplified

structure of OFDM consists of less hardware than traditional based one.

The suggested methods overcome the detrimental effects due to the

limited accuracy of the internal reference clock. Results are analyzed in

terms of Signal to Noise (SNR) values and Bit Error Rates (BER).

Spread spectrum (SS) and Multi-Carrier Modulation (MCM)

techniques are recognized as potential techniques for the design of

21

Cognitive Radio (CR) Systems and OFDM Systems. Literature of

Sundararajan, M [51], proposes the MCM and CR techniques in

MATLAB and simulation results are validated using MATLAB tool.

Various modified FFT algorithms are discussed for cognitive radio

applications. The pruning algorithms are able to achieve much reduction

in computational complexity, but in the view point of hardware

implementation transform decomposition technique is more efficient and

flexible.

In the review of Niladri Mandal [34], a novel efficient input zero

traced FFT pruning (IZIFFTP) algorithm based on DIF radix-2 frequency

transformation technique is implemented. The suggested FFT algorithm

is implemented in high level computer program, and this is similar to the

Cooley-Tukey radix-2 FFT algorithm, retaining all key features such as

regularity and simplicity, by making some programming modification

and alternation. The incorporation procedure of developed input zero

traced FFT into OFDM is briefly explained. The difficulty arises when we

map the frequency transformation technique into receiver side of OFDM

system than transmitter side of OFDM system. Results of developed

FFT structure is developed for various lengths such as 8-bit, 16-bit, 32-

bit, 64-bit, 128-bit, 256-bit, 512-bit and 1024 bit.

22

2.3 Architectural Oriented FFT Models

Fast Fourier Transformation (FFT) technique is used to convert

the discrete frequency samples into discrete time samples. Normal

algorithm for Radix-2 FFT has more computational path to convert the

frequency domain signal into time domain signal. The algorithm for

Radix-2 FFT requires more hardware complexity to implement the

design. It is essential that timing based system design to reduce the

computational path of FFT algorithm. This section discusses different

architectural oriented FFT models to increase the performance in terms

of hardware complexity, delay and power consumption.

In the study of Zhou, B [59], optimized implementation two

different pipeline FFT architectures are presented on Xilinx Spartan-3

and Virtex-4 FPGA device. With the help of Feedback and Feed-forward

techniques, the computational path can be reduced. To provide the

synchronization between more than one number of inputs and outputs

of FFT model, Pipelining register is used. The synthesis results of

normal and pipeline based FFT techniques are compared for different

number of point size such as 16, 64, 256 and 1024, and different types

of devices such as Xilinx Spartan-3, Virtex-4 and Virtex-E. This work

achieves high speed to perform the frequency transformation technique.

In the review of Paul, S. S [38], performance of different types of

multipliers are compared to incorporate with real multiplication of FFT

23

architectures. The parallel mechanism is provided with the help of flip-

flop and buffer circuit. Flip-flop is used to provide the proper delay for

proper information access. Hence, the hardware complexity and delay

of FFT computation can be reduced significantly. Results for different

types of multipliers such as Vedic multiplier, Array multiplier and Baugh

Wooley multiplier are compared. This review concludes that Vedic

multiplier is the best solution for FFT calculation. In the design of

Sundari, R. M [52], different types of Vedic multiplications such as

Urdhva Tiryakbhyam, Nikhilam Navatascharamam Dashatah and

Anurupye Vedic multiplication are analyzed for complex multiplication of

FFT structure. Those Vedic multipliers use the principal of Vedic Sutras.

The computational delay of those three methods are analyzed and

compared in this design. 8X8 Multiplication using Urdhva Sutra

algorithm provides 17.27% of speed improvement than Nikhilam Sutra

algorithm. On other hand, 8X8 Multiplication using Anurupye Sutra

algorithm provides 18.44% of speed improvement than Urdhva Sutra

and 32.52% of speed improvement than Nikhilam Sutra algorithm. This

review concludes that the Vedic Anurupye Sutra algorithm is the best

solution for complex multiplication of FFT models.

In the review of Salehi, S [45], Pipelined Real Valued FFT and

Hermitian-Symmetric IFFT are designed for different input point size.

The real FFT structure is developed by transferring twiddle factors to

subsequent stages, such that each stage in the developed Signal Flow

24

Graph (SFG) contains one column of butterfly units and one column of

twiddle factor blocks, and each column of the flow graph contains only

N samples. This is the key requirement to design the FFT model.

Hence, hardware complexity and computational path of FFT models

reduced automatically. The results for Radix-2 FFT and Radix-22 FFT

are analyzed with the help of proper synthesis tools.

In the brief of Ayinala, M [7], a novel scalable architecture is

designed for Real-valued In-place Fast Fourier Transform (RIFFT)

computation. This brief removes the redundant operations of SFG of

FFT computation. A new processing element (PE) structure is

introduced in RIFFT design by using two radix-2 butterflies’ structures

that can process four inputs in parallel. A conflict-free memory

addressing scheme is extended to support multiple parallel PE

structures. The proposed work of this brief reduces the computation

cycles by a factor of 2 for a 256-point RIFFT compared to normal radix-

2 FFT algorithm while maintaining a lower hardware complexity.

2.3.1 Complex Multiplier Design of FFT Models

In the study of Reddy, K. V. S [44], Radix-8 64-point FFT/IFFT

algorithm is implemented with the help of fixed width modified booth

multiplier. In normal Radix-2 and Radix-8 FFT algorithm, Read Only

Memory (ROM) is used to store the twiddle factors. Number of Look up

Tables (LUTs) has been increased due to ROM of FFT algorithm. To

25

eliminate the usage of ROM, a reconfigurable complex multiplier the

design of reconfigurable complex multiplier based FFT is named as

“ROM-less FFT/IFFT”. Single-path Delay Feedback (SDF) architectures

are used in this study to read the 4-point inputs at the same time. The

performances of fixed width booth multiplier are compared for Radix-2

Single-path Delay Feedback (R2SDF) FFT, Radix-22 Single-path Delay

Feedback (R22SDF) FFT and Radix-23 Single-path Delay Feedback

(R23SDF) FFT.

In the study of Mehta, U. C [30], Single-path Delay Feedback

(SDF) based FFT/IFFT is designed for 2048 point for Wi-Max

application. Modified ROM module is used in this study to reduce the

storage complexity of FFT. This design supports for variable length from

128-2048 point for FFT/IFFT structure. To admit the half of the point of

input at a time, shift register is used and generalized complex multiplier

is used to perform the complex multiplication of FFT computation.

In the study of Berkeman, A [9] and Berkeman, A [10], low logic

depth complex multiplier has been developed for FFT processors.

Generally more number of Slices and LUTs can be utilized to perform

the multiplication of FFT processors. For instance, 4-point FFT requires

only real valued multiplications, but in case of 8 or more than 8 point

FFT requires complex valued multiplication. Therefore, it is essential to

design the complex multiplier for FFT processor. Distributed Arithmetic

(DA) based multiplication is used for complex multiplication. DA based

26

multiplication is one of the best multiplications in which ROMs are used

for performing particular kind of multiplications. This multiplier provides

better performance in terms of logic depth and speed. The limitation of

this study is the utilization of more hardware complexity, since number

of LUTs has been increased when increasing the storage capacity of

ROM.

In the review of Archana Fande [6], signed complex multiplier

design is developed on Field Programmable Gate Array (FPGA). The

real addition, real subtraction, imaginary addition and imaginary

subtraction of variable length FFT processors are compared with best

existing one. The reports of this review are as following: Developed

complex multiplier of 16-bits, 64-bits, 256-bits and 1024-bits offers 20%,

69.23%, 81.62% and 86.96% of real multipliers reduction respectively

and 78.37%, 86.88%, 90.67% and 92.77% reduction of real adders

reduction respectively. Hence, design of proper complex multiplier

provides the better performance for FFT processors.

In the brief of Sreekanth Yadav, K [49], 64-point FFT design is

provided by using Radix-4 algorithm. The design for Decimation in Time

(DIT) FFT and Address Generation Unit (AGU) are developed first time.

The results for butterfly structures, AGU, control units are simulated and

validated by using proper simulation tools.

27

In the design of Kandhi Srikanth [19], Radix-4 64 point pipeline

FFT/IFFT processor has been developed for 3G and 4G wireless

application. The periodicity properties of twiddle factors and

reconfigurable complex multiplier are used reduce the ROM size for

storing twiddle factor values. The result of this productive modification

on FFT architecture reduces the hardware complexity effectively.

Further to reduce the Slices and LUT of FFT/IFFT computation, the

hardware can be reused effectively. The structure of complex

multiplication for FFT architecture is demonstrated in this design.

In the study of Kumar, A [21], realization is made on the structure

of butterfly to improve the performance of FFT. The structure of butterfly

structure consist of real addition, signed complex multiplication and real

subtraction processes. Radix-4 FFT processors have 3N/4 log4N

complex multiplications and 3N log4N complex additions. A number of

complex multipliers and complex adders, Memory size and Control

logics are compared for different types of FFT architectures such as

R2SDF FFT, Radix-4 SDF (R4SDF) FFT, Radix-4 Single-path Delay

Commutator (R4SDC) FFT, Radix-22 SDF (R22SDF) FFT, Radix-2

Multi-path Delay Commutator (R2MDC) FFT and Radix-4 MDC

(R4MDC) FFT. Further this study concludes that Radix-4 structures

provide better performance than Radix-2 FFT structures in terms of

utilization of complex number of additions and multiplications.

28

In the study of Manimaran, A [28], an effective complex multiplier

circuit is developed for 64-point FFT computation. Proposed

architecture of this study completely removes the use of ROM with the

help of reconfigurable complex multiplier and bit parallel multiplier. Bit

parallel multiplier uses the shifters and adders for the multiplication of

fractional value with complex output coming from butterfly structures.

The complex multiplier architecture utilizes the three multipliers and five

adders to compute the complex multiplication of FFT. It takes more

advantage than bit parallel multiplier.

2.4 Hamming Error Correction Codes

Hamming codes is one of the best error correction codes (ECCs)

to detect and correct a single bit error and detect up to double adjacent

bit error. Hamming codes are widely used to protect registers or

memories from soft errors. As technology scale increases, radiation

particles create soft errors that are more likely to affect more than one

bit when they impact an electronic circuit or memory circuit. This effect

is names as Multiple Cell Upset (MCU). To avoid MCU from causing

more than one bit error in a given word, interleaving is commonly used

in registers and memories. Although interleaving process increases the

complexity of the memory device and is not suitable for small memories

or content-addressable memories. However, if interleaving is not used,

MCUs can cause multiple errors in a word that may not even be

29

detected by a Hamming code. Therefore the solution for this problem is

to enhance the hamming code to increase the detection probability.

In the brief of Sanchez-Macian [47], Bit placement algorithm is

used to increase the detection probability of double adjacent bit error. A

Lexicographic Hamming matrix for (8, 12) is used for decoder design.

The block size of (8, 12) hamming matrix is 12 and input length is 8,

therefore four parity bits are used to detect the double adjacent error

and correct a single bit error. This review overcomes the MCU problem

of radiation in space and terrestrial communication. The block size of

designed hamming codes is 12. Therefore, there are eleven

combinations for probability to detect the double adjacent bit

error(DAED). But normal order detects only one out of eleven

combination of double adjacent bit error whereas bit re-ordered

codeword detect nine out of eleven combination of double adjacent bit

error. In this way bit-placement algorithm helps to increase the

efficiency of hamming decoder. Further this review gives the detection

of triple adjacent bit error(TAED). For both DAED and TAED, normal

order provides 9% of detection efficiency whereas bits re-order of this

review provides 82% of detection efficiency.

In the study of Sanchez-Macian [46], selective shortening and bit

placement techniques are used for hamming SEC-DAED and Extended

Hamming SEC-DED-TAED. Shortening algorithm is the best algorithm

for hamming codes to increase the probability detection. In addition to

30

SEC-DED and TAED, SEC-DED-TAED is developed in this brief. The

approach of handling the codeword is same but only the bit size is

extended. A new parity check matrices for SEC-DAED hamming codes

and SEC-DED-TAED hamming codes are developed in this brief. The

generation of parity check matrix is different from lexicographic matrix.

A lexicographic matrix is the normal order of length of block size, but

parity check matrix is generated by combining the identical matrix of

parity bit’s length and transpose of parity matrix generated from

identical matrix of information bit’s length. Apart from detecting all

consecutive errors, the SEC-DAED hamming codes can detect 30%-

67% of double non-adjacent errors. Similarly, SEC-DED-TAED

hamming codes can detect 34%-42% of triple non-adjacent errors.

In the review of Dutta, A [15], Multiple Bit Upset (MBU) tolerant

memory is designed using a Selective Cycle Avoidance based SEC-

DED-DAEC codes. Previous reviews only can detect the double

adjacent error, but the correction of double error is also made by

avoiding the selective cycles. At the first time, the circuit for double

adjacent bit error detection and error correction is designed. However

the developed circuit did not correct all the combination of double

adjacent bit. The error detecting probability is low when compared to the

previous works.

In the literature of Nutan Shep [37], conventional hamming codes

for (7, 4) is implemented in VLSI. With the help of Verilog Hardware

31

Description Language (Verilog HDL) and Very high speed integration

circuit Hardware Description Language (VHDL), algorithm for hamming

codes is implemented in this literature. The minimum hamming distance

of SEC design is 3, which mean three parity bits are used to detect and

correct a single bit error using hamming code. Similarly, minimum

hamming distance of SEC-DAED design is 4, which mean four parity

bits are used to detect the double adjacent error, but extended these

hamming codes can miscorrect the error to invalid one. The process of

this miscorrection is referred as Silent Data Corruption (SDC). When

designing the hamming error correction codes using Verilog HDL or

VHDL design, it is essential that more number of LUT to store the

hamming matrix results. Therefore, it is not sufficient to analyze the

results of hamming in terms of utilization of Silicon chip size, delay and

power consumption; instead the performances are compared in terms of

detection and correction probability.

In the study of Cha, S [11], Check bit Pre-computation methods is

used for SEC and DED. The H-matrix of the developed SEC-DED code

is the same as that of the odd-weight-column code during the write

operation and is designed by replacing 0’s with 1’s at the last row of the

read operation. This design achieves reductions in the number of gates,

latency and power consumption of the ECC processing circuits by up to

9.3%, 18.4% and 14.1% for 64 information bits in a word. This literature

provides alternate solution for the design of SEC-DED.

32

In the review of Cui, Y [12], hamming (40, 32) SEC-DED code is

developed to increase the Error Detection and Correction (EDAC)

ability. Hamming (40, 32) SEC-DED codes have 8-bits parity memory

for single error correction and double error detection. An algorithm

based mutual expressions are developed to minimize the EDAC circuit

area, and delay parameters. The results of (40, 32) hamming SEC-DED

codes are compared to (39, 32) Hsiao code. The critical path of encoder

and decoder computation causes more delay and power. But hamming

codes have only smoothened path for encoding and decoding process

than Hsiao codes. The developed hamming (40, 32) SEC-DED codes

offers 2.97% reduction of encoding delay than Hsiao (39, 32) error

correction code.

In the brief of Noorbasha, F [36], optimized encoding and

decoding process of hamming codes is presented by using

Complementary Metal Oxide Semi-conductor (CMOS) technology.

Hamming ECC codes are verified using 50nm, 70nm and 90nm

technology. Field Programmable Gate Array (FPGA) implementation

methods for developed Hamming ECC codes are provided in this brief.

They have simulated and tested the system and got an excellent

performance at 50 GHz. The reliability of developed hamming SEC-

DEC codes measured at a voltage of 1V, 0.7V and 0.5V. The decoding

of hamming codes can give accurate result even transmission

technique has a single bit error.

33

Usually, reliability problem of data transmission is improved by

channel coding which employs forward error correction (FEC)

techniques. FEC technique can detect and correct a single bit error with

the help of check or redundant bits. These check bits are determined

from data bits and appended to data bits to get the codeword of original

data bits. If AWGN affects the data bits other than check bits of

codeword, error could be easily detected and correct. If check bits are

also corrupted by AWGN, then FEC cannot detect and correct an error.

To overcome this problem, multidirectional parity code is used in

Manchanda, G [27]. The report of Manchanda, G [27] provides the

MATLAB realization of encoder and decoder for multidirectional parity

code with hamming code. Multidirectional parity code with hamming

code, improves the reliability of data transmission in data computer

network with acceptable bit overhead in 26.22% and Code rate in

79.22%. This scheme is very expensive and check bits are also

corrupted by noisy environment. Multidirectional parity code with

hamming code can correct four bit error, three error bits from the data

parts and one error bit from check bits. In receiver, there is no need to

re-transmit the data.

Summary

The Study of Manimaran A [28] gives the best overview of

complex multiplier design of FFT processors than other studies.

Similarly, the study of Zhou, B [59] and Salehi, S [45] provide the best

34

overview for pipelined structures based FFT designs. We consider the

complex multiplier design from the study of Manimaran A [28] and

pipelined techniques from the study of Zhou, B [59] and Salehi, S [45] to

design the FFT architectures. We consider the SDF architectures to

completely reduce the ROM of FFT. In our research work, productive

modification is made on those all considerations of FFT structures to

improve the performance of frequency transformation techniques.

The study of Sanchez-Macian [47] gives the best overview of bit

placement algorithm for extended hamming codes and the study of

Sanchez-Macian [46] gives the best overview of bit shortening algorithm

for extended hamming codes. The literature of Nutan Shep [37] provide

VLSI based design for (7, 4) hamming codes. We consider the study of

Sanchez-Macian [47], for SEC and TAED design. We realize the

methods of bit replacement algorithm for TAED and alternative

replacement procedures are made to improve the detection probability

of TAED. Further literature of Nutan Shep [37] was considered to realize

the problems, for implementing hamming codes in Verilog HDL or

VHDL. In our research work, productive modification is made on bit

replacement algorithm based extended hamming code to improve the

efficiency of detection probability of TAED.

From the review of above literature, it is clear that R2SDF FFT is

available for frequency transformation technique and extended

hamming codes are available for correcting a single error and detecting

35

the double and triple adjacent errors. The possibilities are there to

improve the performance of R2SDF FFT and detection efficiency of

extended hamming codes with the help of pipelining mechanism and bit

replacement algorithm respectively. These two improvements will be

absolutely helps to increase the performance of OFDM System.

36

CHAPTER 3

HIGH SPEED PIPELINED BASED 64-POINT RADIX-2 SINGLE

PATH DELAY FEEDBACK (R2SDF) FFT

3.1 FFT in MANET

Wireless communication technology has enlarged the demands

for signal processing operations such as Convolution, Correlation,

Filtering and frequency transformation techniques. Among those kinds

of operations FFT is frequency transformation technique and is

recognized as a high potential for wireless based communication

technologies in terms of hardware complexity. FFT is widely used to

convert the time domain signal into frequency domain signal and IFFT is

widely used to convert the frequency domain signal into time domain

signal. These frequency transformation techniques are used to transmit

and reconstruct the original input signals in OFDM based

communication. Mobile Ad-hoc Network (MANET) is the types of infra-

structure less wireless network in which OFDM is used for transmission

of information signals to desired users. OFDM is a multi-carrier

transmission scheme in which higher rate single data stream is

transmitted over a number of lower sub-carriers. To analyze and

transmit the frequency characteristics of more number of lower rate data

streams, FFTs and IFFTs are generally used. FFT/IFFT blocks

consume more silicon area and power consumption. Also, Speed of

37

frequency transformation processes is also poor due to difficult signal

flow graph. In this research work, pipelining mechanism is introduced to

increase the speed of the FFT/IFFT processors. The complex multiplier

is designed to reduce the hardware cost and power consumption of the

FFT/IFFT processors.

3.2 FFT Algorithm

FFT is used to analyze the timing characteristics of discrete

frequency response. Butterfly structures are used to determine

frequency response of time domain signals in IFFT and to determine

timing response in frequency domain signals. FFT processors can be

classified as two categories, as Decimation in Time (DIT) FFT and

Decimation in Frequency (DIF) FFT. Generalized buttefly structures for

2-point DIT FFT and 2-point DIF FFT are illustrated in Figure. 3.1 and

Figure. 3.2 respectively as in Takala, J [54] and Sreekanth Yadav, K

[49]. It consists of complex multiplier, complex adder and complex

subtractor.

Figure 3.1 Butterfly structure for 2-point DIT FFT

Twiddle Factor Multiplication

Complex Adder

+

-

Complex input 1

Complex input 2

Complex output 1

Complex output 2

38

Figure 3.2 Butterfly structure for 2-point DIF FFT

This butterfly structures are generally represented as Radix-2

structures, because of processing the two stages in every time period.

In case of 2-point DIT FFT, twiddle factor multiplication, complex

addition and subtraction processes are involved. DIF FFT is used to

construct the frequency representation of discrete time domain signals.

We can construct 8 point, 16 point, 32 point and 64 point FFT

processors.

3.2.1 Discrete and Fast Fourier Transformation Techniques

Discrete Fourier Transformation technique is used to convert the time

domain signals into Frequency domain signals.

The N-point Discrete Fourier Transformation (DFT) of input sequence x

(n) is defined as:

[ ] [ ]W nkN

N

nnxkX ∑

−

==

1

0, k=0, 1, 2 ... N-1 (3.1)

39

Where, eW Nnkjnk

N

π2−= , k=0, 1, 2 ... N-1 is referred as twiddle factor or

DFT coefficients. X[k] is the kth harmonic and x (n) is the nth input

sample. This DFT calculation requires a computational complexity of

O(N2) to transform the time domain signals into frequency domain

signals.

In order to overcome this problem, Fast Fourier Transformation

(FFT) technique is introduced by Cooley and Tukey & Lyon, Douglas A

[28]. By using the Cooley-Tukey FFT, the complexity for computation

can be reduced to O (logr N). This algorithm is the most universal of all

FFT algorithms, because of any factorization of N is possible. In most

Cooley-Tukey FFTs, transform length should be power of a basis r, i.e.,

N=2r. Hence, Cooley-Tukey algorithm is also represented as radix-r

algorithms. The most commonly used are those of basis r=2 and r=4.

The Cooley-Tukey algorithm follows the divide and conquers

approach to determine the frequency transformation of original input

signals. Fast Fourier Transformation algorithm can be classified into two

types such as Decimation in Time (DIT) FFT and Decimation in

Frequency (DIF) FFT.

The first type of classification is based on divide and conquers

approach in the time domain and hence, it is referred to as the

Decimation in Time (DIT) FFT. Similarly, the second type of

classification is also based on divide and conquers approach into the

40

frequency domain and hence, it is referred to as the Decimation in

Frequency (DIF) FFT. In general, first the input sequence can be

divided into two summations in DIF FFT computation. Further they are

simplified as follows:

[ ] ( ) ( )WW nkN

N

Nn

nkN

N

nnxnxkX ∑∑

−

−=

−

=+=

1

12

12

0 (3.2)

( )kNn

N

N

n

nkN

N

nWNnxWnx

⎟⎠⎞

⎜⎝⎛ +−

=

−

=∑∑ ⎟

⎠⎞

⎜⎝⎛ ++= 2

12

0

12

0 2 (3.3)

( ) ( )WWW kNN

nkN

N

n

nkN

N

n

Nnxnx 21

2

0

12

0 2∑∑−

=

−

=⎟⎠⎞

⎜⎝⎛ ++= and (3.4)

( ) ( )12 −= kkNNW

( ) ( ) W nkN

N

n

k Nnxnx∑ −−

=⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠⎞

⎜⎝⎛ ++=

12

0 21 (3.5)

X[k] can be represented as frequency transformation of input sequence.

X[k] can be decimated into even and odd indexed frequency samples:

[ ] ( ) W nkN

N

n

NnnxkX 21

2

0 22 ∑

−

=⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠⎞

⎜⎝⎛ ++=

( ) W nkN

N

n

Nnnx2

12

0 2∑−

=⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠⎞

⎜⎝⎛ ++= (3.6)

41

[ ] ( ) W nkN

N

n

NnnxkX 21

2

0 212 ∑

−

=⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠⎞

⎜⎝⎛ +−=+

( ) W nkN

N

n

Nnnx2

12

0 2∑−

=⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎠⎞

⎜⎝⎛ +−= (3.7)

Equation [3.6] represented as even frequency samples and equation.

[3.7] represented as odd frequency samples. Further this procedure can

be repeated through decimation of the N/2-point DFTs, i.e., X[2k] and

X[2k+1]. The entire Cooley-Tukey algorithm involves log2N stages,

where each stage involves N/2 operation butterflies units. Therefore, the

computation of the N point DFT through DIT-FFT requires (N/2).log2N

complex multiplication and N.log2N complex addition. For instance,

Radix-2 8-point FFT computation using DIF FFT is shown in Figure. 3.3.

Figure 3.3 Radix-2 DIF-FFT structure for 8-point

42

Similarly, we can design for Radix-2 16 point FFT computation model

with the help of DIT-FFT. The number of complex multiplications and

complex additions required for Radix-2 16 point FFT is analyzed for

both DFT and FFT and compared in Table 3.1 [Implementation of 16

point].

Table 3.1 Tabulation for number of complex multiplications and complex

additions required for Radix-2 16 point FFT

OPERATION DFT FFT 16 POINT DFT 16 POINT FFT

Complex Multiplications N2 N/2(log2N-

1) 256 24

Complex Additions N(N-1) N(log2N) 240 64

Real Multiplications 4N2 2N(log2N-1) 1024 96

Real Additions N(4N-2) 2N(log2N) 992 128

3.2.2 Properties of Twiddle Factor

In FFT computation, is represented as twiddle factor or DFT

coefficient. Twiddle factor is generally referred as “rotating vector” which

rotates in increments according to the number of samples, N. The

partial twiddle factor for N-point DFT is illustrated in Figure. 3.4.

The properties of twiddle factor coefficients for DFT are as follows:

1. 120 =−= WWNNN

jWWN

NNN −=−= 4

34

43

2. ( ) WW nkN

NknN −=+ 2

3. ( ) ( )WWW knnN

kNnN

nkN

++ == Periodicity in n and k.

:

Figure 3.4 The partial twiddle factor for N-point DFT

For certain values of the product n*k, twiddle factor takes on the value

either 1 or 0 (property 1). These types of products are calculated by as

follows:

( ) WWWWW nk

N

nk

N

N

N

Nnk

N

nk

N BABABA ×−=×⎟⎠⎞⎜

⎝⎛ ×+=×+× + 22

(3.8)

( ) ( ) WWWWW nkN

nkNN

NnkN

nkN jBABABA ×−=××+=×+× + 44

(3.9)

However, reductions of this type still have an amount of computation

that is proportional to N2. Fortunately, the second property of the

44

periodicity of the complex sequence reduces the computation

significantly.

( ) ( ) WWW knN

NknN

nkN BABA ×+=×+× +

, where n = 0, 1, 2..... N-1 (3.10)

According to symmetric property, we can further reduce equation as

( ) WWW nkN

NnkN

NnkN ji ×−=−= ++

22

85

8 and

( ) WWW nk

N

Nnk

N

Nnk

N ji ×+=−= ++

228

78

3 (3.11)

According to equation. 3.11, the multiplication of a complex number by

twiddle factor involves two real multiplication and two real additions.

Thus, only ±450 phase difference requires for two real multiplications.

3.2.3 FDM Systems based on FFT and Modulation

OFDM is a Discrete Multi-tone (DMT) frequency division multiplexing

(FDM) scheme in which high rate data stream at M fsym bits/s is divided

into blocks with M bits per block at a rate of fsym. Divided blocks are

called as symbols. A symbol allocates (m*k) bits of M bits for

modulation of a carrier k at fc,k and totally M bits for modulation of N

carriers. This results in N sub channels, which send symbols at rate of

fsym. The block diagram of multi-carrier modulation scheme for OFDM

System is illustrated in Figure. 3.5.

45

In traditional MCM technique, the sub-channels are non-

overlapping. Each sub-channel has its own modulator and demodulator

for information transmission purpose. This leads to more utilization of

spectrum and excess hardware requirement. OFDM System overcomes

this drawback by introducing orthogonality. In orthogonality overlapping

of the sub-channels are allowed. The orthogonality of OFDM can be

exploited in frequency domain. The spectrum of overlapping sub-

carriers in OFDM is illustrated in figure. 3.6. The process of change in

frequency or other periodic event is called as Doppler Effect. Carrier

Offset due to Doppler Effect, peak and null of the orthogonal signals,

fc,n-

fc,n-

fc,0 M bits (a Symbol)

Seri

al to

Pa

ralle

l

Modulator n-1

Modulator n-1

Modulator 0

fc,n-

fc,n-

fc,0 M bits (a Symbol)

Para

llel t

o Se

rial

demodulator n-1

demdulator n-1

demodulator 0

Channel Noise

Input

Mfsym b/s

Output

Figure 3.5 Block diagram for Multi-carrier Modulation Scheme

46

Subcarrier spacing and orthogonal sub-carriers of OFDM are indicated

in Figure. 3.6.

Figure 3.6 Spectrum of overlapping of sub-carriers in OFDM

The OFDM modulator can be implemented with the help of IFFT

processor and N sub-carrier instead of N modulation in traditional MCM.

Similarly, OFDM demodulator can be implemented efficiently with the

help of FFT processor and N-sub-carrier than that of traditional MCM.

The simplified OFDM System based FFT is illustrated in figure. 3.7.

47

.

Figure 3.7 OFDM System based on FFT

3.2.4 Digital Signal Processors for FFT Computation

After the publication of Cooley-Tukey FFT algorithm, various

modifications as well as implementations has been provided to improve

the performance of FFT. In general, designed various model of FFT can

be implemented in software, general-purpose processors, algorithm

specific processors and application specific processors. The

architecture of general purpose programmable digital signal processor

(DSP) is shown in Figure. 3.8.

The general purpose programmable DSP processor consists of

Address generator, data memory, program memory, input/output

interface, program controller and Multiplication and Accumulation unit

(MAC) and Arithmetic and Logic Unit (ALU).

IFFT

FFT

D/A

A/D

Output

Channel noise

Input

48

Figure 3.8 General Purpose Programmable DSP Processor

Program memory and Data memory are used to store the

program and data respectively. MAC & ALU of DSP processor controls

the process of frequency transformation technique. Program controller

controls the data flow of processors. In various commercial

programmable DSP processors, the special instruction set for FFT

computation is included. But, the performances are varied from one

processor to another. In architecture point of view, most of commercial

I/O

Interface

Address Generator

Program Memory

Data Memory

Program Controller

I/O

Interface

MAC & ALU

Address Bus

Address Bus

Data Bus

Data Bus

Program

Program

Data

Data

49

DSP processor followed by Harvard architecture. Processors with

Harvard architecture have independent buses for data and control.

For typical FFT/IFFT implementation, general purpose DSP

processor takes approximately 1ms, which is far from the

implementation using more specialized implementations. Hence, the

general purpose DSP processor is not applicable for high speed and

low power applications due to the lack of throughput requirement. In

other hand, general purpose processors are designed to execute

multiple applications and perform multiple tasks. Also this processor

might lack high performance that certain task required. Hence,

application specific application processors emerged as a good solution

for high performance, lower power consumption and cost effective

processors. These processors can be classified into three major

categories:

• Digital Signal Processor (DSP): Programmable

microprocessors/Programmable microcontrollers for extensive

real time and mathematical computations.

• Application Specific Instruction Set Processor (ASIP):

Programmable microprocessors/Programmable microcontroller in

which hardware and instruction set are designed together for

particular special application.

50

• Application Specific Integrated Circuit (ASIC): Specific algorithm

completely implemented in hardware.

Various programmable FFT processors have been developed for the

FFT/IFFT computations. These processors are 5 to 10 times faster than

the general-purpose DSP processors. The architecture of

programmable specific FFT processor is illustrated in Figure. 3.9.

Figure 3.9 Programmable specific FFT processor

The general programmable specific FFT processor consists of

butterfly units and complex multipliers. On-chip ROM is used available

in programmable FFT processor to store the sine and cosine coefficient

values. This type of programmable FFT specific processor are often

provided with windowing functions in either time or frequency domain.

3 Term Window Operator

Workspace RAM

Workspace RAM

Radix-4 Data path

Output Buffer

Coefficient ROM

Input

Output

51

Non Programmable Specific Processors can also be used for FFT

computation. This architecture supports only fixed length of FFT

calculation. Generally algorithm specific processors can be classified

into three categories; they are Fully Parallel FFT Processor, Column

FFT Processors and Pipelined FFT Processors

Mapping of FFT signal-flow graph to hardware structures is different

for all three algorithm specific processors. In a Fully parallel FFT

processor, the hardware structure is the form of isomorphic for FFT

signal flow graph. For instance, the signal flow graph for 8-point DFT

requires 24 complex adders and 5 complex multipliers for FFT

computation. The hardware complexity is more in this implementation.

Hence, it is not power efficient for FFT implementation.

In order to overcome the disadvantage of fully parallel FFT

processor, column based FFT processor is introduced. Set of

processing elements in a column can be computed at a single clock.

These results are fed back to another same set of processing elements

to compute the next stage. The hardware complexity of column based

FFT processors is reduced. The routing for the processing elements is

complex and difficult for the long transform length. Pipelined FFT

processors are introduced to overcome the disadvantage of column

based FFT processors.

52

In a pipelined FFT processor, each stage has its own set of

processing elements. Every stage is computed automatically when data

are available. Pipelined FFT processors have the features like

simplicity, flexibility, modularity and high throughput. The most common

group of pipelined FFT processors are Radix-2 Multipath Delay

Commutator (R2MDC) FFT, Radix-2 Single-path Delay Feedback

(R2SDF) FFT, Radix-2 Single-path Delay Feedback (R4SDF) FFT,

Radix-4 Single Delay Commutator (R4MDC) and Radix-22 Single-path

Delay Commutator (R22SDC) FFT. Lot of endeavours such as T. S.

Ghouse Basha [53], Abhijit D. Palekar [2] have been working on

architecture oriented FFT model and made more modifications to

improve the performance of FFT computations in terms of hardware

complexity and power consumptions.

In generalized FFT architecture, more number of computational

paths is involved to determine the spectrum characteristics of discrete

time signals. Due to large number of computational paths, more number

of logic elements is utilized to design the FFT processors. Also, delay

for FFT computation can be increased significantly. To overcome these

disadvantages of traditional Radix-2 FFT, Radix-2 Single-path Delay

Feedback (R2SDF) FFT is preferred in our research work.

53

3.3 RADIX-2 SINGLE PATH DELAY FEEDBACK (R2SDF) FFT

Radix-2 Single-path Delay Feedback (R2SDF) FFT is a parallel

technique for estimating the frequency response of discrete timing

response. This structure also referred as “stream-like” processing of

block based algorithm. One of the key advantages of R2SDF FFT is

processing the data in a parallel manner whenever input points are

available. Butterfly structure for R2SDF FFT is illustrated in Figure.

3.10. It consist of single butterfly structure for performing signed

addition and signed subtraction process and a single delay line unit for

processing the second point of data, after a single unit delay. In Figure.

3.10 two ways of representations are illustrated to analyze the signal

flow of R2SDF FFT.

Figure 3.10 Butterfly structure for R2SDF FFT

(1) First half of input is shifted in delay buffer and second half of

output is getting from delay buffer.

(2) Second half of input is shifted into butterfly together with first

half of input from delay buffer and second half of output from delay

Delay Line

But

terf

ly

Data in Data out

54

buffer. Symbolic representation of signals flow in butterfly structure of

R2SDF FFT is illustrated in Figure. 3.11. Single path Feedback unit is

used to access next point of input in Figure. 3.10 and Figure. 3.11

respectively. Hence, this structure named as Radix-2 Single path Delay

Feedback (R2SDF) FFT. A single stage has been used for Radix-2 2-

point SDF FFT to estimate the frequency response of the signal. Similar

to this, Radix-2 8-point SDF FFT structure is illustrated in Figure. 3.12.

Figure 3.11 Symbolic representation of R2SDF FFT

Figure 3.12 Structure of Radix-2 8-point Single-path Delay Feedback (R2SDF) FFT

In 8-point R2SDF FFT, input sequence is broken into two parallel

data streams flowing forward with correct “distance” between data

elements entering the butterfly scheduled by proper delays. Both

4D

But

terf

ly

2D

But

terf

ly

1D

But

terf

ly

Delay Line

But

terf

ly

Single Path Single Path

55

complex multiplier and butterfly structures are less utilization in R2SDF

FFT architecture.

One of the straightforward approaches for parallel implementation

of R2SDF FFT algorithm is as follows:

1. The input data sequence is broken into two parallel data streams.

2. In each stage of R2SDF FFT processors, half of the input data is

delayed via feedback delay unit and processed with second half

of the input data.

3. The delay elements used are 4, 2 and 1 respectively for three

stages in 8-point R2SDF FFT processors. Hence, total number of

delay elements used is 4+2+1=7 in case of 8-point R2SDF FFT.

4. In 8-point R2SDF FFT architecture, three types of complex

multiplier has used for performing complex multiplications.

Similarly only 3 number of delay unit structures and 3 number of

butterfly units are used for performing 8-point R2SDF FFT. When

compared to Radix-2 FFT computation, R2SDF FFT reduces

70% of hardware components. Due to reducing the complexity of

computational path, speed of the R2SDF FFT processors is very

high compared to traditional Radix-2 FFT structure.

Butterfly structure of R2SDF FFT is referred to as Processing

Elements (PEs). Therefore three PE structures are required to perform

56

8-point R2SDF FFT and six PE structures are required to perform 64-

point R2SDF FFT. In this research work, 64-point R2SDF FFT

processor is designed. To improve the performances of frequency

transformation processors, three steps are considered in this research

work.

Step 1: Pipelining Mechanism

Pipelining mechanism is introduced in each and every PE

structures to improve the speed of the R2SDF FFT processors.

Step 2: Reduced Complex Multiplier Design

Complex multiplier plays a significant role in hardware

requirement of R2SDF FFT. Hence, to reduce the silicon chip size of

R2SDF FFT, Reduced complex multiplier is designed. Reduced

complex multiplier has only less number of adder units when compared

to complex multiplier. Hence, this multiplier is referred as “Reduced

Complex Multiplier”.

Step 3: 64-point Pipelined R2SDF FFT Design

In this method, developed Pipelined PE structures and designed

reduced complex multipliers are incorporated into 64-point R2SDF FFT.

3.4 Pipelined Processing Element (PE) Structures for R2SDF FFT

The block diagram of 64-point R2SDF FFT is illustrated in Figure.

3.13. It consists of six processing elements to perform the FFT

57

computation. The input data sequences are divided into two parallel

data streams. Six delay units such as 32, 16, 8, 4, 2 and 1 are used to

process the two half of the input data sequences at every stage.

Figure 3.13 Structure of Radix-2 Single-path Delay Feedback (R2SDF)

FFT

It composed of three different types of PEs, a complex constant

multiplier and delay-line (DL) buffers. Three processing elements PE1,

PE2 and PE3 have different architectures to perform the different type

of butterfly operations. Among those PE3 structure is used to implement

a simple radix-2 butterfly construction and it supplies as the associate

modules of the PE2 and PE1 structures. Block diagram of PE3 structure

is illustrated in figure. 3.14.

Real part of the input and output samples are represented as Iin

and Iout respectively. Similarly, imaginary part of input and output

samples is represented as Qin and Qout respectively.

PE1

32

PE2

16

PE3

8

PE1

4

PE2

2

PE3

1

Input

Output

Twiddle Factor Coefficient

58

Figure 3.14 Block diagram of PE3 structure for 64-point FFT

DL_Iin and DL_Iout stand for the real parts of input and output of

DL buffers respectively. Similarly DL_Qin and DL_Qout stand for

imaginary part of input and output of DL buffers respectively. In addition

to PE3 stages, we need to perform multiplication by -1 in PE2 stage.

The working principle of PE3 is as follows:

When S0 = 0,

DL_Iin = Iin, Qout = DL_Qout

Iout = DL_Iout, DL_Qin = Qin

When S0 = 1,

DL_Iin = DL_Iout + (-Iin), Qout = Qin + (-DL_Qout)

Iout = Iin+ (-DL_Iout), DL_Qin = DL_Qout + (-Qin)

DL_Iout

Iin

DL_Iin

Iout

S0

Qin

DL_Qout

Qout

DL_Qin

S0

1

0

1

1

1

0

0

0

59

The block diagram of PE2 Stages and PE1 stages are shown in figure.

3.15 and Figure. 3.16 respectively.

Figure 3.15 Block diagram of PE2 structure for 64-point FFT

Figure 3.16 Block diagram of PE1 structure for 64 point FFT

These processing elements provide the better solution for

computing the frequency transformation of input time samples.

However, asynchronous effect of input/output access mechanism is one

of the main disadvantages in those processing elements. For instance,

two input devices can perform only after arrival of two inputs from any

source. Therefore differences between arrival times of two inputs make

an asynchronous mechanism which disturbs the speed of devices.

0

1

1

0

DL_Iout

Iin DL_Iin

DL_Qin

S1

1

10

0

PE3

DL_Qout

Qin

-1

Iout

Qout

S2

60

Similarly, in every block in PE structures have asynchronous effect. Due

to those effects, more delay has been consumed in PE structures.

In our research work, asynchronous effect of FFT blocks is

identified to reduce the problem of delay consumption. To completely

reduce the asynchronous effect of FFT blocks, pipelining registers are

used in every processing element structures. These pipelining registers

help to reduce the arrival time of input from any sources. Inner block of

register unit has Flip-Flops (FFs) to provide clock matching

synchronism. Hence, processing speed of every processing element

must be high in Pipelined PE structures. Block diagram of Pipelined

PE3, PE2 and PE1 structures are illustrated in Figure. 3.17, Figure.

3.18 and Figure. 3.19 respectively.

Figure 3.17 Block diagram of Pipelined PE3 structure for 64 point FFT

1

0

1

1

1

0

0

0

DL_Iout

Iin

DL_Iin

Iout

S0

Qin

DL_Qout

Qout

DL_Qin

S0

Reg

Reg

Reg

Reg

Pipelining Registers

61



3.5 DESIGN OF REDUCED COMPLEX MULTIPLIER

In FFT computation, signed adders, signed subtractors and

complex multipliers are required to convert the frequency response of

discrete signals into time response of discrete signals. Signed adder

and signed subtractor is the fundamental logic function which can be

easily generated by half adder and full adder circuits. But, complex

multiplier is a difficult task to exhibit an accurate result. FFT

computation requires complex multiplier in the place of twiddle factor

1

1

0

DL_Iou

Iin DL_Iin

DL_Qi

S1

0

1

PE3

DL_Qout

Qin

-1

Iout

Qout

Reg

Reg

1

Pipelining Registers

I

Q

WN/KN

PE3

Iou

Qout

DL_Iou

Iin

Qi

DL_Qou

1

0

0

1

S1

0

1

1

0

DL_Iin

DL_Qin

S2 -1

Reg Reg

Reg Reg

Pipelining Register

S2

62

multiplication. For instance, 2 point FFT require to perform multiplication

of twiddle factor w02 with output of signed subtraction. Similarly, in 4

point FFT computation require twiddle factor multiplication of w04 and

w14 values. The value of first twiddle factor ( w0

4 ) is 1, hence there is no

need to perform multiplication. But, value of second twiddle factor (w14 )

is 1/ =0.707, hence it requires a multiplier to perform the multiplication

with value of 0.707. Bit parallel multiplier has been suggested in large

endeavours such as Manimaran, A [28], and Kandhi Srikanth [19] for

performing the multiplication of fractional values.

3.5.1 Bit Parallel Multiplier for 1/

Bit Parallel Multiplier is based on shifting and adding operation.

We can easily estimate the rounding value of multiplications. The

structure of bit-parallel multiplication is illustrated in Figure. 3.20 as in

Manimaran, A [28].

Figure 3.20 Circuit diagram of the bit-parallel multiplication by 1/

>> >>

>>

>>In Output

63

There are four different bit shifters and four adders are used to

generate the bit parallel multiplications by 1/ . The hardware structure

of bit parallel multiplication for 1/ can be further reduced in terms of

one number of adder and shifter in Yu, C [57]. The circuit diagram of

reduced bit parallel multiplier for 1/ is illustrated in Figure. 3.21. The

multiplication by 1/ using bit parallel multiplier is derived as follows:

( )222222 1486431in22inOutput −−−−−− +++++∗=∗=

(3.12)

( )( )[ ]222 26211in22inOutput −−− −++∗=∗=

(3.13)

Figure 3.21 Circuit diagram of reduced bit parallel multiplier for 1/

Figure.3.22 shows the butterfly structure of FFT with the help of

bit parallel multiplier. In this structure, twiddle factor is used in the place

of bit parallel multiplier.

>>2 >>4

>>2In Output

64

Figure 3.22 Butterfly structure of FFT with the help of bit parallel

multiplier

Twiddle factor multiplication of 0.707 is enough for both 4-point

and 8-point FFT computation. Similarly in case of 16 point FFT, we

need the twiddle factor multiplication of 0.3826, 0.9238. Hence, a

complex multiplier has been designed for providing all the twiddle factor

multiplication in a single multiplier circuits. Different twiddle factor

multiplication values are controlled by circuit switch of the complex

multiplier. In Yu, C [57] and Manimaran, A [28], circuit of complex

multiplier has been designed.

3.5.2 Design of Complex Multiplier

The structure of complex multiplier is illustrated in Figure. 3.22. It

consists of three number of multiplier and five number of adder to

control the different twiddle factor multiplication for 64-point FFT. Circuit

switch of complex multiplier is used to control the different types of

twiddle factor multiplications.

1/

1/

Iin

Qi

Iout

Qout

-

65

Figure 3.23 Structure of Complex Multiplier

The complex multiplier of FFT is designed by using the generalized

expression as

qQiII 1in1inout −= (3.14)

Equation (3.14) is simplified by adding and subtracting the Iinq1 terms as

given in equation (3.15) and rearranged to equation (3.16).

qIqIqQiII 1in1in1in1inout −+−= (3.15)

qQqIqIiII 1in1in1in1inout −−+= (3.16)

When we taking Iin and q1 terms as common factors,

( ) ( )QIqqiII inin111inout +−+= (3.17)

Circuit Switch

i1 to i8 q0 to q8

66

Final expression of real term of complex multiplier is given in equation

(3.17). Similarly, output of imaginary terms are considered as follows,

iQqIQ 1in1inout += (3.18)

Equation (3.18) is reduced by adding and subtracting Qinq1 terms as

given in equation (3.19) and rearranged in equation (3.20).

qQqQiQqIQ 1in1in1in1inout −++= (3.19)

qQqIqQiQQ 1in1in1in1inout ++−= (3.20)

When we taking Qin and q1 terms as common factors,

( ) ( )QIqqiQQ inin111inout ++−= (3.21)

Final expression of imaginary term of complex multiplier is given in

equation (3.21). From equation (3.20) and equation (3.21), it is clear

that complex multiplier require five adders and three multipliers units

3.5.3 Reduced Complex Multiplier design

In traditional complex multiplier design, thee multipliers and five

adders are used to twiddle factor multiplication of 64-point FFT. In our

proposed work, low density adders are identified to reduce the density

and hardware complexity of complex multiplication. Hence, this

multiplier named as “Reduced Complex Multiplier”.

67

In traditional complex multiplier, we identified i1+ q1 and i1 - q1

have low density than other adder structures. Hence, in reduced

complex multiplier, i1 + q1 and i1 – q1 have to be considered as LUT and

other elements remain unchanged. The structure of proposed reduced

complex multiplier is illustrated in Figure 3.24.

Figure 3.24 Structure of Proposed Reduced Complex Multiplier

This multiplier architecture performs twiddle factor multiplication

operation based on configuring circuit switches. Hence the reduced

complex multiplier is also termed as reconfigurable complex multiplier.

Proposed Reduced Complex Multiplier consists of only 3 number of

adder and 3 number of multiplier for performing 64-point FFT

computation. Hence, reduced complex multiplier consumes less

hardware and delay than traditional complex multiplier. This proposed

complex multiplier performs different type of twiddle factor multiplication

values as shown in table 3.2. In every stage, bit parallel multipliers are

used to perform the twiddle factor multiplication with the input values.

-

(i+q)(7q(7:0)

Iin

Qi

Iou

Q

Circuit

(i-

68

Hence, this complex multiplier also referred as “Reduced

Reconfigurable Complex Multiplier”. Circuit Switch of the reduced

complex multiplier is used to choose the suitable bit parallel multiplier in

every stage. Further this reduced complex multiplier is integrated into

normal Radix-2 butterfly structures to improve the performances

frequency transformation techniques.

Table 3.2 Twiddle factor values for 64-point FFT

Coefficient Value Coefficient Value

Real_1 0.7071 Imag_1 0.7071

Real_2 0.7730 Imag_2 0.6343

Real_3 0.8314 Imag_3 0.5555

Real_4 0.8819 Imag_4 0.4713

Real_5 0.9238 Imag_5 0.3826

Real_6 0.9569 Imag_6 0.2902

Real_7 0.9807 Imag_7 0.1950

Real_8 0.9951 Imag_8 0.0980

3.6 Design of Radix-2 Single path Delay Feedback (R2SDF) FFT for

64-point

Radix-2 Single path Delay Feedback (R2SDF) FFT is a “stream-

like” processing of block-based algorithm. It reduces the processing

time of FFT computation. Single Delay path is used in every stage to

69

process the first half of the input points with other half of the points. In

every stage, reconfigurable complex multiplier is used to perform the

twiddle factor multiplication. The architecture of 64-point R2SDF FFT is

illustrated in Figure 3.25.

Figure 3.25 Architecture of 64-point R2SDF FFT

In every stage of 64-point R2SDF FFT both Pipelined and

reduced complex multiplier are integrated to improve the performance

of 64-point R2SDF FFT processor. In Pipelining technique, pipelining

registers are used to eliminate the asynchronous effects between the

inputs and outputs. Hence, processing speed of pipelined PE structures

has been improved significantly. Next to pipelining technique, reduced

complex multiplier provides the multiplication results of twiddle factors

with the help of less hardware. Hence, processing speed of complex

multiplier also gets improved. Therefore, finally, the performances of

32D

But

terf

ly

16D

But

terf

ly

8D

But

terf

ly

4D

But

terf

ly

2D B

utte

rfly

1D

But

terf

ly

70

R2SDF FFT get improved when incorporating the Pipelined PE

Structures and reduced complex multiplier into R2SDF FFT

architecture.

SUMMARY

Orthogonal Frequency Division Multiplexing (OFDM) is a wireless

communication technique in which modulation & demodulation, FFT &

IFFT, encoder & decoder play an important role for performing data

communication services. In our research work, frequency

transformation is to be considered to improve the performances of

OFDM for MANET application.

FFT is widely used for converting the frequency domain of signals

into time domain signal. To improve the architecture of FFT model,

R2SDF structure is used in this research work. R2SDF FFT is a stream

like processor. However it has two disadvantages, one of them is

asynchronous effect of intermediate input/output and another one is

complexity of complex multiplier. To overcome these disadvantages

following steps are to be made in this research work.

Pipelining technique is introduced to PE structures of R2SDF FFT

to increasing the processing speed. Pipelining registers are used to

remove the asynchronous effect of R2SDF FFT.

Reduced complex multiplier is designed to reduce the complexity

of complex multiplier. Low density hardware components has been

71

identified and eliminated to reduce the hardware complexity of complex

multiplier. Proposed reduced complex multiplier performs different

twiddle factor multiplications with less hardware. Hence, processing

speed of the multiplication also gets improved.

Finally, both Pipelined PE structures and Reduced Complex

Multiplier are integrated into R2SDF FFT processor to improve the

performance of frequency transformation techniques.

72

CHAPTER 4

HAMMING SINGLE ERROR CORRECTION – TRIPLE

ADJACENT ERROR DETECTION CODE ALGORITHM FOR

DATA COMMUNICATION

4.1 Error Detection and Correction (EDC) Codes

Error detection and correction codes provide reliable delivery of

information signals. The small size of transistors and capacitors are

combined with radiation effects from cosmic rays, hence these causes

occasional errors in large storage of information. These types of errors

are generated in RAM chips. These errors can be detected and

corrected by employing the error-detecting and error-correcting codes in

RAMs. The scheme for detecting the error in error detecting technique

is parity bit. The parity of the information word is checked after reading

the data from either memory or registers. The information word is

correct, when even parity of 1’s arrived in same information word;

similarly the information word is incorrect, when odd parity of 1’s arrived

in same information word.

Different types of Error Detection and Correction (EDC) codes are

available to transmit the information data from source to destination

without any error. All those different types of EDC codes are bounded

73

from two types of binary codes named as block codes and convolutional

codes.

Block codes is the combination of both linear and circular code

which encodes the data into blocks. In linear block codes, input bits are

partitioned into blocks traditionally. Linear block codes are used in

Forward Error Correction (FEC) in which symbols are transmitted on a

communications channel so that, if errors occur in the transmission that

can be detected or corrected by the parity of block codes. Cyclic codes

are also block codes, in which circular shifts of each code word

produces another circular code.

Figure 4.1 Classification of Error Detection and Correction (EDC) codes

On the other hand, convolutional codes are also one of the best

encoders in which sequential function of convolution processes is

performed by sequence of bits. In convolutional encoder, 2*N bit code

Error Detection and Correction

(EDC)

Convolutional

Codes

Block Codes

Viterbi

Turbo Codes

LDPC

Reed Solomon

74

word is generated while using N bit data. Different types of block and

convolutional encoders have been suggested in the past to bring error

free transmission. The basic classifications of both block and

convolutional EDC codes are illustrated in Figure. 4.1.

Viterbi and Turbo codes are convolutional codes, because

convolution function has been performed in the encoder part of data

transmission system. In the other hand, Low Density Parity Check

(LDPC) codes and Reed Solomon codes are the linear error detecting

and correcting codes. Other than LDPC and Reed Solomon code,

Hamming code is also one of the best linear EDCs. The codeword of

linear codes is also linear. Hence, we can improve the probability of

error detection and error correction.

In this research work, improvements of error detection probability

of Triple Adjacent Error Detection (TAED) have been illustrated with the

help of hamming EDC and Bit replacement algorithm for wireless Mobile

Ad-hoc Network (MANET).

4.2 Hamming Error Detection and Correction Mechanism

Hamming codes are a family of linear block error detecting and

correcting codes that generalize the linear codes invented by Ritchard

Hamming in [42]. Hamming codes are perfect codes in which single bit

error can be detected and corrected successfully with their block length

and minimum distance. In hamming EDC codes, parity bits are used to

75

detect and correct a single error. The difference between word length of

input and parity bits is called as hamming distance (m).

In numerically, hamming codes are characterized for m≥3 with the

following,

12 −= mn (4.1)

mnk −= (4.2)

3min =d (4.3)

Where, n is the block size, k is the number of information bits, m is the

number of parity bits and dmin is the minimum hamming distance.

Minimum hamming distance of the hamming codes is always equal to

the number of parity bits. In hamming codes three parity bits are

required for performing a Single Error Correction (SEC) that means

minimum hamming distance of SEC hamming code is three.

For instance, considered as (7, 4) hamming code in which k = 4

(Width of information bits), n = 7 (Width of information bits) and number

of parity bit m = 3 (m = n - k). Therefore, the minimum hamming

distance is three. By using Figure. 4.2 and equations from equation 4.1

to 4.3, we can determine the values of parity bits.

76

Figure 4.2 Parity bit calculations

P1 = d1 + d2 + d3 (4.4)

P2 = d1 + d2 + d4 (4.5)

P3 = d2 + d3 + d4 (4.6)

Table 4.1 Calculation of parity bits for (7, 4) hamming code

Information bits Parity bits

d1 d2 d3 d4 P1 (d1 + d2+ d3)

P2 (d1 + d2+ d4)

P3 (d2 + d3+ d4)

1 0 0 0 1 1 0

0 1 0 0 1 1 1

0 0 1 0 1 0 1

0 0 0 1 0 1 1

Before, calculating the hamming matrix, it is essential to find the

generation matrix by using input word length and parity bits. For (7, 4)

hamming code, input word length is four. Hence, combination of

identical matrix for four bits and their corresponding parity bits are used

d1

d2

d3 d4

P1 P2

P3

77

to determine the generation matrix [G]. Calculation of parity bits for 4 bit

identical matrix is illustrated in table 4.1.

[G] = [Pm : Ik] (4.7)

Generation matrix for (7, 4) hamming code as follows:

[G] = (4.8)

From the generation matrix, we can estimate the hamming matrix as

follows:

[H] = [PT : Im] (4.9)

[H] = (4.10)

Hamming encoding process could be done with the help of vector

in Figure. 4.2 and from equation 4.1 to 4.3. Hamming decoding process

Parity bits 4 –bits Identical Matrix

Transpose of Parity bits

Identical Matrix for 3-bits

[H] =

[G] =

78

could be done with the help of syndrome vector. Syndrome vector is the

vector which determines error location of encoded signals. Syndrome

vector can be determined by taking the transpose of hamming matrix.

[Syndrome] = (4.11)

For instance, 4-bit information signal considered as 0100, and

then the parity bits of information bits are 111. Therefore code words for

information bit 0100 are represented as [parity bits: information bits],

Code word = [1110100] (4.12)

Hamming decoding processes are as follows:

Step 1: Getting the syndrome vector with the help of multiplication of

code word and Transpose of hamming matrix.

Step 2: Find the error location with the help of syndrome vector.

Step 3: If all the bits of syndrome vector are 0’s, then there is no error in

the encoding transmission. If any one of the bits of syndrome vector are

1’s, then there will be occurred a single bit error in the encoding

transmission. With the help of syndrome locator, we can easily detect

the location of error and correct it.

79

For instance, code word [1110100] is directly sent to the decoding

block. Then the syndrome vector can be obtained as [000]. Therefore,

there will be no error in the encoding transformation. If the second bit of

code word is flipped, then the code word become as 1010100. Now the

syndrome vector can be obtained as [111] which indicate the second

location in syndrome vector. Therefore, there will be error in the

encoding transmission. Hamming code can able to detect the location of

error and also able to correct it. Therefore, the corrected code word as

[1110100]. In this way, hamming code can detect and correct a single

bit error.

As in case of SEC using hamming code, DAED also performed in

hamming code with the help of extending the parity bits. The extended

hamming codes can support the SEC mechanism, but cannot support

double adjacent error correction as shown in Sanchez-Macian [46].

However, as technology scales, it is more likely that more than

one memory cell or register causing multiple errors by Ibe, E [18]. This

is known as a Multiple Cell Upset (MCU) as in Lawrence, R. K [22]. The

cells or register used by the MCU are physically close and in many

cases adjacent. This is because errors are created along the path that

information bit traverses. MCUs can therefore cause multiple adjacent

errors on a given information word causing a failure even when a SEC

code is used. SEC hamming code can cause erroneous correction,

when two adjacent bits are in error. In order to overcome this problem,

80

interleaving method is used in Zhao, J [58], and Baeg, S [8] which

places the bits of a word physically apart such that an MCU can only

affect one bit per word. But interleaving process makes the design as

more complex and can impact more area and power consumption.

Hence Single Error Correction – Double Adjacent Error Detection (SEC-

DAED) is the best solution for this problem. In order to detect the double

adjacent bit error, one more parity bit, (i.e. 3+1=4) is required.

From above consecution, it is clear that SEC-DAED codes can

cause erroneous correction, when three adjacent bits are in error.

Hence, we prefer Single Error Correction – Triple Adjacent Error

Detection (SEC-TAED) codes to detect the triple adjacent error. In both

SEC-DAED and SEC-TAED hamming codes, Silent Data Corruption

(SDC) has occurred where the system is unaware that an error has

occurred and continues its operation.

An algorithm to generate Hamming Codeword’s of information

bits is as follows:

Step 1 Number the positions of bits starting from 1 to n, where n is the

last positions of the bit.

Step 2 All the positions are written in their binary form as 1, 10, 11,

100, 101, 110, 111, 1000, 1001, etc.

81

Step 3 All bit positions that are powers of two (i.e. have only one 1 bit

in the binary form of their position) are to be considered as a

parity bits.

Position of parity bits is to be determined as follows:

[1] Parity position 20: Check 1 bit and Skip 1 bit step positions are

followed such as 1, 3, 5, 7, 9..............

[2] Parity position 21: Check 2 bits and Skip 2 bits step positions are

followed such as 2, 3, 6, 7, 10, 11, 14, 15..............

[3] Parity position 22: Check 3 bits and skip 3 bits step positions are

followed such as 4, 5, 6, 7, 12, 13, 14, 15, 20, 21, 22, 23,

24............

[4] Parity position 23: Check 8 bits and Skip 8 bits step positions are

followed such as 8-15, 24-31, 40-47.............

[5] Parity position 24: Check 16 bits and Skip 16 bits step positions

are followed such as 16-31, 48-63.............

Finally, set a parity bit to 0, if the total number of ones in the

positions is even. On the other hand, set a parity bit to 1, if the total

number of ones in the positions is odd. In this way we get the codeword

by using hamming codes. Syndrome vectors have been used in

hamming code to detect and correct a single bit error. The product of

lexicographic matrix and codeword of given information is called as

82

Syndrome vector. If the value of syndrome is a null vector, there will be

no error in data transmission. If the value of syndrome vector is fixed

from 1 to 2m, position of syndrome vectors indicates position of

erroneous bit. Therefore, we can change position of erroneous bit.

Hamming codes can detect and correct a single error effectively.

4.2.1 Single Error Correction (SEC) Hamming Codes

SEC hamming code uses the lexicographic hamming matrix to

determine the erroneous bit. Lexicographic matrix is constructed by

writing the binary representation of digits starting from 1 to 2m. In

general hamming code can be represented as (n, k). Where n is the

block size and k is the number of information bits.

Let to be consider (7, 4) hamming code, Where length of

information is four and block size of hamming code is 7. Therefore,

parity bit of given hamming code is three. Hence, lexicographic matrix of

(7, 4) hamming code can be written as follows

H = (4.13)

Let, four bit information, input = 1011.

Length of codeword = Length of information bit + Length of Parity bit

Length of codeword = 4 + 3

Length of codeword = 7.

H =

83

Code word can be formulated as follows,

0 1 2 3 4 5 6

Parity1 Parity2 1 0 Parity3 1 1

Parity 1 = Even Parity of [parity1, 1, Parity2, 1] = 1 XOR 1 = 0.

Parity 2 = Even Parity of [Parity2, 1, 1, 1] = 1 XOR 1 XOR 1 = 1.

Parity 3 = Even Parity of [0, Parity3, 1, 1] = 0 XOR 1 XOR 1 = 0.

Hence, code word formed as

In general, transpose of lexicographic matrix is termed as Syndrome

vector.

Syndrome = (4.14)

Parity1 Parity2 Data1 Data2 Parity3 Data3 Data4

0 1 1 0 1 1 1

2220 21

Syndrome =

84

Multiplication of hamming code word and transpose of

lexicographic matrix gives Syndrome vector position. It is denoted as

‘S’.

S = (4.15)

S=

Hence, Syndrome vector provides null vector. From value of S, it

is clear that there is no error in data transmission. If error has been

occurred in hamming codeword, then Syndrome vector gives the values

from 1 to 2m . For instance, if hamming codeword is coded as 0111011

for a given information bits 1011, then the Syndrome vector provides

[100]. This code word located at fourth place of Syndrome vector

(equation 4.15). Hence, we can manually change the fourth location of

codeword to correct a single error. In this way SEC hamming codes can

effectively detect and correct a single bit error. When error has been

occurred in double adjacent position, then this fixed algorithm can

miscorrect the codeword. For instance, hamming codeword is coded as

0111111 for a given information bits 1011, then the Syndrome vector

provides [001]. The code word located at first place of Syndrome vector

(equation 4.6). But, first position of bit is transmitted as itself; there is no

error in transmission. In this way, SEC hamming codes can miscorrect

S =

85

the codeword when occurring double adjacent bit errors. It causes SDC

effect which lead to an incorrect system behaviour and further data

corruption. In order to reduce this problem, SEC-DAED hamming codes

have been introduced in the past.

4.2.2 Single Error Correction–Double Adjacent Error Detection

(SEC-DAED) Hamming Code

As discussed in previous section, SEC hamming codes can

miscorrect the codeword. To avoid this problem, hamming code can be

extended by adding one more parity bit, (i.e.) four parity bits are to be

considered to detect the double adjacent error. Hence, this hamming

code named as “Extended Hamming Code”.

In order to increase the detection probability of SEC-DAED, two

algorithms has been suggested in Sanchez-Macian, Alfonso [46].

• Bit Shortening Algorithm

• Bit Replacement Algorithm

The bit shortening algorithm for improving the probability

detection of SEC or DAED can be explained as follows,

For a 16-bit data word (k=16), shortening technique is applied to a

(31, 26) Hamming code, which producing a (21, 16) SEC code. Hence,

10 columns can be removed from our original matrix. The procedure for

shortening algorithm for normal hamming codes are explained below,

86

• Fill the first k=16 columns with odd-weight values to maximize

double error detection. A double error may affecting any of these

columns will produce an even weight syndrome. So, it will not

correspond to any of these columns.

• Sort those columns for trying to maximize the different even

weight syndromes. Adjacent errors on these k=16 columns can

produce 15 syndromes. The goal is to maximize the coincidences

between these syndrome values.

• The remaining n-k=5 columns need to be filled by even weight

values. Even though, an adjacent error produced in the transition

between the last odd-weight column and the first even-weight

value would produce a miscorrection as it corresponds to a

difference existing odd-weight column. So, a specific odd-weight

column will be selected for removing it from the matrix to provide

for the identified odd-weight syndrome.

• Totally 6 column (5 columns plus the removed one) are filled with

even-weight values placing them in the appropriate order and

excluding those which coincide with a previous double-adjacent

error syndrome.

In other hand, Bit replacement algorithm is an efficient algorithm

where single error can be successfully detected and corrected. Also

87

maximum number of double and triple adjacent error can be detected

successfully with the help of bit replacement algorithm.

In hamming code model, it is essential to extend the number of

parity bits to detect the double and triple adjacent errors successfully.

For instance, consider a (12, 8) hamming codes. The parity bit for

hamming code (12, 8) is four bit. So, its maximum value is 15, but there

are only 12 positions in the shortened code. If syndrome vector gives

the value as 13, 14 and 15 means there means that there will be

detecting the double or triple adjacent errors. But, it is impossible to

correct the detected double or triple adjacent errors because the

syndrome vector provides the vector information in four bits such as

1101 (13), 1110 (14) or 1111 (15). Hence, it’s very difficult to detect the

correct position of errors. If trying to correct the double or triple

adjacent error means, it will miscorrect the errors in inappropriate

manner. The syndrome vector is defined as the multiplication of

transpose of hamming matrix and code word. In traditional method,

lexicographic matrix has been used for finding the code word. The

lexicographic matrix has been illustrated as follows,

H = (4.7)

If a single bit error occurs in the code word, the syndrome vector

that results from product of lexicographic matrix transpose and code

H =

88

word gives the four bit zero (all zeros) vector. For instance, considered

a hamming code (12, 8), data bits (01010100) are coded as

(000010110100). Thus the syndrome vector gives the result as (0000).

Hence, there will be no error in the data transmission. When an error

occurs and, for instance, the fifth bit is changed the code word turns into

(000000110100). The product of this vector by the lexicographic check

matrix results in the syndrome vector (0101) corresponding to the binary

representation of five. Hence, it is possible to flip the fifth location of bit

itself. Hence, single error can be detected and corrected successfully.

Alternatively, a hamming code can be used to correct single errors

as well as detect the double and triple adjacent errors. Traditionally, if

minimum distance between two words is three, it is not possible to

distinguish between single and double errors. For example, coming

back to previous example, if there is a double error in the original word

in positions 3 and 4 we get the vector (000001110100). Syndrome in

this case is (1110) corresponding to the binary representation is 14. In

this case, codeword would be corrected into (001110010100) instead of

the right word. This mis-prediction is termed as Silent Data Corruption

(SDC). In order to maximize the detection probability of double as well

as triple adjacent errors, hamming codes are extended in Sanchez-

Macian, Alfonso [46].

This solution increases the minimum distance to four and allows

both single error correction and double error detection (SEC-DED)

89

simultaneously. In order to maximize the detection probability of double

adjacent error, Bit Replacement algorithm is preferred in the previous

research work. In bit replacement algorithm, code word of the encode

output can be re-ordered which targeted to increase the detection

probability of double or triple adjacent errors. Figure 4.2 shows the flow

chart of selective bit placement strategy. The algorithm of Figure 4.3

illustrates for selecting the combinations of double or triple adjacent

errors by using MATLAB simulation tool. If the obtained syndrome value

is greater than code length there will be occurring the double or triple

adjacent errors.

Bit placement algorithm for detecting double adjacent error by

using hamming (12, 8) is shown in Table 4.2.

Table 4.2 Double Adjacent Error Detection for Hamming (12, 8)

Bit Placement Detection

1 2 3 4 5 6 7 8 9 10 11 12 1/11 9%

1 12 2 3 6 8 7 9 4 10 5 11 9/11 82%

There are 15 combinations of double errors have been identified by

using MATLAB. The combinations are: 1-12, 2-12, 3-12, 4-9, 4-10, 4-11,

5-8, 5-10, 5-11, 6-8, 6-9, 6-11, 7-8, 7-9 and 7-10. In normal order, only

7-8 combination will detect the double adjacent error detection. Hence,

only 8% detection efficiency was obtained in normal order based

90

hamming code. But in case of modified bit placement strategies, there

are 9 combinations are used to detect the double adjacent errors.

These are 1-12, 2-12, 4-9, 4-10, 5-10, 5-11, 6-8, 7-8 and 7-9. Hence,

82% detection efficiency was achieved in bit placement algorithm based

hamming codes.

Choose a code word

Generate next double/triple bit error

Multiply by Lexicographic Matrix

Syndrome > codelength

Select Combination

Any double/triple error remaining

Print Selected Combinations

YES

NO

YES

NO

Rearrange the Code Bit Positions Manually

Figure 4.3 Flow chart of Bit Placement Strategy

91

4.2.3 Single Error Correction – Triple Adjacent Error Detection

(SEC-TAED)

The parity bit can be extended to detect a single error as well as

triple adjacent errors. Hence, SEC-TAED based hamming code is

referred to as “Extended Hamming SEC-TAED codes”. Similar to DED

combination, Triple Error Combinations are determined by using

MATLAB. There are 49 triple error combinations are found in normal bit-

order based hamming code. Hamming (12, 8) SEC-DAED code could

cause miscorrection for detecting triple adjacent errors. Thus it requires

one more parity bit ‘p’ for performing SEC and TAED operation. SEC-

TAED process with (13, 8) hamming code is described in Sanchez-

Macian, Alfonso [47]. Hamming (13, 8) code for detecting triple adjacent

errors is illustrated in Table 4.3.

Table 4.3 Triple Adjacent Error Detection for Hamming (13, 8)

Bit Placement Detection 1 2 3 4 5 6 7 8 9 10 11 12 p 1/11 9% 6 8 1 7 11 3 5 9 2 4 p 10 12 9/11 82%

In normal bit order, only 10-11-12 combinations only can detect

the triple adjacent error successfully. Hence, only 9% of detection

efficiency can be achieved in hamming TAED codes. But in Bit-

reordered codeword, there are 9 combinations (except 2-4-p and p-10-

12) of triple adjacent errors can be detected successfully. Hence, 82%

of detection efficiency can be achieved in modified bit-reordered based

92

SEC-TAED process. However the most disadvantage of hamming (13,

8) SEC-TAED code is reducing the performances in terms of larger time

consumption and power consumption due to additional parity bit (‘P’). In

order to overcome this problem, enhanced extended hamming (12, 8)

code is developed in the current research work for detection &

correcting a single error as well as detecting the triple adjacent errors. It

will support in channel decoder part of MIMO-OFDM system which can

further extended to MANET based temporarily network operations.

4.3 Proposed Extended (12, 8) Hamming Code for SEC-TAED

In proposed methodology, bit replacement algorithm is effectively

used to change the order of the code word and to maximize the

probability detection of triple adjacent error processes. Further one

more crucial key things our research consideration is hamming (12, 8)

code. As discussed earlier, traditional hamming (13, 8) SEC-TAED code

requires 5 parity bits to detect the triple adjacent errors. To reduce this

problem, hamming (12, 8) code is used in the current research work to

detect the triple adjacent error.

Bit Re-ordered format for (12, 8) hamming code for detecting the

triple adjacent errors are shown in Table 4.4. It is the proposed bit-

reordered format which used to maximize the probability detection of

triple adjacent error.

93

Table 4.4 Triple Adjacent Error Detection for Proposed Hamming (12, 8)

Bit Placement Detection 1 2 3 4 5 6 7 8 9 10 11 12 1/10 10% 7 11 2 6 10 1 4 8 3 5 9 12 9/10 90%

In normal bit order, 1 out of 10 combinations only help to detect

the triple adjacent error. But in case of proposed bit re-ordered format 9

out of 10 combinations help to detect the triple adjacent errors. Hence,

90% probability detection efficiency can be achieved in the case of

proposed extended hamming SEC-TAED code. All combination of

proposed bit re-ordered format except 5-9-12 are triple error

combinations. Hence, it helps to improve the bit detective probability

wherever the bits flipping occur during data transmissions.

SUMMARY

Channel Encoder and Channel Decoder are the most essential

blocks of OFDM transceiver architecture. In these blocks, Error

Detection and Correction (EDC) codes are suited to encode and decode

the original data bits. In previous works, hamming codes are used in the

normal order code word. It should detect and correct a single error

perfectly. But, it could miscorrect the double and triple adjacent errors.

In order to maximize the probability detection of double & triple

adjacent error, bit replacement and bit shortening algorithm has been

used in the past. In our research work, Bit Replacement Algorithm is

94

used effectively to improve the probability detection of TAED.

Traditionally (13, 8) hamming code is used to find the TAED process.

Our proposed enhanced extended hamming SEC-TAED processes use

(12, 8) hamming code for detecting the triple adjacent errors.

Hamming (13, 8) SEC-TAED code has 82% triple adjacent error

detection efficiency with 100% SEC efficiency whereas proposed

hamming (12, 8) SEC-TAED code has 90% triple adjacent error

detection efficiency with 100% SEC efficiency. Hence, proposed

pipelined PE structures & Reduced Complex Multiplier based R2SDF

FFT and enhanced extended hamming (12, 8) SEC-TAED hamming

code is most suitable for MANET temporarily network applications.

95

CHAPTER 5

RESULTS AND DISCUSSIONS

The discussion of results obtained at every stage of the research

work presented in this chapter. The design of proposed processing

elements (PEs), pipelined Radix-2 Single path Delay Feedback

(R2SDF) FFT and proposed enhanced extended hamming codes are

simulated and validated by using ModelSim 6.3C Mentor Graphics tool.

Verilog HDL) is used for the design of processing elements, R2SDF

FFT and extended hamming codes. Further Xilinx 10.1i (Family:

Spartan – 3, Device: Xc3s50, Package: PQ208, Speed: -5) design tool

is used to generate the synthesis report of proposed design. Lower

power consumption, less slice & LUT utilization, high speed &

throughput are the main concerns of VLSI System design. The main

target of the current research work is reducing the delay consumption

and area consumption of the frequency transformation technique which

involved in OFDM process. To convert the time domain signal into

frequency domain signal, Radix-2 Single path Delay Feedback (R2SDF)

based pipelined architecture is developed in the current research work.

96

5.1 Synthesis Result of Pipeline based Processing Element

Structures

To increase the processing speed of MANET, the existing PE

structures are modified by adding a register unit in the end of the PE

structure. The register unit of PE structures make the synchronization

between input and output line. Thus the register unit of PE structure is

called as “Pipelining Registers”. The effects of synchronization among

all outputs make the high speed operation in frequency transformation

computation. The synthesis results to determine the Slice and LUT

utilization of existing Processing Elements (PE1, PE2 and PE3) are

shown in Figure 5.1, Figure 5.2 and Figure 5.3 respectively.

Figure 5.1 Synthesis result of PE1 to determine the Slice and LUT

utilization

97


utilization


utilization

98

Similarly, the synthesis results of Pipeline based Processing

Elements (PE1, PE2 and PE3) are shown in Figure 5.4, Figure 5.5 and

Figure 5.6 respectively.

Figure 5.4 Synthesis result of Pipelined PE1 to determine the Slice and

LUT utilization

Figure 5.5 Synthesis result of Pipelined PE2 to determine the Slice and

LUT utilization

99


utilization

From above sequences, it is clear that pipeline based PE1

processor offers 2.29% reduction in delay consumption than PE1

processor without using pipelining method. Pipeline based PE2

processor offers 45.45% reduction in hardware slices, 45.23% reduction

in LUTs and 1.34% reduction in delay consumption than PE2 processor.

Also pipeline based PE3 processor offers 2.5% reduction in hardware

slices and 1.62% reduction in delay consumption than PE3 processors.

The comparison result of area and delay for proposed pipeline based

Processing Elements blocks are shown in Table 5.1. Performances of

proposed pipeline based PE1, PE2 and PE3 processors are graphically

illustrated in Figure 5.7.

100

Table 5.1 Comparison of area and delay for proposed pipeline based

Processing Elements

Methods Slices LUT Delay (ns)

PE1 without Pipeline 67 131 17.633

PE1 With Pipeline 67 134 17.277


PE2 with Pipeline 24 46 17.446


PE3 with Pipeline 39 76 13.039

Figure 5.7 Performances of proposed pipelined PE1, PE2 and PE3

processors

101

5.2 Synthesis Result of Reduced Complex Multiplier

Twiddle factor multiplication of frequency transformation (FFT)

process will produce the frequency representation results for

corresponding timing representation inputs. In order to perform the

twiddle factor multiplication parallel shifter based multiplier has been

used. In the current research work, bit parallel multiplication is used to

perform the twiddle factor multiplication. Further the complexity of bit

parallel multiplication has been identified by using equation solving

method. The synthesis result of traditional and proposed reduced

complex multiplier is shown in Table 5.2.

Table 5.2 Comparison of area and delay between traditional complex

multiplier and proposed reduced complex multiplier

Parameters Traditional Complex Multiplier

Proposed Reduced Complex Multiplier

Percentage Reduction

Slices 299 217 27.42%

LUT 590 426 27.79%

Delay (ns) 26.716 25.822 3.34%

The synthesis result of proposed reduced complex multiplier

structures to determine the slice and LUT utilization is shown in Figure

5.8. Similarly, the synthesis result of proposed reduced complex

multiplier to generate a timing report is shown in Figure 5.9.

102

Figure 5.8 Synthesis result of proposed reduced complex multiplier to


Figure 5.9 Synthesis result of proposed reduced complex multiplier to

determine the delay consumption

103

From Figure 5.8, Figure 5.9 and Table 5.2, proposed reduced

complex multiplier offers 27.42% reduction in hardware slices, 27.79%

reduction in LUTs and 3.34% reduction in delay consumption than

traditional complex multiplier. Further, proposed pipeline based

processing elements and proposed reduced complex multipliers are

incorporated into pipelined architecture called “Radix-2 Single path

Delay Feedback (R2SDF)”.

5.3 Synthesis Result of Pipelined PEs and Reduced Complex

Multiplier based R2SDF FFT

Radix-2 Single path Delay Feedback (R2SDF) FFT is the best

feedback based frequency transformation process in which timing

signals is converted into frequency signals with the help of processing

element architectures and twiddle factor multiplications. In order to

improve the FFT architectures, pipelined PEs and Reduced Complex

Multiplier (RCM) is proposed in the current research work. The

synthesis results of proposed R2SDF FFT by using pipelined PEs &

RCM multiplier to determine the slice & LUT and delay consumptions

are shown in Figure 5.10 and Figure 5.11 respectively. Further the

performance evaluation of proposed pipelined R2SDF FFT

architectures are shown in Table 5.3. The performance of proposed and

traditional R2SDF FFTs is graphically illustrated in Figure 5.12.

104

Figure 5.10 Synthesis result of Proposed R2SDF FFT by using

Pipelined PEs and Reduced Complex Multiplier to determine the Slice

and LUT utilization

Figure 5.11 Synthesis result of Proposed R2SDF FFT by using

Pipelined PEs and Reduced Complex Multiplier to determine the delay

consumption

105

Table 5.3 Comparison of area and delay between traditional and

proposed R2SDF FFT architectures

Parameters Traditional R2SDF FFT

Proposed R2SDF FFT

Percentage Reduction

Slices 616 576 6.49%

LUTs 1195 1122 6.10%

Delay (ns) 53.341 53.307 Slightly reduced

Figure 5.12 Performances of proposed and traditional R2SDF FFT

106

5.4 Simulation Result of Proposed Hamming (12, 8) SEC-TAED

Codes

Channel encoder and decoder part of MIMO-OFDM architecture

performs error correction processes. OFDM channel is mostly affected

by Additive White Gaussian Noise (AWGN) in which bit flipping of

original information signal could leads to fault transmission. In order to

overcome this problem, hamming error detection and correction codes

are used in the current research work. Usually it will detect and correct

a single error perfectly. But, the proposed work performs SEC functions

as well as triple adjacent error detection functions. The simulation result

of extended hamming code is shown in Figure 5.13.

The status of the signal displayed indicates the status of the

current signals. If encoding process is in under process means, the

status signal printed as “PROCESSING”. If the decoding process

doesn’t detect any error means, the status printed as “NO ERROR”.

Similarly, if the decoding process detects a single error means, it can

able to correct the error with the help position of syndrome vector and

the status printed as “SEC”. Also if the syndrome vector detects the bits

as 1101 or 1110 or 1111, the status printed as “TED”. The constant 8-

bit input is considered as 01010100. The encoded output is obtained as

000010110100. If the same encoded data as it is transferred to input of

decoder means, the status signal provide “No Error” output. Figure 5.14

shows hamming (12, 8) error-less data transmission. For instance, the

107

third bit of encoded output is changed manually and the input of

decoder is as 001010110100. Figure 5.15 shows hamming (12, 8)

single error correction data transmission. The status is displayed as

“SEC”.

Figure 5.13 Simulation result of proposed hamming (12, 8) SEC-TAED

code

108

Figure 5.14 Simulation result of hamming (12, 8) SEC-TAED error-less

data transmission: Status displayed as “No error”

Figure 5.15 Simulation result of hamming (12, 8) SEC-TAED code with

single bit flipping: Status displayed as “SEC”

109

Similarly for triple adjacent error detection process, the status of

the signal is displayed as “TED” which shown in Figure 5.13.

Thus the proposed pipelining processing elements (PEs) and

reduced complex multiplier based frequency transformation (R2SDF

FFT) and proposed hamming (12, 8) SEC-TAED code will be useful to

implement an efficient MIMO-OFDM and this application will be further

useful to extend in MANET based temporarily network architecture.

110

CHAPTER 6

CONCLUSION

6.1 Summary of the thesis

Architecture of FFT is analyzed successfully and realized the

problem of dataflow structures involved in FFT architectures.

OFDM provides communication using Mobile Adhoc Networks.

An efficient VLSI based Radix - 2 Single - Path Delay Feedback

(R2SDF) FFT technique is implemented. The hamming SEC-TAED

code is used to reduce noise.

The Speed of Processing Element (PE) of FFT is increased using

pipeline based PE Structure. This offers 45.45% reduction in hardware

slices, 45.23% reduction in LUTs and 1.34% reduction in delay

consumption.

Complex Multiplication architecture of FFT has been realized

and re-designed in the current research work. Complex multiplier

reduces the area and delay by performing twiddle factor multiplication.

Complex multiplier offers 27.42% reduction in hardware slices, 27.79%

reduction in LUTs and 3.34% reduction in speed traditional.

The efficiency of Radix-2 single-pat Delay Feedback (R2SDF)

FFT has increased by implementing reduced complex multiplier and

111

pipeline based PE structures on R2SDF FFT. This offers 6.49%

reduction in area than existing R2SDF FFT architecture.

Reconfigurable architecture for complex multiplier is introduced

by using different twiddle multiplication value.

The probability of error detection in SEC-TAED has increased by

using Bit Replacement algorithm. This provides 8% more detection

efficiency when compared to existing SEC-TAED algorithm.

The application of the presented research work can be

implemented in the design of FFT / IFFT architecture with less area and

delay of OFDM system which suits wireless communication MANETS.

6.2 Future work

The same application can be used in Software Defined Radio

(SDR) based wireless data transmission system.

Software defined radio relies heavily on reconfigurable IIFT/FFT

architectures. Wireless standards used in SDR are a set of Media

Access Control (MAC) and physical layer specification for

implementation. The reconfigurable FFT/IFFT processor is the main tool

for generating these frequency bands and hence can be used in SDR

applications in future.

112

The architecture can also be used for real time reconfiguration of

4G networks which are used for high speed digital transmission at

present.

113

REFERENCES

1. Abdeldime M.S. Abdelgader and Wu Lenan, “The Physical Layer

of the IEEE 802.11p WAVE Communication Standard: The

Specifications and Challenges” Proceedings of the World

Congress on Engineering and Computer Science, Vol. 2, pp: 22-

24, 2014.

2. Abhijit D. Palekar and prashant V. Ingole, “OFDM System Using

FFT and LFFT” International Journal of Advanced Research in

Computer Science and Software Engineering (IJARCSSE), Vol.

3, Issue. 12, pp: 675-679, 2013.

3. Abhishek Mankar, “FPGA Implementation of Fast Fourier

Transform Core Using NEDA”, National Institute of Technology, A

Thesis submitted on 2013.

4. Ankur O. Bang and Prabhakar L. Ramteke, “MANET: History,

Challenges And Applications” International Journal of Application

or Innovation in Engineering & Management (IJAIEM), Vol. 2,

Issue. 9, pp: 249-251, 2013.

5. Anwar Bhasha Pattan and M. Madhavi Latha, “Fast Fourier

Transform Architectures: A Survey and State of the Art”

International Journal of Electronics and Communication

Technology, Vol. 5, No. 4, PP: 94-98, 2014.

114

6. Archana Fande and Anil Sahu, “Efficient Implementation &

Comparison of Signed Complex Multiplier on FPGA using FFT

Algorithm” International Journal of Scientific Research

Engineering & Technology (IJSRET), Vol. 3, Issue. 2, pp: 188-

191, 2014.

7. Ayinala, M., Lao, Y., & Parhi, K. K., “An In-Place FFT Architecture

for Real-Valued Signals”. IEEE Trans. on Circuits and Systems,

Vol. 60, Issue. 10, pp: 652-656, 2013.

8. Baeg, S., Wen, S., & Wong, R. “SRAM interleaving distance

selection with a soft error failure model” Nuclear Science, IEEE

Transactions on Vol. 56, Issue. 4, pp: 2111-2118. 2009,

9. Berkeman, A., Owall, V., & Torkelson, M, “A low logic depth

complex multiplier”. In Solid-State Circuits Conference,

ESSCIRC'98. Proceedings of the 24th European (pp. 204-207).

IEEE, 1998.

10. Berkeman, A., Owall, V., & Torkelson, M, “A low logic depth

complex multiplier using distributed arithmetic”. IEEE Journal of

Solid State Circuits, Vol. 35, Issue. 4, pp: 656-659, 2000.

11. Cha.S., & Yoon, H, “Efficient Implementation of Single Error

Correction and Double Error Detection Code with Check Bit Pre

computation for Memories”, Journal of Semiconductor

115

Technology and Science (JSTS), Vol. 12, Issue. 4, pp: 418-425,

2012.

12. Cui, Y., Lou, M., Xiao, J., Zhang, X., Shi, S., & Lu, P, “Research

and implementation of SEC-DED Hamming code algorithm”. In

TENCON 2013-2013 IEEE Region 10 Conference (31194), pp. 1-

5, 2013.

13. Del Mundo, C., Adhinarayanan, V., & Feng, W. C., “Accelerating

fast Fourier transform for wideband channelization”. In

Communications (ICC), IEEE International Conference on (pp.

4776-4780) 2013.

14. Dickson, B. W., & Conti, A. A., “Parallel Extensions to Single-Path

Delay-Feedback FFT Architectures” pp: 1-9, 2014.

15. Dutta, A., & Touba, N. A, “Multiple bit upset tolerant memory

using selective cycle avoidance based SEC-DED-DAEC code”. In

VLSI Test Symposium in IEEE, pp. 349-354, 2007.

16. Gavin Yeung, Mineo Takai, Rajive Bagrodia, Alireza Mehrnia,

Babak Daneshrad, “Detailed OFDM Modeling in Network

Simulation of Mobile Ad Hoc Networks” IEEE proceedings of the

18th workshop on parallel and distributed simulation (PADS’04),

2004.

116

17. Datuk Mohd , Mobile Ad Hoc Network Overview , Asia-Pacific

conference on Applied Electromagnetic Proceedings, December

4-6, 2007.

18. Ibe, E., Taniguchi, H., Yahagi, Y., Shimbo, K. I., & Toba, T.

“Impact of scaling on neutron-induced soft error in SRAMs from a

250 nm to a 22 nm design rule. Electron Devices”, IEEE

Transactions on Vol. 57, Issue. 7, pp: 1527-1538, 2010.

19. Kandhi Srikanth, “Design Radix-4 64-Point Pipeline FFT/IFFT

Processor for Wireless Application” International Journal of

Engineering Inventions (IJEI), Vol. 3, Issue. 2, pp: 67-70, 2013.

20. Kavita Taneja, R.B. Patel “An Overview of Mobile Ad hoc

Networks: Challenges and Future”, CiteSeerX Scientific Literature

Digital Library, The Pennsylvania State University.

21. Kumar, A., Tripathi, U. N., Verma, R. K., & Mishra, M, “64 Point

Radix-4 FFT Butterfly Realization using FPGA” International

Journal of Engineering and Innovative Technology (IJEIT), Vol. 4,

Issue. 4, pp: 57-60, 2014.

22. Lawrence, R. K., & Kelly, A. T. “Single event effect induced

multiple-cell upsets in a commercial 90 nm CMOS digital

technology” Nuclear Science, IEEE Transactions on Vol. 55,

Issue. 6, 3367-3374, 2008.

117

23. Lyon, Douglas A. "The Discrete Fourier Transform, Part 2: Radix

2 FFT." Journal of Object Technology, Vol. 8, No. 5, pp: 21-23,

2009.

24. Mahendra Babu D.S , Vinutha M.R and Uma C, “Design and

Implementation of MIMO-OFDM using Encoding and Decoding

techniques on FPGA” International Journal of Scientific &

Engineering Research (IJSER), Vol. 5, Issue. 6, pp: 939-944,

2014.

25. Makwana, V., & Parmar, N., “Analysis of Performance of Fast

Fourier Transformation of an Audio File” International Journal of

Application or Innovation in Engineering & Management (IJAIEM),

pp: 68-71, Vol. 2, Issue. 11, 2013.

26. Malik Nasereldin Ahmed, Abdul Hanan Abdullah and Satria

Mandala, “A Study On OFDM In Mobile Ad Hoc Network”

International Journal of Advanced Computer Science and

Applications (IJACSA), Vol. 3, No. 6, pp: 16-19, 2012.

27. Manchanda, G., & Chesta Verma, G. G., “Design of

Multidirectional Parity Code Using Hamming Code Technique for

Error Detection and Correction”, Indian Journal of Research (IJR),

Vol. 3, Issue. 5, pp: 79-81, 2014.

28. Manimaran, A., Sudheer, S. K., & Harshan, M. K, “A Novel

Approach in Pipeline Architecture for 64-Point FFT Processor

118

without ROM” International Journal of Advanced Research in

Electrical, Electronics and Instrumentation Engineering

(IJAREEIE), Vol. 3, Special Issue. 3, pp: 95-100, 2014.

29. Maslen, D. K., & Rockmore, D. N. “The Cooley-Tukey FFT and

group theory”. Notices of the AMS, Vol. 48, No. 10, PP: 1151-

1160, 2001.

30. Mehta, U. C., & Sharma, M. S., “VLSI Implementation of 2048

Point FFT/IFFT for Mobile Wi-MAX”. International Journal of

Computer Applications (IJCA), Vol. 65, Issue. 25, 2013.

31. Mohit Kumar and Rashmi Mishra, “An Overview of MANET:

History, Challenges and Applications” Indian Journal of Computer

Science and Engineering (IJCSE), Vol. 3, No. 1, pp: 121-125,

2012.

32. Moose, Paul H. "A technique for orthogonal frequency division

multiplexing frequency offset correction", IEEE Transactions on

communications, Vol. 42, pp: 2908-2914, 1994.

33. Naga Tanuja, K, “Implementation of OFDM Based

Communication System using Novel FFT Processor Architecture”

International Journal of Advanced Research in Computer and

Communication Engineering (IJARCCE), Vol. 3, Issue. 10, pp:

8346-8349, 2014.

119

34. Niladri Mandal and Souragni Ghosh, “A Modified Fast FFT

Algorithm for OFDM Based Future Wireless Communication

System” International Journal of Soft Computing and Engineering

(IJSCE), Vol. 1, Issue. 6, pp: 179-184, 2012.

35. Noman, H. M. F., Fuzail, M., & Arshad, J, “Software-Defined

Radio Architecture for Broadband OFDM Transceivers”,

International Journal of Computer Science and

Telecommunications (IJCST), Vol. 5, Issue. 4, pp: 20-24, 2014.

36. Noorbasha, F., Harikishore Kakarla, S. R. R., Maruthi, G. V.,

Manoj, S. P. U., & Varalakshmi, G, “VLSI Implementation of

Encryption and Decryption System Using Hamming Code

Algorithm” International Journal of Engineering Research and

Applications (IJERA), Vol. 4, Issue. 4, pp: 52-55, 2014.

37. Nutan Shep and P.H. Bhagat, “Implementation of Hamming code

using VLSI” International Journal of Engineering Trends and

Technology (IJETT), Vol. 4, Issue. 3, pp: 186-190, 2013.

38. Paul, S. S., & Baby, S. M., “An Efficient Design of Parallel

Pipelined FFT Architecture” International Journal Of Engineering

and Computer Science (IJECS), Vol. 3, Issue. 10, pp: 8926-8931,

2014.

120

39. Peng, S., & Wang, C. F., “Precorrected-FFT method on graphics

processing units. Antennas and Propagation”, IEEE Transactions

on Vol. 61, Issue. 4, pp: 2099-2107, 2013.

40. Pravin Ghosekar, Girish Katkar and Pradip Ghorpade, “Mobile Ad

Hoc Networking: Imperatives and Challenges” IJCA Special Issue

on “Mobile Ad-hoc Networks” 2010.

41. Quan Yu, Jun Zheng, Tielian Fu, Kejun Wu and Baoxian Zhang,

“Asynchronous Cooperative Transmission Using Distributed

Unitary Space-Frequency Coded OFDM in Mobile Ad Hoc

Networks” Published in IEEE future generation communication

and networking (FGCN), Vol. 2, pp: 291-296, 2007.

42. R. W. Hamming, “Error Detecting and Error Correcting Codes.”

Bell Syst.tech.J., vol. 29, no. 2, pp. 147–160, 1950.

43. Ramesh Bhakthavatchalu et al, “Modified FPGA based Design

and Implementation of Reconfigurable FFT Architecture” Institute

of Electrical and Electronics Engineers conference on PP: 818-

822, 2013.

44. Reddy, K. V. S., & Bala, K., “Implementation of 64-Point

FFT/IFFT By Using Radix-8 Algorithm” International Journal of

Electrical and Electronic Engineering Telecommunication

(IJEEET), Vol. 2, No. 4, pp: 57-61, 2013.

121

45. Salehi, S. A., Amirfattahi, R., & Parhi, K. K., “Pipelined

Architectures for Real-Valued FFT and Hermitian-Symmetric IFFT

With Real Datapaths” Circuits and Systems II: Express Briefs,

IEEE Transactions on, Vol. 60, Issue. 8, pp: 507-511, 2013.

46. Sanchez-Macian, Alfonso, Pedro Reviriego, and Jaun Antonio

Maestro. "Hamming SEC-DAED and extended hamming SEC-

DED-TAED codes through selective shortening and bit

placement." Device and Materials Reliability, IEEE Transactions

on Vol. 14, Issue. 1, pp: 574-576, 2014.

47. Sanchez-Macian, Alfonso, Pedro Reviriego, and Juan Antonio

Maestro. "Enhanced detection of double and triple adjacent errors

in Hamming codes through selective bit placement." Device and

Materials Reliability, IEEE Transactions on Vol. 12, Issue. 2, pp:

357-362, 2012.

48. Satoh, S., Tosaka, Y., & Wender, S. A., “Geometric effect of

multiple-bit soft errors induced by cosmic ray neutrons on

DRAM's”, Electron Device letters, IEEE, Vol. 21, Issue. 6, pp;

310-312, 2000.

49. Sreekanth Yadav, K, Charishma, V and Neelima koppala,

“Design and simulation of 64 point FFT using Radix 4 algorithm

for FPGA Implementation” International Journal of Engineering

122

Trends and Technology (IJETT), Vol. 4, Issue. 2, pp: 109-113,

2013.

50. Sun, Y., Karkooti, M., & Cavallaro, J. R., “High throughput,

parallel, scalable LDPC encoder/decoder architecture for OFDM

systems”, In Design, Applications, Integration and Software, IEEE

Dallas/CAS Workshop on (pp. 39-42), 2006.

51. Sundararajan, M., & Govindaswamy, U, “Multicarrier Spread

Spectrum Modulation Schemes and Efficient FFT Algorithms for

Cognitive Radio Systems”. Electronics, Vol. 3, Issue. 3, 419-443,

2014.

52. Sundari, R. M., Subathra, D., & Dhanalaxmi, M. S., “Enhancing

Multiplier Speed in Fast Fourier Transform Based on Vedic

mathematics”. International Journal of VLSI design &

communication Systems (VLSICS), Vol. 4, Issue. 3, 2013.

53. T. S. Ghouse Basha and L. Suneetha, “Implementation of High

Speed MDC FFT/IFFT Processor for MIMO-OFDM Systems”

International Journal of Advanced Research in Electrical,

Electronics and Instrumentation Engineering (IJAREEIE), Vol. 3,

Issue. 9, pp: 12201-12207, 2014.

54. Takala, J., & Punkka, K., “Butterfly unit supporting radix-4 and

radix-2 FFT”. In Proceedings of The 2005 International TICSP

123

Workshop on Spectral Methods and Multirate Signal Processing,

SMMSP 2005, Riga, Latvia, Vol. 30, pp. 47-54, 2005.

55. Wang, J., & Ronningen, L. A., “An Implementation of Pipelined

Radix-4 FFT Architecture on FPGAs”. Journal of Clean Energy

Technologies (JCET), Vol. 2, Issue. 1, pp: 101-103, 2014.

56. Yang, Hongwei. "A road to future broadband wireless access:

MIMO-OFDM-based air interface." Communications Magazine,

Vol. 43, No. 1, pp: 53-60, 2005.

57. Yu, C., Yen, M. H., Hsiung, P. A., & Chen, S. J., “A low-power 64-

point pipeline FFT/IFFT processor for OFDM applications”.

Consumer Electronics, IEEE Transactions on, Vol. 57, Issue. 1,

pp: 40-40, 2011.

58. Zhao, J., & Shi, Y. “A novel approach to improving burst errors

correction capability of Hamming code”, In Communications,

Circuits and Systems, 2007. ICCCAS 2007. International

Conference on (pp. 1193-1196), 2007.

59. Zhou, B., Peng, Y., & Hwang, D., “Pipeline FFT architectures

optimized for FPGAs”. International Journal of Reconfigurable

Computing, pp: 1-9, 2009.

124

60. Vikaram Patalbasi, Sonali Mote “An Overview of MANET:History,

Challenges and Applications” International Journal of Computer

Science and Engineering, Vol. 3, No .1 Feb-Mar 2012.

61. Convolutional encoding and Viterbi decoding tutorial is linked in

http://ems.eit.unikl.de/fileadmin/user_upload/Appendix_task7_8.p

df.

62. Cyclic prefix tutorial linked in

http://www2.siit.tu.ac.th/prapun/ecs455_2010_2/ECS455%20-

%205%20-%204%20-%20Cyclic%20Prefix.pdf.

63. Error detection and correction codes linked in

http://logos.cs.uic.edu/366/notes/ErrorCorrectionAndDetectionSu

pplement.pdf.

64. Implementation of 16 point radix 2 FFT, tutorial linked in

http://teal.gmu.edu/courses/ECE645/projects_S05/specs/FFT_as

hwin_vamsi.pdf.

65. Introduction to VLSI technology in

http://www.slideshare.net/yayavaram/introduction-to-vlsi-

technology.

66. Twiddle factor tutorial in

http://www.alwayslearn.com/dft%20and%20fft%20tutorial/DFTan

dFFT_FFT_TwiddleFactor.html.

125

67. Uses and advantages of VLSI technology in

http://www.techulator.com/resources/13398-What-is-VLSI-

Technology.aspx.

68. VLSI design metrics in

http://users.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_1.pdf.

69. VLSI design technology in

http://www.engineersgarage.com/articles/vlsi-design-future?

page=2.

70. VLSI Models of Computation in

http://cs.brown.edu/~jes/book/pdfs/ModelsOfComputation_Chapt

er12.pdf.

126

LIST OF PUBLICATIONS

International Journals

1. Chandrika. S, and Rani Hemamalini. R, “A Novel Pipelined

Radix-2 64 Point FFT with Modified Complex Multiplier in OFDM

for Wireless Ad-hoc Netowrks” International Journal of Applied

Engineering Research (IJAER), ISSN 1087—1090, pp: 19869-

19879, 2014.

2. Chandrika. S, and Rani Hemamalini. R, “Efficient Implementation

of Hamming SEC-TAED Code Algorithm for Data

Communication” International Journal of Innovative Research &

Studies (IJIRS), ISSN 2319 – 9725,pp: 275-289, 2014.

Documents

VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE … · ii CERTIFICATE BY THE SUPERVISOR I certify that the thesis entitled “VLSI BASED RECONFIGURABLE ARCHITECTURE FOR MOBILE ADHOC