13
6 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai International Journal of Electronics, Electrical and Computational System IJEECS ISSN 2348-117X Volume 6, Issue 6 June 2017 Fixed and Floating Point Array Signal Processor Architecture Implemented on FPGA and their performance Comparisons Jayaraj U Kidav 1 , Research Scholar and Scientist 'D', Karunya University Coimbatore,NIELIT Calicut, Kerala,India Nidhi Antony 3 , M.Tech Project Student , NIELIT Calicut,Kerala, India. N.M Sivmangai 2 , Associate Professor, Karunya University Coimbatore, Tamilnadu,India. Dr.M.P Pillai 4 , Exec. Director, NIELIT Calicut,Kerala,India. AbstractArray Signal Processor or digital beamformer is an inevitable processing block in various antenna array signal processing applications including RADAR/SONAR, MIMO, medical imaging etc.. In high sampling rate applications like imaging SONAR digital beamformer needs to handle high input data rate and sampling frequency for processing. Now a days due to the advancement in FPGA technology, most of the digital beamformer implementation for high sampling rate applications are based on FPGAs. Availability of high speed I/Os, parallel hardware structures, internal block RAMs, course grained processing blocks etc. enables FPGA as implementation choice for those applications. In digital beamformer, the blocks like fractional delay unit, apodization unit, summer unit etc involves mathematical operations and in order to get accurate results data representation is key in FPGA implementations. In this paper we discuss design and FPGA implementation of fixed and floating point digital beamformer architecture for high sampling rate applications like imaging SONAR. We also discuss the merits and demerits of the same. In most of the VLSI signal processing architectures fixed point arithmetic is preferred due to ease of implementation. We have used Virtex-6 ML605 evaluation board as implementation platform and utilized LogiCORE IP floating-point operator v5.0 available in Virtex-6 FPGA for floating point processing. In order to compare the accuracy of implementation, initially we modelled the beamformer using MATLAB. We also utilized the available Ultrasound simulation program in Field II for imaging SONAR array modelling and generating echo from software phantoms. We designed and implemented both architectures and investigated the performance in terms of hardware resources and data rate. We could see that Floating point architecture mitigates the data rate requirement for high resolution imaging which requires higher number of channels at the cost of hardware resources. Compared to fixed point counterpart, floating point architecture showed an improvement of about 75 % in data rate. We implemented phased array beamformer and as per delay calculations for various angles, it shows that centre transducer elements requires less delay as compared to left and right most elements. We proposed and implemented a variable delay line structure for each elements, which helped to reduce the number of flip flops in digital beamformer architecture. Index TermsData Rate, Digital Beamformer, Floating point, FPGA, Imaging SONAR. I. INTRODUCTION Pulse-echo processing is the working principle behind RADAR/SONAR, Medical imaging applications etc. When voltage is applied to the transducer probe, pulses are produced due to piezoelectric effect. These pulses from the transducer probe hits the target in region of interest and as a result, echoes are produced. These signals are then processed by the array signal processor that is beamformer. Beamformer acts as a spatial filter which deals with the directional transmission and reception of signal[1]. Earlier, array systems involve simple implementation of beamformer functions without focusing [2]. Later, beamformer design was customized by including focusing. Several restrictions were there such as limited focal region and high side lobe levels. These challenges were solved by using high f-number and apodization. It

Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

  • Upload
    vutram

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

6 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

Fixed and Floating Point Array Signal Processor Architecture

Implemented on FPGA and their performance Comparisons

Jayaraj U Kidav1,

Research Scholar and Scientist 'D',

Karunya University Coimbatore,NIELIT Calicut,

Kerala,India

Nidhi Antony3,

M.Tech Project Student ,

NIELIT Calicut,Kerala, India.

N.M Sivmangai2,

Associate Professor,

Karunya University Coimbatore,

Tamilnadu,India.

Dr.M.P Pillai4,

Exec. Director,

NIELIT Calicut,Kerala,India.

Abstract— Array Signal Processor or digital

beamformer is an inevitable processing block in various

antenna array signal processing applications including

RADAR/SONAR, MIMO, medical imaging etc.. In high

sampling rate applications like imaging SONAR digital

beamformer needs to handle high input data rate and

sampling frequency for processing. Now a days due to

the advancement in FPGA technology, most of the digital

beamformer implementation for high sampling rate

applications are based on FPGAs. Availability of high

speed I/Os, parallel hardware structures, internal block

RAMs, course grained processing blocks etc. enables

FPGA as implementation choice for those applications.

In digital beamformer, the blocks like fractional delay

unit, apodization unit, summer unit etc involves

mathematical operations and in order to get accurate

results data representation is key in FPGA

implementations. In this paper we discuss design and

FPGA implementation of fixed and floating point digital

beamformer architecture for high sampling rate

applications like imaging SONAR. We also discuss the

merits and demerits of the same. In most of the VLSI

signal processing architectures fixed point arithmetic is

preferred due to ease of implementation. We have used

Virtex-6 ML605 evaluation board as implementation

platform and utilized LogiCORE IP floating-point

operator v5.0 available in Virtex-6 FPGA for floating

point processing. In order to compare the accuracy of

implementation, initially we modelled the beamformer

using MATLAB. We also utilized the available

Ultrasound simulation program in Field II for imaging

SONAR array modelling and generating echo from

software phantoms. We designed and implemented both

architectures and investigated the performance in terms

of hardware resources and data rate. We could see that

Floating point architecture mitigates the data rate

requirement for high resolution imaging which requires

higher number of channels at the cost of hardware

resources. Compared to fixed point counterpart, floating

point architecture showed an improvement of about 75 %

in data rate. We implemented phased array beamformer

and as per delay calculations for various angles, it shows

that centre transducer elements requires less delay as

compared to left and right most elements. We proposed

and implemented a variable delay line structure for each

elements, which helped to reduce the number of flip flops

in digital beamformer architecture.

Index Terms— Data Rate, Digital Beamformer,

Floating point, FPGA, Imaging SONAR.

I. INTRODUCTION

Pulse-echo processing is the working principle

behind RADAR/SONAR, Medical imaging

applications etc. When voltage is applied to the

transducer probe, pulses are produced due to

piezoelectric effect. These pulses from the

transducer probe hits the target in region of interest

and as a result, echoes are produced. These signals

are then processed by the array signal processor that

is beamformer. Beamformer acts as a spatial filter

which deals with the directional transmission and

reception of signal[1].

Earlier, array systems involve simple

implementation of beamformer functions without

focusing [2]. Later, beamformer design was

customized by including focusing. Several

restrictions were there such as limited focal region

and high side lobe levels. These challenges were

solved by using high f-number and apodization. It

Page 2: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

7 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

dramatically enhanced the performance. High value

of f-number resulted in some drawbacks like

reduced aperture size. So dynamic focusing was

introduced to reduce f-number during receive

beamforming and to keep it as a constant until it

runs out of aperture. Coarse and fine delay are

needed for the implementation of dynamic focusing

in analog beamformer. The cost variations allied

with these systems led to the digital beamformer

design.

At the beginning the digital beamformers did not

have much significant impact. It was due to the

need for A/D converters with sufficiently large

number of bits and a high enough sampling rate.

Another factor which has facilitated this change is

the dramatic increases in gate counts of ASIC's and

the improvements in their design tools. With the

advancement in technologies there has been a

tremendous change in the digital beamformer

design and architecture.

Conventional antenna array beamforming

implementations are mostly based on Digital Signal

Processors (DSPs) . To handle dynamic range of

ultrasound echo from body, Imaging SONAR and

medical ultrasound beamformers usually requires

higher ADC resolution (14bit), also in order to

generate high quality clinical images higher number

of channels (64-128) and sampling frequency (40-

60MHz) is necessary. This causes the beamformer

to handle higher data rate, and real time processing

becomes predicament. In this paper we discuss an

FPGA based Digital Beamformer (DBF)

architecture, for fixed and floating point processing,

as FPGA can handle high input data rate via high

speed serial I/Os and flexibility in FPGA structure

mitigates the real time implementation challenges.

In most of the existing DBF architecture

implementations[16], [17], [29]-[36] the importance

of data representations is not discussed. The floating

point arithmetic on FPGA is discussed in [4]. Due

to the difficulty in implementation of floating point

arithmetic, resource constraints etc fixed point

implementation was adapted for realization of VLSI

Signal processing architectures on FPGAs. But in

applications like Medical Imaging under water

acoustic camera in order to get high accuracy in

processing the number of bits required for fixed

point arithmetic is massive. This causes high output

data rate which in turn causes the requirement of

high speed interfaces like PCIe, Gigabit Ethernet

etc. at DBF output. The contemporary FPGAs like

Xilinx Virtex-6 provides LogiCORE IP floating-

point operator v5.0, which mitigates

implementation challenges of floating point

arithmetic on FPGAs.

In this work, a 32 Channel DBF architecture has

been developed. The architecture is implemented on

FPGA, with fixed point and floating processing.

The architecture is also having the flexibility in

beamforming as [17] like receive dynamic

focussing and apodization. We also implemented

high accuracy fractional delay unit by adopting

Minimum Mean Square Error (MMSE) interpolator

[18].

We have used Field II software scanner [22], [26] to

validate our architecture implementation on FPGA.

II. FIXED POINT AND FLOATING POINT

REPRESENTATION

With the advancement in technologies there has

been an incredible change in the digital beamformer

design and architecture. Beamformer architecture

can be implemented based on fixed point arithmetic

or floating point arithmetic. Fixed point and floating

point are two formats used to represent numbers.

Floating point architecture supports integer or real

arithmetic while fixed point architecture supports

integer arithmetic so it represents all numbers using

integers. It uses binary scaling to make all numbers

robust into one of the integer data types [5].

8 bits (char, int8): [−128, 127]

16 bits (short, int16): [−32768, 32767]

32 bits (long, int32): [−2147483648,

2147483647]

In fixed-point representation, a real number y is

denoted by an integer Y with L = i + f + 1 bits,

where L is the wordlength, i is the number of

integer bits (excluding the sign bit), f is the number

of fractional bits. “Q-format”: Y is sometimes

called a Qi.f or Qf number.

Floating point number is represented more or less to

a fixed number of significant digits (the significand)

and scaled using an exponent. A number that can

be represented precisely is of the following form:

Significand × baseexponent;

The binary point is erratic (floating) and depends on

the value of the exponent. To obtain the value of the

floating-point number, the significand is multiplied

Page 3: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

8 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

by the base raised to the power of the exponent,

alike to shifting the radix point from its inferred

position by a number of places equal to the value of

the exponent to the right if the exponent is positive

or to the left otherwise. The length of the

significand determines the accuracy to which

numbers can be represented.

The IEEE has standardized the computer

representation for binary floating-point numbers in

IEEE 754 (a.k.a.IEC 60559).Different IEEE

floating point formats are

Single precision(32 bits)

Double precision(64 bits)

Double extended precision(80 bits)

Quadruple precision(128 bits)

Half precision(16 bits)

A. Half Precision(binary 16)

The IEEE 754 standard specifies a binary16 as

having the following format[8].

Sign bit: 1 bit

Exponent width: 5 bits

Significand precision: 11 bits (10 explicitly

stored)

Fig. 1. Half Precision

The format is assumed to have an inherent lead bit

with value 1 unless the exponent field is stored with

all zeros. Thus only 10 bits of the significand appear

in the memory format but the total precision is 11

bits. In IEEE 754 idiom, there are 10 bits of

significand, but there are 11 bits of significand

precision (log10 (211) ≈ 3.311 decimal digits).

III. DIGITAL BEAMFORMER ARCHITECTURE

A 32 channel DBF for receive beam forming is

designed to be implemented in Virtex 6 FPGA. A

14 bit ADC, sampling frequency 40 MHz gives the

digitized echo input to each channel. The received

echoes are sampled by ADC of sampling frequency

40 MHz These samples are delayed with specific

time delays and then summed to form a beam. The

structure is a combination of coarse delay and fine

delay. The architecture of DBF is shown in Fig. 2.

Digital beamformer can be implemented either in

fixed point architecture or floating point

architecture.

Fig. 2. High level architecture of 1 channel

The geometry of linear phased array used in

deriving the formula for calculating time delays for

each element, in Delay and Sum technique is as

per[13] we performed -30 degree to +30 degree

phased array sweep at a lateral resolution of 1

degree. According to these delay calculations we

could observe that there is a course delay difference

of 30 to 50 sampling clocks among centre elements

and left and right most elements. Hence we adopted

a delay structure as shown Fig. 3. and this structure

helped to reduce an average 20 to 25 flip flops per

delay line.

Fig. 3. Delay Line Structure

Page 4: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

9 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

IV. REALIZATION OF FIXED POINT DIGITAL

BEAMFORMER

Due to ease of implementation, digital

beamformer is generally implemented in fixed point

architecture. The basic building blocks of this

development include FIFO, coarse delay structure,

fine delay structure, apodization unit. The transmit

beam former output would be connected to a digital

to analog converter, then to a passive low pass filter

and to high voltage amplifier. The high voltage

amplifier output would be connected to an

ultrasound transducer array an analog switch/mux.

The high voltage amplifier outputs will energise the

transducer array to generate ultrasound waves. In

receive path the receive switch would be on and the

signal will reach to amplifier and filter stage. The

conditioned signal would convert into digital and

fed to Receive beam former for further processing.

The IF signal, generally in the range of 40 MHz to

60 MHz is converted into one word digital data

using 14/16 bit, high speed ADC. The digital data is

received at a sampling clock of 40 MHz and then

processed as follows:

In fixed point architecture 16 bit input is

represented in 16.0 Q format. The coarse delay

values are also represented in 16.0 Q format. The

resulting coarse delayed output is 16 bits. It is then

multiplied with 2.14 Q format filter coefficients and

added together resulting in 32 bit fine delayed

output. The fine delayed output is multiplied with

1.31 Q format apodization coefficients leading to 64

bit output. Later output from all channels is

summed together resulting in 64 bit beamformed

output. So in fixed point architecture for a 16 bit

input, 64 bit output is obtained.

Dynamic range refers to the range of echoes

processed and displayed by the ultrasound system.

It is directly proportional to the no of bits in fixed

point architecture. So as the number of bits

decreases dynamic range also decreases. The

dynamic range of 16 bit fixed point architecture is

96dB. As dynamic range decreases the echoes at the

weaker end of the spectrum will be lost. Dynamic

range can be considered as a variable threshold of

writing for weaker signals. For general imaging the

dynamic range should be kept at its maximum level

to maximize contrast resolution potential. However

in situations where low-level noise or artifacts

degrade image quality the dynamic range can be

reduced to partially eliminate these appearances

[15].The structure for a 32 channel digital

beamformer has to be implemented. The 32 channel

DBF will be performed in one Virtex 6 FPGA.

The hardware blocks in one channel to perform

digital beamforming is depicted in Fig. 2. A 32X16

bits FIFO is designed to buffer the digital echo from

ADC to delay structure for processing. Due to

requirement of huge memory to store input data for

beamforming, FIFO is selected instead of FPGA

internal block RAM. It is used to buffer the

incoming data.

In Delay and sum technique, a combination of

coarse and fine delay strategy is used. An optimized

delay structure is designed to reduce the utilization

of FPGA resources. The design differentiates the

channels of DBF into groups depending on the time

delays required for them. From the statistical study

of time delay values required for each channel for

any type of transducer array, we concluded that a

group of channels in the center always requires less

time delays than the channels at the both left and

right ends of array aperture. So the delay line

structure for channels at the center requires less

delay flops. The delay values depend on the probe

parameters and a number of I/O signals. The

calculated delay values for each probe are stored in

Look up tables (LUTs) as shown in Fig. 4.

Fig. 4. Coarse delay structure of fixed point

architecture

The samples after coarse delay are the inputs to fine

delay structure. A Farrow structure fractional delay

Finite impulse response (FIR) filter with MMSE

interpolator is used to generate fine delays

delays[9]. The filter coefficients are stored in LUTs

as depicted in Fig. 5.

Page 5: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

10 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

Fig. 5. Fine delay structure of fixed point

architecture

The user can select the apodization technique in

order to view the image. The weights of windowing

functions Hanning, Hamming and Kaiser are stored

in LUTs as illustrated in Fig. 6. It is required to

compute the complex multiplication for several

numbers of weights which will decide where the

beam needs to be formed. For sixteen elements to

form one beam we need to have sixteen weights and

for N number of beams, N different sets of sixteen

weights are required. We consider the weights are

fixed and calculated offline.

Fig. 6. Apodization structure of fixed point

architecture

A ping pong memory is used to store the beam

formed output and would be processed in ping pong

fashion.

In fixed point architecture for a 16 bit input, 64 bit

output is obtained. That is the output data rate is

2.56 Gbps.This is because in fixed point

architecture size of output will be sum of the sizes

of inputs being multiplied together. Thus we go for

floating point architecture where size of output will

be size of inputs being multiplied together.

Therefore for a 16 bit input, output bit size is 16 bit

and data rate is 0.64 Gbps.For the same input data

rate it reduces output bit rate by one fourth that of

fixed point output bit rate without affecting the

accuracy of the output. The dynamic range depends

directly on the no of bits. As the no of output bits

decreases dynamic range decreases. But compared

to fixed point, floating point has higher dynamic

range.

V. REALIZATION OF FLOATING POINT

DIGITAL BEAMFORMER

In this paper we designed and implemented floating

point architecture of digital beamformer that offers

better throughput, dynamic range and accuracy

compared to fixed point architecture.

In floating point architecture 16 bit input data is

converted from fixed point format to floating point

format by using floating point IP core version 5.0.

The coarse delay values are also converted from

fixed point format to floating point format. The

resulting coarse delayed output is 16 bits. It is then

multiplied with filter coefficients and added

together using floating point arithmetic resulting in

16 bit fine delayed output. The fine delayed output

is multiplied with 16 bit apodization coefficient

using floating point IP core version 5.0 leading to

16 bit output. Later output from all channels is

summed together resulting in 16 bit beamformed

output.

In floating point architecture size of output will be

size of inputs being multiplied together. Therefore

for a 16 bit input, output bit size is 16 bit itself and

data rate is 0.64 Gbps. It reduces output bit rate by

one fourth that of fixed point output bit rate without

any deterioration in accuracy of digital beamformer

output. Floating point architecture is explained in

detail in the following section.

A 32X16 bits FIFO is designed to buffer the digital

echo from ADC to delay structure for processing.

Input data is converted from fixed point format to

floating point format by using floating point IP core

version 5.0. FIFO buffers the input data, when 128

samples are stored Fifo_full becomes high and

FIFO stops writing. In Delay and sum technique, a

combination of coarse and fine delay strategy is

used. The samples read from FIFO are given to D

flip flops. D flip flops clocked at sampling

frequency are used to give the coarse delays to the

input data. They are positive edge triggered with

clock of sampling frequency i.e.40 MHz, so coarse

delays are integer multiples of sampling period. The

Page 6: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

11 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

output Qn of each D flip flop are given to

multiplexer MUX2.The MUX2 selects the Qn

according to the select line data. The select line data

is the output of MUX1.The MUX1 selects the delay

value required for the input data.

The pre calculated delay values (delay values are

number of clock periods) are stored in LUTs.MUX1

has select lines probe_id, ‘θ’ and ‘r’. Probe_id gives

the information of the active probe inserted in the

scanner, ‘θ’ gives the scan line angle and ‘r’ gives

the receive foci, thus the corresponding delay values

are selected from LUT and given as select line to

MUX2.This delay value decides which Qn is to be

selected and given to fine delay structure as shown

in Fig. 7. Each LUT stores the delay values for a

specific probe that the system will support.

Fig. 7. Coarse delay structure of floating point

architecture

Filter coefficients are converted from fixed point

format to floating point format by using floating

point IP core version 5.0.This converted values are

then stored in LUT’s.

Fig. 8. Fine delay structure of floating point

architecture

MUX3 and MUX4 are used to select filter

coefficients h1 and h2.The select lines are probe_id,

‘θ’, and ‘r’.The farrow structure is designed using D

flip flop. The samples are multiplied with h1 and h2

and added using floating point IP core version 5.0

as depicted in Fig. 8.The sampling frequency is 40

MHz

Apodization coefficients are converted from fixed

point format to floating point format by using

floating point IP core version 5.0.This converted

values are then stored in LUT’s. The different

window functions are stored in LUTs.The

Apo_select represents the window function selected

by the operator. The samples after coarse and fine

delay are multiplied with selected window function

using floating point IP core 5.0 as shown in Fig.9.

Fig. 9. Apodization unit of floating point

architecture

VI. RESULTS

We performed the simulations in MATLAB 2013a

and Xilinx 14.3 to verify the beamformed output.

Through the simulation results, we will illustrate the

performance comparison between the fixed point

and floating point beamformer architecture.

A. MATLAB Simulation

1) Transducer Array Simulation

We modeled transducer array using Field II Tool

box in MATLAB. For the simulations, we used a 32

element phased transducer array with 40 MHz

sampling frequency, 3.5 MHz centre frequency. The

focus is set at 70mm depth and sector angle at 64

degree. The distance between element centers

(pitch) is set as 0.160mm.The width of fill material

between ceramic elements (kerf) is set as

Page 7: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

12 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

0.025mm.The element width is obtained from

subtracting kerf from pitch, element height is

13mm, element subdivision in x direction is 1, and

element subdivision in y direction is 5.Fig.10.

shows the simulated phased array transducer.

Fig. 10. Ultrasound probe model in MATLAB

2) Point Target Phantom Simulation

To evaluate the performance, we designed a

phantom of amplitude 10^(25/20) with 2 point

targets located at the depth of 60 mm and 70 mm

using Field II Tool in MATLAB as in Fig.11. Fig.

12 shows the images of the point targets obtained

using different window functions in beamforming

process. As observed from Fig. 12, Kaiser Window

provides better result compared to other windows.

Echoes generated from cyst phantom are shown in

Fig. 13.

Fig. 11. Point Target Phantom Model

Fig. 12. Images of simulated point target phantom.

using different windows (a) Hanning window (b)

Hamming window, (c) Blackman window, (d)

Rectangular window, (e) Tukeywin window, (f)

Kaiser window.

Fig. 13. Generated echoes from phantom

Fig. 14. shows the MATLAB output of digital

beamformer architecture.

Fig. 14. MATLAB output of DBF

B. FPGA Simulation

For fixed point architecture we have used Virtex-6

ML605 evaluation board as implementation

platform. Input data as well as coarse and fine delay

values are loaded as .coe file in block memory

generators. We utilized RAM based shift registers

for implementing coarse delay.

Page 8: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

13 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

Fig. 15. FPGA simulation blocks of fixed point

DBF

Fig. 16. FPGA simulation of fixed point DBF

Fig. 17. FPGA output of fixed point DBF

For floating point architecture we have used Virtex-

6 ML605 evaluation board as implementation

platform. Input data as well as coarse and fine delay

values are loaded as .coe file in block memory

generators. We utilized RAM based shift registers

for implementing coarse delay. LogiCORE IP

floating-point operator v5.0 available in Virtex-6

FPGA is utilized for floating point processing.

Mainly three features of floating point IP cores are

used here, fixed to float conversion, floating point

addition, floating point multiplication.

Fig. 18. FPGA simulation blocks of floating point

DBF

Fig. 19. FPGA simulation of floating point DBF

Page 9: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

14 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

Fig. 20. FPGA output of floating point DBF

VII. DISCUSSION

Performance comparison of floating point

architecture with fixed point architecture was done

based on the simulation results from MATLAB and

FPGA. Various parameters analysed for the

performance comparison are explained in detail in

following section.

A. I/O Data Rate Estimation

For fixed point architecture, data rate is computed

by multiplying no: of bits, sampling frequency and

no. of channels. For a 16 bit input, 32 channel

digital beamformer with sampling frequency of 40

MHz the input data rate is found to 20480Mbps or

20.48Gbps.Whereas output data rate is found to be

2560Mbps or 2.56Gbps.

For floating point architecture data rate is computed

by multiplying no: of bits, sampling frequency and

no. of channels. For a 16 bit input, 32 channel

digital beamformer with sampling frequency of 40

MHz the input data rate is found to 20480Mbps or

20.48Gbps.Whereas output data rate is found to be

640Mbps or 0.64 Gbps.

For the same input data rate, floating point

architecture reduces output data rate to ¼ th of fixed

point architecture.

Performance improvement in output data rate

= 2.56−0.640

2.56=0.75=75%.

Fig. 21. Output Data Rate

B. Hardware Utilization

Hardware utilization in floating point architecture is

more compared to fixed point architecture because

it requires more multipliers (DSP48E1s), 24% and

19% respectively. Fig. 22 shows the hardware

utilization comparison of both architectures. Both

LUT and FF resource usage and maximum

frequency reduce with latency. Minimizing latency

minimizes resources. Floating point IP core offers

feasibility in changing latency values. In case of

fixed to float conversion operation latency value

ranges from 0 to 6.For floating point addition and

multiplication operation latency value ranges from 0

to 8.Minimum required latency is 1[10].Table I and

Table II shows the device utilization of both

architectures.

TABLE I

FLOATING POINT ARCHITECTURE DEVICE

UTILIZATION (VIRTEX 6 FPGA)

Device Utilization Summary (estimated values)

Logic Utilization Used Available Utilization

Number of Slice

Registers 7932 301440 2%

Number of Slice LUTs 26672 150720 17%

Number of fully used

LUT-FF pairs 4816 29788 16%

Number of bonded

IOBs 533 600 88%

Number of Block

RAM/FIFO 64 416 15%

Number of

BUFG/BUFGCTRLs 1 32 3%

Number of DSP48E1s 188 768 24%

Page 10: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

15 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

TABLE II

FIXED POINT ARCHITECTURE DEVICE

UTILIZATION

(VIRTEX 6 FPGA)

Device Utilization Summary (estimated values)

Logic Utilization Used Available Utilization

Number of Slice

Registers 4210 301440 1%

Number of Slice

LUTs 6279 150720 4%

Number of fully

used LUT-FF pairs 4696 5793 81%

Number of bonded

IOBs 239 600 39%

Number of Block

RAM/FIFO 64 416 15%

Number of

BUFG/BUFGCTRLs 2 32 3%

Number of

DSP48E1s 152 768 19%

Fig. 22. Hardware Utilization

C. Accuracy

To analyze the accuracy of the DBF output obtained

from both floating point and fixed point architecture

the deviation of FPGA simulation output from

MATLAB output is calculated. To find the error

between signals, first normalize the signals to zero

mean and unit variance.

s1=dbfMATLAB-mean (dbfMATLAB)/ (std (dbfMATLAB)

s2=dbffloat-mean (dbffloat)/std (dbffloat)

s3=dbffixed-mean (dbffixed)/std (dbffixed)

Errorfloat=max (s1-s2) = 6.6061

Errorfixed=max (s1-s3) = 6.4251

From the above calculation it is clear that error rate

is less for 64 bit fixed point architecture compared

to 16 bit floating point architecture as shown in Fig.

23.As the no of bits increases error rate decreases.

Fig. 23. Accuracy

D. Dynamic Range

Dynamic range is the ratio of largest and smallest

number that can be represented in data format.

Floating point architecture has high dynamic range

compared to fixed point architecture. Fig. 24

illustrates drastic increase in dynamic range of

floating point compared to fixed point. Table III

shows various dynamic range values of floating

point and fixed point architecture.

Dynamic range for fixed point=20log10(2No:of bits-1) dB

Dynamic range for floating point=20log10(22^No:of exponent

bits) dB

TABLE III

DYNAMIC RANGE

No:of

bits

Fixed point

dynamic

range(dB)

Floating point

dynamic

range(dB)

16 96 192

32 192 1541

64 385 12330

Page 11: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

16 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

Fig. 24. Dynamic Range

VIII. CONCLUSION

Digital beamformer is an inevitable processing

block in various antenna array signal processing

applications. In this paper we developed an FPGA

based floating point DBF architecture as FPGA can

handle high input data rate via high speed serial

I/Os and flexibility in FPGA structure mitigates the

real time implementation challenges in high

sampling rate applications. We implemented phased

array beamformer and as per delay calculations for

various angles, it shows that centre transducer

elements requires less delay as compared to left and

right most elements. We proposed and implemented

a variable delay line structure for each elements,

which helped to reduce the number of flip flops in

digital beamformer architecture. Performance

comparison of proposed floating point architecture

and fixed point architecture was carried out.

Accuracy of DBF output is more for 64 bit fixed

point architecture compared to 16 bit floating point

architecture, as no. of bits increases error rate

decreases. Here we utilized the floating point IP

core 5.0 available in Virtex 6 FPGA for performing

floating point arithmetic. Our implementation

shows 75% improvement in output data

rate.Dynamic range calculation shows drastic

increase in floating point architecture than fixed

point architecture. By minimizing the latency we

minimized the resources used in floating point

architecture. Floating point IP core offers feasibility

in changing latency values. Hardware utilization is

more for floating point architecture due to its usage

of multipliers in floating point arithmetic.

ACKNOWLEDGMENT

This work was supported by Microelectronics

division of the Ministry of Electronics and

Information Technology, Government of India, as

per order No 9(1)/2014 –MDD dated 15-12-2014.

REFERENCES [1] B.D.Van Veen, K.M. Buckley , “Beamforming: A

versatile approach to spatial filtering”, IEEE ASSP

Magazine, Volume: 5, Issue: 2,pp.4-24,1988.

[2] K.E.Thomenius, “Evolution of Ultrasound

Beamformers”, IEEE Ultrasonics

Symposium,pp.1615-1622,1996.

[3] C.A.Balanis, “Antenna Theory: Analysis and design”,

3rd edition ,Wiley,2005

[4] S. Sahin, A. Kavak, Y. Becerikli, and H. E. Demiray,

“ Implementation of floating point arithmetics using

an FPGA”, Mathematical Methods in Engineering,

pp 445-453,ISBN 978-1-4020-5677-2,Springer

Netherlands,January 2007.

[5] https://en.wikipedia.org/wiki/IEEE floating point

[6] S..W.Smith,Chapter 28, “Fixed versus Floating

Point”, The Scientist and Engineers Guide to Digital

Signal Processing, California Technical Pub. p. 514.

ISBN 0966017633,1997,Retrieved December 31,

2012.

[7] Texas Instruments,Signal Processing Overview of

Ultrasound Systems for Medical Imaging,2008.

[8] https://en.wikipedia.org/wiki/Half-precision floating-

point format

[9] J.U.Kidav, B. A. Sujathakumari, C.A .Laseena,

“Ultrasound Array Modelling and Beamforming

using Field II”,International Journal of Emerging

Research in Management Technology,ISSN: 2278-

9359 (Volume-4, Issue-6),2015.

[10] Xilinx(2011),LogiCORE IP Floating-Point Operator

v5.0

[11] M.S .Chaitra, B. G. Sudarshan, B. S.

Sathyanarayana and P. Kumar, “Ultrasound Imaging

System: A Review”, International Journal of

Pharmacology and Pharmaceutical Technology

(IJPPT), ISSN: 2277 3436, Volume-1, Issue-

2,2012.

[12] T.Haynes , “A Primer on Digital Beamforming”,

Spectrum Signal Processing, March 26,1998,

http://www.spectrumsignal.com.

[13] L. Azar, Y. Shi and S. C. Wooh,”Beam focusing

behavior of linear phased arrays”, NDTE

International, Elsevier,vol.33.page 189198, July

2000.

[14] J. A. Jensen, ”Field: A program for simulating

ultrasound systems”,Med. Biol. Eng. Comput.vol. 4,

pp. 351353,1996.

Page 12: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

17 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

[15]

http://www.wikiradiography.net/page/Ultrasound+P

hysics

[16] J.park, S.M.Wi, and J.S. Lee, “Computationally

efficient adaptive beamformer for ultrasound

imaging based on QR decomposition”, IEEE

Transactions on Ultrasonics, Ferroelectrics, and

Frequency Control, vol. 63, no. 2, February 2016.

[17] C.H. Hu, X.C. Xu, J.M.. Cannata, J.T. Yen, and

K.K. Shung, “Development of a Real-Time, High-

Frequency Ultrasound Digital Beamformer for High

Frequency Linear Array Transducers”, IEEE

Transactions on Ultrasonics, Ferroelectrics, and

Frequency Control, vol. 53, no. 2, February 2006.

[18] S. Sami Deeb and Robert A. LaTourette,”

Derivation of Beam Interpolation Coefficients with

Application to the K- Beamformer” NUWC-NPT

Technical Report 11,287 15 June 2001,IEEE Journal

On Very Large Scale Integration Systems .

[19] T.I. Laakso,V.Valimaki,M.Karjlainen and U.K.

Laine, “Splitting the unit delay: tools for fractional

delay filter design”, IEEE Signal Processing

Magazine, page 30 60 January 1996.

[20] P N T Wells, “Ultrasonic imaging of the human

body”,Rep. Prog. Phys.62 pp-671722,1999, Printed

in the UK

[21] P. Hoskins, K. Martin, A.Thrush,” Diagnostic

Ultrasound Physics and Equipment.”

[22] J. A. Jensen,” Users guide for the Field II program”,

Release 3.20, November 19 2010.

[23] W.Hua and L.Mei, “The Design of Delay Pulse

Circuit for Ultrasonic Phased Array System”,

Proceedings of 20th International Congress on

Acoustics, ICA 2010 23-27 August 2010, Sydney,

Australia.

[24] J.Y.Lu, “Transmit-Receive Dynamic Focusing with

Limited Diffraction Beams”, IEEE Ultrasonics

Symposium 1543,1997.

[25] http://www.signal-processing.com/us field.html

[26] J.A. Jensen, “Ultrasound imaging and its modeling”

Chapter in. Fink et al. (Eds.): Imaging of Complex

Media with Acoustic and Seismic Waves, Topics in

Applied Physics, vol. 84, pp. 135-165, Springer

Verlag,2002.

[27] http://www.tp-

link.in/resources/document/beamforming.pdf

[28] J.A.Jensen and P. Munk, “ Computer Phantoms For

Simulating Ultrasound B-Mode And CFM Images”,

Acoustical imaging, vol. 23, pp. 75-80, eds.: s. Lees

and l. A. Ferrari, plenum press,1997.

[29] I. Lie, M.E.Tanase, “A Compact FPGA Beamformer

Architecture”, WSEAS Int. Conf. On Dynamical

Systems and Control, Venice, Italy, November 2-4,

(Pp463-466),2005.

[30] B.G. Tomov and J.A. Jensen, “A new architecture

for a single-chip multi-channel beamformer based on

a standard FPGA”, IEEE Ultrasonics Symposium

,2001.

[31] J.Y. Um, E.W. Song, Y.J. Kim, S.E. Cho, M.K.

Chae, J. Song, B. Kim, S. Lee, J. Bang, Y.Kim, K.

Cho, B. Kim, J.Y. Sim, H.J.Park, “An Analog-

Digital-Hybrid Single-Chip RX Beamformer with

Non- Uniform Sampling for 2D-CMUT Ultrasound

Imaging to Achieve Wide Dynamic Range of Delay

and Small Chip Area”,IEEE International Solid-

State Circuits Conference,2014

[32] M.Almekkawy, J.Xu and M. Chirala, “An

Optimized Ultrasound Digital Beamformer with

Dynamic Focusing Implemented on FPGA”, IEEE

Conference proceedings IEEE Eng Med Biol Soc,

2014

[33] D.B. Casas, “Digital Beamforming Implementation

on an FPGA Platform”, SPCOM Group,July 2007.

[34] G.Meng, “Method and Apparatus for Multi-Beam

Beamformer Based On Real- Time Calculation of

Time Delay and Pipeline Design”, Patent

Application Publication, US 2011/0237950 A1, Sep.

29, 2011 Sheet 1 of 14

[35] G.I. Athanasopoulos, S.J. Carey, and J.V. Hatfield,

“Circuit Design and Simulation of a Transmit

Beamforming ASIC for High-Frequency Ultrasonic

Imaging Systems”, IEEE Transactions on

Ultrasonics, Ferroelectrics, and Frequency Control,

vol. 58, no. 7, July 2011

[36] V.N.Okorogu, G.C.Nwalozie, K.C.Okoli and

E.D.Okoye, “Design and Simulation of a Low Cost

Digital Beamforming (DBF) Receiver for Wireless

Communication”, International Journal of

Innovative Technology and Exploring Engineering

(IJITEE) ISSN: 2278-3075, Volume-2, Issue- 2,

January 2013.

[37] S. G. Dighe and M. T. Kanawade, “Field

Programmable Gate Array Technique’s” ,

International Journal of Computing and Technology,

Volume 2, Issue 12, December 2015 ISSN : 2348

6090.

[38] U.M.Baese, “Digital Signal Processing and Field

Programmable Gate Arrays”,3rd edition,Springer.

[39] M.M. Nguyen and J.T. Yen, “Performance

Improvement of Fresnel Beamforming Using Dual

Apodization with Cross-Correlation”, IEEE

Transactions on Ultrasonics, Ferroelectrics, and

Frequency Control, vol. 60, no. 3, March 2013

[40] S. A.Mohamed, E.D.Mohamed, M.F.Elshikh, and

M. A.Hassan, “Design of Digital Apodization

Technique for Medical Ultrasound Imaging”,

International Conference On Computing, Electrical

And Electronic Engineering(ICCEEE),2013

[41] J.Bhattacharyya, P.Mandal, R.Banerjee, S. Banerjee,

“ Real Time Dynamic Receive Apodization For An

Ultrasound Imaging System”, Proceedings of the

Page 13: Fixed and Floating Point Array Signal Processor ...academicscience.co.in/admin/resources/project/paper/f...Pulse-echo processing is the working principle behind RADAR/SONAR, Medical

18 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai

International Journal of Electronics, Electrical and Computational System

IJEECS

ISSN 2348-117X

Volume 6, Issue 6

June 2017

19th International Conference on VLSI Design

(VLSID06)

[42] B.G.Tomov and J. A. Jensen, “ Compact

implementation of dynamic receive apodization in

ultrasound Scanners”, Proceedings of SPIE - The

International Society for Optical Engineering , April

2004

[43] M. A. Hassan, “ Comparison between windowing

Apodization functions techniques For medical

ultrasound imaging ” , American Journal of

Biomedical Science and Engineering 2015; 1(1): 1-

8, Published online January 30, 2015,

(http://www.aascit.org/journal/ajbse)

[44] I. Mahi P. and S. S. Kerur, “Design and Simulation

of Floating Point Pipelined ALU Using HDL and IP

Core Generator”, International Journal of Current

Engineering and Technology , ISSN 2277-4106,

2013, INPRESSCO,Available at

http://inpressco.com/category/ijcet.

[45] L.Gangwar and R.Chaudhary, “ Floating Point

Arithmetic Unit Using Verilog”, Advance in

Electronic and Electric Engineering.ISSN 2231-

1297, Volume 3, Number 8 (2013), pp. 1013-1018,

Research India Publications,

http://www.ripublication.com/aeee.htm

[46] IEEE Standard For Floating-Point Arithmetic,

Microprocessor Standards Committee of The IEEE

Computer Society,Approved 12 June 2008,IEEESA

Standards Board.