Upload
saikumarchintu
View
53
Download
6
Embed Size (px)
Citation preview
!
!
"#e!%&'()%!*+,-./0#12!
*3!*.e45677/8/e30!"e8#3/9:e!7-.!;<=*5>4?e@!*.0/7/8/4+!Ae:.4+!Ae0B-.C?!
!
!
!
!
!
*!"#e?/?!
D.e?e30e@!0-!0#e!;48:+0E!-7!
%4+/7-.3/4!<-+E0e8#3/8!F040e!G3/He.?/0E!
F43!I:/?!&J/?D-!
!
!
!
!
!
!
)3!<4.0/4+!;:+7/++1e30!
-7!0#e!'e9:/.e1e30?!7-.!0#e!(e,.ee!
K4?0e.!-7!F8/e38e!/3!6+e80./84+!63,/3ee./3,!
!
!
!
JE!
"/1!K8Ie3e,43!
A-He1Je.!LMMN!
ii
Authorization for Reproduction of Master4s Thesis
# grant permission for the reproduction of this thesis in its entirety or any of its parts4
without further authori6ation from me.
88888888888888888888888888
9ignature :;imothy <c=enegan>
88888888888888888888888888
?ate
"""
!""ro%&'()&*e(
#"$%&' #(& )*+,-) .%/01"$(2' .3 .1&45677"8"&3$ #&8(3"9:& 701 ;<=.5>4?&@
.1$"7"8"4% A&:14% A&$B01C?
.:$(01' #"20$(D E8F&3&/43
,4$& G:H2"$$&@' A0I&2H&1 JKL JMMN
,1O FD33& G%"I0I?CD PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
.@I"?01 43@ )022"$$&& )(4"1 G"/34$:1&
,1O .%H&1$ F"@@"804$ PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
)022"$$&& E&2H&1 G"/34$:1&
,1O Q43& R(43/ PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
)022"$$&& E&2H&1 G"/34$:1&
iv
A"stract
An artificial neural network is a computing technique that allows problems to be solved
that would otherwise require an overly complex procedural algorithm. <y designing a
large network of computing nodes based on the artificial neuron model, new solutions
can be developed to problems relating fields such as image and speech recognition.
This thesis presents the ?@ABI? algorithm as a computing technique that is capable of
supporting an artificial neural network in programmable hardware such as DEFAs. The
algorithm is presented in depth, including a derivation and precision analysis. The
designs of a parallel and bitGserial implementation are analyHed with respect to their
ability to support a large neural network. Simulations demonstrating the operation of the
unit follow a breakdown of the subcomponents in each design. It is shown that the small
resource requirements of the ?@ABI? algorithm allow for many instances in an DEFA,
allowing the parallelism inherent to an artificial neural network to be maintained.
! ! "
!able of Contents
!"#$%&'$ ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( i!
*&"+,-o/-0o1$,1$# ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( 2!
3i#$-o/-*&"+,# ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((2ii!
3i#$-o/-4i56%,#((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((2iii!
07&8$,%-9! :1$%o;6'$io1 ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( 9!
07&8$,%-<! !%$i/i'i&+-=,6%&+-=,$wo%?# (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( @!
#$%! &'()*+,-./$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 0!#$#! 1+2343(3'5!67-+,.!8,/75$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 9!#$#$#! :-;;'23,.!<-.(23,. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ =!#$#$>! ?+'.@47+!<-.(23,. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ =!#$#$0! :('53.*!'./!A3;323.* $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ B!#$#$9! C++,+!<-.(23,.$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ B!#$#$=! D-2E-2!<-.(23,.$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ B!#$#$B! A7'+.3.*!<-.(23,. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ F!
#$>! 1+2343(3'5!67-+'5!672G,+)!:2+-(2-+7$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ F!#$>$%! H.E-2!A'I7+$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ J!#$>$#! K3//7.!A'I7+@$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ J!#$>$>! D-2E-2!A'I7+ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %L!
#$0! A7'+.3.*!8,/7@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %L!#$0$%! :-E7+"3@7/!A7'+.3.*$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %%!#$0$#! M.@-E7+"3@7/!A7'+.3.* $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %%!#$0$>! A7'+.3.*!N'27@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %#!#$0$0! O,;;,.!A7'+.3.*!A'G@$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %#!
#$9! <PQ1!H;E57;7.2'23,.@$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %0!#$=! ODNRHO$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %=!
07&8$,%-A! 4BC!# (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( 9D!
>$%! R7@3*.!R7(3@3,.@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ %F!>$#! <PQ1!:2+-(2-+7 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ #L!>$>! S353.T!:E'+2'.U> $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ##!
07&8$,%-@! *7,-0EFG:0-!+5o%i$7H ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( <I!
0$%! &'()*+,-./$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ #=!0$#! ?V7!15*,+32V; $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ #B!0$>! 1((-;-5'2,+!N7*3@27+@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ >L!0$0! O,;E-2'23,.!8,/7@ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ >%!0$0$%! N,2'23,.!8,/7 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ >#!0$0$#! W7(2,+3.*!8,/7$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ >0!
0$9! CTE'./3.*!2V7!O,;E-2'23,.!R,;'3. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ >=!0$=! O,."7+*7.(7 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 0L!
"i
$%&%1 (he Con"er/ence Condition %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $3 $%&%4 5roo6 o6 Con"er/ence Criteria%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $1 $%&%8 Con"er/ence in 9otation :ode%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $4
$%; <ccurac> and ?rror%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $8 $%;%1 @uAericaB 9epreDentation%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $8 $%;%4 9oundin/ ?rror %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $$ $%;%8 <n/Be <pproEiAation ?rror%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $$
!"#$%&' ) *"& +#'#,,&, -.$,&.&/%#%01/22222222222222222222222222222222222222222222222222222222222 34
F%1 GeDi/n%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $; F%1%1 (he ControB Hnit%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% $I F%1%4 (he Jhi6t 9e/iDterD %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% F4 F%1%8 (he KooLup (aMBe %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% F8 F%1%$ GeDi/n JuAAar> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% F$
F%4 JiAuBation 9eDuBtD%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FF F%4%1 JiAuBation 1N CoAputin/ e Oith CP9GIC RroSth ?rror %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FF F%4%4 JiAuBation 4N CoAputin/ e CoApenDatin/ 6or CP9GIC RroSth %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% F& F%4%8 JiAuBation 8N CoAputin/ e
T1%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% F;
F%4%$ JiAuBation $N CoAputin/ PutDide the GoAain o6 Con"er/ence%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FU F%8 JuAAar> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FI
!"#$%&' 4 *"& 5&'0#, -.$,&.&/%#%01/22222222222222222222222222222222222222222222222222222222222222 46
&%1 GeDi/n%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &3 &%1%1 ControB Hnit %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &4 &%1%4 (he Jhi6t 9e/iDterD %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &$ &%1%8 (he KooLup (aMBe %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% && &%1%$ (he JeriaB <dderD %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &; &%1%F GeDi/n JuAAar> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &U
&%4 JiAuBation 9eDuBtD%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &I &%4%1 JiAuBation 1N CoAputin/ e Oith CP9GIC RroSth ?rror %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;3 &%4%4 JiAuBation 4N CoAputin/ e CoApenDatin/ 6or CP9GIC RroSth %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;1 &%4%8 JiAuBation 8N CoAputin/ e
T1%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;4
&%4%$ JiAuBation $N CoAputin/ PutDide the GoAain o6 Con"er/ence%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;4 &%8 JuAAar> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;8
!"#$%&' 7 !1/8,9:01/ #/; <9%9'& =1'>2222222222222222222222222222222222222222222222222222222222 7)
;%1 CoAparin/ the GeDi/nD %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;& ;%4 Inte/ration Oith <rti6iciaB @euraB @etSorLD%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;I
;%4%1 CP9GIC <rti6iciaB @euron%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ;I ;%4%4 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% U4
;%8 Vuture Jtud>%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% U$
?&@&'&/8&:2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222 A)
vii
List of Tables
Table 4.1—CORDIC growth factors for each computation domain................................................................ 38 Table 4.2—Functions that can be computed using the CORDIC algorithm. .................................................. 39 Table 4.3—Domains of convergence for the CORDIC algorithm................................................................... 41 Table 5.1—Overall device utilization for the parallel design........................................................................... 48 Table 5.2—Unconditional control unit outputs for each state in the parallel design. ..................................... 50 Table 5.3—Device utilization for the parallel control unit. .............................................................................. 52 Table 5.4—Device utilization for the parallel shift register. ............................................................................ 53 Table 5.5—Device utilization for the parallel lookup table.............................................................................. 54 Table 6.1—Overall device utilization for the serial design. ............................................................................. 61 Table 6.2—State outputs for the serial control unit........................................................................................... 64 Table 6.3—Device utilization for the serial control unit. ................................................................................. 64 Table 6.4—Device utilization for a single serial shift register (including 32:1 multiplexer)......................... 66 Table 6.5—Device utilization for the serial lookup table. ................................................................................ 66 Table 6.6—Device utilization for the serial and parallel adders. ..................................................................... 68 Table 7.1—Device resource requirements for the artificial neuron designs using both the parallel and serial
CORDIC units............................................................................................................................................ 81
viii
List of Figures
Figure 2.1—Components that comprise an artificial neuron, the basic building blocks of artificial neural
networks. ...................................................................................................................................................... 5 Figure 2.2—A basic artificial neural network. .................................................................................................... 9 Figure 3.1—Internal structure of a Xilinx Spartan-3 FPGA [8]....................................................................... 20 Figure 3.2—Block diagram of a slice in a Spartan-3. ....................................................................................... 23 Figure 3.3—Block diagram of a Complex Logic Block (CLB) in a Spartan-3. .............................................. 24 Figure 4.1—Step i in the CORDIC algorithm. .................................................................................................. 28 Figure 4.2—The CORDIC Rotation mode. ....................................................................................................... 32 Figure 4.3—CORDIC Vectoring Mode ............................................................................................................. 34 Figure 5.1—Block-level diagram of the complete parallel CORDIC unit. ..................................................... 47 Figure 5.2—State diagram for the parallel CORDIC unit. ............................................................................... 49 Figure 5.3—State transition logic for the parallel CORDIC unit. .................................................................... 50 Figure 5.4—Logic behind the state-independent outputs of the parallel control unit..................................... 51 Figure 5.5—Design of the variable shift register. ............................................................................................. 53 Figure 5.6—Slices used by the various components of the parallel CORDIC unit. ....................................... 54 Figure 5.7—Results of a ModelSim simulation attempting to compute the value of e. As a result of
CORDIC growth, the final computed value is inaccurate....................................................................... 56 Figure 5.8—Results of a ModelSim simulation computing the value of e. By accounting for CORDIC
growth, the final computed value is accurate........................................................................................... 57 Figure 5.9—Results of a ModelSim simulation computing the value of e
–1. .................................................. 57
Figure 5.10—Results of a ModelSim simulation computing the value of e5. Since the initial value 5 is
outside the domain of convergence, the computed value is incorrect. ................................................... 58 Figure 6.1—Block diagram for the serial implementation of the CORDIC algorithm................................... 60 Figure 6.2—State diagram for the serial implementation of the CORDIC algorithm. ................................... 63 Figure 6.3—Design of the serial shift registers. ................................................................................................ 65 Figure 6.4—Design of the serial adder. ............................................................................................................. 67 Figure 6.5—Slices used by the various components of the serial CORDIC unit............................................ 68 Figure 6.6—Results of a ModelSim simulation of the serial CORDIC unit attempting to compute the value
of e. As a result of CORDIC growth, the final computed value is inaccurate. ...................................... 70 Figure 6.7—Results of a ModelSim simulation of the serial CORDIC unit computing the value of e. By
accounting for CORDIC growth, the final computed value is accurate................................................. 71 Figure 6.8—Results of a ModelSim simulation of the serial CORDIC unit computing the value of e
–1. ..... 72
Figure 6.9—Results of a ModelSim simulation of the serial CORDIC unit computing the value of e5. Since
the initial value 5 is outside the domain of convergence, the computed value is incorrect. ................. 73 Figure 7.1—Slices required by the various components of the two CORDIC designs. ................................. 76 Figure 7.2—The effect of word size on the FPGA chip area requirements of the CORDIC unit.................. 78 Figure 7.3—The effect of word size on the execution time of the CORDIC unit........................................... 78 Figure 7.4—Block diagram of a basic artificial neuron utilizing the CORDIC unit to compute the transfer
function ex................................................................................................................................................... 80
Figure 7.5—Simulation of an artificial neuron using the 4 inputs (0.3, 0.02, 2.1, 0.65) and the weights (1, 2,
–0.5, 0.25)................................................................................................................................................... 82
1
Chapter 1 Introduction
Traditional design techniques typically require a systems designer to formulate a
mathematical model of the system, and then design an algorithm to operate on the system
inputs in accordance with the model. In general, this approach works well, but there are
some applications where either the mathematical model is difficult to derive, or the
algorithm is too complex to be implemented in a cost-effective manner. Image and
speech processing are two such problems that people intuitively understand but require
complicated mathematical models that in turn require a good deal of computing power to
operate [10].
In these situations, an artificial neural network can be employed. Rather than
deriving an algorithm for analyzing the image or speech pattern, the parameters of the
network are tweaked in a process called training until the desired outputs are obtained
from the network. As will be shown in Chapter 2, networks are massively parallel
networks of small computing nodes called “neurons.” This parallelism makes neural
networks naturally suited to hardware implementations. A software implementation
running on a single microprocessor can only simulate parallelism; only in a hardware
implementation can the real-time benefits of this parallelism be achieved.
Programmable hardware devices such as FPGAs are particularly suited as platforms
for implementation, since the various parameters of the network occur during training.
Programmable hardware devices also afford several other benefits to hardware designers
by eliminating a lot of the difficulties that come with custom chip design. With a platform
for these artificial neural networks chosen, a means of performing the computations in the
interconnected nodes is needed. The number of these nodes can be large; the arithmetic
2
unit must be small enough so that it can be duplicated enough to maintain the parallelism
inherent in the structure.
The CORDIC algorithm will be presented as a solution to that problem. The
algorithm lends itself to a simple hardware implementation consisting only of basic
addition and shifting operations. In addition to its small chip area footprint, the algorithm
can compute a wide variety of functions ensuring that it can be used in a variety of ways
within a system. In particular, the algorithm can compute the exponential function, ex by
summing together the values of sinh(x) and cosh(x), both of which are computed in
parallel by the unit. The exponential function has an application as the transfer function
in a specific type of neural network and is chosen as the function of study for this thesis.
Chapter 2 will describe neural networks in detail and describe how they operate. The
structure of the network will be analyzed, as will the design of the neurons that comprise
the network. Chapter 3 will study the design of programmable hardware devices—
FPGAs in particular. The benefits they offer hardware designers will be presented, as will
the basic structures common to most implementations. Chapter 4 introduces the CORDIC
algorithm. A general-purpose definition of the algorithm is presented, and proofs of
convergence and precision are presented. Expansion into multiple computation domains
allows for the computation of sine, cosine, square root, multiplication, hyperbolic sine,
and hyperbolic cosine.
Chapter 5 and Chapter 6 study two implementations of the CORDIC algorithm and
study how the components of the CORDIC processing unit fit into a Xilinx Spartan-3
FPGA. Simulations performed using the ModelSim software are presented. Chapter 7
3
then compares these two designs and shows how they can be used in a neural network to
facilitate their implementation in a programmable device.
"
Chapter ) *rti,icia. /eura. /et1or34
As electronic devices increasingly become more a part of society, the desire for
devices to solve more complex problems is growing. ;ome problems have no easy
algorithmic solution, or the problems themselves are not completely understood. It is in
these cases that it is occasionally easier to develop a learning machine that can be taught
what to do in a situation without needing to understand the underlying processes.
Artificial neural networks are one such technique. They are designed to emulate the
human brainAs ability to learn. The same structure can be used to solve many different
problems, depending on the training methods employed.
)51 7ac38round
The structure of an artificial neural network is similar to that of the human brain. The
human brain is made up of billions of cells called neurons. The computational power of
the brain is a result of the large volume of neurons available, coupled with learning
ability.
The neurons themselves have several different classifications and subcomponents.
The computation process comes as a result of the interconnection of these different types
of neurons. Bowever, unlike the digital realm in which artificial neural networks reside,
the neurons in the brain “form a process which is not binary, not stable, and not
synchronous” [10].
Though artificial neural networks try to mimic the way in which the human brain
operates, the methodology is not designed to replace the brain, especially since so much
"
of it is (ot u(*erstoo*- ./t0er1 /rtifi2i/3 (eur/3 (et4or5s re6rese(t / (e4 te20(i7ue for
e(gi(eers to so39e 6ro:3e;s-
!"! #$%i'i(i)*+,e.$/0+1/2e*+
!"!"
!"!2
i"!"
i"!2
i"!$
!"!$
! f
"
!"#$% o"
!
i"!!!%&'
()ti!+ial /e1)o"
!"#$r& ()*+,-./-0&0ts t3at c-./r"s& a0 art"6"c"a7 0&$r-08 t3& bas"c b$"7:"0# b7-c;s -6 art"6"c"a7
0&$ra7 0&t<-r;s)
<rtifi2i/3 (euro(s i;it/te t0e 9/rious fe/tures of :io3ogi2/3 (euro(s- Figure >-?
s0o4s / :/si2 ;o*e3 of t0e 9/rious 2o;6o(e(ts t0/t ;/5e u6 /( /rtifi2i/3 (euro(- @/20
(euro( 0/s / set of i(6uts1 3/:e3e* i i( t0e figure1 /(* / set of out6uts *e(ote* /s o i( t0e
figure- A0e /;ou(t of t0ese i(6uts /(* out6uts 2/( 9/rB *e6e(*i(g o( t0e stru2ture of t0e
e(tire (et4or5- Cut6uts 2/( :e 2o((e2te* to ;u3ti63e (euro(s-
!"!"3"3 4ei56%i05+7)(%/$8+
D( or*er to ;i;i2 t0e sB(/6ti2 stre(gt0s of :io3ogi2/3 (euro(s1 4eig0ti(g f/2tors
E*e(ote* w i( Figure >-?F /re (ee*e* for t0e i(6uts to t0e /rtifi2i/3 (euro(- @/20 4eig0t
sig(ifies t0e i;6ort/(2e of t0eir res6e2ti9e i(6ut i( t0e 6ro2essi(g fu(2tio( of t0e (euro(-
D(6uts 4it0 3/rger 4eig0ts 4i33 2o(tri:ute ;ore to t0e (eur/3 res6o(se t0/( t0ose 4it0
6
l$%%$r '$ig*t%. -*$ pot$ntial to l$arn i% incorporat$d into t*$ arti4icial n$uron 6and t*u%
t*$ n$t'or789 :; allo'ing t*$ input '$ig*t% to :$ adapti<$ co$44ici$nt%. -*$ adaptation
proc$%% i% p$r4orm$d in r$%pon%$ to training %$t% o4 data9 and d$p$nd% on :ot* t*$
n$t'or7>% %p$ci4ic topolog; a% '$ll a% t*$ l$arning rul$ :$ing appli$d.
!"!"! #$%%&'()* ,$*-'()*
-*$ 4ir%t %t$p in t*$ op$ration o4 an arti4icial n$uron i% t*$ %ummation 4unction. ?%
t*$ nam$ impli$%9 t*i% i% u%uall; a %ummation o4 t*$ '$ig*t$d input% to t*$ n$uron. -*$
%ummation 4unction can :$ mor$ compl$@ t*an a %impl$ %ummation. Ot*$r 4unction% u%$d
ar$ minimum9 ma@imum9 maBorit;9 product9 and ot*$r normaliCing algorit*m%. ?n;
4unction t*at op$rat$% on multipl$ input% to produc$ a %ingl$ output Duali4i$%.
!"!". /0&*1230 ,$*-'()*
?4t$r t*$ input% *a<$ pa%%$d t*roug* t*$ %ummation 4unction9 t*$; ar$ t*$n 4$d
t*roug* a tran%4$r 4unction. -*i% tran%4$r 4unction i% u%uall; a nonlin$ar 4unction. On$ o4
t*$ goal% o4 arti4icial n$ural n$t'or7% i% to :$ a:l$ to pro<id$ nonlin$ar proc$%%ing.
Eo'$<$r9 t*$ a:ilit; o4 a n$ural n$t'or7 to p$r4orm in a nonlin$ar 4a%*ion i% d$p$ndant
upon t*$ tran%4$r 4unction o4 t*$ indi<idual n$uron%. F; c*oo%ing a lin$ar tran%4$r
4unction9 t*$ o<$rall n$t'or7 'ould :$ limit$d to %impl$ lin$ar com:ination% o4 t*$
input%. Gariou% tran%4$r 4unction% ar$ t;picall; u%$d. ? <$r; common tran%4$r 4unction i%
t*$ *;p$r:olic tang$nt. -*$ *;p$r:olic tang$nt i% a continuou% 4unction9 a% ar$ it%
d$ri<ati<$%.
Hn a 4$' ca%$%9 a uni4orm noi%$ g$n$rator i% add$d :$4or$ t*$ tran%4$r 4unction i%
appli$d. -*$ output o4 t*i% g$n$rator i% r$4$rr$d to a% t*$ n$uron>% It$mp$ratur$.J -*$
%urrounding t$mp$ratur$ can ad<$r%$l; a44$ct t*$ *uman :rain9 and :; adding t*i%
7
ca%a&ilit* to artificial neural network34 t5e &e5a6ior of t5e network 7ore clo3el*
e7ulate3 a 5u7an &rain.
!"!"# $%&'()* &), -(.(/()*
97%le7entation of t5i3 %ortion of t5e artificial neuron 7o:el i3 o%tional. ;5e out%ut
of t5e tran3fer function i3 7ani%ulate: in or:er to lie wit5in certain &oun:3. Scaling i3
%erfor7e: fir3t4 followe: &* 3o7e 3ort of t5re35ol: function.
Accor:ing to An:er3on an: ?cNeil ABCD4 t5e3e co7%onent3 are u3uall* u3e: w5en
eE%licitl* 3i7ulating &iological neuron 7o:el3.
!"!"0 12232 45)%/(3)
;5e raw error of t5e network i3 t5e :ifference &etween t5e :e3ire: out%ut an: t5e
actual out%ut. ;5e error function tran3for73 t5i3 raw error to 7atc5 t5e %articular network
arc5itecture in u3e. 9f it i3 :e3ire: &* t5e 3*3te7 arc5itect to con3i:er error in t5e network4
t5en t5i3 co7%onent i3 inclu:e: in t5e neuron 7o:el. 9n t5i3 ca3e4 %ro%agation :irection
of t5i3 error i3 u3uall* &ackwar:3 t5roug5 t5e network. ;5e &ackF%ro%agate: 6alue 3er6e3
a3 t5e in%ut to ot5er neuron3G learning function3.
!"!"6 75/85/ 45)%/(3)
Hut3i:e t5e neuron 7o:el4 &ut relate: to it i3 t5e out%ut function. Nor7all* t5e
out%ut of t5e neuron i3 eIual to t5e out%ut of t5e tran3fer function. J5en i7%le7ente:4
t5e out%ut function allow3 for co7%etition &etween t5e out%ut3 of 6ariou3 neuron3.
Jit5in a 37all Kneig5&or5oo:L of neuron34 a large out%ut &* one neuron will cau3e t5e
out%ut of a :ifferent neuron to :i7ini35. 9n ot5er wor:34 t5e lou:e3t neuron cau3e3 t5e
ot5er neuron3 to &e Iuieter.
"
2.2.# $earning Function
#he learnin+ ,un.tion 1odi,ie3 the input 5ei+ht3 o, the neuron6 7ther na1e3 +i8en
to thi3 ,un.tion are the adaptation ,un.tion9 or learnin+ 1ode6 #here are t5o 1ain type3
o, learnin+ 5hen dealin+ 5ith neuron3 and neural net5or;36 #he ,ir3t type9 3uper8i3ed
learnin+9 i3 a ,or1 o, rein,or.e1ent learnin+ and re<uire3 a tea.her9 u3ually in the ,or1 o,
trainin+ 3et3 or an o=3er8er6 >n3uper8i3ed learnin+ i3 the other type9 and i3 =a3ed upon
internal .riteria =uilt into the net5or;6 #he 1a?ority o, neural net5or;3 utili@e the
3uper8i3ed learnin+ 1ethod9 a3 un3uper8i3ed learnin+ i3 .urrently 1ore in the re3ear.h
real16 #he pro.e33e3 u3ed to train a net5or; are di3.u33ed in A6B6
2.1 Artificial 5eural 5etwork 8tructure
Crti,i.ial neural net5or;3 ,un.tion a3 parallel di3tri=uted .o1putin+ net5or;36 Da.h
node in the net5or; i3 an arti,i.ial neuron6 #he3e neuron3 are .onne.ted to+ether in
8ariou3 ar.hite.ture3 ,or 3pe.i,i. type3 o, pro=le136 Et i3 i1portant to note that the 1o3t
=a3i. ,un.tion o, any arti,i.ial neural net5or; i3 it3 ar.hite.ture6 #he ar.hite.ture9 alon+
5ith the al+orith1 ,or updatin+ the input 5ei+ht3 o, the indi8idual neuron39 deter1ine3
the =eha8ior o, the arti,i.ial neural net5or;6 Feuron3 are typi.ally or+ani@ed into layer3
5ith .onne.tion3 =et5een neuron3 eGi3tin+ a.ro33 layer39 =ut not 5ithin6 Da.h neuron
5ithin ea.h layer i3 o,ten ,ully .onne.ted to all neuron3 in the a33o.iated layer6 #hi3 .an
lead to a 8a3t a1ount o, .onne.tion3 eGi3tin+ 5ithin the net5or;9 e8en 5ith relati8ely
,e5 neuron3 per layer6 Hi+ure A6A 3ho53 a 3i1ple neural net5or; .ontainin+ I layer36 En
thi3 .a3e9 the layer3 are not ,ully .onne.ted6
"
input layer outputlayer
hiddenlayer
!igure 2.2*+ ,asi/ arti1i/ia2 neura2 net4or6.
!"#"$ %&'() +,-./
#$%&'&%()* $+(,-$. ),+ (.+% /-, +)01 &$2(3 -/ )$ ),3&/&0&)* $+(,)* $+34-,56 71+.+
&$2(3. 0-(*% 8+ 0-**+03+% %)3)9 -, ,+)* 4-,*% &$2(3. /,-: 21;.&0)* .+$.-,.6 <,+=2,-0+..&$>
-/ 31+ &$2(3. 0)$ 8+ %-$+ 3- .2++% (2 31+ *+),$&$> 2,-0+.. -/ 31+ $+34-,56 #/ 31+ &$2(3. ),+
.&:2*; ,)4 %)3)9 31+$ 31+ $+34-,5 4&** $++% 3- *+),$ 3- 2,-0+.. 31+ %)3) &3.+*/9 ). 4+** ).
)$)*;?+ &36 71&. 4-(*% ,+@(&,+ :-,+ 3&:+9 )$% 2-..&8*; )$ +'+$ *),>+, $+34-,5 31)$ 4&31
2,-0+..+% &$2(3.6
!"#"! 0122.& +,-./3
71+ &$2(3 *);+, &. 3;2&0)**; 0-$$+03+% 3- ) 1&%%+$ *);+,6 A(*3&2*+ 1&%%+$ *);+,. :);
+B&.39 4&31 31+ &$2(3. -/ +)01 1&%%+$ *);+,C. $+(,-$. -/3+$ 8+&$> /(**; 0-$$+03+% 3- 31+
1#
$u&pu&s $) &*e previ$us/0 /a0er2s neur$ns. 5idden /a0ers 7ere given &*a& na9e due &$ &*e
)a:& &*a& &*e0 d$ n$& see an0 rea/ 7$r/d inpu&s n$r d$ &*e0 give an0 rea/ 7$r/d $u&pu&s.
;*e0 are )ed <0 &*e inpu& /a0er2s $u&pu&s= and )eed &*e $u&pu& /a0er2s inpu&s. ;*e nu9<er
$) neur$ns 7i&*in ea:* $) &*e *idden /a0ers= as 7e// as &*e nu9<er $) *idden /a0ers
&*e9se/ves= de&er9ines &*e :$9p/e>i&0 $) &*e s0s&e9. ?*$$sing &*e rig*& a9$un& )$r ea:*
is a 9a@$r par& $) designing a 7$rAing neura/ ne&7$rA. Bigure 2.2 s*$7s $ne *idden
/a0er= <u& $&*er ne&7$rA ar:*i&e:&ures :an *ave 9u/&ip/e *idden /a0ers.
!"#"# $%&'%& )*+,-
Da:* neur$n 7i&*in &*e $u&pu& /a0er re:eives &*e $u&pu& $) ea:* neur$n 7i&*in &*e
/as& *idden /a0er. ;*e $u&pu& /a0er pr$vides rea/ 7$r/d $u&pu&s. ;*ese $u&pu&s :$u/d g$ &$
an$&*er :$9pu&er pr$:ess= a 9e:*ani:a/ :$n&r$/ s0s&e9= $r 9a0<e saved in&$ a )i/e )$r
ana/0Eing. LiAe &*e $u&pu& )un:&i$n $) an individua/ neur$n= &*e $u&pu& /a0er 9a0
par&i:ipa&e in s$9e s$r& $) :$9pe&i&i$n <e&7een $u&pu&s. ;*is /a&era/ in*i<i&i$n :an <e
seen in Bigure 2.2 as &*e d$&&ed /ines :$nne:&ing &*e $u&pu& neur$ns. Gn addi&i$n= &*e
$u&pu&s 9a0 a/s$ <e )ed <a:A in&$ previ$us neur$ns &$ assis& &*e /earning pr$:ess. Gn
Bigure 2.2= &*e resu/& )r$9 an $u&pu& neur$n is )ed <a:A in&$ a neur$n in &*e *idden /a0er.
!". ),*-/0/1 234,5
H varie&0 $) di))eren& /earning 9$des e>is& )$r de&er9ining *$7 and 7*en &*e inpu&
7eig*&s $) &*e individua/ neur$ns are upda&ed 7i&*in a ne&7$rA. ;*e &0pes $) /earning are
ei&*er supervised $r unsupervised. Hs s&a&ed ear/ier= &*e unsupervised /earning is :urren&/0
&*e 9$s& unAn$7n &0pe $) /earning. ;*e /earning ra&e $) a ne&7$rA dras&i:a//0 e))e:&s &*e
i&s per)$r9an:e.
11
!"#"$ %&'()*+,(- /(0)1+12
Learning in a su,er-ise. mo.e starts wit3 a 4om,arison o5 t3e networ6’s generate.
out,uts an. t3e .esire. out,uts. 9n,ut weig3ts o5 ea43 neuron are a.:uste. to minimi;e
an< .i55eren4es 5oun.. =3is ,ro4ess is re,eate. until t3e networ6 is .eeme. to be a44urate
enoug3. @5ter t3e training ,3aseA t3e neurons’ weig3ts are t<,i4all< 5ro;enA w3i43 allows
t3e networ6 to be use. reliabl<. =o better a.a,t to slig3t -ariationsA t3e learning rate 4an
be lowere.. Bne o5 t3e most im,ortant t3ings to .o w3en training a networ6 is to
4are5ull< 43oose t3e .ata use. 5or training. =<,i4all< .ata is se,arate. into a training set
an. a mu43 smaller test set. =3e training set is use. to train t3e networ6 to ,er5orm a tas6.
=3e test set is use. to -eri5< t3at t3e networ6 is able to generali;e w3at it 3as learne. to
slig3t -ariations. Cit3out t3is se,aration o5 .ata setsA one woul. not be able to 6now i5
t3e networ6 sim,l< memori;e. t3e .ata set or not.
!"#"! 31,&'()*+,(- /(0)1+12
Dnsu,er-ise. learning is ,er5orme. wit3out an< 5orm o5 eEternal rein5or4ement.
=3is learning mo.e re,resents a sort o5 en. goal 5or s<stems .esigners. Dsing t3is
met3o.A t3e s<stem tea43es itsel5 b< itsel5. =3e networ6 4ontains wit3in itsel5 a met3o. o5
.etermining w3en its out,uts are not w3at t3e< s3oul. be. =3is met3o. o5 learning is not
nearl< as well un.erstoo. as t3e su,er-ise. met3o.. 9t reFuires t3at t3e networ6 learn
online. Current wor6 3as been limite. to sel5Horgani;ing ma,sA w3i43 learn to 4lassi5<
in4oming .ata. Iurt3er .e-elo,ments wit3 t3is t<,e o5 learning woul. 3a-e uses in man<
situations w3ere a.a,tation to new in,uts is reFuire. regularl<.
"2
!"#"$ %&'()*)+ -'.&/
The learning rate o/ a networ1 is deter4ined 56 4an6 /actors. 9etwor1 architecture;
si<e and co4=le>it6 =la6 a 5ig role in the s=eed at which the networ1 learns. ?nother
/actor that a//ects the learning rate is the learning rule or rules e4=lo6ed. @low and /ast
learning rates each haAe their =ros and cons. ? lower rate will o5Aiousl6 ta1e longer to
arriAe at a 4ini4u4 error at the out=ut. ? /aster rate will arriAe 4ore Buic1l6; 5ut has a
tendenc6 to oAershoot the 4ini4u4. @o4e learning rules use the 5est o/ 5oth worlds; and
start o// with a high learning rate; and lower it graduall6 until a 4ini4u4 is reached.
!"#"# 01221) %&'()*)+ %'3/
Learning laws goAern how the in=ut weights o/ neurons within the networ1 are
4odi/ied. T6=icall6 the error at the out=ut is =ro=agated 5ac1 through the Aarious la6ers
o/ the networ1; adDusting the weights as it goes. Eow the error is =ro=agated 5ac1 is the
4aDor di//erence. The /ollowing; all /ro4 F"GH; are so4e laws co44onl6 used 56 networ1
architects.
!"#"#"4 5&667/ -89&
Ee55’s rule was the /irst general rule /or u=dating weights. Jt states; KJ/ a neuron
receiAes an in=ut /ro4 another neuron; and i/ 5oth are highl6 actiAe L4athe4aticall6 haAe
the sa4e signM; the weight 5etween the neurons should 5e strengthened.N Ee55 o5serAed
that 5iological neural =athwa6s are strengthened each ti4e the6 are used; and this rule is
designed to si4ulate that e//ect. Oost o/ the other rules 5uild u=on this /unda4ental law.
?s an e>a4=le o/ how this rule wor1s; su==ose a neural networ1 is 5eing trained to
control the acceleration o/ a car. @u==ose /urther that the networ1’s in=uts are the 5ra1e
and gas =edal =ositions 5eing as o=erated 56 a hu4an driAer. The acceleration and
"#
$e&ele(ation o. t/e &a( &an 0e &ompa(e$ 3it/ t/e $esi(e$ o5tp5t o. t/e $(i6e(. 8o3
s5ppose t/e $(i6e( o. t/e &a( 3ante$ to slo3 $o3n9 an$ p5s/e$ t/e 0(a:e. I. t/e o5tp5t o.
t/e ne5(al net3o(: 3as to $e&ele(ate9 t/an an< inp5t 3ei=/ts9 3/i&/ a >$e&ele(ate?
&omman$ 3as passe$ t/(o5=/9 s/o5l$ 0e in&(ease$. @/is 3o5l$ positi6el< (ein.o(&e t/e
a&&epta0le 0e/a6io( o. t/e net3o(:.
!"#"#"! $%& (&)*+ ,-)&
@/e $elta (5le is one o. t/e most &ommon lea(nin= (5les9 an$ is a 6a(iation o. Ae00Bs
C5le. It is also :no3n 0< se6e(al ot/e( names9 in&l5$in= t/e Di$(o3EAo.. Fea(nin= C5le
an$ t/e Feast Gean HI5a(e Fea(nin= C5le. It 3o(:s 0< t(ans.o(min= t/e e((o( at t/e
o5tp5t 0< t/e $e(i6ati6e o. t/e t(ans.e( .5n&tion. @/e (es5lt o. t/is t(ans.o(mation is 5se$
to a$J5st t/e inp5t 3ei=/ts asso&iate$ 3it/ t/e p(e6io5s la<e(s o5tp5ts. @/e t(ans.o(me$
e((o( (es5lt is p(opa=ate$ 0a&: t/(o5=/ all o. t/e la<e(s. Fee$E.o(3a($9 0a&:Ep(opa=ation
net3o(:s 5se t/is met/o$ o. lea(nin=.
!"#"#". $%& /0+12&3* (&45&3* ,-)&
@/e G(a$ient Mes&ent (5le is simila( to t/e Melta C5le in t/at t/e $e(i6ati6e o.
t(ans.e( .5n&tion mo$i.ies t/e o5tp5t e((o(. An a$$itional p(opo(tional &onstant (elate$ to
t/e lea(nin= (ate is a$$e$ to t/e mo$i.<in= .a&to( 0e.o(e t/e 3ei=/ts a(e a$J5ste$. @/is
met/o$ in its 0asi& .o( is :no3n to /a6e a slo3 (ate o. &on6e(=en&e. Osin= a 6a(<in=
lea(nin= (ate as 3as mentione$ 0e.o(e &an miti=ate t/is.
!"#"#"# 67%73&384 9&+0323: 9+;
@/is lea(nin= la3 is 5se$ .o( 5ns5pe(6ise$ net3o(:s. @e56o Po/onen 3as inspi(e$
0< lea(nin= in 0iolo=i&al s<stems9 an$ t/5s &ame 5p 3it/ t/e la3. Dit/ t/is la39 ne5(ons
&ompete .o( t/e oppo(t5nit< to lea(n. @/e ne5(on 3it/ t/e la(=est o5tp5t is t/e 3inne(9
14
and gets to update its weights and possibly some of its neighbors. 8sually the
neighborhoods start large, and shrink as training progresses.
!.# $%&A Implementations
Currently, most neural network research and testing is done using software
simulators. =oftware implementations allow for easy analysis of network behaviors.
Additionally, prediction-style problems in many cases do not require an embedded
hardware application. However, hardware neural networks can offer many advantages.
An optimiCed hardware implementation can yield better performance than a software
configuration running on a standard microprocessor. Additionally, a hardware
implementation can allow for applications that would be difficult to achieve in a software
set-up, such as with remote sensing applications. Designing for an FPGA also offers
many advantages, as discussed in Chapter H.
The ability of FPGAs to be re-programmed offers several specific advantages.
Jarious techniques in the category of “density enhancement” aim to increase the amount
of “effective circuit functionality per unit circuit area” [1N]. One way to accomplish this
is to separate out the various stages of the networkQs learning algorithm onto different
regions of the FPGA. Another way is to use optimiCed constant-coefficient multipliers to
handle the weighting calculations of the neuronQs inputs. These calculations will be fast.
In a training situation, these constant weights can be changed in under 6T !s [1N].
FGPAs also allow for the hardware implementation of artificial neural networks that
can dynamically adUust their topology. This allows for more complex learning algorithms
to adUust the topology, resulting in a more sophisticated final network.
! ! "#
$%&'(%&)!*+p-)+)./%/*0.1!&%*1)!.)(!*112)1!%.'!30.1*')&%/*0.1!/4%/!%&)!.0/!p&)1)./!*.!
105/(%&)!')1*6.17!8!9):!30.1*')&%/*0.!*.;0-;)1!/4)!.2+)&*3%-!&)p&)1)./%/*0.7!
<2.'%+)./%--:=!/4)&)!%&)!/(0!(%:1!/4%/!%!.2+>)&!3%.!>)!&)p&)1)./)'!*.!%.:!4%&'(%&)!
');*3)?!5*@)'Ap0*./=!%.'!5-0%/*.6Ap0*./7!81!'*13211)'!*.!B7C7"=!DEFGHD!21)1!5*@)'Ap0*./!
.0/%/*0.=!(4*34!*1!1*+p-:!%!13%-)'!*./)6)&7!<-0%/*.6Ap0*./!*1!/4)!+01/!30++0.!
&)p&)1)./%/*0.!21)'!*.!30+p2/*.6!4%&'(%&)!'2)!/0!/4)!(*')!&%.6)!05!;%-2)1!/4%/!3%.!>)!
&)p&)1)./)'7!H.!%!4%&'(%&)!1)//*.6=!)1p)3*%--:!%!1)//*.6!(4)&)!/4)!%&*/4+)/*3!2.*/1!12pp0&/!
%!+234!+0&)!30+p-)@!4%&'(%&)!%&34*/)3/2&)!I)767!%.!%&/*5*3*%-!.)2&%-!.)/(0&9J=!/4)!34*p!
%&)%!&)K2*&)+)./1!05!%!5-0%/*.6Ap0*./!2.*/!%&)!p&04*>*/*;)7!L*340-1!140()'!*.!M"#N!/4%/!*/!*1!
*+p&%3/*3%-!/0!*+p-)+)./!%!.)2&%-!.)/(0&9!*.!%.!<OP8!21*.6!5-0%/*.6Ap0*./!()*64/17!
<*@)'Ap0*./!2.*/1!%&)!+234!+0&)!%//&%3/*;)!*.!/4%/!/4):!%--0(!50&!+0&)!34*p!%&)%!/0!>)!
');0/)'!/0!/4)!%3/2%-!.)2&%-!.)/(0&97!Q4)&)!*1!%!-%&6)!>0':!05!0.60*.6!&)1)%&34!');0/)'!
/0!/4)!21)!5*@)'Ap0*./!()*64/1!*.!.)2&%-!.)/(0&91=!(4*34!)-*+*.%/)!/4)!.))'!50&!5-0%/*.6A
p0*./!4%&'(%&)7!
8-0.6!(*/4!34001*.6!%!.2+)&*3%-!&)p&)1)./%/*0.!30+)1!/4)!')3*1*0.!05!40(!+234!
p&)3*1*0.!/4)!()*64/1!1402-'!4%;)7!81!(*/4!%.:!1:1/)+=!6&)%/)&!p&)3*1*0.!&)12-/1!*.!
*.3&)%1)'!30+p2/%/*0.!/*+)=!6&)%/)&!34*p!%&)%!&)K2*&)+)./1!%.'!p0()&!30.12+p/*0.7!<0&!
%.:!p&0>-)+=!*.!40p)1!05!)-*+*.%/*.6!10+)!05!/4)!p&0>-)+1!%1103*%/)'!(*/4!%!6&)%/)&!
p&)3*1*0.=!/4)!10A3%--)'!R+*.*+2+!p&)3*1*0.S!+21/!>)!')/)&+*.)'7!<0&!10+)!
%pp-*3%/*0.1!M"TN=!*/!+%:!>)!%1!5)(!%1!"T!>*/17!!
G&%64*3*!M"CN!p)&50&+)'!%!+0&)!)@/).1*;)!1/2':!05!p&)3*1*0.=!(4)&)*.!/4)!&%.6)!%.'!
p&)3*1*0.!05!/4)!.)2&0.!()*64/1!()&)!/&)%/)'!%1!;%&*%>-)17!Q4)!%*+!(%1!/0!')&*;)!%!6).)&%-!
50&+2-%!/4%/!')1*6.)&1!302-'!21)!/0!5*.'!/4)!0p/*+%-!()*64/!50&+%/!50&!/4)*&!1:1/)+17!H/!
"#
$%& '()(*+,-(' ).( +,-,+/+ -/+0(* 12 0,)& )1 *(3*(&(-) 1-( 2,4('531,-) $(,6.) ,&
log $p"1# $!"
#$7 $.(*( ).( 8%9/(& 21* ).( $(,6.)& ,& ,- ).( *%-6( !!! !" # :
!"# $%&'($)
;-< -(/*%9 -()$1*= ,+39(+(-)%),1- -(('& %- %*,).+(),> /-,) )1 3(*21*+ ).( 8%*,1/&
>%9>/9%),1-& -(>(&&%*< ,- % -(/*%9 -()$1*=: ?(/*%9 -()$1*=& %*( 9%*6( %-' *(9< 1- ).(
3%*%99(9 >1+3/),-6 31$(* 12 ).(,* -(/*1-& 21* (22,>,(-><: @.( 9%*6( A/%-),)< 12 -(/*1-&
+(%-& ).%) -(/*%9 -()$1*=& .%8( &,6-,2,>%-) .%*'$%*( *(&1/*>( *(A/,*(+(-)&:
B1-&(A/(-)9<7 1-9< 8(*< &+%99 %*,).+(),> /-,)& >%- 0( /&(' ,- % .%*'$%*(
,+39(+(-)%),1-:
@.( BCDEFB %961*,).+ &%),&2,(& ).%) *(A/,*(+(-): BCDEFB ,& >%3%09( 12 >1+3/),-6
+%-< 12 2/->),1-& ).%) %*( /&(' ,- -(/*%9 -()$1*=&: @.( 2%>) ).%) ,) >%- &$,)>. 0()$((-
2/->),1- &()& &1 (%&,9< +(%-& ).%) ,) >%- 0( /&(' ,- % -()$1*= ).%) /),9,G(& ',22(*(-)
)*%-&2(* 2/->),1-& ,- ',22(*(-) 9%<(*& 12 ).( -()$1*=:
@.( +1&) >1++1- )*%-&2(* 2/->),1-& /&(' ,- -(/*%9 -()$1*=& %*( &,6+1,' 2/->),1-&
&/>. %& ).( .<3(*019,> )%-6(-): H.(- ,- ).( 3*13(* >1+3/)%),1- +1'(7 BCDEFB >%- 0(
/&(' )1 >1+3/)( ).,& 2/->),1- $,). ).( .(93 12 % 0,-%*< ',8,'(*7 /&,-6 ).( ,'(-),)<
tanh !( ) $
sinh !( )
cosh !( ): BCDEFB ,& %9&1 >%3%09( 12 3(*21*+,-6 ).,& ',8,&,1- $.(- ,- 9,-(%*
8(>)1*,-6 +1'(7 0/) ).( ',8,&,1- ,& &91$ %-' $1/9' '1/09( >1+3/)%),1- ),+(: BCDEFB
>%- %9&1 3(*21*+ +/9),39,>%),1- $.(- ,- ).( 9,-(%* *1)%),1- +1'(7 $.,>. >%- >1+3/)( ).(
$(,6.)& 21* ).( -(/*1- ,-3/)&7 ).1/6. +/9),39,>%),1- /-,)& 13),+,G(' 21* +/9),39,>%),1- 0<
% >1-&)%-) %*( 0())(* &/,)(' $.(- ).( $(,6.)& .%8( %9*(%'< 0((- '()(*+,-(':
1#
This thesis studies the implementation of the exponential function in an FPGA using
C;RDIC. Though not as common as hyperbolic tangent, there are applications where the
exponential function is used as a transfer function. Eu and Fatalama used the
exponential function as a transfer function in their implementation of a neural network
that acted as an associative memory. I1JK Similarly, Malgamuge used the exponential
function in the implementation of a defuzzication approximator. I19K.
1#
!"#$ter 3 FPGA.
Any designer faces a number of different onto which the system can be
implemented. Each platform comes with a set of benefits and tradeoffs. The nature of the
system in addition to its environment need to be taken into account when choosing a
platform.
3.0 1e.234 1ec2.264.
One approach is to reali?e the design as a software program and implement it using a
microprocessor of some kind. There are numerous architectures that can be chosen, each
with its own benefits. It can be easy to find one that is suited for the design. This
approach does not have any special manufacturing costs associated with itB the only cost
comes for the purchase of the processors. Cowever, this techniDue is best suited for
applications that are mostly seDuential in nature. For applications that are parallel in
nature, a custom hardware-based design can offer much in the way of performance
benefits.
Once the designer has elected to use a custom hardware implementation, the type of
implementation must then be chosen. At the lowest level, a fully custom integrated circuit
can be designed at the transistor level. This offers the designer the most amount of
fleGibility, however design costs can be higher due to the compleGity of the
implementation and fabrication of the design. An Application-Hpecific Integrated Circuit
(AHIC) can alternatively be employed to reali?e the design. Lather than designing at the
transistor level, the designer is presented with a library of logic gates with different
performance and power characteristics that can all be used in the design. Hince the design
! ! "#
$%&'())!*)!)*+$,(%-!./(!'&).!&0!*+$,(+(1.*12!./(!3()*21!'41!5(!,())!./41!4!06,,7!'6).&+!
)&,6.*&18!
9)!41!4,.(%14.*:(!.&!./(!).4.*'!/4%3;4%(!)&,6.*&1)!4%(!4!16+5(%!&0!$%&2%4++45,(!
,&2*'!3(:*'()!./4.!'41!5(!6)(38!<%&2%4++45,(!,&2*'!&00(%)!+6'/!&0!./(!5(1(0*.)!&0!6)*12!4!
+*'%&$%&'())&%8!=(:*'()!'&+(!4,%(437!045%*'4.(3-!(,*+*14.*12!./(!'&).!413!.*+(!
4))&'*4.(3!;*./!045%*'4.*12!41!9>?C!&%!&./(%!'6).&+!3()*218!9,)&-!)*1'(!./(!3(:*'()!4%(!
$%&2%4++45,(-!4!1(;!3()*21!'41!5(!*+$,(+(1.(3!&1!./(!3(:*'(!;*./&6.!/4:*12!.&!%(A
045%*'4.(!1(;!'/*$)8!B/*)!+4C()!./()(!3(:*'()!(D'(,,(1.!'/&*'()!0&%!3()*21!$%&.&.7$*12-!
413!0&%!4$$,*'4.*&1)!;/(%(!*.!*)!3()*%45,(!.&!6$34.(!.&!./(!3()*21!40.(%!*.!/4)!5((1!
3($,&7(38!
B/(%(!4%(!:4%*&6)!4%'/*.('.6%()!0&%!$%&2%4++45,(!3(:*'()-!./4.!4,,&;!0&%!3()*21)!&0!
:4%7*12!)*E()!.&!5(!*+$,(+(1.(38!9.!./(!)+4,,().!,(:(,!4%(!)*+$,(!$%&2%4++45,(!,&2*'!
3(:*'()!F<L=)H!./4.!4,,&;!./(!*+$,(+(1.4.*&1!&0!.;&A,(:(,!,&2*'!061'.*&1)8!C&+$,(D!
<L=)!FC<L=)H!;(%(!3(:(,&$(3!.&!4,,&;!0&%!./(!*+$,(+(1.4.*&1!&0!,4%2(%!,&2*'!061'.*&1)8!
B/()(!'&1)*).!&0!4!'&,,('.*&1!&0!<L=!5,&'C)!6)*12!4!$%&2%4++45,(!*1.(%'&11('.!.&!4,,&;!
3(($!061'.*&1)!.&!5(!*+$,(+(1.(38!
<L=)!4%(!).*,,!:(%7!,*+*.(3!*1!./(!16+5(%!&0!,&2*'!24.(!(I6*:4,(1.)!./4.!'41!5(!+43(8!
J4.(!4%%47)!&00(%!)*21*0*'41.,7!+&%(!/4%3;4%(!%()&6%'()!./41!<L=)-!413!&0.(1!'&+(!;*./!
&./(%!/4%3;4%(!5,&'C)-!)6'/!4)!K9L!&%!4!$%&'())&%!.&!'&+$,(+(1.!4!/4%3;4%(!3()*218!
J4.(!4%%47)!'41!5(!*+$,(+(1.(3!).4.*'4,,7!;*./!./(!3()*21!0%&E(1!&1!)*,*'&1!&%!*1!4!
$%&2%4++45,(!$4'C42(8!M*(,3!$%&2%4++45,(!24.(!4%%47)!FM<J9)H!4,,&;!0&%!
$%&2%4++*12!40.(%!./(!'/*$!/4)!5((1!045%*'4.(38!
"#
3.# FPGA Structure
!"#
!"#
!"# !"#
BlockRAM
)elay-Locked Loop
)elay-Locked Loop
)elay-Locked Loop
)elay-Locked Loop
234 Logic
234 Pins
BlockRAM
BlockRAM
BlockRAM
Mul
tiplie
rM
ultip
lier
Mul
tiplie
rM
ultip
lier
!i#$r& 3.*+,nt&rnal 1tr$2t$r& 34 a Xilinx Spartan-3 !:;< =>?.
$%e standard -P/A %as t%ree 1ain co1ponents6 con7igura:le logic :locks= I/O
:locks= and a progra11a:le interconnectA Bac% co1ponent plays a role in alloDing 7or
t%e Dide Eariety o7 %ardDare designs t%at can :e i1ple1ented in -P/AsA -igure FAG
s%oDs a :lock-leEel diagra1 o7 t%e Earious co1ponents t%at 1ake up an -P/AA
$%e con7igura:le logic :locks IJLBsM are D%ere t%e actual logic 7unctions speci7ied
:y t%e designer are i1ple1entedA In 1ost -P/A designs= t%e con7igura:le logic :locks
"#
$%& '$(& )* +, $ -)'.&% +, /++0)* 1$./&2 3456278 4562 9$- .& :'*/&'&-1&( $2 $
')/1:*/&;&% $-( 9$- :'*/&'&-1 $-< n=>$%:$./& ?++/&$- ,)-91:+-8 6@& n :-*)12 1+ 1@&
,)-91:+- 9$- .& )2&( $2 1@& 9+-1%+/ /:-&2 +- 1@& ')/1:*/&;&%A $-( 1@& "n +)1*)12 ,+% 1@&
,)-91:+- $%& ($1$ :-*)12 1+ 1@& ')/1:*/&;&%8 6@&2& :-*)12 $%& B&-&%$//< :'*/&'&-1&( $2
+-&=.:1 CDEF 9&//2 $-( $%& G@$1 '$0& 1@& /+B:9 ./+902 *%+B%$''$./&8 H$9@ 456
(&29%:.&2 1@& 9+'*/&1& 1%)1@ 1$./& ,+% 1@& ,)-91:+-A %$1@&% 1@$- 1@& %&/$1:+-2@:* .&1G&&-
1@& :-*)12 $-( +)1*)12 :- -+%'$/ ?++/&$- 9:%9):128
E 9+-,:B)%$./& /+B:9 ./+90 G:// 9+-1$:- ')/1:*/& 4562 G:1@ ')/1:*/&;&%2 2&/&91:-B
.&1G&&- 1@& +)1*)12 +, 1@& :--&% 45628 6@& +)1*)1 +, 1@& ,:-$/ ')/1:*/&;&% .&9+'&2 1@&
+)1*)1 +, 1@& /+B:9 ./+908 6@:2 $//+G2 &$9@ 9+-,:B)%$./& /+B:9 ./+90 1+ :'*/&'&-1 $
,)-91:+- +, '+%& >$%:$./&2 1@$- :2 $//+G&( .< 1@& :-(:>:()$/ 4562 1@&'2&/>&28
IJKE2 9+-1$:- $ >$%:&1< +, :-1&%9+--&91:+-28 52)$//<A 1@&%& :2 $ ,$21 :-1&%9+--&91 1@$1
9+--&912 $(L$9&-1 M4?28 6@:2 $//+G2 ,)-91:+-2 1@$1 %&N):%& ')/1:*/& M4?2 1+ *&%,+%'
'+%& &,,:9:&-1/< .< */$9:-B 1@& 2).,)-91:+-2 :- $(L$9&-1 M4?28 I+% 9@:*=G:(& %+)1:-BA
IJKE2 :-9/)(& $ *%+B%$''$./& :-1&%9+--&91 1@$1 :2 )2)$//< :'*/&'&-1&( $2 $ 2G:19@:-B
'$1%:; )2:-B 2:; *$22 1%$-2:21+%2 1@$1 $// *+22:./& 9+--&91:+-2 >&%1:9$//<A @+%:O+-1$//<A $-(
(:$B+-$//<8
6@& PQR 3:-*)1Q+)1*)17 ./+902 3PR?27 *%+>:(& 1@& :-1&%,$9& .&1G&&- 1@& @$%(G$%&
(&2:B- $-( 1@& +)12:(& G+%/(8 H$9@ PR? 9+-1$:-2 /+B:9 2*&9:,:9 1@& :-*)1 $-( +)1*)1 +,
2:B-$/28 52)$//<A $ 1@%&&=21$1& .),,&% :2 )2&( 1+ 9+-,:B)%& 1@& *:- 1@$1 1@& ./+90 :2
9+--&91&( 1+ $2 $- :-*)1 *:-A $-( +)1*)1 *:-A +% $ .:(:%&91:+-$/ *:-8 E((:1:+-$//<A 1@&
./+90 9+-1$:-2 /+B:9 1@$1 2*&9:,:&2 1@& 9)%%&-1 21$1& +, 1@& ./+90 3:8&8 G@&1@&% :1 :2 $- :-*)1
+% $- +)1*)1 *:-78 6@& :-*)1 /+B:9 :2 %&2*+-2:./& ,+% .),,&%:-B 1@& :-*)1 2:B-$/ $-(
22
allo&ing t+e -ignal to be properly capt3red -o it i- pre-entable to t+e re-t o5 t+e FG89:
T+e o3tp3t logic allo&- 5or optional b355ering in a 5lip-5lop= or t+e -ignal can be directly
-ent to t+e pin:
Be5ore any -ignal lea?e- or enter- t+e c+ip= it pa--e- t+ro3g+ an electronic inter5ace
co@ponent t+at deter@ine- t+e electrical c+aracteri-tic- o5 t+e -ignal and control t+e
-a@pling o5 any inp3t ?al3e:
Be-ide- t+e-e t+ree ba-ic co@ponent-= @any ne&er F8G9- pro?ide additional
5acilitie- to @ake t+e de-ign proce-- -i@pler: 9n early addition &a- large array- o5 block
SR9D t+at can be 3-ed 5or data -torage: C+aining toget+er CLB- 5or -i@ple data -torage
can be c3@ber-o@e and area ine55icient= -o 3-ing t+e block SR9D can lea?e @ore c+ip
area 5or t+e +ard&are de-ign: Fa-t carry logic i- o5ten a?ailable &+ic+ allo&- 5a-t adder
de-ign- -3c+ a- carry looka+ead adder- to be i@ple@ented: So@e de-igner- are taking
t+at idea a -tep 5or&ard and i@ple@enting -tatic @3ltiplier- or ot+er 3nit- dedicated to
per5or@ing -i@ple arit+@etic operation-: Ne&er de-ign- can incl3de proce--or- -3c+ a-
t+e 8o&er8C= allo&ing -o5t&are and +ard&are de-ign- to eHi-t on t+e -a@e c+ip:
3.3! #ilin' Spartan-3
T+e XilinH Spartan-3 i- t+e plat5or@ 3-ed 5or t+e de-ign- di-c3--ed in t+e 5ollo&ing
c+apter-: T+e Spartan-3 3-e- SR9D cell- to -tore t+e progra@@ing in5or@ation: Since
t+e SR9D cell- lo-e t+eir -tate &+en t+e po&er -3pply i- t3rned o55= t+ey @3-t be
reprogra@@ed eac+ ti@e t+e c+ip i- po&ered on: T+i- reK3ire- t+e progra@ to be -tored
in a non?olatile 5or@at t+at can be tran-lated onto t+e F8G9:
23
C"##$%"&'%C(&)#(*%L(,-.
/ 0
CL1
CE
S
4
C"##$%"&'%C(&)#(*%L(,-.
L((56U8%T":*;<LUT=
>
I4
IA
IB
IC
/ 0
CL1
CE
S
4
DED
C>UT
F4
FA
FB
FC
L((56U8%T":*;<LUT=
>
I4
IA
IB
IC
G4
GA
GB
GC
CL1CE
EHCII
GII
ED
S4
D0
HEH
H0
Figure 3.*—Block diagram of a 5lice in a Spartan-3.
The Spartan-3 CLBs are broken up into two slices. Figure 3.2 shows what a slice
looks like. Each slice contains two two-variable L?Ts and a multiplexer that selects
between the two. The output of each slice can be any function of five variables, or for
some restricted functions, as many as nine variables can be used. An additional
24
$%ltiple+er i- %-ed to -elect bet2een t4e o%tp%t- o5 t4e t2o -lice-6 T4i- enable- eac4 CLB
to operate on any 5%nction o5 -i+ =ariable-> or a re-tricted cla-- o5 5%nction- o5 %p to 1@
=ariable-6 Eac4 -lice al-o contain- carry and control logic t4at 5acilitate t4e
i$ple$entation o5 arit4$etic 5%nction-6 Fig%re 363 -4o2- 4o2 t4e $%ltiple+er- are %-ed
to -elect bet2een t4e t2o inner -lice-6 El-o> t4e CLB contain- t2o -torage ele$ent- t4at
can be con5ig%red eit4er a- latc4e- or 5lip-5lop-6
!"T
!"T
$
$
!"T
!"T
$
$
%!IC(%
!igure (.(*Block diagram o3 a Comple6 7ogic Block 8C7B9 in a ;partan=(.
T4e Spartan-3 contain- a n%$ber o5 block REJ- de-igned to e55iciently -tore
relati=ely large a$o%nt- o5 data6 Eac4 block contain- 1KK o5 data -torage> and anot4er 2K
5or optional parity c4ecking6 Eac4 block %-e- SREJ cell- 5or -torage6 Eac4 block can
"#
$%&' % &%()'*+ ,- ,(.%/)0%*),/1 -,( %/+ 2,34)/%*),/ ,- 5%*% 6,(5 1)0' %/5 7%()*+ ,7*),/8
9%()*+ )1 5,/' %* *$' 4+*' :'&':; 1, 5%*% 7%*$1 <1)/. = 4+*'1 6):: $%&' = 7%()*+ 4)*18
>5?%2'/* *, '%2$ 4:,2@ A>B )1 %/ CD!CD 3<:*)7:)'(8 E$' 3<:*)7:)'(1 2%/ 4' 2%12%5'5
*, 1<77,(* ,7'(%/51 :%(.'( *$%/ CD 4)*1 6)5' ,( *, ,7'(%*' ,/ 3,(' *$%/ F ,7'(%/518 G$'/
1+/*$'1)0)/. % 3<:*)7:)2%*),/ ,7'(%*),/; *$' 1+/*$'1)0'( 2%/ <1' ')*$'( %/ %1+/2$(,/,<1
3<:*)7:)'( ,( % 1+/2$(,/,<1 3<:*)7:)'( *$%* )/2:<5'1 % ('.)1*'(8 H%2$ 3<:*)7:)'( ,<*7<*1 %
FIJ4)* 6)5' 7(,5<2*8
E$' K7%(*%/JF $%1 -,<( 5)--'('/* :'&':1 ,- )/*'(2,//'2*),/8 E$' -)(1* *+7' ,-
)/*'(2,//'2* )1 *$' L)('2* M)/' *$%* 2,//'2*1 %5?%2'/* NMO18 E$)1 )/2:<5'1 2,//'2*),/1
*$%* ., )/1)5' *$' NMO1 *, 2,//'2* *$' MPE1 %1 6':: %1 -%1* 2,//'2*),/1 4'*6''/ *$'
NMO1 *, *$' )33'5)%*' :'-* %/5 ().$*8 E$'1' :)/'1 %(' 3,1* -('Q<'/*:+ <1'5 *, % :,/.'(
)/*'(2,//'2*; 6$)2$ )/ *<(/ 2,//'2*1 *, % 5)--'('/* 5)('2* :)/' *, *$' 5'1*)/%*),/ NMO8
M,/. :)/'1 2,//'2* *, '&'(+ 1)R*$ NMO %/5 $%&' :,6 2%7%2)*%/2' *, -%2):)*%*' *$' *(%/1-'(
,- $).$J-('Q<'/2+ 1)./%:18 O'*6''/ *$' L)('2* M)/'1 %/5 *$' M,/. M)/'1 )/ *'(31 ,-
2%7%4):)*+ %(' *$' S'R M)/'1 %/5 L,<4:' M)/'18 S'R M)/'1 2,//'2* *, '&'(+ *$)(5 NMO
%/5 L,<4:' M)/'1 2,//'2* *, %:*'(/%*)/. NMO18 O,*$ *+7'1 ,- )/*'(2,//'2* ,--'( %
2,37(,3)1' 4'*6''/ 2,//'2*)&)*+ %/5 $).$J-('Q<'/2+ 2$%(%2*'()1*)218
2#
!h#$ter 4 The !+,-.! Algorithm
The COrdinate Rotation DIgital Computer (CORDIC) algorithm is a technique that
can be used to compute the value of trigonometric functions. CORDIC differs from other
methods of computation because it can be easily implemented using only addition,
subtraction and bit shift operations. It is not as fast as table-based methods, but it can use
significantly less chip area, making it desirable for application where area is more
important than performance.
4.6 7#89gro:;<
The CORDIC algorithm was first proposed by Volder in his paper GThe CORDIC
Trigonometric Computing Technique,H published in 1JKJ L2M. It was originally intended
to be used for real-time airborne computation, but has since found other applications. In
1J71, Walther demonstrated that the algorithmPs domain could also be expanded beyond
computing trigonometric equations to also include hyperbolic and linear (multiply-add-
fused, etc.) functions.
Walther presented a possible hardware implementation using three !-bit registers,
three !-bit adderRsubtractors, three shifters, and a small look-up table. Volder presented a
serial design containing three !-bit"registers, three 1-bit adderRsubtractors, and a number
of shift gates that specified which bits from the registers get fed into the
adderRsubtractors. This thesis will explore both implementations and analySe the
capability of this algorithm to be used in an artificial neural network.
"#
4"# $%&'()*+,-.%/'
The algorithm operate0 in one o2 t3o mode05 Rotation or 7ectoring. The t3o mode0
determine 3hich 0et o2 2unction0 can ;e computed u0ing the algorithm. In Rotation mode=
the !" and #" component0 o2 the 0tarting >ector are input= a0 3ell a0 an angle o2 rotation.
The hard3are then iterati>el? compute0 the !@ and #@component0 o2 the >ector a2ter it ha0
;een rotated ;? the 0peci2ied angle o2 rotation.
In 7ectoring mode= the t3o component0 are input= and the magnitude and angle o2
the original >ector are computed. Thi0 i0 accompli0hed ;? rotating the input >ector until it
i0 aligned 3ith the !"aAi0. B? recording the angle o2 rotation to achie>e thi0 alignment=
3e kno3 the angle o2 the original >ector. Once the algorithm i0 complete= the !@
component o2 the >ector i0 eEual to the magnitude o2 the 0tarting >ector.
The CORDIC algorithm accompli0he0 it0 computation0 ;? u0ing onl? add0=
0u;tract0= and 0hi2t0. Thi0 i0 done ;? iterati>el? rotating the input >ector and 0lo3l?
con>erging on the 2inal rotated >ector. Thi0 i0 done in a 0erie0 o2 3ell@de2ined 0tep0. The
2ir0t rotation 0tep 0ee0 the >ector rotated !
! radian05
H4.1K
! !
!!
= ±x0
= #0#$% !
0± "
&( )
x!
=!!0
= #0'(# !
0± "
&( )
Lhere !0
and y0 repre0ent the input >ector aligned at the origin 3ith magnitude !
!
and angle !!5
H4."K
!
!i = #i cos!i
$i = #i sin!i
In each 0u;0eEuent 0tep= a ne3 angle o2 rotation !i i0 determined 0uch that5
! ! "#
$%&'(! !!!" !"#!$ %!!# $&!'()*)+!% , !
)*i,!-.,/-i0/i12!i,!0-30i45!i2!45516i27!/*.!-1/4/i12!0450354/i12,!8.-91-:.;!i2!.40*!,/.8!
/1!<.!4001:85i,*.;!3,i27!125=!42!4;;!$1-!,3</-40/(!42;!4!,*i9/&!
!!
!!
"!+1"!
#!
#!+1
$!
!!i#$%&'()*+,-&.'i'i/'-0&'123451'67#8%i-09)!
>i73-.!%&?!,*16,!6*4/!.40*!,/.8!i2!/*.!@ABCD@!4571-i/*:!511E,!5iE.&!F/!.40*!,/.8G!
4!;.0i,i12!i,!:4;.!6*./*.-!/1!-1/4/.!/*.!H.0/1-!<=!!!!!1-!!"
i&!)*.!13/01:.!19!<1/*!19!
/*.,.!;.0i,i12,!i,!,*162!i2!/*.!9i73-.&!)*.!.I8-.,,i12!91-!/*.!-1/4/.;!H.0/1-!i2!/*.!$iJ?(/*!
,/.8!i,K!
$%&%(!
!
xi"! # !" "!"i #$% !i $"i% &
yi"! # !" "!"i %&' !i $"i% &!
L*.2!4885=i27!/*.!-.,/-i0/i12!i2!$%&'(G!/*.!,*i9/i27!42;!4;;i27!<.01:.!.Hi;.2/K!
! ! 2#
$%&'(!
!i+! =!
Ki
!i !$i"!i%i( )
%i+! =!
Ki
%i + $i"!i!i( )
#$%&%'Ki = !+ "!"i (!$i = ±!
!
!"!is!+,-!./0ni+23-!-44o4!+-4.6!/n3!!
i!co44-s8on3s!+o!+,-!4o+/+ion!3i4-c+ion!$91!
co44-s8on3s!+o!/!4o+/+ion!/;/<!=4o.!+,-!!>/?is(&!@,-! !! !!+-4.s!in!+,-!+o8!/n3!Ao++o.!
-B2/+ions!co44-s8on3!+o!/!C-=+!s,i=+!o=! !" !/n3! !i 6!4-s8-c+iD-C<!$;,-n!o8-4/+in0!in!A/s-!2(&!
@,is!s,i=+-3!D/C2-!is!+,-n!/33-3!$o4!s2A+4/c+-3(!+o!+,-!c244-n+!D/C2-!o=!+,-!co.8on-n+&!
EoC3-4!4-=-44-3!+,is!o8-4/+ion!"#$%%&'((i*i$+&!F+!is!+,is!c4oss>/33i+ion!+,/+!-n/AC-s!+,-!
/C0o4i+,.!+o!A-!2s-3!-==-c+iD-C<!in!3i0i+/C!,/43;/4-&!
Gs!iCC2s+4/+-3!in!Hi024-!%&16!/CC!4o+/+ion!s+-8s!-==-c+!/n!inc4-/s-!in!+,-!./0ni+23-!o=!
+,-!in82+!D-c+o4!A<!/!=/c+o4!o=! !!"!"! !;i+,!-/c,!4o+/+ion&!@,is!-44o4!is!in+4o32c-3!/s!/!
cons-B2-nc-!o=!+,-!/C0o4i+,.Is!3-4iD/+ion!=4o.!+,-!JiD-ns!+4/ns=o4.6!;,ic,!4o+/+-s!/!
D-c+o4!A<!/!s8-ci=i-3!/n0C-K!
$%&L(!
!
!! = ! cos!" "sin!
!" = "cos!+ ! sin!!
@,-s-!+-4.s!c/n!A-!4-/44/n0-3!2sin0!+,-!A/sic!+4i0ono.-+4ic!i3-n+i+<!
tan!=sin!
cos!K!
$%&M(!
!!
!x " !"#! x" y $%&!# $
!y " !"#! y% x $%&!# $!
Nsin0!+,-!s/.-!4-s+4ic+ion!in!$%&O(6!;-!0-+!+,-!s/.-!4-C/+ions,i8s!/s!in!$%&'(&!@,-!
-44o4!+-4.!is!=4o.!+,-!84-s-nc-!o=!+,-!cosin-!+-4.!in!$%&M(6!;,ic,!is!in3-8-n3-n+!o=!+,-!
4o+/+ion!3i4-c+ion6!sinc-!cosin-!is!s<..-+4ic!/Ao2+!+,-!4o+/+ion!3i4-c+ion6!sinc-!cosin-!is!
! ! "#
$y&&'()*+!,-./(!(0'!!1,x*$3!4.)'.5')6!(0*$!')).)!,++/&/7,('$!8*(0!',+0!$('96!$.!*:!(0'!
;/&-')!.:!*('),(*.;$!*$!$'(6!(0'!(.(,7!')).)!*;!(0'!,7<.)*(0&!*$!*;d'9';d';(!.:!(0'!*;9/(!
,;<7'3!
>:!(0'!:*)$(!).(,(*.;!*$!<*5';!-y!
?@3AB!
!!
" !# "!" !( )"#$%& !
## #
!"!
( )
$!
" !# "!" !( )"#&'( !
## #
!"!
( )!
C0';!(0'!$'+.;d!).(,(*.;!*$!<*5';!-y!
?@3DB!
!
!!
= "+ !!! "( )"+ !!! !( )
"#$%& !+ #
#"#
+ #"""
( )
$!
= "+ !!! "( )"+ !!! !( )
"#&'( !+ #
#"#
+ #"""
( )!
C0'!n1(0!).(,(*.;!+,;!-'!'x(';d'd!,;d!,!<';'),7!d':*;*(*.;!d')*5'dE!
?@3F#B
!
!"+! = !+ "!" !( )!+ "!" "( )
! !+ "!"""
#$
%
&' ## $%& !+ $
#"#
+ $!"!+!+ $"""( )
%"+! = !+ "!" !( )!+ "!" "( )
! !+ "!"""
#$
%
&' ## &'( !+ $
#"#
+ $!"!+!+ $"""( )
!
C0'!(.(,7!*;+)',$'!*;!&,<;*(/d'!+,;!-'!$9'+*:*'d!,$!
!"
" 1# "!"#
"
" 3!C0*$!*;+)',$'!
&/$(!-'!,++./;('d!:.)!80';!9'):.)&*;<!+,7+/7,(*.;$!/$*;<!(0*$!,7<.)*(0&3!
!"# $%%&'&()*+, ./012*/,2
C0'!GHRJ>G!d'$*<;!+,77$!:.)!(0)''!,++/&/7,(.)!)'<*$(')$E!(0'!K!)'<*$(')6!(0'!L!
)'<*$(')6!,;d!(0'!,;<7'!,++/&/7,(.)!?MB3!C0'!K!,;d!L!)'<*$(')$!0.7d!(0'!9)'$';(!#1!,;d!!1
+.&9.;';($!.:!(0'!5'+(.)!,$!*(!*$!-'*;<!).(,('d3!C0'!,;<7'!,++/&/7,(.)!0.7d$!(0'!(.(,7!
).(,(*.;!,&./;(!+.&97'('d!,(!(0'!+/))';(!*('),(*.;3!
! ! "#
$%&!'n)*&!'++,-,*'./0!1./0&1!.%&!'0),-&n.1!./!.%&!12n&!'nd!+/12n&!.&0-1!2n!
45,'.2/n!678#9:;!! !"
" #!!!
# #"!"#!# #
"!"8!</=&>&0?!12n+&!.%21!.&0-!21!&5,2>'*&n.!./!
.%&!d&120&d!0/.'.2/n!'n)*&@+/n1.0'2n&d!./!! !!= !"n
!$%!!( )@.%&!Z!0&)21.&0!-,1.!'*='B1!
+/n.'2n!.%&!&CD0&112/n!!
678##:!
!!!" "
#"#$
!%&!!# $
#"%
!
" !
$%&!'0+.'n)&n.!.&0-1!+'n!E&!1./0&d!2n!'!1-'**!*//F,D!.'E*&8!G%&n!.%21!21!d/n&?!/n*B!
'n!'dd2.2/n!/0!'!1,E.0'+.2/n!21!0&5,20&d!./!+/-D,.&!.%&!n&C.!>'*,&!2n!.%&!Z!0&)21.&0?!12n+&!
! ! "1 8!$%21!.'E*&!21!>&0B!1-'**?!0&5,202n)!/n*B!/n&!0/=!H/0!&'+%!2.&0'.2/n!/H!.%&!
'*)/02.%-8!I2n+&!.%&!2.&0'.2/n!+/,n.!+'n!E&!H2C&d!2n!%'0d='0&?!.%&!12J&!/H!.%&!.'E*&!21!
+/n1.'n.8!
$%&!'++,-,*'./0!0&)21.&01!'0&!'*1/!,1&d!./!d&.&0-2n&!.%&!d20&+.2/n!/H!0/.'.2/n8!Kn!
L/.'.2/n!-/d&?!.%&!12)n!/H!.%&!Z!0&)21.&0!d&.&0-2n&1!.%&!d20&+.2/n8!Kn!M&+./02n)!-/d&?!.%&!
N!0&)21.&0!d&.&0-2n&1!.%&!d20&+.2/n8!$%&1&!-/d&1!'0&!H,0.%&0!&*'E/0'.&d!2n!.%&!n&C.!
1&+.2/n8!
!"! #$%&'()(*$+ -$./0
$%&!OPLQKO!'*)/02.%-!/D&0'.&1!2n!/n&!/H!.=/!-/d&1;!L/.'.2/n!'nd!M&+./02n)8!$%&!
-/d&!/H!/D&0'.2/n!d&.&0-2n&1!=%2+%!1&.!/H!H,n+.2/n1!+'n!E&!+/-D,.&d?!'nd!%/=!.%&!
>'*,&1!2n!.%&!R?!N?!'nd!Z!0&)21.&01!+%'n)&!&'+%!2.&0'.2/n8!
! ! "#
!"!"#!$%&'&i%)*+%,-*
!"!#!$%
!&'&0
!&'&1
!&'&2
!&'&3
!&'&4
!
"! " # $
! "#$$$$ $#$$$$ %$#$$$$&! "#!!!! "#!!!! $"%#!!!!&" "#%!!! !#%!!! ""#%'%"&( "#)*%! !#+*%! $(#,*"(&) "#,+,, !#*!)" ,#'%)+&, "#,,!, !#*-%- "#!**%&% "#,"%' !#+,!- $!#*"(,&' "#,(+* !#+"++ !#"+(+&* "#,(() !#+)!! $!#(',-&+ "#,(%% !#+(,, $!#!,""&- "#,(*( !#+("' !#!*!-&"$ "#'()' $#*(%$ $#$"'+&
!!i#$%& ()*+,-& ./012. 03454i36 738&)
$%!&'()(*'%!+',-.!(/-!*%01(!2-3('4!*5!4'()(-,!'2-4!)!50-3*6*-,!)%78-9!:/-!*%01(!2-3('4!
*5!50-3*6*-,!)5!(/-!*%*(*)8!2)81-!(/-!;!)%,!<!4-7*5(-459!:/-!4'()(*'%!)+'1%(!*5!*%01(!*%('!(/-!
=!4-7*5(-49!$%!'4,-4!('!4'()(-!(/-!*%01(!2-3('4!'2-4!(/-!*%01(!)%78-.!(/-!7')8!'6!-)3/!
*(-4)(*'%!5/'18,!>-!('!4-,13-!(/-!2)81-!*%!(/-!)%78-!)331+18)('4!('!?9!@*%3-!(/-!)43()%7-%(!
2)81-5!6'4!-)3/!*(-4)(*'%!)4-!6*x-,.!(/-!'%8y!C)y!(/)(!(/-!2)81-!*%!(/-!=!4-7*5(-4!3)%!>-!
3'%(4'88-,!*5!(/4'17/!(/-!!!2)81-59!$%!&'()(*'%!D',-.!(/-!,-3*5*'%!2)81-5!)4-!,-6*%-,!('!
>-E!
FG9H#I!
!i=
+!"#i%#&i! '
"!" #i%#&i$ '
#$%%
&%%!
J*(/!(/*5!,-6*%*(*'%!'6!(/-!,-3*5*'%!61%3(*'%.!)6(-4!"!*(-4)(*'%5!'6!(/-!)87'4*(/+5.!*(!*5!
K%'C%!C/)(!(/-!2)81-5!*%!-)3/!'6!(/-!4-7*5(-45!C*88!>-E!
FG9H"I!
!n
" "n!"#$% &
"( )!'
"%(n &
"( )"
#$%
'n
" "n#"#$% &
"( )% $
"%(n &
"( )"
#$%
&n
" "
"n
" *% +!+n
n
&
!
""
#$%&'( )*+ ,-./, /-01 -022(3, 1. 1-( $32&1 4(51.' 061(' (05- $1('01$.3 $3 1-(
07%.'$1-8* 93 1-$, (:0827(; 1-( 4(51.' $, $3$1$077< 07$%3(= /$1- 1-( !>0:$,* ?-( 80%3$1&=(
.6 1-( 4(51.' $35'(0,(, /$1- (05- ,1(2; 0, 1-( 03%7( .6 1-( 4(51.' 5.34('%(, .3 "@A* ?-(
407&(, .6 1-( B; C; 03= D '(%$,1(', 0'( 07,. ,-./3*
93 1-( 6.77./$3% ,(51$.3,; =(,5'$21$.3, .6 -./ 40'$.&, 6&351$.3, 503 E( 5.82&1(=
&,$3% F.101$.3 8.=( /$77 E( =$,5&,,(=*
!"!"#"# $%n'()*+,%n')-n.)/+0-12*-13',%-n)41-n,5+16-3%+n)
?-( 5.82&101$.3 .6 1-( ,$3( 03= 5.,$3( 6&351$.3, $, $31'$3,$5 1. 1-( F.101$.3 8.=(;
03= 503 (0,$7< E( =('$4(= 6'.8 ()*H")* J77 1-01 $, 3(5(,,0'< $, 1. $3$1$07$K( 1-( C '(%$,1('
/$1- @; 03= 1-( B '(%$,1(' /$1- 1-( =(,$'(= ,507$3% 6051.'* 96 1-('( /('( 3. %0$3 $3 1-(
80%3$1&=( .6 1-( $32&1 4(51.'; 1-(3 1-( B '(%$,1(' 5.&7= E( $3$1$07$K(= 1. H; 03= 407&(, .6
,$3( 03= 5.,$3( 5.&7= E( '(0= =$'(517< .&1 .6 1-( C 03= B '(%$,1(',; '(,2(51$4(7<*
L./(4('; 1-( %0$3 8(03, 1-01 ,.8( ,507$3% 8&,1 E( 2('6.'8(= E(6.'( 5.82&101$.3* J61('
n $1('01$.3, .6 1-( 07%.'$1-8; 1-( 5.31(31, .6 1-( '(%$,1(', /$77 E(M
()*H))
!
!n" "
nX"#$% $
"# $
&n" "
nY"%'n $
"# $
)n! "
?-('(6.'(; $3 .'=(' 1. 5.82&1( 1-( ,$3( .' 5.,$3( .6 03 03%7( !; 1-(3 1-( D '(%$,1('
/$77 E( $3$1$07$K(= /$1- !; C /$1- @; 03= 1-( B '(%$,1(' /$1- !"; 1. 055.&31 6.' 1-( %0$3*
#$%&'( )*+ ,-./, 1-( NOFP9N 2'.5(,, 1. 5.82&1( ,$3( 03= 5.,$3(* ?-( $3$1$07 4(51.' $,
07$%3(=; 0, '(Q&$'(=* ?-( B '(%$,1(' 5.''(,2.3=, 1. 1-( ,507(= 5.,$3( 407&(; 03= 1-( C
'(%$,1(' 5.''(,2.3=, 1. 1-( ,507(= ,$3( 407&(* R( 503 =(1('8$3( 1-( &3,507(= 407&( E<
! ! 34
$i&i$in(!)*+!,in-.!&-./+s!12!!!"! !#$%$& .!456!+7-89.+:!-,)+6!11!i)+6-)i5ns:!)*+!-.(56i)*8!
2i+.$s!
! !"# $%!" # $&'%
!'%
$%()*$%
'(+,+)$ %(,--) !
<*+!8+)*5$!)5!=589/)+!sin+!-n$!=5sin+!is!)*+!s-8+!,56!>5.-6?)5?@-6)+si-n!=556$in-)+!
)6-ns,568-)i5n.!Ain=+!)*+!)6-ns,568-)i5n!is!$+,in+$!)5!1+!
B4.1CD!! ! " !"#!
# ! " #$n!!
E..!)*-)!is!n+=+ss-62!)5!9+6,568!)*+!)6-ns,568-)i5n!is!)5!5n=+!-(-in!.5-$!!!in)5!!!-n$!
.5-$!X!Gi)*! !"#:!-n$!H!Gi)*!0.!
4.4.2!$%&'()*+,-.(/%-
!
i " # $
! 1.0000 1.73'1 0.0000(! "#$%"& !#$%"& '(#!!!!)& %#!*+& ,!#-%'! $&#(-(&)" %#"(-- !#&'!( ($#("++)% %#"$'& ,!#"--( -'#-(%+)' %#"*!+ ,!#!-&* -&#!$$()( %#"*"$ !#!'!* (*#"+$-)- %#"*%' ,!#!&!( -!#&+"+)$ %#"*%( !#!&(" (*#$%(&)+ %#"*%( !#!!"' (*#*(+*)* %#"*%( ,!#!!'& -!#!$!*)10 3.')35 +0.000) 60.014)(
i"i#i$%
i&'&(
i&'&)
i&'&*
i&'&+i&'&,
"
!!"#$%&'()*+,-./0,'1&234%"5#'647&'
Jn!K+=)56in(!85$+:!)*+!+L/-)i5ns!/s+$!)5!/9$-)+!)*+!6+(is)+6s!6+8-in!)*+!s-8+:!1/)!
)*+!,/n=)i5n!)*-)!$+)+68in+s!)*+!65)-)i5n!$i6+=)i5n!=*-n(+s.!!Jn!&+=)56in(!85$+:!)*+!
"#
$%&'(i*+, *(i-. *' $%i&/ *+- i/01* 2-3*'( 4i*+ *+- ! $5i., 4+i3+ ,-$/. *+$* *+- 2$%1- i/ *+-
Y (-&i.*-( .+'1%8 3'/2-(&- '/ 9-(': ;+- 8-3i.i'/ f1/3*i'/ i/ ('*$*i/& ,'8- i.=
>?:1AB !"!
"!"#i%#&"# '
!!"#i%#&"" '
#$%
Ci&1(- ?:" .+'4. +'4 *+- i/01* 2-3*'( 3'/2-(&-. '/ *+- !D$5i. 4i*+ -$3+ i*-($*i'/ 'f
*+- $%&'(i*+,: E. i/ *+- ('*$*i'/ ,'8-, *+- ,$&/i*18- 'f *+- 2-3*'( i/3(-$.-. 4i*+ -2-(F
i*-($*i'/ 'f *+- $%&'(i*+,: ;+- .i&/ 'f Y i. 1.-8 *' 8-*-(,i/- 4+i3+ 8i(-3*i'/ *' ('*$*-,
4i*+ *+- &'$% 'f G(i/&i/& *+- 2$%1- *' 0: I/ *+i. -5$,0%-, *+- $/&%- 'f *+- i/01* 2-3*'(
4i*+ (-.0-3* *' *+- !D$5i. i. 3',01*-8 $/8 f'1/8 i/ *+- J (-&i.*-(: Ki/3- *+- $/&%- 8'-.
/'* .3$%-, *+- (-.1%* i. *+- 3'((-3* $/&%-: L.i/& *+- /-4 8-3i.i'/ f1/3*i'/, $f*-( n
i*-($*i'/. 'f *+- $%&'(i*+,, *+- (-&i.*-(. 4i%% 3'/*$i/=
>?:1MB
!!
" "!!"
##$
"
#
$!! "
%!
" #"
# &'(") $"
!"
#
$%%%%
&
'
((((
"!
" )# #"#$
!
)
;+- f'%%'4i/& .-3*i'/. 8i.31.. 4+i3+ f1/3*i'/. 3$/ G- 3',01*-8 i/ 2-3*'(i/& ,'8-:
!"!"#"$ %&'()*+,*(-
E. .--/ i/ -N1$*i'/ >?:1MB, *+- $(3*$/&-/* f1/3*i'/ i. i/*(i/.i3$%%F 3',01*-8 i/ *+- J
(-&i.*-( 4+-/ i/ O-3*'(i/& ,'8-: I/ '(8-( *' 3',01*- *+- $(3*$/&-/* 'f $/ $/&%- ! , *+-/
*+- J (-&i.*-( ,1.* G- i/i*i$%i9-8 *' 0, .' *+$* *+- !"
*-(, i. -%i,i/$*-8 i/ >?:1MB, $/8 *+-
$/&%- ! ,1.* G- -50(-..-8 $. $ ($*i' 'f *+- *4' 2$%1-. i/ *+- P $/8 Y (-&i.*-(.: I* i.
! ! 3#
$%&&'()*!t%!'n't'a)'.*!/!0't1!2345!an6!7!0't1! ! 5!a&!'&!6%n*!'n!8'9:r*!<333!=1*!r*&:)t!%>!
!!"#$!% &" # !'&!c%rr*ct)@!c%A$:t*6!t%!(*!#4B3!
C<32DE!
!"
" !!
# "#$!% #
!
$!
"
#$$$$
%
&
''''
!"
" !# "#$!%
&$ %!"
" '!&
!
!"!"#"#! $%&'(r*+,-./'01%*,.1*2,r'%3/,.45(6,r*
7r,.38(r9,'/(.*
F&!A*nt'%n*6!*ar)'*r5!t1*!>'na)!Ga):*!'n!t1*!/!r*9'&t*r!c%nta'n&!t1*!&ca)*6!Aa9n't:6*!
%>!t1*!'n$:t!G*ct%r3!=1'&!$r%$*rt@5!c%A('n*6!0't1!t1*!'ntr'n&'c!c%A$:tat'%n!%>!arctan9*nt!
at!t1*!&aA*!t'A*!A*an&!t1at!t1*!HIJKLH!G*ct%r'n9!A%6*!a:t%Aat'ca))@!6%*&!a!Hart*&'an!
t%!M%)ar!c%%r6'nat*!tran&>%rAat'%n5!N:&t!a&!t1*!J%tat'%n!A%6*!6%*&!a!M%)ar!t%!Hart*&'an!
c%nG*r&'%n3!J*ca))!t1*!*O:at'%n!>%r!t1*!Hart*&'anPM%)ar!tran&>%rAat'%nQ!
C<32RE!
r " "!
# #!
!" "#$!% #
"
!
=1*!Ga):*!>%r!!!'&!c%A$:t*6!'n!t1*!/!r*9'&t*r5!an6!!! !'n!t1*!S!r*9'&t*r3!
!":! ;x=,.1/.-*'>%*2(9=0','/(.*?(9,/.*
=1*!HIJKLH!a)9%r't1A!a&!$r*&*nt*6!t1:&!>ar!can!%n)@!c%A$:t*!>:nct'%n&!(a&*6!%n!
t1*!&'n*!an6!c%&'n*!>:nct'%n&3!=1'&!'&!a!c%n&*O:*nc*!%>!t1*!c'rc:)ar!r%tat'%n&!$*r>%rA*6!'n!
*ac1!&t*$3!=1*!a)9%r't1A!'&!ca$a()*!%>!$*r>%rA'n9!)'n*ar!an6!1@$*r(%)'c!r%tat'%n&!a&!0*))5!
01'c1!*T$an6&!t1*!&*t!%>!>:nct'%n&!t1at!t1*!a)9%r't1A!can!c%A$:t*3!=%!a))%0!>%r!t1*&*!
n*0!6%Aa'n&5!a!n*0!>act%r!'&!'ntr%6:c*6!t%!t1*!&*t!%>!HIJKLH!*O:at'%n&3!Lt&!Ga):*!
! ! "#
$%&%'()*%+!)*!,-).-!.//'$)*0&%!+1+&%(!&-%!023/')&-(!,)22!/4%'0&%5!6-)+!70.&/'!)+!$%7)*%$!
0+!!
895:0<!!!! "!"#"!" # !
=*!!!>02?%!/7!@1!./''%+4/*$+!&/!&-%!-14%'B/2).!$/(0)*,!0!&/!&-%!2)*%0'!$/(0)*,!0*$!
D1!&/!&-%!.)'.?20'!$/(0)*5!E-%*!&-)+!70.&/'!)+!0442)%$!&/!FG?0&)/*!895H<,!&-%!7/22/,)*3!
3%*%'02I4?'4/+%!$%7)*)&)/*!/7!&-%!JKLMNJ!023/')&-(!)+!/B&0)*%$!80++?()*3!n!)&%'0&)/*+<O!
895:1<!
!
!"" #
"!!"#$ !
"$# $%%! $ $%& !
"$# $!
"#$
%&
%"" #
"%!"#$ !
"$# $' !! $ $%& !
"$# $!
"#$
%&
&"" &
!%!
"
!
=+!)*!95:,!!! !)+!&-%!%2%(%*&0'1!'/&0&)/*!4%'7/'(%$!%0.-!+&%4,!,)&-! !! !B%)*3!&-%!+?(!
/7!&-%!'/&0&)/*+!4%'7/'(%$5!6-)+!'/&0&)/*!)+!$%7)*%$!+/!&-0&!&-%!/4%'0&)/*!4%'7/'(%$!7/'!
%0.-!+&%4!)*!&-%!023/')&-(!'%$?.%+!&/!0!+-)7&!0*$!0$$+O!
895::<!
!
!!=
!"#!$%!!( )& " = +$
%!!& " = '
!"#(!$%!!( )& " =!$
"
#
$$$$$
%
$$$$$
!
E)&-!&-%+%!./*$)&)/*+,!&-%!023/')&-(!'%$?.%+!&/O!
895:"<!
!!
xi"1#
1
Ki
xi !mdi2!iyi$ %
yi"1#
1
Ki
yi " di2!ixi$ %
zi"1# zi !di!i
where Ki # 1"m2!2i
,!di #&1,!m" !1,),"1' (
!
! ! "#
!"m$%&' $r&'()!*ac(&r-!!"! ("&)*$&*'+,*-.'
!
Circular
!"1# $!
!"!!""!
!
# =!#!$$%0%'"( ! 1/!i(1ra(i&23!
!
Linear
!" 0# $!
1" " !#"!
!
# =1$""""""""" ! 4!
!"#$%&'()*
! =!1( )!
!+ !!( )""!! = #$%"%!&'()!
!
# ! 1/!i(1ra(i&23!
/$01-'2345(67!8('9.":*;',$<*".)',".'-$<;'<"m=>*$*%"&'?"m$%&3'
5)1!6alu13!&9!()1!C;<=>C!?r&'()!9ac(&r3!ar1!3)&'2!i2!5a@l1!AB1B!5)1!6alu13!ar1!
r&u2C1C!(&!2i21!C1ciDal!Elac13B!5)1!C&23(a2(!F9(1r!c&luD2!3)&'3!)&'!Da2G!i(1ra(i&23!i(!
(aH13!u2(il!()1!Ci3ElaG1C!6alu1!r1Dai23!c&23(a2(B!*&r!all!@u(!()1!li21ar!C&Dai2-!')ic)!
al'aG3!)a3!a!9ac(&r!&9!1-!()1!Ci3ElaG1C!6alu1!ca2!@1!u31C!9&r!a2G!iDEl1D12(a(i&2!'i()!1/!
&r!D&r1!i(1ra(i&23B!
F3!@19&r1-!()1!#!a2C!$!c&DE&212(3!ar1!3(&r1C!i2!()1!I!a2C!J!r1?i3(1r3B!5)1!C19i2i(i&2!
&9!K!C1E12C3!&2!()1!c&DEu(a(i&2!D&C1-!'i()!)GE1r@&lic!(a2?12(!@1i2?!u31C!i23(1aC!&9!()1!
3(a2CarC!(a2!9u2c(i&2!')12!i2!()1!)GE1r@&lic!C&Dai2-!9&r!1LaDEl1M!
NABOAP!
!
zi"! #
zi!d
itan
!!2!i$ %&!m#!
zi!d
i2!i&!m# '
zi!d
itanh
!!2!!$ %&!m#!!
"
#
$$$$$
%
$$$$$
!
5)1!9&ll&'i2?!(a@l1!3uDDariQ13!')a(!9u2c(i&23!ca2!@1!c&DEu(1C!a2C!i2!')ic)!D&C1!
a2C!C&Dai2!()1G!ca2!@1!c&DEu(1CM!
! ! "#
! !"#$%&'&(") +",-
."#'() /"&'&(") 0-1&"2()3
!
!i#c%&a#
!"(# $
!
! " "n!!"#$ $
!# $
% " "n!!$%& $
!# $
'(&$!"%
!
!
! = "n!!
"#$
!
"
% = %!
# #$%!& $
!
!!
"
#$$$$
%
&
''''
!
!
!"#$%&
!" 0# $ !
! "!!# "
!#!!
!
! " !0#"0
#0
!
!"#e%&'()*
! =!+( )
X = KnX0"#$h Z
0( )
Y = KnX0$inh Z
0( )
tanhZ0
=Y
X
&Z0 = X +Y
!
!
! " "n!!
" !$!
"
% " %!# #$%&!'
$!
!!
"
#$$$$
%
&
''''
(% !$ %" "%) *+i#&!!! "!#')!$! "!!'
! "'
"n
!) *+i#&*!!"!# '
-)!$
!"!! '
-
!
4'56- 789:;%)1&(")< &='& 1') 5- 1"#$%&-, %<()3 &=- !>/.?! '63"2(&=#8
$%!&''()(*%!)*!)+,!&''()(*%!*-!)+,!!!'*m&(%!/,0,1)*23!&!1*450,!*)+,2!m*'(-(1&)(*%/!
m4/)!6,!m&',!)*!)+,!&07*2()+m!8+,%!-4%1)(*%(%7!(%!*)+,2!m*',/9!
$%!)+,!1(2140&2!'*m&(%3!)+,!&07*2()+m!/)&2)/!8()+!)+,!2,7(/),2/!(%!)+,(2!(%()(&0(:,'!/)&),/!
&%'!)+,!&07*2()+m!6,7(%/!8()+!&!/,;4,%1,!*-!"!<&04,/!*-!)+,!-*2m!=3!>3!?3!"3!@3!A!&%'!/*!
*%!4%)(0!)+,!',/(2,'!%4m6,2!*-!(),2&)(*%/!+&/!6,,%!1*m50,),'9!$%!)+,!-(2/)!/),53!%*!/+(-)!(/!
5,2-*2m,'!&/!&!2,/40)!*-!"!6,(%7!,;4&0!)*!=9!$%!)+,!0(%,&2!'*m&(%3!)+,!/,;4,%1,!7*,/!>3!?3!"3!
@3!B3!A!&%'!/*!*%3!8()+!&!/+(-)!+&55,%(%7!(%!)+,!-(2/)!/),59!C(%&00D3!&/!,E50&(%,'!(%!@9F3!)+,!
+D5,26*0(1!'*m&(%G/!"!/,;4,%1,!(/!-42)+,2!1*m50(1&),'!6D!)+,!%,1,//()D!)*!2,5,&)!
(),2&)(*%/9!H+,!"!/,;4,%1,!(%!)+(/!'*m&(%!(/!>3!?3!"3!@3!@3!B3!A!8()+!,<,2D!"#!I!>!(),2&)(*%!
6,(%7!2,5,&),'9!
40
!"#! $%&'()*(&+(,
The algorithm is now fully defined, and it will now be demonstrated that the
algorithm converges on the previously noted functions. The specific region of
convergence will also be shown.
!"#"-!./(,$%&'()*(&+(,$%&0i2i%&,
Assume that the CORDIC algorithm is in vectoring mode. Let !! be the angle of the
input vector after the !th iteration of the algorithm. The algorithm tries to reduce the angle
of the input vector. Therefore, after step i+! , the angle of the vector will change such
that
(4.25) !!!!"!# !
!!"
!
where !!! is the rotation performed in the !th iteration of the algorithm. In order for
the algorithm to converge in " iterations, then all subsequent iterations must bring ! to
within !!!!!!
of zero. If this is the case, then when the "th iteration completes, the input
angle will be zero. Since the rotations accumulate, the following condition is derived:
(4.26)
!!
!!! !
"
""!#1
#!1
" $!#!1
In order for the algorithm to converge, the condition in (4.26) must hold when the
algorithm first begins:
(4.27)
!!
!!! "
!
!"!
"!"
" #""!"
To find the domain of initial values for which the algorithm converges, the above
inequality is solved:
! ! "1
$"%&'(!
!!!"# !
$" #$ "!!%% ""
"$$
!!%
" !
)*+,-!./-!0-1*+*.*2+!21! !!!0-3-+04!2+56!2+!!7!./-!0289*+!21!,2+:-;<-+,-!,9+!=-!
,283>.-0%!?/-!0289*+4!21!,2+:-;<-+,-!12;!./-!./;--!,283>.9.*2+!0289*+4!9;-!4/2@+!*+!
?9=5-!"%A!=6!>4*+<!$"%&'(%!B6!-:95>9.*+<!./-!5*8*.!21!./-! ! !.-;84!94!!!933;29,/-4!!! 7!9+0!
2=4-;:*+<!./9.! !!!!!" " !94! i!" 7!./-! !!!!1 !.-;8!,9+!=-!0;233-0!1;28!$"%&'(%!
!"#$%&'&(") +"#'() ,$$-".(#'&/ +"#'() "0 !")1/-2/)3/
!(-3%4'-! ! =!# $ !
!!!!" !"#
!$%!! !
!"#!$%!!( )
!=$
"
# $$&'()%* !
5()/'-! m = !# $ !
!!!!" !
!! !
!!!
!="
"
# ="!
67$/-8"4(3! ! "!!# $ !
! !!= tanh
!12!! !
!
!"#$!1&!!" #
!
" #1'11(1)
! $ 1* &* 3* ,* ,* -*"% &
!
9'84/ :;<=+"#'()> "0 3")1/-2/)3/ 0"- &?/ !@A+B! '42"-(&?#;
C2.-!./9.!./-!,2+0*.*2+!$"%&D(!*4!+2.!8-.!@/-+!*+!./-!/63-;=25*,!0289*+!*1!./-!
4.9+09;0!+2+E;-3-9.*+<!4-F>-+,-!$-%<%!17!&7!A7G(!*4!>4-0%!H+!2;0-;!12;!./-!95<2;*./8!.2!
,2+:-;<-7!,-;.9*+!*.-;9.*2+4!8>4.!=-!;-3-9.-0%!)3-,*1*,95567!*.!*4!+-,-449;6!12;!4.-34!I"7!1A7!
"07!G!7! 3k+1K!.2!=-!;-3-9.-0%!?/*4!,28-4!94!9!,2+4-F>-+,-!21!./-!12552@*+<L!M1N!
$"%&O(!
!
!!! !
"
""!#1
#!1
"#
$%%%
&
'((((!!
3!#1$!
#!1 !
!"#"$ %&''()'()*'+,-&.-+/-)*&01-&02)
H+!2;0-;!.2!0-82+4.;9.-!./9.! !!!@*55!,2+:-;<-!.2!9.!824.!
!!!!!7!*.!@*55!1*;4.!=-!3;2:-+!
=6!*+0>,.*2+!./9.!./-!12552@*+<!*4!.;>-L!
$"%A0(!
!!
!i""
"!1# "
#
#$1
"!1
" !
! ! "#
$i&'()!*"+,-.!i'!(&/0!12&!!!3!-)!4'!4!&0'/5(!21!*"+#6.+!72!8&290!(:4(!*"+,-.!i'!(&/0!12&!
!!;!<)! !i!i'!'/=(&4>(0?!1&2@!*"+,-.!4A?!*"+#B.!i'!4885i0?!(2!(:0!501(!'i?0+!7:i'!Ci05?'D!
*"+,<.!
!!
! !!!!" !
"
"##"!
!!!
"#
$
%%
&
'
(($!!
#$ "
#
!!#$ !
!!!" !
"
"##"!
!!!
"#
$
%%
&
'
((
!
E:0A!*"+#F.!i'!4885i0?)!(:0!12552GiAH!&0'/5('D!
*"+,#.!
!!
!!"!#"
"!!" "
#
#$!"!
"!!
" !
7:0&012&0)!=C!iA?/>(i2A)!i(!i'!8&290A!(:4(!*"+,-.!i'!(&/0!12&!455!!!!!-+!I1!!!" n )!(:0A!
*"+,,.! !!!""
!!1!
G:i>:!8&290'!(:4(!(:0!JKLMIJ!45H2&i(:@!>2A90&H0'!i1!(:0!iA8/(!4AH50!i'!Gi(:iA!(:0!
?2@4iA!21!>2A90&H0A>0!?01iA0?!iA!*"+#N.!i'!'4(i'1i0?+!
!"#"$ %&'()*+)',) .' /&010.&' 2&3)
7:0!?2@4iA!21!>2A90&H0A>0!?01iAi(i2A!4A?!8&221!iA!"+B+<!4A?!"+B+#!4''/@0?!(:4(!
JKLMIJ!i'!280&4(iAH!iA!90>(2&iAH!@2?0+!I1!"!i'!'/='(i(/(0?!12&! ! !iA!(:0!0O/4(i2A'!iA!
(:2'0!'0>(i2A')!(:0A!(:0!?2@4iA!21!>2A90&H0A>0!>4A!=0!12/A?!4A?!8&290A!12&!(:0!&2(4(i2A!
@2?0+!IA!84&(i>/54&)!*"+#N.!':2G'!(:4(!"!:4'!(:0!'4@0!?2@4iA!21!>2A90&H0A>0!4'!!! D!
*"+,".! ma# !
$" # $ma# !$" # !
4#
!"# $%%&'(%)*(n,*-''.'*
With the algorithm now fully defined, and its convergence proven, the accuracy and
precision of the algorithm will now be discussed. Generally speaking, for each iteration
of the algorithm, an additional bit of accuracy is obtained.
>rror creeps into the results from many sources. ?aturally, since there is not an
infinite number of bits available to any real hardware device, rounding errors are inherent
to any implementation. Additionally, since there are a finite number of iterations to any
real-world implementation of the algorithm, the desired rotation angle can never be fully
realiBed. This results in an angle approximation error.
!"#"/ 0&12'3%(4*526'272n8(83.n*
The CORDIC algorithm operates using bit-shifts, the natural representation for any
number used in the algorithm is going to be a fixed-point representation rather than
floating-point. A fixed point number is a scaled integer, where the binary point is implied
to be ! positions to the left of the JKL. As a result, the integer is 2! times larger than the
number it is representing. This is similar to storing 2.N4 as 2N4. This scale, combined
with the number of bits available completely determines the range and precision available
to all numbers in the algorithm. The scale determines how many bits are available to the
left and right of the binary point.
As the binary point moves to the right, greater ranges of numbers are available, but
the scale becomes coarser, with larger gaps between adOacent numbers. Jikewise, as the
binary point moves to the left, the scale is finer, but the range of possible numbers is
smaller.
! ! ""
#$%&'!'()(*+!,%-!%.+/!01!123./41'!%*!0/*5!1-'+!/6!*51!0(-%&4!7/&'!*/!1-5%-,1!
%,,$&%,48!951!)$%&'!'()(*+!%&1!-/*!$+1'!(-!%-4!(-3$*!:%.$1+!%-'!%&1!&1+1&:1'!6/&!0(*+!
%331%&(-)!(-!6(-%.!:%.$1+8!;/&!1<%23.1=!%*!.1%+*!>!0(*+!%&1!-1,1++%&4!(-!,(&,$.%&!2/'1=!
01,%$+1!*51!!"!6%,*/&!(+!)&%*1&!*5%-!?=!75(,5!21%-+!*5%*!*51!2%)-(*$'1!/6!*51!:%.$1+!7(..!
)&/7!%+!*51!%.)/&(*52!7/&@+8!#$%&'!'()(*+!/-!*51!.1%+*A+()-(6(,%-*!+('1!%&1!123./41'!*/!
&1'$,1!&/$-'(-)!1&&/&8!
!"#"$ %&'()*(+,-..&.,
B51-!*51!-$201&!/6!0(*+!&1C$(&1'!*/!&13&1+1-*!%!:%.$1!1<,11'+!*51!-$201&!/6!0(*+!
%:%(.%0.1!(-!*51!+4+*12=!*51!'1+()-1&!5%+!*7/!/3*(/-+D!&/$-'!/&!*&$-,%*18!E-!%!*&$-,%*(/-=!
*51!1<,1++!0(*+!%&1!'(+,%&'1'8!E-!%!&/$-'=!*51!&13&1+1-*%*(/-!(+!%.*1&1'!'131-'(-)!/-!*51!
1<,1++!0(*+8!F/$-'(-)!7/&@+!01**1&!(-!*51!GHFIEG!%.)/&(*52=!01,%$+1!*51!2%<(2$2!
&/$-'(-)!1&&/&!(+!/-.4!5%.6!%+!.%&)1!%+!*51!1&&/&!&1+$.*(-)!6&/2!*&$-,%*(/-8!
F/$-'(-)!1&&/&+!%&1!(-*&/'$,1'!(-!*51!,/$&+1!/6!,/23.1*(-)!1%,5!(*1&%*(/-!/6!*51!
%.)/&(*528!951+1!1&&/&+!%,,$2$.%*1=!75(,5!5%+!*51!1661,*!/6!6$&*51&!/66+1**(-)!*51!6(-%.!
&1+$.*8!J/71:1&=!B%.*51&!3/(-*1'!/$*!(-!K?L!*5%*!*51!&/$-'(-)!/6!*51!&1+$.*+!(-!1%,5!/6!N!
(*1&%*(/-+!-1:1&!&1+$.*+!(-!2/&1!*5%-! log2 ! !0(*+!/6!1&&/&8!E-!/&'1&!*/!-1)%*1!*5(+!1661,*=!
*51!+4+*12!,%-!$+1! ! " !"#$ ! !0(*+!/6!+*/&%)1!6/&!%..!(-*1&21'(%*1!&1+$.*+8!
!"#"/ 0(+12,033.&4*567*&(,-..&.,
E*!5%+!011-!+5/7-!*5%*!*51!%.)/&(*52!7(..!%.7%4+!,/-:1&)1!/-!%-!%,,$&%*1!&1+$.*!(6!
)(:1-!(-6(-(*1!3&1,(+(/-8!J/71:1&=!6/&!+/21!:%.$1+=!(*!2%4!*%@1!%-!(-6(-(*1!-$201&!/6!
(*1&%*(/-+!*/!%&&(:1!%*!*51!%,,$&%*1!&1+$.*8!M+!%!&1+$.*=!*51!%,*$%.!&/*%*(/-!/6!*51!(-3$*!:1,*/&!
(+!/-.4!%-!%33&/<(2%*(/-!/6!*51!'1+(&1'!%-).18!95(+!%-).1!%33&/<(2%*(/-!1&&/&!(+!*51!
"#
$%&'(%$ ()*+*'), +,-.% /',0& *1% +2*0+. ()*+*'), +,-.% 341'21 '& *1% &0/ )5 +.. *1%
',*%(/%$'+*% +,-.%&67
3"89#6
!!
!!" !" !"""
""!
#""
# : 41%(% !! '& *1% $%&'(%$ ()*+*'), +,-.%
;',2% *1'& %(()( $%2(%+&%& 4'*1 *1% ,0/<%( )5 '*%(+*'),&: +, )<=')0& 4+> *) '/?()=%
*1% +220(+2> )5 *1% +.-)('*1/ '& *) ',2(%+&% *1% ,0/<%( )5 '*%(+*'),& ?%(5)(/%$8 @)4%=%(:
*1%(% +(% &)/% .'/'*& *) 1)4 /+,> '*%(+*'),& 2+, <% +$$%$ +,$ &*'.. '/?()=% +220(+2>8
A'(&*: *1% ()0,$',- %(()( $'&20&&%$ ', "8B8C ',2(%+&%& 4'*1 *1% ,0/<%( )5 '*%(+*'),&: &) *1'&
%55%2* /0&* <% 2),&'$%(%$ 41%, +$$',- *) *1% '*%(+*'), 2)0,*8 A0(*1%(: &',2% *1% +,-.%& +(%
D0+,*'E%$: *1% +220(+*% +,-.% /+> ,%=%( 5'* ',*) *1% <'* 4'$*1 ?()='$%$8 F5 B <'*& +(% 0&%$
5)( &*)(+-%: +,$ *1%(% +(% P <'*& *) *1% ('-1* )5 *1% <',+(> ?)',*: *1%, *1% =+.0% )5 *1% .%+&*G
&'-,'5'2+,* <'* ', &*)(+-% '& -'=%, <> !!B!P
8 H1'& '& *1% &/+..%&* ,0/<%( *1+* 2+, <%
+220(+*%.> &?%2'5'%$ ', *1% &>&*%/8 H1%(%5)(% *1% .+&* ()*+*'), +,-.% 3*1% &/+..%&*6 /0&* <%
+<.% *) 5'* ', *1% +=+'.+<.% &?+2%8 H1'& >'%.$& *1% 5)..)4',- 2),$'*'),7
3"89I6 !!!n!1" 2
!"2P
F5 *1'& $)%& ,)* 1).$: *1%, *1% +,-.% +2*0+..> &*)(%$ 4'.. <% J: 41'21 4'.. (%&0.* ', ,)
50(*1%( ()*+*'),& )5 *1% ',?0* =%2*)(8
"#
!"#$%&' ) *"& +#'#,,&, -.$,&.&/%#%01/
$%is (%esis s(*+ies (,- +i..e/en( i123e1en(4(i-ns -. (%e 567D95 43g-/i(%1. <-(%
+esigns 4/e s=n(%esi>e+ .-/ (%e ?i3in@ A24/(4nBC ?5CA15FF *sing (%e ?i3in@ 9AG He/si-n
I.1i 4n+ si1*34(e+ *sing J-+e3Ai1 He/si-n #.F4. $%is K%42(e/ s(*+ies 4 s(/4ig%(.-/,4/+L
24/433e3 +esign K-nsis(ing -. (,- s%i.( /egis(e/sL 4 s(4n+4/+ /egis(e/L .-*/
4++e/Ms*N(/4K(-/sL 4 3--O*2 (4N3eL 4n+ 4 K-n(/-3 *ni(. $%e 4/e4 4n+ (i1ing /eP*i/e1en(s
4/e +isK*sse+L 4n+ si1*34(i-ns -. (%e +esign 4/e 2/esen(e+.
$%e +esign is K-n.ig*/e+ (- Ne in %=2e/N-3iK /-(4(i-n 1-+e in -/+e/ (- K-12*(e (%e
e@2-nen(i43 .*nK(i-nL N*( 433 K-12-nen(s 4/e 2/esen( .-/ 4 K-123e(e3= gene/43B2*/2-se
567D95 *ni(. 6n3= 1in-/ K%4nges (- (%e K-n(/-3 *ni( ,-*3+ Ne neKess4/= (- i123e1en(
(%e -(%e/ 1-+es. $%is +esign 43s- inK3*+es (%e %4/+,4/e neKess4/= (- 2e/.-/1 (%e /e2e4(
i(e/4(i-ns /eP*i/e+ in (%e %=2e/N-3iK +-14in. $%e K-n(/-3 *ni( K4n Ne 1-+i.ie+ (- 433-,
.-/ 1-/e 4 s-2%is(iK4(e+ i(e/4(i-n /e2e(i(i-n sK%e1e.
$%is i123e1en(4(i-n *ses CQ Ni(s -. 2/eKisi-nL ,i(% (%e Nin4/= 2-in( .i@e+ s*K% (%4(
(%e/e is -ne sign Ni( 4n+ (%/ee in(ege/ Ni(s (- (%e 3e.( -. (%e Nin4/= 2-in(L 4n+ QR Ni(s
/e2/esen(ing (%e ./4K(i-n43 24/( -. (%e H43*es. $%e +esign *ses QSs K-123e1en( 4/i(%1e(iKL
giHing (%e /egis(e/s in (%e +esign 4 /4nge -. H43*es ./-1 TR.F (- I.F. $%e 2-si(i-n -. (%e
Nin4/= 2-in( is i123iKi(L 4n+ n-( s2eKi.ie+ 4n=,%e/e in (%e +esign. $%e H43*es in (%e
3--O*2 (4N3e 4/e gene/4(e+ *sing (%e s2eKi.ie+ Nin4/= 2-in( 2-si(i-nL 4n+ in -/+e/ .-/
K-//eK( /es*3(s (- Ne 4K%ieHe+L 433 in2*(s 1*s( Ne sK43e+ 4KK-/+ing3=. $%e -n3= K%4nge
neKess4/= (- 433-, .-/ 4 +i..e/en( Nin4/= 2-in( 2-si(i-n ,-*3+ Ne in (%e H43*es s(-/e+ in
(%e 3--O*2 (4N3e. U- K%4nges (- (%e +esign ,-*3+ Ne /eP*i/e+.
! ! "#
!"#! De&'()*
!"
1
X
Y
!
"
1
!"
#"
$%$&
$%$&
#
!
atanh()T
'(
)*+,&'(
)*+,&
'(
+((-
)./!#
)./!#)./Z
!"#$%"&'#($
!#Z
)&+-&
'()./!#)./Z$%$&)*+,&(1%2
X+Y
$hifted value
un$caled value
input value
NOT./ 3he 4 and 5 regi$ter$ are $pecial regi$ter$ that al$o have $hifting capa9ilit:. 3he output$ are a$ $ho<n a9ove.
=X
Y
e456Z"7
"
1Z"
$%$&
+((-
!!"#$%e'()*+,-./01-e2e-'3"4#%45'.6'78e'/.59-e7e'94%4--e-':;<=>:'$?"7)'
$%%!&a()*!+)&,)n.nt0!)1!th.!3O5673!,*)+.008n9!:n8t!a*.!0h);n!8n!<89:*.!=>?>!Th.!
XB!YB!anD!Z!*.980t.*0!+)nta8n!th.!+:**.nt!Fa%:.0!)1!th.!XB!YB!anD!Z!+)&,)n.nt0!)1!th.!
3O5673!a%9)*8th&>!Th.G!.a+h!haF.!,a*a%%.%!%)aD!+a,aH8%8t8.0!+)nt*)%%.D!HG!th.!L6!8n,:t!
089na%>!J:?!&:%t8,%.L.*0!+)nt*)%!th.!8n,:t!Fa%:.0!t)!.a+h!)1!th.!*.980t.*0>!$n!7N7T!):t,:t!
HG!th.!+)nt*)%!:n8t!a%%);0!8n8t8a%!Fa%:.0!t)!H.!%)aD.D!H.1)*.!+)&,:tat8)n!H.98n0>!Nh.n!
7N7T!80!a00.*t.D!Oa+t8F.!h89hPB!th.n!th.!&:%t8,%.L.*0!,a00!th.!8n8t8a%!Fa%:.0!8nt)!th.!
*.980t.*>!6:*8n9!th.!+)&,:tat8)n!,ha0.B!th.!7N7T!089na%!80!n)t!a00.*t.DB!a%%);8n9!th.!
):t,:t!)1!th.!aDD.*0!t)!H.!1.D!Ha+Q!8nt)!th.!*.980t.*0>!
Th.!+)nt*)%!:n8t!:0.0!th.!089n!)1!th.!Z!*.980t.*!t)!9.n.*at.!th.!0:Ht*a+t!089na%0!that!
a*.!1.D!8nt)!th.!aDD.*/0:Ht*a+t)*0>!N8th!th.!a,,*),*8at.!&8n)*!&)D818+at8)n0!t)!th.!+)nt*)%!
:n8tB!th.!9.n.*at8)n!)1!th.!SUUXY!anD!SUUZ!089na%0!+an!H.!&)D818.D!0)!that!th.!:n8t!
! ! "#
$pe'()e*!+,!-e.)$'+,/!0$1e2!T4e!1e*+/,!5$'!)4e!.$,)'$6!7,+)!+*!1+*.7**e1!57')4e'!+,!829292!
T4e!-(67e*!$5!)4e!X!(,1!Y!'e/+*)e'*!('e!+,p7)!+,)$!(!5+,(6!(11e'!)$!/e,e'()e!)4e!
exp$,e,)+(6!57,.)+$,2!A5)e'!)4e!(6/$'+)40!+*!.$0p6e)e>!)4e!'e*76)!?+66!@e!!"!
2!T4e!7,+)!.(,!
(6*$!.$0p7)e! e!Z
!
!@A!.4(,/+,/!)4e!(11e'!+,)$!(!*7@)'(.)$'2!
R()4e'!)4(,!.$0p7)e!)4e!('.C4Ape'@$6+.!)(,/e,)>!(!*0(66!6$$D7p!)(@6e!+*!7*e1!)$!*)$'e!
)4e!p'eC.$0p7)e1!-(67e*!$5!)4+*!57,.)+$,2!T4e!)(@6e!4(*!$,e!'$?!pe'!+)e'()+$,!$5!)4e!
(6/$'+)402!I5!(!/e,e'(6Cp7'p$*e!FORDIF!(6/$'+)40!?e'e!)$!@e!1e*+/,e1>!(,!(11+)+$,(6!
6$$D7p!)(@6e!?$761!@e!,e.e**('A!)$!*)$'e!)4e!,e.e**('A!('.)(,/e,)!-(67e*2!I$'!)4e!6+,e('!
1$0(+,>!(,!(11+)+$,(6!*4+5)!'e/+*)e'!?$761!(6*$!@e!,e.e**('A2!A!076)+p6exe'!?$761!)4e,!@e!
7*e1!)$!*e6e.)!)4e!-(67e!5'$0!)4e!(pp'$p'+()e!)(@6e2!T4e!.$,)'$6!7,+)!p'$-+1e*!)4e!(11'e**!
7*e1!)$!(..e**!)4e!1()(!5'$0!)4e!)(@6e2!
U#$% A'()*(+*$ U,)*)-(,)./
S*)1$# JKL! 9J>J9K! K28M!
S*)1$ 2*)342*.3# 9KJ! KN>NK"! 028M!
LUT# N0P! KN>NK"! K2JM!
IO9# 9JK! KK9! 8L2PM!
9RAM# 9! JK! J29M!
<CL># 9! #! 9K28M!
T(+*$ ?@ABO'$C(** %$')1$ D,)*)-(,)./ E.C ,F$ 3(C(**$* %$#)G/@
T4e!1e-+.e!7)+6+Q()+$,!5$'!e,)+'e!1e*+/,!+*!*700('+Qe1!+,!T(@6e!8292!T4e!1e*+/,!7*e*!
K2"M!$5!)4e!XFJS9800S*!(-(+6(@6e!*6+.e*>!6e(-+,/!(0p6e!*p(.e!5$'!7*(/e!@A!(!,e7'(6!
,e)?$'D2!T4e!X+6+,x!*A,)4e*+Qe'!(6*$!'ep$')e1!)4()!)4e!1e*+/,!.(,!4(,16e!(!)4e$'e)+.(6!
0(x+070!.6$.D!5'eT7e,.A!$5!P820!MHQ2!N()7'(66A>!)e*)+,/!+*!,e.e**('A!)$!1e)e'0+,e!)4e!
)'7e!)+0+,/!'eT7+'e0e,)*!5$'!)4e!1e*+/,2!T4e!5$66$?+,/!*e.)+$,*!?+66!1+*.7**!)4e!1e*+/,!
(,1!('e(!'eT7+'e0e,)*!5$'5!)4e!0(X$'!.$0p$,e,)*!$5!)4e!7,+)2!
"#
5.1.1 The Control Unit
$he 'ontrol unit i/ re/pon/i1le for regulating the flo5 of data through the CORDIC
unit. $he unit i/ a finite /tate ma'hine ha>ing three /tate/? a/ /ho5n in Figure 5.2. $he
default /tate for the unit i/ the IDCD /tate. In thi/ /tate? the regi/ter/ are not loading? and
the unit i/ /taE/ in thi/ /tate until the S$GR$ /ignal i/ a//erted. Hhen the S$GR$ /ignal
i/ a//erted? the unit mo>e/ into the IRDCOGD /tate.
In the IRDCOGD /tate? the regi/ter load /ignal i/ a//erted? and the IJI$ /ignal i/
al/o a//erted? 'au/ing the initial >alue/ to 1e loaded into the regi/ter/ Kthe/e >alue/ are
repre/ented a/ LM? NM? and OM in Figure 5.PQ. $he unit then ad>an'e/ to the CORIS$D
/tate. $he regi/ter load /ignal/ remain a//erted? 1ut the IJI$ /ignal i/ not a//erted. G/ a
re/ult? the regi/ter/ are loaded 5ith the >alue/ from the 'orre/ponding adder/? rather than
the input >alue/. $he unit maintain/ an internal iteration 'ount? and 5hen the/e >alue/ are
eTual? the 'ontrol unit mo>e/ 1a'U into the IDCD /tate? 5ith the DOJD /ignal a//erted?
/ignifEing that the >alue/ output 1E the regi/ter/ are >alid. $a1le 5.2 /ho5/ the
un'onditional output/ for ea'h /tate. Other /ignal/? /u'h a/ the looUVup ta1le addre//?
SWGR$? and SSX /ignal/ >arE and are 'ontinuou/lE updated 5hile in the CORIS$D
/tate. $he logi' 1ehind the /tate tran/ition/ i/ /ho5n in Figure 5.Y.
!DL$
%R$L'AD C'M%+T$
SHAMT 0 ST'%1ALSTART
!"#$%&'5)*+,-.-&'/".#%.0'12%'-3&'4.%.55&5'6789:6'$;"-)'
5#
! "#$%! &'%$()#! *(+&,-%!
"."-! # 1 #
#(.%! 1 # #
$#! # 1 1
-/012!3456,7897:;<;97/1!897<=91!>7;<!9><?><@!A9=!2/8B!@</<2!;7!<B2!?/=/1121!:2@;C74!
! "
###$
! "
###$
$%&
'()*+
'+),+
!
D;C>=2!3436F</<2!<=/7@;<;97!19C;8!A9=!<B2!?/=/1121!*('#"*!>7;<4!
The generation of the SHAMT signal is complicated by the need to perform repeat
iterations, and the need to wait one cloc> cycle to read hyperbolic tangent data from the
L@T. Bigure 5.4 shows the logic behind the SHAMT signal, as well as the other signals
that are state independent (S@BXH, S@BI, ADDR, and DONE). The subtract signals
S@BXH and S@BI are determined by the sign of the I register, since the PORDIP unit
is operating in rotation mode. Therefore, the sign bit of the I register is connected
directly to the subtract inputs of the corresponding adders.
The SHAMT register contains the shift amount, which also corresponds to the
current iteration count, which is then connected to the X and H registers. Since data from
the loo>-up table ta>es one cloc> cycle to be retrieved, SHAMT is reTuired to lag one
cycle behind the ADDR signal, which is fed to the loo>up-table. This is shown as a direct
connection between the output of the ADDR register and the input of the SHAMT
! ! "#
$%&'()%$*!+)!'(!),%!-./)%/)(!.0!),%!1223!$%&'()%$!),4)!5%)%$6'/%!7,%/!4!$%8%4)!')%$4)'./!'(!
/%-%((4$9*!
:,%!-;$$%/)!1223!'(!-.684$%5!4&4'/()!),%!<4=;%!.0!),%!>?:3@:!$%&'()%$*!:,%!
>?:3@:!$%&'()%$!-./)4'/(!),%!/%A)!')%$4)'./!),4)!6;()!B%!$%8%4)%5*!+)!'(!'/')'4='C%5!).!D*!+/!
,98%$B.='-!6.5%E!%<%$9! !! !" !')%$4)'./!6;()!B%!$%8%4)%5!F7,%$%!!!'(!),%!8$%<'.;(=9!
$%8%4)%5!')%$4)'./G*!:,'(!-.68;)4)'./!-4/!B%!8%$0.$6%5!;('/&!4!('/&=%!455%$*!H!!'(!
-.68;)%5!B9!'/8;))'/&!>?:3@:!'/).!),%!0'$()!'/8;)!.0!),%!455%$!4/5!! !"#$%&$ !
F4--.68='(,%5!7'),!4!B')!(,'0)G*!I9!4((%$)'/&!),%!-4$$9J'/!'/8;)E! ! !NXTRPT!'!'(!
-.68;)%5*!K,%/!),%!-;$$%/)!<4=;%!.0!1223!4/5!>?:3@:!64)-,E!),%/!>?:3@:!'(!
;854)%5E!4/5!1223!&%)(!=.45%5!7'),!')(!-;$$%/)!<4=;%E!$4),%$!),4/!),%!.;)8;)!.0!),%!455%$!
),4)!'/-$%6%/)(!')(!<4=;%*!
K,%/!LM1N:!64)-,%(!),%!-./()4/)!L:O@P1QE!),%/!),%!2O>R!('&/4=!'(!4((%$)%5E!
4/5!),%!;/')!$%);$/(!).!),%!+2QR!()4)%*!
!"31% S'(XY
S'(!ADDR
$%TR'T
1
"
#
+DD
Cin
#
+DD
Cin
$$ 1
"
1
&S)AMT
&STO',AL
DO12
!!"#$%& ()*+,-#". /&0"12 30& 4353&6"12&7&12&13 -$37$34 -8 30& 75%599&9 .-13%-9 $1"3)
! ! 52
"#$% &'()*(+*$ ",)*)-(,)./
0*)1$# 3%! &3'3&2! ()3*!
0*)1$ 2*)342*.3# 2%! 2+'+2,! ()&*!
5"6# -&! 2+'+2,! ()3*!
789# &&3! 22&! 5&)&*!
9:&;# (! 32! ()(*!
<=5># &! .! &2)5*!
6(+*$ ?@ABC$')1$ D,)*)-(,)./ E.F ,G$ 3(F(**$* 1./,F.* D/),@
/01!2312!31456317189:!;<3!906:!=<7><8189!<;!901!?@ABC?!5869!231!D1926E1D!68!/2FE1!
5)3)!/01!=<893<E!5869!2==<589:!;<3!&2*!<;!901!:E6=1:!5:1D!FG!901!189631!D1:6H8)!
5.1.2!The Shift -egisters
/01!I!28D!J!31H6:913:!3145631!:06;9!=2>2F6E69G)!K68=1!901!:06;9!27<589!68=312:1:!L690!
12=0!6913296<8!<;!901!2EH<36907'!2!:928D23D!:68HE1MF69!:06;9!31H6:913!=288<9!F1!5:1D)!N!
31H6:913!=2>2FE1!<;!:06;968H!2!O2362FE1!27<589!6:!81=1::23G)!P58D271892EEG'!90131!231!9L<!
2>>3<2=01:!9<!D1:6H868H!:5=0!2!31H6:913)!/01!12:61:9!28D!7<:9!2312M=<8:=6<5:!6:!9<!
>13;<37!2!:68HE1!<81MF69!:06;9!>13!=E<=Q!=G=E1!28D!31>129!2:!81=1::23G!5896E!901!O2E51!02:!
F118!:06;91D!FG!901!D1:631D!27<589)!/01!:68HE1!F69!:06;9!=28!F1!023DML631D'!768676R68H!
2312!31456317189:)!/06:!2>>3<2=0!3145631:!901!=<7>59296<8!9<!>25:1!5896E!901!:06;968H!6:!
=<7>E191)!/06:!6:!5:1D!2:!>239!<;!901!2312!<>9676R296<8!H2681D!;3<7!901!F69M:1362E!
67>E17189296<8!D6:=5::1D!68!?02>913!+)!
C9!6:!>31;1331D!9029!901!:06;9!F1!>13;<371D!68!<81!=E<=Q!=G=E1)!/06:!=28!F1!
2==<7>E6:01D!5:68H!75E96>E1S13:'!28D!6:!6EE5:93291D!;<3!2!,MF69!31H6:913!68!P6H531!5)5)!P<3!
906:!D1:6H8'!L06=0!5:1:!32MF69!31H6:913:'!32! 32 ! #75E96>E1S13:!231!5:1D!;<3!12=0!:06;968H!
5869)!/01!31H6:913:!02O1!9L<!<59>59:'!901!58:06;91D!<59>59!D631=9!;3<7!901!31H6:913'!28D!901!
:06;91D!<59>59'!L06=0!6:!901!<59>59!;3<7!901!75E96>E1S13:)!/01!KTNU/!:6H82E!=<893<E:!901!
27<589!FG!L06=0!901!<59>59!6:!:06;91D)!/01!:06;9!31H6:913!>13;<37:!28!236907196=!:06;9!
53
ope'a)*o+ )o e+,u'e p'ope' e.e/u)*o+ 0he+ +e2a)*3e 3alue, a'e u,e5. 7o+,e8ue+)l9: )he
;<= >)he ,*2+ b*)@ *, /o++e/)e5 )o )he ,h*A)e5 ou)pu), oA )he Bul)*ple.e', )o allo0 Ao' a
,*2+ e.)e+,*o+ )o o//u' 0he+ ,h*A)*+2.
!"#
$"# 0123
%H'FT*3+01230
123
0123
%H'FT*2+
%H'FT*1+
%H'FT*0+
D*3+
D*2+
D*1+
D*0+
CL/ %HAMT Figure 5.5—Design of the variable shift register.
Used Available Utilization
Slices CD 13:31F D.GH
Slice Flip-Flops I5 FI:IFJ D.FH
LUTs 1IF FI:IFJ D.IH
IOBs 1DJ FF1 JG.1H
BRAMs D 3F D.DH
GCLKs 1 K 1F.5H
Table 5.H—Device utilization for the parallel shift register.
The ,h*A) 'e2*,)e', u,e a ,*2+*A*/a+) po')*o+ oA )he 5e3*/e a'ea: 'ela)*3e )o )he 'e,) oA
)he 5e,*2+: a, ,ho0+ *+ Table 5.J. Mo)e )ha) )he )able o+l9 /o3e', a ,*+2le 'e2*,)e': a+5
)he'e a'e )0o ,h*A) 'e2*,)e', *+ )he )o)al 5e,*2+. The N 'e2*,)e' *, a ,)a+5a'5 'e2*,)e'
0*)hou) ,h*A)*+2 Au+/)*o+al*)9.
5.1.$ %&e )**+u- %a/0e
The lookup )able /o+)a*+, )he 3alue, oA )he a'/ h9pe'bol*/ )a+2e+) +e/e,,a'9 Ao' ea/h
*)e'a)*o+ oA )he al2o'*)hB. P 7 p'o2'aB 0'*))e+ Ao' )ha) pu'po,e 2e+e'a)e, )he QRSL Ao'
)he )able. The p'o2'aB u,e, )he 7 Ba)h l*b'a'9 Au+/)*o+, *+ /o+Uu+/)*o+ 0*)h a A*.e5V
54
point conversion function to output a 1HDL file containing a description of the lookup
table in fixed-point notation.
"#e% &'a)lable ",)l)-a,)o/
0l)1e# 0 13,312 0.0%
0l)1e 2l)p42lop# 0 26,624 0.0%
L"T# 0 26,624 0.0%
789# 38 221 17.2%
9:&;# 1 32 3.1%
<=L># 1 8 12.5%
Table 5.5ABe')1e u,)l)-a,)o/ DoE ,he paEallel lookup ,able.
The presence of the lookup table has essentially zero impact on the area requirements
for the unit, since the entire table can be synthesized into a single LRAM, as shown in
Table 5.5.
5.1.$ Design Summar1
2)HuEe 5.IA0l)1e# u#e% bJ ,he 'aE)ou# 1oKpo/e/,# oD ,he paEallel =8:B7= u/),.
The entire design uses very little of the QPSA’s resources, and Qigure 5.6 shows that
much of the QPSA’s slices are used up by the X and Y shift registers. The large
multiplexers required by the design result in massive resource requirements. Chapter 6
! ! ""
#$e&e'(&!)'!)*(e$')(e!+e&,-'!(.)(!),/&!(0!12$(.e$!$e+23e!(.e!)$e)!$e42,$e/e'(&5!62(!)(!(.e!
30&(!01!&#ee+7!
8,(.!0'*9!:7;<!01!(.e!=#)$()'>?@&!&*,3e&!2&e+!69!(.,&!+e&,-'5!(.e$e!,&!&(,**!#*e'(9!01!
$00/!*e1(!10$!)!&9&(e/!&23.!)&!)!'e2$)*!'e(A0$B!(.)(!2&e&!(.e!CDEFGC!2',(7!G1!'e3e&&)$95!
(.e!$e&02$3e!$e42,$e/e'(&!302*+!6e!12$(.e$!$e+23e+!69!$e+23,'-!(.e!#$e3,&,0'!01!(.e!2',(!
H(0!IJ!6,(&5!10$!eK)/#*eL7!
M(!?I!,(e$)(,0'&!HA,(.!:!$e#e)(&L5!#*2&!)!NEOPDMF!393*e5!(.e!)*-0$,(./!A,**!()Be!?;!
3*03B!393*e&!(0!30/#*e(e7!M(!(.e!(.e0$e(,3)*!/)K,/2/!3*03B!1$e42e'39!01!Q"7R!STU5!(.e!
2',(!A,**!()Be!R7;"!!&!(0!30/#2(e!(.e!1,')*!V)*2e7!
5.2 $imulation Results
W.,&!&e3(,0'!#$e&e'(&!(.e!$e&2*(&!01!&,/2*)(,0'&!#e$10$/e+!2&,'-!(.e!+e&,-'!&#e3,1,e+!
,'!"7I7!W.e!+e&,-'!2&e&!?:!6,(&!(0!$e#$e&e'(!e)3.!A0$+5!A,(.!;!6,(&!10$!(.e!,'(e-e$!#)$(!01!
(.e!'2/6e$5!)'+!:X!6,(&!10$!(.e!1$)3(,0')*!#)$(!01!(.e!'2/6e$7!W.2&!(.e!V)*2e&!02(#2(!69!
(.e!Y5!Z5![5!)'+!(.e!)++e$!(.)(!30/#2(e+!(.e!eK#0'e'(,)*!12'3(,0'!)$e!e11e3(,Ve*9!&3)*e+!
2#!69!)!1)3(0$!01!::X7!W.e!&,/2*)(e+!3*03B!$2'&!)(!IRR!STU7!
5.2.1 $imulation 1: Computing e With CORDIC ;rowth
Error
W.e!1,$&(!&,/2*)(,0'!#e$10$/e+!+e/0'&($)(e&!(.e!e11e3(!01!(.e!CDEFGC!-),'!K"7!W0!
30/#2(e!(.e!eK#0'e'(,)*!12'3(,0'!e$5!(.e!Y!$e-,&(e$!/2&(!6e!,',(,)*,Ue+!A,(.!I5!(.e!Z!
$e-,&(e$!A,(.!R5!)'+!(.e![!$e-,&(e$!A,(.!(.e!+e&,$e+!)$-2/e'(!(0!(.e!eK#0'e'(,)*!12'3(,0'7!
W.,&!&,/2*)(,0'!)((e/#(&!(0!30/#2(e!(.e!V)*2e!01!e5!&0!)'!,',(,)*!V)*2e!01!I!,&!#*)3e+!,'(0!
(.e![!$e-,&(e$7!
5#
!"#$%& ()*+,&-$./- 01 2 304&.5"6 -"6$.2/"07 2//&68/"7# /0 9068$/& /:& ;2.$& 01 e) <- 2 %&-$./ 01
=>,?@= #%0A/:B /:& 1"72. 9068$/&4 ;2.$& "- "7299$%2/&)
$he results of this simulation are shown in 3igure 5.6. $he algorithm iterate7 for a
total of 89 times :in;lu7ing the <R>?OAD state an7 the two repeat iterations reDuire7 in
the hEperFoli; 7omainG. $he final heHa7e;imal Ialues in the J an7 K registers are
L9625NC5L# an7 P3Q262ARL# respe;tiIelE. Shen ;onIerte7 to 7e;imal an7 s;ale7
a;;or7inglET theE Fe;ome L.266QLP for the J register an7 P.Q688LP for the K register. $he
Ialue of the eHponential fun;tion is the sum of these two numFers: 2.25L2LPT whi;h
mat;hes the output of the > signal.
$he a;tual Ialue of e to four 7e;imal pla;es is 2.6LN8. $his simulation 7emonstrates
an error in ;omputation FE P.9#6L. $his is a result of the CORDIC magnitu7e error fa;tor
"i 7is;usse7 in 9.2 an7 illustrate7 in 3igure 9.2. In or7er to pro7u;e a;;urate resultsT the
J register nee7s to Fe preWs;ale7 FE the gain fa;tor ".
5.#.#!Simulation #. Computing e Compensating for
CORDIC :rowth
In or7er to ;ompute an a;;urate Ialue of eT the growth oFserIe7 in 9.2 must Fe
a;;ounte7 for. $o pro7u;e an uns;ale7 final IalueT the initial Ialue must Fe 7iIi7e7 FE
the growth fa;tor. $he formula use7 to ;ompute the amount of growth is shown in :9.28G.
A pre;ompute7 Ialue is shown in $aFle 9.L. 3or this simulation rather than initialiXing J
with LT J will Fe initialiXe7 with the heHa7e;imal Ialue L85L>N62L#T whi;h eDuals
L.2P65LP. K is again initialiXe7 with P an7 Y with L.
5#
!"#$%& ()*+,&-$./- 01 2 304&.5"6 -"6$.2/"07 8069$/"7# /:& ;2.$& 01 e) <= 2880$7/"7# 10% >?,@A>
#%0B/:C /:& 1"72. 8069$/&4 ;2.$& "- 288$%2/&)
$%& '&(ult( o- t%i( (imulation a'& (%o2n in 3igu'& 5.6. $%& -inal 7omput&9 :alu& o- e
i( output on t%& ; <u(. =t( %&>a9&7imal :alu& 2@#;152B1C 7on:&'t( to 2.#1631EF 2%i7%
mat7%&( t%& a7tual :alu& o- e. G%&n taH&n to 3E 9&7imal pla7&( t%& :alu& output in ; i(
2.#1626166EE211EEEEEEEEEEEEEEEEE1EF 2%i7% 9i--&'( -'om t%& '&al :alu& o- e <I onlI
E.EEEEEEE515C2E5E165E633EEE1E. $%& onlI 2aI -o' t%i( &''o' to <& '&9u7&9 2oul9 <& to
u(& mo'& p'&7i(ion in t%& '&gi(t&'(. Gi9&ning t%& '&gi(t&'(F o' (%i-ting t%& <ina'I point to
t%& l&-t 7an a7%i&:& t%i(. =n a99ition to allo2ing mo'& p'&7i(ion in t%& '&(ult(F it al(o
allo2( g'&at&' p'&7i(ion -o' t%& :alu&( (to'&9 in t%& a'7 %Ip&'<oli7 tang&nt looHup ta<l&.
!"#"$ %&'()*+&,-.$/.0,'1(+&-2.!34.
=t i( al(o impo'tant -o' t%& unit to 7omput& t%& :alu& o- t%& &>pon&ntial -un7tion -o'
n&gati:& a'gum&nt(. Sin7& t%& 9omain o- 7on:&'g&n7& -o' t%& algo'it%m i( (Imm&t'i7
a<out EF %al- o- t%& po((i<l& a'gum&nt( a'& n&gati:&. $%i( (imulation 7omput&( t%& :alu&
o- e–1
. L i( initialiM&9 2it% 1.2E#51E to &n(u'& an un(7al&9 '&(ultF an9 N i( initialiM&9 2it%
E. $%& O '&gi(t&' 7ontain( t%& a'gum&nt -o' t%& &>pon&ntial -un7tion an9 i( t%u( initialiM&9
to –11E P3EEEEEEE1C in 2Q( 7ompl&m&nt notationR.
!"#$%& ()D+,&-$./- 01 2 304&.5"6 -"6$.2/"07 8069$/"7# /:& ;2.$& 01 e
EF)
5#
$he simulation results presented in Figure 5.6 sho7 that the 89R;<8 unit is a=le to
handle the negative argument. $he final value in the @ register A05@C;5#81EF
corresponds to 0.HEI#I6JH5JC10, 7hich is of the same accuracL as the computation for
e1.
5.2.$ Simulation $/ Computing Outside the Domain of
Convergence
Ms discussed in J.E, the 89R;<8 algorithm does not converge on an accurate
ans7er for all input values. $a=le J.H sho7s the domains of convergence for all the
computation domains supported =L the 89R;<8 algorithm. For the rotation mode, 7hich
this design uses, the domain of convergence is a limiter on the magnitude of the initial
value of N. <f N 7ere initialiOed 7ith a value 7hose magnitude is greater than 1.11#1I10,
then the final computed value 7ould =e incorrect. For this simulation, the P register is
again initialiOed 7ith 1.C0I510, and Q is again initialiOed to 0, =ut N is initialiOed to 5, so
that the unit 7ill attempt to compute e5.
!"#$%& ()*+,-&.$/0. 12 3 415&/6"7 ."7$/30"18 917:$0"8# 0;& <3/$& 12 e
() 6"89& 0;& "8"0"3/ <3/$& ( ".
1$0."5& 0;& 5173"8 12 918<&%#&89&= 0;& 917:$0&5 <3/$& ". "891%%&90)
$he simulation results sho7n in Figure 5.10 demonstrate that the 89R;<8 unit is
una=le to compute e5. $he final value output =L @ is H.056H10, 7hich is no7here near the
correct value A1J#.J1HC10F. H.056H10 acts is an asLmptote for the computed values of @.
9nce the initial values of N move outside the domain of convergence, the computed @
value rapidlL approaches H.056H10, and remains at that value no matter ho7 large a value
"#
$ is initiali+e- .ith. 1 similar as4mptote e7ists 8or .hen $ is initiali+e- .ith 9al:es that
are too negati9e. In this =ase> the ? 9al:es approa=h @.AB6#1@.
!"#! $%&&'()
The parallel -esign is a straight8or.ar- implementation o8 the CGHDIC eJ:ations.
?9en .ith the large register si+es> the -esign has a small 8ootprint insi-e the SpartanLA.
Mith iterations lasting one =lo=N =4=le> the per8orman=e is =onsistent. Oer8orman=e =an Pe
impro9e- P4 :sing 8e.er iterations> .hi=h :s:all4 entails :sing a smaller .or- si+e. The
sim:lations -emonstrate a==eptaPle a==:ra=4> espe=iall4 gi9en the area reJ:irements 8or
the -esign. Qor s4stems that reJ:ire an e9en smaller 8ootprint> Chapter 6 presents a
smaller> slo.er PitLserial implementation that 8:rther aims to re-:=e area reJ:irements P4
:sing 1LPit a--erRs:Ptra=tors an- simple shi8t registers.
! ! "#
Chapter 6 *he +erial Implementation
$!%t'a)*ht,-'.a'/!)012303ntat)-n!-,!th3!567895!a2*-')th0!)%!/)%:;%%3/!)n!5ha1t3'!
<=!>h)23!that!/3%)*n!'3?;)'3/!@3'A!2)tt23!-,!th3!BCD$E%!'3%-;':3%F!th3'3!0)*ht!G3!%-03!
a112):at)-n%!.h3'3!,;'th3'!a'3a!'3/;:t)-n!)%!n3:3%%a'A=!9t!.a%!-G%3'@3/!)n!<=H=I!that!th3!
t.-!@a')aG23J.)/th!%h),t!'3*)%t3'%!a::-;nt3/!,-'!ha2,!th3!'3%-;':3!'3?;)'303nt%!,-'!th3!
/3%)*n=!Th)%!:ha1t3'!1'3%3nt%!an!a2t3'nat3!/3%)*n!that!;%3%!G)tJ%3')a2!a')th03t):!a2-n*!.)th!
-n3JG)t!a//3'%!t-!3L3:;t3!th3!567895!a2*-')th0=!9t!.)22!G3!%h-.n!that!th)%!/3%)*n!
'3?;)'3%!,a'!23%%!-,!th3!BCD$E%!'3%-;':3%F!G;t!.)22!'3?;)'3!0-'3!t)03!t-!3L3:;t3=!
6.1 Design
!
!
X
!
BITC&T
'()*
+()*
SUB'+
#
!
SUB..(
)*
S/'T
+Sign
'Sign
atanh()T
BITC&T
!!i#$r& 6.1+,lo/0 1i2#r23 4or t6& 7&ri2l i38l&3&nt2tion o4 t6& C;<=>C 2l#orit63.
$%!%h-.n!)n!B)*;'3!"=HF!th3!-@3'a22!th3!/3%)*n!'30a)n%!'32at)@32A!;n:han*3/!.h3n!
:-01a'3/!t-!th3!1a'a2232!/3%)*n!%h-.n!)n!B)*;'3!<=H=!$22!th3!:han*3%!.3'3!0a/3!t-!th3!
)n/)@)/;a2!:-01-n3nt%!-,!th3!/3%)*n=!Th3!XF!YF!an/!Z!'3*)%t3'%!a'3!n-.!%tan/a'/!-n3JG)t!
%h),t!'3*)%t3'%!.)th!1a'a2232!2-a/!an/!1a'a2232!-;t1;t!:a1aG)2)tA=!Th)%!P331%!th3)'!)nt3'na2!
2-*):!%)0123!an/!'3/;:3%!a'3a!'3?;)'303nt%=!Th3!a//3'Q%;Gt'a:t-'%!a'3!n-.!'3/;:3/!t-!
-13'at)n*!-n!%)n*23JG)t!-13'an/%F!.h):h!a2%-!'3/;:3%!th3)'!:-0123L)tA=!Th3!a//3'%!a2%-!
"#
$%&'()& ( *+),-*+%, '.(' /(01/ '.1 $(223 %4',4' /% '.(' )' $(& 51 (,,+)16 '% '.1 &17' ,()2 %*
%,12(&6/8 9.)/ (++%:/ '.1 ;<-5)' (66)')%& '% 51 ,12*%2=16 %&1 5)' (' ( ')=18
>($. )'12(')%& 21?4)21/ '.1 @ (&6 A $2%//-0(+41/ '% 51 /.)*'16 53 '.1 $4221&' )'12(')%&
$%4&'8 9.1 ':% =4+'),+1712/ (''($.16 '% '.1 @ (&6 A 21B)/'12/ (++%: '.1 $%221$' 5)' '% 51
/1+1$'16 *%2 '.1 $%221/,%&6)&B 21B)/'128 C)&$1 '.1 21/4+'/ %* '.1 (66)')%& %2 /45'2($')%& (21
*16 5($D (/ '.1 /12)(+ )&,4'/ '% '.1 21B)/'12/E (& (66)')%&(+ C>@9 $%&'2%+ /)B&(+ (&6 <F#
=4+'),+1712 )/ &11616 '% $.%%/1 1)'.12 '.1 5)' /1+1$'16 *2%= '.1 21B)/'12 %2 '.1
$%221/,%&6)&B /)B& 5)'8 G)'.%4' '.)/ /)B&(+E 5)'/ $%=,4'16 ,210)%4/+3 )& '.1 )'12(')%&
:%4+6 51 *16 )&'% '.1 (6612/8 9.1 /)B& %* 1($. 21B)/'12 )/ /(016 ,2)%2 '% '.1 171$4')%& %*
1($. )'12(')%&E (++%:)&B *%2 (& (2)'.=1')$ /.)*' %* '.1 0(+41/ 51)&B *16 )&'% '.1 (6612/8
H =4+'),+1712 :(/ (+/% 4/16 '% (++%: '.1 $%221$' 5)' '% 51 *16 *2%= '.1 (2$ .3,125%+)$
'(&B1&' +%%D4, '(5+1 '% '.1 I-(66128 9.1 =4+'),+1712 (++%:/ '.1 +%%D4, '(5+1 '% 51 ,+($16
)& JKHL (&6 ,2101&'/ (&3 (66)')%&(+ $+%$D $3$+1 61+(3/ *2%= 51)&B )&'2%64$16 )&'% '.1
61/)B&E (/ :%4+6 51 '.1 $(/1 )* '.1 0(+41 :121 +%(616 )&'% ( /.)*' 21B)/'128 H/ 51*%21E '.1
$%&'2%+ 4&)' ,2%0)61/ '.1 (6621// )&'% '.1 +%%D4, '(5+18
9.)/ 61/)B& 4/1/ ( ;<-5)' :%26 /)M1E :)'. N 5)'/ *%2 '.1 :.%+1 ,(2' %* '.1 &4=512E (&6
<O 5)'/ *%2 '.1 *2($')%&(+ ,(2' %* '.1 &4=5128
Used Available Utilization
Slices #;P #;E;#< #8QR
Slice 2lip42lops #N; <"E"<N Q8SR
LUTs <S" <"E"<N #8QR
IOBs #;< <<# SP8TR
BRAMs # ;< ;8#R
<CL>s # O #<8SR
Table 6.ABOverall device utilization for the serial design.
! ! "#
T%&!'&()*&!+,)-)./,)01!203!,%&!4&3)/-!'&4)51!)4!4%061!)1!T/7-&!"898!T%)4!'&4)51!+4&4!
01-:!9;<!02!,%&!=C;S9@AAB4!/(/)-/7-&!4-)*&4C!/4!0DD04&'!,0!,%&!;#<!3&E+)3&'!7:!,%&!
D/3/--&-!'&4)518!T%)4!)4!/!3&'+*,)01!02!@F8GH8!I1!/''),)01C!,%&!=)-)1J!4:1,%&4).&3!3&D03,&'!
,%/,!/!K/J)K+K!*-0*L!23&E+&1*:!02!9;M8G!MO.!*/1!7&!+4&'8!T%&!D/3/--&-!'&4)51!,0D4!0+,!
/,!F@8A!MO.8!T%&!4)KD-)*),:!02!,%)4!'&4)51!!3&4+-,4!)1!-&44!*0K7)1/,)01/-!-05)*!'&-/:!
6%)*%!)1!,+31!/--064!203!2/4,&3!*-0*L!4D&&'48!
!"1"1 C%&'(%) +&,'
T%&!D/3/--&-!'&4)51!/--06&'!&/*%!),&3/,)01!,0!7&!*0KD-&,&'!)1!01&!*-0*L!*:*-&8!T%)4!
3&'+*&'!,%&!*0KD-&J),:!02!,%&!*01,30-!+1),8!T%&!46),*%!,0!/!4&3)/-!'&4)51!K&/14!,%/,!,%&!
1+K7&3!02!*-0*L!*:*-&4!3&E+)3&'!,0!*0KD-&,&!&/*%!),&3/,)01!'&D&1'4!01!,%&!603'!4).&C!
4)1*&!/--!(/-+&4!/3&!*0KD+,&'!01&!7),!/,!/!,)K&8!P3&()0+4-:C!,%&!4+7,3/*,!*01,30-!4)51/-4!
,%/,!'&,&3K)1&!6%&,%&3!&/*%!/''&3!)4!/'')15!03!4+7,3/*,)15!*0+-'!7&!&(/-+/,&'!&/*%!*-0*L!
*:*-&8!I1!,%&!4&3)/-!'&4)51C!,%&4&!(/-+&4!1&&'!,0!7&!'&,&3K)1&'!01*&!/,!,%&!7&5)11)15!02!
,%&!),&3/,)01!/1'!4/(&'!+1,)-!,%&!*+33&1,!(/-+&!)4!2+--:!*0KD+,&'8!T0!/**0KD-)4%!,%)4C!/!
1&6!4,/,&!)4!/''&'!,0!,%&!4,/,&!K/*%)1&!02!,%&!*01,30-!+1),C!/4!4%061!)1!Q)5+3&!"8#8!
63
IDLE PRELOAD
SETUPCOMPUTE
ST#RT
%ST#RT
&'T()T + ,1#). S/#0T 1 ,1&'T()T 1 ,1
&'T()T + ,1#). S/#0T + ,1
!"#$%& ()*+,-.-& /".#%.0 12% -3& 4&%".5 "065&0&7-.-"27 21 -3& 89:;<8 .5#2%"-30)
The IDL* and ./*LO1D states remain unchanged. The IDL* state is active when
the algorithm is not being executed. The ./*LO1D state is used to load the initial values
into the X, Y, and Z registers. The S*TF. state is the new state and is only active the
very first clock cycle of each iteration. In the S*TF. state, the values in the X, Y, and Z
registers are evaluated to determine the subtract and sign signals. These signals are saved
into registers for use during the remaining clock cycles of the current iteration. The
JOM.FT* state is active until all 32 bits have been computed. Once this happens, IDL*
becomes the active state, otherwise if more iterations remain, the machine returns to the
S*TF. state to begin execution of the next iteration.
1 new internal control value is also added, named BITJOT. Whereas SH1MT holds
the current iteration count, BITJOT holds the number of bits that have been processed
within the current iteration. When this counter rolls over to zero, then the active iteration
is completed.
6#
! IDL%! P'%LO)D! S%TUP! CO.PUT%!
INIT! $ % $ $
DON%! % $ $ $
PLD! $ % $ $
S0I1T! $ $ % %
T2345!6789S:2:5!;<:=<:>!?;@!:A5!>5@B24!C;D:@;4!<DB:7!
To manage the new shift registers, a new SHIFT signal is also added. When asserted,
the SHIFT signal causes the shift registers to load the input bit into the MSB position,
and shift out the LSB. The old LD signal has been changed to cause a parallel load
operation in the shift registers. This allows execution of the algorithm to begin sooner.
Otherwise, without the parallel load, the initial values would need to be shifted into the
register. The values of these signals in each of the four states are shown in Table 6.H.
! U>5E! )F2B42345! U:B4BG2:B;D!
S4BC5>! 3J %3,3%H $.3K
S4BC5!14B=H14;=>! 33 H6,6H# $.%K
LUT>! 6L H6,6H# $.3K
IOI>! %H6 HH% 5L.$K
I').>! $ 3H $.$K
JCLK>! % J %H.5K
T2345!6739D5FBC5!<:B4BG2:B;D!?;@!:A5!>5@B24!C;D:@;4!<DB:7!
The device utiliNation for the control unit is shown in Table 6.3. The parallel control
unit uses 3O slices, and this control unit uses 3J slices, making the area reQuirements for
the two units essentially the same. This unit reQuired more slice flip-flops T33U than the
parallel unitVs HO flip-flops.
6.#.$ %he )hift -egisters
The W, X, and Y registers in the serial design are all identical shift registers with
parallel load and parallel output capability. In addition to the clock signal, the registers
have two control inputs, two data inputs, and two data outputs. The data inputs are
65
labeled as S+I- and /. S+I- is the bit to be shifted in to the MSB position of the register
and / is the parallel data input. The t=o control signals are /LD and SAIFT. Chen
SAIFT is assertedD then the contents of the register shift to the right everF clock cFcleD
=ith S+I- becoming the ne= MSB. Chen /+LD is assertedD then the parallel data in / is
loaded into the register at the next rising edge of the clock.
!L#
!31
&'()
&'*+T
!L#
!3-
!L#
!-
!'*+T31 !'*+T3- !'*+T-
.L/ .L/ .L/
!"#$%& ()*+,&-"#. /0 12& -&%"34 -2"01 %&#"-1&%-)
The basic design is sho=n in Figure 6.3. Chen /LD is not assertedD then the data
inputs for each flip-flop are the data outputs of the adLacent flip-flip to the left. Chen
/LD is assertedD then the bits in / become the inputs for each of the flip-flops.
-ot sho=n in the figure are the clock enable inputs for the flip-flops. After the
algorithm is completeD the values should be held constantN that isD neither shifting nor
loading. The SAIFT and /LD inputs are connected through an OR gate to the clock
enable inputs of each of the flip-flops. As a resultD the register =ill onlF shift if SAIFT is
asserted and =ill onlF do a parallel load if /LD is asserted. In anF caseD the parallel
outputs are available as the /+OUT signalD and serial output is available as the S+OUT
signalD =hich al=aFs corresponds to the LSB of the register.
66
"sed A'aila+le ",ili-a,i./
Sli1es 2$ 1&,&12 (.2*
Sli1e 2li342l.3s &2 26,624 (.1*
5"Ts 4$ 26,624 (.2*
789s 44 221 1$.$*
9:A;s ( &2 (.(*
<=5>s 1 8 12.-*
Ta+le 6.4BCe'i1e D,ili-a,i./ E.r a si/Gle serial shiE, reGis,er Ii/1lDdi/G JKL1 NDl,i3leOerP.
Since this design does not have the large amount of multiplexers required by the
variable shift register described in -.1.2, the resource requirements of this register are
significantly reduced. The device utiliFation for one shift register is shown in Table 6.4.
The table includes the bit-select multiplexer shown in Figure 6.1 that is used to perform
the bit-shift operation. Only 2$ slices are required for this design, compared to $( for the
parallel shift register. The parallel shift registers are each 21(* larger than the serial
registers. Since the parallel shift registers were used twice in the parallel design, this area
reduction is the key to the small area requirements of the entire serial design.
!"#"$!%&' )**+,- %./0'
The lookup table remains essentially unchanged. In the serial design, a multiplexer is
added to choose the individual bits to be fed into the serial adder.
"sed A'aila+le ",ili-a,i./
Sli1es 8 1&,&12 (.1*
Sli1e 2li342l.3s ( 26,624 (.(*
5"Ts 16 26,624 (.1*
789s 12 221 -.4*
9:A;s 1 &2 &.1*
<=5>s 1 8 12.-*
Ta+le 6.QBCe'i1e D,ili-a,i./ E.r ,he serial l..RD3 ,a+le.
The parallel lookup table did not require any slices. The entire design was
implemented in one BROM. Since the serial design requires a &2:1 multiplexer, this
! ! "#
i%&'(%(n*+*i,n!,-!*.(!',,/0&!*+1'(!2(30i2(4!5!4'i6(47!8.i4!4%+''!in62(+4(!in!+2(+!i4!%,2(!
*.+n!,--4(*!19!*.(!4+:in;4!in!*.(!2(;i4*(24!+n<!+<<(247!
!"#"$ %&' S'*i,- .//'*0
!
!
"
!in
C! $%&'Cin
A
B
SUB
S
!$u&
S)*
S
CLK!
!"#$%& ()*+,&-"#. /0 12& -&%"34 355&%)
8.(!+<<(24!*.+*!6,%&0*(!*.(!=>!?>!+n<!@!:+'0(4!in!*.(!4(2i+'!CBCDEC!0ni*!+2(!
4i%&'(!,n(F1i*!+<<(2G401*2+6*,24!Hi*.!+<<i*i,n+'!.+2<H+2(!*,!40&&,2*!4(2i+'!+2i*.%(*i67!
8.(!<(4i;n!,-!*.(4(!+<<(24!i4!4.,Hn!in!Ii;02(!"7J7!8.(!+<<i*i,n+'!.+2<H+2(!i4!n(6(44+29!
*,!4+:(!*.(!6+229!:+'0(!-2,%!,n(!4*+;(!+n<!*.(n!*,!-((<!i*!*,!*.(!n(K*7!Wi*.,0*!*.i4!',;i6>!
*.(!-in+'!40%!H,0'<!1(!in6,22(6*!4in6(!*.(2(!H,0'<!1(!n,!6+229!&2,&+;+*i,n7!
8.(!C,0*!,0*&0*!,-!*.(!+<<(2!i4!6,nn(6*(<!*,!*.(!in&0*!,-!+!D!-'i&F-',&7!8.(!<+*+!
,0*&0*!,-!*.(!-'i&F-',&!i4!in!*02n!-(<!in*,!+!MN1!%0'*i&'(K(27!W.(n!+!401*2+6*i,n!i4!1(in;!
&(2-,2%(<>!i*!i4!n(6(44+29!*,!.+:(!+n!ini*i+'!6+229!,-!1!*,!+'',H!-,2!*.(!M4!6,%&'(%(n*!,-!
*.(!B!in&0*7!8.(!NEW_Cin!4i;n+'!i4!;(n(2+*(<!19!*.(!6,n*2,'!0ni*!+n<!i4!+44(2*(<!,n'9!in!
*.(!SE8UV!4*+*(!+n<!H.(n!+!401*2+6*i,n!i4!6+''(<!-,27!En!*.(!CBWVU8E!4*+*(>!NEW_Cin!
i4!n,*!+44(2*(<>!+'',Hin;!*.(!6+229!,0*!,-!*.(!&2(:i,04!4*+;(!*,!1(!&+44(<!*.2,0;.!*,!*.(!
6+229!in!,-!*.(!6022(n*!4*+;(7!
68
The adder that computes the exponential value is a 32-bit parallel adder that uses the
parallel outputs of the ; and < registers.
"#r%&' (&r&''#' )*&%'&+'# "#r%&' ,-%'%.&-%/n
"'%1#2 4 16 13,312 0.0%
"'%1# 3'%453'/42 1 0 26,624 0.0%
6,72 7 32 26,624 0.0%
89:2 7 97 221 3.2%
:;)<2 0 0 32 0.0%
=>6?2 1 1 8 12.5%
7&+'# @A@BC#*%1# D-%'%.&-%/n E/r -F# 2#r%&' &nG 4&r&''#' &GG#r2A
As the design would imply, the resource requirements for this component are quite
small. Table 6.6’s serial column shows that only 4 slices are required to implement this
design. The parallel column shows the device utilization for the adder that computes the
exponential function.
!"#"$!%&'()* ,-../01
3%HDr# @AIB"'%1#2 D2#G +J -F# *&r%/D2 1/K4/n#n-2 /E -F# 2#r%&' >9;C8> Dn%-A
This design is even more efficient at using the FPOA’s resources, requiring much
less of the chip’s area than the parallel design. As Figure 6.5 shows, the distribution of
! ! "#
$%&!'%()*!+&*,-+'&*!$,!$%&!',.),/&/$*!(*!0&+1!2343/'&56!7($%!3!/&3+!&0&/!2343/'&!2&$7&&/!
$%&!8!+&9(*$&+*!3/5!$%&!',/$+,4!-/($:!;/'4-5&5!(/!$%&!<=$%&+>!'3$&9,+1!3+&!$%&!$%+&&!*&+(34!
355&+*6!$%&!8?@2($!355&+6!3/5!$%&!3+'!%1)&+2,4('!$3/9&/$!4,,A-)!$324&:!B($%!,/41!1D!,E!$%&!
F)3+$3/@8G*!*4('&*!-*&5!(/!$%(*!5&*(9/6!&0&/!43+9&+!3))4('3$(,/*!'3/!2&!*-)),+$&5:!H%&!
*&+(34!/3$-+&!,E!$%&!5&*(9/!.&3/*!$%3$!&I$+3!)+&'(*(,/!'3/!2&!355&5!$,!$%&!*1*$&.!7($%,-$!
%30(/9!3!43+9&!(/'+&3*&!(/!3+&3:!
B%(4&!$%&!3+&3!+&J-(+&.&/$*!E,+!$%&!5&*(9/!3+&!*.3446!$%&!$(.(/9!+&J-(+&.&/$*!3+&!
/,$:!K,+!3!*1*$&.!-*(/9!n!($&+3$(,/*!L3/5!r +&)&3$*M!7($%!3!7,+5!*(N&!,E!$!2($*6!$%&/!$%&!
$,$34!&I&'-$(,/!$(.&!,E!$%&!-/($!7(44!2&! ! "" ## $"1!'4,'A!'1'4&*:!O3'%!($&+3$(,/!7(44!
+&J-(+&!$!'1'4&*!$,!',.)4&$&!3*!$%&!034-&*!3+&!)3**&5!*&+(3441!$%+,-9%!$%&!355&+*6!3/5!n P!
r!$,$34!($&+3$(,/*!3+&!',.)4&$&5!3E$&+!$%&!349,+($%.!E(/(*%&*:!H%&!QROS=TU!*$3$&!,E!$%&!
',/$+,4!-/($!355*!3/,$%&+!'4,'A!'1'4&!,E!&I&'-$(,/!$(.&:!H%&!355&5!'4,'A!'1'4&*!3+&!
)3+$(3441!,EE*&$!21!3/!(/'+&3*&!(/!'4,'A!E+&J-&/'1:!H%&!4(/&3+!',/0&+9&/'&!,E!$%&!V=RU;V!
349,+($%.!.&3/*!$%3$!$%&!/-.2&+!,E!($&+3$(,/*!*%,-45!2&!'4,*&!$,!$%&!/-.2&+!,E!2($*!(/!$%&!
+&9(*$&+*:!T!%340(/9!,E!$%&!7,+5!*(N&!L3/5!3!',++&*),/5(/9!%340(/9!,E!$%&!($&+3$(,/!',-/$M!
7(44!+&5-'&!$%&!&I&'-$(,/!$(.&!21!3!E3'$,+!,E!W:!
H%(*!5&*(9/6!7%('%!)&+E,+.*!81!($&+3$(,/*6!?!+&)&3$&56!7($%!3!7,+5!*(N&!,E!8?!2($*!
7(44!$3A&!16XYZ!'4,'A!'1'4&*!$,!',.)4&$&:!T$!$%&!$%&,+&$('34!.3I(.-.!'4,'A!+3$&!,E!18W:[!
\]N6!$%&!-/($!7(44!$3A&!Z:[W!!*!$,!',.)-$&!$%&!E(/34!034-&:!
!"#! $%&'()*%+, ./0'(*0
H%(*!*&'$(,/!)+&*&/$*!$%&!+&*-4$*!,E!*(.-43$(,/*!)&+E,+.&5!-*(/9!$%&!5&*(9/!*)&'(E(&5!
(/!":1:!H%&!5&*(9/!-*&*!8?!2($*!$,!+&)+&*&/$!&3'%!7,+56!7($%!W!2($*!E,+!$%&!(/$&9&+!),+$(,/6!
3/5!?[!2($*!E,+!$%&!E+3'$(,/34!),+$(,/!,E!$%&!034-&:!H%&!*3.&!*(.-43$(,/*!3+&!)&+E,+.&5!
! ! "#
$e&e!'(!'&e!)*ne!,n!-./0!,n!*&)e&!1*!)e2*n(1&'1e!1$'1!1$,(!3n,1!4&*)35e(!,)en1,5'6!&e(361(0!
731!'1!1$e!5*(1!*8!')),1,*n'6!56*59!5:56e(.!
6.#.$!Simulation $/ Computing e 3ith CORDIC 9rowth
<rror
;$e!8,&(1!(,236'1,*n!4e&8*&2e)!)e2*n(1&'1e(!1$e!e88e51!*8!1$e!<=>?@<!A',n!!n.!;*!
5*2431e!1$e!eB4*nen1,'6!83n51,*n!#$0!1$e!C!&eA,(1e&!23(1!7e!,n,1,'6,ze)!E,1$!F0!1$e!G!
&eA,(1e&!E,1$!#0!'n)!1$e!H!&eA,(1e&!E,1$!1$e!)e(,&e)!'&A32en1!1*!1$e!eB4*nen1,'6!83n51,*n.!
;$,(!(,236'1,*n!'11e241(!1*!5*2431e!1$e!I'63e!*8!#0!(*!'n!,n,1,'6!I'63e!*8!F!,(!46'5e)!,n1*!
1$e!H!&eA,(1e&.!
!!igure 6.6*Results of a 2o3elSim simulation of the serial CORDIC unit attempting to =ompute the
>alue of !. ?s a result of CORDIC growthA the final =ompute3 >alue is ina==urate.
;$e!&e(361(!*8!1$,(!(,236'1,*n!'&e!($*En!,n!J,A3&e!K.K.!;$e!'6A*&,1$2!,1e&'1e)!8*&!'!
1*1'6!*8!LM!1,2e(!N,n563),nA!1$e!1E*!&e4e'1!,1e&'1,*n(!&eO3,&e)!,n!1$e!$:4e&7*6,5!)*2',nP.!
;$e!8,n'6!$eB')e5,2'6!I'63e(!,n!1$e!C!'n)!G!&eA,(1e&(!'&e!FM"/-Q<-FK!'n)!#JR/"/STFK!
&e(4e51,Ie6:.!U$en!5*nIe&1e)!1*!)e5,2'6!'n)!(5'6e)!'55*&),nA6:0!1$e:!7e5*2e!F./""R!8*&!
1$e!C!&eA,(1e&!'n)!#.R"LL!8*&!1$e!G!&eA,(1e&.!;$e!I'63e!*8!1$e!eB4*nen1,'6!83n51,*n!,(!1$e!
(32!*8!1$e(e!1E*!n327e&(V!/./-F/0!5*&&e(4*n),nA!1*!1$e!I'63e!*8!1$e!W!(,An'6.!
! ! "#
$%e'e!(e')*+'!,-((e'.-/0!.(e,1'e*2!+-!+%e!34*)e'!.(-0),e0!1/!5676#6!8%e(e4'!+%e!
.4(4**e*!)/1+!,-9.*e+e'!1/!)/0e(!:!!';!+%e!'e(14*!)/1+!+4<e'!=)'+!-3e(!7>>!!'!+-!.(-0),e!+%e!
?1/4*!34*)e6!
6.#.#!Simulation #. Computing e Compensating for
CORDIC :rowth
@/!-(0e(!+-!,-9.)+e!4/!4,,)(4+e!34*)e!-?!e;!+%e!A(-B+%!-C'e(3e0!1/!D67!9)'+!Ce!
4,,-)/+e0!?-(6!$-!.(-0),e!4/!)/',4*e0!?1/4*!34*)e;!+%e!1/1+14*!34*)e!9)'+!Ce!01310e0!C2!
+%e!A(-B+%!?4,+-(6!$%e!?-(9)*4!)'e0!+-!,-9.)+e!+%e!49-)/+!-?!A(-B+%!1'!'%-B/!1/!ED67FG6!
H!.(e,-9.)+e0!34*)e!1'!'%-B/!1/!$4C*e!D6#6!I-(!+%1'!'19)*4+1-/;!(4+%e(!+%4/!1/1+14*1J1/A!K!
B1+%!#;!K!B1**!Ce!1/1+14*1Je0!B1+%!+%e!%eL40e,194*!34*)e!#F5#M:"7#N;!B%1,%!eO)4*'!
#67>"5#>6!P!1'!4A41/!1/1+14*1Je0!B1+%!>!4/0!Q!B1+%!##>6!
!!"#$%& ()*+,&-$./- 01 2 304&.5"6 -"6$.2/"07 01 /8& -&%"2. 9:,;<9 $7"/ =06>$/"7# /8& ?2.$& 01 e)
@A 2==0$7/"7# 10% 9:,;<9 #%0B/8C /8& 1"72. =06>$/&4 ?2.$& "- 2==$%2/&)
$%e!(e')*+'!-?!+%1'!'19)*4+1-/!4(e!'%-B/!1/!I1A)(e!N6"6!$%e!?1/4*!,-9.)+e0!34*)e!-?!e!
1'!-)+.)+!-/!+%e!M!C)'6!@+'!%eL40e,194*!34*)e!7R"M#57D#N!,-/3e(+'!+-!76"#:F#>;!B%1,%!
94+,%e'!+%e!4,+)4*!34*)e!-?!e6!8%e/!+4<e/!+-!F>!0e,194*!.*4,e'!+%e!34*)e!-)+.)+!1/!M!1'!
76"#:7:#::>>7##>>>>>>>>>>>>>>>>>;!B%1,%!01??e('!?(-9!+%e!(e4*!34*)e!-?!e!C2!-/*2!
>6>>>>>>>5#5N7>5>#:5>:FF>>>6!$%e'e!34*)e'!4(e!+%e!'49e!4'!+%e!34*)e'!.(-0),e0!C2!
+%e!.4(4**e*!)/1+6!H'!Ce?-(e;!+%e!-/*2!B42!+-!1/,(e4'e!+%e!.(e,1'1-/!1'!+-!1/,(e4'e!+%e!B-(0!
! ! "#
$%&e!()*!%+e,(+%-)!.-/)+0!12%.2!%$!e($%e,!+-!(.2%e3e!%)!+2e!$e,%(4!*e$%5)0!6/+!.-7e$!(+!+2e!
.-$+!-8!4-)5e,!e9e./+%-)!+%7e:!
6.#.$!Simulation $/ Computing e–1
;+!%$!(4$-!%7<-,+()+!8-,!+2e!/)%+!+-!.-7</+e!+2e!3(4/e!-8!+2e!e9<-)e)+%(4!8/).+%-)!8-,!
)e5(+%3e!(,5/7e)+$:!=%).e!+2e!*-7(%)!-8!.-)3e,5e).e!8-,!+2e!(45-,%+27!%$!$>77e+,%.!
(6-/+!?0!2(48!-8!+2e!<-$$%64e!(,5/7e)+$!(,e!)e5(+%3e:!@2%$!$%7/4(+%-)!.-7</+e$!+2e!3(4/e!
-8!eAB:!C!%$!%)%+%(4%&e*!1%+2!B:#?"DB?!+-!e)$/,e!()!/)$.(4e*!,e$/4+0!()*!E!%$!%)%+%(4%&e*!1%+2!
?:!@2e!F!,e5%$+e,!.-)+(%)$!+2e!(,5/7e)+!8-,!+2e!e9<-)e)+%(4!8/).+%-)!()*!%$!+2/$!%)%+%(4%&e*!
+-!AB!GH???????BI!%)!#J$!.-7<4e7e)+!)-+(+%-)K:!
!
Figure 6.*—Results of a ModelSim simulation of the serial CORDIC unit computing the value of e@A
.
@2e!$%7/4(+%-)!,e$/4+$!<,e$e)+e*!%)!H%5/,e!I:L!$2-1!+2(+!+2e!MNOP;M!/)%+!%$!(64e!+-!
2()*4e!+2e!)e5(+%3e!(,5/7e)+:!@2e!8%)(4!3(4/e!-8!+2e!Q!$%5)(4!G?DQ#PDLMBIK!.-,,e$<-)*$!
+-!?:RI"L"STRDT#B?0!12%.2!%$!-8!+2e!$(7e!(../,(.>!($!+2e!.-7</+(+%-)!8-,!eB:!
6.#.5!Simulation 5/ Computing Outside the Domain of
Convergence
U$!*%$./$$e*!%)!T:I0!+2e!MNOP;M!(45-,%+27!*-e$!)-+!.-)3e,5e!-)!()!(../,(+e!
()$1e,!8-,!(44!%)</+!3(4/e$:!@(64e!T:R!$2-1$!+2e!*-7(%)$!-8!.-)3e,5e).e!8-,!(44!+2e!
"3
$%m'u)a)+%, -%ma+,. .u''%r)0- 1y )30 456784 a9:%r+)3m. <%r )30 r%)a)+%, m%-0= >3+$3
)3+. -0.+:, u.0.= )30 -%ma+, %? $%,@0r:0,$0 +. a 9+m+)0r %, )30 ma:,+)u-0 %? )30 +,+)+a9
@a9u0 %? A. 8? A >0r0 +,+)+a9+B0- >+)3 a @a9u0 >3%.0 ma:,+)u-0 +. :r0a)0r )3a, C.CCDC"=
)30, )30 ?+,a9 $%m'u)0- @a9u0 >%u9- 10 +,$%rr0$). <%r )3+. .+mu9a)+%,= )30 E r0:+.)0r +.
a:a+, +,+)+a9+B0- >+)3 C.FG"HCG= a,- I +. a:a+, +,+)+a9+B0- )% G= 1u) A +. +,+)+a9+B0- )% HCG=
.% )3a) )30 u,+) >+99 a))0m') )% $%m'u)0 !H.
Figure (.*+,esults of a 3odel5im simulation of the serial C:,;<C unit computing the value of e
5.
5ince the initial value 5 is outside the domain of convergence, the computed value is incorrect.
J30 .+mu9a)+%, r0.u9). .3%>, +, <+:ur0 6.L -0m%,.)ra)0 )3a) )30 456784 u,+) +.
u,a190 )% $%m'u)0 !H. J30 ?+,a9 @a9u0 %u)'u) 1y M +. 3.GHL3CG= >3+$3 +. ,%>30r0 ,0ar )30
$%rr0$) @a9u0 NCOD.OC3FP. 3.GHL3 a$). +. a, a.ym')%)0 ?%r )30 $%m'u)0- @a9u0. %? M. 5,$0
)30 +,+)+a9 @a9u0. %? A m%@0 %u).+-0 )30 -%ma+, %? $%,@0r:0,$0= )30 $%m'u)0- M @a9u0
ra'+-9y a''r%a$30. 3.GHL3= a,- r0ma+,. a) )3a) @a9u0 ,% ma))0r 3%> 9ar:0 a @a9u0 A +.
+,+)+a9+B0- >+)3. Q .+m+9ar a.ym')%)0 0R+.). ?%r >30, A +. +,+)+a9+B0- >+)3 @a9u0. )3a) ar0
)%% ,0:a)+@0. 8, )3+. $a.0= )30 M @a9u0. a''r%a$3 G.3F6L.
!"# $u&&ar)
<%r 9ar:0 a''9+$a)+%,. >30r0 <STQ r0.%ur$0. ar0 .$ar$0= )30 .0r+a9 456784 u,+)
-0.$r+10- +, )3+. $3a')0r %??0r. ma,y a-@a,)a:0. %@0r )30 'ara9909 u,+) -0.$r+10- +,
74
Chapter 5. -y operating only on single bits in a serial fashion, the complexity of the shift
registers is reduced significantly. The size of the adders used in the design is also reduced
since they only need to handle one bit at a time. Precision and accuracy are not sacrificed
to achieve the reduction in resource utilization. However, the added execution time may
prevent this design from being used in more time-sensitive applications. For those, the
parallel design or a faster table-based approach would be better suited.
7#
!"#$%&' ) !*+,-./0*+ #+1 2.%.'& 3*'4
$rti(i)ia+ ne.ra+ net/or1s ha4e the potentia+ to 6e)ome in4a+.a6+e too+s (or s8stems
9esigners +oo1ing to imp+ement arti(i)ia+ inte++igen)e into their /or1. <he a6i+it8 o( these
net/or1s to 6e traine9 (or a spe)i(i) pro6+em an9 to operate a.tonomo.s+8 opens .p a
9oor to ne/ /a8s o( 9ata ana+8sis an9 other app+i)ations /here it is 9i((i).+t to 9esign
straight(or/ar9 a+gorithms to pro9.)e the 9esire9 res.+ts.
=>?$s are +i1e/ise 6e)oming in)reasing+8 .se(.+ (or 9esigners. <he a6i+it8 o( the
)hips to 6e reprogramme9 spee9s protot8ping an9 simp+i(ies the 9esign pro)ess (or +arger
s8stems. <he training pro)e9.re (or arti(i)ia+ ne.ra+ net/or1s [email protected] )hanges to interna+
/eights o( the net/or1. <he reprogramma6i+it8 o( the =>?$ means that these )hanges
)an 6e ma9e easi+8. <he a6i+it8 to p+a)e an arti(i)ia+ ne.ra+ net/or1 in an em6e99e9
en4ironment /ith rea+A/or+9 inp.ts (.rther eBpan9s the n.m6er o( app+i)ations a4ai+a6+e.
>re4io.s+8C it /as 4er8 9i((i).+t to 6.i+9 a ne.ra+ net/or1 in an =>?$ 6e)a.se o( the
9i((i).+t8 o( (itting the arithmeti) har9/are on the same )hip as the net/or1.
De.ra+ net/or1s [email protected] )omp+eB mathemati)a+ s.pport in or9er to (a)i+itate the
)omp.tation o( the s.mmationC /eightingC an9 trans(er (.n)tions. Eost arithmeti)
a+gorithms are optimiFe9 (or spee9 an9 [email protected] +arge amo.nts o( )hip area /hen
imp+emente9. De.ra+ net/or1s are inherent+8 para++e+ )omp.ter s8stems an9 in or9er to
.se a +arge arithmeti) .nitC a++ the )omp.tations /o.+9 ha4e to 6e pipe+ine9 thro.gh one
or se4era+ o( these .nits. <his /o.+9 +essen the para++e+ism o( the net/or1C /hi)h a+so
res.+ts in +onger )omp.tation times.
7#
The '()*+' implementations presented are compact enough that they potentially
could be duplicated in the F=>A, allowing each neuron to have its own '()*+' unit.
This maintains the parallelism of the network. Two alternate designs of '()*+' units
have been presented, a parallel design and a much smaller serial design that takes longer
to execute. Either of these could be used in a neural network at the neuron level.
7.# $omparing the 1esigns
The parallel design fits in 32J of the KpartanL3Ms slices and can run at a theoretical
maximum clock frequency of 7O PQz. The serial design is much smaller, needing only
13J slices, and can also run at almost twice the clock frequencyT 13U.V PQz.
Figure 7.1+,lices required by the various components of the two CORDIC designs.
The parallel design requires a large amount of 32T1 multiplexers to accomplish the
shifting. This is necessary because the amount by which the operands are shifted
77
increases each iteration of the algorith0. When switching to the serial a44roach5 these
0ulti4le7ers are eli0inate8 resulting in significant resource sa9ings. As seen in ;igure
7.15 it is this sa9ings that accounts for 0ost of the efficienc= of the 8esign.
>he effect of wor8 si?e on the area re@uire0ent of the 8esigns can Ae seen in ;igure
7.2. >he 4arallel 8esign alwa=s re@uires 0ore area than the serial 8esign5 an8 the nu0Aer
of slices re@uire8 for the 4arallel 8esign increases 8ra0aticall= with each increase of the
wor8 si?e. >he area re@uire0ent of the serial 8esign increases roughl= linearl=5 while the
4arallel 8esign follows a shar4 e74onential cur9e.
>he affect wor8 si?e has on e7ecution ti0e is shown in ;igure 7.C. >he unit is
assu0e8 to Ae o4erating at the 0a7i0u0 clock fre@uenc=5 as re4orte8 A= the Eilin7
s=nthesi?er. >his 9alue is 8ifferent for Aoth the 4arallel an8 serial 8esigns. >he 0a7i0u0
clock fre@uenc= also 8ecreases as wor8 si?e increases5 which further lengthens e7ecution
ti0e with larger wor8 si?es. >he serial 8esign is se9erel= i04lacte8 A= the increase in
wor8 si?e. F7ecution ti0e increases e74onentiall= with wor8 si?e5 while e7ecution ti0e
for the 4arallel 8esign follows a roughl= linear track.
"#
Figure 7.2+The effect of word size on the F78A chip area re<uirements of the CORDIC unit.
Figure 7.C+The effect of word size on the execution time of the CORDIC unit.
$% &'( )* %**( )+ &,-.'/0(1 23* 24, 5016/*%7 23* '/*' '(8 20-0(1 /*960/*-*(2% 5,/
23*%* 8*%01(% '/* 0(:*/%*;+ ./,.,/20,(';< =30% 0% 2/6* 5,/ -,%2 8*%01(%< >( 1*(*/';7 43*(
8*%01( &3'(1*% '/* -'8* 23'2 /*86&* 23* 20-0(1 /*960/*-*(2% 5,/ 23* %+%2*-7 23*%* %'-*
&3'(1*% /*%6;2 0( '( 0(&/*'%*8 '/*' /*960/*-*(2< =3* -,:* 2, )02?%*/0'; '/023-*20&7 430&3
7#
optimized the algorithm for area, requires each iteration of the algorithm to take longer.
Each bit in the operands will use one clock cycle to compute the new value. For these
designs, which use the word sizes of 32 bits and 31 iterations Cwith 2 repeatsD, the 3E
clock cycles that the paralle implementation requires are increased to 1,FG7 cycles. This
is an increase of 3FFFI. This is offset somewhat by the increase in clock frequency, but
the serial implementation still takes significantly longer to complete.
For applications that can tolerate less precision, further savings in both area and
computation time can be achieved. Jeural networks in particular are probabilistic in
nature and may not need 32 bits of precision. Kecreasing the word size to 1L bits will
drastically reduce the area requirement for the parallel design, since its area requirement
decreases exponentially with word size. The area requirement of the serial design
depends only linearly on the word size, so the area savings would not be as significant.
The execution time, however, would decrease exponentially.
Noth implementations also have complex control units that are designed with
flexibility in mind. Ot is possible that the clock frequency for both designs could be
increased if the control unit could be simplified for a specific application.
7.# Integration .ith Artificial 4eural 4etwor7s
7.#.9 :;<=I: Artificial 4euron
This section discusses the design of a basic artificial neuron that uses the PQRKOP
unit to compute its transfer function. Figure 7.E shows the design of this neuron. This
sample design uses E inputs, but the design can be scaled to accommodate any number of
inputs.
"0
!"#$
!"#$
!"#%
!"#&
!"#%
!"#&
!"#'
'
(
(
(
!"#)
'
'
'
!"#$%!
()#*
()#+
()#,
-"./"
/0-0"
123
0
45)0
*+,
-./.0
/01-0
"21+
34".
!"##$%&'()*"(+%&'(
%,$(!*-,)*"(+%&'(
&(."%)/-&01&(0
Figure 7.*—Bloc0 diagram of a basic artificial neuron utili9ing the CORDIC unit to compute the
transfer function ex.
The 4 inputs are weighted by constants that are stored internally in the neuron. 9
separate multiplier computes the weighted value for each input in parallel. 9s per the
artificial neuron model shown in Figure 2.1, these weighted values are then passed
through a summation function. Three adders are cascaded to add the four values together.
The CORDIC unit then computes the transfer function. The output of the final adder is
used as the argument to the function. In this case, since the CORDIC unit has been
designed to compute ex, the sum is connected to the INGH input. INGI is initialized to 0,
and INGK is initialized to 1.2075 to ensure an unscaled result. The NT9RT and RENET
signals may be provided externally or may be hardQwired to high or low. The E output of
the CORDIC unit then becomes the output of the neuron (NS9T) and may be connected
to any number of neurons in the next layer of the network.
The multipliers can be of any design, but since all inputs use a fixed point, the output
must be shifted accordingly. The design used here has 2" bits reserved for the fractional
! ! "#
$%&'%(e(*!%+!*,e!-./0e12!1%!*,e!+3(./!'4%50$*!&01*!6e!1,3+*e5!*%!*,e!437,*!8"!63*1!*%!e(104e!
*,.*!3*!31!3(!*,e!+%4&.*!e9'e$*e5!6:!*,e!;<=>?;!0(3*@!
! "#$#%%e%! 'e$(#%!
'%()e*! AA#!BC@DEF! GAH!BH@CEF!
'%()e!+%(,-+%.,*! #8H!BD@CEF! #G#!BD@CEF!
/01*! #28IH!BG@"EF! "JH!BH@GEF!
234*! #AH!BIH@"EF! #AH!BIH@"EF!
4567*! #!BH@#EF! #!BH@#EF!
89!89!7:%;(,%(e$*! #A!BCD@DEF! #A!BCD@DEF!
<=/>*! #!B#8@CEF! #!B#8@CEF!
1#?%e!@A8BCeD()e!$e*.:$)e!$eE:($eFeG;*!H.$!;Ie!#$;(H()(#%!Ge:$.G!Je*(KG*!:*(GK!?.;I!;Ie!,#$#%%e%!#GJ!
*e$(#%!=35C2=!:G(;*A!
K,e!4e1%04$e!4eL034e&e(*1!%+!*,31!5e137(!.4e!1,%M(!3(!K.6/e!I@#@!K,e!5e137(!M.1!
1:(*,e13Ne5!013(7!6%*,!*,e!'.4.//e/!.(5!1e43./!;<=>?;!0(3*1@!K,e!(e04%(!M3*,!*,e!'.4.//e/!
;<=>?;!0(3*!4eL034e1!AA#!%+!*,e!O'.4*.(P1!1/3$e12!M,3/e!*,e!1e43./!5e137(!%(/:!(ee51!GAH@!
K,e!.55e5!.4e.!4eL034e&e(*!+%4!6%*,!5e137(1!$%&e1!+4%&!*,e!H!.55e41!+%4!*,e!10&&.*3%(!
+0($*3%(!.(5!*,e!.553*3%(./!4%0*3(7!/%73$!4eL034e5!*%!3(*e4$%((e$*!*,e!&0/*3'/3e41!.(5!
.55e41@!K,e!(e04%(!31!.6/e!*%!6e(e+3*!+4%&!*,e!O'.4*.(P1!603/*Q3(!&0/*3'/3e41@!R%4!/.47e4!
(e*M%4S12!+04*,e4!$,3'!.4e.!M3//!6e!4eL034e5!*%!3&'/e&e(*!*,e!&0/*3'/3$.*3%(!+0($*3%(@!K,e!
1e43./!5e137(!31!.6/e!*%!%'e4.*e!.*!.!&.93&0&!*,e%4e*3$./!$/%$S!+4eL0e($:!%+!#GD@D!TUN2!
.(5!*,e!'.4.//e/!5e137(!$.(!%'e4.*e!.*!.!*,e%4e*3$./!&.93&0&!+4eL0e($:!%+!II@J!TUN@!K,e!
.4e.V*3&e!*4.5e%++!,%/51!*40e!+%4!*,e!(e04%(!5e137(!W01*!.1!3*!535!+%4!*,e!;<=>?;!0(3*!
./%(e@!K,e!H8Q63*!.55e41!'4e1e(*!3(!*,e!1e43./!5e137(!.4e!+.1*!e(%07,!*%!.//%M!*,e!(e04%(!
*%!1*3//!%'e4.*e!.*!*,e!+.1*e4!$/%$S!4.*e@!
?*!1,%0/5!6e!(%*e5!*,.*!*,e!1e43./!(e04%(!1*3//!01e1!'.4.//e/!&0/*3'/3e41!*%!$%&'0*e!*,e!
Me37,*e5!3('0*12!.(5!1e43./!.55e41!*%!$%&'0*e!*,e!10&&.*3%(!+0($*3%(@!?*!1,%0/5!6e!
'%1136/e!*%!01e!6%*,!1e43./!.55e41!.(5!&0/*3'/3e41!*%!7.3(!.553*3%(./!.4e.!1.-3(71@!
"2
$%&%'()'* +' ,-% (%.)*'. /.%(0 1+()2)34,)+'. ,+ ,-% 567$85 (%.)*' 149 4:.+ ;%
<%=/)<%( .-+/:( 4 (%.)*'%< /.% ,-). 4&&<+43-.
7%./:,. +2 4 ?+(%:@)1 .)1/:4,)+' +2 ,-% (%.)*' 4<% .-+A' )' B)*/<% 7.D. E-% 2+/<
)1&/,. /.%( )' ,-% .)1/:4,)+' 4<% F.G HFI555555JKL0 F.F2 HFFDJMN"DJKL0 2.J
H2JOOOOOOJKL0 4'( F.KD HFPKKKKKKJKL. E-%)< <%.&%3,)Q% A%)*-,. 4<% J0 20 RF.D0 4'( F.2D.
E-% .)*'4:. )'JSA ,-<+/*- )'ISA .-+A ,-% <%./:,. +2 1/:,)&:9)'* ,-% )'&/,. ;9 ,-%)<
A%)*-,)'* 3+'.,4',..
E-% .)*'4: ./1S4:: .-+A. ,-% <%./:, +2 ,-% ./114,)+' 2/'3,)+'T ,-% -%U Q4:/%
B7G$7FPGJK +< RF.DI7DFF. E-% .)*'4: 'Q4: ). ,-% 2)'4: +/,&/, +2 ,-% '%/<+' 4'( ),. -%U
Q4:/% +2 FOIJJPF7JK HF.D7"GOIH 14,3-%. ,-% <%./:, +2 !RF.DI7D
.
@)'3% ,-% )1&:%1%',4,)+' /.%. ,-% @&4<,4'V. )',%<'4: 1/:,)&:)%<.0 ,-% )',%<1%()4,%
<%./:,. +2 3+1&/,)'* ,-% A%)*-,%( Q4:/%. 4<% '+, 4Q4):4;:%. E-% +':9 (%:49 )' ,-%
.)1/:4,)+' ). 2<+1 ,-% 567$85 /'),V. %U%3/,)+'. P (%.)*' /,):)W)'* 3/.,+1 1/:,)&:)%<.
A+/:( -4Q% )'3<%4.%( %U%3/,)+' ,)1%.
! Figure (.*+Simulation of an artificial neuron using the 4 inputs 9:.;, :.:=, =.>, :.?*@ and the weights
9>, =, C:.*, :.=*@.
7.#.#
B+< 4'9 )1&:%1%',4,)+' +2 4' 4<,)2)3)4: '%/<4: '%,A+<X )',+ 4 &<+*<4114;:% (%Q)3%
./3- 4. 4' BYZP0 4, :%4., ,A+ &+..);):),)%. %U)., 2+< )1&:%1%',)'* ,-% 4<),-1%,)3
83
computations: a high-speed lookup table 7either on-chip or off: or either of the CORDIC
implementations discussed here.
Table-based implementations work similar to the logarithm tables found in the back
of textbooks, requiring interpolations between rows. These designs can be very fast and
accurate, but require large amounts of chip area. For large networks, it would require a
much larger 7and more expensive: FPGA to allow the lookup table to coexist with the
network. Additionally, all arithmetic calculations would have to be queued, slowing
down the operation of the entire network. If the table implementation is fast enough and
the network small enough, then performance may be comparable to a CORDIC
implementation.
The CORDIC algorithm is powerful enough and small enough to be able to support a
neural network, ideally with a CORDIC unit in each neuron. This would maintain the
parallelism that is key to the operation of the neural network. The longer computation
time is not a maLor shortcoming, since the power of a neural network lies in its
parallelism. The fact that all neurons can be computing in parallel and all neurons can
finish processing their inputs at the same time allows for a natural progression of data
through the network.
The flexibility of the algorithm with its two computation modes that can each operate
in one of three domains means that the same unit can compute a wide variety of
functions. This gives the network designer the flexibility to vary the transfer functions
used with only minimal changes and no redesigning necessary. It would also be possible
for the CORDIC unit to perform the multiplication of the weighting factors with the
"#
$n&'(s* (+,'-+ f,r ne'r,ns 1$(+ 23r-e 34,'n(s ,f $n&'(s* (+$s 1,'25 c,4e 1$(+ 3 se7ere
&erf,r43nce &en32(89
:+e s8s(e4s 5es$-ner 1,'25 c+,,se (+e &r,&er $4&2e4en(3($,n (+3( ;es( f$(s (+e s$<e
,f (+e ne(1,r= (+3( 1,'25 ;e s'&&,r(e59 :+e f3s( (3;2e>;3se5 3&&r,3c+es 1,'25 ;e
&referre5 ;ec3'se ,f (+e$r +$-+ &erf,r43nce* ;'( ,n28 (+e s4322er ne(1,r=s 1,'25 ;e 3;2e
(, effec($7e28 '($2$<e (+e49 ?,r 4e5$'4>s$<e ne(1,r=s* (+e &3r322e2 @ARCD@ 'n$( $s (+e
;es( c,4&r,4$se ;e(1een s$<e 3n5 s&ee59 En5 f,r 23r-e ne(1,r=s* (+e ser$32
$4&2e4en(3($,n 438 ;e (+e ,n28 c+,$ce9
!"# $%&%'( *&%+,
?'r(+er rese3rc+ c3n ;e 5,ne $n(, (+e ;enef$(s (+3( c3n ;e 5er$7e5 fr,4 's$n- (+e
@ARCD@ 'n$( $n 3 ne'r32 ne(1,r=9 F('58 ,f 3 2$7e* &r3c($c32 ne'r32 ne(1,r= 1$22 +e2& $n
'n5ers(3n5$n- +,1 23r-e 3 ne(1,r= c,'25 ;e s'&&,r(e5 3n5 1+$c+ $4&2e4en(3($,n s ;es(
s'$(e5 f,r ne(1,r=s ,f 73r8$n- s$<e9
F('5en(s 438 32s, 1$s+ (, s('58 138s (+3( (+e @ARCD@ 'n$( c3n ;e f'r(+er
,&($4$<e5 s&ec$f$c3228 f,r 'se $n ne'r32 ne(1,r=s9 :+e c,n(r,2 'n$( $s ,ne c,4&,nen( (+3(
$s 3n e3s8 (3r-e( f,r ,&($4$<3($,n* ;'( f'r(+er 3re3 re5'c($,n 438 ;e &,ss$;2e 1+en (+e
'n$( $s $n(e-r3(e5 $n(, (+e s(r'c('re ,f 3 ne'r,n9 :+$s 32s, $nc2'5es 5e(er4$n$n- (+e $5e32
1,r5 s$<e f,r (+e 3&&2$c3($,n9 :+e 3cc'r3c8 ,f (+e ne(1,r= nee5s (, ;e (3=en $n(, 3cc,'n(
1+en 5e(er4$n$n- (+e &rec$s$,n ,f (+e 'n5er28$n- +3r513re9
:+e @ARCD@ 'n$( &resen(e5 +ere $s +3r5>1$re5 (, (+e +8&er;,2$c r,(3($,n32 4,5e9 E
(r'28 c,4&2e(e -ener32 'n$( (+3( $s c3&3;2e ,f ,&er3($n- $n 322 (+ree 5,43$ns $n ;,(+ 4,5es
s+,'25 ;e &,ss$;2e 1$(+ ,n28 3 2$((2e 4,re +3r513re9 E s('58 ,f (+e res,'rce reG'$re4en(s
,f (+$s -ener32>&'r&,se 'n$( c,'25 32s, ;e 5,ne9
85
!"#"$"%&"'
[1] J. S. *alther, “A unified algorithm for elementary functions,” in Spring (oint
Computer Conference, vol. 38, pp.379–385, 1971.
[2] J.E. Holder, “The COLDIC Trigonometric Computing TechniOue,” 12E Trans.
Electronic Computers, vol. EC-8, no. 3, pp. 330–334, Sept. 1959.
[3] H. Dawid and H. Meyr, “COLDIC Algorithms and Architectures,” in Digital
Signal :rocessing for Multimedia Systems, ed. by K.K. Parhi and T. Nishitani,
Marcel Dekker, 1999, pp. 623–655.
[4] L. Andraka, “A survey of COLDIC algorithms for FPGA-based computers,” in
1nternational Symposium on Field :rogrammable @ate Arrays, 1998.
[5] ^. H. Hu, “The _uanti`ation Effects of the COLDIC Algorithm,” 1EEE Trans.
Signal :rocessing, vol. 40, pp. 834–844, July 1992.
[6] IEEE Std. 754-1985, “Standard for Binary Floating Point Arithmetic,” 1985.
[7] A. A. Liddicoat and L. A. Slivovsky. :rogrammable logic. 3rd
Edition of EE
Handbook. CLC Press.
[8] cilind. Spartan-3 FPGA family: Complete data sheet. Datasheet DS099, cilind.
[9] M.M. Mano and C.L. Kime, BLS1 :rogrammable Logic Devices, Pearson
Prentice Hall, Upper Saddle Liver, NJ, 3rd edition, 2004. Supplement to Logic
and Computer Design Fundamentals.
[10] D. Anderson and G. McNeil, “Artificial Neural Networks Technology,”
DoD Data g Analysis Center for Software, August 1992.
"#
$%%& '( )*+,-*. /0-1 0-234567 8+4569-492+-.: ;3493+<= >5-?6?( %%@ 7(A>B.
C<42=9D 3E F=-49+64<= FGH6G--+6GH. I+<H2-. JKKK(
$%J& L( M52 <GN I( )2993G. /CIO8 PQ7=-Q-G9<963G? 3E 0-2+<= 0-913+*? R <
)2+S-D 3E < ;-4<N- 3E I+3H+-??.: Proceedings of -.t0 International Conference
on Field Programmable Logic and Applications ;FPL <==.>. T6?,3G. )-7 JKKU(
$%U& '( C6H2-6+-N3 <GN A( O=3?9-+. /PQ7=-Q-G9<963G 3E < I+3,<,6=6?964 0-2+<=
0-913+* E3+ '2=96V?7-49+<= PQ<H- A=<??6E64<963G 3G <G CIO8 W<?-N A2?93Q
A3Q7296GH '<456G-.: Proceedings of ?t0 @raAilian Symposium on Eeural
EetForGs. ;-4( %XX"(
$%Y& )(T( W<N- <GN W(T( Z29456GH?. /CIO8VW<?-N )9345<?964 0-2+<=
0-913+*?RPQ7=-Q-G9<963G.: Proc. of IEEE JorGs0op on FPGAs. %XXY(
$%@& [( 06453=? <GN '( '32??<. /C-<?6,6=69D 3E C=3<96GHVI36G9 8+695Q-964 6G
CIO8 ,<?-N 8+96E646<= 0-2+<= 0-913+*?.: 6G Proceedings of t0e -?t0
International Conference on Computer Applications in Industry and
Engineering. JKKJ(
$%#& L(T( Z3=9 <GN >(F( W<*-+. /W<4* 7+37<H<963G ?6Q2=<963G? 2?6GH =6Q69-N
7+-46?63G 4<=42=<963G?.: 6G Proceedings of International Joint Conference on
Eeural EetForGs. %XX%. 77 %J%\%J# S3=( J(
$%]& )( ;+<H5646. /^G 95- 4<7<,6=696-? 3E G-2+<= G-913+*? 2?6GH =6Q69-N
7+-46?63G 1-6H59?.: Eeural EetForGs. JKKJ. %@_ 7( UX@VY%Y
87
[18] '. )u and S.N. Batalama, “An Efficient Learning Algorithm for
Associative Memories,” IEEE $rans. Neural Networks, vol. 11, no. 5, pp. 1058–
1066, Sept. 2000.
[19] S. Halgamuge, “A Trainable Transparent Universal ApproPimator for
DefuRRification in Mamdani-Type Neuro-FuRRy Controllers,” IEEE $rans. Fuzzy
Systems, vol. 6, no. 2, pp. 304–314, May 1998.