Digital System Design
ELEC3342
Advanced Arithmetic Circuits
Dr. Hayden So
Department of Electrical and
Electronic Engineering
http://www.eee.hku.hk/~elec3342
worst case delay = max(πππππ¦ ππ!, ππ’π‘" βπ, π)
Speed of Ripple Carry Adder
n The performance of a circuit is determined by
its worst case delay from any input to any
output:
n The worst case delay of a n-bit ripple-carry adder is from cin<0> to cout<n-1>
β’ Or from cin<0> to s<n-1>
β’ Technology Dependent
n In general, delay for RCA is O(n) where n is
the number of bit
ELEC3342 - H. So 2
Why�
Determining combinational delay
n Estimate delay in an arbitrary unit (u) by
tracing flow of signal in the circuit
n Begin with assumption that all input arrives at
exactly the same time @0
n Trace delay of each component with certain
assumption of the underlying technology, e.g.
β’ 2-input gate = 1u
β’ 3-input gate = 2u
β’ Invert = 1u
β’ Wire = no delay
ELEC3342 - H. So 3
Note: Exact values are highly
technology dependent, but the
analysis technique is the same.
Tracing Delay
ELEC3342 - H. So 4
FA
a b
cico
s
a<0> b<0>
s<0>
FA
a b
cico
s
a<1> b<1>
s<1>
FA
a b
cico
s
a<2> b<2>
s<2>
FA
a b
cico
s
a<3> b<3>
s<3>
CarryIn
Carryout
ci
s
a b
co
π = π β πβ ππ
ππ = π β π + ππ(π + π)
0 0
0
0
1
2
1
2
3
0
0
Tracing Delay
ELEC3342 - H. So 5
FA
a b
cico
s
a<0> b<0>
s<0>
FA
a b
cico
s
a<1> b<1>
s<1>
FA
a b
cico
s
a<2> b<2>
s<2>
FA
a b
cico
s
a<3> b<3>
s<3>
CarryIn
Carryout
ci
s
a b
co
π = π β πβ ππ
ππ = π β π + ππ(π + π)
0 0
T
T
1
T+1
1
T+2
T+2
ππ β π :+2π’
ππ β ππ:+2π’
0
0
T > 3u
Tracing Delay
ELEC3342 - H. So 6
0 0
ππ β π :+2π’
ππ β ππ:+2π’
0 00 00 0
FA
a b
cico
s
a<0> b<0>
s<0>
FA
a b
cico
s
a<1> b<1>
s<1>
FA
a b
cico
s
a<2> b<2>
s<2>
FA
a b
cico
s
a<3> b<3>
s<3>
Carry
In
Carry
out
0
2
7
579
5 39
Overall Delay: π‘#$%%&' = π β π‘()
Improving Adder Performance
n RCA has a linear increase in delay
n Works well for small adder (small n), but
doesnβt scale well to wider adder
β’ E.g. Modern processor has 64 bit adders
β’ E.g. Many encryption/decryption schemes need
128 or even 1024 bit adders
ELEC3342 - H. So 7
Carry-Lookahead Addern Observation: Each bit to be added (a and b)is
known, but the carry in is unknown except the 1st one
n Divide a wide n-bit adder into multiple narrower k-bit adders (e.g. k=4)
n βPredictβ carry out (Cout) for k-bit blocks using generate and propagate signals
n Some definitions:β’ Column i produces a carry out by either
generating a carry out or propagating a carry in to the carry out
ELEC3342 - H. So 8
Propagate and Generaten Observe what happens to cout:
β’ Generate: cout=1 regardless of cin
πΊ = π β π
β’ Propagate: cout=cin
π = π β π
β’ Kill: cout=0 regardless of cin
πΎ = π β π
n P,G,K signals determined solely by the two addends
n Carry out of column π (π*) is related to that of column π β1 (π*+,) by:
π* = π*π* + π* + π* π*+,
ELEC3342 - H. So 9
a
(π! )
b
(π!)
cin
(π!"# )
cout
(π!)
s
(π !)
K 0 0 0 0 0
K 0 0 1 0 1
P 0 1 0 0 1
P 0 1 1 1 0
P 1 0 0 0 1
P 1 0 1 1 0
G 1 1 0 1 0
G 1 1 1 1 1
π$ = πΊ$ + π$ β π$%&
Lookahead in a k-bit Blockn When considering k consecutive bits, observe that:
π$ = πΊ$ + π$ β π$%&π$ = πΊ$ + π$ β (πΊ$%& + π$%& β π$%')
π$ = πΊ$ + π$(πΊ$%& + π$%& πΊ$%' + π$%' πΊ$%( + π$%( β π$%) )
n Define group generate and group propagate for k-bit block:
β’ πΊ!:# = πΊ! + π! β (πΊ!$% + π!$% β (β― (πΊ#&% + π#&% β πΊ#)β― )
β’ π!:# = P' β π!$% β π!$(β―π#
where π = π β π + 1
n Then overall group carry becomes:
ELEC3342 - H. So 10
π$ = πΊ$:+ + π$:+ β π+%&
32-bit CLA with 4-bit Blocks
ELEC3342 - H. So 11
B0
++++
P3:0
G3
P3
G2
P2
G1
P1
G0
P3
P2
P1
P0
G3:0
Cin
Cout
A0
S0
C0
B1 A1
S1
C1
B2 A2
S2
C2
B3 A3
S3
Cin
A3:0B3:0
S3:0
4-bit CLA
BlockCin
A7:4B7:4
S7:4
4-bit CLA
Block
C3C7
A27:24B27:24
S27:24
4-bit CLA
Block
C23
A31:28B31:28
S31:28
4-bit CLA
Block
C27
Cout
CLA Delay
n For an n-bit CLA with k-bit blocks:β’ π‘!": delay to generate π# and πΊ#β’ π‘!"_%&'():delay to generate π#:+ and πΊ#:+β’ π‘,-._/0: delay of an AND_OR gate
ELEC3342 - H. So 12
π‘,-. = ππ‘/. +π
πβ 1 π‘.01_34 + π‘56 + π‘56_789:;
Adder Delay Comparisonsn Compare delay of: 32-bit ripple-carry and
carry-lookahead (CLA) adders
n Example:
β’ CLA has 4-bit blocks
β’ 2-input gate delay = 100 ps; full adder delay = 300 ps
ELEC3342 - H. So 13
π‘<$558= = π β π‘/. = 32 300 ps = 9.6 ns
π‘,-. = π‘56 + π‘56_789:; +>;%& ?!"#_%& + ππ‘/. = 3.3ns
Wider CLAn In the previous version of CLA, there is still a carry
chain that can be quite long for wide CLAβ’ A 128 bit adder would have 32 4-bit blocks
n Possible to build hierarchical carry look ahead adders
n Example:β’ Treat each 4-bit block as a single unit
β’ Build carry look ahead logic for 4 4-bit blocks to form 16 bit adder
ELEC3342 - H. So 14
16 bit CLA with 2 levels of 4-bit
blocks
ELEC3342 - H. So 15
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
g p a b c
s
+
4-bit Carry Lookahead Generator
4-bit Carry Lookahead
Generatorg3 p3c3 g2p2 c2 g1p1 c1 g0 p0
c0
a15b15 a14b14 a13b13 a12b12
4-bit Carry Lookahead
Generatorg3 p3c3 g2p2 c2 g1p1 c1 g0 p0
c0
a11b11 a10b10 a9 b9 a8 b8
4-bit Carry Lookahead
Generatorg3 p3c3 g2p2 c2 g1p1 c1 g0 p0
c0
a7 b7 a6 b6 a5 b5 a4 b4
4-bit Carry Lookahead
Generatorg3 p3c3 g2p2 c2 g1p1 c1 g0 p0
c0
a3 b3 a2 b2 a1 b1 a0 b0
g3 p3 c3 g2p2 c2 g1p1 c1 g0 p0
c0
g3 p3c3
c0
Adder in VHDL
ELEC3342 - H. So 16
library ieee;
use ieee.numeric_std.all;
architecture rtl of adder is
signal a: unsigned(31 downto 0);
signal b: unsigned(31 downto 0);
signal y: unsigned(31 downto 0);begin
y <= a + b;
end rtl;Synthesis tools synthesize
actual adder.
Need to use numeric_std
for signals that need
arithmetic operations with
unsigned type.
Subtraction in VHDL
ELEC3342 - H. So 17
library ieee;
use ieee.numeric_std.all;
architecture rtl of adder is
signal a: unsigned(31 downto 0);
signal b: unsigned(31 downto 0);
signal y: unsigned(31 downto 0);
begin
y <= a - b;
end rtl;
Math Operations in VHDLn Many basic math operations are defined in
numeric_std and are synthesizable:β’ Operations: +,β,Γ,Γ·
β’ Comparisons: >, <, =
β’ etc
n Relies on synthesis tools to generate the actual hardware implementationβ’ Qualities can vary (area, power, performance, etc)
n Good for basic integer operations involving signed and unsigned numbers
n Need more manual design if specific architecture neededβ’ floating point, fixed point, etc.
ELEC3342 - H. So 18
More research opportunities