Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
1 Instructor: Daniel Llamocca
Unit 1 - Digital System Design
DIGITAL SYSTEM MODEL
FSM (CONTROL) + DATAPATH CIRCUIT
USE OF GENERIC COMPONENTS ▪ Among the most common components used in the development of digital systems we have registers, shift registers, and
counters. To optimize design time, it is recommended to use parameterized components (VHDL for FPGA Tutorial – Unit 5):
✓ n-bit register with enable and synchronous clear: my_rege
✓ Counter modulo-N with enable and synchronous clear: my_genpulse_sclr
✓ n-bit parallel access (right/left) register with enable and synchronous clear: my_pashiftreg_sclr
EXAMPLES
CAR LOT COUNTER If A = 1 → No light received (car obstructing LED A)
If B = 1 → No light received (car obstructing LED B)
If car enters the lot, the following sequence (A|B) must be followed: 00 → 10 → 11 → 01 → 00
If car leaves the lot, the following sequence (A|B) must be followed: 00 → 01 → 11 → 10 → 00 A car might stay in a state for many cycles since the car speed is very large compared to that of the clock frequency.
DIGITAL SYSTEM (FSM + Datapath circuit) ▪ Usually, when 𝑟𝑒𝑠𝑒𝑡𝑛 (asynchronous clear) and 𝑐𝑙𝑜𝑐𝑘 are not drawn, they are implied.
B
A
photoreceptors
FINITE STATE MACHINE
resetn
clock
A
B
CONTROL CIRCUIT
Q10
10-bit counter
E
ud
E
ud
DATAPATH CIRCUIT
FINITE STATE MACHINE
resetn
clock
Inputs
Outputs
CONTROL CIRCUIT
DATAPATH CIRCUIT
https://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_5/my_rege.vhdhttps://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_5/my_genpulse_sclr.vhdhttps://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_5/my_pashiftreg_sclr.vhd
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
2 Instructor: Daniel Llamocca
▪ Finite State Machine (FSM):
▪ Algorithmic State Machine (ASM) chart:
S1
S2
resetn=0
yes
no
00
AB=00
AB11
S3
10
10AB
00
S4
11AB
10
01 11
00
S4
01AB
00
01
E, ud 1
1110
S6
01AB
00
S7
11AB
01
1011
00
S8
10AB
00
10
E 1
1101
01
S1 S2
00/00resetn = 0
A|B/E|ud
01,10,11/00 S3 S4
10/00 11/00
S5
01/00
00/11
00/00
S6
01/00
11/00
10/00 11/00 01/00
01/00
00/00 10/00 11/00
10/00
00/00
S7 S8
11/00 10/00
01/00 11/00
10/0010/00
00/00
00/10
01/00 11/00
00/00 01/00
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
3 Instructor: Daniel Llamocca
DEBOUNCING CIRCUIT (VHDL CODE) ▪ Mechanical bouncing lasts approximately 20 ms. The, we have to make sure that the input signal 𝑤 is stable (‘1’) for at least
20 ms before we assert 𝑤_𝑑𝑏. Then, to deassert 𝑤_𝑑𝑏, we have to make that the 𝑤 is stable (‘0’) for at least 20 ms. DIGITAL SYSTEM (FSM + Datapath circuit) ▪ Generic component (my_genpulse_sclr): Behavior on the clock tick:
Counter modulo-N (0 to N-1): If E=0, the count stays Getting the value of N for a given time (e.g. 20 ms.) if E = 1 then
if sclr = 1 then
Q 0 elseif Q = N-1 then
Q 0 else
Q Q+1 end if;
end if;
* z = 1 if Q = N-1 (max. count)
If T is the period of the clock signal, then 𝑁 =20𝑚𝑠
𝑇
For 100 MHz input clock, T = 10 ns.
Then 𝑁 =20𝑚𝑠
10𝑛𝑠= 2 × 106
▪ Algorithmic State Machine:
resetn
Q
clock
n
counter0 to N-1
z
Q=N-1?
comparator
En
FSM
w w_db
sclr
E sclr z
w_db
w
20 ms20 ms
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
4 Instructor: Daniel Llamocca
BIT-COUNTING CIRCUIT (COUNTING 1’S) (VHDL CODE) ▪ This circuit counts the number of bits in register A whose value is ‘1’. Example: A = 00110111 → C = 0101.
▪ The sequential algorithm (pseudo-code) is available. In this case, we can follow this procedure to design a digital system: ✓ Sketch the block diagram: We need start and done signals to indicate when the process starts and finishes. We also
include input (Data for the Register A) and output data (Count). ✓ Sketch the high-level control mechanism (state machine) for the Bit-counting circuit. Here, you can include combinational
blocks as well as common synchronous blocks (registers, shift registers, counters). ✓ With the block diagram and high-level control mechanism, we can start including the components and their signals in
the datapath. Based on the high-level state machine, we can design the actual state machine that includes specific signals controlling the components.
DIGITAL SYSTEM (FSM + Datapath circuit) ▪ FSM: as a ASM chart. Example: For 𝑛 = 8: if 𝐴 = 00110110, then 𝐶 = 0100. ▪ Generic components: Behavior on the clock tick:
Counter modulo-n+1 (0 to n):
If E=0, the count stays.
n-bit Parallel access shift register:
If E=0, the output is kept if E = 1 then
if sclr = 1 then
Q 0 elseif Q = n then
Q 0 else
Q Q+1 end if;
end if;
if E = 1 then
if s_l = ‘1’ then
Q D else
Q shift in ‘din’ (to the right) end if;
end if;
A
din
s_l
E
0
LA
EA
Parallel Accesss_l = 1 → Load
s_l = 0 → Shift
DA
z a0
Q
counter: m bits
E
sclr
EC
FINITE STATE MACHINEs
resetn S1
S2
resetn=0
1
0s
z
EC, sclrC 1
01
EC 1
1
0
EA 1
a0
S3
done 1
1s
0
C
done
𝑛
𝑛
= 2 𝑛 1
sclrCEA, LA 1 Shift-Right
C 0
while A 0if a0 = 1 then
C C + 1end if
right shift A
end while
S1
S2
Reset
1
0s
C 0
noy es
C C+1
1
0
Shif t right A
a0
S3
done
1s
0
Load A
A=0?
Sequential Algorithm
High-level FSM
Bit-counting circuit
s
done
C
DA
https://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_7/bit_counting.zip
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
5 Instructor: Daniel Llamocca
Timing diagram (𝑛 = 8, = 4):
INTEGER BASE-2 LOGARITHM (VHDL CODE) ▪ This circuit computes ⌈ 2 𝐴⌉. Data (unsigned integer) is contained in register 𝐴 (n bits)
▪ The sequential algorithm (pseudo-code) is available. We follow the procedure specified in the previous example. ▪ Note that instead of performing a right shift to 𝐴 to get ⌊𝐴 2⁄ ⌋, we will get ⌊𝐴 2⁄ ⌋ by created a shifted version of 𝐴. Example
(n=4): 𝐴 = 𝑎3𝑎2𝑎1𝑎0 → ⌊𝐴 2⁄ ⌋ = 0𝑎3𝑎2𝑎1.
▪ Also asking 𝐴 = 1 is the same as asking whether ⌊𝐴 2⁄ ⌋ = 0.
clock
resetn
s
DA 0000111000110110
S1 S1 S2 S2 S2 S2 S2 S2 S2 S3 S1 S2 S2 S2 S2 S2 S3
A
z
3600 00 1B 0D 06 03 01
state
C
done
EA
LA
sclrC
EC
00 0E 07 03 0100 00 00 00
00000000 0000 0000 0001 0010 0010 0011 0100 0000 0000 0001 00100100 0100 0011 0011
L 0
while A 1
A A - A/2
L L + 1end while
S1
S2
Reset
1
0s
L 0
noy es update A
L L + 1
S3
done
1s
0
Load A
A=1
Sequential Algorithm
High-level FSM
Integer base-2 Logarithm
s
done
L
DA
https://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_7/mylog2_it.zip
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
6 Instructor: Daniel Llamocca
DIGITAL SYSTEM (FSM + Datapath circuit) ▪ The digital system is depicted below. For 𝑛 bits, ⌈ 2 𝐴⌉ ∈ [0, 𝑛]. Thus, the result needs ⌈ 2(𝑛 1)⌉ bits.
✓ m-bit counter: If 𝐸 = 𝑠𝑐𝑙𝑟 = 1, the count is initialized to zero. If 𝐸 = 1, 𝑠𝑐𝑙𝑟 = 0, the count is increased by 1.
✓ Register: If 𝐸 = 1: 𝑠𝑐𝑙𝑟 = 1 → Clear, 𝑠𝑐𝑙𝑟 = 0 → Load.
✓ Note that 𝑧 = 1 when ⌊𝐴/2⌋ = 0 (this includes the case 𝐴 = 0).
clock
resetn
s
DA 0000110101001010
S1 S1 S2 S2 S2 S2 S2 S2 S2 S2 S3 S1 S2 S2 S2 S2 S2
A
z
4A00 00 25 13 0A 05 03
state
L
done
EA
sA
sclrL
EL
01 01 0D 07 0402 01 02 01
00000000 0000 0001 0010 0011 0100 0101 0111 0111 0000 0001 00100110 0111 0011 0100
AE
sclr
EA
0
DA
z
Q
Counter0 to n
E
sclr
EL
FSM
s
S1
S2
resetn=0
1
0s
z
EL, sclrL 1
0
1
EA 1
EL 1
S3
done 1
1s
0
L
done
𝑛
𝑛
= 2 𝑛 1
sclrL
sA, EA 1
-
𝑛 1
0
𝑛 0an-1an-2a1
0 1
𝑛
clock
resetn
sA
sA
unsigned
subtractor EA
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
7 Instructor: Daniel Llamocca
GREATEST COMMON DIVISOR (GCD)
▪ This circuit processes two 𝑛-bit unsigned numbers (A, B) and generates the GCD of A and B. For example:
✓ If A = 216, B = 192 → GCD = 24.
✓ If A = 132, B = 72 → GCD = 12.
✓ If A = 169, B = 63 → GCD = 1. ▪ Sequential algorithm: we will use the Euclid’s GCD Algorithm:
✓ Based on this algorithm, we design an iterative circuit. DIGITAL SYSTEM (FSM + Datapath circuit) ▪ The figure depicts the FSM (in ASM form) and a datapath circuit for 𝑛 = 8.
Input data: DA, DB. Output data: GCD.
0 1 1 0
8
DA DB
L
Ea
+/-
GCD
=
FSM
s
done
Eb L
𝑏
+/-
Eb
8
99 9 9
8resetn=0
1
s
S1
S2
Ea, Eb, L 1
0
1
z
Eb 1
FSM
1
done 1
s
S3
0
0
a b b a
8
EE
0 𝑏 9
𝑎 8
0 𝑎 9
9 9
8
8
next_bAlB
next_a
8
'1' '1'
8
𝑎 𝑏
z
z
AlB
Ea
1AlB
0
Ea 1
SA SB
next_a=SA[7..0]
next_b=SB[7..0]
AlB = SA[8]
8 8A=B?
AB?
clock
resetn
AlB
s
DA
state
done
21
z
1
DB 1 2
𝑎
𝑏
0
0
0
0
216
192
24
192 168 144 120 96 72 48 24 24 24 5
1524 24 24 24 24 24 24 24 24 10 5
5 5
5
5
5
5
5
5
S1 S1 S2 S3 S1 S2 S3 S1 S1S2S2S2S2S2 S2 S2S2S2S2
a,b: unsigned integers
while a bif a > b
a a-belse
b b-aend
end while
return a
Sequential Algorithm
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
8 Instructor: Daniel Llamocca
LEADING ZERO DETECTOR (ITERATIVE) ▪ A sequence of N bits is fed to the circuit serially. The circuit generates the number of 0’s that exist before the first 1.
▪ Example (N=15):
✓ If the number is: 000000000000101 → Output: 12
✓ If the number is: 000000000000000 → Output: 15
✓ If the number is: 000010001000001 → Output: 4
DIGITAL SYSTEM (FSM + Datapath circuit) ▪ Inputs: X (serial data, MSB first), start.
▪ Outputs: z (leading 1 detected), done (N bits processed), and R (number of 0’s before the first 1).
▪ The process beings with the assertion of start. When the first 1 is detected, then z=1 (for 1 clock cycle) and R is frozen.
When all the bits of the sequence have been processed, done=1. We can re-start the process after this. Note that if X is just
0’s, z is never ‘1’.
▪ Counter R and Q: modulo-(N+1): count from 0 to N.
LEADING-0DETECTOR
X R
start done
x
z
done
Q
R
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 1 1 0 1 0 1 0 0
start
100 0 0 1 2 3 4 5 6 7 8 9 11
12
13
14
15 0 0 1 2 3 4 5 6 7 8 9
z
12
12
10
11
12
120 0 0 1 2 3 4 5 6 7 8 9 0 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4
10
11
12
13
14
15
𝑛 = 2 𝑁 1
𝑛
clock
resetn
Q
counter0 to N
z
E
sclr
EQ
sclrQFSM
z
QD
E
Q
counter0 to N
E
sclr
ER
sclrR
Q R
db
eb
x
Q
eb
db
start
done
bb
ER 1, sclrR 1
EQ 1, sclrQ 1
eb 1, db 0
S1
1
resetn=0
start0
done 1,
EQ 1, sclrQ 1
ER 1, sclrR 1
S3
noQ=N-1
EQ 1
y es
start0
S2
b x
ER 1
z 1
eb 1
db 1
EQ 1
1
0
1
1
0
𝑛 𝑛
𝑛
z
EQ
sclrQ
ER
sclrR
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
9 Instructor: Daniel Llamocca
SIMPLE PROCESSOR DIGITAL SYSTEM (FSM + Datapath circuit) ▪ This system is a basic Central Processing Unit (CPU). For completeness, a memory would need to be included. ▪ Here, the Control Circuit could be implemented as a State Machine. However, in order to simplify the State Machine design,
the Control Circuit is partitioned into a datapath circuit and a FSM.
OPERATION ▪ Every time w = '1', we grab the instruction from 𝑓𝑢𝑛 and execute it. ▪ 𝐼𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 = |𝑓2|𝑓1|𝑓0|𝑅𝑦1|𝑅𝑦0|𝑅𝑥1|𝑅𝑥0|. This is called ‘machine language instruction’ or Assembly instruction:
✓ 𝑓2𝑓1𝑓0: Opcode (operation code). This is the portion that specifies the operation to be performed. ✓ 𝑅𝑥: Register where the result of the operation is stored (we also read data from 𝑅𝑥). 𝑅𝑥 can be R1, R2, R3, R4. ✓ 𝑅𝑦: Register where we only read data from. 𝑅𝑦 can be R1, R2, R3, R4.
f = f2f1f0 Operation Function
000 Load Rx, Data Rx Data
001 Move Rx, Ry Rx Ry
010 Add Rx, Ry Rx Rx + Ry
011 Sub Rx, Ry Rx Rx - Ry
100 Not Rx Rx NOT (Rx)
101 And Rx, Ry Rx Rx AND Ry
110 Or Rx, Ry Rx Rx OR Ry
111 Xor Rx, Ry Rx Rx XOR Ry
R0E
O_R0
E_R0
R1E
O_R1
E_R1
R2E
O_R2
E_R2
R3E
O_R3
E_R3
AE
O_G
E_A
GE
ALU
E_ext
CONTROL CIRCUIT
Data
op
4
w
fun7
done
BUS
B
E_G
QDn
Data_inn
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
10 Instructor: Daniel Llamocca
▪ Control Circuit: This is made out of some combinational units, a register, and a FSM: ✓ Ex: Every time we want to enable register 𝑅𝑥, the FSM only asserts Ex (instead of controlling E_R0, E_R1, E_R2, E_R3
directly). The decoder takes care of generating the enable signal for the corresponding register 𝑅𝑥. ✓ Eo, so: Every time we want to read from register 𝑅𝑦 (or 𝑅𝑥), the FSM only asserts Eo (instead of controlling O_R0,
O_R1, O_R2, O_R3 directly) and so (which signals whether to read from 𝑅𝑥 or 𝑅𝑦). The decoder takes care of generating
the enable signal for the corresponding register 𝑅𝑥 or 𝑅𝑦.
▪ Arithmetic-Logic Unit (ALU):
op Operation Function Unit
0000
0001
0010
0011
0100
0101
0110
0111
y
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
11 Instructor: Daniel Llamocca
▪ Algorithmic State Machine (ASM): Every branch of the FSM implements an Assembly instruction.
S1
S2
resetn=0
1
0
E_ext, Ex 1
done 1
Eo, Ex 1
done 1
000
001
Eo, so 1
E_A 1
Eo, E_G 1
op 0110
O_G, Ex 1
done 1
010
S3a
S3b
Eo, E_G 1
op 0111
O_G, Ex 1
done 1
S4a
S4b
Eo, so 1
E_A 1
E_G 1
op 1000
O_G, Ex 1
done 1
S5a
S5b
Eo, so 1
E_A 1
Eo, E_G 1
op 1010
O_G, Ex 1
done 1
S6a
S6b
Eo, so 1
E_A 1
Eo, E_G 1
op 1011
O_G, Ex 1
done 1
S7a
S7b
Eo, so 1
E_A 1
Eo, E_G 1
op 1110
O_G, Ex 1
done 1
S8a
S8b
Eo, so 1
E_A 1
w
f
E_fun 1
011 100 101
110
111
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
12 Instructor: Daniel Llamocca
BINARY TO BCD CONVERSION (VHDL CODE) DOUBLE DABBLE ALGORITM ▪ Given an 𝑁-bit unsigned binary number, we left shift every one of the 𝑁 bits into a new register (that holds the BCD digits).
We thus have 𝑁 iterations. If after shifting, any of the BCD digits (one or more) is greater than 4, we increment it by 3. The exception is on the last iteration, where after shifting, we do not increment any nibble.
▪ For 𝑁 bits, we need ⌈𝑁 3⁄ ⌉ BCD digits. So the BCD register is 4 × ⌈𝑁 3⁄ ⌉ bits wide.
Number of bits Binary range Number of BCD digits
4 [0-15] 2
7 [0-127] 3
10 [0-1023] 4
14 [0-16383] 5
N [0, 2𝑁 1] ⌈𝑁 3⁄ ⌉
▪ Why does it work? Think of BCD shifting: we want to shift the bits of a BCD number, but preserving the BCD representation.
If a number lower or equal than 4 is shifted, we get at most 8, still representable in BCD. If a number greater than 4 is
shifted, the minimum number we get is 10=10102, which is not in BCD. If we add 3 to the number and then shift, we will get
the proper BCD result with 2 BCD digits.
✓ Example: 5=01012. If we shift, we get 10102. 10102 would be 0001 0000 in BCD. To get this, we can add 3 to 01012, resulting
in 10002. After shifting, we get 0001 0000. The same happens for numbers 6, 7, 8, and 9.
▪ Example: 255 = 111111112 (N=8). We need ⌈𝑁 3⁄ ⌉ = 3 BCD digits. Note that there are 𝑁 = 8 iterations.
The table shows the state after an operation has been applied. In the example, only a nibble gets incremented at a time.
BCD number Binary number Procedure just applied Procedure to apply
0000 0000 0000 1111 1111 Initialization Left shift
0000 0000 0001 1111 1110 Left shift Left shift
0000 0000 0011 1111 1100 Left shift Left shift
0000 0000 0111 1111 1000 Left shift. 0111>4. Then we add 3 to 0111
0000 0000 1010 1111 1000 Add 3 to 0111 Left shift
0000 0001 0101 1111 0000 Left shift. 0101>4. Then we add 3 to 0101
0000 0001 1000 1111 0000 Add 3 to 0101 Left shift
0000 0011 0001 1110 0000 Left shift Left shift
0000 0110 0011 1100 0000 Left shift. 0110>4. Then we add 3 to 0110
0000 1001 0011 1100 0000 Add 3 to 0110 Left shift
0001 0010 0111 1000 0000 Left shift 0111>4. Then we add 3 to 0111
0001 0010 1010 1000 0000 Add 3 to 0111 Left shift
0010 0101 0101 0000 0000 Left shift. None. We do not add 3 in the last iteration
▪ Example: 172 = 10101100 (N=8). We need ⌈𝑁 3⁄ ⌉ = 3 BCD digits. Note that there are 𝑁 = 8 iterations.
The table shows the state after an operation has been applied. In general, more than a nibble can be incremented at a time.
BCD number Binary number Procedure just applied Procedure to apply
0000 0000 0000 1010 1100 Initialization Left shift
0000 0000 0001 0101 1000 Left shift Left shift
0000 0000 0010 1011 0000 Left shift Left shift
0000 0000 0101 0110 0000 Left shift 0101>4. Then we add 3 to 0101
0000 0000 1000 0110 0000 Add 3 to 0101 Left shift
0000 0001 0000 1100 0000 Left shift Left shift
0000 0010 0001 1000 0000 Left shift Left shift
0000 0100 0011 0000 0000 Left shift Left shift
0000 1000 0110 0000 0000 Left shift 1000>4, 0110>4. Then we add 3 to 1000 and 0110
0000 1011 1001 0000 0000 Add 3 to 1011 and 1001 Left shift
0001 0111 0010 0000 0000 Left shift Note. We do not add 3 in the last iteration
https://www.secs.oakland.edu/~llamocca/Tutorials/VHDLFPGA/Vivado/Unit_7/bin2bcd.zip
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
13 Instructor: Daniel Llamocca
DIGITAL SYSTEM (FSM + Datapath circuit)
▪ Algorithmic State Machine (ASM) Chart:
sclrB 1, EB "11..1"
E_C 1, sclr_C 1,
E_CB 1, sclr_CB 1
S1
1
resetn=0
start0
S2
done 1
S4
yesF > 4 &z_C=0
no
0z_CB E_CB 1
1
LA, EA 1
E_LB 1
EB(CB) 1
start10
EB "11..1", EA 1
S3
E_CB 1, sclr_CB 1
0z_C
1
E_C 1
E_C 1, sclr_C 1
LEFT SHIFT
REGISTER
ws_L
E
clock
resetn
0LA
EA
LEFT SHIFT
REGISTER
ws_L
ELB(0)EB(0)
i_bin
N
sosoLEFT SHIFT
REGISTER
ws_L
ELB(1)EB(1)
soLEFT SHIFT
REGISTER
ws_L
ELB(K-1)EB(K-1)
so ...sclr sclr sclr pB(0)pB(1)pB(K-1) pB(2)pB(K)
sclrB
DB4 4 4
CB
+
0011
Q
counter0 to N-1
z_C
E
sclr
E_CQ
counter0 to K-1
z_CB
E
sclr
E_C B 2 𝑁
sclr_C B
012
...K-1
4 4 4...0
1
K-1
2
4F
DB
2
sclr_C
CB
E_LB
LB(0)
LB(1)
LB(2)
LB(K-1)
DEC ODER
= 𝑁 3⁄
FSM
E_A
L_A
E_LB
sclr_B
E_C
sclr_C
E_C
B
sclr_CB
done
z_Cz_CB
EB
CBF
K
start
QB(0)QB(1)QB(K-1)
...
4
o_bcd(0)o_bcd(1)...o_bcd(K-1)
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
14 Instructor: Daniel Llamocca
EXTRA TOPICS
LINEAR FEEDBACK SHIFT REGISTERS (LFSRS) ▪ LFSRs have a simple and fairly regular structure. Typical components: D-type flip flops, and logic gates. ▪ Advantages: very little hardware and high speed operation. ▪ Inputs to flip flops are linear functions of the previous state. ▪ Despite the simple appearance, LFSRs are based on rather complex mathematical theory and have interesting applications
in the area of digital system testing, cryptography, and fault-tolerant computing. ▪ Typical applications: Error correction/detection, pseudo random number generators, fast counters. ▪ A generic 4-bit LFSR is depicted below:
APPLICATION: CYCLIC REDUNDANCY CHECK (CRC) ▪ This error-detection code is used in digital communications (e.g.: Ethernet, CAN), and storage devices (e.g.: RAMs). Communication system ▪ 𝑘-bit message: 𝑀 = 𝑘−1 𝑘−2… 1 0. ▪ CRC code: 𝑅 = 𝑟𝑛−1𝑟𝑛−2…𝑟1𝑟0. ▪ Stream sent: 𝑘−1 𝑘−2… 1 0𝑟𝑛−1𝑟𝑛−2…𝑟1𝑟0
▪ Associated polynomials:
✓ 𝑀(𝑥) = 𝑘−1𝑥𝑘−1 𝑘−2𝑥
𝑘−2 ⋯ 1𝑥1 0𝑥
0. Order: 𝑘 1 ✓ 𝑅(𝑥) = 𝑟𝑛−1𝑥
𝑛−1 𝑟𝑛−2𝑥𝑛−2 ⋯ 𝑟1𝑥
1 𝑟0𝑥0. Order: 𝑛 1
▪ The CRC code is calculated as a remainder: 𝑅(𝑥) = 𝑟𝑒 𝑎𝑖𝑛𝑑𝑒𝑟 (𝑥𝑛𝑀(𝑥)
𝐺(𝑥))
▪ 𝐺(𝑥) = 𝑔𝑛𝑥𝑛 𝑔𝑛−1𝑥
𝑛−1 ⋯ 𝑔1𝑥1 𝑔0𝑥
0. This is the generating polynomial of order 𝑛. 𝐺 = 𝑔𝑛𝑔𝑛−1…𝑔1𝑔0 ▪ Here, when dealing with polynomial operations, we use modulo-2 polynomial arithmetic (Galois field of two elements: GF(2)).
The coefficients of the polynomials can only be 0 or 1. The multiplication operation can be implemented as a simple AND gate, and the addition (or subtraction) operation can be implemented by a XOR gate: 𝑎 ± 𝑏 = 𝑎𝑏, 𝑎 ∈ {0,1}. ✓ Example: 𝑥𝑛 𝑥𝑛 = 𝑥𝑛 𝑥𝑛 = 0.
▪ CRC generator: It computes 𝑅(𝑥). Its input is 𝑥𝑛𝑀(𝑥) ≡ 𝑘−1 𝑘−2… 1 000…0. ▪ CRC checker: On the receiver side, both the message and CRC code might be corrupted by the channel: 𝑀′(𝑥) and 𝑅′(𝑥).
✓ So, we must check if 𝑟𝑒 𝑎𝑖𝑛𝑑𝑒𝑟 (𝑥𝑛𝑀′(𝑥)
𝐺(𝑥)) = 𝑅′(𝑥). If so, we say that the system passed the CRC check.
✓ Note that this implies: 𝑥𝑛𝑀′(𝑥) = 𝑄’(𝑥)𝐺(𝑥) 𝑅’(𝑥) → 𝑥𝑛𝑀′(𝑥) 𝑅’(𝑥) = 𝑥𝑛𝑀′(𝑥) 𝑅’(𝑥) = 𝑄’(𝑥)𝐺(𝑥). In modulo-2 arithmetic, 𝑅′(𝑥) = 𝑅′(𝑥). It follows that 𝑥𝑛𝑀′(𝑥) 𝑅’(𝑥) should be divisible by 𝐺(𝑥).
✓ The CRC checker tests if 𝑟𝑒 𝑎𝑖𝑛𝑑𝑒𝑟 (𝑥𝑛𝑀′(𝑥)+𝑅′(𝑥)
𝐺(𝑥)) = 0. If the remainder is zero, we say that it passed the CRC check. It
is still possible for 𝑅’(𝑥) and 𝑀’(𝑥) to make the remainder equal to zero but this is far less likely. ✓ The input to the CRC checker is 𝑥𝑛𝑀′(𝑥) 𝑅′(𝑥) ≡ ′𝑘−1 ′𝑘−2… ′0𝑟′𝑛−1𝑟′𝑛−2…𝑟′0.
CRC
Generator
CRC
Checker
Transmitted bitserror
D
E
Q
resetn
Q Q
clk
Q0
E
Q1Q2
D
E
D
E
QD
E
Q3
x
LINEAR FUNCTION
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
15 Instructor: Daniel Llamocca
▪ Example: CRC-4 (𝑛 = 4): 𝑀 = 11100110, 𝐺 = 11001 𝑀(𝑥) = 𝑥7 𝑥6 𝑥5 𝑥2 𝑥 → 𝑥𝑛𝑀(𝑥) = 𝑥4(𝑥7 𝑥6 𝑥5 𝑥2 𝑥) = 𝑥11 𝑥10 𝑥9 𝑥6 𝑥5 𝐺(𝑥) = 𝑥4 𝑥3 1 𝑅(𝑥) = 𝑥2 𝑥
CRC circuit ▪ The following LFSR implements 4-bit CRC. Components: flip flops, AND, XOR gates. ▪ Note that 𝑔4 = 1 is not used. This is because it is common in CRC to have 𝑔𝑛 = 1.
This is already considered in the design of the circuit. ▪ Example with 𝑀 = 11100110, 𝐺 = 11001:
✓ When integrated into the CRC generator, the input is given by: 𝑥𝑛𝑀(𝑥) ≡ 111001100000. The output result is 𝑅 = 0110.
✓ When integrated into the CRC checker, the input is given by: 𝑥𝑛𝑀′(𝑥) 𝑅′(𝑥) ≡ 111001100110. The result of the circuit is 𝑟𝑒 𝑎𝑖𝑛𝑑𝑒𝑟 =0000, indicating a valid transmission.
▪ Applications: The circuit above can be implemented for any number of bits 𝑛. For
example: CRC-32 (Ethernet), CRC-15 (CAN), and CRC-1 (trivial even parity generator with 𝐺(𝑥) = 𝑥 1).
▪ This is a serial implementation, i.e., each bit is fed to the circuit at a time. Other implementations process some bits in parallel (for CRC-1, the parity generator in ECE2700: Notes – Unit 1 processes all the bits at once).
▪ The design of the generating polynomial 𝐺(𝑥) is outside the scope of this class. ▪ These circuits are better implemented in hardware for high speed and low resource
utilization.
+ + +
D
E
Q
resetn
Q Q
clk
R0
E
R1 R2
D
E
D
E
QD
E
R3
x
G0 G1 G2 G3
clk
resetn
R
x
E
0000 0000 0001 0011 0111 1110 0101 1011 1110 0101 1010 1100 0000 0000 0000 0000
clk
resetn
R
x
E
0000 0000 0001 0011 0111 1110 0101 1011 1110 0101 0110 0110 01101010 1101 0011
D
E
Q
resetn
clk
R0
E
x
G0
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
16 Instructor: Daniel Llamocca
MODULO-N COUNTER DESIGN ▪ A modulo-N counter could be designed by the State Machine method, but this can be very cumbersome if N is a large
number. For example, if N=1000, we need 1000 states. ▪ The figure shows a counter modulo-200 that uses a register, and adder, a comparator, and logic gates. 𝑄: count. 𝑧: output
signal asserted only when the maximum count (199) is reached. This can be easily described in VHDL using the structural description. You can also use the behavioral description, where the count is increased by 1 every clock cycle and ‘z’ is asserted when the count reaches ‘N-1’.
MEMORY DECODING ▪ Physical implementation of memory is not homogeneous:
✓ Different portions of memory are used for different purposes: RAM, ROM, I/O devices. ✓ A processor can usually address a memory space that is much larger than the memory space covered by an individual
memory chip. Even if all the memory was of one type, we still have to implement it using multiple ICs. ✓ For a given valid address, one and only one memory-mapped component must be accessed.
▪ Address Decoding: Process of generating chip select (CS, CE) signals from the address bus for each device in the system. ▪ The Address bus line (N+M bits) is split into two sections:
✓ The N most significant bits are used to generate the CS signals for the different devices. ✓ The M least significant bits are passed to the devices as addresses.
EXAMPLE ▪ A 20-bit address line in a processor handles up to 220 = 1 𝑀𝐵 of addresses, each address containing one-byte of
information. We want to connect four 256KB memory chips to the processor.
▪ The 2 MSBs of the address line are used to generate the CS signals. The 18 LSBs are passed to the devices as addresses. ▪ The pink-shaded circuit: i) addresses the memory chips, and ii) enables only one memory chip (via CE: chip enable) when
the address falls in the corresponding range. Example: if 𝑎𝑑𝑑𝑟𝑒𝑠𝑠 = 0𝑥 𝐹𝐹𝐹𝐹, → only memory chip 2 is enabled (CE=1).
If 𝑎𝑑𝑑𝑟𝑒𝑠𝑠 = 0𝑥𝐷0123, → only memory chip 4 is enabled.
+
00000001
E
sclr
4
8
resetn
E
sclr
Q
sclr: synchronous clearIf E=sclr=1, then Q = 0
8 8
clock
A=B
= 199?
a7b7
a6b6
a5b5
a4b4
A
11000111
8B
A=B
z
a3b3
a2b2a1
b1a0
b0
00000
3FFFF
40000
7FFFF
80000
BFFFF
C0000
FFFFF
......
......256KB
256KB
256KB
256KB
256 KB
CE
256 KB
CE
256 KB
CE
256 KB
CE
Memory
space
20address
18 18 18 18
Memory
devices
1 2 3 41
2
3
4
w0w1
y0y1y2y3
address(17..0)
address(18)
address(19)
ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, OAKLAND UNIVERSITY ECE-4710: Computer Hardware Design Winter 2021
17 Instructor: Daniel Llamocca
FLIP FLOPS: PRACTICAL ASPECTS
Reducing rate of change of a synchronous circuit: ▪ Here, we use FSM as an example of a synchronous circuit. But we can apply this technique to any synchronous circuit. ▪ We usually would like to reduce the rate at which FSM transitions occur. A straightforward option is to reduce the frequency
of the input clock. But this is a very complicated problem when a high precision clock is required. ▪ Alternatively, we can reduce the rate at which FSM transitions occur by including an enable signal in our FSM: this means
including an enable to every flip flop in the FSM. For any FSM transition to occur, the enable signal has to be ‘1’. Then we assert the enable signal only when we need it. The effect is the same as reducing the frequency of the input clock. ✓ The figure below depicts a counter modulo-N (from 0 to N-1) connected to a comparator that generates a pulse (output
signal 𝑧) of one clock period every time we hit the count ‘N-1’. The number of bits the counter is given by 𝑛 = ⌈ 2𝑁⌉. The effect is the same as reducing the frequency of the FSM to 𝑓 𝑁⁄ , where 𝑓 is the frequency of the clock.
✓ Example: Timing diagram of a modulo-10 counter. Notice that ‘z’ is only asserted when the count reaches “1001”. This
‘z’ signal controls the enable of a FSM, so that the FSM transitions only occur every 10 clock cycles, thereby having the same effect as reducing the frequency by 10.
▪ We can apply the same technique not only to FSMs, but also to any sequential circuit. This way, we can reduce the rate of
any sequential circuit (e.g. another counter) by including an enable signal of every flip flop in the circuit. Flip flop timing parameters: ▪ Propagation Delay. ▪ Setup Time: Interval of time before the
clock edge where the data must be held stable.
▪ Hold Time: Interval of time after the clock where the data must be held stable.
▪ If setup time or hold time are violated, the output may become unpredictable, or even worse it might enter into metastability.
clk
resetn
Q 0000
z
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 0000 0001
E
resetn
Q
clock
n
counter0 to N-1
z
FSMOutputs
Inputs
Q=N-1?
comparator
EE
D Q
resetn
clk
QD
clk
Q
D
restricted
region
Digital System ModelFSM (Control) + Datapath CircuitUse of Generic components
ExamplesCar Lot CounterDebouncing Circuit (VHDL code)Bit-Counting circuit (Counting 1’s) (VHDL code)Integer Base-2 Logarithm (VHDL code)Greatest Common Divisor (GCD)Leading Zero Detector (Iterative)Simple ProcessorBinary to BCD Conversion (VHDL code)
Extra TopicsLinear Feedback Shift Registers (LFSRs)Modulo-N Counter DesignMemory DecodingFlip Flops: Practical Aspects