Upload
deanna
View
65
Download
1
Embed Size (px)
DESCRIPTION
The single cycle CPU. Performance of Single-Cycle Machines. Memory Unit 2 ns ALU and Adders 2 ns Register file (Read or Write) 1 ns Class Fetch Decode ALU Memory Write Back Total R-format 21 2 0 1 6 - PowerPoint PPT Presentation
Citation preview
1Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The single cycle CPU
Shiftleft 2
PC
Instructionmemory
Readaddress
Instruction[31– 0]
Datamemory
Readdata
Writedata
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALUresult
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
JumpRegDst
ALUSrc
Instruction [31– 26]
4
Mux
Instruction [25– 0] Jump address [31– 0]
PC+4 [31– 28]
Signextend
16 32Instruction [15– 0]
1
Mux
1
0
Mux
0
1
Mux
0
1
ALUcontrol
Control
Add ALUresult
Mux
0
1 0
ALU
Shiftleft 2
26 28
Address
2Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Performance of Single-Cycle Machines
• Memory Unit 2 ns• ALU and Adders 2 ns• Register file (Read or Write) 1 ns
Class Fetch Decode ALU Memory Write Back Total
R-format 2 1 2 0 1 6
LW 2 1 2 2 1 8
SW 2 1 2 2 7ns
Branch 2 1 2 5ns
Jump 2 2ns
3Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
What if we had a variable CK cycle?
Let’s check the following scenario:
• Rtype: 44%, LW: 24%, SW: 12% • BRANCH: 18%, JUMP: 2%
• I- number of instructions in program
• T- time of the CK cycle
• CPI - number of CK cycle per instruction (=1)
Execution=I*T*CPI= 8*24%+7*12%+6*44%+5*18%+2*2%=6.3 ns
4Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The result:
EXE Single cycle T single clock * I T single clock 8
EXE Variable T variable clock * I T variable clock 6.3
We get a ratio of 1.27. The ratio is higher when more complicated instructions, e.g., floating point instructions are also implemented.
Since building a variable CK circuit is too complicated, we instead want instructions to take as many shorter CKs as required
5Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Multicycle Approach
The idea of Multi-cycle approach:
•We’ll save time since each instruction takes only the necessary number of CK cycles (which are about 5 times shorter than the original CK cycle)
• We also save in components since we can use the same component in different phases of the same instruction
6Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Building a Multi-Cycle CPU:
Split the instruction to steps (phases)
Make sure that the steps are balanced (same time required)
Reduce the job done at each step. In each step only one chore is done.
At the end of each CK cycle:
Store the result of the current step to be used by the next step. So, add more internal registers for storing the intermediate results.
7Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A single cycle CPU capable of R-type & lw/sw instructions (data & control)
5[25:21=]Rs
5[20:16=]Rt
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck
6[31:26]
RegWrite
16[15:0]
5
add
Sext16>-32
DataMemory
5[25:21=]Rs
6[5:0=]funct ALUcontrol
Rd
Address
D.In
D. Out
MemWrite
8Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A single cycle CPU capable of R-type & lw/sw instructions - Data Path only
5[25:21=]Rs
5[20:16=]Rt
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck16[15:0]
5
Sext16>-32
DataMemory
5[25:21=]Rs
Rd
Address
D.In
D. Out
lw
sw
9Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
PC 0x400000
Rs, Rt ALU inputs
ALU output)address(
Memory output
fetchWrite
backdecode execute
Memdata
memory
0x400004
new ALU inputs
New ALU output)new address(
new Memory output
fetchWrite
backdecode execute
Memdata
memory
D. Memdata
D.Mem adrs
I.Mem data
Timing of a single cycle CPU
10Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
PC
D. Memdata
D.Mem adrs
0x400000
Rs, Rt ALU inputs
ALU output)address(
Memory output
fetch Write backdecode execute
Memdata
memory
I.Mem data
PC
IR
A,B
ALUout
Mem data
MDR
fetch
Write back
decode
execute
memory
Timing of a lw instruction in a single cycle CPU
Timing of a lw instruction in a multi-cycle CPU
2ns
We want to replace a long single CK cyclewith 5 short ones:
1ns 2ns 2ns 1ns
0x400000
Instruction in IR
ALU calculates something
0 1 3 4 5=(0)2
11Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Therefore we should add registers to the single cycle CPU shown below:
5[25:21=]Rs
5[20:16=]Rt
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck16]15:0[
5
Sext16>-32
DataMemory
Rd
Address
D.In
D. Out
12Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Adding registers to “split” the instruction to 5 stages:
5[25:21=]Rs
5[20:16=]Rt
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck16]15:0[
5
Sext16>-32
DataMemory
Rd
Address
D.In
D. Out
IR
ckck
ck
ck
ck
A
B
ALUout MDR
PCWrite
2
03
4
1
5
13Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Here is the book’s version of the multi-cycle CPU:
Shiftleft 2
PC
Memory
MemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
1 Mux
0
3
2
Mux
ALUresult
ALUZero
Memorydata
register
Instruction[15– 11]
A
B
ALUOut
0
1
Address
Only PC and IR have write enable signalsAll other registers hold data for a single cycle
14Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Here is our version of A mult--cycle CPU capable of R-type & lw/sw & branch instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck >> 2
>>2
15Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Let us explain the multi-cycle CPU
• First we’ll look at a CPU capable of performing only R-type instructions
• Then, we’ll add the lw instruction
• And the sw instruction
• Then, the beq instruction
• And finally, the j instruction
16Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Let us remind ourselves how works a single cycle CPU capable of performingR-type instructions.Here you see the data-path and the timing of an R-typeinstruction.
5[25:21=]Rs
5[20:16=]Rt
5[15:11=]Rd
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck
6[31:26]
6[5:0=]funct
PC 0x400000 0x400004
Rs, Rt ALU inputs new ALU inputs
ALU output
Memory output New Memory output
New ALU output
fetchWrite
backdecode execute
17Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A single cycle CPU demo: R-type instruction
5[25:21=]Rs
5[20:16=]Rt
5[15:11=]Rd
Reg FileInstruction
Memory
PCALU
ck
ck
4
18Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of performing R-type instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
19Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of R-type & instructions
fetch
5IR]20:16[=Rt
Reg File
Instruction & dataMemory
PC
ALU
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
01
20Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of R-type & instructions
decode
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
1
2
21Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of R-type & instructions
execute
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
ALU
2
3
22Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of R-type & instructions
write back
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
Rd
ck3
4
23Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
PC
GPR input
0x400000
Rs, Rt ALU inputs
ALU output(Data = result of cala.)
Memory output = the instruction
fetch decode execute Write Back
Inst. Mem data
Mem data
IR
A,B
ALUout
fetch
Write back
decode
execute
Timing of an R-type instruction in a single cycle CPU
Timing of an R-type instruction in a multi-cycle CPU
3 4 (=0)0 1 2
PC
Previous inst. Current instruction
24Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Mem data
IR
A,B
ALUout
fetch
Write back
decode
execute
GPR outputs
ALUoutput
IR=M ( PC )
A= Rs, B= Rt
ALUuot= A op B
IRWrite
At the rising edge of CK:Rd=ALUoutR-Type instruction takes 4 CKs
PC
Previous inst.
Current instruction
Current instruction next inst.
IR=M(PC) A= Rs,B= Rt
ALUout = A op B Rd=ALUout
Rd = ALUout
The state diagram:
25Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type instructions (PC calc. )
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
26Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Mem data
IR
A,B
ALUout
fetch
Write back
decode
execute
GPR outputs
ALUoutput
ALUuot =A op B
At the rising edge of CK:Rd=ALUout
PC = PC+4
PC next PC = current PC+4current PC
next inst.Previous inst. current instruction
PCWrite
27Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi cycle CPU capable of R-type & instructions
fetch
5IR]20:16[=Rt
Reg File
Instruction MemoryPC
ALU
ck5
5IR]25:21[=Rs
Rd
IR
ck
ALUout
ck
A
ck
B
ck
ALU
4
28Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
WBR
ALU
Decode
1
6
0
7
R-type
The state diagram of a CPU capable of R-type instructions only
IR=M(PC)PC = PC+4
ALUout=A op B
A=RsB=Rt
Rd = ALUout
29Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
WBR
Load
ALUAdrCmp
Decode
WB
1
26
0
74
3
lwR-type
lw
The state diagram of a CPU capable of R-type and lw
instructions
ALUout= A+sext(imm)
MDR = M(ALUout)
Rt = MDR
30Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
We added registers to “split” the instruction to 5 stages.Let’s discuss the lw instruction
5[25:21=]Rs
5[20:16=]Rt
Reg File
InstructionMemoryPC ALU
Adder4
ck
ck16]15:0[
5
Sext16>-32
DataMemory
Rd
Address
D.In
D. Out
IR
ckck
ck
ck
ck
A
B
ALUout MDR
PCWrite
2
03
4
1
5
In ths single-cycle we kept the “data flow” from left to right. Here we change that a little, since as we’ll see, we are some parts of the CPU more than once during the same instruction. So we prefer to move data the memory.
All parts related to lw only are blue
31Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
First we draw a multi-cycle CPU capable of R-type & lw instructions:
5IR]20:16[=Rt
Reg FileInstruction MemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
ALUALU
We just moved the data memory All parts related to lw only are blue
Data Memory
32Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw instructionsfetch
5IR]20:16[=Rt
Reg FileInstruction MemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
ALUALU
Data Memory
33Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw instructionsdecode
5IR]20:16[=Rt
Reg FileInstruction MemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck >> 2Data Memory
34Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw instructionsAdrCmp
5IR]20:16[=Rt
Reg FileInstruction MemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
ALU
Data Memory
35Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw instructionsmemory
5IR]20:16[=Rt
Reg FileInstructionMemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
Branch Address
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck >> 2DataMemory
36Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw instructionsWB
5IR]20:16[=Rt
Reg FileInstructionMemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
DataMemory
ck
Rt
37Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Can we unite the Instruction & Data memories? (They are not used simultaneously as in the single cycle CPU)
5IR]20:16[=Rt
Reg FileInstructionMemoryPC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
DataMemory
ck
38Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
So here is a multi-cycle CPU capable of R-type & lw instructionsusing a single memory for instructions & data
5IR]20:16[=Rt
Reg File
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck
Instruction & dataMemory
39Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
PC
D. Memdata
D.Mem adrs
0x400000
Rs, Rt ALU inputs
ALU output)address(
Memory output
fetch Write backdecode execute
Memdata
memory
I.Mem data
PC
IR
A,B
ALUout
Mem data
MDR
fetch
Write back
decode
execute
memory
Timing of a lw instruction in a single cycle CPU
Timing of a lw instruction in a multi-cycle CPU
PC+4
Previous inst. current instruction
Data address
Data to Rt
40Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Mem data
IR
A,B
ALUout
Mem data
MDR
fetch
Write back
decode
execute
memory
GPR outputs
ALUoutput
IR=M ( PC )PC= PC+4
A= Rs, B= Rt
ALUuot=A+sext(imm)
MDR=M(ALUout)
At the rising edge of CK:Rt=MDR
PC
Previous inst. current instruction
Data address
Data address
Data to Rt
PCWrite, IRWrite
41Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
WBR
Load
ALUAdrCmp
Decode
WB
1
26
0
74
3
lwR-type
The state diagram of a CPU capable of R-type and lw
instructions
ALUout= A+sext(imm)
MDR = M(ALUout)
Rt = MDR
IR=M(PC)PC = PC+4
ALUout=A op B
A=RsB=Rt
Rd = ALUout
42Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw & sw instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
Branch Address
IR
ck
MDR
ck
ALUout
ck
A
ck
B
ck >> 2
lwsw
43Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
WBR
Load
ALUAdrCmp
Store
Decode
WB
1
5
26
0
74
3
lw+swR-type
swlw
The state diagram of a CPU capable of R-type and lw and sw
instructions
M(ALUout)=B
IR=M(PC)PC = PC+4
ALUout=A op B
A=RsB=Rt
Rd = ALUout
ALUout= A+sext(imm)
MDR = M(ALUout)
Rt = MDR
44Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw/sw & branch instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
IR
ck
IR
ck
ALUout
ck
A
ck
B
ck
>>2
45Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Calc PC=PC+sext(imm)>>2
Adding the instruction beq to the state diagram:
Calc Rs -Rt (just to produce the zero signal)
Fetch
WBR
Load
BranchALUAdrCmp
Store
Decode
WB
1
5
2 86
0
74
3
lw+swR-type beq
zero
swlw
not zero
46Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Adding the instruction beq to the state diagram, a more efficient way: Let’s use the decode state in which the ALU is doing nothing to compute the branch address.We’ll have to store it for 1 more CK cycle, until we know whether to branch or not! (We store it in the ALUout reg.)
Fetch
WBR
Load
BranchALUAdrCmp
Store
Decode
WB
1
5
2 86
0
74
3
lw+swR-type beq
swlw
Calc ALUout=PC+sext(imm)>>2
Calc Rs - Rt. If zero, load the PC with ALUout data, else do not load the PC
47Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw/sw & branch instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
Branch Address
IR
ck
IR
ck
ALUout
ck
A
ck
B
ck
>>2
PC+4
48Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
Jump
WBR
Load
BranchALUAdrCmp
Store
Decode
WB
1
5
2 86
9
0
74
3
lw+swR-type beq
j
swlw
Adding the instruction j to the state diagram:
PC = PC]31:28[ || IR]25:0[>>2
49Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
A multi-cycle CPU capable of R-type & lw/sw & branch & jump instructions
5IR]20:16[=Rt
Reg FileInstruction & dataMemory
PC
ALU
4
ck
16IR]15:0[
5
Sext16>-32
5IR]25:21[=Rs
Rd
Branch Address
IR
ck
IR
ck
ALUout
ck
A
ck
B
ck
>>2
PC+4= next address
Jump addressIR]25:0[
>>2+ PC]31:28[
50Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The phases (steps) of all instructions
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)
Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completion
Memory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR5
2 8 96
1
0
7 4
3
51Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
MultiCycle implementation with Control
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
Outputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
Final State Machine
PCWritePCSource = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond
PCSource = 01
ALUSrcA =1ALUSrcB = 00ALUOp= 10
RegDst = 1RegWrite
MemtoReg = 0
MemWriteIorD = 1
MemReadIorD = 1
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0RegWrite
MemtoReg =1
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
Instruction fetchInstruction decode/
register fetch
Jumpcompletion
BranchcompletionExecution
Memory addresscomputation
Memoryaccess
Memoryaccess R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op
= 'B
EQ')
(Op
= 'J
')
(Op = 'SW
')
(Op
= 'L
W')
4
01
9862
753
Start
53Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
Jump
WBR
Load
BranchALUAdrCmp
Store
Decode
WB
1
5
2 86
9
0
74
3
lw+swR-type beq
j
swlw
The final state diagram:
54Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Relevant control signalsRTL descriptionState name
IorD=0, MemRead=1, IRWrite=1
ALUSrcA=0, ALUSrcB=01,ALUop=00 (add), PCSrc=00,PCWrite=1
IR= M(PC)
PC=PC+4
Fetch
0
(no signals are needed)
ALUSrcA=0, ALUSrcB=11,ALUop=00 (add) (for branch)
A = RsB = Rt
ALUOut = PC+( sext(imm)>>2 )
Decode
1
ALUSrcA=1, ALUSrcB=10,ALUop=00 (add) (for lw & sw)
(for sw)
ALUOut = A+( sext(imm) )
(B = Rt)
AdrCmp
2
ALUSrcA=1, ALUSrcB=00,ALUop=10 (funct bits determines op)
ALUOut = A op BALU6
ALUSrcA=1, ALUSrcB=00,ALUop=01(sub),PCSrc=01, PCWriteCond=1
if (A = = B) PC=ALUOutelse do nothing
Branch
8
PCSrc=10, PCWrite=1
PC= PC]31:28[||(IR]25:0[>>2)
Jump9
IorD=1, MemRead=1MDR = M(ALUOut)Load3
IorD=1, MemWrite=1M(ALUOut) = BStore5
RegDest=1, MemtoReg=0RegWrite=1
Rd = ALUOutWBR7
RegDest=0, MemtoReg=1RegWrite=1
Rt = MDRWB4
55Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
• Implementation:
Finite State Machine for Control (The book’s version)
PCWrite
PCWriteCond
IorD
MemtoReg
PCSource
ALUOp
ALUSrcB
ALUSrcA
RegWrite
RegDst
NS3NS2NS1NS0
Op5
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
State register
IRWrite
MemRead
MemWrite
Instruction registeropcode field
Outputs
Control logic
Inputs
56Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Opcode= IR]31:26[zero, neg, etc.
next state
current state
control signalsnext statecalculation
Outputs decoder
State reg
ck
The Control Finite State Machine:
For 10 states coded 0-9, we need 4 bits, i.e., ]S3,S2,S1,S0[
57Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The control signals decoder
We just implement the table of slide 54: Let’s look at ALUSrcA: it is “0” in states 0 and 1 and it is “1” in states 2, 6 and 8. In all other states we don’t care. let’s look at PCWrite: it is “1” in states 0 and 9. In all other states it must be “0”.And so, we’ll fill the table below and build the decoder.
S3 S2 S1 S0 ALUSrcA
state Control signals
0 10 0 0 0
01 00 0 0
0
0 0
10
1 0
X
1 X 0
0
0 00 1 0 1
PCWrite
0
1 0
11
0 0
X
1 X 0
0
All other combinations X
0 01 1
0 11 0 1
1
X
1 00 1 1
PCWriteCond
0
0
0
0
0
0
0
0
0
0
1
0
0
0
fetch
decode
AdrCmp
load
WB
store
ALU
WBR
branch
jump
58Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
The state machine “next state calc.” logic
R-type=000000, lw=100011, sw=101011, beq=000100, bne=000101, lui=001111, j=0000010, jal=000011, addi=001000
Fetch
0
Jump
9
WBR
7
Load
3
Branch
8ALU 6
AdrCmp 2
Store
5
Decode
1
WB
4
lw+sw
R-typebeq j
swlw
IR31 IR30 IR29 IR28 IR27 IR26
opcode
S3 S2 S1 S0
current state
S3 S2 S1 S0
next state
X 0X X X X X 0 0 0 0 0 0 1
0 0 0 1 0 1 1 00 0 00 0 0
X
X 1
0X
X X X
X X X
X
0 0 1 0
0 0 1 0
0 0 1 1
0 1 0 1
1 0X X X X X 0 0 1 0 0 1 0
R-type
lw
sw
lw+sw
59Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Opcode = IR]31:26[
next state
current state
control signalsnext statecalculation
Outputs decoder
State reg
ck
The Control Finite State Machine:
Meally machine
PCWrite
PCWriteCond
zero
Mooremachine
to PC
60Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Microprogramming
PCWritePCWriteCondIorD
MemtoRegPCSourceALUOpALUSrcBALUSrcARegWrite
AddrCtl
Outputs
Microcode memory
IRWrite
MemReadMemWrite
RegDst
Control unit
Input
Microprogram counter
Address select logic
Op[
5–
0]
Adder
1
Datapath
Instruction registeropcode field
BWrite
61Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Microinstruction
LabelALU
control SRC1 SRC2Register control Memory
PCWrite control Sequencing
Fetch Add PC 4 Read PC ALU SeqAdd PC Extshft Read Dispatch 1
Mem1 Add A Extend Dispatch 2LW2 Read ALU Seq
Write MDR FetchSW2 Write ALU FetchRformat1 Func code A B Seq
Write ALU FetchBEQ1 Subt A B ALUOut-cond FetchJUMP1 Jump address Fetch
Microinstruction formatField name Value Signals active Comment
Add ALUOp = 00 Cause the ALU to add.ALU control Subt ALUOp = 01 Cause the ALU to subtract; this implements the compare for
branches.Func code ALUOp = 10 Use the instruction's function code to determine ALU control.
SRC1 PC ALUSrcA = 0 Use the PC as the first ALU input.A ALUSrcA = 1 Register A is the first ALU input.B ALUSrcB = 00 Register B is the second ALU input.
SRC2 4 ALUSrcB = 01 Use 4 as the second ALU input.Extend ALUSrcB = 10 Use output of the sign extension unit as the second ALU input.Extshft ALUSrcB = 11 Use the output of the shift-by-two unit as the second ALU input.Read Read two registers using the rs and rt fields of the IR as the register
numbers and putting the data into registers A and B.Write ALU RegWrite, Write a register using the rd field of the IR as the register number and
Register RegDst = 1, the contents of the ALUOut as the data.control MemtoReg = 0
Write MDR RegWrite, Write a register using the rt field of the IR as the register number andRegDst = 0, the contents of the MDR as the data.MemtoReg = 1
Read PC MemRead, Read memory using the PC as address; write result into IR (and lorD = 0 the MDR).
Memory Read ALU MemRead, Read memory using the ALUOut as address; write result into MDR.lorD = 1
Write ALU MemWrite, Write memory using the ALUOut as address, contents of B as thelorD = 1 data.
ALU PCSource = 00 Write the output of the ALU into the PC.PCWrite
PC write control ALUOut-cond PCSource = 01, If the Zero output of the ALU is active, write the PC with the contentsPCWriteCond of the register ALUOut.
jump address PCSource = 10, Write the PC with the jump address from the instruction.PCWrite
Seq AddrCtl = 11 Choose the next microinstruction sequentially.Sequencing Fetch AddrCtl = 00 Go to the first microinstruction to begin a new instruction.
Dispatch 1 AddrCtl = 01 Dispatch using the ROM 1.Dispatch 2 AddrCtl = 10 Dispatch using the ROM 2.
63Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Interrupt and exception
Type of event From Where ? MIPS terminology
Interrupt External I/O device request
------------------------------------------------------------------------------------
Invoke Operation system Internal Exception
From user program
-------------------------------------------------------------------------------------
Arithmetic Overflow Internal Exception
Using an undefined
Instruction Internal Exception--------------------------------------------------------------------------------------
Either Exception or interrupt Hardware malfunctions
64Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Exceptions handling
Exception type Exception vector address (in hex)
Undefined instruction c0 00 00 00
Arithmetic Overflow c0 00 00 20
We have 2 ways to handle exceptions:
Cause register or Vectored interrupts
MIPS – Cause register
65Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Handling exceptions
10
66Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Handling exceptions
Shiftleft 2
Memory
MemData
Writedata
Mux
0
1
Instruction[15– 11]
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
Control
Outputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
Address
EPC
CO 00 00 00 3
Cause
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
EPCWriteIntCauseCauseWrite
1
0
1 Mux
0
3
2
Mux
0
1
Mux
0
1
PC
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
ALUOut
67Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch
Jump
WBR
Load
BranchALU
AdrCmp
Store
Decode
WB
1
5
2 86
9
0
74
3
lw+swR-type be
q
j
sw
lw
SavePC
10
IRET
1
JumpInt
11
Handling interrupts:
int
int iret
68Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
D Q
“1”
irq
int (to the state machine)
eint
clr_irq~
The interrupt source
Handling an interrupt: remembering it in a FF until it is serviced
69Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Jumping to the interrupt routine
C0000000
Iret
Returning from interrupt
Interrupt
70Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Jumping to the interrupt routine
C0000000
Iret
Returning from interrupt
Interrupt
irq eint
0 10 1
71Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
Fetch > decode >ex >wb
Fetch > decode >ex >wb
Fetch > decode >ex >wb
Fetch > decode >ex >wb
Fetch > Save_PC >JumpInt
C0000000
Iret Fetch > decode > Iret
The state machine in action during interrupt
72Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved.
End of multi-cycle implementation