Upload
wraith324
View
217
Download
0
Embed Size (px)
Citation preview
8/12/2019 4 29 03 ImplementingMIPS 0429
1/45
1
The Midterm is Coming
Midterm on May 8th. Midterm review on May 6th. Come to class with questions.
Midterm will cover everything before it It will mostly resemble the homeworks andthe reading quizzes.
We will send out a reading quiz compendium onThursday
It will be challenging. It will be curved.
8/12/2019 4 29 03 ImplementingMIPS 0429
2/45
2
The Final is Also Coming (but
more slowly) Despite what the online schedule says, we
have only one final time and it is:
6/10/2014
8:00am-11:00am.
8/12/2019 4 29 03 ImplementingMIPS 0429
3/45
3
Implementing a MIPS
Processor
Readings: 4.1-4.9
8/12/2019 4 29 03 ImplementingMIPS 0429
4/454
Goals for this Class
Understand how CPUs run programs How do we express the computation the CPU? How does the CPU execute it? How does the CPU support other system components (e.g., the OS)? What techniques and technologies are involved and how do they work?
Understand why CPU performance (and other metrics)varies How does CPU design impact performance? What trade-offs are involved in designing a CPU? How can we meaningfully measure and compare computer systems?
Understand why program performance varies
How do program characteristics affect performance? How can we improve a programs performance by considering the CPU
running it?
How do other system components impact program performance?
8/12/2019 4 29 03 ImplementingMIPS 0429
5/45
8/12/2019 4 29 03 ImplementingMIPS 0429
6/45
8/12/2019 4 29 03 ImplementingMIPS 0429
7/45
Foreshadowing
Act I: A Single-cycle Processor Simplest design Not how many real machineswork (maybe some deeply embedded processors)
Figure out the basic parts; what it takes to executeinstructions
Act II: A Pipelined Processor This is how many real machines work Exploit parallelism by executing multiple instructions
at once.
8/12/2019 4 29 03 ImplementingMIPS 0429
8/458
Target ISA
We will focus on part of MIPS Enough to run into the interesting issues Memory operations A few arithmetic/Logical operations (Generalizing is
straightforward)
BEQ and J This corresponds pretty directly to what
youll be implementing in 141L.
8/12/2019 4 29 03 ImplementingMIPS 0429
9/459
Basic Steps for Execution
Fetch an instruction from the instruction store Decode it What does this instruction do?
Gather inputs From the register file From memory
Perform the operation Write back the outputs
To register file or memory Determine the next instruction to execute
8/12/2019 4 29 03 ImplementingMIPS 0429
10/45
10
The Processor Design Algorithm
Once you have an ISA
Design/Draw the datapath Identify and instantiate the hardware for your architectural state Foreach instruction
Simulate the instruction
Add and connect the datapath elements it requires Is it workable? If not, fix it. Design the control
Foreach instruction Simulate the instruction
What control lines do you need? How will you compute their value? Modify control accordingly Is it workable? If not, fix it.
Youve already done much of this in 141L.
Arithmetic; R Type
8/12/2019 4 29 03 ImplementingMIPS 0429
11/45
Arithmetic; R-Type Inst = Mem[PC] REG[rd] = REG[rs] op REG[rt] PC = PC + 4
bits 31:26 25:21 20:16 15:11 10:6 5:0
name op rs rt rd shamt funct
# bits 6 5 5 5 5 6
ADDI; I Type
8/12/2019 4 29 03 ImplementingMIPS 0429
12/45
12
ADDI; I-Type PC = PC + 4 REG[rt] = REG[rs] op SignExtImm
bits 31:26 25:21 20:16 15:0
name op rs rt imm
# bits 6 5 5 16
Load Word
8/12/2019 4 29 03 ImplementingMIPS 0429
13/45
13
Load Word PC = PC + 4 REG[rt] = MEM[signextendImm + REG[rs]]
bits 31:26 25:21 20:16 15:0
name op rs rt immediate
# bits 6 5 5 16
Store Word
8/12/2019 4 29 03 ImplementingMIPS 0429
14/45
14
Store Word PC = PC + 4 MEM[signextendImm + REG[rs]] = REG[rt]
bits 31:26 25:21 20:16 15:0
name op rs rt immediate
# bits 6 5 5 16
8/12/2019 4 29 03 ImplementingMIPS 0429
15/45
8/12/2019 4 29 03 ImplementingMIPS 0429
16/45
16
A Single-cycle Processor
Performance refresher ET = IC * CPI * CT Single cycle CPI == 1; That sounds great Unfortunately, Single cycle CT is large
Even RISC instructions take quite a bite of effort toexecute This is a lot to do in one cycle
8/12/2019 4 29 03 ImplementingMIPS 0429
17/45
17
Our Hardware is Mostly Idle
Cycle time = 18 nsSlowest module (alu) is ~6ns
8/12/2019 4 29 03 ImplementingMIPS 0429
18/45
Processor Design inTwo Acts
Act II: A pipelined CPU
8/12/2019 4 29 03 ImplementingMIPS 0429
19/45
Pipelining
Letter Answer
A Allows the execution of multiple instructions to
overlap
B Prevents branch articulation
C Significantly decreases the amount of time it
takes to execute a particular instructionD Significantly increases the amount of time it
takes to implement a particular instruction
E A and D
19
8/12/2019 4 29 03 ImplementingMIPS 0429
20/45
Pipelining
Letter Answer
A Increases instruction count
B Reduces CPI
C Reduces cycle time
D Has no effect on performance
E B and C
20
8/12/2019 4 29 03 ImplementingMIPS 0429
21/45
Data hazards
Letter Answer
A Occur because a value is not ready when its
needed
B Occur because the next PC is not yet known.
C Cannot be removed.
D A and B
E All of the above
21
8/12/2019 4 29 03 ImplementingMIPS 0429
22/45
Stalling a processor
Letter Answer
A Reduces CPI and increases instruction count.
B Means that instructions early in the pipeline
stop making progress
C Can resolve some hazards.
D B and C
E A and C
22
F di
8/12/2019 4 29 03 ImplementingMIPS 0429
23/45
Forwarding
Letter Answer
A Is just for email.
B Allows the processor to resolve control
hazards.
C Improves CPI
D Reduces cycle time
E Interacts poorly with stalling.
23
8/12/2019 4 29 03 ImplementingMIPS 0429
24/45
24
Pipelining Review
Pi li i
8/12/2019 4 29 03 ImplementingMIPS 0429
25/45
Pipelining
Break up the logic with pipeline registers intopipeline stages Each stage can act on different instruction/data States/Control Signals of instructions are hold in
pipeline registers (latches)
25
2ns 2ns 2ns 2ns 2ns
10nslatch
latch
latc
h
latch
latch
latch
latch
latch
Pi li i
8/12/2019 4 29 03 ImplementingMIPS 0429
26/45
Pipelining
26
2ns 2ns 2ns 2ns 2nslatch
lat
ch
lat
ch
lat
ch
lat
ch
lat
ch
cycle #1
2ns 2ns 2ns 2ns 2nslatch
latch
latch
latch
latch
latch
cycle #2
2ns 2ns 2ns 2ns 2nslatch
latch
latch
latch
latch
latch
cycle #3
2ns 2ns 2ns 2ns 2nsla
tch
la
tch
la
tch
la
tch
la
tch
la
tch
cycle #4
2ns 2ns 2ns 2ns 2nslatch
latch
latch
latch
latch
latch
cycle #5
P f f i li
8/12/2019 4 29 03 ImplementingMIPS 0429
27/45
Performance of a pipeline processor
If we have 500 instructions , whats the speedup of a5-stage pipeline processor with 2 ns cycle time v.s. asingle-cycle processor with 10 ns cycle time?
A. 5
B. 4.96
C. 2.78D. 1
E. None of the above
27
R Cl k
8/12/2019 4 29 03 ImplementingMIPS 0429
28/45
Recap: Clock
A hardware signal defines when data is valid andstable Think about the clock in real life!
We use edge-triggered clocking
Values stored in the sequential logic is updated only on aclock edge
28
sequential logiccombinational logic
Th 5 St MIPS Pi li
8/12/2019 4 29 03 ImplementingMIPS 0429
29/45
The 5-Stage MIPS Pipeline
Instruction Fetch Read the instruction Decode
Figure out the incominginstruction?
Fetch the operands from theregister file
Execution: ALU Perform ALU functions
Memory access Read/write data memory Write back results to registers
Write to register file
36
Execution (EXE)
Instruction Fetch (IF)
Instruction Decode (ID)
Memory Access (MEM)
Write Back (WB)
Pipelined Datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
30/45
p D p
Read
Address
Instruction
Memory
Add
PC
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
16 32
ALU
Shift
left 2
Add
Data
Memory
Address
Write Data
Read
Data
Sign
Extend
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
31/45
Pipelined datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]
inst[31:0]
mux
0
1
mux
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegDst
RegWrite MemWrite
PCSrc
Zero
PCSrc = Branch & Zero
IF/ID ID/EX EX/MEM MEM/WB
Instruction FetchInstruction
DecodeExecution
Memory
Access
Write
Back
Will this work?
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
32/45
Pipelined datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]
inst[31:0]
mux
0
1
mux
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
add $1, $2, $3
lw $4, 0($5)
sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
33/45
Pipelined datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]inst[31:0]
mux
0
1
mux
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
add $1, $2, $3
lw $4, 0($5)
sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
34/45
Pipelined datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]
inst[31:0]
mux
0
1
mux
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
add $1, $2, $3
lw $4, 0($5)
sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
35/45
Pipelined datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]
inst[31:0]
mux
0
1
mu
x
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
add $1, $2, $3
lw $4, 0($5)
sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
36/45
RegDst
P pel ned datapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[15:11]
inst[31:0]
mux
0
1
mu
x
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemRead
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
Is this right?
add $1, $2, $3
lw $4, 0($5)
sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
ALUop
Pipelined datapath
8/12/2019 4 29 03 ImplementingMIPS 0429
37/45
p n atapath
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[31:0]
mux
0
1
mu
x
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemReadRegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
inst[15:11]
ALUop
Pipelined datapath + control
8/12/2019 4 29 03 ImplementingMIPS 0429
38/45
p p
Read
Address
Instruction
Memory
ALU
Write Data
4
Add
Read
Data 1
Read
Data 2
Read Reg 1
Read Reg 2
Write Reg
Register
File
inst[25:21]
inst[20:16]
inst[31:0]
mux
0
1
mu
x
0
1sign-
extend 3216
Data
Memory
AddressRead
Data
mux
1
0
Write Data
mux
1
0
AddShiftleft 2
ALUSrc
MemtoReg
MemReadRegDst
RegWrite MemWrite
PCSrc
Zero
IF/ID ID/EX EX/MEM MEM/WB
inst[15:11]
ALUop
Control WB
ME
EX
WB
ME
WB
RegWrite
Simplified pipeline diagram
8/12/2019 4 29 03 ImplementingMIPS 0429
39/45
p f p p g
1.Use symbols to represent the physical resourceswith the abbreviations for pipeline stages.
1. IF, ID, EXE, MEM, WB
2.Horizontal axis represent the timeline, vertical axis
for the instruction stream3.Example:
add $1, $2, $3
lw $4, 0($5)sub $6, $7, $8
sub $9,$10,$11
sw $1, 0($12)
IF EXE WBID MEM
IF EXE WBID MEM
IF EXEID MEM
IF EXEID
IF ID
WB
WBMEM
EXE WBMEM
What how much speedup should pipelining
8/12/2019 4 29 03 ImplementingMIPS 0429
40/45
p p p p gprovide and why?
Letter Answer
A 5x, by Amdahls law (x = 0.8 , S = 6.25)
B 25x, by the PE, since CPI goes up by 5xand cycle time goes down by 5x
C 2.24x, by the PE and the quotient rule
D 5x, by the PE since cycle time goes downby 90%
E 5x, by the PE since clock rate goes up by
5x 48
Pipelining Inaction
8/12/2019 4 29 03 ImplementingMIPS 0429
41/45
50
Pipelining Inaction
Imem 2 77 ns
8/12/2019 4 29 03 ImplementingMIPS 0429
42/45
Ctrl 0.797 ns
Imem 2.77 ns
ArgBMux 1.124 ns
ALU 6.527ns
8/12/2019 4 29 03 ImplementingMIPS 0429
43/45
Dmem 1.744 ns
ALU 6.527 ns
WriteRegMux
3.067 ns
RegFile 2.27 ns
Single-cycle Implementation to scale
8/12/2019 4 29 03 ImplementingMIPS 0429
44/45
53
Ideal 5-stage Pipeline (3.733ns -> 267Mhz)
18.667ns -> 3.733ns == 80% reduction in CT Lold = IC * CPI * CTold Lnew = IC * CPI * CTnew
CTnew = 0.2 * CTold Lnew = 0.2 * Lold Speed up = Lold/Lnew = 5x
Single-cycle Implementation to scale
8/12/2019 4 29 03 ImplementingMIPS 0429
45/45
54
Ideal 5-stage Pipeline (3.733ns -> 267Mhz)
Realistic 5-stage Pipeline
Letter Whats the actual
speedup? Clock rate?
A 3x; 150MhzB 1.02; 76.6Mhz
C 2.85x; 153Mhz
D 5.49x; 294Mhz
E None of the abve