Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 1
Laboratorio deTecnologías de Información
Introduction to Pipelining:Introduction to Pipelining: DatapathDatapath
Arquitectura de ComputadorasArquitectura de ComputadorasArturo DArturo Dííaz Paz Péérezrez
Centro de InvestigaciCentro de Investigacióón y de Estudios Avanzados del IPNn y de Estudios Avanzados del IPNLaboratorio de TecnologLaboratorio de Tecnologíías de Informacias de Informacióónn
[email protected]@cinvestav.mx
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 2
Laboratorio deTecnologías de InformaciónPipeliningPipelining
1 2 3 4
A way
of
exploiting
instruction
level
parallelism
1 2 3 4
1 2 3 4
1 2 3 4
timein
strn
s Throughput
Latency
1 2 3 4
1 2 3 4
1 2 3 4
time
inst
rns Throughput
Latency
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 3
Laboratorio deTecnologías de InformaciónObservationsObservations
♦
Pipelining doesn’t help latency of a single task, it helps throughput of the entire workload
♦
Pipeline rate limited by slowest pipeline stage♦
Multiple tasks operating simultaneously
♦
Potential speedup = Number pipe stages♦
Unbalanced lengths of pipe stages reduces speedup
♦
Time to “fill”
pipeline and time to “drain”
it reduces speedup
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 4
Laboratorio deTecnologías de Información5 Steps of DLX Datapath5 Steps of DLX Datapath
PCInst.
Memory
Add
Registers
SignExtend
Add
Mux
Mux
Zero ?
Mux
DataMemory
Mux
NPC
IR
16
A
B
SM D
ALUOutput
LM D
Instruction Fetch Instruction Decode/Register Fetch
ExecuteAddr. Calc.
MemoryAccess
WriteBack
32
4
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 5
Laboratorio deTecnologías de InformaciónPipelinedPipelined DLX DLX DatapathDatapath
Data stationary
control-
local decode
for
each
phase
/ pipeline
stage
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 6
Laboratorio deTecnologías de Información
Visualizing PipeliningVisualizing Pipelining
Single Cycle, Multiple Cycle, vs. PipelineSingle Cycle, Multiple Cycle, vs. Pipeline
Clk
Cycle 1
Multiple Cycle Implementation:
Ifetch Reg Exec Mem Wr
Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
Load Ifetch Reg Exec Mem Wr
Ifetch Reg Exec MemLoad Store
Pipeline Implementation:
Ifetch Reg Exec Mem WrStore
Clk
Single Cycle Implementation:
Load Store Waste
IfetchR-type
Ifetch Reg Exec Mem WrR-type
Cycle 1 Cycle 2
R-type
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 8
Laboratorio deTecnologías de InformaciónWhy Pipeline?Why Pipeline?
♦
Suppose we execute 100 instructions♦
Single Cycle Machine■
45 ns/cycle x 1 CPI x 100 inst = 4500 ns
♦
Multicycle Machine■
10 ns/cycle x 4.6 CPI (due to inst mix) x 100 inst = 4600 ns
♦
Ideal pipelined machine■
10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = 1040 ns
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 9
Laboratorio deTecnologías de InformaciónLimits to PipeliningLimits to Pipelining
♦
Its not that easy for computers
♦
Limits to pipelining: Hazards
prevent next instruction from executing during its designated clock cycle■
Structural hazards: HW cannot support this combination of instructions
■
Data hazards: instruction depends on result of prior instruction still in the pipeline
■
Control hazards: pipelining of branches & other instructions that change the PC
♦
Common solution is to stall the pipeline until the hazard is resolved, inserting one or more “bubbles”
in the pipeline
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 10
Laboratorio deTecnologías de Información
1 1 MemoryMemory isis StructuralStructural HazardHazard
Detection is easy in this case!
(right half highlight means read, left half write)
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 11
Laboratorio deTecnologías de Información
1 Memory is Structural Hazard1 Memory is Structural Hazard
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 12
Laboratorio deTecnologías de InformaciónData Hazard on r1Data Hazard on r1
• Dependencies backwards in time are hazards
Instr.
Order
Time (clock cycles)
add r1,r2,r3
sub r4,r1,r3
and r6,r1,r7
or r8,r1,r9
xor r10,r1,r11
IF ID/RF EX MEM WBAL
UIm Reg Dm Reg
AL
UIm Reg Dm RegA
LUIm Reg Dm Reg
Im
AL
UReg Dm Reg
AL
UIm Reg Dm Reg
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 13
Laboratorio deTecnologías de Información
Data Hazard SolutionData Hazard Solution
• “Forward” result from one stage to another
Instr.
Order
Time (clock cycles)
add r1,r2,r3
sub r4,r1,r3
and r6,r1,r7
or r8,r1,r9
xor r10,r1,r11
IF ID/RF EX MEM WBAL
UIm Reg Dm Reg
AL
UIm Reg Dm RegA
LUIm Reg Dm Reg
Im
AL
UReg Dm Reg
AL
UIm Reg Dm Reg
•“or” OK if define read/write properly
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 14
Laboratorio deTecnologías de Información3 Generic Data Hazards3 Generic Data Hazards
♦
Instri
followed by Instrj
♦
Read After Write (RAW)■
Instrj
tries to read operand before Instri
writes it♦
Write After Read (WAR)■
Instrj tries to write operand before Instri
reads it■
Can’t happen in DLX 5 stage pipeline because
» all instructions take 5 stages» reads are always in stage 2, and» writes are always in stage 5
♦
Write After Write (WAW)■
Instrj tries to write operand before Instri writes it
» Leaves wrong result (Instri not Instrj)■
Can’t happen in DLX 5 stage pipeline because
» all instructions take 5 stages» writes are always in stage 5
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 15
Laboratorio deTecnologías de Información
Control Hazard: WaitControl Hazard: Wait
♦
Stall: wait until decision is clear♦
Impact: 2 lost cycles (i.e. 3 clock cycles per branch instruction) => slow
♦
Move decision to end of decode■
save 1 cycle per branch
Instr.
Order
Time (clock cycles)
Add
Beq
Load
AL
UMem Reg Mem Reg
AL
UMem Reg Mem RegA
LUReg Mem RegMemLostpotential
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 16
Laboratorio deTecnologías de Información
Control Hazard: PredictControl Hazard: Predict
Instr.
Order
Time (clock cycles)
Add
Beq
Load
AL
UMem Reg Mem Reg
AL
UMem Reg Mem Reg
Mem
AL
UReg Mem Reg
♦
Predict:
guess one direction then back up if wrong♦
Impact: 0 lost cycles per branch instruction if right, 1 if wrong (right -
50% of time)■
Need to “Squash”
and restart following instruction if wrong■
Produce CPI on branch of (1 *.5 + 2 * .5) = 1.5■
Total CPI might then be: 1.5 * .2 + 1 * .8 = 1.1 (20% branch)♦
More dynamic scheme: history of 1 branch (-
90%)
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 17
Laboratorio deTecnologías de Información
Control Hazard: Delayed BranchControl Hazard: Delayed Branch
Instr.
Order
Time (clock cycles)
Add
Beq
Misc
AL
UMem Reg Mem Reg
AL
UMem Reg Mem Reg
Mem
AL
UReg Mem Reg
Load Mem
AL
UReg Mem Reg
♦
Delayed Branch: Redefine branch behavior (takes place after next instruction)
♦
Impact: 0 clock cycles per branch instruction if can find instruction to put in “slot”
(-
50% of time)
♦
As launch more instruction per clock cycle, less useful
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 18
Laboratorio deTecnologías de Información
Forwarding to Avoid Data HazardForwarding to Avoid Data Hazard
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 19
Laboratorio deTecnologías de Información
Data Hazard even with ForwardingData Hazard even with Forwarding
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 20
Laboratorio deTecnologías de Información
Forwarding and LoadsForwarding and Loads
Reg
• Dependencies backwards in time are hazards
Time (clock cycles)
lw r1,0(r2)
sub r4,r1,r3
IF ID/RF EX MEM WBAL
UIm Reg Dm
AL
UIm Reg Dm Reg
Can’t solve with forwarding: Must delay/stall instruction dependent on loads
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 21
Laboratorio deTecnologías de Información
Forwarding and LoadsForwarding and Loads
Reg
Time (clock cycles)
lw r1,0(r2)
sub r4,r1,r3
IF ID/RF EX MEM WBAL
UIm Reg DmA
LUIm Reg Dm RegStall
Dependencies backwards in time are hazards
Can’t solve with forwarding: Must delay/stall instruction dependent on loads
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 22
Laboratorio deTecnologías de Información
Designing a Pipelined ProcessorDesigning a Pipelined Processor
♦
Go back and examine your datapath
and control diagram♦
associated resources with states
♦
ensure that flows do not conflict, or figure out how to resolve
♦
assert control in appropriate stage
Control and Control and DatapathDatapath: Split state : Split state diagdiag into 5 piecesinto 5 pieces
IR <- Mem[PC]; PC <– PC+4;
A <- R[rs]; B<– R[rt]
S <– A + B;
R[rd] <– S;
S <– A + SX;
M <– Mem[S]
R[rd] <– M;
S <– A or ZX;
R[rt] <– S;
S <– A + SX;
Mem[S] <- B
If CondPC < PC+SX;
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
SReg
File
Equ
al
PC
Nex
t PC
IR
Inst
. Mem
D
M
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 24
Laboratorio deTecnologías de Información
Pipelined ProcessorPipelined Processor
♦
What happens if we start a new instruction every cycle?
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
S
M
Reg
File
Equ
al
PC
Nex
t PC
IR
Inst
. Mem
Valid
IRex
Dcd
Ctrl
IRm
em
Ex
Ctrl
IRw
b
Mem
Ctrl
WB
Ctrl
D
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 25
Laboratorio deTecnologías de InformaciónPipelinedPipelined DatapathDatapath
Data stationary
control-
local decode
for
each
phase
/ pipeline
stage
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 26
Laboratorio deTecnologías de Información
Pipelining the Load InstructionPipelining the Load Instruction
♦
The five independent functional units in the pipeline datapath
are:■
Instruction Memory for the Ifetch
stage
■
Register File’s Read ports (bus A and busB) for the Reg/Dec
stage■
ALU for the
Exec
stage
■
Data Memory for the Mem
stage■
Register File’s Write
port (bus W) for the Wr
stage
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7
Ifetch Reg/Dec Exec Mem Wr1st lw
Ifetch Reg/Dec Exec Mem Wr2nd lw
Ifetch Reg/Dec Exec Mem Wr3rd lw
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 27
Laboratorio deTecnologías de Información
The Four Stages of RThe Four Stages of R--typetype
♦
Ifetch: Instruction Fetch■
Fetch the instruction from the Instruction Memory
♦
Reg/Dec: Registers Fetch and Instruction Decode♦
Exec: ■
ALU operates on the two register operands
■
Update PC
♦
Wr: Write the ALU output back to the register file
Cycle 1 Cycle 2 Cycle 3 Cycle 4
Ifetch Reg/Dec Exec WrR-type
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 28
Laboratorio deTecnologías de Información
Pipelining the RPipelining the R--type and Load type and Load InstructionInstruction
♦
We have pipeline conflict or structural hazard:■
Two instructions try to write to the register file at the same time!
■
Only one write port
ClockCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9
Ifetch Reg/Dec Exec WrR-type
Ifetch Reg/Dec Exec WrR-type
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec WrR-type
Ifetch Reg/Dec Exec WrR-type
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 29
Laboratorio deTecnologías de Información
Important ObservationImportant Observation
♦
Each functional unit can only be used once
per instruction
♦
Each functional unit must be used at the same
stage for all instructions:■
Load uses Register File’s Write Port during its 5th
stage
■
R-type uses Register File’s Write Port during its 4th
stage
Ifetch Reg/Dec Exec Mem WrLoad1 2 3 4 5
Ifetch Reg/Dec Exec WrR-type1 2 3 4
°
2 ways to solve this pipeline hazard.
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 30
Laboratorio deTecnologías de Información
Solution 1: Insert Solution 1: Insert ““BubbleBubble”” into into the Pipelinethe Pipeline
♦
Insert a “bubble”
into the pipeline to prevent 2 writes at the same cycle■
The control logic can be complex.
■
Lose instruction fetch and issue opportunity.
♦
No instruction is started in Cycle 6!
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9
Ifetch Reg/Dec Exec WrR-type
Ifetch Reg/Dec Exec
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec WrR-typeIfetch Reg/Dec Exec WrR-type Pipeline
Bubble
Ifetch Reg/Dec Exec Wr
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 31
Laboratorio deTecnologías de Información
Solution 2: Delay RSolution 2: Delay R--typetype’’s Write s Write by One Cycleby One Cycle
♦
Delay R-type’s register write by one cycle:■
Now R-type instructions also use Reg
File’s write port at Stage 5
■
Mem
stage is a NOOP
stage: nothing is being done.
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9
Ifetch Reg/Dec Mem WrR-type
Ifetch Reg/Dec Mem WrR-type
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Mem WrR-type
Ifetch Reg/Dec Mem WrR-type
Ifetch Reg/Dec Exec WrR-type Mem
Exec
Exec
Exec
Exec
1 2 3 4 5
Modified Control & Modified Control & DatapathDatapathIR <- Mem[PC]; PC <– PC+4;
A <- R[rs]; B<– R[rt]
S <– A + B;
R[rd] <– M;
S <– A + SX;
M <– Mem[S]
R[rd] <– M;
S <– A or ZX;
R[rt] <– M;
S <– A + SX;
Mem[S] <- B
if Cond PC < PC+SX;
M <– S
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
SReg
File
Equ
al
PC
Nex
t PC
IR
Inst
. Mem
D
M
M <– S
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 33
Laboratorio deTecnologías de Información
The Four Stages of StoreThe Four Stages of Store
♦
Ifetch: Instruction Fetch■
Fetch the instruction from the Instruction Memory
♦
Reg/Dec: Registers Fetch and Instruction Decode♦
Exec: Calculate the memory address
♦
Mem: Write the data into the Data Memory
Cycle 1 Cycle 2 Cycle 3 Cycle 4
Ifetch Reg/Dec Exec MemStore Wr
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 34
Laboratorio deTecnologías de InformaciónThe Three Stages of The Three Stages of BeqBeq
♦
Ifetch: Instruction Fetch■
Fetch the instruction from the Instruction Memory
♦
Reg/Dec: ■
Registers Fetch and Instruction Decode
♦
Exec: ■
compares the two register operand,
■
select correct branch target address■
latch into PC
Cycle 1 Cycle 2 Cycle 3 Cycle 4
Ifetch Reg/Dec Exec MemBeq Wr
Control DiagramControl DiagramIR <- Mem[PC]; PC < PC+4;
A <- R[rs]; B<– R[rt]
S <– A + B;
R[rd] <– S;
S <– A + SX;
M <– Mem[S]
R[rd] <– M;
S <– A or ZX;
R[rt] <– S;
S <– A + SX;
Mem[S] <- B
If Cond PC < PC+SX;
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
Reg
File
Equ
al
PC
Nex
t PC
IR
Inst
. Mem
M <– S M <– S
A
B
S
D
M
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 36
Laboratorio deTecnologías de Información
DatapathDatapath + Data Stationary + Data Stationary ControlControl
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
SReg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M
rs rt
oprsrt
fun
im
exmewbrwv
mewbrwv
wbrwv
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 37
Laboratorio deTecnologías de Información
LetLet’’s Try it Outs Try it Out
10 lw r1, r2(35)14 addI r2, r2, 320 sub r3, r4, r524 beq r6, r7, 10030 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
these addresses are octal
Start: Fetch 10Start: Fetch 10
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
SReg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M
rs rt im
10 lw r1, r2(35)14 addI r2, r2, 320 sub r3, r4, r524 beq r6, r7, 10030 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
n n n n
10
Fetch 14, Decode 10Fetch 14, Decode 10
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
A
B
SReg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M
2 rt im
10
lw
r1, r2(35)
14 addI r2, r2, 320 sub r3, r4, r524 beq r6, r7, 10030 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
n n n
14
lwr1
, r2(
35)
Fetch 20, Decode 14, Exec 10Fetch 20, Decode 14, Exec 10
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r2
B
SReg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M
2 rt 35
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20 sub r3, r4, r524 beq r6, r7, 10030 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
n n
20
lwr1
addI
r2, r
2, 3
Fetch 24, Decode 20, Exec 14, Fetch 24, Decode 20, Exec 14, MemMem 1010
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r2
B
r2+3
5
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M
4 5 3
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24 beq r6, r7, 10030 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
n
24lw
r1
sub
r3, r
4, r5
addI
r2, r
2, 3
Fetch 30, Fetch 30, DcdDcd 24, Ex 20, 24, Ex 20, MemMem 14, WB 1014, WB 10
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r4
r5
r2+3
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
M[r2
+35]
6 7
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30 ori r8, r9, 1734 add r10, r11, r12
100 and r13, r14, 15
30
lwr1
beq
r6, r
7 10
0
addI
r2
sub
r3
Fetch 34, Fetch 34, DcdDcd 30, Ex 24, 30, Ex 24, MemMem 20, WB 1420, WB 14
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r6
r7
r2+3
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
r1=M
[r2+3
5]
9 xx
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30
ori
r8, r9, 17
34 add r10, r11, r12
100 and r13, r14, 15
34
beq
addI
r2
sub
r3r4
-r5
100
orir
8, r9
17
Fetch 100, Fetch 100, DcdDcd 3434, Ex 30, , Ex 30, MemMem 24, WB 2024, WB 20
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r9
x
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
r1=M[r2+35]
11 12
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30
ori
r8, r9, 17
34
add
r10, r11, r12
100 and r13, r14, 15
100
beq
r2 = r2+3
sub
r3r4
-r5
17or
ir8
xxx
add
r10,
r11,
r12
ooops, we should have only one delayed instruction
Fetch 104, Fetch 104, DcdDcd 100, 100, Ex 34Ex 34, , MemMem 30, WB 2430, WB 24
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r11
r12
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
r1=M[r2+35]
14 15
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30
ori
r8, r9, 17
34
add
r10, r11, r12
100 and r13, r14, 15
104
beq
r2 = r2+3r3 = r4-r5
xx
orir
8
xxx
add
r10
and
r13,
r14,
r15
n
Squash the extra instruction in the branch shadow!
r9 |
17
Fetch 108, Fetch 108, DcdDcd 104, Ex 100, 104, Ex 100, MemMem 3434, WB 30, WB 30
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
r14
r15
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
r1=M[r2+35]
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30
ori
r8, r9, 17
34
add
r10, r11, r12
100 and r13, r14, 15
110
r2 = r2+3r3 = r4-r5
xx
orir
8
add
r10
and
r13
n
Squash the extra instruction in the branch shadow!r9
| 17
r11+
r12
Fetch 114, Fetch 114, DcdDcd 110, 110, Ex 104Ex 104, , MemMem 100, 100, WB 34WB 34
Exe
c
Reg
. Fi
le
Mem
Acc
ess
Dat
aM
em
Reg
File
PC
Nex
t PC
IR
Inst
. Mem
D
Dec
ode
MemCtrl
WB Ctrl
r1=M[r2+35]
10
lw
r1, r2(35)
14
addI
r2, r2, 3
20
sub
r3, r4, r5
24
beq
r6, r7, 100
30
ori
r8, r9, 17
34
add
r10, r11, r12
100 and r13, r14, 15
114
r2 = r2+3r3 = r4-r5r8 = r9 | 17
add
r10
and
r13
n
Squash the extra instruction in the branch shadow!r1
1+r1
2
NO WBNO Ovflow
r14
& R
15
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 48
Laboratorio deTecnologías de Información
Summary: PipeliningSummary: Pipelining
♦
What makes it easy■
all instructions are the same length
■
just a few instruction formats■
memory operands appear only in loads and stores
♦
What makes it hard?■
structural hazards: suppose we had only one memory
■
control hazards: need to worry about branch instructions■
data hazards: an instruction depends on a previous instruction
♦
We’ll build a simple pipeline and look at these issues
♦
We’ll talk about modern processors and what really makes it hard:■
exception handling
■
trying to improve performance with out-of-order execution, etc.
Laboratorio deTecnologías de Información
Arquitectura de Computadoras Pipelining- 49
Laboratorio deTecnologías de Información
SummarySummary
♦
Pipelining is a fundamental concept■
multiple steps using distinct resources
♦
Utilize capabilities of the Datapath
by pipelined instruction processing■
start next instruction while working on the current one
■
limited by length of longest stage (plus fill/flush)■
detect and resolve hazards