42
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline - Hazards October 1, 2009

CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

  • Upload
    others

  • View
    5

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell

CS352H: Computer Systems Architecture

Topic 9: MIPS Pipeline - Hazards

October 1, 2009

Page 2: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2

Data Hazards in ALU Instructions

Consider this sequence:sub $2, $1,$3and $12,$2,$5or $13,$6,$2add $14,$2,$2sw $15,100($2)

We can resolve hazards with forwardingHow do we detect when to forward?

Page 3: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3

Dependencies & Forwarding

Page 4: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4

Detecting the Need to Forward

Pass register numbers along pipelinee.g., ID/EX.RegisterRs = register number for Rssitting in ID/EX pipeline register

ALU operand register numbers in EX stage aregiven by

ID/EX.RegisterRs, ID/EX.RegisterRtData hazards when1a. EX/MEM.RegisterRd = ID/EX.RegisterRs1b. EX/MEM.RegisterRd = ID/EX.RegisterRt2a. MEM/WB.RegisterRd = ID/EX.RegisterRs2b. MEM/WB.RegisterRd = ID/EX.RegisterRt

Fwd fromEX/MEMpipeline reg

Fwd fromMEM/WBpipeline reg

Page 5: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5

Detecting the Need to Forward

But only if forwarding instruction will write to aregister!

EX/MEM.RegWrite, MEM/WB.RegWriteAnd only if Rd for that instruction is not $zero

EX/MEM.RegisterRd ≠ 0,MEM/WB.RegisterRd ≠ 0

Page 6: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6

Forwarding Paths

Page 7: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7

Forwarding Conditions

EX hazardif (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10

MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

Page 8: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8

Double Data Hazard

Consider the sequence:add $1,$1,$2add $1,$1,$3add $1,$1,$4

Both hazards occurWant to use the most recent

Revise MEM hazard conditionOnly fwd if EX hazard condition isn’t true

Page 9: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9

Revised Forwarding Condition

MEM hazardif (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01

if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

Page 10: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10

Datapath with Forwarding

Page 11: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11

Load-Use Data Hazard

Need to stallfor one cycle

Page 12: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12

Load-Use Hazard Detection

Check when using instruction is decoded in ID stageALU operand register numbers in ID stage are given by

IF/ID.RegisterRs, IF/ID.RegisterRtLoad-use hazard when

ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))

If detected, stall and insert bubble

Page 13: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13

How to Stall the Pipeline

Force control values in ID/EX registerto 0

EX, MEM and WB do nop (no-operation)

Prevent update of PC and IF/ID registerUsing instruction is decoded againFollowing instruction is fetched again1-cycle stall allows MEM to read data for lw

Can subsequently forward to EX stage

Page 14: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14

Stall/Bubble in the Pipeline

Stall insertedhere

Page 15: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15

Stall/Bubble in the Pipeline

Page 16: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16

Datapath with Hazard Detection

Page 17: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17

Stalls and Performance

Stalls reduce performanceBut are required to get correct results

Compiler can arrange code to avoid hazards and stallsRequires knowledge of the pipeline structure

Page 18: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18

Branch Hazards

If branch outcome determined in MEM

PC

Flush theseinstructions(Set controlvalues to 0)

Page 19: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19

Reducing Branch Delay

Move hardware to determine outcome to ID stageTarget address adderRegister comparator

Example: branch taken36: sub $10, $4, $840: beq $1, $3, 744: and $12, $2, $548: or $13, $2, $652: add $14, $4, $256: slt $15, $6, $7 ...72: lw $4, 50($7)

Page 20: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20

Example: Branch Taken

Page 21: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21

Example: Branch Taken

Page 22: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22

Data Hazards for Branches

If a comparison register is a destination of 2nd or 3rd

preceding ALU instruction

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

add $4, $5, $6

add $1, $2, $3

beq $1, $4, target

Can resolve using forwarding

Page 23: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23

Data Hazards for Branches

If a comparison register is a destination of preceding ALUinstruction or 2nd preceding load instruction

Need 1 stall cycle

beq stalled

IF ID EX MEM WB

IF ID EX MEM WB

IF ID

ID EX MEM WB

add $4, $5, $6

lw $1, addr

beq $1, $4, target

Page 24: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24

Data Hazards for Branches

If a comparison register is a destination of immediatelypreceding load instruction

Need 2 stall cycles

beq stalled

IF ID EX MEM WB

IF ID

ID

ID EX MEM WB

beq stalled

lw $1, addr

beq $1, $0, target

Page 25: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25

Dynamic Branch Prediction

In deeper and superscalar pipelines, branchpenalty is more significantUse dynamic prediction

Branch prediction buffer (aka branch history table)Indexed by recent branch instruction addressesStores outcome (taken/not taken)To execute a branch

Check table, expect the same outcomeStart fetching from fall-through or targetIf wrong, flush pipeline and flip prediction

Page 26: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26

1-Bit Predictor: Shortcoming

Inner loop branches mispredicted twice!outer: … …inner: … … beq …, …, inner … beq …, …, outer

Mispredict as taken on last iteration of inner loopThen mispredict as not taken on first iteration ofinner loop next time around

Page 27: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27

2-Bit Predictor

Only change prediction on two successive mispredictions

Page 28: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28

Calculating the Branch Target

Even with predictor, still need to calculate the targetaddress

1-cycle penalty for a taken branch

Branch target bufferCache of target addressesIndexed by PC when instruction fetched

If hit and instruction is branch predicted taken, can fetch targetimmediately

Page 29: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29

Exceptions and Interrupts

“Unexpected” events requiring changein flow of control

Different ISAs use the terms differently

ExceptionArises within the CPU

e.g., undefined opcode, overflow, syscall, …

InterruptFrom an external I/O controller

Dealing with them without sacrificing performance is hard

Page 30: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30

Handling Exceptions

In MIPS, exceptions managed by a SystemControl Coprocessor (CP0)Save PC of offending (or interrupted) instruction

In MIPS: Exception Program Counter (EPC)Save indication of the problem

In MIPS: Cause registerWe’ll assume 1-bit

0 for undefined opcode, 1 for overflow

Jump to handler at 8000 00180

Page 31: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31

An Alternate Mechanism

Vectored InterruptsHandler address determined by the cause

Example:Undefined opcode: C000 0000Overflow: C000 0020…: C000 0040

Instructions eitherDeal with the interrupt, orJump to real handler

Page 32: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32

Handler Actions

Read cause, and transfer to relevant handlerDetermine action requiredIf restartable

Take corrective actionuse EPC to return to program

OtherwiseTerminate programReport error using EPC, cause, …

Page 33: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33

Exceptions in a Pipeline

Another form of control hazardConsider overflow on add in EX stageadd $1, $2, $1

Prevent $1 from being clobberedComplete previous instructionsFlush add and subsequent instructionsSet Cause and EPC register valuesTransfer control to handler

Similar to mispredicted branchUse much of the same hardware

Page 34: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34

Pipeline with Exceptions

Page 35: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35

Exception Properties

Restartable exceptionsPipeline can flush the instructionHandler executes, then returns to the instruction

Refetched and executed from scratch

PC saved in EPC registerIdentifies causing instructionActually PC + 4 is saved

Handler must adjust

Page 36: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36

Exception Example

Exception on add in40 sub $11, $2, $444 and $12, $2, $548 or $13, $2, $64C add $1, $2, $150 slt $15, $6, $754 lw $16, 50($7)…

Handler80000180 sw $25, 1000($0)80000184 sw $26, 1004($0)…

Page 37: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37

Exception Example

Page 38: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 38

Exception Example

Page 39: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 39

Multiple Exceptions

Pipelining overlaps multiple instructionsCould have multiple exceptions at once

Simple approach: deal with exception from earliestinstruction

Flush subsequent instructions“Precise” exceptions

In complex pipelinesMultiple instructions issued per cycleOut-of-order completionMaintaining precise exceptions is difficult!

Page 40: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 40

Imprecise Exceptions

Just stop pipeline and save stateIncluding exception cause(s)

Let the handler work outWhich instruction(s) had exceptionsWhich to complete or flush

May require “manual” completion

Simplifies hardware, but more complex handler softwareNot feasible for complex multiple-issueout-of-order pipelines

Page 41: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41

Fallacies

Pipelining is easy (!)The basic idea is easyThe devil is in the details

e.g., detecting data hazards

Pipelining is independent of technologySo why haven’t we always done pipelining?More transistors make more advanced techniquesfeasiblePipeline-related ISA design needs to take account oftechnology trends

e.g., predicated instructions

Page 42: CS352H: Computer Systems Architecture › ~fussell › courses › cs352h › ...University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 41 Fallacies

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 42

Pitfalls

Poor ISA design can make pipelining hardere.g., complex instruction sets (VAX, IA-32)

Significant overhead to make pipelining workIA-32 micro-op approach

e.g., complex addressing modesRegister update side effects, memory indirection

e.g., delayed branchesAdvanced pipelines have long delay slots