24
1 COMP 206: COMP 206: Computer Architecture Computer Architecture and Implementation and Implementation Montek Singh Montek Singh Wed., Sep 24, 2003 Wed., Sep 24, 2003 Topic: Topic: Pipelining -- Intermediate Concepts Pipelining -- Intermediate Concepts (Multicycle Operations; Exceptions) (Multicycle Operations; Exceptions)

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

1

COMP 206:COMP 206:Computer Architecture and Computer Architecture and

ImplementationImplementation

Montek SinghMontek Singh

Wed., Sep 24, 2003Wed., Sep 24, 2003

Topic: Topic: Pipelining -- Intermediate Pipelining -- Intermediate

ConceptsConcepts

(Multicycle Operations; Exceptions)(Multicycle Operations; Exceptions)

Page 2: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

2

OutlineOutline Multi-cycle operationsMulti-cycle operations

Floating-point operationsFloating-point operations Structural and data hazardsStructural and data hazards

Interrupts, Faults and ExceptionsInterrupts, Faults and Exceptions Precise exceptionsPrecise exceptions Complications in pipelinesComplications in pipelines

READING: Appendix AREADING: Appendix A

Page 3: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

3

Pipelining Multicycle OperationsPipelining Multicycle Operations Assume five-stage pipelineAssume five-stage pipeline Third stage (execution) has two functional Third stage (execution) has two functional

units E1 and E2units E1 and E2 Instruction goes through either E1 or E2, but not bothInstruction goes through either E1 or E2, but not both E1 and E2 are not pipelinedE1 and E2 are not pipelined Stage delay of E1 = 2 cyclesStage delay of E1 = 2 cycles Stage delay of E2 = 4 cyclesStage delay of E2 = 4 cycles No buffering on inputs of E1 and E2No buffering on inputs of E1 and E2

Stage delay of other stages = 1 cycleStage delay of other stages = 1 cycle Consider an instruction sequence of five Consider an instruction sequence of five

instructionsinstructions Instructions 1, 3, 5 need E1Instructions 1, 3, 5 need E1 Instructions 2, 4 need E2Instructions 2, 4 need E2

Page 4: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

4

Space-Time Diagram: Multicycle Space-Time Diagram: Multicycle OperationsOperationsDelay 1 2 3 4 5 6 7 8 9 10 11 12 13

1 IF 1 2 3 4 5 5 51 ID 1 2 3 4 4 4 52 E1 1 1 3 3 5 54 E2 2 2 2 2 4 4 4 41 MEM 1 3 2 5 41 WB 1 3 2 5 4

Out-of-order completionOut-of-order completion 3 finishes before 2, and 5 finishes before 43 finishes before 2, and 5 finishes before 4

Instructions may be delayed after entering the Instructions may be delayed after entering the pipeline because of pipeline because of structural hazardsstructural hazards Instructions 2 and 4 both want to use E2 unit at same timeInstructions 2 and 4 both want to use E2 unit at same time Instruction 4 Instruction 4 stallsstalls in ID unit in ID unit This causes instruction 5 to This causes instruction 5 to stallstall in IF unit in IF unit

Page 5: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

5

Floating-Point Operations in MIPSFloating-Point Operations in MIPS

IFIF IDID

MEMMEM

WBWB

A1A1 A2A2 A3A3 A4A4

M1M1 M2M2 M3M3 M4M4 M5M5 M6M6 M7M7

EXEX

DIV (25)

Structural hazard:not fully pipelined

Structural hazard:instructions havevarying running

times

WAW hazardspossible; WAR

hazards notpossible

Longer operationlatency impliesmore frequentstalls for RAW

hazards

Out-of-ordercompletion; hasramifications for

exceptions

Page 6: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

6

Structural Hazard on WB UnitStructural Hazard on WB Unit1 2 3 4 5 6 7 8 9 10 11

DIV.D (issued at t = -16) D D D D D D D D D MEM WBMUL.D F0, F4, F6 IF ID M1 M2 M3 M4 M5 M6 M7 MEM WBinteger instruction IF ID EX MEM WBinteger instruction IF ID EX MEM WBADD.D F2, F4, F6 IF ID A1 A2 A3 A4 MEM WBinteger instruction IF ID EX MEM WBinteger instruction IF ID EX MEM WBL.D F2, 0(R2) IF ID EX MEM WB

This is worst-case scenario: max steady-state number of write ports is 1This is worst-case scenario: max steady-state number of write ports is 1 Don’t replicate resources; detect and serialize access as neededDon’t replicate resources; detect and serialize access as needed

Early resolutionEarly resolution Track use of WB in ID stage (using shift register), stall instructions thereTrack use of WB in ID stage (using shift register), stall instructions there

reservation registerreservation register Simplifies pipeline control; all stalls occur in IDSimplifies pipeline control; all stalls occur in ID

adds shift register and write-conflict logicadds shift register and write-conflict logic Late resolutionLate resolution

Stall instructions at entry to MEM or WB stageStall instructions at entry to MEM or WB stage Complicates pipeline control (two stall locations)Complicates pipeline control (two stall locations)

Page 7: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

7

1 2 3 4 5 6 7 8 9 10 11 12 13DIV.D (issued at t = -16) D D D D D D D D D MEM WBMULT.D F0, F4, F6 IF ID s M1 M2 M3 M4 M5 M6 M7 MEM WBinteger instruction IF s ID EX MEM WBinteger instruction IF ID EX MEM WBADD.D F2, F4, F6 IF ID s A1 A2 A3 A4 MEM WBL.D F2, 0(R2) IF ID EX MEM WB

WAW HazardsWAW Hazards

WAW hazard arises only when no instruction between ADD.D and WAW hazard arises only when no instruction between ADD.D and L.D uses result computed by ADD.DL.D uses result computed by ADD.D Adding an instruction like “ADD.D F8,F2,F4” before L.D would stall Adding an instruction like “ADD.D F8,F2,F4” before L.D would stall

pipeline enough for RAW hazard to avoid WAW hazardpipeline enough for RAW hazard to avoid WAW hazard Can happen through a branch/trap (example in HP3, Section A.9)Can happen through a branch/trap (example in HP3, Section A.9) Rare situation, but must still handle correctlyRare situation, but must still handle correctly

Hazard resolutionHazard resolution Delay the issue of L.D until ADD.D enters MEMDelay the issue of L.D until ADD.D enters MEM Cancel write of ADD.DCancel write of ADD.D

Page 8: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19L: L.D F4, 0(R2) IF L M A A S S S S S S S DM:MUL.D F0, F4, F6 ID L M M A A A A A A A S DA:ADD.D F2, F0, F8 EX L S S S SS:S.D 0(R2), F2 Mult M M M M M M MD:DIV.D F12, F4, F8 Add A A A A

Div D D D D D DMEM L M A SWB L M A S

RAW HazardsRAW Hazards

Longer delays of FP operations increases number of stalls in Longer delays of FP operations increases number of stalls in response to RAW hazardsresponse to RAW hazards

Two methods for reducing stallsTwo methods for reducing stalls Compiler could have moved instruction D between instructions M Compiler could have moved instruction D between instructions M

and A, which would allow D to complete earlier; or hardware could and A, which would allow D to complete earlier; or hardware could detect this possibility and issue instruction D out of orderdetect this possibility and issue instruction D out of order

ID stage is a bottleneck because instructions wait there for their ID stage is a bottleneck because instructions wait there for their operands to be available; could add buffers (reservation stations) operands to be available; could add buffers (reservation stations) to functional units and let instructions await their operands thereto functional units and let instructions await their operands there

Page 9: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

9

Responsibilities of ID (all stalls in Responsibilities of ID (all stalls in ID)ID) Three sets of checksThree sets of checks

Structural hazardsStructural hazardsCheck for availability of FP unitCheck for availability of FP unitEnsure WB unit will be available when neededEnsure WB unit will be available when needed

RAW hazardsRAW hazardsStall current instruction until its source registers are not Stall current instruction until its source registers are not

listed as pending registers in a pipeline register that will listed as pending registers in a pipeline register that will not be available when current instruction needs the resultnot be available when current instruction needs the result

WAW hazardsWAW hazards If any instruction in adder, divider, or multiplier has same If any instruction in adder, divider, or multiplier has same

register destination as current instruction, stall current register destination as current instruction, stall current instructioninstruction

Hazards between FP and integer instructionsHazards between FP and integer instructions Integer and FP instructions use disjoint sets of Integer and FP instructions use disjoint sets of

registers, except for FP-integer register movesregisters, except for FP-integer register moves FP load-stores can conflict with integer load-stores in FP load-stores can conflict with integer load-stores in

MEM stageMEM stage

Page 10: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

10

MIPS R4000 Floating-Point MIPS R4000 Floating-Point PipelinePipelineStage Functional Unit Description

A FP adder Mantissa ADD stageD FP divider Divide pipeline stageE FP multiplier Exception test stageM FP multiplier First stage of multiplierN FP multiplier Second stage of multiplierR FP adder Rounding stageS FP adder Operand shift stageU Unpack FP numbers

1 2 3 4A x xDEMNR x xS x xU x

AddSubtract

1 2 3 4 5 6 7 8A xDE xM x x x xN x xR xSU x

Multiply

1 2 3 4 … 30 31 32 33 34 35 36A x x x xD x … x x x x xEMNR x x x xSU x

Divide

Page 11: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

11

Instruction Mixes in FP Pipeline: Adds Instruction Mixes in FP Pipeline: Adds OnlyOnly

1 2 3 4A x xDEMNR x xS x xU x

AddSubtract

Can’t initiateanother addon cycle 2Conflict here

Can’t initiateanother addon cycle 3Conflict here

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19A x x y y x x y y x x y yDEMNR x x y y x x y y x x y yS x x y y x x y y x x y yU x y x y x y

• Forbidden latencies: 1 and 2• Steady-state utilization (cycles 4 through 18) = (5*7)/(8*15) = 35/120 = 29.17%• Total utilization (cycles 1 through 19) = (5+5*7+2)/(8*19) = 42/152 = 27.63%

• Forbidden latencies: 1 and 2• Steady-state utilization (cycles 4 through 18) = (5*7)/(8*15) = 35/120 = 29.17%• Total utilization (cycles 1 through 19) = (5+5*7+2)/(8*19) = 42/152 = 27.63%

Page 12: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

12

FP Pipeline: Multiplies OnlyFP Pipeline: Multiplies Only

1 2 3 4 5 6 7 8A xDE xM x x x xN x xR xSU x

1 1 1 1 0 0 0 0

Multiply

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28A x y z x y zDE x y z x y zM x x x x y y y y z z z z x x x x y y y y z z z zN x x y y z z x x y y z zR x y z x y zSU x y z x y z

• Collision vector: 1 indicates forbidden latency 0 indicates allowed latency• Steady-state utilization (cycles 5-24) = (5*10)/(8*20) = 50/160 = 31.25%• Total utilization (cycles 1-28) = (5+5*10+5)/(8*28) = 60/224 = 26.79%

• Collision vector: 1 indicates forbidden latency 0 indicates allowed latency• Steady-state utilization (cycles 5-24) = (5*10)/(8*20) = 50/160 = 31.25%• Total utilization (cycles 1-28) = (5+5*10+5)/(8*28) = 60/224 = 26.79%

Page 13: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

13

FP Pipeline: Adds and MultipliesFP Pipeline: Adds and Multiplies

1 2 3 4A x xDEMNR x xS x xU x

AddSubtract

1 2 3 4 5 6 7 8A xDE xM x x x xN x xR xSU x

Multiply

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28A a a m b b n a a m b b n a a m b b nDE m n m n m nM m m m m n n n n m m m m n n n n m m m m n n n nN m m n n m m n n m m n nR a a m b b n a a m b b n a a m b b nS a a b b a a b b a a b bU m a n b m a n b m a n b

• Note out-of-order completion• Steady-state utilization (cycles 6-21) = (4*17)/(8*16) = 68/128 = 53.13%• Total utilization = (12+4*17+22)/(8*28) = 85/224 = 37.95%

• Note out-of-order completion• Steady-state utilization (cycles 6-21) = (4*17)/(8*16) = 68/128 = 53.13%• Total utilization = (12+4*17+22)/(8*28) = 85/224 = 37.95%

Page 14: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

14

Interrupts, Faults, or ExceptionsInterrupts, Faults, or Exceptions

Synchronous, coerced interrupts that occur Synchronous, coerced interrupts that occur within instructions and after which execution within instructions and after which execution must resume are the hardest to implementmust resume are the hardest to implement

See Figure A.27 in HP3See Figure A.27 in HP3

I/O I/O requestrequest

AsyncAsync CoercedCoerced Between Between instr.instr.

ResumeResume

OS callOS call SyncSync User User requestrequest

Between Between instr.instr.

ResumeResume

BreakpoiBreakpointnt

SyncSync User User requestrequest

Between Between instr.instr.

ResumeResume

Power Power failfail

AsyncAsync CoercedCoerced Within Within instr.instr.

TerminatTerminatee

Page 15: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

15

Precise Interrupts (Sequential Precise Interrupts (Sequential Processor)Processor) When interrupt occurs, state of interrupted process is When interrupt occurs, state of interrupted process is

saved, including PC (= saved, including PC (= uu), registers, and memory), registers, and memory Interrupt is Interrupt is preciseprecise if the following three conditions hold if the following three conditions hold

All instructions preceding All instructions preceding uu have been executed, and have have been executed, and have modified the state correctlymodified the state correctly

All instructions following All instructions following uu are unexecuted, and have not are unexecuted, and have not modified the statemodified the state

If the interrupt was caused by an instruction, it was caused by If the interrupt was caused by an instruction, it was caused by instruction instruction uu, which is either completely executed (overflow) or , which is either completely executed (overflow) or completely unexecuted (VM page fault)completely unexecuted (VM page fault)

Precise interrupts are desirable if software is to fix up Precise interrupts are desirable if software is to fix up error that caused interrupt and execution has to be error that caused interrupt and execution has to be resumedresumed Easy for external interrupts, could be complex and costly for Easy for external interrupts, could be complex and costly for

internalinternal Imperative for some interrupts (VM page faults, IEEE FP Imperative for some interrupts (VM page faults, IEEE FP

standard)standard)

Page 16: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

16

Problems on Sequential Problems on Sequential ProcessorsProcessors Instruction modifies state Instruction modifies state

early, then causes an early, then causes an interruptinterrupt State change must be State change must be

undoneundone Example: First operand of Example: First operand of

VAX instruction uses VAX instruction uses autodecrement autodecrement addressing mode, which addressing mode, which writes a register. Trying writes a register. Trying to access second operand to access second operand causes a page fault. causes a page fault. Since instruction Since instruction execution cannot be execution cannot be completed, we must completed, we must restore the register restore the register written by autodecrement written by autodecrement to its original valueto its original value

Long-running instructionsLong-running instructions Not enough to be able to Not enough to be able to

restore state, must make restore state, must make progress from interrupt to progress from interrupt to interruptinterrupt

Example: MVC on IBM 360 Example: MVC on IBM 360 copies 256 bytescopies 256 bytes

No virtual memory, so interrupts No virtual memory, so interrupts not allowed to stop MVCnot allowed to stop MVC

Example: MVC on IBM 370 Example: MVC on IBM 370 copies 256 bytescopies 256 bytes

Has virtual memory, so first Has virtual memory, so first access all pages involved; after access all pages involved; after that, no interrupts allowedthat, no interrupts allowed

Example: MVCL on IBM 370 Example: MVCL on IBM 370 copies up to 2copies up to 22424 bytes bytes

Has VM; two addresses and Has VM; two addresses and length are in registerslength are in registers

Registers saved and restored on Registers saved and restored on interrupts (making progress)interrupts (making progress)

Page 17: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

17

Interrupts in MIPS PipelineInterrupts in MIPS PipelinePipeline stage Problem exceptions

IF Page fault on instruction fetchMisaligned memory accessMemory-protection violation

ID Undefined or illegal opcodeEX Arithmetic exception

MEM Page fault on data fetchMisaligned memory accessMemory-protection violation

WB None

How do we stop and restart execution on an interrupt to How do we stop and restart execution on an interrupt to keep it precise?keep it precise?

What problems do delayed branches cause?What problems do delayed branches cause? What happens if multiple exceptions occur in the What happens if multiple exceptions occur in the

pipeline?pipeline? Can exceptions occur out-of-order?Can exceptions occur out-of-order? What problems do multi-cycle instructions cause? What problems do multi-cycle instructions cause?

Page 18: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

18

MIPS Integer Pipeline, Single MIPS Integer Pipeline, Single InterruptInterrupt1 2 3 4 5 6 7 8 9 10

u-2 F D X M Wu-1 F D X M Wu F D X M W

u+1 F D X M Wu+2 F D X M W

TRAP F D X M W

Force Force TRAPTRAP instruction in pipeline on next IF instruction in pipeline on next IF Turn off all writes for faulting instruction and Turn off all writes for faulting instruction and

subsequent instructionssubsequent instructions After exception-handling routine in OS receives control, After exception-handling routine in OS receives control,

save save PCPC of faulting instruction of faulting instruction When exception has been handled, the RFE instruction When exception has been handled, the RFE instruction

reloads PC and restarts sequential instruction executionreloads PC and restarts sequential instruction execution

Page 19: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

19

Complications with Delayed Complications with Delayed BranchesBranches 1 2 3 4 5 6 7 8 9

1 branch F D X M W2 delay slot F D X M Wu BTA F D X M W

u+1 F D X M Wu+2 F D X M W

Suppose instruction 2 causes an exception (e.g., a Suppose instruction 2 causes an exception (e.g., a page fault) after the taken branch completes page fault) after the taken branch completes (determining that the branch outcome is true)(determining that the branch outcome is true) Instruction 2 cannot completeInstruction 2 cannot complete Neither can instruction uNeither can instruction u

On restart, we do not have sequential executionOn restart, we do not have sequential execution We must remember two PC values: 2 and uWe must remember two PC values: 2 and u

Page 20: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

20

Complications with Multiple Complications with Multiple ExceptionsExceptions

1 2 3 4 5 6LW F D X M WADD F D X M W

At same cycle, LW takes a data page fault and At same cycle, LW takes a data page fault and ADD takes an arithmetic exceptionADD takes an arithmetic exception

On an unpipelined machine, LW’s exception On an unpipelined machine, LW’s exception would occur firstwould occur first Handle the page faultHandle the page fault Restart executionRestart execution ADD will cause arithmetic exception to reoccur; handle ADD will cause arithmetic exception to reoccur; handle

it thenit then

Page 21: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

21

Complications with Out-of-order Complications with Out-of-order ExceptionsExceptions

1 2 3 4 5 6LW F D X M WADD F D X M W

LW takes data page fault, ADD takes instruction LW takes data page fault, ADD takes instruction page faultpage fault

Relative timing differs between unpipelined and Relative timing differs between unpipelined and pipelined machinespipelined machines To maintain precise interrupts, we need to consider To maintain precise interrupts, we need to consider

both when they occur and the instructions that caused both when they occur and the instructions that caused themthem

Post exceptions in exception status vector, turn off Post exceptions in exception status vector, turn off state modifications, and check vector in WB unitstate modifications, and check vector in WB unit

Page 22: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

22

Complications with Multicycle Complications with Multicycle OperationsOperations

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28DIVF F0, F2, F4 F D X X X X X X X X X X X X X X X X X X X X X X X X M WADDF F10, F10, F8 F D X X X X M WSUBF F12, F12, F14 F D X X X X M W

Instructions are independent (no hazards) and therefore Instructions are independent (no hazards) and therefore issue immediatelyissue immediately

Differences in running times causes out-of-order Differences in running times causes out-of-order terminationtermination

DIVF throws arithmetic exception late in its executionDIVF throws arithmetic exception late in its execution At that point, ADDF and SUBF have both completed At that point, ADDF and SUBF have both completed

execution and destroyed one of their operandsexecution and destroyed one of their operands Can we maintain precise interrupts under these Can we maintain precise interrupts under these

conditions?conditions?

Page 23: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

23

FP Pipeline Exceptions: Solns. 1 FP Pipeline Exceptions: Solns. 1 and 2and 2 Settle for imprecise interrupts (CRAY, with Settle for imprecise interrupts (CRAY, with

checkpointing)checkpointing) Done on Alpha 21064 and 21164, IBM Power-1 and Done on Alpha 21064 and 21164, IBM Power-1 and

Power-2, MIPS R8000 by supporting a fast imprecise Power-2, MIPS R8000 by supporting a fast imprecise mode and a slow precise modemode and a slow precise mode

Not an option if you have to support virtual memory Not an option if you have to support virtual memory or IEEE floating point standardor IEEE floating point standard

Software finishes certain instructions (SPARC)Software finishes certain instructions (SPARC) Keep enough state around for trap handler to create a Keep enough state around for trap handler to create a

precise sequence for exception and finish work for precise sequence for exception and finish work for some instruction stagessome instruction stages

Only FP instructions cause this problemOnly FP instructions cause this problem1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19F D X X X X X X X X X X X X X X X M W

F D X X X X X X X X M WF D X X X X X X X X M W

F D X X X X M W

Page 24: 1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;

24

FP Pipeline Exceptions: Solns. 3 FP Pipeline Exceptions: Solns. 3 and 4and 4 Stalling (MIPS R2000/3000, MIPS R4000, Stalling (MIPS R2000/3000, MIPS R4000,

Pentium)Pentium) An instruction is allowed to issue only if it is certain An instruction is allowed to issue only if it is certain

that all the instructions before the issuing instruction that all the instructions before the issuing instruction will complete without causing an exceptionwill complete without causing an exception

To prevent excessive stalling, FP units must decide on To prevent excessive stalling, FP units must decide on possibility of exceptions early in pipelinepossibility of exceptions early in pipeline

General methods (PowerPC 620, MIPS R10000)General methods (PowerPC 620, MIPS R10000) Reorder buffer, history file, future fileReorder buffer, history file, future file An instruction is allowed to finalize its writes only An instruction is allowed to finalize its writes only

when all previously issued instructions are completewhen all previously issued instructions are complete More naturally used in connection with ILP (Chapter 4)More naturally used in connection with ILP (Chapter 4) Significant complexity (to be discussed later)Significant complexity (to be discussed later)