43
1 ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected]) Exam #2 Results Exam #2 Results

Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected]) 1 Exam #2 Results

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

1ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Exam #2 ResultsExam #2 Results

Page 2: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

Pipeline HazardsPipeline Hazards

Jin-Soo Kim ([email protected])Computer Systems Laboratory

Sungkyunkwan Universityhttp://csl.skku.edu

Page 3: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

3ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

HazardsHazards What are hazards?

• Situations that prevent starting the next instruction in the next cycle

Structure hazards• A required resource is busy

Data hazards• Need to wait for previous instruction to complete its

data read/write

Control hazards• Deciding on control action depends on previous

instruction

Page 4: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

4ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Structure HazardsStructure Hazards Conflict for use of a resource

In MIPS pipeline with a single memory• Load/store requires data access• Instruction fetch would have to stall for that cycle Would cause a pipeline “bubble”

Hence, pipelined datapaths require separate instruction/data memories• Or separate instruction/data caches

Page 5: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

5ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards (1)Data Hazards (1) An instruction depends on completion of data

access by a previous instruction

add    $s0, $t0, $t1sub    $t2, $s0, $t3

Page 6: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

6ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards (2)Data Hazards (2) Read After Write (RAW)

• Inst J tries to read operand before Inst I writes it:

• Caused by a “(true) dependence”. This hazard results from an actual need for communication.

I: add  $t1, $t2, $t3J: sub  $t4, $t1, $t3

Page 7: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

7ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards (3)Data Hazards (3) Write After Read (WAR)

• Inst J writes operand before Inst I reads it:

• Called an “anti-dependence” by compiler writers. This results from the reuse of the name “$t1”.

I: sub  $t4, $t1, $t3J: add  $t1, $t2, $t3K: add $t6, $t1, $t7

Page 8: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

8ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards (4)Data Hazards (4) Write After Write (WAW)

• Inst J writes operand before Inst I writes it:

• Called an “output dependence” by compiler writers. This also results from the reuse of the name “$t1”.

I: sub  $t1, $t4, $t3J: add  $t1, $t2, $t3K: add $t6, $t1, $t7

Page 9: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

9ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

ForwardingForwarding Forwarding (aka Bypassing)

• Use result when it is computed• Don’t wait for it to be stored in a register• Requires extra connections in the datapath

Page 10: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

10ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Forwarding ExampleForwarding Example Consider this sequence:

We can resolve hazards with forwarding• How do we detect when to forward?

sub   $2,  $1, $3and   $12, $2, $5or    $13, $6, $2add   $14, $2, $2sw $15, 100($2)

Page 11: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

11ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Dependencies & ForwardingDependencies & Forwarding

Page 12: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

12ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Forwarding Conditions (1)Forwarding Conditions (1) Detecting the need to forward

• Pass register numbers along pipeline– e.g., ID/EX.RegisterRs = register number for Rs sitting in

ID/EX pipeline register

• ALU operand register numbers in EX stage are given by ID/EX.RegisterRs & ID/EX.RegisterRt

• Data hazards when1a. EX/MEM.RegisterRd = ID/EX.RegisterRs1b. EX/MEM.RegisterRd = ID/EX.RegisterRt2a. MEM/WB.RegisterRd = ID/EX.RegisterRs2b. MEM/WB.RegisterRd = ID/EX.RegisterRt

Fwd from EX/MEMpipeline reg

Fwd from MEM/WBpipeline reg

Page 13: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

13ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Forwarding Conditions (2)Forwarding Conditions (2) Detecting the need to forward (cont’d)

• But only if forwarding instruction will write to a register!– EX/MEM.RegWrite, MEM/WB.RegWrite

• And only if Rd for that instruction is not $zero– EX/MEM.RegisterRd ≠ 0,

MEM/WB.RegisterRd ≠ 0

Page 14: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

14ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Forwarding Conditions (3)Forwarding Conditions (3) Forwarding paths

Page 15: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

15ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Forwarding Conditions (4)Forwarding Conditions (4) EX hazard

• if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)and (EX/MEM.RegisterRd = ID/EX.RegisterRs))

ForwardA = 10 • if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)

and (EX/MEM.RegisterRd = ID/EX.RegisterRt))ForwardB = 10

MEM hazard• if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)

and (MEM/WB.RegisterRd = ID/EX.RegisterRs))ForwardA = 01

• if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)and (MEM/WB.RegisterRd = ID/EX.RegisterRt))

ForwardB = 01

Page 16: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

16ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Double Data HazardDouble Data Hazard Consider this sequence:

Both hazards occur• Want to use the most recent

Revise MEM hazard condition• Only forward if EX hazard condition isn’t true

add   $1, $1, $2add   $1, $1, $3add   $1, $1, $4

Page 17: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

17ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Rev’d Forwarding ConditionsRev’d Forwarding Conditions MEM hazard

• if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)

and (EX/MEM.RegisterRd = ID/EX.RegisterRs))and (MEM/WB.RegisterRd = ID/EX.RegisterRs))

ForwardA = 01

• if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)

and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt))

ForwardB = 01

Page 18: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

18ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Datapath with ForwardingDatapath with Forwarding

Page 19: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

19ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (1)Load-Use Data Hazard (1) Load-Use data hazards

• Can’t always avoids stalls by forwarding• If value not computed when needed• Can’t forward backward in time!

Page 20: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

20ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (2)Load-Use Data Hazard (2)

Need to stall for one cycle

Page 21: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

21ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (3)Load-Use Data Hazard (3) Load-use hazard detection

• Check when using instruction is decoded in ID stage• ALU operand register numbers in ID stage are given

by IF/ID.RegisterRs & IF/ID.RegisterRt• Load-use hazard when

– ID/EX.MemRead and((ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt))

• If detected, stall and insert bubble

Page 22: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

22ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (4)Load-Use Data Hazard (4) How to stall the pipeline?

• Force control values in ID/EX register to 0– EX, MEM, and WB do nop (no-operation)

• Prevent update of PC and IF/ID register– Using instruction is decoded again– Following instruction is fetched again– 1-cycle stall allows MEM to read data for lw

» Can subsequently forward to EX stage

Page 23: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

23ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (5)Load-Use Data Hazard (5) Stall/Bubble in the pipeline

Stall inserted here

Page 24: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

24ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Load-Use Data Hazard (6)Load-Use Data Hazard (6) Stall/Bubble in the pipeline (cont’d)

Or, more accurately...

Page 25: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

25ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Datapath + Hazard DetectionDatapath + Hazard Detection

Page 26: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

26ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Compiler Approach (1)Compiler Approach (1) Stalls reduce performance

• But are required to get correct results

Compiler can arrange code to avoid hazards and stalls• Requires knowledge of the pipeline structure

Page 27: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

27ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Compiler Approach (2)Compiler Approach (2) Code scheduling to avoid stalls

• Reorder code to avoid use of load result in the next instruction

(C code)   A = B + E;   C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0) add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall 

stall 

13 cycles 11 cycles

Page 28: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

28ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Control Hazards (1)Control Hazards (1) Branch determines flow of control

• Fetching next instruction depends on branch outcome

• Pipeline can’t always fetch correct instruction– Still working on ID stage of branch

In MIPS pipeline• Need to compare registers and compute target early

in the pipeline• Add hardware to do it in ID stage

Page 29: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

29ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Control Hazards (2)Control Hazards (2) Stall on branch

• If branch outcome determined in MEM:

PC

Flush theseinstructions(Set controlvalues to 0)

Page 30: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

30ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Control Hazards (3)Control Hazards (3) Reducing branch delay

• Move hardware to determine outcome to ID stage– Target address adder– Register comparator

Example: branch taken

36:  sub $10, $4, $840:  beq $1,  $3, 744:  and $12, $2, $548:  or  $13, $2, $652:  add $14, $4, $256:  slt $15, $6, $7

...72:  lw $4, 50($7) 

Page 31: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

31ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Control Hazards (4)Control Hazards (4)

Page 32: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

32ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Control Hazards (5)Control Hazards (5)

Page 33: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

33ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards for Branches (1)Data Hazards for Branches (1) Data hazards for branches

• If a comparison register is a destination of 2nd or 3rd preceding ALU instruction Can resolve using forwarding

add $4, $5, $6

add $1, $2, $3

beq $1, $4, target

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

IF ID EX MEM WB

Page 34: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

34ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards for Branches (2)Data Hazards for Branches (2) Data hazards for branches (cont’d)

• If a comparison register is a destination of preceding ALU instruction or 2nd preceding load instruction Need 1 stall cycle

beq stalled

add $4, $5, $6

lw $1, addr

beq $1, $4, target

IF ID EX MEM WB

IF ID EX MEM WB

IF ID

ID EX MEM WB

Page 35: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

35ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Data Hazards for Branches (3)Data Hazards for Branches (3) Data hazards for branches (cont’d)

• If a comparison register is a destination of immediately preceding load instruction Need 2 stall cycles

beq stalled

beq stalled

lw $1, addr

beq $1, $0, target

IF ID EX MEM WB

IF ID

ID

ID EX MEM WB

Page 36: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

36ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Branch Prediction (1)Branch Prediction (1) Branch prediction

• Longer pipelines can’t readily determine branch outcome early– Stall penalty becomes unacceptable

• Predict outcome of branch– Only stall if prediction is wrong

• In MIPS pipeline– Can predict branches not taken– Fetch instruction after branch, with no delay

Page 37: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

37ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Branch Prediction (2)Branch Prediction (2) Static branch prediction

• Based on typical branch behavior• Example: loop and if-statement branches

– Predict backward branches taken– Predict forward branches not taken

Dynamic branch prediction• Hardware measures actual branch behavior

– e.g., record recent history of each branch

• Assume future behavior will continue the trend– When wrong, stall while re-fetching, and update history

Page 38: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

38ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Static Branch PredictionStatic Branch Prediction MIPS with “Predict Not Taken”

Prediction correct

Prediction incorrect

Page 39: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

39ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Dynamic Branch Prediction (1)Dynamic Branch Prediction (1) Dynamic branch prediction

• In deeper and superscalar pipelines, branch penalty is more significant

• Use dynamic prediction– Branch prediction buffer (aka branch history table)– Indexed by recent branch instruction addresses– Stores outcome (taken/not taken)– To execute a branch

» Check table, expect the same outcome» Start fetching from fall-through or target» If wrong, flush pipeline and flip prediction

Page 40: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

40ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Dynamic Branch Prediction (2)Dynamic Branch Prediction (2) 1-bit predictor: shortcoming

• Inner loop branches mispredicted twice!

– Mispredict as taken on last iteration of inner loop– Then mispredict as not taken on first iteration of inner loop

next time around

outer: ……

inner: ……beq …, …, inner…beq …, …, outer

Page 41: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

41ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Dynamic Branch Prediction (3)Dynamic Branch Prediction (3) 2-bit predictor

• Only change prediction on two successive mispredictions

Page 42: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

42ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Dynamic Branch Prediction (4)Dynamic Branch Prediction (4) Calculating the branch target

• Even with predictor, still need to calculate the target address– 1-cycle penalty for a taken branch

• Branch target buffer– Cache of target addresses– Indexed by PC when instruction fetched

» If hit and instruction is branch predicted taken, can fetch target immediately

Page 43: Exam #2 Results - csl.skku.educsl.skku.edu/uploads/ICE3003S11/10-hazards.pdf · ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim (jinsookim@skku.edu) 1 Exam #2 Results

43ICE3003: Computer Architecture | Spring 2011 | Jin-Soo Kim ([email protected])

Compiler ApproachCompiler Approach Delayed branch

• Branch delay slot filled by a useful instruction• Done by compilers and assemblers