41
Advanced Microarchitecture Lecture 13: Commit, Exceptions, Interrupts

Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

Embed Size (px)

Citation preview

Page 1: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

Advanced MicroarchitectureLecture 13: Commit, Exceptions, Interrupts

Page 2: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

2

The End of the Road (um… Pipe)• Commit is typically the last stage of the

pipeline• Anything that an instruction does at this

point is irrevocable– therefore, only actions that correspond to a

sequential execution can be allowed to pass this point

– E.g., wrong path instructions may not commit because they do not exist in the sequential execution

Lecture 13: Commit, Exceptions, Interrupts

Page 3: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

3

Everything In-Order• The ISA defines program execution with

respect to the sequential order of the instructions

• To the outside world, the CPU must appear to execute in-order

• What does it mean to “appear”?– … when someone looks– ok, so what does it mean to “look”?

Lecture 13: Commit, Exceptions, Interrupts

Page 4: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

4

“Looking” at CPU State• Examples

– when the OS task swaps a program out, it copies the current program state (requires “looking”) so the program can be resumed later

– when a program has a fault (e.g., page fault), again the OS usually steps in and it needs to “look” at the “current” CPU state

Lecture 13: Commit, Exceptions, Interrupts

Page 5: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

5

When Can Someone Look?• Divide-by-Zero

– only on a divide• Page fault

– only on a load or store?– can occur anytime if instruction fetch is on an

unmapped page• OS Task/Process Scheduling

– anytime

Lecture 13: Commit, Exceptions, Interrupts

Page 6: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

6

“Proper” State Always Required• So what does it mean to have a proper

state?– one that corresponds to sequential execution

• For a superscalar processor with degree N, the commit stage must be able to “retire” at least N instructions per cycle– retirement includes updating processor state in

the same fashion that an instruction would for sequential execution

Lecture 13: Commit, Exceptions, Interrupts

Page 7: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

7

Superscalar Commit is like Sampling

Lecture 13: Commit, Exceptions, Interrupts

A

AB

ABC

ABCD

ABCDE

ABCDEF

ABCDEFG

ABCDEFGH

ABC

ABCDE

ABCDEFGH

Each “state” in thesuperscalar machinealways correspondsto one state of thescalar machine (butnot necessarily theother way around),and the ordering ofstates is preserved

Scalar Commit Processor States Superscalar Commit Processor States

Page 8: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

8

Implementation in the CPU• The architected register file contains the

state of the machine corresponding only to the committed instructions– commit happens in-order, so the ARF states

always correspond to some RF state of a sequential execution

– … this means anyone who wants to “look” will have to look in the ARF

– What about the other instructions that have executed out of order?

Lecture 13: Commit, Exceptions, Interrupts

Page 9: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

9

Nothing Else Matters

Lecture 13: Commit, Exceptions, Interrupts

LSQ

PRF

RS

fPC

ROBPC

Memory

SequentialView of theProcessor

PC

Memory

RF

State of the SuperscalarOut-of-Order Processor

Just ignoreeverything else!

RF

Page 10: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

10

Committing Instructions• “Retire” vs. “Commit”

– sometimes people use this to mean the same thing

– sometimes they mean different things• check the context!

• An instruction commits by making its “effects” known/visible to the architected state– architected state: (A)RF, Memory/$, PC, other

regs– speculative state: everything else (ROB, RS,

LSQ, etc.)Lecture 13: Commit, Exceptions, Interrupts

Page 11: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

11

Committing Instructions (2)• When an instruction executes, it modifies

processor state– update a register– update memory– update the PC (almost all instructions do this)

• To “make appear”, the processor just copies values– copy value from Physical Reg to Architected Reg– copy value from LSQ to memory/cache– copy value from ROB to Architected PC

Lecture 13: Commit, Exceptions, Interrupts

Page 12: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

12

Blocked Commit• To commit N instructions per cycle, the

ROB needs to support N read ports– (in addition to other ports such as for reading

from the PRF, write ports for allocation, and write ports for updates/writebacks)

Lecture 13: Commit, Exceptions, Interrupts

inst 1inst 2inst 3inst 4

Four re

ad p

orts

for fo

ur

com

mits

ROB

inst 1 inst 2 inst 3 inst 4

One wide read port

ROB

Can’t reuse (reallocate) any of these ROB entries until all have committed

Helps to keep ROB port requirements under control, at cost of slight IPC loss

due to occasional alloc stalls

Page 13: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

13

Commit Restrictions• If any N instructions can commit per cycle, may

require heavy multi-porting of other structures– Stores: need N extra DL1 write ports, N extra DTLB

read ports– Branches: need N branch predictor update ports;

also need to deallocate N RAT checkpoints– Others: if processor has memory dependence

predictor, then may need N update ports for loads

• Solution: limit maximum number of commits per cycle of special types of instructions (e.g., max one branch per cycle)

Lecture 13: Commit, Exceptions, Interrupts

Don’t we need to check DTLB during store-address

computation anyway?Do we need to do it again

here?

Page 14: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

14

x86 Commit (macros vs. uops)• ROB contains uops, but outside world only

knows about macro-ops

Lecture 13: Commit, Exceptions, Interrupts

uop 1 (SUB)uop 1 (LD)

uop 2 (ADD)

ROB

uop 1uop 2uop 3uop 4uop 5uop 6

uop 1 (ADD)

com

mit

ADD EAX, EBX

SUB EBX, ECX

ADD EDX, [EAX]

com

mit

POPA ????

uop 7uop 8

If we take an interrupt right now, we’ll see a half-executed

instruction!

Page 15: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

15

x86 Commit (con’t)• Works fine when uop-flow length ≤ commit

width• What to do with long flows?

– In all cases: can’t commit until all uops in a flow have completed

– Just commit N uops per cycle– ... but make commit uninteruptable

Lecture 13: Commit, Exceptions, Interrupts

ROBuop 1uop 2uop 3uop 4uop 5uop 6uop 7uop 8

com

mit

com

mit

POPA

Timer interrupt! Defer: Can’t act on this yet...

Now do something about the interrupt.

Page 16: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

16

Handling REP-prefixed Instructions• Ex. REP STOSB (memset with EAX’s value)

– The entire sequence is effectively one x86 instruction

– What if this REP’s for 1,000,000,000 iterations?• Does the CPU “lock up” for a billion or so cycles while

we wait for the entire instruction to commit?• We also can’t wait until the entire instruction has

completed to commit, because we can’t even fetch the entire instruction!

• At the ISA level, REP iterations are interruptable...– so treat each iteration as a separate “macro-

op”

Lecture 13: Commit, Exceptions, Interrupts

Page 17: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

17

REP (con’t)• MOV EDI, <pointer> ; the array we want to

memset• SUB EAX, EAX ; zero• CLD ; clear direction flag

(REP fwd)• MOV ECX, 4 ; do for 100 iterations• REP STOSB ; memset!• ADD EBX, EDX ; unrelated instruction

Lecture 13: Commit, Exceptions, Interrupts

MOV EDI, xxx

SUB EAX, EAX

CLD

MOV ECX, 4

uCMP ECX, 0

uJCZ

STA tmp, EDI

STD EAX, tmp

SUB ECX, 1

ADD EDI, 1

uCMP ECX, 0

uJCZ

STA tmp, EDI

STD EAX, tmp

SUB ECX, 1

ADD EDI, 1

uCMP ECX, 0

uJCZ

STA tmp, EDI

STD EAX, tmp

SUB ECX, 1

ADD EDI, 1

uCMP ECX, 0

uJCZ

STA tmp, EDI

STD EAX, tmp

SUB ECX, 1

ADD EDI, 1

uCMP ECX, 0

uJCZ

ADD EBX, EDX

Check for zero iterations(could happen with MOV ECX, 0 )

MOVS flow

REP overhead for1st iteration

MOVS flow

REP overheadfor 2nd iter.

MOVS flow

REP overheadfor 3rd iter

MOVS flow

REP4th iter

All of these are interruptible points (commit can stop and effects be seen by outside world), since they all have well-defined ISA-level

states:A: ECX=3, EDI = ptr+1B: ECX=2, EDI = ptr+2C: ECX=1, EDI = ptr+3D: ECX=0, EDI = ptr+4

A:

B:

C:

D:

Page 18: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

18

Store Retirement• Stores need to forward values to later loads

from the same address– Normally, LSQ provides this facility

Lecture 13: Commit, Exceptions, Interrupts

ld

17st

ld

33st

ld

D$

17

At commit, storeUpdates cache

st

ld

After store has leftthe LSQ, the D$can provide the

correct value

D$ D$

Page 19: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

19

Store Retirement (2)• Store shouldn’t vacate LSQ entry until

actual writeback has completed– this way it can continue to forward values until

it is guaranteed that later loads can get the value from the cache

• This can stall commit– what if there’s a cache miss?– what if there’s a TLB miss?

Lecture 13: Commit, Exceptions, Interrupts

store

All instructions may have successfullyexecuted, but none can commit!

Page 20: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

20

Writeback Buffer• Want to get stores out of the way

Lecture 13: Commit, Exceptions, Interrupts

storeD$

store

WB Buffer

ld

Even if the store misses incache, entering the WB buffer

counts as committing.

This allows other insts tocommit as well.

The WB buffer is part of the cachehierarchy now. It may need

to provide values to later loads

Eventually, the cacheupdate occurs, the WBbuffer entry is emptied

ld

The cache can nowprovide the correct value

Page 21: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

21

Writeback Buffer (2)• The WBB is part of the “Architected” state

Lecture 13: Commit, Exceptions, Interrupts

ld

D$

WBB

• Loads must check both structures for a hit– more routing to get request

to both places– yet another possible

bypass source– also want to normalize

latencies for easier scheduling

stcache

coherenceinterface As an “official” write, the store must

also be visible to other processors

To o

ther C

PU

s

Processor stalls if there are noavailable writeback buffer entries

Page 22: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

22

Writeback Buffer (3)• Stores enter WB Buffer in program order• If there are multiple stores to the same

address, only the last one is “visible”

Lecture 13: Commit, Exceptions, Interrupts

123442

Addr Value

-113

909018

oldest

youngest

next to writeto cache

Load 42

567842

Store 42

Load 42

No one can “see” this store anymore!

Page 23: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

23

Write Combining Buffer• Augment WBB to combine writes together

Lecture 13: Commit, Exceptions, Interrupts

123442

Addr ValueLoad 42 567842

Store 42

Load 42

Only one writebackNow instead of two

If writing to same address, just combine the writes

(similar to writeback cache)

Page 24: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

24

Write Combining Buffer (2)• Can combine stores to same cache line

Lecture 13: Commit, Exceptions, Interrupts

123480

$-LineAddr

Cache Line Data

Store 84

One cache writecan serve multiple

original storeinstructions

Benefit: reduces cachetraffic, reduces pressure

on store buffers

5678

Aggressiveness of write-combining may be limited by

memory ordering model

Writeback/combining buffer can be implemented in/integrated with the

MSHRs

Only certain memory regions may be “write-combinable” (e.g., USWC

in x86)

Page 25: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

25

Senior Store Queue• Use STQ as WBB (not necessarily write

combining)

Lecture 13: Commit, Exceptions, Interrupts

STQ

StoreStoreStoreStoreStoreStore

STQ head

STQ tail

DL1 L2STQ headSTQ head

While stores are completing, other accesses (loads, snoops) can continue getting the values

from the “senior” STQ

StoreStore

StoreStore

STQ tail

New stores cannot allocate into Senior STQ entries until stores

complete

Can continue allocating stores

Benefit: don’t need separate WBB, but we don’t need to stall commit

on stores either

Page 26: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

26

Silent Stores• Many stores write the same value to

memory that’s already there

Lecture 13: Commit, Exceptions, Interrupts

Store 4242

This is a “silent store”

D$

Ideally, this store shouldn’t have to happen

Silent Stores are another way to exploit value locality;they are like the “dual” of value-predictable loads that

always (frequently) load the same value

Page 27: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

27

Store executesas a Load

CommitWBExecuteScheduleDispatchRenameDecodeFetch

Silent Store Verification

• Pro: stores can commit faster (reduces commit-time cache port pressure)

• Con: more total accesses– every store converted to

load– non-silent still has to store

Lecture 13: Commit, Exceptions, Interrupts

D$

=

data

hit

Check load value againststore data for equality

If equal, then store issilent, suppress store

at commit

Silent?

Page 28: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

28

Other Duties• Besides updating architected state, commit

needs to deallocate microarchitecture resources– ROB/LSQ entries– Physical register– “colors” of various sorts if used– RAT checkpoints

• Most are FIFO’s or Queues, so alloc/dealloc is usually just inc/dec head/tail pointers

• Unified PRF requires a little more work

Lecture 13: Commit, Exceptions, Interrupts

Page 29: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

29

In-Order Commit is Sufficient• … but not necessary• Necessary conditions for retirement:

1. Finished– result produced as defined by ISA semantics

2. No WAR hazard– in the architected register file

3. No unresolved branches4. No unresolved exceptions5. No memory ordering violations6. No memory consistency problems

Lecture 13: Commit, Exceptions, Interrupts

Page 30: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

30

Commitability

Lecture 13: Commit, Exceptions, Interrupts

% o

f RO

B

NoneNot Finished

Earlier BranchWAR

CombinationEarlier agen Many instructions could commit

(all requirements satisfied) butdon’t due to in-order constraint

Figure from Bell and Lipasti, “Deconstructing Commit,” in ISPASS’04

Page 31: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

31

Backup Slides on Interrupts/Faults

Lecture 13: Commit, Exceptions, Interrupts

Page 32: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

32

External Appearance Maintained

Lecture 13: Commit, Exceptions, Interrupts

LSQ

PRF

RS

fPC

ROBPC

Memory

ARF

If you want to peek in,all you get to see arethe (arch) register file,the PC and the cachestate (incl. WB buffers)

But what if there’s no ARF?

Page 33: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

33

View of the Unified Register File

Lecture 13: Commit, Exceptions, Interrupts

PRFsRAT

aRAT

If you need to “see”a register, you gothrough the aRATfirst.

ARF

Page 34: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

34

View of Branch Mispredictions

Lecture 13: Commit, Exceptions, Interrupts

ROB

LSQ

PRF

RS

fPC

PC

Memory

ARF

MispredictedBranch

Wrong-path instructionsare flushed…

architected state hasnever been touched

Fetch correct pathinstructions

Which can update thearchitected state when

they commit

Page 35: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

35

Faults• Div-by-Zero, Overflow, Page-Fault

• All occur at a specific point in execution (precise)

Lecture 13: Commit, Exceptions, Interrupts

• CPU must maintain the appearance of sequential execution

DBZ!

Trap

(resumeexecution)

Divide may have executedbefore other instructionsdue to OOO scheduling!

DBZ!

Trap?(when?)

Page 36: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

36

Timing of DBZ Fault• Need to hold on to your faults

Lecture 13: Commit, Exceptions, Interrupts

RS

ROB

Exec:DBZ

ArchitectedState

Let earlier instructions commitNow, the arch. state is the sameas just before the divide executed

in the sequential order

Now, raise the DBZ fault andwhen you switch to the kernel,everything appears as it should

On a fault, flush themachine and switch

to the kernel

Just make note of the fault,but don’t do anything (yet)

Page 37: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

37

Speculative Faults• Faults might not be faults…

Lecture 13: Commit, Exceptions, Interrupts

ROB

DBZ!

BranchMispredict

The fault goes away, too

Which is what we want since in asequential execution, the wrong-path divide

would not have executed (and faulted)

(flush wrong-path)

Buffering faults until commit filtersout speculative faults

Page 38: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

38

Timing of TLB Miss• Store must re-execute

(or re-commit, really), but this means it cannot leave the ROB

• Store TLB miss can stall the processor

Lecture 13: Commit, Exceptions, Interrupts

TLB miss

Trap

(resumeexecution)

Walk page-table,may find a page fault

Re-executestore

Page 39: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

39

Load Faults are Similar• Load issues, misses in TLB• Must wait! When load is oldest, allow

switch to kernel to do page-table walk

• …could be painful; there are lots of loads

• Some processors have hardware page-table walkers– OS loads a few registers with PT information

(pointers) and then simple logic fetches mapping info from memory

Lecture 13: Commit, Exceptions, Interrupts

OS must support particular PT format tomake use of HW page-table walker

Page 40: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

40

Asynchronous Interrupts• Some interrupts are not associated with

any particular instruction– timer interrupt– disk, network interrupts (I/O)– low battery, UPS shutdown

• When the CPU “notices” doesn’t matter (too much)

Lecture 13: Commit, Exceptions, Interrupts

KeyPressed Key

Pressed

KeyPressed

Page 41: Lecture 13: Commit, Exceptions, Interrupts. Commit is typically the last stage of the pipeline Anything that an instruction does at this point is irrevocable

41

Handling Async Interrupts• Handle right away

– Just use the current architected state, and flush the pipeline

• Deferred– Stop fetching, let the processor drain, then

switch to the interrupt handler• What if CPU takes a fault in the mean time?• Which came “first”, the External interrupt or the fault?

Lecture 13: Commit, Exceptions, Interrupts