Anshul Kumar, CSE IITD
CSL718 : Superscalar CSL718 : Superscalar ProcessorsProcessors
CSL718 : Superscalar CSL718 : Superscalar ProcessorsProcessors
Issue and Despatch
23rd Jan, 2006
Anshul Kumar, CSE IITD slide 2
Early proposals/prototypesEarly proposals/prototypesEarly proposals/prototypesEarly proposals/prototypes
1982 1983 1984 1985 1986 1987 1988 1989
IBM
DEC
Stanford U
Kyushu U
Cheetah America project(4)
Multititan project(2)
Match(2) Torch(4)
SIMP(4) DSNS(4)
TermSuperscalar
Anshul Kumar, CSE IITD slide 3
Commercial superscalarsCommercial superscalarsCommercial superscalarsCommercial superscalars
RISCs• Intel 960KA/KB 960CA (3)1989• IBM Power 1 RS/6000 (4) 1990• HP PA7000 PA7100 (2) 1992• SUN SPARC SuperSparc (3) 1992• DEC Alpha 21064(2) 1992• Motorola MC88100 MC88110(2) 1993• Motorola PowerPC 601/603 (3) 1993• MIPS R4000 R8000(4) 1994
Anshul Kumar, CSE IITD slide 4
Commercial superscalarsCommercial superscalarsCommercial superscalarsCommercial superscalars
CISCs• Intel 80486 Pentium (2) 1993• Motorola MC68040 MC68060 (2) 1993• Gmicro Gmicro/100p
Gmicro 500 (2) 1993• AMD K5(2) – 4 RISC instr 1995• CYRIX M1 (2) 1995
Anshul Kumar, CSE IITD slide 5
Tasks of superscalar processingTasks of superscalar processingTasks of superscalar processingTasks of superscalar processing
Parallel Parallel Preserving thedecoding instruction sequential and issue execution consistency of instruction execution and exception processing
Anshul Kumar, CSE IITD slide 6
Superscalar decode and issueSuperscalar decode and issueSuperscalar decode and issueSuperscalar decode and issue
I - cache
Instructionbuffer
Decode & Issue
IF D/I
I - cache
Instructionbuffer
Decode & Issue
IF D I
ScalarIssue
SuperscalarIssue
Anshul Kumar, CSE IITD slide 7
Parallel DecodingParallel DecodingParallel DecodingParallel Decoding
• Fetch multiple instructions in instruction buffer
• Decode multiple instructions in parallel – instruction window
• Possibly check dependencies among these as well as with the instructions already under execution
Anshul Kumar, CSE IITD slide 8
Pre-decodingPre-decodingPre-decodingPre-decoding
• Do partial decoding while instructions are being loaded in I-cache
• Decoded information is appended to the instruction
• This includes instruction class, resources required etc.
Second level cacheor main memory
Pre-decode unit
I - cache
N bits/cycle
N + n bits/cycle
Anshul Kumar, CSE IITD slide 9
Number of Pre-decode bitsNumber of Pre-decode bitsNumber of Pre-decode bitsNumber of Pre-decode bits
Processor No. of predecode bitsPA 7200 (1995) 5PA 8000 (1996) 5PowerPC 620(1996) 7UltraSparc (1995) 4HAL PM1 (1995) 4AMD K5 (1995) 5 (per
byte)R 10000 (1996) 4
Anshul Kumar, CSE IITD slide 10
Issue vs DispatchIssue vs DispatchIssue vs DispatchIssue vs Dispatch
Blocking Issue• Decode and issue to
EU
Instructions may be blocked due to data dependency
Non-blocking Issue• Decode and issue to
buffer• From buffer dispatch
to EU
Instructions are not blocked due to data dependency
Anshul Kumar, CSE IITD slide 11
Blocking IssueBlocking IssueBlocking IssueBlocking Issue
EU EU EU
Decode Check & Issue
Instructionbuffer
issue window
Anshul Kumar, CSE IITD slide 12
Non-blocking (shelved) IssueNon-blocking (shelved) IssueNon-blocking (shelved) IssueNon-blocking (shelved) Issue
Reservationstation
Dep. Checking/dispatch
EU
Reservationstation
Dep. Checking/dispatch
EU
Reservationstation
Dep. Checking/dispatch
EU
Decode & Issue
Instructionbuffer
Anshul Kumar, CSE IITD slide 13
Handling of Issue BlockagesHandling of Issue BlockagesHandling of Issue BlockagesHandling of Issue Blockages
Preserving issue order Alignment of instruction issue
aligned unalignedin-order out of order
Anshul Kumar, CSE IITD slide 14
Issue OrderIssue OrderIssue OrderIssue Order
cd abe
a
Issue windowInstructionsto be issued
Instructionsissued
cd abe
a
Issue windowInstructionsto be issued
Instructionsissued
Issue in strict program order Out of order Issue
c
Example: MC 88110, PowerPC 601
Independent instruction
Dependent instruction
Issued instruction
Anshul Kumar, CSE IITD slide 15
AlignmentAlignmentAlignmentAlignment
cd abe
a
fixed windowcheckedin cycle 1
Aligned Issue Unaligned Issue
issuedin cycle 1
fgh
next window
cd be
b
checkedin cycle 2
issuedin cycle 2
fgh
de
d
checkedin cycle 3
issuedin cycle 3
fgh
c
cd abe
a
gliding window
fgh
cd be
b
fgh
defgh
c
def
Anshul Kumar, CSE IITD slide 16
Design choices in instruction issueDesign choices in instruction issueDesign choices in instruction issueDesign choices in instruction issue
Coping with Coping with Use of Handling of Issuefalse data unresolved shelving issue blockages ratedependencies control (2-6) dependencies
no Register renaming wait speculative
blocking shelved
Anshul Kumar, CSE IITD slide 17
Frequently used issue policies Frequently used issue policies in scalar processorsin scalar processors
Frequently used issue policies Frequently used issue policies in scalar processorsin scalar processors
Traditional Traditional Traditional Traditionalscalar issue scalar issue scalar issue scalar issue with shelving with shelving with spec. and renaming execution
CDC 6600 IBM 360/91i386MC68030R3000Sparc
I486MC68040R4000MicroSparc
Anshul Kumar, CSE IITD slide 18
Frequently used issue policies Frequently used issue policies in super scalar processorsin super scalar processors
Frequently used issue policies Frequently used issue policies in super scalar processorsin super scalar processors
Straightforward Straightforward Straight forward Advancedsuperscalar superscalar superscalar superscalar issue issue with issue with issue shelving renaming (renaming+shelving)
aligned unaligned(speculative execution in all)
PentiumPowerPC601PA7100SuperSparcAlpha21164
MC68060PA7200UltraSparc
MC88110R8000
PowerPC602 R10000PentiumProPowerPC602PA8000Sparc64Am29000K5
Anshul Kumar, CSE IITD slide 19
Frequently used issue policies Frequently used issue policies Frequently used issue policies Frequently used issue policies
Traditional Traditional Straight forward Advancedscalar issue scalar issue superscalar issue superscalar with spec. Issue execution
aligned unaligned
Anshul Kumar, CSE IITD slide 20
Design Space of ShelvingDesign Space of ShelvingDesign Space of ShelvingDesign Space of Shelving
Scope of Layout of Operand fetch Instructionshelving shelving policy dispatch scheme buffers
partial full
Anshul Kumar, CSE IITD slide 21
Layout of Shelving BuffersLayout of Shelving BuffersLayout of Shelving BuffersLayout of Shelving Buffers
Type of the Number of Number of readshelving buffers shelving buffer entries and write ports
Stand combined withalone renaming and(RS) reordering
individual 2-4group 6-16central 20total 15-40
depends onno. of EUsconnected
Anshul Kumar, CSE IITD slide 22
Reservation Stations (RS)Reservation Stations (RS)Reservation Stations (RS)Reservation Stations (RS)
EU EU EU EU EU EU EU EU
RS RS RS RS RS
Individual RSs Group RSs Central RS
Anshul Kumar, CSE IITD slide 23
Combined BufferCombined Buffer(for Shelving, Renaming, Reordering)(for Shelving, Renaming, Reordering)
Combined BufferCombined Buffer(for Shelving, Renaming, Reordering)(for Shelving, Renaming, Reordering)
EU EU
DRIS
From decode/issueDeferred scheduling, Register renaming and InstructionShelving
Anshul Kumar, CSE IITD slide 24
Operand Fetch PoliciesOperand Fetch PoliciesOperand Fetch PoliciesOperand Fetch Policies
Issueboundfetch
Dispatchboundfetch
Anshul Kumar, CSE IITD slide 25
Issue bound operand fetchIssue bound operand fetch(with single register file)(with single register file)
Issue bound operand fetchIssue bound operand fetch(with single register file)(with single register file)
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF
instructiondata
Anshul Kumar, CSE IITD slide 26
Dispatch bound operand fetch Dispatch bound operand fetch (with single register file)(with single register file)
Dispatch bound operand fetch Dispatch bound operand fetch (with single register file)(with single register file)
EU EU
RS RS
EU EU
RS RS
Decode/issueinstructiondata
RF
Anshul Kumar, CSE IITD slide 27
Issue bound operand fetchIssue bound operand fetch(with multiple register files)(with multiple register files)
Issue bound operand fetchIssue bound operand fetch(with multiple register files)(with multiple register files)
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF RF
instructiondata
Anshul Kumar, CSE IITD slide 28
Dispatch bound operand fetch Dispatch bound operand fetch (with multiple register files)(with multiple register files)
Dispatch bound operand fetch Dispatch bound operand fetch (with multiple register files)(with multiple register files)
EU EU
RS RS
EU EU
RS RS
Decode/issueinstructiondata
RF RF
Anshul Kumar, CSE IITD slide 29
Updating RFs and RSsUpdating RFs and RSsUpdating RFs and RSsUpdating RFs and RSs
EU EU
RS RS
EU EU
RS RS
Decode/issue
RF RF
instructiondata
Anshul Kumar, CSE IITD slide 30
Instruction dispatch schemeInstruction dispatch schemeInstruction dispatch schemeInstruction dispatch scheme
Dispatch Dispatch Checking Treatment ofpolicy rate operand empty RS availability
single multipleinstr/ instr/cycle cycleIndividual RS Group or central RS
Anshul Kumar, CSE IITD slide 31
Dispatch policyDispatch policyDispatch policyDispatch policy
Selection Arbitration Dispatchrule rule order
Rule for identifyinginstructions which areready for execution(data dependency check)
Rule for choosingone out of severalready instructions(earlier instruction has priority)
Anshul Kumar, CSE IITD slide 32
Dispatch orderDispatch orderDispatch orderDispatch order
in-order partially out of out of order order
RS RScheck check
Anshul Kumar, CSE IITD slide 33
Checking availability of operandsChecking availability of operandsChecking availability of operandsChecking availability of operands
Direct check of Check of explicit score-board bits status bits in RS
(usual for dispatch (usual for issuebound operand fetch) bound operand fetch)
control flow approach data flow approachFlynn’s terminology
Anshul Kumar, CSE IITD slide 34
Score-boardScore-boardScore-boardScore-board
RegisterFile
101
10
012
Data status
Introduced with CDC6600
Anshul Kumar, CSE IITD slide 35
Checking in dispatch bound fetchChecking in dispatch bound fetchChecking in dispatch bound fetchChecking in dispatch bound fetch
RegisterFile
Reservationstation
OC Rs1 Rs2 Rd
EU
decodedinstruction
check V bits of sources
update Rdset V bitRs1,Rs2,Rd
reset V bit of Rd
OC(opcode)
Os1
Os2 (operand value)
result, Rd
Anshul Kumar, CSE IITD slide 36
Checking in issue bound fetchChecking in issue bound fetchChecking in issue bound fetchChecking in issue bound fetch
OC Os1/Is1 Vs1 Os2/Is2 Vs2 Rd
EU
decodedinstruction
OC, Os1, Os2, Rd
result, Rd
RegisterFile
update Rd, set V bitRs1,Rs2,Rdreset V bit of Rd
Os1
Os2 (operand value)
Reservation station check Vs1, Vs2
associative update ofIs1, Is2 with Rd, set Vs bits
Anshul Kumar, CSE IITD slide 37
Treatment of an empty RSTreatment of an empty RSTreatment of an empty RSTreatment of an empty RS
Straight forward Bypassingapproach RS if empty
RS At least onecycle stay in RS
EU
RS
EUNx586 Sparc64
PowerPc 604
Anshul Kumar, CSE IITD slide 38
Approaches in dispatchingApproaches in dispatchingApproaches in dispatchingApproaches in dispatching
Straight forward Enhanced Advanced in order partially out of order out of order single single multiple instr/cycle instr/cycle instr/cycleindividual RSs individual RSs group/central RSs
Power1, PPC603 Power2 PM1, PentiumProNx586, Am29000 PPC604,620 PA8000, R10000