Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
xBGAS:ABridgeProposalforRV128andHPC
JohnLeidel1,DavidDonofrio2,Farzad Fatollahi-Fard2,KurtKeville31TacticalComputingLabs
2LawrenceBerkeleyNationalLab3MIT
DataCenterScaleAddressing• ExtendedBaseGlobalAddressSpace(xBGAS)• Goals:
• ProvideextendedaddressingcapabilitieswithoutruiningthebaseABI• EG,RV64appswillstillexecutewithoutanissue
• Extendedaddressingmustbeflexibleenoughtosupportmultipletargetapplicationspaces/systemarchitectures• Traditionaldatacenters,clouds,HPC,etc..
• Extendedaddressingmustnotspecificallyrelyuponanyonevirtualmemorymechanism• EG,provideforobject-basedmemoryresolution
• WhatisxBGAS NOT?• …adirectreplacementforRV128
ApplicationDomains• HPA-FLAT
• Highperformanceanalyticsflataddressing• Forextremelylargedatasetsthataretoo
difficult/timeconsumingtoshard• MMAP-IO
• Mapstoragetiersintoaddressspace• Potentialforobject-basedaddressing• SeeDDNWOS
• Cloud-BSP• Potentialforglobalobjectvisibilityforin-memory
cloudinfrastructures(Spark)• Reducethetime/costtoportJavatoafull128-bit
addressingmodel• HPC-PGAS
• HighPerformanceComputing:PartitionedGlobalAddressSpace
HPC-PGAS• Traditionalmessagepassingparadigmhastremendousamountofoverhead• Userlibraryoverhead,driveroverhead• Optimizedforlargedatatransfers• ManagementofcommunicationforExascale-classsystems
• Wehaveexcellentexamplesoflow-latencyPGASruntimes,butlittlehardware/uArch support• LBNL:GASnet• PNNL:GlobalArrays/ARMCI• Cray:Chapel• OpenSHMEM
Part0 Part1 Part2 Part3 Part4
get
getget
putput put
AddressingArchitecture• uArch mapsextendedaddressingintoRV64• WehopetogeneralizethisforRV32aswell
• CSRbitsencodedtoappearasstandardRV64uArch• XLENmapstoRV64• TBDwhetherweneedadditionalinterruptsandexceptions
• Additionofextended {eN}registersthatmaptobasegeneralregisters• Extendedregistersaremanuallyutilizedviaextendedload/store/moveinstructions
RV64I ALU
RV64
I Reg
iste
r File
x0
x9x10
x31
.
.
.
.
.
.
.
.
.
.
.
.
.
.
RV12
8I E
xten
ded
Regi
ster
Filee10
e31
.
.
.
.
.
.
.
.
.
.
.
.
.
e9.
e0
eld x31, 0(x21)
Effective Address
[127:64] = e21 [63:0] = x21 imm+128-bit base address
ISAExtension
• Instructionsaresplitintothreeblocks:• Baseintegerload/store• Rawintegerload/store• Addressmanagement
• Baseintegerload/store(I-type)• Permitsloading/storingallbaseRV64Idatatypesusingstandardmnemonic• EX:eld rd,imm(rs1)• Theextendedregistermappedtothesameindexas’rs1’isimplied
• Rawintegerload/store(R-type)• Permitsloading/storingusingexplicitextendedregisterscombinedwithexplicitbaseregisters(noimm)• erld rd,rs1,ext2• LOAD(ext2[127-64],rs1[63-0])
• AddressManagement• Permitsexplicitmanipulationoftheextendedregistercontents• eaddie extd,rs1,imm• extd =rs1+imm
ISAExtensionEncodingsxBGAS Architecture Extension Specification
4 xBGAS Instruction Set Listings
4.1 xBGAS Load/Store Instructions
Table 2: Extended RV64 Load Operations
Mnemonic base funct3 dest opcode
eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111
Table 3: Extended RV64 Store Operations
Mnemonic src base funct3 opcode
esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110
Table 4: Extended Quad and E-Loads
Mnemonic base funct3 dest opcode
elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111
Table 5: Extended Quad and E-Stores
Mnemonic src base funct3 opcode
esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111
xBGAS 0.0.4 17
xBGAS Architecture Extension Specification
4 xBGAS Instruction Set Listings
4.1 xBGAS Load/Store Instructions
Table 2: Extended RV64 Load Operations
Mnemonic base funct3 dest opcode
eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111
Table 3: Extended RV64 Store Operations
Mnemonic src base funct3 opcode
esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110
Table 4: Extended Quad and E-Loads
Mnemonic base funct3 dest opcode
elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111
Table 5: Extended Quad and E-Stores
Mnemonic src base funct3 opcode
esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111
xBGAS 0.0.4 17
xBGAS Architecture Extension Specification
4 xBGAS Instruction Set Listings
4.1 xBGAS Load/Store Instructions
Table 2: Extended RV64 Load Operations
Mnemonic base funct3 dest opcode
eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111
Table 3: Extended RV64 Store Operations
Mnemonic src base funct3 opcode
esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110
Table 4: Extended Quad and E-Loads
Mnemonic base funct3 dest opcode
elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111
Table 5: Extended Quad and E-Stores
Mnemonic src base funct3 opcode
esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111
xBGAS 0.0.4 17
xBGAS Architecture Extension Specification
4 xBGAS Instruction Set Listings
4.1 xBGAS Load/Store Instructions
Table 2: Extended RV64 Load Operations
Mnemonic base funct3 dest opcode
eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111
Table 3: Extended RV64 Store Operations
Mnemonic src base funct3 opcode
esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110
Table 4: Extended Quad and E-Loads
Mnemonic base funct3 dest opcode
elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111
Table 5: Extended Quad and E-Stores
Mnemonic src base funct3 opcode
esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111
xBGAS 0.0.4 17
xBGAS Architecture Extension Specification
4.2 Raw Integer Load/Store Instructions
Table 6: Raw Integer Load/Store Instructions
Mnemonic funct7 rs2 rs1 funct3 rd opcode
erld rd, rs1, ext2 0000010 ext2 rs1 011 rd 0111111erlw rd, rs1, ext2 0000010 ext2 rs1 010 rd 0111111erlh rd, rs1, ext2 0000010 ext2 rs1 001 rd 0111111erlhu rd, rs1, ext2 0000010 ext2 rs1 101 rd 0111111erlb rd, rs1, ext2 0000010 ext2 rs1 000 rd 0111111erlbu rd, rs1, ext2 0000010 ext2 rs1 100 rd 0111111erle extd, rs1, ext2 0000011 ext2 rs1 100 extd 0111111ersd rs1, rs2, ext3 0000100 rs2 rs1 011 rs1 0111111ersw rs1, rs2, ext3 0000100 rs2 rs1 010 rs1 0111111ersh rs1, rs2, ext3 0000100 rs2 rs1 001 rs1 0111111ersb rs1, rs2, ext3 0000100 rs2 rs1 000 rs1 0111111erse ext1, rs2, ext3 0001000 rs2 ext1 011 rs1 0111111
xBGAS 0.0.4 18
xBGAS Architecture Extension Specification
4.3 xBGAS Address Management Instructions
Table 7: Address Management Instructions
Mnemonic base funct3 dest opcode
eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111
4.4 Assembly Mnemonics
In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.
Table 8: Assembly Mnemonics
Mnemonic Base Instruction
movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0
xBGAS 0.0.4 19
xBGAS Architecture Extension Specification
4.3 xBGAS Address Management Instructions
Table 7: Address Management Instructions
Mnemonic base funct3 dest opcode
eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111
4.4 Assembly Mnemonics
In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.
Table 8: Assembly Mnemonics
Mnemonic Base Instruction
movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0
xBGAS 0.0.4 19
BaseIntegerLoad/Store RawIntegerLoad/Store
AddressManagement
AssemblyMnemonics
ABI(CallingConvention)• Thisiswherethingsgettricky…• ThebaseRV{32,64}ABIdefines:• Contextsave/restorespace• Call/returnregisterutilization• Caller/Callee savedstate• Coredatatypes
• Wewanttopreserveasmuchaspossiblewhileprovidingextendedaddressing
• Manyoutstandingquestions• HowdowelinkbaseRVobjectswithobjectscontainingextendedaddressing?• Howdoweaddressthecaller/callee savedstatewithextendedregisters?• Debugginganddebuggingmetadata?
Research&Progress
• Software• DataIntensiveScalableComputingLabatTexasTechisleadingthesoftwareresearch• CurrentxBGAS specimplementedinLLVMcompiler• Studythevariousapplicationdomains
• Hardware• TCL/LBNL/MITleadinghardwareeffort• Exploringpipelinedandaccelerator-basedimplementations
• OtherTopics• Operatingsystem(contextsaveinfo)• Debugging• ProgrammingModel
CommunitySupport&Interest
• xBGAS specavailableonGithub• https://github.com/tactcomplabs/xbgas-archspec
• RISC-VToolsBranchfromPriv-1.9inprogress• https://github.com/tactcomplabs/riscv-tools
• Wewelcomecomments/collaborators!
Acknowledgements
• Farzad Fatollahi-Fard,DavidDonofrio,JohnShalf:LawrenceBerkeleyLab• KurtKeville:MIT• XiWang,FrankConlon,YongChen:TexasTechUniversity• BruceJacob:UniversityofMaryland• SteveWallach:Micron