11
xBGAS: A Bridge Proposal for RV128 and HPC John Leidel 1 , David Donofrio 2 , Farzad Fatollahi-Fard 2 , Kurt Keville 3 1 Tactical Computing Labs 2 Lawrence Berkeley National Lab 3 MIT

xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

xBGAS:ABridgeProposalforRV128andHPC

JohnLeidel1,DavidDonofrio2,Farzad Fatollahi-Fard2,KurtKeville31TacticalComputingLabs

2LawrenceBerkeleyNationalLab3MIT

Page 2: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

DataCenterScaleAddressing• ExtendedBaseGlobalAddressSpace(xBGAS)• Goals:

• ProvideextendedaddressingcapabilitieswithoutruiningthebaseABI• EG,RV64appswillstillexecutewithoutanissue

• Extendedaddressingmustbeflexibleenoughtosupportmultipletargetapplicationspaces/systemarchitectures• Traditionaldatacenters,clouds,HPC,etc..

• Extendedaddressingmustnotspecificallyrelyuponanyonevirtualmemorymechanism• EG,provideforobject-basedmemoryresolution

• WhatisxBGAS NOT?• …adirectreplacementforRV128

Page 3: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

ApplicationDomains• HPA-FLAT

• Highperformanceanalyticsflataddressing• Forextremelylargedatasetsthataretoo

difficult/timeconsumingtoshard• MMAP-IO

• Mapstoragetiersintoaddressspace• Potentialforobject-basedaddressing• SeeDDNWOS

• Cloud-BSP• Potentialforglobalobjectvisibilityforin-memory

cloudinfrastructures(Spark)• Reducethetime/costtoportJavatoafull128-bit

addressingmodel• HPC-PGAS

• HighPerformanceComputing:PartitionedGlobalAddressSpace

Page 4: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

HPC-PGAS• Traditionalmessagepassingparadigmhastremendousamountofoverhead• Userlibraryoverhead,driveroverhead• Optimizedforlargedatatransfers• ManagementofcommunicationforExascale-classsystems

• Wehaveexcellentexamplesoflow-latencyPGASruntimes,butlittlehardware/uArch support• LBNL:GASnet• PNNL:GlobalArrays/ARMCI• Cray:Chapel• OpenSHMEM

Part0 Part1 Part2 Part3 Part4

get

getget

putput put

Page 5: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

AddressingArchitecture• uArch mapsextendedaddressingintoRV64• WehopetogeneralizethisforRV32aswell

• CSRbitsencodedtoappearasstandardRV64uArch• XLENmapstoRV64• TBDwhetherweneedadditionalinterruptsandexceptions

• Additionofextended {eN}registersthatmaptobasegeneralregisters• Extendedregistersaremanuallyutilizedviaextendedload/store/moveinstructions

RV64I ALU

RV64

I Reg

iste

r File

x0

x9x10

x31

.

.

.

.

.

.

.

.

.

.

.

.

.

.

RV12

8I E

xten

ded

Regi

ster

Filee10

e31

.

.

.

.

.

.

.

.

.

.

.

.

.

e9.

e0

eld x31, 0(x21)

Effective Address

[127:64] = e21 [63:0] = x21 imm+128-bit base address

Page 6: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

ISAExtension

• Instructionsaresplitintothreeblocks:• Baseintegerload/store• Rawintegerload/store• Addressmanagement

• Baseintegerload/store(I-type)• Permitsloading/storingallbaseRV64Idatatypesusingstandardmnemonic• EX:eld rd,imm(rs1)• Theextendedregistermappedtothesameindexas’rs1’isimplied

• Rawintegerload/store(R-type)• Permitsloading/storingusingexplicitextendedregisterscombinedwithexplicitbaseregisters(noimm)• erld rd,rs1,ext2• LOAD(ext2[127-64],rs1[63-0])

• AddressManagement• Permitsexplicitmanipulationoftheextendedregistercontents• eaddie extd,rs1,imm• extd =rs1+imm

Page 7: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

ISAExtensionEncodingsxBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4 xBGAS Instruction Set Listings

4.1 xBGAS Load/Store Instructions

Table 2: Extended RV64 Load Operations

Mnemonic base funct3 dest opcode

eld rd, imm(rs1) rs1+ext1 011 rd 0111111elw rd, imm(rs1) rs1+ext1 010 rd 0111111elh rd, imm(rs1) rs1+ext1 001 rd 0111111elhu rd, imm(rs1) rs1+ext1 101 rd 0111111elb rd, imm(rs1) rs1+ext1 000 rd 0111111elbu rd, imm(rs1) rs1+ext1 100 rd 0111111

Table 3: Extended RV64 Store Operations

Mnemonic src base funct3 opcode

esd rs1, imm(rs2) rs1 rs2+ext2 011 0111110esw rs1, imm(rs2) rs1 rs2+ext2 010 0111110esh rs1, imm(rs2) rs1 rs2+ext2 001 0111110esb rs1, imm(rs2) rs1 rs2+ext2 000 0111110

Table 4: Extended Quad and E-Loads

Mnemonic base funct3 dest opcode

elq rd, imm(rs1) rs1+ext1 000 rd 1111111ele extd, imm(rs1) rs1+ext1 001 rd 1111111

Table 5: Extended Quad and E-Stores

Mnemonic src base funct3 opcode

esq rs1, imm(rs2) rs1 rs2+ext2 100 1111111ese ext1, imm(rs2) ext1 rs2+ext2 101 1111111

xBGAS 0.0.4 17

xBGAS Architecture Extension Specification

4.2 Raw Integer Load/Store Instructions

Table 6: Raw Integer Load/Store Instructions

Mnemonic funct7 rs2 rs1 funct3 rd opcode

erld rd, rs1, ext2 0000010 ext2 rs1 011 rd 0111111erlw rd, rs1, ext2 0000010 ext2 rs1 010 rd 0111111erlh rd, rs1, ext2 0000010 ext2 rs1 001 rd 0111111erlhu rd, rs1, ext2 0000010 ext2 rs1 101 rd 0111111erlb rd, rs1, ext2 0000010 ext2 rs1 000 rd 0111111erlbu rd, rs1, ext2 0000010 ext2 rs1 100 rd 0111111erle extd, rs1, ext2 0000011 ext2 rs1 100 extd 0111111ersd rs1, rs2, ext3 0000100 rs2 rs1 011 rs1 0111111ersw rs1, rs2, ext3 0000100 rs2 rs1 010 rs1 0111111ersh rs1, rs2, ext3 0000100 rs2 rs1 001 rs1 0111111ersb rs1, rs2, ext3 0000100 rs2 rs1 000 rs1 0111111erse ext1, rs2, ext3 0001000 rs2 ext1 011 rs1 0111111

xBGAS 0.0.4 18

xBGAS Architecture Extension Specification

4.3 xBGAS Address Management Instructions

Table 7: Address Management Instructions

Mnemonic base funct3 dest opcode

eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111

4.4 Assembly Mnemonics

In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.

Table 8: Assembly Mnemonics

Mnemonic Base Instruction

movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0

xBGAS 0.0.4 19

xBGAS Architecture Extension Specification

4.3 xBGAS Address Management Instructions

Table 7: Address Management Instructions

Mnemonic base funct3 dest opcode

eaddi rd, ext1, imm ext1 001 rd 1111111eaddie extd, rs1, imm rs1 100 extd 1111111eaddix extd, ext1, imm extd 101 ext1 1111111

4.4 Assembly Mnemonics

In addition to the aforementioned encodings and core xBGAS instruction extensions, we also define a setof complementary instruction mnemonics that may be supported by the target binary assembler in order tofacilitate condensed definition of common operations. The following table describes these instructions andtheir associated mnemonics.

Table 8: Assembly Mnemonics

Mnemonic Base Instruction

movebe rd, ext1 eaddi rd, ext1, 0moveeb extd, rs1 eaddie extd, rs1, 0moveee extd, ext1 eaddix extd, ext1, 0

xBGAS 0.0.4 19

BaseIntegerLoad/Store RawIntegerLoad/Store

AddressManagement

AssemblyMnemonics

Page 8: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

ABI(CallingConvention)• Thisiswherethingsgettricky…• ThebaseRV{32,64}ABIdefines:• Contextsave/restorespace• Call/returnregisterutilization• Caller/Callee savedstate• Coredatatypes

• Wewanttopreserveasmuchaspossiblewhileprovidingextendedaddressing

• Manyoutstandingquestions• HowdowelinkbaseRVobjectswithobjectscontainingextendedaddressing?• Howdoweaddressthecaller/callee savedstatewithextendedregisters?• Debugginganddebuggingmetadata?

Page 9: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

Research&Progress

• Software• DataIntensiveScalableComputingLabatTexasTechisleadingthesoftwareresearch• CurrentxBGAS specimplementedinLLVMcompiler• Studythevariousapplicationdomains

• Hardware• TCL/LBNL/MITleadinghardwareeffort• Exploringpipelinedandaccelerator-basedimplementations

• OtherTopics• Operatingsystem(contextsaveinfo)• Debugging• ProgrammingModel

Page 10: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

CommunitySupport&Interest

• xBGAS specavailableonGithub• https://github.com/tactcomplabs/xbgas-archspec

• RISC-VToolsBranchfromPriv-1.9inprogress• https://github.com/tactcomplabs/riscv-tools

• Wewelcomecomments/collaborators!

Page 11: xBGAS: A Bridge Proposal for RV128 and HPC · xBGAS: A Bridge Proposal for RV128 and HPC John Leidel1, David Donofrio2, FarzadFatollahi-Fard2, Kurt Keville3 ... eaddix extd, ext1,

Acknowledgements

• Farzad Fatollahi-Fard,DavidDonofrio,JohnShalf:LawrenceBerkeleyLab• KurtKeville:MIT• XiWang,FrankConlon,YongChen:TexasTechUniversity• BruceJacob:UniversityofMaryland• SteveWallach:Micron