Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
CS110ComputerArchitecture
Lecture6:MoreRISC-V,RISC-VFunctions
Instructor:SörenSchwertfeger
http://shtech.org/courses/ca/
School of Information Science and Technology SIST
ShanghaiTech University
1Slides based on UC Berkley's CS61C
LevelsofRepresentation/Interpretation
lw xt0,0(x2)lw xt1,4(x2)sw xt1,0(x2)sw xt0,4(x2)
HighLevelLanguageProgram(e.g.,C)
AssemblyLanguageProgram(e.g.,RISC-V)
MachineLanguageProgram(RISC-V)
HardwareArchitectureDescription(e.g.,blockdiagrams)
Compiler
Assembler
MachineInterpretation
temp=v[k];v[k]=v[k+1];v[k+1]=temp;
0000 1001 1100 0110 1010 1111 0101 10001010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111
ArchitectureImplementation
Anythingcanberepresentedasanumber,
i.e.,dataorinstructions
2
LogicCircuitDescription(CircuitSchematicDiagrams)
Processor
Control
Datapath
Review:ComponentsofaComputer
3
ProgramCounter
Registers
Arithmetic&LogicUnit(ALU)
MemoryInput
Output
Bytes
Enable?Read/Write
Address
WriteData
ReadData
Processor-MemoryInterface I/O-MemoryInterfaces
Program
Data
Lastlecture• InRISC-VAssemblyLanguage:
– RegistersreplaceCvariables– Oneinstruction(simpleoperation)perline– SimplerisBetter,SmallerisFaster
• InRV32,wordsare32bit• Instructions:
add, addi, sub, lw, sw, lb• Registers:
– 32registers,referredtoasx0 – x31– Zero:x0
4
RISC-V:LittleEndianADDR3 ADDR2 ADDR1 ADDR0BYTE3 BYTE2 BYTE1 BYTE0
00000000 00000000 00000100 00000001• Hexadecimal number:
0xFD34AB88 (4,248,087,432ten) =>– Byte 0: 0x88 (136ten)– Byte 1: 0xAB (171ten)– Byte 2: 0x34 (52ten)– Byte 3: 0xFD (253ten)
• Little Endian: The ”Endianess” is little:– It starts with the smallest (least significant) Byte– Swapped from how we write the number
04812…
15913…
261014…
371115…
LittleEndianMostsignificantbyteinaword(numbersareaddresses)
5
0x8864
0xAB65
0x3466
0xFD67
Data:
Address: 64 addressofword(e.g.int)Address:
(E.g.,1025=0x401=0b010000000001)
6
PeerInstruction
We want to translate *x = *y into RISC-Vx, y ptrs stored in: x3 x5
1: add x3, x5, zero2: add x5, x3, zero3: lw x3, 0(x5)4: lw x5, 0(x3)5: lw x8, 0(x5)6: sw x8, 0(x3)7: lw x5, 0(x8)8: sw x3, 0(x8)
12345®66®57®8
ABCDEFG
RISC-VLogicalInstructions
Logical operations
Coperators
Java operators
RISC-Vinstructions
Bit-by-bit AND & & andBit-by-bit OR | | orBit-by-bit XOR ^ ^ xorBit-by-bit NOT ~ ~ xoriShift left << << sllShift right >> >> srl 7
• Usefultooperateonfieldsofbitswithinaword− e.g.,characterswithinaword(8bits)
• Operationstopack/unpackbitsintowords• Calledlogicaloperations
RISC-VLogicalInstructions
• Alwaystwovariants– Register: and x5, x6, x7 # x5 = x6 & x7– Immediate: andi x5, x6, 3 # x5 = x6 & 3
• Usedfor‘masks’
– andi with0000 00FFhex isolatestheleastsignificantbyte– andi withFF00 0000hex isolatesthemostsignificantbyte
– andi with0000 0008hex isolatesthe4th bit(00001000two )
YourTurn.Whatisinx11?
xor x11, x10, x10ori x11, x11, 0xFFandi x11, x11, 0xF0
0x0
0xF
0xF0
0xFF00
0xFFFFFFFF
A:B:C:D:E:
Logic Shifting• ShiftLeft:sll x11,x12,2 #x11=x12<<2
– Storeinx11 thevaluefromx12 shifted2bitstotheleft(theyfalloffend),inserting0’s onright;<<inC.Before:00000002hex00000000000000000000000000000010twoAfter: 00000008hex00000000000000000000000000001000two
Whatarithmeticeffectdoesshiftlefthave?multiplywith2n
• ShiftRight:srl isoppositeshift;>>
10
ArithmeticShifting• Shiftrightarithmeticmovesn bitstotheright(inserthighordersignbitintoemptybits)
• Forexample,ifregisterx10contained11111111111111111111111111100111two=-25ten
• Ifexecutedsra x10,x10,4,resultis:11111111111111111111111111111110two=-2ten
• Unfortunately,thisisNOTsameasdividingby2n− Failsforoddnegativenumbers− Carithmeticsemanticsisthatdivisionshouldroundtowards0
11
YourTurn.Whatisinx12?
x10 holds 0x34FF
slli x12, x10, 0x10srli x12, x12, 0x08and x12, x12, x10
0x0
0x3400
0x4F0
0xFF00
0x34FF
A:B:C:D:E:
HelpfulRISC-VAssemblerFeatures
• Symbolicregisternames– E.g.,a0-a7 forargumentregisters(x10-x17)– E.g.,zero forx0– E.g.,t0-t6 (temporary) s0-s11 (saved)
• Pseudo-instructions– Shorthandsyntaxforcommonassemblyidioms– E.g., mv rd, rs = addi rd, rs, 0– E.g., li rd, 13 = addi rd, x0, 13
ComputerDecisionMaking
• Basedoncomputation,dosomethingdifferent• Inprogramminglanguages:if-statement
• RISC-V:if-statementinstructionisbeq register1, register2, L1
means:gotostatementlabeledL1if(valueinregister1)==(valueinregister2)….otherwise,gotonextstatement
• beq standsforbranchifequal• Otherinstruction:bne forbranchifnotequal 14
TypesofBranches
• Branch – changeofcontrolflow
• ConditionalBranch – changecontrolflowdependingonoutcomeofcomparison– branchifequal(beq)orbranchifnot equal(bne)– Alsobranchiflessthan(blt)andbranchifgreaterthanorequal(bge)
• UnconditionalBranch – alwaysbranch– aRISC-Vinstructionforthis:jump (j),asinj label
15
Label• Holdstheaddressofdataorinstructions
– Think:”constantpointer”– Willbereplacedbytheactualaddress(number)duringassembly(orlinking)
• AlsoavailableinCfor”goto”:
• NEVER usegoto !!!!Verybadprogrammingstyle! 16
17
Label
Exampleif Statement• Assumingtranslationsbelow,compileif blockf→x10 g→x11 h→x12i →x13 j→x14
if (i == j) bne x13,x14,Exitf = g + h; add x10,x11,x12
Exit:
• Mayneedtonegatebranchcondition18
Exampleif-else Statement
• Assumingtranslationsbelow,compilef→x10 g→x11 h→x12i →x13 j→x14
if (i == j) bne x13,x14,Else f = g + h; add x10,x11,x12
else j Exit f = g – h; Else: sub x10,x11,x12
Exit:
19
Magnitude Compares in RISC-V
• Untilnow,we’veonlytestedequalities(==and!=inC);Generalprogramsneedtotest<and>aswell.
• RISC-Vmagnitude-comparebranches:“BranchonLessThan”Syntax:blt reg1,reg2, labelMeaning: if(reg1<reg2)//treatregistersassignedintegers
goto label;• “BranchonLessThanUnsigned”
Syntax:bltu reg1,reg2, labelMeaning: if(reg1<reg2)//treatregistersasunsignedintegers
goto label;
20
CLoopMappedtoRISC-VAssembly
int A[20];int sum = 0;for (int i=0; i < 20; i++)
sum += A[i];
add x9, x8, x0 # x9=&A[0]add x10, x0, x0 # sum=0add x11, x0, x0 # i=0addi x13,x0, 20 # x13=20
Loop:bge x11,x13,Donelw x12, 0(x9) # x12=A[i]add x10,x10,x12 # sum+=addi x9, x9,4 # &A[i+1]addi x11,x11,1 # i++j Loop
Done:
21
Administrivia• HW2Autolab isonline– dueinoneweek!March14
– STARTlatesttoday!– GotoOHifyouhaveproblems– don’taskyourfellow
students– Usepiazzafrequently.
• HW3andProject1.1willbepublishedthisweekend!
• MidtermIisinonemonthduringlecturehours…(April3rd)
22
Schedule2weeksforHWandProjects;startearly!
23
ProceduresinRISC-V
24
HowProgramisStored
25
Memory
Bytes
Program
Data
OneRISC-VInstruction=32bits
AssemblertoMachineCode(morelaterincourse)
26
foo.S bar.S
Assembler Assembler
foo.o bar.o
Linker lib.o
a.out
Assemblersourcefiles(text)
Machinecodeobjectfiles
Pre-builtobjectfilelibraries
Machinecodeexecutablefile
Assemblerconvertshuman-readableassemblycodetoinstructionbitpatterns
Processor
Control
Datapath
ExecutingaProgram
27
PC
Registers
Arithmetic&LogicUnit(ALU)
Memory
BytesInstructionAddress
ReadInstructionBits
Program
Data
• ThePC(programcounter)isinternalregisterinsideprocessorholdingbyteaddressofnextinstructiontobeexecuted.
• Instructionisfetchedfrommemory,thencontrolunitexecutesinstructionusingdatapath andmemorysystem,andupdatesprogramcounter(defaultisadd+4bytestoPC,tomovetonextsequentialinstruction)
CFunctionsmain() {
int i,j,k,m;...i = mult(j,k); ... m = mult(i,i); ...
}
/* really dumb mult function */int mult (int mcand, int mlier){
int product = 0;while (mlier > 0) {product = product + mcand;mlier = mlier -1; }
return product;}
What information mustcompiler/programmer
keep track of?
What instructions can accomplish this?
SixFundamentalStepsinCallingaFunction
1. Putparametersinaplacewherefunctioncanaccessthem
2. Transfercontroltofunction3. Acquire(local)storageresourcesneededfor
function4. Performdesiredtaskofthefunction5. Putresultvalueinaplacewherecallingcode
canaccessitandrestoreanyregistersyouused6. Returncontroltopointoforigin,sincea
functioncanbecalledfromseveralpointsinaprogram
29
RISC-VFunctionCallConventions
• Registersfasterthanmemory,sousethem
• a0–a7 (x10-x17):eightargumentregisterstopassparametersandreturnvalues(a0-a1)
• ra:onereturnaddressregistertoreturntothepointoforigin(x1)
• Alsos0-s1 (x8-x9) ands2-s11 (x18-x27):savedregisters(moreaboutthoselater)
30
InstructionSupportforFunctions(1/4)
... sum(a,b);... /* a, b: s0, s1 */}int sum(int x, int y) {return x+y;
}address (shown in decimal)1000 1004 1008 1012 1016 …2000 2004
C
InRISC-V,allinstructionsare4bytes,andstoredinmemoryjustlikedata.Sohereweshowtheaddressesofwheretheprogramsarestored.
31
RIS
C-V
InstructionSupportforFunctions(2/4)
... sum(a,b);... /* a, b: s0, s1 */}int sum(int x, int y) {return x+y;
}address (shown in decimal)1000 add a0, s0, x0 # x = a1004 mv a1, s1 # y = b1008 addi ra, zero, 1016 # ra=10161012 j sum # jump to sum1016 … # next instruction…2000 sum: add a0, a0, a12004 jr ra # new instr. “jump register”
C
32
RIS
C-V
InstructionSupportforFunctions(3/4)
... sum(a,b);... /* a,b:$s0,$s1 */}int sum(int x, int y) {return x+y;
}
2000 sum: add a0, a0, a12004 jr ra # new instr. “jump register”
• Question:Whyuse jr here?Whynot usej?
• Answer:summightbecalledbymanyplaces,sowecan’treturntoafixedplace.Thecallingproctosummustbeabletosay“returnhere”somehow.
C
33
RIS
C-V
InstructionSupportforFunctions(4/4)• Singleinstructiontojumpandsavereturnaddress:jumpandlink(jal)
• Before:1008 addi ra, zero, 1016 # $ra=10161012 j sum # goto sum
• After:1008 jal sum # ra=1012, goto sum
• Whyhaveajal?– Makethecommoncasefast:functioncalls verycommon.– Reduceprogramsize– Don’thavetoknowwhere codeis inmemorywithjal!
34
RISC-VFunctionCallInstructions• Invokefunction:jumpandlinkinstruction(jal)
(reallyshouldbelaj “linkandjump”)– “link”meansformanaddressorlinkthatpointstocallingsitetoallowfunctiontoreturntoproperaddress
– Jumpstoaddressandsimultaneouslysavestheaddressofthefollowing instructioninregisterra (x1)jal FunctionLabel
• Returnfromfunction:jumpregisterinstruction(jr)– Unconditionaljumptoaddressspecifiedinregisterjr ra
– Assemblershorthand: ret = jr ra35
NotesonFunctions• Callingprogram(caller)putsparametersintoregistersa0-a7 andusesjal X toinvoke(callee)ataddresslabeledX
• Musthaveregisterincomputerwithaddressofcurrentlyexecutinginstruction– InsteadofInstructionAddressRegister (bettername),historicallycalledProgramCounter (PC)
– It’saprogram’scounter;itdoesn’tcountprograms!
• Whatvaluedoesjal X placeintora?????• jr ra putsaddressinsidera backintoPC
36
WhereAreOldRegisterValuesSavedtoRestoreThemAfterFunctionCall?
• Needaplacetosaveoldvaluesbeforecallfunction,restorethemwhenreturn,anddelete
• Idealisstack:last-in-first-outqueue(e.g.,stackofplates)– Push:placingdataontostack– Pop:removingdatafromstack
• Stackinmemory,soneedregistertopointtoit• sp isthestackpointerinRISC-V(x2)• Conventionisgrowfromhightolowaddresses
– Push decrementssp,Pop incrementssp
37
Stack
• Stackframeincludes:• Return“instruction”address• Parameters• Spaceforotherlocalvariables
• Stackframescontiguousblocksofmemory;stackpointertellswherebottomofstackframeis
• Whenprocedureends,stackframeistossedoffthestack;freesmemoryforfuturestackframes
frame
frame
frame
frame
$sp
0xBFFFFFF0
Exampleint Leaf(int g, int h, int i, int j)
{int f;f = (g + h) – (i + j);return f;
}• Parametervariablesg,h,i,andj inargumentregistersa0,a1,a2,anda3,andf ins0
• Assumeneedonetemporaryregisters1
39
StackBefore,During,AfterFunction
• Needtosaveoldvaluesofs0 ands1
sp
Beforecall
spSaved s1
Duringcall
Saved s0
sp
Aftercall
Saved s1Saved s0
RISC-VCodeforLeaf()
41
Leaf: addi sp, sp, -8 # adjust stack for 2 itemssw s1, 4(sp) # save s1 for use afterwardssw s0, 0(sp) # save s0 for use afterwards
add s0, a0, a1 # f = g + hadd s1, a2, a3 # s1 = i + jsub a0, s0, s1 # return value (g + h) – (i + j)
lw s0, 0(sp) # restore register s0 for caller lw s1, 4(sp) # restore register s1 for calleraddi sp, sp, 8 # adjust stack to delete 2 itemsjr ra # jump back to calling routine
NestedProcedures(1/2)
int sumSquare(int x, int y) {return mult(x,x)+ y;
}• SomethingcalledsumSquare,nowsumSquare iscallingmult
• Sothere’savalueinra thatsumSquarewantstojumpbackto,butthiswillbeoverwrittenbythecalltomult
42
NeedtosavesumSquare returnaddressbeforecalltomult
NestedProcedures(2/2)
• Ingeneral,mayneedtosavesomeotherinfoinadditiontora.
• WhenaCprogramisrun,thereare3importantmemoryareasallocated:– Static:Variablesdeclaredonceperprogram,ceasetoexistonlyafterexecutioncompletes- e.g.,Cglobals
– Heap:Variablesdeclareddynamicallyviamalloc– Stack:Spacetobeusedbyprocedureduringexecution;thisiswherewecansaveregistervalues
43
RegisterConventions(1/2)
• CalleR:thecallingfunction• CalleE:thefunctionbeingcalled• Whencallee returnsfromexecuting,thecallerneedstoknowwhichregistersmayhavechangedandwhichareguaranteedtobeunchanged.
• RegisterConventions:Asetofgenerallyacceptedrulesastowhichregisterswillbeunchangedafteraprocedurecall(jal)andwhichmaybechanged.
RegisterConventions(2/2)
Toreduceexpensiveloadsandstoresfromspillingandrestoringregisters,RISC-Vfunction-callingconventiondividesregistersintotwocategories:
1. Preservedacrossfunctioncall– Callercanrelyonvaluesbeingunchanged– sp,gp,tp, “savedregisters”s0- s11 (s0 isalso fp)
2. Notpreservedacrossfunctioncall– Callercannotrelyonvaluesbeingunchanged– Argument/returnregistersa0-a7,ra,
“temporaryregisters”t0-t6
RISC-VSymbolicRegisterNamesNumbers:hardwareunderstands
Human-friendlysymbolicnamesinassemblycode
Question
• WhichstatementisFALSE?
47
B: jal savesPC+1inra
C: Thecallee canusetemporaryregisters(ti)withoutsavingandrestoringthem
D: Thecallercanrelyonsaveregisters(si)withoutfearofcallee changingthem
A:RISC-Vusesjal toinvokeafunctionandjr toreturnfromafunction
AllocatingSpaceonStack
• Chastwostorageclasses:automaticandstatic– Automatic variablesarelocaltofunctionanddiscardedwhenfunctionexits
– Staticvariablesexistacrossexitsfromandentriestoprocedures
• Usestackforautomatic(local)variablesthatdon’tfitinregisters
• Procedureframeor activationrecord:segmentofstackwithsavedregistersandlocalvariables
48
StackBefore,During,AfterFunction
sp
Beforecall
sp
Duringcall
Savedargumentregisters(ifany)
Savedreturnaddress(ifneeded)
Savedsavedregisters(ifany)
Localvariables(ifany)
sp
Aftercall
UsingtheStack(1/2)
• Wehavearegistersp whichalwayspointstothelastusedspaceinthestack.
• Tousestack,wedecrementthispointerbytheamountofspaceweneedandthenfillitwithinfo.
• So,howdowecompilethis?int sumSquare(int x, int y) {
return mult(x,x)+ y;}
50
UsingtheStack(2/2)
sumSquare: addi sp, sp, -8 # space on stacksw ra, 4(sp) # save ret addrsw a1, 0(sp) # save ymv a1, a0 # mult(x,x)jal mult # call multlw a1, 0(sp) # restore yadd a0, a0, a1 # mult()+ylw ra, 4(sp) # get ret addraddi sp, sp, 8 # restore stackjr ra
mult: ...
int sumSquare(int x, int y) {return mult(x,x)+ y; }
“push”
“pop”
BasicStructureofaFunction
entry_label: addi sp,sp, -framesizesw ra, framesize-4(sp) # save rasave other regs if need be
...
restore other regs if need belw ra, framesize-4(sp) # restore $raaddi sp, sp, framesizejr ra
Epilogue
Prologue
Body (call other functions…)
ra
memory
52
WhereistheStackinMemory?• RV32convention(RV64andRV128havedifferentmemorylayouts)• Stackstartsinhighmemoryandgrowsdown
– Hexadecimal:bfff_fff0hex
– Stackmustbealignedon16-byteboundary(nottrueinexamplesabove)
• RV32programs(textsegment)inlowend– 0001_0000hex
• staticdatasegment(constantsandotherstaticvariables)abovetextforstaticvariables– RISC-Vconventionglobalpointer(gp)pointstostatic– RV32gp =1000_0000hex
• Heapabovestaticfordatastructuresthatgrowandshrink;growsuptohighaddresses
RV32MemoryAllocation
“AndinConclusion…”• Registersweknowsofar(Allofthem!)
– a0-a7forfunctionarguments,a0-a1forreturnvalues– sp,stackpointer,ra returnaddress– s0-s11savedregisters– t0-t6temporaries– zero
• Instructionsweknow:– Arithmetic:add,addi,sub– Logical:sll,srl,sla,slli,srli,slai,and,or,xor,andi,ori,xori– Decision:beq,bne,blt,bge– Unconditionalbranches(jumps):j,jr– Functionscalledwithjal,returnwithjr ra.
• Thestackisyourfriend:Useittosaveanythingyouneed.Justleaveitthewayyoufoundit!