View
3
Download
0
Category
Preview:
Citation preview
AMemoryConsistencyModelForRISC-VFormallyEvaluatedwithTriCheck
CarolineTrippelPrincetonUniversityNovember29,2016
CarolineTrippel,Yatin Manerkar,DanielLustig,MichaelPellauer,andMargaretMartonosi.“TriCheck:MemoryModelVerificationattheTrisectionSoftware,Hardware,andISA”.In ProceedingsoftheTwenty-SecondInternationalConferenceonArchitecturalSupportforProgrammingLanguagesandOperatingSystems (ASPLOS'17).
RoleoftheInstructionSetArchitecture(ISA)
Software/HLL
HardwareISA
Weak PPO(e.g.,ARM,POWER)
More orderingprimitives(e.g.,fences/barriers)insertedbycompiler
• Introducedin1964byIBM• 1setofsoftware• >1hardwareimplementations
• Definitivespec.ofhardwareasseenbysoftware:
• Specificationofwhathardwaremustimplement
• Targetforcompilertranslation
Software/HLL
Hardware
ISA
Strong PPO(e.g.,SC,TSO)
Fewer orderingprimitives(e.g.,fences/barriers)insertedbycompiler
OurWork:MemoryConsistencyModelVerification
Software/HLLMemoryModel
ISAMemoryModel
HardwareMemoryModel
Compilation
MicroarchitecturalImplementation PipeCheck [Lustig etal.MICRO-47]CCICheck [Manerkar etal.MICRO-48]
COATCheck [Lustig etal.ASPLOS‘16]
ArMOR [Lustig etal.ISCA‘15]
OperatingSystem
TriCheck [Trippeletal.ASPLOS‘17]
MemoryModelsBugsObservedinPractice
ARMRead-after-ReadHazard[Alglave etal.TOPLAS‘14]• AmbiguousISAspec.regardingsame-addressLdàLd ordering
• ARMcompilersdidnotinsertsynchronizationprimitives(e.g.,fences/barriers)• SomeARMimplementationsrelaxedsame-addressLdàLd ordering(e.g.,Cortex-A9,Snapdragon805)
• C/C++atomicsrequiresame-addressLdàLd ordering• ARMissuederrata1:Rewritecompilerstoinsertfences(withperformancepenalties)
We’veidentifiedandcharacterizedflawsinthecurrentRISC-Vmemorymodel(i.e.,thememorymodeldefinedinthecurrentmanual)[Trippeletal.ASPLOS‘17]
1ARM.Cortex-A9MPCore,programmeradvicenotice,read-after-readhazards.ARMReference761319.,2011.http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf.
Notethatthemodificationstofixtheseissueswillbemostlycompatiblewithcurrentimplementations.
Outline
• RoleofMemoryModelsinISAs• WhatShouldWeRequireFromtheHardware?• WhatFences/BarriersDoWeNeedtoSupportC/C++?• TriCheck FrameworkforFull-StackMemoryModelVerification• On-GoingWork&Conclusions
SequentialConsistency
• Memory models specify the allowed behavior of a multithreaded program executing with shared memory
• First defined by [Lamport 1979], execution is the same as if:(R1) Memory ops of each processor appear in program order(R2) Memory ops of all processors were executed in some global sequential order
Thread 0x=1y=1
Thread 1r1=yr2=x
x=1y=1r1=yr2=x
x=1r1=yy=1r2=x
x=1r1=yr2=xy=1
r1=yr2=xx=1y=1
r1=yx=1r2=xy=1
r1=yx=1y=1r2=x
Program Legal Executions
TwoCategoriesofMemoryModelRelaxation
PreservedProgramOrder:DefinesprogramorderingsthathardwaremustpreservebydefaultStoreAtomicity:Definesorderinwhichstoresbecomevisibletocores• Multiple-copyatomic:
• Allcoresseestoresimultaneously• Read-Own-Write-Early-multiple-copyatomic:
• Storingcorecanreaditsownstorebeforeothercores• Storesmadevisibletoallremotecoressimultaneously
• Non-multiple-copyatomic:• Storingcorecanreaditsownstorebeforeothercores• Storeismadevisibletosomeremotecoresbeforeothers
E.g.,monolithicmemory
E.g.,privatestorebuffer
E.g.,sharedstorebuffer
RISC-VProposedPreservedProgramOrderandStoreAtomicityPreservedProgramOrder:
StoreAtomicity:Non-multiple-copyatomic:
• Storingcorecanreaditsownstorebeforeothercores• Storeismadevisibletosomeremotecoresbeforeothers
EffectsofNon-Multiple-CopyAtomicStores
Initial conditions: x=0, y=0 T0 T1 T2 T3
st [x] ç 1 st [y] ç 1 ld x à [r0] ld y à [r2]F R, R F R, R
ld y à [r1] ld x à [r3]Non-SC Outcome: r0=1, r1=0, r2=1, r3=0
ThisoutcomecorrespondstothecaseinwhichthestoresonthreadsT0andT1arrivetothreadsT2andT3indifferentorders
L1$
L1$
WhyAllowNon-Multiple-CopyAtomicStores?
• CommercialISAsallownon-multiple-copyatomicstores(e.g.ARM,POWER)
• RISC-VisintendedtobeintegratedwithothervendorISAs• Potentialdeploymentinnon-multiple-copyatomicmemorysystems• Ifsharingmemorysystem,awarenessthatstoresmaybeobservedinordersthatdifferfromothercores
Outline
• RoleofMemoryModelsinISAs• WhatShouldWeRequireFromtheHardware?• WhatFences/BarriersDoWeNeedtoSupportC/C++?• TriCheck FrameworkforFull-StackMemoryModelVerification• On-GoingWork&Conclusions
FencestoRestoreMultiple-CopyAtomicity
Initial conditions: x=0, y=0 T0 T1 T2 T3
st [x] ç 1 st [y] ç 1 ld x à [r0] ld y à [r2]pscF RW, RW pscF RW, RW
ld y à [r1] ld x à [r3]Non-SC Outcome: r0=1, r1=0, r2=1, r3=0
Predecessor-/Successor- CumulativeFence:NecessarytoRestoreSCforNon-Multiple-CopyAtomicMemorySystems
OtherFences/Barriers/OrderingPrimitives
• BaselineMemoryModel• PPOrequiressame-addressR-Rordertobemaintained• PPOrequiresordertobemaintainedbetweenmostdependentinstructions• Predecessor-/Successor-CumulativeFRW,RW;FIO,IO;FIORW,IORW
• Baseline+AtomicsExtension• Predecessor-CumulativeFRW,W
Outline
• RoleofMemoryModelsinISAs• WhatShouldWeRequireFromtheHardware?• WhatFences/BarriersDoWeNeedtoSupportC/C++?• TriCheck FrameworkforFull-StackMemoryModelVerification• On-GoingWork&Conclusions
TriCheck Full-StackVerificationFramework
SuiteofC/C++LitmusTests
SuiteofSmallC/C++Programs
CompilerMappingsfromC/C++toRISC-V
C/C++HerdModel RISC-VCheckModel
ISALevelOutcomeForbiddenC/C++OutcomeForbidden implies
TriCheck comparesHLLoutcomestoISA-leveloutcomes
foraspectrumoflegalISAmicroarchitectures.
TriCheck Full-StackVerificationFramework
SuiteofC/C++LitmusTests
SuiteofSmallC/C++Programs
CompilerMappingsfromC/C++toRISC-V
C/C++HerdModel RISC-VCheckModel
ISALevelOutcomeForbiddenC/C++OutcomeForbidden implies
ISADOESNOTALLOWoutcomesprohibitedbytheISA
TriCheck Full-StackVerificationFramework
SuiteofC/C++LitmusTests
SuiteofSmallC/C++Programs
CompilerMappingsfromC/C++toRISC-V
C/C++HerdModel RISC-VCheckModel
ISALevelOutcomeForbiddenC/C++OutcomeForbidden implies
ISAALLOWSoutcomesprohibitedbytheISA
RISC-VBase:LackofCumulativeFences
050100150200250
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
riscv-curr riscv-ours
wrc
RISC-VBaseline(Base)
TestVa
riatio
nsBugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
• C/C++acquire/releasesynchronizationistransitive:• Accessesbeforeareleasewriteinprogramorder,andobservedbythe
releasingcorepriortothereleasewrite mustbeorderedbeforethereleasefromtheviewpointofanacquirereadthatreadsfromthereleasewrite
• BaseRISC-VISAlackscumulativefences• Minimally,theISArequiresaPredecessor-/SuccessorCumulativeFRW,RW• Cannotfixbugsby modifyingcompilercurrently
OurcurrentRISC-VproposalrequiresonlyaP-/S-CumulativeFRW,RWintheRISC-VBaseISA,andincludesaweakerP-CumulativeFRW,WFenceintheBase+Atomics extension.
Outline
• RoleofMemoryModelsinISAs• WhatShouldWeRequireFromtheHardware?• WhatFences/BarriersDoWeNeedtoSupportC/C++?• TriCheck FrameworkforFull-StackMemoryModelVerification• On-GoingWork&Conclusions
On-GoingWork&Conclusions
• WehaveformulatedanEnglishlanguagediff.ofthecurrentspec.withourproposedchanges
• CurrentlyweareconstructingaformalmodelinHerd[Alglave etal.,TOPLAS‘14]ofourproposedmemorymodelmodifications
• Memorymodeldesignchoicesarecomplicatedandinvolvereasoningaboutthesubtleinterplaybetweenmanydiversefeatures
• DefininganISAspecificationinlightoftheevaluationofasinglemicroarchitectureisnotsufficient
• TriCheck isgeneralizabletoanyISA anduncovered/quantifiedflawsintheRISC-Vmemorymode.
ctrippel@princeton.eduhttp://check.cs.princeton.edu/
21
RISC-VBase+A:LackofTransitiveReleases
• C/C++acquire/releasesynchronizationistransitive:• Accessesbeforeareleasewriteinprogramorder,andobservedbythe
releasingcorepriortothereleasewrite mustbeorderedbeforethereleasefromtheviewpointofanacquirereadthatreadsfromthereleasewrite
0
50
100
150
200
250
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
riscv-curr riscv-ours
wrc
RISC-VBaseline+Atomics(Base+A)
TestVa
riatio
nsBugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
• Base+A RISC-VISAlackstransitivereleases• i.e.,RISC-VacquiresdonotsynchronizewithRISC-VreleasesasrequiredbyC/C++• AMO.rl andstrongerAMO.aq.rl arebothinsufficeint• Cannotfixbugsby modifyingcompiler
• Oursolution: redefinereleaseoperationsintheBase+A RISC-VISAtobetransitive
0153045607590
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
WR
rWR
rWM
rMM
nWR
nMM
A9like
riscv-curr riscv-ours riscv-curr riscv-ours
mp sb
RISC-VBaseline+Atomics(Base+A)
TestVariations
Bugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
RISC-VBase+A:NoRoach-MotelMovementforSCAtomics• Roach-motelmovement=expansionofacquire-release
criticalsection• C++SCloadhaveC++Acquiresemantics• C++SCstoreshaveC++Releasesemantics
• RISC-VSCloadsandstoresrequirebothaq andrl bitssetonAMOs• Operationhasacquireandreleasesemantics• Prohibitsroach-motelmovement
• Oursolution: addansc bitforimplementingAMO.aq.sc andAMO.rl.sc instructionswhicharecapableofimplementingC/C++SCloadsandstores
RISC-VBase:SameAddressLdàLd Re-Ordering
• C/C++forbidssame-addressLdàLd reordering• BugsalwayswhenC/C++loadsaremappedtoregularRISC-Vloads
0153045607590
WR rWRrWMrMMnWRnMM A9 WR rWRrWMrMMnWRnMM A9
riscv-curr riscv-ours
corr
RISC-VBaseline(Base)
TestVa
riatio
nsBugs OverlyStrict Equivalent
μSpec Model:
Variation:
Litmustest:
ISA:
Initial conditions: x=0, y=0
T0 T1
a: sw x1, (x5) c: lw x3, (x5)
b: sw x2, (x5) d: lw x4, (x5)
Forbidden HLL Outcome: x1=1, x2=2, x3=2, x4=1• BaseRISC-VISAincludesFR,R• Possibletofixbugsbymodifyingcompilerwithpotentialperformancepenalty• 20.3%preliminaryestimateoffenceinsertionperformancepenaltyforARM
• Oursolution: modifyBaseRISC-Vmemorymodeltorequiresame-addressLdàLd ordering
OurcurrentRISC-Vproposalelimites FR,RfromtheRISC-VBaseISA,andrequireshardwaretoenforcesame-addressLdàLdorderbydefault.
Re-orderingDependentOperations• RISC-Vdoesnotrequireorderingfordependentinstructions• ManycommercialISAs– x86,ARM,Power– respectdependencies
• Canalsobeusedaslightweightsynchronization
• Explicitsynchronization/fencesneededwhendependencyorderingisrequiredbutnotenforcedbydefault,e.g.,Linux
• Macroread_barrier_depends()optionallyinsertsabarrier• InsertsafenceforAlpha,whichdoesnotrespectdependencies1
• InsertsnothingforRISC-V,whichdoesnotrespectdependencies2
• Oursolution:modifyBaseRISC-Vmemorymodeltorequirethepreservationofdependencyorderings.
1LinusTorvaldsetal.Linuxkernel,2016.https://github.com/torvalds/linux/blob/master/arch/alpha/include/asm/barrier.h2RISC-VFoundation.RISC-VportofLinuxkernel,2016.https://github.com/riscv/riscv-linux/blob/master/rch/riscv/include/asm/barrier.h
ctrippel@princeton.eduhttp://check.cs.princeton.edu/
26
Recommended