Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
CS152,Spring2016
CS152ComputerArchitectureandEngineering
Lecture1-Introduc:on
Dr.GeorgeMichelogiannakisEECS,UniversityofCaliforniaatBerkeley
CRD,LawrenceBerkeleyNaFonalLaboratory!
http://inst.eecs.berkeley.edu/~cs152!
CS152,Spring2016
Pronuncia:on
Miheloyannakis
(opFonal)
2
CS152,Spring2016
WhatisComputerArchitecture?
3
ApplicaFon
Physics
Gaptoolargetobridgeinonestep
InitsbroadestdefiniFon,computerarchitectureisthedesignoftheabstrac0onlayersthatallowustoimplementinformaFonprocessingapplicaFonsefficientlyusingavailablemanufacturingtechnologies.
(butthereareexcep0ons,e.g.magne0ccompass)
CS152,Spring2016
WhatisComputerArchitecture?
• AsetofrulesandmethodsthatdescribethefuncFonality,organizaFonandimplementaFonofcomputersystems.
• ComputerArchitectureisthescienceandartofselecFngandinterconnecFnghardwarecomponentstocreatecomputersthatmeetfuncFonal,performanceandcostgoals.
• Computerarchitectureactsastheintermediatebetweenprogrammersanddevices(e.g.,VLSI).
• Whatareyouheretolearn?
4
CS152,Spring2016 5
Abstrac:onLayersinModernSystems
Algorithm
Gates/Register-TransferLevel(RTL)
ApplicaFon
InstrucFonSetArchitecture(ISA)
OperaFngSystem/VirtualMachines
Microarchitecture
Devices
ProgrammingLanguage
Circuits
Physics
EE141CS150
CS162
CS170CS164
EE143
CS152
UCBEECSCourses
CS152,Spring2016
Costofso\waredevelopmentmakescompaFbilityamajorforceinmarket
ArchitectureCon:nuallyChanging
6
ApplicaFons
Technology
ApplicaFonssuggesthowtoimprovetechnology,providerevenuetofunddevelopment
ImprovedtechnologiesmakenewapplicaFonspossible
CS152,Spring2016
Example:x86BackwardsCompa:bility
• Intel’s8086wasreleasedin1978with~50instrucFons• Today,x86has~650withallextensions
– Mostarerarelyemidedbycompilers
7
CS152,Spring2016 8
Compu:ngDevicesThen…
EDSAC,UniversityofCambridge,UK,1949
CS152,Spring2016 9
Compu:ngDevicesNow
Robots
Supercomputers Automobiles
Laptops
Set-top boxes
Smart phones
Servers Media Players
Sensor Nets
Routers
Cameras Games
CS152,Spring2016
Moore’sLaw
• TheobservaFonthat,overthehistoryofcompuFnghardware,thenumberoftransistorsinadenseintegratedcircuit(chip)hasdoubledapproximatelyeverytwoyears.
10
CS152,Spring2016
DesignComplexity
11
CS152,Spring2016
DesignCapacity
• In1978,Intelcoulddesignachip(8086)with29,000transistors
• In2012,2,104million(IvyBridge)
• Rocket(RISC-V)whichyou’llbeusinghas75+milliontransistors
• DoeshumanitygetsmarterwithFme?
12
CS152,Spring2016
ComputerArchitectsThen
13
CS152,Spring2016
ComputerArchitectsNow
14
CS152,Spring2016
TechnologyTrends
15
CS152,Spring2016
PowerDissipa:on
16
CS152,Spring2016
PowerWallinModernProcessors
17
While at the same time chips keep getting larger.
Therefore, not all of the chip can be powered on at the same time
CS152,Spring2016 18
TheEndoftheUniprocessorEra
Singlebiggestchangeinthehistoryofcompu0ngsystems
CS152,Spring2016
WeWentFromThis
• Cray-1
• Singleprocessor
19
CS152,Spring2016
ToThis
• Titan,anXK7supercomputeratOakRidgeNaFonalLaboratory(CrayXT3)(299,008AMDOpteroncores)
20
CS152,Spring2016
Result:SimpleCores
21
J. Huh, D.C. Burger, and S.W. Keckler. Exploring the Design Space of Future CMPs.
In International Conference on Parallel Architectures and Compilation Techniques (PACT), September, 2001
CS152,Spring2016
Result:SimpleCores
22
CS152,Spring2016
Result:Specializa:on
23
CS152,Spring2016
BeforeThat:DennardScaling
• Power=AxCxFxV2
– A:AcFvityfactor– C:Capacitance– F:Frequency– V:Voltage
• Capacitanceisrelatedtoarea– So,asthesizeofthetransistorsshrunk,andthevoltagewasreduced,circuitscouldoperateathigherfrequenciesatthesamepower
• Butleakagecurrentandthresholdvoltageoftransistorssetalowerboundforvoltage
• Transistorsgetsmaller,theirpoweristhesame->Powerdensityincreases
24
CS152,Spring2016
ALITTLEHISTORYLearnfromthemistakesofothers
25
CS152,Spring2016
An:kytheraMechanism
• FoundinaGreekshipbelievedtohavesankaround80B.C.
• Itaccuratelypredictedlunarandsolareclipses,aswellassolar,lunarandplanetaryposiFons– Size:8inchesacross
26
CS152,Spring2016 27
DifferenceEngine
1855.Cancomputeany6thdegreepolynomialbycalculaFngthedifferencebetween2Dmatrixelements
Speed:33to4432-digitnumbers
perminute!
Now the machine is at the Smithsonian
n
d2(n)
d1(n)
f(n)
0
41
1
2
2
2
3
2
4
2
4 6 8
43 47 53 61
CS152,Spring2016 28
HarvardMarkI
• Builtin1944inIBMEndico`laboratories– HowardAiken–ProfessorofPhysicsatHarvard– Essen:allymechanicalbuthadsomeelectro-magne:callycontrolledrelaysandgears
– Weighed5tonsandhad750,000components– Asynchronizingclockthatbeatevery0.015seconds(66Hz)– InspiredbyCharlesBabbage’sanaly:cengine
Performance: 0.3 seconds for addition 6 seconds for multiplication 1 minute for a sine calculation Decimal arithmetic No Conditional Branch!
Broke down once a week!
CS152,Spring2016 29
ElectronicNumericalIntegratorandComputer(ENIAC)• InspiredbyAtanasoffandBerry,EckertandMauchlydesignedand
builtENIAC(1943-45)attheUniversityofPennsylvania• Thefirst,completelyelectronic,operaFonal,general-purpose
analyFcalcalculator!– 30tons,72squaremeters,200KW
• Performance– Readin120cardsperminute– AddiFontook200µs,Division6ms– 1000FmesfasterthanMarkI
• Notveryreliable!
Application: Ballistic calculations angle = f (location, tail wind, cross wind, air density, temperature, weight of shell, propellant charge, ... )
WW-2 Effort
CS152,Spring2016 30
Computersinmid50’s
• Hardwarewasexpensive• StoreinstrucFonsweresmall(1000words)
⇒Noresidentsystemso\ware!
• MemoryaccessFmewas10to50Fmesslowerthantheprocessorcycle⇒InstrucFonexecuFonFmewastotallydominatedbythememory
reference0me.
• TheabilitytodesigncomplexcontrolcircuitstoexecuteaninstrucFonwasthecentraldesignconcernasopposedtothespeedofdecodingoranALUoperaFon
• Programmer’sviewofthemachinewasinseparablefromtheactualhardwareimplementaFon
• MTBF20minuteswasstateoftheart
CS152,Spring2016 31
Compa:bilityProblematIBM
By early 60’s, IBM had 4 incompatible lines of computers!
701 → 7094 650 → 7074 702 → 7080 1401 → 7010
Each system had its own
• Instruction set • I/O system and Secondary Storage: magnetic tapes, drums and disks • assemblers, compilers, libraries,... • market niche
business, scientific, real time, ...
⇒ IBM 360
CS152,Spring2016 32
IBM360:DesignPremisesAmdahl,BlaauwandBrooks,1964
• Thedesignmustlenditselftogrowthandsuccessormachines• GeneralmethodforconnecFngI/Odevices• Totalperformance-answerspermonthratherthanbitspermicrosecond⇒ programmingaids
• MachinemustbecapableofsupervisingitselfwithoutmanualintervenFon
• Built-inhardwarefaultcheckingandlocaFngaidstoreducedownFme
• SimpletoassemblesystemswithredundantI/Odevices,memoriesetc.forfaulttolerance
• SomeproblemsrequiredfloaFng-pointlargerthan36bits
CS152,Spring2016 33
IBM360:AGeneral-PurposeRegister(GPR)Machine• ProcessorState
– 16General-Purpose32-bitRegisters» maybeusedasindexandbaseregister
» Register0hassomespecialproper0es– 4FloaFngPoint64-bitRegisters– AProgramStatusWord(PSW)
» PC,Condi0oncodes,Controlflags• A32-bitmachinewith24-bitaddresses
– ButnoinstrucFoncontainsa24-bitaddress!• DataFormats
– 8-bitbytes,16-bithalf-words,32-bitwords,64-bitdouble-words
The IBM 360 is why bytes are 8-bits long today!
CS152,Spring2016 34
IBM360:Ini:alImplementa:ons
Model30 ... Model70Storage 8K-64KB 256K-512KBDatapath 8-bit 64-bitCircuitDelay 30nsec/level 5nsec/levelLocalStore MainStore TransistorRegistersControlStore Readonly1µsec ConvenFonalcircuits
IBM360instruc0onsetarchitecture(ISA)completelyhidtheunderlyingtechnologicaldifferencesbetweenvariousmodels.Milestone:ThefirsttrueISAdesignedasportablehardware-soRwareinterface!
Withminormodifica0onsits0llsurvivestoday!
CS152,Spring2016 35
IBM360:47yearslater…ThezSeriesz11Microprocessor
• 5.2GHzinIBM45nmPD-SOICMOStechnology• 1.4billiontransistorsin512mm2• 64-bitvirtualaddressing
– originalS/360was24-bit,andS/370was31-bitextension
• Quad-coredesign• Three-issueout-of-ordersuperscalarpipeline• Out-of-ordermemoryaccesses• Redundantdatapaths
– everyinstrucFonperformedintwoparalleldatapathsandresultscompared
• 64KBL1I-cache,128KBL1D-cacheon-chip• 1.5MBprivateL2unifiedcachepercore,on-chip• On-Chip24MBeDRAML3cache• Scalesto96-coremulFprocessorwith768MBofsharedL4eDRAM[ IBM, HotChips, 2010]
CS152,Spring2016
StorageDevicesAlsoProgressed
36
CS152,Spring2016
Magne:cStorageDevices
37
7.25 MB
CS152,Spring2016
LOGISTICS
38
CS152,Spring2016 39
RelatedCourses
CS61C CS152
CS150
Basiccomputerorganiza:on,firstlookatpipelines+caches
ComputerArchitecture,Firstlookatparallelarchitectures
DigitalLogicDesign,FPGAs
Strong
Prerequisite
CS250
VLSI Systems Design
CS252
GraduateComputerArchitecture,Advanced
Topics
CS152,Spring2016 40
CS61CvsCS152vsCS252
• CS152focusesoninteracFonofso\wareandhardware– morearchitectureandlessdigitalengineering– moreusefulforOSdevelopers,compilerwriters,performanceprogrammers
• Muchofthematerialyou’lllearnthistermwaspreviouslyinCS252– SomeofthecurrentCS61CwasinCS252over20yearsago!– Maybeevery10years,shi\CS252->CS152->CS61C?
• CS152beginswhereCS61Cle\off(withoverlap)
• CS252delvesintomoredetailandhasaresearchproject
CS152,Spring2016 41
CS152Execu:veSummary
TheprocessoryoubuiltinCS61C
Plus,thetechnologybehindchip-scalemulFprocessors(CMPs)andgraphicsprocessingunits(GPUs)
Whatyou’llunderstandandexperimentwithinCS152
CS152,Spring2016 42
CS152StructureandSyllabusFivemodules
1. Simplemachinedesign(ISAs,microprogramming,unpipelinedmachines,IronLaw,simplepipelines)
2. Memoryhierarchy(DRAM,caches,opFmizaFons)plusvirtualmemorysystems,excepFons,interrupts
3. Complexpipelining(score-boarding,out-of-orderissue)4. Explicitlyparallelprocessors(vectormachines,VLIW
machines,mulFthreadedmachines)5. MulFprocessorarchitectures(memorymodels,cache
coherence,synchronizaFon)
CS152,Spring2016 43
CS152AdministriviaInstructor:GeorgeMichelogiannakis,mihelog@eecs!
OfficeHours:A\erlectures,Wednesdays11-12:30pm341ASodaT.A.: ColinSchmidt,colins@eecs
OfficeHours:Tuesday2-4pm651SodaLectures: M/W,9-10:30AM,306Soda
SecFon: Th2PM-4PM,9105LaFmerText: ComputerArchitecture:AQuan0ta0veApproach,
HennesseyandPaWerson,5thEdi0on(2012)ReadingsassignedfromthisediFon,somereadingsavailableinolder
ediFons–seewebpage.
Webpage:http://inst.eecs.berkeley.edu/~cs152!Lecturesavailableonline
Piazzza: http://piazza.com/berkeley/spring2016/cs152
CS152,Spring2016 44
CS152CourseComponents
• 15%Problemsets(onepermodule)– Intendedtohelpyoulearnthematerial.Feelfreetodiscusswithotherstudentsandinstructors,butmustturninyourownsoluFons.Gradingbasedmostlyoneffort,butquizzesassumethatyouhaveworkedthroughallproblems.SoluFonsreleaseda\erPSshandedin
• 45%Quizzes(onepermodule)– In-class,closed-book,nocalculators,nosmartphones,nolaptops,...– Basedonlectures,readings,problemsets,andlabs
• 40%Labs(onepermodule)– Labsuseadvancedprocessorandsystemsimulators– Directedplusopen-endedsecFonstoeachlab
• SecFonswillrevieweachoftheabove• Checkthewebsitefordeadlines!• SignupforPiazza!
CS152,Spring2016 45
CS152Labs• Eachlabhasdirectedplusopen-endedassignments• DirectedporFon(2/7)isintendedtoensurestudentslearnmainconceptsbehindlab– Eachstudentmustperformownlabandhandintheirownlabreport
• Open-endedassignment(5/7)istoallowyoutoshowyourcreaFvity– Roughlya“mini-project”
» E.g.,tryanarchitecturalideaandmeasurepotenFal,ortrytoimproveadesign.NegaFveresultsOK(ifexplainable!)
– Studentscanworkindividuallyoringroupsoftwo– Groupopen-endedlabreportsmustbehandedinseparately(butstatewhoyouworkedwith)
– Studentscanworkindifferentgroupsfordifferentassignments
• LabreportsmustbereadableEnglishsummaries• Twofreetwo-dayextensionsperstudent• YoumayhavetolearnscripFnglanguages
CS152,Spring2016
RISC-VISA
• RISC-Visanewsimple,clean,extensibleISAwedevelopedatBerkeleyforeducaFonandresearch– RISC-I/II,firstBerkeleyRISCimplementaFons– BerkeleyresearchmachinesSOAR/SPURconsideredRISC-III/IV
• BothofthedominantISAs(x86andARM)aretoocomplextouseforteaching
• RISC-VISAmanualavailableonwebpage– See“resources”onclasswebsite
• FullGCC-basedtoolchainavailable
46
CS152,Spring2016
Chiselsimulators
• ChiselisanewhardwaredescripFonlanguagewedevelopedatBerkeleybasedonScala– ConstrucFngHardwareinaScalaEmbeddedLanguage
• LabswilluseRISC-VprocessorsimulatorsderivedfromChiselprocessordesigns– GivesyoumuchmoredetailedinformaFonthanothersimulators– CanmaptoFPGAorrealchiplayout
• YouneedtolearnsomeminimalChiselinCS152,butwe’llmakeChiselRTLsourceavailablesoyoucanseeallthedetailsofourprocessors
• CandolabprojectsbasedonmodifyingtheChiselRTLcodeifdesired
47
CS152,Spring2016
ChiselDesignFlow
48
ChiselDesignDescripFon
C++code
FPGAVerilog
ASICVerilog
C++Simulator
C++Compiler
ChiselCompiler
FPGAEmulaFon
FPGATools
GDSLayout
ASICTools
CS152,Spring2016
FAMILIARITYQUIZ
49
CS152,Spring2016
PipelinedProcessor
50
CS152,Spring2016
VirtualAddresses
51
CS152,Spring2016
Caches
52
CS152,Spring2016
BirdsCache(hoard)too!
• Sameidea.Bringvaluableobjectsclose• AcornWoodpeckersstoretheirfoodinholesdrilledintrees
53
CS152,Spring2016 54
InConclusion
• ComputerArchitecture>>ISAsandRTL• CS152isaboutinteracFonofhardwareandso\ware,anddesignofappropriateabstracFonlayers
• ComputerarchitectureisshapedbytechnologyandapplicaFons– Historyprovideslessonsforthefuture
• ComputerScienceatthecrossroadsfromsequenFaltoparallelcompuFng– SalvaFonrequiresinnovaFoninmanyfields,includingcomputerarchitecture
• ReadChapter1&AppendixAfornextFme!
CS152,Spring2016 55
Acknowledgements
• Theseslidescontainmaterialdevelopedandcopyrightby:– Arvind(MIT)– KrsteAsanovic(MIT/UCB)– JoelEmer(Intel/MIT)– JamesHoe(CMU)– JohnKubiatowicz(UCB)– DavidPaderson(UCB)– Variouswebsitesandpapers
• MITmaterialderivedfromcourse6.823• UCBmaterialderivedfromcourseCS252