Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
CS61C:GreatIdeasinComputerArchitecture(MachineStructures)
MoreI/O:DMA
VladimirStojanovic&NicholasWeaverhttp://inst.eecs.berkeley.edu/~cs61c/
1
Review:I/O• “MemorymappedI/O”:Devicecontrol/dataregistersmappedtoCPUaddressspace
• CPUsynchronizeswithI/Odevice:– Polling– Interrupts
• “ProgrammedI/O”:– CPUexecslw/sw instructionsforalldatamovementto/fromdevices
– CPUspendstimedoing2things:1. Gettingdatafromdevicetomainmemory2. Usingdatatocompute
2
Workingwithrealdevices• “MemorymappedI/O”:Devicecontrol/dataregistersmappedtoCPUaddressspace
• CPUsynchronizeswithI/Odevice:– Polling– Interrupts
• “ProgrammedI/O”:DMA– CPUexecslw/sw instructionsforalldatamovementto/fromdevices
– CPUspendstimedoing2 things:1. Gettingdatafromdevicetomainmemory2. Usingdatatocompute
3
Agenda
• DirectMemoryAccess(DMA)• Disks
4
What’swrongwithProgrammedI/O?
• Notidealbecause…1. CPUhastoexecutealltransfers,couldbedoing
otherwork2. Devicespeedsdon’talignwellwithCPUspeeds3. Energycostofusingbeefygeneral-purposeCPU
wheresimplerhardwarewouldsuffice• UntilnowCPUhassolecontrolofmainmemory
5
PIOvs.DMA
6
DirectMemoryAccess(DMA)
• AllowsI/Odevicestodirectlyread/writemainmemory
• NewHardware:theDMAEngine• DMAenginecontainsregisterswrittenbyCPU:– Memoryaddresstoplacedata– #ofbytes– I/Odevice#,directionoftransfer– unitoftransfer,amounttotransferperburst
7
OperationofaDMATransfer
8
[FromSection5.1.4DirectMemoryAccessinModernOperatingSystemsbyAndrewS.Tanenbaum,HerbertBos,2014]
DMA:IncomingData
1. Receiveinterruptfromdevice2. CPUtakesinterrupt,beginstransfer– InstructsDMAengine/devicetoplacedata@certainaddress
3. Device/DMAenginehandlethetransfer– CPUisfreetoexecuteotherthings
4. Uponcompletion,Device/DMAengineinterrupttheCPUagain
9
DMA:OutgoingData
1. CPUdecidestoinitiatetransfer,confirmsthatexternaldeviceisready
2. CPUbeginstransfer– InstructsDMAengine/devicethatdataisavailable@certainaddress
3. Device/DMAenginehandlethetransfer– CPUisfreetoexecuteotherthings
4. Device/DMAengineinterrupttheCPUagaintosignalcompletion
10
DMA:Somenewproblems
• WhereinthememoryhierarchydoweplugintheDMAengine?Twoextremes:– BetweenL1andCPU:• Pro:Freecoherency• Con:TrashtheCPU’sworkingsetwithtransferreddata
– BetweenLast-levelcacheandmainmemory:• Pro:Don’tmesswithcaches• Con:Needtoexplicitlymanagecoherency
11
DMA:Somenewproblems
• HowdowearbitratebetweenCPUandDMAEngine/Deviceaccesstomemory?Threeoptions:– BurstMode• Starttransferofdatablock,CPUcannotaccessmemoryinthemeantime
– CycleStealingMode• DMAenginetransfersabyte,releasescontrol,thenrepeats- interleavesprocessor/DMAengineaccesses
– TransparentMode• DMAtransferonlyoccurswhenCPUisnotusingthesystembus
12
Agenda
• DirectMemoryAccess(DMA)• Disks
13
ComputerMemoryHierarchy
14Today
MagneticDisk– commonI/Odevice• Akindofcomputermemory
– Informationstoredbymagnetizingferritematerialonsurfaceofrotatingdisk• Similartotaperecorderexceptdigitalratherthananalogdata
• Atypeofnon-volatilestorage– Retainsitsvaluewithoutapplyingpowertodisk.
• TwoTypesofMagneticDisk1. HardDiskDrives(HDD)– faster,moredense,non-removable.2. Floppydisks– slower,lessdense,removable(nowreplacedbyUSB
“flashdrive”).
• Purposeincomputersystems(HardDrive):1. Workingfilesystem+long-termbackupforfiles2. Secondary“backingstore”formain-memory.Large,inexpensive,
slowlevelinthememoryhierarchy(virtualmemory)
15
PhotoofDiskHead,Arm,Actuator
Arm
Head
Spindle
16
DiskDeviceTerminology
• Severalplatters,withinformationrecordedmagneticallyonbothsurfaces(usually)
• Bitsrecordedintracks,whichinturndividedintosectors (e.g.,512Bytes)
• Actuator moveshead (endofarm)overtrack(“seek”),waitforsector rotateunderhead,thenreadorwrite
OuterTrack
InnerTrackSector
Actuator
HeadArmPlatter
17Videoofharddiskinaction
HardDrivesareSealed.Why?• Theclosertheheadtothedisk,the
smallerthe“spotsize”andthusthedensertherecording.– MeasuredinGbit/in^2– ~900Gbit/in^2isstateoftheart– Startedoutat2Kbit/in^2– ~450,000,000ximprovementin~60
years• Disksaresealedtokeepthedust
out.– Headsaredesignedto“fly”ataround
3-20nmabovethesurfaceofthedisk.– 99.999%ofthehead/armweightis
supportedbytheairbearingforce(aircushion)developedbetweenthediskandthehead.
18
3-20nm
DiskDevicePerformance(1/2)
• Disk Access Time = Seek Time + Rotation Time + Transfer Time + Controller Overhead– SeekTime=timetopositiontheheadassemblyatthepropercylinder– RotationTime=timeforthedisktorotatetothepointwherethefirst
sectorsoftheblocktoaccessreachthehead– TransferTime=timetakenbythesectorsoftheblockandanygaps
betweenthemtorotatepastthehead
Platter
Arm
Actuator
HeadSectorInnerTrack
OuterTrack
ControllerSpindle
19
DiskDevicePerformance(2/2)
• Averagevaluestoplugintotheformula:• RotationTime:Averagedistanceofsectorfromhead?– 1/2timeofarotation
• 7200RevolutionsPerMinuteÞ 120Rev/sec• 1revolution=1/120secÞ 8.33milliseconds• 1/2rotation(revolution)Þ4.17ms
• Seektime:Averageno.trackstomovearm?– Numberoftracks/3(seeCS186forthemath)– Then,seektime=numberoftracksmoved× timetomoveacrossonetrack
20
Butwait!
• Performanceestimatesaredifferentinpractice:
• Manydiskshaveon-diskcaches,whicharecompletelyhiddenfromtheoutsideworld– Previousformulacompletelyreplacedwithon-diskcacheaccesstime
21
WheredoesFlashmemorycomein?• ~10yearsago:Microdrives andFlashmemory(e.g.,CompactFlash)wenthead-to-head– Bothnon-volatile(retainscontentswithoutpowersupply)
– Flashbenefits:lowerpower,nocrashes(nomovingparts,needtospinµdrivesup/down)
– Diskcost=fixedcostofmotor+armmechanics,butactualmagneticmediacostverylow
– Flashcost=mostcost/bitofflashchips– Overtime,cost/bitofflashcamedown,becamecostcompetitive
22
FlashMemory/SSDTechnology
• NMOStransistorwithanadditionalconductorbetweengateandsource/drainwhich“traps”electrons.Thepresence/absenceisa1or0
• Memorycellscanwithstandalimitednumberofprogram-erasecycles.ControllersuseatechniquecalledwearlevelingtodistributewritesasevenlyaspossibleacrossalltheflashblocksintheSSD.
WhatdidAppleputinitsiPods?Samsung flash
16 GB
shuffle nano classic touch
Toshiba 1.8-inch HDD80, 120, 160 GB
Toshiba flash2 GB
Toshiba flash32, 64 GB
24
FlashMemoryinSmartPhones
25
iPhone6:upto128GB
FlashMemoryinLaptops– SolidStateDrive(SSD)
26capacitiesupto512GB
iClicker Question• Wehavethefollowingdisk:– 15000Cylinders,1ms tocross1000Cylinders– 15000RPM=4ms perrotation–Wanttocopy1MB,transferrateof1000MB/s– 1ms controllerprocessingtime
• Whatistheaccesstimeusingourmodel?
DiskAccessTime=SeekTime+RotationTime+TransferTime+ControllerProcessingTime
27
A B C D E
10.5 ms 9 ms 8.5ms 11.4ms 12 ms
iClicker Question
• Wehavethefollowingdisk:– 15000Cylinders,1ms tocross1000Cylinders– 15000RPM=4ms perrotation– Wanttocopy1MB,transferrateof1000MB/s– 1ms controllerprocessing time
• Whatistheaccesstime?Seek=#cylinders/3*time=15000/3*1ms/1000cylinders=5msRotation=timefor½rotation=4ms /2=2msTransfer=Size/transferrate=1MB/(1000MB/s)=1msController=1msTotal=5+2+1+1=9ms
28
“Andinconclusion…”
• I/Ogivescomputerstheir5senses• I/Ospeedrangeis100-milliontoone• Pollingvs.Interrupts• DMAtoavoidwastingCPUtimeondatatransfers• Disksforpersistentstorage,replacedbyflash
29