Upload
doanduong
View
219
Download
1
Embed Size (px)
Citation preview
Introduction to Operating Systems
Mass Storage I
John FrancoElectrical Engineering and Computing Systems
University of Cincinnati
Taken from Chapter 12 of Silberschatz
File System Has Three Parts
• The programmer interface to the file system
• Internal data structures and algorithms to implement the interface
• The actual hardware or secondary storage structures
Outline
• Physical structure of mass storage devices
• Disk scheduling algorithms for performance improvement
• Disk formatting
• Management of Boot blocks
• Management of damaged blocks
• Management of swap space
• Disk reliability (RAID)
• Maintaining stable storage
• Tertiary storage
Magnetic Tape
• Relatively permanent and holds large quantities of data
• But access time is slow: about 1000 times slower than disk
• Mainly used for backup, storage of infrequently used data
• Density is 29.5 billion bits per square inch - 40x better than disk
• Cartridges capable of 35 TB capacity to be developedhttp://www.technologyreview.com/news/417218/new-life-for-magnetic-tape/
• Cost: < 1 penny per GB
• Kept in spool and wound or rewound past read-write head
• Once data is under the head, transfer rates are comparable to disk
• 10s of seconds to get the head in position for a read or write
• Forget random access - instead schedule a run, pick up and cache blocks
• Big plus: cartridges can be stored and loaded onto reader as neededhttp://www.oracle.com/technetwork/server-storage/sun-tape-storage/overview/index.html?ssSourceSiteId=ocomen
• EB size archives can be managed inexpensively with mag tape
Sizes
Kilobyte (KB) 210 1,024 BytesMegaByte (MB) 220 1,024 KBGigaByte (GB) 230 1,024 MBTeraByte (TB) 240 1,024 GBPetaByte (PB) 250 1,024 TBExaByte (EB) 260 1,024 PBZettaByte (ZB) 270 1,024 EBYottaByte (YB) 280 1,024 ZBGeopbyte (Ge?B)
Magnetic Disk
• Bulk of R/W secondary-storage medium in today’s general purpose devices
• Rotate at 7200 rpm typically, 5400 rpm for laptops
• Data organized in tracks
• Seek time (head moves to track): 3ms (high end), 15ms (mobile)
• Rotation latency (data under head) - 4ms @ 7200 rpm
• Data transfer rate - 2000 Mbits/s
• Cost: 8 cents per GB (about 10x more expensive than tape)
• Big minus: platters are fixed hence reader cost is added to storage cost
• Another minus: vulnerable to head crash (head scrapes platter)
• Another minus: due to high seek/latency, fragmentation is something to worryabout regarding performance degradation
• May be affected by magnetic fields - although they are typically shielded
Solid State Drive
• Big plus: no moving parts hence: quieter, less vulnerable to shock betteraccess/latency but failures do happen and are usually completely catastrophic
• Data is constantly being rearranged (unlike disk) so as to even out the rateof erasures/rewrites over the entire medium
• Unfortunately, most file systems have frequent rewrites in small sectionssuch as directory entries
• Seek time (time to access) - .1ms
• Data transfer rate - 5000 Mbits/s
• Cost: about 60 cents per GB
• Developed with disk-like interfaces as a drop in replacement
• Not affected by magnetic fields
• Much less power consumption than disk
Organization of Magnetic Disk
• Several platters - two heads per platter
• Track - circle of data on a platter - 2 tracks per platter location specifiedby a radius
• Cyclinder - collection of tracks at same radius - may be 1000s of these
• Sector - portion of a track, 512 bytes in size - 100s of these/track
• Addressing the sectors: sector/track/cylinder
sector track cylinder0 0 01 0 0...N 0 00 1 0...N 1 0...N M 00 0 1...
Organization of Magnetic Disk
Access to Disk
Through IO channel using some protocol
• Integrated Drive Electronics (IDE): 40 pin parallel busDevices are daisy chained: one is master the other a slaveMax speed: 133 MB/secSeveral upgrades to Ultra-ATA/133 (AT Attachment - 1986)
16 bit data widthUses PIO Modes 0, 1, 2, 3, 4
Multiword DMA modes 0, 1, 2Ultra DMA modes 0, 1, 2, 3, 4, 5, 6
wherePIO = Programmed input/output
Access to Disk
• PIO Mode: CPU uses instructions that access the I/O address spacedirectly to perform data transfers to or from an I/O device
The PIO modes require a great deal of CPU overhead toconfigure a data transaction and transfer the data.
PIO ModeMaxtransfer
Timebetweentransactions
0 3.3 MB/s 600 ns1 5.2 MB/s 383 ns2 8.3 MB/s 240 ns3 11.1 MB/s 180 ns4 16.7 MB/s 120 ns5 20 MB/s 100 ns6 25 MB/s 80 ns
Access to Disk
• Multiword DMA Mode: Direct Memory Access, all words aretransfered before interrupt is raised for CPU to process
MDMA ModeMaxtransfer
Timebetweentransactions
0 4.2 MB/s 480 ns1 13.3 MB/s 150 ns2 16.7 MB/s 120 ns3 20 MB/s 100 ns4 25 MB/s 80 ns
Access to Disk
• Ultra DMA Mode: Before Ultra DMA, one transfer of data occurred oneach clock cycle, triggered by the rising edge of the interfaceclock. With Ultra DMA, data is transferred on both the risingand falling edges of the clock.
UDMA ModeMaxtransfer
Timebetweentransactions
0 16.7 MB/s –1 25 MB/s –2 33.3 MB/s –3 44.4 MB/s –4 66.7 MB/s –5 100 MB/s –6 133 MB/s –7 167 MB/s –
Access to Disk
• Serial Advanced Technology Attachment (SATA):Serial bus - 4 pinsTransfer rates up to 600 MB/s
• Small Computer System Interface (SCSI):Parallel, intelligent, buffered, peer-to-peer interfaceUp to 16 devices can be attachedNCR developed the first SCSI chip
Protocol WidthMaxTransfer
SCSI-1 (1986) 8 5 MB/sFast SCSI (1994) 8 10 MB/sFast-Wide SCSI 16 20 MB/sUltra SCSI 8 20 MB/sUltra Wide SCSI 16 40 MB/sUltra2 SCSI 8 40 MB/sUltra2 Wide SCSI 16 80 MB/sUltra3 SCSI 16 160 MB/sUltra-320 SCSI 16 320 MB/sUltra-640 SCSI 16 640 MB/s
Access to Disk
• Fiber Channel (FC):Serial technologyFiber Channel Protocol (FCP) - serial SCSI (SCSI commands)For use in Storage Area Networks (SAN)
ProtocolMaxTransfer
1GFC(1997) 200 MB/s2GFC (2001) 400 MB/s4GFC (2004) 800 MB/s8GFC (2005) 1600 MB/s10GFC (2008) 2550 MB/s16GFC (2011) 3200 MB/s32GFC (2014) 6400 MB/s
Access to Disk
Through Network Attached Storage (NAS)
• Client nodes on a LAN share a pool of storage as though it is local
• Efficiency drops and performance suffers vs. local storage
• Clients access net-attached storage using RPCs over NFS or CIFS
• Usually implemented as a RAID array
• ISCSI is latest net-attached storage protocol
Access to Disk
Through Storage Area Network (SAN)
• Private network connecting servers and storage units
• Multiple hosts and multiple storage arrays can attach to the same SAN
• Uses storage protocols rather than networking protocols
• Separates communication between server and clients from communicationbetween servers and storage devices, no contention between client & storage
• Storage can be dynamically allocated to hosts - a host running low onspace might get a higher allocation
• Storage access from hosts can be prohibited
• Clusters of servers can share the same storage arrays
• Fiber Channel is commonly used for the connection
Access to Disk
Through InfiniBand (Serial)
• Point-to-point bidirectional serial links intended for the connectionof processors with high-speed peripherals such as disks
• Most commonly used communications link in supercomputers
• Switched fabric communications link, not broadcast (e.g. TCP/IP)
• Used in high-performance computing and enterprise data centers
• Defines a connection between processor nodes and high performancestorage devices (and other I/O devices)
Type SDR DDR QDR FDR-10 FDR EDR1X 2 Gb/s 4 Gb/s 8 Gb/s 10.3 Gb/s 13.64 Gb/s 25 Gb/s4X 8 Gb/s 16 Gb/s 32 Gb/s 41.25 Gb/s 54.54 Gb/s 100 Gb/s12X 24 Gb/s 48 Gb/s 96 Gb/s 123.75 Gb/s 163.64 Gb/s 300 Gb/s
Disk Scheduling
Disk Scheduling
• Goal of OS and disk controller: minimize service time (quickestresponse time) by using hardware most efficiently
• Time factors in reading (or writing) a disk sector are:
1. Seek time: time for heads to move to the cylinder containingthe desired sector
2. Rotational latency: time waiting for disk to rotate the desired sector
3. Transfer time: time to read data and move it to the system
4. Bandwidth: total bytes transferred/(time completed-time started)
• Cannot do much about transfer and latency time since these are hardwarerelated
• Can try to schedule reads/writes to minimize seek time and maximizebandwidth
1. Do this by minimizing seek distance
2. If system/disk is reasonably busy, can have significant impact
3. Scheduling can be done by operating system, disk controller, or both
Disk SchedulingDisk Scheduling Algorithms need to consider this:• Assume requests for service are enqueued
• Request includes the following:
1. Whether this operation is input or output
2. What the disk address for the transfer is
3. What the memory address for the transfer is
4. What the number of sectors to be transferred is
• There may be many enqueued requests in a multi-programmedsystem
• The disk controller needs to select a request when it is ready toperform I/O
Disk SchedulingDisk Scheduling Algorithm evaluation criteria:• Fairness: a totally fair system ensures that the mean response time
of the disk requests is the same for all processes
• Complexity: data structures and algorithms for support.What happens when a new request enters the queue?
• Response Time: time it takes before transfer begins
Disk SchedulingOptimal:
• The schedule that yields the lowest total number of cylinder traversals
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53:98, 183, 37, 122, 14, 124, 65, 67Visits: 37, 14, 65, 67, 98, 122, 124, 183
Traversals: 16+23+51+2+31+24+2+59 = 208 cylinders
Disk SchedulingFirst Come First Serve (FIFO):
• Application calls logical file system (LFS) which knows the formatof the directory structures
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53:98, 183, 37, 122, 14, 124, 65, 6798-53 = 45, 183-98 = 85, 183-37 = 146, 122-37 = 85122-14 = 108, 124-14 = 110, 124-65 = 59, 67-65 = 2
Traversals: 45+85+146+85+108+110+59+2 = 640 cylinders
Disk SchedulingShortest Seek Time First (SSTF):
• Service all the requests close to the current head position before moving thehead far away to service other requests
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53:98, 183, 37, 122, 14, 124, 65, 67Visits: 65, 67, 37, 14, 98, 122, 124, 183Traversals: 12+2+30+23+84+24+2+59 = 236 cylinders
Disk SchedulingSCAN (The Elevator Algorithm):
• Disk arm starts at one end of the platter, moves toward the other end takingcare of requests as they match current cyclinder position - reverses when itgets to the end
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 0:98, 183, 37, 122, 14, 124, 65, 67Visits: 37, 14 | 65, 67, 98, 122, 124, 183Traversals: 53+183 = 236 cylinders
Disk SchedulingSCAN (The Elevator Algorithm):
• Disk arm starts at one end of the platter, moves toward the other end takingcare of requests as they match current cyclinder position - reverses when itgets to the end
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 200:98, 183, 37, 122, 14, 124, 65, 67Visits: 65, 67, 98, 122, 124, 183 | 37 14Traversals: (200-53)+(200-14) = 333 cylinders
Disk SchedulingCircular SCAN (C-SCAN):
• Scans from one end to the other, but when it reaches the end it pops backto the beginning and starts over - works because there are few requests leftthat are close to the end of the scan
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 0:98, 183, 37, 122, 14, 124, 65, 67Visits: 37, 14 | 183, 124, 122, 98, 67, 65Traversals: 53+200+(200-65) = 388 cylinders
Disk SchedulingLOOK:
• Like scan but arm reverses on the last request in the current direction
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 0:98, 183, 37, 122, 14, 124, 65, 67Visits: 37, 14 65, 67, 98, 122, 124, 183Traversals: (53-14)+(183-14) = 208 cylinders
Disk SchedulingLOOK:
• Like scan but arm reverses on the last request in the current direction
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 200:98, 183, 37, 122, 14, 124, 65, 67Visits: 65, 67, 98, 122, 124, 183, 37 14Traversals: (183-53)+(183-14) = 299 cylinders
Disk SchedulingCircular LOOK (C-LOOK):
• Circular version of LOOK
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 0:98, 183, 37, 122, 14, 124, 65, 67Visits: 37, 14 183, 124, 122, 98, 67, 65Traversals: (53-14)+(183-14)+(183-65) = 326 cylinders
Disk SchedulingCircular LOOK (C-LOOK):
• Circular version of LOOK
• Fairness:
• Complexity:
• Response Time:
Example:Assume cylinder requests are in this order and head is on cylinder 53,heading to 200:98, 183, 37, 122, 14, 124, 65, 67Visits: 65, 67, 98, 122, 124, 183, 14, 37Traversals: (183-53)+(183-14)+(37-14) = 322 cylinders