Upload
edgar-merritt
View
257
Download
0
Tags:
Embed Size (px)
Citation preview
2
Motivation)-:
Megabyte found that tying lots of barrels together tomake his raft worked just as well as having a huge one.And if one of them became punctured, it didn't sink
3
In 1987, Patterson, Gibson and Katz at the Universityof California Berkeley, published a paper entitled "ACase for Redundant Arrays of Inexpensive Disks(RAID)" . This paper described various types of diskarrays, referred to by the acronym RAID. The basicidea of RAID was to combine multiple small,inexpensive disk drives into an array of disk driveswhich yields performance exceeding that of a SingleLarge Expensive Drive (SLED). Additionally, thisarray of drives appears to the computer as a singlelogical storage unit or drive.
What does RAID stand for?
4
The Problem
• The Mean Time Between Failure (MTBF) of the array will be equal to the MTBF of an individual drive, divided by the number of drives in the array. Because of this, the MTBF of an array of drives would be too low for many application requirements.
5
The Solution
• Disk arrays can be made fault-tolerant by redundantly storing information in various ways.
• Five types of array architectures, RAID-1 through RAID-5, were defined by the Berkeley paper, each providing disk fault-tolerance and each offering different trade-offs in features and performance.
• In addition to these five redundant array architectures, it has become popular to refer to a non-redundant array of disk drives as a RAID-0 array.
6
Today’s Motivation
• We use RAID today for – Increasing disk throughput by allowing parallel
access– Eliminating the need to make disk backups
• Disks are too big to be backed up in an efficient fashion
7
Data Striping
• Fundamental to RAID is "striping", a method of concatenating multiple drives into one logical storage unit.
• Striping involves partitioning each drive's storage space into strips which may be as small as one sector (512 bytes) or as large as several megabytes.
8
Logical to physical data mapping for striping
strip 0
strip 4
strip 8
strip 12
strip 1
strip 5
strip 9
strip 13
strip 2
strip 6
strip 10
strip 14
strip 3
strip 7
strip 11
strip 15
PhysicalDisk 0
PhysicalDisk 1
PhysicalDisk 2
PhysicalDisk 3
strip 0
strip 1
strip 2
strip 3
strip 4
strip 15
strip 14
strip 13
strip 12
strip11
strip 10
strip 9
strip 8
strip 7
strip 6
strip 5
stripe
9
RAID Idea• Several improvements in disk-use techniques
involve the use of multiple disks working cooperatively.
• Disk striping uses a group of disks as one storage unit.
• RAID schemes improve performance and improve the reliability of the storage system by storing redundant data.– Mirroring or shadowing keeps duplicate of each
disk.– Block interleaved parity uses much less
redundancy.
10
RAID Common Characteristics
1. A set of physical disk drives viewed by the OS as a single logical drive.
2. Data are distributed across the array of disk drives.
3. Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure.
11
RAID Structure• RAID – provides reliability via redundancy.• RAID is arranged into six different levels:
13
RAID 0RAID-0 RAID Level 0 is not redundant. In level 0, data is split across drives, resulting in
higher data throughput. Performance is very good, but the failure of any
disk in the array results in data loss. This level is commonly referred to as striping.
14
RAID 0 (non-redundant)
strip 0
strip 4
strip 8
strip 12
strip 1
strip 5
strip 9
strip 13
strip 2
strip 6
strip 10
strip 14
strip 3
strip 7
strip 11
strip 15
15
RAID 1
RAID-1 RAID Level 1
Provides redundancy by writing all data to two or more drives.
The performance of a level 1 array tends to be faster on reads and slower on writes compared to a single drive, but if either drive fails, no data is lost.
The cost per megabyte is high. This level is commonly referred to as mirroring.
16
RAID 1 (mirrored)
• Recommended Applications
– Video Production and Editing
– Image Editing
– Pre-Press Applications
– Any application requiring high bandwidth
17
RAID 2RAID-2
• RAID level 2 uses error correcting algorithm that employs disk-striping strategy that breaks a file into bytes and spreads it across multiple disks. – Is intended for use with drives which do not have built-in
error detection.
– All SCSI drives support built-in error detection, so this level is of little use when using SCSI drives.
• The error-correction method requires several disks.
• RAID level 2 is more advanced than Level 0, because it provides fault tolerance, but is not as efficient as other RAID levels and is not generally used.
19
RAID 3
RAID-3 • RAID level 3 is similar to RAID level 2, because it
uses the same striping method as level 2, but it requires only one disk for parity data.
• RAID 3 suffers from a write bottleneck, because all parity data is written to a single drive, but provides some read and write performance improvement
21
RAID 4• RAID level 4 is similar to RAID level 3, because it
uses the similar striping method as level 3 and requires only one disk for parity data, but it employs striped data in much larger blocks or segments.
• RAID level 4 is not as efficient as RAID level 5, because (as in RAID level 3) all parity data is written to a single drive, so RAID level 4 suffers from a write bottleneck and is not generally used.
23
RAID 5• RAID level 5 is known as striping with parity. • This is the most popular RAID level.• It is similar to level 4 in that it stripes the data in large
blocks across all the disks in the array. • It differs in that it writes the parity across all the disks. • The data redundancy is provided by the parity
information. • The data and parity information are arranged on the
disk array so that the two are always on different disks.
• RAID level 5 has better performance than RAID level 1 and provides fault tolerance.
27
Summary )1)
RAID-1 is the array of choicefor performance-critical,fault-tolerant environments. Inaddition, RAID-1 is the onlychoice for fault-tolerance if nomore than two drives aredesired.
29
Summary )3)RAID-3 can be used in data intensive or
single-user environments which accesslong sequential records to speed up datatransfer. However, RAID-3 does notallow multiple I/O operations to beoverlapped and requiressynchronized-spindle drives in order toavoid performance degradation withshort records.
30
Summary )4)
RAID-4 offers noadvantages over RAID-5and does not supportmultiple simultaneouswrite operations.
31
Summary )5)RAID-5 is the best choice in
multi-user environments whichare not write performancesensitive. However, at leastthree, and more typically fivedrives are required for RAID-5arrays.
32
Hardware vs. Software RAID
• Software-based arrays occupy host system memory, consume CPU cycles and are operating system dependent.
• Software-based arrays degrade overall server performance
• Unlike hardware-based arrays, the performance of a software-based array is directly dependent on server CPU performance and load.
• Software-based implementations commonly require a separate boot drive, which is NOT included in the array.