32
RAID Redundant Array of Independent Disks

RAID Redundant Array of Independent Disks. 2 Motivation :-)

Embed Size (px)

Citation preview

RAID

Redundant Array of Independent Disks

2

Motivation)-:

Megabyte found that tying lots of barrels together tomake his raft worked just as well as having a huge one.And if one of them became punctured, it didn't sink

3

In 1987, Patterson, Gibson and Katz at the Universityof California Berkeley, published a paper entitled "ACase for Redundant Arrays of Inexpensive Disks(RAID)" . This paper described various types of diskarrays, referred to by the acronym RAID. The basicidea of RAID was to combine multiple small,inexpensive disk drives into an array of disk driveswhich yields performance exceeding that of a SingleLarge Expensive Drive (SLED). Additionally, thisarray of drives appears to the computer as a singlelogical storage unit or drive.

What does RAID stand for?

4

The Problem

• The Mean Time Between Failure (MTBF) of the array will be equal to the MTBF of an individual drive, divided by the number of drives in the array. Because of this, the MTBF of an array of drives would be too low for many application requirements.

5

The Solution

• Disk arrays can be made fault-tolerant by redundantly storing information in various ways.

• Five types of array architectures, RAID-1 through RAID-5, were defined by the Berkeley paper, each providing disk fault-tolerance and each offering different trade-offs in features and performance.

• In addition to these five redundant array architectures, it has become popular to refer to a non-redundant array of disk drives as a RAID-0 array.

6

Today’s Motivation

• We use RAID today for – Increasing disk throughput by allowing parallel

access– Eliminating the need to make disk backups

• Disks are too big to be backed up in an efficient fashion

7

Data Striping

• Fundamental to RAID is "striping", a method of concatenating multiple drives into one logical storage unit.

• Striping involves partitioning each drive's storage space into strips which may be as small as one sector (512 bytes) or as large as several megabytes.

8

Logical to physical data mapping for striping

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

PhysicalDisk 0

PhysicalDisk 1

PhysicalDisk 2

PhysicalDisk 3

strip 0

strip 1

strip 2

strip 3

strip 4

strip 15

strip 14

strip 13

strip 12

strip11

strip 10

strip 9

strip 8

strip 7

strip 6

strip 5

stripe

9

RAID Idea• Several improvements in disk-use techniques

involve the use of multiple disks working cooperatively.

• Disk striping uses a group of disks as one storage unit.

• RAID schemes improve performance and improve the reliability of the storage system by storing redundant data.– Mirroring or shadowing keeps duplicate of each

disk.– Block interleaved parity uses much less

redundancy.

10

RAID Common Characteristics

1. A set of physical disk drives viewed by the OS as a single logical drive.

2. Data are distributed across the array of disk drives.

3. Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure.

11

RAID Structure• RAID – provides reliability via redundancy.• RAID is arranged into six different levels:

12

RAID Levels

13

RAID 0RAID-0 RAID Level 0 is not redundant. In level 0, data is split across drives, resulting in

higher data throughput. Performance is very good, but the failure of any

disk in the array results in data loss. This level is commonly referred to as striping.

14

RAID 0 (non-redundant)

strip 0

strip 4

strip 8

strip 12

strip 1

strip 5

strip 9

strip 13

strip 2

strip 6

strip 10

strip 14

strip 3

strip 7

strip 11

strip 15

15

RAID 1

RAID-1 RAID Level 1

Provides redundancy by writing all data to two or more drives.

The performance of a level 1 array tends to be faster on reads and slower on writes compared to a single drive, but if either drive fails, no data is lost.

The cost per megabyte is high. This level is commonly referred to as mirroring.

16

RAID 1 (mirrored)

• Recommended Applications

– Video Production and Editing

– Image Editing

– Pre-Press Applications

– Any application requiring high bandwidth

17

RAID 2RAID-2

• RAID level 2 uses error correcting algorithm that employs disk-striping strategy that breaks a file into bytes and spreads it across multiple disks. – Is intended for use with drives which do not have built-in

error detection.

– All SCSI drives support built-in error detection, so this level is of little use when using SCSI drives.

• The error-correction method requires several disks.

• RAID level 2 is more advanced than Level 0, because it provides fault tolerance, but is not as efficient as other RAID levels and is not generally used.

18

RAID 2 (Redundancy through Hamming code)

19

RAID 3

RAID-3 • RAID level 3 is similar to RAID level 2, because it

uses the same striping method as level 2, but it requires only one disk for parity data.

• RAID 3 suffers from a write bottleneck, because all parity data is written to a single drive, but provides some read and write performance improvement

20

RAID 3 (bit-interleaved parity)

21

RAID 4• RAID level 4 is similar to RAID level 3, because it

uses the similar striping method as level 3 and requires only one disk for parity data, but it employs striped data in much larger blocks or segments.

• RAID level 4 is not as efficient as RAID level 5, because (as in RAID level 3) all parity data is written to a single drive, so RAID level 4 suffers from a write bottleneck and is not generally used.

22

RAID 4 (block-level parity)

23

RAID 5• RAID level 5 is known as striping with parity. • This is the most popular RAID level.• It is similar to level 4 in that it stripes the data in large

blocks across all the disks in the array. • It differs in that it writes the parity across all the disks. • The data redundancy is provided by the parity

information. • The data and parity information are arranged on the

disk array so that the two are always on different disks.

• RAID level 5 has better performance than RAID level 1 and provides fault tolerance.

24

RAID 5 (block-level distributed parity)

25

RAID Levels

26

Summary )0)

RAID-0 is the fastestand most efficient arraytype but offers nofault-tolerance.

27

Summary )1)

RAID-1 is the array of choicefor performance-critical,fault-tolerant environments. Inaddition, RAID-1 is the onlychoice for fault-tolerance if nomore than two drives aredesired.

28

Summary )2)

RAID-2 is seldom usedtoday since ECC isembedded in almost allmodern disk drives.

29

Summary )3)RAID-3 can be used in data intensive or

single-user environments which accesslong sequential records to speed up datatransfer. However, RAID-3 does notallow multiple I/O operations to beoverlapped and requiressynchronized-spindle drives in order toavoid performance degradation withshort records.

30

Summary )4)

RAID-4 offers noadvantages over RAID-5and does not supportmultiple simultaneouswrite operations.

31

Summary )5)RAID-5 is the best choice in

multi-user environments whichare not write performancesensitive. However, at leastthree, and more typically fivedrives are required for RAID-5arrays.

32

Hardware vs. Software RAID

• Software-based arrays occupy host system memory, consume CPU cycles and are operating system dependent.

• Software-based arrays degrade overall server performance

• Unlike hardware-based arrays, the performance of a software-based array is directly dependent on server CPU performance and load.

• Software-based implementations commonly require a separate boot drive, which is NOT included in the array.