Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is...

Preview:

Citation preview

Enhanced Availability With RAID

CC5493/7493

RAID

• Redundant Array of Independent Disks

• RAID is implemented to improve:– IO throughput (speed) and – Availability of a file system.

RAID Implementation

• Software – often criticized as not being a true RAID implementation.

• Hardware – A special RAID controller is required.

RAID: Stripe

• The stripe takes on two meanings within the context of a RAID system:– Stripe width (number of independent drives)– Stripe size (storage block size)

Both stripe width and stripe size are adjusted to enhance IO throughput.

RAID Stripe Width

• Stripe width refers to the number of disks used in parallel for IO transfers to and from the array.

Raid Stripe Size

• Stripe size refers to the size of the storage units organized on the disk surface.

• The stripe size is adjusted to optimize the speed of the IO transfers.

Common RAID Types

• RAID-0

• RAID-1

• RAID-1+0, RAID-0+1

• RAID-5

• RAID-6

RAID-0

• AKA disk striping

• Does not provide redundancy

• Degrades data availability, reduces MTF

• Improves IO throughput (average IO transfer rate improves)

RAID-0

• Ideal for temporary storage requiring fast data access.-Engineering/Scientific calculations on large

data volumes. However, the data is a redundant temporary copy.

RAID-1

• AKA mirroring

• Requires two independent disk devices– The first disk stores the data– The second disk is an image of the first– Can double the overall read throughput

RAID-1

• width = 1

RAID-1 Advantages

• Improves data availability.

• Dual-channel controller allows for two simultaneous read operations.

• Allows for error detection on read.

• Administrative advantages for service on one drive while the other remains available.

• Fault tolerance is one drive.

RAID-1 Disadvantages

• Writes have a slight performance penalty compared to no RAID.

• Doubles the cost of storage.

• Storage efficiency = 50%

RAID-1

• Ideal for data that is read more often than written:– Some database information that is not

updated often.– Web Server information (lots of reads, few

writes)

RAID-1+0

• Enhances IO throughput and data availability.

• Requires 2(n+1) separate disk devices, where n = 1, 2, 3, 4, …– Minimum of 4 disks required (n=1)

RAID-1+0

Width=2

RAID-1+0

• Width = 4

RAID-1+0

• RAID-1+0 has a higher fault tolerance compared to RAID-0,1, & 5.

• Storage efficiency is 50%

RAID-0+1

• Requires the same hardware as RAID-1+0, but less fault tolerant.

• However, there is better read throuthput from RAID-0+1 compared to RAID-1+0.

RAID-0+1

• Duplicate RAID-0 arrays. Allows simultaneous reads

RAID-5

• RAID-5 enhances – IO data throughput– Data availability

• Parity information enhances availability

• Requires a minimum of 3 independent disk devices.

Parity Information

• Based on the logical exclusive-or operation.

RAID-5 Configuration

• Stripe Width = 4

RAID-5

• The most common implementation of RAID.

• Ideal for a disk-server providing general storage.

• A good balance between reliability and speed.

• Often implemented using high quality disk drives (SCSI, 15k-rpm, high MTF)

RAID-5 Limitations

• Overhead occurs during writes due to the parity calculation and parity write.

• Storage efficiency is not 100% due to the parity storage requirements.

storage efficiency = (n-1)/n, where n = number of drives.

RAID-5 (S)ATA Limitations

• Large capacity (S)ATA drives are more likely to contain bad blocks.

• After a disk failure, the bad blocks make it impossible to rebuild the array from the remaining drives.

RAID-6

• Contains two sets of parity.

• Tolerates two simultaneous disk failures.

• A better solution for (S)ATA arrays where each disk has a large capacity (multiple TB).

• Stripe Width = 6

RAID-6

• Higher availability at the cost of greater IO overhead due to complex parity calculations and storage.

• Storage efficiency = (n-2)/n

• Becoming more popular for large storage capacity (S)ATA arrays

RAID-6 Disadvantages

• More expensive to implement due to extra parity information

• Slower write operations compared to other RAID-5

RAID Disk Swapping

• Hot Swap

• Warm Swap

• Cold Swap

Hot Swap

• The ability to swap out a failed disk from a RAID array without an interruption of service from the array.

• Performance will be slower due to the operations required to rebuild the new replacement disk.

Warm Swap

• The array is not accessible while a drive is being serviced, but the system does not need to be shut down.

Cold Swap

• System must be shutdown to service the array.

Spare Disk: Hot Spare

• Some RAID controllers can be configured to immediately recover from a disk failure if a hot-spare disk is connected to the controller at all times.

RAID Disk Failure and Performance

• When a failed disk is replaced in an array, there is a performance hit as the new disk must be re-populated with the required data for the complete array.

RAID Summary

• RAID-0 : for temporary storage only

• RAID-1 : ideal for disk services that provide mostly read operations like data base services and web services.

• RAID-5 : general purpose disk-server

• RAID-6 : for very large data requirement environments (multiple T-Bytes).

RAID Summary

• RAID 1+0 : general purpose disk server where RAID-5 & 6 are not adequate.– Better fault tolerance– More IO throughput

Other?

• RAID 1+1, mirror a mirrored RAID-1– Triples the cost of storage– Excellent fault tolerance.– Excellent read throughput.– Writes will suffer

Recommended