Upload
simon-shanon-lindsey
View
228
Download
2
Embed Size (px)
Citation preview
Chapter 3
Presented by:
Anupam Mittal
Data protection: Concept of RAID and its Components
Data Protection: RAID - 2
After completing this chapter, you will be able to: Describe what is RAID and the needs it
addresses Describe the concepts upon which RAID is built Define and compare RAID levels Recommend the use of the common RAID levels
based on performance and availability considerations
Explain factors impacting disk drive performance
Data Protection: RAID - 3
Performance limitation of a single drive disk drive◦ Limited Capacity◦ Limited access speed
An individual drive has a certain life expectancy ◦ Measured in MTBF ◦ Example - If the MTBF of a drive is 750,000 hours,
and there are 100 drives in the array, then the MTBF of the array becomes 750,000 / 100, or 7,500 hours
RAID was introduced to mitigate this problem RAID provides:
◦ Increase capacity◦ Higher availability ◦ Increased performance Data Protection: RAID - 4
RAID Arrays - 5
RAIDController
RAIDController
RAID Array
Host
Data Protection: RAID - 6
RAIDController
RAIDController
Hard Disks
Logical Array
Physical Array
RAID Array
Host
Hardware (usually a specialized disk controller card)◦ Controls all drives attached to it◦ Array(s) appear to host operating system as a
regular disk drive◦ Provided with administrative software
Software ◦ Runs as part of the operating system ◦ Performance is dependent on CPU workload◦ Does not support all RAID levels
Data Protection: RAID - 7
0 Striped array with no fault tolerance 1 Disk mirroring 3 Parallel access array with dedicated parity
disk 4 Striped array with independent disks and
a dedicated parity disk 5 Striped array with independent disks and
distributed parity 6 Striped array with independent disks and
dual distributed parity Nested RAID (i.e., 1 + 0, 0 + 1, etc.)
Data Protection: RAID - 8
© 2008 EMC Corporation. All rights reserved. RAID Arrays - 9
RAID Redundancy: Parity
Parity Disk
0
84
1
95
2
106
3
117
0 1 2 3
8 9 10 114 5 6 7
RAIDController
RAIDController
Host
© 2008 EMC Corporation. All rights reserved. RAID Arrays - 10
Parity Calculation
Parity
Data
Data
Data
Data
4
2
3
5
14
5 + 3 + 4 + 2 = 14
The middle drive fails:
5 + 3 + ? + 2 = 14
? = 14 – 5 – 3 – 2
? = 4
RAID Array
© 2008 EMC Corporation. All rights reserved.
Lecture 8, 9, 10
Different RAID levels and their suitability for different application environments: RAID 0, RAID 1
RAID Arrays - 11
Data Protection: RAID - 12
Stripe 1
Stripe 2
Strips
Strip 1 Strip 2 Strip 3
Stripe
Strips
Stripes
Strip 3Strip 2Strip 1
Stripe 1
Data Protection: RAID - 13
1
95
2
106
3
117
0
Host
RAIDController
RAIDController
Data Protection: RAID - 14
Block 1Block 1 Block 1Block 1Block 1Block 1Block 0Block 0Block 0Block 0
Host
Block 0Block 0 RAIDController
RAIDController
Data Protection: RAID - 15
Block 3Block 3
Block 2Block 2
Block 1Block 1
Host
RAID 0
Block 0Block 0
Block 3Block 3Block 2Block 2Block 1Block 1Block 0Block 0
RAID 1
RAIDController
RAIDController
Data Protection: RAID - 16
RAIDController
RAIDController
Block 3Block 3
Block 2Block 2
Block 1Block 1
RAID 0
Block 0Block 0
RAID 1
Block 3Block 3
Block 2Block 2
Block 1Block 1
Block 0Block 0
Block 3Block 3
Block 2Block 2
Block 1Block 1
Block 0Block 0
Host
Data Protection: RAID - 17
Host
Block 3Block 3
Block 3Block 3
Block 1Block 1
RAID 1Block 0Block 0Block 0Block 0
Block 1Block 1
RAID 0
Block 2Block 2Block 2Block 2 RAIDController
RAIDController
Data Protection: RAID - 18
Host
RAIDController
RAIDController
RAID 1
Block 0Block 0
Block 0Block 0
RAID 0
Block 2Block 2
Block 2Block 2 Block 3Block 3
Block 3Block 3
Block 1Block 1
Block 1Block 1Block 0Block 0
Block 2Block 2
Benefits are identical under normal operations
Rebuild operations are very different◦ RAID 1+0 uses a mirrored pair – only 1 disk is
rebuilt if a disk fails◦ RAID 0+1 if a single drive fails, the entire stripe is
faulted RAID is 0+1 is a poorer solution and is less common
RAID Arrays - 19
RAID Arrays - 20Parity Disk
0
84
1
95
2
106
3
117
0 1 2 3
8 9 10 114 5 6 7
RAIDController
RAIDController
Host
Data Protection: RAID - 21
Parity Disk
1
95
3
117
0
0 1 2 34 5 6 7
4
6
1
7
18
Host
RAIDController
RAIDController
Parity calculation 4 + 6 + 1 + 7 = 18The middle drive fails:
4 + 6 + ? + 7 = 18
? = 18 – 4 – 6 – 7
? = 1
?
Data Protection: RAID - 22
Host
RAIDController
RAIDController
Block 1Block 1
Block 2Block 2
Block 3Block 3
P 0 1 2 3
Block 0Block 0Block 3Block 3Block 2Block 2Block 1Block 1Block 0Block 0
ParityGenerated
© 2008 EMC Corporation. All rights reserved. RAID Arrays - 23
RAID 4 – Striping with Dedicated Parity Disk
RAIDController
RAIDController
P 0 1 2 3
Block 0Block 0
Block 0Block 0
Block 4Block 4
Block 1Block 1
Block 5Block 5
Block 2Block 2
Block 6Block 6
Block 3Block 3
Block 7Block 7
P 0 1 2 3P 0 1 2 3
P 4 5 6 7P 4 5 6 7
ParityGenerated
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Host
Data Protection: RAID - 24
Host
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Block 7Block 7
RAIDController
RAIDController
P 0 1 2 3
Block 0Block 4Block 0
Block 1Block 1
Block 5Block 5
Block 2Block 2
Block 6Block 6
Block 3Block 3
ParityGenerated
Block 0Block 0
P 0 1 2 3P 0 1 2 3
Block 4Block 4
P 4 5 6 7P 4 5 6 7P 4 5 6 7P 4 5 6 7
Block 4Block 4
P 4 5 6 7
Block 4ParityGenerated
Two disk failures in a RAID set leads to data unavailability and data loss in single-parity schemes, such as RAID-3, 4, and 5
Increasing number of drives in an array and increasing drive capacity leads to a higher probability of two disks failing in a RAID set
RAID-6 protects against two disk failures by maintaining two parities◦ Horizontal parity which is the same as RAID-5 parity◦ Diagonal parity is calculated by taking diagonal sets of
data blocks from the RAID set members Even-Odd, and Reed-Solomon are two commonly
used algorithms for calculating parity in RAID-6
Data Protection: RAID - 25
Hardware (usually a specialized disk controller card)◦ Controls all drives attached to it◦ Performs all RAID-related functions, including volume
management◦ Array(s) appear to the host operating system as a regular disk
drive◦ Dedicated cache to improve performance◦ Generally provides some type of administrative software
Software ◦ Generally runs as part of the operating system ◦ Volume management performed by the server◦ Provides more flexibility for hardware, which can reduce the cost◦ Performance is dependent on CPU load◦ Has limited functionality
RAID Arrays - 26
Comparison of RAID Levels
Data Protection: RAID - 27
RAIDMin
DisksStorage
Efficiency %Cost Read Performance Write Performance
0 2 100 Low
Very good for both random and sequential
readVery good
1 2 50 HighGood
Better than a single disk
GoodSlower than a single
disk, as every write must be committed to two
disks
3 3
(n-1)*100/nwhere n= number of
disksModerate
Good for random reads and very good for sequential reads
Poor to fair for small random writesGood for large,
sequential writes
5 3
(n-1)*100/nwhere n= number of
disksModerate
Very good for random reads
Good for sequential reads
Fair for random writeSlower due to parity
overhead Fair to good for
sequential writes
6 4
(n-2)*100/nwhere n= number of
disks
Moderate but more
than RAID 5
Very good for random reads
Good for sequential reads
Good for small, random writes
(has write penalty)
1+0and0+1
4 50 High Very good Good
Data Protection: RAID - 28
RAID Comparison
Small (less than element size) write on RAID 3 & 5 Ep = E1 + E2 + E3 + E4 (XOR operations) If parity is valid, then: Ep new = Ep old – E4 old + E4 new (XOR
operations)◦ 2 disk reads and 2 disk writes
Parity Vs Mirroring◦ Reading, calculating and writing parity segment introduces penalty to every write operation◦ Parity RAID penalty manifests due to slower cache flushes◦ Increased load in writes can cause contention and can cause slower read response times
Data Protection: RAID - 29
Ep new
RAID Controller
2 XOR
Ep new Ep old E4 old E4 new
+-= E4 oldEp old E4 new
P0 D1 D2 D3 D4
Data Protection: RAID - 31
RAIDController
RAIDController
What is a RAID array? What benefits do RAID arrays provide? What methods can be used to provide
higher data availability in a RAID array? What is the primary difference between
RAID 3 and RAID 5? What is advantage of using RAID 6? What is a hot spare?
Data Protection: RAID - 33