27
The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor Shelby County Schools AETA - October, 2007 (revised post-conference to correct errors)

The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

Embed Size (px)

Citation preview

Page 1: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

The latest scoop on the popular disk storage technology, how it works,

and what it can do for you.

Walter J. Alexander, IVTechnical Services Supervisor

Shelby County SchoolsAETA - October, 2007

(revised post-conference to correct errors)

Page 2: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 2

No, not that RAID!

???

What Does RAID Mean?

Page 3: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 3

What Does RAID Mean?

Redundant Array(s) of Inexpensive Disks

The technology we now call RAID was developed (and patented) in 1978.

The term RAID was first used in 1987

Redundant? That means there’s another one to take over if the first one can no

longer perform the job.

Page 4: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 4

What Does RAID Mean?

RedundantArray(s) ofInexpensiveDisks

Okay, so what’s an Array???Here we have 3 separate disks (hard drives). If we take these 3 disks and treat them as a single unit,

now we have an array!

36GB 36GB 36GB

Now we have a 108GB disk!

Page 5: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 5

Why RAID?

Fault Tolerance If a hard drive fails, your system could continue to run,

and even allow you to make the repair without ever taking the system down.Users continue to work.No data is lost – thus no restores.

Better PerformanceDisk reads or writes can occur more quickly, giving

users what they need faster.

Page 6: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 6

Disk Types used in RAID Arrays ATA, PATA, IDE & EIDE

These terms may be improperly used interchangeably. Usually refers to the common 3.5” hard drive found in most computer

over the last 10 years.

SCSI, Ultra SCSI, Wide SCSI, SCSI2, SCSI3 The standard for servers and high-end PCs from around 1993 until now.

SATA Quickly becoming the new standard for PCs. Also becoming a strong contender in the server market.

RAID itself will work with any of these disk types. RAID relies on the controller more than the type of disk.

Page 7: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 7

RAID Hardware vs Software Controllers

RAID is a method of disk management and data input/output.

RAID can be achieved via software or hardware methods.Software-Based RAIDHardware-Based RAID

Software-Based RAID is built into some operating systems.Windows Server 2003Linux (various versions)

Page 8: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 8

Software-Based RAID

Pros Built into operating system. No additonal cost for RAID controller. Wizard-like GUI configuration makes setup easy.

Cons Performance impact on operating system. Memory and process usage. May not include for all partitions of disks in RAID array. Problem recovery can be more difficult.

When to Use When hardware-based RAID is not an option, but RAID is desired. Non-critical servers with large disk storage requirements.

Page 9: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 9

Hardware-Based RAID

Pros Built into many newer servers. Easy to add PCI-type controller to servers. All the overhead of the controlling the RAID array is handled by the

controller itself. More advanced recovery options for serious failures.

Cons Possible additional cost for controller. BIOS-based setup utility may not be as easy to use as GUI utilities.

When to Use Every time it’s available.

Page 10: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 10

RAID Basics

RAID is more about the configuration of the drives, and how data is written to, and read from, those drives than it is about physical connections.

Something must manage the RAID system – this must be either Hardware or Software. If Hardware, this is often called the RAID Controller. If Software, this is often called the RAID Manager.

The RAID number (i.e. RAID-1, RAID-5) indicates the configuration of the disks, how data is written, and how data is read. This also indicates what happens if part of the disk storage system fails.

The simplest forms of RAID require 2 hard drives. In most cases, all drives in the RAID array must be:

The same size (capacity) The same speed The same interface. For all practical purposes, drives are likely the same exact model.

Page 11: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 11

RAID Terminology RAID – Redundant Array of Inexpensive Disks. Array – A grouping of multiple hard drives into a single entity. This entity is

a RAID-x Array. Disk – The physical hard drive. There will be at least 2 of these when

talking about RAID. Controller – The physical card (or built-in) that connects to the Disks. A

RAID Controller has the smarts to arrange those disks into an array. Manager – The software component that has the smarts to arrange disks

into an array. Parity – Refers to Parity Blocks used in some RAID configurations. These

Parity Blocks provide the ability to reconstruct data after a drive failure. Cache – Memory set aside (usually part of a controller) to pre-store

anticipated Read Data, or to queue Write Data waiting to be written.

(Remember, you normally are dealing with either HARDWARE-BASED RAID, or SOFTWARE-BASED RAID… not both on the same server).

Page 12: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 12

RAID-0 Commonly known as a Striped Set. Requires at least 2 disks. There is NO Redundancy (AID-0?) Data is striped across disks.

Pros Faster performance because there is no Parity generation. Faster reads and writes because data can be transferred in a parallel fashion. You get the full capacity of all disks for data storage.

Cons If any disk fails, all the data is irrecoverable.

When to Use Performance is the number-one goal. Data is backed up elsewhere. Downtime is not a problem (lengthy restores).

Page 13: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 13

RAID-1

Commonly known as a Mirrored Set. Requires at least 2 disks. All data is written to both disks.

Pros Fault-Tolerance

If either disk fails, the system can keep working on the remaining disk.

Faster disk reads because data can data can be read in a parallel fashion.

Cons You lose the capacity of the second disk.

If each disk is 40GB, you only have 40GB of storage space (not 80GB).

When to Use Servers that are only configured with two disks. Fault-tolerance is importance.

Page 14: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 14

RAID-3 Commonly known as a Striped Set with Dedicated Parity. Requires at least 3 disks. Data is striped to data disks, then parity is written to the parity disk. RAID-3 accesses all the disks at once, and therefore can only service one I/O request at a time.

Pros Fault-Tolerance

If a data disk fails, the system can keep working on the remaining disks. If the parity disk fails, the system can keep working but without parity.

High disk read performance because data can data can be read in a parallel fashion. High disk write performance for large, sequential data.

Cons You lose the capacity of the parity disk. Not efficient for small, random data.

When to Use Very effective for large sequential data,

such as video.

Page 15: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 15

RAID-4 Also known as a Striped Set with Dedicated Parity, but with multiple I/O calls. Requires at least 3 disks. Data is striped to data disks, then parity is written to the parity disk.

Pros Fault-Tolerance

If a data disk fails, the system can keep working with the remaining disks. If the parity disk fails, the system can keep working without parity, but with no

noticable effect in performance. High disk read performance because data can data can be read in a parallel fashion. Can execute multiple I/O requests at once, assuming that data is on different physical

disks.

Cons You lose the capacity of the parity disk. Medium write performance.

When to Use Effective for small, random data.

Page 16: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 16

RAID-5 Commonly known as a Striped Set with Distributed Parity. Requires at least 3 disks. Data and partity is striped to all disks in a round-robin fashion.

Pros Fault-Tolerance

If any disk fails, the system can keep working on the remaining disks. Best balance of cost, performance and fault-tolerance for most applications. High disk read performance because data can data can be read in a parallel fashion. Can execute multiple I/O requests at once, assuming that data is on different physical

disks. Highly efficient.

Cons Inefficient with large file transfers. Medium write performance. Disk failure will have a direct impact on performance.

When to Use Most applications, including databases,

file and print services, web servers and e-mail.

Page 17: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 17

RAID-6

Commonly known as a Striped Set with DUAL-Distributed Parity. Requires at least 4 disks. Data and partity is striped to all disks in a round-robin fashion.

Pros Fault-Tolerance

If any TWO disks fail, the system can keep working on the remaining disks.

Cons You lose the storage capacity of two disks.

When to Use Consider RAID-6 when you have RAID arrays with more than 10

large capacity physical disks.

Page 18: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 18

Nested RAID Levels Supported by many RAID controllers. Allows the use of multiple RAID strategies on the same set of physical

disks.

RAID 0+1 Minimum of 4 disks – must be even number (4, 6, 8, etc.) Striped set is then mirrored to another striped set.

RAID 1+0 (RAID 10) Minimum of 4 disks – must be even number (4, 6, 8, etc.) Mirrored sets are then striped to another set of drives.

RAID 5+0 (RAID 50) Minimum of 3 disks. Striped set across distributed parity RAID systems.

RAID 5+1 Minimum of 4 disks – must be even number (4, 6, 8, etc.) Mirrored striped set with distributed parity. Sometimes called RAID 53.

Page 19: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 19

Non-Standard RAID Levels RAID-7

Created by Storage Computer Corporation. Added caching to RAID-3 and RAID-4 to improve performance.

RAID-S Created by EMC Corporation. Also known as Parity RAID. Was offered as an alternative to RAID-5 on EMC Symmetrix Systems. No longer support on the latest releases of Enguinuity (the Symmetrix operating

system). RAID-Z

Created by SUN Microsystems. Adds an extra level of protection in the way data is written to a new location

instead of overwriting existing data. Only writes full stripes of data, utilizing a mirroring technology for small data

writes. RAID-Z2

Variation of RAID-Z with the ability to withstand two disk failures.

Page 20: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 20

Non-Standard RAID Levels Double-Parity

Not exactly RAID-6. Two sets of Parity Information are generated. The two sets are not the same… they are based on different groups of data blocks.

RAID-DP Created by Network Appliance Corp. Offers better performance over traditional RAID-6 via storage controller software. Offer protection of writing data to battery-protected NVRAM to ensure that no data is lost in

the event of a power outage. Was offered as an alternative to RAID-5 on EMC Symmetrix Systems. No longer support on the latest releases of Enguinuity (the Symmetrix operating system).

RAID 1.5 Created by HighPoint Systems. Sometimes incorrectly identified as RAID-15. Very close (if not exact) implementation of RAID-1.

RAID-5E, 5EE & RAID 6E Introduced by IBM ServeRAID. The E stands for Enhanced. Variations of RAID-5 and RAID-6, with hot-spare drives that are an active part of the

block rotation.

Page 21: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 21

Non-Standard RAID Levels ServeRAID 1E

Utilized by some IBM and Sun storage systems. Uses 2-way mirroring.

RAID-K Created by Kaleidescape for their KSERVER media storage units. Uses Double-Parity with proprietary modifications. Allows adding additional drives to existing array.

Linux MD RAID-10 Part of the Linux kernel since 2.6.9 (software-based). Slight modification to the RAID-10 standard to allow less drives to constitute a RAID-10

array. Intel Matrix RAID

Not a new RAID level. Allows physical disks to be broken into logical partitions. Those logical partitions can be part of separate RAID arrays.

UNRAID Developed by Lime Technology. Unusual in that it does not require drives in RAID Array to be of matching size or speed.

Page 22: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 22

The RAID Experience

Okay, all this RAID-1 and RAID-5 and stuff is great, but what does it all mean?

The bottom line to me is fault-tolerance.Hard drives have moving parts, which means they will

likely fail at some point.Keep those systems running, even if a hard drive fails.Better performance is nice, but fault-tolerance is the real

selling point.

Page 23: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 23

The Cost of Non-RAID

Typical Non-RAID Setup…

36GB 72GB

Here’s my operating system hard drive.

Here’s my important data hard drive.

Just lost my OS… can’t boot, and even though my

72GB data drive is good, no one can

get to it!

Just lost my Data… server is still

running, but users don’t care because they cannot get to

their data!

Page 24: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 24

RAID-1 Example

Typical RAID-1 Setup…

36GB 72GB

Here’s my operating system RAID-1 Array

called “Array 0”

Here’s my important data RAID-1 Array called “Array 1”

Just lost the first hard drive in Array 0.

RAID-1 keeps Array 0 running on the second hard drive. Users

may see a little slower reads, but things keep working.

Just lost the second hard drive in Array 1.

RAID-1 keeps Array-1 running on the first hard drive… many

users don’t know that anything happened!

36GB 72GB

Page 25: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 25

RAID-5 Example

Typical RAID-5 Setup…

72GB

Here’s all the disks in my server configured as a single RAID-5 array.

Third hard drive has failed. Arrays get a little slower, but users keep working and no data is lost.

72GB 72GB 72GB

New hard drive installed… RAID controller REBUILDS the data on that drive while the system is operating… users connected,

programs runnings, etc.

Page 26: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 26

The Case for RAID

RAID makes sense for all your servers.Smaller servers should have a minimum of two

hard drives in a RAID-1 mirror configuration.Larger servers should have 3 or more hard

drives in a RAID-5 striped set configuration.SAN, NAS and really big servers are more likely

to use RAID-6, dual-parity configurations and such.

No matter what you have, keep a spare hard drive or two in stock if possible!

Page 27: The latest scoop on the popular disk storage technology, how it works, and what it can do for you. Walter J. Alexander, IV Technical Services Supervisor

AETA October 2007 Shelby County Schools 27

RAID

Bibliography

Lascon.co.uk – RAID Animated graphics

Any questions? Drop me an e-mail!

[email protected]

Thank You!