6
* DRAFT * Implementing Seamless Disk-based Data Protection and Recovery By Michael S. Mendola, Sr. Storage Architect A Mendola White Paper

Generic RLM White Paper

Embed Size (px)

Citation preview

Page 1: Generic RLM White Paper

* DRAFT *

Implementing Seamless Disk-based Data Protection and Recovery

By Michael S. Mendola, Sr. Storage Architect

A Mendola White Paper

Page 2: Generic RLM White Paper

Contents

Introduction 2

Is Disk-based Data Protection New? 2

To Do-it-Yourself or Not 3

The Key to Choosing and Planning

for Disk-based Data Protection and

Recovery 3

Implementation 5

Summary 5

Introduction

Have you already considered migrating to a

disk-based backup and recovery method?

Are you already somewhere along this path,

or haven’t yet decided that this is even right

for you? In any case, you are about to learn

the most efficient and cost-effective way of

planning for and bringing disk-based backup

and recovery into your IT environment.

Data protection and recovery have been a

major IT challenge for many years. It has

been said by experts that backup may

account for more than 50% of IT systems

administration time. Further, backing up

data to tape takes so much time and attention

that the capability to actually be able to

restore from backup data usually goes

untested - until that data is needed.

Many large IT shops are considering or

already enhancing their backup strategy with

new disk-based options. By moving to the

capabilities of disk, enterprises can

complement their tape backup strategies to

achieve faster and more reliable backup and

much faster – and more relevant – data

recovery while continuing to use tape for

offsite storage and longer-term archiving.

Is Disk-based Data Protection New?

By now many, if not most administrators of

IT systems in smaller businesses have at

least had the passing thought of

implementing a disk-based data protection

solution. Although the concept is certainly

not new, many quickly find that the myriad

of choices in solutions, methodologies and

vendors (and vendor claims) make the

“goodness” of disk-based backup and

recovery quickly regress into a solution that

is more expensive and at least as unwieldy

to manage as their old tape-based system

was. This is mainly due to realizing too late

that there are many considerations to be

Page 3: Generic RLM White Paper

taken into account when moving

toward using disk technologies to

protect and recover business-critical

data and systems, especially if there

are one or more remote business

locations which must be considered.

To Do-it-Yourself or Not

The concept of using disk instead of

tape as a “backup target” is fairly

straightforward, and this can be done

while still keeping the legacy tape-

based system intact. Many smaller

shops even start out by just daily

copying their files to one or more

USB drives and then taking the

drives offsite. Others have started to

use online WAN-based service

providers which allow automatic

periodic file backup over the Internet

to a remote vaulting location where

data is protected and restorable at the

file or even disk-image level when

needed. Again, for very, very small

businesses these options may be just

the thing. However, should the

business start to grow, or where there

might come into play external

demands, such as compliance,

varying mandated retention policies,

etc., these methods will quickly

become a major liability which itself

will start to beg for a solution; The

act of changing what has already

been put into place will be much

more costly in time, money,

materials – and disruption - than first

employing, through proper planning

and implementation the best solution

which fits current needs and can then

be made to granularly and

seamlessly scale to accommodate

growing needs or even just shear

changes in requirements.

The Key to Choosing and Planning for Disk-based Data Protection and Recovery

For the small-to-medium-sized organization,

the debate on whether disk-based data

protection should be used is pretty-much

over because the benefits far outweigh tape

as the method of production data protection

and recovery. As we have seen, the major

“downside” is to not truly understand how to

go about constructing the vision for, and

then finding the products and services and

vendors to bring disk-based techniques to

life in the most cost-effective and efficient

way, with emphasis on long-term efficacy.

An added benefit of proper planning and

implementation using the right products and

services is that cost reduction and

technology obsolescence avoidance (“future

proofing”) can be had wherever any storage

is used throughout your IT environment –

even if there are one or more remote sites to

consider.

The best way to view all data storage is in a

five-tier hierarchy which provides a more

holistic point-of-view. Below is a diagram

which does just that:

Page 4: Generic RLM White Paper

The diagram uses five layers to

describe storage and its use. Let’s

briefly examine each layer.

Application Tier

All business computers live here.

What each does is immaterial. From

powerful database servers down to

web servers, to back room file

servers, they all have one major

thing in common: They all write,

read and store data on disk.

Production Tier

This is where all of the primary

storage “lives”. Whether internal,

external direct-attached, SAN-based

or NAS, this is where all of the data

is stored - lifeblood business data,

email, down to the golf outing

JPEG’s and music MP3’s, this is

where it all is – and this is what you

need to backup and hopefully are

able to restore if any of it becomes

unavailable (okay, maybe the

JPEG’s and MP#’s are not that

important).

Recovery Tier

This is the layer where you have

systems which hold copies of the

data on in the Production Tier.

Unfortunately many still use tape

here – this is what you are planning

to replace with a disk-based model.

If you had to restore your most

critical business application data,

there are two main question this

layer must allow you to efficiently

address:

- Recovery Time Objective

(RTO): How long will it take

to get the data restored AND get the

application server up and servicing

requests again? (Using snapshot

technology will recover the data in a

very short time, but unless the disk

snapshot system uses integrated file

system and application preparation

agents, database systems will

perform a data consistency check

which will ADD possibly hours to

finally getting the application server

up and running the business again.

- Recovery Point Objective (RPO):

How “stale” is the data that will be

used to restore from? (note that with

tape the answer is easy: it’s as

“fresh” as the last backup, typically

at least 24 hours old).

POINT: The disk systems in this layer are

key as they must protect, reliably hold and

recover your data to whatever “flavor” of

primary storage failed: internal (data and OS

system boot recovery), DAS, SAN and

NAS). Also, if protecting your data means

moving it offsite to another facility (or

protecting data at remote sites means

sending it back to the main site), replication

would fit here as well. The systems in this

layer should enable open choice for what

storage, storage protocols and network

infrastructure to use - and should best be

vendor neutral. It should allow the use of

the most inexpensive disk possible, even if it

might be recycling decommissioned disk

form the Production Tier. Further, it should

be able make the most efficient use of

network bandwidth if replication will be in

the mix.

Protection Tier

This is the place for tape image (both

physical and virtual) data storage, including

disk-based clones if desired.

Page 5: Generic RLM White Paper

POINT: ALL first line production

data protection and recovery (with

optimum RTO and RPO “dialed-in”

for each business system) is done

from the Recovery Tier, leaving the

Protection Tier to hold only the

“backups-of-the-backups”.

Archive Tier

It is here where the oldest data which

is still worthy of retaining lives.

Disk storage systems used here

should be of the most cost-effective

and permanent nature (i.e. slowest

access performance, optical, Content

Addressable Storage (CAS),. Data

deduplication systems would fall into

this category as well.

Implementation

As can be seen, stratifying storage

and storage protection and recovery

concepts makes plain the inherent

need for a solid vision and proper

planning and hardware and software

and services vendor selection(s).

There are many vendors, including

your incumbent server and storage

vendor(s), which have their own

disk-based protection and recovery

strategies. These by necessity are

engineered to work best (and

sometimes only with) their main

server and/or storage products suite

and, by nature this will force the

technology/methodology

combination onto the user, resulting

in being more-or-less “locked-in” to

that paradigm.

Just as important to realize is that it

may take multiple vendors to

implement customized solutions at

each layer in the model for your

local and remote organization(s).

Much more rewarding is to select a single

vendor whose sole business is satisfying

each layer in the model with hardware and

software which are specifically designed and

totally integrated as a unit to supply the

features seamlessly. The best scenario is to

have all of the software that drives each

storage system in each layer below the

Application Layer based on a single code

set.

Summary

Disk-based methodologies can simplify,

make more efficient, and most importantly,

much more reliable, the protection and

recovery of your business data. It enables

you to turn from maintaining your data

backup and restore as separate and labor-

intensive operations with often questionable

value toward merging these into a reliable,

rapid and seamless operational environment

which becomes part of your daily production

storage operations. It can also enable

customizable recovery time and recovery

point objectives for each business

application system, giving the ability to

“dial-in” the right mix of solution

capabilities, costs and scaling over time.

The best way to achieve all the “goodness”

of disk-based methodologies is two-fold.

First, seek a tightly integrated modular

solution suite from a single vendor. Further,

the product set should provide components

to cover each layer in the tiered model

which are built on a common code platform.

This approach maximizes product

interoperability and reliability and

minimizes operational complexity.

Second, choose a product and services

provider that has experience and expertise in

this area, especially within the context of the

chosen solution set. Demand a proven

record of expertise in design, planning and

Page 6: Generic RLM White Paper

implementation of disk-based

methodologies.