29
Disk-Based Backup & Recovery: Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at unbelievable rates for businesses of every type. To help counter this, it’s time for organizations to examine the benefits that disk storage subsystems can provide in data protection environments. This white paper details four enhanced data recovery (disk-based backup) architectures and provides guid- ance on how to determine whether or not those architectures would be an overall fit within the IT infrastructure. The assessment of a solution’s effectiveness examines whether its overall benefits balance well against its human, corporate, and financial costs. This white paper offers in-depth insight on what those benefits and costs are so that IT professionals are armed with the knowledge required to make sound decisions.

Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Disk-Based Backup & Recovery:

Making Sense of Your Options

White Paper

Datalink

September 2007

Abstract: Data and storage requirements are growing at unbelievable rates for businesses

of every type. To help counter this, it’s time for organizations to examine the benefits that

disk storage subsystems can provide in data protection environments. This white paper

details four enhanced data recovery (disk-based backup) architectures and provides guid-

ance on how to determine whether or not those architectures would be an overall fit within

the IT infrastructure. The assessment of a solution’s effectiveness examines whether its

overall benefits balance well against its human, corporate, and financial costs. This white

paper offers in-depth insight on what those benefits and costs are so that IT professionals

are armed with the knowledge required to make sound decisions.

Page 2: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

Tape: the traditional approach to data recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

Why disk makes sense for data recovery now . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

Paper topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2

Data recovery bottlenecks and their impact on recovery operations . . . . . . . . . . . . . . . . . .2

What is a bottleneck? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2

Potential bottleneck areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

Improper assessment of bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

Eliminating bottlenecks with enhanced data recovery solutions . . . . . . . . . . . . . . . . . . . . .3

What is enhanced data recovery? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

First analyze the pain points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4

Can disk play a role? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

Will disk completely replace tape? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

The cost of archiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

The cost of off-site media storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

Disk is just another tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

Enhanced Data Recovery Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

The enhanced data recovery continuum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

Disk to disk (D2D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

How does D2D work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

Synthetic full backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8

Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9

Operational benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9

Best fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9

Business value impact of D2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10

Example of a D2D implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10

Virtual tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

How does it work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

Vendor approaches to virtual tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12

Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12

Operational benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

Best fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13

Business value impact of virtual tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Example of a virtual tape implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Point-in-time copies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

Types of point-in-time copies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

Full image mirroring technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

Mirror management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

Snapshot edits, additions, and disk space requirements . . . . . . . . . . . . . . . . . . . . .16

Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17

Operational benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

Table of Contents

Page 3: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Best fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

Business value impact of point-in-time copies . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

Example of a point-in-time copy solution implementation . . . . . . . . . . . . . . . . . . .18

Continuous data protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19

What is continuous data protection? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19

How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19

Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20

Operational benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

Best fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

Business value impact of CDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

Example of CDP implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21

Adding depth to enhanced data recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22

Data deduplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22

Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24

Compromise no longer necessary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24

Understand the bottlenecks and pain points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Role of disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Partnership with Datalink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25

Disk-Based Backup & Recovery: Making Sense of Your Options

Page 4: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Introduction

Tape: the traditional approach to data recovery

For decades, organizations have protected critical, electronically stored data

by making a second copy of the data on tape. Since then, incremental

progress has been made in improving this process. Tape drives have grown in

capacity and performance. Software has become more sophisticated, allow-

ing creative ways to back data up faster, and using fewer resources. Tape

libraries using robotic automation have become common, enabling lights-out

backup operations in large environments. Storage area networks (SANs)

enable tremendous scalability with robust performance. Ultimately, however,

these advances have not kept pace with the business expectations placed on

IT organizations, and therefore the market has screamed for enhancements.

The sources of pain come from many areas, but they can be grouped into the

following categories:

• Data growth

• Backup windows

• Recovery time objectives (RTOs)

• Recovery point objectives (RPOs)

• Decentralization of data

• Rising costs and flat budgets

• Compliance requirements and legal discovery needs

Why disk makes sense for data recovery now

While tape-based architectures have long been the primary technology of

choice for backup and recovery, they have not kept pace with the business

expectations placed on IT organizations.

When comparing the costs of raw ATA or Serial ATA drives to those of LTO

tape media, the difference has been more than 10X in favor of tape in the

past. In today’s market, however, the gap has narrowed and low-end disk is

now less than 2X the cost per gigabyte of tape storage. As a result, organiza-

tions are increasingly augmenting tape-based backup and recovery architec-

tures with disk-based solutions, resulting in measurable performance, relia-

bility, and manageability enhancements. It’s important to note though that

tape technology has not stood still, so further reduction in the relative cost

between raw disk capacity and raw tape capacity cannot be counted on to

drive down the overall cost of disk-based systems.

Still, with rapidly declining disk prices and the introduction of low cost, rela-

tively high-performance RAID-protected ATA disk subsystems, the storage

industry is now spinning about enhanced data recovery (disk-based) solu-

tions, which raises these questions:

While tape-based

architectures have

long been the pri-

mary technology of

choice for backup

and recovery, they

have not kept pace

with business exp-

tectations placed

on IT organizations.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 1

Page 5: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

How much performance improvement can be realized in data recoveryutilizing disk?Now that the price of disk has made it a practical consideration for datarecovery, is tape needed?

Will recovery operations be simpler with disk?

The answer to those questions is one that nobody wants to hear: It depends.

There are many variables in storage environments; therefore, it requires an

organization to assess its environment and carefully prioritize its business

objectives based on that assessment.

Implementing a data recovery solution that meets business objectives and

SLAs, especially around restoring data, is a tremendous challenge for IT

managers. For example, if an organization is unable to meet its internal and

external SLAs, it could pay a hefty price for this shortcoming. These costs

could include lost revenue, lost productivity, penalties and litigation costs,

and degraded customer and business partner experiences and confidence.

Paper topics

The topics in this paper include:

• Environmental variables (how bottlenecks impact data recovery opera-

tions)

• Enhanced data recovery approaches (how can new technologies be used

to augment tape for data recovery?)

• Enhanced data recovery architectures (the categories, benefits, and best

fit)

Data recovery bottlenecks and their impact on recovery operations

What is a bottleneck?

In its broadest perspective, a bottleneck is the step or process within every

system that imposes a delay on the ultimate throughput of that system. A bot-

tleneck can exist in any area of a system, including people, process, or tech-

nology. All systems, including data recovery systems, contain bottlenecks.

When a bottleneck is removed, another one often appears, or worse, another

one is actually created by improperly addressing the original bottleneck.

When determining if disk technology can be implemented to address data

recovery bottlenecks by improving performance or reliability, it is vital to

fully assess the entire data recovery operation, and accurately pinpoint where

the bottlenecks occur. The goal is to systematically remove bottlenecks until

service level agreements (SLAs) can be met and quality of service (QoS)

goals are achieved.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 2

All systems,

including data

recovery systems,

contain bottle-

necks. When one

is removed,

another one often

appears.

Page 6: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Potential bottleneck areas

Bottlenecks can be found in data recovery operations in the following areas:

• Procedure

• Software capabilities and configuration

• Hardware capabilities and configuration

• LAN performance and saturation levels

• SAN configurations

Improper assessment of bottlenecks

Often bottlenecks are inaccurately diagnosed in one component or area of the

data recovery operation. This leads to an investment to improve the capabili-

ties of the problem component or area. However, due to improper diagnosis,

the upgrade will not yield a net performance improvement because it was not

the cause of the bottleneck.

For example, an organization finds that its tape I/O performance is not ade-

quate during a backup operation, so it upgrades the tape drives. When the

tape drive upgrade does not improve backup performance, the organization

implements a SAN to increase the bandwidth to the drives. Still frustrated,

the organization next replaces the host bus adaptors (HBAs) in the servers.

The performance is still slow so the organization deduces that the issue must

be the backup application. The organization upgrades the application and the

problem still exists.

Then the organization realizes that there are too many files in one directory.

This directory bogs down the file system, preventing the server from sending

data down stream any faster than previously. When the real bottleneck is

fixed though, a new one appears. This new bottleneck might be resolved by

introducing disk into the environment. But even if this were the case, disk as

a backup target clearly would not have addressed the root cause of the prob-

lem (too many files in one directory), even if it provided some symptomatic

relief. For that matter, any one of the introduced fixes could have exacerbat-

ed or further masked the true problem.

This example illustrates the need for a systematic, objective-based approach

to assessing problems that occur in data recovery operations.

Eliminating bottlenecks with enhanced data recovery solutions

What is enhanced data recovery?

Enhanced data recovery is a data backup and recovery architecture that adds

a disk-based storage array (combined with a sophisticated software technolo-

gy) to a traditional tape-only solution. It enables the concept of backing up to

disk and archiving to tape.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 3

Enhanced data

recovery enables

the concept of

backing up to

disk and archiv-

ing to tape.

Page 7: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

First analyze the pain points

Before deciding to introduce disk into the data recovery equation, it is impor-

tant to analyze and prioritize objectives based on pain points, and then deter-

mine the desired outcome. Typical objectives for a disk-based enhanced data

recovery project should come from the organization’s pain points:

Data growth. No organization is exempt from the pressures of managing,

storing, and protecting the ever-increasing amounts of data.

Backup window. Backup window is defined as the time available for IT

administrators to slow down or stop production to perform data recovery

operations. In today’s 24x7xforever business, organizations are increasingly

finding that their backup windows have become either extremely compressed

or extinct. They no longer have the luxury of shutting down applications, or

sending out emails to users asking them to wait until further notice to login.

Recovery time objective (RTO). RTO is the goal that the organization sets

for the time it takes to fully restore an application with its data when a failure

or data loss occurs. Business executives measure system downtime in terms

of hours and dollars; IT managers measure it in terms of high blood pressure

and gray hair. Either way, it is no secret that the costs are both extremely

high and on the rise. When a system goes down and perhaps loses its data,

there is an expectation that the system will be recovered within a reasonable

amount of time. Business units and IT staff often debate what might be

deemed reasonable, given the available technology and staff. Usually, when a

RTO is established for a particular application, it represents an ambitious

goal, placing strain on the IT staff. (See Figure 1.)

Recovery point objective (RPO). RPO is defined as the point in time at

which data is recovered. For example, if a backup is performed at midnight,

and the data is used to recover a system at noon the next day, the recovery

point would be midnight. This is a measurement of the data loss between the

last backup and the time of data loss or corruption. The practical approach to

determining RPO for a given application would be to ask these questions:

“How much data can we afford to lose; and what is the investment cost to

protect the data to that level?” To fully establish this, it is important to esti-

mate how much data is generated during an hour of production, and what

would be the cost to recreate this data (if possible) when lost or corrupted.

This information can help determine the risk-adjusted ROI for technologies

to improve the ability to recover data to a more recent point in time. (SeeFigure 1.)

Before introduc-

ing disk into the

equation, it’s

important to ana-

lyze and prioritize

objectives based

on pain points.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 4

Figure 1: The cost of a data protection solution is relative to how quickly an organ-ization needs to restore the data and how much data it can afford to lose.

Page 8: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Decentralization of data. In recent years, storage hardware costs have

declined more rapidly than WAN bandwidth costs. And over long distances,

latency issues may preclude centralization of resources. For these reasons,

companies often decentralize data to provide adequate performance in remote

offices, without requiring the costly bandwidth necessary to run data-inten-

sive applications on centralized storage. This introduces a significant chal-

lenge to provide adequate protection for decentralized data. Many companies

place tape resources at remote sites and designate the most technical person

in that office to take responsibility to maintain the data recovery operation.

This approach is prone to error, which could result in significant data losses.

Rising costs, flat budgets, and static headcount. Virtually all IT organiza-

tions feel the pressure of too much data and not enough resources to manage

and protect it adequately. The expectation is that IT organizations continue to

deliver against SLAs that remain constant for data recovery, while the vol-

ume of data grows rapidly, without adding headcount. While tape provides a

good archive solution for many organizations, it can become an operational

bottleneck in environments with frequent simultaneous backup and recovery

operations. This pressure fuels the cry for data recovery technologies that

allow greater scalability in the data center within constrained IT budgets.

Compliance requirements and legal discovery needs. Data must be secure,

unalterable, and in some cases immediately accessible in order to comply

with a myriad of regulations. In addition, litigation support and e-discovery

have emerged as major IT and business challenges.

Can disk play a role?

Clearly defined and prioritized project objectives based on the organization’s

pain points and an understanding of the performance bottlenecks within a

data recovery environment serve as the foundation for determining whether

disk can play a role in the data recovery operations. Depending on the identi-

fied bottlenecks and the established objectives, disk technologies can be

added to the data recovery equation in many different ways. This can be done

to achieve efficiencies in data recovery operations.

Will disk completely replace tape?

The answer is probably not yet. In most environments, disk will complement,

rather than replace tape in near-term implementations. Tape is a low-cost

alternative to disk in two areas: 1) Economic archive due to its low incre-

mental cost and portability; and 2) Foundational offsite disaster recovery

storage.

The cost of archiving

When organizations evaluate the cost of disk and tape, their comparisons

often focus on the simple cost per gigabyte of storage. They often overlook

the ongoing operational cost with each medium for a given application. For

archival purposes, there can be a great premium in the costs of managing

Depending on the

identified bottle-

necks and the

established objec-

tives, disk tech-

nologies can be

added to the data

recovery equation

in different ways.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 5

Page 9: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

disk compared to those for managing offline tape. When measuring cost fac-

tors for data stored on disk, organizations should consider:

• Power consumption costs linked with keeping the spindles turning and

keeping systems cool

• Wear-and-tear on all components of the array

• Data center floor space

• General storage management costs linked to disks

In contrast, archived tape offers:

• Greater portability

• Little power consumption

• Cheaper storage space

• Less management effort over the course of an extended archival life cycle

The cost of off-site media storage

Most organizations have procedures in place to rotate a third set of tapes to

an offsite vault location mainly for use after a disaster, but also in the event

of a double-media failure in the data center. While this procedure could be

deemed expensive compared to not having a disaster recovery plan in place,

it is often economical compared to having live data replicated to a disk array

at a remote hot site. With this approach, the cost of the incremental disk to

manage the process (in lieu of tape) could be nominal. However, the total

cost of this approach would include disk, software, servers, adequate band-

width, remote data center, and management costs.

It is clear that the total cost of this approach can be expensive. This is true

particularly if the capabilities of tape can adequately satisfy the organiza-

tion’s requirement for recovery time and recovery point performance in the

event of a site disaster.

Disk is just another tool

It is evident that disk is not the silver bullet, but merely one more tool that

can be leveraged to improve an organization’s data recovery operations. If it

is determined that disk should play a role, the next step would be to architect

a solution that applies the right technologies to resolve the bottlenecks and to

satisfy the data recovery pain points that the organization is experiencing.

Enhanced Data Recovery Architectures

Categories

Datalink segments enhanced data recovery architectures into four categories:

1. Backup to disk, which is sometimes referred to as D2D (disk-to-disk) or

D2D2T (disk-to-disk-to-tape).

2. Virtual tape, where software is used to present a virtual interface to disk,

making it appear to backup applications as a tape device.

It’s evident that

disk is not the sil-

ver bullet, but

merely one more

tool that can be

leveraged to

improve data

recovery opera-

tions.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 6

Page 10: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

3. Point-in-time copy, where an image of the data, such as a full mirror or

pointer-based snapshot of the data is stored on a RAID subsystem.

4. Continuous data protection, which features the ongoing sequential capture

of each data write transaction committed to a production volume. These

transactions are held on a secondary disk volume with a logging feature

that allows rolling back to a desired point in time.

The enhanced data recovery continuum

At a very basic level, data is protected by making copies of the production

data. The method used to make these copies determines the level at which the

data is protected as well as the costs required to reach that level. The

enhanced data recovery continuum illustrates the range of methods available

to deliver increasing levels of data protection. How quickly a copy can be

made determines how often the copies occur, which dictates the maximum

potential data loss. More specifically, all transaction records that are captured

after the last copy was made are potentially lost if a disruption occurs.

Because of budget limitations, most enterprises are not able to deploy the

most sophisticated data protection solution available. Firms need to define an

RPO and RTO for each of their applications and then select the most appro-

priate technology from a continuum of data protection technologies — both

disk and tape.

Disk to disk (D2D)

How does D2D work?

When backing up to disk, enterprise backup software directs data from server

clients to a disk-based storage array. The data is stored on the disk before

being staged or cloned to a tape library for secondary protection. Backing up

Calculating ROI for a Tape Consolidation Project: Five Key Steps

Because of budg-

et limitations,

most enterprises

are not able to

deploy the most

sophisticated data

protection solu-

tion available.

2007 Datalink. All Rights Reserved. www.datalink.com 7

Conventional Disk

Virtual Tape Library

Point-in-Time Copy

Continuous Data Protection

Data DeduplicationReplication

Cost of Protection

Cost of Lost Data

Cost of Recovery

Cost of Time

days hours seconds seconds hours days

$$$$$$ $$

$$$ $ $ $$$

Recovery Point Recovery Time

FailureOccurs

Enhanced Data Recovery Continuum

Figure 2: Organizations implement data protection technologies from somewhere on thecon tinuum shown in the arrows above.

Page 11: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

to disk benefits from the random access nature of disks, eliminating the slow

sequential seek process of tape, which leads to faster file-level restores and

improved reliability. (See Figure 3.)

Synthetic full backups

Backup to disk can also provide an improved foundation for performing syn-

thetic full backups. Synthetic full backups allow organizations to perform

one full backup, followed by ongoing incremental backups. These backups

are finished by merging the full backup with the incremental backups as a

background activity, resulting in an updated full backup. This stops the need

to perform regular full backups, which can consume tremendous bandwidth

and bring production to a grinding halt.

Synthetic full backups can be done without using disk in a traditional tape

environment, but the impact on resources during image consolidation back-

ups can be significant. The wear and tear on tape and library resources from

loading and unloading the tapes when backup images are regularly merged

into a single set is significant.

Also, tape technology does not allow multiple simultaneous reads or writes

from a single tape or drive so any tape resource being used for synthetic full

backup is unavailable during that period. Disk, however, can provide the

server with online access to data for making merged backup images, mini-

mizing the impact on tape resources. By storing the backup images onto disk,

Backup to disk

can also provide

an improved foun-

dation for per-

forming synthetic

full backups.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 8

Primary Storage

Backup System

Disk Staging

Tape Library

Fast

(Prim

ary)

Rec

over

y

Slo

wer

(Sec

onda

ry) R

ecov

ery

Figure 3: Disk-to-disk recovery is fast and reliable.

Page 12: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

these images are available for simultaneous reads for recovery operations

during a synthetic full backup.

Challenges

Disk can yield improvements in a data recovery operation, but there are some

challenges.

First, to fully exploit the capabilities of D2D, it is critical that the backup

application supports the desired functionality. Most commercial backup soft-

ware products were originally designed to use tape as the exclusive backup

medium. For most products, only recent upgrades make it feasible to use disk

as a viable destination for backup data. Also, most file systems used to

address disk were designed for random I/O; this is okay for general file

access and database use, but not optimal for sequential, large-block data

transfers, which is a common characteristic of backup applications. This

introduces a bottleneck and throttles back the performance of the disk sub-

system.

Another limiting factor to performance in a D2D configuration is the way

most file systems reclaim and reuse space on a volume after data has been

deleted. The file system will generally write new blocks of data to the loca-

tions where data has been deleted. This is done so that the blocks become

available, without regard for where they are physically on the disk surface.

This method of writing to disk causes a condition known as fragmentation

and leads to performance loss, which worsens over time as the data on a vol-

ume becomes increasingly more fragmented.

A third challenge is the cost of conventional disk, which has limited its use to

only the most important data and the most recent backups. Acquisition costs

for conventional disk have been high due to the lack of advanced storage-

efficient technologies. However, with the introduction of data deduplication

technologies, the cost factor is significantly less and shows a strong ROI.

Operational benefits

In D2D backup, traditional disk storage—often ATA or SATA based—can be

configured as the target for recovery data. By utilizing disk technology as

part of a data recovery infrastructure, an IT operation can improve its ability

to meet its backup windows, improve RTO, increase reliability, and provide

greater functionality. Although this approach can yield backup performance

benefits in some environments, its primary benefits are backup reliability and

recovery performance. Datalink recommends that a copy of the data be writ-

ten to tape or replicated offsite for disaster recovery and archive purposes.

Best fit

D2D recovery can give an organization a greater ability to meet its RTOs.

This is noticeable when having to recover individual or small file sets on a

regular basis. Depending on what bottlenecks exist in the environment, D2D

can also improve the ability to meet a defined backup window. D2D also has

a reliability advantage over tape, which can lead to some RPO benefit in an

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 9

Although this

approach can

yield backup per-

formance benefits

in some environ-

ments, its primary

benefits are back-

up reliability and

recovery perform-

ance.

Page 13: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

environment with a high failure rate on backup jobs. When a tape backup job

fails, the previous evening’s backup becomes the recovery point, exposing up

to 48 hours of potential data loss.

In addition, D2D offers benefits in environments where backup is performed

over low-performing or saturated networks and tape streaming is difficult to

achieve, or where interleaving must be used to achieve streaming tape drives.

In these environments, the data will be accepted by the recovery disk at

whatever speed the data is delivered. This stops the shoe-shining effect (com-

mon to many low-end and mid-range tape drives), thereby greatly improving

system throughput.

Advantages can also be achieved in performance at the high-end of the spec-

trum, where disk offers greater data configuration flexibility on primary stor-

age, such as with database systems. Once the system is optimized, disk can

deliver performance comparable or greater than tape with less ongoing

tweaking and tuning.

Business value impact of D2D

In addition to the operational benefits within IT, organizations experience the

following business benefits associated with the successful implementation of

D2D backup:

- Reduced risk to business viability due to unprotected data as a result of

failed backups

- Tape media cost savings due to reduced tape media consumption

- Higher employee productivity due to shortened downtime in the event

that data needs to be recovered from a backup source

- Improved customer satisfaction as customers experience fewer outages

from the business and their inquiries are answered more quickly because

employees will have improved access to critical data

Example of a D2D implementation

Problem: A medium-sized IT services organization faced an ongoing chal-

lenge of lackluster backup operations performance. Due to complex applica-

tion configurations and overburdened servers, the organization was unable to

deliver data to its tape drives at a fast enough rate to stream the devices,

which caused excessive wear and tear on the tape drives and media, and dra-

matically impacted the overall throughput of the system. Also, the tape dupli-

cation process for disaster recovery was unacceptably slow given the envi-

ronmental limitations. As a workaround, multiplexing was implemented as a

means of aggregating multiple data streams in an effort to stream the tape

drives. While this offered some relief on backup operations, it had negative

ramifications on restore performance due to the greater amount of interleaved

data that needed to be read during a system restoration.

Solution: Careful analysis of this environment identified multiple bottle-

necks that could be addressed to improve the situation. Given a variety of

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 10

Advantages can

also be achieved

at the high-end of

the spectrum,

where disk offers

greater data con-

figuration flexibili-

ty, such as with

database sys-

tems.

Page 14: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

cost and implementation process variables, it was determined that the most

cost effective option, and the measure that would also deliver the most bene-

fit, was a D2D architecture. Secondary disk has been implemented as the ini-

tial destination for backup jobs. From that location, backup data is subse-

quently cloned to tape in a process that is controlled by the backup applica-

tion. A copy of backup data is maintained on disk for a period of time where

it is accessible for restore operations.

Results: The implementation of D2D required some minor modification of

the backup environment, but the overall effort was not excessive. The organi-

zation experienced roughly a 60% reduction in the amount of time necessary

to complete a backup and a noticeable decrease in the wear and tear on tape

resources, as the cloning process consistently results in full streaming of the

tape drives. The system was architected so that approximately 90-95% of

recoveries are sourced from disk. Tape recoveries are extremely rare. Disk-

based recoveries occur much more quickly than tape recoveries, which

allows the IT organization to meet its SLAs to the business units. The backup

and recovery operation can be delivered much more consistently and requires

60-70% fewer administrative hours, leading to greater productivity within IT.

Virtual tape

How does it work?

Another approach to introducing disk into the recovery infrastructure is to

use emulation technology as a front-end to the disk system. This presents

disk to the backup application in a way that makes it appear as tape. In many

cases this approach (as shown in Figure 4) is the least disruptive of the four

discussed in this paper.

Another approach

to introducing

disk into the

recovery infrast-

structure is to use

emulation tech-

nology as a front-

end to the disk.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 11

Primary Storage

Backup System

Tape Emulation Disk

Tape Library

Fast

(Prim

ary)

Rec

over

y

Server (S

econdary) Recovery

Figure 4: Virtual tape is typically easy to integrate.

Page 15: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Vendor approaches to virtual tape

Vendors have taken a couple different approaches to bundling this technology

and the role that it can play in the recovery operation. The following items

characterize some of the differentiation in the marketplace:

Integrated Virtual Library Solution: Several traditional storage vendors as

well as new technology companies have integrated software, server, and disk

into a packaged solution. This approach offers a pre-configured solution that

has been integrated and tested with specific backup software and server plat-

forms. It also provides a single point of support for the complete solution

(i.e., one throat to choke).

Software Only: Other virtual tape products come as a software-only solu-

tion. With this approach, the software is integrated on a customer or software

vendor supplied server, with Fibre Channel cards and network connec-

tions.This allows the customer to choose the optimal server architecture for

its data volumes, performance needs, infrastructure, etc. In addition, it pro-

vides the ability to select which open systems disk product best fits the envi-

ronment.

This approach also offers the ability to utilize the newest software functional-

ity available from the software vendor (features that are not yet included in

the integrated VTL solutions). It can also support backup server platforms

that are not yet supported in an integrated solution.

Data Mover or Not: Virtual tape vendors take differing positions on whether

the virtual tape technology should be a passive or active storage device. In

the passive approach, the virtual tape technology allows the backup applica-

tion and server to take responsibility for migrating data from disk to tape for

archival purposes. In the active approach, it can manage this task with or

without direction from the backup server. The benefit of having the backup

server manage this I/O is that seamless command and control is maintained

in the environment and the risk of having metadata and subsequent data

integrity challenges in the backup application is minimized. The advantage of

having the virtual tape product manage this data migration is that the I/O can

occur without passing through the backup application server, which allows

the server to perform its other tasks without being impacted. Current best

practices favor leaving the backup application in charge of all data movement

from disk to tape to simplify management and assure metadata integrity.

Challenges

Virtual tape is a mainstream solution with many vendors offering products

that support this capability. Caution must be taken to ensure that all backup

applications in the environment support this technology. Also, the virtual tape

technology should provide the ability to emulate a tape resource that is

appropriate for the targeted environment.

Several traditional

storage vendors

as well as new

technology com-

panies have inte-

grated software,

server, and disk

into a packaged

solution.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 12

Page 16: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Another factor to consider is that efficient use of tape may be minimized

when moving data from virtual to physical tape, depending on the backup

application and how it leverages compression.

Operational benefits

Virtual tape offers all the same benefits of D2D and addresses some of its

shortcomings.

• It enables seamless integration of disk into the data recovery operation

versus D2D, which generally requires some re-engineering of the data

recovery workflow.

• In some highly complex customized environments, virtual tape makes it

possible to consider using disk as a backup destination, as the effort to

modify all the scripts needed to use standard D2D would be far too dis-

ruptive.

• virtual tape can result in measurable performance gains versus D2D. This

is because it does not have the same file system overhead and disk frag-

mentation typically associated with disk systems. Another reason is it

enables data to be transferred to disk in large blocks, similar to how data

is typically written to tape, rather than small blocks more typically associ-

ated with transfer to standard disk. And since disk systems are random

access devices, the importance of maintaining streaming performance,

which is critical to tape system performance, is eliminated.

Best fit

In assessing the best fit for virtual tape technology, it is important to note if

the technology is being compared to tape or D2D. For this discussion, the

attributes of virtual tape are compared primarily with those of a D2D archi-

tecture. The optimal environment for virtual tape technology is one that has a

considerable level of customization to the data recovery workflow, where the

introduction of standard D2D would be too disruptive.

Other characteristics of an environment that would lead to the introduction of

virtual tape include:

• Intense backup window pressure and a strong requirement for improved

performance, where disk fragmentation of standard D2D would be a con-

sideration

• Preservation of investment in existing backup software and tape systems

• Use of disk and tape in a tiered backup strategy

• Reduction of reliance on physical tape (physical tape archived off-site for

disaster recovery)

• Need for quick restore of recent backups

• Need for additional tape resources to eliminate tape device bottlenecks

The optimal envi-

ronment for virtu-

al tape is one that

has a consider-

able level of cus-

tomization to the

data recovery

workflow.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 13

Page 17: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Business value impact of virtual tape

As a starting point, organizations can expect to receive the same business

benefits as with D2D. In addition to these, companies may also experience:

• Lower IT expense for the initial implementation due to the ease of imple-

mentation relative to D2D

• Improved user access to production data resulting in greater productivity

as a result of backups completing more quickly

• Lower ongoing IT expenses due to simplified file system administration

and centralized volume management

Example of a virtual tape implementation

Problem: A large international food distributor was facing extreme pressure

in its data recovery operations. With over 35 terabytes of production data, the

organization struggled to meet its backup windows and recovery SLAs.

Scheduled full backups were cancelled frequently for production purposes.

Nightly backups were failing at rates of 10 to 15 percent, which required a

full-time system administrator to troubleshoot the failures. Often cloning

operations or recovery operations conflicted with backup jobs. Recoveries

took from hours to days, depending on the data source and the number of lost

or corrupt files.

Backup windows were stretched and the environment required significant

tuning through use of techniques like multiplexing to maintain backup sys-

tem performance.

Solution: The assessment of this environment led to the determination that

traditional tape technology imposed many limitations that caused perform-

ance and reliability problems. Disk-to-disk recovery seemed like the natural

solution to some problems, but due to the customized backup application, the

implementation would be difficult and too disruptive. Also, without compres-

sion capabilities, the amount of disk required to support this application

would be cost prohibitive. Further analysis showed that a virtual tape solu-

tion was the best fit. This approach eliminated the need to modify the cus-

tomized workflow that was continually optimized over the years and provid-

ed file compression capability to minimize the amount of disk space required

to enable the solution.

Results: Backups are consistently completed within the backup window,

with nearly 100 percent reliability. Data recoveries occur within established

SLAs. The implementation was completed with minimal disruption to the

customized workflow; also, the full-time administrator previously dedicated

to troubleshooting failed backups was re-assigned to other proactive storage

projects.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 14

As a starting

point, organiza-

tions can expect

to receive the

same business

benefits as with

D2D.

Page 18: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Point-in-time copies

Types of point-in-time copies

Point-in-time copy software makes a mirrored copy of the production data,

which is split off and assigned to a backup server. This is referred to as “off

host” or “zero impact” backup since the application server is not responsible

for performing the backup. Two types of copies exist: full image and pointer-

based copies. Full image mirrors are complete block-level copies of the origi-

nal data. Pointer-based mirrors are copies of the index information identify-

ing where data exists on the storage array.

Full image mirroring technology

Many storage infrastructures include component redundancy such as disk

mirroring, using RAID 1 to improve the reliability and availability of pri-

mary disk storage subsystems. Often organizations take advantage of mirror-

ing technology to augment their recovery capabilities. By deploying an addi-

tional mirror, this tertiary copy of the data can be separated from the primary

and the first mirror for data recovery operations. This provides a point-in-

time copy of data, which is separate from application and user data. A sepa-

rate server can backup this copy without adversely affecting production on

the application server. (See Figure 5.)

Mirror management

Mirror management can be performed by either hardware or software-based

utilities with similar benefits to data recovery operations. One advantage of

using software-based volume administration to manage the point-in-time

copy of the data is that inexpensive storage media can be used for second

mirrors. This allows organizations to invest in innovative RAID technology

for production storage, while using less expensive storage subsystems.

Organizations can also use repurposed legacy storage for their second mir-

rors, where performance and availability are not as critical as with production

storage.

Pointer-based snapshot technology

Pointer-based snapshots are copies of the index information identifying

where the data resides on the storage array. Snapshot technology provides a

means of creating parallel, read-only file systems that point to a set of data

intermingled with live-production data. Creating pointer-based snapshots take

only seconds with minimal impact on the system. There are two general cate-

gories of pointer-based snapshots: file system and storage array.

• File system snapshots are stored as small files on the live file system. The

data that exists at the time of the snapshot is protected from being over-

written on the physical disk, so that it can be referenced from the snap-

shots. This enables consistent static access to files at an identified point in

time, which offers tremendous benefit to data recovery operations.

Point-in-time copy

software makes a

mirrored copy of

the production

data, which is

split off and

assigned to a

backup server.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 15

Page 19: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

• Storage array snapshots create a designated area on the LUN where new

data is written in a copy-on-write operation, when data is updated on the

LUN. This allows quick rollback to a point in time, but adds overhead to

the process of updating data on the array, given that one read and two

writes need to occur for each updated block, compared to only one write

in a file system snapshot environment.

Snapshot edits, additions, and disk space requirements

Data edits and additions are written to a new area on the disk, which means

that snapshots do not require nearly the incremental disk space required for

point-in-time data copies (split mirrors), but generally some incremental disk

space is required. The disk space requirement is dependent on two factors:

• The length of time the snapshots are kept

• The data refresh rate

It is important to manage and cycle the snapshots so that the unneeded disk

space can be released and made available to the live file system. (See Figure 6.)

Snapshots and the data recovery process

Snapshots of data can be taken either to create a consistent point of quick

rollback for inadvertent changes, deletions, or corruption of the data or to

establish a solid point-in-time reference to a live data source to assist data

It is important to

manage and cycle

the snapshots so

the unneeded

disk space can be

released and

made available to

the live file system.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 16

Backup Server

Production Server

Mirror

Mirror

Data

Production splits off a mirror copy of the data. The mirror is mounted to a second server, where the data is backed up without impacting the production server. After backup, data can be resynchronized with the primary data volume.

Figure 5: Data mirroring

Page 20: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

recovery operations. When snapshots are used as part of the backup, a snap-

shot of the data is taken before the backup process begins. Then the data host

mounts the read-only snapshot file system for backup purposes, while contin-

uing its production use of the live file system. The potential drawback of this

approach is that it creates a lot of I/O activity on the primary volume, which

can impact the performance of a production application. This is an advantage

for full-image point-in-time copies.

For recovery purposes, snapshot file systems can be referenced to recover

files that have been corrupted or inadvertently deleted. In many environ-

ments, snapshot technology is used for up to 90 percent of file recovery,

rather than retrieving the file from tape or other secondary media. This recov-

ery method greatly improves performance, eases administration, and serves

to complement traditional recovery technologies. Snapshots also provide IT

organizations the ability to deliver on the most stringent RPO requirements

related to recovering corrupted or deleted files.

Challenges

One challenge that exists with snapshot technology is the management of file

system and storage array snapshots. These technologies can generate count-

less snapshots. This is good because snapshots offer such a dramatic RPO

improvement over traditional tape backup, but they do not offer a good cen-

tralized method of systematically generating, managing, and releasing the

snapshots.

This seems to be a new battleground for backup software products, as these

vendors recognize the value of incorporating this management intelligence

into the traditional backup application interface. As this capability is incorpo-

rated into the environment, it allows centrally managing snapshots consis-

tently across all storage platforms.

A second challenge is that snapshot implementations can cause a heavy hit

on network performance. Backing up at the byte or block level is substantial-

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 17

Data

Base Data

DataSnapshot

Snapshot Directory

DataSnapshot

Snapshot Directory

Creates snapshot of base data and a snapshot directory with pointer to data at point in time

Writes new blocks, updating base data; points to base blocks in snapshot directory

New blocks will not be overwritten

Directory points to data changed in base data

1 2 31 2 3 1

2

11 3 4

Figure 6: File system data snapshots

For recovery pur-

poses, snapshot

file systems can

be referenced to

recover files that

have been cor-

rupted or inadver-

tently deleted.

Page 21: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

ly more efficient versus copying an entire file. An optimal solution makes

light demands on performance and works not only for remote offices but for

the data center as well.

Operational benefits

This solution offers very fast full data recoveries, a decrease in network traf-

fic, and backup times that are reduced to minutes. It reduces the organiza-

tion’s recovery time capability to hours versus days and helps consolidate

backup software licenses.

Best fit

Off-host backup provides a method for delivering quality, predictable, point-

in-time data protection, using standard data recovery technologies. Given the

declining cost per megabyte of disk storage, this approach is gaining popular-

ity as a way of performing backups without hindering production operations

in environments with shrinking or non-existent backup windows.

Business value impact of point-in-time copies

Point-in-time copies deliver the following business benefits:

• Reduced risk of lost intellectual property due to data loss, given the capa-

bility to generate frequent snapshots

• Higher employee productivity due to shortened downtime in the event

that data needs to be recovered from a backup source due to fast accessi-

bility of point-in-time snapshot data

• Reduced IT labor costs as less time is spent recovering data given the fast

nature of snapshot recovery

• Improved customer satisfaction as customers experience fewer outages

from the business, and their inquiries are answered more quickly on aver-

age given that employees will have improved access to critical data

Example of a point-in-time copy solution implementation

Problem: A large organization ran its production systems on servers with

direct-attached enterprise RAID storage subsystems. The data recovery

method was to simply attach tape devices to each file server. The total file

storage was roughly 20TB. A full backup cycle could take up to 40 hours and

nightly incremental backups would consistently extend into the next business

day. The environment lacked high availability; often, recovery from outages

and data losses would take days due to high capacity and slow, legacy tape

technology.

Solution: A high availability, high performance, well-protected solution was

installed to address these problems. The solution comprises file servers (as

before), but in a clustered configuration, redundantly attached to two enter-

prise RAID subsystems with Fibre Channel and Serial ATA drives. The Fibre

Channel drives house the production file shares while ATA storage provides

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 18

Off-host backup

delivers quality,

predictable point-

in-time data pro-

tection using

standard data

recovery tech-

nologies.

Page 22: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

mirrored copies that are systematically split from the production data for off

host backup. For tape backup, a group of servers running data protection

software was installed. Every night, these servers get a copy of the produc-

tion data from FC LUNs. Clone copies are written to ATA RAID LUNs. The

servers mount the ATA volumes containing a replica of the production data.

Backup servers write that data via a FC SAN to an enterprise tape library

subsystem. The two tape copy writes of the data are written simultaneously

for immediate offsite storage. Mirror copies remain mounted on the backup

servers until the next backup cycle, which provides for direct network-share

recovery and a test platform for patches, software, etc.

Results: Full backups have gone from 40 hours to just seconds, in terms of

the time that production is impacted. The high availability capabilities have

significantly reduced service outages and the risk of data loss. If a data loss

were to occur, an exact replica of production data would be ready for full

volume-level recovery. Primary recovery has gone from slow, outdated tape

technology to a simple network share via the backup servers, housing the

previous day’s data. Users simply drag and drop files or folders as a means

of recovery. Volume recovery has gone from roughly 10 to 12 hours to min-

utes with mirror reverse synchronization and four to six hours if tape needs

to be used.

Continuous data protection

What is continuous data protection?

Continuous data protection (CDP) is a relatively new, emerging technology

designed to continuously capture or track data modifications and store

changes independently of the primary data, enabling recovery points from

any point in the past. In effect, CDP creates an electronic journal of complete

storage snapshots, one storage snapshot for every instant in time that data

modifications occur. A major advantage of CDP is that it preserves a record

of every transaction that takes place in the enterprise. CDP can support RPOs

of zero as well as protect against data corruption errors because data can be

rolled back to the exact instant before the error occurred.

The working definition for CDP from the Storage Networking Industry

Association (SNIA) CDP Special Interest Group is “a methodology that con-

tinuously captures or tracks data modifications and stores changes independ-

ently of the primary data, enabling recovery points from any point in the

past. CDP systems may be block, file, or application-based and can provide

fine granularities of restorable objects to infinitely variable recovery points.”

How it works

CDP is a front-end protection system that is always on, operating unobtru-

sively to enterprise applications. A typical CDP system is based on a disk

storage infrastructure to log the continuous data changes as well as provide a

time-indexed view into historic points in time. As such, CDP systems may

CDP is a front-end

protection system

that is always on,

operating unob-

trusively to enter-

prise applica-

tions.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 19

Page 23: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

require additional processing resources depending on the approach. In return,

CDP can provide enterprise IT organizations with seamless, near-instanta-

neous recoveries from logical and physical data corruption events stemming

from many sources, including operator errors. Recovery can be in seconds or

minutes rather than the hours that a traditional application and data restora-

tion operation may entail.

Challenges

The challenges with implementing a CDP solution include:

• Managing complexity. Because of the number of heterogeneous products

often involved, it is important to select a CDP solution that minimizes the

number of interfaces data administrators must master. Otherwise, the vol-

ume of point product interfaces involved will rapidly transform simplified

environments into ones that are vastly more complex.

• Designing reliability and availability. With the aggressive RPO require-

ments, many factors must be considered and architected into the solution.

These include dual paths, ensuring read/write acknowledgements, asyn-

chronous versus synchronous replication, etc.

• Changing business practices. Often a CDP solution requires IT and busi-

ness process changes to support the new technology or empower end-

users to restore files.

• Maintaining financial responsibility for a CDP solution. IT must be dili-

gent in selecting which data to protect. While it would be ideal to protect

all data through a CDP solution, it is often not fiscally practical, so the

EDR continuum must be leverage to select the most appropriate technolo-

gy for different applications based on the RPO and RTO requirements.

CDP is a front-end

protection system

that is always on,

operating unob-

trusively to enter-

prise applica-

tions.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 20

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

App

licat

ion

Ser

ver

SAN

Source Vol

Target VolJournal

Storage Array Site 1

Storage Array Site 2

Source Vol

Journal

Instant restore to last written change

Copy of every change

Site 2 is for site 1 disaster and provides instant failover

Figure 7: Continuous data protection

Page 24: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Operational benefits

Given the real-time nature of transmitting updates, this approach yields the

best possible recovery point, minimizing the amount of data loss in the event

of a disaster. With any CDP implementation, the benefits are:

• Near zero data loss

• Backup window reduced to virtually zero

• Improved network efficiency

Best fit

CDP technologies are largely implemented as components of a business con-

tinuance, disaster recovery, or high availability strategy. When these tech-

nologies are present, they can be leveraged for data recovery purposes by

providing faster recovery times from storage subsystem failures or they can

provide off host backup capabilities.

CDP will best address the local backup and recovery of data associated with

a particular set of critical applications with the following general characteris-

tics:

• A high number of writes, thus data changes frequently

• Need to run continuously (24×7×365) and have a major business impact

when down

• Amount of data (database) is large, thus making activities like traditional

backup difficult and time consuming

• Large transactional systems

• Application with an RTO of seconds/minutes (near zero) and RPO of zero

or near zero (last completed transaction)

Business value impact of CDP

CDP is an emerging technology that enables enterprises to increase their abil-

ity to provide sustained application availability and roll back to specific

point-in times. Business benefits are:

• Less productivity lost in event of data outage due to immediate accessibil-

ity of online replica data

• Continued customer satisfaction in the event of a disaster, when access to

customer data can be nearly seamless

• IT cost savings by allowing end users to restore their own files

Example of CDP implementation

Following represents a possible scenario where CDP would be utilized.

CDP technologies

are largely imple-

mented as com-

ponents of a busi-

ness continuance,

disaster recovery,

or high availabili-

ty strategy.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 21

Page 25: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Problem: A large Wall Street financial company was encountering problems

with data availability for its customers, brokers, and traders. Specific pain

points were:

• Operations need to provide 24x7x365 access to customer records

• Large volume of data with 85% annual growth

• RTO of less than a minute and RPO near zero

Solution: The financial company engaged a business partner for an assess-

ment of their customer database. The recommended solution was to continu-

ously capture data changes and store them independently of the primary data.

Hardware replication was chosen to minimize the impact of replication I/O

on production processes. At the disaster recovery site, archive copies are

written simultaneously to tape, creating an immediate offsite copy.

Results: The implemented solution has met its objectives; the backup win-

dow has been reduced to virtually zero, data loss has dropped to near zero,

and overall network efficiency has improved. Also, the solution has been

integrated with business continuity and disaster recovery.

Adding depth to enhanced data recovery

Data deduplication

An emerging technology gaining a significant amount of attention is the

elimination of redundant data or data deduplication. It is having a dramatic

impact on enhanced data recovery solutions and spans all four architectures:

D2D, virtual tape, point-in-time copies, and continuous data protection.

Recent product offerings provide more advanced data reduction methods and

promise significantly improved benefits. When considering a consolidated

storage environment, the cost of storing online data and multiple backup or

archive copies consumes a significant amount of IT and data center

resources. If an organization can reduce the amount stored data (data at rest),

then it can reduce the capacity required as well as environmentals such as

floor space and power and cooling to support that capacity. Likewise, if the

unique data can be isolated, the bandwidth required to transmit that data to a

remote site will be significantly less.

The potential to achieve an extremely high ROI with this technology has led

to solutions that are leveraging more efficient deduplication. The most com-

mon solutions being adopted in the market today include:

• Storage application solutions. A business or storage application removes

redundant data prior to storing or transmitting the data. It typically

achieves this by comparing data patterns to data already sent or stored in

a common data repository. Examples of this are backup applications with

granular single instance storage repositories or data replication solutions

that compare source and target data sets and send or store only unique

data patterns.

An emerging

technlogy gaining

a significant

amount of atten-

tion is the elimi-

nation of redun-

dant data or data

deduplication.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 22

Page 26: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

• Storage device solutions. A storage device or appliance emulating a stor-

age device eliminates redundant data by comparing all new data objects

to a common data repository of unique patterns. This is done either as

data is being sent to the storage device (inline), or as a background

process to clean up redundant data after it is stored (post-processing). The

most common implementation of data storage deduplication is in backup

solutions that leverage virtual tape appliances. Recently, some advanced

file systems have announced data deduplication solutions intended for use

with online data.

• Storage networking solutions. Storage networking solutions designed to

move large amounts of data over long distances are also adding data

deduplication as a component of their solutions. These solutions also

leverage common data repositories to send references to data that has

been recently sent and compression to aid in efficiently transmitting new

unique data. This is a key feature of most WAN Optimization Controller

solutions.

It can be very complex to properly size and implement data deduplication

solutions, but the benefits can provide dramatic savings in IT resources.

These solutions are gaining traction in many components of enhanced data

recovery solutions.

Again, it is important to understand the application and lifecycle require-

ments to properly integrate the benefits of data deduplication into an

enhanced data recovery solution.

Replication

Another technology that provides greater depth to enhanced data recovery is

data replication. In essense, data replication can serve as the disaster recovery

component of any of the four architechtures highlighted in this paper. Data

replication technology provides the ability to create a secondary copy of data

primarily for disaster recovery. It gives organizations the ability to quickly

recover data at an off site location when a disaster occurs, delivering tremen-

dous recovery time capabilities. Given the real-time nature of transmitting

updates, this approach yields the best possible recovery point, minimizing the

amount of data loss in the event of a disaster. With a wide area high avail-

ability strategy and adequate server resources at a disaster recovery site,

replication technology can help provide the highest level of disaster

resilience in a data center.

While data replication provides benefits generally associated with disaster

recovery, the replica data can be leveraged for recovery purposes as well.

Some organizations perform their regular backups on the replica data, yield-

ing these added benefits:

• Replica copies that can be mounted directly, eliminating recovery steps

• Off-host backup, reducing the impact on production servers during backup

• Reduced network traffic

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 23

While data repli-

cation provides

benefits generally

associated with

DR, the replica

data can be lever-

aged for recovery

purposes as well.

Page 27: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

No organization

can cost effective-

ly protect all of its

data assets with

just one

enhanced data

recovery architec-

ture.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 24

• Reduced media requirements

• Solid recovery point and recovery time capabilities

This approach can also be used to centralize the data recovery operations of

several facilities, yielding greater efficiency in data recovery operations.

Replication technologies are largely implemented as components of a busi-

ness continuance, disaster recovery, or high availability strategy. When these

technologies are present, they can be leveraged for data recovery purposes by

providing faster recovery times from storage subsystem failures or they can

provide off-host backup capabilities.

The implementation of replication technology offers the following business

benefits:

• Less productivity lost in event of data outage due to immediate accessibil-

ity of online replica data

• Reduced IT labor costs due to centralization of backup tasks

• Lower tape media costs due to consolidation of tape resources to a central

location

• Continued customer satisfaction in the event of a disaster, when access to

customer data can be nearly seamless

• Improved productivity of business people in branch offices who may have

previously been tasked with managing backup processes

For additional information on data replication, refer to Datalink white paper,

“A Detailed Look at Data Replication Options for Disaster Recovery

Planning.”

Summary

Compromise no longer necessary

No organization can cost effectively protect all of its data assets with just one

enhanced data recovery architecture. Less critical data may require a simple

tape backup for operational recovery while the most critical data may require

a more robust solution like continuous data protection. Companies that use a

single technology to meet the protection needs of multiple data types will

likely see excessive exposure to data loss or excessive costs. The most effec-

tive approach combines the various technologies available into a tiered pro-

tection infrastructure that delivers the most appropriate levels of protection to

data based on its value to the organization. It is no longer necessary for

organizations to compromise – either accepting greater risks than they need

to, or implementing a solution that overprotects data and costs more to

implement than is justified by the value of the data.

Page 28: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

It is clear that

disk can play a

significant role as

an enhancement

to tape, but it has

of yet proven to

be a panacea.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 25

Understand the bottlenecks and pain points

Data and storage requirements are growing at unbelievable rates for business-

es of every type. What does this mean for IT managers? It means that the

benefits from disk storage subsystems used in data protection environments

must be assessed against the organization’s data recovery bottlenecks and

pain points. The assessment of a solution’s effectiveness examines whether

its overall benefits balance well against its human, corporate, and financial

costs. IT managers must have clearly defined and prioritized objectives based

on the organization’s pain points and SLAs and an understanding of the per-

formance bottlenecks within a data recovery environment to determine if disk

can play a role.

Role of disk

It is clear that disk can play a significant role as an enhancement to tape in

data recovery operations, but it has of yet proven to be a panacea. Disk is

merely one more tool that can be used to improve an organization’s data

recovery capabilities. If after a detailed assessment, an organization deter-

mines that disk can play a role, the next step then would be to architect a

solution that combines the right technologies.

Technologies

The most attractive EDR technologies are summarized below.

Technology Summary

Disk-to-Disk Recovery A data recovery approach using disk

technology as an intermediary step

before archiving the data to tape

Virtual Tape A disk-based data recovery solution that

emulates a tape device

Point-in-Time Copy A physical or virtual copy of data on disk

that is primarily used as a read-only copy

for data recovery purposes

Continuous Data Protection A continuous record of data modifica-

tions which allow recovery to any prior

point-in-time

Page 29: Disk-Based Backup & Recovery: Making Sense of Your Options...Making Sense of Your Options White Paper Datalink September 2007 Abstract: Data and storage requirements are growing at

Interested in

learning more

about enhanced

data recovery

technologies?

Consider a part-

nership with

Datalink.

Disk-Based Backup & Recovery: Making Sense of Your Options

2007 Datalink. All Rights Reserved. www.datalink.com 26

Partnership with Datalink

Interested in learning more about enhanced data recovery solutions and

whether or not they make sense in your environment? Consider a partnership

with Datalink.

Datalink is a leading information storage architect. We analyze, design,

implement, and support information storage infrastructures that store, protect,

and provide continuous access to information. Datalink’s specialized capabil-

ities and solutions span storage area networks, networked-attached storage,

direct-attached storage, and IP-based storage, using industry-leading hard-

ware, software, and technical services.

AssessLink consulting services help organizations develop information stor-

age strategies and tactics that align with business needs. Datalink storage

consultants conduct comprehensive data research, develop current versus

desired state gap analyses, and make storage infrastructure and practice rec-

ommendations.

For more information, contact Datalink at (800) 448-6314 or datalink.com.