8/14/2019 STF-4 Business Continuity.pdf
1/140
8/14/2019 STF-4 Business Continuity.pdf
2/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 2
2006 EMC Corporation. All rights reserved. Business Continuity - 2
Section Objectives
Upon completion of this section, you will be able to:
Describe what business continuity is
Describe the basic technologies that are enablers of dataavailability
Describe basic disaster recovery techniques
The objectives for this section are shown here. Please take a moment to read them.
8/14/2019 STF-4 Business Continuity.pdf
3/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 3
2006 EMC Corporation. All rights reserved. Business Continuity - 3
In This Section
This section contains the following modules:
Business Continuity Overview
Backup and Recovery
Business Continuity Local Replication
Business Continuity Remote Replication
This section contains the following 4 modules:
Business Continuity Overview
Backup and Recovery
Business Continuity Local Replication
Business Continuity Remote Replication.
8/14/2019 STF-4 Business Continuity.pdf
4/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 4
2006 EMC Corporation. All rights reserved. Business Continuity - 4
Business Continuity Overview
After completing this module, you will be able to:
Define and differentiate between Business Continuity andDisaster Recovery
Differentiate between Disaster Recovery and DisasterRestart
Define terminology such as Recovery Point Objective andRecovery Time Objective
Give a high level description of Business ContinuityPlanning
Identify Single Points of Failure and describe solutions toeliminate them
The are the objectives for this module. Please take a moment to review them.
8/14/2019 STF-4 Business Continuity.pdf
5/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 5
2006 EMC Corporation. All rights reserved. Business Continuity - 5
What is Business Continuity?
Business Continuity is the preparation for, response to,
and recovery from an application outage that adverselyaffects business operations
Business Continuity Solutions address systemsunavailability, degraded application performance, orunacceptable recovery strategies
Since information is a primary asset for most businesses, business continuity is a major concern.
This is not just a concern for the Information Technology department, it impacts the entire
business. At one time, data storage was viewed as a simple issue. The requirements have
become more sophisticated. Businesses must now contend with information availability, storageand business continuation in adverse events large or small, man-made or natural. Before we
can talk about business continuity and solutions for business continuity, we must first define the
terms. Business Continuity is the preparation for, response to, and recovery from an application
outage that adversely affects business operations. Business Continuity Solutions address
systems unavailability, degraded application performance, or unacceptable recovery strategies.
8/14/2019 STF-4 Business Continuity.pdf
6/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 6
2006 EMC Corporation. All rights reserved. Business Continuity - 6
Lost RevenueKnow the downtime costs (per
hour, day, two days...) Number of employeesimpacted (x hours out *hourly rate)
Damaged Reputation
Customers Suppliers Financial markets Banks
Business partners
Financial Performance
Revenue recognition Cash flow Lost discounts (A/P) Payment guarantees
Credit rating Stock price
Other Expenses
Temporary employees, equipment rental, overtimecosts, extra shipping costs, travel expenses...
Why Business Continuity
Direct loss Compensatory payments Lost future revenue Billing losses Investment losses
Lost Productivity
There are many factors that need to be considered when calculating the cost of downtime. A
formula to calculate the costs of the outage should capture both the cost of lost productivity of
employees and the cost of lost income from missed sales.
The Estimated average cost of 1 hour of downtime = (Employee costs per hour) *( Number
of employees affected by outage) + (Average Income per hour).
Employee costs per hour is simply the total salaries and benefits of all employees per week,
divided by the average number of working hours per week.
Average income per hour is just the total income of an institution per week, divided by
average number of hours per week that an institution is open for business.
8/14/2019 STF-4 Business Continuity.pdf
7/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 7
2006 EMC Corporation. All rights reserved. Business Continuity - 7
Information Availability
20 min 10 sec17 hrs 31 min0.2%99.8%
1 hr 41 min3.65 days1%99%
10 min 5 sec8 hrs 45 min0.1%99.9%
0.6 sec31.5 sec0.0001%99.9999%
6 sec5.25 min0.001%99.999%
1 min52.5 min0.01%99.99%
3hrs 22 min7.3 days2%98%
Downtime per WeekDowntime per Year% Downtime% Uptime
Information Availability ensures that applications and business units have access to information
whenever it is needed. The primary components of information availability are:
Protection from data loss
Ensuring data access
Appropriate data security
The online window for some critical applications has moved to 99.999% of time.
Information availability depends upon robust, functional IT systems.
8/14/2019 STF-4 Business Continuity.pdf
8/140
8/14/2019 STF-4 Business Continuity.pdf
9/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 9
2006 EMC Corporation. All rights reserved. Business Continuity - 9
Tape
B
ackup
Pe
riodic
Replication
Recovery Point Objective (RPO)
Wks Days Hrs Mins Secs
Recovery Point Recovery TimeRecovery Point Recovery Time
Tape
B
ackup
Pe
riodic
Re
plication
Asynchronous
Replication
Asynchronous
Replication
Sy
nchronous
Replication
Synchronous
Replication
Secs Mins Hrs Days Wks
Recovery Point Objective (RPO) is the point in time to which systems and data must be
recovered after an outage. This defines the amount of data loss a business can endure. Different
business units within an organization may have varying RPOs.
8/14/2019 STF-4 Business Continuity.pdf
10/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 10
2006 EMC Corporation. All rights reserved. Business Continuity - 10
Recovery Time Objective (RTO)
Recovery Time includes:
Fault detection
Recovering data
Bringing apps back online
Global
Cluster
Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks
Recovery Point Recovery TimeRecovery Point Recovery Time
Global
Cluster
Manual
Migration
M
anual
M
igration
Ta
peRestore
Ta
peRestore
Recovery Time Objective (RTO) is the period of time within which systems, applications, or
functions must be recovered after an outage. This defines the amount of downtime that a
business can endure, and survive.
8/14/2019 STF-4 Business Continuity.pdf
11/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 11
2006 EMC Corporation. All rights reserved. Business Continuity - 11
Disaster Recovery versus Disaster Restart
Most business critical applications have some level of data
interdependencies
Disaster recovery
Restoring previous copy of data and applying logs to that copy to bring it toa known point of consistency
Generally implies the use of backup technology
Data copied to tape and then shipped off-site
Requires manual intervention during the restore and recovery processes
Disaster restart
Process of restarting mirrored consistent copies of data and applicationsAllows restart of all participating DBMS to a common point of consistency
utilizing automated application of recovery logs during DBMS initialization
The restart time is comparable to the length of time required for theapplication to restart after a power failure
Disaster recovery is the process of restoring a previous copy of the data and applying logs or
other necessary processes to that copy to bring it to a known point of consistency.
Disaster restart is the restarting of dependent write consistent copies of data and applications,utilizing the automated application of DBMS recovery logs during DBMS initialization to bring
the data and application to a transactional point of consistency.
There is a fundamental difference between Disaster Recovery and Disaster Restart. Disaster
recovery is the process of restoring a previous copy of the data and applying logs to that copy to
bring it to a known point of consistency. Disaster restart is the restarting of mirrored consistent
copies of data and applications.
Disaster recovery generally implies the use of backup technology in which data is copied to tape
and then it is shipped off-site. When a disaster is declared, the remote site copies are restored
and logs are applied to bring the data to a point of consistency. Once all recoveries are
completed, the data is validated to ensure it is correct.
8/14/2019 STF-4 Business Continuity.pdf
12/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 12
2006 EMC Corporation. All rights reserved. Business Continuity - 12
Disruptors of Data Availability
Disaster (
8/14/2019 STF-4 Business Continuity.pdf
13/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 13
2006 EMC Corporation. All rights reserved. Business Continuity - 13
Causes of Downtime
Human Error
System Failure
Infrastructure Failure
Disaster
Today, the most critical component of an organization is information. Any disaster occurrence
will affect information availability critical to run normal business operations.
In our definition of disaster, the organizations primary systems, data, applications are damagedor destroyed. Not all unplanned disruptions constitute a disaster.
8/14/2019 STF-4 Business Continuity.pdf
14/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 14
2006 EMC Corporation. All rights reserved. Business Continuity - 14
Business Continui ty vs. Disaster Recovery
Business Continuity has a broad focus on prevention:
Predictive techniques to identify risks
Procedures to maintain business functions
Disaster Recovery focuses on the activities that occurafter an adverse event to return the entity to normalfunctioning.
Business Continuity is a holistic approach to planning, preparing, and recovering from an
adverse event. The focus is on prevention, identifying risks, and developing procedures to
ensure the continuity of business function. Disaster recovery planning should be included as
part of business continuity.
BC objectives include:
Facilitate uninterrupted business support despite the occurrence of problems.
Create plans that identify risks and mitigate them wherever possible.
Provide a road map to recover from any event.
Disaster Recovery is more about specific cures, to restore service and damaged assets after an
adverse event. In our context, Disaster Recovery is the coordinated process of restoring systems,
data, and infrastructure required to support key ongoing business operations.
8/14/2019 STF-4 Business Continuity.pdf
15/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 15
2006 EMC Corporation. All rights reserved. Business Continuity - 15
Business Continuity Planning (BCP)
Includes the following activities:
Identifying the mission or critical business functions
Collecting data on current business processes
Assessing, prioritizing, mitigating, and managing risk
Risk Analysis
Business Impact Analysis (BIA)
Designing and developing contingency plans and disaster
recovery plan (DR Plan)
Training, testing, and maintenance
Business Continuity Planning (BCP) is a risk management discipline. It involves the entire
business--not just IT. BCP proactively identifies vulnerabilities and risks, planning in advance
how to prepare for and respond to a business disruption. A business with strong BC practices in
place is better able to continue running the business through the disruption and to return tobusiness as usual.
BCP actually reduces the risk and costs of an adverse event because the process often uncovers
and mitigates potential problems.
8/14/2019 STF-4 Business Continuity.pdf
16/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 16
2006 EMC Corporation. All rights reserved. Business Continuity - 16
Objectives
Train, Test, and
Document
Implement,
Maintain, and
Assess
Analysis
Design
Develop
Business Continuity Planning Lifecycle
The Business Continuity Planning process includes the following stages:
1. Objectives
Determine business continuity requirements and objectives including scope and budget
Team selection (include all areas of the business and subject matter expertise (internal/external) Create the project plan
2. Perform analysis
Collect information on data, business processes, infrastructure supports, dependencies, frequency of use
Identify critical needs and assign recovery priorities.
Create a risk analysis (areas of exposure) and mitigation strategies wherever possible.
Create a Business Impact Analysis (BIA)
Create a Cost/benefit analysis identify the cost (per hour/day, etc.) to the business when data is unavailable.
Evaluate Options
3. Design and Develop the BCP/Strategies
Evaluate options
Define roles/responsibilities Develop contingency scenarios
Develop emergency response procedures
Detail recovery, resumption, and restore procedures
Design data protection strategies and develop infrastructure
Implement risk management/mitigation procedures
4. Train, test, and document
5. Implement, maintain, and assess
8/14/2019 STF-4 Business Continuity.pdf
17/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 17
2006 EMC Corporation. All rights reserved. Business Continuity - 17
Business Impact Analysis (BIA)
$1800
$8000
18000
$55619
$55768
$69517
Loss p/y
$400
$16,000
$16,000
$279,098
$279,066
$279,056
Single Loss
Expectancy
1.0
0.5
1.0
0.2
0.2
.25
# Event
p/y
$5,000
$122,000
$80,000
$10,000
$66,456
$5,800
Est cost of
mitigation
No failover for developmentwebserver
12IT-Intranet/B2B
6
Computer room does nothave sufficient UPScapacity to run on singleunit
34EntireCompany
5
Primary dev platforms donthave failover
34IT-All4
Relocate net equip to aseparate physical rack
15EntireCompany
3
Cisco net backbone switchnot redundant
15EntireCompany
2
No redundant UPS forNetworking/phone equip
15EntireCompany
1
High Risk SPOF ItemProbability
(1-5)
Impact
(1 -5)
Business Area
Affected
#
This is an example of Business Impact Analysis (BIA). The dollar values are arbitrary and are
used just for illustration. BIA quantifies the impact that an outage will have to the business and
potential costs associated with the interruption. It helps businesses channel their resources based
on probability of failure and associated costs.
8/14/2019 STF-4 Business Continuity.pdf
18/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 18
2006 EMC Corporation. All rights reserved. Business Continuity - 18
User & ApplicationClients
IP
Identifying Single Points of Failure
PrimaryNode
Consider the components in the picture and identify the Single Points of Failure.
8/14/2019 STF-4 Business Continuity.pdf
19/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 19
2006 EMC Corporation. All rights reserved. Business Continuity - 19
HBA Failures
Configure multiple HBAs, and use
multi-pathing software Protects against HBA failure
Can provide improvedperformance (vendordependent)
HBAHBA
Host
Switch
Storage
PortPort
HBAHBA
Configuring multiple HBAs and using multi-pathing software provides path redundancy. Upon
detection of a failed HBA, the software can re-drive the I/O through another available path.
8/14/2019 STF-4 Business Continuity.pdf
20/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 20
2006 EMC Corporation. All rights reserved. Business Continuity - 20
Switch/Storage Array Port Failures
Configure multiple switches
Make the devices availablevia multiple storage arrayports
HBAHBA
HostSwitch
Storage
PortPort
HBAHBA
PortPort
This configuration provides switch redundancy as well as protects against storage array port
failures.
8/14/2019 STF-4 Business Continuity.pdf
21/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 21
2006 EMC Corporation. All rights reserved. Business Continuity - 21
Disk Failures
Use some level of RAID
HBAHBA
HostSwitch
Storage
PortPort
HBAHBA
PortPort
As seen earlier, using some level of RAID, such as RAID-1 or RAID-5, will ensure continuous
operation in the event of disk failures.
8/14/2019 STF-4 Business Continuity.pdf
22/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 22
2006 EMC Corporation. All rights reserved. Business Continuity - 22
Host Failures
Clustering protects against production host failures
HBAHBA
HostSwitch
Storage
PortPort
HBAHBA
PortPort
Storage
Host
Planning and configuring clusters is a complex task. At a high level:
A cluster is two or more hosts with access to the same set of storage (array) devices
Simplest configuration is a two node (host) cluster
One of the nodes would be the production server while the other would be configured as a
standby. This configuration is described as Active/Passive.
Participating nodes exchange heart-beats or keep-alives to inform each other about their
health.
In the event of the primary node failure, cluster management software will shift the
production workload to the standby server.
Implementation of the cluster failover process is vendor specific.
A more complex configuration would be to have both the nodes run production workload on
the same set of devices. Either cluster software or application/database should then provide a
locking mechanism so that the nodes do not try to update the same areas on disksimultaneously. This would be an Active/Active configuration.
8/14/2019 STF-4 Business Continuity.pdf
23/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 23
2006 EMC Corporation. All rights reserved. Business Continuity - 23
Site/Storage Array Failures
Remote replication helps protect against either entire site
or storage array failures
HBAHBA
HostSwitch
Storage
PortPort
HBAHBA
PortPort
Storage
Remote replication will be explored in-depth in a later module in this section. What is not shown
in the picture is host connectivity to the storage array in the remote site.
8/14/2019 STF-4 Business Continuity.pdf
24/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 24
2006 EMC Corporation. All rights reserved. Business Continuity - 24
User & ApplicationClients
IP
Resolving Single Points of Failure
PrimaryNode
IP
Redundant
Network
Keep
Alive
Clustering
Software
FailoverNode
Redundant Paths Redundant DisksRAID 1/RAID5
Redundant
Site
This example combines the methods that we have discussed to resolve single points of failure. It
uses clustering, redundant paths and redundant disks, a redundant site, and a redundant network.
8/14/2019 STF-4 Business Continuity.pdf
25/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 25
2006 EMC Corporation. All rights reserved. Business Continuity - 25
Local Replication
Data from the production devices is copied over to a set
of target (replica) devices
After some time, the replica devices will contain identicaldata as those on the production devices
Subsequently copying of data can be halted. At this point-in-time, the replica devices can be used independently ofthe production devices
The replicas can then be used for restore operations in
the event of data corruption or other eventsAlternatively the data from the replica devices can be
copied to tape. This off-loads the burden of backup fromthe production devices
Local replication technologies offer fast and convenient methods for ensuring data availability.
The different technologies and the uses of replicas for BC/DR operations will be discussed in a
later module in this section. Typically local replication uses replica disk devices. This greatly
speeds up the restore process, thus minimizing the RTO. Frequent point-in-time replicas alsohelp in minimizing RPO.
8/14/2019 STF-4 Business Continuity.pdf
26/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 26
2006 EMC Corporation. All rights reserved. Business Continuity - 26
Backup/Restore
Backup to tape has been the predominant method for
ensuring data availability and business continuity
Low cost, high capacity disk drives are now being usedfor backup to disk. This considerably speeds up thebackup and the restore process
Frequency of backup will be dictated by definedRPO/RTO requirements as well as the rate of change ofdata
Far from being antiquated, periodic backup is still a widely used method for preserving copies of
data. In the event of data loss due to corruption or other events, data can be restored up to the
last backup. Evolving technologies now permit faster backups to disks. Magnetic tape drive
speeds and capacities are also continually being enhanced. The various backup paradigms andthe role of backup in B-C/D-R planning will be discussed in detail later in this section.
8/14/2019 STF-4 Business Continuity.pdf
27/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 27
2006 EMC Corporation. All rights reserved. Business Continuity - 27
Module Summary
Key points covered in this module:
Importance of Business Continuity
Types of outages and their impact to businesses
Business Continuity Planning and Disaster Recovery
Definitions of RPO and RTO
Difference between Disaster Recovery and Disaster
Restart Identifying and eliminating Single Points of Failure
These are the key points covered in this module. Please take a moment to review them.
8/14/2019 STF-4 Business Continuity.pdf
28/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 28
2006 EMC Corporation. All rights reserved. Business Continuity - 28
Backup and Recovery
Upon completion of this module, you will be able to:
Describe best practices for planning Backup andRecovery.
Describe the common media and types of data that arepart of a Backup and Recovery strategy.
Describe the common Backup and Recovery topologies.
Describe the Backup and Recovery Process.
Describe Management considerations for Backup andRecovery.
This lesson looks at Backup and Recovery. Backup and Recovery are a major part of the
planning for Business Continuity.
8/14/2019 STF-4 Business Continuity.pdf
29/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 29
2006 EMC Corporation. All rights reserved. Business Continuity - 29
Lesson: Planning for Backup and Recovery
Upon completion of this lesson, you be able to:
Define Backup and Recovery.
Describe common reasons for a Backup and Recoveryplan.
Describe the business considerations for Backup andRecovery.
Define RPO and RTO.
Describe the data considerations for Backup andRecovery
Describe the planning for Backup and Recovery.
This lesson provides an overview of the business drivers for backup and recovery and introduces
some of the common terms used when developing a backup and recovery plan.
8/14/2019 STF-4 Business Continuity.pdf
30/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 30
2006 EMC Corporation. All rights reserved. Business Continuity - 30
What is a Backup?
Backup is an additional copy of data that can be used for
restore and recovery purposes.
The Backup copy is used when the primary copy is lostor corrupted.
This Backup copy can be created as a:
Simple copy (there can be one or more copies)
Mirrored copy (the copy is always updated with whatever is writtento the primary copy.)
A Backup is a copy of the online data that resides on primary storage. The backup copy is
created and retained for the sole purpose of recovering deleted, broken, or corrupted data on the
primary disk.
The backup copy is usually retained over a period of time, depending on the type of the data,
and on the type of backup. There are three derivatives for backup: disaster recovery, Archival,
and operational backup. We will review them in more detail, on the next slide.
The data that is backed up may be on such media as disk or tape, depending on the backup
derivative the customer is targeting. For example, backing up to disk may be more efficient than
tape in operational backup environments.
8/14/2019 STF-4 Business Continuity.pdf
31/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 31
2006 EMC Corporation. All rights reserved. Business Continuity - 31
Backup and Recovery Strategies
Several choices are available to get the data to the backup
media such as:
Copy the data.
Mirror (or snapshot) then copy.
Remote backup.
Copy then duplicate or remote copy.
Several choices are available to get the data written to the backup media.
You can simply copy the data from the primary storage to the secondary storage (disk or
tape), onsite. This is a simple strategy, easily implemented, but impacts the production
server where the data is located, since it will use the servers resources. This may be
tolerated on some applications, but not high demand ones.
To avoid an impact on the production application, and to perform serverless backups, you
can mirror (or snap) a production volume. For example, you can mount it on a separate
server and then copy it to the backup media (disk or tape). This option will completely free
up the production server, with the added infrastructure cost associated with additional
resources.
Remote Backup, can be used to comply with offsite requirements. A copy from the primary
storage is done directly to the backup media that is sitting on another site. The backup media
can be a real library, a virtual library or even a remote filesystem.
You can do a copy to a first set of backup media, which will be kept onsite for operationalrestore requirements, and then duplicate it to another set of media for offsite purposes. To
simplify thr procedure, you can replicate it to an offsite location to remove any manual
procedures associated with moving the backup media to another site.
8/14/2019 STF-4 Business Continuity.pdf
32/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 32
2006 EMC Corporation. All rights reserved. Business Continuity - 32
Its All About Recovery!
Businesses back up their data to enable its recovery in
case of potential loss.
Businesses also back up their data to comply withregulatory requirements.
Types of backup derivatives:
Disaster Recovery
Archival
Operational
There are three different Backup derivatives:
Disaster Recovery addresses the requirement to be able to restore all, or a large part of, an IT
infrastructure in the event of a major disaster.Archival is a common requirement used to preserve transaction records, email, and other
business work products for regulatory compliance. The regulations could be internal,
governmental, or perhaps derived from specific industry requirements.
Operational is typically the collection of data for the eventual purpose of restoring, at some
point in the future, data that has become lost or corrupted.
8/14/2019 STF-4 Business Continuity.pdf
33/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 33
2006 EMC Corporation. All rights reserved. Business Continuity - 33
Reasons for a Backup Plan
Hardware Failures
Human Factors
Application Failures
Security Breaches
Disasters
Regulatory and Business Requirements
Reasons for a backup plan include:
Physical damage to a storage element (such as a disk) that can result in data loss.
People make mistakes and unhappy employees or external hackers may breach security and
maliciously destroy data.
Software failures can destroy or lose data and viruses can destroy data, impact data integrity,
and halt key operations.
Physical security breaches can destroy equipment that contains data and applications.
Natural disasters and other events such as earthquakes, lightning strikes, floods, tornados,
hurricanes, accidents, chemical spills, and power grid failures can cause not only the loss of
data but also the loss of an entire computer facility. Offsite data storage is often justified to
protect a business from these types of events.
Government regulations may require certain data to be kept for extended timeframes.
Corporations may establish their own extended retention policies for intellectual property toprotect them against litigation. The regulations and business requirements that drive data as
an archive generally require data to be retained at an offsite location.
8/14/2019 STF-4 Business Continuity.pdf
34/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 34
2006 EMC Corporation. All rights reserved. Business Continuity - 34
How does Backup Work?
Client/Server Relationship
Server
Directs Operation
Maintains the Backup Catalog
Client
Gathers Data for Backup (a backup client sends backup data to abackup server or storage node).
Storage Node
Backup products vary, but they do have some common characteristics. The basic architecture of
a backup system is client-server, with a backup server and some number of backup clients or
agents. The backup server directs the operations and owns the backup catalog (the information
about the backup). The catalog contains the table-of-contents for the data set. It also containsinformation about the backup session itself.
The backup server depends on the backup client to gather the data to be backed up. The backup
client can be local or it can reside on another system, presumably to backup the data visible to
that system. A backup server receives backup metadata from backup clients to perform its
activities.
There is another component called a storage node. The storage node is the entity responsible for
writing the data set to the backup device. Typically there is a storage node packaged with the
backup server and the backup device is attached directly to the backup servers host platform.
Storage nodes play an important role in backup planning as it can be used to consolidate backup
servers.
8/14/2019 STF-4 Business Continuity.pdf
35/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 35
2006 EMC Corporation. All rights reserved. Business Continuity - 35
How does Backup Work?
DiskStorage
TapeBackup
Data SetMetadata
Catalog
Backup Server& Storage Node
Servers
Clients
The following represents a typical Backup process:
The Backup Server initiates the backup process (starts the backup application).
The Backup Server sends a request to a server to send me your data.
The server sends the data to the Backup Server and/or Storage Node.
The Storage Node sends the data to the tape storage device and the Backup Server begins
building the catalog (metadata) of the backup session.
When all of the data has been transferred from the server to the Backup Server, the Backup
Server writes the catalog to a disk file and closes the connection to the tape device.
8/14/2019 STF-4 Business Continuity.pdf
36/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 36
2006 EMC Corporation. All rights reserved. Business Continuity - 36
Business Considerations
Customer business needs determine:
What are the restore requirements RPO & RTO?
Where and when will the restores occur?
What are the most frequent restore requests?
Which data needs to be backed up?
How frequently should data be backed up?
hourly, daily, weekly, monthly
How long will it take to backup?
How many copies to create?
How long to retain backup copies?
Some important decisions that need consideration before implementing a Backup/Restore
solution are shown above. Some examples include:
The Recovery Point Objective (RPO)
The Recovery Time Objective (RTO)
The media type to be used (disk or tape)
Where and when the restore operations will occur especially if an alternative host will be
used to receive the restore data.
When to perform backups.
The granularity of backups Full, Incremental or cumulative.
How long to keep the backup for example, some backups need to be retained for 4 years,
others just for 1 month
Is it necessary to take copies of the backup or not
8/14/2019 STF-4 Business Continuity.pdf
37/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 37
2006 EMC Corporation. All rights reserved. Business Continuity - 37
Data Considerations: File Characteristics
Location
Size
Number
Location:
Many organizations have dozens of heterogeneous platforms that support a complex
application. Consider a data warehouse where data from many sources is fed into the
warehouse. When this scenario is viewed as The Data Warehouse Application, it easily
fits this model. Some of the issues are:
How the backups for subsets of the data are synchronized
How these applications are restored
Size:
Backing up a large amount of data that consists of a few big files may have less system
overhead than backing up a large number of small files. If a file system contains millions of
small files, the very nature of searching the file system structures for changed files can take
hours, since the entire file structure is searched.
Number: a file system containing one million files with a ten-percent daily change rate will
potentially have to create 100,000 entries in the backup catalog. This brings up other issues
such as:
How a massive file system search impacts the system
Search time/Media impact
Is there an impact on tape start/stop processing?
8/14/2019 STF-4 Business Continuity.pdf
38/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 38
2006 EMC Corporation. All rights reserved. Business Continuity - 38
Data Considerations: Data Compression
Compressibility depends on the data type, for example:
Application binaries do not compress well.
Text compresses well.
JPEG/ZIP files are already compressed and expand ifcompressed again.
Many backup devices such as tape drives, have built-in hardware compression technologies. To
effectively use these technologies, it is important to understand the characteristics of the data.
Some data, such as application binaries, do not compress well. Text data can compress very
well, while other data, such as JPEG and ZIP files, are already compressed.
8/14/2019 STF-4 Business Continuity.pdf
39/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 39
2006 EMC Corporation. All rights reserved. Business Continuity - 39
Data Considerations: Retention Periods
Operational
Data sets on primary media (disk) up to the point where most restorerequests are satisfied, then moved to secondary storage (tape).
Disaster Recovery
Driven by the organizations disaster recovery policy
Portable media (tapes) sent to an offsite location / vault.
Replicated over to an offsite location (disk).
Backed up directly to the offsite location (disk, tape or emulated tape).
ArchivingDriven by the organizations policy.
Dictated by regulatory requirements.
As mentioned before, there are three types of backup models (Operational, Disaster Recovery,
and Archive). Each can be defined by its retention period. Retention Periods are the length of
time that a particular version of a dataset is available to be restored.
Retention periods are driven by the type of recovery the business is trying to achieve:
For operational restore, data sets could be maintained on a disk primary backup storage
target for a period of time, where most restore requests are likely to be achieved, and then
moved to a secondary backup storage target, such as tape, for long term offsite storage.
For disaster recovery, backups must be done and moved to an offsite location.
For archiving, requirements usually will be driven by the organizations policy and
regulatory conformance requirements. Tapes can be used for some applications, but for
others a more robust and reliable solution, such as disks, may be more appropriate.
8/14/2019 STF-4 Business Continuity.pdf
40/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 40
2006 EMC Corporation. All rights reserved. Business Continuity - 40
Lesson: Summary
Topics in this lesson included:
Backup and Recovery definitions and examples.
Common reasons for Backup and Recovery.
The business considerations for Backup and Recovery.
Recovery Point Objectives and Recovery TimeObjectives.
The data considerations for Backup and Recovery
The planning for Backup and Recovery.
In this lesson we reviewed the business and data considerations when planning for Backup and
Recovery including:
What is a Backup and Recovery?What is the Backup and Recovery process?
Business recovery needs
RPO Recovery point objectives
RTO Recovery time objectives
Data characteristics
Files, compression, retention
8/14/2019 STF-4 Business Continuity.pdf
41/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 41
2006 EMC Corporation. All rights reserved. Business Continuity - 41
Lesson: Backup and Recovery Methods
Upon completion of this lesson, you be able to:
Describe Hot and Cold Backups.
Describe the levels of Backup Granularity.
Weve discussed the importance and considerations for a Backup Plan, now this lesson provides
an overview of the different methods for creating a backup set.
8/14/2019 STF-4 Business Continuity.pdf
42/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 42
2006 EMC Corporation. All rights reserved. Business Continuity - 42
Database Backup Methods
Hot Backup: production is not interrupted.
Cold Backup: production is interrupted.
Backup Agents manage the backup of different datatypes such as:
Structured (such as databases)
Semi-structured (such as email)
Unstructured (file systems)
Backing up databases can occur useing two different methods:
A Hot backup, which means that the application is still up and running, with users accessing
it, while backup is taking place.
A Cold backup, which means that the application will be shut down for the backup to take
place.
Most backup applications offer various Backup Agents to do these kinds of operations. There
will be different agents for different types of data and applications.
8/14/2019 STF-4 Business Continuity.pdf
43/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 43
2006 EMC Corporation. All rights reserved. Business Continuity - 43
Backup Granularity and Levels
Full Backup
Cumulative (Differential)
Incremental
Full Cumulative Incremental
The granularity and levels for backups depend on business needs, and, to some extent,
technological limitations. Some backup strategies define as many as ten levels of backup. IT
organizations use a combination of these to fulfill their requirements. Most use some
combination of Full, Cumulative, and Incremental backups.
A Full backup is a backup of all data on the target volumes, regardless of any changes made to
the data itself.
An Incremental backup contains the changes since the last backup, of any type, whichever was
most recent.
A Cumulative backup, also known as a Differential backup, is a type of incremental that
contains changes made to a file since the last full backup.
8/14/2019 STF-4 Business Continuity.pdf
44/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 44
2006 EMC Corporation. All rights reserved. Business Continuity - 44
Files 1, 2, 3, 4, 5
ProductionProduction
Restoring an Incremental Backup
Key Features
Files that have changed since the last full or incremental backup are
backed up. Fewest amount of files to be backed up, therefore faster backup and less
storage space.
Longer restore because last full and all subsequent incremental backupsmust be applied.
IncrementalIncremental
Tuesday
File 4
IncrementalIncremental
Wednesday
File 3
IncrementalIncremental
Thursday
File 5Files 1, 2, 3
Monday
Full BackupFull Backup
The following is an example of an incremental backup and restore:
A full backup of the business data is taken on Monday evening. Each day after that, an
incremental backup is taken. These incremental backups only backup files that are new or thathave changed since the last full or incremental backup.
On Tuesday, a new file is added, File 4. No other files have changed. Since File 4 is a new file
added after the previous backup on Monday evening, it will be backed up Tuesday evening.
On Wednesday, there are no new files added since Tuesday, but File 3 has changed. Since File
3 was changed after the previous evening backup (Tuesday), it will be backed up Wednesday
evening.
On Thursday, no files have changed but a new file has been added, File 5. Since File 5 was
added after the previous evening backup, it will be backed up Thursday evening.
On Friday morning, there is a data corruption, so the data must be restored from tape. The first step is to restore the full backup from Monday evening. Then, every incremental
backup that was done since the last full backup must be applied, which, in this example,
means the:
Tuesday,
Wednesday, and
Thursday incremental backups.
8/14/2019 STF-4 Business Continuity.pdf
45/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 45
2006 EMC Corporation. All rights reserved. Business Continuity - 45
Restoring a Cumulative Backup
Key Features
More files to be backed up, therefore it takes more time to backupand uses more storage space.
Much faster restore because only the last full and the last cumulativebackup must be applied.
Files 1, 2, 3, 4, 5, 6
ProductionProduction
CumulativeCumulative
Tuesday
File 4Files 1, 2, 3
Monday
Full BackupFull Backup CumulativeCumulative
Wednesday
Files 4, 5
CumulativeCumulative
Thursday
Files 4, 5, 6
The following is an example of cumulative backup and restore:
A full backup of the data is taken on Monday evening. Each day after that, a cumulative backup
is taken. These cumulative backups backup ALL FILES that have changed since the LASTFULL BACKUP.
On Tuesday, File 4 is added. Since File 4 is a new file that has been added since the last full
backup, it will be backed up Tuesday evening.
On Wednesday, File 5 is added. Now, since both File 4 and File 5 are files that have been added
or changed since the last full backup, both files will be backed up Wednesday evening.
On Thursday, File 6 is added. Again, File 4, File 5, and File 6 are files that have been added or
changed since the last full backup; all three files will be backed up Thursday evening.
On Friday morning, there is a corruption of the data, so the data must be restored from tape.
The first step is to restore the full backup from Monday evening.
Then, only the backup from Thursday evening is restored because it contains all the
new/changed files from Tuesday, Wednesday, and Thursday.
8/14/2019 STF-4 Business Continuity.pdf
46/140
8/14/2019 STF-4 Business Continuity.pdf
47/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 47
2006 EMC Corporation. All rights reserved. Business Continuity - 47
Lesson: Backup Archi tecture Topologies
Upon completion of this lesson, you be able to:
Describe DAS, LAN, SAN, Mixed topologies.
Describe backup media considerations.
We have discussed the importance of the Backup plan and the different methods used when
creating a backup set. This lesson provides an overview of the different topologies and media
types that are used to support creating a backup set.
8/14/2019 STF-4 Business Continuity.pdf
48/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 48
2006 EMC Corporation. All rights reserved. Business Continuity - 48
Backup Architecture Topologies
There are 3 basic backup topologies:
Direct Attached Based Backup
LAN Based Backup
SAN Based Backup
These topologies can be integrated, forming a mixedtopology
There are three basic topologies that are used in a backup environment: Direct Attached Based
Backup, LAN Based Backup, and SAN Based Backup.
There is also a fourth topology, called Mixed, which is formed when mixing two or more ofthese topologies in a given situation.
8/14/2019 STF-4 Business Continuity.pdf
49/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 49
2006 EMC Corporation. All rights reserved. Business Continuity - 49
Direct Attached Based Backups
Catalog
Backup Server
LAN
Metadata
MediaBackupStorage Node
Data
Here, the backup data flows directly from the host to be backed up to the tape, without utilizing
the LAN. In this model, there is no centralized management and it is difficult to grow the
environment.
Direct Attached Based Backups are performed directly from the backup clients disk to the
backup clients tape devices. The advantages and disadvantages are outlined here. The key
advantage of direct-attached backups is speed. The tape devices can operate at the speed of the
channels. Direct-attached backups optimize backup and restore speed since the tape devices are
close to the data source and dedicated to the host. Disadvantages are Direct-attached backups
impact the host and application performance since backups consume host I/O bandwidth,
memory, and CPU resources. Direct-attached backups potentially have distance restrictions, if
short-distance connections such as SCSI are used.
8/14/2019 STF-4 Business Continuity.pdf
50/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 50
2006 EMC Corporation. All rights reserved. Business Continuity - 50
LAN Based Backups
Backup Server
LAN
Metadata
Storage Node
Data
Mail ServerFile ServerDatabase Server
Metadata
Data
In this model, the backup data flows from the host to be backed up to the tape through the LAN.There is centralized management, but there may be an issue with the LAN utilization since alldata goes through it.
As we have defined previously, Backup Metadata contains information about what has beenbacked up, such as file names, time of backup, size, permissions, ownership, and mostimportantly, tracking information for rapid location and restore. It also indicates where it hasbeen stored, for example, which tape. Data, the contents of files, databases, etc., is the primaryinformation source to be backed up. In a LAN Based Backup, the Backup Server is the centralcontrol point for all backups. The metadata and backup policies reside in the Backup Server.Storage Nodes control backup devices and are controlled by the Backup Server.
The advantages of LAN Based Backup include the following:
LAN backups enable an organization to centralize backups and pool tape resources.
The centralization and pooling can enable standardization of processes, tools, and backupmedia. Centralization of tapes can also improve operational efficiency.
Disadvantages are:
The backup process has an impact on production systems, the client network, and theapplications.
It consumes CPU, I/O bandwidth, LAN bandwidth, and memory.
In order to maintain finite backup points, applications might have to be halted and databasesshut down.
8/14/2019 STF-4 Business Continuity.pdf
51/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 51
2006 EMC Corporation. All rights reserved. Business Continuity - 51
SAN Based Backups (LAN Free)
LAN
Metadata
Storage Node
Data
Mail Server
SAN
Backup Server
Data
A SAN based backup, also known as LAN Free backup, is achieved when there is no backup
data movement over the LAN. In this case, all backup data travels through a SAN to the
destination backup device.
This type of backup still requires network connectivity from the Storage Node to the Backup
Server, since metadata always has to travel through the LAN.
LAN-free backups use Storage Area Networks (SANs) to move backup data rapidly and reliably.
The SAN is usually used in conjunction with backup software that supports tape device sharing.
A SAN-enabled backup infrastructure introduces these advantages to the backup process. It
provides Fibre Channel performance, reliability, and distance. It requires fewer processes and
reduced overhead. It does not use the LAN to move backup data and eliminates or reduces
dedicated backup servers. Finally, it improves backup and restore performance.
8/14/2019 STF-4 Business Continuity.pdf
52/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 52
2006 EMC Corporation. All rights reserved. Business Continuity - 52
SAN/LAN Mixed Based Backups
LAN
Metadata
Storage Node
Data
Mail ServerDatabase Server
Data
SAN
Backup Server
Data
A SAN/LAN Mixed Based Backup environment is achieved by using two or more of the
topologies described in the previous slides. In this example, some servers are SAN based while
others are LAN based.
8/14/2019 STF-4 Business Continuity.pdf
53/140
8/14/2019 STF-4 Business Continuity.pdf
54/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 54
2006 EMC Corporation. All rights reserved. Business Continuity - 54
Multiple Streams on Tape Media
Multiple streams interleaved to achieve higher
throughput on tape Keeps the tape streaming, for maximum write performance
Helps prevent tape mechanical failure
Greatly increases time to restore
TapeTape
Data fromStream 1 Data fromStream 2 Data from
Stream 3
Tape drive streaming is recommended from all vendors, in order to keep the drive busy. If you
do not keep the drive busy during the backup process (writing), performance will suffer.
Multiple streaming helps to improve performance drastically, but it generates one issue as well:
the backup data becomes interleaved, and thus the recovery times are increased.
8/14/2019 STF-4 Business Continuity.pdf
55/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 55
2006 EMC Corporation. All rights reserved. Business Continuity - 55
Backup to Disk
Backup to disk minimizes tape in backup environments
by using disk as the primary destination deviceCost benefits
No processes changes needed
Better service levels
Backup to disk aligns backup strategy to RTO andRPO
Backup to disk replaces tape and its associated devices, as the primary target for backup, with
disk. Backup to disk systems offer major advantages over equivalent scale tape systems, in
terms of capital costs, operating costs, support costs, and quality of service. It can be
implemented fully on day 1 or over a phased approach.
While no changes are needed, any number of enhancements to the process, and the services
provided, are now possible. Backup to disk can be a great enabler. Instead of having tape
technology drive the business processes, the business goals drive the backup strategy.
8/14/2019 STF-4 Business Continuity.pdf
56/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 56
2006 EMC Corporation. All rights reserved. Business Continuity - 56
Tape versus Disk Restore Comparison
Typical Scenario: 800 users, 75 MB mailbox
60 GB database
Source: EMC Engineering and EMC IT
*Total time from point of failure to return of service to e-mail users
56
0 10 20 30 40 50 60 70 80 90 100 120110
Recovery Time in Minutes*
TapeBackup / Restore
DiskBackup / Restore
108Minutes
108Minutes
24Minutes
24Minutes
This example shows a typical recovery scenario using tape and disk. As you can see, recovery
with disk provides much faster recovery than does recovery with tape.
This example shows a typical recovery scenario using tape and disk. As you can see, recoverywith disk provides much faster recovery than recovery with tape.
Keep in mind that this example involves data recovery only. The time it takes to bring the
application online is a separate matter. Even so, you can see in this example that the benefit was
a restore roughly five times faster than it would have gone with tape. What you dont see is the
mitigated risk of media failure, and time saved in not having to locate and load the correct tapes
before being able to begin the recovery process.
8/14/2019 STF-4 Business Continuity.pdf
57/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 57
2006 EMC Corporation. All rights reserved. Business Continuity - 57
Three Backup / Restore Solutions based on RTO
Time of last image dictatesthe log playback time
Larger data sets extend therecovery time (ATA and tape)
*Total time from point of failure to return of service to e-mail users
0 10 20 30 40 50 60 70 80 90 100 120110
Recovery Time in Minutes*
Backup on tape
Backup on ATA
108 Min.108 Min.
24 Min.24 Min.
Typical Scenario: 800 users, 75 MB mailbox
60 GB DB restore time
500 MB logs log playback
130
BCV / Clone
2 Min.
41 Minutes
19 Minutes
125 Minutes
17 Min.
17 Min.
17 Min.
Restore time
Log playback
The diagram shows typical recovery scenarios using different technical solutions. As you can
see recovery with Business Continuance Volumes (BCVs) clones provides the quickest recovery
method.
It is important to note that using BCV or clones on Disk, enables you to be able to make more
copies of your data more often. This will improve RPO (the point from which they can recover).
It will also improve RTO because the log files will be smaller and that will reduce the log
playback time.
8/14/2019 STF-4 Business Continuity.pdf
58/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 58
2006 EMC Corporation. All rights reserved. Business Continuity - 58
Traditional Backup, Recovery and Archive Approach
Production environment grows
Requires constant tuning and data placement tomaintain performance
Need to add more tier-1 storage
Backup environment grows
Backup windows get longer and jobs do not complete
Restores take longer
Requires more tape drives and silos to keep up withservice levels
Archive environment grows Impact flexibility to retrieve content when requested
Requires more media, adding management cost
No investment protection for long term retentionrequirements
BackupProcessBackupProcess
ArchiveProcessArchiveProcess
ProductionProduction
In a traditional approach for backup and archive, businesses take a backup of production.
Typically backup jobs use weekly full backups and nightly incremental backups. Based on
business requirements, they will then copy the backup jobs and eject the tapes to have them sent
offsite, where they will be stored for a specified amount of time.
The problem with this approach is simple - as the production environment grows, so does the
backup environment.
8/14/2019 STF-4 Business Continuity.pdf
59/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 59
2006 EMC Corporation. All rights reserved. Business Continuity - 59
Differences Between Backup / Recovery & Archive
Data typically maintained for
analysis, value generation, orcompliance
Data typically overwritten onperiodic basis (e.g., monthly)
Useful for compliance and shouldtake into account information-retention policy
Not for regulatory compliancethough some are forced to use
Typically long-term (months, years,or decades)
Typically short-term (weeks ormonths)
Adds operational efficiencies bymoving fixed / unstructured contentout of operational environment
Improves availability by enablingapplication to be restored to aspecific point in time
Available for information retrievalUsed for recovery operations
Primary copy of informationA secondary copy of information
ArchiveBackup / Recovery
Backup/Recovery and Archiving support different business and goals. This slide compares and
contrasts some of the differences that are significant.
8/14/2019 STF-4 Business Continuity.pdf
60/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 60
2006 EMC Corporation. All rights reserved. Business Continuity - 60
New Architecture for Backup, Recovery & Archive
Understand the environment
Actively archive valuable information to tieredstorage
Back up active production information to disk
Retrieve from archive or recoverfrom backup
BackupProcessBackupProcess
ArchiveProcessArchiveProcessProductionProduction
1
3
4
2
4
The recovery process is much more important than the backup process. It is based on the
appropriate recovery-point objectives (RPOs) and recovery-time objectives (RTOs). The process
usually drives a decision to have a combination of technologies in place, from online Business
Continuance Volumes (BCVs), to backup to disk, to backup to tape for long-term, passiveRPOs.
Archive processes are determined not only by the required retention times, but also by retrieval-
time service levels and the availability requirements of the information in the archive.
For both processes, a combination of hardware and software is needed to deliver the appropriate
service level. The best way to discover the appropriate service level is to classify the data and
align the business applications with it.
8/14/2019 STF-4 Business Continuity.pdf
61/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 61
2006 EMC Corporation. All rights reserved. Business Continuity - 61
Lesson: Summary
Topics in this lesson included:
The DAS, LAN, SAN, and Mixed topologies.
Backup media considerations.
This lesson provided an overview of the different topologies and media types that support
creating a backup set.
8/14/2019 STF-4 Business Continuity.pdf
62/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 62
2006 EMC Corporation. All rights reserved. Business Continuity - 62
Lesson: Managing the Backup Process
Upon completion of this lesson, you be able to:
Describe features and functions of commonBackup/Recovery applications.
Describe the Backup/Recovery process managementconsiderations.
Describe the importance of the information found inBackup Reports and in the Backup Catalog.
We have discussed the planning and operations of creating a Backup. This lesson provides an
overview of Management activities and applications that help manage the Backup and Recovery
process.
8/14/2019 STF-4 Business Continuity.pdf
63/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 63
2006 EMC Corporation. All rights reserved. Business Continuity - 63
How a Typical Backup Application Works
Backup clients are grouped and associated with a Backup
schedule that determines when and which backup type willoccur.
Groups are associated with Pools, which determine whichbackup media will be used.
Each backup media has a unique label.
Information about the backup is written to the Backup Catalogduring and after it completes. The Catalog shows:
when the Backup was performed, andwhich media was used (label).
Errors and other information is also written to a log.
The process for using a Backup application includes the following:
Backup clients are grouped and associated with a Backup schedule that determines when and
which backup type will occur.
Groups are associated with Pools, which determine which backup media will be used. Each
backup media has a unique label.
Information about the backup is written to the Backup Catalog during and after it completes.
The Catalog shows when the Backup was performed, and which media was used (label).
Errors and other information are also written to a log.
8/14/2019 STF-4 Business Continuity.pdf
64/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 64
2006 EMC Corporation. All rights reserved. Business Continuity - 64
Backup Application User Interfaces
There are typically two types of user interfaces:
Command Line Interface CLI
Graphical User Interfaces GUI
There are typically two types of user interfaces. With Command Line Interface, CLI, backup
administrators usually write scripts to automate common tasks, such as sending reports via email.
Graphical User Interfaces, GUI, controls the backup and restore process, multiple backup
servers, multiple storage nodes, and multiple platforms/operating systems. It is a single andeasy to use interface that provides the most common (if not all) administrative tasks.
8/14/2019 STF-4 Business Continuity.pdf
65/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 65
2006 EMC Corporation. All rights reserved. Business Continuity - 65
Managing the Backup and Restore Process
Running the B/R Application: Backup
The backup administrator configures it to be started, most (if not all)of the times, automatically
Most backup products offer the ability for the backup client to initiatetheir own backup (usually disabled)
Running the B/R Application: Restore
There is usually a separate GUI to manage the restore process
Information is pulled from the backup catalog when the user isselecting the files to be restored
Once the selection is finished, the backup server starts reading fromthe required backup media, and the files are sent to the backupclient
There are common tasks associated with managing a Backup or Restore activity using the B/R
Application. These include backup and restore. In backup, it configures a backup to be started,
most (if not all) of the times, automatically, and enables the backup client to initiate its own
backup (Note: usually this feature is disabled).
In restore, there is usually a separate GUI to manage the restore process. Information is pulled
from the backup catalog when the user is selecting the files to be restored. Once the selection is
finished, the backup server starts reading from the required backup media, and the files are sent
to the backup client.
8/14/2019 STF-4 Business Continuity.pdf
66/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 66
2006 EMC Corporation. All rights reserved. Business Continuity - 66
Backup Reports
Backup products also offer reporting features.
These features rely on the backup catalog and log files.
Reports are meant to be easy to read and provideimportant information such as:
Amount of data backed up
Number of completed backups
Number of incomplete backups (failed)
Types of errors that may have occurred
Additional reports may be available, depending on thebackup software product used.
Backup products also offer reporting features. These features rely on the backup catalog and log
files. Reports are meant to be easy to read and provide important information such as amount of
data backed up, number of completed backups, number of incomplete backups (failed), and
types of errors that may have occurred. Additional reports may be available, depending on thebackup software product used.
8/14/2019 STF-4 Business Continuity.pdf
67/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 67
2006 EMC Corporation. All rights reserved. Business Continuity - 67
Importance of the Backup Catalog
As you can see, backup operations strongly rely on the
backup catalog
If the catalog is lost, the backup software alone has nomeans to determine where to find a specific file backedup two months ago, for example
It can be reconstructed, but this usually means that all ofthe backup media (i.e. tapes) have to be read
Its a good practice to protect the catalog
By replicating the file system where it resides to a remote locationBy backing it up
Some backup products have built-in mechanisms toprotect their catalog (such as automatic backup)
As you can see, backup operations strongly rely on the backup catalog. If the catalog is lost, the
backup software alone has no means to determine where to find a specific file backed up in the
past. It can be reconstructed, but this usually means that all of the backup media (i.e. tapes) has
to be read. Its a good practice to protect the catalog by replicating the file system where itresides, to a remote location, and by backing it up. Some backup products have built-in
mechanisms to protect their catalog (such as automatic backup).
8/14/2019 STF-4 Business Continuity.pdf
68/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 68
2006 EMC Corporation. All rights reserved. Business Continuity - 68
Lesson: Summary
Topics in this lesson included:
The features and functions of common Backup/Recoveryapplications.
The Backup/Recovery process managementconsiderations.
The importance of the information found in BackupReports and in the Backup Catalog.
This lesson provided an overview of Backup and Recovery management activities and tools.
8/14/2019 STF-4 Business Continuity.pdf
69/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 69
2006 EMC Corporation. All rights reserved. Business Continuity - 69
Module Summary
Key points covered in this module:
The best practices for planning Backup and Recovery.
The common media and types of data that are part of aBackup and Recovery strategy.
The common Backup and Recovery topologies.
The Backup and Recovery Process.
Management considerations for Backup and Recovery.
These are the key points covered in this module. Please take a moment to review them.
8/14/2019 STF-4 Business Continuity.pdf
70/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 70
2006 EMC Corporation. All rights reserved. Business Continuity - 70
Local Replication
After completing this module you will be able to:
Discuss replicas and the possible uses of replicas
Explain consistency considerations when replicating filesystems and databases
Discuss host and array based replication technologies
Functionality
Differences
ConsiderationsSelecting the appropriate technology
In this section, we will look at what replication is, technologies used for creating local replicas,
and things that need to be considered when creating replicas.
8/14/2019 STF-4 Business Continuity.pdf
71/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 71
2006 EMC Corporation. All rights reserved. Business Continuity - 71
What is Replication?
Replica - An exact copy (in all details)
Replication - The process of reproducing data
Original Replica
REPLICATIONREPLICATION
Local replication is a technique for ensuring Business Continuity by making exact copies of
data. With replication, data on the replica will be identical to the data on the original at the
point-in-time that the replica was created.
Examples:
Copy a specific file
Copy all the data used by a database application
Copy all the data in a UNIX Volume Group (including underlying logical volumes, file
systems, etc.)
Copy data on a storage array to a remote storage array
8/14/2019 STF-4 Business Continuity.pdf
72/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 72
2006 EMC Corporation. All rights reserved. Business Continuity - 72
Possible Uses of Replicas
Alternate source for backup
Source for fast recovery
Decision support
Testing platform
Migration
Replicas can be used to address a number of Business Continuity functions:
Provide an alternate source for backup to alleviate the impact on production.
Provide a source for fast recovery to facilitate faster RPO and RTO.
Decision Support activities such as reporting.
For example, a company may have a requirement to generate periodic reports. Running
the reports off of the replicas greatly reduces the burden placed on the production
volumes. Typically reports would need to be generated once a day or once a week, etc.
Developing and testing proposed changes to an application or an operating environment.
For example, the application can be run on an alternate server using the replica volumes
and any proposed design changes can be tested.
Data migration.
Migration can be as simple as moving applications from one server to the next, or as
complicated as migrating entire data centers from one location to another.
8/14/2019 STF-4 Business Continuity.pdf
73/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 73
2006 EMC Corporation. All rights reserved. Business Continuity - 73
Considerations
What makes a replica good?
Recoverability
Considerations for resuming operations with primary
Consistency/re-startability
How is this achieved by various technologies
Kinds of Replicas
Point-in-Time (PIT) = finite RPO
Continuous = zero RPO
How does the choice of replication technology tie backinto RPO/RTO?
Key factors to consider with replicas:
What makes a replica good:
Recoverability from a failure on the production volumes. The replication technologymust allow for the restoration of data from the replicas to the production and then allowproduction to resume with a minimal RPO an RTO.
Consistency/re-startability is very important if data on the replicas will be accesseddirectly or if the replicas will be used for restore operations.
Replicas can either be Point-in-Time (PIT) or continuous:
Point-in-Time (PIT) - the data on the replica is an identical image of the production atsome specific timestamp
For example, a replica of a file system is created at 4:00 PM on Monday. This replicawould then be referred to as the Monday 4:00 PM Point-in-Time copy.
Note: The RPO will be a finite value with any PIT. The RPO will map to the time when the PITwas created to the time when any kind of failure on the production occurred. If there is a failureon the production at 8:00 PM and there is a 4:00 PM PIT available, the RPO would be 4 hours (8 4 = 4). To minimize RPO with PITs, take periodic PITs.
Continuous replica - the data on the replica is synchronized with the production data atall times.
The objective with any continuous replication is to reduce the RPO to zero.
8/14/2019 STF-4 Business Continuity.pdf
74/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 74
2006 EMC Corporation. All rights reserved. Business Continuity - 74
Replication of File Systems
Host
Apps
Volume Management
DBMS Mgmt Utilities
File System
Multi-pathing Software
Device Drivers
HBA HBA HBA
Operating System
Physical Volume
Buffer
Most OS file systems buffer data in the host before the data is written to the disk on which the
file system resides.
For data consistency on the replica, the host buffers must be flushed prior to the creation ofthe PIT. If the host buffers are not flushed, the data on the replica will not contain the
information that was buffered on the host.
Some level of recovery will be necessary
Note: If the file system is unmounted prior to the creation of the PIT no recovery would be
needed when accessing data on the replica.
8/14/2019 STF-4 Business Continuity.pdf
75/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 75
2006 EMC Corporation. All rights reserved. Business Continuity - 75
A database application may be spread out over
numerous files, file systems, and devicesall of whichmust be replicated
Database replication can be offline or online
Replication of Database Applications
LogsData
Database replication can be offline or online:
Offline replication takes place when the database and the application are shutdown.
Online replication takes place when the database and the application are running.
8/14/2019 STF-4 Business Continuity.pdf
76/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 76
2006 EMC Corporation. All rights reserved. Business Continuity - 76
Database: Understanding Consistency
Databases/Applications maintain integrity by following the
Dependent Write I/O PrincipleDependent Write: A write I/O that will not be issued by an application
until a prior related write I/O has completed
A logical dependency, not a time dependency
Inherent in all Database Management Systems (DBMS)
e.g. Page (data) write is dependent write I/O based on a successful logwrite
Applications can also use this technology
Necessary for protection against local outagesPower failures create a dependent write consistent image
A Restart transforms the dependent write consistent to transactionallyconsistent
i.e. Committed transactions will be recovered, in-flight transactions will bediscarded
All logging database management systems use the concept of dependent write I/Os to maintain
integrity. This is the definition of dependent write consistency. Dependent write consistency is
required for the protection against local power outages, loss of local channel connectivity, or
storage devices. The logical dependency between I/Os is built into database managementsystems, certain applications, and operating systems.
8/14/2019 STF-4 Business Continuity.pdf
77/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 77
2006 EMC Corporation. All rights reserved. Business Continuity - 77
Database Replication: Transactions
Data
Log
DatabaseApplication
4 4
3 3
2 2
1 1
Buffer
Database applications require that for a transaction to be deemed complete a series of writes
have to occur in a particular order (Dependent Write I/O), these writes would be recorded on the
various devices/file systems.
In this example, steps 1-4 must complete for the transaction to be deemed complete.
Step 4 is dependent on Step 3 and will occur only if Step 3 is complete
Step 3 is dependent on Step 2 will occur only if Step 2 is complete
Step 2 is dependent on Step 1 will occur only if Step 1 is complete
Steps 1-4 are written to the databases buffer and then to the physical disks.
8/14/2019 STF-4 Business Continuity.pdf
78/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 78
2006 EMC Corporation. All rights reserved. Business Continuity - 78
Database Replication: Consistency
Data
Log
Source Replica
Consistent
4 4
3 3
2 2
1 1
Log
Data
Note: In this example, the database is online.
At the point in time when the replica is created, all the writes to the source devices must be
captured on the replica devices to ensure data consistency on the replica.
In this example, steps 1-4 on the source devices must be captured on the replica devices forthe data on the replicas to be consistent.
8/14/2019 STF-4 Business Continuity.pdf
79/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 79
2006 EMC Corporation. All rights reserved. Business Continuity - 79
Database Replication: Consistency
Data
Log
Source Replica
Inconsistent
Note: In this example, the database is online.
4 4
3 3
2
1
Creating a PIT for multiple devices happens quickly, but not instantaneously.
Steps 1-4 which are dependent write I/Os have occurred and have been recorded successfully
on the source devices It is possible that steps 3 and 4 were copied to the replica devices, while steps 1 and 2 were
not copied.
In this case, the data on the replica is inconsistent with the data on the source. If a restart
were to be performed on the replica devices, Step 4 which is available on the replica might
indicate that a particular transaction is complete, but all the data associated with the
transaction will be unavailable on the replica making the replica inconsistent.
8/14/2019 STF-4 Business Continuity.pdf
80/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 80
2006 EMC Corporation. All rights reserved. Business Continuity - 80
DatabaseApplication
(Offline)
Database Replication: Ensuring Consistency
Data
Log
Source Replica
Consistent
Off-line Replication
If the database is offline orshutdown and then a replica iscreated, the replica will beconsistent
In many cases, creating an offlinereplica may not be a viable due tothe 24x7 nature of business
Database replication can be performed with the application offline (i.e., application is shutdown,
no I/O activity) or online (i.e., while the application is up and running). If the application is
offline, the replica will be consistent because there is no activity. However, consistency is an
issue if the database application is replicated while it is up and running.
8/14/2019 STF-4 Business Continuity.pdf
81/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 81
2006 EMC Corporation. All rights reserved. Business Continuity - 81
Online Replication
Some database applications allowreplication while the application is upand running
The production database would have tobe put in a state which would allow it tobe replicated while it is active
Some level of recovery must beperformed on the replica to make thereplica consistent
Database Replication: Ensuring Consistency
Data
Log
Source Replica
Inconsistent
4 4
3 3
2
1
In the situation shown, Steps 1-4 are dependent write I/Os. The replica is inconsistent because
Steps 1 & 2 never made it to the replica. To make the database consistent, some level of
recovery would have to be performed. In this example, this could be done by simply discarding
the transaction that was represented by Steps 1-4. Many databases are capable of performingsuch recovery tasks.
8/14/2019 STF-4 Business Continuity.pdf
82/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 82
2006 EMC Corporation. All rights reserved. Business Continuity - 82
Database Replication: Ensuring Consistency
5
Source Replica
Consistent
4 4
3 3
2 2
1 1
5
An alternative way to ensure that an online replica is consistent is to:
Hold I/O to all the devices at the same instant.
Create the replica.
Release the I/O.
Holding I/O is similar to a power failure and most databases have the ability to restart from a
power failure.
Note: While holding I/O simultaneously one ensures that the data on the replica is identical to
that on the source devices, the database application will timeout if I/O is held for too long.
8/14/2019 STF-4 Business Continuity.pdf
83/140
Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved.
Business Continuity - 83
2006 EMC Corporation. All rights reserved. Business Continuity - 83
Tracking Changes After PIT Creation
At P