1© Copyright 2013 EMC Corporation. All rights reserved.
BUSINESS CONTINUITY AND
DISASTER RECOVERY
STRATEGIESGiulio BrennaSystems Engineers managerSpecialists Team
2© Copyright 2013 EMC Corporation. All rights reserved.
Workshop Objectives
Explain Why Customers need a BC/DR Strategy
Explain Capabilities, Complexity, and Choice
Understand BC and DR from a technological standpoint
Describe the main EMC Solutions for BC/DR
3© Copyright 2013 EMC Corporation. All rights reserved.
BC & DR Drivers
4© Copyright 2013 EMC Corporation. All rights reserved.
A study from research firm Frost & Sullivan estimates that North American Business Continuity and Disaster Recovery spending will reach $23.3 billion by 2012.
That is up more than 50 percent from $15.1 billion in 2006.
"We are seeing increased concern from small and mid-sized enterprises about how they protect their data,”
October 2009
Why You Should Care
5© Copyright 2013 EMC Corporation. All rights reserved.
Recovery-Point Objectives
PRIMARY DECISION DRIVERS
Business Considerations
Technical Considerations
Cost
Recovery-Time Objectives
Performance
Bandwidth
Capacity
Consistency and Recovery
Functionality,Availability
Business Continuity and Disaster Recovery Decision Drivers
6© Copyright 2013 EMC Corporation. All rights reserved.
The Cost of Downtime Per Hour By Industry
Source: AMR Research
Investments
Retail
Insurance
$0 $100,000 $200,000 $300,000 $400,000
Telecom
Banking
Transportation
Manufacturing
7© Copyright 2013 EMC Corporation. All rights reserved.
The Impact of Business Continuity
Revenue Impact
• Employees affected
• Email !• Systems
Brand Impact
• Customers• Suppliers• Financial markets• Banks• Business partners• The Media
Financial Impact
• Revenue recognition• Cash flow
• Direct + Indirect losses• Compensatory
payments• Lost future revenue
Productivity Impact
8© Copyright 2013 EMC Corporation. All rights reserved.
Business Continuity Considerations
• What are your company’s most critical processes and data needs?• How much data can you afford to
lose?• How quickly do you need to
restore your critical processes?• How vulnerable are your
operations to disasters?
9© Copyright 2013 EMC Corporation. All rights reserved.
Events that Impact Information Availability
Scheduled events/competing workloads: 85%
Examples?
Unscheduled events/failures: 15%
Examples?
Events that require a data center move: <1% of occurrences
Examples?
10© Copyright 2013 EMC Corporation. All rights reserved.
Events that Impact Information Availability
Scheduled events/competing workloads: 85%
Maintenance and migrations Backup and restore Batch processing Reporting Data warehouse extract, build, and load
Unscheduled events/failures: 15%
Server failure Application failure Network / storage failure Processing or operator error
Events that require a data center move: <1% of occurrences
Disaster events– Fire, flood, storms, etc.
Data center move or relocation Workload relocation
11© Copyright 2013 EMC Corporation. All rights reserved.
Terminology
12© Copyright 2013 EMC Corporation. All rights reserved.
A Key Differentiation
Understand the difference betweenDisaster Recovery (DR) and Business Continuity
(BC)
• Disaster Recovery: Restoring IT operations following a site failure
• Business Continuity: Reducing or eliminating application downtime
• Disaster Avoidance: availability to predict and take actions to avoid Disaster Impact
13© Copyright 2013 EMC Corporation. All rights reserved.
Disaster RecoveryTape Backup and Offsite Rotation
High AvailabilityIn-data-center Application Restart
Continuous AvailabilityApplication continues with no disruption (Zero Downtime)
Traditional
Advanced RecoveryReplication to Second Site
Convergence
The Evolution of Availability
14© Copyright 2013 EMC Corporation. All rights reserved.
Protecting Information is a Business DecisionRecovery-point objective (RPO): How much data can you afford to lose, can you determine a sync point
Recovery-time objective (RTO): How long can you afford to idle your business and survive?
Fast recovery times enable continuous business operationsSlow recovery times—or data loss—translates into Business Recover
Recovery Time ObjectiveRecovery Point Objective
System Restart
Service Availabilit
y
Plan Activation
NotificationData Lost
Failover
Technology Plan and procedures
15© Copyright 2013 EMC Corporation. All rights reserved.
Backup BackupReplication ReplicationAuto-
mation
Balancing Business Requirements and Cost
$ $
Time= 0 RTORPO
Cost of Data
Availability
Cost of System
Availability
CriticalApplication
The maximum acceptable length of time that can elapse following an interruption to the operations of a business function
before its absence severely impacts the organization
The point in time to which critical data must be restored to following an interruption before its loss severely
impacts the organization
Cost of System
Downtime
Cost of Data Loss
Business Application
16© Copyright 2013 EMC Corporation. All rights reserved.
Basic Replica Tipologies
Replica syncronous replication with zero data loss
Costs
Recovery Point Objective
m
12
h
48
h
24
h
2 h
Electronic
Vaulting
Remote Journalin
g
Stand-ByDatabase
Traditional
Backup
Mirroring
Remote Replication Transactions
replication with different levels of possible data loss
Semi-Synchronous Mirroring
Synchronous
Mirroring
Remote replication of data based on time intervals or events
8 h
4 h
Zero
d
ata
lo
ss
ss
Classic Backup with physical tape transportation
Acces Anywhere
NEW
Active – Active access
17© Copyright 2013 EMC Corporation. All rights reserved.
Replica protocols
18© Copyright 2013 EMC Corporation. All rights reserved.
Latency Latency in dark fiber is ~ 5ns/m or 5us/km (One 10km link can have 50us
latency)
Worst ….. A round‐trip time (RTT) can be 100us
Latency over SONET/SDH is higher
Latency over IP networks is generally much higher
Latency directly impacts application performance:– Increased idle‐time while application is waiting for read data– Increased idle‐time while application is waiting for write acknowledgement– Reduces I/Os per second (IOPS)
19© Copyright 2013 EMC Corporation. All rights reserved.
Protection%
Replica syncronous replication with zero data loss
Costs
Recovery Point Objective
m
12
h
48
h
24
h
2 h
Electronic
Vaulting
Remote Journalin
g
Stand-ByDatabase
Traditional
Backup
Mirroring
Remote Replication Transactions
replication with different levels of possible data loss
Semi-Synchronous Mirroring
Synchronous
Mirroring
Remote replication of data based on time intervals or events
8 h
4 h
Zero
d
ata
lo
ss
ss
Classic Backup with physical tape transportation
Acces Anywhere
NEW
Active – Active access
20© Copyright 2013 EMC Corporation. All rights reserved.
Protection%
Replica syncronous replication with zero data loss
Costs
Recovery Point Objective
m
12
h
48
h
24
h
2 h
Electronic
Vaulting
Remote Journalin
g
Stand-ByDatabase
Traditional
Backup
Mirroring
Remote Replication Transactions
replication with different levels of possible data loss
Semi-Synchronous Mirroring
Synchronous
Mirroring
Remote replication of data based on time intervals or events
8 h
4 h
Zero
d
ata
lo
ss
ss
Classic Backup with physical tape transportation
Acces Anywhere
NEW
Active – Active access
21© Copyright 2013 EMC Corporation. All rights reserved.
DRStorage
Backup Applications
Onsite Backup Storage
Backup Evolution Over TimeFrom Tape to Disk to Deduplication
Deduplication Backup Software and SystemDeduplication Backup Software and System
TransformationalDisk Centric
Backup SoftwareBackup
Software VTLVTL VTL/TapeVTL/Tape
Backup SoftwareBackup
Software TapeTape TapeTapeTraditional
Tape Centric
Backup FailuresRecovery Time Storage CostComplexity
Decrease
22© Copyright 2013 EMC Corporation. All rights reserved.
Deduplication is Accelerating the Transition
More EfficientReduced
StorageLess Bandwidth
23© Copyright 2013 EMC Corporation. All rights reserved.
Avamar Disk Libraryfor Mainframe
EMC Backup and Recovery Solutions
NetWorker Data ProtectionAdvisor
Data Domain
24© Copyright 2013 EMC Corporation. All rights reserved.
Protection%
Replica syncronous replication with zero data loss
Costs
Recovery Point Objective
m
12
h
48
h
24
h
2 h
Electronic
Vaulting
Remote Journalin
g
Stand-ByDatabase
Traditional
Backup
Mirroring
Remote Replication Transactions
replication with different levels of possible data loss
Semi-Synchronous Mirroring
Synchronous
Mirroring
Remote replication of data based on time intervals or events
8 h
4 h
Zero
d
ata
lo
ss
ss
Classic Backup with physical tape transportation
Acces Anywhere
NEW
Active – Active access
25© Copyright 2013 EMC Corporation. All rights reserved.
Symmetrix Remote Data Facility (SRDF) Family
• Protects against local and regional disruptions
• Increases application availability by reducing downtime
• Minimizes/eliminates performance impact on applications and hosts
• Independent of hosts and operating systems, applications, and databases
• Improves recovery point objectives (RPOs) and recovery time objectives (RTOs) with automated restart solutions
• Mission-critical proven with numerous testimonials and references
• Tens of thousands of licenses shipped
Industry-leading remote replication
EMC offers choice and flexibility to meet any service
level requirement
SRDF Family
SRDF/SSynchronous for
zero data exposure
SRDF/AAsynchronous for
extended distances
SRDF/DMEfficient Symmetrix-to-
Symmetrix data mobility
SRDF/StarMulti-site replication
option
SRDF/ARAutomated Replication
option
SRDF/CECluster Enabler
option
Cascaded SRDF and SRDF/EDP
Extended Distance Protection
Concurrent SRDFConcurrent
SRDF/CGConsistency Groups
26© Copyright 2013 EMC Corporation. All rights reserved.
SRDF Synch
Provides disaster restart of remotely replicated devices and can be used for offsite backup operations
using EMC TimeFinder
Virtual
ProductionVirtual
machines
Test and development
Virtual
TimeFinder
TimeFinderSRDF links
Primary Secondary
R2Boot
R1Data
R2Boot
R1Data
27© Copyright 2013 EMC Corporation. All rights reserved.
SRDF/A Delta Set Push Operation
SRDF/A write I/O cycle number assigned as part of capture cycle (N) SRDF/A write I/O acknowledged back to host as local write operation SRDF/A write I/O cycle number is part of transmit/receive cycle (N–1) SRDF/A write I/O acknowledged from target and removed from transmit cycle
(N–1) on source
Capture to transmit cycle switch initiated based on cycle switch time interval setting with N–1 and N–2 cycles completed
Source Target
21
43N–1
TransmitN
CaptureWAN N–2
Apply43 N–1
Receive
R2
WAN
28© Copyright 2013 EMC Corporation. All rights reserved.
SRDF Advanced Three-Site SRDF/Star Solution
Reconfigure dynamic SRDF devices at Site A to Concurrent SRDF mode and start SRDF/A session from Site A to C
Extended intersite link outage occurs between Sites B and C
Site COut-of-Region Site
Site AWorkload Site
SRDF/A
R11
R2
SRDF/ASRDF/S
Site BLocal or Regional Site
R2
29© Copyright 2013 EMC Corporation. All rights reserved.
EMC RECOVERPOINT FAMILYOne way to protect everything better
30© Copyright 2013 EMC Corporation. All rights reserved.
What Is EMC RecoverPoint?
• Protects any physical or virtual host, application, or storage
• Provides affordable data protection
• Uses a DVR-like point-in-time recovery
• Supports policy-based synchronous and asynchronous replication
• Supports Block and NAS
One way to protect everything better
RecoverPoint family
CRR: remote protection
CLR: concurrent local and remote
protection
CDP: local protection
31© Copyright 2013 EMC Corporation. All rights reserved.
Any Point-in-Time Recovery
• Recover to any point in time
• Annotate selected recovery points with bookmarks
• Continue replication during recovery
• Use recovered image for a variety of purposes
RecoverPoint for continuous protection
Daily Backup: Recovery point is once every 24 hours
Snapshots: Recovery point is once every 8 hours
Continuous Protection: Recovery to any point in time
Check-point
Patch Post-Patch
Cache Flush
HotBackup
Check-point
Pre-Patch
UNLIMITED RECOVERY POINTS, APPLICATION BOOKMARKS
Time
Disk Mirroring: Recovery point is latest image replicated
32© Copyright 2013 EMC Corporation. All rights reserved.
Protection%
Replica syncronous replication with zero data loss
Costs
Recovery Point Objective
m
12
h
48
h
24
h
2 h
Electronic
Vaulting
Remote Journalin
g
Stand-ByDatabase
Traditional
Backup
Mirroring
Remote Replication Transactions
replication with different levels of possible data loss
Semi-Synchronous Mirroring
Synchronous
Mirroring
Remote replication of data based on time intervals or events
8 h
4 h
Zero
d
ata
lo
ss
ss
Classic Backup with physical tape transportation
Acces Anywhere
NEW
Active – Active access
33© Copyright 2013 EMC Corporation. All rights reserved.
Mobility. Availability. Collaboration.
34© Copyright 2013 EMC Corporation. All rights reserved.
The Unique Value of VPLEX
in SEPARATE locations …
Access Anywhere
Access the SAME information…
all at the SAME time…
35© Copyright 2013 EMC Corporation. All rights reserved.
Active-Passive Data AccessBefore VPLEX
Site BSite A
SYNCHRONOUS/ASYNCHRONOUS REPLICATION
Active-Passive Site
Data on disaster recovery site is used on failure
Outage to move applications
36© Copyright 2013 EMC Corporation. All rights reserved.
Federated Data AccessWith VPLEX
VPLEX Metro or VPLEX Geo
Site BSite A
TRANSFER PROTOCOLTRANSFER PROTOCOL
Active-Active Site
VPLEX enables active use of resources at two sites
DISTRIBUTED VIRTUAL VOLUMEDISTRIBUTED VIRTUAL VOLUME
37© Copyright 2013 EMC Corporation. All rights reserved.
The VPLEX Family of Products
Within a data center
Local Metro
AccessAnywhere at
synchronous distances
Geo
AccessAnywhere at
asynchronous distances
NEW
38© Copyright 2013 EMC Corporation. All rights reserved.
Application ConsistencyWhich type of consistency will be required for the applications that
you’re going to protect ?
Crash Consistency
This is the equivalent of pulling the power from a server while the applications are running, and then powering up the server again. Replication solutions that have limited knowledge of the applications are easier to put together. During recovery you are reliant on the application’s capability to start up on its own merits, or possibly with some intervention. Following a fail-over, the data will not have transactional consistence, if transactions were in-flight at the time of the failure. In most cases what occurs is that once the application or database is restarted, the incomplete transactions are identified and the updates relating to these transactions are “backed-out” or some extra procedures or tools may be required.
Application Consistency
There are ways of ensuring that if a copy is taken, or if a system is shut down, all necessary transactions within a database are complete and caches are flushed inorder to maintain consistency. Scripts can be written, following best practice for each application to ensure processes take place in a certain order, or there are applications which can automate these procedures for each application. Some technologies use agents which are application specific. The choice is again down to importance of data, RPOs, RTO’s and the available budgets within the organisation.