Download pdf - Disaster Recovery 2 - Amazon Web Services... · • Realities of Disaster Recovery Today • Industry Trends / Statistics • Disaster Recovery Components ... - Cloud Computing. DR

Welcome To

Disaster Recovery 2.0

Presented by Bob [email protected]

Food for Thought

“To go forward, you must backup.”-Cardinal rule of computing

“If it wasn’t backed-up, then it wasn’t important.”

-The Sysadmin’s motto

Agenda

• Realities of Disaster Recovery Today

• Industry Trends / Statistics

• Disaster Recovery Components

• External Waves- Financial Tsunami- Going Green

Questions Posed

• Disk based backup versus tape based. Do I still need tape?

• Replication options: host, network, application specific and array based solutions. Which is the right approach for my organization?

• How can snapshots be used to improve recovery times?

• Why is virtualization having such a big impact on DR strategies?

• What infrastructure systems need to be “self healing” to ensure application recovery?

• How do WAN accelerators fit into the DR strategy? Do they work in all scenarios (live replication versus replicating backups offsite)

DR Realities

• Production environments have grown more complex with increasing availability requirements

- 24 X 7 “always on” Internet apps

- Work from home increased availability for all apps

- Days of KO backup job @ 6:00 PM and send the tapes offsite the next morning are Going or Gone!

- Manual business recovery workarounds are just a fond memory

DR Realities

• Has your DR capability kept current?

- Are you trapped in a tape based recovery strategy?

- Are critical applications missing from DR strategy?

• Does your test program validate recovery objectives?

- Can you hit the defined RTOs?

- Time for declaration, travel etc. included? Are they realistic?

- Good mix of exercises across all critical applications?

DR Proclamations

1. Tape is dead – Long live Disk!• Not quite dead yet but only disk for critical DR

o Tape may be only option for single site and SMBso Still needed for long term retention (archive) requirementso Also good for less critical systems with RTO in days/weeks

2. Infrastructure needs to be resilient• “Recover AD” better not be Step One of DR plan• Has the tape catalog been recovered at the DR site?

3. Work from Home has/will be the dominant work area recovery platform

The Need for Resiliency

Dis

aste

r Rec

over

yD

isas

ter R

ecov

ery

Operational R

esiliencyO

perational Resiliency

Long-Term Immediate

SPOF Redundant

DegradedPerformance

Time of Last Backup

NoData Loss

Outage Duration

Environment

Data Currency

Service LevelNear

Equivalent

Shared DedicatedAlternate Site

CostContainment

Return on InvestmentBudget

Full Recovery& Roll Forward RestartableData Base

THEN NOW

Dis

aste

r Rec

over

yD

isas

ter R

ecov

ery

Operational R

esiliencyO

perational Resiliency

Long-Term Immediate

SPOF Redundant

DegradedPerformance

Time of Last Backup

NoData Loss

Outage Duration

Environment

Data Currency

Service LevelNear

Equivalent

Shared DedicatedAlternate Site

CostContainment

Return on InvestmentBudget

Full Recovery& Roll Forward RestartableData Base

THEN NOW……A New Paradigm Has EmergedA New Paradigm Has Emerged

Disaster Recovery Components

DR 2.0 Planning

• It is the maintenance that kills you

- How manual is your update process?

• Do you have a tool or database in place?

- Is there a current application / server map?

o All applications or specific to certain data centers? Just “in scope” or just DR?

DR 2.0 Testing

• Is there a documented Test schedule? Enforced?

- Different examples of two tests per year

• [1] mainframe & [1] one distributed application

• [1] big (almost all) & [1] little (NetBackup, one DB)

• [1] 50% of apps & [1] the other 50%

- Another approach

- All apps with defined DR solution are tested every 18 months

• Plans updated/enhanced as a result of testing?

• Are walkthroughs done effectively?

DR 2.0 Data Center Availability

Tier One Tier Two Tier Three Tier FourBuilding Type Tenant Tenant Stand-Alone Stand-AloneDelivery path

(Power)One One One Active

One PassiveTwo Active

Delivery path (Cooling)

One One One ActiveOne Passive

Two Active

Redundant Components

N N+1 N+1 2(N+1) orS + S

Concurrently Maintainable

No No Yes Yes

Site Availability 99.67% 99.75% 99.98% 100.00%Hours of IT downtime

due to site28.8 hrs 22.0 hrs 1.6 hrs .8 hrs

The Uptime Institute, Inc. has developed a four Tier classification approach to facility infrastructure functionality that addresses the need for a common benchmarking standard. Availability considerations on site infrastructure should use this standard.www.uptimeinstitute.org/

DR 2.0

• Migration from Hot Site vendors to Internal

- Coincides with migration to DR replication solutions

- Co-Lo facilities as part of Internal strategy

• Outsourcing getting multi-faceted

- Infrastructure support

- Software as a Service (SaaS)

- Cloud Computing

DR 2.0 Cloud Computing

• It started with SaaS

- Salesforce.com / LDRPS

• Cloud vendors now looking toprovide scalability & performanceflexibility

• Vendors – Web 2.0 (Google, Amazon) – Traditional IT (IBM, MS)

DR 2.0 Cloud Computing

Cloud – DR Considerations

• Risk Profile has changed

- Risk is diversified by moving app/data off corporate network

• Negotiate/define DR SLAs in contract

• Consider compliance requirements

- Where is my data stored?

DR 2.0 WAN Accelerators

• Who are the players?- Riverbed has mindshare – Also Cisco, Bluecoat, Juniper

• Types of environments - Branch to Data Center vs. Data Center to Data Center- Application Specific WAN Optimizers

• Common apps are CIFS, HTTP• Also Oracle, SQL

• Drivers to implementation- Data Center consolidation- Software As a Service (SAAS)- Branch environments

DR 2.0 Case Study

International Law FirmCritical Applications

MS ExchangeFiles / Doc Mgmt

Voice

DR 2.0 Case Study

Wan AcceleratorsEliminated Resynch bottleneckKept Bandwidth requirements downReduced data loss (RPO)Improved Recovery Time (RTO)

DR 2.0

• Work from Home Deployment- Supplement / replace fixed workarea recovery

• Major Deployments- VPN

- Citrix

- Remote Desktop Protocol

DR 2.0 Recovery Tier Chart

Tier Criticality RTO RPO Investment0 Self-Healing Immediate PoF1 Very High

PoF orIntra-Day2

Intra-Dayor LC3

4 Non-Critical 96 hrs - 1 Week LC Low

5 Deferrable As time allows LC $'s ATOD*

* At Time of Disaster

Required Moderate

High

Moderate to High

3 LC

24 to 48 hrs

48 to 96 hrs

Application Tier Examples

1 Mission Critical

Critical2

To be determined by BIA activity

DNS, LDAP, Active Directory, Authentication

<24 hrs

1 Point-of Failure – data protected to the time of disaster2 Intra-Day – data protected periodically during the business day3 Last Capture – data protected via tape which is typically last nights backup cycle

DR 2.0

• High Availability Candidates- Directories/Authentication

– Active Directory (AD)– Lightweight Directory Access Protocol (LDAP)

- Network Routing– Domain Name Service (DNS)

• Disaster Recovery Candidates- Firewall and other security devices- Data backup system

DR 2.0 Virtualization Defined

Virtualization is the creation of a virtual layer between the actual physical element and the virtual interface. The virtualization layer shields the user from hardware differences and masks changes to the actual element.

There are four areas of IT where Virtualization is making inroads: - Network Virtualization: Established technology including the use of VLANS, VPNs- Server Virtualization: Great fit for DR/Test/Dev environments. Production also.- Storage Virtualization: Gaining traction in the market. It is the pooling of physical

storage from multiple devices into what appears to be a single storage device. - Desktop Virtualization: The latest hot topic in this space. Tremendous potential to

simplify workarea recovery, but requires network bandwidth.

Bottom Line: Server Virtualization is a game changer. Storage Virtualization is just starting. Desktop is TBD

DR 2.0 Server Virtualization

• 1,038 VMware customers from North America,

Europe and Asia Pacific- 45 % of respondents cited business continuity as main driver

for virtualization deployment.

- Server consolidation - 2nd most popular reason

DR 2.0 Virtualization

Server Virtualization-Vendors: VM Ware, MS Hyper V, XenSource

-DR Variations: V2V, P2V, V2P

-Primarily Windows – moving slowly into Linux

-Watch out for management issues

DR 2.0 Virtualization Case Study

International Manufacturing Company- Outsourced iSeries platform with RTO 24 hours- Windows environment with no DR capability

Tier Criticality RTO Servers0 Self-Healing Immediate 4

4 Medium < 7 Days 55 LOW < 30 Days 14

6

< 24 hrs

3 < 3 daysHigh

1 Critical

Very High2

< 4 hrs 19

61

DR 2.0 Virtualization Case Study

Cost PHYSICAL VIRTUALCapital Year 1Servers/Midrange Processors $607,700 $111,600Storage Requirements $520,033 $520,033Software $25,575 $48,825Tape Recovery $27,300 $27,300Network Equipment $68,735 $68,735Implementation Manpower Cost $0 $0Data Center Buildout $128,400 $23,100BC/DR Support Manpower $200,000 $200,000Workarea $0 $0

SubTotal Capital $1,577,743 $999,593Other (miscellaneous) 5% $78,887 $49,980

Total Capital $1,656,630 $1,049,573

Operating Expense PHYSICAL VIRTUALServers/Midrange Processors $4,290 $1,500Storage Requirements $0 $0Software Maintenance $4,830 $4,830Hardware Maintenance $183,565 $10,638Tape Recovery $0 $0Network $117,270 $117,270Dedicated Space (Colo racks) $0 $0Dedicated Space - Leased $0 $0Power Cost $120,870 $30,038Facility Maintenance & Support $18,108 $3,177BC/DR Support Manpower $100,000 $100,000DR Plan Development Manpower $0 $0Test/Exercise Manpower $0 $0Equipment Support Manpower $100,000 $100,000Workarea (qship or mobile) $0 $0Email Recovery (If Outsourced) $0 $0 Subtotal Expense $648,933 $367,453Miscellaneous 2% $12,979 $7,349Total Operating Expense $661,912 $2,700,434,144

International Manufacturing Company

DR 2.0 Disk vs. Tape

DR of critical apps must be on disk- Tape adds too much risk

- Tape means 1 to 2 days of data loss. Is that acceptable?

- Cost differential between Tape / Disk is closing

DR 2.0 Data Replication

Replication options:- Array based

SAN (also NAS)

iSCSI is gaining traction on Fiber Channel

- Host (Server to Server)

- Application specificOracle, Exchange, SharePoint etc.

DR 2.0 Data Replication

Replication vendors:- Array based

• SAN – EMC, Hitachi, IBM

• NAS – NetApp, BlueArc

- Host• DoubleTake, NeverFail

- Application specific• Oracle, MS and others

DR 2.0 Data Backup

Advances in Data Backup Systems

- Data De-duplication

20-50x or more compression ratio often achieved

- Continuous Data Protection

- Snapshots

Surviving The Meltdown

Current financial environment will drastically impact us all.

What can the BCP / DR professional do about it?

Option 1: Hide in a corner and wait for 2010

Option 2: Get proactive – identify cost effective solutions

Surviving The Meltdown

• Tighten up plans

• Identify Top Two Risks & address them- Reallocate funds

- Gap Analysis

• Install $$$ saving technologies- Virtualization

- WAN Accelerators

- IP based replication

Going Green

• By 2008, Gartner estimates that 48% of all IT budgets

will be spent on energy alone. *

• The energy used to power the nation’s data centers

doubled between 2000 and 2006, and could double

again in another five years.**

*Network World – 10 Ways to make your data center more efficient

** Federal Times.com - Huge savings seen in power-hungry data centers

Going Green

• Server Virtualization

• Data De-duplication

• Hot / Cold aisle - blanking panels / vinyl curtains

• Storage Consolidation

• Server Shut Down

Final Thoughts

Thank You !