Food for Thought
“To go forward, you must backup.”-Cardinal rule of computing
“If it wasn’t backed-up, then it wasn’t important.”
-The Sysadmin’s motto
Agenda
• Realities of Disaster Recovery Today
• Industry Trends / Statistics
• Disaster Recovery Components
• External Waves- Financial Tsunami- Going Green
Questions Posed
• Disk based backup versus tape based. Do I still need tape?
• Replication options: host, network, application specific and array based solutions. Which is the right approach for my organization?
• How can snapshots be used to improve recovery times?
• Why is virtualization having such a big impact on DR strategies?
• What infrastructure systems need to be “self healing” to ensure application recovery?
• How do WAN accelerators fit into the DR strategy? Do they work in all scenarios (live replication versus replicating backups offsite)
DR Realities
• Production environments have grown more complex with increasing availability requirements
- 24 X 7 “always on” Internet apps
- Work from home increased availability for all apps
- Days of KO backup job @ 6:00 PM and send the tapes offsite the next morning are Going or Gone!
- Manual business recovery workarounds are just a fond memory
DR Realities
• Has your DR capability kept current?
- Are you trapped in a tape based recovery strategy?
- Are critical applications missing from DR strategy?
• Does your test program validate recovery objectives?
- Can you hit the defined RTOs?
- Time for declaration, travel etc. included? Are they realistic?
- Good mix of exercises across all critical applications?
DR Proclamations
1. Tape is dead – Long live Disk!• Not quite dead yet but only disk for critical DR
o Tape may be only option for single site and SMBso Still needed for long term retention (archive) requirementso Also good for less critical systems with RTO in days/weeks
2. Infrastructure needs to be resilient• “Recover AD” better not be Step One of DR plan• Has the tape catalog been recovered at the DR site?
3. Work from Home has/will be the dominant work area recovery platform
The Need for Resiliency
Dis
aste
r Rec
over
yD
isas
ter R
ecov
ery
Operational R
esiliencyO
perational Resiliency
Long-Term Immediate
SPOF Redundant
DegradedPerformance
Time of Last Backup
NoData Loss
Outage Duration
Environment
Data Currency
Service LevelNear
Equivalent
Shared DedicatedAlternate Site
CostContainment
Return on InvestmentBudget
Full Recovery& Roll Forward RestartableData Base
THEN NOW
Dis
aste
r Rec
over
yD
isas
ter R
ecov
ery
Operational R
esiliencyO
perational Resiliency
Long-Term Immediate
SPOF Redundant
DegradedPerformance
Time of Last Backup
NoData Loss
Outage Duration
Environment
Data Currency
Service LevelNear
Equivalent
Shared DedicatedAlternate Site
CostContainment
Return on InvestmentBudget
Full Recovery& Roll Forward RestartableData Base
THEN NOW……A New Paradigm Has EmergedA New Paradigm Has Emerged
Disaster Recovery Components
DR 2.0 Planning
• It is the maintenance that kills you
- How manual is your update process?
• Do you have a tool or database in place?
- Is there a current application / server map?
o All applications or specific to certain data centers? Just “in scope” or just DR?
DR 2.0 Testing
• Is there a documented Test schedule? Enforced?
- Different examples of two tests per year
• [1] mainframe & [1] one distributed application
• [1] big (almost all) & [1] little (NetBackup, one DB)
• [1] 50% of apps & [1] the other 50%
- Another approach
- All apps with defined DR solution are tested every 18 months
• Plans updated/enhanced as a result of testing?
• Are walkthroughs done effectively?
DR 2.0 Data Center Availability
Tier One Tier Two Tier Three Tier FourBuilding Type Tenant Tenant Stand-Alone Stand-AloneDelivery path
(Power)One One One Active
One PassiveTwo Active
Delivery path (Cooling)
One One One ActiveOne Passive
Two Active
Redundant Components
N N+1 N+1 2(N+1) orS + S
Concurrently Maintainable
No No Yes Yes
Site Availability 99.67% 99.75% 99.98% 100.00%Hours of IT downtime
due to site28.8 hrs 22.0 hrs 1.6 hrs .8 hrs
The Uptime Institute, Inc. has developed a four Tier classification approach to facility infrastructure functionality that addresses the need for a common benchmarking standard. Availability considerations on site infrastructure should use this standard.www.uptimeinstitute.org/
DR 2.0
• Migration from Hot Site vendors to Internal
- Coincides with migration to DR replication solutions
- Co-Lo facilities as part of Internal strategy
• Outsourcing getting multi-faceted
- Infrastructure support
- Software as a Service (SaaS)
- Cloud Computing
DR 2.0 Cloud Computing
• It started with SaaS
- Salesforce.com / LDRPS
• Cloud vendors now looking toprovide scalability & performanceflexibility
• Vendors – Web 2.0 (Google, Amazon) – Traditional IT (IBM, MS)
DR 2.0 Cloud Computing
Cloud – DR Considerations
• Risk Profile has changed
- Risk is diversified by moving app/data off corporate network
• Negotiate/define DR SLAs in contract
• Consider compliance requirements
- Where is my data stored?
DR 2.0 WAN Accelerators
• Who are the players?- Riverbed has mindshare – Also Cisco, Bluecoat, Juniper
• Types of environments - Branch to Data Center vs. Data Center to Data Center- Application Specific WAN Optimizers
• Common apps are CIFS, HTTP• Also Oracle, SQL
• Drivers to implementation- Data Center consolidation- Software As a Service (SAAS)- Branch environments
DR 2.0 Case Study
International Law FirmCritical Applications
MS ExchangeFiles / Doc Mgmt
Voice
DR 2.0 Case Study
Wan AcceleratorsEliminated Resynch bottleneckKept Bandwidth requirements downReduced data loss (RPO)Improved Recovery Time (RTO)
DR 2.0
• Work from Home Deployment- Supplement / replace fixed workarea recovery
• Major Deployments- VPN
- Citrix
- Remote Desktop Protocol
DR 2.0 Recovery Tier Chart
Tier Criticality RTO RPO Investment0 Self-Healing Immediate PoF1 Very High
PoF orIntra-Day2
Intra-Dayor LC3
4 Non-Critical 96 hrs - 1 Week LC Low
5 Deferrable As time allows LC $'s ATOD*
* At Time of Disaster
Required Moderate
High
Moderate to High
3 LC
24 to 48 hrs
48 to 96 hrs
Application Tier Examples
1 Mission Critical
Critical2
To be determined by BIA activity
DNS, LDAP, Active Directory, Authentication
<24 hrs
1 Point-of Failure – data protected to the time of disaster2 Intra-Day – data protected periodically during the business day3 Last Capture – data protected via tape which is typically last nights backup cycle
DR 2.0
• High Availability Candidates- Directories/Authentication
– Active Directory (AD)– Lightweight Directory Access Protocol (LDAP)
- Network Routing– Domain Name Service (DNS)
• Disaster Recovery Candidates- Firewall and other security devices- Data backup system
DR 2.0 Virtualization Defined
Virtualization is the creation of a virtual layer between the actual physical element and the virtual interface. The virtualization layer shields the user from hardware differences and masks changes to the actual element.
There are four areas of IT where Virtualization is making inroads: - Network Virtualization: Established technology including the use of VLANS, VPNs- Server Virtualization: Great fit for DR/Test/Dev environments. Production also.- Storage Virtualization: Gaining traction in the market. It is the pooling of physical
storage from multiple devices into what appears to be a single storage device. - Desktop Virtualization: The latest hot topic in this space. Tremendous potential to
simplify workarea recovery, but requires network bandwidth.
Bottom Line: Server Virtualization is a game changer. Storage Virtualization is just starting. Desktop is TBD
DR 2.0 Server Virtualization
• 1,038 VMware customers from North America,
Europe and Asia Pacific- 45 % of respondents cited business continuity as main driver
for virtualization deployment.
- Server consolidation - 2nd most popular reason
DR 2.0 Virtualization
Server Virtualization-Vendors: VM Ware, MS Hyper V, XenSource
-DR Variations: V2V, P2V, V2P
-Primarily Windows – moving slowly into Linux
-Watch out for management issues
DR 2.0 Virtualization Case Study
International Manufacturing Company- Outsourced iSeries platform with RTO 24 hours- Windows environment with no DR capability
Tier Criticality RTO Servers0 Self-Healing Immediate 4
4 Medium < 7 Days 55 LOW < 30 Days 14
6
< 24 hrs
3 < 3 daysHigh
1 Critical
Very High2
< 4 hrs 19
61
DR 2.0 Virtualization Case Study
Cost PHYSICAL VIRTUALCapital Year 1Servers/Midrange Processors $607,700 $111,600Storage Requirements $520,033 $520,033Software $25,575 $48,825Tape Recovery $27,300 $27,300Network Equipment $68,735 $68,735Implementation Manpower Cost $0 $0Data Center Buildout $128,400 $23,100BC/DR Support Manpower $200,000 $200,000Workarea $0 $0
SubTotal Capital $1,577,743 $999,593Other (miscellaneous) 5% $78,887 $49,980
Total Capital $1,656,630 $1,049,573
Operating Expense PHYSICAL VIRTUALServers/Midrange Processors $4,290 $1,500Storage Requirements $0 $0Software Maintenance $4,830 $4,830Hardware Maintenance $183,565 $10,638Tape Recovery $0 $0Network $117,270 $117,270Dedicated Space (Colo racks) $0 $0Dedicated Space - Leased $0 $0Power Cost $120,870 $30,038Facility Maintenance & Support $18,108 $3,177BC/DR Support Manpower $100,000 $100,000DR Plan Development Manpower $0 $0Test/Exercise Manpower $0 $0Equipment Support Manpower $100,000 $100,000Workarea (qship or mobile) $0 $0Email Recovery (If Outsourced) $0 $0 Subtotal Expense $648,933 $367,453Miscellaneous 2% $12,979 $7,349Total Operating Expense $661,912 $2,700,434,144
International Manufacturing Company
DR 2.0 Disk vs. Tape
DR of critical apps must be on disk- Tape adds too much risk
- Tape means 1 to 2 days of data loss. Is that acceptable?
- Cost differential between Tape / Disk is closing
DR 2.0 Data Replication
Replication options:- Array based
SAN (also NAS)
iSCSI is gaining traction on Fiber Channel
- Host (Server to Server)
- Application specificOracle, Exchange, SharePoint etc.
DR 2.0 Data Replication
Replication vendors:- Array based
• SAN – EMC, Hitachi, IBM
• NAS – NetApp, BlueArc
- Host• DoubleTake, NeverFail
- Application specific• Oracle, MS and others
DR 2.0 Data Backup
Advances in Data Backup Systems
- Data De-duplication
20-50x or more compression ratio often achieved
- Continuous Data Protection
- Snapshots
Surviving The Meltdown
Current financial environment will drastically impact us all.
What can the BCP / DR professional do about it?
Option 1: Hide in a corner and wait for 2010
Option 2: Get proactive – identify cost effective solutions
Surviving The Meltdown
• Tighten up plans
• Identify Top Two Risks & address them- Reallocate funds
- Gap Analysis
• Install $$$ saving technologies- Virtualization
- WAN Accelerators
- IP based replication
Going Green
• By 2008, Gartner estimates that 48% of all IT budgets
will be spent on energy alone. *
• The energy used to power the nation’s data centers
doubled between 2000 and 2006, and could double
again in another five years.**
*Network World – 10 Ways to make your data center more efficient
** Federal Times.com - Huge savings seen in power-hungry data centers
Going Green
• Server Virtualization
• Data De-duplication
• Hot / Cold aisle - blanking panels / vinyl curtains
• Storage Consolidation
• Server Shut Down
Final Thoughts
Thank You !