3

Click here to load reader

Www Definethecloud Net

  • Upload
    savio77

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Www Definethecloud Net

You are here: Home / Concepts / Disaster Recovery and the Cloud

Disaster Recovery and the CloudDECEMBER 1, 2010 BY MICHAEL LYLE 2 COMMENTS

It goes without saying that modern business relies on information technology. As a result, it is essential thatoperations personnel consider the business impact of outages and plan accordingly. As an illustration, Virgin Bluerecently experienced a twenty-hour outage in its reservation system that resulted in losses of up to $20 milliondollars. The cloud provides both considerable opportunities and significant challenges relating to disaster recovery.

In general, organizations must currently build multiple levels of redundancy into their systems to reach high-availability targets and to protect themselves from catastrophic outages during a natural or man-made disaster. Adisaster recovery strategy requires that data and critical application infrastructure be duplicated at a separatelocation, away from the primary datacenter. Cutting over to a disaster recovery site is usually not instantaneous andredundancy is often lost during the contingency operating plan. For this reason, site-local redundancy mechanisms– such as high availability network systems, failover for portions of the application stack, and SAN-level redundancyare also required to achieve availability goals. Public clouds often further complicate disaster recovery planning, asthe organization’s critical systems may now be spread across their own infrastructure and a multitude of outsidevendors, each with their own data model and recovery practices.

Business requirements and application criticality should guide the approach chosen for business continuity. Consider the concepts of RPO (Recovery Point Objective) and RTO (Recovery Time Objective). The RPO of a systemis the specified amount of data that may be lost in the event of a failure, while the RTO of a system is the amount oftime that it will take to bring the system back online after a failure. In general, site-local mechanisms will providenear-instantaneous RPO and RTO, while disaster recovery systems often will have an RPO of several hours or daysof information, and an RTO measured in tens of minutes. Through increasingly sophisticated (and costly)infrastructures, these times can be reduced but not entirely eliminated.

Illustration of RTO and RPO in a backup system

Dedicated redundancy infrastructure, both site-local and for disaster recovery purposes, must be regularly tested. Additionally, it is essential to ensure that the disaster recovery environment is compatible with the existinginfrastructure and capable of running the critical application. This is an area where change managementprocedures are important, to ensure that critical changes to the production infrastructure are made in the standbyenvironment as well. Otherwise, the standby environment may not be able to correctly run the application when thedisaster recovery plan is activated.

The primary factor that determines RTO and RPO is the approach used to move data to the contingency site. Theeasiest and lowest cost approach is tape backup. In this case, the RPO is the time between successive backupsmoved off-site (perhaps a week or more) and the RTO is the amount of time necessary to retrieve the backups,restore the backups, and activate the contingency site. This may be a significant amount of time, especially ifpersonnel are not readily available during the disaster scenario. Alternatively, a hot contingency site may bemaintained, and database log-shipping or volume snapshotting/replication can be used to send business data tothe secondary site. These systems are costly, but readily attain an RTO of under an hour, and an RPO of perhapsone day. With substantial investment and complexity, RPO can even be reduced to the range of minutes. However,organizations have often been surprised to find that the infrastructure doesn’t work when it is called upon, oftenbecause of the complexity of the infrastructure and the difficulties involved in testing a standby site.

When procuring IaaS (Infrastructure as a Service) or SaaS (Software as a Service), it is essential for the organizationto perform due diligence regarding what disaster recovery mechanisms the service vendor uses. The stakes are

Search Define the Cloud

Search this website… SEARCHSEARCH

Blogroll

Blades Made Simple

Follow me on twitter

Health IT Guy

IOS Hints

M. Sean McGee's Blog

Network Static

Pivot Point

Private Cloud Tech Center

Rational Survivability

Scott Lowe's Blog

The Security Blogger

View Yonder

WWT Data Center Services Team Blog

Tags

aci Big Data blades Brocade business drivers

CIFS cisco citrix cloud cloud

challenges Cloud ComputingCloudStack Consolidation cooling DataCenter DataCenter data center

virtualization DCB DCBX disaster recovery EMC

Emulex FCoE Fibre Channel HP I/OConsolidation IOC iSCSI NetAppnetworking Networkvirtualization NFS Open Source

OpenStack Private CloudPublic Cloud ROI SDN ServersStorage UCS VDI

Virtualization vmware VN-Tag

Disclaimer

All brand and company names are used foridentification purposes only. These pages are notsponsored or sanctioned by any of the

About Define The CloudAbout Define The Cloud About the FounderAbout the Founder ArchivesArchives Cloud SearchCloud Search Data Center QuotesData Center Quotes DonateDonate

converted by Web2PDFConvert.com

Page 2: Www Definethecloud Net

too high to trust service level agreements alone (in the case of a catastrophic failure during a disaster, will thevendor be solvent and will the compensation received be sufficient to compensate for business losses?).

Disaster Recovery as a Service, or DRaaS, is an emerging category for organizations that wish to control their owninfrastructure but not maintain the disaster recovery systems themselves. With a DRaaS offering, an IT organizationdoes not directly build a contingency site, but instead relies on a vendor to do so on a dedicated or utility computinginfrastructure. The cloud’s advantages in elasticity and cost-reduction are significant benefits in a disaster recoveryscenario, and service offerings allow organizations to outsource portions of contingency planning to vendors withexpertise in the area. However, many of the complexities remain and it is essential to perform the due diligence toensure that the contingency plan will work and provide a sufficient level of service if called upon.

Finally, there are emerging technologies that combine site-local redundancy and disaster recovery into a unifiedsystem. For example, distributed synchronous multi-master databases allow an application to be spread acrossmultiple locations, including cloud availability zones, with the application active and processing transactions in all ofthem. A specified portion of the system can be lost without any downtime or recovery effort. These emergingsystems offer the prospect of dramatically reducing costs and minimizing the risk of contingency sites notfunctioning properly.

About the Author

Michael Lyle (@MPLyle) is CTO and co-founder of Translattice, and is responsible for the company’s strategictechnical direction. He is a recognized leader in developing new technologies and has extensive experience indatacenter operations and distributed systems.

Rating: 5.0/5 (4 votes cast)

Disaster Recovery and the Cloud, 5.0 out of 5 based on 4 ratings

Related posts:

1. Redundancy in Data Storage: Part 2: Geographical Replication2. OTV and Vplex: Plumbing for Disaster Avoidance3. How to Boost Cloud Reliability4. Business Drivers for Cloud Infrastructures5. Building a Hybrid Cloud

FILED UNDER: CONCEPTS TAGGED WITH: DATA CENTER, DISASTER RECOVERY, DR

Trackbacks

Tweets that mention Disaster Recovery and the Cloud — Define The Cloud -- Topsy.com says:December 1, 2010 at 12:49 pm

[...] This post was mentioned on Twitter by Joe Onisick, Michael Lyle. Michael Lyle said: I'm now contributing to@jonisick 's Define the Cloud. My first post: http://www.definethecloud.net/disaster-recovery-and-the-cloud [...]

Reply

Redundancy in Data Storage: Part 1: RAID Levels — Define The Cloud says:February 21, 2011 at 12:03 am

[...] I think Joe raises a good point, and this is a natural topic to dissect in detail after my previous article aboutcloud disaster recovery and business continuity. I, too, am concerned by the variety of data redundancyarchitectures used in enterprise [...]

Reply

Speak Your Mind

Name *

Email *

sponsored or sanctioned by any of thecompanies mentioned; they are the sole workand property of the authors. While the author(s)may have professional connections to some ofthe companies mentioned, all opinions are thatof the individuals and may differ from officialpositions of those companies. This is a personalblog of the author, and does not necessarilyrepresent the opinions and positions of hisemployer or their partners.

This work by Joe Onisick is licensed under aCreative Commons Attribution-ShareAlike 3.0Unported License

121212121212121212

converted by Web2PDFConvert.com

Page 3: Www Definethecloud Net

Website

POST COMMENT

Notify me of follow-up comments by email.

Notify me of new posts by email.

RETURN TO TOP OF PAGERETURN TO TOP OF PAGE COPYRIGHT © 2014 · COPYRIGHT © 2014 · GENESIS FRAMEWORKGENESIS FRAMEWORK · · WORDPRESSWORDPRESS · · LOG INLOG IN

converted by Web2PDFConvert.com