88
Data Center Business Continuance Business Continuance and Disaster Recovery Maciej Bocian Maciej Bocian [email protected] Architecture Sales Manager © 2009 Cisco Systems, Inc. All rights reserved. Cisco Confidential Presentation_ID 1 Data Center and Virtualization, Central Europe CCIE#7785

Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Business ContinuanceBusiness Continuanceand Disaster Recovery

Maciej BocianMaciej [email protected] Sales Manager

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 1

Data Center and Virtualization, Central Europe

CCIE#7785

Page 2: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Business Continuance Drivers

• Cost of application downtime, lost data

Business Continuance Drivers

Cost of application downtime, lost data and productivity

• Regulatory mandates (Homeland Hurricanesg y (Defense, Basel II, HIPAA, GLB, SEC)

Firms must recover business operations the same business day a disruption occurs“Out-of-region” data center, 200+ km away Mandates backup data centers on separate grids

The Northeast Blackout

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 2

NYC Blizzard of 2003

Page 3: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Business Continuance Is More Critical than Ever75% of IT decision-makers have altered Disaster Recovery/Business Continuance programs as a result of September 11result of September 11

Following a disaster 43% of directly affectedFollowing a disaster 43% of directly affected businesses do not reopen and 29% fail within 24 months as a result

Only 15% of Global 2000 enterprises have a full-fledged business continuity plan.

Disasters: fire, storm, floods, earthquakes, chemical accidents, nuclear accidents, wars

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 3

accidents, nuclear accidents, wars

Sources: Disaster Recovery Journal, Gartner Group

Page 4: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 4

Page 5: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

The Evolution of Data Centers

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 5

Page 6: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center EvolutionData Center EvolutionNETWORKED DATA

CENTER PHASEData Center

Network

Data CenterContinuous Availability

Data Center Consolidation

Data Center Distributed

Agi

lity

Client/Server

COMPUTE EVOLUTION

OptimizationInternet Computing

1 Consolidation

Data CenterNetworking

Bus

ines

s MainframesContent

Networking

Thin Client: HTTP

1. Consolidation2. Integration3. Distributed

4. High Availability

TerminalNETWORK

EVOLUTION

TCP/IP

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 6

1960 1980 2000 2010

Terminal EVOLUTION

Page 7: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

What is involved in a Data CenterWhat is involved in a Data Center

Application solutionLi /HP

Network infrastructure solutionLinux/HP,

Solaris/SunFire, WebLogic, J2EE custom app, etc.

Cisco GSRs, CISCO CATALYST

6500, Cisco Catalyst Cat4000

Database solutionLinux/HP, Solaris/SunFire, Oracle 10G RAC, etc.

Layer 4–7 services solutionCSM, SSLM, CSS,

CE, GSS 10G RAC, etc.

St l ti

Network security solutionPIX®,

FWSM, IDSM, Storage solution

MDS9000

Management and instrumentation solution

IDSM, VPNSM,

CSA

Terminal NAM

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 7

servers, NAM,Cisco Works LMS/VMS,

HSE

Page 8: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

What is Distributed Data CenterWhat is Distributed Data Center

APP A APP B APP A APP C

Data Replication

Primary SecondaryFC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 8

yData Center

yData Center

Page 9: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Why Distributed Data CentersWhy Distributed Data Centers

Provide disaster recovery and business continuance

Avoid single, concentrated data depositary

High availability of applications and data access g y pp

Load balancing together with performance scalability

Better response and optimal content routing: proximityBetter response and optimal content routing: proximityto clients

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 9

Page 10: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Front-end IP Access Layer y

“Content Routing”site selectionAPP A APP B APP A APP C

Primary SecondaryFC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 10

yData Center

yData Center

Page 11: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Application and Database Layerpp y

“Content Switching”

APP A APP B APP A APP C

Content SwitchingLoad Balancing

“Server Clustering”High AvailabilityHigh Availability

PrimaryData Center

SecondaryData Center

FC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 11

Data Center Data Center

Page 12: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Backend SAN Extension

APP A APP B APP A APP C“Storage” & “Optical”

DataMirroring and Replicationo g a d ep cat o

P i S d

FC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 12

PrimaryData Center

SecondaryData Center

Page 13: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Disaster Recovery

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 13

Page 14: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectivesFailure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 14

Page 15: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster RecoveryDisaster Recovery

Recovery of data and resumption of service - EnsuringRecovery of data and resumption of service Ensuring business can recover and continue after failure or disaster

Ability of a business to adapt, change and continue when confronted with various outside impacts

Mitigating the impact of a disaster

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 15

Page 16: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

What It means For Business

Business ResilienceBusiness ResilienceContinued Operation ofBusiness During a Failure

Business ContinuanceRestoration of Business

After a FailureDisaster Recovery

Protecting Data Through Offsite

After a Failure

g gData Replication

and Backup

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 16

Zero Down Time is the ultimate goal

Page 17: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery PlanningDisaster Recovery Planning

• Business Impact Analysis (BIA)Business Impact Analysis (BIA) Determines the impacts of various disasters to specific business functions and company assets

• Risk Analysis Identifies important functions and assets that are critical to company’s operationscompany s operations

• Disaster Recovery Plan (DRP) Restores operability of the target systems applications orRestores operability of the target systems, applications, or computing facility at the secondary Data Center after the disaster

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 17

Page 18: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery ObjectivesDisaster Recovery Objectives

Recovery Point Objective (RPO)Th i t i ti ( i t th t ) i hi h t d d tThe point in time (prior to the outage) in which system and data

must be restored toTolerable lost of data in event of disaster or failureThe impact of data loss and the cost associated with the loss

Recovery Time Objective (RTO)The period of time after an outage in which the systems and dataThe period of time after an outage in which the systems and data

must be restored to the predetermined RPO The maximum tolerable outage time

R A Obj ti (RAO)Recovery Access Objective (RAO)Time required to reconnect user to the recovered application,

regardless where it is recovered

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 18

Page 19: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Recovery Point/Time vs. CostRecovery Point/Time vs. CostDisasterstrikes

Systems recoveredand operational

Critical data is recovered

time

Recovery timeRecovery point

time t1 time t2

Recovery time

secs mins hours days weeks

Recovery point

secsminshoursdays

time t0

ExtendedCluster

ManualMigration

TapeRestore

SynchronousReplication

AsynchronousReplication

PeriodicReplication

Tapebackup

Smaller RPO/RTO Larger RPO/RTO

$$$ Increasing cost$$$ Increasing cost

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 19

Smaller RPO/RTO Higher $$$, Replication, Hot

standby

Larger RPO/RTO Lower $$$, Tape backup/restore,

Cold stanby

Page 20: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure ScenariosDesign Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 20

Page 21: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Failure ScenariosFailure Scenarios

Disaster could mean many types of FailureDisaster could mean many types of Failure

Network Failure

D i F ilDevice Failure

Storage Failure

Site Failure

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 21

Page 22: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Network FailuresNetwork FailuresInternet

ServiceP id A

ServiceProvider BProvider A Provider B

ISP failureDual ISP connectionsMultiple ISP

Connection failure within the networknetwork

ether-channelMultiple route paths

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 22

Page 23: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Device FailuresDevice FailuresInternet

ServiceProvider A

ServiceProvider BProvider A

Routers, Switches, FWsHSRPVRRP

HostsHA clusterHA cluster

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 23

Page 24: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Storage FailuresStorage FailuresInternet

ServiceP id A

ServiceProvider BProvider A Provider B

Disk arraysRAID

Disk Controllers

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 24

Page 25: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Site FailuresSite FailuresInternet

ServiceP id A

ServiceProvider BProvider A Provider B

Partial Site FailureApplication maintenanceppApplication migrationApplication scheduled DRexercise

Complete Site FailureDisaster

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 25

Page 26: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 26

Page 27: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Cold StandbyCold Standby

One or more data center with appropriately configured space equipped with pre-qualified environmental, electrical, and communication conditioning, g

Hardware and Software installation, Network access, and data restoration all need manual intervention

Least expensive to implement and maintain

Substantial delay from standby to full operationy y p

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 27

Page 28: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery – Active/StandbyDisaster Recovery Active/Standby

APP A APP B APP A APP B

Primary SecondaryFC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 28

yData Center Data Center

(Cold Standby)

Page 29: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Warm StandbyWarm Standby

A data center that is partially equipped with hardware and communications interfaces capable of providing backup operating support. p g pp

Latest backups from the production data center must be delivered

Network access needs to be activated

Provides better RTO and RPO than Cold Standby yBackup

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 29

Page 30: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery – Active/StandbyDisaster Recovery Active/Standby

APP A APP B APP A APP B

IP/Optical Network

Primary SecondaryData Center

FC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 30

yData Center Data Center

(Warm Standby)

Page 31: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Hot StandbyHot Standby

A data center that is environmentally ready and hasA data center that is environmentally ready and has sufficient hardware, software to provide data processing service with little down or no down time.

Hot Backup offers Disaster Recovery, with little or no human intervention

A li ti d t i li t d f th i itApplication data is replicated from the primary site

A hot backup site provides very good RTO and RPO

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 31

Page 32: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery – Active/StandbyDisaster Recovery Active/Standby

APP A APP B APP A APP C

IP/Optical Network

Primary SecondaryFC FC

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 32

yData Center

yData Center

Page 33: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery – Active/ActiveDisaster Recovery Active/Active

What Does Active/Active Mean??

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 33

Page 34: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Multiple Tiers of ApplicationMultiple Tiers of ApplicationInternet

ServiceP id A

ServiceProvider BProvider A Provider B

Presentation TierPresentation Tier

Application TierApplication TierApplication TierApplication Tier

Storage TierStorage Tier

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 34

Page 35: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Active/Active Data Centers

Internal

Active/Active Data Centers

InternetInternalNetwork

Network InternetService

Provider AService

Provider B

Active/Active Web Hosting

Active/Active Application Processing

Active/Standby

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 35

Database ProcessingOr

Active/Active

Page 36: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Recovery yComponents

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 36

Page 37: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 37

Page 38: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Site Selection MechanismsSite Selection MechanismsSite selection mechanisms depend on the technology or mix of technologies adopted for request routing:or mix of technologies adopted for request routing:1. HTTP Redirect

2 DNS Based2. DNS Based

3. L3 Routing with Route Health Injection (RHI)

H lth f d/ li ti d t bHealth of servers and/or applications needs to be taken into account

Optionally other metrics (like load ) can be measuredOptionally, other metrics (like load ) can be measured and utilized for a better selection

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 38

Page 39: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

HTTP Redirection – The IdeaHTTP Redirection The Idea

Leveraging the HTTP redirect function:Leveraging the HTTP redirect function:HTTP return code 302

Proper site selection made after the initial DNS requestProper site selection made after the initial DNS request has been resolved, via redirection

Mainly as a method of providing site persistence while providing local server farm failure recovery

Can be used with the “Location Cookie” feature of the CSS to provide redirection after wrong site selectionCSS to provide redirection after wrong site selection

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 39

Page 40: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

HTTP Redirection – Traffic FlowHTTP Redirection Traffic Flow

http://www1.cisco.com/

http://www.cisco.com/

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 40

http://www2.cisco.com/

Page 41: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Advantages of the HTTP Redirection ApproachApproach

Can be implemented without any other GSLB devices or mechanisms

Inherent persistence to the selected location

Can be used in conjunction with other methods to provide more sophisticated site selectionsite selection

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 41

Page 42: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Limitations of the HTTP Redirection Approach

It is protocol specific – relies on HTTP

Requires redirection to fully qualified q y qadditional names – additional DNS records

U b k k ifi l iUsers may bookmark a specific location – losing automatic failover

HTTPS redirect requires full SSL handHTTPS redirect requires full SSL hand shake to be completed first

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 42

Page 43: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

DNS-Based Site Selection – The IdeaDNS Based Site Selection The Idea

The client D-proxy (local name server) performs iterative queriesThe device which acts as “site selector” is the authoritative name server for the domain(s) distributedauthoritative name server for the domain(s) distributed in multiple locationsThe “site selector” sends keepalives to servers or

l d b l i th l l d t l tiserver load balancer in the local and remote locationsThe “site selector” selects a site for the name resolution, according to the pre-defined answers andresolution, according to the pre defined answers and site load balance methodThe user traffic is sent to the selected location

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 43

Page 44: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

DNS-Based Site Selection – Traffic FlowDNS Based Site Selection Traffic Flow

DNS Proxy

Root Name Server for/Authoritative Name Server for .com

2

Authoritative Name Servercisco.com

1

23 4

56

Client Authoritative

1 6

78

9

10

Client

http://www.cisco.com/Name Server

www.cisco.comUDP:53

TCP 80TCP:80

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 44

Data Center 1 Data Center 2

Page 45: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Advantages of the DNS ApproachAdvantages of the DNS Approach

Protocol independent: works with any p yapplication that uses name resolution

Minimal configuration changes in the current IP and DNS infrastructure (DNS authoritative (server)

Implementation can be different for specific host nameshost names

A-records can be changed on the fly

Can take load or data center size into account

Can provide proximity

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 45

Page 46: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Limitations of the DNS-Based ApproachLimitations of the DNS Based Approach

Visibility limited to the D-proxy (not theVisibility limited to the D proxy (not the client)

Can not guarantee 100% session gpersistency

DNS caching in the D-proxy

DNS caching in the client application

Order of multiple A-record answers can be altered by D-proxies

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 46

Page 47: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Route Health Injection – The IdeaRoute Health Injection The Idea

Server and application health monitoring provided byServer and application health monitoring provided by local Server Load Balancers

SLB can advertise or with draw VIP address to upstream routing devices depending on the availability of the local server farm

S VIP dd b d ti d f lti lSame VIP addresses can be advertised from multiple data centers – IP Anycast

Relying on L3 routing protocols for route propagatingRelying on L3 routing protocols for route propagatingand content request routing

Disaster Recovery provided by network convergence

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 47

y p y g

Page 48: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Route Health Injection – ImplementationRoute Health Injection Implementation

Client BClient A Router 13Router 11

Router 13

Router 10

Router 12

Location AVery High CostVery High Cost

Low CostLow Cost

Location BPreferred Location for

VIP x.y.w.z

Location ABackup Location for

VIP x.y.w.z

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 48

Page 49: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Advantages of the RHI ApproachAdvantages of the RHI Approach

Supports legacy application and does notSupports legacy application and does not rely on a DNS infrastructure

Very good re-convergence time, y g gespecially in Intranets where L3 protocols can be fine tuned appropriately

P t l i d d t k ithProtocol-independent: works with any application

Robust protocols and proven featuresRobust protocols and proven features

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 49

Page 50: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Limitations of the RHI ApproachLimitations of the RHI Approach

Relies on host routes (32 bits) whichRelies on host routes (32 bits), which cannot be propagated all over the internet (more on this later)

Requires tight integration between the application-aware devices and the L3 routersrouters

Inability to intelligently load balance among the data centers

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 50

Page 51: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 51

Page 52: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Cluster OverviewA cluster is two or more servers configured to appear as one Two types of clustering: Load balancing (LB) and High Availability (HA) Web Servers

Clustering provides benefits for availability, reliability, scalability, and manageabilityLB l t i lti l i f Application ServersLB clustering: multiple copies of the same application against the same data set, usually read only HA clustering: multiple copies of

Application Servers

HA clustering: multiple copies of long running application that requires access to a common data depository, usually read and write

Database Servers

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 52

Page 53: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

HA Cluster ConnectionsHA Cluster ConnectionsPublic Network (typically Ethernet) for client /Application Ethernet) for client /Application requests

Servers with same hardware, OS, and application software

Private Network (typically Ethernet) for interconnection between nodes. Could be direct

t ti ll iconnect, or optionally going through the public network

Storage Disk (typically Fiber) shared storage array NAS orshared storage array, NAS or SAN

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 53

Page 54: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Typical HA Cluster ComponentsTypical HA Cluster Components

Application software that are clustered to provide High pp p gAvailability. Example: Microsoft Exchange, SQL, Oracle database, File and Print Services Operating System that runs on the server hardware. E l Mi ft Wi d 2000 2003 Li ( d thExample: Microsoft Windows 2000 or 2003, Linux (and the other flavors of UNIX), IBM VMS or z/OS (for mainframe)Cluster Software that provides the HA clustering service for the application Example: Microsoft MSCS EMCfor the application. Example: Microsoft MSCS, EMC AutoStart (Legato), Veritas Cluster Server, HP TruCluster and OpenVMS Optionally Cluster Enabler a software that synchronizesOptionally, Cluster Enabler, a software that synchronizes the cluster software with the storage disk array software

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 54

Page 55: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Basic HA Cluster DesignBasic HA Cluster Design

Active/Standby:– Active node takes client requests and writing to the data– Standby takes over when detecting failure on active– Two-node or multi-node

Active/Active: node1 node2

– Database requests load balanced to both nodes– Lock mechanism ensures data integrity– Most scalable design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 55

Page 56: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

File System Approaches for HA ClustersFile System Approaches for HA Clusters

Shared Everythingy g– Equal access to all storage– Each node mounts all storage resources– Provides a single layout reference system for all nodesProvides a single layout reference system for all nodes– Changes updated in the layout reference

Shared Nothing– Traditional file system with peer-peer communication– Each node mounts only its “semi-private” storage– Data stored on the peer system’s storage is accessed via the peer-p y g ppeer communication– Failed node’s storage needs to be mounted by the peer

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 56

Page 57: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Geo-clustersGeo clusters

Geo-cluster: cluster that span multiple data centers

Local Remote

WAN

LocalDatacenter

RemoteDatacenter

node1 node2

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 57

Disk Replication

Synchronous or Asynchronous

2 x RTT

Page 58: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Considerations for HA ClustersConsiderations for HA Clusters

Split Brain: Cluster partitioning when nodes can not communicate withSplit Brain: Cluster partitioning when nodes can not communicate with each other but are equally capable of forming a cluster and mount disks.

Extended L2 required in most implementations for:Public Network since client only knows about the Virtual IP address– Public Network, since client only knows about the Virtual IP address

– Private Network, used for Heart-beats

Storage:– Directly Attached Disk (DAS) cannot be used– Shared Disk needs to be visible to both Nodes– Needs to interface with cluster software for disk failover, zoning, LUN masking when there is a node failure

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 58

Page 59: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Split-BrainSplit Brain

Split-brain happens when all of theSplit-brain happens when all of the network communication links between two or more cluster nodes fail.

Both nodes could potentially go active, and concurrently access the disk, thus corrupting data

node1 node2

d s , t us co upt g data

Data Corruption

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 59

Data Corruption

Page 60: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Resolution for Split Brain: QuorumResolution for Split Brain: Quorum

A quorum device serves as a tie qbreaker to arbitrate which system has access to resources.

The quorum ensures that even if there qis no communication between the nodes, only one node can continue to access the disk. node1 node2

Only the node that owns the quorum (or, majority quorum votes) can bring resources online.

Any resource can be used as the arbitrator to break the tie.

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 60

quorum

Application data

Page 61: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Extended Layer 2 NetworkExtended Layer 2 Network

In most implementation, L2 t k i

WANa common L2 network is needed for the heartbeat between the nodes, as well as public client

LocalDatacenter

RemoteDatacenter

accessExtending VLAN on a geographical basis is not

id d b t ti

Public Layer 2 network

node1 node2considered best practice because of the impact of broadcasts, multicast, flooding and Spanning-

Private Layer 2 network node1

g gTree integration issues

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 61

Disk Replication: Synchronous or Asynchronous

Page 62: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Resolution: L3 Routed SolutionResolution: L3 Routed Solution

In certain cases a L3 routed solution is possible 11 20 5 x 172.28.210.x

Microsoft MSCS – Requires that 2 nodes be on the same subnet.

Th i ti b t th 2

node1 node2

11.20.5.x

– The communication between the 2 nodes is UDP unicast– Local Area Mobility (LAM) allows the placement of the nodes on 2 different subnetsdifferent subnets

Veritas VCS– Allows having nodes with IP addresses in different subnets

Extended SAN

– The Virtual Address needs to change when moving from node1 to node2– DNS can be used to provide name-

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 62

pmultiple IP mapping Disk Replication:

Synchronous or Asynchronous

Page 63: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Storage Disk ZoningStorage Disk Zoning

What storage disk array node1 node2

g yshould node 2 be zoned to before and after a failure on node 1

standbyactive

To complete the failover you need to change the zoning configuration

Extended SAN

Software needed to synchronize the Cluster Software with the Disk Array’s software, i.e. Cluster Enabler

RW RD

sym1320 sym1291

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 63

RW RD

Page 64: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Resolution: Cluster Enabler

The Cluster Enabler (CE) provides the interface between the

node1 node2the interface between the Clustering Software and the Disk Array’s softwareWhen the Clustering Software detects a failure and wants to fail

active standby

detects a failure and wants to fail the node, the Cluster Enabler instructs the Disk Array to perform an failover Extended SAN

Cluster Enabler also allows node1 to be zoned to sym1320 and node2 to be zoned to 1291The Cluster Enabler running onThe Cluster Enabler running on each node typically communicates with the Cluster Enabler Software running on the remote node with Local Multicast messages RW WD

sym1320 sym1291

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 64

Local Multicast messages WD

RW WD

Page 65: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

AgendaAgenda

Introduction to Data Center - The EvolutionIntroduction to Data Center The Evolution

Data Center Disaster RecoveryObjectives Failure Scenarios Design Options

Components of Disaster RecoveryComponents of Disaster RecoverySite Selection - Front End GSLBServer High Availability - ClusteringD t R li ti d S h i ti SAN E t iData Replication and Synchronization - SAN Extension

Sample Design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 65

Page 66: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

TerminologyTerminology

Storage subsystemJust a bunch of disks (JBOD)Redundant array of independent disks (RAID)

Storage I/O devicesStorage I/O devicesHost Bus Adapter (HBA)Small Computer Serial Interface (SCSI)p ( )

Storage protocolsSCSIiSCSIFC (FCIP)

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 66

Page 67: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Terminology (Cont’d)Terminology (Cont d)

Direct Attached Storage (DAS)St i “l l” b hi d thStorage is “local” behind the server No storage sharing possibleCostly to scale; complex to manage

Network Attached Storage (NAS)Storage is accessed at a file level over an IP networkSt b h d b tStorage can be shared between servers

Storage Area Networks (SAN)Storage is accessed at a block-levelStorage is accessed at a block level Separation of Storage from the ServerHigh performance interconnect providing high I/O throughput

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 67

Page 68: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Storage for ApplicationsStorage for ApplicationsPresentation Tier

Unrelated small data files commonly stored on internal disks U yManual distribution

Application Processing Tier Transitional, unrelated data Small files residing on file systemsMay use RAID to spread data over multiple disks y p p

Storage Tier Large, permanent data files or raw dataLarge batch updates, most likely Real timeLog and data on separate volumes

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 68

Page 69: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Backup and ReplicationBackup and Replication

Offsite tape vaultingBackup tapes stored at offsite location

Electronic vaultingTransmission of backup data to offsite locationTransmission of backup data to offsite location

Remote disk replicationContinuous copying of data to offsite locationTransparent to host

Other methods of replicationHost-based mirroring Network-based replication

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 69

Page 70: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Replication: Modes of OperationReplication: Modes of Operation

SynchronousSynchronousAll data written to cache of local and remote arrays before I/O is complete and acknowledged to host

AsynchronousWrite acknowledged after write to local array cache; changes (writes) are replicated to remote array asynchronously(writes) are replicated to remote array asynchronously

Semi-synchronousWrite acknowledged with a single subsequent WRITE command g gpending from remote array

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 70

Page 71: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Synchronous Vs. Asynchronous Trade-Off

SynchronousImpact to Application

AsynchronousNo Application

Off

Impact to Application Performance

Distance Limited (Are Both Sites within the Same

Threat Radius)

No Application Performance Impact

Unlimited Distance (Second Site Outside Threat Radius)

Threat Radius)

No Data Loss Exposure to

Possible Data Loss

Enterprises Must Evaluate the Trade-Offs

Maximum tolerable distance ascertained byMaximum tolerable distance ascertained by assessing each application

Cost of data loss

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 71

Page 72: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Replication with DB ExampleData Replication with DB Example

Control Files identify other files making up the database and

Control Files• DB name making up the database and records content and state of the db.Datafile is only updated

DB name

• creation date

• backup performed

• redo log time period

• datafile state y pperiodicallyRedo logs record db changes resulting from transactions

U d t l b k h th t

Identify

• datafile state

Used to play back changes that may not have been written to datafile when failure occurred

Typically archived as they fill to local and DR site destinationslocal and DR site destinations

Datafiles Redo Log Files

Record changes to

• Tablespaces • Database changes

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 72

Tablespaces

• Indexes

• Data Dictionary

Database changes

Page 73: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Replication with DB Example (Cont’d)(Cont d)

Failure or disaster occurs at time t1

• Media Failure (e g disk)time

• Media Failure (e.g. disk)• Human Error (datafile deletion)

• Database Corruption

. . . . . . . . .

t0t1Archived Redo Logs Online Redo

Logs

Database restored to state at time of failure (time t1) by:

1. Restoring Control Files & Datafiles from last Hot Backup (time t0)

Hot Backup of Datafiles and

Control Files taken at Time t0

Backup (time t0)2. Sequentially replaying changes from subsequent

Redo Logs (archived and online) – changes made between time t0 and t1

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 73

Page 74: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Replication with DB Example (Cont’d)(Cont d)

Redo Logs (Cyclic)Redo Logs (Cyclic)Copy of Every Committed

Transaction Synchronously Replicated

Primary Site Secondary Site

Earlier DBfor Zero Loss

Database

Earlier DB Backups

SAN E t i

Replicated/Copied

Point in Time Copy Taken

When DB Quiescent

Database copy at time t0

Database Copy at Time t0

Extension Transport

Archive LogsReplicated/Copied

Quiescent

Archive Logs

Mixture of sync and async replication technologies commonly usedUsually only redo logs sync replicated to remote siteArchive logs created from redo log and copied when redo log switches

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 74

g g p gPoint in time (PiT) copies of datafiles and control files copied periodically

(e.g. nightly)

Page 75: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Interconnection OptionsInternet

C t t

StatefulFirewalls

Data Center Interconnection OptionsInternet

Content

StatefulFirewalls

IntrusionDetection

ServerLoad Balancing

ContentCaching

HighDensity

MultilayerLAN

SwitchIntrusionDetection

ServerLoad Balancing

Caching

HighDensity

MultilayerLAN

Switch

SONET/SDH

Front-End Application Servers

Front-End Application Servers

DWDM/

Back-End Application Servers

High

Back-End Application Servers

High

DWDM/CWDM

gDensity

MultilayerSAN

Director

Enterprise-Class Storage Arrays

HighDensity

MultilayerSAN

Director

Enterprise-Class storage ArraysIP/Metro E

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 75

Page 76: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Transport OptionsData Center Transport Options

Increasing DistanceData

Center Campus Metro Regional National

Increasing Distance

Limited by Optics (Power Budget)Dark Fiber

CWDM

Sync

Sync (2Gbps) Limited by Optics (Power Budget)

cal

DWDM

SONET/SDH

Sync (2Gbps lambda)

Sync (1Gbps+ subrate) Async

Limited by BB_CreditsOpt

ic

Sync (Metro Eth) Async (1Gbps+)MDS9000 FCIP IP

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 76

Page 77: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Replication with SAN ExtensionExtension

Extend the normal reach ofSh d D Extend the normal reach of a Fibre Channel fabric

ReplicationRemote host to target array

Shared Data Cluster or

Remote Host Access to Storage

Remote host to target arrayShared data clusters

SAN Extension Network

FC FCReplication

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 77

Page 78: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

SAN Design for Data ReplicationSAN Design for Data Replication

Servers with two fibreSite A Server

Access

Replication Fabrics

Servers with two fibre channel connections to storage arrays for high availability

FC

availabilityUse of multipath software is required in dual fabric host design

DC Interconnect

Network

design

SAN extension fabrics typically separate from

FC

typically separate from host access fabrics

Replication fabric requirements generally

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 78

Site B

FCReplication

fabrics

requirements generally specified by array vendor

Page 79: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Data Center Disaster RecoveryDisaster Recoverysample design

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 79

Page 80: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Disaster Impact RadiusDisaster Impact RadiusGlobal

Regional< 400km

PrimaryD t C t

SecondaryData CenterDR Site

Metro< 50km

Data CenterData CenterDR Site

Disasters are characterized by their impact

Local metro regional global

Local1–2 km

Local, metro, regional, globalFire, flood, earthquake, attack

Is the backup site within the threat radius?

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 80

radius?

Page 81: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Active/Standby Architecture - TodayActive/Standby Architecture TodayCA

High Availability Site 1CA

High Availability Site 2NC

Disaster Recovery Site

Hosts 1 Hosts 2 Hosts 3

HA Cluster(s) Electronic Journaling

Synch CWDMReplicationMDS 9509’s MDS 9509’s MDS 9509’s

Synch FCIPReplication

Asynchronous FCIP Replication

Dual OC12

MDS 9509Gateway

MDS 9509Gateway

MDS 9509Gateway

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 81

Storage 1 Storage 2 Storage 3Bunker

Page 82: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Frame Based ReplicationFrame Based Replication

ProductionCluster

Data Center 1D/R

Data Center 2

MDS DUAL OC12 MDS

SRDF

MDS DUAL OC12

R2 BCV/R1

PiTPiT

PiTPiT

Arch

Redo

PROD

Arch

Redo

D/R

BCVTimefinderTimefinder

SRDF/ASRDF/ASRDF/A

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 82

Arch

EMC/DMXEMC/DMX

Arch

EMC/DMXTriple Threat

Page 83: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Active/Active Architecture - Tomorrow

UserACE

decryptsrequest

ACEroutes

request

ACNScachespages

Service Locator Group Data Centers

Clustered Backend Y Active

DC2ActiveStandby

Requestsdirected to

b k

Content Engine

ACEprobes t k

GSS performs Site (DC) selection according to pre-configured condition, using

FQDN

Y ActiveX Standby Active

Data Y

ActiveData X

StandbyData X

backup application

track application

health

Presentation LayerMirror

Asynchronous Replication

Requestsdirected to

primary application

DC1Replication

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 83

Clustered Backend X Active

Y Standby

ActiveData X Active

Data YStandbyData Y

Page 84: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

SANTap and Continuous Data ProtectionSANTap and Continuous Data Protection

Production Servers• SANTap• Appliance based storage replication• Reliable copy of WRITE operations• SCSI-FCIP communication

CDPAppliance

• Continuous Data Protection• Automatic and Continuous Backups• Time Addressable Storage (TAS) Appliance

MDS SAN

Time Addressable Storage (TAS)• Any Point-in-Time Recovery• Application based or Network based

SAN Tap

SecondaryPrimary

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 84

Page 85: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Fabric Based Replication with CDPFabric Based Replication with CDP

ProductionCluster

Data Center 1D/R

Data Center 2

DUAL OC12SANTap

Replication/CDPAppliance

Replication/CDPAppliance

MDSMDS

DUAL OC12

Arch

Redo

PROD

APiT

APiT

APiT

APiT

APiT

APiTArch

Redo

BCV

D/R

SRDF/ASRDF/ASRDF/A

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 85

Arch

EMC/DMX TAS/SATA TAS/SATA

Arch

EMC/DMX

Page 86: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

End-End Data Center ResilienceEnd End Data Center Resilience

GSS-1 GSS-2

Corp. DNS

ACE-1 ACE-2 ACE-3

DC-3

Web/APP

Server

DC-2DC-1

IP/Optical Network

DB

CWDM/DWDM

Server Farm

FC

CWDM/DWDM

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 86

PrimaryLocation

FC SecondaryLocation

FC

Page 87: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

Summary - Design DetailsSummary Design DetailsData centers 1 and 2 are in primary location with close enough distance that can provide DC HA for active/activeenough distance that can provide DC HA for active/active accessData Center 3 (DR) with > tolerable disaster radius, away for Primary DC 1 and 2for Primary DC 1 and 2Web/App server farms are load balanced geographicallyDB servers are within a geo HA cluster and running in aDB servers are within a geo-HA cluster and running in a L3 designSynchronize Data replication between data centers within y pthe primary locationAsynchronous Data replication is done between the primary and secondary storage systems

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 87

primary and secondary storage systems

Page 88: Data Center Business Continuance and Disaster Recovery · Business Continuance Is More Critical than Ever 75% of IT decision-makers have altered Disaster Recovery/Business Continuance

© 2009 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 88