Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Architecting a Highly Available Infrastructure:An Overview
SLN187
Mark MilowMike DiPetrillo
Agenda
Why?SolutionsQ&A
Quick Stats2004 Gartner study found an average of $42,000 per hour of downtime
Average network experiences 175 hours of downtime a year (98% availability)That’s $7.35 million in lost revenue each yearAn increase to 99% means only $3.68 millionThat’s $3.67 million you can spend on DR
Quick Math: Amazon.comRevenue in 2001: $3.1B/year with 7774 employeesRevenue Per Hour: ~$350,000If outage effects 90% of revenue: ~$320,000Assume average annual salary is $85,000$656M/year or $12.5M/week for all staff@ 50 Hours/week: ~ $250,000 per hourIf outage effects 80% of employees: ~$200,000Total is $520,000 per hour of downtime
Why and for What?Different levels
Planned disasters (hurricanes, etc.)Unplanned disasters (power outage, tornadoes, etc.)
High availabilityPlan for local (inside the datacenter) failures
Disaster recoveryPlan for regional (the datacenter is gone) failures
Definitions
MTR: Mean Time to RecoverCTR: Cost To RecoverTiers: Levels of Recovery Options
Helpful HintsTier data
Break data into tiers of different MTR commitsNetworking
Plan how to get your network to fail with the dataPeople
Don’t plan on people flying to remote sitesAutomate
Automate as much of the failover as possible
Local High Availability
Local High AvailabilityFeatures
Standard clustering agentsSolutions for non-cluster aware appsStateful and non-stateful failoverFailover and fail-back
Product vendorsLegato, Microsoft, Veritas, Steeleye, Linux
Deployment scenariosPhysical to virtualVirtual to virtualPoor man’s
Primary Server ESX ServerFailover Server
4U, 8-way Rackmountwith ESX Server
MS Exchange Windows 2000
File / PrintWindows NT
Intranet App Server on Windows 2000
Data
Data
Data
1U, 2-way Rack
MS Exchange Windows 2000
1U, 2-way Rack
File / PrintWindows NT
1U, 2-way Rack
Intranet App Server on Windows 2000
Shared disks, arrays or
SAN storage
Physical to Virtual Clustering
Virtual to Virtual Clustering
ESX Server 2VM4
HP 4-wayRackmountDell 4-way
Rackmount
ESX Server 1
Shareddisks
arrays, orSAN storage
VM5
VM6
VM1 VM2
VM3
VM4 VM5
VM6
VM1 VM2
VM3
ON ON
VM1VM2
VM3
VM5VM6
VM4
OFF OFFON
Poor Man’s Clustering
Local High Availability
ConsiderationsCostMTRNumber of virtual machinesDisk space
BenefitsOut-of-the-box solutionStateful failoverInexpensiveFor any application
OS Based Solutions
OS Based SolutionsFeatures
Regular agentsVery similar to physical environmentEfficiencies from virtual machine architecture
Product vendorsLegato, Symantec, Veritas
Deployment scenariosPhysical to virtualVirtual to virtualTrunk of Car
Tape Array
Backup Server
Physical to Virtual
Tape Array
Backup Server
Virtual to Virtual
Trunk of Car
OS Based Solutions
ConsiderationsCostNumber of virtual machinesBandwidthDisk space
BenefitsStandard solutionNo learning curveGreat reduction in agent cost
Host Based Solutions
Host Based SolutionsFeatures
“Agent” based, runs as a serviceFile/byte level replicationSynchronous and asynchronousFailover and fail-back
Product VendorsLegato, NSI Double-Take, NeverFail, Mimix
Deployment ScenariosPhysical to virtualVirtual to virtualMulti-node
Virtual to Virtual
Primary Site Failover Site
Virtual to Virtual
Primary Site Failover Site
Virtual to Virtual Multi-Node
Primary Site Failover Site
Virtual to Virtual Multi-Node
Primary Site Failover Site
Virtual to Virtual Multi-Node
Primary Site Failover Site
Host Based Solutions
ConsiderationsCostDistanceNumber of virtual machinesBandwidth
BenefitsOut-of-the-box solutionMaximum uptimeEase of use
SAN Based Solutions
SAN Based Solutions Features
SAN layered applicationsLUN SnapshotReplication/Mirroring
Block level replicationSynchronous and asynchronous
Products VendorsEMC, HP, IBM, Network Appliance
Deployment ScenariosPhysical to virtualVirtual to virtualHybrid
SAN: Virtual to Virtual
Failover SitePrimary Site
Host Agent For
Replication and Failover
Hybrid Mode Backup
Failover SitePrimary Site
SAN Replication
Host Agent For Virtual
Machine and Application
Failover
SAN Based SolutionsConsiderations
CostDistanceBandwidthDowntimeComplex configuration
BenefitsHigh performance CentralizationMultiple working copies
Questions