Upload
anabel-henderson
View
214
Download
1
Embed Size (px)
Citation preview
Hi
High Availability Design Ram Dantu
Slides are adopted from various sources from Cisco and Interwork Inc.,
Agenda
• Definitions
• Concepts / Calculations
• Examples
• Challenges
Availability as a percentage
1 year = 525960 minutes
Availability Unscheduled downtime per year
99% 3 days 15 hours 36 minutes
99.9% 8 hours 45 min.
99.99% 52 min. 36 sec.
99.999% 5 min. 15 sec.
99.9999% 32 sec.
Getting downtime from availability
1 year = 525,960 minutesUptime = availability * TimeAnnual Uptime = Availability * 525,960Annual Downtime = 525,960 – Annual Uptime---------------------------------------------------------Availability = 0.9999Annual Uptime = .9999 * 525,960Annual Uptime = 525,907.4Annual Downtime = 525,960 – 525,907.4 = 52.596-----------------------------------------------------------Downtime = (1-Availability) * Time
Availability vs. Reliability
Availability = users’ perception
Reliability = individual component failures
Reliability impacts maintenance costs but doesn’t necessarily have to impact availability
Defects Per Million
Availability DPM
99% 10000
99.9% 1000
99.99% 100
99.999% 10
99.9999% 1
Calculating Availability
Availability =MTBF
MTBF + MTTR
Example: 6500 Chassis
MTBF = 369897 hours (about 42 years)
MTTR = 4 hours
Availability = 369897 / ( 369897 + 4 )
= 369897 / 369901
= 0.9999892 = 99.99892%
6500 Availability
Module Availability
Chassis 99.99892%
Power Supplies 99.99873%
Supervisor - including software 99.99516%
GBIC Line Card 99.99577%
GBIC 99.99907%
10/100 Line Card 99.99577%
Availability Formulas
Serial availability
Availability = AvailA × AvailB × AvailC
= 99.999% × 99.999% × 99.999%
= 99.997%
A CB
99.999% 99.999% 99.999%
Availability Formulas
Parallel availability
Availability = 1 – ((1 – AvailA) × (1 – AvailB))
= 1 – ((1 – 99.9%) × (1 – 99.9%))
= 99.9999%
99.9%
99.9%
B
A
Availability Formulas
Parallel-series availability
99.9%
99.9%
B
A
99.9%
99.9%
D
C
99.9%
99.9%
F
E
99.9999% 99.9999% 99.9999%
= 99.9997%
Availability Formulas
Series-parallel availability
99.7%
= 99.9991%
99.9%
99.9%
B
A
99.9%
99.9%
D
C
99.9%
99.9%
F
E
= 1 – ((1 – 99.7%) × (1 – 99.7%))
Core
Distribution
Access
One Core / Distribution 6500
Chassis 99.99892%
Dual power 99.99999%
Supervisor + software 99.99516%
Dual GBIC Line Cards 99.99999%
Dual GBICs 99.99999%
Single switch availability 99.99405%
Pair of Core / Distribution 6500’s
Series-Parallel Availability
99.99405% series availability each
Two switches in parallel
1 – ((1 – 99.99405%) × (1 – 99.99405%))
= 99.999999%
Access Layer 6500
Chassis 99.99892%
Dual power 99.99999%
Dual Supervisors 99.99999%
Dual GBIC Line Cards 99.99999%
Dual GBICs 99.99999%
10/100 Line Card 99.99577%
Switch availability 99.99888%
Access port availability 99.99465%
Challenges
SD
STATUS0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
SD
STAT
US
WS-X6408-GBIC
8 PORTS GIGABIT ETHERNET
1 2 3 4 5 6 7 8
LINK
LINK
LINK
LINK
LINK
LINK
LINK
LINK
SD
STATU
S
SUPERVISOR LINK
SYSTEM PORT 2
LINKACTI
VEPW
R MGM
TRESE
T
CONSOLE
SWITCHLOAD1-20%
CONSOLEPORT
1%EJECTPCMCIA
PORT 1
LINK
WS-X6K-SUP1-2GE
SD
STATUS0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
SD
STATU
S
WS-X6408-GBIC
8 PORTS GIGABIT ETHERNET
1 2 3 4 5 6 7 8
LINK
LINK
LINK
LINK
LINK
LINK
LINK
LINK
SD
STATU
S
SUPERVISOR LINK
SYSTE
M PORT 2
LINKACTI
VEPW
R MGM
TRESE
T
CONSOLE
SWITCHLOAD1-20%
CONSOLEPORT
1%EJECTPCMCIA
PORT 1
LINK
WS-X6K-SUP1-2GE
SD
STATU
S0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
SD
STATU
S
WS-X6408-GBIC
8 PORTS GIGABIT ETHERNET
1 2 3 4 5 6 7 8
LINK
LINK
LINK
LINK
LINK
LINK
LINK
LINK
SD
STAT
US
SUPERVISOR LINK
SYSTE
M PORT 2
LINKAC
TIVE
PWR M
GMT
RESET
CONSOLE
SWITCHLOAD1-20%
CONSOLEPORT
1%EJECTPCMCIA
PORT 1
LINK
WS-X6K-SUP1-2GE
• Improving availability at the access layer– “NIC Teaming” in servers– Reduce MTTR
MTBF = 369897 hours
Availability with MTTR of 4 hours = 99.99892%
Availability with MTTR of 2 hours = 99.99945%
Challenges
• Long convergence times– Spanning tree
Eliminate layer-2 links where possible
Avoid layer-2 loops
Use STP enhancements where appropriate– Routing protocols
Use a link-state (or EIGRP) routing protocol
Use routing convergence enhancements
Minimize routing table sizes
Limit convergence scope