Eric Severson – CCNP, CCDP, MCSE Network Specialties, Inc. eric@network-specialties.com (817)...

Preview:

Citation preview

Eric Severson – CCNP, CCDP, MCSENetwork Specialties, Inc.eric@network-specialties.com(817) 491-0267

Agenda

Availability

Uptime Uptime Maximum Downtime per Year

Six nines 99.9999% 31.5 seconds

Five nines 99.999% 5 minutes 35 seconds

Four nines 99.99% 52 minutes 33 seconds

Three nines 99.9% 8 hours 46 minutes

Two nines 99.0% 87 hours 36 minutes

One nine 90.0% 36 days 12 hours

* Unscheduled downtime

Design for Availability

Availability of a Single Component

Availability = MTBF/(MTBF+MTTR)

Example:MTBF = 120,000 hr

MTTR = 4 hr

Availability = 0.99997 = 99.9967%

Annual downtime = 17.5 minutes

Availability – Multiple Components

Multiple Components

Availability = Avail(component 1) x Avail(component 2) … x Avail(component n)

ISP router firewall switch server

Availability – Server System

Availability – Multiple Components

Availability – Other Components

What about A/C power availability? What about software errors – IOS

bugs, application code errors, bad patches or antivirus updates that cause outages?

How about the human fat-finger?

Availability – Power/Software added

Availability – How Can you Improve?

Add redundancy Reduce repair time Manage your network…

Availability – With Redundancy

ISP router firewall server

ISP router firewall server

switch

switch

Parallel Availability = Same product of availabilities but use 1-((1-availability)*(1-availability)) for each component that has been made redundant.

Managed Network Characteristics

Systems must be managed towards a common goal

Network must be secure Infrastructure is thoroughly documented Equipment must be manageable Enterprise synchronized to a common time

source

Managed Network Characteristics

Logging SNMP trapping SNMP polling Vendor specific alerting – e.g. Dell iDRAC Application monitoring Personnel trained on equipment and

management systems Network Management System

Why do we want a managed network?

To achieve the availability that was designed into the system

Downtime is costly!

Equipment is Manageable

Enterprise grade hardware Configurable Supports industry standards Evolves to support new standards/features Redundancy available if design demands it Remotely accessible (SSH, http, telnet, SNMP)

Comprehensive Documentation Organized repository (online/offline) “First Responder” documents Network diagrams - logical and physical Network device lists Circuits lists Applications/firewall rules Contact lists – IT/vendors/support/site Policies/procedures/service level agreements Business continuity/disaster recovery plan

Enterprise synchronized to a standard time

Hierarchical design NTP (Network Time Protocol) is used Real -time clock or approved Internet source All network hardware must synchronize All active systems ( Windows, UNIX and

proprietary platforms) must synchronize

Equipment must be maintained Vendor hardware maintenance Vendor software maintenance Hot/cold spares Periodic patches to fix software/hardware

issues Upgrades to add new features Configuration management Change control Life cycle planning

Logging

Syslog server for accepting logged events Windows/UNIX Event logging Logging properly configured on all systems Systems in place to interpret log events Predetermined/proscribed actions for log

events Out-of-band alerting for actionable events

SNMP Trapping

SNMP (Simple Network Management Protocol) NMS to accept SNMP messages Devices configured to send SNMP messages

when events occur Systems in place to interpret SNMP events Predetermined/proscribed actions for SNMP

events Out-of-band alerting for actionable events Operational guidelines for responding to

events

SNMP POLLING

SNMP server configured to proactively retrieve operational/performance data

NMS system in place to interpret SNMP events

Proscribed actions for SNMP events Provide detailed metrics on hardware/software

systems Out-of-band alerting for actionable events Operational guidelines for responding to

events

Application Monitoring

Specific TCP/UDP ports are checked for proper response - e.g. HTTP, SSL, SMTP, DNS, etc

Synthetic transactions are issued – e.g. a query against a web site/database system

Out-of-band alerting for actionable events Operational guidelines for responding to

events

Trained Personnel

Network design LAN configuration, operation &

troubleshooting WAN configuration, operation &

troubleshooting Windows active directory/networking

operations Vendor specific tools Generic tools

Systems must be managed towards a common goal

Availability should be specified Expectations should be explained to

customers Customer expectations should be met Network metrics should be developed and

publicized

Network must be secure

Only authorized access is allowed Network equipment must be in secure areas Network equipment must be hardened AAA (Authentication, Authorization and

Accounting) should be in place Network design should support the security

paradigms

Logging

Syslog is native to Unix/Linux Kiwi Syslog is a free Windows program Syslog can be a part of a network management

software package Windows event logs can be retrieved by NMS or

other application Define how syslog will be used

SNMP Polling/Trapping

Define what you want to track and thresholds for actionable items

SNMP community strings defined on each device/host

SNMP polling and trapping is configured on NMS

Define actions (NMS and human) should an actionable state occur

How to Build a Managed Network

Document existing infrastructure Set up logging host Configure all devices/hosts for logging & SNMP Set up Network Management Station Configure logging, polling and traps Document specific actions for events

No-Cost Systems

Use the tools that vendors provide free Syslog - Linux or Kiwi syslog NMS – Nagios, OpenNMS, Zenoss, Pandora,

Groundwork, Hyperic, NetXMS Configuration management

Kiwi Cattools - routers, switches and firewalls

Scripting – Perl/TCL/Expect/WMI

Low-Cost Systems

What’s Up Gold PRTG GFI Network Monitor

Enterprise Systems

HP Openview Solarwinds Orion CA eHealth IBM Tivoli EMC Ciscoworks Cisco MARS

Next Steps

Develop strategy Develop short-term tactical plan to rapidly

move towards a more manageable network

Further Information

Comparison of network monitoring systems - http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems

Popular Network Management Software in Comparison - http://ipinfo.info/html/network_management_software.php

Eric SeversonNetwork Specialties, Inc.

eric@network-specialties.com817-491-0267

Recommended