Upload
linda-marsh
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
1
BMC PATROL Express
Presentation to the Ottawa Area PATROL User Group (OAPUG)
Pierre Vanier, KOAN-IT Corp.May 5th, 2004
2
About KOAN-IT
• KOAN-IT’s mission:– To deliver Service Management solutions for IT
operations– Visit us at: www.koan-it.com
• About the presenter:Pierre Vanier, Senior IT Consultant, KOAN-IT Corp.– 15 years experience in the IT industry– 10 years experience in Enterprise Management solutions
3
KOAN-IT and PATROL Express
• PATROL Express used at KOAN-IT to monitor:– IT Infrastructure– Corporate web site (i.e. www.koan-it.com)
• Powerful monitoring of both Patrol and non-Patrol environments
• KOAN-IT is now reselling PATROL Express to commercial clients
• Available on a subscription basis (monthly fee)• For more information, please contact:
KOAN-IT Corp.email: [email protected]: (613) 591-9131
4
Agenda
• PATROL Express Overview & Architecture
• PATROL Express Features & Demo• Q&A
5
What is PATROL Express?
PATROL Express is an infrastructure monitoring solution. It provides monitoring, notification of outages, and reporting for servers, network devices and applications.
PATROL Express also monitors the performance and availability of web transactions. It measures both transactions and infrastructure against user-defined service level objectives.
PATROL Express uses an agentless technology that enables it to be deployed rapidly with minimal impact. Its interfaces are web-based; all management tasks can be performed using a standard web browser.
PATROL Express Overview
6
Enterprise Management
Service Level Management
Root Cause Analysis
Remote Service Level Monitoring
Up/Down Detection
Po
ints
of
En
try
InfrastructureMonitoring
InfrastructureManagement
ServiceManagement
ExtensibilityPATROL Express
PATROL
Capabilities
Recovery Actions
Sliding Scale of Management
7
Architecture Overview
8
Key Concepts
Service Integration Portal (SIP)– Web-based application– Resides at customer’s or service provider’s data center– Used by end-users to:
• Configure elements• Organize elements into services• View reports• Set up notifications
9
Key Concepts (cont’d)Remote Service Monitor (RSM)
– Collects performance data and relays it to the SIP– Remotely monitor elements that are configured at the
SIP– Agentless technology – uses industry-accepted protocols– Downloaded from the SIP– Installed and deployed on the customer’s network– RSM clustering: provides failover protection
Monitoring Overview Supports a number of remote monitoring protocols Each RSM can monitor hundreds of elements Typically retrieve parameters from elements once a
minute (monitoring interval)
10
RSM Location
– The RSM resides in the customer’s environment– Must have IP addressability to the elements that it
monitors– Must also be able to resolve the PATROL Express
SIP IP address– The RSM runs as a service– RSM manager application resides in the Microsoft
Windows system tray
11
RSM Manager (system tray)
12
RSM – SIP Communications
• The RSM communicates with the SIP using HTTPS• The RSM initiates all communication• The data is compressed and encrypted prior to being
sent
• RSM-SIP communications fall into one of the three following categories:– Verifying RSM-to-SIP communication– Forwarding alarm or warning exceptions to the SIP for
processing– Forwarding parameter data to the SIP for processing service
reports
13
PATROL Express System Requirements
Minimum system requirements for installing all PATROL Express components (web, application and database servers)on ONE System, are as follows:
ResourceMinimum Requirement Comments
PlatformIntel Pentium I I I or equivalent
Minimum of 1 GB memory dedicated to the SIP components; minimum 733-MHz processing speed, Minimum 30 GB Hard Drive Capacity
Operating System
Windows 2000 Server (SP2 or later)
The OS MUST be Windows 2000 Server with all latest Critical Updates and Patches applied
SIP (Service Integration Portal)
Resource Minimum Requirement Comments
Platform Intel Pentium I I I Minimum of 128 MB memory dedicated to the RSM; minimum 600-MHz processing speed
Windows NT 4.0 (SP6a or later),Windows 2000 (SP2 or later), or Windows XP
BrowserInternet Explorer 5.5 and later (with latest patches) The RSM uses the Internet Explorer WinInet.dll
Operating System
RSM (Remote Service Monitor)
14
PATROL Express
PATROL Express Features
15
Features Overview
– Remotely monitors Infrastructure and Web transactions> Infrastructure > Web-based
transactions- Operating systems - HTTP/HTTPS- Databases- Applications- Network/Storage Devices
– Measures against user-defined service level objectives
– Provides ‘business service’ performance measurements
– Sends real-time problem notifications– Provides centralized access to reports– Service enabled for end user access
16
PATROL Express Monitoring• What it monitors
– Windows (NT, 2000, XP, 2003)
– Unix (AIX, HP-UX, Linux, Solaris)
– Databases (Oracle, MS-SQL, Sybase)
– Web servers (IIS)
– Web transactions– Email (Exchange, POP, IMAP, SMTP)
– Network devices– Storage devices– Port monitoring– Process monitoring– Windows event log
monitoring– Text log monitoring– Custom parameter sets
• How it monitors– PerfMon– WMI– rstatd– Secure Shell (SSH)– SNMP– HTTP/HTTPS– SQL*NET– Ping– DNS– PATROL protocol
For more information, please refer to the PATROL Express Parameter Set Guide.
17
Monitoring Capabilities
Operating SystemsPATROL Express monitors basic operating system parameters, including memory, CPU and disk utilization. PATROL Express can be configured to monitor specific processes as well as how much memory and CPU a process is using. It can also monitor both Windows Event logs and text log files for user-defined messages.
DatabasesPATROL Express takes a snapshot of database performance, which includes ensuring that the database is up and running and that it can monitor parameters such as number of transactions, lock usage and active SQL statements.
Network DevicesPATROL Express monitors each interface (port) on the device to ensure it is running and reports how much data has been transmitted – and at what speed. In addition, it monitors the status of network devices, checking on availability and reporting the system description of each device.
18
Monitoring Capabilities (cont’d)
Storage DevicesWhen monitoring storage devices, PATROL Express focuses primarily on availability, such as up/down status and environmental systems, including fans, power supplies and temperature; asset information, such as vendor, model, serial number and firmware; and configuration, such as device capacity and number of ports.
Web TransactionsPATROL Express monitors the performance and availability of Web transactions using HTTP and HTTPS. PATROL Express supports all major dynamic HTML techniques, such as JSP, ASP and CGI, and popular content types, such as Microsoft Word, PDF and plain text.
19
Service / Element Monitoring
- Cumulative “current status” at various levels:
• Company• Service• Element• Parameter
- Parameter performance graphs
- PATROL Express API can be used to retrieve current and historical performance data, e.g.:
https://patrolexpress.bmc.com/gethistdata.do?user=t1&password=secret&
format=csv&element=localhost¶mset=Windows_2000¶metername=TotalCPUUtilization&startdate=06-18-2003-06-00PM&version=3.0
20
21
22
Parameter Performance Graph
23
Service Measure Reports• Charts at the account and service levels
– Availability reports• Availability – percent of time with no critical alarms• Availability vs. Goals – percent of time availability goals
were met– MTTR reports
• Mean Time To Repair Critical Alarms• Mean Time To Repair vs. Goals – percent of time MTTR goals
were met
MTTR is the average length of time it takes to fix a problem that caused an alarm state on a monitored element.
• Charts at the element level– Availability– Mean Time To Repair (MTTR)
24
25
Web Transaction Reports
• Reports specific to Web transactions
– Service Measures reports• Path Time vs. Goals (for Web pages)• Page Time vs. Goals (for Web pages)
– Diagnostic charts• Total path time• Slowest five steps, fastest five steps• Page time • Page component (DNS, first byte, resources)• Shown as averages or for specific locations (RSMs)
26
Path Time Charts
Total Path Time with Goal Path Time by Step
27
Page Time Charts
Total Page Time with Goal Page Time Breakdown
28
Worldwide Web Site Monitoring
29
Alerts– Historical log of all alerts sent out– Alerts may be sent via email or SNMP– Alert log entry includes:
• Element service• Element name and network name or IP address• When it was detected• State: critical alarm, alarm, warning or normal• Who was notified
– Built-in blackout periods
30
Notification Examples• When to notify (escalation polices)
– Notify Joe immediately– Notify Jane if the alarm is not fixed in 10 minutes
• What types of alerts to send (critical alarm, alarm, warning or all alerts)– Notify Joe on critical alarms– Notify Jane on critical alarms and alarms– Notify Bill on critical alarms, alarms and warnings
• Notify different recipients for different services or elements– Notify only Joe for Service X alerts– Notify only Jane for Service Y alerts
31
E-Mail Notification Example
32
Customizing PATROL Express– The Parameter Set Editor (PSE) is an interface
for adding custom parameter sets– Primary way to customize PATROL Express– Custom parameter sets are added via PSE
using the following protocols:• PerfMon• SNMP• PATROL
– Accessible by administrators only– Administrators may hide existing parameter
sets (e.g., do not show Solaris)
33
PATROL Express Security• PATROL Express was designed with security in mind.• All traffic, including log on credentials between the
SIP and the RSM, is compressed and encrypted using HTTPS.
• PATROL Express supports SNMP v3 – encrypted credentials and data transmission.
In addition, the following precautions have been taken:– The SIP uses Secure Socket Layer (SSL) IDs with strong
encryption– Users are required to authenticate using their user IDs and
passwords to access the SIP– Portal and element credentials are stored encrypted in the
database of the SIP
34
Summary
PATROL Express improves quality of service (QoS)– Minimizes deployment and configuration – Notifies customers of outages – Enables a System-wide monitoring tool– Measures against user-defined service
level objectives– Provides Web-based Service Measures
reports
35
Summary (cont’d)
PATROL Express provides great functionality "out-of-the-box“. Other products often require several loosely integrated components to accomplish the same level of functionality:– secure web-interface– secure communications between RSM and SIP– secure storage of credentials– easy clustering of RSMs– email notifications– pager-ready messages & notifications– blackout period support built-in– automatic report generation/emailing– easy administration & management
36
Why PATROL Express?
– Deploy quickly, without an agent on a box?– Reactive or NO monitoring in less strategic
areas?– Improve service levels (QoS) to customers? – Remotely monitor performance and
availability globally?– Other tools too expensive to expand
business segments?– Are you being asked to do more with less?
37
PATROL Express in the PATROL Management Architecture
38
Questions?
For more information, please contact KOAN-IT:
Email: [email protected]: (613) 591-9131