31
The Role of OpManager in Event and Fault Management Team OpManager www.opmanager.com

Role of OpManager in event and fault management

Embed Size (px)

DESCRIPTION

Managing Event and Fault are not new to any IT managers. However if not implemented properly, this could be the most daunting of network monitoring and network management tasks.Check out this presentation, to understand# The basics of Event and Fault Management &# How ManageEngine OpManager helps in effective Fault Management

Citation preview

Page 2: Role of OpManager in event and fault management

Agenda

• Brushing up Fault management– Reactive Vs. Pro-active

• The four processes and OpManager’s role– Detect– Isolate– Inform– Resolve

2

Page 3: Role of OpManager in event and fault management

Reactive Fault Management

• Firefighting in nature • Troubleshooting starts after business is impacted• Higher resolution time• Least preferred by both IT admins & End users

3

User

IT Admin

It is not working!

Page 4: Role of OpManager in event and fault management

Proactive Fault Management

• Alerts on an impending fault• Resolution time reduced drastically• Reduced operation cost

4

NMS has reported a problem & I’m working on

it

User

IT Admin

Page 5: Role of OpManager in event and fault management

What is Fault and Event Management?

• Detecting events • Make sense of them• Present only

actionable events

5

*An event can be informational, a cleared event, warning, trouble or even a critical problem

Page 6: Role of OpManager in event and fault management

The four processes

6

Page 7: Role of OpManager in event and fault management

The four processes explained

7

• Active Monitoring• Passive Monitoring

• De-duplication• Correlation• Automation

• Visual representation• Ticketing• Alerting

• Automatic correction• Troubleshooting tools

Page 8: Role of OpManager in event and fault management

Detect – Capture events

• Active Polling/ Probing/ Query monitoring

8

Active Monitoring: e.g. SNMP Polling

Other e.g. of Active polling are monitoring through SNMP, WMI, Telnet, SSH, Custom scripts, Remote query & more…

Page 9: Role of OpManager in event and fault management

Detect – Capture events

• Passive or Event-based Monitoring

9

Passive Monitoring e.g. SNMP TRAP

Other e.g. of Passive monitoring are SNMP TRAPS, Syslog, NetFlow, Packet forwarding & more …

Page 10: Role of OpManager in event and fault management

Isolate – Present actionable faults

• Helps identify the root cause of the problem quickly; reduces Mean-Time-To-Resolve (MTTR)

• Includes tasks to– Understand event source– Filters-out redundant or known events – Projects only actionable faults

*Network Management System’s Fault management engine plays a vital role

10

Page 11: Role of OpManager in event and fault management

De-duplication

•Drops recurrent events from displaying•Build them as event history

Isolate – Present actionable faults

11

Page 12: Role of OpManager in event and fault management

De-duplication

•OpManager Alarms view – Showing unique alerts for every device and type of alarms•Detailed alarm history page with list of alarm actions

Isolate – Present actionable faults

12

Page 13: Role of OpManager in event and fault management

Correlation

•Relates previous events and interdependency•Projects only the root cause of the problem

Isolate – Present actionable faults

13

Page 14: Role of OpManager in event and fault management

Correlation

•OpManager has automated and custom network maps that lets you identify the root cause much quickly.•Lets you configure device dependencies to project only the root of the problem

Isolate – Present actionable faults

14

Page 15: Role of OpManager in event and fault management

Automation

•Ignore incidental events •Remove cleared faults•Suppress known alarms (Automated/ Manual Suppression)

Isolate – Present actionable faults

15

Page 16: Role of OpManager in event and fault management

Automation

•Threshold configuration – Consecutive Times and Rearm Value

•Suppress known alarms – Downtime Scheduler

Isolate – Present actionable faults

16

Page 17: Role of OpManager in event and fault management

Automation

•Suppress known alarms - Manual suppression for devices and interfaces

Isolate – Present actionable faults

17

Page 18: Role of OpManager in event and fault management

•Visual representation of faults to facilitate NOC admins •Ticketing and Alert remote admins

Inform – Notify admins

18

Page 19: Role of OpManager in event and fault management

Inform – Notify admins

• Alarms color coding• Web Alarms and Dashboards• Dynamic network or custom maps showing

the network and device status

19

Page 20: Role of OpManager in event and fault management

Inform – Notify admins

Trouble ticketing

•Through Email for other helpdesk products •Automatic ticket creation with ManageEngine ServiceDesk plus, through integration

20

Page 21: Role of OpManager in event and fault management

Inform – Notify admins

• Alert remote admins – Email, SMS, RSS feeds, Twitter Alerts, iPhone/ Smartphone GUI

21

Email

RSSTwitter DM

Smart Phone UI

SMS

Page 22: Role of OpManager in event and fault management

Resolve – Aid faster resolution

• Needs proprietary knowledge of your IT infrastructure, policies & agreed SLAs.

• NMS should help – Execute such automation logics (Communicate

execution faults, if any)– Back manual troubleshooting with set of IT tools

22

Page 23: Role of OpManager in event and fault management

Resolve – Aid faster resolutionAutomated Fault resolution

•Run a command or Run a program on a remote machine with options to append error messages•Restart Windows service or the server, if the service is found to be down

23

Page 24: Role of OpManager in event and fault management

Resolve – Aid faster resolutionServer Troubleshooting Tools

•Remote Process Diagnostics•Device Tools: Ping, Trace route, Tools to remotely connect to the server – Web console, Telnet/ SSH, MS terminal server

24

Page 25: Role of OpManager in event and fault management

Resolve – Aid faster resolutionNetwork Troubleshooting Tools

•Switch Port Mapper•Network Traffic Analysis•Switch port disabling option

25

Page 26: Role of OpManager in event and fault management

Resolve – Aid faster resolutionNetwork Troubleshooting Tools

•WAN link hop-wise latency count graph•Network Change and Configuration Management (NCCM)

26

Page 27: Role of OpManager in event and fault management

Resolve – Aid faster resolutionOther Troubleshooting Tools

•Real-time performance graphs•MIB Browser and Syslog viewer

27

Page 28: Role of OpManager in event and fault management

Tons of features that we’ve not talked about

28

ManageEngine OpManager is comprehensive, easy-to-use network monitoring & management software.

For free trial visit -www.opmanager.com

For product demos - Mail us at [email protected]

Call at +1 888 720 9500

• Automatic network discovery• Device and Interface monitoring

templates• Network Maps/ Custom Maps• WAN RTT and VoIP Monitoring• Network Traffic Analysis• Network Change and Configuration

Mgmt.• Server Monitoring (Windows/ Linux/

UNIX flavor OSes)• ESX VMware Monitoring• MS Exchange, SQL and Active

Directory Monitoring• Service Monitoring, Website

monitoring, Process and File/ Folder monitoring

• Processing SNMP TRAPs, Syslogs & Event Log

• Monitors any pingable and SNMP enabled device

Page 29: Role of OpManager in event and fault management

About ManageEngineManageEngine is the only IT Management vendor focused on bringing a complete IT

Management portfolio to the mid-sized enterprise.

29

Trusted by over 45,000 customers including 3 out of every 5 fortune 500 companies. More at www.manageengine.com

Page 30: Role of OpManager in event and fault management

Fault and Event Management Proactive and Reactive approach

Four processes of Fault Management: Detect: Active and Passive MonitoringIsolate: De-duplication, Correlation, Automation Inform: Visual fault representation, ticketing and alertingResolve: Automated Scripts and Tools to aid manual troubleshooting

In each process OpManager’s role in Fault and Event management

30

Summary

About ManageEngine and its various IT management products

Page 31: Role of OpManager in event and fault management

Thank [email protected]

Questions?