Upload
cherwell-software
View
102
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Dealing with the good and bad “alerts” from environments is a challenge when one deals with 500 Servers, 800 SQL Databases and 2500 workstations. How does IT ensure that all of the alerting systems can change IT behavior to become more pro-active? The default mechanism is to produce e-mail, and lots of it, to various resources and/ or e-mail distribution lists. This remains an unstructured approach and can only work in very small IT environments. Learn how Super Group manages alerts and steal ideas you can use in your environment.
Citation preview
Johan SwartSuperGroup, South Africa
I am a Light Current Electrical Engineer. My main subjects were Computer Science, Electronics, Industrial Electronics, Digital Systems, Electrical Engineering, Radio and Telecommunications. My real interest started in the electronics world when Television was introduced in South Africa in the late 70’s. I ordered via a small group my “BBC” computer, existing of a CPU, RAM and an on-board keyboard. In those days you connected to a TV screen and your data import was from a tape recorder. I’ve been working for Super Group for the past 10 years. My specific focus for the last 2.5 years has been IT Service Management, including the implementation of good practices. Cherwell and CBAT made all this possible. We’re only now starting to explore the real exciting stuff! The limitation of Cherwell is only governed by what’s in your mind.
“Exception Management to support Services IT provided to Business”
Speaker – Johan Swart
Session Title:Exception Management to support Services IT
provides to BusinessDate & Time: 25 September 2014 2:50pm
2
Agenda• The Businesses we support• IT Strategy & Infrastructure in brief• Event problems & solution concepts• The implemented Events solution• Imported items to provide solutions• Other solutions we implemented• Why we succeeded• Q&A
3
The Business we support• Supply Chain
• The division operates across the entire supply chain and focuses on providing true end-to-end solutions.
• Fleet Solutions• Fleet management solutions meet vehicle
management needs and represented in SA, UK, New Zealand and Australia.
• Dealerships• 37 Franchised dealerships based in Gauteng
and North West Provinces in South Africa
4
Our IT Strategy in brief• Support Business operations and processes.• Proactive in the strategic decision making, to ensure that
SGIT provides a competitive service to all our customers.• Provide required technology (Fit for purpose and not
best of breed only) enabling Divisions to conduct their daily operations in a fast and efficient manner.• Provide continued “cost effective” technology whilst
offering latest technology.• Most importantly striving to provide the best Customer
Service experience.
5
Our Infrastructure in brief• Summary of Components
• 500+ Servers (Microsoft, SUN, Linux etc)• 1000+ Databases (SQL, Oracle, UniVerse etc)• 170+ Managed Switches (Nortel, Cisco & other)• 100+ Firewall devices (SonicWall, Mikrotik, CheckPoint)• 2500+ Workstations
• 2 Primary Hosting Facilities• Services provided
• 30 Business Application Services• 8 Core Services• 11 Technical Services
6
Event’s starting out• The questions were:• How do we make sense of the monitored alerts as a
valuable asset so that we could become more pro-active?• Can monitored alerts really help us in a new way to
ensure that our Services remained in an operational state more of the time?• How do we make use of what we already have in
place without re-inventing the wheel?• How do we keep the solution as simple as possible
for the people that is targeted for?
7
The Problem• Let’s Email these alerts to a distribution list, a group
of people or worse a single individual.• Facts.
• Electronic dis-organization is worse than the manual kind.• Communicating everything to everyone produces a zero
result.• 99% of this stuff should have never crossed your desk in
the 1st place as you’ll simply miss the 1% that is really important.
• Knowing about something serious as an individual and not visible helps no-one.
8
The Solution in Concept• A structured approach was needed:• Tunnel events into a single controlled system• Record everything required, but alert/focus ONLY on
the critical events• Direct the critical events to the correct people• Maintain a 100% visibility of everything going on• Focus on the basic priority items first and then later
expand the system gradually as you go.• This will assist resources to focus on the right things
instead of just a busy frenzy.
9
The Solution in Concept …• To bring about structured and organized
approach in dealing with 1000’s of daily events
10
The Implemented Solution
• Incoming Events• Processing Events• Mapping Events• Visibility of Events• Alerting on Events
11
Incoming Events• Our sources of events/alerts are:• Solarwinds, Telco & Internet Service Providers,
Backup systems, AV systems, Firewall systems, Microsoft OpsMgr and Foglight.• Homegrown queries in the Microsoft
environment using tools like PowerShell. • Homegrown queries in the Linux environment
using tools like Bash.
12
Incoming Email Events• A Dedicated Email listener for Events/Alerts• Internal controlled Emails comes in in a
standard format• Subject: HV-NEO Drive C: Status• Body 1st line “HV-NEO C: 10.484GB of 99.998GB, leaving 10% free.”• Status:Recorded%• Level:Warning%• Event Type:Disk%• Hostname:HV-NEO%• Service:Storage Service%• UpDown Only:Down.%
13
Processing of Events
• Business Process 1• Set the relevant Event Fields on creation• Create the SMS output file
• Business Process 2 • Update the CI if required (Example on backups)
• Business Process 3• Auto close all “Down” events if an Up event was
received for the exact item
14
Business Process 1 Result
15
Business Process 1 creates SMS
• Creation of the appropriate SMS output file
16
Business Process 2 updates CI’s
17
Business Process 3 close Event
• Will automatically close all Down Events when a corresponding Up Event was received.
18
Mapping Events - Independency
• Maps to CI’s
• Maps to Services
19
Visibility of Events
20
Visibility of Events
21
Visibility of Events
• Results consolidating upwards examples
22
Alerting of Events
• Email versus SMS?• Ability to switch off parts of the SMS system
23
Key guidelines on Events
• To keep Email & BPE processing fast and affective• Keep the incoming Event component small• Keep the incoming Event structured as indicated• Processed Email Events move to Deleted Items• Keep the Event as independent as possible• Keep high volume events during non peak hours• Use Cherwell as the wake-up mechanism and use
the source systems to go and find the in-depth detail
24
Some results are Imported Daily• In some case it makes sense to import daily
results, such as those from:• Anti-Virus Systems• Workstation Backup Systems• Access to critical Business applications such as
those from:• Warehouse Management Systems• ERP Systems and• Fleet Management Systems
25
Application Access Control• Access can be internally audited and verified
against the different controls already in place in Cherwell
26
Dedicated Exceptions Homepage
• This brings all your exceptions together in one single place that is simply a click away.
27
Business Process Support
• Example of an outgoing warning to business
28
Other Implemented Solutions
• JML (Joiner, Mover Leaver) Portal Forms• Staff Availability• Asset Management• Contract Management• Project Management• Expense Tracking System• Supplier Approval System
29
Why we succeeded?• Having a “Business Orientated” CIO• Dedicated position was created for ITSM responsibilities• Product selection criteria set (Cost/Dev/Upgrade/Sup)• Select the right partner• Start with the basics but keep on evolving even if
implemented solution is not perfect.• Ability to “Make things happen” without reliance always on
development teams• Making it easy for PEOPLE to do their work and even bringing
in some “Wow” factors for your IT staff members as the system evolves.
• PEOPLE, PROCESS, TECHNOLOGY done.
30
Questions and Answers?
31
Final Slide
• Thank you for attending this session.• Please fill out an evaluation form.
32