38
slide 1 Service Management @ Colruyt

Service Management @ Colruyt

  • Upload
    edie

  • View
    67

  • Download
    1

Embed Size (px)

DESCRIPTION

Service Management @ Colruyt. Frank Waegeman. Frank Waegeman. Team Manager Service Management. [email protected]. Assignment Service management. - PowerPoint PPT Presentation

Citation preview

Page 1: Service Management @ Colruyt

slide 1

Service Management @ Colruyt

Page 2: Service Management @ Colruyt

slide 2

Team Manager Service Management

[email protected]

Frank WaegemanFrank Waegeman

Page 3: Service Management @ Colruyt

slide 3

Assignment Service management

Service Management BP&S has the overall responsibility, together with all stakeholders, to ensure that the operations and support of the operational BP&S products and services meet and continue to meet the agreed service levels

We keep SH.. out of ..IT

Page 4: Service Management @ Colruyt

slide 4

Role of Service Mgmt in the Service Life Cycle

ServiceManagement

Solution Delivery

Solution

Managed service

Solutions deliver the new functional and nonfunctional requirements fix the service levels

Ensures that we keep the agreed SLE’s

Page 5: Service Management @ Colruyt

slide 5

Guidelines for Service management

Used standard: ITIL (“Information Technology Infrastructure Library”)

= Goal

= Series of best practices (guidance) to set up the necessary operational processes for an (ICT) organisation

Service management ensures that these processes can be incorporated within BP&S

Page 6: Service Management @ Colruyt

slide 6

The processes…Reference model

Service Design & Management

Operation bridge

Incident Event

Problem

Continuity Capacity

COST

SLA Management

Availability

Service development & deployment

Build & test

Release to Production

Business IT Alignment

IT Strategy development

Business Assessment Customer

Management

Service Planning

Request Fulfilment

Configuration

Change

Page 7: Service Management @ Colruyt

slide 7

The processes…

Service Design & Management

Operation bridge

Incident Event

Problem

Continuity Capacity

COST

SLA Management

Service development & deployment

Build & test

Release to Production

Business IT Alignment

IT Strategy development

Business Assessment Customer

Management

Service Planning

Request Fulfilment

Configuration

Change

Availability

Page 8: Service Management @ Colruyt

slide 8

Why Service Management?

Page 9: Service Management @ Colruyt

slide 9

For the Business

IS CRUCIAL

PRODUCTION

Page 10: Service Management @ Colruyt

slide 10

Operational ITIL

We make every effort to keep a stable production environmenttoday and tomorrow.

To achieve this we need to set up different processes

PRODUCTION

Page 11: Service Management @ Colruyt

slide 11

Operational ITIL

You can only have a stable production environment if you have control over the operational changes

PRODUCTION

CHANGE

Page 12: Service Management @ Colruyt

slide 12

Operational ITIL

Having control over the operational changes means:

CHANGEChangeCalendar

ITCONFIG

- knowing the correct impact of a change - knowing ALL the changes

- planning and communicating each change

PRODUCTION

ITChange

Page 13: Service Management @ Colruyt

slide 13

Operational ITIL

Asset management is mandatory for asset validationCHANGE_ASSET = INCIDENT_ASSET = EVENT_ASSET

CHANGEChangeCalendar

ITCONFIG

ITASSET

PRODUCTION

ITChange

Page 14: Service Management @ Colruyt

slide 14

Operational ITIL

CHANGEChangeCalendar

ITCONFIG

ITSERVICES

Change Window

Unavailability

ITASSET

PRODUCTION

ITChange

SLA & SLE

Having control over the impact means:

- knowing the change window of an impacted asset

- knowing what an enduser needs (inventory of assets)

- communicating the changes for each itservice

Page 15: Service Management @ Colruyt

slide 15

ChangeGoal

Ensure that changes can happen within the

agreed SLEs and without affecting the

stability of the production

Page 16: Service Management @ Colruyt

slide 16

ChangeHow

• Having control over the changes: – Each Change is communicated ITChange – Each Change is planned ChangeCalender– Each Change impact is known– Each Change is authorised

The CAB (Change Advisory Board) manage all changes.

Page 17: Service Management @ Colruyt

slide 17

Configuration & AssetImpact Analysis & dependences

The environment becomes more and more complex

The impact becomes bigger

Extra availability becomes ‘normal’

The change windows become smaller

How can we keep an overview of all these assets & relations?

Page 18: Service Management @ Colruyt

slide 18

When can I switch this cable?

What is the impact?

When can I maintain the UPS System?

When can I deploy this middleware service?

When can I upgrade the RAC Database?

How can I move a datacenter?

When can I install a new application server?

Page 19: Service Management @ Colruyt

slide 19

IMPACT

80 % of all unavailabilities are due to changes(Gartner)

Today 99% of all changes are running fine at Colruyt,but this still generates more than 40% of all unavailabilities…

Page 20: Service Management @ Colruyt

slide 20

Impact?

Which services are impacted whenI pull the fibre cable connected to the director XFBS011102 on port

26 module 2?

Page 21: Service Management @ Colruyt

slide 21

IMPACT?

Director 1XFBS011101-FC2/26XFBS011101-FC6/4XFBS011101-FC9/4

SAN

DS8300W-50050763060005D4DS8300W-50050763060B05D4DS8300W-50050763061405D4DS8300W-50050763061905D4

Bootdisk

SAN

FIBERCARD1SVLIPC71-500110A00016C17E

FIBERCARD2SVLIPC71-50050763060005D4

SVLIPC71 Wilgenveld 1214B RACK AD41

NETWORKXWBS013P21 – GI0/12

MACSVLIPC71-001A64D32554

Director 2XFBS011102-FC2/26XFBS011102-FC6/4XFBS011102-FC9/4

ORACPC50

BRSTD001@ORACPC50

ORACPC50_PROCESS

DS-JDBC_BRANCHCOUNT

BRANCHCOUNT001

ITSERVICEVERKOOP_FVS2000ITSERVICE

VERKOOP_FVS2000ITSERVICEVERKOOP_FVS2000ITSERVICEVERKOOP_FVS2000

The ITService VERKOOP_FVS2000 has 1199 dependences(Result on 20/01/2010)

The impact list of componentXFBS011102-FC2/26 contains 1954 entries (Result on 20/01/2010)

TELLINGEN_ALIAS

Page 22: Service Management @ Colruyt

slide 22

RELATIONSCountry Site Building Room Rack

Physical server

Logical server

MF OthersWindows Network componentUnixLinux

LPAR

STC’s

Logical Database

Physical Database

JDBC connection

Middleware services

Application

Universe

Reports

IRAP

ESX

Storage

IMSL

Fibercard

Blade ChassisLoad balancer

Queue

Windows Services

Bootdisk

ITSERVICES

CICS

WAS

ITELEMENTS ITFUNCTIONS

Windows Shares

Page 23: Service Management @ Colruyt

slide 23

ITService e.g. Finance

AGENDA MUSTARCHIVES MUST ATST SHOULD DIENSTINFO_SHARE MUSTEXCEL SHOULDINTERNET_CONNECTIVITY MUSTIRAP MUSTFILT SHOULDMICF SHOULDONKO MUSTPAFW SHOULDPEOPLESOFT_HUMAN_RESOURCES MUST PERSONEELSDIENST_SHARE MUSTPNPEPAFW_REPORTGROUP MUSTTELEFONIE SHOULD....

35 top levels defined by the FA

ITSERVICE 35 top levels

685 dependencies

gives 685 dependences for this itservice

ASSET

Page 24: Service Management @ Colruyt

slide 24

Extra availability

Extra availability is a period outside the normal availability hours when you want to make use of the ITService

e.g. Extra work needs to be done on Saturday

e.g. No changes on related ITServices because the financial year closure takes place the first 2 weeks of April

e.g. Next week project H59A asks full exclusivity for changes because of the size of the project

e.g. A demo will take place at the fair this weekend

Page 25: Service Management @ Colruyt

slide 25

Frozen period

During the whole month of December we reduce the amount of changes to an absolute minimum for the complete Colruyt Group

because:

-This period is too crucial to take risks for the Colruyt Group(each change is a risk…)

- We notice that a yearly ‘rest of our IT’ is good for stability

Page 26: Service Management @ Colruyt

slide 26

The processes…

Service Design & Management

Operation bridge

Incident Event

Problem

Continuity Capacity

COST

SLA Management

Availability

Service development & deployment

Build & test

Release to Production

Business IT Alignment

IT Strategy development

Business Assessment Customer

Management

Service Planning

Request Fulfilment

Configuration

Change

Page 27: Service Management @ Colruyt

slide 27

IncidentWhat

• An incident is an event caused by a disruption or a reduction in quality of a service

Page 28: Service Management @ Colruyt

slide 28

IncidentGoal

- Return as soon as possible to the ‘normal situation’ so the end user can continue doing his job

- Minimise the negative impact on the business operation

It is not the goal of incident to fix the problem in a permanent way Cost vs benefit

An incident is fixed when the EU can continue with his work and when he agrees with the proposed solution

Page 29: Service Management @ Colruyt

slide 29

Page 30: Service Management @ Colruyt

slide 30

Information Request

Information Requests are handled by the

Key user of the application on business side

Page 31: Service Management @ Colruyt

slide 31

DisasterEscalation of an incident

• Prio1 and 2 incidents can be escalated to disaster by helpdesk

• Escalated incidents are evaluated by a disaster coordinator

• Not every escalated incident results in a disaster!

• The disaster coordinator coordinates the disaster until the incident is under control

• Tools : Adobe connect, disastertel, disaster room

Page 32: Service Management @ Colruyt

slide 32

Request FulfilmentWhat

• Handles standard IT requests (computer, keyboard,

software, hardware, mobile devices,...) of an end user

• <> INCIDENT!

Page 33: Service Management @ Colruyt

slide 33

Request FulfilmentHelp

• Link @ Portal to Servicedesk

Page 34: Service Management @ Colruyt

slide 34

Event MonitoringWhat

Monitors all events that occur throughout the IT infrastructure, to monitor normal operation and to detect and escalate exception conditions

We have :

– Passive monitoring: Detects operational events configuration item (asset)

– Active Monitoring: Active testing of a health status of a configuration item (asset)

Page 35: Service Management @ Colruyt

slide 35

Event Management

Collecting Snmptraps,Application & System LogMonitoringSystem MessagesMail2ITO

How does ITO works?

Processing FilteringPriorityGroupingThreshold

Acting Automatic ActionsOperator Initiated ActionsIncident ManagementNotification (SMS)

Page 36: Service Management @ Colruyt

slide 36

Automatic Actions

Operator Initiated Actions

Fixes

Workarounds

Filter

Event Overview

OPERATIONS

SUPPORTTEAM

Monitoring Strategy

MACHINES

ITO

HELPDESKCONFIG

INCIDENT PROBLEM

APPLICATIONS

END USERS

INCIDENT

CHANGE

Page 37: Service Management @ Colruyt

slide 37

ProblemWhat

Problem management is focused on:• Solving the underlying cause of a incident

“How can we avoid this?”• Ideas from the end user• Managing problems that you deliberated not to fix

• status REJECTED!

active & proactive

Page 38: Service Management @ Colruyt

slide 38

Questions

Thanks