Upload
alg-systems-
View
261
Download
2
Tags:
Embed Size (px)
Citation preview
Yuriy Bondarenkoexpert of monitoring and automation
Organize the process of reviewing the complaints of users and compensate it by
monitoring
Promoting by leadershipPart of the company's strategy
Presence of service catalog
Projects include monitoring phase
IT System #1
IT System #2 IT System #3
Busi
ness
ser
vice
(5 IT
sys
tem
s)Bu
sine
ss s
ervi
ce(2
IT s
yste
ms)
Busi
ness
ser
vice
(3 IT
sys
tem
s)
Applicationcluster
Databasecluster
Balancer
Ping
Basic services
CPU utilization
Memory utilization
Disks availability
Disks % free space
Disks latency
Status code
Response time
Connection count
DNS lookup
SSL certificate check
telnet 80
Failed request
Demon started
Exceptions log
telnet app port
JMX query
OLEDB query
Instance status
Database state
Free table space
Active sessions
And so on
JSONJSON || XML|| XML
HTMLHTML
HTTPHTTPPOSTPOST
HTTPHTTPGET, GET, POSTPOST
Try logonTry get history
Try get data
Try logonTry find
Try send sms
For examplePowerShell + dllSOAP queryHTTP GET, POSTWCF methods invokeUsing different APIJSON converterXML converterInvoke web scenarios
Check availabilityMetering of speed invoke
Get servi
ce
Provide service
User experience. Send transaction history
Technical view. Send transaction history
HTTP POST
UserId, ServiceId,actionId
WebRequest
Application serverWeb server
Data Access Framework
Insert record(new order)
JOBApplication server
Integration API Send email
Data Access Framework
Application server
Self Care System Order management
Billing systemMail system
Upd
ate
orde
r
Get
ore
ders
Upd
ate
orde
r
Sender log
Sender log
HTTP POST
UserId, ServiceId,actionId
WebRequest
Application serverWeb server
Data Access Framework
Insert record(new order)
Application server
Data Access Framework
Integration API Send email
Application server
Self Care System Order management
Billing systemMail system
Upd
ate
orde
r
Get
ore
ders
data
pre
senc
e aft
er t
upda
teor
der
Monitoring of inserting exception
Chec
k st
atus
aft
er d
eleti
ng
Monitoring of deleting & sending exception
Check status after sending
Check status after sending
Technical view. Send transaction history
get criteria of unavailable
mapping of monitoringmetrics and criteria
build “tree of health” in monitoing
methodology of calculationof service availability
methodology of calculationof IT availability
requirements for the reportingsystem or RT dashboard
Customer says:“The service is not available:• when the money is not received for services rendered.• when money is received but it is not delivered a service”
Customer’s requirements must be in conformity with monitoring metricsFor example:•Service is unavailable = Base Page Status Code 500•Service is unavailable = E2E logon is rejected•Service is unavailable = email sending error & order updating error
Worst stateBest state
Best state
Worst state
Monitoring metrics
Good
Bad
How can you calculate?Using schedule of support in SLA (8x5, 12x7, 24x7)Using of repeat count to avoid false alarms (accuracy is lower, reliability is higher )Not to use the data layer in the calculation of availabilityTry to get the cost of service downtime. You will have a lot of customers)
How can you calculate availability of IT?You can use coefficients
Service1 (Domain1) – 0,1Service2 (Domain2) – 0,9
24HH IT Uptime = 100% - 24HH Serv1 DownTime%*k + 24HH Serv1 DownTime%*kOr other, most difficult formulas)
Built coefficients•BCM•Count of users•Service profitability
You can use the average and not to worry
It can be: periodic email with counters, report, real-time dashboard with analytics
You have to know the architecture of the monitoring databaseYou have to have a mechanism of operational availability management!!!
$
Availability
Performance
Financial losses
DL,BL,PL Influence
Monitoring detection
Critical incidents
Other Attributes of SLA