Upload
ca-technologies
View
396
Download
2
Embed Size (px)
Citation preview
Hands-On Lab: Learn How to Harness CA Application Performance Management Differential Analysis to Reduce False Positives
David B. MartinSrikant Noorani
DevOps: Agile Ops
CA Technologies
DO5X235L
#CAWorld
Application Performance Management
2 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
AbstractOperations teams have long sought for a solution that automatically
identifies performance problems in their applications without having
too many false alerts. In CA Application Performance Management
(CA APM) 10, the differential analysis capability uses a technique
new to the application performance management market that
mirrors the actions a human operator would perform to identify
when and where to act to solve performance issues. In this session,
you'll learn how this new approach identifies both slow-growing,
chronic problems and fast-acting acute ones, with no
configuration. You'll also see how differential analysis alerts you to
these conditions and automatically captures diagnostic transaction
traces for review.
David B. Martin
CA Technologies
Product Manager
Srikant Noorani
CA Technologies
Sr Engineering Services Architect
3 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
© 2015 CA. All rights reserved. All trademarks referenced herein belong to their respective companies.
The content provided in this CA World 2015 presentation is intended for informational purposes only and does not form any type of
warranty. The information provided by a CA partner and/or CA customer has not been reviewed for accuracy by CA.
For Informational Purposes Only
Terms of this Presentation
4 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Agenda
WHY MODELS ARE FAILING
A BRIEF HISTORY OF APM ALERTING
CA TECHNOLOGIES’ DIFFERENTIAL ANALYSIS
MODELS ARE MADE TO BE BROKEN
DATA-DRIVEN DIVE INTO AUTOMATIC ALERTING MODELS
SHEWHART SAVES THE DAY
1
2
3
4
5
6
5 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Keeping My Promise!
I will begin this session by making a detailed, data-centric case for why CA Technologies’ new Differential Analysis feature is a superior, market-leading approach to automatic alerting
No, I will not then pull a rabbit out of a hat. ‘Cuz this ain’t magic people… even if it looks like magic.
“Any sufficiently advanced technology is indistinguishable from magic.” -- A.C. Clarke
6 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
What was CA Technologies Last Answer?
In the early 90s, Wily implemented Holt’s Linear Exponential Smooth (HLES) to calculate baselines for metrics
Baselines were fooled by regular production events – many were more about regular patterns in load than about maintenance events. Seasonality debuts to address it.
This leads to rules – and rules engines – to address edge cases that seasonality does not address (e.g. “+3 std dev from baseline” to deaden the sensitivity of triggers)
And what are our competitors doing?
7 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
What’s the Problem With the State-of-the-art?
As the following slides will explain, seasonal baselines miss problems that you don’t want to miss
Inevitably, they also report too often
When they do, you have to write rules resolve the issue with your issues
Now you’ve failed to find the automatic alerting grail
It may actually be more efficient to go back to writing static thresholds for your key components
Or, a good reason for teaching you some interesting math…
8 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
440
460
480
500
520
540
560
580
600
620
Average Response Time
+1 Std Dev
+2 Std Dev
+3 Std Dev
This is a stable application response time, with bands of standard deviation.Most baselines are fancy forms of standard deviation that take into account things like seasonality.
9 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
0
200
400
600
800
1000
1200
1400
1600
1800An outlier…What to do? If it’s in a seasonal window, it has to be a bigger outlier, but the problem of “To Alert, or Not to Alert” remains the same.
You must either send an alert for this single spike, or write a rule to say that the spike has to be “so big” before you care (which is usually done with a manually written rule like “> 3 std dev”).
“Mr. Ops won’t even put down his sandwich for a single failed transaction.”
10 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
0
500
1000
1500
2000
2500What about the situation of a sustained spike?
Supposedly, seasonality cancels out the normal operations. But how many of you have apps in which a single user logs in and starts running expensive (e.g. reporting) transactions?
Traditional approach has to again decide: when to alert? If app users login at irregular intervals and perform this type of transaction, then triggering alerts on their normal (non-seasonal) activity? “cat alerts | /dev/null”.
But how long do you wait then?Once again, a decision YOU have to make and configure for each of your apps.
11 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
0
500
1000
1500
2000
2500
3000 Better hope that sustained, normal changes in response time are seasonal when they happen…
If not, you must write rules!
And if you write rules, you might accidentally deaden the threshold to actual problems.
Dang, gum!
12 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Our Hero: Walter Shewhart
In the 1920s, Walter Shewhart et al worked on quality control for buried telephone lines
Shewhart observed that while every line displays variation, some lines occasionally display uncontrolled variation. Like a seismometer, there are normal fluctuations and then there are earthquakes.
Shewhart invented Control Charts and the Western Electric Rules to identify uncontrolled variance, earning himself the title: “Father of Statistical Quality Control”
13 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Translation Please!
Shewhart taught us to favor real time observation over mathematical models of a signal’s behavior
We still baseline the signal, but the Western Electric Rules define the situations in which the signal should be considered in a bad state and not a simple delta from the baseline model
Shewhart’s method of characterizing the quality of a signal mirrors the behavior of a human observer
Trust us, you will understand this math…
14 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Shewhart’s Western Electric RulesStraight off Wikipedia…
The canonical Western Electric Rules use plain, old standard deviation as their real time
measure. Each rule identifies a pattern in the signal:
Rule #1 – A statistically interesting outlier
Rule #2 – Two somewhat interesting outliers out of three measurements.
Rule #3 – Four smaller outliers out of five measurements.
Rule #4 – Many small outliers over many measurements.
This much we flat out stole from math history!
15 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
CA Technologies Innovation
Western Electric Rules are brilliant for both real time analysis of telephone signals and application signals
A single rule breach, however, is too dull a blade for slicing through this tough problem
By assigning weights to each rule breach, keeping a running sum, and aging out old breaches, we can produce a single, normalized value for variance intensity
APM 10 has several patents pending…
16 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
In a busy system, there are always varying levels of stability.
In this picture, can you tell which signals are least stable?
17 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
This signal experienced an outlier, but it didn’t turn blue.
A single rule breach isn’t enough for “Pete to put down his sandwich”.
18 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
In this case, the change in stability was sustained over about forty minutes.
What happened? Click to find out…
19 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
This application experienced a remarkable degradation in performance over a forty-minute period of time.
Both old and our new approach would alert here, but CA’s alert would happen early in the event, and trigger trace collection automatically.
The old approach might not have let an operator know for thirty minutes or more, based on the rules they configured.
20 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Triage is a battlefield medicine term: where are the wounded soldiers?
CA’s approach means identifying chronic problems as well as acute ones. Which of these lines are more stable, but still having chronic stability events at regular intervals?
21 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
22 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
23 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Differential Analysis Default Configuration
24 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
25 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
CA Technologies Team Pegasus!Clockwise from left:Prashant Pathak, Mark LoSacco, Weini Yu, Prasanna Ram Venkatachalam, Naresh Chippada, Carey Feldstein, Paul Callahan, and Sai Krishna Rayanapati[not pictured: me]
26 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Recommended Sessions
SESSION # TITLE DATE/TIME
DO5X189SHow to Achieve a Customer-Centric View in an Omni-
Channel World11/18/2015 at 1:00 pm
DO5X194SMonitor Microservices, Containers, Cloud Foundry and
Node with CA Application Performance Management11/18/2015 at 4:30 pm
DO5X193S
Customize CA Application Performance Management
with Tips for Using the CA Application Performance
Management Open APIs
11/19/2015 at 4:30 pm
27 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Must See Demos
Application Performance Management and DevOps, featuring APM use in preproduction scenarios
Application Performance Management
Theater 5
Application Performance Management, Modern Monitoring, featuring the new APM Team Center
Application Performance Management
Theater 5
Ensuring a “5 star” mobile app experience with CA Mobile App Analytics
Mobile App Analytics
Theater 5
Unified Monitoring: APM Integrations including UIM
Application Performance Management
Theater 5
28 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Follow on Conversations At…
Smart Bar
Application Performance Management
Theater 5
Tech Talks
Application Performance Management
Theater 5
29 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
Q & A
30 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
For More Information
To learn more, please visit:
http://cainc.to/Nv2VOe
CA World ’15