Upload
vividcortex
View
142
Download
2
Embed Size (px)
Citation preview
æ
Quantifying Abnormal Behavior
Baron Schwartz • VividCortex
æ Optimization, Backups, Replication, and more
Baron Schwartz, Peter Zaitsev &
Vadim Tkachenko
High PerformanceMySQL
3rd Edition
Covers Version 5.5
Me
2
Baron Schwartz
baron - at - vividcortex.com
@xaprb
åThe Goal
Is the system in trouble?
Find problems early & small
Prevent problems from growing
åPrimitive Health Checks
System is dead/down
Metric exceeds threshold
å
Threshold PainFalse alarmsMissed alarmsDecisions, decisions
å6
Then the glory of the Lord departed from off the threshold
- Ezekiel 10:18
åHow Do Systems Fail?
Down/dead/unavailable is “rare”
Partial failures are common
Failures escalate over time
å
Abnormality DetectionHey, it’s a starting point.
æ
Look For Improbable Events?
9
å
Statistics RefresherVariance: the mean of the square minus the square of the mean
Standard Deviation: √variance
Z-Score: how many standard deviations a measurement is from the mean
The distribution of measured samples is asymptotically Gaussian, regardless of the underlying distribution.
10
å
æ
Is It Really Unlikely?
12
å
Systems Are Continually Abnormal
å
Statistical Process Control
æ
Shewhart Control Chart
15
å
Sliding WindowsTail Wags Dog?
æ17
Holt-Winters Forecasting
æ
We’re Doing It Wrong.
18
å
Measuring What MattersWhat matters is whether the system is getting its work done.
Measure work, not just status or activity.
Know the meaning of the metrics.
Did you attend Brendan Gregg’s talk?
19
å
Little’s Law: N=XR
å
Utilization Law: U=SX
æ
Universal Scalability Law
22
åWork-Related Metrics
N: Concurrency
X: Throughput
R: Response Time
U: Utilization
S: Service Time
åRealtime
At scale, in-memory operation is helpful
Rolling windows are less practical
CPU-intensive operations are impractical
The distant past has little relevance
å
Define a decay factor αbetween 0 and 1, then:
avg = avg•(α-1) + sample•α
Exponentially Weighted Moving Average
åHow To Choose α
Moving Window: age = N/2
EWMA: α = 2/(N+1)
Ex: .064516129 for a “60-second window” with an average age of 30 seconds
æ27
0
100
200
300
400
1 2 3 4 5
0
100
200
300
400
1 2 3 4 5
åExponentially Weighted Moving Statistics
Variance = EWMA of squares minus squared EWMA
Standard deviation = √EWMVar
Z-Score = Number of EWMStddev from the EWMA
åOne Feasible Normality Metric
Track EWMA and EWMASoS; compute Z-Score!
Or, use your imagination. Ideas:Variance-to-mean ratio (index of dispersion)
http://en.wikipedia.org/wiki/Index_of_dispersion
Follow links on that page ;-)
æ
Questions?@xaprb • linkedin.com/in/xaprbbaron - at - vividcortex.com
30
åPhoto Credits
http://www.flickr.com/photos/exquisitur/3502317741/http://www.flickr.com/photos/conorkeller/3424910997/http://www.flickr.com/photos/zooboing/5394322517/http://www.flickr.com/photos/robbn1/4114136177/http://www.flickr.com/photos/nathaninsandiego/5054092761/http://www.flickr.com/photos/ericmay/4817484054/http://www.flickr.com/photos/hktang/4243300265/http://www.flickr.com/photos/marceau_r/5445398067/http://www.flickr.com/photos/domesticat/2963393184/http://www.flickr.com/photos/amattox/3206367817/http://www.flickr.com/photos/rawhead/4617769266/http://www.flickr.com/photos/23737778@N00/7115229223/http://www.flickr.com/photos/sprengben/4419536377/http://www.flickr.com/photos/nickpix2008/2588993907/http://www.flickr.com/photos/kevineddy/1796490978/http://www.flickr.com/photos/asphericlens/5661878892/http://www.flickr.com/photos/dexxus/3031015377/http://www.flickr.com/photos/dexxus/5791228117/