Quantifying Abnormal Behavior

Preview:

Citation preview

æ

Quantifying Abnormal Behavior

Baron Schwartz • VividCortex

æ Optimization, Backups, Replication, and more

Baron Schwartz, Peter Zaitsev &

Vadim Tkachenko

High PerformanceMySQL

3rd Edition

Covers Version 5.5

Me

2

Baron Schwartz

baron - at - vividcortex.com

@xaprb

åThe Goal

Is the system in trouble?

Find problems early & small

Prevent problems from growing

åPrimitive Health Checks

System is dead/down

Metric exceeds threshold

å

Threshold PainFalse alarmsMissed alarmsDecisions, decisions

åHow Do Systems Fail?

Down/dead/unavailable is “rare”

Partial failures are common

Failures escalate over time

å

Abnormality DetectionHey, it’s a starting point.

æ

Look For Improbable Events?

9

å

Statistics RefresherVariance: the mean of the square minus the square of the mean

Standard Deviation: √variance

Z-Score: how many standard deviations a measurement is from the mean

The distribution of measured samples is asymptotically Gaussian, regardless of the underlying distribution.

10

å

æ

Is It Really Unlikely?

12

å

Systems Are Continually Abnormal

å

Statistical Process Control

æ

Shewhart Control Chart

15

å

Sliding WindowsTail Wags Dog?

æ17

Holt-Winters Forecasting

æ

We’re Doing It Wrong.

18

å

Measuring What MattersWhat matters is whether the system is getting its work done.

Measure work, not just status or activity.

Know the meaning of the metrics.

Did you attend Brendan Gregg’s talk?

19

å

Little’s Law: N=XR

å

Utilization Law: U=SX

æ

Universal Scalability Law

22

åWork-Related Metrics

N: Concurrency

X: Throughput

R: Response Time

U: Utilization

S: Service Time

åRealtime

At scale, in-memory operation is helpful

Rolling windows are less practical

CPU-intensive operations are impractical

The distant past has little relevance

å

Define a decay factor αbetween 0 and 1, then:

avg = avg•(α-1) + sample•α

Exponentially Weighted Moving Average

åHow To Choose α

Moving Window: age = N/2

EWMA: α = 2/(N+1)

Ex: .064516129 for a “60-second window” with an average age of 30 seconds

æ27

0

100

200

300

400

1 2 3 4 5

0

100

200

300

400

1 2 3 4 5

åExponentially Weighted Moving Statistics

Variance = EWMA of squares minus squared EWMA

Standard deviation = √EWMVar

Z-Score = Number of EWMStddev from the EWMA

åOne Feasible Normality Metric

Track EWMA and EWMASoS; compute Z-Score!

Or, use your imagination. Ideas:Variance-to-mean ratio (index of dispersion)

http://en.wikipedia.org/wiki/Index_of_dispersion

Follow links on that page ;-)

æ

Questions?@xaprb • linkedin.com/in/xaprbbaron - at - vividcortex.com

30

åPhoto Credits

http://www.flickr.com/photos/exquisitur/3502317741/http://www.flickr.com/photos/conorkeller/3424910997/http://www.flickr.com/photos/zooboing/5394322517/http://www.flickr.com/photos/robbn1/4114136177/http://www.flickr.com/photos/nathaninsandiego/5054092761/http://www.flickr.com/photos/ericmay/4817484054/http://www.flickr.com/photos/hktang/4243300265/http://www.flickr.com/photos/marceau_r/5445398067/http://www.flickr.com/photos/domesticat/2963393184/http://www.flickr.com/photos/amattox/3206367817/http://www.flickr.com/photos/rawhead/4617769266/http://www.flickr.com/photos/23737778@N00/7115229223/http://www.flickr.com/photos/sprengben/4419536377/http://www.flickr.com/photos/nickpix2008/2588993907/http://www.flickr.com/photos/kevineddy/1796490978/http://www.flickr.com/photos/asphericlens/5661878892/http://www.flickr.com/photos/dexxus/3031015377/http://www.flickr.com/photos/dexxus/5791228117/

Recommended