Effective monitoring with statsd - Alexis lê-quôc

Preview:

Citation preview

EffectiveMonitoring

with

@alqCTO at

Datadog

An applicationthrough the naked eye

An applicationthrough a monitoring

tool

OODA Loop (simplified)

Observe Orient

DecideAct

OODA Loop (simplified)

Observe Orient

DecideAct

OODA Loop (simplified)

Observe Orient

DecideAct

Monito

ring

Tool

OODA Loop (simplified)

Observe Orient

DecideAct

Monito

ring

Tool Yo

u

OODA Loop (simplified)

Observe Orient

DecideAct

Monito

ring

Tool Yo

u

You

OODA Loop (simplified)

Observe Orient

DecideAct

Monito

ring

Tool Yo

u

You

You

Observations need to be...

1.Timely2.Correct3.Comprehensive

Observations need to be...

1.Timely2.Correct3.Comprehensive

Observations need to be...

1.Timely2.Correct3.Comprehensive

Else

Observations need to be...

1.Timely2.Correct3.Comprehensive

Garbage In, Garbage Out

Else

Timely

Initial set of metrics

Initialassumptions

Revised set of metrics

Contact with reality

Revisedassumptions

Timely

Initial set of metrics

Initialassumptions

Revised set of metrics

Contact with reality

Revisedassumptions

Minutes

Not w

eeks

Comprehensive

WorkResources ValueResourcesResourcesResourcesResources

Comprehensive

WorkResources ValueResourcesResourcesResourcesResources

Easy to collectgeneric

but not actionable

Comprehensive

WorkResources ValueResourcesResourcesResourcesResources

Easy to collectgeneric

but not actionable

Harder to collect,custom

but most actionable

statsD

Easy

statsD

Easy

Timely

statsD

Easy

Timely Comprehensive

How statsD works

Client libraries talk to asimple UDP server...

pageviews:100|c@0.25latency:320|msbacklog:333|guniques:765|s

...using a simple text protocol

statsD typesType Definition Example

Gauges Absolute values Queue size

Counters Per-second rates Page views

Histograms Gauge summary Page Latency

Timers Gauge distribution Page Latency

Sets Counters of unique things Unique visitors

statsD problemsType Definition Problem

Gauges Absolute valuesLatest value wins.Gauge deltas???

Counters Per-second ratesRates, not counts (!

= rrdtool)

Histograms Gauge summaryAssumes normal

distribution

Timers Gauge distributionCan measure much

more than time

Sets Counters of unique things :-)

#1 pitfall: “Counters”

http://dtdg.co/tokyo-counters

How we use statsD

http://dtdg.co/tokyo-dog

Essential: Tagging

http://dtdg.co/tokyo-tags

How to get started

• statsD https://github.com/etsy/statsd• client libraries https://github.com/etsy/statsd/wiki

(my company) 1-stop shop http://www.datadoghq.com

ありがとうございました。質問?@alq

Thank you very much!Questions? @alq