OSMC 2014: Introduction into collectd | Florian Foster

collectdAn introduction

About me

● Florian "octo" Forster

● Open-source work since 2001

● Started collectd in 2005

Agenda

● collectd

● Aggregation of metrics

● Alerting with Icinga

Agenda

● collectd

collectd

● Daemon

● collect metrics

● mangle / transport metrics

● store metrics (no retrieve)

collectd

● Open-source project○ MIT and GPL licensed

● Platform independent○ Linux, BSD, Solaris, AIX, HP-UX, …○ Windows via SSC Serv (non-free)

collectd

● Agent based design○ Runs on each host

● Extensible via plugins○ Language bindings (Perl, Python, Java)○ "exec" plugin, e.g. shell scripts

collectd

● 95+ "read" (input) plugins

○ System metrics (e.g. CPU, memory)

○ Application metrics (e.g. MySQL)

○ Other (Xeon Phi, SNMP, OneWire)

collectd

● 15+ "write" (output) plugins

○ Graphite○ RRDtool○ RRDCacheD○ Riemann○ MongoDB○ HTTP (generic)

collectd

# Input

LoadPlugin cpu

LoadPlugin memory

LoadPlugin df

MountPoint "/"

ValuesPercentage true

</Plugin>

# Output

LoadPlugin write_graphite

Host "graphite.example.com"

</Node>

</Plugin>

Example configuration

collectd

● collectd's write_graphite plugin

○ Sends metric to Graphite○ TCP or UDP transport○ Metric names somewhat adjustable

→ Monitoring mit Graphite(15:30 in this room, German)

Agenda

● collectd

Aggregation

● Aggregates often more useful for alerting○ e.g. sum over CPUs, minimum RTT

● Metric storage often I/O bound

● Dashboards require "sane" amount of information

Aggregation

collectd Graphite

Memory

…Aggregation

Aggregation

● Load the Aggregation plugin

● Select (filter) applicable metrics

● Group by metric type and other fields

● Aggregate functions (e.g. sum)

Aggregation

LoadPlugin aggregation

</Aggregation>

</Plugin>

example.com/battery/percent-charged

example.com/cpu-0/cpu-idle

example.com/cpu-0/cpu-user

example.com/cpu-0/cpu-wait

…example.com/df-root/df_complex-free

example.com/df-root/df_complex-used

example.com/df-root/df_complex-rsvd

Load the aggregation plugin

Aggregation: Selection

● Five fields usable for selection

○ Host○ Plugin○ PluginInstance○ Type (mandatory)○ TypeInstance

Aggregation: Selection

Plugin "cpu"

Type "cpu"

</Aggregation>

</Plugin>

Select metrics

Aggregation: Grouping

● Four fields usable for selection

○ Host○ Plugin○ PluginInstance○ TypeInstance

● One field unspecified (or more)

Aggregation: Grouping

Plugin "cpu"

Type "cpu"

GroupBy Host

GroupBy TypeInstance

</Aggregation>

</Plugin>

example.com/cpu-???/cpu-idle

example.com/cpu-???/cpu-user

example.com/cpu-???/cpu-wait

Configure grouping

Aggregation: Functions

● Up to six aggregate functions

○ Count○ Sum○ Minimum○ Maximum○ Average○ Standard deviation

Aggregation

Plugin "cpu"

Type "cpu"

GroupBy Host

GroupBy TypeInstance

CalculateSum true

</Aggregation>

</Plugin>

example.com/cpu-sum/cpu-idle

example.com/cpu-sum/cpu-user

example.com/cpu-sum/cpu-wait

Select aggregate function(s)

Aggregation

● Creates additional metrics

● Use chains to filter out unwanted "raw" metrics.

● Usable on client and/or server.

Agenda

● collectd

Alerting

● Load the Unixsock plugin

● Query and check values with collectd-nagios

● Both come with collectd

Alerting

LoadPlugin unixsock

SocketFile "/var/run/collectd-unixsock"

SocketGroup "collectd-nagios"

SocketPerms "0660"

DeleteSocket true

</Plugin>

Load the Unixsock plugin

Alerting

-> GETVAL example.com/cpu-average/cpu-wait

<- 1 Value found

<- value=8.540017+e00

Query values with the Unixsock plugin

Alerting

● collectd-nagios queries and checks metrics

● Ranged -w (warn) and -c (critical) options

● Conforms to Icinga's best practices

Alerting

$ collectd-nagios -s /var/run/collectd-unixsock \

> -n cpu-average/cpu-wait -H example.com \

> -w '0:10' -c '0:25'

OKAY: 0 critical, 0 warning, 1 okay | value=8.540017;;;;

Example: collectd-nagios

Alerting

define command{ command_name check_cpuio_collectd command_line collectd-nagios \

-H $HOSTNAME$ \

-n cpu-average/cpu-wait \

-w $ARG1$ -c $ARG2$

define service{ use generic-service host_name example.com service_description I/O wait check_command \

check_cpuio_collectd!10:!5: }

commands.cfg services.cfg

Alerting

● What's next?

○ Use "passive checks"

○ Let collectd push metrics to Icinga 2?

○ Bring on the patches!

Thank you!

Questions?

It's time for

Questions

OSMC 2014: Introduction into collectd | Florian Foster

Software

Florian WOLF

OSMC 2014: Monitoring VoIP Systems | Sebastian Damm

OSMC Task Group Report: London Road Industrial Estatedecisionmaking.westberks.gov.uk/documents/s85516/6. LRIE OSMC … · Task Group were Councillors Jeff Brooks, James Cole, Lee

Monitoring your DPDK application with collectd/snap...to showcase the performance of your application in action. The metrics and stats collectd by the dpdk plugins fit into a bigger

FLORIAN HECKER

Novell @ OSMC 2010 Inside SUSE Linux - NETWAYS GmbH · Novell ® @ OSMC 2010 Inside SUSE Linux Joachim Werner Senior Product Manager joe@novell.com October 7th 2010

St. Florian Bulletin - Welcome to Hegewisch.Net. your ...hegewisch.net/florian/bulletins/5-10-2015.pdf · Page 2 St. Florian Parish 6th Sunday of Easter ... May 10, 2015 St. Florian

Icinga 2010 at OSMC

Ryota Mibu, NEC Carlos Gonçalves, NEC DPDK, Collectd and

DPDK, Collectd and Ceilometer · DPDK, Collectd and Ceilometer The missing link between my telco cloud and the NFV infrastructure Maryam Tahhan, Intel Emma Foley, Intel Carlos Gonçalves,

PostgreSQL Performance Analysis using collectd Sebastian ...psql.pdf · PostgreSQL Performance Analysis using collectd Sebastian ” tokkee“ Harl ... synopsis:

Florian Bonnet

OSMC 2014: Business Prozessmonitoring mit BPView | Rene Koch

OSMC 2014: Current state of Icinga | Icinga Team

OSMC 2014: MonitoringLove with Sensu | Jochen Lillich

OSMC 2014: Why we do monitoring wrong | Michael Medin

OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan

Florian Rodler

The Observing System Monitoring Center (OSMC)

collectd Output JSON Examples - cisco.com · collectd Output JSON Examples ThisappendixprovidesexamplescollectdJSONoutput. VTChasthefollowinginformationtoemittotheCentralizedCollect-D: