27
So you want to switch off ? Time to say goodbye to your Nagios based setup! © 2014 - Olivier Jan - Check my Website @olivjan - [email protected]

Time to say goodbye to your Nagios based setup

Embed Size (px)

DESCRIPTION

Time to say goodbye to your Nagios based setup. Discover all the new cool tools out there to do some more efficient monitoring. A talk made at OSMC 2014. https://www.youtube.com/watch?v=_BAWi9Zhmic

Citation preview

Page 1: Time to say goodbye to your Nagios based setup

So you want to switch off ?

Time to say goodbye to your Nagios based setup!

© 2014 - Olivier Jan - Check my Website@olivjan - [email protected]

Page 2: Time to say goodbye to your Nagios based setup

About me

❖ System admin and architect

❖ Co-founder of « Communauté Francophone de la Supervision Libre »

❖ Writer of the book « Nagios 3 au cœur de la supervision Open Source »

❖ Co-founder of Check my Website, a SaaS service for remote monitoring of

websites and applications (current)

Page 3: Time to say goodbye to your Nagios based setup

Content

❖ Why switch off ? the good and maybe not so good reasons to do so !

❖ Which way to take ?

❖ Building a monitoring solution without Nagios :

❖ Tools available

❖ A personal work in progress

❖ Migrating from Nagios to this kind of solution

Page 4: Time to say goodbye to your Nagios based setup

Some reasons to switch off…

❖ The godfather of OSS monitoring is dead as an

Open Source project ?

❖ Can’t do better with it

❖ Cool new kids out there

❖ Better « cloud » support

❖ Clear states, metrics and messages monitoring

distinction

❖ Better charting solution

❖ Near realtime monitoring

❖ Routing, aggregation, correlation…

❖ YOUR reasons ;)

Page 5: Time to say goodbye to your Nagios based setup

Which way to take ?

❖ The « 4 mousquetaires »

❖ Naemon

❖ Icinga 2

❖ Shinken

❖ Centreon

❖ Reboot from building blocks

❖ Collect

❖ Store

❖ Visualize

❖ Alert

Page 6: Time to say goodbye to your Nagios based setup

Tools : Collecting metrics and messages

❖ Packetbeat (metrics & messages)

❖ Rsyslog, NX log, Syslog-ng

(messages)

❖ sFlow Toolkit, Host sFlow

❖ Logstash-forwarder (messages)

❖ Collectd (metrics)

❖ Diamond (metrics)

❖ OSquery, WMI (metrics)

❖ Network level (sFlow)

❖ System Level

❖ Application Level

Page 7: Time to say goodbye to your Nagios based setup

Tools : External collecting

❖ End user perspective

❖ Controls done closest to the

end-user

❖ Application behavior

❖ Real User Monitoring

❖ Webpagetest

❖ Selenium

❖ PhantomasJS

❖ Boomerang

❖ Bucky

Page 8: Time to say goodbye to your Nagios based setup

Tools : Routing metrics and messages

❖ Messages : Logstash, Flume, Fluentd

❖ Metrics : StatsD

❖ Metrics : Carbon Relay NG

One or more messages can fire an event

Page 9: Time to say goodbye to your Nagios based setup

Tools : Databases

❖ Graphite : The most used.

❖ OpenTSDB : HBase

❖ KairosDB : Cassandra

❖ InfluxDB : The most promising ?

❖ Elasticsearch : Index database

Page 10: Time to say goodbye to your Nagios based setup

Tools : Visualizing metrics and messages

❖ Kibana

❖ Grafana

❖ Dashboards collection

Page 11: Time to say goodbye to your Nagios based setup

Tools : Alerting

❖ Seyren : Alerting dashboard for

Graphite.

❖ Cabot : Get alerted when services

go down or metrics go crazy

❖ Bosun : An advanced, open-source

monitoring and alerting system

❖ Skyline : Real-time anomaly

detection system

❖ Oculus : Anomaly correlation

component of Etsy's Kale system

❖ Esper : Complex Event Processing

Page 12: Time to say goodbye to your Nagios based setup

The French Monitoring Community Xperience

❖ Reboot from building blocks

❖ Collect

❖ Store

❖ Visualize

❖ Alert

Page 13: Time to say goodbye to your Nagios based setup

The French Monitoring Community Xperience

Is it working ? What is not working ?

Page 14: Time to say goodbye to your Nagios based setup

Collecting metrics : Collectd

❖ InfluxDB Collectd proxy

❖ In Golang like InfluxDB

❖ Temporary solution

❖ Native Collectd plugin

LoadPlugin network

<Plugin network>

# proxy address

Server "127.0.0.1" "8096"

</Plugin>

❖ PHP5-FPM metrics

❖ Nginx metrics

❖ MariaDB metrics

❖ System metrics

❖ <metricname>:<value>|<type>

Page 15: Time to say goodbye to your Nagios based setup

Collecting messages : Rsyslog❖ Nearly ready log consumption

❖ Native distribution package

❖ Nginx Log, MySQL slow query

log

template(name=« ls_json"

type=« list" option.json="on") {

constant(value=« {")

constant(value="\"@timestamp\":\"") property(name="timereported" dateFormat=« rfc3339")

constant(value=« \",\"@version\":\"1")

constant(value="\",\"message\":\"") property(name=« msg")

constant(value="\",\"host\":\"") property(name=« hostname")

constant(value="\",\"severity\":\"") property(name=« syslogseverity-text")

constant(value="\",\"facility\":\"") property(name=« syslogfacility-text")

constant(value="\",\"programname\":\"") property(name=« programname")

constant(value="\",\"procid\":\"") property(name=« procid")

constant(value=« \"}\n")

}

Page 16: Time to say goodbye to your Nagios based setup

Collecting @ network level : Packetbeat

❖ Specific agent

❖ Collect traffic for

❖ HTTP

❖ MySQL

❖ PostgreSQL

❖ Redis

Page 17: Time to say goodbye to your Nagios based setup

Routing messages : Logstash

❖ Inputs

❖ Codecs/filters

❖ Outputsinput {

udp {

port => 10514

codec => "json"

type => "syslog"

}

}

filter {

# This replaces the host field with the host that generated the message (sysloghost)

if [sysloghost] {

mutate {

replace => [ "host", "%{sysloghost}" ]

remove_field => "sysloghost"

}

}

}

output {

elasticsearch { host => localhost }

}

Page 18: Time to say goodbye to your Nagios based setup

Routing metrics : StatsD

❖ Is now a protocol implemented

in all languages

❖ InfluxDB plugin

❖ Collectd can behave as a statsD

daemon (plugin)

❖ Very easy to push metrics

echo "foo:1|c" | nc -u -w0 127.0.0.1 8125

Page 19: Time to say goodbye to your Nagios based setup

Storing metrics : InfluxDB

❖ Make it behave like Graphite

❖ graphite-api

❖ carbon-relay-ng

❖ graphite-influxdb

❖ Cluster, cluster, cluster

❖ Design for events and metrics

Page 20: Time to say goodbye to your Nagios based setup

Storing messages : Elasticsearch

❖ Index database

❖ Cluster, cluster, cluster

❖ Full text search

Page 21: Time to say goodbye to your Nagios based setup

Visualizing @ network level : Packetbeat

❖ Kibana 3 modified version

❖ Dashboards ready out

of the box

Page 22: Time to say goodbye to your Nagios based setup

Visualizing metrics : Grafana

❖ Compatible

❖ Graphite

❖ InfluxDB

❖ OpenTSDB

❖ Built on Kibana 3

Page 23: Time to say goodbye to your Nagios based setup

Visualizing messages : Kibana 4

❖ Easy install

❖ Interactive dashboards

❖ Multiple indices

Page 24: Time to say goodbye to your Nagios based setup

What's missing ? Wishes

❖ Alerting

❖ External monitoring

❖ Repository for dashboards…

❖ Giving sense to metrics and

messages

Page 25: Time to say goodbye to your Nagios based setup

Alerting reboot

❖ Alert only on end user problems from an end

user perspective

❖ IRC, Chat channel…

❖ Alert thresholds based on history vs static

thresholds

❖ Statistics functions

❖ Boolean conditions

❖ Dynamic thresholds

❖ Anomaly detection

❖ Standard deviation

Page 26: Time to say goodbye to your Nagios based setup

Coming from Nagios

❖ Graphios will inject perfdatas in Graphite or InfluxDB

❖ Check_graphite can query Graphite API from Nagios for alert based on

history

❖ Logstash will send events to NSCA

❖ Nagios log in Kibana with Grok %{NAGIOSLINE}

❖ Keep Nagios for states ?

Page 27: Time to say goodbye to your Nagios based setup

Questions ?

@olivjan

[email protected]