About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Processing millions of logs with Logstashand integrating with Elasticsearch, Hadoop and Cassandra
Valentin Fischer-Mitoiu, OSMC 2014
November 21, 2014
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
About me
My name is Valentin Fischer-Mitoiu and I work for theUniversity of Vienna.
More specificaly in a group called Domainis (Internet domainadministration).
We operate the nameservers, monitoring and much more fornic.at, the .at registry.
I’m a monitoring guy and sys admin, working daily with systemslike Logstash, Icinga, Elasticsearch, Hadoop, Cassandra.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Scope of this presentation
1. This presentation is meant as a walk-through for startingmonitoring logs using logstash.
2. Describing possible integration of logstash in ”big data”environments.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: What is it?
Logstash is a software that allows you to send, receive andmanage log files from various sources.
It appeared in 2011 and it is getting a lot of interest because of itsflexibility and power of configuration.
Based on Ruby, works as an agent/daemon and provides a highlevel of abstractization for its configuration, similar to puppet.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: How we use it
Logstash: How we use it
We started by testing it and see what types of logs we can managewith it. The initial setup was processing logs related to dns, login,postresql, soap, epp, and system.
We started using it in production and feeding logs from varioussources.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Why we use it
Logstash: Why we use it
We use logstash because we want to use its power in managinglogs and generate events based on specific triggers.
We cover states with icinga and adding logstash would cover thewhole monitoring spectrum.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Deploying fast and simple
Deploying logstash is quite simple, the challenge is in generatinga configuration.
In most cases it’s recommended to start with a centralized setup.
This means having a shipper, queue, indexer and some storagesystem.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Overview of a centralized setup
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Hardware requirements
Logstash: Hardware requirements
In order find out how much hardware you need for your logstashsetup, first think how much data you process daily and if youneed to search into it.
Another factor is the complexity of your configuration, speciallyfilters.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with inputs
Logstash: Working with inputs
This is how a file input looks like
1 i n pu t {f i l e {
3 codec => ” p l a i n ”path => ”/dns / l o g s / p r o c e s s i n g qu e u e /∗ . t a s k s ”
5 s t a r t p o s i t i o n => [ ” b eg i nn i n g ” ]type => ” p r o c e s s i n g t a s k s ”
7 }}
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with filters
Logstash: Working with filters
This is how a filter looks like
f i l t e r {2 grok {
add tag => [ ” p r o c e s s e d t a s k s ” ]4 match => [ ”message ” , ”%{IP : c l i e n t } %{WORD: method}
%{URIPATHPARAM: r e qu e s t } %{NUMBER: by t e s } %{NUMBER:du r a t i o n }” ]t a g o n f a i l u r e => [ ” f a i l e d t a k s ” ]
6 }}
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with outputs
Logstash: Working with outputs
This is how an output looks like
1 output {e l a s t i c s e a r c h {
3 c l u s t e r => ” s to rage−c l u s t e r ”codec => ” p l a i n ”
5 }}
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with Elasticsearch
Logstash: Working with Elasticsearch
Elasticsearch is distributed RESTful search and analyticsengine.
This gives logstash great searching power using the lucene syntax.
It is also a powerful storage system that can make use ofindexes, facets, templates and much more.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with Elasticsearch
Logstash: Elasticsearch overview
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with Kibana
Logstash: Working with Kibana
Kibana is a web interface used to search (query) theelasticsearch cluster.
The interface can be grouped in dashboards, each of onecontaining different layouts.
Each layout can have: pies, maps, charts, histograms, tables,trends, etc.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Working with Kibana
Logstash: Dashboard overview
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Deployment example
Logstash: Deployment example
You now know the basics, an actual deployment would look likethis.
1. Download a package from www.elasticsearch.org2. Setup your shipper3. Setup your queue4. Setup your indexer5. Setup your storage
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Setting up the shipper/forwarder
Logstash: Setting up your shipper/forwarder
A shipper/forwarder is an agent that usually contains some input(port, file, etc) that gets some data and sends it to an output (aqueue if using a centralized setup).
To setup your shipper you generally need an input and outputsection.
You can also do preprocessing in most shipping agents.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Shipping logs
Logstash: Shipping logs
Log Courier ( lightweight, fast, secure, stable )
Logstash ( heavy, fast, stable, well developed )
Logstash-forwarder ( fast, secure, well developed )
Nxlog ( no java, lightweight, mainly used for windows logs )
Beaver ( lightweight, fast , python)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Setting up the indexer
Logstash: Setting up your indexer
An indexer is a logstash agent that contains some logic (filters)and does some processing on messages.
To setup your indexer you generally need an input, filter andoutput section.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Indexer example
Logstash: An example of a working indexer
An example of a running indexer.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Setting up the queue
Logstash: Setting up your queue
In order to have a more robust and centralized setup we decided touse a queue.
This will keep all your messages and wait for indexers to processthem.
The most used queue with logstash is redis and rabbitmq.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Setting up the storage cluster
Logstash: Setting up your storage cluster
In order to store all our logs, we will setup an elasticsearch cluster.We will use it also because of its advanced search capabilities.
Configuration is fairly simple. Edit elasticsearch.yml and makethe necessary changes.
After that, set the max RAM limit for your ES node(ES MAX MEM).
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Setting up the Kibana instance
Logstash: Setting up your storage cluster
Kibana is a tool used to visualize your data stored in Elasticsearch.
You can do complex queries using the lucene language and graphthe data in various charts, maps, etc.
e.g MESSAGES WHERE LOG SOURCE=(”shipper1”,”shipper2”)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Debugging and errors
Logstash: Debugging and errors
Logstash is pretty clear and descriptive when it comes to errors butyou can always troubleshoot more using debug mode (-debug).
Another useful tip is sending your logs to stdout output and seemore information if you have issues with your messages format.
To see all the flags just use -h or –help.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Getting help
Logstash: Getting help
Usually if you need help you can enter IRC and channel #logstashon network Freenode. You will find a large community and peoplewilling to help you.
I’m also present there with nickname vali.
You can also visit the official documentation present onhttp://logstash.net
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Developing your own add-ons
Logstash: Developing your own add-ons
To develop your own addon (input,filter,output) you need rubyknowledge.
You can find more information athttp://logstash.net/docs/1.4.2/extending/
Basically you follow a standard structure, add your methods andrequired arguments. Do some processing with the incoming event,modify it or return a value.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Integrating with big data systems
In our case, integrating with big data systems was done viaKairosDB. This acts as a middleware and allows us to use asstorage system Hadoop or Cassandra.
In our case the sending was done via the opentsdb output.
And custom scripts sending directly to KairosDB via a TCP port.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Using Hadoop
Logstash: Using Hadoop
The target is to develop a big data system that allows use to storeevents/data from various systems/sources forever.
The system has to be able to be queried using a simple interfaceand its data returned in a useful format (json) and be graphedeasily.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Using Cassandra
Logstash: Using Cassandra
The same idea as using HBase + HDFS but exploring differentstorage systems.
We can still use Hive, Pig and MapReduce queries.
More simple to setup
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Using Elasticsearch
Logstash: Using Elasticsearch
Elasticsearch has the same characteristics as other NOSQLsystems.
Some advantages over casssandra and hadoop:- direct access from logstash, no need for any other middleware.- great web interface to allow easy access to the data andexcellent graphing options.- powerful API to retrieve data
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Logstash: Conclusion
In our case logstash seems to be a great candidate for coveringthe log management spectrum.
Its ease of usage, great collection of plugins and easyconfiguration makes it a great tool for managing logs.
It is definitely a tool worth exploring for this kind of monitoring.
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
Questions
This would be a great time to ask questions. :)
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash
About me Scope of this presentation Logstash: What? When? How? Logstash: Deploying fast and simple Logstash: Deploying fast and simple Big data integration Conclusion Questions The end
The end
Thank you!
Valentin Fischer-Mitoiu, OSMC 2014
Processing millions of logs with Logstash