Upload
guillermo-carrasco-hernandez
View
413
Download
6
Tags:
Embed Size (px)
DESCRIPTION
Presentation given on 2014-09-22 at Science For Life Laboratory in Stockholm, Sweden, about ELK stack.
Citation preview
ELK StackBecause logs are not meant to go to /dev/null
Guillermo Carrasco @guillemch
Logging & ELK stack
Logging & ELK stack
What are logs?
What are logs for? Theory vs reality
Logstash & Elasticsearch
Kibana
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
MiSeqIntegrator.logHiSeqIntegrator.log
apache.logGenStat.log
supervisord.log…
What are logs?
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
MiSeqIntegrator.logHiSeqIntegrator.log
apache.logGenStat.log
supervisord.log…
So logs are files?
What are logs?
- Logs are time-oriented streams of records
http://adam.herokuapp.com/past/2011/4/1/logs_are_streams_not_files/
2014-09-19 13:08:37,972 INFO [MiSeqIntegrator] (SequencingIntegrationUtil)
Extracted raw content - [cycleNumber = 308 , runFolder = "D:\Illumina\MiSeqTemp\140904_M01320_0130_000000000-A9NE9" , netFolder = "Z:\140904_M01320_0130_000000000-A9NE9" ,
Record
MiSeqIntegrator.logHiSeqIntegrator.log
apache.logGenStat.log
supervisord.log…
What are logs for?Theory
What are logs for?
Provide real-time and valuableinformation about the execution ofa program
Use this information in your benefit:prevent problems, do analytics, plotstatus…
Theory
Example: Our pipeline
Example: Our pipelinestarted job X for sample Y Aligning sample X Generating report for Project A.Sample_14_09 Cleaning /proj/a2010002/nobackup area
Example: Our pipelinestarted job X for sample Y Aligning sample X Generating report for Project A.Sample_14_09 Cleaning /proj/a2010002/nobackup area
Submitted jobs in the last X mins… Pipeline crashes in the last X days…
Example: Illumina logs
Example: Illumina logs
- Status of a particularrun
- Failures/Anomalies- Cycles sequenced
today/this week/etc- …
What are logs for?Reality
What are logs for?
Something we look at ONLY whensomething has already gone wrong…if we can!
Reality
On the previous examples…
On the previous examples…
- The pipeline logs are dumped tonextgen_analysis_server.log, inmilou-b, under the functionalaccount… and rotated!
- The Illumina logs are just neverlooked at…
Problems
Problems
- Logs spread around servers and accounts
- Rotating logs may disappear - If you don’t rotate, logs will fill up
disks - Hardly difficult to do any analytics
(real-time) - Different applications == different
log formats
Problems
Genologics support:
”I took a look at the system. Unfortunately the logs are filling up in too quick of a time. I have increased the number of logs and the size of them. We should have more that one day of logs now.”
Problems
rm -rf <all
_the_log
s>
ELK Stack !
- Elasticsearch - Logstash - Kibana
Logstash
Logstash
Index log records formdifferent sources
Re-format log data to bestructured and ”queryable"
Apply filters
Store your structured datainto Elasticsearch (andother outputs)
input { #Read messages from redis redis { host => "localhost" data_type => "list" password => "password" key => "python" codec => json } } !#We want to filter multiline events, and we'll suppose that multiline #events are composed by one event and the following ones starting with #a sapce (like anexception traceback) filter { multiline { type => "exception" pattern => "^\s" what => "previous" add_tag => [ "exception" ] } } !output { elasticsearch { host => "tools.scilifelab.se" } }
Elasticsearch
Elasticsearch
Built on top of Lucene
Store complex data asstructured JSON documents.All fields are indexed bydefault, and all the indicescan be used in a single query.
Schema free (good for logs)
RESTful API
Kibana
No code required
Real-time analysis forstreaming data
Customise and createdashboards
For freeeee!!!
Shippers
Shippers
Broker*
Shippers
Broker*
Indexer
Shippers
Broker*
Indexer
Storage & search
Shippers
Broker*
Indexer
Storage & search
Visualization
Thank you!