22
Large scale log collection Guided by Professor Simon Shim Team #14 Gaurav Bhardwaj <009297431> Vaibhav Bhor <009313434> Sumant Murke <009303879> Amod Rege <009259692> CMPE 283: VIRTUALIZATION TECHNOLOGIES

Large Scale Log collection using LogStash & mongoDB

Embed Size (px)

DESCRIPTION

Short description

Citation preview

Page 1: Large Scale Log collection using LogStash & mongoDB

Large scale log collection

Guided byProfessor Simon Shim

Team #14 Gaurav Bhardwaj <009297431> Vaibhav Bhor <009313434> Sumant Murke <009303879> Amod Rege <009259692>

CMPE 283: VIRTUALIZATION TECHNOLOGIES

Page 2: Large Scale Log collection using LogStash & mongoDB

1. Project Overview2. Objective3. Project Part-2 4. Project Part-1 (DRS-DPM)5. Screenshots6. Lessons learnt 7. Conclusion

AGENDA

Page 3: Large Scale Log collection using LogStash & mongoDB

Objective

Manage and test Virtual Machines Simulate DRS- DPM functionality Develop large scale analysis tool, which collects VM as

well as Host performance data. Understand need to Gather and Analyze log Data To come up with a framework which provides complete

solution for virtual Machine log file collection & analysis.

Page 4: Large Scale Log collection using LogStash & mongoDB

Design

Page 5: Large Scale Log collection using LogStash & mongoDB
Page 6: Large Scale Log collection using LogStash & mongoDB

Components

Agent Collector Aggregator Local storage (mongoDB) Central storage (MySQL) Visualization

Page 7: Large Scale Log collection using LogStash & mongoDB

Agent

Uses Java VI api to collect system metrics Collects Host as well as Virtual Machine stats Writes to a text file every 5 secs Takes following parameter VM Name, vHost

Name , y/n VM Name => Name of Virtual Machine it has to

monitor, y=> to collect stats for both vHost as well as

VM, n=> to collect only VM stats Vhost-Name => Name of vHost it has to

monitor

Java -jar Agent.jar “vHost Name” “vm

Name” “y/n”

Page 8: Large Scale Log collection using LogStash & mongoDB

Agent flow

Page 9: Large Scale Log collection using LogStash & mongoDB

Parsing file using LogStash

LogStash reads log file written by agent, For every append in log file it detects and

generates an event, parses each line of log file and stores it in mongoDB.

Conf file(logshipper.conf) supplied to LogStash

Input {file=> ”*.log”} Filter {filter=>json} Output {output=> mongoDB }

bin/logstash -f logshipper.conf

Page 10: Large Scale Log collection using LogStash & mongoDB

Collector

Takes IP of all agents Connects to local storage of each VM Pulls data in a round robin manner Clears data from mongoDB after reading Stores in MySQL Configuration file for connection information Automated run every 5 min using crontab

Python collector.py “conf file”

Page 11: Large Scale Log collection using LogStash & mongoDB

Aggregator & Central DB design

24 hour 1 hour 5 minute data VM and vHost stats Schema

Page 12: Large Scale Log collection using LogStash & mongoDB

DRS-DPM (Part-1)

Initialize the environment and get number of VM's and host's.

Initialize standard variables vmCount and hostCount. If number of virtual machines is greater than vmCount.If new machine is powered on. Move newly added virtual machine to host with minimum load. End if End ifIf number of host machines is greater than hostCount. If cpu load of new host is less than 30% Migrate the virtual machine to host with minimum load. Power off the host. End if find the VM with minimum load Migrate the virtual machine under new host. end if

Avoided ping-pong migration

Page 13: Large Scale Log collection using LogStash & mongoDB

Is our design good ?

Agents: will not append will re-write to file DataBase (mongoDB) Collector:

Collects data, stores it in MySQL and removes it from local Storage

Can connect to as many client specified in conf file

Aggregator purges main table DataBase (MySQL): Aggregator clears the

main table Visualization module is totally decoupled from

server and storage

Page 14: Large Scale Log collection using LogStash & mongoDB

Visualization approach Library

We used canvas.js a Javascript library for visualization.

CanvasJSUsed canvas.js to plot the graphs.We used canvas.js since it is easy to use

and provides different types of visualization.

Data Source: MySQL DatabaseMySQL database was used from which data

was plotted on the graph.MySQL was used to get data in structured

format and then plotted on the graph.

Page 15: Large Scale Log collection using LogStash & mongoDB

Output Graphs

Page 16: Large Scale Log collection using LogStash & mongoDB

Output Graphs

Page 17: Large Scale Log collection using LogStash & mongoDB

Output Graphs

Page 18: Large Scale Log collection using LogStash & mongoDB

Output Graphs

Page 19: Large Scale Log collection using LogStash & mongoDB

Output Graphs

Page 20: Large Scale Log collection using LogStash & mongoDB

Tools & Technology Agents

       - Java VI api Collectors

       - Python script automated with CRONTAB Log file parsing

       - LogStash with mongoDB plugin Stress api

Manually increase CPU, IO and RAM consumption stress --cpu 2 --io 1 --vm 1 --vm-bytes 128M --timeout 10s --verbose

Visualization tools CanvasJS JavaScript Library JSP & HTML5

Programming languages       - Java, Python, JavaScript

Utilities Putty , winscp

Database MySQL mongoDB

Page 21: Large Scale Log collection using LogStash & mongoDB

Lessons learnt

Using VI java api Concept behind DRS-DPM. Never clone a vHost Not every Virtual Machine is Linux Automation using CRONTAB ESX log files awareness Designing systems Working with SQL and No-SQL databases and

understanding their usage context

Page 22: Large Scale Log collection using LogStash & mongoDB

THANK YOU...