21
The InfluxDB-Grafana plugin for Fuel Documentation Release 0.8.0 Mirantis Inc. December 14, 2015

The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

  • Upload
    others

  • View
    47

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for FuelDocumentation

Release 080

Mirantis Inc

December 14 2015

Contents

1 User documentation 111 Overview 112 Release Notes 313 Installation Guide 314 User Guide 515 Licenses 1716 Appendix 18

2 Indices and Tables 19

i

CHAPTER 1

User documentation

11 Overview

The InfluxDB-Grafana Fuel Plugin is used to install and configure InfluxDB and Grafana which collectively provideaccess to the OpenStack metrics analytics InfluxDB is a powerful distributed time-series database to store and searchmetrics time-series The metrics analytics are used to visualize the time-series and the annotations produced by theLMA Collector The annotations contain insightful information about the detected fault or anomaly that triggered achange of state for a node cluster or service cluster as well as textual hints about what might be the root cause of thefault or anomaly

The InfluxDB-Grafana Plugin is an indispensable tool to answering the questions ldquowhat has changed in my Open-Stack environmentwhen and whyrdquo Grafana is installed with a collection of predefined dashboards for each of theOpenStack services that are monitored Among those dashboards the Main Dashboard provides a single pane of glassoverview of your OpenStack environment status

InfluxDB and Grafana are key components of the LMA Toolchain project as shown in the figure below

1

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

111 Requirements

Require-ment

VersionComment

Disk space At least 55GBFuel Mirantis OpenStack 70Hardwareconfiguration

The hardware configuration (RAM CPU disk(s)) required by this plugin depends on the size ofyour cloud environment and other factors like the retention policy but a typical setup would atleast require a quad-core server with 8GB of RAM and access to a fast disk or disks array(ideally SSDs)It is also highly recommended to use a dedicated disk for your data storage Otherwise TheInfluxDB-Grafana Plugin will use the root filesystem by default

112 Limitations

A current limitation of this plugin is that it not possible to display in the Fuel web UI the URL where the Grafanainterface can be reached when the deployment has completed Instructions are provided in the User Guide about howyou can obtain this URL using the fuel command line

11 Overview 2

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

113 Key terms acronyms and abbreviations

Terms ampacronyms

Definition

LMACollector

Logging Monitoring and Alerting (LMA) Collector A service running on each node whichcollects all the logs and the OpenStak notifications

InfluxDB InfluxDB is a time-series metrics and analytics open-source database (MIT license) Itrsquos writtenin Go and has no external dependenciesInfluxDB is targeted at use cases for DevOps metrics sensor data and real-time analytics

Grafana Grafana is an (Apache 20 Licensed) general purpose dashboard and graph composer Itrsquos focusedon providing rich ways to visualize metrics time-series mainly though graphs but supports otherways to visualize data through a pluggable panel architectureIt currently has rich support for Graphite InfluxDB and OpenTSDB But supports other datasources via plugins Grafana is most commonly used for infrastructure monitoring applicationmonitoring and metric analytics

12 Release Notes

121 Version 080

bull Add support for the ldquoinfluxdb_grafanardquo Fuel Plugin role instead of the ldquobase-osrdquo role which had several limita-tions

bull Add support for retention policy configuration

bull Upgrade to InfluxDB 094 which brings metrics time-series with tagging

bull Upgrade to Grafana 250

bull Several dashboard visualisation improvements

bull A new self-monitoring dashboard

122 Version 070

bull Initial release of the plugin This is a beta version

13 Installation Guide

131 InfluxDB-Grafana Fuel Plugin install using the RPM file of the Fuel PluginsCatalog

To install the InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins Catalog you need to follow thesesteps

1 Download the RPM file from the Fuel Plugins Catalog

2 Copy the RPM file to the Fuel Master node

[roothome ~] scp influxdb_grafana-08-080-1noarchrpm rootltFuel Master node IP addressgt

3 Install the plugin using the Fuel CLI

12 Release Notes 3

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 2: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

Contents

1 User documentation 111 Overview 112 Release Notes 313 Installation Guide 314 User Guide 515 Licenses 1716 Appendix 18

2 Indices and Tables 19

i

CHAPTER 1

User documentation

11 Overview

The InfluxDB-Grafana Fuel Plugin is used to install and configure InfluxDB and Grafana which collectively provideaccess to the OpenStack metrics analytics InfluxDB is a powerful distributed time-series database to store and searchmetrics time-series The metrics analytics are used to visualize the time-series and the annotations produced by theLMA Collector The annotations contain insightful information about the detected fault or anomaly that triggered achange of state for a node cluster or service cluster as well as textual hints about what might be the root cause of thefault or anomaly

The InfluxDB-Grafana Plugin is an indispensable tool to answering the questions ldquowhat has changed in my Open-Stack environmentwhen and whyrdquo Grafana is installed with a collection of predefined dashboards for each of theOpenStack services that are monitored Among those dashboards the Main Dashboard provides a single pane of glassoverview of your OpenStack environment status

InfluxDB and Grafana are key components of the LMA Toolchain project as shown in the figure below

1

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

111 Requirements

Require-ment

VersionComment

Disk space At least 55GBFuel Mirantis OpenStack 70Hardwareconfiguration

The hardware configuration (RAM CPU disk(s)) required by this plugin depends on the size ofyour cloud environment and other factors like the retention policy but a typical setup would atleast require a quad-core server with 8GB of RAM and access to a fast disk or disks array(ideally SSDs)It is also highly recommended to use a dedicated disk for your data storage Otherwise TheInfluxDB-Grafana Plugin will use the root filesystem by default

112 Limitations

A current limitation of this plugin is that it not possible to display in the Fuel web UI the URL where the Grafanainterface can be reached when the deployment has completed Instructions are provided in the User Guide about howyou can obtain this URL using the fuel command line

11 Overview 2

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

113 Key terms acronyms and abbreviations

Terms ampacronyms

Definition

LMACollector

Logging Monitoring and Alerting (LMA) Collector A service running on each node whichcollects all the logs and the OpenStak notifications

InfluxDB InfluxDB is a time-series metrics and analytics open-source database (MIT license) Itrsquos writtenin Go and has no external dependenciesInfluxDB is targeted at use cases for DevOps metrics sensor data and real-time analytics

Grafana Grafana is an (Apache 20 Licensed) general purpose dashboard and graph composer Itrsquos focusedon providing rich ways to visualize metrics time-series mainly though graphs but supports otherways to visualize data through a pluggable panel architectureIt currently has rich support for Graphite InfluxDB and OpenTSDB But supports other datasources via plugins Grafana is most commonly used for infrastructure monitoring applicationmonitoring and metric analytics

12 Release Notes

121 Version 080

bull Add support for the ldquoinfluxdb_grafanardquo Fuel Plugin role instead of the ldquobase-osrdquo role which had several limita-tions

bull Add support for retention policy configuration

bull Upgrade to InfluxDB 094 which brings metrics time-series with tagging

bull Upgrade to Grafana 250

bull Several dashboard visualisation improvements

bull A new self-monitoring dashboard

122 Version 070

bull Initial release of the plugin This is a beta version

13 Installation Guide

131 InfluxDB-Grafana Fuel Plugin install using the RPM file of the Fuel PluginsCatalog

To install the InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins Catalog you need to follow thesesteps

1 Download the RPM file from the Fuel Plugins Catalog

2 Copy the RPM file to the Fuel Master node

[roothome ~] scp influxdb_grafana-08-080-1noarchrpm rootltFuel Master node IP addressgt

3 Install the plugin using the Fuel CLI

12 Release Notes 3

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 3: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

CHAPTER 1

User documentation

11 Overview

The InfluxDB-Grafana Fuel Plugin is used to install and configure InfluxDB and Grafana which collectively provideaccess to the OpenStack metrics analytics InfluxDB is a powerful distributed time-series database to store and searchmetrics time-series The metrics analytics are used to visualize the time-series and the annotations produced by theLMA Collector The annotations contain insightful information about the detected fault or anomaly that triggered achange of state for a node cluster or service cluster as well as textual hints about what might be the root cause of thefault or anomaly

The InfluxDB-Grafana Plugin is an indispensable tool to answering the questions ldquowhat has changed in my Open-Stack environmentwhen and whyrdquo Grafana is installed with a collection of predefined dashboards for each of theOpenStack services that are monitored Among those dashboards the Main Dashboard provides a single pane of glassoverview of your OpenStack environment status

InfluxDB and Grafana are key components of the LMA Toolchain project as shown in the figure below

1

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

111 Requirements

Require-ment

VersionComment

Disk space At least 55GBFuel Mirantis OpenStack 70Hardwareconfiguration

The hardware configuration (RAM CPU disk(s)) required by this plugin depends on the size ofyour cloud environment and other factors like the retention policy but a typical setup would atleast require a quad-core server with 8GB of RAM and access to a fast disk or disks array(ideally SSDs)It is also highly recommended to use a dedicated disk for your data storage Otherwise TheInfluxDB-Grafana Plugin will use the root filesystem by default

112 Limitations

A current limitation of this plugin is that it not possible to display in the Fuel web UI the URL where the Grafanainterface can be reached when the deployment has completed Instructions are provided in the User Guide about howyou can obtain this URL using the fuel command line

11 Overview 2

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

113 Key terms acronyms and abbreviations

Terms ampacronyms

Definition

LMACollector

Logging Monitoring and Alerting (LMA) Collector A service running on each node whichcollects all the logs and the OpenStak notifications

InfluxDB InfluxDB is a time-series metrics and analytics open-source database (MIT license) Itrsquos writtenin Go and has no external dependenciesInfluxDB is targeted at use cases for DevOps metrics sensor data and real-time analytics

Grafana Grafana is an (Apache 20 Licensed) general purpose dashboard and graph composer Itrsquos focusedon providing rich ways to visualize metrics time-series mainly though graphs but supports otherways to visualize data through a pluggable panel architectureIt currently has rich support for Graphite InfluxDB and OpenTSDB But supports other datasources via plugins Grafana is most commonly used for infrastructure monitoring applicationmonitoring and metric analytics

12 Release Notes

121 Version 080

bull Add support for the ldquoinfluxdb_grafanardquo Fuel Plugin role instead of the ldquobase-osrdquo role which had several limita-tions

bull Add support for retention policy configuration

bull Upgrade to InfluxDB 094 which brings metrics time-series with tagging

bull Upgrade to Grafana 250

bull Several dashboard visualisation improvements

bull A new self-monitoring dashboard

122 Version 070

bull Initial release of the plugin This is a beta version

13 Installation Guide

131 InfluxDB-Grafana Fuel Plugin install using the RPM file of the Fuel PluginsCatalog

To install the InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins Catalog you need to follow thesesteps

1 Download the RPM file from the Fuel Plugins Catalog

2 Copy the RPM file to the Fuel Master node

[roothome ~] scp influxdb_grafana-08-080-1noarchrpm rootltFuel Master node IP addressgt

3 Install the plugin using the Fuel CLI

12 Release Notes 3

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 4: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

111 Requirements

Require-ment

VersionComment

Disk space At least 55GBFuel Mirantis OpenStack 70Hardwareconfiguration

The hardware configuration (RAM CPU disk(s)) required by this plugin depends on the size ofyour cloud environment and other factors like the retention policy but a typical setup would atleast require a quad-core server with 8GB of RAM and access to a fast disk or disks array(ideally SSDs)It is also highly recommended to use a dedicated disk for your data storage Otherwise TheInfluxDB-Grafana Plugin will use the root filesystem by default

112 Limitations

A current limitation of this plugin is that it not possible to display in the Fuel web UI the URL where the Grafanainterface can be reached when the deployment has completed Instructions are provided in the User Guide about howyou can obtain this URL using the fuel command line

11 Overview 2

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

113 Key terms acronyms and abbreviations

Terms ampacronyms

Definition

LMACollector

Logging Monitoring and Alerting (LMA) Collector A service running on each node whichcollects all the logs and the OpenStak notifications

InfluxDB InfluxDB is a time-series metrics and analytics open-source database (MIT license) Itrsquos writtenin Go and has no external dependenciesInfluxDB is targeted at use cases for DevOps metrics sensor data and real-time analytics

Grafana Grafana is an (Apache 20 Licensed) general purpose dashboard and graph composer Itrsquos focusedon providing rich ways to visualize metrics time-series mainly though graphs but supports otherways to visualize data through a pluggable panel architectureIt currently has rich support for Graphite InfluxDB and OpenTSDB But supports other datasources via plugins Grafana is most commonly used for infrastructure monitoring applicationmonitoring and metric analytics

12 Release Notes

121 Version 080

bull Add support for the ldquoinfluxdb_grafanardquo Fuel Plugin role instead of the ldquobase-osrdquo role which had several limita-tions

bull Add support for retention policy configuration

bull Upgrade to InfluxDB 094 which brings metrics time-series with tagging

bull Upgrade to Grafana 250

bull Several dashboard visualisation improvements

bull A new self-monitoring dashboard

122 Version 070

bull Initial release of the plugin This is a beta version

13 Installation Guide

131 InfluxDB-Grafana Fuel Plugin install using the RPM file of the Fuel PluginsCatalog

To install the InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins Catalog you need to follow thesesteps

1 Download the RPM file from the Fuel Plugins Catalog

2 Copy the RPM file to the Fuel Master node

[roothome ~] scp influxdb_grafana-08-080-1noarchrpm rootltFuel Master node IP addressgt

3 Install the plugin using the Fuel CLI

12 Release Notes 3

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 5: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

113 Key terms acronyms and abbreviations

Terms ampacronyms

Definition

LMACollector

Logging Monitoring and Alerting (LMA) Collector A service running on each node whichcollects all the logs and the OpenStak notifications

InfluxDB InfluxDB is a time-series metrics and analytics open-source database (MIT license) Itrsquos writtenin Go and has no external dependenciesInfluxDB is targeted at use cases for DevOps metrics sensor data and real-time analytics

Grafana Grafana is an (Apache 20 Licensed) general purpose dashboard and graph composer Itrsquos focusedon providing rich ways to visualize metrics time-series mainly though graphs but supports otherways to visualize data through a pluggable panel architectureIt currently has rich support for Graphite InfluxDB and OpenTSDB But supports other datasources via plugins Grafana is most commonly used for infrastructure monitoring applicationmonitoring and metric analytics

12 Release Notes

121 Version 080

bull Add support for the ldquoinfluxdb_grafanardquo Fuel Plugin role instead of the ldquobase-osrdquo role which had several limita-tions

bull Add support for retention policy configuration

bull Upgrade to InfluxDB 094 which brings metrics time-series with tagging

bull Upgrade to Grafana 250

bull Several dashboard visualisation improvements

bull A new self-monitoring dashboard

122 Version 070

bull Initial release of the plugin This is a beta version

13 Installation Guide

131 InfluxDB-Grafana Fuel Plugin install using the RPM file of the Fuel PluginsCatalog

To install the InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins Catalog you need to follow thesesteps

1 Download the RPM file from the Fuel Plugins Catalog

2 Copy the RPM file to the Fuel Master node

[roothome ~] scp influxdb_grafana-08-080-1noarchrpm rootltFuel Master node IP addressgt

3 Install the plugin using the Fuel CLI

12 Release Notes 3

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 6: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel plugins --install influxdb_grafana-08-080-1noarchrpm

4 Verify that the plugin is installed correctly

[rootfuel ~] fuel plugins --listid | name | version | package_version---|----------------------|---------|----------------1 | influxdb_grafana | 080 | 300

132 InfluxDB-Grafana Fuel Plugin install from source

Alternatively you may want to build the RPM file of the plugin from source if for example you want to test thelatest features modify some built-in configuration or implement your own customization But note that running a Fuelplugin that you have built yourself is at your own risk

To install the InfluxDB-Grafana Plugin from source you first need to prepare an environement to build the RPM fileThe recommended approach is to build the RPM file directly onto the Fuel Master node so that you wonrsquot have to copythat file later on

Prepare an environment for building the plugin on the Fuel Master Node

1 Install the standard Linux development tools

[roothome ~] yum install createrepo rpm rpm-build dpkg-devel

2 Install the Fuel Plugin Builder To do that you should first get pip

[roothome ~] easy_install pip

3 Then install the Fuel Plugin Builder (the fpb command line) with pip

[roothome ~] pip install fuel-plugin-builder

Note You may also need to build the Fuel Plugin Builder if the package version of the plugin is higher than packageversion supported by the Fuel Plugin Builder you get from pypi In this case please refer to the section ldquoPreparing anenvironment for plugin developmentrdquo of the Fuel Plugins wiki if you need further instructions about how to build theFuel Plugin Builder

4 Clone the plugin git repository

[roothome ~] git clone gitgithubcomstackforgefuel-plugin-influxdb-grafanagit

5 Check that the plugin is valid

[roothome ~] fpb --check fuel-plugin-influxdb-grafana

6 And finally build the plugin

[roothome ~] fpb --build fuel-plugin-influxdb-grafana

7 Now that you have created the RPM file you can install the plugin using the fuel plugins ndashinstall command

[rootfuel ~] fuel plugins --install fuel-plugin-influxdb-grafananoarchrpm

133 InfluxDB-Grafana Fuel Plugin software components

List of software components installed by the plugin

13 Installation Guide 4

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 7: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

Components VersionInfluxDB v094 for Ubuntu (64-bit)Grafana v250 for Ubuntu (64-bit)

14 User Guide

141 Plugin configuration

To configure your plugin you need to follow these steps

1 Create a new environment with the Fuel web user interface

2 Click on the Settings tab of the Fuel web UI

3 Scroll down the page and select the InfluxDB-Grafana Plugin in the left column The InfluxDB-Grafana Pluginsettings screen should appear as shown below

14 User Guide 5

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 8: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

1 Select the InfluxDB-Grafana Plugin checkbox and fill-in the required fields

(a) Specify the number of days retention period for data

(b) Specify the InfluxDB admin password (called root password in the InfluxDB documentation)

(c) Specify the database name (default is lma)

(d) Specify the InfluxDB user name and password

(e) Specify the Grafana user name and password

2 Assign the InfluxDB Grafana role to the node where you would like to install the InfluxDB and Grafana serversas shown below

Note Because of a bug with Fuel 70 (see bug 1496328) the UI wonrsquot let you assign the InfluxDBGrafana role if at least one node is already assigned with one of the built-in roles

To workaround this problem you should either remove the already assigned built-in roles or use the FuelCLI

14 User Guide 6

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 9: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

$ fuel --env ltenvironment idgt node set --node-id ltnode_idgt --role=influxdb_grafana

1 Adjust the disk configuration if necessary (see the Fuel User Guide for details) By default the InfluxDB-Grafana Plugin allocates

bull 20 of the first available disk for the operating system by honoring a range of 15GB minimum to 50GBmaximum

bull 10GB for varlog

bull At least 30 GB for the InfluxDB database in optinfluxdb

2 Configure your environment as needed

3 Verify the networks on the Networks tab of the Fuel web UI

4 Deploy your changes

142 Plugin verification

Be aware that depending on the number of nodes and deployment setup deploying a Mirantis OpenStack environmentcan typically take anything from 30 minutes to several hours But once your deployment is complete you should seea notification that looks like the following

Verifying InfluxDB

Once your deployment has completed you should verify that InfluxDB is running properly On the Fuel Master nodeyou can retrieve the IP address of the node where InfluxDB is installed via the fuel command line

14 User Guide 7

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 10: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

[rootfuel ~] fuel nodesid | status | name | cluster | ip | | roles | ---|----------|------|---------|-----------|-----|-------------------|----37 | ready | lma | 38 | 102004 | | influxdb_grafana |

[Skip ]

On that node (node-37 in this example) the influx command should be available via the CLI Executing influx willstart an interactive CLI and automatically connect to the local InfluxDB server

[rootnode-37 ~] optinfluxdbinflux -database lma -password lmapass --username lmaConnected to httplocalhost8086 version 0942InfluxDB shell 0942gt

Then if you type

gt show series

You should see a dump of all the time-series collected so far

[ Skip]

name swap_used---------------_key deployment_id hostnameswap_useddeployment_id=38hostname=node-40 38 node-40swap_useddeployment_id=38hostname=node-42 38 node-42swap_useddeployment_id=38hostname=node-41 38 node-41swap_useddeployment_id=38hostname=node-43 38 node-43swap_useddeployment_id=38hostname=node-38 38 node-38swap_useddeployment_id=38hostname=node-37 38 node-37swap_useddeployment_id=38hostname=node-36 38 node-36

name total_threads_created---------------------------_key deployment_id hostnametotal_threads_createddeployment_id=38hostname=node-38 38 node-38total_threads_createddeployment_id=38hostname=node-37 38 node-37total_threads_createddeployment_id=38hostname=node-36 38 node-36

Verifying Grafana

The Grafana user interface runs on port 8000 Pointing your browser to the URL httpltHOSTgt8000 you should seethe Grafana login page

14 User Guide 8

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 11: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

You should be redirected to the Grafana Home Page The first time you access Grafana you are requested to authen-ticate using the credentials you have defined in the settings Once you have authenticated successfully you should beautomatically redirected to the Home Page from where you can select a dashboard as shown below

14 User Guide 9

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 12: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

143 Exploring your time-series with Grafana

The InfluxDB-Grafana Plugin comes with a collection of predefined dashboards you can use to visualize the time-series that are stored in InfluxDB There is one primary dashboard called the Main Dashboard and several otherdashboards that are organized per service name

The Main Dashboard

We suggest you start with the Main Dashboard as shown below The Main Dashboard provides a single pane of glassto visualize the health status of all the OpenStack services being monitored such as Nova or Cinder but also HAProxyMySQL and RabbitMQ to name a few

14 User Guide 10

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 13: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

As you can see the Main Dashboard (as most dashboards) provides a drop down menu list in the upper left corner ofthe window from where you can select a metric tag (aka dimension) such as a controller name or device name youwant to visualize In the example above we say we want to visualize the system time-series for node-48

Within the OpenStack Services row each of the services represented can be assigned five different states

Note The precise determination of a service state depends on the Global Status Evaluation (GSE) policies definedfor the GSE Plugins

The meaning associated with a service health state is the following

bull Down One or several primary functions of a service cluster are failed For example all API endpoints of aservice cluster like Nova or Cinder are failed

bull Critical One or several primary functions of a service cluster are severely degraded The quality of servicedelivered to the end-user should be severely impacted

bull Warning One or several primary functions of a service cluster are slightly degraded The quality of servicedelivered to the end-user should be slightly impacted

bull Unknown There is not enough data to infer the actual health state of a service cluster

bull Okay None of the above was found to be true

The Virtual Compute Resources row provides an overview of the amount of virtual resources being used by the com-pute nodes including the number of virtual CPUs the amount of memory and disk space being used as well as theamount of virtual resources remaining available to create new instances

14 User Guide 11

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 14: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The ldquoSystemrdquo row provides an overview of the amount of physical resources being used on the control plane (thecontroller cluster) You can select a specific controller using the controllerrsquos drop down list in the left corner of thetoolbar

The ldquoCephrdquo row provides an overview of the resources usage and current health state of the Ceph cluster when it isdeployed in the OpenStack environment

The Main Dashboard is also an entry point to access detailed dashboards for each of the OpenStack services beingmonitored For example if you click through the Nova box you should see a screen like this

14 User Guide 12

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 15: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 13

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 16: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

The Nova Dashboard

The Nova Dashboard provides a detailed view of the Nova servicersquos related metrics

The Service Status row provides information about the Nova service cluster health state as a whole including the stateof the API frontend (the HAProxy plubic VIP) a counter of HTTP 5xx errors the HTTP requests response time andstatus code

The Nova API row provides information about the health state of the API backends (nova-api ec2-api ) the state ofthe workers and compute nodes

The Instance row provides information about the number of active instances instances in error and instances creationtime statistics

The ldquoResourcesrdquo row provides various virtual resources usage indicators

The LMA Self-Monitoring Dashboard

The LMA Self-Monitoring Dashboard is a new dashboard in LMA 08 This dashboard provides an overview of howthe LMA Toolchain performs overall

The LMA Collector row provides information about the Heka process In particular it is possible to visualize theprocessing time allocated to the Lua plugins and the amount of messages that have been processed as well as theamount of system resources consumed by the Heka process

Again it is possible to select a particular node using the dropdown menu list

The Collectd row provides system resource usage information allocated to the collectd process

The InfluxDB row provides system resource usage information allocated to the InfluxDB application

The Grafana row provides system resource usage information allocated to the Grafana application

The Elasticsearch row provides system resource usage information allocated to the JVM process running the Elastic-search application

Other Dashboards

In total there are 16 different dashboards you can use to explore different time-series facettes of your OpenStackenvironment

Viewing Faults and Anomalies

The LMA-Toolchain is capable of detecting a number of service-affecting conditions such as the faults and anomaliesthat occured in your OpenStack environment Those conditions are reported in annotations that are displayed inGrafana The Grafana annotations contain a textual representation of the alarm (or set of alarms) that were triggeredby the Collectors for a service In other words the annotations contain valuable insights that you could use to diagnoseand troubleshoot problems Futhermore with the Grafana annotations the system makes a distiction between whatis estimated as a direct root cause versus what is estimated as an indirect root cause This is internally represented ina dependency graph There are first degree dependencies that are used to describe situations whereby the healthstate of an entity strictly depends on the health state of another entity For example Nova as a service has firstdegree dependencies with the nova-api endpoints and the nova-scheduler workers But there are also second degreedependencies whereby the health state of an entity doesnrsquot strictly depends on the heath state of another entity althoughit might be depending on the operation being performed For example by default we declared that Nova has a seconddegree dependency with Neutron As a result the health state of Nova will not be directly impacted by the healthstate of Neutron but the annotation will provide a root cause analysis hint For example letrsquos assume a situation whereNova has changed a state from okay to critical (because of 5xx HTTP errors) and that Neutron has been in down state

14 User Guide 14

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 17: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

for a while In this case the Nova dashboard will display an annotation that says Nova has changed a state to warningbecause the system has detected 5xx errors and that it may be due to the fact that Neutron is down An example ofwhat an annotation looks like is shown below

14 User Guide 15

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 18: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

14 User Guide 16

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 19: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

144 Troubleshooting

If you get no data in Grafana follow these troubleshooting tips

1 First check that the LMA Collector is running properly by following the LMA Collector troubleshooting in-structions in the LMA Collector Fuel Plugin User Guide

2 Check that the nodes are able to connect to the InfluxDB server on port 8086

3 Check that InfluxDB is running

[rootnode-37 ~] etcinitdinfluxdb statusinfluxdb Process is running [ OK ]

4 If InfluxDB is down restart it

[rootnode-37 ~] etcinitdinfluxdb startStarting the process influxdb [ OK ]influxdb process was started [ OK ]

5 Check that Grafana is running

[rootnode-37 ~] etcinitdgrafana-server status

grafana is running

6 If Grafana is down restart it

[rootnode-37 ~] etcinitdgrafana-server start

Starting Grafana Server

7 If none of the above solves the problem check the logs in varloginfluxdbinfluxdblog andvarloggrafanagrafanalog to find out what might have gone wrong

15 Licenses

151 Third Party Components

Name Project Web Site LicenseInfluxDB httpsinfluxdbcom MITGrafana httpgrafanaorg Apache v2

152 Puppet modules

Name Project Web Site LicenseApt httpsgithubcompuppetlabspuppetlabs-apt Apache V2Concat httpsgithubcompuppetlabspuppetlabs-concat Apache V2Stdlib httpsgithubcompuppetlabspuppetlabs-stdlib Apache V2IniFile httpsgithubcompuppetlabspuppetlabs-inifile Apache v2Grafana httpsgithubcombfraserpuppet-grafana Apache v2

15 Licenses 17

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 20: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

The InfluxDB-Grafana plugin for Fuel Documentation Release 080

16 Appendix

bull The InfluxDB-Grafana plugin project at GitHub

bull The official InfluxDB documentation

bull The official Grafana documentation

16 Appendix 18

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables
Page 21: The InfluxDB-Grafana plugin for Fuel Documentation · 2019-04-02 · The InfluxDB-Grafana plugin for Fuel Documentation, Release 0.8.0 1.4.3Exploring your time-series with Grafana

CHAPTER 2

Indices and Tables

bull search

19

  • User documentation
    • Overview
    • Release Notes
    • Installation Guide
    • User Guide
    • Licenses
    • Appendix
      • Indices and Tables