18
Spring HEPiX 2017 - Budapest - April 27th The musings of a data junkie Data, data and more data Cary Whitney

Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

The musings of a data junkie

Data, data and more data

Cary Whitney

Page 2: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Data

Page 3: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Realization

Data collectionvs

Monitoring

Page 4: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

HEPiX Twiki

Page 5: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Monitoring (Myths/Issues/New)

• Students can do anything.• Without an understanding of the data, any dashboard or monitoring is pretty basic at

best.• He who collects the data, knows the data.

• Many stakeholders actually think that the ones who collect the data can build meaningful graphs and monitoring. The collectors may know the tools better but not the data.

• Building a comprehensive dashboard and monitoring is easy. Just throw it together.• Hey, now that you have all this data, can I copy it?

• There is a strong desire to copy some or all the data into other places.• RabbitMQ monitoring plugin

• You can have speed or monitoring but not both.• Cray SEDC data plugins. Direct streaming from the Cray system.• Elastic on Docker running on the Mac• Elastic upgrade to v5

• Kibana complained if there was still v2 components in the mix.• Kopf was broken, Cerebro is the new replacement for Kopf, for Elasticsearch

management.• Basic logstash, kibana and elastic stats now in kibana• Logstash config reload coming along

• rsyslog rework of the center’s syslog infrastructure, based on relp• ElastAlerts, netdata, GPFS and Lustre work this summer.

Page 6: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

netdata

Page 7: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

netdataMonitoring

Page 8: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

OpenDCIM

Page 9: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

OpenDCIM visual

Page 10: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Cori, a Cray XC40 based system

12

Haswell

16,128 Cores203 TB Memory2004 Nodes

52

632,672 Cores1 PB Memory9304 Nodes

KNL

Cray Dragonfly topology 45 TB/s bisectional bandwidth

Burst Buffers1.8PB SSD dynamic storage

Lustre Scratch disk space30PB

700 GB/s

For only 7 MW of peak power

Page 11: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Building Power (HPL run)

Page 12: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Grafana Max CPU Overview

Page 13: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Overview of Node

Page 14: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Grafana Monitoring

Page 15: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Elastic Docker

• Main doc page: https://twiki.cern.ch/twiki/bin/view/HEPIX/Monitoring• Instructions: https://twiki.cern.ch/twiki/bin/view/HEPIX/Instructions

1. Install Docker on you system.2. Create Docker yml file to load in Elastic, Kibana and Grafana3. Setup some local directories to store the data.4. Get your Elastic Index(es)5. Start the Elastic Docker6. Load the Index(es)7. Configure Kibana and Grafana8. Away you go with your local index

Page 16: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Elastic Size

Page 17: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Data Volumes (Single Day)

Size (GB) doc count (M) Descriptionmodbus 15.4 99.7 Serial based industrial devices

2500 PDU stripes and 849 PDU panels and substation

collectd 108.75 807.8 Linux system stats

SEDC 27.6 261.4 Cray power, environmental and job

Syslog 4.25 21.95 Logs from all systems/devices of the center

weather 0.017 0.044 Davis Weather station outside

onewire 0.940 5 Computer room temperature network over 1800 sensors

upmu 0.46 0.164 High resolution power monitoring

ION 0.206 1.9 Building substation power monitoring

Total 160 1.2B

Page 18: Data, data and more data · Spring HEPiX 2017 - Budapest - April 27th Monitoring (Myths/Issues/New) • Students can do anything. • Without an understanding of the data, any dashboard

Spring HEPiX 2017 - Budapest - April 27th

Thank You

- -