22
Performance Monitoring on NERSCs POWER 5 System Richard Gerber NERSC User Services [email protected] ScicomP 14 Poughkeepsie, N.Y. May 22, 2008

Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

Performance Monitoring on NERSC’s POWER 5 System Richard Gerber NERSC User Services [email protected]

ScicomP 14 Poughkeepsie, N.Y.

May 22, 2008

Page 2: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 2

Prequel

•  Although this talk centers around identifying system problems, it also highlights the stability and excellent design of the p575 POWER 5 system.

•  Virtually all the problems we’ve had with Bassi can be attributed to software complexity.

•  Take away-point: Application testing and monitoring are necessary to ensure proper system function.

Page 3: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 3

Outline

•  Why monitor performance? •  What are we monitoring? •  Procedure •  Data and results •  Discussion •  Summary

Page 4: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 4

Why monitor performance?

•  To provide stable, consistent high performance scientific computing resources.

•  To ensure that system performance and reliability never decreases over the machine’s lifetime.

•  To recognize when software, hardware, and configuration changes impact performance.

Page 5: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 5

What are we monitoring?

•  Parallel application performance – looks at the system from a user perspective.

•  High-level “component” tests: IO, memory bandwidth, MPI latency and bandwidth.

•  Serial application performance on a single packed node.

•  The goal is to monitor, maintain, improve the user experience.

Page 6: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 6

Three parallel applications

•  GTC (Gyrokinetic Toroidal Code) ▪  Studies energy transfer via plasma

microturbulence in fusion device plasmas. PIC code.

•  PARATEC (Parallel Total Energy Code) ▪  Ab-initio quantum total energy calculations via

pseudo-potentials and plane wave basis set. Self-consistent field conjugate gradient.

•  CAM (Community Atmospheric Model) ▪  Complicated multi-physics and chemistry.

Page 7: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 7

Three standard parallel benchmarks

•  NPB 2.4 Class D: SP •  NPB 2.4 Class D: MG •  NPB 2.4 Class D: FT

•  These were chosen as models of memory-intensive kernels that benefited from good memory bandwidth

Page 8: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 8

Three “Component” Tests

•  MEMRATE ▪  Single-node memory bandwidth.

•  MPITEST ▪  MPI latency ▪  MPI bandwidth

•  PIORAW ▪  Parallel IO performance

Page 9: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 9

Procedure

•  Goals ▪  Monitor the system in production mode ▪  Small impact on users ▪  Fast and flexible way to test SW and HW changes ▪  Automatically run suite, gather data and post results

•  Implementation ▪  Run suite of codes approximately weekly ▪  Each code typically runs a few minutes on 8 (of 111)

nodes ▪  End to end workflow accommodated through scripts

that run, parse, import into DB and results displayed on the web.

Page 10: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 10

Results Web Interface

•  The monitoring results are publicly available on the web ▪  http://www.nersc.gov/nusers/systems/

bassi/monitor.php •  This is useful for visually scanning

for anomalies. •  Data is quickly available to IBM

management and technical staff.

Page 11: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 11

Examples of Problems Revealed

•  PE upgrade problem •  HPS firmware upgrade problem •  AIX 5.3 upgrade problem •  Password file indexing problem •  System daemon problem •  Compiler upgrade issues •  Random hardware problems (e.g., L3

cache)

Page 12: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 12

CAM 3.0 Results CAM 16x1 (no OpenMP)

200.00

250.00

300.00

350.00

400.00

450.00

500.00

550.00

600.00

Sep-05 Mar-06 Oct-06 Apr-07 Nov-07 Jun-08

Date

Mflo

ps/s

ec/ta

sk

Healthy PE upgrade problem Firmware upgrade problemAIX 5.3 testing Indexing System DaemonXLF 11

Page 13: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 13

PARATEC Results Paratec 64 tasks

3000.00

3200.00

3400.00

3600.00

3800.00

4000.004200.00

4400.00

4600.00

4800.00

5000.00

Sep-05 Mar-06 Oct-06 Apr-07 Nov-07 Jun-08Date

Mflo

ps/s

ec/ta

sk

Healthy PE upgrade problem Firmware upgrade problemAIX 5.3 testing Indexing System DaemonBad Node

Page 14: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 14

NPB MG Results NPB MG 2.4 64 tasks

0.00

200.00

400.00

600.00

800.00

1000.00

1200.00

1400.00

1600.00

Sep-05 Dec-05 Mar-06 Jul-06 Oct-06 Jan-07 Apr-07 Aug-07 Nov-07 Feb-08 Jun-08 Sep-08

Date

Mflo

ps/s

ec/ta

sk

Healthy PE upgrade problem Firmware upgrade problemAIX 5.3 test Bad Node System DaemonIndexing XLF 11

Page 15: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 15

MEMRATE Packed Node MEMRATE Packed Node TRIAD

0

1000

2000

3000

4000

5000

6000

7000

8000

Sep-05 Mar-06 Oct-06 Apr-07 Nov-07 Jun-08

Date

MB

/sec

Healthy PE upgrade problem Firmware upgrade problemsBad Node Mem Segments IndexingSystem Daemon LL Upgrade/Config

Page 16: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 16

GTC Results GTC 64 tasks

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

800.00

900.00

Sep-05 Mar-06 Oct-06 Apr-07 Nov-07 Jun-08

Date

Mflo

ps/s

ec/ta

sk

Healthy PE upgrade Firmware upgradeAIX 5.3 testing Bad Node System daemonIndexing LL upgrade/config XLF 11

Page 17: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 17

Discussion

•  Most of the problems we uncovered involved software upgrades or configuration changes

•  None of these issues were known before monitoring revealed them

•  None except the bad L3 caches and compiler changes were identified by “system” tests or had simple resolutions

Page 18: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 18

Healthy Results Summary

Benchmark Avg. MFlop/s Std. Dev. COV

CAM 16 502 1.4 0.28%

PARATEC 64 4719 80 1.69%

GTC 747 8.9 1.20%*

NPB MG D 64 1331 26 1.95%

MEMRATE 7070 42 0.60%

Page 19: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 19

Discussion

•  When you run and examine the results over time you get a “feel” for when something is wrong

•  The measured variation on a healthy system allows you to quickly evaluate the significance of an outlying result

•  Even minor variations in run time can indicate that a system is sick

•  The historical results provide quantitative evidence that a problem exists

Page 20: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 20

Discussion

•  A healthy Bassi has remarkably consistent run time performance characteristics.

•  This is especially notable because the system is heavily loaded with a diverse workload.

•  We don’t have comparable long-term data for other systems at NERSC (yet!)

•  Aggregate measures (SSP) are also useful, but are not as diagnostic for as many problems

Page 21: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 21

Sustained System Performance (SSP)

The SSP is an aggregate metric derived from a (geometric) average of application benchmarks and standard parallel benchmarks.

Indexing System Daemon

Page 22: Performance Monitoring on NERSC s POWER 5 System · • Parallel application performance – looks at the system from a user perspective. • High-level “component” tests: IO,

ScicomP 14, Poughkeepsie, N.Y. May 22, 2008 22

Summary

•  Application testing and monitoring are necessary to ensure proper system function.

•  When configured properly application performance on Bassi is remarkably consistent.

•  Most problems we’ve had with Bassi can be attributed to software complexity.