20
Challenges in Challenges in Performance Performance Evaluation and Evaluation and Improvement of Improvement of Scientific Codes Scientific Codes Boyana Norris Argonne National Laboratory http://www.mcs.anl.gov/~norris Ivana Veljkovic Pennsylvania State University

Challenges in Performance Evaluation and Improvement of Scientific Codes Boyana Norris Argonne National Laboratory norris Ivana

Embed Size (px)

Citation preview

Challenges in Performance Challenges in Performance Evaluation and Improvement Evaluation and Improvement of Scientific Codesof Scientific Codes

Boyana NorrisArgonne National Laboratoryhttp://www.mcs.anl.gov/~norris

Ivana VeljkovicPennsylvania State University

SIAM CSE 2February 13, 2005

OutlineOutline

Performance evaluation challenges

Component-based approach

Motivating example: adaptive linear system

solution

A component infrastructure for performance

monitoring and adaptation of applications

Summary and future work

SIAM CSE 3February 13, 2005

AcknowledgmentsAcknowledgments

Ivana Veljkovic, Padma Raghavan (Penn State) Sanjukta Bhowmick (ANL/Columbia) Lois Curfman McInnes (ANL) TAU developers (U. Oregon) PERC members Sponsor: DOE and NSF

SIAM CSE 4February 13, 2005

Challenges in performance evaluationChallenges in performance evaluation

+ Many tools for performance data gathering and analysis PAPI, TAU, SvPablo, Kojak, … Various interfaces, levels of automation, and approaches to

information presentation User’s point of view

- What do the different tools do? Which is most appropriate for a given application?

- (How) can multiple tools be used in concert?- I have tons of performance data, now what? - What automatic tuning tools are available, what exactly do they

do?- How hard is it to install/learn/use tool X?- Is instrumented code portable? What’s the overhead of

instrumentation? How does code evolution affect the performance analysis process?

SIAM CSE 5February 13, 2005

Incomplete list of toolsIncomplete list of tools Source instrumentation: TAU/PDT, KOJAK (MPI/OpenMP),

SvPablo, Performance Assertions, … Binary instrumentation: HPCToolkit, Paradyn, DyninstAPI,

… Performance monitoring: MetaSim Tracer (memory), PAPI,

HPCToolkit, Sigma++ (memory), DPOMP (OpenMP), mpiP, gprof, psrun, …

Modeling/analysis/prediction: MetaSim Convolver (memory), DIMEMAS(network), SvPablo (scalability), Paradyn, Sigma++, …

Source/binary optimization: Automated Empirical Optimization of Software (ATLAS), OSKI, ROSE

Runtime adaptation: ActiveHarmony, SALSA

SIAM CSE 6February 13, 2005

Incomplete list of toolsIncomplete list of tools Source instrumentation: TAU/PDT, KOJAK (MPI/OpenMP),

SvPablo, Performance Assertions, … Binary instrumentation: HPCToolkit, Paradyn, DyninstAPI,

… Performance monitoring: MetaSim Tracer (memory), PAPI,

HPCToolkit, Sigma++ (memory), DPOMP (OpenMP), mpiP, gprof, psrun, …

Modeling/analysis/prediction: MetaSim Convolver (memory), DIMEMAS(network), SvPablo (scalability), Paradyn, Sigma++, …

Source/binary optimization: Automated Empirical Optimization of Software (ATLAS), OSKI, ROSE

Runtime adaptation: ActiveHarmony, SALSA

SIAM CSE 7February 13, 2005

Challenges (where is the complexity?)Challenges (where is the complexity?) More effective use integration Tool developer’s perspective

Overhead of initially implementing one-to-one interoperabilty Managing dependencies on other tools Maintaining interoperabilty as different tools evolve

Individual Scientist Perspective Learning curve for performance tools less time to focus on own

research (modeling, physics, mathematics) Potentially significant time investment needed to find out

whether/how using someone else’s tool would improve performance tend to do own hand-coded optimizations (time-consuming, non-reusable)

Lack of tools that automate (at least partially) algorithm discovery, assembly, configuration, and enable runtime adaptivity

SIAM CSE 8February 13, 2005

What can be doneWhat can be done

How to manage complexity? Provide Performance tools that are truly interoperable Uniform easy access to tools Component implementations of software, esp. supporting

numerical codes, such as linear algebra algorithms New algorithms (e.g., interactive/dynamic techniques, algorithm

composition)

Implementation approach: components, both for tools and the application software

SIAM CSE 9February 13, 2005

What is being doneWhat is being done

No “integrated” environment for performance monitoring, analysis, and optimization

Most past efforts One-to-one tool interoperability

More recently OSPAT (initial meeting at SC’04), focus on common

data representation and interfaces Tool-independent performance databases: PerfDMF Eclipse parallel tools project (LANL) …

SIAM CSE 10February 13, 2005

OSPATOSPAT The following areas were recommended for OSPAT to

investigate: A common instrumentation API for source level, compiler level,

library level, binary instrumentation A common probe interface for routine entry and exit events A common profile database schema An API to walk the callstack and examine the heap memory A common API for thread creation and fork interface Visualization components for drawing histograms and

hierarchical displays typically used by performance tools

SIAM CSE 11February 13, 2005

ComponentsComponents Working definition: a component is a piece of software that

can be composed with other components within a framework; composition can be either static (at link time) or dynamic (at run time) “plug-and-play” model for building applications For more info: C. Szyperski, Component Software: Beyond Object-

Oriented Programming, ACM Press, New York, 1998

Components enable Tool interoperability Automation of performance instrumentation/monitoring Application adaptivity (automated or user-guided)

SIAM CSE 12February 13, 2005

Example: component infrastructure for Example: component infrastructure for multimethod linear solversmultimethod linear solvers Goal: provide a framework for

Performance monitoring of numerical components Dynamic adaptativity, based on:

Off-line analyses of past performance information Online analysis of current execution performance information

Motivating application examples: Driven cavity flow [Coffey et al, 2003], nonlinear PDE solution FUN3D – incompressible and compressible Euler equations

Prior work in multimethod linear solvers McInnes et al, ’03, Bhowmick et al,’03 and ’05, Norris at al. ’05.

SIAM CSE 13February 13, 2005

Example: driven cavity flowExample: driven cavity flow

Linear solver: GMRES(30), vary only fill level of ILU preconditioner Adaptive heuristic based on:

Previous linear solution convergence rate, nonlinear solution convergence rate, rate of increase of linear solution iterations

96x96 mesh, Grashof = 105, lid velocity = 100 Intel P4 Xeon, dual 2.2 GHz, 4GB RAM

SIAM CSE 14February 13, 2005

Example: Compressible PETSc-FUN3DExample: Compressible PETSc-FUN3D

Finite volume discretization, variable order Roe scheme on a tetrahedral, vertex-centered mesh

Initial discretization: first-order scheme; switch to second-order after shock position has settled down

Large sparse linear system solution takes approximately 72% of overall solution time Original FUN3D developer: W.K. Anderson et al., NASA Langley

Image: Dinesh Kaushik

SIAM CSE 15February 13, 2005

PETSc-FUN3d, cont.PETSc-FUN3d, cont.

A3: Nonsequence-based adaptive strategy based on polynomial interpolation [Bhowmick et al., ’05]

A3 vs base method time: ~1% slowdown - 32% improvement Hand-tuned adaptive vs base method time: 7% - 42% improvement

SIAM CSE 16February 13, 2005

Component architectureComponent architecture

PerfDMFPerfDMF

Metadata extractorMetadata extractor CheckpointCheckpoint

Runtime DBRuntime DB

TAUTAU

ExperimentExperiment

MonitorMonitor

Off-line analysisOff-line analysis

insertextract

start, stop, trigger

checkpointadapt request

adapt: algorithm, parameters

extract

extract query

SIAM CSE 17February 13, 2005

Future workFuture work Integration of ongoing efforts in

Performance tools: common interfaces and data represenation (leverage OSPAT, PerfDMF, TAU performance interfaces, and similar efforts)

Numerical components: emerging common interfaces (e.g., TOPS solver interfaces) increase choice of solution method automated composition and adaptation strategies

Long term Is a more organized (but not too restrictive)

environment for scientific software lifecycle development possible/desirable?

SIAM CSE 18February 13, 2005

Ext. dependencies,Version control

Configure, make,…

Performance tools

Job management,Results

DebuggingDebugging

Typical application development “cycle”Typical application development “cycle”

Implementation

Implementation

ProductionExecution

ProductionExecution

Compilation, Linking

Compilation, Linking

DeploymentDeployment

TestingTesting

DesignDesign

Performance evaluation

Performance evaluation

SIAM CSE 19February 13, 2005

Future workFuture work

Beyond components Work flow Reproducible results – associate all necessary

information for reproducing particular application instance

Ontology of tools and tools to guide selection and use

SIAM CSE 20February 13, 2005

SummarySummary No shortage of performance evaluation, analysis, and

optimization technology (and new capabilities are continuously added)

Little shared infrastructure, limiting the utility of performance technology in scientific computing

Components, both in performance tools, and numerical software can be used to manage complexity and enable better performance through dynamic adaptation or multimethod solvers

A life-cycle environment may be the best long-term solution

Some relevant sites: http://www.mcs.anl.gov/~norris http://perc.nersc.gov (performance tools) http://cca-forum.org (component specification)