37
The Cactus Framework & Numerical Relativity: Petascale Requirements and Science Drivers Gabrielle Allen, Ed Seidel, Peter Diener, Erik Schnetter, Christian Ott and others [email protected] Center for Computation & Technology Departments of Computer Science & Physics Louisiana State University

The Cactus Framework & Numerical Relativitygallen/Presentations/ORNL_Nov06.pdf · The Cactus Framework & Numerical Relativity: Petascale Requirements and Science Drivers Gabrielle

  • Upload
    lenhu

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

The Cactus Framework &Numerical Relativity:Petascale Requirements andScience DriversGabrielle Allen, Ed Seidel, Peter Diener, Erik Schnetter, Christian Ottand [email protected] for Computation & TechnologyDepartments of Computer Science & PhysicsLouisiana State University

4/8/07

4/8/07

Gravitational Wave Physics

Observations Models

Analysis & Insight

LSU/AEI Collaboration(Seidel/Rezzolla)

4/8/07

• Einstein Equations: Gµν(γij) = 8πTµν

– Constraint Equations:• 4 coupled elliptic equations for initial data and beyond• Familiar from Newton: ∇2 φ = 4πρ

– 12 fully 2nd order evolution equations for γij, Kij (∂γij /∂t)• Like a “wave equation” ∂2φ/dt2 −∇2 φ = Source (φ, φ2, φ’)• Thousands of terms in RHS (automatic code generation)

– 4 gauge conditions• Elliptic, hyperbolic, whatever …

– GR hydrodynamics for Tµν

• Analytically can only study trivial solutions or approximations… full numerical 3D models needed

• Black hole physics, neutron star physics, vacuumspacetimes, BH-BH/NS-NS/NS-BH binaries, supernovae,gamma ray bursts, …

Solving Einstein’s Equations

4/8/07

BSSN System

Plus equations for gauge variables

4/8/07

• Previous unigrid runs scaled tothousands of processors

• Current runs use FMR (around 6levels, ~80x80x80 nested grids)

• Scale to 64 procs• Take 2 weeks to run• (Boundary conditions, gauge

conditions, initial data, ellipticsolves)

• Easy AMR viz needed!!!!

Current BH Runs

(Movie from P. Diener, CCT)

4/8/07

Current Supernova WorkChristian Ott (U. Arizona) et al: Fully consistant 3+1 GR calculationsof rotating core collapse (using ORNL machines)

4/8/07

Current Supernova Work (2)

Christian Ott (U. Arizona) et al

4/8/07

4/8/07

Cactus Code• Freely available, modular, portable and

manageable environment forcollaboratively developing parallel, high-performance multi-dimensionalsimulations (Component-based)

• Developed for Numerical Relativity, butnow general framework for parallelcomputing (CFD, astro, climate, chemeng, quantum gravity, …)

• Finite difference, AMR (Carpet, Samrai,Grace), new FE/FV, multipatch

• Active user and developer communities,main development now at LSU and AEI.

• Open source, documentation, etc

4/8/07

Cactus StructureCore “flesh” with plug-in “thorns”

Core “Flesh”

Plug-In “Thorns”(modules)

driverdriver

input/outputinput/output

interpolationinterpolation

SOR solverSOR solver

coordinatescoordinatesboundaryboundary conditionsconditions

black holesblack holes

equations of stateequations of state

remote steeringremote steering

wave evolverswave evolvers

multigridmultigrid

parametersparameters

gridgrid variablesvariables

errorerror handlinghandling

schedulingscheduling

extensibleextensible APIsAPIs

makemake systemsystem

ANSI CANSI CFortran/C/C++Fortran/C/C++

Your Physics !!Your Physics !!

ComputationalComputationalTools !!Tools !!

4/8/07

Cactus Flesh• Written in ANSI C• Independent of all thorns• Contains flexible build system, parameter

parsing, rule based scheduler, …• After initialization acts as utility/service library

which thorns call for information or to requestsome action (e.g. parameter steering)

• Contains abstracted APIs for:– Parallel operations, IO and checkpointing, reduction

operations, interpolation operations, timers. (APIsdesigned for science needs)

• All actual functionality provided by (swappable)thorns

4/8/07

Cactus Thorns (Components)

• Can be written in C, C++, Fortran 77, Fortran 90,(Java, Perl, Python)

• Separate libraries encapsulating some functionality• To keep distinction between functionality and

implementation of functionality each thorn declaresit provides a certain “implementation”

• Different thorns can provide the same“implementations”

• Thorn dependencies expressed in terms of“implementations”, so that thorns providing same“implementation” are interchangeable.

4/8/07

Thorn Specification• Each thorn contains configuration files which

specify interface with Flesh and other thorns• Configuration files converted at compile time

(Fortran) into a set of routines the Flesh cancall for thorn information– Scheduling directives– Variable definitions– Function definitions– Parameter definitions– Configuration details

• Configuration files have well defined language, canuse as basis to build interoperability with othercomponent based frameworks

4/8/07

Timebins in Main Loop

4/8/07

Users and Toolkits• Many numerical relativity

groups around the world– Over 100 publications– Maya, Whisky, Lazarus, …

• Others:– CFD– Quantum Gravity– Chemical Engineering– Crack Propagation– Environmental modeling– Plasma physics– Computer science– Astrophysics– Cosmology– [Biology/Materials]

• Toolkits– Cactus Computational

Toolkit– Einstein Toolkit– CFD Toolkit– (Biology Toolkit)

• Teaching– Over 30 student

thesis/diploms

4/8/07

• Cactus modules (thorns)for numerical relativity.

• Many (>100) additionalthorns available from othergroups (AEI, CCT, …)

• Agree on few basics (e.g.names of variables) andthen can share evolution,analysis etc.

• Over 100 relativitypapers & 30student theses

ADM

EvolSimple

Evolve

ADMAnalysis

ADMConstraints

AHFinder

Extract

PsiKadelia

TimeGeodesic

Analysis

IDAnalyticBH

IDAxiBrillBH

IDBrillData

IDLinearWaves

IDSimple

InitialData

CoordGauge

Maximal

Gauge Conditions

SpaceMask ADMCoupling

ADMMacros StaticConformal

ADMBaseCactus Einstein

4/8/07

Numerical Methods• Most application codes using Cactus use finite

differences on structured meshes• Parallel driver thorns: Unigrid (PUGH), FMR

(Carpet), AMR (PARAMESH, Grace, SAMRAI),finite volume/element on structured meshes

• Method of lines thorn• Elliptic solver interface (PETSc, SOR, Multigrid,

Trilinos)• Multipatch with Carpet driver• Unstructured mesh support being added.

4/8/07

Parallelism• Scheduler calls routines and provides n-D

block of data (typical set up for FD codes)• Also information about size, boundaries, etc.• Fortran memory layout used (appears to C as

1D array)• Driver thorns are responsible for memory

management and communication.– Abstracted from science modules

• Supported parallel operations– Ghostzone synchronization, generalized reduction,

generalized interpolation.

4/8/07

PUGH UniGrid Driver• Was standard driver for science runs

until last year• MPI domain decomposition• Flexible methods for load balancing,

processor topology• Well optimized, scales very well for

numerical relativity kernels (e.g. to 33Kprocessors on BG/L)

4/8/07

Carpet AMR Driver• Carpet is a mesh refinement library for

Cactus, written in C++, mainly developed byErik Schnetter (LSU)

• Implements (minimal) Berger Oligeralgorithm, constant refinement ratio, vertexcentered refinement

• Uses MPI to decompose grids acrossprocessors, handle communications

• Current scaled and used with up to 64processors, work ongoing to better optimize.

4/8/07

IO

• Support for IO in different formats (generic interface)– 2-d slices as jpegs– N-d ASCII data– N-d data in IEEEIO format– N-d data in HDF5 format (to disk or streamed)– Panda parallel IO– Isosurfaces, MPEGS

• Checkpoint/restart (move to any new machine)

4/8/07

External Libraries• IO API:

– HDF5, FlexIO, jpeg, mpeg, NetCDF• Elliptic/Solver subsystem:

– LAPACK, BLAS, PETSc, Trilinos, FFTW• Timer API:

– PAPI• Driver API:

– MPI, Grace, SAMRAI, PARAMESH• Grid:

– GAT, MPICH-G2, (Globus)

4/8/07

• Designed for portability: IPAQ, PS2, Xbox, Itanium’99, EarthSim,BG/L, …

• 33K procs on BG/L but now new complex data structures (AMR)

BG/L (2006)

Performance and Optimization

4/8/07

Benchmark/Scaling Study

http://www.cactuscode.org/Articles/Cactus_Madiraju06.pdf/

4/8/07

Relativistic Astrophysics in 2010

• Frontier Astrophysics Problems– Full 3D GR simulations of binary systems for dozens or

orbits and merger to final black hole• All combinations of black holes, neutron stars, and exotic

objects like boson stars, quark stars, strange stars– Full 3D GR simulations of core collapse, supernova

explosions, accretion onto NS, BH– Gamma-ray bursts

• All likely to be observed byLIGO in the timeframe ofthis facility.

4/8/07

Physics Needed• Full 3D GR

– Coupled hyperbolic, elliptic equations, 10^4 double precisionfloats/grid point. 30 years of work, just in last year havealgorithms been developed for many orbits of binaries

• GR Hydro– Complete coupled GR-Hydro codes just now available

• MHD– Just starting

• Nuclear physics, EOS– Complex, time consuming

• Radiation transport– Not really started in 3D GRNeed all for complete solution to the above problems

4/8/07

• Resolve from 10,000km down to 100m on a domainof 1,000,000km-cubed for 100 secs of physical time

• Assume 16,000 flops per gridpoint• 512 grid functions• Computationally:

– High order (>4th) adaptive finite difference schemes– 16 levels of refinement– Several weeks with1 PFLOP/s sustained performance– (at least 4 PFLOP/s peak, > 100K procs)– 100 TB memory (size of checkpoint file needed)– PBytes storage for full analysis of output

Computational Needs for GRB

4/8/07

Code Details• Cactus is a computational framework that is used widely in numerical relativity and

astrophysics groups around the world. Using the Einstein Toolkit in Cactus, similarcodes have been developed that solve Einstein’s equation for the gravitational fieldand applied them successfully to solve the binary black hole problem. An EU grouphas further a GR-Hydro code in Cactus, called Whisky, that extends thecapabilities to include matter, and has successfully applied the coupled Einstein-Hydro solver to problems of relativistic neutron star binaries, mixed binarysystems, collapse of a neutron star to a black hole, and full 3D relativisticsupernova calculations.

• Cactus couples the physics solvers to parallel driver layers, that now provide AMR,parallel I/O, steering interfaces, and so on. We estimate that in several years, thecomputations described above will be the state of the art, and will be urgentlyneeded to interpret gravitational wave events being seen by LIGO.

• Cactus has been used in benchmarking studies on virtually every machine, fromSony Playstations to the Earth simulator. It was recently shown to scale well to32K processors with a relativity code using a uniform mesh. Work will be neededto achieve such scaling results for AMR drivers, which is now underway.

4/8/07

Code Details (2)• Main methods

– 64 bit– Explicit finite differencing– Domain decomposition– Structured grids with adaptive mesh

refinement– Parallel I/O– Minimal global reductions (sums, min/max)

4/8/07

Code Details (3)• What is the programming paradigm?

– Domain decomposition with messagepassing (applications work with abstractoperations)

• What languages wil be used?– Cactus Flesh and core thorns in C, some

physics thorns F90, Carpet driver C++.• What libraries will be required?

– PETSc, LAPACK, HDF5, PAPI, MPI

4/8/07

Code Details (4)• What does the source code look like

– Around 500,000 lines of code, but dependson number of analysis thorns used (this isfor 85 thorns in total)

• On what is the source code based?– Open source, used by different apps– Cactus toolkit, many community thorns for

numerical relativity• What special features of the system will

it use

4/8/07

Code Details (5)• I/O

– Typically using HDF5 (need platformindependence)

– I/O layer in Cactus allows new methods tobe easily added/used

– Many parameter choices (one file perprocessor, one file per n-procs, one file pertimestep, chunked vs unchunked data)

– Checkpoint and restart handled by Cactus

4/8/07

Code Details (6)• Code termination

– Steering interface allowed termination fromgiven condition, from external machine etc.

– Usually codes crash through fromnumerical problems (black holesingularities and grid stretching etc)

• Fault tolerance– None implemented at MPI level, but have

plans for this if necessary.

4/8/07

Code details (7)• Debugging

– Printf, gdb– Some Cactus thorns (and plans for more),

e.g. NaNChecker– Fair amount of checks in Cactus

infrastructure (e.g. parameter checking,different levels of warnings)

– Plans for more realtime debuggingcapabilities (e.g. via web interface, logging)

– Totalview, purify, …

4/8/07

Code details (8)• Profiling• Cactus has its own

timing interface(thorns, timebins,communication, userdefined, …)

• Use PAPI throughthe Cactus timinginterface

4/8/07

Summary• Cactus development challenges

– Load balancing and scaling for AMR– Performance across all thorns– Hydrid parallel model if needed

• Machine needs– Memory 10TB to 500TB, local storage >

10PB, mass storage 100PB– Large shared memory nodes– Cache– Parallel IO