30
Review/Preface to Second Day of DD-15 Tutorial William D. Gropp Lab for Advanced Numerical Software Argonne National Laboratory David E. Keyes Department of Applied Physics & Applied Mathematics Columbia University

Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

Review/Preface to Second Day of DD-15 Tutorial

William D. Gropp

Lab for Advanced Numerical SoftwareArgonne National Laboratory

David E. Keyes

Department of Applied Physics & Applied MathematicsColumbia University

Page 2: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

PETSc codeUser code

ApplicationInitialization

FunctionEvaluation

JacobianEvaluation

Post-Processing

PC KSPPETSc

Main Routine

Linear Solvers (SLES)

Nonlinear Solvers (SNES)

Timestepping Solvers (TS)

PETSc’s SLES & SNES: objects and algorithms

Page 3: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

PETSc objectsl Vector and matrix objects, Vec and Mat

n Accessed and used as mathematical objects n Code usage independent of runtime choice of data structuresn Implemented to allow for high performance

u Split-phase operationsu Overlap of communication and computationu Aggregation, locality, low memory footprint

l Cartesian grid object DAl Solver objects SLES and SNES

n Large collection of progressive algorithmic options assembled under abstract interfaces

n Recursively callable, with separate contexts for sub-objectsn Command-line tunable

l Auxilliary objects (e.g., Viewer)

Page 4: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Basic and advanced algorithmsl Linear solvers

n Schwarz: single-level, two-leveln Schur: reduced, full (interface-interior, primal-dual)n Schwarz-Schur hybrids

l Nonlinear solversn Newton-Krylov-Schwarz (NKS)n Jacobian-free Newton-Krylov (JFNK)

l Implicit time integratorsn Time-accurate time stepping (PETSc’s TS)n Pseudo-transient continuation (? tc)

Page 5: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Scalability properties of DDl Parallel scaling (time per iteration)

n Good “surface-to-volume” ratio: communication subdominant to computation, Amdahl’s law defeated

n Reasonable communication requirements: mostly nearest neighbor, of small size when global

n Can increase processors without bound with increased system size, at power-law rate that depends upon richness of communication network

l Convergence scaling (iterations needed for solution)n Schwarz and Schur can have bounded condition number

n Newton can have quadratic convergencen Nonlinear robustification techniques

Page 6: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

So far…l Good stories mentioned …

n 1999 Gordon Bell Prize for aerodynamic computation using NKS (PETSc, Argonne)

n 2002 Gordon Bell Prize for structural dynamics computation using FETI-DP (Salinas, Sandia)

n Combustion model for ? tc

l But only simple examples presented in detail …n Poisson problem

n Bratu problem

n Driven cavity

Page 7: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

This morning …l Brief review

l More advanced algorithms

l Programmatic context of current PETSc development environmentn Mathematicians and computer scientists, come join

the fun☺

l Broader sense of PDE-type scientific applications being addressed through PETScn Applications scientists, come join the fun☺

Page 8: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Two distinct definitions of scalabilityl “Strong scaling”

n execution time decreases in inverse proportion to the number of processors

n fixed size problem overall

l “Weak scaling”n execution time remains constant,

as problem size and processor number are increased in proportion

n fixed size problem per processor

n also known as “Gustafson scaling”

T

p

good

poor

poor

N ∝ p

log T

log pgood

N constant

Slope= -1

Slope= 0

Page 9: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Three important usages of “efficiency”l For any iterative method,

l “Convergence efficiency”

l “Implementation efficiency”

l Overall efficiency

l “Ideal” is unity for all three measures

)()(# iterpertimeitersoftime ×=

)(#)(#

large

smallconv Pforitersof

Pforitersof=η

large

small

large

smallimpl )(

)(PP

PforiterpertimePforiterpertime

×=η

large

small

large

smallimplconv )(

)(PP

PfortimePfortime

×=×= ηηη

Page 10: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

PDE varietiesl Evolution (time hyperbolic, time parabolic)

l Equilibrium (elliptic, spatially hyperbolic or parabolic)

l Mixed, varying by region

l Mixed, of multiple type (e.g., parabolic with elliptic constraint)

KK =∇•⋅∇−•∂∂

=•⋅∇+•∂∂

)()(,))(()( αt

ft

K=∇•−•⋅∇ )( αU

Page 11: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Common featuresl Despite enormous variety of mathematical analysis

required, most computational algorithms for PDEs can be described with surprisingly few common bulk-synchronous (SPMD) kernelsn Domain-decomposed loops over mesh points manipulating purely

local data

n Domain-decomposed loops over mesh points (or edges) manipulating own and near neighbors’ data

n Global reductions and recurrences

l The large variety of operations, yet simplicity of operation class, invites abstraction and performance optimization: in short, an object oriented library

Page 12: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Questions?

l Now is a great time for them!

l Next, we go on to nonlinear Schwarz and advanced concepts in nonlinear solver software, to finish Lecture #2 from Thursday afternoon

l Then we start Lecture #3 on applications this morning

l This afternoon’s Lecture #4 is on two research topics: physics-based preconditioning and PDE-constrained optimization

l Whatever we don’t finish will be posted on line by this evening

Page 13: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz preconditioningl Nonlinear Schwarz has Newton both inside and

outside and is fundamentally Jacobian-freel It replaces with a new nonlinear system

possessing the same root, l Define a correction to the partition (e.g.,

subdomain) of the solution vector by solving the following local nonlinear system:

where is nonzero only in the components of the partition

l Then sum the corrections: to get an implicit function of u

0)( =uF0)( =Φ u

thi

thi

)(uiδ

0))(( =+ uuFR ii δn

i u ℜ∈)(δ

)()( uu ii δ∑=Φ

Page 14: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz – picture

11

11

0 0

u

F(u)

Ri

RiuRiF

Page 15: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz – picture

11

11

0 0

11

11

0 0

u

F(u)

Ri

Rj

Riu

RjF

RiF

Rju

Page 16: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz – picture

11

11

0 0

11

11

0 0

u

F(u)

Fi’(ui)

Ri

Rj

Riu

RjF

RiF

Rju

diu+dju

Page 17: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz, cont.l It is simple to prove that if the Jacobian of F(u) is

nonsingular in a neighborhood of the desired root then and have the same unique root

l To lead to a Jacobian-free Newton-Krylov algorithm we need to be able to evaluate for any :n The residual n The Jacobian-vector product

l Remarkably, (Cai-Keyes, 2000) it can be shown that

where and l All required actions are available in terms of !

0)( =Φ u

nvu ℜ∈,)()( uu ii δ∑=Φ

0)( =uF

vu ')(Φ

JvRJRvu iiTii )()( 1' −∑≈Φ

)(' uFJ = Tiii JRRJ =

)(uF

Page 18: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Nonlinear Schwarz, cont.

Discussion:l After the linear version of additive Schwarz was

invented, it was realized that it is in the same class of methods as additive multigrid, in terms of operator algebra: projection off the error in a set of subspaces (though linear MG is usually done multiplicatively)

l After nonlinear Schwarz was invented, it was realized that it is in the same class of methods as nonlinear multigrid (full approximation scheme MG), in terms of operator algebra (though nonlinear multigrid is usually done multiplicatively)

Page 19: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Driven cavity in velocity-vorticity coords

02 =∂∂

−∇−y

02 =∂∂

+∇−x

0Gr2 =∂∂

−∂∂

+∂∂

+∇−xT

yv

xu

ωωω

0)(Pr2 =∂∂

+∂∂

+∇−yT

vxT

uT

x-velocity

y-velocity

vorticity

internal energy

hotcold

Page 20: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Experimental example of nonlinear Schwarz

Newton’s method Additive Schwarz Preconditioned Inexact Newton(ASPIN)

Difficulty at critical Re

Stagnation beyond

critical Re

Convergence for all Re

Page 21: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Common software infrastructure for nonlinear PDE solvers – coming in PETSc

l The user codes to the problem they are solving (by providing F(u)), not the algorithm used to solve the problem

l Implementation of various algorithms reuse common concepts and code when possible, withoutlosing efficiency

Optimizer

Linear solver

Eigensolver

Time integrator

Nonlinear solver

Indicates dependence

Sens. AnalyzerVision:

Page 22: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Encompassing …l Newton’s method

n Jacobian-free methods n w/Jacobian-based preconditioned linear solvers

(Newton-PCG)n w/multigrid linear solvers (Newton-MG)

l Nonlinear multigridn a.k.a. Full approximation scheme (FAS) n a.k.a. MG-Newton

Page 23: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Software engineering ingredients

l Standard solver interfaces (SLES, SNES)

l Solver libraries

l Automatic differentiation (AD) tools

l Code generation tools

Page 24: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Algorithm review

F(u) = 0, Jacobian A(u)

Newton

Newton–SOR (1 inner linear sweep over i)

SOR–Newton (1 inner nonlinear sweep over i)

1( ) ( )u u A u F u−← −

1( ){ ( ) ( )[ ]}i ji ii i ij jj i

u u A u F u A u u u−

<

← − − −∑

1( ) ( )ii ii iu u A u F u−← −

Page 25: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Cute observation

( ) ( )

( ) ( ) ( )[ ]

ii ii

ji i ij j

A u A u

F u F u A u u u

≈ + −∑1( ){ ( ) ( )[ ]}i ji ii i ij j

j i

u u A u F u A u u u−

<

← − − −∑

1( ) ( )ii ii iu u A u F u−← −SOR-Newton

… with approximations

… gives Newton-SOR

⇒ (Gauss-Seidel) matrix-free linear relaxation

is very similar to nonlinear relaxation

Page 26: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Function and Jacobian evaluation

l FAS requires them pointwise

l Newton requires them globally

l Newton-MG requires bothn Newton (outside) globally

n MG (inside) locally

Page 27: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Enter Automatic Differentiation

l Given code for AD can compute

n A(u) and

n A(u)*w efficientlyl Given code for AD can compute

n and

n efficiently

( )F u

( )iF u( )iiA u

( )ij jj

A u w∑

Page 28: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Automated Code generation (“in-lining”)

• Newton method requires a user-provided function and an AD Jacobian

• Big performance hit if handled directly with components§ A outer loop over an inner subroutine that calls a function at

each point is inefficient globally

§ An outer subroutine that loops over all inner points is wasteful locally (information thrown away)

• Need to have it both ways, depending upon algorithmic context; user should not see this level of performance-induced coding complexity

Page 29: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Coarse grid correction can also be handled

Coarse grid objects generated together with fine

(e.g., PETSc’s DMMG)

l Newton-MG

l MG-Newton

( ) ( )H H

TH

A Ru c RF u

u u R c

=

← −

)

( ) ( ) ( ) 0H H HF Ru c F Ru RF u+ − + =) )

Page 30: Review/Preface to Second Day of DD-15 Tutorial · DD15 Tutorial, Berlin, 17-18 July 2003 PETSc objects l Vector and matrix objects, Vec and Mat n Accessed and used as mathematical

DD15 Tutorial, Berlin, 17-18 July 2003

Conclusion

The algorithmic/mathematical building blocks for Newton-MG and MG-Newton are essentially the same.

Thus the software building blocks should be also (and they will be in the next release of PETSc, version 3.0).