Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Big Iron Chef Episode:
The Laplace Transform versus Parareal
Craig C. Douglas (University of Wyoming and the King Abdullah University of Science & Technology (KAUST))
with
Dongwoo Sheen and Imbunm Kim (Seoul National University) Hyoseop Lee (Alcatel-Lucent Bell Labs – Seoul)
Samir Karaa (Sultan Qaboos University)
The research is based on work partially supported by AFOSR, NRF, NSF, KAUST, and the Seoul R & D Program.
Scaling Up to Massive Parallelism Airplane alternate energy example (A-380): • One Rolls Royce engine is equivalent to
~75,000 standard blow dryers. • 4 dirty or 300,000 green power sources?
o How much would batteries for 300,000 blow dryers weigh to fly an A-380 from Sydney to LAX? ! Answer: Too much to get the plane off the
ground. (Open research area in batteries). o Now think about Peta/Exa-scale computing…
Time Evolution Problems Differential Equations:
– Ordinary: u'= f (u) – Parabolic: du
dt =L(u)+ f
Classical methods based on time stepping:
…
… t0 t1
tNt−1tNt
Computational Complexity Example Suppose we have Ns spatial points, Nt time steps, and we solve a parabolic equation solved using backward Euler combined with multigrid. Then the cost of solving the problem has two cases:
Serial O(Nt ⋅Ns) Parallel O(Nt ⋅log2Ns)
The Big Question Can we robustly solve parts of the problem later in time before fully approximating the solution earlier in time using something similar to a traditional numerical algorithm for solving partial differential equations? For a long time the answer was thought to be no… (cf. [A. Deshpande, S. Malhotra, C. C. Douglas, and M. H. Schultz, A rigorous analysis of time domain parallelism. Parallel Algorithms and Applications, 6 (1995), pp. 53-62].)
Parareal Papers
See also [M.J. Gander and S. Vandewalle, Analysis of the parareal time-parallel time-integration method, SIAM J. Sci. Comput., 29 (2007), pp. 556–578].
Parareal Algorithm Consider the ODE u'= f (u) starting from an initial condition of u1=u(t1). Use two time propagation operators of the form, • G(t2,t1,u1) is a rough approximation of u(t2). • F(t2,t1,u1) is a more accurate approximation of u(t2).
For example,
Parareal starts with a coarse initial guess Un0 at time points
t1, t2, , tN and computes Unk for
k=1, 2, … by a series of correction iterations.
Parareal Implementation Loop over for k=0,1,..., Iteration #s for n=0,1,...,N−1, Time steps Un+1k+1=G(tn+1,tn,Unk+1)+F(tn+1,tn,Unk)−G(tn+1,tn,Unk); Comments: Dominant part of the computation (F) is embarrassingly parallel in time. About five lines of Matlab to experiment with Parareal.
Parareal Update Pattern
About Convergence • When converged, we have F-propagator
accuracy at each tn . • Convergence guaranteed after N iterations on
t0, t1, , tN .
• Is this it??? (Remember the Big Question?)
Typical Theorem for Parareal Theorem (LMT 2001): For u'=−au and u(0)=u0, let F(tn+1,tn,Unk) denote the exact solution at tn+1 and G(tn+1,tn,Unk) be the backward Euler approximation with time step ΔT . Then
max1≤n≤N
u(tn)−Unk ≤CkΔTk+1.
Some Interpretations of Parareal • Just a solver for the F-equations if Parareal
iterates until convergence. • A new time integrator if the number of
iterations is fixed in advance. • Is it related to an already known time
integration method from the dark ages (BG era) of the printed document library?
BG = before Google
Multiple Shooting Methods For N intervals for solving u'= f (u), u(0)=u0, t∈[0,1],
define Un+1k+1=un(tn+1,Unk)+∂un∂Un
(tn+1,Unk)Unk+1−Unk⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟.
Theorem: If in the multiple shooting method, un(tn+1,Unk)≈F(tn+1,tn,Unk) and
∂un∂Un
(tn+1,Unk)Unk+1−Unk⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟≈G(tn+1,tn,Unk+1)−G(tn+1,tn,Unk),
then the multiple shooting and Parareal methods coincide.
Interpretation and Commentary Interpretation: Parareal = multiple shooting with a coarse approximate Jacobean. Comment: Different approximations for the
∂un∂Un
(tn+1,Unk)Unk+1−Unk⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟
term lead to different time-parallel algorithms. See [H.B. Keller, Numerical Solution of Two-Point Boundary Value Problems (CBMS-NSF Regional Conference Series in Applied Mathematics), SIAM, 1976].
Speedup Let S=TSTP
= NtFNtG+K NtG+(N /P)tF
⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
≈PK , where
P = Number of processor cores K = Number of iterations N = Number of time steps tG , tF = 1 step cost of the G and F propagators. Limited speedup if K is large. Useless if P=1.
Where Is Parareal Useful? Fluid, structure, molecular dynamics, … problems. Extensions developed in recent years:
1. Combined with multilevel in time, space 2. Domain decomposition in space 3. Subspace filtering 4. Combined with waveform relaxation 5. Combined with Kalman/stochastic filtering
17
Simple Computational Examples
1. For u(t0)=1, t∈0,30⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥, ΔT =1 and Δt=0.01,
u'(t)=−u(t)+sin(t).
Use the trapezoidal rule. The initial error ~1.
With K=13, error ~10−14 .
18
2. Brusselator problem: For x(0)=0, y(0)=1, t∈0,12⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥, T =12, ΔT =T /32, Δt=T /320,
x=1+x2y−4x and y=3x−x2y.
Use a 4th order Runge-Kutta scheme. With K=7, error ~10−12.
See [M.J. Gander and S. Vandewalle, Analysis of the parareal time-parallel time-integration method, SIAM J. Sci. Comput., 29 (2007), pp. 556–578].
Laplace Transform (LT)
We solve
∂u∂t+Au= f , t∈(0,T ], starting from u(0)=u0.
Given some z∈ and a function u(⋅,t), the Laplace transform in time is given by
u(⋅,z)≡L[u](z)= u(⋅,t)e−ztdt0∞∫ .
We are left solving by any reasonable elliptic solver the transformed problem
u(⋅,z)= zI+A⎛
⎝
⎜⎜
⎞
⎠
⎟⎟
−1u0(⋅)+ f (⋅,z)⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟.
We assume for some CS∈
+ the real parts of singular points of u0(⋅)+ f (⋅,z)≤CS . Let the integral contour be a straight line parallel to the imaginary axis,
Γ≡ z∈ | z(ω)=α+iω, α≥Cs, ω∈[−∞→∞]=⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪.
The Laplace inversion formula is u(⋅,t)= 1
2πi u(⋅,z)eztdzΓ∫ .
When z0 and z∈Γ has negative real parts, the
discretization error in evaluating the integrals for u(⋅,t) is significantly reduced for all t>0. We deform Γ to the left half plane with all singularities to its left with a hyperbola contour
Γ= z∈ | z(ω)+isω, ω∈[−∞→∞], ζ (ω)=γ − ω2+υ2⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪
In essence, the hyperbola contour must be kept away from the spectrum of −A and the singular points of f (z).
Define ψ (ω)= tanh(τω2 ): (−∞,∞)→[−1,1]. Then
Use the trapezoidal rule for discretization.
u(t)= 12π i ezt
Γ∫ u(z)dz
= 12π i e{σ (ω)+isω}t
−∞∞∫ uσ (ω)+iω
⎛
⎝
⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟
{σ′(ω)+is}dω
= 12π i e{σ (ψ−1(y))+isψ−1(y)}t
−11∫ uσ (ψ−1(y))+isψ−1(y)
⎛
⎝
⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟
σ′(ψ−1(y))+is⎧
⎨
⎪⎪
⎩
⎪⎪
⎫
⎬
⎪⎪
⎭
⎪⎪
dψ−1dy (y)dy.
Higher Order Compact Finite Differences Restrict to A=−aΔ in 2D with f ≡0 and unit square domain. We can easily construct 4th and 6th order 9 point discrete stencils. Define
σ s=u j,k+1+u j+1,k+u j,k−1+u j−1,k ,
σc=u j+1,k+1+u j+1,k−1+u j−1,k−1+u j−1,k+1,
ψ s=(u0) j,k+1+(u0) j+1,k+(u0) j,k−1+(u0) j−1,k , and
ψ c=(u0) j+1,k+1+(u0) j+1,k−1+(u0) j−1,k−1+(u0) j−1,k+1.
Then the stencils have the form A0u j,k+Asσ s+Acσc=B0(u0) j,k+Bsψ s+Bcψ c,
for j,k=1, , Nx and u j,k=0 if jk( j−Nx)(k−Nx)=0.
4th order Dirichlet problem: A0=
10a3 +h2z 1+h2z12a
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
, As=−2a3 , Ac=−a6 ,
B0=h223+
h2z12a
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
, Bs=h212 , and Bc=0
Neumann problem: modify stencil and Bc.
6th order Dirchlet problem:
A0=10a3 +h2 46z45 +
h2z12a+
h4z3360a2
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
, As=− 2a3 +h2z90
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
,
Ac=− a6−h2z180
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
, and B0=h2 1+h2z12a+
h4z2360a2
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
.
The right hand side is given by
B0(u0) j,k+h412+
h6z360a
⎛
⎝
⎜⎜⎜
⎞
⎠
⎟⎟⎟(u0xx+u0yy)+
h6360(4u0xxyy+u0xxxx+u0yyyy).
Parabolic Example Consider
ut−uxx+uyy
5π2 =0, u(x,y,0)=esin(2πx)sin(πy)
with exact solution u(x,y,t)=e1−tsin(2πx)sin(πy). The Laplace transformed problem is given by zu− uxx+uyy
5π2 =esin(2πx)sin(πy) in [0,1]2, u=0 on ∂[0,1],
where
Γ=z∈ | z(ω)+isω, ω∈[−∞→∞],ζ (ω)=γ − ω2+υ2, ω(y)=2
τ tanh−1y=1τ log1+y
1−y
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
⎫
⎬
⎪⎪⎪
⎭
⎪⎪⎪.
Using [J.A.C. Weideman, and L.N. Trefethen, Parabolic and
hyperbolic contours for computing the Bromwich integral. Math.
Comp., 76 (2007), pp. 1341–1356], we get
α =1.1721, a(α)=cosh−1 2α(4α−π )sinα
⎛
⎝
⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟
, γ =4πα−π2a(α) ⋅Nzt ,
υ=γ sin(α), s=γ cot(α), and τ = log(2Nz−1)
γ sin(α)sinh a(α)Nz−1Nz
⎛
⎝
⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟
.
Since we know α , γ =134.8, υ=124.2, s=0.4213, and τ =0.02633. Holy cow! ☺
Numerical Experiments for Example We use ,
, and
is
MADPACK used as solver in LT code [C.C. Douglas, Madpack: a family of abstract multigrid or multilevel solvers, Comput. Appl. Math., 14 (1995), pp. 3–20].
Nz=30Nx= 10,20,40,80
⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
z∈Γ
4th order compact scheme
4th order high order ADI method (Karaa code) 100, 500, and 1,000 time steps (one solve per step)
Laplace transform takes Nz=30 solves only. LT is the clear winner. ☺
6th order compact scheme
Laplace transform versus Parareal (6th order) 44 Intel Xeon quad core processor cluster with 1 Gb/sec Ethernet, Nz=32, and Nx=160.
Small Parallel Computer Clusters Choose Nz points on the hyperbola contour. For P=CNz
Nz processing cores, where CN is small, then
the Laplace transform is the obvious choice (if applicable), particularly if CNz
=1.
Reasoning: Parareal may use too many iterations, whereas we know we have Nz-fold parallelism trivially with the Laplace transform (and more with parallel solvers).
Conclusions
• There are at least two families of robust algorithms to create parallel in both time and space (parabolic) PDE solvers for parallel computers. o Other methods? Sinc methods
• For one processing core, neither is appropriate. • For a small number of computing cores, the Laplace
transform is clearly the better choice if it is applicable. o Solve Laplace transformed problems in parallel to get
highly parallel solver. • For a large number of processing cores, Parareal is clearly
the method of choice to try first. • Big Question answer: No clear answer #
References
• A. Deshpande, S. Malhotra, C.C. Douglas, and M.H. Schultz, A rigorous analysis of time domain parallelism, Parallel Algorithms and Applications, 6 (1995), pp. 53–62.
• C.C. Douglas, Madpack: a family of abstract multigrid or multilevel solvers, Comput. Appl. Math., 14 (1995), pp. 3–20.
• C.C. Douglas, I. Kim, H. Lee, and D. Sheen, Higher-order schemes for the Laplace transformation method for parabolic problems, Comput. Visual Sci., 14 (2011), pp. 39–47.
• M.J. Gander and S. Vandewalle, Analysis of the parareal time-parallel time-integration method, SIAM J. Sci. Comput., 29 (2007), pp. 556–578.
• S. Karaa and J. Zhang, High order ADI method for solving unsteady convection-diffusion problems, J. Comput. Phys., 198 (2004), pp. 1–9.
• H.B. Keller, Numerical Solution of Two-Point Boundary Value Problems (CBMS-NSF Regional Conference Series in Applied Mathematics), SIAM, 1976.
• J.L. Lions, Y. Maday, and G. Turinici, A parareal in time discretization of PDE’s, C.R. Acad. Sci. Paris Ser. I Math., 332 (2001), pp. 661–668.
• J.A.C. Weideman and L.N. Trefethen, Parabolic and hyperbolic contours for computing the Bromwich integral, Math. Comp., 76(2007), pp. 1341–1356.