15
A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER TIME OF AN EVOLUTIONARY PROCESS WITH IMPLICIT TIME STEPPING ELEANOR G. MCDONALD * AND ANDREW J. WATHEN * Abstract. Evolutionary processes arise in many areas of applied mathematics, however since the solution at any time depends on the solution at previous time steps, these types of problems are inherently difficult to parallelize. Whilst an explicit time stepping scheme can be easily parallelized, stability considerations for such methods impose significant restriction on the size of the time step, particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary problems with implicit time step schemes. We derive and demonstrate our approach for both the linear diffusion equation and the convection-diffusion equation. Using an all-at-once approach, we solve for all time steps simultaneously using a preconditioned iterative method. The application of the preconditioner is parallelizable over time and significant speed up compared to standard sequential solving methods is achieved. In this short paper, we have not compared to any of the more sophisticated parallel in time methods, but rather simply describe our own elementary approach. Key words. Parallel computation over time, preconditioning, iterative methods, evolutionary process, heat equation, implicit time stepping, convection-diffusion AMS subject classifications. 35K05, 65F08, 65M55, 65Y05 1. Introduction. Evolutionary processes have been extensively studied for many years and methods for solving these types of problems have evolved as computing ca- pabilities have changed. As we approach the limits of computational speeds on a single processor, the use of massively parallel computer architectures is seen as the way forward. The inherent problem with solving evolutionary problems on such sys- tems is the causality principle; the solution at each time step depends on the solution at previous steps. This presents significant difficulty to parallel computations. While there are several successful methods for parallelizing computation over the spatial domain, time domain parallelization remains the key to providing significant speed up for these problems. Significant research has been conducted on time domain parallelization and a recent comprehensive review can be found in [5]. As described in [5], methods can generally be classified into either multiple shooting methods (such as the ‘parareal’ method [9]), domain decomposition and waveform relaxation methods such as in [4], space-time multigrid methods ([11], [7]), or direct methods [6]. Our proposal falls into none of these categories. The method we propose is a parallelizable block diagonal preconditioner applied within an iterative method such as GMRES [15] or BiCGSTAB [16]. The solution is computed using the all-at-once method, whereby the solution at all time steps is com- puted simultaneously. Parallelization of an all-at-once formulation is also presented in [11] where a space-time multigrid approach is used; that approach involves applying a damped block Jacobi smoother before applying a standard multigrid restriction in space-time. In our more elementary approach, the block diagonal preconditioner is solved by the application of a single multigrid cycle to each block, but there is no coarsening in time. In our computations for the heat equation we employ a standard Numerical Analysis Group, Mathematical Institute, Andrew Wiles Building Radcliffe Observa- tory Quarter, Woodstock Road, Oxford, OX2 6GG, United Kingdom ([email protected], [email protected]) 1

A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVERTIME OF AN EVOLUTIONARY PROCESS WITH IMPLICIT TIME

STEPPING

ELEANOR G. MCDONALD∗ AND ANDREW J. WATHEN∗

Abstract. Evolutionary processes arise in many areas of applied mathematics, however sincethe solution at any time depends on the solution at previous time steps, these types of problems areinherently difficult to parallelize. Whilst an explicit time stepping scheme can be easily parallelized,stability considerations for such methods impose significant restriction on the size of the time step,particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approachfor the solution of evolutionary problems with implicit time step schemes. We derive and demonstrateour approach for both the linear diffusion equation and the convection-diffusion equation. Using anall-at-once approach, we solve for all time steps simultaneously using a preconditioned iterativemethod. The application of the preconditioner is parallelizable over time and significant speed upcompared to standard sequential solving methods is achieved. In this short paper, we have notcompared to any of the more sophisticated parallel in time methods, but rather simply describe ourown elementary approach.

Key words. Parallel computation over time, preconditioning, iterative methods, evolutionaryprocess, heat equation, implicit time stepping, convection-diffusion

AMS subject classifications. 35K05, 65F08, 65M55, 65Y05

1. Introduction. Evolutionary processes have been extensively studied for manyyears and methods for solving these types of problems have evolved as computing ca-pabilities have changed. As we approach the limits of computational speeds on asingle processor, the use of massively parallel computer architectures is seen as theway forward. The inherent problem with solving evolutionary problems on such sys-tems is the causality principle; the solution at each time step depends on the solutionat previous steps. This presents significant difficulty to parallel computations. Whilethere are several successful methods for parallelizing computation over the spatialdomain, time domain parallelization remains the key to providing significant speedup for these problems.

Significant research has been conducted on time domain parallelization and arecent comprehensive review can be found in [5]. As described in [5], methods cangenerally be classified into either multiple shooting methods (such as the ‘parareal’method [9]), domain decomposition and waveform relaxation methods such as in [4],space-time multigrid methods ([11], [7]), or direct methods [6]. Our proposal falls intonone of these categories.

The method we propose is a parallelizable block diagonal preconditioner appliedwithin an iterative method such as GMRES [15] or BiCGSTAB [16]. The solution iscomputed using the all-at-once method, whereby the solution at all time steps is com-puted simultaneously. Parallelization of an all-at-once formulation is also presented in[11] where a space-time multigrid approach is used; that approach involves applyinga damped block Jacobi smoother before applying a standard multigrid restriction inspace-time. In our more elementary approach, the block diagonal preconditioner issolved by the application of a single multigrid cycle to each block, but there is nocoarsening in time. In our computations for the heat equation we employ a standard

†Numerical Analysis Group, Mathematical Institute, Andrew Wiles Building Radcliffe Observa-tory Quarter, Woodstock Road, Oxford, OX2 6GG, United Kingdom ([email protected],[email protected])

1

Page 2: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

2 E. G. MCDONALD AND A. J. WATHEN

‘black box’ algebraic multigrid V-cycle for an elliptic operator [1], though there isno reason to believe that any standard elliptic multigrid method could not be used;parallelism is achieved through simultaneous applications of multigrid cycles to inde-pendent problems in the preconditioning step, rather than requiring a parallel multi-grid methodology. For computation on the convection-diffusion operator, a modifiedgeometric multigrid operator [12] is used which has been shown to be effective forthese types of problems.

Another alternative for solving parabolic problems is to use an explicit time step-ping scheme which is readily parallelizable; however this places significant restrictionon the time step size, τ , in order to ensure stability. The advantage of the methodproposed here is that it is valid for essentially any implicit time stepping scheme,including unconditionally stable schemes such as Backward Euler so that large timesteps can be used. For the numerical experiments conducted with constant time steps,we have chosen a time step on the order of the grid size h. Furthermore, our methodis shown to also be effective for multi-step methods.

This short paper will be structured as follows. Firstly, we focus on the lineardiffusion equation in order to describe the proposed method in a concrete setting.We will introduce the problem and discretisation and outline how the problem wouldusually be solved without any parallel capability. We then introduce our approach andprovide analysis to support the convergence and parallel efficiency of the scheme. Wewill describe and show how this same methodology can be applied to the convection-diffusion equation thus demonstrating that the exponential smoothing and decay ofthe solution for the heat equation is not required and solutions which develop layers- and thus energies at all frequencies - can just as readily be computed with ourapproach. Lastly, results are presented to illustrate the performance of the methodfor both of the model problems considered.

2. Proposal Outline.

2.1. Discretization and Standard Approach. In order to describe our method,we will begin by considering the solution of the linear diffusion (or heat) equationinitial-boundary value problem,

(2.1)

ut = ∆u+ f in Ω× (0, T ], Ω ⊂ R2 orR3,

u = g on ∂Ω,

u(x, 0) = u0(x) at t = 0.

We will use a finite element discretization in space on a uniform square grid withmesh size h though there is no reason to believe that this is necessary for our proposal.A θ-method will be used to discretize in time with N time steps of size τk at the k-thstep where

∑Nk=1 τk = T . We note that for 1

2 ≤ θ ≤ 1 we have an unconditionallystable implicit scheme. While this method is valid for all θ-schemes we will focus onresults for the Backward Euler (θ = 1) and Crank-Nicolson (θ = 1

2 ) schemes.For general θ-schemes the discretization of (2.1) gives,

Muk − uk−1

τk+K (θuk + (1− θ)uk−1) = fk,

for k = 1, . . . , N where M ∈ Rn×n is the standard finite element mass matrix andK ∈ Rn×n is the stiffness matrix (the discrete Laplacian), where n is the number ofspatial degrees of freedom. The initial vector u0 should be obtained from the initial

Page 3: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 3

data by a convenient projection. Rearranging we have,

(2.2) (M + τkθK)uk = (M − τk(1− θ)K)uk−1 + τkfk.

In order to solve this system, the classical approach is to solve the N separaten× n systems sequentially for k = 1, 2, . . . , N . For large n, an iterative method suchas Algebraic Multigrid (AMG) may be used to complete each of these solves. Thisapproach is inherently sequential, hence the method is difficult to parallelize overtime.

In order to approximate the total amount of work required to solve a system usingthis method, we regard 1 AMG V-cycle as the main unit of work for each solve. Thusif we require r V-cycles to solve the n × n linear system at each time step to thedesired accuracy, we require rN V-cycles in total to achieve an accurate solution atall time steps. In practice, for the system described above, with constant time stepsize τ equal to the grid size h = 2−5, 5 V-cycles were observed to be required at eachstep to obtain a solution with `2 residual less than 10−6 using uk−1 as the startingguess in each case. We note that the number of V-cycles required may decrease if asmaller time step size was used. For instance, 3 V-cycles were observed to be requiredat each step for τ = 2−8 with h = 2−5.

2.2. Proposed Approach. Our simple proposal solves all time steps simultane-ously using an ‘all-at-once’ approach. Conceptually, we construct the following linearsystem which defines the solution at all time steps in one large equation system,

A

u1

u2

...uN

=

(M − τ1(1− θ)K)u0 + τ1f1

τ2f2...

τN fN

(2.3)

where,

A :=

M + τ1θK

−M + τ2(1− θ)K M + τ2θK. . .

. . .

−M + τN (1− θ)K M + τNθK

.(2.4)

We note that the resulting linear system is huge; it is an nN × nN system.The standard method would correspond to solution of this system by block forwardsubstitution. In order to solve the system we propose to use a preconditioned iterativemethod such as GMRES or BiCGSTAB; the key consideration is preconditioning. Anysuch Krylov subspace method requires a matrix vector product and vector operationsas well as the major part of the work in preconditioning. We note that the matrixvector products with A require only vector products with M and K, thus can becomputed simply without actual construction of A.

The preconditioner we propose is an approximation to the block diagonal of A in(2.4) and is given by,

(2.5) P−1MG :=

(M + τ1θK)MG

(M + τ2θK)MG

. . .

(M + τNθK)MG

,

Page 4: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

4 E. G. MCDONALD AND A. J. WATHEN

where (M + τkθK)MG denotes the application of a single multigrid V-cycle to thematrix M + τkθK ∈ Rn×n. We see no reason that any reasonable multigrid code couldnot be used; we have employed the Ruge-Stuben AMG method [14] as implemented inthe HSL MI 20 software [1]. Due to the block diagonal structure, this preconditionercould be applied using N independent parallel processes. We note that this methoddoes not require τk to be constant at all time steps and thus could be applied to anadaptive scheme, however the size of the time step at each step would need to beknown a priori. Also note that in the case of constant time steps, application of P−1

MG

amounts to the solution of N problems with the same matrix and different right handsides so that ideas such as those described in [10] could be employed to further speedup computation; we have not done so in our implementation here.

Since each iteration of GMRES or BiCGSTAB is dominated by the work requiredto complete one solve of a linear system with PMG, the total work at each itera-tion will be approximately equal to 1N V-cycles. However, as the preconditioner isinherently parallelizable, each time step can be computed on a separate processor.Thus, if distributed over N processors, the total work at each iteration is equivalentto only 1 V-cycle. We will now show that convergence with the corresponding exactpreconditioner must occur in at most N iterations, thus the total amount of work forthe solve is N2 V-cycles sequentially but only N V-cycles in a parallel environmentwith N processors. Thus the solution is computed in time equivalent to N V-cyclesinstead of rN V-cycles in the standard approach.

2.3. Analysis. In order to estimate the convergence of the GMRES iteration wewill examine the eigenvalues of the preconditioned system with the exact precondi-tioner, Pexact, defined as

(2.6) Pexact :=

M + τ1θK

M + τ2θK. . .

M + τNθK

.We will then show that the AMG approximation, PMG, is spectrally very close to theexact preconditioner and thus our analysis closely represents the behaviour seen forthe multigrid preconditioner.

Proposition 2.1. Let T = P−1exactA be the preconditioned system where A and

Pexact are defined as in (2.4) and (2.6). Then, the eigenvalues of T are all equal to1 and furthermore, the minimal polynomial is given by

(2.7) p(T ) = (T − I)m

for some m < N .Proof. We note that,

P−1exactA =

IJ2 I

. . .. . .

JN I

where , Jk = (M + τkθK)−1(−M + τk(1− θ)K). We see that the matrix T is a lowertriangular matrix with ones on the diagonal which implies that that all the eigenvaluesof T must be equal to 1. Furthermore, the matrix T − I will be strictly block lower

Page 5: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 5

triangular. For any m < N , we note that (T −I)m will be 0 except for non-zero entireson the m-th subdiagonal which are products of the blocks J2, J3, . . . , JN . Therefore,we must have (T − I)N = 0 as there are only N − 1 subdiagonals.Remark: we would typically expect the minimum polynomial to precisely be (T −I)N .

If we were to calculate the preconditioner exactly, GMRES would converge to theexact solution in at most N steps in exact arithmetic; typically it will be exactly Nsteps. Numerically we can see that this is still very close to being the case for themultigrid preconditioner defined in (2.5). Figure 1 below shows the convergence for asmall system using a Backward Euler discretisation. We can see that there is a sharpdrop in the residual when the number of iterations is equal to N for both GMRES andBiCGSTAB iterations. Furthermore, we note that this same argument could equally

0 10 20 30 40 50 60 70 8010

−15

10−10

10−5

100

Iteration

Resid

ua

l

GMRES

BiCGSTAB

Fig. 1: Convergence of preconditioned iterative methods for the smooth test problemdescribed in Section 4.1.1 (h = τ = 2−4, N = 80).

be applied to multistep methods; in a k-step method the resulting all-at-once systemwould contain k block subdiagonals. In order to demonstrate this we will describe thesystem for the 2-step Backward Differentiation Formula method (BDF2). Using thismethod the discretisation in (2.2) instead becomes,

(2.8) (M +2

3τkK)uk =

4

3Muk−1 −

1

3Muk +

2

3τkfk,

and the overall system becomes,

ABDF2

u1

u2

u3

...uN

=

43Mu0 − 1

3u−1 + 23τ1f1

− 13u0 + 2

3τ2f223τ3f3

...23τN fN

(2.9)

Page 6: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

6 E. G. MCDONALD AND A. J. WATHEN

where,

ABDF2 :=

M + 2

3τ1K− 4

3M M + 23τ2K

13M − 4

3M M + 23τ3K

. . .. . .

. . .13M − 4

3M M + 23τNK

.(2.10)

If we define our exact preconditioner for the BDF system, P(BDF2)exact to be theblock diagonal of the matrix ABDF2 we can see that the preconditioned system willnow have identities on the diagonal with two sub-diagonal blocks of non-zero matrices.Thus the minimal polynomial for the preconditioned system,

T = P−1(BDF2)exactABDF2

would be,

(2.11) p(T ) = (T − I)m,

for some m ≤ N and we typically expect m = N as before. We will provide numericalresults for the BDF2 method to support this. For any multi-step method wherethe exact precondition is chosen as the block diagonal for the linearised problem theminimal polynomial must similarly be (2.11) for some m ≤ N .

Until now, we have considered analysis of the exact preconditioner. In order todemonstrate that the multigrid preconditioner, PMG, is spectrally very close to Pexact

we have plotted the eigenvalues of the preconditioned system for the Backward Euler(θ = 1) and Crank-Nicolson (θ = 1

2 ) schemes as well as the 2-step BDF method inFigure 2. In order to calculate these eigenvalues the matrix which acts as the multigridoperator was calculated explicitly as follows.

A single multigrid V-cycle was applied to the trivial equation Ax = ei whereA = M + τθK or A = M + 2

3τK for the θ-method or BDF2 respectively, and eidenotes the i-th column of the identity matrix. The resulting solution for x forms thei-th column of (M+τθK)MG or (M+ 2

3τK)MG and by repeating for i = 1, 2, . . . , n weare able to explicitly construct the multigrid operator as a matrix. This is certainlynot efficient, but is only used for the eigenvalue computations here; it is not requiredfor the proposed method. The computed multigrid matrix was then used to calculatethe eigenvalues of the preconditioned system. Figure 2 show these eigenvalues forthree particular cases and demonstrates that they are indeed extremely close to 1.Similarly clustered eigenvalues about 1 are expected for different parameter values.

3. Convection-Diffusion Equation. Up to this point we have considered theheat equation in order to demonstrate our proposed method. The solutions of thisparabolic PDE become exponentially smoother through time. In order to demonstratethat the effectiveness of our proposed method does not rely on this smoothness, weadditionally consider the convection-diffusion equation and use examples which havesignificant energy in all frequencies for all times.

The initial-boundary value problem considered is,

(3.1)

ut − ε∆u+ w · ∇u = f in Ω× (0, T ], Ω ⊂ R2 orR3,

u = g on ∂Ω,

u(x, 0) = u0(x) at t = 0.

Page 7: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 7

0.95 0.96 0.97 0.98 0.99 1 1.01−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5x 10

−5

Re(λ)

Im(λ

)

λ( ˆP−1ABE)

λ( ˆP−1ACN )

λ( ˆP−1ABDF )

0 0.2 0.4 0.6 0.8 1−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5x 10

−5

Re(λ)

Im(λ

)

λ( ˆP−1ABE)

λ( ˆP−1ACN )

λ( ˆP−1ABDF )

Fig. 2: Eigenvalues of the preconditioned system (h = τ = 2−5, N = 5). The imageon the left shows a more detailed view of the eigenvalues around 1 on the real axiswhile the image on the right demonstrates how clustered the eigenvalues are.

The parameter ε is small and positive. It is noted that the convection-diffusionequation is a well studied non-symmetric system that arises in many areas of engineer-ing and models of flow [13]. The equation represents, for example, the concentrationof a pollutant, u, which is being convected with a velocity described by w and whichis also subject to diffusive effects.

As discussed in Chapter 6 of [3], it is often the case that convection effects aremore dominant than diffusion, implying that ε ‖w‖. However in this regime, theproblem can be difficult to solve as any method must be robust to the formation ofinternal or boundary layers which often results in a very fine mesh being required. Asis well known, if a mesh is not sufficiently fine, for a Galerkin method local oscillationscan arise in the numerical solution. In such circumstances, a stabilisation method isrequired.

One popular and widely employed method of stabilising the convection-diffusionproblem is to use the Streamline Upwind Petrov-Galerkin (SUPG) stabilization methodwhich was introduced in [8] and described for example in [3, Section 6.3.2].

3.1. Discretisation and Stabilisation. As with the heat equation problem,we will use a finite element discretisation in space on a uniform square grid with meshsize h and constant time step τ . Using a general θ-scheme the discretisation andSUPG stabilisation of (3.1) gives,

(3.2) (M + τθK)uk = (M − τ(1− θ)K)uk−1 + τ fk

where K = εK + N + δS. Here K represents the stiffness matrix, N represents thediscretisation of the convection term w · ∇u, and S is the stabilisation term.

The stabilisation parameter δ required as part of the SUPG method is taken tobe the optimal value of,

δ =

0 if Pe ≤ 1,h

2‖w‖2

(1− 1

Pe

)if Pe > 1,

Page 8: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

8 E. G. MCDONALD AND A. J. WATHEN

where the mesh Peclet number Pe is defined on each element as

Pe =h‖w‖2

2ε.

3.2. Approximation. For the heat equation problem, in order to approximatethe application of the preconditioner we used an algebraic multigrid method. Dueto the formation of layers in the convection-diffusion problem, more care is requiredin order to use a multigrid approximation. The method we will use is a modifiedgeometric multigrid method described by Ramage in [12] and is specifically designedfor convection-diffusion problems.

The two main differences of this method from a typical multigrid routine are:• Construction of the coarse grid operator - For the Ramage multigrid

method, the coarse grid operator is explicitly constructed on each grid withstabilization appropriate to that grid level. This is instead of the standarduse of the so called Galerkin coarse grid operator which is defined in terms ofthe projection and restriction operators.

• Pre- and Post-smoothing - The smoothing strategy used in this method isblock Gauss-Seidel smoothing applied in all four directions in order to accountfor wind in each possible direction. Two pre- and two post-smoothing stepsare employed.

3.3. Analysis. Similarly to the heat equation we now have the following system,

A

u1

u2

...uN

=

(M − τ(1− θ)K)u0 + τ f1

τ f2...

τ fN

(3.3)

where,

A :=

M + τθK

−M + τ(1− θ)K M + τθK. . .

. . .

−M + τ(1− θ)K M + τθK

.(3.4)

and the f terms contain the extra stabilisation term. Likewise, we can define thepreconditioner to be,

(3.5) P−1MG :=

(M + τθK)MG

(M + τθK)MG

. . .

(M + τθK)MG

,where (M + τθK)MG denotes the application of a single Ramage multigrid V-cycleto the matrix M + τθK ∈ Rn×n. In order to conduct analysis of the preconditioner,we also define the exact preconditioner

(3.6) Pexact :=

M + τθK

M + τθK. . .

M + τθK

.

Page 9: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 9

For the convection-diffusion equation we can conclude an analogous result toProposition 2.1.

Proposition 3.1. Let T = P−1exactA be the preconditioned system where A and

Pexact are defined as in (3.4) and (3.6). Then, the eigenvalues of T are all equal to1 and furthermore, the minimal polynomial is given by

(3.7) p(T ) = (T − I)m

for some m < N .Proof. The proof follows exactly as in Proposition 2.1 noting that symmetry was

never required in the matrix J .Thus we expect similar convergence results for the convection-diffusion equation

as were seen in the heat equation. This is demonstrated in Figure 3 below where asharp drop off in the norm of the residual is seen at the N -th iteration.

Iteration0 10 20 30 40 50 60 70 80 90

Res

idua

l

10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

GMRESBiCGSTAB

Fig. 3: Convergence of preconditioned iterative methods for the double glazing prob-lem described in Section 4.2.2 (h = τ = 2−4, N = 80).

We can also analyse the eigenvalues of the preconditioned system in order to showthat the multigrid preconditioner is spectrally very close to the exact preconditioner.For this we again build the appropriate multigrid operator matrix using the methoddescribed in Section 2.3. Figure 4 shows the eigenvalues of the (Ramgage) multigridpreconditioned system being extremely close to 1. It is also noted that the eigenvaluesplotted in Figure 4 were calculated using only one multigrid V-cycle with a single pre-and post-smoothing step, however in the numerical results 2 V-cycles with 2 pre- andpost-smoothing steps were used resulting in eigenvalues even more closely clusteredaround 1.

4. Numerical Results. The results presented in this section are based on animplementation of the method described in the previous sections within the IFISS[2] framework. Our implementation of GMRES was from the IFISS package and

Page 10: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

10 E. G. MCDONALD AND A. J. WATHEN

0.9999 1 1.0001−2

−1.5

−1

−0.5

0

0.5

1

1.5

2x 10

−5

Re(λ)

Im(λ

)

λ(P−1A)

0 0.2 0.4 0.6 0.8 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2x 10

−5

Re(λ)

Im(λ

)

λ(P−1A)

Fig. 4: Eigenvalues of the preconditioned system for the variable vertical wind problemdescribed in Section 4.2.1 with 1 V-cycle with 1 pre- and post-smoothing step (h =τ = 2−4, N = 5). The image on the left shows a more detailed view of the eigenvaluesaround 1 on the real axis while the image on the right demonstrates how clusteredthe eigenvalues are.

did not allow restarting. As GMRES can require prohibitive amounts of storage (andgrowing work) if many iterations are required, we also completed the calculations withthe inbuilt Matlab implementation of BiCGSTAB. Both methods were stopped withan absolute residual tolerance of 10−6. The finite element discretisation uses Q1 finiteelements over the domain Ω = [0, 1]×[0, 1] for the heat equation and Ω = [−, 1]×[−, 1]for the convection-diffusion problem.

4.1. Heat equation. In this section we present the results for solution of theheat equation using the proposed method. For the multigrid preconditioner, we usedthe Harwell Subroutine Library AMG preconditioner implementation, HSL MI20 [1]employed as a “black box”. This is a standard implementation of Ruge-Stuben alge-braic multigrid method [14].

We consider two test problems; one with smooth initial data and a second withrandom initial conditions. For the first test problem, we consider the Backward Eulermethod in time as well as variable time-stepping. For the second problem, we considerCrank-Nicholson and BDF2 methods. These methods have not been chosen specifi-cally for each problem, but rather just to demonstrate that a variety of time-steppingmethods can be used.

4.1.1. Test problem with smooth initial vector. Our first example is de-fined by the initial conditions,

u0 = x(x− 1)y(y − 1)

with no external forcing (i.e. f = 0). In order to generate an initial starting vectorwhich contains energy in all modes, we constructed the vector,

(4.1) x0 = (u0,u0, . . . ,u0)T + rand(0, 0.5 ∗max(u)),

where rand(0, 0.5 ∗ max(u)) represents a vector of length n × N of random valuesbetween 0 and 0.5 times the maximum value in u0. This was done to ensure thatthe initial vector contained energy in both high and low frequencies and that the

Page 11: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 11

initial residual ‖Ax0 − b‖ is of order 1. This vector was used for both GMRES andBiCGSTAB for all test problems. The iteration results are summarized in Table 1below. We note that for the first part of the table, the solution on the time domain(0,5] is computed and that the final solution has values of order 10−10 at the finaltime. As the time step is reduced along with the grid size, the total number of timesteps N increases. Thus we expect the total number of iterations to increase also,as is seen in the results. In the second part of the table we give results where wemaintain a constant number of time steps, therefore as τ is reduced the final time,T = Nτ is also reduced.

Table 1: Heat Equation, smooth test problem: Backward Euler Number ofiterations for given grid and step sizes. The first part shows doubling of N for halvingof h while the second part shows fixed N for varying h.

h τ N DoF GMRES BiCGSTAB

2−3 2−3 40 3240 40 372−4 2−4 80 23120 80 772−5 2−5 160 174240 160 1612−3 2−3 40 3240 40 372−4 2−4 40 11560 40 392−5 2−5 40 43560 43 432−6 2−6 40 169000 45 452−7 2−7 40 665640 47 45

In Table 2, we present results with variable time steps. The timesteps sizes werechosen to double at each time step. The range of time step size τ0 is listed in thetable.

Table 2: Heat Equation, smooth test problem: Variable step size Number ofiterations for given grid and step sizes. Compuations were all run until a final timet = 5.

h [τmin,τmax] N DoF GMRES BiCGSTAB

2−3 [2−6,22] 9 729 9 52−4 [2−7,22] 10 2890 10 62−5 [2−8,22] 11 11979 12 72−6 [2−9,22] 12 50700 13 82−7 [2−10,22] 13 216333 14 92−8 [2−11,22] 14 924686 15 11

4.1.2. Test problem with non-smooth initial vector. The second examplewas defined with random initial data taking values from a uniform distribution on[0, 10]. The initial starting vector, x0 was defined as (4.1) which for this problemcorresponds to a random vector with values from [0,15]. In order to demonstrate thatother implicit time-stepping schemes are also effective, Crank-Nicholson was used todiscretize in time and the results are summarized in Table 3 below. In Table 4, the 2-step Backwards Differentiation Formula (BDF2) method is used for the time-stepping.

Page 12: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

12 E. G. MCDONALD AND A. J. WATHEN

It is again evident that the iteration numbers are approximately equal to the numberof time steps and independent of h.

Table 3: Heat Equation, non-smooth test problem: Crank-Nicholson Numberof iterations for given grid and step sizes. The first part shows doubling of N forhalving of h while the second part shows fixed N for varying h.

h τ N DoF GMRES BiCGSTAB

2−3 2−5 40 3240 49 482−4 2−6 80 23120 90 952−5 2−7 160 174240 175 1992−3 2−5 40 3240 49 482−4 2−6 40 11560 52 502−5 2−7 40 43560 53 532−6 2−8 40 169000 56 532−7 2−9 40 665640 57 54

Table 4: Heat Equation, non-smooth test problem: BDF2 Number of iterationsfor given grid and step sizes. The first part shows doubling of N for halving of h whilethe second part shows fixed N for varying h.

h τ N DoF GMRES BiCGSTAB

2−3 2−3 40 3240 45 482−4 2−4 80 23120 87 902−5 2−5 160 174240 169 1902−3 2−3 40 3240 45 482−4 2−4 40 11560 49 502−5 2−5 40 43560 51 522−6 2−6 40 169000 53 552−7 2−7 40 665640 55 58

4.2. Convection-Diffusion Equation. For the results of this section the mod-ified geometric multigrid described in Section 3.2 and implemented in IFISS was used.Two pre- and two post-smoothing steps were used with four-directional line Gauss-Seidel smoothing. The initial vector for GMRES and BiCGSTAB defined by (4.1)where u0 represented interpolation of the initial data u0.

In each of the problems, ε was set equal to 1/200 so the maximum mesh Pecletnumber ranged between approximately 46 for h = 2−3 and 3 for h = 2−7.

4.2.1. Variable vertical wind. The first test problem considered is taken asExample 6.1.2 from [3]. The wind is vertical and is described by w = (0, 1+(x+1)2/4)and thus increases from left to right. Dirichlet boundary conditions apply to the inflowand side boundaries where u is set equal to 1 on the inflow boundary and decreasesquadratically to 0 on the right wall and cubically on the left wall. The vector u0

is specified as zero except for the boundary conditions. Again the first half of thetable shows result calculated for a time domain of (0,5] with τ = h, while the lowerhalf shows results with a constant number of timesteps. Figure 5 shows a solution

Page 13: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 13

to the problem at time t = 5. We again see that the number of iterations remainsapproximately equal to the number of time steps and independent of the mesh size.There is some increase in iteration numbers evident when the BiCGSTAB methodwas used.

10

-1-10

1

1.2

1

0.8

0.6

0.4

0.2

0

Fig. 5: Solution to variable vertical wind problem at t = 5 (h = τ = 2−5, N = 160)

Table 5: Convection-diffusion, variable vertical wind. Number of iterations forgiven grid and step sizes. The first part shows doubling of N for halving of h whilethe second part shows fixed N for varying h.

h τ N DoF GMRES BiCGSTAB

2−3 2−3 40 3240 40 382−4 2−4 80 23120 80 792−5 2−5 160 174240 160 1642−4 2−4 160 46240 160 1422−5 2−5 160 174240 160 1662−6 2−6 160 676000 161 1722−7 2−7 160 2662560 161 175

4.2.2. Double glazing problem. The second convection diffusion test problemis given by Example 6.1.4 in [3] and is known as the double glazing problem. This isbecause it is a simple model for temperature in a cavity with recirculating wind and anexternal wall which is ‘hot’. The wind is described by w = (2y(1− x2),−2x(1− y2)).Dirichlet boundary conditions are imposed everywhere on the boundary with u = 1 onthe boundary where x = 1 and zero on all other boundaries. The vector u0 was zeroeverywhere except the boundaries where is satisfies the boundary conditions. Figure6 shows the solution to the problem at time t = 5. The iteration numbers remainindependent of the mesh size.

Page 14: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

14 E. G. MCDONALD AND A. J. WATHEN

1

0

-1-1

-0.5

0

0.5

1

0.8

0.6

0.4

0.2

01

Fig. 6: Solution to the double glazing problem at t = 5 (h = τ = 2−5, N = 160)

Table 6: Convection-diffusion, double glazing problem. Number of iterationsfor given grid and step sizes. The first part shows doubling of N for halving of h whilethe second part shows fixed N for varying h.

h τ N DoF GMRES BiCGSTAB

2−3 2−3 40 3240 40 412−4 2−4 80 23120 80 782−5 2−5 160 174240 160 1672−4 2−4 160 46240 160 1822−5 2−5 160 174240 160 1672−6 2−6 160 676000 175 1762−7 2−7 160 2662560 165 174

5. Conclusions. In this paper we have presented a simple approach for solvingthe heat equation and the convection-diffusion problem which is parallelizable overtime. The method constructs an all-at-once system which is solved using a precon-ditioned iterative method. The preconditioner is block diagonal and therefore itsapplication can be computed at each time step in parallel over the time steps.

We have shown that with the exact preconditioner, GMRES must converge inat most N iterations with exact arithmetic. In practice, multigrid approximationperforms very close to this for both the heat equation and the convection-diffusionproblem. Thus, N iterations of a single AMG V-cycle at each time step can be spreadover N processors. This results in the overall work being approximately N V-cyclesrather than rN V-cycles in a classical sequential approach.

Page 15: A SIMPLE PROPOSAL FOR PARALLEL COMPUTATION OVER … · particularly for parabolic problems. In this paper, we make a simple proposal of a parallel approach for the solution of evolutionary

PARALLEL COMPUTATION OF AN EVOLUTIONARY PROCESS 15

REFERENCES

[1] J. Boyle, M. D. Mihajlovic, and J. A. Scott, HSL MI20: an efficient AMG preconditioner,Technical rep. RAL-TR-2007-021, STFC Rutherford Appleton Laboratory, Didcot, UK,(2007).

[2] H. Elman, A. Ramage, and A. Silvester, Algorithm 866: IFISS, a Matlab toolbox for mod-elling incompressible flow, ACM Trans. Math. Softw., 33 (2007), pp. 2–14,.

[3] H. Elman, D. J. Silvester, and A. J. Wathen, Finite elements and fast iterative solvers:with applications in incompressible fluid dynamics, Numerical Mathematics and ScientificComputation, Oxford Univ. Press, UK, Oxford, 2nd ed., 2014.

[4] M. Gander, Overlapping schwarz for linear and nonlinear parabolic problems, in 9th Interna-tional Conference on Domain Decomposition, 1996.

[5] M. J. Gander, 50 years of time parallel time integration, tech. rep., University of Geneva,2014.

[6] M. J. Gander and S. Guttel, PARAEXP: a parallel integrator for linear initial-value prob-lems, SIAM J. Sci. Comput., 35 (2013), pp. 123–142.

[7] G. Horton and S. Vandewalle, A space-time multigrid method for parabolic partial differ-ential equations, SIAM J. Sci. Comput., 16 (1995), pp. 848–864.

[8] T. J. R. Hughes and A. Brooks, A multidimensional upwind scheme with no crosswinddiffusion, Finite element methods for convection dominated flows, 34 (1979), pp. 19–35.

[9] J. Lions, Y. Maday, and G. Turinici, A “parareal” in time discretization of PDE’s, ComptesRendus de l’Academie des Sciences Series I Mathematics, 332 (2001), pp. 661–668.

[10] X. Liu, E. Chow, K. Vaidyanathan, and M. Smelyanskiy, Improving the performance ofdynamical simulations via multiple right-hand sides, in Parallel Distributed ProcessingSymposium (IPDPS), 2012 IEEE 26th International, May 2012, pp. 36–47.

[11] M. Neumuller, Space-Time Methods: Fast Solvers and Applications, PhD thesis, Graz Uni-versity of Technology, June 2013.

[12] A. Ramage, A multigrid preconditioner for stabilised discretisation of advection-diffusion prob-lems, Journal of Computational and Applied Mathematics, 110 (1999), pp. 187–203.

[13] H.-G. Roos, M. Stynes, and L. Tobiska, Robust numerical methods for singularly perturbedDifferential equations: convection-diffusion-reaction and flow problems, Springer-Verlag,Berlin, 1996.

[14] J. Ruge and K. Stuben, Algebraic multigrid (AMG), in Multigrid Methods, S. McCormick,ed., vol. 3 of Frontiers in Applied Mathematics, SIAM, 1987, pp. 73–130.

[15] Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solvingnonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856–869.

[16] H. Van der Vorst, Bi-CGSTAB: A fast and smoothly convergent variant of Bi-CG for the so-lution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13(2) (1992), pp. 631–644.