32
A Parallel Implementation of the BDDC Method for the Stokes Flow Jakub ˇ ıstek joint work with P. Burda, M. ˇ Cert´ ıkov´ a, J. Mandel, J. Novotn´ y, B. Soused´ ık Institute of Mathematics of the AS CR, Prague Czech Technical University, Prague University of Colorado Denver July 16th, 2010, ICCFD6 St. Petersburg

A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

A Parallel Implementation of the BDDCMethod for the Stokes Flow

Jakub Sıstek

joint work withP. Burda, M. Certıkova, J. Mandel, J. Novotny, B. Sousedık

Institute of Mathematics of the AS CR, PragueCzech Technical University, Prague

University of Colorado Denver

July 16th, 2010, ICCFD6St. Petersburg

Page 2: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Table of contents

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 3: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Outline

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 4: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Steady Stokes problemFind flow velocity u(x) ∈ [C2(Ω)]d and pressure p(x) ∈ C1(Ω)/Rsatisfying

−ν∆u +∇p = f in Ω,

−∇ · u = 0 in Ω,

u = g on ∂Ωg ,

−ν(∇u)n + pn = 0 on ∂Ωh,

I d = 2, 3 . . . spacial dimension

I Ω ⊂ Rd . . . domain with Lipschitz boundary ∂Ω filled withincompressible viscous fluid

I ν . . . constant positive kinematic viscosity of the fluid

I f(x) . . . vector of intensity of volume forces per mass unit

I ∂Ωg and ∂Ωh . . . subsets of ∂Ω satisfying ∂Ω = ∂Ωg ∪ ∂Ωh

I n . . . unit outer normal vector to the boundary ∂Ω

Page 5: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Weak formulation

Function spaces

Vg =

v = (v1, v2) | v ∈ [H1(Ω)]d ; Tr vi = gi , i = 1, . . . on ∂Ωg

,

V =

v = (v1, v2) | v ∈ [H1(Ω)]d ; Tr vi = 0, i = 1, . . . on ∂Ωg

.

Find u(x) ∈ Vg , u− ug ∈ V and p(x) ∈ L2(Ω)/R satisfying

ν∫

Ω∇u : ∇vdΩ −∫

Ω p∇ · vdΩ =∫

Ω f · vdΩ ∀v ∈ V ,−∫

Ω ψ∇ · udΩ = 0 ∀ψ ∈ L2(Ω).

I ug ∈ Vg satisfies Dirichlet boundary condition g

Page 6: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Approximation of the problem by finite element method

Triangulation τh of domain Ω by Taylor–Hood finite elements.

Ω

finite element K

τh

• . . . node with value of velocity component and pressure• . . . node with value of velocity component only

Page 7: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Approximation of the problem by finite element methodTaylor–Hood finite elements – satisfying Babuska-Brezzi condition

∃CB > 0, const. ∀ψh ∈ Qh supvh∈Vh

(ψh,∇ · vh)0

‖vh‖1≥ CB‖ψh‖0

function spaces for approximation:

velocities

Vgh =

vh ∈ [C(Ω)]d ; vhi|K∈ R2(K ), i = 1, . . . , d ; vh = g on ∂Ωg

pressure and test functions for the continuity equation

Qh =ψh ∈ C(Ω); ψh |K∈ R1(K )

test functions for momentum equations

Vh =

vh ∈ [C(Ω)]d ; vhi|K∈ R2(K ), i = 1, . . . , d ; vh = 0 on ∂Ωg

where

Rm(K ) =

Pm(K ), if K is a triangle/tetrahedron

Qm(K ), if K is a quadrilateral/hexahedron

Page 8: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Matrix problem

Discretization leads to the saddle point problem[A BT

B 0

] [up

]=

[f0

]

I u . . . velocity unknowns

I p . . . pressure unknowns

I A . . . vector–Laplacian matrix

I B . . . divergence matrix

I f . . . discrete vector of intensity of volume forces per mass unit

Page 9: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Outline

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 10: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Brief overview of BDDC method

I Balancing Domain Decomposition by Constraints

I 2003 C. Dohrmann (Sandia), theory with J. Mandel (UCD)

I nonoverlapping primary domain decomposition method

I equivalent with FETI-DP [Mandel, Dohrmann, Tezaur 2005]

I for SPD problems - condition number κ satisfy

κ ≤ C log2(1 + H

h

)

Page 11: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

The abstract problem in BDDC

Variational setting

u ∈ U : a(u, v) = 〈f , v〉 ∀v ∈ U

I a (·, ·) symmetric positive definite form on U

I 〈·, ·〉 is inner product on U

I U is finite dimensional space

Matrix form

u ∈ U : Au = f

I A symmetric positive definite matrix on U

Linked together

〈Au, v〉 = a (u, v) ∀u, v ∈ U

Page 12: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

BDDC set-up

I division into subdomains

I selection of coarse problem nodes (also called corners)

interface

subdomain iΩcoarse problem nodes

h

H

finite elements

Page 13: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Function spaces in BDDC

U ⊂ W c ⊂ Wcontinuous continuous at coarse no continuity

problem nodes

I enough coarse nodes to fix floating subdomains – rigid bodymodes captured

I a (·, ·) symmetric positive definite form on W c

I corresponding matrix Ac symmetric positive definite, almostblock diagonal structure, larger dimension than A

I operator of projection E : W c → U, Range(E ) = U,e.g. averaging across interfaces (arithmetic, weighted)

Page 14: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

The BDDC preconditioner with corners

Define MBDDC : r ∈ U −→ u ∈ U

variational form

MBDDC : r 7−→ u = Ew , w ∈ W c : a (w , z) = 〈r ,Ez〉 , ∀z ∈ W c

matrix form

Acw = ET r

MBDDC r = Ew

Page 15: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Fictious mesh

I Ac can be constructed using auxiliary mesh

Page 16: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

BDDC for the Stokes problem

In the abstract formAu = f ,

simply put

A =

[A BT

B 0

],

u =

[up

],

and

f =

[f0

].

I system matrix and preconditioner – symmetric indefinite

Page 17: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Outline

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 18: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Parallel implementation

I built on multifrontal solver

I MUltifrontal Massively Parallel sparse direct Solver(MUMPS) http://mumps.enseeiht.fr

I based on W c

I Fortran 90 programming language, MPI libraryI experiments on

I SGI Altix 4700, CTU, Prague, CR72 processors Intel Itanium 2, OS Linux

Page 19: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Outline

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 20: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 2D lid driven cavity

I 1282 = 16 384 Taylor-Hood elements, 115 971 dof

I 8 subdomains, 14 corners

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Page 21: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 2D lid driven cavity

I 8 processors of SGI Altix 4700

I 59 PCG iterations, 17.2 sec (serial frontal algorithm – 231 sec)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

streamlines pressure

Page 22: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 2D lid driven cavity

I other iterative methods and preconditioners

I ‖r‖2/‖g‖2 < 10−8

I Matlab results

no BDDC BDDC ILUT ILUT ILUT

method prec. W c W c+F τ = 10−3 10−4 10−5

BICGSTAB n/a 45 22 n/a 331 10GMRES 759 49 38 472 87 18

I n/a – no convergence

I F – continuity of arithmetic averages on faces

Page 23: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 3D channel

I a quarter of the channel

I 3 393 Taylor–Hood finite elements, 54 248 unknowns

I division into 4 subdomains

Page 24: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 3D channel

I ‖r‖2/‖f ‖2 < 10−6 by 33 PCG iterations

I pressure with streamlines

Page 25: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 3D lid driven cavity

I 3D extension of 2D lid driven cavity flow

I unit cube

I tangential velocity rotated by π/8

I kinematic viscosity 0.01

z

x

uπ/8y

Page 26: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 3D lid driven cavity

I 323 = 32 768 Taylor-Hood finite elements, 457 380 unknowns

I division into 32 subdomains by METIS graph partitioner

Page 27: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Stokes flow in 3D lid driven cavity

I stopping criterion ‖r‖2/‖g‖2 < 10−6.

I 46 PCG iterations

I 731 sec on 32 processors

I streamline through point with coordinates [0.5,0.55,0.5]

Page 28: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Outline

Stokes problem and mixed FEM

BDDC method

Parallel implementation

Numerical results

Conclusion

Page 29: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Conclusion and future direction

I implementation based on MUMPS is simpler thansubdomain-by-subdomain approach

I lack of theoretical background for BDDC on indefiniteproblems for some elements

I study of using BDDC implementation to Stokes problem ‘as is’

I BDDC is applicable to other problems than SPD and othermethods than PCG

I more sophisticated (adaptive) way for selection of constraints- ongoing research

I application as block preconditioner

Page 30: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

introduce matrix G with constraints

I each row of G corresponds to a continuity constraint betweentwo subdomains

I introduces new coupling between subdomains

Example: for arithmetic averages on an edge between subdomainsi and j , a row of G is

gk = [0 . . . 0 1 1 1 1︸ ︷︷ ︸edge dof on Ωi

0 . . . 0−1− 1− 1− 1︸ ︷︷ ︸edge dof on Ωj

0 . . . 0]

define intermediate space as

W =

w ∈ W c : Gw = 0

Page 31: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Enforcing additional constraints

Change of variables on each subdomain, such that averages appearas single node constraints.

w = Tw , w = Bw , B = T−1

Matrix T invertible, contains weights of averages.

Compute MBDDC r = EBw , where w is the solution to

BT AcBw + BTGTλ = BTET rGBw = 0

.

Transformed averages may be handled as corners and furtherassembled [Li, Widlund 2006].

Drawback: The distinction between W c and W lost.

Page 32: A Parallel Implementation of the BDDC Method for the ...users.math.cas.cz/~sistek/talks/Sistek-2010-ICCFD-talk.pdf · A Parallel Implementation of the BDDC Method for the Stokes Flow

Projected change of variablesCombination of projected BDDC and change of variables:

I introduce matrix G with constraints,

I define matrix G = GB – reduces to one 1 and one −1 in eachrow.

Projection onto null(G )

P = I − GT

(GGT

)−1G

Construct matrix

A = PBT AcBP + t(I − P)

BDDC preconditioner as

Aw = PBTET r

MBDDC r = EBw