Parallel 3D Mortar Element Method for Adaptive Nonconforming Meshes

DOI: 10.1007/s10915-005-9026-zJournal of Scientific Computing, Vol. 27, Nos. 1–3, June 2006 (© 2005)

Parallel 3D Mortar Element Method for AdaptiveNonconforming Meshes

Huiyu Feng,1 Catherine Mavriplis,2 Rob Van der Wijngaart,1 and RupakBiswas3

Received October 15, 2004; accepted (in revised form) September 1, 2005; Published online December 29, 2005

We present a new two-step “intermediate mortar” approach that efficientlyextends the Mortar Element Method (MEM) to 3D meshes with nonconform-ing discretizations. The technique utilizes projection matrices derived in 2D,thereby avoiding the need to explicitly form large matrices for the 3D cases.The method also allows more flexibility for both h- and p-type adaptation.Parallel implementations with OpenMP and multithreading compiler directivesare used to solve a moving heat source problem on SGI Origin and Cray MTAmachines. Numerical results demonstrate the advantages of adaptive noncon-forming meshes with MEM over uniformly fine meshes.

KEY WORDS: Mortar element method; spectral element method; adaptive;nonconforming.

1. INTRODUCTION

High order methods are frequently used in computational simulations ofcomplex physical phenomena because of their superior accuracy. An effi-cient way to avoid unnecessary computations in smooth regions of thesolution is to use adaptive meshes that employ fine resolutions only inregions where they are needed. Nonconforming spectral elements allowthe mesh to be flexibly adjusted to satisfy the accuracy requirements ofthe simulation. The method is particularly suitable for unsteady problems

1 Computer Sciences Corporation, NASA Ames Research Center, Moffett Field, CA 94035,USA.

2 Mechanical and Aerospace Engineering Department, George Washington University, Wash-ington, DC 20052, USA.

3 NAS Division, NASA Ames Research Center, Moffett Field, CA 94035, USA.

231

0885-7474/06/0600-0231/0 © 2005 Springer Science+Business Media, Inc.

232 Feng et al.

with very disparate length scales or moving features, as occur in heattransfer, fluid dynamics, or flame combustion. The computational savingsover nonadaptive methods become more significant for higher order meth-ods because of the latter’s large number of floating-point operations.

The flexibility of mesh generation directly affects the efficiency ofthe adaptation process. Unstructured and nonconforming meshes certainlysimplify adaptation. The method discussed here is for nonconformingmeshes in the context of the Spectral Element Method (SEM). However,the occurrence of spurious discontinuities across nonconforming interfacescomplicates the numerical algorithm. Different methods have been usedto deal with this problem. In [10], interpolation (instead of projection) isused to handle the nonconforming interface at element boundaries. Thismethod was motivated by the result of [14], which reported spurious ei-genvalues in the convection operator for certain nonconforming formula-tions.

An alternative is to use the Mortar Element Method (MEM) [5, 12],a nonoverlapping domain decomposition technique that couples differentvariational approximations in different subdomains. It has the capabilityto couple spectral methods of varying polynomial degrees, spectral meth-ods with finite elements, or different finite element methods with differentmeshes [1]. The MEM does this in an optimal manner so that the error isbounded by the sum of the subregion-by-subregion approximation errors,independent of the discretization scheme.

By allowing both geometrically and functionally nonconformingmeshes, MEM greatly increases the portability of SEM and the flexibil-ity of mesh adaptation. It allows jumps across interfaces between adjacentsubdomains, but minimizes them in an integral sense for interior points ofmortar elements. The C0 continuity condition is replaced by weak match-ing and mortaring conditions. Different techniques can be used to enforcecontinuity. For instance, Lagrange multipliers are used in the Finite Ele-ment Tearing and Interconnecting method [6]. Alternatively, a noncon-forming space can be explicitly constructed by introducing an auxiliarymortar trace space over the skeleton of an appropriate variational oper-ator [13]. This strategy was studied in [5], and discussed exclusively forSEM in [4, 13]. It was extended to 3D in [3], and analyzed for spectralelements in [2].

In this work, the nonconforming discretization is based on Mavri-plis’s approach [13] that allows a clean decoupling between local residualevaluations and the transmission of continuity and boundary conditions.Moreover, it does not degrade the accuracy of SEM; in fact, its elemen-tal strategy decouples global and local computations, allowing efficientparallel implementations. A new two-step approach to efficiently extend

Parallel 3D Mortar Element Method 233

MEM to 3D is described here. By introducing “intermediate mortars”,the projection between elements and mortars is decoupled. At each step,projection matrices derived in 2D are used, thus avoiding the need toexplicitly form large matrices for 3D meshes. Correctness and accuracyare validated by numerical tests. Parallel performance results are also pre-sented and analyzed.

2. MORTAR ELEMENTS ON NONCONFORMING MESHES

The term “geometrically conforming” mesh refers to the intersectionbetween the closure of any two elements being either empty, a vertex, anedge, or a face; it is called “nonconforming” otherwise. In the context ofhigh-order methods such as SEM, the concept of nonconforming can beextended from geometrical to functional. If elements on both sides of anelement interface are approximated by different order basis functions, theyare “functionally nonconforming”. The method described in this paper canapply to both geometrically and functionally nonconforming 3D meshes.

Mortar elements are located on element boundaries. The dependentvariables on mortars are approximated using high-order polynomials. In3D meshes, a mortar element is a 2D plane as shown in Fig. 1. The right(shaded) face of element 3 matches four elements, so it has four corre-sponding mortar elements. If an element face matches more than one ele-ment, it is a nonconforming face, otherwise, it is conforming.

mortar elements

element 4

mortar element shared by elements 3 & 4

element 3

Fig. 1. Mortar elements in 3D meshes.

234 Feng et al.

Each mortar element is located on the interface of two adjacentelements. For conforming interfaces, the locus of mortar collocation pointsis identical to that on either element’s boundary. However, the colloca-tion points on the two elements’ boundaries are different if their interfaceis nonconforming. If it is geometrically nonconforming, the mortar col-location point locus is the same as that of the more refined element. Ifthe interface is functionally nonconforming, the mortar collocation pointlocus is the same as that on the side with the lower order approximation.If the interface is both geometrically and functionally nonconforming, thecollocation point locus is the same as that of the more refined element.

In 3D meshes, a mortar element edge may be shared by more thantwo elements. If it is shared by a nonconforming face, it is called a non-conforming edge. The collocation point locus on the nonconforming edgeis identical to that on the most refined element. For example, in Fig. 2(a),the highlighted edge in elements 1–3 is shared by elements 4 and 5. So thecollocation point locus on the edges of the corresponding mortars is thesame as that on the edges of elements 4 and 5. The mortar collocationpoint locus on the top and right faces of element 3 is shown in Fig. 2(b),and on the right and bottom faces of element 1 in Fig. 2(c). If a mortarelement edge is shared by geometrically conforming but functionally non-conforming elements, the collocation point locus is the same as that on theelement with the lowest approximation order.

3. PROJECTION BETWEEN MORTARS AND ELEMENTS

The projection between mortars and elements is accomplished by pro-jection matrices derived in 2D. With the introduction of an auxiliary mor-tar space Wh, the nonconforming discretization space Xh is described as

Xh=v∈L2(Ω),∀k∈1,K, v|Ωk ∈PN(Ωk) such that ∃φ ∈Wh for which :

∀vertices vq of Ωk, v|Ωk (v

q)=φ(vq), (1)

∀l ∈1,4, ∀k∈1,K, ∀ψ ∈PN−2(Γk,l),

∫Γk,l

(v|Ωk −φ)ψ ds=0, (2)

where Wh=φ ∈C0(S), ∀p∈1,M, φ|γ p ∈PN(γ p), φ|∂Ω =0

and K is the total number of elements; q ∈ 1, V where V is total num-ber of vertices in the mesh; M is total number of mortar elements; Γk,l


Fig. 2. (a) Exploded view of a 3D nonconforming mesh; (b) collocation points on mortarelements corresponding to two faces of element 3; and (c) two faces of element 1.

are edges of elements such that ∂Ωk =⋃4l=1 Γk,l , mortars γ p = (Ωk

⋂Ωm)

for some element k and m, skeleton S=⋃Mp=1 γ

p=⋃Kk=1 ∂Ω

k, and PN(Ωk)

denotes the space of all polynomials on Ωk of degree N in each spa-tial direction. The L2 condition Eq. (2) ensures that the jump in functionsbetween the element edge and its corresponding mortar is small, whereasthe vertex condition Eq. (1) ensures exact continuity at cross points.

The projection matrix Q handles the solution projection from mortarsto element boundaries, while QT deals with the residual projection fromelement boundaries to mortars. The derivation of Q is based on Eqs. (1)and (2). The basis function for test function ψ is chosen as

ψ |Γk,l =

N−1∑q=1

βqηN−2q (s), (3)

where

ηN−2q (z)= (−1)N−q L

′N(z)

ξq − z , z∈ [−1,1], q ∈1, . . . ,N −1. (4)

The residual on mortars is obtained by adding residuals projectedfrom all related elements. Q is the identity matrix for conforming inter-

236 Feng et al.

faces. A detailed derivation of Q for nonconforming interfaces can befound in [13]. The entries in Q are not fixed but depend on the size andlocation of the mortar relative to the element edge. Each Q and QT pair isonly associated with one element edge and its corresponding mortar. Theglobal projection matrices are referred to as Q and QT .

The discrete version of the Poisson equation, ∇2T = −q, is derivedfrom the weak form (∇Th,∇vh)= (qh, vh), and is equivalent to

AT = Bq, (5)

where A is the discrete matrix for the global Laplacian operator, B theglobal mass matrix, and T represents the unknowns (degrees of freedom)at collocation points both on element interior and mortar elements. Fornonconforming meshes, Eq. (5) becomes

QT AQT = QT Bq. (6)

This is a symmetric positive definite system that can be solved by iterativemethods such as Conjugate Gradient (CG).

4. 3D MEM IMPLEMENTATION

As noted earlier, mortar elements in 3D meshes are 2D planes. Solu-tion projection from mortars to a nonconforming face is performed in twosteps: mortar elements to intermediate mortar elements, and then to theelement face, as shown in Fig. 3. At each step, the projection is performedonly in one coordinate direction. For example, in Fig. 3, the projection isfirst performed in the row direction, and then in the column direction. Thetwo-step method avoids the need to form a large projection matrix for theentire element face. Also, the projection matrix for each line segment onthe face is the same, therefore it can be used repeatedly.

In the first step, the projection results are added and stored at theintermediate mortar collocation points. The value on the ith collocationpoint, φi , in a row of collocation points on an intermediate mortar, is

φi =∑p

Qpijψ

pj , j ∈0,1, . . . ,N, (7)

where p refers to all mortar elements corresponding to this intermediatemortar and ψ

pj is the value at the j th collocation point in a row of col-

location points on mortar element p. In the second step, q different Qmatrices are applied to the values on a column of collocation points onthe intermediate mortars. The projection results are summed to be the val-ues on the corresponding column of collocation points on the element


step 1 step 2

intermediate mortars element face

mapping direction mapping direction

mortar elements

Fig. 3. Mortar-to-element projection, one direction at each step.

face. Note that p and q may be unequal depending on the number ofmortars for this nonconforming face in each direction. The projectionresult is independent of the direction order; i.e., steps 1 and 2 may beswitched.

The residual projection from elements to mortars is similar, exceptthat transpose matrices QT are used. Again, the two-step approach isapplied: first from element faces to intermediate mortars, followed by pro-jection from intermediate mortars to mortar elements. As before, the orderof projection direction does not affect the final result. If the collocationpoint locations on the mortar and element face are the same except onthe edge(s), then Q and QT operations need to be performed only on theedge; no intermediate mortar element is involved.

Although the above discussion is in Cartesian coordinates, this methodcan be applied to curved elements with an introduction of geometric fac-tors. The geometric factors perform the mapping between the curved andCartesian coordinates [e.g., see Ref. 11]. This method can be applied afterthe mapping, i.e., on the Cartesian mesh. Although desirable in complexgeometries, for the simplicity and robustness of mesh adaptation curvedelements are not used for now. A curved element may be replaced byfiner Cartesian elements. Although this may slightly increase the numberof elements, the extra expense of mapping between curved and Cartesiancoordinates is avoided. More importantly, Cartesian elements are easier togenerate and to handle automatically in the mesh adaptation procedure. Inthe next section, we will present a heat transfer problem with a sphericallyshaped heat source using pure Cartesian elements.

5. NUMERICAL RESULTS

Validation tests of the 3D MEM implementation are performed on anonconforming mesh with domain size [0,1]3. The mesh, consisting of 22

238 Feng et al.

(0,0,0) (1,0,0)

(0,0,1) (1,0,1)

Fig. 4. Front view of the 3D test mesh generated by three levels of h refinement on thelower-front-left element.

elements, is generated by three levels of h refinement on the lower-front-left element. A 2D front view of the cube is shown in Fig. 4. The mesh,though it looks simple, includes nonconforming faces and nonconformingedges. Solving the 3D Poisson problem

∇2T = −3π2 sin(πx) cos(πy) sin(πz) in Ω, (8)

T |∂Ω = sin(πx) cos(πy) sin(πz) (9)

with exact polynomial solution T = sin(πx) cos(πy) sin(πz), the projec-tions from mortars to elements and vice versa yielded accuracy to withinmachine error. Similarly, the solution itself was captured to machine error.Exponential accuracy was achieved, as shown in Table I.

The 3D MEM implementation is next applied to a heat transfer prob-lem with convection, diffusion, and a moving source, originally developedin [7]. We use h-type mesh adaptation to track the effects of the spheri-cally-shaped heat source, as shown in Fig. 5. The problem is modeled by

Tt + v ·∇T = ε∇2T +S(x, t), (10)

Table I. Solution Error Versus Approximation Order N for Problem inEqs. (8) and (9)

N 4 6 8 10 12

Log L2 error −4.1885 −6.5618 −9.1774 −11.9860 −14.4473


Fig. 5. A spherically-shaped heat source travels across a cubic domain. The adaptive non-conforming mesh has finer resolution close to the source.

where the source term is defined as

S(x, t)=

cos(πα||x −x0 − vt ||)+1 if ||x −x0 − vt ||α,

0 if ||x −x0 − vt ||>α. (11)

Here, || · || signifies the Euclidian norm, T refers to temperature, x thelocation vector, t time, and ε is the heat diffusion coefficient. The heatsource, of radius α, is initially located at x0. The prescribed velocity fieldv is uniform and constant, and equals the speed of the source. The initialtemperature distribution on the domain, consisting of the unit cube [0,1]3,is zero, and the Dirichlet boundary conditions are fixed at zero as well.

The SEM discretization and mesh adaptation are used to solve thisproblem. The approximation order is fixed at four in all coordinate direc-tions for all elements. A time splitting technique decouples convection anddiffusion [9], with a CG method solving the diffusion part. The mesh isadaptively refined and coarsened every 20 time steps. The time step, ∆t ,is set to satisfy the CFL condition. Elements that overlap with the heatsource are refined isotropically to the minimal allowed size; all other ele-ments are coarsened, if possible. A restriction is that neighboring elements(connected by faces or edges) cannot differ by more than one refinementlevel. The initial grid has a single element the size of the entire domain[0,1]3. The heat source is located at x0 = (0.1,0.08,0.08), and moves withspeed v = (1,1,1).

We denote the minimal edge length by ∆x. Since isotropic h-refine-ment is performed, all edges of an element are of the same length. We solve

240 Feng et al.

the problem on three different meshes. The first is a uniform conformingmesh with ∆x= 1/8, i.e. there are 512 equal-sized elements. The other twomeshes are adaptive nonconforming with ∆x=1/8 and 1/16. We use thesemesh sizes to satisfy the heuristic that nonconforming meshes require oneadditional level of refinement over conforming meshes to attain the sameaccuracy. The problem is also solved for three different heat source sizes:α= 0.04, 0.05, and 0.06. The value of ε is fixed at 0.01 for all cases. Wecompare CPU times to show the efficiency of mesh adaptation.

In Table II, the solution (temperature) integral over the whole domainat the last time step is compared for the three different meshes. Observethat the results are quite close, especially for the conforming (∆x=1/8) andnonconforming (∆x=1/16) cases, confirming the heuristic. The conformingmesh uses 512 elements whereas the adaptive mesh contains an average ofonly 152, 164, and 186 elements for α=0.04, 0.05, and 0.06, respectively.

The benefit of adaptive meshes comes from the savings in compu-tational work. To demonstrate this fact, we use the same approximationorder and a fixed number of CG iterations for comparing the three differ-ent meshes. Results, presented in Table III, show that although adaptationis performed fairly frequently (every 20 time steps), the savings over theconforming case are significant. Even for the nonconforming mesh with∆x = 1/16 which requires twice as many time steps because of the CFLcondition, the CPU time savings are almost 50% for α=0.04. The refine-ment region increases when the heat source is larger, thereby reducing thecomputational benefit. Even so, we notice a 30% savings when α=0.06.

Adaptation induces various overheads, such as mesh adjustments,solution interpolation, and preconditioner recalculation. But the CPUtime spent in these parts is very small, as shown in Table IV.

Table II. Solution Quality Versus Mesh Type for Problem in Eqs. (10)and (11)

Temperature integral at time = 0.75 s

AdaptiveUniform

conforming Nonconforming NonconformingHeat source radius α ∆x=1/8 ∆x=1/8 ∆x=1/16

0.04 0.7065×10−4 0.7550×10−4 0.6960×10−4

0.05 0.1344×10−3 0.1449×10−3 0.1356×10−3

0.06 0.2312×10−3 0.2460×10−3 0.2330×10−3


Table III. CPU Time Versus Mesh Type for Problem in Eqs. (10) and (11)

CPU time (in seconds)

AdaptiveUniform

conforming Nonconforming NonconformingHeat source radius α ∆x=1/8 ∆x=1/8 ∆x=1/16

0.04 366.45 38.28 191.970.05 352.62 37.50 208.350.06 347.43 44.94 239.61

Table IV. Percentage of CPU Time Spent On Mesh Adaptation

CPU time use percentage (%)

∆x=1/8 ∆x=1/16α α

0.04 0.05 0.06 0.04 0.05 0.06

Mesh adjustments + Interpolation 2.82 2.94 3.54 1.36 1.24 1.15Preconditioner calculation 0.24 0.40 0.60 0.63 0.62 0.58

6. PARALLEL IMPLEMENTATION

We developed two parallel shared-memory versions of the adaptivenonconforming MEM, and apply them to the moving heat source problemdescribed in Eqs. (10) and (11). We run the OpenMP version on a SGIOrigin2000 and the multithreaded version on a Cray MTA-2.

Elemental computations with loops over elements can be parallelizedtrivially and efficiently using OpenMP, with equal distribution of workamong threads. The projection from mortars to elements can also be donein parallel. However, when looping over elements, the corresponding mor-tars are not ordered consecutively. Therefore, as in the sequential case,memory access of the mortars is irregular.

The projection from elements to mortars computes the residual on amortar by summing contributions from all neighboring elements. Parallelimplementation is trickier since simultaneous updates of the same mortarmust be avoided to prevent race conditions. We do this by using the Open-MP “atomic” directive that prevents any memory location from beingwritten to by more than one thread at a time. The “atomic” directive is

242 Feng et al.

activated only when two threads try to update the same mortar simulta-neously. Unfortunately, this condition cannot be totally avoided; however,the probability of it occurring can be minimized by assigning adjacent ele-ments to the same thread. We use a space filling curve to order elementsto increase this likelihood.

Extensive details of parallel tests and performance results can befound in [8]. Here we briefly present some conclusions that are related tothe mortar implementation. The fact that a mortar can be accessed frommultiple elements and that it is impossible to order all those elements instrict consecutive order even with the help of space filling curves causesirregular memory access for the element-to-mortar projection. Further-more, the moving heat source makes the memory access pattern dynamic.The irregularity and dynamic nature differentiate this type of problemfrom those with static features and nonadaptive structured grids.

The SGI Origin2000 is a scalable hardware-supported cache-coher-ent nonuniform memory access (CC-NUMA) system. If a thread needs toaccess data located in another node, the time for a remote memory access(RMA) depends on how far away the data is. Since memory allocationis a result of first touch, the RMA percentage for adaptive nonconform-ing meshes with MEM is higher than for problems with structured grids.Another consequence of high RMA and dynamic irregular memory accessis the difficulty of predicting data needed in the next step, which hurts pre-fetch performance. The special hardware architecture of the Cray MTA-2 hides the memory access latency by employing multiple threads. Thisarchitecture is especially suitable for irregular and dynamic applications.

We compare the parallel performance of the Origin2000 (400 MHz)and MTA-2 (220 MHz) by solving the moving heat source problem (∆t=3.125×10−4, ∆x=1/128, α=0.076, and adaptation performed every fivesteps for a total 200 time steps) using 1, 2, 4, and 8 processors. Speedupresults, shown in Table V, are all normalized to that of a single Origin2000processor. Observe that the absolute performance of the MTA-2 exceedsthat of the Origin2000, despite its significantly slower clock.

Table V. Speedup on Origin2000 and MTA-2for Problem in Eqs. (10) and (11).

Number of CPUs

System 1 2 4 8

Origin2000 1.00 1.52 3.32 7.79MTA-2 1.38 2.75 5.33 9.44


7. CONCLUSIONS

We presented a 3D extension to the MEM for numerical problemsthat benefit from local, temporally varying mesh adaptation, combinedwith high-order spatial discretization methods. We demonstrated over-all efficiency of MEM compared to uniform mesh refinement, using astrongly dynamic heat transfer problem. We also showed parallel efficiencyof the method on two different shared-memory platforms, a SGI Ori-gin2000 and a Cray MTA-2.

REFERENCES

1. Achdou, Y., Maday, Y., and Widlund, O. B. (1998). Iterative substructuring precondition-ers for mortar element methods in two dimensions. SIAM J. Num. Anal. 36(2), 551–580.

2. Ben Belgacem, F., and Maday, Y. (1993). Non-conforming spectral element method forsecond order elliptic problems in 3D. East-West J. Num. Math. 1(4), 235–252.

3. Ben Belgacem, F., and Maday, Y. (1997). The mortar element method for three dimen-sional finite elements. Modelisation Mathematique et Analyse Numerique 31, 289–302.

4. Bernardi, C., Debit, N., and Maday, Y. (1990). Coupling finite element and spectralmethods: first results. Math. Comp. 54(189), 21–39.

5. Bernardi, C., Maday, Y., and Patera, A. T. (1994). A new nonconforming approach todomain decomposition: the mortar element method. In Brezis, H., and Lions, J. L. (eds.),Non-linear Partial Differential Equations and their Applications, Vol. 11, Pitman/Wiley,London/New York, pp. 13–51.

6. Farhat, C., and Roux, F. (1991). A method of finite element tearing and interconnectingand its parallel solution algorithm. Int. J. Num. Meth. Eng. 32, 1205–1227.

7. Feng, H. (2003). A Spectral Element Method with hp Mesh Adaptation, D.Sc. Disserta-tion, George Washington University, Washington, DC.

8. Feng, H., Van Der Wijngaart, R. F., and Biswas, R. (2005). Unstructured adaptivemeshes: bad for your memory? J. Appl. Num. Math. 52, 2–3, 153–173.

9. Feng, H., Van Der Wijngaart, R. F., Biswas, R., and Mavriplis, C. A. (2004). Unstruc-tured Adaptive (UA) NAS Parallel Benchmark, Version 1.0. Technical Report, NAS-04-006, NASA Ames Research Center, Moffett Field, CA.

10. Fischer, P. F., Kruse, G. W., and Loth, F. (2001). Spectral element methods for transi-tional flows in comoplex geometries. J. Sci. Comp. 17(1), 81–98.

11. Korczak, K. Z., and Patera, A. T. (1986). An isoparametric spectral element method forsolution of the Navier-Stokes equations in complex geometry. J. Comp. Phys. 62 (2), 361–382.

12. Maday, Y., Mavriplis, C. A., and Patera, A. T. (1989). Nonconforming mortar elementmethod: application to spectral discretizations. In Chan, T., Periaux J., and Widlund, O.B. (eds.), Domain Decomposition Methods, SIAM: Philadelphia, PA, pp. 392–418.

13. Mavriplis, C. A. (1989). Non conforming discretizations and A Posteriori error estimationsfor adaptive spectral element techniques, Ph.D. Thesis, Massachusetts Institute of Technol-ogy, Cambridge, MA.

14. Rønquist, E. M. (1996). Convection treatment using spectral elements of different order.Int. J. Num. Meth. Fluids. 22(4), 241–264.

Documents

Parallel 3D Mortar Element Method for Adaptive Nonconforming Meshes