29
A Comparative Study of Conforming and Nonconforming High-Resolution Finite Element Schemes Matthias Möller Institute of Applied Mathematics (LS3) TU Dortmund, Germany European Seminar on Computing Pilsen, Czech Republic June 25, 2012 Montag, 25. Juni 12

A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

A Comparative Study of Conformingand Nonconforming High-Resolution

Finite Element SchemesMatthias Möller

Institute of Applied Mathematics (LS3)TU Dortmund, Germany

European Seminar on ComputingPilsen, Czech Republic

June 25, 2012

Montag, 25. Juni 12

Page 2: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Family of AFC schemes Kuzmin et al.

MLu = [K+D]u+ F(u,u)

artificial diffusion operator

antidiffusive correctionlumped mass matrix

Montag, 25. Juni 12

Page 3: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

linearized flux-correction

nonlinear flux-correction

Family of AFC schemes Kuzmin et al.

NL-GP NL-FCT NL-TVD Low-order

MLu = MLuL + F (uL, uL)Lin-FCT

F (u) = �DuF (u, u) = [ML �MC ]u�Du F ⌘ 0

MLu = [K+D]u+ F(u,u)

Montag, 25. Juni 12

Page 4: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

linearized flux-correction

nonlinear flux-correction

Family of AFC schemes Kuzmin et al.

AFC-LPT. . .

NL-GP NL-FCT NL-TVD Low-order

MLu = MLuL + F (uL, uL)Lin-FCT

F (u) = �DuF (u, u) = [ML �MC ]u�Du F ⌘ 0

MLu = [K+D]u+ F(u,u)

Montag, 25. Juni 12

Page 5: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Review of design principles

■ Jameson‘s Local Extremum Diminishing criterion

IF

THEN local solution maxima/minima do not increase/decrease

positive not negative

miui =X

j 6=i

�ij(uj � ui)

Montag, 25. Juni 12

Page 6: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Review of design principles

■ Jameson‘s Local Extremum Diminishing criterion

IF

THEN local solution maxima/minima do not increase/decrease

■ Semi-discrete high-resolution scheme

positive not negative

miui =X

j 6=i

�ij(uj � ui)

miui =X

j 6=i

(kij + dij)(uj � ui) + �iui +X

j 6=i

↵ijfij

mi =X

jmij

not negative by construction of

diffusion coefficient

controlled byflux limiter

Montag, 25. Juni 12

Page 7: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Review of design principles

■ Jameson‘s Local Extremum Diminishing criterion

IF

THEN local solution maxima/minima do not increase/decrease

■ Semi-discrete high-resolution scheme

■ Edge-based assembly of operators/vectors is feasible and efficient!

positive not negative

miui =X

j 6=i

�ij(uj � ui)

miui =X

j 6=i

(kij + dij)(uj � ui) + �iui +X

j 6=i

↵ijfij

mi =X

jmij

not negative by construction of

diffusion coefficient

controlled byflux limiter

Montag, 25. Juni 12

Page 8: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Bilinear Q1 FE

■ nodal values at cell vertices

Rotated bilinear ~Q1 FE

■ nodal values at cell midpoints

■ integral mean-values at edges

Finite Elements

Montag, 25. Juni 12

Page 9: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

■ Uniform 128x128 grid of unit square with 5% stochastic disturbance

Finite Element properties

△(A) MC ML NEQ NA

Q1 8 ✓ ✓ 16,641 82,680

~Q1par 6 ✓ ✓ 33,024 140,483

~Q1np 6 ✓ ✓ 33,024 140,478

~Q1par 6 -7.2E-07 (✓) 33,024 138,204

~Q1np 6 -7.2E-07 (✓) 33,024 138,576

meanvalue

pointvalue

Montag, 25. Juni 12

Page 10: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

■ Uniform 128x128 grid of unit square with 5% stochastic disturbance

Finite Element properties

△(A) MC ML NEQ NA

Q1 8 ✓ ✓ 16,641 82,680

~Q1par 6 ✓ ✓ 33,024 140,483

~Q1np 6 ✓ ✓ 33,024 140,478

~Q1par 6 -7.2E-07 (✓) 33,024 138,204

~Q1np 6 -7.2E-07 (✓) 33,024 138,576

meanvalue

pointvalue

Montag, 25. Juni 12

Page 11: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Finite Element properties

Montag, 25. Juni 12

Page 12: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Finite Element properties

0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE

Montag, 25. Juni 12

Page 13: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Finite Element properties

0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE

0 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 530

~Q1 FE

Montag, 25. Juni 12

Page 14: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Finite Element properties

0 10 20 30 40 50

0

5

10

15

20

25

30

35

40

45

50

nz = 369

Q1 FE

0 10 20 30 40 50 60 70 80

0

10

20

30

40

50

60

70

80

nz = 530

~Q1 FE

3456789

101112

0 10 20 30 40 50 60 70 80

Number of non-zero entries per matrix row

Number of matrix row

Q1~Q1

Storage format:■ CRS, CCS

Storage format:■ ELLPACK

Montag, 25. Juni 12

Page 15: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Solid body rotation

Pure convection problem

u+r · (vu) = 0 in ⌦ = (0, 1)2

u = 0 on �

inflow

■ Velocity field

■ Grid size

■ Stochastic grid disturbance

■ Time step in Crank-Nicolson

■ Initial = exact solution at

v(x, y) = (0.5� y, x� 0.5)

�t = 1.28 · h

t = 2⇡k, k 2 N

~Q1 FE NL-FCTsim. time t=2π

� 2 {0%, 1%, 5%}

h = 1/2l, l = 5, 6, . . .

Montag, 25. Juni 12

Page 16: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

SBR: L2-error (0% mesh disturbance)

0,01

0,1

1

5 6 7 8

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

0,01

0,1

1

5 6 7 8

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 17: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

SBR: L2-error (5% mesh disturbance)

0,01

0,1

1

5 6 7 8

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

0,01

0,1

1

5 6 7 8

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 18: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Rotation of a Gaussian hill

■ Convection-diffusion equation

u+r · (vu� ✏ru) = 0

in ⌦ = (�1, 1)2

■ Velocity field, diffusion coefficient

■ Analytical solution

■ Position of the peak

■ Other parameters as before

v(x, y) = (�y, x), ✏ = 0.001

x(t) = �0.5 sin t

y(t) = 0.5 cos(t)

~Q1 FE NL-FCTtime t=5/2π

u(x, y, t) =1

4⇡✏te

� r2

4✏t

r

2 = (x� x)2 + (y � y)2

Montag, 25. Juni 12

Page 19: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

RGH: L2-error (0% mesh disturbance)

0,001

0,01

0,1

1

10

5 6 7 8 9

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

0,001

0,01

0,1

1

10

5 6 7 8 9

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 20: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

RGH: dispersion-error (0% mesh disturbance)

-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 21: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

RGH: L2-error (5% mesh disturbance)

0,001

0,01

0,1

1

10

5 6 7 8 9

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

0,001

0,01

0,1

1

10

5 6 7 8 9

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 22: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

RGH: dispersion-error (5% mesh disturbance)

-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

Q1 FE

Refinement level

Low-order Lin-FCT NL-FCT NL-GP NL-TVD

-0,5

0,5

1,5

2,5

3,5

4,5

5,5

5 6 7 8 9

~Q1 FE

Refinement level

Montag, 25. Juni 12

Page 23: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

3:3conforming

non

conforming

good accuracy good

small numerical diffusion small

smaller #DOFs, #edges larger

irregular sparsity pattern regular

Taxonomy of finite elements

Montag, 25. Juni 12

Page 24: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Can nonconforming finite elements help to improve performance on many-core hardware ?

NVIDIA Kepler GK110 Die Shot (taken from: www.gpgpu.org)Montag, 25. Juni 12

Page 25: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Example: edge-based flux-assembly with Q1 FE

0

10

20

30

40

50

0K 1K 10K 100K 1.000K

Band

wid

th (

GB/

s)

#edges per color group

1 OMP 2 OMP 4 OMP8 OMP 12 OMP Xeon X5680(2x6) CUDA C2070

best GPU performancefor large problem sizes

best CPU performanceif data fits into cache

Montag, 25. Juni 12

Page 26: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Example: edge-based flux assembly with Q1 FE

0

25

50

75

100

125

150

CUDA OMP 6

Acc

umul

ated

tim

e (m

s)

w/ transf.„55 ms“

132 ms

23 ms

5,7x

0

10

20

30

40

50

0K

125K

250K

375K

500K

Band

wid

th (

GB/

s)

Color groups 1-14

OMP 6 on Core i7 EPCUDA on C2070#edges per color group

6-7 GB/s

Fi =N

colorsX

k=1

X

ij2CGk

�ij(Uj � Ui)

Montag, 25. Juni 12

Page 27: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

0

25

50

75

100

125

150

175

200

CUDA OMP 6

Acc

umul

ated

tim

e (m

s)

w/ transf.„95 ms“

191 ms

32 ms

5,9x

Example: edge-based flux assembly with ~Q1 FE

0

10

20

30

40

50

60

0K

375K

750K

1.125K

1.500K

Band

wid

th (

GB/

s)

Color groups 1-9

OMP 6 on Core i7 EPCUDA on C2070#edges per color group

6-7 GB/s

Montag, 25. Juni 12

Page 28: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Summary

Nonconforming ~Q1 finite elements

■ can be used within the algebraic flux correction framework

■ are comparable to conforming FEs (accuracy/numerical diffusion)

■ increase number of DOFs as compared to conforming Q1 FEs

■ lead to system matrices with regular structure favorable for HPC

Future plans

■ Apply nonconforming AFC schemes to systems of equations

■ Exploit benefits of nonconforming FEs for many-core architectures: speed up parallel (edge-based) assembly loops, implement more efficient matrix structures (ELLPACK), reduce communication costs in parallelized code

Montag, 25. Juni 12

Page 29: A Comparative Study of Conforming and Nonconforming High ... · Rotated bilinear ~Q 1 FE nodal values at cell midpoints integral mean-values at edges Finite Elements Montag, 25. Juni

Acknowledgements

AFC schemes

■ D. Kuzmin (University of Erlangen-Nuremberg)

CUDA/GPU programming

■ D. Göddeke, D. Ribbrock, M. Geveler (TU Dortmund)

Featflow2

■ M. Köster, P. Zajac (TU Dortmund)

Source code freely available at:

http://www.featflow.de/en/software/featflow2.html

Montag, 25. Juni 12