Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Introduction to Simulation - Lecture 21
Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy
Boundary Value Problems - Solving 3-D Finite-Difference problems
Jacob White
Outline
• Reminder about FEM and F-D– 1-D Example
• Finite Difference Matrices in 1, 2 and 3D– Gaussian Elimination Costs
• Krylov Method– Communication Lower bound
– Preconditioners based on improving communication
1-D ExampleNormalized 1-D Equation
Normalized Poisson Equation
2
2
( ) ( ) ( )sT x u xh f x
x x xκ∂ ∂ ∂
= − ⇒ − =∂ ∂ ∂
Heat Flow
( ) ( )xxu x f x− =
SMA-HPC ©2002 MIT
SMA-HPC ©2002 MIT
SMA-HPC ©2002 MIT
FD Matrix properties Finite Differences
1-D Poisson
0x 1x 2x nx 1nx +
12
ˆ ˆ ˆ2( )j j jj
u u uf x
x+ − +
− =∆
xxu f− =
1 1
2
ˆ ( )2 1 0 01 2 1
1 0 01
0 0 1 2ˆ ( )n n
u f x
x
u f x
− − − = ∆ − −
M ML
M MO M
M MO O O
M MM O O O
M ML
A
SMA-HPC ©2002 MIT
Residual EquationUsing Basis Functions
2
2
u fx∂
− =∂
Partial Differential Equation form
(0) 0 (1) 0u u= =
Basis Function Representation
Plug Basis Function Representation into the Equation
( ) ( ) ( )2
21
ni
ii
d xR x f x
dxϕ
ω=
= +∑
( ) ( ) ( ){1 Basis Functions
n
h i ii
u x u x xω ϕ=
≈ =∑
SMA-HPC ©2002 MIT
Basis WeightsUsing Basis functions
Galerkin Scheme
Generates n equations in n unknowns
( ) ( ) ( )21
210
0n
il i
i
d xx f x dx
dxϕ
ϕ ω=
+ =
∑∫
Force the residual to be “orthogonal” to the basis functions
{ }1,...,l n∈
( ) ( )1
0
0l x R x dtϕ =∫
SMA-HPC ©2002 MIT
Basis WeightsUsing Basis Functions Galerkin with integration by
parts
Only first derivatives of basis functions
( ) ( )( ) ( )1
1 1
0 0
0
n
i iil
i
xdd xdx x f x dx
dx dx
ωϕϕϕ= − =
∑∫ ∫
{ }1,...,l n∈
Structural Analysis of Automobiles
• Equations– Force-displacement relationships for
mechanical elements (plates, beams, shells) and sum of forces = 0.
– Partial Differential Equations of Continuum Mechanics
Drag Force Analysis of Aircraft
• Equations– Navier-Stokes Partial Differential Equations.
Engine Thermal Analysis
• Equations– The Poisson Partial Differential Equation.
SMA-HPC ©2002 MIT
FD Matrix properties Discretized Poisson
2-D Discretized Problem
1x 2x mx
1mx + 2mx
m
m
1 12 2
2 2( )j j j j m j j m
j
u ux
u u u u
x
u uf x
x y
yy
+ − + −− + − ++ =
∆ ∆1442443 1442443
SMA-HPC ©2002 MIT
FD Matrix properties Matrix Nonzeros, 5x5 example
2-D Discretized Problem
SMA-HPC ©2002 MIT
FD Matrix properties
3-D Discretization
mm
jx
j mx −2j m
x−
1jx +
1jx −
j mx +
2j mx
+
2 21 12 2 2
ˆ ˆ ˆ2ˆ ˆ ˆ ˆ ˆ ˆ2 2( )
( ) ( ) ( )jj j j j m j j m j m j m
j
xx yy zz
u u uu u u u u uf x
x y zu u u
+ − + − + −− +− + − +
+ + =∆ ∆ ∆
1442443 1442443 144424443
m
Discretized Poisson
SMA-HPC ©2002 MIT
FD Matrix properties Matrix nonzeros, m = 4 example
3-D Discretization
SMA-HPC ©2002 MIT
FD Matrix properties
Summary
2 2 2
1 1 11 2, 2 4, 3 6,D D D− → − → − →∆ ∆ ∆
Matrix is symmetric positive definite
Assuming uniform discretization, diagonal is
| | | |ii ijj i
A A≠
≥
∑
Numerical Properties
Matrix is Irreducibly Diagonally Dominant
Each row is strictly diagonally dominant, or path connected to a strictly diagonally dominant row
SMA-HPC ©2002 MIT
FD Matrix properties
Summary
2 2 3 31 , 2 , 3D m m D m m D m m− → × − → × − → ×
Nonzeros per row 1 3, 2 5, 3 7D D D− − −
2
1 0 | | 1
2 0 | |
3 0 | |
ij
ij
ij
D A i j
D A i j m
D A i j m
− = − >
− = − >
− = − >
Structural Properties
Matrices in 3-D are LARGE
100x100x100 grid in 3-D = 1 million x 1 million matrixMatrices are very sparse
Matrices are banded
SMA-HPC ©2002 MIT
11 12 13 14
21 22 23 24
31 32 33 34
41 42 43 44
A A A AA A A AA A A AA A A A
44A43A43A 44A44A
34A33A32A
Basics of GEPicture
Triangularizing
22A 23A 24A
42A33A 34A0
0
000 0
SMA-HPC ©2002 MIT
Pivot
GE BasicsAlgorithm
Triangularizing
Multiplier
Form n-1 reciprocals (pivots)
Form 21
1
( )2
n
i
nn i−
=
− =∑ multipliers
For i = 1 to n-1 { “For each Row”For j = i+1 to n { “For each Row below pivot”
For k = i+1 to n { “For each element beyond Pivot”
}}
}
jijk jk ik
ii
AA A A
A⇐ −
Perform1
2 3
1
2( )3
n
in i n
−
=
− ≈∑Multiply-adds
SMA-HPC ©2002 MIT
Complexity of GE
3 3
3 6
3
6
12
19 8
1 D ( ) ( ) 100 pt grid 2 D ( ) ( ) 100 10
(10 ) ops(0 grid
3 D ( ) ( ) 100 100 110 ) ops
(10 )00 g sr opid
O n O mO n O mO n
OO
O m O
− = ←
− = ← ×
− = ← × ×
For 2- D and 3-D problems Need a Faster Solver !
SMA-HPC ©2002 MIT
For i = 1 to n-1 {For j = i+1 to n {
For k = i+1 to n {
}}
}
Banded GEAlgorithm
Triangularizing
NONZEROS
0
0b
jijk jk ik
ii
AA A A
A⇐ −
b i+b-1 {i+b-1 {
Perform( )
12 2
1(min( 1, ))
n
ib n i O b n
−
=
− − ≈∑Multiply-adds
SMA-HPC ©2002 MIT
Complexity of Banded GE
2
2 4
2 7
8
14
1 D ( ) ( ) 100 pt grid 2 D ( ) ( ) 100 100 grid 3 D
(100) ops(10 ) ops
(10 ) ops( ) ( ) 100 100 100 grid
O b n OO
O mO b n
OO m
O b n O m
− = ←
− = ← ×
− = ← × ×
For 3 - D problems Need a Faster Solver !
SMA-HPC ©2002 MIT
The World According to Krylov
Determine the Krylov Subspace 0r Pb PAx= −
Select Solution from the Krylov Subspace( ){ }1 0 0 0 0, , ,..., kk k kx x y y span r PAr PA r+ = + ∈kGCR picks a residual minimiz ng yi .
Start with , Form Ax b PAx Pb= =Preconditioning
( ){ }0 0 0Krylov Subspace , ,..., kspan r PAr PA r≡
SMA-HPC ©2002 MIT
Krylov MethodsDiagonal Preconditioners
Preconditioning
Let A ndD A= +
( ) ( )1 1 1Apply GCR to ndD A x I D A x D b− − −= + =
• The Inverse of a diagonal is cheap to compute• Usually improves convergence
SMA-HPC ©2002 MIT
Krylov MethodsConvergence Analysis
Optimality of GCR poly
GCR Optimality Property
( )k+1 polynomial such that 0 =1℘%
ThereforeAny polynomial which satisfies the
constraints can be used to get an upper bound on
1
0
kr
r
+
1 0k+1 k+1( ) r where is any orderk thr kPA+ ≤ ℘ ℘% %
SMA-HPC ©2002 MIT
( )Keep as small as possible:k iλ℘Easier if eigenvalues are clustered!
Residual Poly Picture for Heat Conducting Bar MatrixNo loss to air (n=10)
SMA-HPC ©2002 MIT
10
0
M
The World According to Krylov
Krylov Vectors for diagonal Preconditioned A
1
1 0.5 0 00.5 1 00 0.50 0 0.5 1
D A−
− − − −
O
O O
1444442444443
1-D DiscretizedPDE0b r= 1 0 0 0 0 0 0
1 -.5 0 0 0 0 0
1.25 -1 -.5 0 0 0 0
1D A−
1D A−
10.5
0
−
M=
(1 digit)exactx 0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1
10.5
0
−
M
1.251
0.50
− −
SMA-HPC ©2002 MIT
The World According to Krylov
0b r= 1 0 0 0 0 0 0
1 -.5 0 0 0 0 0
1.25 -1 -.5 0 0 0 0
1D A−
1D A−
(1 digit)exactx 0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1
Krylov Vectors for Diagonal Preconditioned A
Communication Lower Bound for m gridpoints( )1 0 is nonzero in entry after itersthk
m mD A r k− =
Communication Lower Bound
( )1 0
1Need at least iterations for + 0
kk j
jmj m
x x rm α+
=
= ≠
∑
SMA-HPC ©2002 MIT
The World According to Krylov
Krylov Vectors for Diagonal Preconditioned A
Two Dimensional Case
For an mxm Grid
m
m
0
10
If
0
r
=
M
( )Takes iters2m O m≈ =
( )12or 0f k
mx + ≠
m2
1
SMA-HPC ©2002 MIT
The World According to Krylov
Convergence for GCR
Eigenanalysis
1Recall Eigenvalues of 1 cos1
kD Amπ− = − +
10
0
M
1
1 0.5 0 00.5 1 00 0.50 0 0.5 1
D A−
− − − −
O
O O
1444442444443
10.5
0
−
M=
10.5
0
−
M
1.251
0.50
− −
SMA-HPC ©2002 MIT
The World According to Krylov
Convergence for GCR
Eigenanalysis
# Iters for GCR to achieve convergence
log2
1log1
( )m
k O mγ
κκ
→ ∞= ≅
−
+
1 max
min
1For , 1 cos 1 cos
1 1mD Am m
λ π πλ
κ−−
≡ = − − + +
0
121
kkr convergencetolerancer
κκ
γ −≤ ≤ +
GCR achieves Communication lower bound O(m)!
SMA-HPC ©2002 MIT
The World According to Krylov
Work for Banded Gaussian Elimination, Relaxation and GCR
( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )
2
4 3 3
7 6 4
Dimension Banded GE Sparse GE GCR
1
2
3
O m O m O m
O m O m O m
O m O m O m
GCR faster than banded GE in 2 and 3 dimensions Could be faster, 3-D matrix only m3 nonzeros.Must defeat the communication lower bound!
SMA-HPC ©2002 MIT
The World According to Krylov
How to get Faster Converging GCR
Preconditioner must accelerate communication
Preconditioning is the Only Hope
Multiplying by PA must move values by more than one grid point.
GCR already achieves Communication Lowerbound for a diagonally preconditioned A
SMA-HPC ©2002 MIT
1-D Discretized PDE0u (old)
2u(new)1u
(new)1ˆnu −
(new)ˆnu1ˆnu +
Gauss Seidel(new)1u
(new)2u (old)
3u
Each Iteration of Gauss-Seidel Relaxationmoves data across the grid
Preconditioning Approaches
Gauss-Seidel Preconditioning
Physical View
SMA-HPC ©2002 MIT
1-D DiscretizedPDE0b r= 1 0 0 0 0 0 0
x x x x x x x
x x x x x x x
( ) 1D L A−+
( ) 1D L A−+
(1 digit)exactx 0.9 -0.7 0.6 -0.5 0.4 -0.3 0.1
0b r= 0 0 0 0 0 0 1
0 0 0 0 0 x x
0 0 0 0 x x x
( ) 1D L A−+
( ) 1D L A−+
X=nonzero
Gauss-Seidel communicates quickly in only one direction
Preconditioning Approaches
Gauss-Seidel Preconditioning
Krylov Vectors
SMA-HPC ©2002 MIT
0u (old)2u
(new)1u
(new)1ˆnu −
(new)ˆnu1ˆnu +
(new)1u
(new)2u (old)
3u
(new)2ˆnu −
(newer)1ˆnu − ( )ˆ new
nu
0u (newer)2u
(newer)1u
Preconditioning Approaches
Gauss-Seidel Preconditioning
Symmetric Gauss-Seidel
This symmetric Gauss-Seidel PreconditionerCommunicates both directions
SMA-HPC ©2002 MIT
Preconditioning Approaches
Gauss-Seidel Preconditioning
Symmetric Gauss-Seidel
( ) ( ) 1/ 2Forward Sweep half step : k kD L x Ux b++ + =
Derivation of the SGS Iteration Equations
( ) ( ) 1/ 2Backward Sweep half step : k kD U Lx x b+++ =
( ) ( ) 11 1 k kD L UUx xL D− −+ +⇒ +=( ) ( ) ( )1 1 1D U D U Lb D L b−− −+ − ++ +
( ) ( )1 11 k k kx x D U D D L Ax− −+ = − + +⇒
( ) ( )1 1D U D D L b− −+ + +
SMA-HPC ©2002 MIT
Preconditioning Approaches
Block Diagonal Preconditioners
Line SchemesGrid
Matrix
m2
1
Tridiagonal Matrices factor quickly
SMA-HPC ©2002 MIT
Preconditioning Approaches
Block Diagonal Preconditioners
Line Schemes
m2
1
The Preconditioner is now two Tridiagonal solves, with variable reordering in between.
Grid
Do lines first in x, then in y.
Solution
ProblemLines preconditioners
communicate rapidly in only one direction
SMA-HPC ©2002 MIT
Preconditioning Approaches
Block Diagonal Preconditioners
Domain Decomposition
Fewer blocks means faster convergence, but
more costly iterates
The trade-off
ApproachBreak the domain into
small blocks each with the same # of grid points
m2
1
SMA-HPC ©2002 MIT
Preconditioning Approaches
Block Diagonal Preconditioners
Domain Decompositionm points
mpointsm
l
1 2
2l
l
( )2
2B factoring grids, , sparse GElock cost: m l l O m ll
×
Communication bound gives iterations.GCR iters: mOl
( )3Suggests insensitivity t Algorithm is : .o l O m
Do you have to refactor every GCR iteration?
Block index
SMA-HPC ©2002 MIT
Preconditioning Approaches
Siedelerized Block Diagonal Preconditioners
Line SchemesGrid
Matrix
m2
1
SMA-HPC ©2002 MIT
GridMatrix
m2
1
Bigger systems to solve, but can have faster convergence on tougher problems (not just Poisson).
Preconditioning Approaches
Overlapping Domain Preconditioners
Line based Schemes
i
Reminder about Gaussian Elimination Computational Steps
Fill-in for Sparse MatricesGreatly increases factorization cost
Fill-in in a 2-D gridIncomplete Factorization Idea
Preconditioning Approaches
Incomplete Factorization Schemes
Outline
SMA-HPC ©2002 MIT
Sparse MatricesFill-In
Example
X X XX X 0X 0 X
X X XX X 0X 0 X
XX
X X
X= Non zero
Matrix Non zero structure Matrix after one GE step
X X
Fill-ins
SMA-HPC ©2002 MIT
Sparse MatricesFill-In
Second Example
X X X XX X 0 00 X X 0
X 0 00
Fill-ins Propagate
XX
X
X
X
X X
X
X X
Fill-ins from Step 1 result in Fill-ins in step 2
SMA-HPC ©2002 MIT
Sparse Matrices Fill-In
Pattern of a Filled-in Matrix
Very Sparse
Very Sparse
Dense
SMA-HPC ©2002 MIT
Sparse Matrices Fill-InUnfactored Random Matrix
SMA-HPC ©2002 MIT
Sparse Matrices Fill-InFactored Random Matrix
SMA-HPC ©2002 MIT
Factoring 2-D Finite-Difference matrices
Generated Fill-in Makes Factorization Expensive
SMA-HPC ©2002 MIT
FD Matrix properties Matrix nonzeros, m = 4 example
3-D Discretization
iPreconditioning Approaches
Incomplete Factorization Schemes
Key idea
THROW AWAY FILL-INS!Throw away all fill-ins
Throw away fill-ins produced by other fill-insThrow away only fill-ins with small values
Throw away fill-ins produced by fill-ins of other fill-ins, etc.
Summary
• 3-D BVP Examples– Aerodynamics, Continuum Mechanics, Heat-Flow
• Finite Difference Matrices in 1, 2 and 3D– Gaussian Elimination Costs
• Krylov Method– Communication Lower bound
– Preconditioners based on improving communication