Solving Large-scale Eigenvalue Problems in SciDAC Applications

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N

Solving Large-scale Eigenvalue Problems in SciDAC Applications

Chao YangLawrence Berkeley National Laboratory

June 27, 2005


People Involved

LBNL: W. Gao, P. Husbands, X. S. Li, E. Ng, C. Yang

(TOPS) J. Meza, L. W. Wang, C. Yang (Nano-science)

SLAC: L. Lee, K. Ko

Stanford: G. Golub

UC-Davis Z. Bai


SciDAC Applications

Accelerator Modeling

Nano-science

MxKx 0)1(

0

0)1( 2

2

E

E

EE

n

nc

H i E ii XXXH )(


Algorithms

Krylov Subspace MethodAlternatives

Optimization based approach non-linear solver based approach

Multi-level Sub-structuringNon-linear Eigenvalue Problems

Structure preserving methods Optimization based method


Krylov Subspace Method

xAx 0

1000 ,...,,span);,( vAAvvkvA k Κ

kT

kkkkT

kTkkkkk AVVHIVVefHVAV , ,

• Widely used, relatively well understood (Polynomial approximation theory):

• Convergence of KSM: Well separated, large eigenvalues converge rapidly the starting vector

nnn xpxpxpvApz 2221110)(


Acceleration Techniques

Implicit Restart

Spectral transformation

MxMxMK

xxIA

1

1

)(

)(

);,( 0 kvAΚ

);,( 0 kvA ΚQRIH

QefQHQQVQAV

k

Tkkk

Tkk

,)(

1

filter out unwanted spectral components from v0

ARPACK


Using KSM in accelerator modeling

the spectrum of the problem

Example: H60VG3 structure, linear element, N=30M, nnz=484M 1024 CPUs, 738GB Ordering time: 4143s Numerical Factorization: 133s Total: 5068s for 12 eigenvalues

Software: PARPACK (implicit restart) + SuperLU, WSMP (spectral transformation)

1


Limitations of the KSM High degree polynomial

needed for computing small clustered eigenvalues many matrix vector

multiplications Spectral transformation

can be expensive memory limitation scalability

Not easy to introduce a preconditioner eigenvectors of P-1A are

different from eigenvectors of A


Alternative algorithms

Optimization based approach Minimizing Rayleigh Quotient

Minimizing Residual (Wood & Zunger 85, Jia 97)

Nonlinear equation solver based approach (Jacobi-Davidson) Newton correction Preconditioner stopping criteria for the inner iteration (Notay

2002, Stathopoulos 2005)

0 ),)(()( zuzuzuA T

AxxT

xxT 1min

xAxxxVx T

1,

min

)()( TT uuIPuuI

Allows us to solve problems with more than 90M DOF


Multi-level Sub-structuring (for computing many eigenpairs)

Domain Decomposition concept Multi-level extension of the Component Mode

Synthesis (CMS) method (Bennighof 92) Decomposition can be done algebraically (Lehoucq &

Bennighof 2002) Success story in structure engineering.... Error analysis Extend to accelerator modeling


Single-level Sub-structuring

Matrix Partition

Block elimination

Sub-structure calculation (mode selection)

Subspace assembling

11K

22K

11M

22M

11K

22K

11M

22M

),( MK

TKLLK 1ˆ TMLLM 1ˆ

)3(33

)3()3(33

)2(22

)2()2(22

)1(11

)1()1(11

ˆˆ vMvK

vMvK

vMvK

)3()3(

2)3(

13

)2()2(2

)2(12

)1()1(2

)1(11

3

2

1

k

k

k

vvvS

vvvS

vvvS

1S

2S

S

3S


Mode Selection


Implementation & cost

Cost: Flops: more than a single sparse Cholesky

factorization Storage: Block Cholesky factor + Projected matrix +

some other stuff NO triangular solves (involving the original K and M),

NO orthogonalization

attractive when:1) the problem is large enough2) a large number of eigenvalues are needed


AMLS vs. Shift-invert Lanczos (SIL)DOF=65K, 3 levels of partition


Cavity with External Coupling

Vector wave equation with waveguide boundary conditions can be modeled by a non-linear eigenvalue problem

OpenCavity

n

E i k 2 kc1

2 n n

E 0

n

E i k 2 kc2

2 n

n

E 0

n

E i k 2 kc3

2 n

n

E 0 Waveguide BC

Waveguide BC

Waveguide BC

With


Quadratic Eigenvalue Problem

Consider only one mode propagating in the waveguides

Algorithms Linearize then solve by KSM (does not preserve

the structure of the problem) Second Order Arnoldi Iteration (Bai & Su 2005)

project the QEP into 2nd order Krylov Subspace


Second-Order Krylov Space (Bai)

IkKMBWiMA c211 ,


SOAR is faster and more accurate (than linearization)

Accelerating cavity model for international linear collider (ILC)

9-cell superconducting cavity coupled to one input coupler and two Higher-Order-Mode couplers.

NDOFs=3.2million, NCPUs=768, Memory=300GB

18 eigenpairs in 2634 seconds (linearization took more than 1 hour)


Electronic Structure Calculation

wave function

n – real space grid size, e.g. 323~32000

k – number of occupied states, 1~10% of n

Charge density

• Ekinetic =

• Eionic =

• EHartree=

• Exc =

)(trace21 LXX T

i

Ti

Tion wxXXD

2trace

)(21 XSX T

Xfe xcT

nik xxxxX R ),,...,,( 21

TXXX diag)(

Etotal(X) = Ekinetic + Eionic + EHartree + Exc


Non-linear Eigenvalue Problem

Total energy minimization

KKT condition

IXX

XET

totalX

s.t.

)(min

IXX

XX

XH

XgXSwwDL

T

xcT

ion

)(

))((DiagDiag


The Self Consistent Field Iteration

Input: initial guess and Output:

Major steps

o For i=1,2,…,until converged

1) Form

2) Compute k smallest eigpairs of

)( )()( ii XHH

0X k

T

total

IXX

XEX

s.t.

)(minargwSDL ion ,,,

k

T

iTi

IXX

XHXX

s.t.

traceminarg )()1(

)(iH

)()()()( iiii XXXH


Direct Constrained Minimization (DCM)

GYE itotal

IYGYG kTT

)(min

):1,2:1()()( kkkGYP ii

)1()()()( ,, iiii PRXY

)( )()( ii XHH For i=1,2,… until convergence1. Form 2. Compute

3. If (i>1) then• set

4. else• set

5. Solve

6. If (i>1) then• set

7. else• set

onerpreconditi a is ,Diag where

,)()()()(

)()()()(1)(

KXHX

XXHKRiiii

iiiii

T

)()()( , iii RXY

):1,3:1()()( kkkGYP ii


DCM vs. SCF

Atomic system: SiH4 Discretization: spectral

method with plane wave basis: n=323 in real space, N=2103 (# of basis functions) in frequency space

Number of occupied states: k = 4

PETOT version of SCF uses 10 PCG steps (inner iterations) per outer iteration

DCM: 3 inner iterations

min)()( )()( EXEXE i

totali


Concluding Remarks

Krylov Subspace Method (with appropriate acceleration strategies) continues to play an important role in solving SciDAC eigenvalue problems

Steady progress has been made in alternative approaches that can make better use of preconditioners

Multi-level sub-structuring is promising for computing many eigenpairs

Significant progress made in solving QEP Non-linear eigenvalue problems remain challenging

Documents

Solving Large-scale Eigenvalue Problems in SciDAC Applications