Algorithms and Computational Aspects of DFT Calculations ... · Algorithms and Computational...

Preview:

Citation preview

Algorithms and Computational Aspects of DFTCalculations

Part I

Juan Meza and Chao YangHigh Performance Computing ResearchLawrence Berkeley National Laboratory

IMA TutorialMathematical and Computational Approaches to Quantum Chemistry

Institute for Mathematics and its Applications, University of Minnesota

September 26-27, 2008

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 1 / 32

Outline

1 Preliminaries

2 Density Functional Theory

3 Pseudopotentials

4 Bloch’s Theorem

5 Diagonalization / Minimization

6 Improving Convergence

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 2 / 32

Goals

1 Brief introduction to Schrodinger’s equation and Density Functional Theory

2 Overview of most commonly used approximations

3 Description of the Self-Consistent Field Iteration

4 Overview of major algorithmic components

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 3 / 32

References

M. C. Payne, M. P. Teter, D. C. Allen, T. A. Arias, J. D. Joannopoulos,Iterative minimization techniques for ab initio total energy calculation:Molecular dynamics and conjugate gradients, Reviews of Modern Physics,Vol. 64, Number 4, pp. 1045–1097 (1992).

Christopher J. Cramer, Essentials of Computational Chemistry, John Wileyand Sons (2003).

Richard M. Martin, Electronic Structure Basic Theory and Practical Methods,Cambridge University Press (2005).

F. Nogueira, A. Castro, A.L. Marques, A Tutorial on Density FunctionalTheory, Chapter 6, pp. 218–256, A Primer in Density Functional Theory,Springer-Verlag (2002).

J.M. Thijssen, Computational Physics, Cambridge University Press (2003).

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 4 / 32

Many-body electronic Schrodinger equation

H Ψk(r1, r2, ..., rN ) = EkΨ(r1, r2, ..., rN ) (1)

H = − ~2

2m

N∑i=1

∇2i +

N∑i=1

Vext(ri) +12

∑i 6=j

e2

|ri − rj |(2)

Ψk contains all the information needed to study a system

|Ψk|2 probability of finding an electron at a certain state

Vext represents an external potential, e.g. Coulomb attraction by nuclei

Ek quantized energy

Ψk is a function of 3N variables; the electron positions, r1, ..., rN .

Computational work grows like O(103N )

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 5 / 32

Approximations commonly used

Born–Oppenheimer

Also called adiabatic approximationDue to large difference in mass between electrons and nucleiTake nuclear positions as fixed

Density Functional Theory for modeling electron-electron interactions

Local Density Approximation (LDA)

Pseudopotentials for handling electron-ion interactions

Supercells to model systems with aperiodic geometries

Methods for minimizing total energy functional

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 6 / 32

Density Functional Theory

The unknown is very simple, i.e. the electron density, ρ(r)Hohenberg-KohnTheory

There is a unique mapping between the ground state energy, E0, and theground state density, ρ0

Exact form of the functional unknown and probably unknowable

Independent particle model

Electrons move independently in an average effective potential fieldMust add correction for exchange and correlation terms

Good compromise between accuracy and feasibility

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 7 / 32

Kohn-Sham Total Energy

Kohn and Sham proposed using ne noninteracting electrons moving in aneffective potential due to the other electrons

Replace many-particle wavefunctions with single-particle wavefunctions

Kohn-Sham Total Energy

Etotal[ψi] =12

ne∑i=1

∫Ω

|∇ψi|2 +∫

Ω

Vextρ+

12

∫Ω

ρ(r)ρ(r′)

|r − r′ |drdr

′+ Exc[ρ(r)],

where ρ(r) =∑ne

i=1 |ψi(r)|2,∫

Ωψiψj = δi,j , ne is the number of

electrons, and Exc[ρ(r)], denotes the exchange–correlation functional

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 8 / 32

Kohn-Sham Equations

Goal is to find the ground state energy by minimizing the Kohn-Sham totalenergy, Etotal

Leads to:

Kohn-Sham equations

Hψi = εiψi, i = 1, 2, ..., ne

H =[−1

2∇2 + V (ρ(r))

],

V (ρ(r)) = Vext(r) +∫

ρ

|r − r′ |+ Vxc(ρ)

Nonlinear eigenvalue problem since the Hamiltonian, H, depends on ψthrough the charge density, ρ

Vxc(ρ) = ∂Exc(ρ(r))/∂ρ

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 9 / 32

Pseudopotentials

Interaction between electrons and the nucleus creates a problem; one needsto deal with a singularity near the atomic core, specifically the 1/r term incomputation of Vext(r)Pseudopotentials are based on idea that most chemistry is dependent onvalence electrons rather than core electrons

Therefore we replace the core electrons (and the ionic potential) with aweaker pseudopotential

Using pseudopotentials reduces the number of electrons that we need toconsider, as well as the number of plane waves needed to accurately representthe wavefucntions, thereby reducing the computational cost

Both empirical and ab initio forms available.

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 10 / 32

Exchange–Correlation Functional

Most of the complexity of DFT is hidden in the exchange–correlationfunctional

Exchange arises from antisymmetry due to the Pauli exclusion principle

Correlation accounts for other many-body effects missing from single-particleapproximation, e.g. K.E. not covered by first term of Hamiltonian

No systematic way to improve the exchange–correlation functional

Local Density Approximation (LDA)

Simplest approximation to exchange–correlation termAssumes energy is equal to energy from a homogeneous electron gasPurely local, yet remarkably successfulKnown limitations

Literally hundreds of functionals proposed. For an interesting historicalperspective see In Pursuit of the ”Divine” Functional, A.E. Mattsson,Science, Vol. 298, No. 5594, pp. 759–760 (2002).

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 11 / 32

Discretization Options

Finite difference ψ′(rj) ≈ [ψ(rj + h)− ψ(rj − h)]/hFinite elements

ψ(r) ≈n∑j

αjφj(r), φj(r) nice functions with local support

Local orbital method (good for molecules)

Choose φj(r) as Gaussian or other “nice” functions

Planewave expansion

Choose φj(r) as eigj ·r

Useful for modeling solids with a periodic structure

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 12 / 32

Blochs’ Theorem and Periodic Supercells

Bloch’s Theorem: In a periodic solid each electronic wave function can beexpressed as the product of a periodic function φ and exp(ik · r), where k isa wavevector, i.e.

ψ(r) = e(ik·r) · φ(r)

Can expand φ(r) in a set of plane waves so that ψ(r) is a sum of plane waves(more in a minute)

Bloch’s Theorem allows us to express the electronic wavefunctions in termsof a discrete set of plane waves

Can model large periodic systems by focusing on a smaller primary cell

Can also be used to model nonperiodic systems, like molecules

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 13 / 32

Plane-wave Basis Set

Write wavefunction as:

ψi(r) = eik·r∑gj

αjeigj ·r (3)

In principle, you need an infinite plane-wave basis set

In practice, you introduce an energy cutoff to truncate the basis set

All terms for which the kinetic energy is bigger than the cutoff are ignored

Pseudopotentials also allow us to use a much smaller number of plane-wavebasis thereby reducing the computational cost

As a bonus, the kinetic energy term of Hamiltonian is diagonal (in Fourierspace) when using a plane-wave basis set

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 14 / 32

Finite Dimensional Problem

Recall we want to

minE[ψi] =1

2

neXi=1

|∇ψi|2 +

Vextρ+1

2

ρ(r)ρ(r′)

|r − r′ |drdr

′+ Exc(ρ)

Substituting (3) and after some algebra we have

minX∗X=Ine

EKS(X) ≡ Ekinetic(X) + Eext(X) + EHartree(X) + Exc(X),

where

Ekinetic =12

trace(X∗LX)

Eionic = trace(X∗VextX)

EHartree =12ρ(X)TL†ρ(X)

Exc = ρ(X)T (µxc[ρ(X)])ρ(X) = diag(XX∗)X N × nematrix

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 15 / 32

Minimizing the Total Energy

KKT conditions

∇XL(X,Λ) = 0,X∗X = Ine .

Discretized Kohn-Sham equations can now be written as:

H(X)X = XΛ,X∗X = Ine

.

Kohn-Sham Hamiltonian given by:

H(X) =12L+ V (X),

V (X) = Vext + Diag (L†ρ(X)) + Diag gxc(ρ(X))

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 16 / 32

Approaches for Solving the Kohn-Sham Equations

Work with the KS equations indirectlySelf-Consistent Field Iteration

View as solving a sequence of linear eigenvalue problemsNeed to preconditionNeed other acceleration techniques to improve convergence

Minimize the total energy directlyDirect Constrained Minimization

Constrained optimization problemAlso requires globalization techniquesIn general more robust

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 17 / 32

The SCF Iteration

V (ρ(r))

ρ(r) =∑ne

i |ψi(r)|2

ψii=1,...,ne

[− 1

2∇2 + V (ρ(r))

]ψi = Eiψi Most of the work is in

solving the linear eigenvalueproblem

Orthogonality constraint forthe wavefunctions must beenforced explicitly

If using reciprocal (Fourier)space, then you also havemany 3D FFTs

For large systems, thecalculation of nonlocalpotentials can also beexpensive

SCF does NOT decreasethe energy monotonically

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 18 / 32

Checking for Self-consistency

Convergence is usually checked by computing the change in total energy ordensity between iterations

Recall that neither quantity is guaranteed to decrease monotonically

Sometimes difficult to decide when self-consistency is reached

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 19 / 32

SCF Convergence Properties

Surprisingly few results

A good starting point:

E. Cances and C. Le Bris, Can we outperform the DIIS approach for electronicstructure calculations? Intl. J. Quantum Chem. 79 (200), 82-90

E(x) may not monotonically decrease between SCF iterations

SCF does not always converge;

limi→∞ ‖E(x(i+1))− E(x(i))‖ 6= 0,or limi→∞ ‖ρ(x(i+1))− ρ(x(i))‖ 6= 0

For some problems, one can show subsequence convergence;

limi→∞

‖ρ(x(i+1))− ρ(x(i−1))‖ = 0

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 20 / 32

Example

E(x) =12xTLx+

α

4ρ(x)TL−1ρ(x)

L =(

2 −1−1 2

), x =

(x1

x2

), ρ(x) =

(x2

1

x22

)minE(x)

s.t. x21 + x2

2 = 1[L+ αDiag(L−1ρ(x))

]x = λ1x

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 21 / 32

SCF Converges When α = 1.0

∆ρ(i) = ‖ρ(i) − ρ(i−1)‖

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 22 / 32

SCF Fails When α = 12.0

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 23 / 32

Subsequence Convergence

odd subsequence even subsequence

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 24 / 32

Why Does SCF Fail?

SCF is attempting to minimize a sequence of surrogate models

Objective:

E(x) = 12xTLx+ α

4ρ(x)TL−1ρ(x)

Esur(x) = 12(xTH(x(i))x),

Gradient:

∇E(x) = H(x)x∇Esur(x) = H(x(i))x

Gradients match at x(i)

∇E(x(i)) = ∇Esur(x(i))

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 25 / 32

SCF Step is Too Long!

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 26 / 32

Improving SCF

Construct better surrogate

Cannot afford to use local quadratic approximation (Hessian too expensive)

Charge mixing to improve convergence (heuristic)

Trust Region to restrict the update of the x in a small neighborhood of thegradient matching point, e.g. TRSCF – Thogersen, Olsen, Yeager &Jorgensen (2004)

Direct Constrained Minimization – Yang, Meza & Wang (2006) 1

See talk by Chao Yang, Friday, Oct. 4, 2008

1C. Yang, J. Meza, L. Wang, A Constrained Optimization Algorithm for Total EnergyMinimization in Electronic Structure Calculation, J. Comp. Phy., 217 709-721 (2006)

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 27 / 32

Mixing

Linear mixing

New ρ is a linear combination of the previous value and the quantity computedfrom the solution of the linear eigenvalue problem, i.e.ρi+1 = βρ+ (1− β)ρi

Anderson extrapolation

Broyden and Modified Broyden Mixing

DIIS (Direct Iterative Inversion Subspace)

All methods are some form of an acceleration technique for a nonlineariteration

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 28 / 32

Trust Region Subproblem

Solve

min Esur(x)s.t. xTx = 1,

‖xxT − x(i)(x(i))T ‖2F ≤ ∆ trust region constraint

Equivalent to solving[H(x(i))− σx(i)(x(i))T

]x = λ1x

xTx = 1

σ is a penalty parameter (Lagrange multiplier for the trust region constraint)

Need heuristic for choosing σ

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 29 / 32

SCF + Charge Mixing Improves Convergence

∆E(x(i)) = ‖E(x(i))− Emin‖

α = 12,

n = 10, ne = 2

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 30 / 32

TRSCF Further Improves Convergence

How should we choose σ?

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 31 / 32

Summary

Reviewed basic approximations used in DFT

Introduced the major algorithmic components

Discussed methods for improving SCF convergence

Introduced trust region ideas

Part II of this talk will discuss many of thecomputational issues

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 32 / 32

Recommended