37
Algorithms and Computational Aspects of DFT Calculations Part II Juan Meza and Chao Yang High Performance Computing Research Lawrence Berkeley National Laboratory IMA Tutorial Mathematical and Computational Approaches to Quantum Chemistry Institute for Mathematics and its Applications, University of Minnesota September 26-27, 2008 Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 1 / 37

Algorithms and Computational Aspects of DFT Matlab

Embed Size (px)

Citation preview

Algorithms and Computational Aspects of DFTCalculations

Part II

Juan Meza and Chao YangHigh Performance Computing ResearchLawrence Berkeley National Laboratory

IMA TutorialMathematical and Computational Approaches to Quantum ChemistryInstitute for Mathematics and its Applications, University of Minnesota

September 26-27, 2008

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 1 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 2 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 3 / 37

Goals

1 The Role of Computation

2 Review Equations and Solution Techniques

3 Discuss Major Computational Aspects of Plane Wave DFT codes

4 Present Some Parallelization Issues

5 Highlight Computational Challenges

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 4 / 37

Materials by design

Advances in density functional theory coupled with multinodecomputational clusters now enable accurate simulation of the behaviorof multi-thousand atom complexes that mediate the electronic and ionictransfers of solar energy conversion. These new and emerging nanosciencecapabilities bring a fundamental understanding of the atomic andmolecular processes of solar energy utilization within reach.

Basic Research Needs for Solar Energy Utilization, Report of the BESWorkshop on Solar Energy Utilization,April 18-21, 2005

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 5 / 37

DFT codes are widely used for science applications

9470 nodes; 19,480 cores

13 Tflops/s SSP (100 Tflops/speak)

Upgrade to QuadCore (355 Tflops/speak)

DFT methods account for 75% ofthe materials sciences simulations atNERSC, totaling over 5 Millionhours of computer time in 2006

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 6 / 37

We can now simulate some realistic structures

The charge density of a 15,000 atom

quantum dot, Si13607H2236. Using 2048

processors at NERSC the calculation took

about 5 hours.

The calculated dipole moment of

a 2633 atom CdSe quantum rod,

Cd961Se724H948. Using 2560 processors

at NERSC the calculation took about 30

hours.

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 7 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 8 / 37

Kohn-Sham Equations

Recall our goal is to find the ground state energy by minimizing theKohn-Sham total energy, Etotal

Leads to:

Kohn-Sham equations

Hψi = εiψi, i = 1, 2, ..., ne

H =[−1

2∇2 + V (ρ(r))

],

V (ρ(r)) = Vext(r) +∫

ρ

|r − r′ |+ Vxc(ρ)

Nonlinear eigenvalue problem since the Hamiltonian, H, depends on ψthrough the charge density, ρ

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 9 / 37

Discretized Kohn-Sham Equations

KKT conditions

∇XL(X,Λ) = 0,X∗X = Ine .

Discretized Kohn-Sham equations can now be written as:

H(X)X = XΛ,X∗X = Ine

.

Kohn-Sham Hamiltonian given by:

H(X) =12L+ V (X),

V (X) = Vext + Diag (L†ρ(X)) + Diag gxc(ρ(X))

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 10 / 37

The SCF Iteration

V (ρ(r))

ρ(r) =∑ne

i |ψi(r)|2

{ψi}i=1,...,ne

[− 1

2∇2 + V (ρ(r))

]ψi = Eiψi

1 Given an initial charge density ρcompute a potential Vk(ρ(r))

2 Solve the linear eigenvalue problemfor the ψi, i = 1, . . . , ne

3 Compute the new charge density ρ

4 Update ρ using your favorite mixingscheme

5 Compute Vk+1 and repeat untilconverged

Overall computational complexity isO(N n2

e) due to linear algebra

Major computational components

CG methodOrthogonalizationComputation of potentials3D FFT

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 11 / 37

What Are the Computational Issues?

DFT methods account for 75% of the material science simulations at NERSC

Parallel efficiencies can be quite high

on plane wave basis can scale to ≈ 1000 processorson plane wave basis and wavefunction index can scale to ≈ 10, 000 processors

Most codes still based on O(N3) algorithms

Not systematically improvable

Inadequate for strong and/or non-local correlations

Parallel efficiencies can be difficult to achieve; 10-20% parallel efficiency isnot uncommon

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 12 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 13 / 37

Major Computational Components of Plane Wave DFTCodes

Eigenvalue solver

Orthogonalization

3D FFTs

Computation of potentials

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 14 / 37

Eigenvalue Solver

Need to solve one N × ne linear eigenvalue problem at each SCF iteration

The size of N can easily be 10,000 – 100,000

Only need the ne(≈ number of atoms) lowest eigenvalues and correspondingeigenvectors

Called diagonalization in chemistry/materials science circles

Various approaches including CG, Grassmann CG, residual minimization

Distinction is usually made between all band vs. band-by-band, whichcorresponds to solving for all eigenvectors simultaneously vs. solving for oneeigenvector at a time. We would call this blocked vs. unblocked

Use of optimized high-level BLAS3 routines can significantly improveperformance

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 15 / 37

Orthogonalization

Due to physical constraints, the electronic wavefunctions must beorthonormal

This adds a constraint to the KS equations in the form of X∗X = Ine

Can be time consuming for large systems

Complexity is O(N n2e), where N is the size of the discretization and ne is

the number of electrons

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 16 / 37

FFTs

Recall that the kinetic energy operator takes on a particularly simple form inFourier space (also called G-space)

Most DFT codes take advantage of this fact by converting from real space toG-space for computation of the Hamiltonian

Since systems are usually 3D, codes need to compute the 3D FFTs through aseries of 1D FFTs

This has a consequence both in the total amount of work and when trying toparallelize the codes

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 17 / 37

Computation of potentials

The Hartree potential,VHartree =

∫ρ

|r−r′ | , can be computed in several ways

The calculation can be posed as the solution of a Poisson problem.

Fast Poisson solvers or multigrid can also be used

Because the potential can be viewed a convolution, it can also be computedusing FFTs

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 18 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 19 / 37

Parallel Calculations Milestones

1991 Silicon surface reconstruction (7x7), Meiko I860, 64 processor, (Stich, Payne,King-Smith, Lin, Clarke)

1998 FeMn alloys (exchange bias), Cray T3E, 1500 procs; First > 1 Tflopsimulation, Gordon Bell prize (Ujfalussy, Stocks, Canning, Y. Wang, Sheltonet al.)

2005 1000 atom Molybdenum simulation with Qbox, BlueGene/L at LLNL with32,000 processors (F. Gygi et al.)

2008 Band-gap calculation of a 13,824 atom ZnTeO alloy proposed as a new solarcell material. Used 131,072 processors on Blue Gene/P at ANL achieved107.5 Tflops/s

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 20 / 37

Parallelization Strategies

Parallel across k-points – Not useful for large systems as k is usually small

Parallel over electrons – number of processors limited by number of electrons

Parallel over the number of plane-wave basis, ng – most commonly used inplane-wave codes

Parallelization of DFT codes is nontrivial and most codes cannot scale tolarge numbers of processors with even moderate efficiencies.

30% parallel efficiency is usually considered very good

Parallelization issues for Hartree-Fock codes are similar, especially for SCF

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 21 / 37

Parallelization of 3D FFT

3D FFTs are computed via 3 sets of 1DFFTs and 2 transposes

Most of the communication is in globaltranspose (b) to (c)

Ratio of flops/comm ≈ logNMany FFTs are computed at the sametime to avoid latency issues

Only non-zero elementscomputed/communicated

For details see (Canning et al.):http://www.nersc.gov/projects/paratec/

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 22 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 23 / 37

Linear Scaling Electronic Structure Methods

Goal is to reduce the computational work from O(N3) to O(N)Quantum mechanical effects are near-sighted, e.g. treat the computation ofthe exchange-correlation potential locally

Need to introduce concept of a localization region, inside which the quantityof interest is computed and is assumed to vanish outside the region

Six strategies for taking advantage of this fact (see Goedecker (1999)):1 Fermi operator expansion2 Fermi operator projection3 Divide-and-conquer4 Density-matrix minimization5 Orbital minimization approach6 Optimal basis density-matrix minimization

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 24 / 37

LS3DF

Based on Divide-and-Conquer approach

Divide a large system into smaller sub-domains that can be solvedindependently, then stitch the sub-domains back together again

Classical electrostatic interactions are long-ranged, i.e. solve one globalPoisson equation

Requires minimal communication between the sub-domains

Artificial boundary effects due to sub-dividing domains can be cancelled out

Based on ideas from fragment molecular method

We call our method Linear Scaling 3D Fragment or LS3DF 1

1L.W. Wang, Z. Zhao, J. Meza, LBNL-61691 (2006)Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 25 / 37

Parallelism Issues

IBM Cell Blade. Same processor as found in

a Sony Playstation 3

Multi-core and many-core is thewave of the future

Current algorithms for parallelismare difficult to parallelize with highefficiency

Many quantum chemistry codes donot parallelize well for even mediumscaled paralellism

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 26 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 27 / 37

Electronic Structure Codes

ABINIT – www.abinit.org

PARATEC – www.nersc.gov/projects/paratec

PEtot – hpcrd.lbl.gov/linwang/PEtot/PEtot.html

PWscf – www.pwscf.org

NWChem – www.emsl.pnl.gov/docs/nwchem/nwchem.html

Q-Chem – www.q-chem.com/

Quantum Espresso – www.quantum-espresso.org

Socorro – dft.sandia.gov/Socorro

VASP – cms.mpi.univie.ac.at/vasp

Many, many more – apologies if your favorite code was not listed

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 28 / 37

KSSOLV Matlab package

KSSOLV Matlab code for solving the Kohn-Sham equations

Open source package

Handles SCF, DCM, Trust Region

Example problems to get started with

Object-oriented design - easy to extend

Good starting point for students

Beta version of KSSOLV available, ask one of us for more information!

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 29 / 37

Example: SiH4

a1 = Atom(’Si’);

a2 = Atom(’H’);

alist = [a1 a2 a2 a2 a2];

xyzlist= [

0.0 0.0 0.0

1.61 1.61 1.61

... ];

mol = Molecule();

mol = set(mol,’supercell’,C);

mol = set(mol,’atomlist’,alist);

mol = set(mol,’xyzlist’ ,xyzlist);

mol = set(mol,’ecut’, 25);

mol = set(mol,’name’,’SiH4’);

...

isosurface(rho);

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 30 / 37

Convergence

[Etot, X, vtot, rho] = scf(mol);[Etot, X, vtot, rho] = dcm(mol);

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 31 / 37

Charge Density

isosurface(rho);

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 32 / 37

Example: Pt6Ni2O

cell:19.59 0.0 0.0...

sampling size: n1 = 96, n2 = 48, n3 = 48atoms and coordinates:1 Pt 1.3 -0.180 -0.015...7 Ni 8.4 0.003 3.0698 Ni 8.5 7.998 7.7629 O 14.9 2.644 1.511

number of electrons : 86spin type : 1kinetic energy cutoff: 60.0

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 33 / 37

Comparison of DCM vs. SCF

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 34 / 37

1 Goals and Motivation

2 Review of Equations

3 Plane Wave DFT Computational Components

4 Parallelization Strategies

5 Future Computational ChallengesLinear Scaling MethodsParallelism Issues

6 SoftwareAvailable CodesKSSOLV

7 Summary

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 35 / 37

Summary

Described most common PW DFT computational components

Overview of standard numerical methods used

Brief introduction into some parallelization issues

Listed some computational challenges

Introduced KSSOLV, Matlab package for solving KS equations

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 36 / 37

References

Aron J. Cohen, Paula Mori-Snchez, Weitao Yang, Insights into CurrentLimitations of Density Functional Theory, Science, Vol. 321. no. 5890, pp.792 - 794 (2008).

F. Gygi, R. K. Yates, J. Lorenz, E. W. Draeger, F. Franchetti, C. W.Ueberhuber, B. R. de Supinski, S. Kral, J. A. Gunnels, J. C. Sexton ,Proceedings of the 2005 ACM/IEEE conference on Supercomputing (2005).

G. Goedecker, Linear Scaling Electronic Structure Methods, Rev. Mod. Phys.71, 1085 (1999).

Curtis L. Janssen and Ida M.B. Nielsen, Parallel Computing in QuantumChemistry, CRC Press, (2008).

Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 27, 2008 37 / 37