Bayesian Error Estimation in Density Functional Theorypages.physics.cornell.edu/es05/talks/jakobsen-talk.pdfCan be generalized to other xc funcs., higher order derivatives, exact exchange

Bayesian Error Estimation in Density Functional Theory

Karsten W. JacobsenJens Jørgen Mortensen

Kristen KaasbjergSøren L. Frederiksen

Jens K. NørskovCAMP, Dept. of Physics, DTU

James P. SethnaLASSP, Cornell Univ.

cond–mat/0506225

Density Functional Theory● Widely used in many different scientific areas● “Predictive power” (LDA, GGA, exact

exchange...)● The reliability is usually estimated based on

experience with practical calculations – try a few different functionals – no systematic theory

● The reliability varies a lot with type of system/type of property

Cohesive Energy versusStructural Energy Differences

Typical cohesive energy error ~0.20.4 eVSmall structural energy differences(10100 meV) calculated with high reliability

(Skriver, 1985)

Our results: Ensemble error prediction

A Question (or two)● Is it possible to replace the experience and

“general wisdom” of the practitioners with a systematic approach?

● Can the reliability of a physical model (DFTGGA) be estimated more systematically?

Standard Bayesian Approach to Model Determination/Selection

Model: M with parameters , Data: D Prior probability

P ∣DM =P D∣M

P D∣M P ∣M

Bayes' theorem:

Maximal probability: Best fit > Least squares

P ∣DM ∝exp −∑iyi−yi

2 /22

Model parameters Data points

Data calculated with model parameters

Noise in data

y=01x

x

y

Obtain ensemble ofmodel parameters

Standard Bayesian:Fitting a Noisy SineFunction

Fitting a noisy sine function with 6th order polynomial

100 data points 1000 data points (not shown)

Low noise, but bad model (3rd order polynomial)

Ensemble fluctuations too small to describe deviations.

Different Approach for “Bad” Models

Previously used to develop and evaluate interatomic potentials Frederiksen, Jacobsen, Brown, Sethna PRL 93, 165501 (2004)

P ∣MD∝exp −C /T , C =∑i yi−y i2

Effective temperature given by the bestfit cost function: Low noise and bad model

(3rd order polynomial)T=

2C0N p

Number of parameters

“Truth”Best fit

Ensemble

DFT Generalized Gradient ApproximationModel

E x [n ]=∫ d r n rxunif n F x s

s= 1242

1/3∣∇ n∣n4/3

Perdew, Burke, and Ernzerhof, PRL 77, 3865 (1996).B. Hammer, L. B. Hansen, J. K. Nørskov, PRB 59, 7413 (1999)

Exchange energy involvesenhancement factor:

Fx s=∑n=1

Np

n Fn s =∑n=1

Np

n ss1

2n−2

The model:Expansion of enhancement factor:

Use PBE densities.Fast (energies are linear in parameters)

The DatabaseAtomization energies for 20 molecules and cohesive energies for 12 solids (per atom)

Molecules:

H2, LiH, CH4, NH3, OH, H2O, HF,Li2, LiF, Be2, C2H2, C2H4, HCN, CO,

N2, NO, O2, F2, P2, Cl2.

Solids:

Li, Na, Al, C (Diamond), Si, SiC,AlP, NaCl, LiF, MgO, Cu, Pt.

Cost function: C =∑iE iexp−E icalc

2

Calculations performed with new realspacemultigrid PAW code “gridPAW”(http://www.camp.dtu.dk/campos)(Jens Jørgen Mortensen)

Selfconsistent PBE densities used for all calculations.

Because xenergy and therefore total energy is linear in parameters onlya few coefficients need to be calculated.

http://www.camp.dtu.dk/campos

Overfitting versus Model FlexibilityMany parameters give flexibility to model (many fluctuations possible).Too many parameters lead to overfitting.

C =∑iEiexp−E icalc

2

Train:Minimization of cost functionon 201=19 molecules

Test:Evaluate cost function for themolecule which was left out.Average over which moleculeis left out.

We use three parameters in the following.

BestFit Enhancement FactorC =∑i

Eiexp−E icalc2

Minimization of cost function

error LDA PBE RPBE best-fitmolecules:mean abs. 1.46 0.35 0.21 0.24mean 1.38 0.28 -0.01 0.12solids:mean abs. 1.35 0.16 0.40 0.27mean 1.35 -0.09 -0.40 -0.24all:mean abs. 1.42 0.28 0.28 0.25mean 1.37 0.14 -0.16 -0.02

s=0 limit consistent with LDA!

Fx s=∑n=1

Np

n Fn s =∑n=1

Np

n ss1

2n−2

best fit=1.00,0.19,1.90

Units of eV

Ensemble of Enhancement Factors

Sampling of probability distribution using Metropolis

P ∣MD∝exp −C /T

For a harmonic cost function a direct evaluation is simpler: P ∣MD∝e

−12∑i

ii2/T

BEE O= 1N∑=1N O −Obest− fit 2

Essentially no additional computational cost (no selfconsistency)

Bayesian error estimation:

So, does it work?

For 46 out of 64 observables the error is within the Bayesian Error Estimate

Comparing BEE with Actual Errors

Example CO/Cu(100)Experiment: Top site preferredDFTPBE: Bridge site preferred

Best fit plus BEE:

Etop = - 0.64 ± 0.30 eV

Ebridge – Etop = - 26 ± 70 meV

Also “CO/Pt(111) Puzzle”

Calc. says fcc:E fcc−E top=−0.13eV

Expt says top site(Feibelman, Hammer, Nørskov, Wagner, Scheffler, Stumpf, Watwe, and Dumesic, J. Phys. Chem. B (2001))

Cu bccfcc Structural Energy Difference

Best fit plus BEE:

Ec = 3.3 ± 0.5 eV

Ebcc – Efcc = 35 ± 4 meV fcc bcc

Limitations● Error estimates depend on model (GGA) and

database (binding energies for molecules and solids)

● Problems if– Property not represented in database– Model unable to describe property at all

● Example: van der Waals interactions

Conclusions and Outlook● Systematic and computationally fast approach to

estimating error bars on DFTGGA calculations● Predicted error bars on energies may vary by

several orders of magnitudes – in agreement with expt.

● Possibility for tailoring functionals to specific purposes (RPBE for chemisorption)

● Can be generalized to other xc funcs., higher order derivatives, exact exchange (orbital func.), “stepping up the xcladder”... cond–mat/0506225

Documents

Bayesian Error Estimation in Density Functional Theorypages.physics.cornell.edu/es05/talks/jakobsen-talk.pdfCan be generalized to other xc funcs., higher order derivatives, exact exchange