Recent advances in quantum Monte Carlo for quantum ... · Recent advances in quantum Monte Carlo...

Preview:

Citation preview

Recent advances in quantum Monte Carlo

for quantum chemistry: optimization of wavefunctions and calculation of observables

Julien Toulouse1, Cyrus J. Umrigar2, Roland Assaraf1

1 Laboratoire de Chimie Theorique, Universite Pierre et Marie Curie - CNRS, Paris, France.

2Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, New York, USA.

Email : julien.toulouse@upmc.fr

Web page: www.lct.jussieu.fr/pagesperso/toulouse/

March 2009

1 Optimization of wave functions

2 Calculation of observables

1 Optimization of wave functions

2 Calculation of observables

Trial wave function

Jastrow-Slater wave function

|Ψ(p)〉 = J(α)

NCSF∑

i=1

ci |Ci〉

• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.

Trial wave function

Jastrow-Slater wave function

|Ψ(p)〉 = J(α)

NCSF∑

i=1

ci |Ci〉

• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.

The Slater determinants are made of orbitals expanded on a Slaterbasis:

φk(r) =

Nbasis∑

µ=1

λkµχµ(r)

χ(r) = N(ζ) rn−1 e−ζr Sl ,m(θ, φ)

Trial wave function

Jastrow-Slater wave function

|Ψ(p)〉 = J(α)

NCSF∑

i=1

ci |Ci〉

• J(α) = Jastrow factor (with e-e, e-n, e-e-n terms)• |Ci〉 = Configuration state function (CSF) = linearcombination of Slater determinants of given symmetry.

The Slater determinants are made of orbitals expanded on a Slaterbasis:

φk(r) =

Nbasis∑

µ=1

λkµχµ(r)

χ(r) = N(ζ) rn−1 e−ζr Sl ,m(θ, φ)

Parameters to optimize p = {α, c, λ, ζ}: Jastrow parameters α,

CSF coefficients c, orbital coefficients λ and basis exponents ζ

Wave function optimization: why and how?

Important for both VMC and DMC in order to

reduce the systematic error

reduce the statistical uncertainty

Wave function optimization: why and how?

Important for both VMC and DMC in order to

reduce the systematic error

reduce the statistical uncertainty

How to optimize?

Until recently: minimization of the variance of the energy

OK for the few Jastrow parameters

but does not work well for the many CSF and orbitalparameters

Since recently: minimization of the energy (+ possibly a smallfraction of variance)

works well for all the parameters

the energy is a better criterion

Optimization method: principle

Expansion of the wave function around p0 to linear orderin ∆p = p − p0:

|Ψ[1](p)〉 = |Ψ0〉 +∑

j

∆pj |Ψj〉

where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉

∂pj

.

Optimization method: principle

Expansion of the wave function around p0 to linear orderin ∆p = p − p0:

|Ψ[1](p)〉 = |Ψ0〉 +∑

j

∆pj |Ψj〉

where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉

∂pj

.

Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.

Optimization method: principle

Expansion of the wave function around p0 to linear orderin ∆p = p − p0:

|Ψ[1](p)〉 = |Ψ0〉 +∑

j

∆pj |Ψj〉

where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉

∂pj

.

Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.

Minimization of the energy =⇒ generalized eigenvalue equation:(

E0 gT/2g/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

)(

1∆p

)

where E0 = 〈Ψ0|H|Ψ0〉, gi =∂E (p0)

∂pi

, Hij = 〈Ψi |H|Ψj〉, Sij = 〈Ψi |Ψj〉.

Optimization method: principle

Expansion of the wave function around p0 to linear orderin ∆p = p − p0:

|Ψ[1](p)〉 = |Ψ0〉 +∑

j

∆pj |Ψj〉

where |Ψ0〉 = |Ψ(p0)〉 and |Ψj〉 =∂|Ψ(p0))〉

∂pj

.

Normalization of wave function chosen so that thederivatives |Ψj〉 are orthogonal to |Ψ0〉.

Minimization of the energy =⇒ generalized eigenvalue equation:(

E0 gT/2g/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

)(

1∆p

)

where E0 = 〈Ψ0|H|Ψ0〉, gi =∂E (p0)

∂pi

, Hij = 〈Ψi |H|Ψj〉, Sij = 〈Ψi |Ψj〉.

Update of the parameters: p0 → p0 + ∆p.

Optimization method: robustness

The linear method is equivalent to a stabilized Newtonmethod:

(

E0 gT/2g/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

)(

1∆p

)

⇐⇒

{

(h + 2∆E S) · ∆p = −g

2∆E = −gT · ∆p

where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.

=⇒ more robust than Newton method

Optimization method: robustness

The linear method is equivalent to a stabilized Newtonmethod:

(

E0 gT/2g/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

)(

1∆p

)

⇐⇒

{

(h + 2∆E S) · ∆p = −g

2∆E = −gT · ∆p

where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.

=⇒ more robust than Newton method

In quantum chemistry, it is known as super-CI method oraugmented Hessian method.

Optimization method: robustness

The linear method is equivalent to a stabilized Newtonmethod:

(

E0 gT/2g/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

)(

1∆p

)

⇐⇒

{

(h + 2∆E S) · ∆p = −g

2∆E = −gT · ∆p

where h = 2(H − E0S) is an approximate Hessian, and∆E = E0 − Elin > 0 is the energy stabilization.

=⇒ more robust than Newton method

In quantum chemistry, it is known as super-CI method oraugmented Hessian method.

Additional stabilization: Hij → Hij + a δij where a ≥ 0.

Optimization method: on a finite VMC sample

The generalized eigenvalue equation is estimated as(

E0 gTR/2

gL/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

) (

1∆p

)

with

gL,i/2 =

Ψi (R)

Ψ0(R)

H(R)Ψ0(R)

Ψ0(R)

Ψ20

and gR,j/2 =

Ψ0(R)

Ψ0(R)

H(R)Ψj(R)

Ψ0(R)

Ψ20

Hij =

Ψi (R)

Ψ0(R)

H(R)Ψj(R)

Ψ0(R)

Ψ20

and Sij =

Ψi (R)

Ψ0(R)

Ψj(R)

Ψ0(R)

Ψ20

non-symmetric!

Optimization method: on a finite VMC sample

The generalized eigenvalue equation is estimated as(

E0 gTR/2

gL/2 H

)(

1∆p

)

= Elin

(

1 0T

0 S

) (

1∆p

)

with

gL,i/2 =

Ψi (R)

Ψ0(R)

H(R)Ψ0(R)

Ψ0(R)

Ψ20

and gR,j/2 =

Ψ0(R)

Ψ0(R)

H(R)Ψj(R)

Ψ0(R)

Ψ20

Hij =

Ψi (R)

Ψ0(R)

H(R)Ψj(R)

Ψ0(R)

Ψ20

and Sij =

Ψi (R)

Ψ0(R)

Ψj(R)

Ψ0(R)

Ψ20

non-symmetric!

=⇒ Zero-variance principle of Nightingale et al. (PRL 2001):

If there is some ∆p so that Ψ0(R) +∑

j ∆pj Ψj(R) = Ψexact(R)

then ∆p is found with zero variance.

In practice, these non-symmetric estimators reduce the fluctuationson ∆p by 1 or 2 orders of magnitude.

Optimization method: mixing a fraction of variance

How to minimize the energy variance with the linear method?

V = min∆p

{

V0 + gTV · ∆p +

1

2∆pT · hV · ∆p

}

Optimization method: mixing a fraction of variance

How to minimize the energy variance with the linear method?

V = min∆p

{

V0 + gTV · ∆p +

1

2∆pT · hV · ∆p

}

⇐⇒ V = min∆p

(

1 ∆pT)

(

V0 gTV /2

gV /2 hV /2 + V0S

)(

1∆p

)

(

1 ∆pT)

(

1 0T

0 S

)(

1∆p

)

Optimization method: mixing a fraction of variance

How to minimize the energy variance with the linear method?

V = min∆p

{

V0 + gTV · ∆p +

1

2∆pT · hV · ∆p

}

⇐⇒ V = min∆p

(

1 ∆pT)

(

V0 gTV /2

gV /2 hV /2 + V0S

)(

1∆p

)

(

1 ∆pT)

(

1 0T

0 S

)(

1∆p

)

⇐⇒

(

V0 gTV /2

gV /2 hV /2 + V0S

) (

1∆p

)

= V

(

1 0T

0 S

)(

1∆p

)

Optimization method: mixing a fraction of variance

How to minimize the energy variance with the linear method?

V = min∆p

{

V0 + gTV · ∆p +

1

2∆pT · hV · ∆p

}

⇐⇒ V = min∆p

(

1 ∆pT)

(

V0 gTV /2

gV /2 hV /2 + V0S

)(

1∆p

)

(

1 ∆pT)

(

1 0T

0 S

)(

1∆p

)

⇐⇒

(

V0 gTV /2

gV /2 hV /2 + V0S

) (

1∆p

)

= V

(

1 0T

0 S

)(

1∆p

)

matrix to add to the energy matrix

Simultaneous optimization of all parameters

Optimization of 149 parameters = 24 (Jastrow) + 49 (CSF) +64 (orbitals) + 12 (exponents) for C2 molecule :

-75.9

-75.8

-75.7

-75.6

-75.5

-75.4

0 1 2 3 4 5 6

En

erg

y (

Hart

ree)

Iterations

-75.88

-75.875

-75.87

-75.865

-75.86

-75.855

2 3 4 5 6

En

ergy (

Hart

ree)

Iterations

=⇒ Energy converges up to 1 mHartreein a few iterations

Systematic improvement in QMC

For C2 molecule: total energies for a series of fully optimizedJastrow-Slater wave functions:

-75.94

-75.92

-75.9

-75.88

-75.86

-75.84

-75.82

-75.8

J*RAS(8,26)J*CAS(8,8)J*CAS(8,7)J*CAS(8,5)J*SD

En

erg

y (

Hart

ree)

Wave function

Exact

CCSD(T)/cc-pVQZ

VMC

=⇒ Systematic improvement in VMC

Systematic improvement in QMC

For C2 molecule: total energies for a series of fully optimizedJastrow-Slater wave functions:

-75.94

-75.92

-75.9

-75.88

-75.86

-75.84

-75.82

-75.8

J*RAS(8,26)J*CAS(8,8)J*CAS(8,7)J*CAS(8,5)J*SD

En

erg

y (

Hart

ree)

Wave function

Exact

CCSD(T)/cc-pVQZ

VMC

DMC

=⇒ Systematic improvement in VMC and DMC!

Potential energy curve of C2 molecule (1Σ+g )

Jastrow × single determinant wave function

:

-75.9

-75.8

-75.7

-75.6

-75.5

-75.4

1 2 3 4 5 6 7 8 9 10

En

ergy (

Hart

ree)

Interatomic distance (Bohr)

VMC J × SD

Morse potential

size-consistencyerror

Potential energy curve of C2 molecule (1Σ+g )

Jastrow × single determinant wave function

:

-75.9

-75.8

-75.7

-75.6

-75.5

-75.4

1 2 3 4 5 6 7 8 9 10

En

ergy (

Hart

ree)

Interatomic distance (Bohr)

VMC J × SD

DMC J × SD

Morse potential

size-consistencyerror

=⇒ Single-determinant DMC is size-consistent

but with broken spin symmetry at dissociation, 〈ΨDMC|S2|ΨDMC〉 = 2

Potential energy curve of C2 molecule (1Σ+g )

Jastrow × multideterminant wave function:

-75.9

-75.8

-75.7

-75.6

-75.5

-75.4

1 2 3 4 5 6 7 8 9 10

En

ergy (

Hart

ree)

Interatomic distance (Bohr)

VMC J × CAS(8,8)

DMC J × CAS(8,8)

Morse potential

Potential energy curve of C2 molecule (1Σ+g )

Jastrow × multideterminant wave function:

-75.9

-75.8

-75.7

-75.6

-75.5

-75.4

1 2 3 4 5 6 7 8 9 10

En

ergy (

Hart

ree)

Interatomic distance (Bohr)

VMC J × CAS(8,8)

DMC J × CAS(8,8)

Morse potential

=⇒ DMC gives dissociation energywith chemical accuracy (1 kcal/mol ≈ 0.04 eV):

DDMC = 6.482(3) vs Dexact = 6.44(2) eV

Dissociation energies of diatomic molecules

Jastrow × multideterminant (full valence CAS) wave functions:

-1.5

-1

-0.5

0

Ne2F2O2N2C2B2Be2Li2

Err

ors

on

dis

soci

ati

on

en

ergy (

eV)

Molecules

MCSCF CAS

VMC J × CAS

DMC J × CAS

=⇒ Near chemical accuracy in DMC

Example of application

Binding energy of 2 NO2 to a fragment of carbon nanotube:

Estimates for the full nanotube (9,0):

B3LYP calculations: no binding

QMC calculations: weak binding (. 10 kcal/mol)

Lawson, Bauschlicher, Toulouse, Filippi, Umrigar, Chem. Phys.Lett., 466, 170 (2008)

1 Optimization of wave functions

2 Calculation of observables

Calculation of an observable in VMC

Energy

Estimator: EL(R) =H(R)Ψ(R)

Ψ(R)

Systematic error: δE = O(δΨ2)

Variance: σ2 (EL) = O(δΨ2)

}

Quadratic Zero-VarianceZero-Bias property

Calculation of an observable in VMC

Energy

Estimator: EL(R) =H(R)Ψ(R)

Ψ(R)

Systematic error: δE = O(δΨ2)

Variance: σ2 (EL) = O(δΨ2)

}

Quadratic Zero-VarianceZero-Bias property

Arbitrary observable O (which does not commute with H)

Estimator: OL(R) =O(R)Ψ(R)

Ψ(R)

Systematic error: δO = O(δΨ)

Variance: σ2 (OL) = O(1)

}

Quadratic Zero-VarianceZero-Bias property

Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)

Based on the Hellmann-Feynman theorem

〈O〉 =

(

dEλ

)

λ=0

where Eλ = 〈Ψλ|H + λO|Ψλ〉

Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)

Based on the Hellmann-Feynman theorem

〈O〉 =

(

dEλ

)

λ=0

where Eλ = 〈Ψλ|H + λO|Ψλ〉

one can define an improved estimator:

Oimproved(R) =O(R)Ψ(R)

Ψ(R)+ ∆OZV(R) + ∆OZB(R)

with the ZV term: ∆OZV(R) =

[

H(R)Ψ′(R)

Ψ′(R)− EL(R)

]

Ψ′(R)

Ψ(R)

and the ZB term: ∆OZB(R) = 2 [EL(R) − E ]Ψ′(R)

Ψ(R)

Zero-Variance Zero-Bias estimators (Assaraf & Caffarel)

Based on the Hellmann-Feynman theorem

〈O〉 =

(

dEλ

)

λ=0

where Eλ = 〈Ψλ|H + λO|Ψλ〉

one can define an improved estimator:

Oimproved(R) =O(R)Ψ(R)

Ψ(R)+ ∆OZV(R) + ∆OZB(R)

with the ZV term: ∆OZV(R) =

[

H(R)Ψ′(R)

Ψ′(R)− EL(R)

]

Ψ′(R)

Ψ(R)

and the ZB term: ∆OZB(R) = 2 [EL(R) − E ]Ψ′(R)

Ψ(R)

Quadratic Zero-Variance Zero-Bias property

Systematic error: δOimproved = O(δΨ2 + δΨ δΨ′)

Variance: σ2 (Oimproved) = O(δΨ2 + δΨ′2 + δΨ δΨ′)

Example of improved QMC estimators

Dipole moment of CH molecule (2Π) in VMC:

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5

Dip

ole

mo

men

t (D

eby

e)

Interatomic distance (Bohr)

usual estimator

Example of improved QMC estimators

Dipole moment of CH molecule (2Π) in VMC:

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5

Dip

ole

mo

men

t (D

eby

e)

Interatomic distance (Bohr)

usual estimatorimproved estimator

=⇒ Reduction of statistical uncertainty!

Example of improved QMC estimators

Correlation hole of C2 molecule in VMC:

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0 1 2 3 4 5

Co

rrel

ati

on

ho

le (

Bo

hr-1

)

Interelectronic distance (Bohr)

usual histogram estimator

Example of improved QMC estimators

Correlation hole of C2 molecule in VMC:

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0 1 2 3 4 5

Co

rrel

ati

on

ho

le (

Bo

hr-1

)

Interelectronic distance (Bohr)

usual histogram estimatorimproved estimator

=⇒ Reduction of statistical uncertainty!

Summary and perspectives

Summary

efficient wave function optimization method byminimization of VMC energy

near chemical accuracy with compact wave functions

improved estimators for observables in QMC

Toulouse, Umrigar, JCP 126, 084102 (2007)

Umrigar, Toulouse, Filippi, Sorella, Hennig, PRL 98, 110201 (2007)

Toulouse, Assaraf, Umrigar, JCP 26, 244112 (2007)

Toulouse, Umrigar, JCP 128, 174101 (2008)

www.lct.jussieu.fr/pagesperso/toulouse/

Perspectives

optimization by minimization of DMC energy

optimization of molecular geometry

excited states

Recommended