12
THE JOURNAL OF CHEMICAL PHYSICS 138, 234103 (2013) On the analytical representation of free energy profiles with a Morse/long-range model: Application to the water dimer Yalina Tritzant-Martinez, 1,2 Tao Zeng, 2 Aron Broom, 2 Elizabeth Meiering, 2 Robert J. Le Roy, 2 and Pierre-Nicholas Roy 2 1 Department of Chemistry, University of Alberta, Edmonton, Alberta T6G 2G2, Canada 2 Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada (Received 1 November 2012; accepted 28 May 2013; published online 18 June 2013) We investigate the analytical representation of potentials of mean force (pmf) using the Morse/long- range (MLR) potential approach. The MLR method had previously been used to represent potential energy surfaces, and we assess its validity for representing free-energies. The advantage of the ap- proach is that the potential of mean force data only needs to be calculated in the short to medium range region of the reaction coordinate while the long range can be handled analytically. This can re- sult in significant savings in terms of computational effort since one does not need to cover the whole range of the reaction coordinate during simulations. The water dimer with rigid monomers whose interactions are described by the commonly used TIP4P model [W. Jorgensen and J. Madura, Mol. Phys. 56, 1381 (1985)] is used as a test case. We first calculate an “exact” pmf using direct Monte Carlo (MC) integration and term such a calculation as our gold standard (GS). Second, we compare this GS with several MLR fits to the GS to test the validity of the fitting procedure. We then obtain the water dimer pmf using metadynamics simulations in a limited range of the reaction coordinate and show how the MLR treatment allows the accurate generation of the full pmf. We finally calculate the transition state theory rate constant for the water dimer dissociation process using the GS, the GS MLR fits, and the metadynamics MLR fits. Our approach can yield a compact, smooth, and accurate analytical representation of pmf data with reduced computational cost. © 2013 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4810006] I. INTRODUCTION Calculation of free energy variations along a reaction pathway is one way of gaining insight into physical and chem- ical properties such as the stability 1, 2 of a complex, its chem- ical reactivity 3 or other physicochemical properties such as anesthetic potency, 4 or binding free energy. 5 In a reaction, the transformation from reactants to products implies a free en- ergy difference that depends on the stability and reactivity of the substances involved, and it is related to the Potential of Mean Force (pmf). 6, 7 This pmf can be used to determine the reaction rate constant in the context of Variational Transition State Theory (VTST). 812 The rate constant k TST obtained via VTST will of course depend on the quality of the pmf. Thus, inaccuracies in the value of the pmf can greatly affect the val- ues of k TST . 9, 1315 Several levels of theory can be used to obtain the free energy profiles. These levels vary from ab initio potentials to semiempirical theories, together with classical Molecular Dynamics (MD) or even Path Integral Molecular Dynamics (PIMD). However, in all cases, a proper representation of the pmf is of critical importance in calculating k TST . Vari- ous authors have suggested approaches for obtaining smooth pmf curves using a representation of the pmf from weighted residuals. 16 Recently, Schofield and van Zon 17 have pre- sented an extension of the Berg-Harris method 18 to construct a smooth pmf, radial distribution, and probability density. As well, Calvo 19 has applied Wang and Landau’s uniform sam- pling method 20 to determine the potential of mean force as- sociated with rotation about the dihedral angle of the butane molecule. To obtain a well-behaved global representation of the pmf for weakly bound complexes, such as the water dimer con- sidered in the present work, there are three major require- ments. First, a proper account of the long-range interactions is needed. Second, the representation of the pmf should be com- pact. Finally, continuous lower-order derivatives of the pmf are required to facilitate the calculation of properties such as the reaction rate. In addition to these analytical require- ments, a method with a low computational cost is also highly desirable. To fulfill these conditions, we have opted for the Morse/long-range (MLR) functional form 21, 22 because it is known to meet all the aforementioned requirements. 21 MLR functions were developed to represent potential en- ergy surfaces (PES) in a highly accurate manner. The accu- racy of this approach has been confirmed by their use in fits to high resolution spectroscopy experiments. The determina- tion of highly accurate MLR PESs from fits to high resolu- tion spectroscopic data has been essential for the success of several studies. 2325 For example, the first molecular super- fluid was confirmed based on Feynman path integral simula- tions using pairwise additive MLR pair potentials and infrared spectroscopy of doped hydrogen nano-clusters. 26 The use of MLR potential functions was also crucial in determining the proper shift behaviour of the asym- metric stretch vibration of CO 2 doped in helium nano- clusters 27, 28 where results based on an MLR potential have showed marked consistency with experimental results. Other 0021-9606/2013/138(23)/234103/12/$30.00 © 2013 AIP Publishing LLC 138, 234103-1 Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

THE JOURNAL OF CHEMICAL PHYSICS 138, 234103 (2013)

On the analytical representation of free energy profiles witha Morse/long-range model: Application to the water dimer

Yalina Tritzant-Martinez,1,2 Tao Zeng,2 Aron Broom,2 Elizabeth Meiering,2

Robert J. Le Roy,2 and Pierre-Nicholas Roy2

1Department of Chemistry, University of Alberta, Edmonton, Alberta T6G 2G2, Canada2Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada

(Received 1 November 2012; accepted 28 May 2013; published online 18 June 2013)

We investigate the analytical representation of potentials of mean force (pmf) using the Morse/long-range (MLR) potential approach. The MLR method had previously been used to represent potentialenergy surfaces, and we assess its validity for representing free-energies. The advantage of the ap-proach is that the potential of mean force data only needs to be calculated in the short to mediumrange region of the reaction coordinate while the long range can be handled analytically. This can re-sult in significant savings in terms of computational effort since one does not need to cover the wholerange of the reaction coordinate during simulations. The water dimer with rigid monomers whoseinteractions are described by the commonly used TIP4P model [W. Jorgensen and J. Madura, Mol.Phys. 56, 1381 (1985)] is used as a test case. We first calculate an “exact” pmf using direct MonteCarlo (MC) integration and term such a calculation as our gold standard (GS). Second, we comparethis GS with several MLR fits to the GS to test the validity of the fitting procedure. We then obtainthe water dimer pmf using metadynamics simulations in a limited range of the reaction coordinateand show how the MLR treatment allows the accurate generation of the full pmf. We finally calculatethe transition state theory rate constant for the water dimer dissociation process using the GS, the GSMLR fits, and the metadynamics MLR fits. Our approach can yield a compact, smooth, and accurateanalytical representation of pmf data with reduced computational cost. © 2013 AIP Publishing LLC.[http://dx.doi.org/10.1063/1.4810006]

I. INTRODUCTION

Calculation of free energy variations along a reactionpathway is one way of gaining insight into physical and chem-ical properties such as the stability1, 2 of a complex, its chem-ical reactivity3 or other physicochemical properties such asanesthetic potency,4 or binding free energy.5 In a reaction, thetransformation from reactants to products implies a free en-ergy difference that depends on the stability and reactivity ofthe substances involved, and it is related to the Potential ofMean Force (pmf).6, 7 This pmf can be used to determine thereaction rate constant in the context of Variational TransitionState Theory (VTST).8–12 The rate constant kTST obtained viaVTST will of course depend on the quality of the pmf. Thus,inaccuracies in the value of the pmf can greatly affect the val-ues of kTST.9, 13–15

Several levels of theory can be used to obtain the freeenergy profiles. These levels vary from ab initio potentialsto semiempirical theories, together with classical MolecularDynamics (MD) or even Path Integral Molecular Dynamics(PIMD). However, in all cases, a proper representation ofthe pmf is of critical importance in calculating kTST. Vari-ous authors have suggested approaches for obtaining smoothpmf curves using a representation of the pmf from weightedresiduals.16 Recently, Schofield and van Zon17 have pre-sented an extension of the Berg-Harris method18 to constructa smooth pmf, radial distribution, and probability density. Aswell, Calvo19 has applied Wang and Landau’s uniform sam-pling method20 to determine the potential of mean force as-

sociated with rotation about the dihedral angle of the butanemolecule.

To obtain a well-behaved global representation of the pmffor weakly bound complexes, such as the water dimer con-sidered in the present work, there are three major require-ments. First, a proper account of the long-range interactions isneeded. Second, the representation of the pmf should be com-pact. Finally, continuous lower-order derivatives of the pmfare required to facilitate the calculation of properties suchas the reaction rate. In addition to these analytical require-ments, a method with a low computational cost is also highlydesirable. To fulfill these conditions, we have opted for theMorse/long-range (MLR) functional form21, 22 because it isknown to meet all the aforementioned requirements.21

MLR functions were developed to represent potential en-ergy surfaces (PES) in a highly accurate manner. The accu-racy of this approach has been confirmed by their use in fitsto high resolution spectroscopy experiments. The determina-tion of highly accurate MLR PESs from fits to high resolu-tion spectroscopic data has been essential for the success ofseveral studies.23–25 For example, the first molecular super-fluid was confirmed based on Feynman path integral simula-tions using pairwise additive MLR pair potentials and infraredspectroscopy of doped hydrogen nano-clusters.26

The use of MLR potential functions was also crucialin determining the proper shift behaviour of the asym-metric stretch vibration of CO2 doped in helium nano-clusters27, 28 where results based on an MLR potential haveshowed marked consistency with experimental results. Other

0021-9606/2013/138(23)/234103/12/$30.00 © 2013 AIP Publishing LLC138, 234103-1

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 2: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-2 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

applications of MLR functions have included the determina-tion of the ground state potential and the dissociation energyof MgH,23, 24 the potentials for the excited X1�+

g and theground A1�+

u states of Li2,24, 25 as well as the potential en-ergy functions for several other diatomic molecules.29

In this study, we present the first application of the MLRform to the construction of a free-energy surface, as opposedto a potential-energy surface. The benchmark system understudy is a water dimer made up of rigid monomers. Obtainingthe pmf of the water dimer’s dissociation reaction requires en-hanced sampling of the phase space. Metadynamics30, 31 willbe used to obtain the pmf in the short-range region and toshowcase the usefulness of our method. We emphasize herethat our proposed approach only requires simulations to beperformed in a limited range of the reaction coordinate as op-posed to the full range as in traditional techniques. This meansthat a smaller computational effort is required in order to ob-tain a global representation of the pmf. An analysis of theform of the dipole-dipole and dispersion forces will be usedto provide a description for the long-range behaviour of thefree energy.

The structure of the paper is as follows. Section II sum-marizes the fundamental methodology of calculation used inthis work. It includes a description of how the pmf is ob-tained using metadynamics. Second, the parameters in theMorse/long-range Potential used to fit the pmf and the the-ory behind the calculation of the long-range coefficient aredescribed. Following this, a brief discussion of the calcula-tion of kTST using VTST is presented. Finally, results are pre-sented and discussed in Sec. III, and concluding remarks areprovided in Sec. IV.

II. THEORY AND METHODS

As stated above, our goal is to test the utility of the MLRform for representing pmf data. The pmf is related to the freeenergy, A(ξ ), associated with a reaction coordinate ξ , and itdepends on the reduced density distribution function ρ(ξ ).The pmf, w(ξ ), is then defined as

w(ξ ) = w(ξ ∗) − kBT ln

[ρ(ξ )

ρ(ξ ∗)

]. (1)

In this equation, w(ξ ∗) and ρ(ξ ∗) are arbitrary reference val-ues. The reduced density distribution function corresponds tothe following canonical average:

ρ(ξ ) =∫

dxδ(ξ ‡[x] − ξ )e−U (x)/kBT∫dx e−U (x)/kBT

, (2)

where U(x) is the total potential energy and x represents theconfiguration of the system. The quantity ξ ‡[x] is the reactioncoordinate function and depends on x. The above expressionsare generic, and different choices of reaction coordinateswill have different measure of integration (with associatedJacobian factor).

The reaction coordinate chosen for the present waterdimer study is the distance, r, between the centres of massof the two water monomers,

ξ = r = |�r2 − �r1|, (3)

where �ri is a vector representing the position of the centre ofmass of a water monomer. This reaction coordinate is relatedto the one used by Doren and co-workers in an earlier work.32

However, these authors used the distance between the centreof mass of one of the water monomers and the centre of massof the dimer. A Jacobian factor of r2 is associated with ourchoice of reaction coordinate.

An accurate sampling of the whole phase space is re-quired in order to obtain the pmf over a wide range of valuesof the reaction coordinate. This requirement presents a possi-ble complication, because dissociation of the water dimer isa rare event. The system remains trapped in a small regionof the phase space, because the potential barriers it has toovercome to move to a different region are too high. Withoutenough sampling of the phase space, it is impossible to calcu-late an accurate pmf. Enhanced sampling is needed in orderto overcome this barrier and explore the whole phase space.We used metadynamics30 as our enhanced sampling method.

A. Generating potential of mean force data

The goal of metadynamics is to reconstruct the free en-ergy surface of a system, taking as a starting point a coarse-grained description. This starting point is provided by a fewcollective variables that the user needs to choose dependingon the process under study. From the starting point of thesimulation, a positive Gaussian potential centred on this pointof the trajectory is added to the real energy landscape of thesystem. The system is therefore “discouraged” from comingback to that region of the landscape, and is thus more likelyto explore another region. Positive Gaussian potentials con-tinue to be added in a history dependent manner, favouringthe unexplored regions of the phase space. The sum of theseGaussian potentials added at intervals tG creates a biasing po-tential, whose expression at time t is

VG((ξ ‡(x), t) =∑

t ′=tG, 2tG, 3tG, ...

w exp

(− (ξ ‡(x) − ξt ′)2

2δs2

),

(4)where w is the height of the Gaussians, δs is the width ofthe Gaussians, and ξ t = ξ ‡(x(t)) is the value of the collectivevariable at time t.

This biasing potential will grow until the system exploresthe whole phase space. At that point, the sum of the biasingpotential and the free energy becomes approximately constantas a function of the collective variable. From this sum, the freeenergy landscape can be recovered as the negative of the sumof all the Gaussians

limt→∞ VG(ξ, t) ∼ −A(ξ ). (5)

It is also possible to calculate a numerically “exact” ref-erence pmf for the water dimer via direct integration. We willterm such a calculation our Gold Standard (GS). This pmf willbe useful for a thorough study of the behaviour of the MLRfitting procedure. It will also serve as a reference to assess thequality of free-energy calculation methods such as metady-namics. The radial density function or probability distributionfunction, ρ(r), can be calculated as an integral over all of the

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 3: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-3 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

Euler angles that describe the orientations of the two watermolecules,

ρ(r) = 1

64π4

∫d�1

∫d�2e

−βU (�1,�2,r), (6)

where �i corresponds to the set of three Euler angles ofmolecule i, and U is the interaction potential. The 1

64π4 fac-tor corresponds to the total volume of the angular integrals.As defined, the above ρ(r) tends to unity as r tends to infin-ity since the interaction potential U vanishes. This choice ofnormalization of the radial density, ρ(r), will lead to a vanish-ing pmf in the limit of large r. This GS pmf will serve as abenchmark in the remainder of this paper.

B. Analytical representation of the potential of meanforce: The Morse/long-range form

The representation discussed below is for dimer systemswith a distance reaction coordinate. Generalization to multi-dimensional cases is possible, and a two-dimensional MLRfit to a PES with both radial and angular coordinates was per-formed in Ref. 28.

Providing a good expression to represent a potential overthe whole range of the reaction coordinate r (0 ≤ r ≥ ∞) isquite challenging. It is not trivial to find an expression that canreproduce both the long-range behaviour of a pmf and its be-haviour at short distance. Our solution to this problem wasto apply the Morse/long-range potential. In previous work,MLR has been demonstrated to be extremely successful forrepresenting PESs.21–29 Here, we propose to use the MLR forrepresenting the pmf.

An MLR radial potential energy function has the form

VMLR(r) = De

(1 − uLR(r)

uLR(re)e−β(r)·yp(r)

)2

, (7)

in which De is the well depth or binding energy, and uLR de-fines the attractive limiting long-range behaviour. β(r) is aslowly varying function that is written as a constrained poly-nomial, and yp(r) is a dimensionless radial variable that de-pends on the distance from the potential minimum re:

yp(r) = rp − rep

rp + rep. (8)

The power p in the expression for this radial variable is a smallpositive integer, and is often set to be equal to the differencebetween the inverse power of the leading term (n) of the long-range function uLR and the inverse power of the first long-range term predicted theoretically, but not included explicitlyin the definition of uLR.25

The reason for using this radial expansion variable inplace of r itself is that it will always remain finite. Repre-senting the MLR potential as a polynomial function of theinternuclear distance r is not recommended, because any suchfunction of r will approach ±∞ when r → ∞. However, therepresentation of the potential as an exponential function ofthis dimensionless radial variable behaves in a very differentmanner. The radial variable approaches finite limits of −1 and+1, respectively, as r → 0 and r → ∞ (see Fig. 1) meaning

0 1 2 3 4 5r/r

e

-1.0

-0.5

0.0

0.5

1.0

q=1q=2q=3q=4q=5q=6q=7q=8q=9q=10q=11q=12

rq-r

e

q

rq+r

e

q

q=1

q=2q=3

q=4q=5q=6

q=7

q=8,9,10,11,12

data region

FIG. 1. Behaviour of the radial variable yq(r) for various integer values of q.

that polynomial functions of these variables will approach fi-nite values at these limits.

Dimensionless radial expansion variables also define theexponent coefficient function β(r),

β(r) = β∞yp(r) + [1 − yp(r)]N∑

i=0

βiyq(r)i , (9)

in which β∞ = ln 2De

uLR(re) . The polynomial β(r) is also a func-tion of a second radial variable,

yq(r) = rq − req

rq + req, (10)

and the structure of Eq. (9) constrains β(r) → β∞ asr → ∞ and hence constrains VMLR(r) → uLR(r) at large r.This choice of dimensionless radial variables facilitates repre-senting the exponent coefficient β(r) which is a slowly vary-ing function of r. The parameter p has been defined before,and in this work, q was allowed to vary from 2 to 12 to ex-plore the suitability of different models.

From Eq. (7), we can see that the behaviour of the MLRpotential is determined not only by the exponential term, butalso by the pre-exponential factor, uLR(r)

uLR(re) . The long-range po-tential uLR(r) in the latter is normally described by a sum ofinverse-powers terms,

uLR(r) = Cn

rn+ Cm

rm+ . . . , (11)

in which the leading term Cn/rn represents the principal inter-action. The choice of the above term depends on the nature ofthe system under study.

In the present water dimer case, the long-range interac-tions are best described by an r−6 term,

uLR(r) = C6

r6. (12)

This choice is justified in Sec. II B 1 below. Note that as theleading long-range interaction has an inverse power in r equalto 6, and since the first long-range interaction not includedin this definition of uLR would have the inverse power in thecoordinate equal to 8, the value of p, as discussed, will be

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 4: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-4 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

p = 8 − 6. The ratio

uLR(r)

uLR(re)=

(C6 / r6

C6 / r6e

)= r6

e

r6(13)

equals unity when r = re and approaches ∞ when rapproaches zero.

In our fitting of the pmf to the MLR potential, we usedvalues of r greater than a small cutoff value of 2.1 Å to avoidmisbehaviour of the potential and its first derivatives at smalldistances.33

1. The coefficient of the leading long-range termin the uLR function: C6

C6 is the long-range coefficient of the pre-exponentialfunction uLR which defines the attractive limiting long-rangeinteraction in Eq. (7). In the current study, we describe theinteraction between the monomers in the water dimer us-ing the rigid TIP4P model,34 although our approach is gen-eral and could be applied to any interaction potential. TheTIP4P model is a combination of a Lennard-Jones functionand Coulombic potentials,

V inter = 4εOO

[(σOO

rOO

)12

−(

σOO

rOO

)6]

+all pairs∑

i,j

{qi qj

rij

},

(14)where εOO and σ OO are the Lennard-Jones parameters for theoxygen atoms pair, rOO is the distance between the oxygens,qi and qj are the partial charges of any pair of atoms i and j,and rij is the distance between them.

One can safely assume that the interaction uLR, whichdefines the long-range behaviour, will be orientation indepen-dent. Consequently, the Coulombic interaction of the two wa-ter monomers at long distance can be represented as a dipole-dipole interaction that has been Boltzmann averaged over allorientations. The angular average of the free energy dipole-dipole interaction is then35

ucoulLR (r) = Ccoul

6

r6= μ2

1μ22

3(4πεo)2kBT r6, (15)

in which r is the distance between the dipoles, μi is the dipolemoment of the water monomer i in the TIP4P model assum-ing monomer equilibrium structure, εo is the permittivity ofvacuum, kB is the Boltzmann constant, and T is the temper-ature. Note that at long-range, the dipole-dipole free energy

varies as 1/r6, and the quantity Ccoul6 = μ2

1μ22

3(4πεo)2kBTacts as an

effective temperature-dependent C6 coefficient.This Boltzmann average is valid when kBT >

μ1μ2

4πεor3 .Thus, the values of the coordinate where this long-range be-haviour dominates are determined by the inequality35

rlong−range >

(μ1μ2

4πε0kBT

)1/3

. (16)

The contribution that the Lennard-Jones part of the po-tential makes to the long-range behaviour must also be con-sidered. This contribution is small compared to the angle-average dipole-dipole interaction, but it is still large enoughthat it needs to be taken into consideration for the fine tuning

of the pmf fitting. This need requires the total long-range po-tential to be the sum of the effective temperature dependent

term (Ccoul6r6 ) of Eq. (15) and the free energy contribution of the

attractive part of the Lennard-Jones potential ( CLJ6

r6 ):

uLR(r) = Ccoul6

r6+ CLJ

6

r6. (17)

The potential energy ULJ(r) and the free energy WLJ(r)have the same long-range behaviour, so we define

uLR(r) = C6

r6= μ2

1μ22

3(4πεo)2kT r6+ 4εOO

(σOO

r

)6. (18)

The above theoretical analysis can be used for a wide varietyof dimer systems for which to long range contribution to thepmf can be deduced from monomer properties.

C. Variational transition state theoryand the reaction rate

At this point, we have finally arrived at a representationfor a smooth pmf, which can be continuously differentiated,and is ready to be used to calculate properties such as the re-action rate using VTST. The reaction rate constant accordingto TST is obtained from the expression36

kT STr† =

√1

2πmkBT

(dA(r)

dr

)r=r†

. (19)

In this expression, kB is the Boltzmann’s constant, T is thetemperature, and A(r) is the Helmholtz free energy at a givenvalue r of the reaction coordinate. The reaction rate constantcalculated using this expression is always an upper bound tothe true classical rate constant.

To obtain a variational rate constant kVTST, r = r† is deter-mined as the distance at which the reactive flux is minimized,

min

{−

(dA(r)

dr

)}

= kBT

(d ln

[ ∫ r

r0exp(−w(r ′)/kBT )r ′2d r ′]

dr

)r=r†

= kBTexp

(− w(r†)

kBT

)r†2

∫ r†

r0exp

(−w(r)

kBT

)r2d r

. (20)

III. RESULTS AND DISCUSSION

We now present results that show how the MLR approachcan be used to obtain smooth and accurate potential of meanforce curves over long distances without having to generatethe pmf data over a wide range of distances. The water dimer,described with a TIP4P34 model, is used to illustrate our gen-eral approach. The reaction coordinate corresponds to the dis-tance between the centre of mass positions of the two watermonomers. Results based on the fit of pmf data generated bydirect integration are first presented as a way to benchmarkthe approach. The fitting procedure is then applied to theresults of metadynamics simulations.

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 5: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-5 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

-15

-10

-5

0pm

f (k

J/m

ol)

q=2q=3q=4q=5q=6q=7q=8q=9q=10q=11q=12GS

2 4 6 8 100.0

0.5

1.0

pmf

(kJ/

mol

)

(a)fitting data range

N=3

T=230K

2.1 5.34.0 6.0 8.0 10.0r (Å)

-15

-10

-5

0

pmf

(kJ/

mol

)

q=2q=3q=4q=5q=6q=7q=8q=9GS

2 4 6 8 100.0

0.5

1.0

pmf

(kJ/

mol

)

(b)fitting data range

N=4

T=230K

FIG. 2. GS pmf values, represented as solid circles, and MLR fitted pmf forthe water dimer at 230 K. (a) MLR fits with N = 3, p = 2, and q = 2 − 12.The inset shows the difference between the fitted pmf and the GS: �pmf= pmf fit(r) − pmf GS(r). (b) Same as (a) with N = 4, p = 2, and q = 2 − 9.

A. Fitting the direct integration pmf to the MLR form

We have used Eq. (6) to compute the converged densityρ(r) at a series of temperatures, and a direct Monte Carloscheme to evaluate this integral over the Euler angles of eachmolecule. Such an approach is possible because the volumeof the angular integral is known and finite: 64π4. The solidcircles in Fig. 2 are results obtained from the above Gold

Standard calculation for T = 230 K, for the full range of rvalues; they serve as a reference when we assess the qualityof our fitting procedure.

A particular model is defined by a set of values of theintegers p and q and the order of the exponent polynomialN. For each such model, a nonlinear least-squares fit is per-formed using program betaFIT37 to determine values of theN + 3 parameters: De, re, and β0 − βN. As discussed abovein Eq. (13), in all these fits we set p = 2. The first series offits of MLR functions to these points are based on data ob-tained from the GS approach for r values between 2.1 and5.3 Å. The number of expansion terms, N, in the exponent co-efficient (see Eq. (9)) is N = 3 for Fig. 2(a) and N = 4 forFig. 2(b). The range of q values is q = 2 − 12 for Fig. 2(a)and q = 2 − 9 for Fig. 2(b). Values of q and N outside theseranges give potentials that have one of the following traits:(i) they have extra turning points, (ii) they have unnecessaryβ parameters (rounded values equal to zero), or (iii) they yielda poorer quality of fit.

The results in Fig. 2 clearly show that the agreement withthe GS is excellent for a wide range of models. This agree-ment is the first indication of the ability of the MLR functionsto represent pmf data.

The dimensionless root mean square deviation (dd) is thequantity that defines the quality of the fit, that is the ability ofthe calculated or fitted parameters to reproduce the data. It isdefined as

DRMSD ≡ dd ={

1

N

N∑i=1

[ycalc(i) − yobs(i)

u(i)

]2}1/2

. (21)

In this equation, N is the number of experimental data pointsyobs(i), u(i) is the uncertainty in the point-i, and ycalc(i) is thepredicted value of that point. When multiplied by the aver-age uncertainty, dd defines the average difference betweenthe predicted values and the original data. The uncertaintiesof the GS data are shown in Tables S.I–S.IV of the supple-mentary material.38

The dd of each fit is given in Tables I and II. All thedd values obtained using the MLR fits of the pmf data ofthe water dimer at T = 230 K are very close to one. Thus,

TABLE I. MLR parameters of a fit to the GS at T = 230 K. Numbers in parentheses represent uncertainties inthe last digits shown.

N = 3

q dd β0 β1 β2 β3 Re De Vmin

2 1.80 − 0.684 0.560 − 0.170 − 3.00 2.7996(36) 15.43(14) − 15.252(81)3 1.70 − 0.662 0.460 − 0.800 − 1.10 2.7999(34) 15.39(11) − 15.266(77)4 1.57 − 0.642 0.410 − 0.500 − 0.600 2.8001(33) 15.36(09) − 15.277(72)5 1.41 − 0.621 0.380 − 0.380 − 0.400 2.8004(30) 15.34(07) − 15.289(66)6 1.29 − 0.588 0.420 − 0.340 − 0.400 2.8015(29) 15.36(07) − 15.318(61)7 1.19 − 0.564 0.380 − 0.320 − 0.300 2.8009(28) 15.35(06) − 15.325(59)8 1.19 − 0.525 0.400 − 0.330 − 0.300 2.8011(29) 15.37(06) − 15.350(60)9 1.11 − 0.482 0.420 − 0.350 − 0.300 2.8015(28) 15.38(06) − 15.374(57)10 1.17 − 0.445 0.430 − 0.370 − 0.300 2.8014(32) 15.39(06) − 15.385(62)11 1.22 − 0.424 0.340 − 0.380 − 0.200 2.7994(36) 15.38(06) − 15.384(67)12 1.23 − 0.385 0.340 − 0.410 − 0.200 2.7990(39) 15.38(07) − 15.389(70)

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 6: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-6 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

TABLE II. MLR parameters of a fit to the GS at 230 K. Numbers in parentheses represent uncertainties in thelast digits shown.

N = 4

q dd β0 β1 β2 β3 β4 Re De Vmin

2 1.13 − 0.521 1.19 − 3.90 − 6.00 22.0 2.7998(22) 15.20(8) − 15.337(59)3 1.09 − 0.528 0.760 − 2.00 − 2.00 5.00 2.7997(22) 15.27(7) − 15.335(58)4 1.13 − 0.516 0.570 − 1.35 − 1.00 1.90 2.7991(24) 15.34(6) − 15.348(61)5 1.05 − 0.511 0.470 − 0.970 − 0.600 0.900 2.7990(23) 15.36(5) − 15.351(58)6 1.04 − 0.498 0.410 − 0.760 − 0.400 0.500 2.7990(25) 15.37(5) − 15.360(59)7 1.09 − 0.491 0.370 − 0.620 − 0.300 0.300 2.7990(28) 15.36(6) − 15.356(63)8 1.03 − 0.463 0.390 − 0.560 − 0.300 0.200 2.7997(29) 15.38(6) − 15.371(61)9 1.13 − 0.451 0.410 − 0.470 − 0.300 0.100 2.8005(34) 15.38(6) − 15.378(68)

the general quality of the MLR fits is very good, with verysmall changes between one fit and another, sometimes nochanges at all. In this paper all the parameters have beenrounded by the sequential rounding and refitting procedureof Ref. 39.

As stated before, the GS pmf data which the MLR func-tion was fitted to ranges from 2.1 to 5.3 Å along the reac-tion coordinate r. This range was selected so that it would in-clude the behaviour of the attracted well while omitting boththe long-range interaction defined by the inequality (16) andthe steepest portion of the repulsion wall. For this selection,we defined the long-range interaction asymptote as a zero inthe pmf scale. Then, we selected only one point from the re-pulsive part of the curve with values of the pmf higher thanzero to be included as input data in the MLR fit, leaving asidethe rest of the repulsive wall.

We now turn our attention to the change in Helmholtzfree energy with respect to a change in the reaction coordi-nate, −dA(r)/dr. This quantity can be used to predict rate con-stants using variational transition state theory (see Eq. (19)).Values of −dA(r)/dr are obtained from the pmf using Eq. (20),and results are presented in Fig. 3, based on pmf functions de-fined by the fits of Fig. 2.

We first observe that the results are all very similar ir-respective of the MLR model used, although the fits withN = 4 give slightly smaller values of dd . The insets showthat the spread of values for different fits is of the order of2% of the value of −dA(r)/dr at the minimum, r†. Since theVTST rate constant is proportional to −dA(r)/dr|r=r† (seeEq. (19)), we expect the MLR fitting procedure to introduceuncertainties of 2% into values of the rate constants.

Figure 3 shows that q values such as 6, 7, or 8 can leadto −dA(r)/dr|r=r† values very close to the minimum valueobtained from the GS reference. To illustrate this point, let usexamine the behaviour of the radial variable yq(r) with differ-ent values of q. As discussed in Subsection II B, the behaviourof the radial variable will strongly influence the behaviour ofany polynomial that depends on it. Thus, selecting the val-ues of q that give a well-behaved yq(r) function is extremelyimportant.

The plot of yq(r) at different values of the integer q isrepresented in Fig. 1. From the plot we can see that, for smallvalues of q, the radial variable is not able to explore its wholerange [−1, 1] in the data region. Thus, the fitted polynomial

may not lead to accurate extrapolation at long range since itwill still be changing quite a bit outside the data region withinwhich the MLR parameters are determined. However, largervalues of q will lead to a better coverage of the range of yq(r)within the data region, while outside the data region, the radialvariable will become almost “flat,” and the polynomial fit willextrapolate in a more reliable way. It is worth noting, however,that a very large q value will lead to a stiff yq(r) coordinate,which is “flat” over a significant fraction of the data range.Such a stiff coordinate requires a higher-order polynomial toobtain a good fit, and will yield a more complicated model.

0.1

0.2

0.3

0.4

-dA

(r)/

dr (

kJ/m

ol Å

)

q=2q=3q=4q=5q=6q=7q=8q=9q=10q=11q=12GSIG

5.5 5.95.3 5.7

0.0091

0.0090

(a)

N=3

T=230K

0 10 20 30 40 50 60r (Å)

0.0

0.1

0.2

0.3

-dA

(r)/

dr (

kJ/m

ol Å

)

q=2q=3q=4q=5q=6q=7q=8q=9GSIG

5.7 5.95.55.3

0.0090

0.0091

(b)

N=4

T=230K

FIG. 3. Water dimer −dA(r)/dr at 230 K. The labelling is the same as inFig. 2. The insets represent a magnification near the minima of the curves.The ideal gas result, labelled IG, is also presented.

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 7: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-7 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

We opt for a compromise and have empirically tested var-ious choices of q. Our criterion for choosing the parametersis to have a simple model with few fitting parameters. Thismodel is also required to yield physical results over a widerange of r values. To test this, we calculate the effective lead-ing coefficient of the long-range polynomial Ceffec

6 (r) and plotit as a function of 1/r2.

The Ceffec6 (r) is defined by the expression

Ceffec6 (r) = r6[De − V (r)]. (22)

This expression is suggested because at long range any inter-molecular potential behaves as22

V (r) = De − Cn

rn− Cm

rm− . . . . (23)

In this study, we describe the TIP4P water dimer pre-exponential MLR potential uLR as the single term shown inEq. (12). However, at long range, the exponential factor ofEq. (7) approaches37

e−β(r)·yp(r) � e−β∞

(1 + Ap

rp+ Ap,q

rp+q+ . . .

)

� uLR(re)

2De

(1 + Ap

rp+ Ap,q

rp+q+ . . .

). (24)

Then taking into consideration that p = 2 in all our models,the whole TIP4P water dimer MLR potential at long distanceapproaches

V (r) = De − C6

r6− C6+p

r6+p− C6+p+q

r6+p+q, (25)

and as r → ∞ our Ceffec6 (r) will have a limiting behaviour

Ceffec6 (r) = C6 + C8

r2+ C6A2

r2. (26)

With the help of Ceffec6 (r) versus 1/r2 graphs, we chose

a model that yields a well-behaved and smooth function.Figure 4(a) illustrates a number of possible choices. Modelswith big positive or negative extrema, as the functions ap-proach r → ∞ (C6th in the ordinate), should be discardedbecause they display implausible unphysical behaviour. No-tice that for the two different orders of the polynomial, themodels converge from above and below to a common smoothshape as q becomes large. We take this common large-q limitto define our best recommended models. The label C6th in thegraph represents the approximate position of the value of ourtheoretical C6 calculated using Eq. (18).

Results of GS calculations and MLR fits to the pmf func-tion for T = 243, 273, and 303 K are shown in Figs. 5–7,respectively. As with the fits at T = 230 K, this set of MLRfits was obtained from the GS approach for r values between2.1 and 5.3 Å. In this case, after doing the whole fitting analy-sis as before, we again found that optimum models were thosewith q = 6 − 8. Fits with N = 3 and N = 4, with q = 6 − 8are presented in Figs. 5(a), 6(a), and 7(a).

These figures clearly show that the agreement with theGS is again excellent, and that this agreement extend well pastthe domain of the points used in the fit. This time, a magnifica-tion of the fits and the GS in the region between 5.0 and 5.6 Åis shown in the inset of Figs. 5(a), 6(a), and 7(a). As shown

10000

20000

25000

C6ef

fec (r

)(kJ

Å6 /m

ol)

N=3 q=2N=3 q=3N=3 q=4N=3 q=5N=3 q=6N=3 q=7N=3 q=8N=3 q=9N=3 q=10N=3 q=11N=3 q=12N=4 q=2N=4 q=3N=4 q=4N=4 q=5N=4 q=6N=4 q=7N=4 q=8N=4 q=9

N=3 q=2

N=3 q=3

N=3 q=4N=3 q=5N=3 q=6

N=4 q=2

N=4 q=3

N=4 q=4

N=3 q=7,8,9,10,11,12

C6

th

T=230K

slope ~ C8

(a)

C6

th=16758

N=4 q=5,6,7,8,9

10000

20000

C6ef

fec (r

)(kJ

Å6 /m

ol)

N=3 q=2N=3 q=3N=3 q=4N=3 q=5N=3 q=6N=3 q=7N=3 q=8N=3 q=9N=3 q=10N=3 q=11N=3 q=12N=4 q=2N=4 q=3N=4 q=4N=4 q=5N=4 q=6N=4 q=7N=4 q=8N=4 q=9

N=3 q=2

N=3 q=3N=3 q=4N=3 q=5

C6

th

T=243K

N=4 q=2

N=4 q=3

N=4 q=4

N=3 q=6,7,8,9,10,11,12

slope ~ C8

(b)

C6

th=15998

N=4 q=5,6,7,8,9

10000

20000

7500

C6ef

fec (r

)(kJ

Å6 /m

ol)

N=3 q=2N=3 q=3N=3 q=4N=3 q=5N=3 q=6N=3 q=7N=3 q=8N=3 q=9N=3 q=10N=3 q=11N=4 q=2N=4 q=3N=4 q=4N=4 q=5N=4 q=6N=4 q=7N=4 q=8N=4 q=9N=4 q=10

N=3 q=2

N=3 q=3

N=3 q=4N=3 q=5

N=4 q=2

N=4 q=3

N=4 q=4

N=3 q=6,7,8,9,10,11

C6

th

T=273K

slope ~ C8

(c)

C6

th=14520

N=4 q=5,6,7,8,9,10

6000

18000

10000

C6ef

fec (r

)(kJ

Å6 /m

ol)

N=3 q=2N=3 q=3N=3 q=4N=3 q=5N=3 q=6N=3 q=7N=3 q=8N=3 q=9N=3 q=12N=4 q=2N=4 q=3N=4 q=4N=4 q=5N=4 q=6N=4 q=7N=4 q=8

N=3 q=2N=3 q=3N=3 q=4N=3 q=5

N=4 q=5

N=4 q=2

N=4 q=3

N=4 q=4

N=3 q=6,7,8,9,12

C6

th

T=303K

slope ~ C8

(d)C

6

th=13335

N=4 q=6,7,8

0.00 0.02 0.04 0.06 0.08 0.101/r

2 (Å

-2)

FIG. 4. −Ceffec6 versus T calculated from MLR fits with N = 3 − 4,

p = 2, and q = 2 − 12. The labelling is the same as in Fig. 2. (a) T = 230 K,(b) T = 243 K, (c) T = 273 K, and (d) T = 303 K.

in Tables S.V–S.X of the supplementary material,38 all the dd

values obtained using the MLR fits of the GS pmf data of thewater dimer, at T = 243, 273, and 303 K, are again very closeto unity. As was seen in the fits for T = 230 K, it is sometimesthe case that only limited change occurs between one fit and

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 8: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-8 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

2.1 5.34.0 6.0 8.0 10.0r (Å)

-15

-10

-5

0pm

f (k

J/m

ol)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GS

5.40 5.45 5.50

-0.70-0.68-0.66-0.64

(a)fitting data range

T=243K

0 10 20 30 40 50 60r (Å)0.0

0.1

0.2

0.3

0.4

-dA

(r)/

dr (

kJ/m

ol Å

)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GSIG

5.4 5.5

0.0159

0.0158

0.0160

(b)

T=243K

FIG. 5. GS, represented as solid circles, MLR fitted pmf for the waterdimer and −dA(r)/dr at 243 K. (a) MLR fits with N = 3 − 4, p = 2, andq = 4 − 6. The inset shows a magnification of the fitted pmf and the GS inthe region from 5.4 to 5.8 Å. (b) The labelling is the same as in (a). The insetsrepresent a magnification near the minima of the curves from 5.2 to 5.8 Å.The ideal gas result, labelled IG, is also presented.

another, or no changes at all, given the general good qualityof the MLR fits.

We can now examine the variation in slope of theHelmholtz free energy dA(r)/dr along the reaction coordinate.In accordance with our previous findings, the results are sim-ilar for the different MLR models. As shown in the graphs,the spread of the values of −dA(r)/dr at the minimum, r†, isagain of the order of 2%. Given that the rate constant is pro-portional to −dA(r)/dr|r=r† (see Eq. (19)), it is reasonable topredict that the model-dependence of the MLR fitting proce-dure will introduce uncertainties of approximately 2% for rateconstants. With regard to agreement with the GS reference, inFigs. 5–7, one can see that the fits with N = 4 can lead toslightly better −dA(r)/dr|r=r† .

Thus, the MLR approach can produce a family of fitsthat yield consistent results over a range of temperatures. Asummary of the MLR parameters for these cases is presentedin Tables I and II, and Tables S.V–S.X of the supplementarymaterial.38 The curves Ceffec

6 (r) in the lower panels of Fig. 4display the same behaviour observed at 230 K. We discussbelow the application of MLR fitting to realistic pmf dataobtained from MD simulations.

2.1 4.0 6.05.3 8.0 10.0r (Å)

-15

-10

-5

0

pmf

(kJ/

mol

)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GS

5.255.20 5.30-0.85

-0.80

-0.75

-0.70

(a)fitting data range

T=273K

0 10 20 30 40 50 60r (Å)0

0.1

0.2

0.3

0.4

0.0

-dA

(r)/

dr (

kJ/m

ol Å

)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GSIG

5.2 5.3 5.4

0.0460

0.0458

0.0462

(b)

T=273K

FIG. 6. GS, represented as solid circles, MLR fitted pmf for the waterdimer and −dA(r)/dr at 273 K. (a) MLR fits with N = 3 − 4, p = 2, andq = 4 − 6. The inset shows a magnification of the fitted pmf and the GS inthe region from 5.4 to 5.8 Å. (b) The labelling is same as in (a). The insetsrepresent a magnification near the minima of the curves from 5.1 to 5.5 Å.The ideal gas result, labelled IG, is also presented.

B. Fitting the pmf from metadynamicsto the MLR form

We now present free energy results obtained using ametadynamics approach. The metadynamics simulations areperformed with a Langevin thermostat at T = 243, 273, and303 K. The SETTLE40 method is used to impose rigid bondconstraints. The time step is 2 fs, the length of the simulationsis 2 μs, and the files that track the variation of the pmf alongthe reaction coordinate are written every 20 ns. The softwareused to carry out these simulations was the scalable moleculardynamic package “Not (just) Another Molecular Dynamicsprogram” (NAMD)41 version 2.8.

Results for T = 243 K are shown in Fig. 8. The graphshows the pmf from “raw metadynamics” and the pmf ob-tained after fitting the metadynamics data to an N = 2 MLRform. It also shows the results of the GS calculation atT = 243 K (solid round points), and of the MLR fits tothese data obtained using two different numbers of expansionterms in the exponential coefficient (see Eq. (9)): N = 3 andN = 4. In contrast with the GS case, the best fits of an MLRform to the metadynamics data is obtained with N = 2. Fits

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 9: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-9 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

2.1 4.0 6.0 8.0 10.05.3r (Å)

-15

-10

-5

0pm

f (k

J/m

ol)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GS

5.10 5.15 5.20

-0.85

-0.80

-0.75

(a)fitting data range

T=303K

0 10 20 30 40 50 60r (Å)

0.1

0.2

0.3

0.4

-dA

(r)/

dr (

kJ/m

ol Å

)

q=6 N=3q=7 N=3q=8 N=3q=6 N=4q=7 N=4q=8 N=4GSIG5.255.15

0.1024

0.1026

0.1028

0.1030

(b)

T=303K

FIG. 7. GS, represented as solid circles, MLR fitted pmf for the waterdimer and −dA(r)/dr at 303 K. (a) MLR fits with N = 3 − 4, p = 2, andq = 4 − 6. The inset shows a magnification of the fitted pmf and the GS inthe region from 5.4 to 5.6 Å. (b) The labelling is same as in (a). The insetsrepresent a magnification near the minima of the curves from 5.0 to 5.5 Å.The ideal gas result, labelled IG, is also presented.

with values of N ≥ 3 produce MLR β parameters with un-certainties ≥100%. Nevertheless, the agreement with the GSresults is excellent. In all of these plots, a value of q = 6 wasused. Table III presents the values of the MLR parameters forthe metadynamics fits with different q values.

Any proper least-squares fit should weight the data withuncertainties which allow the data to be treated in a balancedfashion. However, metadynamics data are not accompaniedby any natural intrinsic measure of their uncertainty, and wemust impose a weighting scheme which will appropriatelybalance the very large positive values obtained for small ragainst the small negative (relative to the asymptote) valuesobtained for large r. To address this question, we have ex-amined the pattern of the uncertainties associated with ourGS results. The uppermost panel of Fig. 9 plots the absolutevalues of the GS uncertainties as a function of r, while themiddle panel plots the absolute values of the relative uncer-tainties. While the latter yields a more equal distribution ofvalues, than the former, it “accidentally” gives anomalouslylarge values at values or r where the values of the pmf arepassing through zero as the function changes sign. The bot-tom panel damps the most extreme of these values by defin-

2.3 4.0 6.0 8.0 10.05.2r(Å)

-15

-10

-5

0

pmf

(kJ/

mol

)

metaDmetaD N=2GSGS N=3GS N=4

(a)

T=243K

fitting data range

0 10 20 30 40 50 60r (Å)

0.0

0.1

0.2

0.3

0.4

-dA

(r)/

dr (

kJ/m

ol Å

)

GS N=3GS N=4metaD N=2

5.3 5.5 5.7

0.0162

0.0164

0.0160

(b)

T=243K

FIG. 8. GS, represented as solid circles, MLR fitted pmf for the GS, pmffrom metadynamics (metaD), MLR fit of the metaD data and −dA(r)/dr at243 K, (a) metaD MLR fit with N = 2, MLR fits to the GS with N = 3 andN = 4. p = 2 and q = 6 for all the fits. (b) The labeling is the same as in(a) but only the −dA(r)/dr of the MLR fits are represented. The insets repre-sent a magnification near the minima of the curves from 5.2 to 5.8 Å.

ing a modified uncertainty for each point as the average of themagnitude of the relative uncertainties for three neighbouringpoints. After a careful numerical testing of the uncertainties inthe GS data, we established that using a constant relative un-certainty for our metadynamics data yields better fits. A plotof the GS absolute u(i), relative (u(i)/pmf (i)) and “average”〈(u(i)/pmf (i))〉 uncertainties at T= 230, 243, 273, and 303 Kis presented in Fig. 9. From the graphs, it seems clear that aconstant fraction relative uncertainty will be a sensible way ofweighting the points relative to one another as a function ofdistance.

TABLE III. MLR parameters of a fit to the metadynamics data at 243 K.Numbers in parentheses represent uncertainties in the last digits shown.

N = 2

q dd β0 β1 β2 Re De Vmin

6 0.566 − 0.46 0.20 − 0.70 2.801(7) 15.0(2) − 14.99(28)7 0.651 − 0.43 0.20 − 0.60 2.801(9) 15.0(3) − 15.04(32)8 0.623 − 0.45 0.10 − 0.50 2.799(9) 14.9(3) − 14.93(31)

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 10: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-10 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

0.05

0.15

0.25u(

i)T=230KT=243KT=273KT=303K

0.005

0.015

0.025

|u(i

)/pm

f(i)

|

2.5 3.0 3.5 4.0 4.5 5.0r(Å)

0.005

0.010

0.015

<|u

(i)/

pmf(

i)|>

(a) Absolute Uncertainties

(b) Relative Uncertainties

(c) "Average" Relative Uncertainties

FIG. 9. Uncertainties in the GS data points at 230 K, 243 K, 273 K, and303 K. (a) Absolute uncertainties in the GS points u(i). (b) Relative uncer-tainties in the GS points u(i)/pmf(i). (c) “Average” relative uncertainties inthe GS points. The average was calculated using the values of the relative un-certainties in the pointi, the predecessor pointi−1 and the successor pointi+1.

The dimensionless root mean square deviations (dd)of the metadynamics fits using different MLR models withq = 6, 7, and 8 are given in Table III. In general, the dd val-ues obtained from the MLR fits to the metadynamics resultsare smaller than the dd values obtained from the MLR fits tothe GS. As stated, the dimensionless root mean square devi-ation (dd) is a measure of the average discrepancy betweenthe original data and the predicted values, taking into accountthe average of the uncertainties of the data. In the metady-namics case, these better dd values in the MLR fits can meanthat we overestimated the uncertainties. The variations of theslope of the Helmholtz free energy along the reaction coordi-nate, −dA(r)/dr obtained from the MLR fit of the GS, are alower-bound of the calculations. Nevertheless, the agreementbetween the −dA(r)/dr for the GS fits and the metadynamicsfits is remarkable despite the difference in calculation meth-ods or choice of MLR models. Thus, the predicted rate con-stants will be very similar.

The same trend observed at T = 243 K continues withthe increase of temperature to T = 273 K and to T = 303 K.Figures 10 and 11 illustrate the similarities between the “rawmetadynamics” calculations, the MLR-fitted metadynamicspmf, the GS and the MLR-fitted GS. The five curves be-have similarly despite the different fitting parameters selected.

0.80.43.2 6.05.2 10.0r(Å)

-15

-10

-5

0

pmf(

kJ/m

ol)

metaDmetaD N=2GSGS N=3GS N=4

(a)

T=273K

fitting data range

0 10 20 30 40 50 60r (Å)0

0.1

0.2

0.3

0.4

-dA

(r)/

dr (

kJ/m

ol Å

)

GS N=3GS N=4metaD N=2

5.1 5.3 5.5

0.0460

0.0450

0.0470

(b)

T=273K

FIG. 10. GS, represented as solid circles, MLR fitted pmf for the GS, pmffrom metaD, MLR fit of the metaD data and −dA(r)/dr at 273 K, (a) metaDMLR fit with N = 2, MLR fits to the GS with N = 3 and N = 4. p = 2 andq = 6 for all the fits. (b) The labeling is the same as in (a) but only the−dA(r)/dr of the MLR fits are represented. The insets represent a magnifica-tion near the minima of the curves from 5.1 to 5.5 Å.

The −dA(r)/dr curves are also very similar, with the MLR-fitted GS being always a lower-bound of the calculations.The parameters of the MLR fits for the metadynamics data atT = 273 K and T = 303 K are presented in Tables S.XI andS.XII of the supplementary material.38 At T = 303 K, wepresent the parameters of the fits with N = 2 and N = 3 aswell.

C. Obtaining the variational rate constant kVTST

As the VTST method specifies, we chose the mini-mum value of the reactive flux −dA(r)/dr at each of the

TABLE IV. Values of the variational rate constant for different models at different temperatures.

kVTST

T GS GS fit Metadynamics fit Literature

230 1.22 × 109 1.22 × 109 7.0 × 1010 [Ref. 32]; 8.568 × 1010 [Ref. 42]243 2.14 × 109 2.14 × 109 2.21 × 109 4.75 × 109[Ref. 43]; 2.84 × 1011[Ref. 43]273 6.22 × 109 6.23 × 109 6.08 × 109 8.42 × 109[Ref. 43]; 7.66 × 1011[Ref. 43]303 1.39 × 1010 1.39 × 1010 1.45 × 1010 1.31 × 1010[Ref. 43]; 1.72 × 1012[Ref. 43]

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 11: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-11 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

4.02.3 6.05.2 8.0 10.0r(Å)

-15

-10

-5

0pm

f(kJ

/mol

)metaDmetaD N=2GSGS N=3GS N=4

(a)

T=303K

fitting data range

0 10 20 30 40 50 60r(Å)

0.1

0.2

0.3

0.4

-dA

(r)/

dr(k

J/m

ol Å

)

GS N=3GS N=4metaD N=2

5.25 5.505.00

0.105

0.103

0.107

(b)

T=303K

FIG. 11. GS, represented as solid circles, MLR fitted pmf for the GS, pmffrom metaD, MLR fit of the metaD data and −dA(r)/dr at 303 K, (a) metaDMLR fit with N = 2, MLR fits to the GS with N = 3 and N = 4. p = 2 andq = 6 for all the fits. (b) The labeling is the same as in (a) but only the−dA(r)/dr of the MLR fits are represented. The insets represent a magnifica-tion near the minima of the curves from 5.0 to 5.5 Å.

temperatures used in this study. Following this, we calculatethe variational rate constant using Eq. (19).

The graph of the temperature dependence of the rateconstant is shown in Fig. 12. For comparison purposes, weselected three sets of data: the GS, one GS MLR fit withN = 4 and q = 6 and one metadynamics MLR fit with N = 2,q = 6. The figure illustrates the agreement between the threecalculations, proving the suitability of the MLR approach formodelling free energy surfaces.

Table IV lists the values of kVTST calculated with the threemethods. Two separate TIP4P results at T = 230 K fromRefs. 32 and 42 are presented for comparison. Our re-sults differ significantly from those. In light of the excellentagreement between the present GS and metadynamics results,one could attribute this difference to insufficient convergenceof the umbrella sampling approach used in Refs. 32 and 42.Such an error could in turn lead to an incorrect descriptionof the long range free energy. Results from Ref. 43 are alsogiven for reference but cannot be used for a direct compari-son since they are based on a different interaction model andrate theory.

0.0034 0.0036 0.0038 0.0040 0.0042 0.0044

1/T(Å-1

)

21

22

23

ln(k

VT

ST)

GSGS N=4 q=6metaD N=2 q=6

FIG. 12. Values ln(kVTST) versus 1/T calculated for the GS, an MLR fittedmodel of GS with N = 4, q = 6, and an MLR fitted model of the metadynam-ics calculation with N = 2 and q = 6.

IV. CONCLUDING REMARKS

In this work, we have successfully applied, for the firsttime the MLR function to model free energy surfaces. Wehave demonstrated the suitability of the MLR potential formsto accurately reproduce a free energy landscape exemplifiedby the TIP4P water dimer dissociation reaction. The MLRmodels have yielded results that are in excellent agreementwith the “exact” Monte Carlo or GS calculations. This agree-ment can be seen in the small values of the dimensionless rootmean square deviations (dd) and in the close agreement of therate constant calculated by both methods.

We have also used an enhanced sampling simulationmethod called metadynamics to test our calculations. Meta-dynamics leads to very accurate kinetics, with values of therate constant strikingly similar to the ones obtained by theGS calculations and the MLR-fitted GS calculations. The dif-ferences in the values of the free energy potentials obtainedfrom the MLR-fitted GS and the MLR-fitted metadynamicsmethods are of the order of the sub kJ/mol (or sub quarterkcal/mol). The accuracy of our approach has been tested atseveral temperatures. The variation of the rate constants withthe temperature shows an “Arrhenius”-like behaviour.

Our approach is general and can be applied to a variety ofsystems since the β(r) function that appear in the MLR form(Eq. (7)) can be chosen with flexibility. Recent results showthat use of a spline function in the definition of β(r) and al-lows one to represent potentials with multiple minima.44 Onethen needs to generate pmf data in the short range region of adissociation reaction coordinate and then determine the longrange nature of the interaction free energy based on monomerproperties. Because pmf data only needs to be generated inthe short-range region, significant reduction in computationalcost can be achieved when comparing to approaches that re-quire pmf data in the full range of interest. This information isthen used to perform the MLR fit. The MLR-fitting software“betaFit” used in this study can generate a series of MLR-fitted potentials that give apparently well-behaved functions.One has to select a model with the proper physical behaviour.

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions

Page 12: On the analytical representation of free energy profiles ...scienide2.uwaterloo.ca/~rleroy/Pubn/13JCP_Yalina.pdf · Yalina Tritzant-Martinez,1,2 Tao Zeng, 2Aron Broom, Elizabeth

234103-12 Tritzant-Martinez et al. J. Chem. Phys. 138, 234103 (2013)

Note that MLR forms are not limited to one-dimensionalreaction coordinates. The MLR approach has successfullybeen used to obtain fits to potential energy surfaces withthree27 and four45 dimensions. Monomer flexibility can evenbe included in the MLR fit.46 We therefore expect that it willbe possible to fit multidimensional potentials of mean forcewith the MLR approach. We, however, anticipate that poten-tial problems could be encountered when a barrier lies abovethe asymptote in a dissociating reaction coordinate. Furtherwork will be required in that particular area. We point out thatthe direct Monte Carlo approach developed to generate theGS results can be used to obtain pmf data for a wide varietyof rigid dimer systems. Work is also underway in this direc-tion. Future work also includes applications to larger waterclusters and protein-ligand complexes.

ACKNOWLEDGMENTS

We thank Dr. Alexander Brown and Fred McCourt foruseful discussions. Y.T.M. thanks Lisa Verge, Mary McPher-son, Ph.D. (candidate) Andrea Prier, and Erica McKellar forall their helpful suggestions while preparing this manuscript.This research has been supported by the Natural Sciences andEngineering Research Council of Canada (NSERC) and theCanada Foundation for Innovation (CFI). T.Z. thanks NSERC(PDF-403739-2011) and the Ministry of Research and Inno-vation of Ontario for postdoctoral fellowships.

1C. Chipot, R. Jaffe, B. Maigret, D. A. Pearlman, and P. A. Kollman, J. Am.Chem. Soc. 118, 11217 (1996).

2J. Henin, A. Pohorille, and C. Chipot, J. Am. Chem. Soc. 127, 8478 (2005).3R. P. Feynman, Statistical Mechanics: A Set of Lectures (Addison-Wesley,1992).

4C. Chipot, M. A. Wilson, and A. Pohorille, J. Phys. Chem. B 101, 782(1997).

5Y. Deng and B. Roux, J. Phys. Chem. B 113, 2234 (2009).6B. Roux, Comput. Phys. Commun. 91, 275 (1995).7J. Kirkwood, J. Chem. Phys. 3, 300 (1935).8G. K. Schenter, S. M. Kathmann, and B. C. Garrett, J. Chem. Phys. 116,4275 (2002).

9R. P. McRae, G. K. Schenter, B. C. Garrett, Z. Svetlicic, and D. G. Truhlar,J. Chem. Phys. 115, 8460 (2001).

10C. Alhambra, J. Corchado, M. L. Sanchez, M. Garcia-Viloca, J. Gao, andD. G. Truhlar, J. Phys. Chem. B 105, 11326 (2001).

11M. Garcia-Viloca, C. Alhambra, D. G. Truhlar, and J. L. Gao, J. Comput.Chem. 24, 177 (2003).

12T. D. Poulsen, M. Garcia-Viloca, J. L. Gao, and D. G. Truhlar, J. Phys.Chem. B 107, 9567 (2003).

13S. Kerisit and K. M. Rosso, J. Chem. Phys. 131, 114512 (2009).14G. R. Haynes and G. A. Voth, J. Chem. Phys. 99, 8005 (1993).15J. R. Mathis and J. T. Hynes, J. Phys. Chem. 98, 5445 (1994).16E. C. Cyr and S. D. Bond, J. Comput. Phys. 225, 714 (2007).17R. van Zon and J. Schofield, J. Chem. Phys. 132, 154110 (2010).18B. A. Berg and R. C. Harris, Comput. Phys. Commun. 179, 443 (2008).19F. Calvo, Mol. Phys. 100, 3421 (2002).20F. Wang and D. P. Landau, Phys. Rev. Lett. 86, 2050 (2001).21R. J. Le Roy, Y. Huang, and C. Jary, J. Chem. Phys. 125, 164310

(2006).22R. J. Le Roy and R. D. E. Henderson, Mol. Phys. 105, 663 (2007).23A. Shayesteh, R. D. E. Henderson, R. J. Le Roy, and P. F. Bernath, J. Phys.

Chem. A 111, 12495 (2007).24R. J. Le Roy, C. C. Haugen, J. Tao, and H. Li, Mol. Phys. 109, 435 (2011).25R. J. Le Roy, N. S. Dattani, J. A. Coxon, A. J. Ross, P. Crozet, and

C. Linton, J. Chem. Phys. 131, 204309 (2009).26H. Li, R. J. Le Roy, P.-N. Roy, and A. R. W. McKellar, Phys. Rev. Lett.

105, 133401 (2010).27H. Li, N. Blinov, P.-N. Roy, and R. J. Le Roy, J. Chem. Phys. 130, 144305

(2009).28H. Li and R. J. Le Roy, Phys. Chem. Chem. Phys. 10, 4128 (2008).29R. J. Le Roy, “Determining equilibrium structures and potential energy

functions for diatomic molecules,” Equilibrium Structures of Molecules,edited by J. Demaison and A. G. Csaszar (Taylor & Francis, London, 2010),Chap. 6, pp. 168–211.

30A. Laio and F. L. Gervasio, Rep. Prog. Phys. 71, 126601 (2008).31G. Bussi, A. Laio, and M. Parrinello, Phys. Rev. Lett. 96, 090601 (2006).32Y. Ming, G. L. Lai, C. H. Tong, R. H. Wood, and D. J. Doren, J. Chem.

Phys. 121, 773 (2004).33Reference 24 introduced “damping functions” into Eq. (11) to prevent such

misbehavior but we prefer to omit that complexity here.34W. Jorgensen and J. Madura, Mol. Phys. 56, 1381 (1985).35J. N. Israelachvili, Intermolecular and Surface Forces (Academic Press

Inc., California, 1991).36S. M. Kathmann, G. K. Schenter, and B. C. Garrett, J. Chem. Phys. 111,

4688 (1999).37R. J. Le Roy, betaFIT 2.0: a computer program to fit poten-

tial function points to selected analytic functions, University of Wa-terloo Chemical Physics Research Report No. CP-665, 2009, seehttp://leroy.uwaterloo.ca/programs/.

38See supplementary material at http://dx.doi.org/10.1063/1.4810006 foradditional tables containing details of the MLR fits.

39R. J. Le Roy, J. Mol. Spectrosc. 191, 223 (1998).40S. Miyamoto and P. A. Kollman, J. Comput. Chem. 13, 952 (1992).41J. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C.

Chipot, R. D. Skeel, L. Kale, and K. Schulten, J. Comp. Chem. 26, 1781(2005).

42J. Z. Liu, L. Yang, and D. J. Doren, Chem. Phys. 323, 579 (2006).43L. Yao, R. X. He, A. M. Mebel, and S. H. Lin, Chem. Phys. Lett. 470, 210

(2009).44J. Tao, R. J. Le Roy, and A. Pashov, in Proceedings of 65th Interna-

tional Symposium on Molecular Spectroscopy, Columbus, Ohio, 2010paper MG07.

45H. Li, P.-N. Roy, and R. J. Le Roy, J. Chem. Phys. 132, 214309 (2010).46L. Wang, D. Xie, R. J. Le Roy, and P.-N. Roy, J. Chem. Phys. 137, 104311

(2012).

Downloaded 31 Jul 2013 to 129.97.80.64. This article is copyrighted as indicated in the abstract. Reuse of AIP content is subject to the terms at: http://jcp.aip.org/about/rights_and_permissions