A less expensive Ewald lattice sum

A less expensive Ewald lattice sum

Dean R. Wheeler *, John Newman

Department of Chemical Engineering, University of California, Berkeley, CA 94720, USA

Received 13 August 2002; in final form 3 October 2002

Abstract

We present a treatment of the Ewald lattice sum which permits 25% or more decrease in program execution cost for

the same level of accuracy. This is accomplished by optimizing on additional degrees of freedom introduced into the

function that partitions the Coulombic potential between real- and reciprocal-space parts. The technique was tested in

simulations of 1 M KCl in water. It is relatively simple to implement in existing codes, including those based on fast-

Fourier-transform solutions to the lattice sum.

Published by Elsevier Science B.V.

1. Introduction

The Ewald sum and related lattice-sum methodsare some of the better ways to handle Coulombic

interactions in molecular dynamics and Monte

Carlo simulations employing periodic boundary

conditions [1]. Because the Coulombic interaction

is long-ranged, a summation of all charge–charge

interactions is slowly converging and therefore a

large computational burden. The Ewald sum re-

duces the cost of the Coulomb-potential sum bysplitting the potential into two parts, one of which

is calculated directly or in real space, and the other

of which is calculated in Fourier or reciprocal

space.

Naturally, there is interest in optimizing the

Ewald method in order to access greater time and

length scales via simulation. This interest wasfurther heightened with the realization that neglect

of long-range Coulombic interactions leads to se-

rious artifacts in dipolar liquid and bio-molecular

simulations [2–4]. It has long been recognized that

certain combinations of input parameters allow

one to optimize computational cost for a desired

calculation accuracy [5–9]. Perram et al. [10] first

recognized that by a judicious choice of parame-ters the cost of the Ewald method could be made

to scale with N 1:5, where N is the number of

charged particles. Development of solutions to the

reciprocal-space sum based on the fast Fourier

transform (FFT) has improved the cost scaling to

N logN [11–14]. The particle–particle particle-

mesh (PPPM) and particle-mesh Ewald (PME)

methods seem to have received the most attention.Despite a long history of Ewald lattice-sum al-

gorithm development (starting in 1921 [15]), there

has been little work in attempting to reduce the

Chemical Physics Letters 366 (2002) 537–543

www.elsevier.com/locate/cplett

* Corresponding author. Present address: Department of

Chemical Engineering, Brigham Young University, Provo,

UT 84602, USA.

E-mail address: [email protected] (D.R. Wheeler).

0009-2614/02/$ - see front matter. Published by Elsevier Science B.V.

PII: S0009-2614 (02 )01644-5

mail to: [email protected]

cost by varying the form of the splitting function.

While it has been recognized that a variety of

functional forms could be used [3,5,16,17], nearly

all implementations of the Ewald method to the

present day use a Gaussian-based function,

SðrÞ ¼ erfcðarÞ, to perform the split between thereal- and reciprocal-space Coulombic sums. This is

because the Gaussian form performs quite well: it

produces nearly equally rapid convergence in both

the real and reciprocal sums. Recently, a few

groups have experimented with new functional

forms to improve convergence of the Ewald

method [18–20]. We share their goal of reducing

the cost of the Ewald method while maintainingsatisfactory accuracy in the Coulombic potential.

The present effort has been quite successful with

respect to achieving a reduction in the cost of the

Ewald method while at the same time being widely

applicable and easy to implement into existing

computer codes.

2. A general Ewald sum

The physical premise undergirding the Ewald

method is that the distribution of Coulombic point

charges in the system is augmented by a distribu-

tion of diffuse countercharges [1]. That is, at each

location of a point-charge particle, a diffuse charge

of opposite sign and equal magnitude is placed.These countercharges have a radially symmetric

distribution of charge governed by cðrÞ, which is

known as the charge-shape or core function and is

traditionally taken to be Gaussian. The counter-

charges serve to screen or damp the Coulombic

interactions, making them short-ranged and al-

lowing the sum to be truncated at interparticle

cutoff distance rc. The damped (real-space) po-tential of the system is given by

Ureal ¼1

2

Xi

Xj

qiqj/realðrijÞ; ð1Þ

where i and j are particle indexes, qi is particle

charge and includes a ð4pe0Þ�1=2Coulombic factor,

rij is nearest-image interparticle distance, and

/realðrÞ ¼ SðrÞ=r. SðrÞ is the splitting (or switching

or damping) function and depends on cðrÞ as givenlater.

Next the augmented system must be corrected

back to the original point-charge-only system.

This is done by calculating the potential generated

by the diffuse countercharges alone. This second

sum of interactions is best accomplished by solving

Poisson�s equation in Fourier or reciprocal space.The resulting solution is

Urecip ¼2pV

Xh6¼0

CðhÞh2

Xj

qj eih�rj

��2

; ð2Þ

where V ¼ L3 is the unit cell volume, h is a reciprocallattice vector havingmagnitude h, andCðhÞ=h2 is theFourier coefficient. Reciprocal lattice vectors are

essentially the eigenvalues of the series solution and

are defined for a cubic unit cell as h ¼ 2pn=L, wheren is a vector of three independent integers. The

summation over h is performed in a spherical fash-

ion, analogous to the real-space sum.Depending on

the smoothness of the diffuse charges, CðhÞ rapidlyconverges to zero with increasing h, and the sum can

be truncated at finite hc. It is common to report the

lattice-sum cutoff for a cubic unit cell as an integer,

n2c , through the relationship n2c ¼ ðhcL=2pÞ2; this

convention will be followed here. Also, recall that

the way in which FFT-based Ewald methods im-

prove cost scaling performance is by rapid numeri-

cal approximation of the Fourier sum of Eq. (2).Besides this difference, the FFT-based methods fit

squarely in the present framework.

The correction needs a correction. Eq. (2), for the

sake of computational convenience, erroneously

includes the potential of each diffuse countercharge

interacting with itself. This is resolved with

Uself ¼1

2

Xi

q2i /recipð0Þ; ð3Þ

where /recipðrÞ ¼ ½1� SðrÞ�=r. An expanded self-interaction correction is required for multi-site

molecules. That expansion, as well as a general-

ization of the Ewald sum to arbitrary parallelepi-

ped simulation cells is described in [21].

The total Coulombic cell potential is then

Ucoul ¼ Ureal þ Urecip � Uself : ð4ÞAs expected, the force on each particle is thenegative gradient of the cell potential with respect

to the particle�s position, which is a straightfor-

ward operation [21].

538 D.R. Wheeler, J. Newman / Chemical Physics Letters 366 (2002) 537–543

3. A new splitting function

The starting point in developing expressions for

the functions CðhÞ, SðrÞ, /realðrÞ, and /recipðrÞ in-

troduced above is to assume a charge-shapefunction cðrÞ [3]. By definition, cðrÞ must be nor-

malized so that an integration over all space results

in a value of unity. Our guiding philosophy is to

build on the desirable convergence properties of

the Ewald method by adding a few additional

degrees of freedom on which to optimize. We

therefore choose to start with the conventional

Gaussian form for cðrÞ, which we then perturbwith a series of polynomial functions:

cðrÞ ¼ p�3=2a3e�ðarÞ2 1

"þXn

k¼1

bkpkðarÞ#; ð5Þ

a is the adjustable convergence or scaling param-

eter. The bk are also adjustable and allow one to

tune the spherically symmetric shape of cðrÞ; set-ting all bk ¼ 0 recovers the conventional Ewald

expressions found in the literature for cðrÞ and for

all the expressions which follow.

The polynomials pkðsÞ are defined by

pkðsÞ ¼ð�1ÞkH2kþ1ðsÞ

22kþ1s; ð6Þ

where HlðsÞ is the lth Hermite polynomial. Becauseof this choice for the pk, cðrÞ is properly normal-

ized for arbitrary choices of bk, and the ensuing

forms of CðhÞ and SðrÞ are made particularly

simple. Table 1 lists the polynomials up to k ¼ 3

used in the present work. Note that p0 is not usedin Eq. (5) but is used in Eq. (8).

The Fourier-coefficient function CðhÞ is simply

the three-dimensional Fourier transform of cðrÞ:

CðhÞ ¼ FT3 cðrÞf g ¼Z 1

0

sinðhrÞhr

cðrÞ4pr2 dr

¼ e�b2 1

"þXn

k¼1

bkb2k

#; ð7Þ

where b ¼ h=ð2aÞ is a convenient dimensionless

combination.

Next we generate the splitting function SðrÞfrom a solution to Poisson�s equation about a

single spherically symmetric diffuse charge. After

some manipulation the analytic result is

SðrÞ ¼Z 1

rcðsÞ4psðs� rÞds

¼ erfcðarÞ � arffiffiffip

p e�ðarÞ2Xn

k¼1

bkpk�1ðarÞ: ð8Þ

The exponential term in Eq. (8) does not entail

extra computational expense because it is already

required for calculating the real-space force on theparticles. Moreover, for those computer codes in

which the potential is not tabulated, it is common

to approximate erfcðsÞ with an expression which

requires expð�s2Þ anyway.With SðrÞ in hand, it is a simple procedure to

get the potential functions /realðrÞ ¼ SðrÞ=r and

/recipðrÞ ¼ ½1� SðrÞ�=r. In particular, the term

/recipð0Þ, which goes into Eq. (3), is

/recipð0Þ ¼affiffiffip

p 2

"þXn

k¼1

bkpk�1ð0Þ#: ð9Þ

4. The Coulombic pressure tensor

Obtaining the Coulombic part of the atomic

pressure tensor, Pcoul, requires taking the deriva-

tive of the cell Coulombic potential with respect to

a cell-basis matrix, while keeping constant the

scaled particle positions in the unit cell, asdescribed in [17]. The end result here is

PcoulV ¼ 1

2

Xi

Xj

qiqj�d/realðrijÞ

drij

� �rijr

Tij

rij

þ 2pV

Xh6¼0

CðhÞh2

Xj

qjeih�rj

��2

� I

� 2GðbÞ hh

T

h2

; ð10Þ

Table 1

Charge-shape perturbation polynomials

k pkðxÞ

0 1

1 �x2 þ 32

2 x4 � 5x2 þ 154

3 �x6 þ 212x4 � 105

4x2 þ 105

8

D.R. Wheeler, J. Newman / Chemical Physics Letters 366 (2002) 537–543 539

where I is the identity matrix and adjacent vectors

imply an outer product. The auxiliary function

GðbÞ is

GðbÞ ¼ 1þ b2 �Pn

k¼1 bkkb2k

1þPn

k¼1 bkb2k: ð11Þ

Recall that b ¼ h=ð2aÞ. As is the case for the cell

potential and forces, the pressure tensor requires a

modified form when used in simulations of multi-

site molecules rather than atoms [21,22].

5. Optimizing the parameters

In the conventional Ewald analytic method

there are three arbitrary parameters, {rc; a; n2c}, onwhich one can optimize. Our starting point is to fix

the desired errors in the real-space sum and thereciprocal-space sum (the nature of the errors will

be discussed below). This determines a and n2c for agiven choice of rc. The process is then repeated for

various values of rc until the minimum in program

execution cost is obtained. However, with the

perturbation-series length set to n ¼ 3, there are an

additional three parameters, {b1; b2; b3}, on which

to optimize for cost, with the constraint that pre-viously fixed error limits are maintained. Optimal

sets of bk appear to be system independent and

were therefore held constant in the optimization

loop after their initial selection.

Combinations of parameters were tested by

performing Ewald calculations on 30 configura-

tion snapshots, obtained as follows. Ten configu-

rations were extracted at 10 ps intervals from anequilibrated trajectory of 3680 molecules corre-

sponding to 1 M KCl in water. Each water mole-

cule was modeled with three partial-charge sites.

The trajectory was generated by our molecular

dynamics research code under isothermal and

isobaric controls set to 298 K and 0.1 MPa, re-

spectively. The code uses an analytic (non-FFT)

form of the Ewald sum in conjunction with amultiple-time-step scheme, allowing the Ewald

sum to be calculated only every 7.5 fs. Finally, an

additional 20 configurations were taken from

similar trajectories of 920 and 1725 molecules, to

elucidate the size dependence of the Ewald sum

and as an additional check.

Our optimization procedure is admittedly ad

hoc, and we do not claim to have found a global

optimum for the configurations tested. Indeed, our

objective was to find parameter sets which seem to

be widely applicable, rather than specifically tuned

to one system and unit-cell size L. We present fourrepresentative parameter sets in Table 2. Set 0

corresponds to the conventional Ewald sum. For

sets 1–3, the bk values were chosen to enhance si-

multaneous convergence of SðrÞ and CðhÞ, as

shown in Fig. 1a,b. Each of the sets requires a

unique value of a; parameter c1 in Table 2 is a

scaling factor to relate the optimal a from one set

to another. Likewise, parameter c2 is an empiricalscaling factor to relate optimal n2c values from one

set to another.

As an aid to ourselves and others, we developed

the following �optimizing� heuristic which gener-

ates consistent errors and contains only one degree

of freedom, rc. We hope the heuristic will be as

successful in minimizing program cost for other

systems as it has been for 1 M KCl in water.1. Choose a set of bk from Table 2.

2. Choose a value of rc.3. Let a ¼ 2:4c1=ðrc � 0:2 nmÞ.4. Let n2c ¼ floor½ð0:67 nm=rcÞc2a2L2�,where function floorðxÞ returns the maximum in-

teger less than the argument. Using the above

procedure, one then iterates on the value of rc in

order to find the fastest program execution rate. Inthe case of an analytic Ewald sum, the optimal

choice will lead to a program cost that scales as

N 1:4, a modest improvement over the N 1:5-scaling

performance reported in [10]. For smaller system

sizes, the optimal value of rc may be less than the

minimum distance needed to compute intermo-

lecular van der Waals interactions. In this case it is

typically advantageous to use the van der Waalscutoff distance for rc.

Table 2

System-size-independent parameter sets

Set b1 b2 b3 c1 c2

0 0 0 0 1 1

1 )0.4715 0.0688 )0.00311 1.17 0.50

2 )0.6103 0.1104 )0.00599 1.20 0.39

3 )0.6956 0.1395 )0.00822 1.21 0.35


Fig. 2 gives the overall program-execution cost

obtained using steps 1–4 above, for simulations of

3680 molecules, where L ¼ 4:8 nm. The optimumcost for set 3 is 27% below the optimum for set 0.

Significant additional savings are possible if the

starting point is a conventional Ewald sum that is

not fully optimized.

Figs. 3 and 4 illustrate the accuracy in Cou-

lombic potential and forces generated by the above

heuristic. The curve corresponding to each of the

four parameter sets is an average taken over theten 3680-molecule configurations. (The twenty

configurations containing fewer molecules gave

similar results.) The real-space potential errors

given in Fig. 3a are defined as

eUrealðrc; n2cÞ ¼

Urealðrc; n2cÞ � Ureal;1

Ucoul;1; ð12Þ

(a) (b)

Fig. 1. Shape of (a) SðrÞ and (b) CðhÞ for the parameter sets of Table 2, in the region of anticipated cutoffs.

Fig. 2. Overall program execution cost (arbitrary units of

time) for the four parameter sets and �optimizing� heuristic.The open circles indicate values of rc subsequently used to

generate optima in Figs. 3 and 4. The curves are parabolic fits

of the data.

(a) (b)

Fig. 3. Relative errors in the (a) real and (b) reciprocal potential sums as a function of respective cutoffs. The open circles indicate

respective optima for each set.


where subscript 1 indicates evaluation at suffi-

ciently large rc and/or n2c that the result can be

considered exact. eUrecipis obtained by replacing

�real� with �recip� in Eq. (12). Fig. 4 gives the cor-

responding relative accuracy in the root-mean-

square (RMS) force on each molecule, partitioned

into real and reciprocal parts. The real-space force

errors are defined as

efrealðrc; n2cÞ ¼P

i f i;realðrc; n2cÞ � f i;real;1� 2P

i f2i;coul;1

( )1=2

;

ð13Þwhere f i is the force on molecule i (either a single

ion or a water) and the other subscripts are the

same as in Eq. (12).

The open circles in Figs. 3 and 4 allow a com-

parison of the errors generated using the four

parameter sets and the �optimizing� heuristic. In-terestingly, the real-space truncation contributes

the greatest error to the Coulombic potential,

whereas the reciprocal-space truncation contrib-

utes the greatest error to the Coulombic force. The

four parameter sets, by design, generate nearly

equal errors, namely about 10�4 in the potential

and 10�3 in the RMS force.

These prescribed errors may be greater thansome workers are willing to accept. However, for

many simulations of realistic systems, insufficient

sampling of phase space is the greatest source of

error in calculated properties. A speedup of pro-

gram execution, at the cost of a modest decrease in

Coulomb-potential accuracy, can be advantageoussince it allows more sampling of phase space for

the same computational cost. In any case, one can

modify the �optimizing� heuristic to generate more

accurate potentials by a modest increase in the

respective constants 2:4 and 0:67 nm found in

steps 3 and 4. It is here that use of the lower-

numbered sets may be advantageous, since these

sets exhibit smaller error magnitudes than thehigher-numbered sets for large n2c , as shown in Fig.

3b and Fig. 4b. Although the reciprocal-space

truncation errors for sets 1–3 exhibit oscillatory

and non-monotonic behavior for increasing n2c ,they exhibit the essential behavior of a predictable

decay to zero for large n2c .Our series perturbation method permits a sig-

nificant reduction in program cost relative to theconventional Ewald method, while maintaining

comparable accuracy. The configurations tested

here are for 1 M KCl in water. While we believe

that this system constitutes a reasonable test of our

choice of Ewald parameters, other solvents may

require an adjustment of the explicit constants in

steps 3 and 4 of the heuristic to achieve the same

accuracy, due to differences in electrostaticscreening. Nevertheless, our experience indicates

that the parameter sets in Table 2 are likely to be

widely applicable. We therefore recommend the

parameter sets and heuristic for general use, with

the caveat that the accuracy of calculated proper-

ties be initially validated against a more accurate

and expensive Ewald parameter set.

(a) (b)

Fig. 4. Relative errors in the (a) real and (b) reciprocal RMS-forces as a function of respective cutoffs. The open circles indicate

respective optima for each set.


References

[1] M. Allen, D. Tildesley, Computer Simulation of Liquids,

Oxford Science, Oxford, 1989.

[2] D. Adams, J. Chem. Phys. 78 (1983) 2585.

[3] B.A. Luty, I.G. Tironi, W.F. van Gunsteren, J. Chem.

Phys. 103 (8) (1995) 3014.

[4] D.M. York, W. Yang, H. Lee, T. Darden, L.G. Pedersen,

J. Am. Chem. Soc. 117 (17) (1995) 5001.

[5] B. Nijboer, F. de Wette, Physica 23 (1957) 309.

[6] M. Sangster, M. Dixon, Adv. Phys. 25 (1976) 247.

[7] D. Fincham, Molec. Simulation 13 (1994) 1.

[8] G. Hummer, Chem. Phys. Lett. 235 (1995) 297.

[9] Z. Wang, C. Holm, J. Chem. Phys. 115 (14) (2001)

6351.

[10] J. Perram, H. Petersen, S. DeLeeuw, Mol. Phys. 65 (1988)

875.

[11] U. Essmann, L. Perera, M.L. Berkowitz, T. Darden, H.

Lee, L.G. Pedersen, J. Chem. Phys. 103 (19) (1995) 8577.

[12] D. York, W. Yang, J. Chem. Phys. 101 (1994) 3298.

[13] R. Hockney, J. Eastwood, Computer simulation using

particles, McGraw-Hill, New York, 1981.

[14] E. Pollock, J. Glosli, Comp. Phys. Comm. 95 (1996) 93.

[15] P. Ewald, Ann. Physik (Leipzig) 64 (1921) 253.

[16] D.M. Heyes, J. Chem. Phys. 74 (3) (1981) 1924.

[17] N. Karasawa, W.A. Goddard III, J. Phys. Chem. 93 (1989)

7320.

[18] P.F. Batcho, T. Schlick, J. Chem. Phys. 115 (18) (2001) 8312.

[19] P.H. Hunenberger, J. Chem. Phys. 113 (23) (2000) 10464.

[20] H. Berendsen, in: Computer Simulation of Biomolecular

Systems, 2, ES-COM, The Netherlands, 1993, p. 161.

[21] D.R. Wheeler, N.G. Fuller, R.L. Rowley, Molec. Phys. 92

(1) (1997) 55.

[22] D.R.Wheeler, R.L. Rowley,Molec. Phys. 94 (3) (1988) 555.


Documents

A less expensive Ewald lattice sum