4
ISSN 0012-5008, Doklady Chemistry, 2006, Vol. 408, Part 1, pp. 76–79. © Pleiades Publishing, Inc., 2006. Original Russian Text © D.A. Shul’ga, A.A. Oliferenko, S.A. Pisarev, V.A. Palyulin, N.S. Zefirov, 2006, published in Doklady Akademii Nauk, 2006, Vol. 408, No. 3, pp. 340–343. 76 An important part of force fields and scoring func- tions used in molecular modeling is their electrostatic component. It is most often assessed as the energy of the Coulomb interaction of a system of atom-centered point charges q i . According to more rigorous approaches to describing noncovalent interactions [1], it is quite natural that this system needs to fit well the molecular electrostatic potential (MEP, or molecular ESP) created by the electron density distribution. The calculation schemes do not all meet this requirement. The charges optimally reproducing the MEP (hereinafter, ESP charges) are obtained by mini- mizing the difference between the quantum-chemical and classical Coulomb electrostatic potentials at the points of a three-dimensional grid around the molecule. ESP charges are widely used. However, they have some weaknesses; for example, they are often conformation- ally dependent, ill-conditioned on buried atoms, and not easily transferable between common functional groups in related molecules [2]. One major obstacle to the wide use of ESP charges for molecular modeling is the need for a tedious ab initio calculation of the mole- cule and, sometimes, even of its several conformers [3]. Among the schemes generating ESP charges, the most used one is the restrained electrostatic potential (RESP) method [2, 4], in which some of the weaknesses of clas- sical methods of generating ESP charges are partially rectified. Previously [5, 6], we suggested a method for gener- ating atomic charges based on a new principle of elec- tronegativity equalization. The method compares the molecular structure to a closed electric circuit and elec- tronegativity, to the electric potential. The developed matrix formalism, based on applying Kirchhoff’s and Ohm’s laws to this circuit, gives quantities that can be treated as atomic charges [5]. In this method, molecules are considered at the topological level; i.e., for calcula- tions, only the structural formula is required without resort to the data on the geometry of molecules. The parameters of this method are atomic or orbital elec- tronegativities and hardnesses. Two versions of the method have been developed that differ in the level of structure representation: based on a molecular graph (MG) or on an orbital graph (OG) [5]. The laborious- ness of this method is determined by the computational complexity of solving the system of N linear equations with N unknowns, where N is proportional to the num- ber of atoms in a structure. The ability of topological schemes to reproduce the MEP is attractive in the context of developing a tool for rapid generation of charges for simulation of biomole- cules. In this work, we study the possibility of parame- trization of the methods suggested for reproducing the atomic RESP charges [2, 4]. To do this, the model parameters should be adjusted to minimize the devia- tion of the calculated charges from the reference RESP charges. A set of 174 different uncharged organic structures was prepared for study. The set contains representatives of the basic classes of organic compounds: alcohols, amines, thiols, carboxylic acids, halo derivatives, amides, nitro and nitroso compounds, aminoalcohols, thioamides, hydroxycarboxylic acids, enols, diazines, phosphorus and sulfur compounds in different oxida- tion states, and others. Atomic RESP charges were obtained by a single-stage scheme from the MEP val- ues on spherical atoms-centered grids [7] using the closed-shell Hartree–Fock approximation (the 6-31G* basis set) after geometry optimization. The deviation for the entire set was estimated by the target function (1) F M χ M η M , , ( ) q M ij , calcd q ij RESP ( ) 2 , j N at i , i N mol = q M ij , calcd q M ij , calcd χ M η M , , ( ) M åG éG , { } , = = Modeling of Atomic RESP Charges with the Use of Topological Calculation Schemes D. A. Shul’ga, A. A. Oliferenko, S. A. Pisarev, V. A. Palyulin, and Academician N. S. Zefirov Received December 29, 2005 DOI: 10.1134/S0012500806050065 Moscow State University, Vorob’evy gory, Moscow, 119992 Russia CHEMISTRY

Modeling of atomic RESP charges with the use of topological calculation schemes

  • Upload
    n-s

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modeling of atomic RESP charges with the use of topological calculation schemes

ISSN 0012-5008, Doklady Chemistry, 2006, Vol. 408, Part 1, pp. 76–79. © Pleiades Publishing, Inc., 2006.Original Russian Text © D.A. Shul’ga, A.A. Oliferenko, S.A. Pisarev, V.A. Palyulin, N.S. Zefirov, 2006, published in Doklady Akademii Nauk, 2006, Vol. 408, No. 3, pp. 340–343.

76

An important part of force fields and scoring func-tions used in molecular modeling is their electrostaticcomponent. It is most often assessed as the energy ofthe Coulomb interaction of a system of atom-centeredpoint charges

q

i

. According to more rigorousapproaches to describing noncovalent interactions [1],it is quite natural that this system needs to fit well themolecular electrostatic potential (MEP, or molecularESP) created by the electron density distribution.

The calculation schemes do not all meet thisrequirement. The charges optimally reproducing theMEP (hereinafter, ESP charges) are obtained by mini-mizing the difference between the quantum-chemicaland classical Coulomb electrostatic potentials at thepoints of a three-dimensional grid around the molecule.ESP charges are widely used. However, they have someweaknesses; for example, they are often conformation-ally dependent, ill-conditioned on buried atoms, andnot easily transferable between common functionalgroups in related molecules [2]. One major obstacle tothe wide use of ESP charges for molecular modeling isthe need for a tedious ab initio calculation of the mole-cule and, sometimes, even of its several conformers [3].Among the schemes generating ESP charges, the mostused one is the restrained electrostatic potential (RESP)method [2, 4], in which some of the weaknesses of clas-sical methods of generating ESP charges are partiallyrectified.

Previously [5, 6], we suggested a method for gener-ating atomic charges based on a new principle of elec-tronegativity equalization. The method compares themolecular structure to a closed electric circuit and elec-tronegativity, to the electric potential. The developedmatrix formalism, based on applying Kirchhoff’s andOhm’s laws to this circuit, gives quantities that can betreated as atomic charges [5]. In this method, moleculesare considered at the topological level; i.e., for calcula-

tions, only the structural formula is required withoutresort to the data on the geometry of molecules. Theparameters of this method are atomic or orbital elec-tronegativities and hardnesses. Two versions of themethod have been developed that differ in the level ofstructure representation: based on a molecular graph(MG) or on an orbital graph (OG) [5]. The laborious-ness of this method is determined by the computationalcomplexity of solving the system of

N

linear equationswith

N

unknowns, where

N

is proportional to the num-ber of atoms in a structure.

The ability of topological schemes to reproduce theMEP is attractive in the context of developing a tool forrapid generation of charges for simulation of biomole-cules. In this work, we study the possibility of parame-trization of the methods suggested for reproducing theatomic RESP charges [2, 4]. To do this, the modelparameters should be adjusted to minimize the devia-tion of the calculated charges from the reference RESPcharges.

A set of 174 different uncharged organic structureswas prepared for study. The set contains representativesof the basic classes of organic compounds: alcohols,amines, thiols, carboxylic acids, halo derivatives,amides, nitro and nitroso compounds, aminoalcohols,thioamides, hydroxycarboxylic acids, enols, diazines,phosphorus and sulfur compounds in different oxida-tion states, and others. Atomic RESP charges wereobtained by a single-stage scheme from the MEP val-ues on spherical atoms-centered grids [7] using theclosed-shell Hartree–Fock approximation (the 6-31G*basis set) after geometry optimization.

The deviation for the entire set was estimated by thetarget function

(1)FM χM ηM …, ,( ) qM ij,

calcd qijRESP–( )2

,j

Nat i,

∑i

Nmol

∑=

qM ij,calcd qM ij,

calcd χM ηM …, ,( ) M åG éG,{ },= =

Modeling of Atomic RESP Chargeswith the Use of Topological Calculation Schemes

D. A. Shul’ga, A. A. Oliferenko, S. A. Pisarev, V. A. Palyulin,and

Academician

N. S. Zefirov

Received December 29, 2005

DOI:

10.1134/S0012500806050065

Moscow State University, Vorob’evy gory, Moscow,119992 Russia

CHEMISTRY

Page 2: Modeling of atomic RESP charges with the use of topological calculation schemes

DOKLADY CHEMISTRY

Vol. 408

Part 1

2006

MODELING OF ATOMIC RESP CHARGES 77

where

q

stands for the partial charges on atoms,

N

mol

isthe number of molecules in the set,

N

i

is the number ofatoms in the

i

th molecule, and the variable parametersare the vector of the atomic or orbital electronegativi-ties

χ

and the vector of the orbital hardnesses

η

for theOG model.

For optimization, a nongradient algorithm [8] wasused, which is a hybrid of the simplex method and sim-ulated annealing. The parameters described in [5] weretaken as starting parameters of topological schemes. Ateach iteration, an update of parameters was performed,new charges for each structure from the set were calcu-lated based on the new values of the parameters, and thetarget function was assessed anew. Stable minima of thetarget functions were attained for both models. Themean deviation of the atomic charges from the RESPcharges was estimated by Eq. (2),

(2)

where

F

(

χ

,

η

, …)

is the standard deviation for the entireset and

D

i

characterizes the standard deviation of thecalculated charges from the RESP charges for the

i

thstructure from the set. The optimization results are pre-sented in Table 1.

To verify the transferability of parameters and thepredictive power of the models obtained, the initial setof structures was partitioned into two nonintersectingparts: training and test sets. After the parameters of theMG and OG models were optimized on the training set,the charges for the structures from the test set wereevaluated (Table 1). The models have rather good pre-dictive power, taking into account that the range ofRESP charges in the training set is about 2.5e.

To study the possibilities for further improving thedescription of the MEP, the training set was subjectedto a more in-depth analysis according to two criteria.First, the MEP of some structures is inadequatelydescribed by a system of atom-centered point charges.The relative root-mean-square deviation (RRMS) (3)obtained upon generation of RESP charges is used as ameasure of quality of the description. The description isdeemed adequate if RRMS

0.2

(20%).

(3)

F χ η …, ,( ) 1Nmol---------- Di

2

i

Nmol

∑⎝ ⎠⎜ ⎟⎛ ⎞

1/2

,=

Di1Ni

----- qijcalcd qij

RESP–( )2

j

Ni

∑⎝ ⎠⎜ ⎟⎛ ⎞

1/2

,=

RPMS

χ2

Vi2

grid i,

Nt

∑-----------------

1/2

,=

where

V

i

is the quantum-chemical electrostatic poten-tial at the

i

th point of a grid of

N

t

points and

χ

2

is theresidual dispersion of the RESP method [2].

Second, the statistical validity of atomic chargeswas assessed by least-squares analysis using Eqs. (4).Here,

σ

is the mean-square deviation and a measure ofthe convexity of the function

χ

2

in the vicinity of theatomic charge

q

k

and determines the significance of thischarge for reproduction of the MEP [9].

(4)

where

q

k

is the charge on the

k

th atom,

N

p

is the numberof points in the grid,

N

is the number of atoms,

N

restr

isthe number of restraints (in this case, only one restraintis used: the sum of atomic charges is zero), and (

A

–1

)

kk

is the diagonal element of the inverse matrix of normalequations of the least-squares method for finding thecharge values.

In the ill-conditioned centers, the charges can varyconsiderably, only slightly affecting the potential repro-duction quality, however. We abandoned the attempt toreproduce the RESP charges for the structures contain-ing centers with

σ

0.1e.The lists of the structures rejected according to these

two criteria were virtually the same; therefore, theselists were combined. After these structures wererejected from the sets, we repeated the optimization andthe assessment of the predictive power. The results arepresented in Fig. 1 and Table 2.

Figure 2 shows one of the structures of the trainingset, and Table 3 presents the atomic charges, mean-square deviations, and mean absolute errors of the cal-culated charges and the RESP charges. As can be seenfrom Table 3, the charges obtained using the optimizedparameters of topological schemes correspond much

σ qk( ) χ2

Np N‡ N restr––------------------------------------⎝ ⎠

⎛ ⎞1/2

A 1–( )kk,=

Table 1.

Residual discrepancies (in fractions of an electronper atom) for the total set. The number of structures is paren-thesized

ParametersTraining set (166) Test set (8)

MG OG MG OG

Original 0.314 0.244 0.320 0.232

Optimized 0.224 0.127 0.207 0.148

Table 2.

Residual discrepancies (in fractions of an electronper atom) for the set minus the rejected structures

ParametersTraining set (84) Test set (8)

MG OG MG OG

Original 0.366 0.251 0.320 0.232

Optimized 0.176 0.108 0.151 0.136

Page 3: Modeling of atomic RESP charges with the use of topological calculation schemes

78

DOKLADY CHEMISTRY

Vol. 408

Part 1

2006

SHUL’GA et al.

better to the RESP charges. Due to its more detailedrepresentation of a structure, the OG model reproducesthe RESP charges noticeably better than the MG modelboth before and after optimization. Some residual devi-ation is associated with the fact that topological

schemes generate equal charges on topologicallyequivalent atoms.

The strategy proposed makes it possible to obviatesuch a disadvantage of the RESP charges as consider-able differences between the charges of similar ill-con-ditioned centers. Upon optimization on the set of manystructures, the electronegativity and hardness parame-ters take values to reproduce the weighted averagecharge for the atoms of a given type for the entire set.The parameters thus obtained show a significantlysmaller scatter. Some change in the resulting chargeswith respect to the RESP charges at ill-conditioned cen-ters will not significantly change the reproduction ofthe MEP as a whole.

The rejected structures can be divided into threeclasses: (I) low-polarity molecules, (II) structures withpronounced electron-density anisotropy, and (III) struc-tures with heavy elements (sulfur, bromine) and withpolyhalomethyl groups. The MEP of low-polaritystructures is inadequately described by atomic charges[2]. The structures of class II can hardly be adequatelydescribed by a system of atom-centered point chargesbecause of the spherical symmetry of the Coulombpotential. In compounds containing a double bond, theelectron density above and under the plane of the dou-ble bond is higher than in this plane at a comparabledistance from the bond line. This has an effect on the abinitio electrostatic potential. In amines and phosphines,

–0.5

–0.5–1.0 0 0.5 1.0 1.5

q(RESP)

0

0.5

1.0

1.5

2.0

q(éG)

q(éG) = 1.0188q(RESP)R2 = 0.9253

(b)

2.0

–0.5

–1.5

0

0.5

1.0

1.5

2.0

q(éG) = 0.9409q(RESP)R2 = 0.9416

(‡)

–0.5 0 0.5 1.0 1.5 2.0–1.0

–1.0

H1

O2 O1

1

H2

H3 H4

Fig. 1. Reproduction of RESP charges by the OG modelafter optimization: (a) the training set and (b) the test set.

Fig. 2. Structure with calculated charges (Table 3).

Table 3. Charges for the structure shown in Fig. 2 calculated with the use of the original and optimized parameters

Atom RESPOriginal Optimized

MG OG MG OG

C1 0.739 0.097 0.613 0.727 0.735C2 0.008 –0.018 0.132 –0.063 0.026O1 –0.563 –0.168 –0.467 –0.587 –0.665O2 –0.397 –0.206 –0.617 –0.626 –0.527H1 0.102 0.060 0.072 0.132 0.112H2 0.062 0.060 0.072 0.132 0.112H3 0.062 0.060 0.072 0.132 0.112H4 –0.014 0.117 0.124 0.152 0.095Mean-square deviation 0.279 0.117 0.110 0.075Average absolute error 0.179 0.094 0.084 0.059

2

Page 4: Modeling of atomic RESP charges with the use of topological calculation schemes

DOKLADY CHEMISTRY Vol. 408 Part 1 2006

MODELING OF ATOMIC RESP CHARGES 79

heterogeneities are also due to the presence of a loneelectron pair, although its effect for an oxygen atomturned out to be insignificant. One way to improve thedescription of the MEP of classes II and III is to intro-duce extraatomic charge sites corresponding to theelectron pairs and bonding π orbitals. This was demon-strated in [10], in which a multipoint model of atomicsulfur with two positions for the charges on electronpairs was introduced and parameters of the GROMACSforce field were modified to fit the ab initio potentialenergy surface of the dimethyl sulfide–methanol inter-action, and [11], in which the electrostatic potential ofmonomers was analyzed in the context of predicting thestructure and energy of intermolecular complexes.

Thus, in this work, we showed that it is possible toreproduce RESP charges and their generating MEP bymeans of topological schemes based on the electronega-tivity equalization principle, which were specially opti-mized for describing the MEP. Quantum-chemical calcu-lations are required only for optimization of methodparameters. Therefore, in the framework of the approachdeveloped, the MEP can be fitted by a system of pointcharges calculated with the use of topological schemes.

REFERENCES1. Hobza, P. and Zahradnik, R., Chem. Rev., 1988, vol. 88,

pp. 871–897.

2. Bayly, C.I., Cieplak, P., Cornell, W.D., and Koll-man, P.A., J. Phys. Chem., 1993, vol. 97, pp. 10269–10280.

3. Reynolds, C.A., Essex, J.W., and Richards, W.G., J. Am.Chem. Soc., 1992, vol. 114, pp. 9075–9079.

4. Cornell, W.D., Cieplak, P., Bayly, C.I., and Koll-man, P.A., J. Am. Chem. Soc., 1993, vol. 115, pp. 9620–9631.

5. Oliferenko, A.A., Palyulin, V.A., Pisarev, S.A.,Neiman, A.V., and Zefirov, N.S., J. Phys. Org. Chem.,2001, vol. 14, pp. 355–369.

6. Oliferenko, A.A., Palyulin, V.A., and Zefirov, N.S., Dokl.Akad. Nauk, 1999, vol. 368, no. 1, pp. 63–67 [Dokl.Chem. (Engl. Transl.), vol. 368, nos. 1–3, pp. 209–212].

7. Singh, U.C. and Kollman, P.A., J. Comput. Chem., 1984,vol. 5, pp. 129–145.

8. Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetter-ling, W.T., in Numerical Recipes in C: The Art of Scien-tific Computing, 2nd ed., Cambridge: Cambridge Univ.Press, 1992.

9. Spackman, M.A., J. Comput. Chem., 1996, vol. 17,pp. 1–18.

10. Wennmohs, F. and Schindler, M., J. Comput. Chem.,2005, vol. 26, pp. 283–293.

11. Kollman, P.A., Acc. Chem. Res., 1977, vol. 10, pp. 365–371.