8
< < Estimating Correlation Energy of Diatomic Molecules and Atoms with Neural Networks GERALDO MAGELA E SILVA, PAULO HORA ACIOLI, ANTONIO CARLOS PEDROZA Departamento de Fısica, Universidade de Brasılia, 70 910 900 Brasılia DF, Brazil ´ ´ ´ Received 7 November 1996; accepted 25 February 1997 ABSTRACT: The electronic correlation energy of diatomic molecules and heavy atoms is estimated using a back propagation neural network approach. The supervised learning is accomplished using known exact results of the electronic correlation energy. The recall rate, that is, the performance of the net in recognizing the training set, is about 96%. The correctness of values given to the test set and prediction rate is at the 90% level. We generate tables for the electronic correlation energy of several diatomic molecules and all the neutral Ž . atoms up to radon Rn . Q 1997 by John Wiley & Sons, Inc. J Comput Chem 18: 1407 ] 1414, 1997 Keywords: neural network; network; correlation energy Introduction eural networks have become powerful tools N for researchers in recent years 1 . These com- puter models mimic the functioning of the brain for several tasks. Problems of classification, model- ing, mapping, association, and dynamical pro- cesses can be handled very well, frequently giving better results than other conventional techniques. The applicability of neural networks in computa- Correspondence to:G. Magela e Silva; e-mail magela@helium. fis.unb.br tional chemistry has been proved in problems of reactivity of chemical bonds, electrophilic aromatic substitution reaction, infrared spectrum] structure correlation, and in various other applications. 2 Neural networks are computer programs in which the role of neurons is played by processing Ž . elements PE that are connected in a net and can receive and transmit information among them- selves. All information received by one PE is pro- cessed with the addition of a transfer function and then transferred to other PEs. Communication be- tween PEs is mediated by weights that work very much like the synapses between neurons. It is the modification of these weights that enables the net- Q 1997 by John Wiley & Sons, Inc. CCC 0192-8651 / 97 / 111407-08

Estimating correlation energy of diatomic molecules and atoms with neural networks

Embed Size (px)

Citation preview

Page 1: Estimating correlation energy of diatomic molecules and atoms with neural networks

— —< <

Estimating Correlation Energy ofDiatomic Molecules and Atoms withNeural Networks

GERALDO MAGELA E SILVA, PAULO HORA ACIOLI,ANTONIO CARLOS PEDROZADepartamento de Fısica, Universidade de Brasılia, 70 910 900 Brasılia DF, Brazil´ ´ ´

Received 7 November 1996; accepted 25 February 1997

ABSTRACT: The electronic correlation energy of diatomic molecules andheavy atoms is estimated using a back propagation neural network approach.The supervised learning is accomplished using known exact results of theelectronic correlation energy. The recall rate, that is, the performance of the netin recognizing the training set, is about 96%. The correctness of values given tothe test set and prediction rate is at the 90% level. We generate tables for theelectronic correlation energy of several diatomic molecules and all the neutral

Ž .atoms up to radon Rn . Q 1997 by John Wiley & Sons, Inc. J Comput Chem18: 1407]1414, 1997

Keywords: neural network; network; correlation energy

Introduction

eural networks have become powerful toolsN for researchers in recent years1. These com-puter models mimic the functioning of the brainfor several tasks. Problems of classification, model-ing, mapping, association, and dynamical pro-cesses can be handled very well, frequently givingbetter results than other conventional techniques.The applicability of neural networks in computa-

Correspondence to: G. Magela e Silva; e-mail [email protected]

tional chemistry has been proved in problems ofreactivity of chemical bonds, electrophilic aromaticsubstitution reaction, infrared spectrum]structurecorrelation, and in various other applications.2

Neural networks are computer programs inwhich the role of neurons is played by processing

Ž .elements PE that are connected in a net and canreceive and transmit information among them-selves. All information received by one PE is pro-cessed with the addition of a transfer function andthen transferred to other PEs. Communication be-tween PEs is mediated by weights that work verymuch like the synapses between neurons. It is themodification of these weights that enables the net-

Q 1997 by John Wiley & Sons, Inc. CCC 0192-8651 / 97 / 111407-08

Page 2: Estimating correlation energy of diatomic molecules and atoms with neural networks

MAGELA E SILVA, ACIOLI, AND PEDROZA

work to learn and execute the tasks it is devisedto do.

The most common neural networks have a mul-tilayer architecture. The first layer is the inputlayer used to feed the network with the informa-tion that will model the weights in a first stageŽ .the so-called learning process . In the second stage,when the learning process has been finished, theinput layer is used to trigger the network to pro-duce answers, referred as recalls and predictions.

The bulk of the processing is done in the layersbetween the input layer and the last one. There-fore, neural networks are trained with examplesfrom where we have an output associated with agiven input. After successful training the net can

Ž .recall give the correct known answers and pre-dict answers that have not been presented to itpreviously.

Accurate determination of the correlation en-ergy of atomic and molecular systems from abinitio calculations requires heavy computationalwork, limiting the size of the systems that can betreated. Although the correlation energy is a smallfraction of the total energy of the system, typicallyof the order of 1%, it can be of the order ofchemically important energies such as binding en-ergies and electron affinities.

In this work we use the backpropagation methodof supervised learning to train a neural network todetermine the correlation energy of diatomicmolecules and neutral heavy atoms. In the trainingphase, the input set for the molecules includes the

Žatomic numbers of the composing atoms Z and1.Z and their bond length. For the atoms we used2

the Hartree]Fock energy, the atomic number, andthe column on the periodic table of atoms whosecorrelation electronic energies are already known.Of course, other representations are possible, butthis choice yields a good learning rate and a pre-diction rate that seems reasonable. Using thetrained network we generate the correlation ener-

Ž .gies for all elements up to radon Z s 86 . Ourresults are compared with the existing previsionsgiven by other methods in the literature, as well asexperimental values, when available. Our calcu-lated correlation energy is in fairly good agree-ment with the experimental data and the predictedvalues seem very reasonable. The advantage ofusing the neural network is that it can calculatecorrelation energies of many-electron systemswithout prohibiting computational implications.

In the next section we make a brief review ofthe back propagation method and describe thenetwork architecture used in this work. We then

discuss the important role of correlation energiesin the electronic structure and the difficulty inobtaining accurate correlation energies for many-electron systems. The final section is devoted tothe application of the back propagation method tomodel the correlation energy and tabulate valuesof these energies for diatomic molecules and heavyatoms. Also, we compare our estimation with othertheoretical approaches and provide concludingremarks.

Backpropagation Method andNetwork Architecture

Here we present a brief discussion of themethod. The backpropagation of errors is a learn-ing method; that is, a method for the correction ofweights.3 The great appeal of this method comesfrom an explicit and well-defined set of equationsfor the correction of the weights. The equations areapplied to the correction of the weights in the lastlayer and successively to the previous layers up tothe first. This supervised learning process utilizespairs of input]output data. These pairs are sets ofreal variables, where we initially associate a knownoutput for each given input. To begin, the weightvalues are set at random. The correction in thismethod is based on the difference between theactual response of the net and the desired output.The correction, following the gradient descentmethod, is computed from4:

l l ly1 lŽ pr e v i ou s. Ž .DW s hd out q mDW 1ji j i ji

where DW l represents the correction to the weighti jbetween the jth PE in the lth layer and the ith PEin the previous layer; out ly1 is the output of theiith PE on the l y 1 layer; h and m are constantscalled the learning rate and the momentum con-stant, respectively. These two constants determinethe rate of convergence of the learning process. Inthe present work we vary their values from 0.1 to0.9 independently during the learning process toobtain the best convergence rate. The d l values arejthe errors introduced by the jth PE in the lthlayer. They are calculated from4:

l a st l a st l a st l a st Ž .d s y y out out 1 y out 2Ž . Ž .j j j j j

rl lq1 lq1 l ld s d W out 1 y outŽ .Ýj k k j j jž /

ks1

Ž .l s 1, . . . , last y 1 3

VOL. 18, NO. 111408

Page 3: Estimating correlation energy of diatomic molecules and atoms with neural networks

ESTIMATING CORRELATION ENERGY

where y is the output set. Hence, for the correc-jtion of each weight it is necessary to use the valuesof three layers. We used normalized data to pre-vent numerical calculation problems. Upon repeat-edly presenting the input and desired output setsto the network the weights are gradually correcteduntil the response and the desired output agree tosome extent. This is accomplished by calculatingthe root-mean-square error between them until itbecomes smaller than 10y3 for the training set.

In our network architecture each layer of neu-Žrons is fully connected to the layer below it Fig.

.1 . We considered nets with several configurationsand found that the best results were given by athree-layer network. The input layer has four PEs.Three PEs identify the system. The fourth PE is abias. The hidden layer consists of four PEs: threeactive PEs and a bias. The bias are introduced toformally account for the parameters of the transferfunctions inherent with artificial neurons. Finally,the output layer has only one PE, giving the elec-tronic correlation energy for the input molecule oratom.

Correlation Energy

The correlation energy is defined as the differ-ence between the exact nonrelativistic ground stateand the Hartree]Fock energies. This small amount

FIGURE 1. Schematic representation of the neuralnetwork architecture used in this work.

of energy, however, is enough to cause errors inpredicting the nonexistence of most negative ionsor the instability for some molecules.5 Also, thecorrelation energy is necessary for studying theelectronic structures of clusters or solids.

The correlation energies obtained from very ac-curate experimental total energies are availableonly to light atoms6 and small molecules.7 ] 9 Noreliable experimental data are available for many-electron systems.

To obtain measurement of theoretically accuratecorrelation energy, one needs to perform heavycomputational calculations such as that done inconfiguration interaction or many-body perturba-tion methods5 with relativistic considerations.These methods, however, are limited to systemswith no more than two dozen or so electrons.Almost equal accuracy is obtained using the den-sity functional theory within its nonlocal approxi-mations.10 This approach needs less computationaleffort and, thus, results for a few heavy atomshave been published10. Using a semiempiricalmodel Chakravorty and Clementi11 made very ac-curate estimation of the correlation energy formany atoms. In their study, however, they did notgo beyond the Xe atom.

In the treatment of diatomic molecules Carrol etŽal. used local including self-correlation correc-

.tions and nonlocal correlation energy functionals.They provided an extensive table of diatomicmolecule correlation energies7. Savin et al. alsoconsidered the nonlocal approximation to generatethe correlation energy of several molecules8. Theuse of many-body perturbation and coupled clus-ter methods has also been taken into account in thedetermination of the correlation energy of di-atomic molecules.9

Discussion and Summary

In this work we used a very flexible method toestimate the correlation energies of diatomicmolecules and heavy atoms. During the learningprocess we taught the neural network the exactknown correlation energies and bond lengths.6 ] 9

The variables in the input for the diatomicŽmolecules are the atomic numbers of the atoms Z1

.and Z and their bond length. The network was2trained using a set of 31 molecules and the predic-tion rate was determined using a set of 6 testmolecules. We then used the network to predict

JOURNAL OF COMPUTATIONAL CHEMISTRY 1409

Page 4: Estimating correlation energy of diatomic molecules and atoms with neural networks

MAGELA E SILVA, ACIOLI, AND PEDROZA

the correlation energies of 28 molecules. For theatomic case we used the atomic number and thecolumn of the atom in the periodic table and itsHartree]Fock energy.12

One remark should be made, though—therecan be a deficiency in this representation in theestimation of correlation energies of the lan-thanides, as we chose to represent all of them ascolumn 3 in the periodic table. They are distin-guished by their atomic numbers and HF energies.However, Figure 3 shows that the estimation seemsreasonable.

In Table I we present the results of the learningprocess. The difference between the energies pre-dicted by the network and the experimental val-

6 Ž .ues taken as exact values is about 4%. Overall,results are better than those estimates using den-

Ž .sity functional theory DFT in the local spin den-Ž .13 Ž .13sity LSDA and generalized gradient GGA

approximations. We can better visualize the differ-Žence in Figure 2 where we plot the LSDA dashed

. Ž .lines , GGA long dashed lines , exact energiesŽ .full circles , and the energies learned by the neu-

TABLE I.( )Correlation Energies in a.u. for Atoms

(Helium through Argon Comparison of LSDA, GGA,[ ] )Neural Network NN , and Exact Values .

Z LSDA GGA NN Exact( ) ( ) ( )Ref. 12 Ref. 12 Ref. 5

2 y0.045 y0.044 y0.045 y0.0423 y0.063 y0.053 y0.055 y0.0454 y0.094 y0.094 y0.086 y0.0945 y0.125 y0.128 y0.110 y0.1246 y0.159 y0.163 y0.155 y0.1557 y0.198 y0.198 y0.204 y0.1868 y0.251 y0.264 y0.256 y0.2549 y0.305 y0.328 y0.312 y0.316

10 y0.362 y0.389 y0.373 y0.38111 y0.399 y0.414 y0.388 y0.38612 y0.443 y0.464 y0.426 y0.42813 y0.484 y0.507 y0.441 y0.45914 y0.525 y0.552 y0.493 y0.49415 y0.569 y0.597 y0.547 y0.52116 y0.623 y0.668 y0.603 y0.59517 y0.677 y0.736 y0.662 y0.66718 y0.732 y0.802 y0.723 y0.732

( ) ( ) ( ) (FIGURE 2. Correlation energies in a.u. as a function of atomic number Z , learning stage. a Exact values full) ( ) ( ) ( ) ( ) ( ) (circles ; b Predictions of the neural network full line ; c GGA results dashed lines ; and d LSDA results long

)dashed lines .

VOL. 18, NO. 111410

Page 5: Estimating correlation energy of diatomic molecules and atoms with neural networks

ESTIMATING CORRELATION ENERGY

TABLE II.( )Estimates of Correlation Energies in a.u. for Neutral Atoms from Helium through Argon Using a

Backpropagation Neural Network.

Z e Z e Z e Z e Z ec c c c c

2 y0.045 19 y0.768 36 y1.581 53 y2.390 70 y2.9183 y0.055 20 y0.813 37 y1.675 54 y2.416 71 y2.9414 y0.086 21 y0.858 38 y1.738 55 y2.425 72 y2.9905 y0.110 22 y0.903 39 y1.799 56 y2.490 73 y3.0346 y0.155 23 y0.948 40 y1.857 57 y2.550 74 y3.0757 y0.204 24 y0.994 41 y1.913 58 y2.583 75 y3.1128 y0.256 25 y1.041 42 y1.966 59 y2.615 76 y3.1469 y0.312 26 y1.088 43 y2.017 60 y2.647 77 y3.176

10 y0.373 27 y1.135 44 y2.065 61 y2.677 78 y3.20311 y0.388 28 y1.183 45 y2.110 62 y2.707 79 y3.22812 y0.426 29 y1.232 46 y2.153 63 y2.736 80 y3.24913 y0.441 30 y1.281 47 y2.194 64 y2.764 81 y3.26814 y0.493 31 y1.331 48 y2.232 65 y2.791 82 y3.28415 y0.547 32 y1.381 49 y2.268 66 y2.818 83 y3.29816 y0.603 33 y1.431 50 y2.301 67 y2.844 84 y3.31017 y0.662 34 y1.481 51 y2.333 68 y2.869 85 y3.31918 y0.723 35 y1.531 52 y2.362 69 y2.894 86 y3.327

Ž .ral network full line . Of course, the DFT has amore physical justification, but the approach inthis work can be equally justified as a mapping ofknown results that model the neural network andits a posteriori use to estimate the unknown corre-lation energies. In Table II we tabulate all correla-tion energies estimated in this work.

In Figure 3 we plot the correlation energiespredicted by our network for all atoms up to

Ž .radon Z s 86 . One can see that the correlationŽ .energies absolute values increase with the atomic

number, as expected. One can also note that thereis a certain periodicity in the predictions, as maybe expected, due to the periodic nature of theelements. The correlation energies predicted by theneural network are somewhat smaller than thosepredicted by LSDA and GGA.13 Nevertheless, it isknown that LSDA overestimates the atomic corre-lation energies and the GGA should correct thistrend. So the neural network prediction is alsocoherent in this respect. The very elaborate resultof Kelly and Ron14 for the iron atom is also dis-played in Figure 3, and is in good agreement withour results. Of course we would need additional,exact data to have a better idea of how good ourestimations are or how to better train our network.Therefore, a more comprehensive list of exact cor-relation energies could be very useful in improv-

ing our current neural network. In particular weare interested in exact correlation energies for thetransition metals.

It should be noted that, for high-Z atoms, rela-tivistic effects will have a profound effect on elec-tronic correlation. Also, some neural networks havedemonstrated poor reliability when the range of agiven input pattern of the test set exceeds thecorresponding range of the training set.15

Figure 4 presents the recognition and predictionŽ .of our learning and test sets Table III . The over-

whelming concordance of the neural network andthe experimental results has stimulated us to makepredictions of several unknown correlation ener-gies of diatomic molecules. These predictions areshown in Figure 5 and Table IV. New experimen-tal values for the molecules shown in Figure 5 areexpected to be in good accordance with the pre-sent results.

It should be noted that the use of networkpredictions to calculate third quantities, such asthe binding energy in the diatomics, could lead torather large deviations due to error propagation.

In conclusion, we have succeeded in generatingestimates of the correlation energies of severaldiatomic molecules and neutral atoms up to radonusing a backpropagation neural network. Becausethere are no exact correlation energies for many-

JOURNAL OF COMPUTATIONAL CHEMISTRY 1411

Page 6: Estimating correlation energy of diatomic molecules and atoms with neural networks

MAGELA E SILVA, ACIOLI, AND PEDROZA

( ) ( ) ( )FIGURE 3. Correlation energies in a.u. as a function of atomic number Z , prediction stage. a Predictions of the( ) ( ) ( ) ( ) ( ) ( )neural network full line ; b GGA results dashed lines ; c LSDA results long dashed lines ; and d Kelly and Ron

( )estimate for the iron atom star .

TABLE III.Estimates of Correlation Energies of Diatomic Molecules.a

( ) ( ) ( ) ( )Molecule e neural network e exact Molecule e neural network e exactc c c c

H y0.044 y0.041 C y0.497 y0.5102 2LiH y0.076 y0.083 CN y0.508 y0.503BeH y0.103 y0.091 CF y0.560 y0.559CH y0.214 y0.196 N y0.546 y0.5452NH y0.247 y0.239 NO y0.567 y0.605OH y0.293 y0.310 PN y0.872 y0.870HF y0.348 y0.377 O y0.636 y0.6472NaH y0.388 y0.420 SiO y0.851 y0.858MgH y0.437 y0.434 OF y0.679 y0.672SiH y0.539 y0.524 F y0.749 y0.7352PH y0.591 y0.557 AlF y0.843 y0.826HCl y0.681 y0.702 NaCl y1.063 y1.082Li y0.119 y0.120 BH y0.172 y0.1522LiF y0.430 y0.429 AlH y0.483 y0.484LiCl y0.733 y0.744 BF y0.511 y0.498BeO y0.440 y0.449 CO y0.512 y0.529BeF y0.471 y0.438 MgO y0.793 y0.844B y0.329 y0.326 NaF y0.858 y0.7822BO y0.476 y0.466

a ( ) ( )The first 31 molecules H to NaCl were used to train the neural network. The last six BH to NaF were used to determine the2prediction rate of the network.

VOL. 18, NO. 111412

Page 7: Estimating correlation energy of diatomic molecules and atoms with neural networks

ESTIMATING CORRELATION ENERGY

FIGURE 4. Correlation energies of diatomic molecules.Experimental values: stars; network recognition: fullcircles; predictions: full triangles.

FIGURE 5. Network prediction of the correlation energyof diatomic molecules.

TABLE IV.( )Estimate of Correlation Energies in a.u. for

Diatomic Molecules.

Molecule e Molecule ec c

KH y0.720 MgF y0.829CaH y0.768 SiF y0.902MnH y0.945 PF y0.934BN y0.440 SF y0.964BS y0.806 CIF y0.991CP y0.814 KF y1.037CCl y0.882 Si y1.0602CS y0.851 SiS y1.108SiN y0.821 P y1.1122PN y0.858 S y1.1532OP y0.898 CaS y1.213S0 y0.930 Cl y1.1892OCl y0.958 CKl y1.216CaO y1.036 K y1.2412

electron systems we compare our estimates withcorrelation energies obtained in the framework ofthe DFT in the LSDA and GGA approximations.

Ž .Our energies are smaller in absolute value thanthose obtained in those approximations, a trendfollowed by the known exact results.

Extensions of this work to provide estimates ofdipole moments of diatomic molecules, correctionsto HF, MP2 homolytic bond dissociation energies,and some of the parameters involved in parame-terizing the G2 method are currently being under-taken and the results will be presented in a forth-coming article.

References

1. T. L. H. Watkin, A. Rau, and M. Biehl, Rev. Mold. Phys., 65,Ž .499 1993 .

2. J. Gasteiger and J. Zupan, Angew. Chem. Int. Ed. Engl., 32Ž .503 1993 .

3. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ParallelDistributed Processing: Explorations in the Microstructuresof Cognition, Vol. I, D. E. Rumelhart and J. L. McClelland,Eds., MIT Press, Cambridge, MA, p. 318.

4. J. Zupan and J. Gasteiger, Neural Networks for Chemists,VCH, Weinheim, 1993, pp. 122, 123, and 147.

5. S. Wilson, Ed., Methods in Computational Physics, Vol. I,Plenum, New York, 1987.

Ž .6. A. Veillard and E. Clementi, J. Chem. Phys., 49, 2415 1968 .

7. M. T. Carroll, R. F. W. Bader, and S. H. Vosko, J. Phys. BŽ .Atom. Mol. Phys., 20, 3599 1987 .

JOURNAL OF COMPUTATIONAL CHEMISTRY 1413

Page 8: Estimating correlation energy of diatomic molecules and atoms with neural networks

MAGELA E SILVA, ACIOLI, AND PEDROZA

8. A. Savin, U. Wedig, H. Preuss, and H. Stoll, Phys. Rev. Lett.Ž .53, 2087 1984 .

9. L. Adamowicz, R. J. Barlett, and E. A. McCullough, Jr.,Ž .Phys. Rev. Lett. 54, 426 1985 .

Ž .10. Y.-M. Juan and E. Kaxiras, Phys. Rev. B, 48, 14944 1993 .11. S. J. Chakravorty and E. Clementi, Phys. Rev. A, 39, 2290

Ž .1989 .

12. C. F. Bunge, J. A. Barrientos, and A. V. Bunge, Atom. DataŽ .Nucl. Data Tables, 53 113 1993 .

13. O. V. Gritsenko, N. A. Cordero, A. Rubio, L. C. Balbas, and´Ž .J. A. Alonso, Phys. Rev. A, 48, 4197 1993 .

Ž .14. H. Kelly and A. Ron, Phys. Rev. A 4, 11 1971 .15. G. Rauhut, J. W. Boughton, and P. Pulay, J. Chem. Phys.

Ž .103, 5662 1995 .

VOL. 18, NO. 111414