A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules

Embed Size (px)

Citation preview

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    1/9

    J . Med. Chem. 1985,28,849-857 849From a practical standpoint, current clinical trials areevaluating the ability of ARIs either to prevent the pro-gression or reverse the presence of clinically observedpathology. A major anticipated problem with this ap-proach, however, is that the timing of intervention maynot be optimal. Checking the progress of diabetic pa-thology may not be possible once the pathological processhas progressed to a certain point. Moreover, the dosageof inhibitor at that point may not be adequate. Thus,despite the fact that aldose reductase may be involved in

    the initiation of the pathology, the clinical trials with ARIsmay not demonstrate this point. Intervention studiesconducted on cataract formation in galactosemic ratssuggests such a possibility.3BWhile cataract formation canbe essentially prevented through administration of ARIconcomitant with the galactose diet , the reversal of thecataractogenic process can only be accomplished through(94) Judzew itsch, R.; et al. Diabetes 1981,30 (suppl l ) ,30A.

    either the removal of the high galactose diet or the ad-ministration of an adequate dose of ARI prior to the sixthday of galactose feeding. After the sixth day a point ofno return was reached in the reversal of the cataractousprocess with the vacuolar stage continuing on to the densenuclear opacity.The galactose cataract study indicates the required ad-ministration of ARI at the initial onset of the pathogenicprocess, and to date, most animal studies with ARIs havestressed prevention of diabetic pathology rather than in-tervention. These animal studies suggest that ARIs shouldeventually be clinically administered a t the onset of di-agnosed diabetes on a prophylactic basis. For such long-term administration ARIs with extremely low toxicity andno side effects will be needed.

    Acknowledgment. We thank Drs. Dennis Feller andHerman Ziffer of the National Institutes of Health for theirhelpful comments and discussions.Regi s t ry No. Aldose reductase, 9028-31-3.

    Art i d e sA Computational Procedure for Determining Energetically Favorable Binding Siteson Biologically Important MacromoleculesP. J. GoodfordThe Laboratory of Molecular Biophysics, The Rex Richards Building, University of Oxford, Oxford O X 1 3QU, England.Received August 3, 1984

    Th e interaction of a probe group with a protein of known structure is computed a t samp le positions throug houtan d around the m acromolecule, giving an array of energy values. Th e probes include water, the methyl gro up, aminenitrogen, carboxy oxygen, and hydroxyl. Contou r surfaces at approp riate energy levels are calculated for each probeand displayed by compu ter graphics together with the protein structure. Contours a t negative energy levels delineateregions of attraction between probe and protein and are found at known ligand binding clefts in particular. Th econtours also enable other regions of attr action to be identified and facilitate the interp retatio n of protein-ligandenergetics. Th ey may, therefo re, be of value for drug design.Several attempts have been made to use the atomicstructures of biological macromoleculesin order to discovernovel therapeutic agents. The novel molecules are de-signed so that they should fit onto an appropriate proteinwhose structure hasbeen previously determined by X-raycrystallography. However, this fitting process is not justa matter of simple geometry because the chemical prop-erties of the individual atoms are critically important, andthe goodness of fi t may be assessed by predicting the en-

    ergetics of the drug-protein system instead of its shapea2A method is now described in which energy contour sur-faces are displayed in three dimensions on a computergraphics system together with the macromolecular struc-ture, so that energy and shape can be considered simul-taneously when designing drugs.The idea of small hard atoms goes back to Democritusand Lucretius, but it was van der Waals who used theJoule-Thompson effect to show that there are attractiveforces between atoms. The combination of attractive withrepulsive forces has led to a number of simple function^^.^~(1 ) Goodford, P. J. J. Med. Chem. 1984,27, 557.(2) Richards, W. G. Quantum Pharmacology, 2nd ed.; Butter-worths: London, 1983.

    that can display energy minima corresponding to preferrednonbonded arrangements of atoms and molecules, and the(12,6) Lennard-Jones potential (Figure 1) s such a func-tion:(1)

    In this equation d is the distance between a pair of non-bonded atoms whose Lennard-Jones energyEljis describedby the parameters A and B. When d is small, the A / d 1 2term generates a dominating repulsion corresponding toa large positive value of E,. This effectively defines aminimum separation that can be apportioned between theatoms, giving each of them a nominal radius, and thethreshold step function established in this way can beextended to larger systems, thus determining a molecularsurface that may be measured5or displayed by computergraphics! Alternatively, a somewhat bigger value of d maybe used, corresponding to the minimum value of Eljat the(3 ) Buckingham, R. A. Proc. R. SOC.ondon, A 1938, 168, 264.(4 ) Jones, J. E. Proc. R. SOC.ondon, A 1924, 106,463.(5 ) Lee, B.; Richards, F. M. J. Mol. Biol. 1971, 55, 379.(6) Blaney, J. M.; Jorgensen, E. C.; Connolly, M. L. Ferrin, T. C.;Langridge,R.; Oatley, S. J.; Burridge, J. M.; Blake, C. C. F. J.Med. Chem. 1982,25, 785.

    E1j = A / d 1 2 - B / d 6

    0022-2623/85/1828-0849$01.50/0 1985 American Chem ical Society

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    2/9

    850 Journal of Medicinal Chemis try , 1985, Vol. 28, No. 7 GoodfordTable I. Parameters Used To Evaluate the Nonbonded Interactionsof Probe Grouos (SeeT e x t P b

    probe Lennard-Jones hydrogen bonddescription symbol ff i Neff rad charge C D H- H+

    amino NH3+ 2.13 9 1.75 0.66 4295 629 3 0carbonyl oxygen 0 0.84 6 1.60 -0.37 3855 738 0 2carboxy oxygen 0- 2.14 6 1.60 -0.57 3855 738 0 2hydroxyl OH 1.20 7 1.65 -0.12 3855 738 1 1methyl CH3 2.17 8 1.95 0.00 0 0water HZO 1.20 7 1.70 0.00 3855 738 2 2OH- and H c re th e maximum number of hydrogen bonds that each probe can donate and accept. bunits: polarizability (a,),3; vande r Waals radius (rad), A; charge, the electron charge; C , kcal mol- A6; , kcal mole- A4.

    I I

    - 0 . 4 1Figure 1. A Lennard-Jones curve for th e interaction betweentwo nonbonded atoms. Abscissa: interaction energyE Ordinateinteratomic distance d. Slightly negative values of Aijyield twoalternative distances dand d, which definea zone of interatomicattraction, and some characteristics of such zones have now beenexamined by computer graphics. See text.bottom of the trough in the Lennard-Jones curve, and thisgenerates a rather more extended molecular surface formeasurement or display.An attempt has now been made to study the surfacescharacterized by a slightly negative energy value tha t de-fines two intersections with the Lennard-Jones curve(Figure 1) at distances d and d! As atoms approachnearer together than d, the repulsion term A/d12 soondominates, while the interaction tends to fade away com-pletely as the atoms move further apar t than d . Thereis a zone of mutual attract ion between these limits, andsome characteristics of the three-dimensional contoursurfaces at dand d hat define the extent of the zone inmacromolecular systems are now explored. In order to dothis the interaction of a probe group with the chosenprotein is computed a t sample positions throughout andaround the macromolecule, giving an array of energy valuesthrough which the contours may be drawn. Probes includewater, the methyl group, amine nitrogen, carboxy oxygen,and hydroxyl.Electrostatic and hydrogen bond effects are also con-sidered, and the present studies show that some knownligands do bind as expected in the calculated zone of a t-traction. It k suggested that the contour surfaces at dandd may be of value in designing novel ligands or drugs.MethodsThe overall aim is to compute an empirical energyfunction and thereby produce an array of energy values.Three-dimensional contour surfaces at selected energy

    levels may then be generated and studied together withthe protein structure. The examples given are displayedon an Evans and Sutherland Picture System 2 using pro-gram FROD0,78s ut these display procedures are well es-tablished and i t is only necessary to describe the earlierstages of the method. Moreover, the results can be dis-played effectively on other graphics systems.Preparation of Data. Atomic coordinates for thechosen macromolecule are taken from a source such as theProtein Data Bankg and should be carefully checkedagainst the notes and literature references. An appropriatechoice is made where alternative atomic positions havebeen provided, and insertions may be supplied for missinggroups. Ligands such as water molecules, chloride ions,or cofactors are not normally treated as part of the mac-romolecule, unless they are structurally significant like theheme group of hemoglobin.A table of parameters (table GRUB) is needed in orderto evaluate the Lennard-Jones and other empirical energyfunctions, and program GRIN appends these parameters tothe atomic coordinates of the protein. Each heavy atomin each of the 20 common amino acids is individuallyconsidered, and additional molecules such as hydroxy-proline or heme may be added to the table and processedin the same way. It may then be appropriate to modifysome of the parameter values in particular cases. Thusthe electric charges on amino and oxy terminal atoms anddisulfide bridge atoms may be changed as suggested byMcCatnmon et a l . , O and account may be taken of the pK,of histidine residues if this is known. Hydrogen-bondinghydrogen atoms in positions of standard geometry areappended to the protein by program GRIN if they are notalready present, and a check is made on Cys sulfur atomsto be sure they have received proper treatment. Forproteins such as hemoglobin, the complete physiologicaloligomer must then be generated by symmetry operations,and a check is finally made for atoms that may be wronglylisted twice in the same position on symmetry axes.The parameters in GRUB are based on the extendedatom used for program CHARMM. Thus, theamino group is treated as a single entity with a van derWaals radius of 1.75A, which is somewhat larger than thenormal value for a nitrogen atom. This single extendedamino group replaces four real atoms for computationalpurposes, so that the size and duration of all the compu-tations is significantly reduced. Other advantages anddisadvantages of the extended atom representation havebeen considered by Brooks et al., and the parameter(7 ) Jones, T. A. J. Appl . Crys tal logr . 1978, 11, 268.(8) Jones, T.A. Computational Crystallography;Sayre, D., Ed.;Clarendon Press: Oxford, 1982; p 303.(9) Bernstein, F. C.;Koetzle, T. F.; Williams, G. J. B.; Meyer, E.F.;Bryce, M. D.; Rodgen, J. R.; Kennard, 0.;Shikanouchi, T.;Tasumi, M. J. Mol. B i d . 1977, 112,535.(10) McCammon, J. A. ;Wolynes, P. G.; Karplus, M. Biochemistry1979, 18, 927.

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    3/9

    Favored Binding Sit es on Biological Moleculesvalues for extended probe groups are presented in TableI. The Grid. The calculations are performed by Fortranprogram GRID, which can be made available to interestedparties together with program GRIN and table GRUB. Aregular array of GRID points is established throughoutand around the protein, and the potential energy E,,, ofthe probe is calculated when it is located at the first pointon the first XY lane of the GRID. Successive probe pos-itions are sampled in the same way until an energy valueE,, has been assigned to every GRID point.The dimensions of the array are determined so that allpoints on the first XY lane are outside the protein, andthe computed energy values are therefore small when theprobe is in this plane. However, subsequent planes startto intersect the macromolecule, and large positive energiesdue to the A/d12 Lennard-Jones repulsion term may thenbe calculated for any GRID point that happens to be neara protein atom. Other points lie in the interatomic spaces,and modest negative energies would then correspond tofavorable interactions between the probe and the protein.These would be partly due to the -B /d6 term of the Len-nard-Jones function, partly to electrostatic effects, andpartly to hydrogen-bond interactions. Further arrays mayalso be calculated for other probes of interest, and eachmay be separately contoured for display so that the in-teractions of different ligand types with the same proteincan be compared.The Empirical Energy Functions. The nonbondedinteraction energy E,, of the probe at each xyz positionon the GRID is calculated as the sum of many differentcomponents:

    (2)Each individual term in the summations relates to onepairwise interaction between the probe at position xyz anda single extended atom of the protein. The summationsextend over all extended protein atoms, although somesmall terms may be set to zero as indicated below in orderto save computing time.The Lennard-Jones Function. The form of theLennard-Jones terms E ljhas already been defined (eq l ) ,but E lj s set to zero if the probe and the pairwise proteinatom are more than a certain distance apart. This iscurrently 8 A at whichE , is typically -0.01 kcal/mol andis, moreover, diminishing according to the sixth power ofthe interatomic distance. The cut-off a t 8 A yields aworthwhile saving of computer time, although the neglectof small terms may introduce an error of the order of 0.1kcal/mol in LElj.The values of A and B in eq 1are calculated from theeffective number of electronsNee,the polarizability cy1 andthe van der Waals radius r of the interacting atoms asdescribed by Hopfinger.12The Electrostatic Function. The electrostatic inter-action Eeldoes not diminish rapidly with distance, and theindividual terms are therefore evaluated painvise betweenthe probe group and every extended protein atom, ir-respective of their separation. However, the magnitudeof E,, is critically sensitive to the spatial dielectric behaviorof the environment, which has been considered by Hop-figer,l2who proposed a distance-dependent dielectric. Inthe present work it is assumed that a planar interface

    E x y z= CElj + CE, , + CEhb

    Journal of Medicinal Chemis try , 1985, Vol. 28, No. 7 851

    (11) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J. ;Swiminathan, S.; Karplus, M. J . Comp. Chem. 1983, 4 , 187.(12) Hopfinger , A. J. Conformat ional Proper t ies ofMacromolecules;Academic Press: New York, 1973;Chapter2.

    Table 11. Assessment of the Nominal Depth (s)of a ProbeGroup or Atom inside the Protein, W hich Is Regarded as aHomogeneous Phase (See Text)n s,A n s,A

    6 6 0.0 10 1.97 0.4 11 2.68 0.9 312 4. 09 1.4

    a n s the number of protein atoms within 4 A of the probe posi-tion.separates a homogeneous protein phase of dielectric { froma homogeneous solution of dielectric t. The nominal depth(sq) of each protein atom in the protein phase is assessedby counting the number (n)of neighboring protein atomswhose nuclei lie within a distance of 4 A, and the depth(sp) f the probe at each xy z position is similarly assessed.In order to calibrate this procedure, a test sphere of radius4A was placed a t various positions in or around a proteinduring a series of preliminary studies, and the number ofprotein atoms in the sphere was counted. When it con-tained 1 2 or more atoms, the sphere was apparently im-mersed in the protein phase, but if it held less than sevenatoms, the center of the sphere appeared to be in thesolution. Table I1 shows the depth (s) of the spherescenter into the protein phase for other numbers of atomson the basis of these studies, and Eelmay then be evaluateddirectly by applying the method of images13-15

    In this equation p and q are the electrostatic charges onthe probe group and the pairwise protein atom that areseparated by a distance d, and K is a combination ofgeometrical factors and natural constants.l The term 4s,,ss,is set to zero if the probe has less than seven proteinneighbors within a distance of 4 A, ince this gives theappropriate functional form on the present assumptionsaccording to the method of images when the probe isoutside the protein phase. Values of 80 and 4were usedfor e and {.Equation 3 represents a compromise between proceduresbased upon the algorithm of Warwicker and Watson,15which can require substantial computing times, and theclassical functions whose inadequacies have been discussedby Hopfinger.12 Equation 3 leads to an effective dielectricvalue of { if the pairwise groups are so deep in the proteinand so close together that the effect of the solution maybe neglected. At increasing separations there is a distancedependency analogous to that suggested by Hopfinger,12but if one or both groups approach the surface the elec-trostatic interadion becomes progressively diminished, andit is almost switched off as the probe enters the solutionwhen the effective dielectric rises to ({ + ~ ) / 2 .Some ofthe constants in eq 3 can be collected in order to speed upthe computation, and it is only necessary to determine spand sq once in order to evaluate eq 3 for every interaction.The square-root term is rather time consuming, but it doesnot have to be evaluated when the probe is outside theprotein.The Hydrogen Bond Function. A direction-depend-ent 6-4 function is used for hydrogen bonds (eq 4),giving(13) Landau, L. D.; Lifshitz, E. M. Course of Theoretical Physics;Englis ed.; Pergamon Press: Oxford, 1960; Vol. 8.(14) Rogers, N.; Sternberg, M. J . Mol. Biol. 1984, 174, 527.(15) Warwicker, J.; Watson, H. C. J . Mol. Biol. 1982, 157, 671.

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    4/9

    852 Journal of Medicinal Chemistry, 1985, Vol. 28, No. 7a broad energy minimum as suggested by Brooks et a1.l1

    (4)When two identical atoms are interacting, the tabulatedvalues for C and D determine their interatomic separationd,h at the bottom of the curve. If the atoms are of dif-ferent types, the geometric mean of their individual Dvalues and the arithmetic mean of their dminseparationsare used. The appropria te C value is then calculated bysetting the differentiated form of eq 4 equal to zero, whichleads to

    E h h = [ C / d 6- D / d 4 ] COSm 0

    C = O.666Dd,in2 (5)If the protein donates a hydrogen bond, then the bonddirection is determined by the hydrogen position as com-puted from the heavy atom st ructure of the, protein. 6 isthe angle D H P where D is the protein donor atom, H isthe hydrogen, and P is the probe accepting the hydrogenbond. The term m is normally 4, but the whole E h b termis set to zero when 6 I 90 However, problems can arisewhen tyrosine donates a hydrogen bond, because i t is notalways easy to decide on an unequivocal position for thehydrogen atom. In the present work a dummy hydrogenposition is assigned on the tyrosine side-chain axis at 1Afrom the phenolic oxygen, and a less well-defined orien-tation is specified for this hydrogen bond by setting m =2. The procedure tends to overestimate the strength ofhydrogen bonds donated by tyrosine, but the bias is toosmall to show on any of th e present figures.If the probe group donates the bond, it is assumed thatthe probe can orient itself in order to form the most ef-fective hydrogen-bond interaction with the acceptor atomof the protein, and the cos 6 term is set to unity.If Elj ndicates a Lennard-Jones repulsion but E h b showsa favorable hydrogen-bond interaction between two atoms,then Eljs set to zero. The maximum number of hydrogenbonds that each protein atom can donate and the maxi-mum number tha t i t can accept are specified and are also

    specified for the probe. Out of the many possible hydrogenbonds that a probe might be able to make at any particularxyz position, the energetically most favorable ones are thenselected subject to these constraints.There is no general agreement a t present on th e em-pirical functions that best describe&, r, indeed, EljandEel. Some authordl favor the use of trigonometricalfunctions for hydrogen bonds, but others16-ls suggest tha tdirectional preferences can be adequately simulatedwithout explicit angular factors. Further work is needed,and no particular claim is made for the present functionsand parameters in preference to those of Snir et al.I9 orWeiner et alaz0Results

    Wate r Contours. The full parameterization of waterpresents particular problems, and a very simple model hastherefore been used. An "extended" water molecule (H20;Table I) is treated as an electrically neutral group that has

    Goodford

    Hagler, A. T.; Dauber, P.; Lifson, S. J. Am. Chem. SOC. 979,101, 5131.Hagler, A. T.; Lifson, S.;Dauber, P. J . Am. Chem. SOC. 979,101, 5122.Lifson, S. ; Hagler, A. T.; Dauber, P. J . Am. Chem. SOC. 979,101,5111.Snir, J.; Nemenoff, R. A. ;Scheraga,H. A. J . Phys. Chem. 1978,82, 2497.Weiner, S. J.; Kollman, P. A.; Case, D. A.; Singh, U. C.; Ghio,C.; Alagona, G .; Profeta, S.; Weiner, P. J . Am. Chem. SOC.1984, 106, 765.

    \ \ *I I

    Figure 2. Th e cluster of sho rt vectors were calculated from t heobserved struc ture of the phospholipase-A2 macromolecule.21These vectors define a surface that surrounds a globular regionin a cavity in the protein. A water molecule in thi s region shouldhave an interaction energy of -9 kcal/mol w ith the enzyme,according to the presen t calculations. Th is corresponds to asignificant attraction, and a water molecule (H53) as observedby X -ray crystallography as shown. It is almost exactly wherepredicted by the contours.no dipole but can donate up to two hydrogen bonds andcan also accept up to two. It also seemed desirable to usea sufficiently close GRID spacing between GRID points toensure tha t no appreciable cavity in thenmatrix of theprotein would be passed over, and with this in mind thefirst computations were performed with water as a probegroup at a GRID spacing of 0.5 A. The three-dimensionalcontours in Figure 2 are therefore displayed as a networkof interlocking vectors whose lengths correspond t o thisdistance.

    Part of the bovine phospholipase-A2 macromoleculez1is depicted in Figure 2 with contours plotted at a heightof -9 kcal/mol. This contour level corresponds to a sig-nificant attraction, and the binding of a water moleculeshould be well favored in the small globular region definedby the contours. The size of thiszone is largely determinedby Lennard-Jones interactions while its suitability forwater is due to the presence of nearby hydrogen-bondinggroups on the macromolecule,and a water molecule(H53)was in fact detected by X-ray crystallography near thecenter of the favored zone (Figure 2). Furthermore, themap was recontoured at more negative energy levels inorder to assess the accuracy of the predictions, and thisshowed that the experimental position of th e water mol-ecule lies within 0.3 A of the peak in the computed contourmap.The good agreement between observed and predictedwater positions in Figure 2 might have been expected,because this particular si te lies within the s tructure of asingle phospholipase-A2 molecule and because the bindingpocket is well defined by steric constraints. However, otherwater molecules are found close to the outer surface of theprotein, where the pocket is not so obvious and where theexperimental positions could in principle be influenced byinteractions with neighboring macromolecules in thecrystal. Whole-crystal effects of this type were not takeninto account during the present calculations because they(2 1 ) Dijkstra, B. W.; Kalk, K. H.; HoI, W. G. J.; Drenth, J. J.Mol .B i d . 1981, 147 , 97.

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    5/9

    Favored Binding Sites on Biological Molecules

    +

    Journal of Medicinal Chemistry, 1985, Vol. 28, No. 7 853

    Figure 3. Two water molecules near the outside surface ofphospholipase-A2,2*hydrogen bonded to the protein and to eachother as shown. Both lie within the -9.0 kcal/mol contours forwater as expected, bu t another water molecule (H88) s furtherfrom the protein and is not surrounded by contours. See text.may have little biological significance, but a typical situ-ation near the protein surface is depicted in Figure 3. Thisshows a shallow pocket containing two water molecules(H59 and H96), whose observed positions are both withinthe calculated water contour a t -9 kcal/mol as expected.However, the shape of the contours around each moleculeis qualitatively different. The favored location for waterH59 is clearly defined by two hydrogen bonds from N75and 075, and its position is accordingly delineated by aglobular cluster of contours peaking a t the observed pos-ition. On the other hand, water H96 is surrounded by along tube of contours showing that several different pos-itions might be equally acceptable for this water molecule,depending on the particular hydrogen bonds formed withthe protein. The actual location of water H96 appears tobe influenced by the well-defined position of water H59,since H96 is observed in the contour tube at the preciseplace where an additional water-to-water hydrogen bondcan be formed as shown in the figure.Figures 2 and 3 demonstrate t hat energy contours canbe of value in predicting where water molecules will bindto a protein, both inside the protein matrix and close toits surface. However, the contours become less helpful asthe distance from the protein increases, and water H88 isnot contained by the -9 kcal/mol water contour in Figure3 although it is present in the experimental crystalstructure. Since its distance would correspond to a secondhydration layer around the protein and since its locationcould be influenced by intermolecular contacts within thecrystal, this may not be surprising.Amino Contours. The structure of Esc he r i c h ia c o lidihydrofolate reductase with the inhibitor methotrexate(I) bound at the active site has been observed by high-

    R. l

    \coo-I. 4 NHz: Rlo= CH3: aromaticII.R4=OH: i o = H : 7 . 8 - d i h y d r o

    resolution X-ray crystallography.22 The crystals containedtwo conformationally distinct m o l d e a of the enzyme, andtheoriginal authors concluded that one of these was a more

    Figure 4. The substrate binding cleft of E. coli dihydrofolatereductase,P on which has been superimposed the enzyme inhibitortrimethoprim (molecule Hl). Th e contours are plotted a t -12.0kcal/mol for an extended amino group. The y delineate two sitesof strong attraction, and the position of trimethoprim relativeto the protein was adjusted so tha t amino nitrogens N2 and N4of the ligand were located on these peaks. Hydro gen bonds fromthe l igand to the protein and t o water molecule H39 are shown.See text.representative s tructure than the other. Figure 4 showsthe active site of the representative macromolecule, butthe observed inhibitor methotrexate has been replaced inthe figure by trimethoprim (1111, which is another inhibitorwhose structure has been independently observed.23

    ,NHZ . ,c=c /OMe,N=G\\ $-( \N

    HzN-C C-CHz-C 'C-OMe

    OR111, . CH3IV, E CH2)nCOOH

    The three-dimensional contours in Figure 4 were cal-culated for an "extended" amino group (NH3+,Table I)interacting with the protein. A GRID spacing of 1A wasused in the calculations, since this reduces the computa-tions by a factor of 8 without seriously degrading thequality of the contour surfaces. In fact, the contours inFigure 4 show two distinct peaks that could be of valuewhen designing a novel drug to fit dihydrofolate reductase,and the precise location for the heterocyclic ring of tri-methoprim in the figure was adjusted by placing N2 andN4 on the "extended" amino density peaks, while keepingthe plane of the ring in the carboxy plane of side-chain

    Bolin et a1.22comment that the corresponding hetero-cyclic ring of methotrexate interacts with 061 and 0 6 2 ofAsp-27 and with the carbonyl oxygens of Ile-5 and Ile-94(E .coli numbering). The equivalent trimethoprim inter-actions are shown in Figure 4 together with a water in-teraction at N2, hose importance is emphasized by Bolin

    Asp-27.

    (22) Bolin, J. T.; Filman, D. J.; Matthews, D. A.; Hamlin, R. C.;Kraut, J. J.Biol . Chem. 1982, 257, 13650. (23) Koetzle, T. F.; Williams, G . J. B. J. A m . Chem.SOC. 976,98,2074.

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    6/9

    854 Journal of Medicinal C hemis try , 1985, Vol. 28, No. 7 Goodford

    Figure 5. Another view of the binding cleft in dihydrofolatebut with contours at -4.5 kcal/mol for a carboxyoxygen atom and with a trimethoprim analogue that hasa carboxyside chain. The contours indicate where the binding of thiscarboxy group might be expected to occur. See text. An alter-native carboxy-binding region is delineated by the contours atthe bottom edge of the figure.et aLZ2 H39 in Figure 4 is Bolin's Wat-639, ec2). Thehydroxyl of Tyr-100 s also seen between residues Ile-5 andIle-94 and this could have some effect on the generationof "extended" amino contours around N4. However,Tyr-100 s not a conserved residue in other dihydrofolatereductase sequences, and its hydroxyl group cannot be ofgeneral importance for the binding of diamino heterocyclesto these enzymes.Carboxy Contours. An analogue (IV) f trimethoprimhaving an O(CH,),COOH side chain instead of OCH, wasstudied by Kuyper et al.= Those authors showed by X-raycrystallography that the carboxy group approaches residueArg-57 of E. coli DHFR, and a GRID computation has nowbeen performed in order to find out whether this bindingmode could have been predicted by using energy contours.The GRID calculation was performed with carboxy oxygen(0-;able I) as the probe atom, and the contours of Figure5 were plotted a t an energy level of -4.5 kcal/mol.These contours were generated for a probe that simu-lates a single carboxy oxygen atom, but of course carboxyatoms occur aspairs in chemical groups. A small globularpeak in the contour map (such as hat shown in Figure 2)would not be able to accommodate a complete carboxyfunction. On the contrary, an elongated volume would beappropriate, and the largest contour feature on the carboxymap (Figure 5) has an elongated outline. Moreover it liesadjacent to Arg-57 in the expected orientation and is ingood agreement with the observations of Kuyper et aLZ4Another significant set of contours may be observed atthe bottom edge of Figure 5, and these are redrawn in thecenter of Figure 6 from a viewpoint that shows that thereare twin peaks surrounding a group of three water mole-(24) Kuyper, . F.; Roth, B.; Baccanari, D. P.;Ferone, R.; Beddell,C. R.; Champness, J. N.; Stammers,D. .; Dann, . G.; Nor-rington, F. E. A.; Baker,D. .; Goodford, P. J. J . M ed. C h em .1985, 28, 303.

    UFigure 6. Another view of the altemative carboxy-binding regionshown at the bottom of Figure 5. The contours are at -4.5kcal/mol and surround structurally conserved water molecules.22Nearby residues are also well conserved, and there may be afunctionally-important huttle between Trp-22 and Asp-27. Seetext.

    k 3 I

    4I

    % \4 %%

    Figure 7. Another carboxy-binding region is indicated by these-4.5 kcal/mol contours near the amino termini of helices crC andcrF of dihydrofolate reductase.% Such regionshave been discussedby Hol et al.% See text. A chloride ion (C1H2) has been observedzat this site, as shown.cules. Bolin et a1.22have drawn attention to this clusterthat hydrogen bonds to the conserved residues Trp-22 andLeu-24 and to the functionally important residue Asp-27.They suggest that water in this position may be responsiblefor binding the oxygen subst ituent at position 4 of thenatural substrate dihydrofolate (II) , and the presence ofoxygen contours in this position is not incompatible withtheir interpretation.Bolin et al." have also pointed out that the a-carboxylategroup in the glutamate moiety of methotrexate is boundin a well-defined pocket adjacent to residue Arg-67, butthe y-carboxylate binding is weak and nonspecific. Asearch was therefore made through the contour map fora site a t which y-carboxylate binding might be predicted,

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    7/9

    Favored Binding Sites on Biological Molecules

    1 \

    \ > \ ,....a+ +

    +c"4'.+

    Figure8. The dihydrofolate reductase binding cleftB with aminocontours at -15.0 and (heavy lines round N2 ) -18.0 kcal/mol.Water moleculeH39was treated as a structural part of the proteinfor these calculations. The hydrogen bond from H39 to tri-methoprim therefore contributes to the contours surroundingN2of the ligand, giving a higher contourpeak and a stillbetter definedlocation for this atom. See text.and a peak was actually found in the region between Oylof Thr-73 and N62 of Asn-59. However, these are notstrictly conserved residues and neither is the nearbyLys-58, which contributes toward this peak in the carboxycontours.A significant feature (Figure 7) was discovered on thecontour map near oxygen Oyl of residue Thr-46. This isa conserved amino acid that hydrogen bonds directly toa phosphate oxygen of the cofactor in crystals of the ter-nary enzyme-methotrexate-cofactor complex from Lac-tobacillus c~ z se i. ~~he contours in Figure 7 are locatednear the N-terminals of a-helices C and F and apparentlyindicate the sort of anion binding site that has been pro-posed by Hol et a1.2e A chloride ion was observed hereby Bolin et al,= in the E . coli DHFR crystals and is shownin Figure 7 in the zone of attraction delineated by thecontours.Water as Part of the Protein Structure. It may notbe unreasonable to t reat a water molecule that is clearlydefined by a high-occupancy and low-temperature factorin the X-ray refinement,or that is indicated by a high peakin the computed energy map for a water probe, as part ofthe protein macromolecule itself.27 In Figure 3, for ex-ample, water H59 might have been treated in this way,which would have altered the contours around water H96so that its position could be more accurately predicted.Bolin e t a1.22 have suggested t hat water H39 of the di-hydrofolate reductase structure (Figure 4) comes into thesame category, and the "extended" amino computationsfor dihydrofolate reductase were therefore repeated withH39 parameterized as a protein atom. This had the ex-pected effect of collapsing the lobe of contour that extends(25) Filman,D. J.; Bolin, J. T.; Matthews, D. A.; Kraut, J. J . B i d .Chem. 1982,257,13 663.(26) Hol,W. . J.; va n Duijnen, P. T.;Berendsen, H. J. C. Nature(London) 1978,273, 443.(27) Edsall, J. T.; McKenzie, H. A. Adu. Biophys. 1983, 16, 53.

    Journal of Medicinal Chemistry, 1985, Vol. 28, No. 7 855in Figure 4 from N2 of trimethoprim toward H39 whilesharpening the peak around N2 itself. The new contoursare plotted in Figure 8 at -15.0 kcal/mol, and a secondcontour level is drawn with thicker lines at -18.0 kcal/mol.This higher level only surrounds the GRID sampling pointat (14.0, 66.0, 47.0), which is closest to the trimethoprimN2 peak (Figure 8) and has a maximum value of -23.5kcal/mol.The major interactions of the "extended" amino probeat (14.0, 66.0,47.0) will now be considered in some detail,because they illustrate the relative magnitudes of thepairwise components that contribute to the total energyvalue of a typical GRID point. The probe is cationic (NH3+;Table I) and the largest individual component is theelectrostatic attraction E,, of negatively charged oxygenAsp-27 061. This is 3.08 A distant from the probe andprovides -5.7 kcal/mol, while other appreciable electro-static attractions come from Asp-27 0 6 2 (3.99 A; -3.3kcal/mol), Asp-270 (3.858;3.3 kcal/mol), Ala-60 (4.66A; -2.3 kcal/mol), and Trp-30 0 (5.21A; -1.9 kcal/mol),but against these must be set electrostatic repulsions fromAsp27 Cy (3.90 A; +2.7 kcal/mol), Trp-30 C (4.75A; +2.2kcal/mol), and Asp-27 C (4.88A; +2.1 kcal/mol). Theseare significant energies, and atoms as far away as 6 A canstill make an individual electrostatic contribution ex-ceeding 1kcal/mol, while the total electrostatic term forall protein atoms more distant than this is a net attractionof -4 kcal/mol. Clearly one cannot neglect the electrostaticinteractions of distant atoms when a charged probe is used,and the treatment of the dielectric effect is therefore im-portant.14

    The hydrogen bonds from water H39 and Asp-27 061each have a computed strength of -3.0 kcal/mol, and thereare some 15 protein atoms within 4 A of probe position(14.0,66.0,47.0), each contributing over -0.2 kcal/mol tothe Lennard-Jones attraction. Only Trp-30 C61 generatesa net Lennard-Jones repulsion, and this is smaller (+0.1kcal/mol) than the electrostatic repulsion (+0.2 kcal/mol)due to the positive charge on the same protein atom.Hence the hydrogen bonds and Lennard-Jones forcesmake up roughly half the interaction, while electrostaticeffects account for the remainder. However, these pro-portions depend on the choice of probe group and theparameters that describe it.Another GRID computation was performed in whichwater molecules H03 and H0 4 (Figure 6) were paramet-erized as a part of the E . coli dihydrofolate reductasestructure. Contours were calculated with carbonyl oxygen(0 ;Table I) as the probe group, in order to follow thesuggestion of Bolin et that this cluster of waters mightfacilitate the binding of the natural substrate dihydro-folate. However, no evidence was obtained to support theirinteresting hypothesis because no contour peak was ob-served in the expected location. The highest carbonyl peakin the E . coli dihydrofolate reductase binding cleft wasactually near N4 of trimethoprim (Figure 4), but the hy-drogen bond from Tyr-100 made a significant energycontribution to this value and Tyr-100 is not a conservedresidue.The presence of a significant contour peak near N4, andits absence near water molecules H0 3 and H 04 do notnecessarily run counter to the proposals of Bolin et a1.22The oxygen atoms at position 4 of dihydrofolate mightreplace a bound water molecule instead of binding to it,or the wrong water molecules might have been selected forinclusion as part of the protein structure in the presentcalculations, or the water positions might alter when di-hydrofolate reductase interacts with substrate, or the

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    8/9

    856 Journal of Medic inal C hemistry , 1985, Vol . 28, No. 7 Goodfordcontours can indicate significant features of protein-pro-tein interactions.DiscussionBefore assessing the findings, it is necessary to considerthe shortcomings of the method. The positions of theprotein atoms were observed by X-ray crystallography, andthe strengths and weaknesses of that technique have re-cently been reviewed in the context of drug design.' Therole of entropy is a very significant factor in ligand bindingsince the release of solvent molecules exerts an importantinfluence in the hydrophobic effect,29but entropy wasneglected when computing the present energy contours.Moreover, the decision to divide up the ligand into separateprobe groups and the assumption of pairwise additivitybetween atoms are at best unjustifiable approximations.Furthermore an "extended" atom may have one set ofcharacteristics in proteins but behave differently in otherchemical environments. Thus the amino group may beNH3+ n the side chain of lysine and has been treated assuch by Brooks et al." and in the present work, but theamino substituents a t positions 2 and 4 of trimethoprimmay be better represented by NH,. The fit of trimetho-prim to the "extended" NH3+ contours is therefore sur-prising at first sight and may indicate that more of thepositive charge on the cationic form of the bound hetero-cycle is delocalized to the amino substituents than iscurrently thought.This raises another problem because the distribution ofelectric charge may be altered in the heterocyclic andaromatic rings of trimethoprim when it interacts with theprotein, and such alterations are in no way modelled bythe "extended" amino group of Figure 4. The reality ofthese effects is clearly illustrated by changes in the pK,of ionizable groups in the ligand when it binds to themacrom~lecule,~~ut they can also occur in systems wherethey are not detected so easily, such as the nearby aminoacid residue Phe-31 of dihydrofolate reductase.The local conformation of the macromolecule in theobserved X-ray structure may be molded around anybound ligand in the crystal. Indeed, this is probably apartial explanation for the good fit of ligands that has beenobserved by X-ray crystallography. It has been shown that2,3-diphosphoglycerate influences the conformation ofhemoglobin31 and that the structure of cytochrome cchanges on oxi da ti ~n .~ ~J ~uch movements of atoms andcharges in ligand and macromolecule from their initiallypreferred positions could be energetically expensive, andan appropriate part of this deformation energy shouldperhaps be set against Ex,,*.It was not offset in the presentcalculations, and the resulting GRID values may well bebiased.The above shortcomings are not directed against energycontours in particular. Similar problems arise with con-ventional space-filling models, and they do not yield somuch useful information because they only define for-bidden conformations. The present approach allows thedetailed attributes of an energy profile to be studied inrelation to the structure of a protein, and this can beparticularly helpful when novel ligands are being designedto fit a macromolecule. The most favorable sites for each

    +Figure 9. Residues 13, 14, 16, 17 , 101, and 118 are all part ofone insulin monomer whose structure2s was used to calculatecontours for an extended methyl group. These contoursdelineatea cleft tha t is filled by the s ide chain of residue Phe-701, comingfrom a neighboring insulin monomer in the crystal, thus dem-onstrating how the contours can be used to predict protein-proteininteractions. See text.parameterization of the probe as a carbonyl group mighthave been an inappropriate representation for the oxygensubstituent at position 4 of the substrate dihydrofolate.However, the use of hydroxy (OH; Table I) as an alter-native probe group did not provide any more support forthe hypothesis of Bolin e t a1.22Protein-Protein Interactions. The present modelsimulates Lennard-Jones, electrostatic, and hydrogen-bondinteractions, and these same effects should influence othersystems besides the binding of small ligands to proteins.An attempt was therefore made to use energy contours inorder to investigate protein-protein interactions.Insulin consists of two different protein chains joinedtogether by disulfide bridges to form a single monomericsubunit. GRID colnputations were performed with an"extended" methyl probe (CH,;able I) on the porcineinsulin monomer,28and the strongest peak on the map isshown in Figure 9. The parameters for "extended" methylwere assigned on the assumption that it does not interactelectrostatically with the protein and that it does not formhydrogen bonds. This only leaves the Lennard-Jonesinteraction, which is relatively weak, and the contours inFigure 9 are accordingly plotted at -1.5 kcal/mol. whichcorresponds to a rather modest attraction.Residues 13, 14, 16, and 17 are in the first protein chain,and residues 101 and 118 are in the second chain of thesame insulin subunit. All these residues therefore con-tribute toward the contours in Figure 9, but insulin mo-nomers can associate into dimers and hexamers, and res-idue Phe-701 does not come from the same monomer butfrom a neighboring monomeric subunit in the crystal. Thezone of attraction from the first monomer envelopes thehydrophobic groups of this Phe-701 side chain from thesecond as shown in the figure. In this case the contoursseem to delineate an hydrophobic pocket which is exploitedto facilitate a monomer-monomer association in the hex-americ crystalline structure, thus demonstrating how such(28) Isaacs, N. W.;garwal, R. C. Acta. Crystallogr., Sec t . A 1978,A 34 , 782.

    (29) Tanford, C. "The Hydrophobic Effect: Formation of Micellesand Biological Mem branes", 2nd ed.; Wiley: New York, 1980.(30) Roberts, G.C. K.; Feeney, 3. ; Burgen, A. S. V. ; Daluge, S.FEBS L et t . 1981, 31 , 85.(31) Arnone, A. Nature (London)1972,237, 46.(32) Warshall, A.;Churg, A. K. J. Mol. Biol. 1983, 68, 693.(33) Churg, A. K.;Weiss, R. M.; Warshel, A.; Takani, T. J.Phys.Chem. 1983,87, 1683.

  • 8/7/2019 A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macrom

    9/9

    J . Med. Ch em. 198 5,28 , 857-864 857The present ex4ples mostly come from well-knownbinding sites on well-known proteins, but contour peaksoften indicate favored sites where no ligand other thanwater has been observed so far. For example, a watermolecule has been reported in the pig insulin structure a thydrogen bonding distance from a backbone nitrogen anda backbone carbonyl and Glu-4 0~1. he highest peak onthe contour map for an extended amino probe occurs inprecisely the same location with a computed energy of -18kcal/mol, making this the most favored site-probe inter-

    action anywhere on the porcine insulin molecule.Sites like this, or the hydrophobic pocket illustrate inFigure 9, may be places where one macromolecule makesbiologically significant contacts with another in vivb. Onthe other hand, nature may not use all the favored sitesthat occur on proteins, but they could still be appropriatetargets for novel therapeutic agents. The a-terminal siteof hem~globin,~ay be such an example, and althoughth e choice of energy functions at any one time is a1,waysopen to improvement, the present contour maps suggestthat there really are favored sites on macromolecules andshow tha t energy contours may provide an effective ap-proach to the design of high-affinity ligands as drugs.Acknowledgment. I thank Professor SirDavid Phillipsfor his support and encouragement. I am greatly indebtedto him and to Margaret Adams, Peter Artymiuk, JaneBurridge, Will Gibson, Sheila Gover, Richard Pickersgill,Graham Richards, Neil Rogers, Michael Sternberg, andGarry Taylor for all their suggestions, advice, and help.Registry No. , 59-05 -2; 111,738-70-5; IV ( n = 5), 82830-33-9;HzO,7732-18-5;phospholipase-A2,9001-84-7;ihydrofolate re-ductase , 9002-03-3; porcine insulin, 12584-58-6.

    kind of probe are displayed, together with the energeticpenalty that must be paid if the probe group is not inexactly the best place. Moreover, the closeness of thecontours indicates what forces the probe would be expe-riencing and their direction.The usefulness of energy contours might be diminishedif there were many spurious peaks which could mislead theobserver. However the present results show that the majorfeatures of a predicted interaction can be studied whilesmaller peaks are automatically filtered out by the con-touring procedure. The significance of these major effectsmay be judged from the figures, but the amount of in-formation on view is easily varied by recontouring atdifferent energy levels. This recontouring is a quick pro-cess that does not require another run of programs GRINand GRID. On the contrary, it can be done while the energymaps are being studied, so that the user directly controlsthe amount of information that he is considering at anytime.The ability of different probes to interact favorably withproteins appears to follow a general ranking order. I t isnot uncommon to find sites where the extended aminogroup has a computed energy of -15 kcal/mol, and valuesas high as 23 kcal/mol have been observed. At the otherend of the scale comes the extended methyne group(CH), which hasa low polarizability and does not hydrogenbond at all. Extended methyl (CH,) is a little better butrarely achieves more than -4 kcal/mol, while carbonyloxygen may reach -7 and carboxy oxygen -10 kcal/mol.One cannot be precise about the ranking order, and it mustbe borne in mind tha t a complete carboxy group containstwo oxygen atoms while the computation relates to onealone. Although no quan titative claims should be madefor the actual contour levels, these observations suggestthat amino and carboxy functions might be used withadvantage when high-affinity ligands are being designed. (34) Beddell, C. R.; Goodford, P. J.; Kneen, G.; White, R. D.;Wilkinson, S.; Wooton, R. Br . J. Pharmacol. 1984, 82 , 397.

    Comparative Computer Graphics and Solution Studies of the DNA Interaction ofSubstituted Anthraquinones Based on Doxorubicin and MitoxantroneSuhail A. Islam,+ Stephen Neidle, Bijukumar M. Gandecha,* Malcolm Partridge,* Laurence H. Patterson,*and Jeffrey R. Brown*iCancer Research Campaign Biomolecular Structure Research Group, Departmen t of Biophysics, Kings College,London WC2B 5RL, School of Pharmacy, Leicester Polytechnic, Leicester LE1 9BH, and Departm ent of PharmaceuticalChem istry, Sunderland Polytechnic, Sunderland SR2 7 EE, England. Received J uly 19, 1984

    1-[(Diethylamino)ethyl]amino]- nd 1,4-, 1,5-, an d 1,8-bis[[(diethylamino)ethyl]amino]anthraquinones re shownto intercalate into DNA. Com puter graphics modelling of their intercalation into the self-complementary deoxy-dinucleoside d(CpG ) showed differences in binding properties. While th e 1-sub stituted compound can bind fromeither groove, the l,&dis ubstitu ted compound bind s with both substitu ents in the major groove. In the low-energystat e of the complex with th e 1,5-disubstituted comp ound, this ligand straddles the site with a substituent in eachgroove-to do this, the compou nd mu st bind to a non-base-paired region, 80 inducing base pairing. Th e 1,4-compoundbinds from the major groove; straddling is also possible if full minimization of deoxydinucleoside geometry isperformed. Th e differences in bind ing mode an d interaction energies are reflected in the affinities of interaction(1,5- > 1,4- >> 1,8- > 1-); lso the an tiproliferative effects in vitro are in general agreement with th is ranking.

    Doxorubicin (1 ) has a wide spectrum of antitumouractivity, and although i t shows other cellular effects?, itacytotoxic effect is probably a result of its intercalativeinteraction with nuclear DNA.4 This DNA interactioncould lead to the cytotoxicity either directly (by inhibitionUniversity of London.t Leicester Polytechnic.f Sunderland Polytechnic.

    of DNA and RNA ~ynt hesis )~r i n d i r e c t l y (by c a u s i n gsingle or double strand breaks).6 In an attempt to mimic(1) Young, R. C.; Ozols, R. F.;Myers, C. E. N . Engl.J . Med. 1976,305,139.(2) Moore, H. W.; Czerniak, R. Med . Chem. R e v . 1981, 1, 249.(3) Tritton, T. R.; Yee, G.; Wingard, L. B. Fed. Proc., Fed. Am.SOC. r p . Biol. 1983, 42, 284.(4) Brown, J. R. B o g . M e d . Chem. 1978, 15 , 125.(5) DiMarco, A. Antibiot. Chemother. 1978, 23, 216.

    0022-2623/85/1828-0857$01.50/0 1985 American Chemical Society