6
BioSystems 43 (1997) 199 – 204 Non-random ionic-charge distribution responsible for the structural stability and molecular recognition of proteins Kunitsugu Soda *, Keiichi Kakuyama, Yoichiro Miki Department of Bioengineering, Nagaoka Uni6ersity of Technology, Nagaoka, Niigata 940 -21, Japan Received 25 March 1997; received in revised form 6 June 1997; accepted 13 June 1997 Abstract The ‘ionic-charge shuffling method’ is presented to generate a complete set of electrostatic mutants for a natural protein where ionic charges on the molecular surface of the template protein are exhaustively interchanged with each other. Total Coulomb interaction energies are evaluated for all of the mutants by numerically solving the finite difference Poisson-Boltzmann equation and their distribution in the ensemble is obtained. This method has been applied to five natural proteins to reveal that they have a significantly lower Coulomb energy than the average over the ensemble of their mutants. It is also shown that these natural proteins have a significantly larger and smaller number of pairs of attractive and repulsive ionic groups, respectively, than those expected for their randomly shuffled ensemble: They have been ‘designed’ through molecular evolution so that a pair of ionic charges with opposite signs may have a higher tendency to be located close to each other, while a pair with the same sign are away from each other. © 1997 Elsevier Science Ireland Ltd. Keywords: Charges; Coulomb energy; Molecular evolution; Mutants; Proteins 1. Introduction Most biological functions in life are performed by proteins. Bioinformation, for example, is pro- cessed through tactile interactions between en- zymes and their substrates (Conrad, 1992). The native structure of proteins is responsible for their specific interaction with other proteins or ligands. It is one of the most urgent problems in life science to solve how the stability of these native three-dimensional structures is maintained. In this work, we focus on discussing the electrostatic contribution to protein stability from ionic charges on the surface of proteins (Nakamura, 1996). It was pointed out earlier that, in a protein molecule, a pair of atoms having charges with * Corresponding author. Tel.: +81 258479424; fax: +81 258479424; E-mail: [email protected] 0303-2647/97/$17.00 © 1997 Elsevier Science Ireland Ltd. All rights reserved. PII S0303-2647(97)00038-5

Non-random ionic-charge distribution responsible for the structural stability and molecular recognition of proteins

Embed Size (px)

Citation preview

BioSystems 43 (1997) 199–204

Non-random ionic-charge distribution responsible for thestructural stability and molecular recognition of proteins

Kunitsugu Soda *, Keiichi Kakuyama, Yoichiro Miki

Department of Bioengineering, Nagaoka Uni6ersity of Technology, Nagaoka, Niigata 940-21, Japan

Received 25 March 1997; received in revised form 6 June 1997; accepted 13 June 1997

Abstract

The ‘ionic-charge shuffling method’ is presented to generate a complete set of electrostatic mutants for a naturalprotein where ionic charges on the molecular surface of the template protein are exhaustively interchanged with eachother. Total Coulomb interaction energies are evaluated for all of the mutants by numerically solving the finitedifference Poisson-Boltzmann equation and their distribution in the ensemble is obtained. This method has beenapplied to five natural proteins to reveal that they have a significantly lower Coulomb energy than the average overthe ensemble of their mutants. It is also shown that these natural proteins have a significantly larger and smallernumber of pairs of attractive and repulsive ionic groups, respectively, than those expected for their randomly shuffledensemble: They have been ‘designed’ through molecular evolution so that a pair of ionic charges with opposite signsmay have a higher tendency to be located close to each other, while a pair with the same sign are away from eachother. © 1997 Elsevier Science Ireland Ltd.

Keywords: Charges; Coulomb energy; Molecular evolution; Mutants; Proteins

1. Introduction

Most biological functions in life are performedby proteins. Bioinformation, for example, is pro-cessed through tactile interactions between en-zymes and their substrates (Conrad, 1992). Thenative structure of proteins is responsible for their

specific interaction with other proteins or ligands.It is one of the most urgent problems in lifescience to solve how the stability of these nativethree-dimensional structures is maintained. In thiswork, we focus on discussing the electrostaticcontribution to protein stability from ioniccharges on the surface of proteins (Nakamura,1996).

It was pointed out earlier that, in a proteinmolecule, a pair of atoms having charges with

* Corresponding author. Tel.: +81 258479424; fax: +81258479424; E-mail: [email protected]

0303-2647/97/$17.00 © 1997 Elsevier Science Ireland Ltd. All rights reserved.

PII S 0303 -2647 (97 )00038 -5

K. Soda et al. / BioSystems 43 (1997) 199–204200

Table 1Proteins taken for analysis

Naaa Nbase

b NacidcProtein qnet

dPDB code Resolution/nm

012(1) 11Ubiquitin 761ubq 0.1810(0) 4Trypsin inhibitor 4pti 0.15 58 +6

−7e126(3)RNase T1 isozyme 1041rga 0.1720(1) 11Lysozyme 1lz1 0.17 129 +819(1)f 9Cytochrome c 3cyt 0.15 103 +9g

a The number of amino acid residues. b The number of basic residues (histidines in parenthesis). c The number of acidic residues.d The net charge at pH 7 including charges of the N- and C-terminal groups, and hetero atoms/groups, if any. e The charge of Ca2+

ion is included. f His18 is excluded as it is assumed neutral due to its coordination to heme. g The charge of ferric heme is included.The N-terminus is acetylated.

opposite signs tend to be located close to eachother, while those with the same sign are apartfrom each other (Wada and Nakamura, 1981;Barlow and Thornton, 1983). To estimate to whatdegree the charge distribution of natural proteinsdeviates from the random distribution, we havedeveloped the ‘ionic-charge shuffling method’, inwhich a complete set of charge constellations forany given protein can be generated by exhaus-tively permuting the ionic charges under the re-striction that either a positive or a negative chargecan be loaded only on the terminal ionizablegroup of amino-acid side chains. Total Coulombenergies are calculated for all the mutants bysolving the Poisson-Boltzmann equation for anaqueous solution (Nakamura and Nishida, 1987;Takahashi et al., 1992) and the distribution oftheir magnitudes in the ensemble is analyzed. Thecharge–charge distance distribution for theensemble is also given to discuss a non-random-ness of the ionic charge distribution of naturalproteins.

Spassov and Atanasov (1994) made a similarstudy based on a model in which all the sidechainatoms with non-zero accessibility to solvent ex-cept b carbon are assumed to be chargeable sites.They concluded from their analysis that ioniccharges with the same sign tend to be locatedaway from each other but the mutual distributionbetween opposite charges does not differ signifi-cantly from that of the mutant ensemble. Wecompare their results with ours to show that theabove conclusion was brought about by the spe-cific nature of the model taken in their analysis.

Further, it is noted from the view point of proteinengineering that our method will offer an efficientway of remodelling natural proteins to improvetheir structural stability.

2. Methods

Data on the three-dimensional structure ofproteins analyzed in this study (Table 1), whichwere selected from those in the BrookhavenProtein Data Bank (PDB), have a spatial resolu-tion higher than 0.2 nm. They were chosen so asto include various proteins differing in the num-ber of acidic and basic residues.

To generate an ensemble of mutant proteinswith altered charge distributions, for a givenprotein we first suppose a system of sites on whichelectric charges of the ionizable groups and ligandions are loaded. We assume that only these sitesin the natural protein are available as a charge-able site. In shuffled proteins, charges on side-chain ionizable groups are interchanged with eachother, while those of the N- and C-terminalgroups and ligand ions are kept unchanged. Theposition and the magnitude of the ionic charges ofamino-acid residues in natural proteins are givenin the following way: the guanidyl groups ofarginines, the amino groups of lysines and theN-terminus (except for N-modified proteins) areassumed to be protonated having a unit positivecharge. All histidines are assumed to be deproto-nated. The positive charge of the NH3

+ group islocated at the Na atom for the N-terminus and at

K. Soda et al. / BioSystems 43 (1997) 199–204 201

the Nz atom for lysine. For arginine, the charge isassigned to the Cz atom. All the carboxyl groupsof aspartates, glutamates and the C-terminus areassumed to be deprotonated having a unit nega-tive charge located at their carbon atom. Cou-lomb energy is estimated among these ioniccharges without including atomic partial charges.Thus, both of the terminal groups and the ligandions were not included in the shuffling but in-cluded in the estimation of Coulomb energy.

Since the size of the ensemble of charge-shuffledmutants can be enormous for proteins with 20 ormore charges, it is essential to minimize the com-putation time for the total Coulomb energy ofeach mutant. This requirement can be fulfilled bytaking in the above model that the chargeablesites are fixed in space and, as the result, theabsolute value of the Coulomb energy �uij � for apair of charges located at sites i and j is constantindependent of their signs. The total Coulombenergy Ua of a mutant a is given by a combina-tion of �uij � as

Ua=%i, j

sign(i, j ) · �uij �,

where sign(i, j ) is +1 or −1 if the charges atsites i and j have the same or opposite signs,respectively, and the summation runs over allpairs of (i, j ).

Hence, we need only to calculate a set of inter-site Coulomb energies, {uij}, for all pairs of sites.The absolute value of the interaction energy uij iscalculated by numerically solving the finite differ-ence Poisson-Boltzmann equation on the basis ofthe continuum model (Takahashi et al., 1992),where the effect of the ionic strength of solution isincorporated through the Debye-Huckel screeningparameter k. We have tested two values of k, 0and 1 nm−1, the latter of which corresponds tothe Debye radius of 1 nm or the ionic strength of�0.1 M NaCl.

3. Results and discussion

Fig. 1 shows the distribution of the total Cou-lomb energy for ionic-charge shuffled mutants offive proteins. For example, human lysozyme (1lz1)

has 18 positive and 11 negative charges to beshuffled as 5 lysines, 13 arginines, 8 aspartates,and 3 glutamates are ionized at neutral pH. Inconsequence, the total number of charge-shuffledmutants is given by the number of combination29C18 (=29C11), which amounts to about 3.46×107. The positive charge of Arg41 was kept un-

Fig. 1. Distribution of the total Coulomb energy for charge-shuffled proteins in an aqueous solution with the Debye-Huckel screening parameter, k=0 nm−1 (solid lines) andk=1 nm−1 (dotted lines). Inset figures indicate the totalCoulomb energy for the natural protein. Proteins (PDB code,the number of shuffled positive and negative charges): (a)ubiquitin (1ubq, 11, 11); (b) bovine pancreatic trypsin inhibitor(4pti, 10, 14); (c) RNase T1 isozyme (1rga, 3, 11); the chargeof Asp15 is fixed because it is very near to a Ca2+ ion; (d)human lysozyme (1lz1, 18, 11); the charge of Arg41 is fixedbecause it is located close to the N-terminal amino group; (e)tuna ferricytochrome c (3cyt, 17, 9); the positive charge ofArg38 is fixed because it is located close to the heme.

K. Soda et al. / BioSystems 43 (1997) 199–204202

Table 2Characteristic parameters of the Coulomb energy distribution for the ensemble of mutants

k=0 nm−1 k=1.0 nm−1Protein code

U0a sc LdU0

a U( bU( b sc Ld

−10.93 −1.44 5.151ubq −13.74 −3.36 1.415.88 1.909.591.791.164pti −0.02 −1.2652.88 2.23 9.59

−8.44 1.95 7.161rga 9.34−6.85 4.12 7.27 10.21.388.406.861lz1 −10.51 −10.909.44 9.29 1.07

9.09 5.553cyt −6.26 10.20 9.05 2.61 −8.891 6.17

a The Coulomb energy of the natural protein, kcal/mol. b The average Coulomb energy of the ensemble of charge-shuffled mutants,kcal/mol. c S.D. of the distribution of Coulomb energy for the ensemble, kcal/mol. d Percentage of the mutants with Coulomb energylower than U0 to the total population.

changed because it is located close to the positivecharge of the N-terminal group. Such an unfa-vorable interaction cannot be made by a freeprotein in solution unless it has some functionalrole. We suppose that it is an artifact due toprotein crystallization. There must be some ioniccharge(s) with the opposite sign in the vicinity ofArg41, such as salt ion(s) and/or other ionizablegroup(s) of neighboring molecules in the crystal,to compensate for the unfavorable interaction. Infact, we found Asp102 and Asp120 of the neigh-boring molecule at a distance of 0.8 nm fromArg41. There may be some salt anion(s) looselybound to Arg41 or the N-terminal group. In spiteof those stated above, we have also confirmedthat the results obtained are not essentiallychanged if Arg41 is included as a chargeable sitein the analysis. Table 2 lists parameters character-izing the distribution of Coulomb energy for theensemble of mutants obtained for the fiveproteins.

A prominent characteristic common to naturalproteins is that their Coulomb energy U0 is sig-nificantly lower than the average over the ensem-ble of shuffled mutants U( . For example, the valueof U0 for lysozyme in pure water (k=0 nm−1) is−10.5 kcal/mol, while that of U( is +9.44 kcal/mol. The difference between U0 and U( is morethan 2 times the S.D., 9.29 kcal/mol, of the distri-bution. The number of species with Coulombenergy lower than U0 is very small: its percentageto the total population of mutants is only 1.07%.In Fig. 1 are also shown the results for four other

proteins, ubiquitin (1ubq), pancreatic trypsin in-hibitor (4pti), ribonuclease T1 isozyme (1rga),and cytochrome c (3cyt). We can see from thisfigure that qualitatively similar but quantitativelydifferent results are obtained for these proteins.

Aspartate 15 in ribonuclease+1 isozyme wasalso excluded from shuffling because it is situatedvery close to the bound Ca2+ ion which must bereleased if the site of Asp15 is positively charged.The value of U( for ubiquitin, which has no netcharge at pH 7, is negative, whereas those for theother acidic and basic proteins are positive. It isdue to the situation that the neutral protein suf-fers less unfavorable Coulomb interactions thanbasic or acidic proteins whose charge balance isbiased. The reason why the curve for ribonucle-ase T1 isozyme is somewhat ragged is because thenumber of ionic charges to be shuffled is small.

From comparison of the results for k=0 and1.0 nm−1 in Table 2, we can see that the increasein k leads to a decrease in the absolute value ofU( for all proteins. It is due to the electrostaticshielding effect that an addition of salt to solu-tion yields an increase in k which decreases themagnitude of �uij �, especially for a pair of chargesexposed to solvent and not making an ion pair.In spite of it, the situation that U0 is much lowerthan U( is qualitatively unchanged even with theincrease in k. Hence, we conclude that the distri-bution of ionic-charges on the surface of naturalproteins is not random but has been ‘designed’through molecular evolution so as to lower theirCoulomb energy.

K. Soda et al. / BioSystems 43 (1997) 199–204 203

Table 3Number of attractive and repulsive charge-pairsa

Case Attractive pairProtein PDB code Repulsive pair

n0,a n0,r nrna

0.77 01ubq 2 1.231.2700.7321rga

0 2.751lz1b a 4 1.251 3.131lz1c b 4 1.87

0.86 03cytd 2.143

n0,a (n0,r) number of attractive (repulsive) pairs in natural proteins except the pairs fixed in shuffling. na (nr), average number ofattractive (repulsive) pairs for the ensemble of mutants.a The calculation was not made for 4pti because it has no charge pair by our definition. b The charge of Arg41 is fixed and notincluded in the shuffling. c Arg41 is included in the shuffling for comparison with case a. d Arg38 is buried in the protein interiorand assumed to be deprotonated.

To see the mechanism responsible for the aboveobservation, we examined the number of attrac-tive and repulsive pairs of charged groups in thenatural protein and its charge-shuffled mutants.Here, the ‘attractive’ and the ‘repulsive’ pairs aredefined as follows: The attractive pair is a pair ofcharges with opposite signs interacting favorablywith uij lower than −2 kcal/mol, and the repul-sive pair is a pair of charges with the same signinteracting unfavorably with uij higher than 2kcal/mol.

The numbers of attractive and repulsive pairsobserved in natural proteins and those expectedfor the corresponding ensemble of mutants arecompared in Table 3. Due to the model employed,the sum of the numbers of attractive and repulsivepairs is exactly the same for both the real proteinand the mutant ensemble: The chargeable sites arespatially fixed and the total number of chargepairs is independent of the distribution of charges.We can see that every natural protein has lessrepulsive pairs and more attractive pairs than theensemble of its mutants (Wada and Nakamura,1981). There is found only one repulsive pair, thepair of N-terminal amino group and Arg41 inlysozyme (case b in Table 3) and all other chargepairs are attractive ones. Hence, it is concludedthat the distribution of ionic charges of naturalproteins is designed so that charges with oppositesigns are located closer to and those with the samesign away from each other.

Spassov and Atanasov carried out a similarcharge-shuffling analysis using a model in whichany side-chain atom with non-zero accessibility tosolvent except b carbon is available as a charge-able site. Counting the number of pairs of chargedgroups within a distance of 0.5 nm, they foundthat, for most proteins, the number of attractivepairs is nearly the same in both natural proteinand its mutant ensemble, but the number of repul-sive pairs is significantly smaller in naturalproteins. They concluded from this result thatevolution effectively removed only the ‘bad’ elec-trostatic contacts, but has not gone far in thecreation of more attractive interactions (Spassovand Atanasov, 1994). An obvious difference be-tween our model and theirs is that the averagetotal number of ion pairs in the mutant ensemblediffers from that of the real protein in theirmodel, while both numbers are exactly the samein our model.

To see this point in more detail, the distancedistribution of chargeable sites in their model(solid line) and that in our model (bars) areshown in Fig. 2 for lysozyme with 30 chargeablesites. The two profiles clearly differ from eachother: their distribution shifts slightly toward theshorter distance relative to ours, and the numberof ion pairs within a distance of 0.5 nm is lager intheir model than in our model. The reason whythe numbers of attractive ion pairs for the realprotein and the ensemble are nearly the same intheir analysis is that the total number of ion pairs

K. Soda et al. / BioSystems 43 (1997) 199–204204

Fig. 2. Distribution of the distance between charged sites inshuffled mutants of lysozyme with 30 chargeable sites. Ourmodel (bars) and the model by Spassov and Atanasov (solidline) are compared. See the text for details.

same net charge but electrostatically higher stabil-ity. As described above, our model is realistic inthat chargeable sites are located at the terminaltitratable group of sidechains, which are generallyexposed to solvent and moderately distant fromeach other. We found that even a single permuta-tion can lower the Coulomb energy by 3 kcal/molor more for all proteins tested. For example, thepermutation of Asp2 and Lys86 of cytochrome cin solution with a k of 1.0 nm−1 yields a totalCoulomb energy of −12.1 kcal/mol which islower than U0 by 3.2 kcal/mol. The presentmethod will be very usefully applied to predictwhich sites should be interchanged to remodel anatural protein.

Acknowledgements

We thank Drs H. Nakamura and T. Takahashifor their kind permission to use their computerprogram. We are also grateful to R. Hatanaka forhis help in computer programming.

This work was partly supported by Grants-in-Aid for Scientific Research Nos. 08272208 and06680646 from the Ministry of Education, Sci-ence, Culture and Sport of Japan.

References

Barlow, D.J., Thornton, J.M., 1983. Ion-pairs in proteins. J.Mol. Biol. 168, 867–885.

Conrad, M., 1992. Quantum molecular computing: the self-as-sembly model. Int. J. Quantum Chem., Quantum Biol.Symp. 19, 125–143.

Nakamura, H., 1996. Roles of electrostatic interaction inproteins. Q. Rev. Biophys. 29, 190.

Nakamura, H., Nishida, S., 1987. Numerical calculations ofelectrostatic potentials of protein-solvent systems by theself consistent boundary method. J. Phys. Soc. Jpn. 56,1609–1622.

Spassov, V.Z., Atanasov, B.P., 1994. Spatial optimization ofelectrostatic interactions between the ionized groups inglobular proteins. Proteins Struct. Funct. Genet. 19, 222–229.

Takahashi, T., Nakamura, H., Wada, A., 1992. Electrostaticforces in two lysozymes: calculations and measurements ofhistidine pKa values. Biopolymers 32, 897–909.

Wada, A., Nakamura, H., 1981. Nature of the charge distribu-tion in proteins. Nature 293, 757–758.

is larger in their ensemble than in the real protein.From our analysis based on the model that onlythe terminal group of ionic residues can becharged, the number of attractive pairs in naturalproteins is found to be significantly larger thanthat expected from the random distribution asseen in Table 3. Thus, the difference between ourconclusion and their results is from the differencein the number of chargeable sites assumed forshuffled mutants.

We propose that the condition should be pre-served in such an analysis that mutant proteins aswell as natural ones are composed of existingamino acids. From this viewpoint, our analysis isalso approximate and does not fully satisfy theabove condition: Charge-shuffled mutants aregenerated through interchanges not of side chainsbut of their charges. In spite of it, our analysiswill be rationalized for two reasons: (1) analysisbased on the interchange of charged amino acidswould be practically impossible because it requiresenormously long computation time for the evalua-tion of electrostatic energy for the shuffledmolecules; and (2) conclusions obtained from ourpresent model are expected to be qualitativelyunchanged even when the exact model is taken.

In regard to protein engineering, our methodwill provide an efficient way of remodellingproteins in which their ionizable amino-acidresidues are permuted to yield a protein with the