7
TIBTECH-MAY 1991 [Vol. 9] 163 Predicting antigenic sites on proteins Peter S. Stern The ability to predict antigenic sites on proteins is of major importance for the production of synthetic peptide vaccines and synthetic peptide probes of antibody structure. Many predictive methods, l~ased on various assumptions about the nature of the antigenic response have been proposed and tested. This review will discuss the principles underlying the various approaches to pre- dicting antigenic sites and will attempt to answer the question of how well they work. The ability to predict antigenic sites (also known as antigenic determin- ants or epitopes) on proteins could be applied widely. Knowing the amino acid sequence of protein sites most likely to be recognized by anti- bodies would enable corresponding oligopeptides to be made; and anti- bodies raised to these that would bind the native protein antigen. This approach could be applied to devel- oping synthetic peptide vaccines for a variety of infections, such as anti- bacterial vaccines against diphtheria and cholera, anti-viral vaccines against hepatitis, influenza and AIDS, and anti-parasite vaccines against malarial This approach could also be extended to develop- ing antibodies to specific reproduc- tive antigens to disrupt fertility and serve as a means of immunological birth controP. Eventually, it may be possible to synthesize peptide vac- cines capable of raising antibodies to specific classes of tumor cells. Pre- dicting antigenicity permits selec- tion of suitable synthetic peptides, based on the DNA sequence for which the gene products are un- known. These peptides can be used to generate antibodies which can then be used as probes to isolate the unknown gene product z. These applications, however, pre- sume an understanding of immuno- genicity (the ability to generate anti- P. S. Stern (currently on sabbatical at the Center for Advanced Research in Bio- technology, University of Maryland, 9600 Gudelsky Drive, Rockville, M_D 20850, USA) is from the Chemical Physics Department, The Weizmann Institute of Science, 76100 Rehovot, Israel. bodies) and antigenicity (the ability to be recognized by antibodies) of protein- characteristics necessary and sufficient for a selected peptide to be effective as an antigen capable of raising antibodies which will also recognize the native protein (cross- reactive). Different types of epitopes exist. Antigenic sites were originally classed as either sequential (depend- ing only upon the amino acid se- quence and not upon a specific con- formation) or conformational3. More recently, however, all antigenic sites are considered to be conformational, but are classified as being either continuous (i.e. a continuous se- quence of the amino acid chain) or discontinuous, in which the resi- dues of the epitope may not be contiguous, but are brought close together by folding into the three- dimensional structure of the pro- tein4. Much research has been devoted to understanding anti- genicity (for reviews see Refs 2,4-7), and different models have been pro- posed. One suggests that only certain regions of the protein surface are antigenics, though it is now gener- ally accepted that most regions, if not the entire surface, of the protein is potentially antigenic4. What characterizes an antigenic site? The first predictive approach was to guess which residues, based on their hydrophilicity, would be on the protein surface. Subsequently, the statistical probability of a residue being exposed on the surface was introduced as a parameter. Later, significant correlations between antigenicity and the segmental mobility of the antigenic region of ~) 1991, Elsevier Science Publishers Lid (UK) 0167 - 94:d)191/$2.00 the protein were found °-11. Anti- bodies to regions of myohaem- erythrin (MHr)1°.11 determined from X-ray crystallographic temperature factors (B values} to be highly mobile, reacted strongly with the native protein, while anti-peptide antibodies to regions which were well ordered did not I°. There is a correlation between mobility and surface exposure, since regions of the protein which are 'buried' tend to be less flexible than those on the surface which are exposed. How- ever, not all surface residues are necessarily highly mobile; three pep- tides representing less mobile sur- face regions of MHr generated anti- bodies which did not react strongly with the intact protein 1°. Westhof et al. 9 found a similar correlation in studies of the inhibitory abilities of peptides to the reactions of tobacco mosaic virus protein (TMVP), myo- globin and lysozyme reacting with antibodies to the native protein. In each case, segmental mobility was found to correlate with the location of antigenic determinants. In TMVP, three hexapeptides corresponding to accessible, but not particularly mobile regions on the surface of the protein, were found to have no sig- nificant antigenic activity. Tainer et al. 1~ compared the antigenic properties of a dozen proteins (in- cluding MHr, insulin, cytochrome c, three globins, lysozyme, lactate dehydrogenase, ribonuclease and three non-mammalian proteins} with X-ray crystallographic temperature factors, hydropathicity and surface accessibility. They concluded that much of the available data is con- sistent with antigenicity being most highly correlated with struc- tural mobility. Others have proposed that it is not mobility, but shape accessibility, which determines antigenicity. Novotny et aL Iz suggest that the area of protein accessible to a 10 A probe correlates better than mobility with antigenicity. Thornton et aL ~3 de- veloped a 'protrusion index' to locate continuous antigenic determinants which protrude from the surface of proteins of known structure. For the three proteins they analysed, antigenicity was best predicted using mobility and protrusion param- eters, followed by accessibility and then by hydropathicity. Geysen et al. ~4 came to a similar conclusion.

Predicting antigenic sites on proteins

Embed Size (px)

Citation preview

TIBTECH-MAY 1991 [Vol. 9] 163

Predicting antigenic sites on proteins

Peter S. Stern

The ability to predict antigenic sites on proteins is of major importance for the production of synthetic peptide vaccines and synthetic peptide probes of antibody structure. Many predictive methods, l~ased on various assumptions about the nature of the antigenic response have been proposed and tested. This review wil l discuss the principles under lying the various approaches to pre- dicting antigenic sites and wil l attempt to answer the question of

how well they work.

The ability to predict antigenic sites (also known as antigenic determin- ants or epitopes) on proteins could be applied widely. Knowing the amino acid sequence of protein sites most likely to be recognized by anti- bodies would enable corresponding oligopeptides to be made; and anti- bodies raised to these that would bind the native protein antigen. This approach could be applied to devel- oping synthetic peptide vaccines for a variety of infections, such as anti- bacterial vaccines against diphtheria and cholera, anti-viral vaccines against hepatitis, influenza and AIDS, and anti-parasite vaccines against malarial This approach could also be extended to develop- ing antibodies to specific reproduc- tive antigens to disrupt fertility and serve as a means of immunological birth controP. Eventually, it may be possible to synthesize peptide vac- cines capable of raising antibodies to specific classes of tumor cells. Pre- dicting antigenicity permits selec- tion of suitable synthetic peptides, based on the DNA sequence for which the gene products are un- known. These peptides can be used to generate antibodies which can then be used as probes to isolate the unknown gene product z.

These applications, however, pre- sume an understanding of immuno- genicity (the ability to generate anti-

P. S. Stern (currently on sabbatical at the Center for Advanced Research in Bio- technology, University of Maryland, 9600 Gudelsky Drive, Rockville, M_D 20850, USA) is from the Chemical Physics Department, The Weizmann Institute of Science, 76100 Rehovot, Israel.

bodies) and antigenicity (the ability to be recognized by antibodies) of p ro t e in - characteristics necessary and sufficient for a selected peptide to be effective as an antigen capable of raising antibodies which will also recognize the native protein (cross- reactive). Different types of epitopes exist. Antigenic sites were originally classed as either sequential (depend- ing only upon the amino acid se- quence and not upon a specific con- formation) or conformational 3. More recently, however, all antigenic sites are considered to be conformational, but are classified as being either continuous (i.e. a continuous se- quence of the amino acid chain) or discontinuous, in which the resi- dues of the epitope may not be contiguous, but are brought close together by folding into the three- dimensional structure of the pro- tein 4. Much research has been devoted to understanding anti- genicity (for reviews see Refs 2,4-7), and different models have been pro- posed. One suggests that only certain regions of the protein surface are antigenic s, though it is now gener- ally accepted that most regions, if not the entire surface, of the protein is potentially antigenic 4.

What characterizes an antigenic site?

The first predictive approach was to guess which residues, based on their hydrophilicity, would be on the protein surface. Subsequently, the statistical probability of a residue being exposed on the surface was introduced as a parameter. Later, significant correlations between antigenicity and the segmental mobility of the antigenic region of

~) 1991, Elsevier Science Publishers Lid (UK) 0167 - 94:d)191/$2.00

the protein were found °-11. Anti- bodies to regions of myohaem- erythrin (MHr) 1°.11 determined from X-ray crystallographic temperature factors (B values} to be highly mobile, reacted strongly with the native protein, while anti-peptide antibodies to regions which were well ordered did not I°. There is a correlation between mobility and surface exposure, since regions of the protein which are 'buried' tend to be less flexible than those on the surface which are exposed. How- ever, not all surface residues are necessarily highly mobile; three pep- tides representing less mobile sur- face regions of MHr generated anti- bodies which did not react strongly with the intact protein 1°. Westhof et al. 9 found a similar correlation in studies of the inhibitory abilities of peptides to the reactions of tobacco mosaic virus protein (TMVP), myo- globin and lysozyme reacting with antibodies to the native protein. In each case, segmental mobility was found to correlate with the location of antigenic determinants. In TMVP, three hexapeptides corresponding to accessible, but not particularly mobile regions on the surface of the protein, were found to have no sig- nificant antigenic activity. Tainer et al. 1~ compared the antigenic properties of a dozen proteins (in- cluding MHr, insulin, cytochrome c, three globins, lysozyme, lactate dehydrogenase, ribonuclease and three non-mammalian proteins} with X-ray crystallographic temperature factors, hydropathicity and surface accessibility. They concluded that much of the available data is con- sistent with antigenicity being most highly correlated with struc- tural mobility.

Others have proposed that it is not mobility, but shape accessibility, which determines antigenicity. Novotny et aL Iz suggest that the area of protein accessible to a 10 A probe correlates better than mobility with antigenicity. Thornton et aL ~3 de- veloped a 'protrusion index' to locate continuous antigenic determinants which protrude from the surface of proteins of known structure. For the three proteins they analysed, antigenicity was best predicted using mobility and protrusion param- eters, followed by accessibility and then by hydropathicity. Geysen et al. ~4 came to a similar conclusion.

164 TIBTECH- MAY 1991 [Vol. 9]

- -Table I

Evaluation of methods available to predict antigenicity

Method Advantages Shortcomings Refs

Hydropathicity Uses sequence

Hydropathicity plots Accessible Neighbor interaction Improvement of

above methods

Surface topography Well correlated

Shape accessibility Protrusion index Packing density

Secondary structure

Mobility

Uses sequence

Well correlated

X-ray Normal modes Molecular dynamics Predicted flexibility

Antigenicity Combined protocols

Uses sequence

Uses sequence

Improvement over individual methods

Not a determinant of antigenicity

Poorly correlated 22, 23 Not evaluated, not 25

accessible Need three-

dimensional protein structure

Correlated 12 w i th 13 mobility 14,15

Not reliable 28, 29

Need three- dimensional protein structure

9, 10 30

Not practical Poorly correlated 31

Poorly correlated 35, 36

Surface profile Antigenic index Difference profile Limited use

Amphipathicity Useful for T-cell Over-predicts a antigens

32 33 34

19, 20

a i.e. only a small fraction of predicted antigenic sites is correct.

They mapped the immunological reactivity of MHr by synthesizing all possible 113 hexapeptides of the protein and compared the reactivity with different chemical and physical properties of these hexapeptides. Antigenic frequency correlated best with mobility, followed by packing density, shape accessibility ar, d surface exposure, and correlated only poorly with hydropathicity. Negative electrostatic potential was also correlated with antigenicity. Nine antigenic sites on MHr were analysed by synthesizing all poss- ible analogs with single amino acid substitutions ~5 (i.e. 120 analogs for a hexapeptide). In this way the resi- dues critical for interaction with the antibodies to MHr were identified; buried residues may be exposed and their binding to the antibody facilitated by local side chain dis- placements induced by the initial binding to surface amino acid residues.

Different investigators with the same information may draw differ- ent conclusions about what deter- mines antigenicity. This arises from the frequently qualitative, rather

than quantitative, evaluation (e.g. in terms of relative binding constants) of antigenicity or cross-reactivity data. In addition the available data is often insufficient to determine con- clusively the dominant factors, though both shape accessibility and segmental mobility are important~4; prediction methods should probably incorporate both these parameters.

T-cell response Before discussing predictive meth-

ods, it is important to distinguish between the humoral response (i.e. the formation of antibodies), and the helper T-cell response. The latter involves recognition of peptide fragments of proteins presented on a B-cell surface in, a class II major histocompatibility complex (MHC), and stimulation of specific B cells to produce more antibody to that antigen. Extensive studies of this type of antigenicity ~6 have led to the proposal that amphipathic (i.e. alter- nating hydrophobic and hydrophilic) helical regions of the protein are particularly antigenic to the helper T cells ~7. Algorithms for predicting this type of antigenicity have been

developed ~s-2°. T-cell antigens, as smaller peptide fragments, are mor~ amenable to predictive analysis, and could form the starting point for synthetic peptide vaccines which incorporate both T-cell and B-cell epitopes.

How are antigenic sites predicted from primary sequence?

Antigenic sites on a protein may be predicted either from the three- dimensional structure or from the primary amino acid sequence. Using the three-dimensional structure is more reliable since it includes the predominant factors in antigenicity - mobility and surface topography. Unfortunately, complete three- dimensional structures are rarely known and prediction algorithms based on the far more extensive database of primary sequences are less successful.

An early algorithm 21,22 for predict- ing antigenic sites from protein pri- mary structure was based on the observation that antigenic sites were on surface regions of the protein, highly exposed to solvent, and that charged, hydrophilic amino acids were common features of antigenic sites. Other scales of hydropathi- city based on different para- meters have been developed for epitope prediction zs,24. Fraga z5 im- proved upon this approach for pre- dicting antigenicity by including the combined effects of hydropathicity and 'neighborliness' rather than hydropathicity alone. The major drawback of using hydropathicity to predict antigenicity is that not all antigenic sites are hydrophilic, and that usually only the highest peak can be reliably used to predict epi- topes 21.

A second approach for predicting antigenicity considers surface ac- cessibility as the determining factor. Determining which regions of the protein are most exposed, using available algorithms, is straight- forward if the three-dimensional structure is known 12-14. Attempts have also been made to predict sur- face accessibility directly from primary sequence data 2B,27, or in- directly, from algorithms used to predict the secondary structure of proteins from their primary se- quence z8'29, assuming that loops and turns tend to be found more frequently on the protein surface.

TIBTECH- MAY 1991 |Vol. 91 165

These regions of secondary structure also tend to be more mobile as determined by X-ray crystallography and the calculated normal modes of vibration 3°. The normal modes are coupled vibrations of the protein molecule in which all atoms are moving in phase at a particular fre- quency. These frequencies have periods of vibration ranging from 10 -11 s {for collective motions of the protein backbone) to 10 -14 s (for bond vibrations). The calculated B values are obtained by summing atomic fluctuations over all the normal modes.

The possibility of predicting residue mobility from sequence in- formation alone was proposed by Karplus and Schulz 31 who took B values for 31 selected proteins and determined normalized B values for each of the 20 amino acid residues. They found that their flexibility pre- diction gave information signifi- cantly different from that obtained using secondary structure or hydro- pathicity methods, and recom- mended using their scheme in con- junction with other methods for predicting antigenicity.

Combining predictive parameters, Parker et ol. 3z explored various combinations of parameters used to predict antigenicity in order to improve the prediction: the three parameter sets giving the highest in- dividual scores for predicting anti- genic sites (hydrophilicity based on HPLC retention times, accessibility "-7 and flexibility ~1) were chosen to generate composite surface profiles.

Subsequently, Jameson and WolP 3 also developed a computer algor- ithm which combines predicted values for hydropathicity "-1, surface accessibility zT, regional backbone flexibility 31 and secondary struc- ture z8'29, to give what they call the 'antigenic index'.

Krystek et oL ~4 developed a 'dif- ference profile' algorithm to predict antigenic sites in homologous pro- teins that might not have been pre- dicted using hydropathicity alone. This determines hydropathicity "-1'23 values for two aligned sequences and then subtracts one from the other,

Welling et ol. 35 calculated 'anti- genicity values' for each amino acid based on the frequency of occur- rence of the amino acid in a set of known antigenic determinants rela-

- - T a b l e 2

Programs available for Predicting antigenicity

Author(s) Name Language Availability a (Ref.)

Hopp and Woods HYDRO HP-Basic b Hopp c HYDRO31HYDRO4 HP-Basic b Jameson and Wolf PROTCALC/ VAX/Fortran

PROTPLOT Karplus and Schulz FLEXPLOT Fortran 31 Krystek and Andersen HYDROP IBM-PC/Basic 42 Kyte and Doolittle SOAP C 23 Margalit et aL AMPHI VAX/Fortran 19 Parker et aL SURFACEPLOT Not given e 32 d Stille et aL STRIP-OF-HELIX Pascal 20 f Van Regenmortel and HYDRPHIIJ IBM-PCIBasic g 37 f

Daney de Marcillac HYDGRAF

22 40 33 d

a Unless indicated otherwise, program is published in given reference. bWritten for HP-85; minor changes required for other computers or Fortran or APPLE Basic version available from author. = Optional additional subroutines for the above program, d Available commercially; see reference for details. eAvailable for IBM-PC or APPLE Macintosh. fAvailable from authors on request. o MS-Basic or GW-Basic for IBM-PC or compatible.

tive to its occurrence in the protein in question, and these were used to pre- dict antigenic sites. Geysen et al. 36 also calculated such antigenicity values, which they termed 'pro- pensity factors', for critical and non- critical residues, and found no cor- relation with these and hydro- pathicity, surface accessibility or even the Welling et al. antigenicity values for the corresponding resi- dues. The advantages and short- comings of the different methods used to predict antigenicity are summarized in Table 1.

Prediction o f T-cell response Based on the physico-chemical

and biological requirements for T-cell activation by antigen, DeLisi and Berzofsky 17 proposed that anti- genic sites on such proteins should be amphipathic zT. Subsequently, Spouge et a/. la applied ~ novel statistical technique to "characterize peptide properties that correlate with antigenicity. They generated (by computer) random segments in a protein, and compared the prob- ability of a particular property occur- ring at random to its occurring in an antigenic segment. They proposed a hypothesis in which peptides immunodominant to helper T cells readily adopt specific conformations which are stabilized by their hydro- phobic interaction with some struc- ture on the antigen-presenting cell. Margalit et 01.19 have extended this model as the basis of an algorithm

for predicting T-cell antigenic sites. Using a similar hypothesis, Stille et eL 2° developed a hydrophobic 'strip- of-helix' algorithm for predicting such sites. In this algorithm, poten- tial antigenic sites are chosen as those putative o~-helices with the highest hydrophobic score along one face (i.e. residues n, n+4, n+7, n+11, n+14, n+18).

How well do prediction schemes work?

The objective evaluation of the different methods for predicting antigenicity is difficult for a number of reasons. Ideally, parameters should be determined objectively from one set of data, and their suc- cess measured against another set. This ensures, at least, that one is predicting and not just 'fitting' the data; unfortunately, there is insuf- ficient data to do this easily. Pre- dictions can be tested by deleting data en ,the individt~al protein being investigated from the database, and determining the parameters from the remaining proteins. However, this approach prevents the predictive accuracy of the final parameters being tested. Margalit et al. 19 tackled this problem by testing their pre- dictions for statistical significance (i.e. to show that a particular proper- ty was related to antigenicity more than it would be on the basis of chance alone). They optimized their algorithm for predicting T-cell anti- genicity z6 to predict the largest

2 . 0 - m

1.0--

w

"~ 1.0-- m

0.9--

a 3.0--

. m

._u

o. 2 -r-

+1-- C

O_

--1-

166

.--Fig. 1

I ' I ' I ' I ' I ' I 20 40 60 80 100 120

Residue number

Comparison of (a) observed and calculated (darker line) B values, (b) predicted B values and (c) hydropathicity values of lysozyme with its known continuous antigenic sites (horizontal lines). The B values were normalized (as in Ref. 31) for consistency. The C-terminal residue was omitted to keep the plot on scale. Note the high correlation of predicted flexibility with hydropathicity.

number of antigenic sites correctly, minimizing the probability of ar- riving at those predictions by chance. They successfully predicted 18 of 23 known antigenic sites with a high degree of significance, from a total of 117 amphipathic segments found.

Van. Regenmortel and Daney de Marcillac 37 have compared and assessed eight different methods of predicting antigenicity in four pro- teins and use two tests for scoring success. In probability analysis, X 2 is calculated by subtracting the prod- uct of incorrect positive and nega- tive predictions from the product of correct positive and negative predic-

tions. (A correct negative prediction is one in which a residue not pre- dicted to be antigenic has not been found experimentally in. any epi- topes.) The larger the value of X 2, the better a method distinguishes be- tween antigenic and non-antigenic sites. In the second test, the number of residues correctly predicted to be antigenic is divided by the total number of residues known to be antigenic. One problem in their as- sessment is that some of the proteins they included were used in deter- mining some of the parameters com- pared. Another general problem in evaluating and comparing different methods is the uncertainty as to

T I B T E C H - MAY 1991 [Vol. 9]

whether all the antigenic sites on the protein in question have been found. One way of avoiding this problem is to score success by dividing the number of residues correctly pre- dicted to be antigenic by the total number of residues predicted to be antigenic. This third scoring meth- od, although discussed by Van Regen- mortel and Daney de Marcillac 37, was not used in their assessment of the different methods. As Kabsch and Sander 38 point out in their assessment of protein secondary structure prediction methods, this last measure is perhaps the most useful, especially in the general case in which no reference can be made to observed states (i.e. those for which all the antigenic residues of a protein are known, and therefore one can discern not only how many true positives were predicted but also how many false positives). If the purpose of predicting antigenic sites is to facilitate the synthesis of pep- tides which will generate antibodies recognizing the native protein, then the method which yields the largest proportion of correctly predicted sites has the highest probability of correct prediction. Using this criterion for success, the method of Parker e t al. a2 does significantly better than the other seven methods evaluated. It was also assessed to be among the best prediction methods using the other two approaches to scoring 37.

Getzoff e t al. 7 have applied a statistical analysis to evaluate three methods 22'31'~5 for predicting anti- genic sites from amino acid se- quence, comparing them to the prob- ability of finding an epitope in a randomly chosen peptide of given length. Their conclusions, based on experimental results for three pro- teins, are that while the predictions are better using 15-residue peptides than seven-residue peptides, no method is significantly better than choosing the peptide sequences randomly. Consequently, they suggest that the sequential epitopes of a protein be determined by map- ping the entire protein sequence using short overlapping peptides, and then synthesizing longer pep- tides which include epitopes found with high frequency. However, they acknowledge the need to continue to refine predictive tools on the basis of emerging experimental results.

TIBTECH - MAY 1991 [Vol. 9] 167

- -F ig . 2,

Perhaps the best way to evaluate different prediction schemes is to accumulate data on holy well they predict previously undetermined antigenic sites. Tanaka et al. 4° syn- thesized 35 peptides based on a Hopp and Woods prediction of hydrophilicity, and only 18 of the 32 (56%) antipeptide antibodies elicited reacted with their respective proteins. Strynadka et al. 4~ using the Surfaceplot algorithm 32 selected 12 peptides, eight of which elicited antibodies which bound to herpes simplex virus type 1 (HSV-1) virions. Only five of the 12 peptides synthesized were predicted to be antigenic using the Hopp and Woods 21 algorithm, and only two of the five produced antipeptide anti- bodies which bound to HSV-1.

The present methods, then, are of limited usefulness until our under- standing of the antibody-antigen interaction increases and the pre- dictive methods can be further refined.

H o w c a n y o u ge t t h e p r o g r a m s ? Many of the prediction algorithms

are available as programs in a variety of forms. Most of the algorithms are short, so that the most convenient way to obtain them is to copy them from the papers in which they were published. Others are available either upon request from the authors or commercially (see Table 2}.

W h e r e c a n w e go f r o m here? Hopefully, the extensive research

devoted to obtaining a better under- standing of the nature and speci- ficity of antigenicity and the immu- nological response will lead to an improvement in the success of prediction methods. Prediction methods which rely only upon the amino acid sequence of the protein can only succeed in so far as anti- genicity is sequence determined. This is probably true for peptides which elicit antibodies which can also recognize the native protein, but as has been shown, sequence alone is not necessarily a determinant of immunogenicity.

One method which can give de- tailed information on interactions between antibody and antigen is X-ray crystallography. The recently determined three-dimensional struc- ture of an antigen-antibody complex of lysozyme with the Fab fragment of

4 o a

3.0-- >~ - m 2.0-- ~ _

1.0-

1 1 1 J i l l

b 1 .1 - -

~ 1.o m

0.9

• . I , I

+l-

f-

I ' I ' I I I ' I ' I ' I ' I 10 30 50 70 90 110 130 150

Residue number

Comparison of (a) observed and calculated (darker line) B values, (b) predicted B values and (©) hydropathicity values of myoglobin with its known continuous antigenic sites (horizontal lines). The B values were normalized as m Fief. 31 for consistency. The N-terminal residue was omitted to keep the plot on scale. Note the high correlation of predicted flexibility with hydropathicity.

a monoclonal antibody has shed new light on this problem 43. The tertiary structure of bound lysozyme ap- peared identical to that of the un- bound native protein. The antigenic site was made up of two continuous stretches of the protein, residues 18- 27 and 116-129. The latter of these two stretches (C-terminus) has both above-average mobility (as shown from X-ray temperature factors 44 or normal-mode calculations 3°) and above-average contact area 12, while the 18-27 stretch is nei~er particu- larly mobile nor exposed. The peak of highest hydrophilicity 2~ centers on residues 112-119, and the third highest on residues 13-18. Three

additional three-dimensional struc- tures of antigen--antibody complexes have since been solved, two of them again including lysozyme. One 4s includes residues 41-53 and 67-70 , which are both highly mobile 3°'44 and accessible ~2'13 regions of lyso- zyme. The second 46 includes the highly mobile region of the lysozyme molecule from residues 89 to 102. The antigenic site of an influenza virus neuraminidase-antibody com- plex 47 is also highly mobile. Two- dimensional nuclear magnetic res- onance (NMR} is also being used increasingly for studying dynamic molecular interactions (see Russu, I. M. TIBTECI-I 9, 96-104, 1991).

168 TIBTECH- MAY 1991 [Vol. 9]

Recently, the structure of peptide antigen-antibody complexes was solved using two-dimensional NMR difference techniques 48.

Given the high level of correlation reported between antigenic sites and segmental mobility 9-11 it was pro- posed to use normal-mode dynam- ics 4 9 for predicting segmental flexi- bility and potential antigenic sites. Ideally, it would be more accurate to use molecular dynamics to predict protein flexibility, but this method is far more demanding of computer resources, and B values calculated from normal-mode dynamics have been shown to correlate well with experimental B values 3°. One advan- tage of this over using experimen- tally determined B values is that the normal modes can be calculated from a low resolution, less well- refined structure than that necessary for determining crystallographic B values 3° or even from a three- dimensional structure determined by model-building techniques. B values calculated from normal modes also eliminate the effects of crystal packing on the mobility. Figures I and 2 compare normalized observed, calculated and predicted B values and the hydropathic profiles of lysozyme and myoglobin with their continuous epitopes.

Another approach to modelling has been taken by Mesyanzhinov et al. 5° They predicted the secondary structure of hepatitis A virus {HAV) proteins from the amino acid se- quence and found a similarity in the location of secondary structure el- ements and the homology of the corresponding sequences between HAV and other picornavirus struc- tural proteins. These were aligned with secondary structures of two of these proteins {determined by X-ray crystallography), and a model then built of the three-dimensional struc- ture of the HAV proteins. From the model, the exposed regions of the polypeptide chains were used to predict potential antigenic sites.

The disadvantage of the above methods is that they are not readily accessible to the general scientific public and are of more interest as a means of obtaining a better under- standing of the nature of the antigen-antibody interaction. Never- theless, Sasaki et al. S~ combined hydropathicity with B values and shape accessibility obtained from

X-ray crystallographic data to predict six antigenic sites on insulin, four of which correspond to the four known antigenic sites. Methods in which all one has to do is enter a protein sequence to get a list of the peptides most likely to elicit antibodies which will cross-react with the native protein are far more appeal- ing, even if their rate of success is only about 50%. What is needed now is a more quantitative assessment (in terms of reproducible numbers such as binding affinities), of the relative abilities of different methods to pre- dict those peptides which will bind most strongly antibodies to the native protein, or alternatively elicit antibodies which will react most strongly with the native protein anti- gen. A simultaneous improvement in methods to predict mobility and accessibility from the primary se- quence of amino acid residues should lead to more reliable predic- tion methods.

References 1 Amen, R. (1986) Trends Biochem.

Sci. 11,521-524 2 Berzofsky, J. A. (1985) Science 229,

932-940 3 Sela, M. (1969) Science 166,

1365-1374 4 Benjamin, D. C., Berzofsky, J. A,,

East, I. J., Gurd, F. R. D., Hannum, C., Leach, S. J., Margoliash, E., Michael, J. G., Miller, A., Prager, E. M., Reichlin, M., Sercarz, E. E., Smith- Gill, S. J., Todd, P. E. and Wilson, A. C. (1984) Annu. Rev, lmmunol. 2, 67-101

5 Van Regenmortel, M. H. V. (1986) Trends Biochem. Sci. 11, 36-39

6 Van Regenmortel, M. H. V. (1987) Trends Biochem. Sci. 12, 237-240

7 Getzoff, E. D., Tainer, J. A., Lerner, R. A. and Geysen, H. M. (1988) Adv. lmmunol. 43, 1-98

8 Atassi, M. Z. (1984) Eur. J. Biochem. 145, 1-20

9 Westhof, E., Altschuh, D., Moras, D., Bloomer, A. C., Mondragon, A., Klug, A. and Van Regenmortel, M. H. V. (1984) Nature 311,123-126 Tainer, J. A., Getzoff, E. D., Alexander, H., Houghten, R. A., Olson, A. J., Lerner, R. A. and Hendrickson, W. A. (1984) Nature 312, 127-134 Tainer, J. A., Getzoff, E. D., Paterson, Y., Olson, J. A. and Lerner, R. A. (1985) Annu. Rev. Immunol. 3, 501-535 Novotny, J., Handschumacher, M., Haber, E., Bruccoleri, R. E., Carlson, W. B., Fanning, D. W., Smith, J. A.

10

11

12

and Rose, G. D. (1986) Prec. Natl Acad. Sci. USA 83, 226-230

13 Thornton, J. M., Edwards, M. S., Taylor, W. R. and Barlow, D. J. (1986) EMBO ]. 5,409-413

14 Geysen, H. M., Tainer, J. A., Rodda, S. J., Mason, T. J., Alexander, H., Getzoff, E. D. and Lerner, R. A. (1987) Science 235, 1184-1190

15 Getzoff, E. D., Geysen, H. M., Rodda, S. J., Alexander, H., Tainer, J. A. and Lerner, R. A. (1987) Science 235, 1191-1196

16 Pierce, S. K. and Margoliash, E. (1988) Trends Biochem. Sci. 13, 27-29

17 DeLisi, C. and Berzofsky, J. A. (1985) Prec. Natl Acad. Sci. USA 82, 7084- 7052

18 Spouge, J. L., Guy, H. R., Cornette, J. L., Margalit, H., Cease, K., Berzofsky, J. A. and DeLisi, C. (1987) J. lmmunol. 138, 204-212

19 Margalit, H., Spouge, J. L., Comette, J. L., Cease, K. B., DeLisi, C. and Berzofsky, J. A. (1987) J. Immunol. 138, 2213-2229

20 Stille, C. J., Thomas, L. J., Reyes, V. E. and Humphreys, R. E. (1987) Mol. lmmunol. 24, 1021-1027

21 Hopp, T. P. and Woods, K. R. (1981) Prec. Natl Acad. Sci. USA 78, 3824-3828

22 Hopp, T. P. and Woods, K. R. (1983) Mol. lmmunol. 20, 483-489

23 Kyte, J. and Doolittle, R. F. (1982) J. Mol. Biol. 157, 105-132

24 Cornette, J. L., Cease, K. B., Margalit, H., Spouge, J. L., Berzofsky, J. A. and DeLisi, C. (1987) J. Mol. Biol. 195, 659-686

25 Fraga, S. (1982) Can. /. Chem. 60, 2606-2610

26 Chothia, C. (1976) J. Mol. Biol. 105, 1-14

27 Janin, J. (1979) Nature 277, 491--492 28 Chou, P. Y. and Fasman, G. D. (1978)

Adv. Enzymol. 47, 45-148 29 Gamier, J., Osguthorpe, D. J. and

Robson, B. (1978) J. Mol. Biol. 120, 97-120

30 Levitt, M., Sander, C. and Stern, P. S. (1985) J. Mol. Biol. 181,423.447

31 Karplus, P. A. and Schulz, G. E. (1985) Naturwissenschaflen 72, 212-213

32 Parker, J. M. R., Guo, D. and Hedges, R. S. (1986) Biochemistry 25, 5425-5432

33 Jameson, B. A. and Wolf, H. (1988) Comput. Appl. Biosci. (CABIOS) 4, 181-186

34 Krystek, S. R. Jr, Dias, J. A., Reichert, L. E. Jr and Andersen, T. T. (1985) Endocrinology 117, 1125-1131

35 Welling, G. W., Weijer, W. J., van der Zee, R. and Welling-Wester, S. (1985) FEBS Lett. 188, 215-218

36 Geysan, H. M., Mason, T. J. and Rodda, S. J. (1988) J. Mol. Recog. 1, 32.41

TIBTECH - MAY 1991 [Vol. 9] 169

37 Van Regenmortel, M. H. V. and Daney de Marcillac, G. (1988) lmmunol. Lett. 17, 95-108

38 Kabsch, W. and Sander, C. (1983) FEBS Lett. 155, 179-182

39 Tanaka, T., Slamon, D. J. and Cline, M. J. (1985) Proc. Natl Acad. ScL USA 82, 3400-3404

40 Hopp, T. P. (1986) J. lmmunol. Methods 88, 1-18

41 Strynadka, N. C. J., Redmond, M. J., Parker, J. M. R., Scraba, D. G. and Hodges, R. S. (1988) J. Viral. 62, 3474-3483

42 Krystek, S. R. Jr, Reichert, L. E. Jr and Andersen, T. T. (1985) Endocrin- ology 117, 1110-1124

43 Amit, A. G., Mariuzza, R. A., Phillips,

S. E. V. and Poljak, R. J. (1986) Science 233, 747-753

44 Berthou, J., Lifchitz, A., Artymiuk, P. and Jolles, P. (1983) Proc. R. Sac. B217, 471-489

45 Sheriff, S., Silverton, E. W., Padlan, E. A., Cohen, G. H., Smith-Gill, S. J., Finzel, B. C. and Davies, D. R. (1987) Proc. Natl Acad. Sci. USA 84, 8075-8079

46 Davies, D. R., Sheriff, S. and Padlan, E. A. (1~88) J. Biol. Chem. 263, 10541-10544

47 Caiman, P. M., Laver, W. G., Varghese, J. N., Baker, A. T., Tulloch, P. A., Air, G. M. and Webster, R. G. (1987) Nature 326, 358-363

48 Levy, R., Assulin, O., Scherf, T.,

Levitt, M. and Anglister, J. (1989) Biochemistry 28, 7168-7175

49 Stern, P. S. (1989) in Computer- Assisted Modeling of Receptor- Ligand Interactions: Theoretical Aspects and Applications to Drug Design (Progress in Clinical and Biological Research, VoL 289) (Rein, R. and Golombek, A., eds), pp. 87-94, Alan R. Liss

50 Mesyanzhinov, V. V., Peletskaya, E. N., Zhdanov, V. IVi., Efimov, A. V., Finkelstein, A. V. and Ivanovsky, D. I. (1987)J. Biomol. Struc. Dyn. 5, 447--458

51 Sasaki, A., Mikawa, Y., Sakamoto, Y., Yamada, H., Ikeda, Y. and Ohno, T. (1988) Mol. lmmunol. 25, 157-163

r-1 [] [] [] [] [] [] [] [] [] [] []

Human and mouse monoclonal antibodies by

repertoire cloning Dennis R. Burton

Antibody reper toi res , the wide range of an t ibody molecules pro- d u c e d by animals, can now be es tabl ished in bac te r i a by cloning and express ion of an t ibody genes. Beginning wi th i m m u n i z e d an imal s , an t igen can be used to select, f rom the reper to i re , c lones which secre te specific monoclonai antibody. In the future, i m m u n i z a t i o n m a y become unnecessary . The me thod m a y p rov ide a genera l route , which has so far e l u d e d biotechnologists , to h u m a n monoclonal

antibodies.

Under constant bombardment from viruses, bacteria and other harmful agents, the body is equipped with a defence system based, in part, on the antibody molecule. With no advance knowledge of the molecular shapes of the foreign molecules (antigens) to be encountered in a lifetime, a 'catch-all' molecule or set of mol- ecules has evolved. This 'antibody repertoire', in humans 1, has been estimated to comprise between a million an~ a b~l~.dred million members. "l:'J~.is primary, or naive,

D. R. Burton (c~Jrrently on sabbatical at the Departmeni'. of Molecular Biology, Research Instiiute of Scripps Clinic, 10666 North ToFtey Pines Road, La Jolla, CA 92037, USA J is from the Krebs Insti- tute, Department of Molecular Biology and Biotechnology, The Universily, Sheffield $10 2TN, .UK.

repertoire is generated by recom- binddon~-~ents from a relatively small nu~nber of genes and is be- lieved to contain antibodies recog- nizing, with moderate affinity, virtu- ally any antigen. Many cycles of somatic mutation and selection of the antibody-producing cells in the presence of antigen then act to in- crease the affinity and abundance of specific antibody.

The existence of molecules that are capable of high-affinity binding with high discriminatory ability has long been of great interest to biotech- nologists. For many years the diffi- culty was how to isolate any specific at~tibody from the mixture present in serum. Specific polyclonal antibody can be generated by hyperimmun- ization but this results in a hetero- geneous antibody population which will vary between preparations. The

~) 1991, Elsevier Scien~. = . Publishers Lid (UK) 0167 - 9430/91152.00

introduction of hybridoma tech- nology (see Glossary) by K6hler and Milstein 2 in 1975 permitted the production of monoclonal anti- bodies (mAbs) from mice, and caused a revolution in the appli- cation of antibodies in many areas of the biological sciences. One of the few disappointments of hybridoma technology has been the inability to extend it as a general method for the generation of human mAbs. Although Epstein-Barr virus (EBV)- transformation of human lympho- cytes has had some success in this regard 3'4, this approach can produce cell lines which are unstable or produce low levels of antibody. Nevertheless, it is desirable to use human antibodies for therapy be- cause rodent antibodies can induce an anti-globulin response that may be harmful to the patient 5.

Ingenious protein engineering strategies have been described for 'humanizing' mouse mAbs. These include joining the entire immuno- globulin variable domains from the mouse mAE to human constant domains e-~, or transplanting the complementarity determining re- gions (CDRs), responsible for antigen recognition, of the mouse mAb into a human mAb (myeloma protein) 9 (Fig. 1). The former approach is relatively straightforward but the whole of the variable domains, per- ceived as foreign by a human, are retained. In this respect, the latter approach is more satisfactory, but is more time consuming and may in- volve some fine adjustment of frame- work residues.