6
DOI: 10.1126/science.1089427 , 1364 (2003); 302 Science , et al. Brian Kuhlman Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal, non-commercial use only. clicking here. colleagues, clients, or customers by , you can order high-quality copies for your If you wish to distribute this article to others here. following the guidelines can be obtained by Permission to republish or repurpose articles or portions of articles ): January 24, 2011 www.sciencemag.org (this infomation is current as of The following resources related to this article are available online at http://www.sciencemag.org/content/302/5649/1364.full.html version of this article at: including high-resolution figures, can be found in the online Updated information and services, http://www.sciencemag.org/content/suppl/2003/11/20/302.5649.1364.DC1.html can be found at: Supporting Online Material http://www.sciencemag.org/content/302/5649/1364.full.html#related found at: can be related to this article A list of selected additional articles on the Science Web sites 387 article(s) on the ISI Web of Science cited by This article has been http://www.sciencemag.org/content/302/5649/1364.full.html#related-urls 94 articles hosted by HighWire Press; see: cited by This article has been http://www.sciencemag.org/cgi/collection/biochem Biochemistry subject collections: This article appears in the following registered trademark of AAAS. is a Science 2003 by the American Association for the Advancement of Science; all rights reserved. The title Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the Science on January 24, 2011 www.sciencemag.org Downloaded from

Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

DOI: 10.1126/science.1089427, 1364 (2003);302 Science

, et al.Brian KuhlmanDesign of a Novel Globular Protein Fold with Atomic-Level Accuracy

This copy is for your personal, non-commercial use only.

clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others

  here.following the guidelines

can be obtained byPermission to republish or repurpose articles or portions of articles

  ): January 24, 2011 www.sciencemag.org (this infomation is current as of

The following resources related to this article are available online at

http://www.sciencemag.org/content/302/5649/1364.full.htmlversion of this article at:

including high-resolution figures, can be found in the onlineUpdated information and services,

http://www.sciencemag.org/content/suppl/2003/11/20/302.5649.1364.DC1.htmlcan be found at: Supporting Online Material

http://www.sciencemag.org/content/302/5649/1364.full.html#relatedfound at:

can berelated to this article A list of selected additional articles on the Science Web sites

387 article(s) on the ISI Web of Sciencecited by This article has been

http://www.sciencemag.org/content/302/5649/1364.full.html#related-urls94 articles hosted by HighWire Press; see:cited by This article has been

http://www.sciencemag.org/cgi/collection/biochemBiochemistry

subject collections:This article appears in the following

registered trademark of AAAS. is aScience2003 by the American Association for the Advancement of Science; all rights reserved. The title

CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 2: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

Design of a Novel GlobularProtein Fold with

Atomic-Level AccuracyBrian Kuhlman,1*† Gautam Dantas,1* Gregory C. Ireton,4

Gabriele Varani,1,2 Barry L. Stoddard,4 David Baker1,3‡

A major challenge of computational protein design is the creation of novelproteins with arbitrarily chosen three-dimensional structures. Here, we used ageneral computational strategy that iterates between sequence design andstructure prediction to design a 93-residue �/� protein called Top7 with a novelsequence and topology. Top7 was found experimentally to be folded andextremely stable, and the x-ray crystal structure of Top7 is similar (root meansquare deviation equals 1.2 angstroms) to the design model. The ability todesign a new protein fold makes possible the exploration of the large regionsof the protein universe not yet observed in nature.

There are a large but finite number of proteinfolds observed thus far in nature, and it is notclear whether the structures not yet observedare physically unrealizable or have simplynot yet been sampled by the evolutionaryprocess or characterized by a structural biol-ogist. Methods for de novo design of novelprotein structures provide a route to resolvingthis question and, perhaps more importantly,a possible route to novel protein machinesand therapeutics.

There has been considerable progress inthe development of computational methodsfor identifying amino acid sequences compat-ible with a target structure (1–3), notably thepioneering complete redesign of a zinc fingerprotein by Mayo and co-workers (1). In gen-eral, these methods have not been used tocreate new protein structures but rather toredesign naturally occurring proteins so thatthey have enhanced stability or new function-ality (4–6). Because of the strong steric re-strictions in the cores of globular proteins, thepacking of side chains in redesigned proteinsis often quite similar to that in the originalnative protein (1, 7), and hence high-resolutionprotein backbone coordinates contain somememory of the original native sequence (8–12).When creating a new protein from scratch,

there is no such sequence memory to aidthe process, and it is not even knownwhether the target backbone is designable.Thus, the computational design of novelprotein structures is a more rigorous test ofcurrent force fields and optimization meth-odology than the redesign of naturally oc-curring proteins.

Because it is unlikely that any arbitrarilychosen protein backbone will be designable,it is essential that the design procedure in-clude a search of nearby conformationalspace in addition to sequence space. With theexception of the method used by Desjarlaisand Handel (2) to redesign the hydrophobiccore of a small naturally occurring protein,most previous approaches have either opti-

mized the amino acid sequence for a largenumber of fixed backbone conformations (4,12–14) or, as in the landmark design byHarbury and colleagues of coiled coil oli-gomers with a right-handed superhelicaltwist (15), refined the backbone conforma-tion for a large number of fixed amino acidsequences (15, 16 ). The range of sequence-structure pairs that can be searched with theuse of these approaches is restricted by theneed to specify, in advance, a limited num-ber of backbone conformations or aminoacid sequences.

We have developed a general proce-dure for identifying very low free energysequence-structure pairs that iterates be-tween sequence optimization and structureprediction and can be applied to the designof any desired target structure. The sameenergy function is used to guide the searchat all stages, and at each stage only thelowest energy sequence or structure identi-fied in the previous iteration is optimized,thereby avoiding the large-scale and com-putationally expensive enumeration of al-ternative backbones or alternative sequenc-es. Unlike the genetic algorithm of Desjar-lais and Handel (2) in which randomlyselected torsion angles and residue identi-ties were simultaneously perturbed, ourprocedure iterates between full-scale opti-mization of sequence for a fixed backboneconformation and gradient-based optimiza-tion of the backbone coordinates for a fixedsequence. We used this approach to createa 93-residue �/� protein with a topologynot present in the Protein Structure Data-base (PDB).

1Department of Biochemistry, 2Department of Chem-istry, 3Howard Hughes Medical Institute, University ofWashington, Seattle, WA 98195, USA. 4Division ofBasic Sciences, Fred Hutchinson Cancer Research Cen-ter, 1100 Fairview Avenue North, Seattle, WA 98109,USA.

*These authors contributed equally to this work.†Present address: Department of Biochemistry andBiophysics, University of North Carolina, Chapel Hill,NC 27599, USA.‡To whom correspondence should be addressed. E-mail: [email protected]

Fig. 1. A two-dimensional schematic of the target fold (hexagon, strand; square, helix; circle, other).Hydrogen bond partners are shown as purple arrows. The amino acids shown are those in the finaldesigned (Top7) sequence.

RESEARCH ARTICLES

21 NOVEMBER 2003 VOL 302 SCIENCE www.sciencemag.org1364

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 3: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

Generation of starting models. Thetarget structure for the de novo design pro-cess can range from a detailed backbonemodel to a back-of-the-envelope sketch. Be-cause we aimed to create a novel protein fold,we selected a topology not present in thePDB according to the Topology of ProteinStructure (TOPS) server (17). A rough two-dimensional diagram was created of the tar-get structure (Fig. 1), and constraints wereidentified that define the topology (Fig. 1,arrows). Three-dimensional models satisfy-ing the constraints were then generated byassembling three- and nine-residue fragmentsfrom the PDB with secondary structures con-sistent with the diagram using the Rosetta denovo structure prediction methodology (18),leading to 172 backbone-only models thathad the desired topology and secondary struc-ture content and had root mean square devi-ations (RMSDs) from each other of 2 to 3 Å.

Generation of starting sequences. Asequence was designed for each model withthe use of the RosettaDesign (9) MonteCarlo search protocol and energy function,which is dominated by a 12-6 Lennard-Jones potential, an orientation-dependenthydrogen bonding term (19), and an implic-it solvation model (20, 21). All amino acidsexcept for cysteine were allowed at 71 ofthe 93 positions [�110 rotamers from Dun-brack’s library (22) per position]; the re-maining 22 surface �-sheet positions wererestricted to polar amino acids (�75 rota-mers per position). The search through the11071 � 7522 (�10186) rotamer combina-tions took �10 min for each model on aPentium III (Intel) processor.

Because the starting backbone conforma-tions were generated without regard to side-chain packing, it was anticipated that se-quences with very low free energies might

not exist (i.e., the structures would not bedesignable). Indeed, the lowest energy se-quences selected for the starting structureshad energies considerably higher than thoseof native proteins of roughly the same size. Inparticular, the Lennard-Jones interaction en-ergies for core residues were on average 0.8kcal mol�1 less favorable than the interactionenergies for the same residues in native pro-tein cores. The finding that low-energy se-quences do not exist for protein backbonesgenerated without regard to side-chain pack-ing emphasizes the need to couple sequencedesign with backbone flexibility for generalprotein design.

Simultaneous optimization of se-quence and structure. The critical feature ofthe design protocol is the cycling between se-quence design, as described above, and back-bone optimization. The goal of the backboneoptimization step, to identify the lowest freeenergy backbone conformation for a fixed aminoacid sequence, is formally analogous to the high-resolution structure prediction problem, and weused the Rosetta program (23), which we devel-oped for structure prediction. The backbone tor-sion angles were optimized with the use of aMonte Carlo minimization protocol (24) inwhich each move has the following parts. (i) Aninitial perturbation, consisting of either a smallrandom change in the torsion angles of one tofive randomly selected residues or a substitu-tion of the backbone torsion angles of one tothree consecutive residues with torsion anglesfrom a structure in the PDB. In the latter case,the torsion angles of neighboring residueswere varied to minimize the displacement ofthe downstream portion of the chain. (ii) Arapid optimization of side-chain conformationfor all residue positions that had a higherenergy after step 1 by cycling through eachrotamer at each position in turn and replacingthe current side-chain conformation with thelowest energy rotamer conformation. (iii) Op-timization of the backbone torsion angles in a10-residue window surrounding the site ofinsertion by energy minimization using aquasi-Newton method (25). Moves were ac-cepted or rejected on the basis of the energydifference between the final minimized struc-ture and the starting structure according to theMetropolis criterion. The same energy func-tion was used for backbone optimization andsequence design. Each round of backbone re-laxation consisted of several thousand suchMonte Carlo minimization moves; a full com-binatorial optimization of side-chain rotamerconformations was carried out with the use ofa Monte Carlo procedure every 20 moves.

For each starting structure, five independentsimulations, each with 15 cycles of sequencedesign and backbone optimization, were used toobtain low-energy structure sequence pairs. Fi-nal energies were comparable to those observedfor naturally occurring proteins. Proteins de-

Fig. 2. Biophysical characterization of Top7.(A) The far-ultraviolet (UV) CD spectrumof 20 �M Top7 in 25 mM tris-HCl, 30 mMNaCl, and pH � 8.0 at varying tempera-tures and concentrations of GuHCl. (B) CDsignal at 220 nm as a function of temper-ature and GuHCl for 8 �M TOP7 in 25 mMtris-HCl, 30 mM NaCl, pH � 8.0, in a 2-mmcuvette. (C) CD signal at 220 nm as afunction of GuHCl concentration for 5 �Mprotein in 25 mM tris-HCl, 30 mM NaCl,pH � 8.0, at 25°C in a 1-cm cuvette. (D)The NOESY spectrum of �1 mM Top7 atpH � 6.0 recorded at 298 K, 500 Mhz, and200-ms mixing time with the use of Wa-tergate suppression. , frequency. (E) The

HSQC spectrum of �1 mM 15N-Top7 at pH � 6.0 recorded at 298 K and 500 Mhz with the useof the fast HSQC scheme of Mori et al. (43).

R E S E A R C H A R T I C L E S

www.sciencemag.org SCIENCE VOL 302 21 NOVEMBER 2003 1365

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 4: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

signed with the use of an initial version of theprotocol with a damped Lennard-Jones repulsiveterm and a Monte Carlo optimization without theminimization step were observed experimentallyto be quite stable but appeared to have somewhatmolten cores (26). To optimize steric packing,the atomic radii were reparameterized on thebasis of the distances of closest approach of atompairs in high-resolution protein structures, ex-plicit protons were included on all atoms, thepenalty for atom-atom overlaps was greatlysteepened, and the full Monte Carlo minimiza-tion protocol was used for varying the backbone,resulting in the generation of much lower energysequence-structure pairs (20% of the final 860models had more favorable Lennard-Jones ener-gies than an average protein in the PDB). Withthese improvements, the protocol was used todesign a protein sequence called Top7 (27).

The average Lennard-Jones energies for theburied residues in Top7 become favorable dur-ing the relaxation process (table S1), and, al-though the structural changes during the iterativerefinement process are modest (the final proteinbackbone model has an RMSD of 1.1 Å from thestarting model), they bring about dramaticchanges in the designed sequence: Only 31% ofthe Top7 residues are identical to those in thestarting sequence. Neither the Top7 sequencenor the sequence before the iterative sequence-structure refinement process have significantsimilarity to any naturally occurring protein se-quence; the closest match to the Top7 sequencefound with the use of PSI-BLAST (28) in theNon-Redundant protein sequence database isweaker than would be expected by randomchance (E value � 1.6).

Biophysical and structural character-ization of Top7. The folding, stability, andstructure of the Top7 protein (29) were analyzedwith the use of a variety of biophysical methods.The Top7 protein is highly soluble (at 25 to 60mg ml�1) and is monomeric as determined bygel filtration chromatography. The circular di-chroism (CD) spectrum of Top7 is characteristicof �/� proteins (Fig. 2A), and the protein isremarkably thermally stable: The CD spectrumat 98°C is nearly indistinguishable from that at25°C. At intermediate concentrations (�5 M)of guanidine hydrochloride (GuHCl), Top7 un-folds cooperatively with an increase in tempera-ture and exhibits cold denaturation (Fig. 2B).Fitting these data to the Gibbs-Helmholtz equa-tion gave a change in heat capacity at constantpressure (°Cp) per residue associated with un-folding of about 10 cal deg�1 mol�1, a typicalvalue for well-folded proteins of this size (30).The GuHCl-induced chemical denaturation ofTop7 is cooperative, and the steep transition ischaracteristic of the two-state unfolding expect-ed for a small, monomeric, single-domain pro-tein (Fig. 2C). Fitting the chemical denaturationdata to a two-state unfolding model yields a freeenergy of unfolding of 13.2 kcal mol�1 at 25°C,indicating that Top7 is more stable than most

proteins of similar size (31). The nuclear Over-hauser effect spectroscopy (NOESY) and hetero-nuclear single-quantum coherence (HSQC)spectra of Top7 (Fig. 2, D and E) exhibit featurescharacteristic of a folded protein with substantial�-sheet content. The HSQC spectrum containsthe expected number of cross peaks, and thedispersion is comparable to that of �/� proteinsof similar size. Strong backbone NH-H� crosspeaks and the observation of H� resonancesdownfield of the water signal (to 6 parts permillion) indicate the presence of a � sheet,whereas NH-NH peaks are consistent with apartial helical character for the protein.

Crystallization trials with Top7 yieldedcrystals that diffracted to 2.5 Å. Remarkably, astrong molecular replacement (MR) solution tothe phase problem was found with the use ofthe design model. This suggested immediatelythat the design model was quite close to the truestructure: Even the small deviations of nuclearmagnetic resonance (NMR) solution structuresfrom x-ray crystal structures can make molec-ular replacement searches fail. To obtain unbi-ased phase information, we produced and crys-tallized a selenomethionyl (SeMet)-substitutedvariant of Top7 with a surface lysine at position37 mutated to methionine, and we solved the

x-ray crystal structure to 2.5 Å by direct re-building into an unbiased single-wavelengthanomalous difference (SAD) electron densitymap (Fig. 3B) and residual difference Fouriermaps (32). The final Rwork and Rfree were 0.268and 0.293, respectively (table S2).

The high-resolution crystal structure revealsthat the Top7 protein adopts the designed topol-ogy (Fig. 4A). Indeed, the structure is strikinglysimilar to the design model at atomic resolution(1.17 Å RMSD over all backbone atoms). Theoverall protein structure is very well ordered,with the exception of two turns (comprisingresidues 11 to 15 and 24 to 31), each of whichexhibit elevated B-factors and poor quality elec-tron density. The first of these two turns and theimmediately adjoining residues from its neigh-boring strand deviate the most from the compu-tational model. However, even in this region, theall-atom RMSD between the two models doesnot exceed 2.8 Å. In contrast, the C-terminal halfof the x-ray structure is well ordered and verysimilar to the computational model; for example,the region from Asp78 to Gly85 has an all-atomRMSD of 0.79 Å (Fig. 4B). Many side chains inthe core of the solved structure are effectivelysuperposable with those of the designed Top7(Fig. 4C).

Fig. 3. Schematic representation of Top7 in unbiased SAD density. (A and B) Stick representationsof residues 46 to 76 from the computationally designed Top7 (left, green) and from the 2.5 Å x-raystructure (right, red) are shown in unbiased density (blue). The map was generated from SADphasing from a single SeMet-substituted variant of Top7, followed by density modification. (C andD) Ribbon diagrams of Top7 with residues 46 to 76 highlighted in red. The two diagrams are relatedby a 90° rotation around the vertical axis.

R E S E A R C H A R T I C L E S

21 NOVEMBER 2003 VOL 302 SCIENCE www.sciencemag.org1366

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 5: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

Like the design model, the Top7 crystalstructure is judged to be a novel topology bythe TOPS server. The strongest structuralsimilarity found in a Dali search of the PDB(33) is to a discontinuous portion of the 668-residue protein S-adenosylmethionine decar-boxylase, which has a large 68-residue inser-tion between strands 1 and 2, and the thirdand fourth strands are connected by an unre-fined loop instead of a helix. According to A.Murzin, the curator of the Structural Classi-fication of Proteins (SCOP) database, theTop7 fold is not present in SCOP (34, 35).

Implications. The 1.17-Å backbone atomRMSD between the Top7 design model and thecrystal structure implies that deep minima in thefree energy function used in design correspondto deep minima in the actual free-energy land-scape and hence are an important validation ofthe accuracy of current potential functions. Thisatomic-level accuracy contrasts sharply withthe low accuracy of ab initio structure predic-tions for naturally occurring sequences: Themost accurate structure predictions in the Crit-ical Assessment of Structure Prediction exper-iments for 90- to 100-residue proteins haveRMSDs greater than 4 Å from the experimen-tally determined structure. Why does the simul-taneous optimization of sequence and structureidentify the global free energy minimum,whereas the optimization of structure for fixedsequence does not? The answer may involveboth of the challenges facing ab initio structureprediction, the vast size and ruggedness of theconformational space to be searched and thelimited accuracy of current potential functions.The capability to alter the sequence and hencereconfigure the landscape may greatly facilitate

the search for low-free-energy protein struc-tures as compared to standard ab initio predic-tion, where the sequence is fixed. In addition,Top7 lacks functional constraints, which canlead to locally suboptimal regions in nativestructures that are particularly challenging forstructure prediction, and the more extensiveoptimization of the folding free energy maypartially compensate for inaccuracies in the po-tential functions. Finally, it should be noted thatthe design process focused entirely on minimiz-ing the free energy of the folded monomericstructure; attaining a highly stable new structuredid not require extensive negative designagainst possible alternative conformations (36,37) nor consideration of the kinetic process ofprotein folding (38).

The design of Top7 shows that globularprotein folds not yet observed in nature notonly are physically possible but can be ex-tremely stable. This extends the earlier obser-vation that helical coiled coil geometries notfound in nature can be generated by compu-tational protein design (15). The protein ther-apeutics and molecular machines of the fu-ture should thus not be limited to the struc-tures sampled by the biological evolutionaryprocess. The methods used to design Top7are, in principle, applicable to any globularprotein structure and open the door to theexploration and use of a vast new world ofprotein structures and architectures.

References and Notes1. B. I. Dahiyat, S. L. Mayo, Science 278, 82 (1997).2. J. R. Desjarlais, T. M. Handel, Protein Sci. 4, 2006

(1995).3. J. W. Ponder, F. M. Richards, J. Mol. Biol. 193, 775 (1987).4. J. Reina et al., Nature Struct. Biol. 9, 621 (2002).

5. L. L. Looger, M. A. Dwyer, J. J. Smith, H. W. Hellinga,Nature 423, 185 (2003).

6. S. M. Malakauskas, S. L. Mayo, Nature Struct. Biol. 5,470 (1998).

7. E. C. Johnson, G. A. Lazar, J. R. Desjarlais, T. M. Handel,Structure Fold. Des. 7, 967 (1999).

8. P. Koehl, M. Levitt, J. Mol. Biol. 293, 1161 (1999).9. B. Kuhlman, D. Baker, Proc. Natl. Acad. Sci. U.S.A. 97,

10383 (2000).10. A. Jaramillo, L. Wernisch, S. Hery, S. J. Wodak, Proc.

Natl. Acad. Sci. U.S.A. 99, 13554 (2002).11. K. Raha, A. M. Wollacott, M. J. Italia, J. R. Desjarlais,

Protein Sci. 9, 1106 (2000).12. A. Su, S. L. Mayo, Protein Sci. 6, 1701 (1997).13. S. M. Larson, J. L. England, J. R. Desjarlais, V. S. Pande,

Protein Sci. 11, 2804 (2002).14. B. Kuhlman, J. W. O’Neill, D. E. Kim, K. Y. Zhang, D.

Baker, J. Mol. Biol. 315, 471 (2002).15. P. B. Harbury, J. J. Plecs, B. Tidor, T. Alber, P. S. Kim,

Science 282, 1462 (1998).16. A. E. Keating, V. N. Malashkevich, B. Tidor, P. S. Kim,

Proc. Natl. Acad. Sci. U.S.A. 98, 14825 (2001).17. The TOPS database is available at www.

tops.leeds.ac.uk/.18. P. M. Bowers, C. E. Strauss, D. Baker, J. Biomol. NMR

18, 311 (2000).19. T. Kortemme, A. V. Morozov, D. Baker, J. Mol. Biol.

326, 1239 (2003).20. T. Lazaridis, M. Karplus, Proteins Struct. Func. Genet.

35, 132 (1999).21. Materials and methods are available as supporting

material on Science Online.22. R. L. Dunbrack, F. E. Cohen, Protein Sci. 6, 1661 (1997).23. R. Bonneau et al., Proteins (suppl. 5), 119 (2001).24. C. Rohl, C. E. M. Straus, K. M. S. Misura, D. Baker,

Methods Enzymol., in press.25. We used the Davidson-Fletcher-Powell algorithm as

described by W. H. Press et al., in Numerical Recipesin C (Cambridge Univ. Press, Cambridge, ed. 2, 1992),pp. 428–429.

26. B. Kuhlman et al., data not shown.27. The sequence of Top7 is mgDIQVQVNIDDNGKNF-

DYTYTVTTESELQKVLNELKDYIKKQGAKRVRISITARTK-KEAEKFAAILIKVFAELGYNDINVTFDGDTVTVEGQLEgg-slehhhhhh; the computationally designed sequence isin uppercase and residues added to allow expressionand purification are in lowercase. Single-letter abbre-viations for the amino acid residues are as follows:A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H,

Fig. 4. Comparison of the computationally de-signed model (blue) to the solved x-ray structure(red) of Top7. (A) C-� overlay of the model andstructure in stereo (backbone RMSD � 1.17 Å). (B)The C-terminal halves of the x-ray structure andmodel are extraordinarily similar. The representa-tive region shown (Asp78 to Gly85) has an all-atomRMSD of 0.79 Å and a backbone RMSD of 0.54 Å.(C) Stereorepresentation of the effectively super-posable side chains in the cores of the designedmodel and the solved structure.

R E S E A R C H A R T I C L E S

www.sciencemag.org SCIENCE VOL 302 21 NOVEMBER 2003 1367

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 6: Design of a Novel Globular Protein Fold with Atomic-Level … · 2016. 6. 27. · Design of a Novel Globular Protein Fold with Atomic-Level Accuracy This copy is for your personal,

His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln;R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

28. S. F. Altschul et al., Nucleic Acids Res. 25, 3389 (1997).29. A synthetic gene with the Top7 sequence under the

control of the T7 promoter, a C-terminal 6X His tag,and codon usage optimized for Escherichia coli wasobtained from BlueHeron Biotechnologies (Bothell,WA). After expression in E. coli, the protein wasreadily purified to �95% homogeneity with the useof nickel affinity chromatography followed by anionexchange chromatography.

30. J. K. Myers, C. N. Pace, J. M. Scholtz, Protein Sci. 4,2138 (1995).

31. K. W. Plaxco, K. T. Simons, D. Baker, J. Mol. Biol. 277,985 (1998).

32. The structure of Top7_K35M was solved by molecularreplacement with the program EPMR (39) and bydirect rebuilding into an unbiased SAD electron den-sity map and residual difference Fourier maps. Formolecular replacement, 19 large surface residuessuch as Lys, Arg, Gln, and Glu were truncated to Alain the search model. The correlation coefficient forthe initial MR search, using data to 4.0 Å resolution,was 0.52, compared with background of 0.36. ForSAD phasing, the position of SeMet 37 was deter-mined from an anomalous difference Patterson map.The initial phasing power and figure of merit for SADphasing were 1.99 and 0.24, respectively, before den-

sity modification. An interpretable electron densitymap was obtained after density modification withsolvent flipping with a solvent content of 43% withthe use of CNS (40). An initial model was built withthe use of XtalView (41) and O (42). The model wasrefined with CNS with the use of the mlhl target(maximum likelihood, Hendrickson-Lattman coeffi-cients) with 5% of the data excluded for the calcu-lation of the cross-validating free R. Of all the builtresidues, 88% are in the most favorable regions ofRamachandran space, and 12% are in the allowedregions. Statistics from phasing and refinement areshown in table S2.

33. L. Holm, C. Sander, Trends Biochem. Sci. 20, 478 (1995).34. T. J. Hubbard, A. G. Murzin, S. E. Brenner, C. Chothia,

Nucleic Acids Res. 25, 236 (1997).35. A. Murzin, personal communication.36. J. J. Havranek, P. B. Harbury, Nature Struct. Biol. 10,

45 (2003).37. W. Jin, O. Kambara, H. Sasakawa, A. Tamura, S.

Takada, Structure (Cambridge) 11, 581 (2003).38. L. Mirny, E. Shakhnovich, J. Mol. Biol. 308, 123 (2001).39. EPMR: A program for crystallographic molecular re-

placement by evolutionary search (C. R. Kissinger, D.K. Gehlhaar, Agouron Pharmaceuticals, La Jolla, CA).

40. A. T. Brunger et al., Acta Crystallogr. D. Biol. Crystal-logr. 54, 905 (1998).

41. D. E. McRee, J. Struct. Biol. 125, 156 (1999).

42. T. A. Jones, J. Y. Zou, S. W. Cowan, M. Kjeldgaard, ActaCrystallagr. A 47, 110 (1991).

43. S. Mori, C. Abeygunawardana, M. O. Johnson, P. C.van Zijl, J. Magn. Reson. B. 108, 94 (1995).

44. We acknowledge the expert assistance of B. Shen incrystallographic phasing, modeling, and refinementof the TOP7 structure, C. Rohl for aiding in theincorporation of RosettaDesign into Rosetta, C.Strauss for helping to generate the initial models ofTop7, T. Leeper for help with two-dimensional NMRstudies, and R. Klevit and the Klevit laboratoryfor help with preliminary NMR characterizationof Top7. Academic users can obtain licensing in-formation for RosettaDesign at www.unc.edu/kuhlmanpg/rosettadesign.htm. The coordinates andstructure factors for the Top7 x-ray crystal structurehave been deposited in the PDB with accession code1QYS. B.K. was supported by a fellowship from theCancer Research Fund of the Damon Runyon–WalterWinchell Foundation. This work was also supportedby NIH grants to G.V., B.L.S., and D.B.

Supporting Online Materialwww.sciencemag.org/cgi/content/full/302/5649/1364/DC1Materials and MethodsTables S1 to S6

21 July 2003; accepted 25 September 2003

Targeted Protein Degradationand Synapse Remodeling by an

Inducible Protein KinaseDaniel T. S. Pak*† and Morgan Sheng†

Synaptic plasticity involves the reorganization of synapses at the protein and themorphological levels. Here, we report activity-dependent remodeling of synapsesby serum-inducible kinase (SNK). SNK was induced in hippocampal neurons bysynaptic activity and was targeted to dendritic spines. SNK bound to and phos-phorylated spine-associated Rap guanosine triphosphatase activating protein(SPAR), a postsynaptic actin regulatory protein, leading to degradation of SPAR.Induction of SNK in hippocampal neurons eliminated SPAR protein, depletedpostsynaptic density–95 and Bassoon clusters, and caused loss of mature dendriticspines. These results implicate SNK as a mediator of activity-dependent change inthe molecular composition and morphology of synapses.

Synaptic activity can induce a variety ofchanges within postsynaptic neurons, rangingfrom transient posttranslational modificationsto altered programs of gene expression.Long-lasting forms of synaptic plasticity re-quire new gene expression and protein syn-thesis (1–3). Some activity-inducible genesmay mediate the conversion of short-termresponses to long-term changes by alteringsynaptic structure (4, 5).

Numerous activity-inducible genes havebeen identified (6, 7). Notably, few protein

kinases are known to be inducible by synapticactivity at the mRNA level; the best-character-ized induction profiles are those of SNK andFGF-inducible kinase (FNK) (8). However, theroles of these polo family kinases in plasticityare unknown.

One reasonable expectation is that synap-tic remodeling will involve the dismantlingand/or reorganization of key cytoskeletal andscaffolding protein complexes. In thepostsynaptic density (PSD) of mammalianexcitatory synapses, actin is the major cy-toskeletal element, and scaffold proteins ofthe PSD-95 family are important for as-sembling glutamate receptors with theirsignaling-cytoskeletal complexes (9, 10).One PSD-95–interacting partner, SPAR[spine-associated Rap guanosine triphos-phatase (GTPase) activating protein (GAP)],is a multidomain postsynaptic protein thatcontrols dendritic spine shape by regulating

actin arrangement as well as signaling by thesmall GTPase Rap (11). Actin dynamics andRap activity are both regulated by synapticactivity and involved in synaptic plasticity(12–16). With its postsynaptic location in theN-methyl-D-aspartate (NMDA) receptor–PSD-95 complex, SPAR is an attractive can-didate for mediating activity-dependent re-modeling of synapses.

Interaction of SPAR and SNK. Wescreened for SPAR-binding proteins with the useof individual domains of SPAR (Fig. 1A) as baitin the yeast two-hybrid system (17). SPAR con-tains two actin regulatory domains, termed Act1and Act2, a GAP domain specific for Rap, aPDZ domain of unknown function, and a C-terminal region (termed GKBD) that binds spe-cifically to the guanylate kinase domain of PSD-95 (11). When the Act2 domain was used toscreen a brain cDNA library, one of the positiveclones (clone 19) isolated was SNK, initiallyidentified in fibroblasts as an mRNA transcriptinduced by mitogenic stimulation (18). Clone 19encoded roughly the C-terminal half of SNKprotein (amino acids 395 to 682; hereon termedSNKc) (Fig. 1A), including a motif characteristicof the polo family of kinases (19, 20). Full-length SNK also bound to Act2 in the yeasttwo-hybrid assay. Neither full-length SNK norclone 19 interacted with the GKBD region ofSPAR (Fig. 1A).

The interaction between SNK and SPAR wasconfirmed with the use of an in vitro precipita-tion assay in which GST fused to SNKc precip-itated full-length SPAR expressed in COS-7cells but not a SPAR construct with an internaldeletion of the Act2 domain (SPARAct2)(Fig. 1B). The isolated Act2 domain alsobound GST-SNKc with greater efficiencythan full-length SPAR. GST alone failed tobind any of these SPAR constructs.

Picower Center for Learning and Memory, RIKEN Mas-sachusetts Institute of Technology (MIT) Neuro-science Research Center, Howard Hughes MedicalInstitute, MIT, Cambridge, MA 02139, USA.

*Present address: Department of Pharmacology,Georgetown University, Washington, DC 20057, USA.†To whom correspondence should be addressed. E-mail: [email protected] (D.T.S.P.); [email protected] (M.S.)

R E S E A R C H A R T I C L E S

21 NOVEMBER 2003 VOL 302 SCIENCE www.sciencemag.org1368

on

Janu

ary

24, 2

011

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om