6
Designed protein reveals structural determinants of extreme kinetic stability Aron Broom a , S. Martha Ma a , Ke Xia b , Hitesh Rafalia c,d , Kyle Trainor a , Wilfredo Colón b , Shachi Gosavi c , and Elizabeth M. Meiering a,1 a Department of Chemistry, Guelph-Waterloo Centre for Graduate Studies in Chemistry and Biochemistry, University of Waterloo, Waterloo, ON, Canada N2L 3G1; b Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180; c National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India; and d Manipal University, Madhav Nagar, Manipal 576104, India Edited by Alan R. Fersht, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom, and approved October 6, 2015 (received for review June 9, 2015) The design of stable, functional proteins is difficult. Improved design requires a deeper knowledge of the molecular basis for design outcomes and properties. We previously used a bioinformatics and energy function method to design a symmetric superfold protein composed of repeating structural elements with multivalent carbo- hydrate-binding function, called ThreeFoil. This and similar methods have produced a notably high yield of stable proteins. Using a battery of experimental and computational analyses we show that despite its small size and lack of disulfide bonds, ThreeFoil has remarkably high kinetic stability and its folding is specifically chaperoned by carbohy- drate binding. It is also extremely stable against thermal and chemical denaturation and proteolytic degradation. We demonstrate that the kinetic stability can be predicted and modeled using absolute contact order (ACO) and long-range order (LRO), as well as coarse-grained simulations; the stability arises from a topology that includes many long-range contacts which create a large and highly cooperative energy barrier for unfolding and folding. Extensive data from proteomic screens and other experiments reveal that a high ACO/ LRO is a general feature of proteins with strong resistances to denaturation and degradation. These results provide tractable approaches for predicting resistance and designing proteins with sufficient topological complexity and long-range interactions to ac- commodate destabilizing functional features as well as withstand chemical and proteolytic challenge. SDS/protease resistance | protein folding | coarse-grained simulations | protein topology | contact order T he design of proteins with a desired stable fold and function is a much sought after goal. Although impressive recent successes have been reported in designing both natural and novel protein functions and/or structures (16), design remains diffi- cult, often requiring multiple rounds of iterative improvements (710). In depth biophysical characterization of protein design outcomes and an understanding of their molecular basis have been limited, and these are critical for improving future designs. Combining designed function with structure is particularly dif- ficult, in part because functional sites tend to be sources of thermodynamic instability (11, 12) and folding frustration (1315). We investigate how an approach that considers both struc- ture and function from the outset may be used to overcome such obstacles. Furthermore, we demonstrate how kinetic and related stabilities against denaturation can be rationally designed. A promising emerging paradigm for protein design is the repetition of modular structural elements (1, 2, 57, 14, 1620). This approach can simplify the design process and build on as- pects of the evolution of natural repetition in proteins, as well as incorporate the inherent multivalent binding functionality of such structures (1, 21). Internal structural symmetry, resulting from the repetition of smaller elements of structure, is very common in natural proteins, with 20% of all protein folds (22) and the ma- jority of the most populated globular protein folds (superfolds) (21) containing internal structural symmetry. Recent design successes, for helical proteins (5, 6), repeat proteins (18, 20, 23) and symmetric superfolds (1, 2, 7, 16, 17, 19, 2426) recommend the simplification of the design process by using repetitive/symmetric folds as a par- ticularly effective strategy. The β-trefoil superfold is an interesting test case for design by repetition as bioinformatics analysis has revealed multiple and recent instances of the evolution of distinct proteins with this symmetric fold (1). The fold consists of three repeats, each con- taining four β-strands, and is adopted by numerous superfamilies with highly diverse binding functions (27). Our design of a com- pletely symmetric β-trefoil, ThreeFoil (Fig. 1), used a hypothetical multivalent carbohydrate binding template and mutated 40 of the 141 residues (1). The mutations were based on a combination of consensus design using a limited set of close homologs (to preserve function), and energy scoring using Rosetta (28). The design was successful on the first attempt, producing a soluble, well folded, and functional monomer with very high resistance to structural fluctuations as indicated by high resistance to thermal denaturation and limited amide H/D exchange (1). Here, we use a battery of biophysical and computational methods to perform an in depth analysis of Threefoil, which shows that it has remarkably slow unfolding and folding kinetics compared with natural and designed proteins due to an unusually high transition state energy barrier. Such kinetic stability against unfolding has been studied little to date. Furthermore, Threefoil is extremely resistant to chemical denaturation and proteolytic degradation. Analyses using Absolute Contact Order (ACO) (29) and Long-Range Order Significance Much research has focused on the molecular basis for protein thermodynamic stability; by comparison, kinetic stability is much less understood. Thermodynamics define the equilibrium fraction of unfolded protein while kinetics define the rate of unfolding; the latter can be of great practical importance for ensuring a protein remains folded under biological conditions. Using extensive experimental and modeling analyses we show that ThreeFoil, a small glycan binding protein without disul- fides, exhibits outstanding kinetic stability against chemical denaturation and proteolytic degradation. We demonstrate that high kinetic stability is successfully modeled in terms of extensive long-range intramolecular interactions. These results show that topological complexity is a key determinant of ki- netic stability which should help in designing proteins to withstand harsh conditions. Author contributions: A.B., S.G., and E.M.M. designed research; A.B., S.M.M., K.X., H.R., K.T., and S.G. performed research; A.B., S.M.M., K.X., K.T., W.C., S.G., and E.M.M. analyzed data; and A.B., S.G., and E.M.M. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1510748112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1510748112 PNAS | November 24, 2015 | vol. 112 | no. 47 | 1460514610 BIOPHYSICS AND COMPUTATIONAL BIOLOGY Downloaded by guest on March 24, 2021

Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

Designed protein reveals structural determinants ofextreme kinetic stabilityAron Brooma, S. Martha Maa, Ke Xiab, Hitesh Rafaliac,d, Kyle Trainora, Wilfredo Colónb, Shachi Gosavic,and Elizabeth M. Meieringa,1

aDepartment of Chemistry, Guelph-Waterloo Centre for Graduate Studies in Chemistry and Biochemistry, University of Waterloo, Waterloo, ON, CanadaN2L 3G1; bDepartment of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute,Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India; and dManipal University, MadhavNagar, Manipal 576104, India

Edited by Alan R. Fersht, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom, and approved October 6, 2015 (received forreview June 9, 2015)

The design of stable, functional proteins is difficult. Improved designrequires a deeper knowledge of the molecular basis for designoutcomes and properties. We previously used a bioinformatics andenergy function method to design a symmetric superfold proteincomposed of repeating structural elements with multivalent carbo-hydrate-binding function, called ThreeFoil. This and similar methodshave produced a notably high yield of stable proteins. Using a batteryof experimental and computational analyses we show that despite itssmall size and lack of disulfide bonds, ThreeFoil has remarkably highkinetic stability and its folding is specifically chaperoned by carbohy-drate binding. It is also extremely stable against thermal and chemicaldenaturation and proteolytic degradation. We demonstrate that thekinetic stability can be predicted and modeled using absolute contactorder (ACO) and long-range order (LRO), as well as coarse-grainedsimulations; the stability arises from a topology that includes manylong-range contacts which create a large and highly cooperativeenergy barrier for unfolding and folding. Extensive data fromproteomic screens and other experiments reveal that a high ACO/LRO is a general feature of proteins with strong resistances todenaturation and degradation. These results provide tractableapproaches for predicting resistance and designing proteins withsufficient topological complexity and long-range interactions to ac-commodate destabilizing functional features as well as withstandchemical and proteolytic challenge.

SDS/protease resistance | protein folding | coarse-grained simulations |protein topology | contact order

The design of proteins with a desired stable fold and functionis a much sought after goal. Although impressive recent

successes have been reported in designing both natural and novelprotein functions and/or structures (1–6), design remains diffi-cult, often requiring multiple rounds of iterative improvements(7–10). In depth biophysical characterization of protein designoutcomes and an understanding of their molecular basis havebeen limited, and these are critical for improving future designs.Combining designed function with structure is particularly dif-ficult, in part because functional sites tend to be sources ofthermodynamic instability (11, 12) and folding frustration (13–15). We investigate how an approach that considers both struc-ture and function from the outset may be used to overcome suchobstacles. Furthermore, we demonstrate how kinetic and relatedstabilities against denaturation can be rationally designed.A promising emerging paradigm for protein design is the

repetition of modular structural elements (1, 2, 5–7, 14, 16–20).This approach can simplify the design process and build on as-pects of the evolution of natural repetition in proteins, as well asincorporate the inherent multivalent binding functionality ofsuch structures (1, 21). Internal structural symmetry, resulting fromthe repetition of smaller elements of structure, is very common innatural proteins, with ∼20% of all protein folds (22) and the ma-jority of the most populated globular protein folds (superfolds) (21)containing internal structural symmetry. Recent design successes,

for helical proteins (5, 6), repeat proteins (18, 20, 23) and symmetricsuperfolds (1, 2, 7, 16, 17, 19, 24–26) recommend the simplificationof the design process by using repetitive/symmetric folds as a par-ticularly effective strategy.The β-trefoil superfold is an interesting test case for design by

repetition as bioinformatics analysis has revealed multiple andrecent instances of the evolution of distinct proteins with thissymmetric fold (1). The fold consists of three repeats, each con-taining four β-strands, and is adopted by numerous superfamilieswith highly diverse binding functions (27). Our design of a com-pletely symmetric β-trefoil, ThreeFoil (Fig. 1), used a hypotheticalmultivalent carbohydrate binding template and mutated 40 of the141 residues (1). The mutations were based on a combination ofconsensus design using a limited set of close homologs (to preservefunction), and energy scoring using Rosetta (28). The design wassuccessful on the first attempt, producing a soluble, well folded,and functional monomer with very high resistance to structuralfluctuations as indicated by high resistance to thermal denaturationand limited amide H/D exchange (1).Here, we use a battery of biophysical and computational methods

to perform an in depth analysis of Threefoil, which shows that it hasremarkably slow unfolding and folding kinetics compared withnatural and designed proteins due to an unusually high transitionstate energy barrier. Such kinetic stability against unfolding has beenstudied little to date. Furthermore, Threefoil is extremely resistantto chemical denaturation and proteolytic degradation. Analysesusing Absolute Contact Order (ACO) (29) and Long-Range Order

Significance

Much research has focused on the molecular basis for proteinthermodynamic stability; by comparison, kinetic stability ismuch less understood. Thermodynamics define the equilibriumfraction of unfolded protein while kinetics define the rate ofunfolding; the latter can be of great practical importance forensuring a protein remains folded under biological conditions.Using extensive experimental and modeling analyses we showthat ThreeFoil, a small glycan binding protein without disul-fides, exhibits outstanding kinetic stability against chemicaldenaturation and proteolytic degradation. We demonstratethat high kinetic stability is successfully modeled in terms ofextensive long-range intramolecular interactions. These resultsshow that topological complexity is a key determinant of ki-netic stability which should help in designing proteins towithstand harsh conditions.

Author contributions: A.B., S.G., and E.M.M. designed research; A.B., S.M.M., K.X., H.R.,K.T., and S.G. performed research; A.B., S.M.M., K.X., K.T., W.C., S.G., and E.M.M. analyzeddata; and A.B., S.G., and E.M.M. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1510748112/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1510748112 PNAS | November 24, 2015 | vol. 112 | no. 47 | 14605–14610

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021

Page 2: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

(LRO) (30) as well as Go model folding simulations (31–33) showthat ThreeFoil’s resistance can be explained by the high coopera-tivity of its folded structure, which includes many long-range in-teractions. Simulations also show that nonnative interactions orfolding frustration arising from protein symmetry (34) do not createlong-lived traps during folding or account for the high barrier. Theyalso explain how ligand binding can chaperone folding, which canbe an added advantage of designing the fold and function together.Notably, additional analyses using whole proteome screening andother experiments show that proteins with similar resistances asThreeFoil generally have high ACO/LRO values. Thus, the designmethod used for ThreeFoil and the strategy of designing folds withmany long-range contacts may be useful for designing functionalproteins with high resistance to denaturation and degradation, asmay be needed for challenging biotechnology applications.

ResultsThreeFoil Has Extremely Slow Kinetics and Substantial ThermodynamicStability. To better understand the basis for ThreeFoil’s very highapparent melting temperature (>90 °C) and slow amide exchange(1), we measured its folding kinetics and thermodynamic stabilityusing chemical denaturation. ThreeFoil is extremely resistant tochemical denaturation, remaining folded in high concentrations ofurea, with unfolding only observable after very long incubationtimes in high concentrations of the stronger denaturants guanidi-nium chloride and guanidinium thiocyanate (GuSCN) (Fig. 2 and SIAppendix, Fig. S1). ThreeFoil’s half-life for unfolding in the absenceof denaturant is ∼8 y, although its folding half-life is on the order of1 h (SI Appendix, Table S1). A comparison against natural anddesigned proteins of varying structural classes and lengths illustrateshow unusually slow these kinetics are (Fig. 3). Despite ThreeFoil’sslow kinetics, unfolding is highly reversible and the rate constantsmeasured by multiple optical probes vary linearly with denaturantconcentration, indicating a two-state transition between folded andunfolded states (Fig. 2A and SI Appendix, Fig. S1). The very slowkinetics are indicative of a high free energy (un)folding transitionstate (Fig. 2B). Similarly, a high transition state barrier underlies theextremely slow unfolding of α-lytic protease (35); however, unlikethis prototypical kinetically stable protein, which is thermodynami-cally unstable, ThreeFoil possesses substantial thermodynamic sta-bility of ∼6 kcal/mol (SI Appendix, Table S1).Various studies have found evidence that repetition in proteins

can slow kinetics by creating folding frustration (34, 36). To furtherexamine the role of sequence repetition on kinetics, we compareThreeFoil with another fully symmetric β-trefoil, Symfoil, which wasobtained using iterative rounds of rational design and sequenceselection (7). Symfoil has <15% sequence identity to ThreeFoil and,although it has a higher thermodynamic stability of ∼11 kcal/mol, it

both folds and unfolds much faster (one million- and 400-fold, re-spectively, SI Appendix, Table S1 and Fig. 3). Thus, symmetry doesnot a priori result in kinetic stability. Despite having a nearlyidentical core secondary structure to Symfoil, ThreeFoil has addi-tional length and interactions for a set of loops involved in its car-bohydrate binding function (Fig. 1 B and C). By contrast, heparinbinding function, including binding residues in a loop of Symfoil’stemplate protein, acidic fibroblast growth factor, were eliminatedduring the many iterations of the design process (7). As ThreeFoil’slonger loops surround and create its ligand binding sites, we in-vestigated the structure of these sites during folding.

Formation of Ligand Binding Sites During Folding. Measurements ofthe kinetics of (un)folding in the presence of ligands can be used tomonitor the formation of ligand binding sites (37). ThreeFoil has asingle binding site for sodium, which is coordinated by three sets ofresidues distant in sequence, and three carbohydrate binding sites,which have binding residues close in sequence (Fig. 1 B and C) (1).The carbohydrate sites bind lactose and are highly specific forglycans with terminal galactose in a β-1,4 linkage (SI Appendix,Figs. S2A and S3). Sodium decreases the unfolding rate but has noeffect on the folding rate, thus it binds only to the folded state andnot to the transition state (Fig. 2B). In contrast, lactose not onlydecreases the unfolding rate but also increases the folding rate,indicating partial formation of the lactose binding sites in thetransition state ensemble. The kinetic effects of lactose are specificand not general solvent properties, as no kinetic changes are ob-served for sucrose (Fig. 2 and SI Appendix, Table S1).Interestingly, the addition of lactose also increases the de-

naturant-dependence of stability, m, which reports on the extentof solvent accessible surface burial for a structural transition.The m increases from a value that is 68% of that expected for aprotein of this size to 85% (SI Appendix, Table S1). An increasein m may arise from increased burial of hydrophobic residues inthe folded protein and/or decreased residual structure in theunfolded protein. Multiple lines of evidence support the latterexplanation. Circular dichroism (CD) and NMR (SI Appendix,Figs. S2 and S4, respectively) experiments for folded ThreeFoilshow no significant change in native structure upon lactosebinding. Also, anilinonapthalenesulfonic acid (ANS), a dye thatbinds clusters of exposed hydrophobic groups, shows no bindingto folded or denatured Threefoil (SI Appendix, Fig. S5). There isno apparent change in CD upon adding lactose to denaturedThreeFoil (SI Appendix, Fig. S2B); however, the CD spectra ofdenatured ThreeFoil show evidence for nonrandom structure.Similarly, quantitative CD analysis for OneFoil, a peptide con-sisting of just one of the constituent repeats of ThreeFoil, shows

Fig. 1. Design of ThreeFoil. (A) ThreeFoil (PDB: 3PG0) illustrating its threeidentical peptide subdomains (red, green, blue). (B) ThreeFoil’s secondarystructure: turn (purple), β-strand/bridge (yellow), and 3/10-helix (magenta)and ligand binding residues indicated by colored circles and insertions shownin red. (C) Comparison of ThreeFoil with the independently designed Symfoil(PDB: 3O4D, 15% sequence identity), shown along (Left) and across (Right) theaxis of symmetry. Backbones are colored by RMSD between the two structures(blue to white, 0–5 Å), with insertions in the loops of ThreeFoil relative toSymfoil colored red. ThreeFoil’s bound sodium shown in gray, and bis-Tris,which binds in the conserved carbohydrate binding sites, shown in cyan.

Fig. 2. Folding and unfolding kinetics of ThreeFoil are modulated by ligandbinding. (A) Chevron plots of observed folding and unfolding rate constants(in s−1) in GuSCN were determined by fluorescence. Measurements werewithout sodium (gray open circles), with sodium (300 mM, black filled circles)or sodium and 50 mM of either lactose (cyan filled circles) or sucrose (cyanopen circles). (B) Energy diagram corresponding to the kinetic measurements(coloring as in A). The energy axis is given by -RTln(kobs) and the reactioncoordinate follows the change in solvent accessible surface area as measuredby mf and mu. The folded (F), transition (‡), and unfolded (U) states are in-dicated. Unfolded state energies and folded state reaction coordinates areset equal to facilitate comparisons.

14606 | www.pnas.org/cgi/doi/10.1073/pnas.1510748112 Broom et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021

Page 3: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

it has approximately half of the β-structure observed in foldedThreeFoil (SI Appendix, Fig. S2C). However, OneFoil shows noevidence for stable structure formation by NMR (1). These ex-periments strongly suggest the presence of fluctuating residualstructure in the ensemble of denatured conformers for Three-Foil. Together with folding simulations (described below), theresults indicate that lactose binding enhances folding not only bybinding to the transition state, but also by binding weakly to someconformations in the denatured ensemble and so decreasing non-native residual structure.Other studies have also shown that different types of ligands

(e.g., metals, heme, nucleotides) can bind to partially foldedproteins (denatured, intermediate, and transition states) and sopromote folding (38–40). Thus, ligands may not only stabilize thenative state, but also promote and chaperone protein folding bybinding to other states and thereby smooth the folding energylandscape. In this way, ligand binding can increase the foldabilityof the protein when structure and function are designed con-currently rather than separately.

Modeling Reveals the Molecular Mechanisms for ThreeFoil’s LigandBinding and Slow Kinetics. The ligand binding loops make exten-sive contacts with distant residues in the primary sequence and soincrease the ACO and LRO for ThreeFoil, which are notably high(Fig. 3 A and B). ACO/LRO are measures of topological com-plexity based on the sequence separation of contacting residues inthe folded protein. We have shown recently that the rates ofprotein folding and unfolding both decrease with increasing ACO/LRO (41). LRO provides a more linear and stronger correlationand is normalized for increasing protein size, which also slows(un)folding (41, 42). As ACO/LRO provide just a simple measureof protein structural complexity, we used Go models to furtherdefine the molecular origins of ThreeFoil’s high barrier.Go models encode the structure of the folded protein in their

energy functions (31–33) and can be used to understand athigher resolution the effects of protein topological complexity onfolding. Molecular dynamics (MD) simulations of such modelsfor diverse proteins can capture trends in barrier heights as wellas mechanistic details of folding (32). Here, a coarse-grained Go

model shows that ThreeFoil has a particularly high free energybarrier (Fig. 4A). Also, in the structure of the transition stateensemble (Fig. 4 B and C) residues around the carbohydratebinding site of the second repeat are almost completely folded.Therefore, lactose may bind to and lower the energy of the tran-sition state ensemble and so increase the folding kinetics, whereasunfolding kinetics are slowed owing to even stronger lactosebinding to the folded state (Fig. 2). In contrast, the residues in thesodium-binding site are quite unstructured early in the transitionstate (Fig. 4 B and C) showing that sodium only binds the foldedstate and therefore slows unfolding with no effect on folding.Thus, the Go model simulations rationalize ThreeFoil’s slow ex-perimental kinetics and also provide a molecular explanation forthe kinetic effects of ligands. In addition, although a simple cal-culation of ACO/LRO indicates that ThreeFoil should (un)foldslower than Symfoil and Hisactophilin (see below) (Fig. 3 A andB), the more detailed Go model simulations better capture thevariations in these rates (Fig. 4G and SI Appendix, Fig. S6).The largest differences between Threefoil and Symfoil are in

the β2–β3 loops, at the edge of ThreeFoil’s carbohydrate bindingsites (Fig. 1 B and C). Consequently, the differences in the contactmaps of the two also occur mostly in the contacts of the β2–β3loops, with Symfoil having shorter loops and fewer contacts in thisregion. To understand whether the high barrier for ThreeFoil iscaused by differences in the length, conformation and packing ofthe β2–β3 loops or by differences in packing for the rest of theprotein a hybrid construct (HYB) was created which has theSymfoil backbone with the ThreeFoil contact map; this constructhas almost the same barrier as Symfoil (SymF) (Fig. 4G). Thus,the differences responsible for the higher barrier are the β2–β3loops. To further define how the conformation and packing of theβ2–β3 loops increases the barrier, a mutant of ThreeFoil (MUT1)was created with all long-range interactions of the β2–β3 loopsdeleted (Fig. 4 D and E). The mutation lowered the barrier heightof MUT1 leaving it similar to that of HYB and SymF (Fig. 4G).These results show that the packing of the β2–β3 loops of a givenrepeat with parts of the other repeats create long-range contactsthat markedly increase the barrier height. To test the effect ofother long-range contacts (which reduce the overall ACO by anequivalent amount), a control mutant (MUT2) was created wherethe same number of other contacts with similar sequence sepa-rations were deleted (Fig. 4 D and F). The free energy barrier ofMUT2 is similar to that of both MUT1 and HYB. Thus, the ki-netic stability of a protein can be reduced by either a large loss inpacking density localized in the structure (as in MUT1) or by anadditive effect from many losses across the structure (as inMUT2). To further confirm the correlation between ACO/LRO,barrier heights, and kinetic stability, we also simulated Hisacto-philin, a natural β-trefoil with a low barrier (SI Appendix, TableS1). As expected, the low ACO/LRO Hisactophilin has a muchlower folding free energy barrier (Fig. 4G; green profile) than thatof either ThreeFoil or SymFoil and has the lowest kinetic stability(Fig. 3). Distinct functional features for Hisactophilin contributeto its low ACO/LRO and barrier (43).In principle, the internal symmetry of ThreeFoil might also

contribute to slow folding by creating misfolded intermediatesarising from internal subdomain swapping, analogous to domainswapping observed or inferred for proteins containing longerstretches of repeated sequence (34, 36). Such trapping was testedas a cause of ThreeFoil’s slow kinetics using simulations per-formed with the addition of symmetric nonnative contacts be-tween the repeats. The results indicate that, close to the transitionmidpoint, nonnative interactions arising from symmetry do notcreate significant trapping (SI Appendix, Fig. S7). Thus, comparedwith the longer proteins the shorter repeat length of ThreeFoilaided by local structure formation, likely limits slowing of foldingdue to nonnative interrepeat interactions.MD simulations were also used to investigate the effect of non-

native residual structure (in the unfolded ensemble) on ThreeFoilfolding. We performed simulations where the local structure ofall three repeats was biased to both the native ThreeFoil structure

Fig. 3. ThreeFoil folding/unfolding kinetics are extremely slow comparedwith other proteins. Rate constants (gray diamonds) at the transition midpoint(ln(kf) = ln(ku)) for a large dataset of proteins (SI Appendix, Table S2) (41), arecorrelated with ACO (A) and LRO (B). β-trefoil proteins: ThreeFoil (orange),Symfoil (green), and Hisactophilin (blue) are highlighted. (C) The half-lives forfolding (orange) and unfolding (blue) are shown for β-trefoils and the aver-ages for the large dataset in each major structural class (α, β, αβ). The pro-totypical kinetically stable protein α-lytic protease is shown for comparison(35). Ankyrin proteins with 1–3 consensus designed internal repeats (NI1C toNI3C) illustrate the effect of increasing interface area and cooperativity (23).

Broom et al. PNAS | November 24, 2015 | vol. 112 | no. 47 | 14607

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021

Page 4: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

(as above) and to the most common OneFoil structure obtained inRosetta ab initio simulations. The tertiary contacts between the re-peats were calculated only from the native ThreeFoil structure. TheThreeFoil tertiary contacts appear to suppress the intra-OneFoilnonnative interactions and these nonnative interactions do not cre-ate significant trapping (SI Appendix, Fig. S8). Lastly, simulations ofjust OneFoil (including both native and nonnative structural biases)confirm that the presence of ligand, modeled as a strengthening ofintrabinding-residue contacts, greatly suppresses the formation ofnonnative structure (SI Appendix, Fig. S8 A and B). Overall, thesimulations explain the effects of ligands during folding while alsorevealing that the extreme kinetic stability of ThreeFoil arises fromits native topology and is unlikely to be caused by nonnative traps onthe folding free energy landscape.

High Chemical and Protease Resistances of ThreeFoil and Other HighACO/LRO Proteins. Extremely slow unfolding has been associatedwith the capacity to maintain native form and function under harshconditions (44), such as high concentrations of protease (35, 45, 46)and detergent (46, 47). Protease resistance of a classic extremelykinetically stable protein, α-lytic protease, has been proposed tooriginate from its large and highly cooperative unfolding energybarrier resulting in a rigid native conformation with limited localopenings and consequently limited proteolytically susceptible re-gions (35). Also, challenge by a high concentration of SDS hasbeen used extensively for direct evaluation of protein kinetic sta-bility based on the ability of SDS to induce denaturation by trap-ping hydrophobic residues exposed during even transient unfoldingevents (47, 48). Given its high barrier to unfolding, we testedThreeFoil for resistance to protease and SDS.In the manner of Manning and Colón (46) in their profiling of

protein kinetic stability, we incubated ThreeFoil with the aggressiveand nonspecific protease, proteinase K. ThreeFoil demonstratedstrong resistance, remaining intact for the full 4-d challenge by a highconcentration of protease (Fig. 5A). A highly protease-resistantcontrol protein, the dimeric human Cu,Zn superoxide dismutase(SOD) also remained intact, whereas histactophilin, which hasgreater thermodynamic stability but much faster unfolding kineticsthan ThreeFoil (SI Appendix, Table S1), was completely degradedwithin an hour as were other commonly studied proteins (Fig. 5A).The results for the SDS challenge follow the same pattern withonly ThreeFoil and SOD being resistant (Fig. 5B), although

SOD depends on an intact disulfide bond for SDS resistancewhereas ThreeFoil does not (SI Appendix, Fig. S9F).Given the correlations between high topological complexity and

slow unfolding (Fig. 3 A and B) (41) and between slow unfoldingand SDS/protease resistance (44, 46), we asked whether theseresistances could be predicted from topological complexity. Weconducted experiments and surveyed the literature to identifyproteins with experimentally demonstrated resistance, or lackthereof, to SDS or protease. The identified proteins include new(SI Appendix, Fig. S10) and previously reported (45, 47) resultsfrom whole proteome screening to identify kinetically stable pro-teins, as well as new (Fig. 5 A and B and SI Appendix, Fig. S9) andpreviously reported analyses of individual proteins (SI Appendix,Table S3). The results (Fig. 5 C and D) clearly show that resistantproteins have notably high ACO/LRO values, similar to Three-Foil, whereas the nonresistant proteins tend to have much lowervalues. The few nonresistant proteins with a high ACO/LRO in-dicate that high topological complexity is necessary but not alwayssufficient for resistance. The preceding observations suggest therough measure provided by ACO/LRO does not capture otherrequirements such as highly cooperative unfolding, needed toeliminate weak points in the structure, which provide oppor-tunities for attack by chemical denaturants and proteases (35,44, 46). Thus, a high ACO/LRO indicates potentially high re-sistance to degradation/denaturation but a more detailed sim-ulation, as performed for ThreeFoil (Fig. 4), is needed for amore accurate prediction and understanding of molecular de-terminants for resistance.Finally, we note that the distribution of ACO/LRO values for a

large dataset of proteins with previously characterized kinetics,similar to the nonresistant cases, is markedly lower than for theresistant proteins (Fig. 5 C and D). Thus, although kinetically stableresistant proteins exist, they have received relatively little attentionand using their folds or incorporating analogous long-range contactsprovides attractive new avenues for designing resistance.

Discussion.An in depth analysis of the folding characteristics of designedproteins, as we have performed for the threefold symmetricThreeFoil, is rarely reported, yet is critical for ultimately un-derstanding design outcomes and improving their reliability. Wedemonstrate a high level of design success for ThreeFoil as

0

1

5

20

ΔG /

k BT f P

opulation

TSE3

2 1

MUT1

MUT2

90o

NC

0 0.2 0.4 0.6 0.8 1 1400

140

Residue Number

Res

idue

Num

ber

Q (fraction of native contacts)1400

140

0

1

1

2

3

Residue NumberR

esid

ue N

umbe

r0 0.2 0.4 0.6 0.8 1

Q (fraction of native contacts)

5

20

ΔG /

k BT f

ThreeFoil

MUT1

MUT2

SymFHYB

His

A B C D E

F

G

Fig. 4. Structure-based simulations reveal the molecular origins of ThreeFoil’s large kinetic barrier. (A) The folding free energy of ThreeFoil in units of kBTf(left y axis) is plotted at the transition midpoint (Tf) as a function of the fraction of native contacts (Q) in black. The population distribution is plotted in gray(right y axis). The protein populates only the unfolded state (Q ∼ 0.1) and the folded state (Q ∼ 0.9). The two curves were calculated from simulations of aThreeFoil model using all contacts shown in D. (B) Contact map of the transition state ensemble (TSE, Q ∼ 0.35 in A), colored based on degree of structure,with 1 indicating native levels of structure and 0 indicating no structure. Contacts between lactose binding site residues (cyan) and sodium binding residues(gray) are shown. The numbered squares contain intratrefoil contacts (see C). (C) Average level of structure derived from B (same coloring) illustrated forThreeFoil partitioned into its three repeats by gray lines. The residues shown as spheres are part of the 3 symmetric lactose-binding sites, whereas thoseshown as sticks are part of the single sodium-binding site (Fig. 1). The rotation indicated gives the view in E and F. (D) Contact map of ThreeFoil, with contactsdeleted in MUT1 and MUT2 shown in red and blue, respectively. All deleted contacts are long-range (far from diagonal). (E) Long-range contacts (red sticks) ofthe β2–β3 loop residues (red spheres at Cα positions) deleted in MUT1. (F) For MUT2 the same number of contacts were deleted such that MUT1 and MUT2have a very similar ACO. However, these contacts (blue sticks) are spread over the entire protein. (G) Folding free energies of ThreeFoil (black, same as in A),Symfoil (SymF; gray), a hybrid protein with the ThreeFoil contact map projected on the Symfoil backbone (HYB; gold), the two ThreeFoil mutants (MUT1; red,MUT2; blue), and Hisactophilin (His; green) are plotted in units of their respective folding temperatures (kBTf, y axis) at their respective transition midpoints asa function of the fraction of their respective native contacts (x axis). The SymF, HYB, MUT1 and MUT2 free energy profiles have very similar barrier heights, inbetween those of the highly kinetically stable ThreeFoil and the much less stable His.

14608 | www.pnas.org/cgi/doi/10.1073/pnas.1510748112 Broom et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021

Page 5: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

evidenced by its: (i) reversible, cooperative, two-state (un)folding;and (ii) well folded and functional native structure which has highsolubility and monodispersity, well diffracting crystals, and greatresistance against H/D exchange (1), denaturation by chaotropesand detergent, and degradation by protease.Although the rational design of proteins with desired structure

and function remains a great challenge and often require mul-tiple cycles of design and/or selection to improve them, successesin designing both structures and/or functions, including ones notobserved in nature, have been increasing (3, 4, 6, 8, 9, 18, 49, 50).These results demonstrate the increasing understanding of funda-mental principles and utility of computational protein design. Re-cently, there have been multiple reports of success for commonfolds based on repeated structural elements, including relativelyhigh success rates and stabilities for various helix-containing elon-gated repeat proteins (18, 51) and toroidal or globular superfolds(1–3, 7, 16, 17, 19). The great diversity of sequences observed forthese symmetric protein structures may reflect an inherent capacityfor stability, foldability and functionality that is especially amenableto both evolution and design (22).

Design strategies similar to that used to make ThreeFoil,which use repetition of structural elements designed usingexisting sequence information and structural modeling with theRosetta energy function (28), have proven particularly fruitful,with several studies yielding well-folded proteins with highmelting points on the first attempt (1, 16–18). Furthermore, wehave shown that ThreeFoil possesses stability, cooperativity andmultivalent binding function. These features may be “inherited”through the use of existing sequence information, generating amore naturally funneled energy landscape. Other proteinsdesigned in a similar way, and not yet characterized in detail,may also capture favorable natural features (16–18). Also, ourresults show how ligand binding can further smooth the land-scape by decreasing the formation of nonnative structure and sopromote folding and design success.Although evolution has provided a great range of sequences

and structures that may be leveraged, it has also set limitations,which need not constrain rational protein design. As an example,natural proteins for which kinetics have been measured typicallyunfold on a timescale of seconds-hours (41); ready unfoldingmay be needed to facilitate protein transport, regulation orturnover. However, other natural proteins that must withstandharsh extracellular or thermophilic conditions tend to have highkinetic stability (35, 44); hence, fast unfolding is not an inherentconstraint on proteins. Artificial proteins can be freed fromvarious biological constraints allowing for uncommon propertiessuch as extreme kinetic stability using suitable natural structuresor novel ones with the requisite features.It is important to note that although the energy landscape of a

protein defines both its thermodynamic and kinetic stabilities,the two properties are distinct. Thermodynamic stability dependson the energy difference between folded and unfolded states,whereas kinetic stability depends on the energy barrier betweenthe folded and transition states (Fig. 2). High kinetic stability is aparticularly attractive feature for rational design, as it is linked toother benefits such as resistances against protein denaturationand degradation by detergents (by decreasing exposure of thehydrophobic core), proteases (by limiting accessibility of cutsites), and temperature (by producing a high energy transitionstate barrier that is unlikely to be crossed by thermal fluctua-tions) (44, 46). Such characteristics are highly desirable for in-dustrial or biomedical applications that require a protein toremain folded and functional for a long time, even in challengingenvironments. Although it is known that kinetic stability and itsassociated resistances are the result of slow global unfolding andlimited local opening (35, 44–46), little has been reported onhow to rationally incorporate this into designed proteins. Our indepth experimental and modeling analyses of ThreeFoil providevaluable insight into the molecular basis for these characteristics.Specifically, the origin of ThreeFoil’s very slow global and lim-ited local unfolding is a high and steep energy barrier which is aconsequence of a folded topology that includes a large numberand proportion of long-range and extensively distributed con-tacts. Thus, there are no weak points in the structure and itundergoes very cooperative folding to a native state that is highlyresistant to local openings.In summary, a simple calculation of ACO/LRO indicates whether

a design has the capacity to be kinetically stable, whereas Go modelsimulations give a more accurate prediction and can be used todetermine the impact of specific contacts. This paves the way forrational design of resistance to harsh conditions. The mechanisticunderstanding of the structural determinants of resistance and theability to design it, as well as the simplified and efficient designprocess of using structural repetition within the context of a sym-metric and functional superfold, provide valuable avenues for im-proving future protein designs.

Materials and MethodsProtein Expression and Purification. ThreeFoil was expressed in E. coli andinclusion bodies solubilized in urea before being purified on a Ni-NTA

Fig. 5. ThreeFoil is highly resistant to protease and detergent. (A) Incu-bation with Proteinase K of: ThreeFoil (3F), Hisactophilin (His), human Cu,Zn superoxide dismutase (SOD), BSA, ovalbumin (Ova), β-lactoglobulin(βlac), myoglobin (Myo), and lysozyme (Lys). Protein before (−) and afterincubation with protease (+) are shown. Retention of the protein bandafter incubation shows resistance to digestion. ThreeFoil and SOD areshown after 4 d (still nondegraded), whereas others are shown after 1 h(fully degraded). The molecular weight decrease for ThreeFoil after in-cubation is due to the loss of its unstructured his-tag (untagged ThreeFoilhas a MW of 15.3 kDa and runs higher than intact Hisactophilin with a MWof 13.3 kDa, see also SI Appendix, Fig. S9A). Individual gels shown in SIAppendix, Fig. S9 B–E. (B) The same proteins tested for resistance to SDS.Where the unboiled (U) and boiled (B) samples are indistinguishable, no SDSresistance is observed, whereas a higher running unboiled sample indicatesSDS is unable to penetrate and bind without thermal unfolding of the protein.Comparison of topological complexity as measured by ACO (C) and LRO (D) forproteins that have been kinetically characterized experimentally (SI Appendix,Table S2) and those with experimentally demonstrated resistance or non-resistance to protease and SDS (SI Appendix, Table S3). Resistant proteinsgenerally have higher topological complexity. β-trefoil proteins are colored asin Fig. 3. Data shown as box-and-whisker plots, with a horizontal line at themedian, box enclosing middle 50% of the data, whiskers drawn to 1.5*IQR(interquartile range).

Broom et al. PNAS | November 24, 2015 | vol. 112 | no. 47 | 14609

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021

Page 6: Designed protein reveals structural determinants of ... · Troy, NY 12180; cNational Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore-560065, India;

column and refolded by dialysis, as described (1). Details for removal ofsodium are given in the SI Appendix, SI Materials and Methods.

Kinetic Measurements.Measurements were performed at 27 °C andmonitoredby fluorescence (excitation 274 nm, emission 317 nm) using a SpectraMax M5plate reader (Molecular Devices). Protein was diluted into varying concentra-tions of GuSCN by manual mixing and measured for up to 4 d. Additionaldetails are given in the SI Appendix, SI Materials and Methods.

SDS Resistance. Protein in H2O was diluted into SDS and Tris so that finalsamples contained 0.5 mg/mL protein and 1% SDS in 125 mM Tris (pH 6.8).Samples were then either boiled or incubated at room temperature for10 min before analysis by SDS/PAGE using 15% (wt/vol) Acrylamide gels with0.1% SDS in Tris/glycine running buffer (pH 8.3), and either without (Fig.5B) or with (SI Appendix, Fig. S9 F and G) 7% (vol/vol) β-mercaptoethanolto reduce disulfides.

Protease Resistance. Samples contained 0.5 mg/mL of protein in 25 mM Trisand 1 μM EDTA (pH 8.3). A time 0, control was taken before adding pro-teinase K (final concentration 0.02 mg/mL), and further samples taken after1 h, 1 d, and 4 d of incubation at 25 °C. The reaction was stopped by mixing

samples 1:1 with buffer [2.5 μM phenylmethylsulfonyl fluoride, 125 mM Tris,4% SDS (wt/vol), 20% (vol/vol) glycerol, 15% (vol/vol) β-mercaptoethanol, atpH 6.8] and boiling for 10 min. SDS/PAGE was performed using the same gelconditions as for SDS Resistance.

Coarse-Grained Go Models. A common form of the Go model (31, 32) was usedto perform MD simulations. The inputs to this model are the coordinates ofthe Cα atoms of the protein and the contact map. Details of the model, con-tact map calculations, simulation conditions and analyses are given in SI Ap-pendix, SI Materials and Methods. The simulations were performed using apreviously developed enhanced sampling technique (52) based on themulticanonical method.

ACKNOWLEDGMENTS. We thank Prof. Jayant B. Udgaonkar for monellinand SH3; Dr. Ranabir Das for ubiquitin; and Core H of the Consortium forFunctional Glycomics, funded by the National Institute of General MedicalSciences (GM62116), for glycan array analysis. This work was supported byNational Sciences and Engineering Research Council of Canada (to E.M.M.),Government of India-DAE and DST-Ramanujan Fellowship (SR/S2/RJN-63/2009, 5 years, wef 15/04/2010; to S.G.), and National Science FoundationGrant 1158375 (to W.C.).

1. Broom A, et al. (2012) Modular evolution and the origins of symmetry: Reconstructionof a three-fold symmetric globular protein. Structure 20(1):161–171.

2. Longo LM, Kumru OS, Middaugh CR, Blaber M (2014) Evolution and design of proteinstructure by folding nucleus symmetric expansion. Structure 22(10):1377–1384.

3. Koga N, et al. (2012) Principles for designing ideal protein structures. Nature 491(7423):222–227.

4. Rajagopalan S, et al. (2014) Design of activated serine-containing catalytic triads withatomic-level accuracy. Nat Chem Biol 10(5):386–391.

5. Thomson AR, et al. (2014) Computational design of water-soluble α-helical barrels.Science 346(6208):485–488.

6. Huang P-S, et al. (2014) High thermodynamic stability of parametrically designedhelical bundles. Science 346(6208):481–485.

7. Lee J, Blaber SI, Dubey VK, Blaber M (2011) A polypeptide “building block” for theβ-trefoil fold identified by “top-down symmetric deconstruction”. J Mol Biol 407(5):744–763.

8. Privett HK, et al. (2012) Iterative approach to computational enzyme design. Proc NatlAcad Sci USA 109(10):3790–3795.

9. Li Z, Yang Y, Zhan J, Dai L, Zhou Y (2013) Energy functions in de novo protein design:Current challenges and future prospects. Annu Rev Biophys 42:315–335.

10. Korendovych IV, DeGrado WF (2014) Catalytic efficiency of designed catalytic pro-teins. Curr Opin Struct Biol 27:113–121.

11. Meiering EM, Serrano L, Fersht AR (1992) Effect of active site residues in barnase onactivity and stability. J Mol Biol 225(3):585–589.

12. Shoichet BK, Baase WA, Kuroki R, Matthews BW (1995) A relationship betweenprotein stability and protein function. Proc Natl Acad Sci USA 92(2):452–456.

13. Capraro DT, Roy M, Onuchic JN, Gosavi S, Jennings PA (2012) β-Bulge triggers route-switching on the functional landscape of interleukin-1β. Proc Natl Acad Sci USA109(5):1490–1493.

14. Ferreiro DU, Komives EA, Wolynes PG (2014) Frustration in biomolecules. Q RevBiophys 47(4):285–363.

15. Gershenson A, Gierasch LM, Pastore A, Radford SE (2014) Energy landscapes offunctional proteins are inherently risky. Nat Chem Biol 10(11):884–891.

16. Fortenberry C, et al. (2011) Exploring symmetry as an avenue to the computationaldesign of large protein domains. J Am Chem Soc 133(45):18026–18029.

17. Voet ARD, et al. (2014) Computational design of a self-assembling symmetricalβ-propeller protein. Proc Natl Acad Sci USA 111(42):15102–15107.

18. Parmeggiani F, et al. (2015) A general computational approach for repeat proteindesign. J Mol Biol 427(2):563–575.

19. Höcker B (2014) Design of proteins from smaller fragments-learning from evolution.Curr Opin Struct Biol 27:56–62.

20. Javadi Y, Itzhaki LS (2013) Tandem-repeat proteins: Regularity plus modularity equalsdesign-ability. Curr Opin Struct Biol 23(4):622–631.

21. Orengo CA, Jones DT, Thornton JM (1994) Protein superfamilies and domain super-folds. Nature 372(6507):631–634.

22. Balaji S (2015) Internal symmetry in protein structures: Prevalence, functional rele-vance and evolution. Curr Opin Struct Biol 32:156–166.

23. Wetzel SK, Settanni G, Kenig M, Binz HK, Plückthun A (2008) Folding and unfoldingmechanism of highly stable full-consensus ankyrin repeat proteins. J Mol Biol 376(1):241–257.

24. Yadid I, Tawfik DS (2011) Functional β-propeller lectins by tandem duplications ofrepetitive units. Protein Eng Des Sel 24(1-2):185–195.

25. Carstensen L, et al. (2012) Conservation of the folding mechanism between designedprimordial (βα)8-barrel proteins and their modern descendant. J Am Chem Soc134(30):12786–12791.

26. Höcker B, Claren J, Sterner R (2004) Mimicking enzyme evolution by generating new(betaalpha)8-barrels from (betaalpha)4-half-barrels. Proc Natl Acad Sci USA 101(47):16448–16453.

27. Ponting CP, Russell RB (2000) Identification of distant homologues of fibroblastgrowth factors suggests a common ancestor for all beta-trefoil proteins. J Mol Biol302(5):1041–1047.

28. Leaver-Fay A, et al. (2011) ROSETTA3: An object-oriented software suite for thesimulation and design of macromolecules. Methods Enzymol 487:545–574.

29. Ivankov DN, et al. (2003) Contact order revisited: Influence of protein size on thefolding rate. Protein Sci 12:2057–2062.

30. Gromiha MM, Selvaraj S (2001) Comparison between long-range interactions andcontact order in determining the folding rate of two-state proteins: Application oflong-range order to folding rate prediction. J Mol Biol 310(1):27–32.

31. Nymeyer H, García AE, Onuchic JN (1998) Folding funnels and frustration in off-latticeminimalist protein landscapes. Proc Natl Acad Sci USA 95(11):5921–5928.

32. Chavez LL, Onuchic JN, Clementi C (2004) Quantifying the roughness on the freeenergy landscape: Entropic bottlenecks and protein folding rates. J Am Chem Soc126(27):8426–8432.

33. Hyeon C, Thirumalai D (2011) Capturing the essence of folding and functions ofbiomolecules using coarse-grained models. Nat Commun 2:487.

34. Borgia MB, et al. (2011) Single-molecule fluorescence reveals sequence-specific mis-folding in multidomain proteins. Nature 474(7353):662–665.

35. Jaswal SS, Sohl JL, Davis JH, Agard DA (2002) Energetic landscape of alpha-lyticprotease optimizes longevity through kinetic stability. Nature 415(6869):343–346.

36. Javadi Y, Main ERG (2009) Exploring the folding energy landscape of a series of de-signed consensus tetratricopeptide repeat proteins. Proc Natl Acad Sci USA 106(41):17383–17388.

37. Sancho J, Meiering EM, Fersht AR (1991) Mapping transition states of protein un-folding by protein engineering of ligand-binding sites. J Mol Biol 221(3):1007–1014.

38. Wittung-Stafshede P (2002) Role of cofactors in protein folding. Acc Chem Res 35(4):201–208.

39. Stigler J, Rief M (2012) Calcium-dependent folding of single calmodulin molecules.Proc Natl Acad Sci USA 109(44):17814–17819.

40. Liu P-F, Park C (2012) Selective stabilization of a partially unfolded protein by ametabolite. J Mol Biol 422(3):403–413.

41. Broom A, Gosavi S, Meiering EM (2015) Protein unfolding rates correlate as stronglyas folding rates with native structure. Protein Sci 24:580–587.

42. Thirumalai D (1995) From minimal models to real proteins - time scales for proteinfolding kinetics. J Phys I 5:1457–1467.

43. Gosavi S (2013) Understanding the folding-function tradeoff in proteins. PLoS One8(4):e61222.

44. Sanchez-Ruiz JM (2010) Protein kinetic stability. Biophys Chem 148(1-3):1–15.45. Park C, Zhou S, Gilmore J, Marqusee S (2007) Energetics-based protein profiling on a

proteomic scale: Identification of proteins resistant to proteolysis. J Mol Biol 368(5):1426–1437.

46. Manning M, Colón W (2004) Structural basis of protein kinetic stability: Resistance tosodium dodecyl sulfate suggests a central role for rigidity and a bias toward beta-sheet structure. Biochemistry 43(35):11248–11254.

47. Xia K, et al. (2007) Identifying the subproteome of kinetically stable proteins via di-agonal 2D SDS/PAGE. Proc Natl Acad Sci USA 104(44):17329–17334.

48. Xia K, et al. (2012) Quantifying the kinetic stability of hyperstable proteins via time-dependent SDS trapping. Biochemistry 51(1):100–107.

49. Watters AL, et al. (2007) The highly cooperative folding of small naturally occurringproteins is likely the result of natural selection. Cell 128(3):613–624.

50. Best RB, Hummer G, Eaton WA (2013) Native contacts determine protein foldingmechanisms in atomistic simulations. Proc Natl Acad Sci USA 110(44):17874–17879.

51. Boersma YL, Plückthun A (2011) DARPins and other repeat protein scaffolds: Ad-vances in engineering and applications. Curr Opin Biotechnol 22(6):849–857.

52. Gosavi S, Chavez LL, Jennings PA, Onuchic JN (2006) Topological frustration and thefolding of interleukin-1 beta. J Mol Biol 357(3):986–996.

14610 | www.pnas.org/cgi/doi/10.1073/pnas.1510748112 Broom et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

4, 2

021