Upload
abraham-shelton
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Historical Perspective
Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999
A personal perspective on advances and developments in protein folding over the last 40 years
Levinthal Paradox
Cyrus Levinthal, Columbia University, 1968 Observed that there is insufficient time to
randomly search the entire conformational space of a protein
Resolution: Proteins have to fold through some directed process
Goal is to understand the dynamics of this process
Old vs. New Views
Old: Heirarchical view of protein folding Secondary structures form, then interact to form tertiary
structures General order of events
New: Statistical ensembles of states Potential energy landscape Folding “Funnel”
Not all that different; most important ideas were theorized many years ago
Secondary Structures
Consensus view is that secondary structure formation is the earliest part of the folding process
Numerous studies indicate that local sequence codes for local structures Helical sequences in a folded protein tend to be helical in
isolation Current SSE prediction algorithms about 70%
correct (1993). Failure indicates some tertiary interactions in stabilizing SSEs
However…
Not clear what sequence elements code for overall topology
One factor is the existence of hydrophobic faces on the surface of SSEs
Still challenges in predicting topology of SSEs, even when protein class is known
Atomic level calculations
Molecular calculations have made great impact in our understanding of protein folding
Harold Scheraga, 1968 Shneior Lifson, 1969 Martin Karplus’s laboratory, ~1979 Early calculations had trouble dealing with
solvent effects
Secondary Structure
Many of the essential elements of protein energetics can be derived from looking at SSE formation
Early experimental work: Ingwall et all, 1968 Baldwin et all, 1989, Worked on stabilizing shorter
helices Dyson, Wright, 1991, demonstrated that even
short peptides in solution can be partially structured
Results
Yang and Honig, 1995 Alpha-helices stabilized by hydrophobic
interactions and close packing; hydrogen bonding has little effect
Beta-sheets stabilized by non-polar interactions between residues on adjacent strands
Work supports idea that SSEs coded for locally in the sequence
Folding Pathways
SSEs can change conformation in the presence of a relatively small number of tertiary interactions
Free-energy difference between alpha-helix, beta-sheet, and coil is not great
Individual helices can be changed into beta-sheets by changing just a few amino acids
This suggests that proteins have a “structural plasticity” which allows for changes in conformation
Folding Pathways
Early in folding processes, many different combinations of SSEs have very similar stabilities
In the end, it is the tertiary interactions which drive towards the native topology
Early in folding, “flickering” of SSEs, eventually stabilized by tertiary interactions and converge to native state
Suggests that multiple folding pathways exist, which can all lead to the same end result once stabilized
Structure Prediction
Recently, a split has been seen Protein prediction problem
Trying to predict the end result of folding, using a large amount of comparison between known and unknown structures
Protein folding problem Trying to understand the folding path which leads to the end
result of folding, typically by MD simulations or energy calculation
Authors contention that both areas will need to be used together to fully understand protein folding
PrISM
Yang and Honig, 1999 Software suite which integrates prediction based on
simulations and known information about structures Sequence analysis Structure based sequence alignment Fast structure-structure superposition using a structural
domain database Multiple Structure alignment Fold recognition and homology model building
Used to make predictions for all 43 targets of CASP3 conference (more on CASP later)
Conclusions
Much of the current understanding of protein folding was theorized long ago
Vague and speculative ideas have been replaced by carefully defined theoretical concepts and rigorous experimental observations
Conclusions
Polypeptide backbone is the most important determinant of structure
SSEs are “meta-stable”; statement that sequence determines structure not wholly accurate
More accurate statement is that sequence chooses from a limited set of available SSEs and determines how they are ordered in space
Conclusions
Free-energy differences between alternate conformations is not large: may provide a bases for rapid evolutionary change
CASP
A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction, John Moult
CASP = Critical Assessment of Structure Prediction
First held in 1994, every 2 years afterwards Teams make structure predictions from
sequences alone
CASP
Two categories of predictorsAutomated
Automatic Servers, must complete analysis within 48 hours
Shows what is possible through computer analysis aloneNon-automated
Groups spend considerable time and effort on each target
Utilize computer techniques and human analysis techniques
CASP
CASP6, 1994200 prediction teams from 24 countriesOver 30,000 predictions for 64 protein targets
collected and evaluatedConference held after to discuss results, with
many teams presenting individual results and methodologies
Helps to steer future work
Modeling classes
Comparative modeling based on a clear sequence relationship
Modeling based on more distant evolutionary relationships
Modeling based on non-homologous fold relationships
Template free modeling
Comparative modeling based on a clear sequence relationship Easily detectable sequence relationship between
the target protein and one or more known protein structures, typically through BLAST
Copy from template, however: Must align target and template sequences In general, reliably building regions not present in the
template is still a challenge Sidechain accuracy is poor
Refinement remains a challenge
Comparative modeling based on a clear sequence relationship Progress in MD
needed for refinement Models useful for
identifying which members of a protein family have similar functionalities, and which are different
Modeling based on more distant evolutionary relationships Makes use of PSI-BLAST and hidden Markov
models Compile a profile for the sequence, compare this
profile to other known profiles Allows for prediction of structures, even when
sequence is not close Use of metaservers to find consensus structures
between CASP4 and CASP5 has led to improved accuracy
Modeling based on more distant evolutionary relationships Limitations:
Correct template may not be identified Alignment of target sequence to template is not trivial Significant fraction of residues will have no structural
equivalent in the template; modeling of these regions is hit or miss
Although regions are similar, they are not identical, and the greater the difference, the higher the error
Details are thus not accurate, but overall structure can be useful
For improvements, must work together with template-free methodologies
Modeling based on non-homologous fold relationships Protein “threading” In recent CASP experiments, these
methods have not been competitive with template free models
Template-free Modeling
For sequences where no template is available Historically physics based approaches were used Newer methods focus on substructures
While we have not seen all folds, we have probably seen nearly all substructures
Make use of substructure relationships From a few residues through SSEs to super-secondary
structures
Template-free Modeling
Range of possible conformations and considered Most successful package has been ROSETTA For proteins less than ~100 residues, produce
one or several approximately correct structures (4-6 A rmsd for C-alpha atoms)
Selecting the most accurate structures from all possibilities is still to be solved, typically make use of clustering currently
Development of atomic models is crucial to further progress