Upload
denver
View
41
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Modeling the Structures of Proteins and Macromolecular Assemblies. Depts. Of Biopharmaceutical Sciences and Pharmaceutical Chemistry California Institute for Quantitative Biomedical Research University of California at San Francisco. Andrej Š ali http://salilab.org/. 3/25/03. - PowerPoint PPT Presentation
Citation preview
Modeling the Structures of Proteins andMacromolecular Assemblies
Depts. Of Biopharmaceutical Sciences and Pharmaceutical ChemistryCalifornia Institute for Quantitative Biomedical Research
University of California at San Francisco
Andrej Šali
http://salilab.org/
3/25/03
1. Yeast and E. coli ribosomes: Electron microscopy, comparative modeling, and structural genomics.
2. Yeast Nuclear Pore Complex: Low-resolution modeling of large assemblies; bridging the gaps between structural biology, proteomics, and system biology.
3. Comments.
4/6/03
S. cerevisiae ribosome
C. Spahn, R. Beckmann, N. Eswar, P. Penczek, A. Sali, G. Blobel, J. Frank. Cell 107, 361-372, 2001.
Fitting of comparative models into 15Å cryo- electron density map.
43 proteins could be modeled on 20-56% seq.id. to a known structure.
The modeled fraction of the proteins ranges from 34-99%.
3/25/03
E. coli ribosomeH.Gao, J.Sengupta, M.Valle, A. Korostelev, N.Eswar, S.Stagg, P.Van Roey, R.Agrawal, S.Harvey, A.Sali, M.Chapman, and J.Frank. Cell, in press.
Upon EF-G binding, the
ribosome becomes less
compact. In contrast to
mRNA, many protein
contacts undergo large
conformational changes,
suggesting ribosomal
proteins facilitate the
dynamics of translation.
4/6/03
STARTSTART
Get profile for sequence (NR)Get profile for sequence (NR)
Scan sequence profile against representative PDB chains
Scan sequence profile against representative PDB chains
Scan PDB chain profiles against sequence
Scan PDB chain profiles against sequence
PS
I-B
LA
ST
MODPIPE: Large-Scale Comparative Protein Structure Modeling
Select templates using permissive E-value
cutoff
Select templates using permissive E-value
cutoff
1
Expand match to cover complete
domains
Expand match to cover complete
domains
1
Build model for target segment by satisfaction of spatial
restraints
Build model for target segment by satisfaction of spatial
restraints
Evaluate modelEvaluate model
Align matched parts of sequence and structure
Align matched parts of sequence and structure
MO
DE
LL
ER
R. Sánchez & A. Šali, Proc. Natl. Acad. Sci. USA 95, 13597, 1998.N. Eswar, M. Marti-Renom, M.S. Madhusudhan, B. John, A. Fiser, R. Sánchez, F. Melo, N. Mirkovic, A. Šali.
Fo
r ea
ch t
arg
et s
equ
ence
Fo
r ea
ch t
emp
late
str
uct
ure
3/25/03
ENDEND
http://salilab.org/modbasePieper et al., Nucl. Acids Res. 2002.
3/25/03
Comparative modeling of the TrEMBL database
Unique sequences processed: 733,239
Sequences with fold assignments or models: 415,937 (57%)
4/03/02 ~4 weeks on 500 Pentium III CPUs
70% of models based on <30% sequence identity to template.
On average, only a domain per protein is modeled(an “average” protein has 2.5 domains of 175 aa).
3/25/03
Model AccuracyMarti-Renom et al. Annu.Rev.Biophys.Biomol.Struct. 29, 291-325, 2000.
MEDIUM ACCURACY LOW ACCURACYHIGH ACCURACY
% Sequence Identity
0 10 20 30 40 50 60 70 80 90 100
% S
tru
ctu
ral O
verl
ap
10
20
30
40
50
60
70
80
90
100
Target-ModelTarget-Template
NM23Seq id 77%
CRABPSeq id 41%
EDNSeq id 33%
X-RAY / MODEL
SidechainsCore backboneLoops
C equiv 147/148RMSD 0.41Å
SidechainsCore backboneLoopsAlignment
C equiv 122/137RMSD 1.34Å
SidechainsCore backboneLoopsAlignmentFold assignment
C equiv 90/134RMSD 1.17Å
4/6/03
Future directions
Make sure we have building blocks (structural genomics).
Develop methods for simultaneous fitting of proteins into
EM density and conformational modeling (induced fit,
comparative modeling, ab initio).
Need large computing (eg, cluster of hundreds of nodes
with >1GB memories).
4/6/03
Rockefeller University, New York
*UCSF
F. Alber*, T. Suprapto, J. Kipper, W. Zhang, L. Veenhoff, S. Dokudovskaya, M. Rout, B. Chait, A. Šali*
Modeling of the yeast nuclear pore complex by satisfaction of spatial restraints (MODELLER)
4/6/03
Modeling macromolecular assemblies by satisfaction of spatial restraints
1) Representation of a system.
2) Scoring function (spatial restraints).3) Optimization.
Sali, Ernst, Glaeser, Baumeister. From words to literature in structural proteomics. Nature 422, 216-225, 2003.3/25/03
There is nothing but points and restraints on them.
Modeling of NPC
Stochiometry.
Parse proteins into domains.
Protein and “sub-complex’ shapes from Stokes radii.
Excluded volume of proteins.
Symmetry of NPC (EM).
Radial and axial localization of proteins (IEM).
Protein-protein proximity (immuno-purification).
Binary protein-protein contacts from “overlay” experiments.
Modeling in the context of the nuclear envelope.
Comparative models for some domains.
Structural genomics of nucleoporins.4/6/03
spokehalf-spoke
cytosolic side
nuclear side
Schematic structure of yeast NPC
nuclear membrane
TOP VIEW SIDE VIEW
half-spoke contains~30 nucleoporin proteins (NUPs).
~480 NUPs in NPC.
C.W. Akey, M. Rout
3/25/03
[sum of residues]
com
ple
x d
imen
sio
n f
(res
) [A
]
pulloutN
i
resii
resP
resP
ABP nSNwithNftoleranceD
1
;
Protein-protein “contacts”
For each protein pair within a half-spoke, the upper bound on the center-center distance is the estimated maximal complex diameter:
Complications:- multiple protein copies in half-spoke - inter-spoke interactions.
3/25/03
Successful optimization11/18/023/25/03
Scan of figureNUP120NUP84
NUP85
NUP145C
SEH1
Cdc31
Experiment
Rappsilber J, Siniossoglou S, Hurt EC, Mann M. Anal Chem 2000 Jan 15;72(2):267-75.
Model
NUP84-complex
4/6/03
Structural proteomics aims to characterize structures of most macromolecular complexes, in space and time.
On average, a domain may interact with a few other domains. The function of a complex is determined by its structure and dynamics.
There are too many complexes to be determined directly by high-resolution experimental structure determination.
Thus, just like in structural genomics, an efficient combination of experiment and computation is required.
4/6/03
Structural Proteomics versus Structural Genomics
Potential targets not clear clear
Target selection not clear clear
Scope not clear clear
Structure determination hybrid X-ray or NMR
Functional Annotation essential not a major
focus
There are additional sciences and technologies in structural proteomics, relative to structural genomics.
4/6/03
Structural Genomics
Characterize most protein sequences based on related known structures.
There are ~16,000 30% seq id families (90%)(Vitkup et al. Nat. Struct. Biol. 8, 559, 2001)
Sali. Nat. Struct. Biol. 5, 1029, 1998.Sali et al. Nat. Struct. Biol., 7, 986, 2000.Sali. Nat. Struct. Biol. 7, 484, 2001.Baker & Sali. Science 294, 93, 2001.
The number of “families” is much smaller than the number of proteins.
Any one of the members of a family is fine.
Characterize most protein sequences based on related known structures:
3/25/03
Target selection for structural proteomics(scope)
Comprehensive coverage.
What targets:
Stable assemblies? Transient complexes? Pairs of domains?
Can they be organized into groups?
How many such groups are there?
4/6/03
Target selection for structural proteomics?
• There are ~3,000 folds containing ~90% of all sequences.
• A target: Binary domain-domain interface.
• There may be a finite number of domain-domain interfaces, largely defined by the domain fold types.
• Given pairwise interactions, a large assembly may be reconstructed.
4/6/03
Protein and assembly structure by experiment and computation
3/25/3Sali, Ernst, Glaeser, Baumeister. From words to literature in structural proteomics. Nature 422, 216-225, 2003.
3/25/03
Modeling macromolecular assemblies by satisfaction of spatial restraints
1) Representation of a system.
2) Scoring function (spatial restraints).3) Optimization.
Sali, Ernst, Glaeser, Baumeister. From words to literature in structural proteomics. Nature 422, 216-225, 2003.3/25/03
There is nothing but points and restraints on them.
Future directions
Development of general/flexible/hierarchical representation of assemblies.
Quantifying spatial information from experiments.
Optimization of structure.
Toy models.
Space, time.
4/6/03
Role of NIH
Structural proteomics is timely and feasible.
Humongous integration of concepts, sciences, methods, tools, people.
Thus, need research centers, but also “R01” research.
Significant computing.
Bridging the gaps between structural biology, proteomics, and system biology.
4/6/03