Upload
cece
View
46
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Classification of Protein Complexes based on Biophysics of Association Sandor Vajda Boston University. “Tell me with whom you go, and I'll tell you what you are.” Italian Proverb. List of Interactions. computational prediction of structure and specificity of protein – protein complexes. - PowerPoint PPT Presentation
Citation preview
Classification of Protein Complexes based on Biophysics of Association
Sandor Vajda
Boston University
“Tell me with whom you go,and I'll tell you what you are.” Italian Proverb
“FYI” filtered yeast interactome (Vidal 2004):• involves ~1500 proteins,• making ~2500 physical interactions
H. Jeong et al, Nature 2001
Structure: Nature of Intreractions
PDB: ~ 25’000 solved crystal structures; ~ 10% complexes
computational prediction of structure and specificity of
protein – protein complexes
“Tell me how you contact your partners,and I'll tell you who you are.”
List of Interactions
Protein-protein docking How proteins interact with each other? Docking problem
Predict docking configuration from the structures of component proteins
Bound vs. unbound docking Conformational change
Bound vs.unbound: at least side chain conformations change
Coarse details
Fine details
Trypsin/APPI
Receptor
Ligand
Talk outline
1.What is the current state of docking?
2.What docking calculations tell us about the nature of protein - protein complexes?
3.How to deal with side chain flexibility?
Proteins: Basics
ADEFFGKLSTKK……. Sequence
CASP
Structure
N
O
OO
N
O
N
O
N
N
O
......
Building Blocks:backbone & side chains
CAPRI
+
Complex
Monomers
Rigid body degrees of freedom 3 translation3 rotation
de novo docking
Structure Prediction
Benchmark set of protein complexes: Chen, R. et al. (2003) A protein-protein docking benchmark. Proteins,
52, 88-91. 22 enzyme-inhibitor 19 antigen-antibody 11 “other” types 7 “difficult” cases
Comeau, S. et al. (2003) ClusPro: An automated docking and discrimination method for the prediction of protein complexes. Bioinformatics, 20, 45-50.
Chen, R. et al. (2003) ZDOCK: An initial-stage protein-docking algorithm Proteins, 52, 80-87
Li, L. et al. (2003) RDOCK: Refinement of rigid-body protein docking predictions. Proteins, 53, 693-707.
Gray, J.J. et al. (2003) Protein–protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J. Molec. Biol. 331, 281-299
How current protein docking programs work?
Submit 10 models to
CAPRI
Rigid Body Search
Select docked structures with low energy
Cluster retained conformations
Refine structures Flexible side
chains
Filter 1: 20,000
Filter 2: 2,000
Filter 3: 30
Filter 4: 1?
Algorithms of the 3 docking methods
Method (Investigator)
Step 1: Rigid body search Step 2: Rescoring, ranking, filtering, and refinement
ClusPro (Camacho and Vajda)
Fast Fourier Transform (FFT) correlation approach using ZDOCK or DOT
Re-scoring with empirical potentials and clustering
Gray and Baker
Monte-Carlo search using simplified protein geometry and scoring function
Iterative repacking of side chains and rigid-body docking repeated until convergence. Final selection by clustering.
ZDOCK (Weng)
FFT correlation with shape complementarity, electrostatics, and desolvation
Clustering of conformations to avoid redundancies
RDOCK (Weng)
FFT correlation with shape complementarity
Re-scoring with empirical potentials
Effect of the interface area
difficult
uncertain
easy very difficult
GOOD
Effect of hydrophobicity
easy
uncertain
-4
Type IVdifficult
Type IIIuncertain
Type IIeasy
Type Ieasy
Size vs. Hydrophobicity
Type Vdifficult
difficultType IVdifficult
Type Ieasy
Type III uncertain
Benchmark by type
Type II easy
Type Vdifficult
Interface Area
Des
olva
tion
free
ene
rgy
1400 2000 3400
-4
Type IEasy
Enzymes
Type IIEasy
Largemultienzymecomplexes
Type IIIUncertain
Antibody/Antigen
Type IVDifficult
Type VHopeless
Transitionalcomplexes with
substantial conformational
change
Small signallingcomplexes
Type II Or
Type V?
Table I. Major differences between enzyme-inhibitor and antibody-antigen complexes
Property Enzyme-inhibitor complexes Antibody-antigen complexes
Interface area ASA 1400 Å2 < ASA < 2000 Å2, Possibly < 1400 Å2
Interface connectedness Single patch Frequently multiple patches
Interface shape Convex-concave Mostly planar
Binding free energy G, kcal/mol
-17.5 kcal/mol < G < -13.0 kcal/mol
-13.0 kcal/mol < G < -6.5 kcal/mol
% Nonpolar residues in interface
61% nonpolar (can reach 71%)
51% nonpolar (can be as low as 44%)
Desolvation free energy Negative (favorable) Positive (unfavorable)
Conformational change Generally moderateCan be substantial; loop and/or hinge motion
Crystallographic water positions
Around perimeter of interface Within the interface
Conformati-onal
Interface
Type change ASAa Hydrophobicity Docking outcome
Example
I Small (rigid interface)
Standardb Strong; the convex-concave interface provides good shape complementarity
Successful, unless key side chains are in wrong conformations
Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, ASA = 1950 Å2, and Gdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category
II Small ASA > 2000 Å2
Unimportant Successful Ribonuclease a and ribonuclease inhibitor (1dfj): ASA = 2580 Å2, Gdes = 18.6 kcal/mol, Eelec=-63.9 kcal/mol KD = 0.15 nM
III Moderate, but larger than for Type I
Standard Variable, but generally weak. Charge-charge interactions can be strong
Unpredictable; can be very difficult, even with know hypervariable regions of antibody
Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, ASA = 1390 Å2, and Gdes = -3.84 kcal/mol. Most antibody – antigen complexes are in this category
IV Restricted to side chains
ASA <1400 Å2
Weak; mostly polar and charge-charge interactions
Hits are found, but are generally lost in scoring and ranking
Ras and Ras interacting domain (1lfd) KD = 2M, ASA = 1130 Å2, and Gdes = 3.6 kcal/mol. A number of weak complexes are in this category
V Substantial backbone change, C RMSD > 2 Å
ASA > 2000 Å2
Generally moderate
Rigid body methods seem to always fail for these complexes
Cyclin A and cyclin-dependent kinase 2 (1fin): KD = 47.6 nM, ASA = 3390 Å2, and Gdes = 4.7 kcal/mol
a ASA – Acessible Surface Area, bStandard interface: 1400 Å2 < ASA < 2000 Å2, c C RMSD - carbon Root Mean Square Deviation
Classification of complexes
alpha-chymotrypsinogen
trypsin inhibitor variant 3
Type I:Enzyme-Inhibitor Complexes
Interface in the complex of alpha-chymotrypsinogen with trypsin inhibitor
Table I. Major differences between enzyme-inhibitor and antibody-antigen complexes
Property Enzyme-inhibitor complexes Antibody-antigen complexes
Interface area ASA 1400 Å2 < ASA < 2000 Å2, Possibly < 1400 Å2
Interface connectedness Single patch Frequently multiple patch
Interface shape Convex-concave Mostly planar
Binding free energy G, kcal/mol
-17.5 kcal/mol < G < -13.0 kcal/mol
-13.0 kcal/mol < G < -6.5 kcal/mol
% Nonpolar residues in interface
61% nonpolar (can reach 71%)
51% nonpolar (can be as low as 44%)
Desolvation free energy Negative (favorable) Positive (unfavorable)
Conformational change Generally moderateCan be substantial; loop and/or hinge motion
Crystallographic water positions
Around perimeter of interface Within the interface
Conformati-onal
Interface
Type change ASAa Hydrophobicity Docking outcome
Example
I Small (rigid interface)
Standardb Strong; the convex-concave interface provides good shape complementarity
Successful, unless key side chains are in wrong conformations
Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, ASA = 1950 Å2, and Gdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category
II Small ASA > 2000 Å2
Unimportant Successful Ribonuclease a and ribonuclease inhibitor (1dfj): ASA = 2580 Å2, Gdes = 18.6 kcal/mol, Eelec=-63.9 kcal/mol KD = 0.15 nM
III Moderate, but larger than for Type I
Standard Variable, but generally weak. Charge-charge interactions can be strong
Unpredictable; can be very difficult, even with know hypervariable regions of antibody
Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, ASA = 1390 Å2, Gdes = -3.84 kcal/mol, Eelec = --21.4 kcal/mol, Most antibody – antigen complexes are in this category
IV Restricted to side chains
ASA <1400 Å2
Weak; mostly polar and charge-charge interactions
Hits are found, but are generally lost in scoring and ranking
Ras and Ras interacting domain (1lfd) KD = 2M, ASA = 1130 Å2, and Gdes = 3.6 kcal/mol. A number of weak complexes are in this category
V Substantial backbone change, C RMSD > 2 Å
ASA > 2000 Å2
Generally moderate
Rigid body methods seem to always fail for these complexes
Cyclin A and cyclin-dependent kinase 2 (1fin): KD = 47.6 nM, ASA = 3390 Å2, and Gdes = 4.7 kcal/mol
a ASA – Acessible Surface Area, bStandard interface: 1400 Å2 < ASA < 2000 Å2, c C RMSD - carbon Root Mean Square Deviation
Classification of complexes
chicken lysozyme
Monoclonal antibody fab d44.1
Type III:Antigen-Antibody Complexes
Interface in the complex of chicken lysozyme with antibody fab d44.1
Interface Area
Des
olva
tion
free
ene
rgy
1400 2000 3400
-4
Type IEasy
Enzymes
Type IIEasy
Largemultienzymecomplexes
Type IIIUncertain
Antibody/Antigen
Type IVDifficult
Type VHopeless
Transitionalcomplexes with
substantial conformational
change
Small signallingcomplexes
Type II Or
Type V?
Conformati-onal
Interface
Type change ASAa Hydrophobicity Docking outcome
Example
I Small (rigid interface)
Standardb Strong; the convex-concave interface provides good shape complementarity
Successful, unless key side chains are in wrong conformations
Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, ASA = 1950 Å2, and Gdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category
II Small ASA > 2000 Å2
Unimportant Successful Ribonuclease a and ribonuclease inhibitor (1dfj): ASA = 2580 Å2, Gdes = 18.6 kcal/mol, Eelec=-63.9 kcal/mol KD = 0.15 nM
III Moderate, but larger than for Type I
Standard Variable, but generally weak. Charge-charge interactions can be strong
Unpredictable; can be very difficult, even with know hypervariable regions of antibody
Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, ASA = 1390 Å2, and Gdes = -3.84 kcal/mol. Most antibody – antigen complexes are in this category
IV Restricted to side chains
ASA <1400 Å2
Weak; mostly polar and charge-charge interactions
Hits are found, but are generally lost in scoring and ranking
Ras and Ras interacting domain (1lfd) KD = 2M, ASA = 1130 Å2, and Gdes = 3.6 kcal/mol. A number of weak complexes are in this category
V Substantial backbone change, C RMSD > 2 Å
ASA > 2000 Å2
Generally moderate
Rigid body methods seem to always fail for these complexes
Cyclin A and cyclin-dependent kinase 2 (1fin): KD = 47.6 nM, ASA = 3390 Å2, and Gdes = 4.7 kcal/mol
a ASA – Acessible Surface Area, bStandard interface: 1400 Å2 < ASA < 2000 Å2, c C RMSD - carbon Root Mean Square Deviation
Classification of complexes
ribonuclease a
Ribonuclease inhibitor
Interface in the complex of ribonuclease a with ribonuclease inhibitor
Interface Area
Des
olva
tion
free
ene
rgy
1400 2000 3400
-4
Type IEasy
Enzymes
Type IIEasy
Largemultienzymecomplexes
Type IIIUncertain
Antibody/Antigen
Type IVDifficult
Type VHopeless
Transitionalcomplexes with
substantial conformational
change
Small signallingcomplexes
Type II Or
Type V?
Conformati-onal
Interface
Type change ASAa Hydrophobicity Docking outcome
Example
I Small (rigid interface)
Standardb Strong; the convex-concave interface provides good shape complementarity
Successful, unless key side chains are in wrong conformations
Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, ASA = 1950 Å2, and Gdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category
II Small ASA > 2000 Å2
Unimportant Successful Ribonuclease a and ribonuclease inhibitor (1dfj): ASA = 2580 Å2, Gdes = 18.6 kcal/mol, Eelec=-63.9 kcal/mol KD = 0.15 nM
III Moderate, but larger than for Type I
Standard Variable, but generally weak. Charge-charge interactions can be strong
Unpredictable; can be very difficult, even with know hypervariable regions of antibody
Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, ASA = 1390 Å2, and Gdes = -3.84 kcal/mol. Most antibody – antigen complexes are in this category
IV Restricted to side chains
ASA <1400 Å2
Weak; mostly polar and charge-charge interactions
Hits are found, but are generally lost in scoring and ranking
Ras and Ras interacting domain (1lfd) KD = 2M, ASA = 1250 Å2, Gdes = 3.6 kcal/mol, and Eelec =-39.5 kcal/mol A number of weak complexes are in this category
V Substantial backbone change, C RMSD > 2 Å
ASA > 2000 Å2
Generally moderate
Rigid body methods seem to always fail for these complexes
Cyclin A and cyclin-dependent kinase 2 (1fin): KD = 47.6 M, ASA = 3550 Å2,
Gdes = 3.9 kcal/mol, and Eelec= -66.5 kcal/mol.
a ASA – Acessible Surface Area, bStandard interface: 1400 Å2 < ASA < 2000 Å2, c C RMSD - carbon Root Mean Square Deviation
Classification of complexes
ras protein
ras-interacting domain of ralgds
GNP (5'-guanosyl-imido-triphosphate
Interface in the complex of ras-interacting domain with ras
Interface Area
Des
olva
tion
free
ene
rgy
1400 2000 3400
-4
Type IEasy
Enzymes
Type IIEasy
Largemultienzymecomplexes
Type IIIUncertain
Antibody/Antigen
Type IVDifficult
Type VHopeless
Transitionalcomplexes with
substantial conformational
change
Small signallingcomplexes
Type II Or
Type V?
Conformati-onal
Interface
Type change ASAa Hydrophobicity Docking outcome
Example
I Small (rigid interface)
Standardb Strong; the convex-concave interface provides good shape complementarity
Successful, unless key side chains are in wrong conformations
Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, ASA = 1950 Å2, and Gdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category
II Small ASA > 2000 Å2
Unimportant Successful Ribonuclease a and ribonuclease inhibitor (1dfj): ASA = 2580 Å2, Gdes = 18.6 kcal/mol, Eelec=-63.9 kcal/mol KD = 0.15 nM
III Moderate, but larger than for Type I
Standard Variable, but generally weak. Charge-charge interactions can be strong
Unpredictable; can be very difficult, even with know hypervariable regions of antibody
Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, ASA = 1390 Å2, and Gdes = -3.84 kcal/mol. Most antibody – antigen complexes are in this category
IV Restricted to side chains
ASA <1400 Å2
Weak; mostly polar and charge-charge interactions
Hits are found, but are generally lost in scoring and ranking
Ras and Ras interacting domain (1lfd) KD = 2M, ASA = 1130 Å2, and Gdes = 3.6 kcal/mol. A number of weak complexes are in this category
V Substantial backbone change, C RMSD > 2 Å
ASA > 2000 Å2
Generally moderate
Rigid body methods seem to always fail for these complexes
Cyclin A and cyclin-dependent kinase 2 (1fin): KD = 47.6 nM, ASA = 3550 Å2,
Gdes = 3.9 kcal/mol, Eelec =-66.5 kcal/mol
a ASA – Acessible Surface Area, bStandard interface: 1400 Å2 < ASA < 2000 Å2, c C RMSD - carbon Root Mean Square Deviation
Classification of complexes
Type V:Large interface andlarge conformational change
Cyclin-ACyclin-dependent kinase
Li, L. et al. (2003) RDOCK
Gray, J.J. et al. (2003)
Overall Success rates of participants in CAPRI 1-5
2: How the community is doing?
Classification of CAPRI 1-2 Targets
Interface area ASA
1000 1500 2000 2500 3000 3500
Des
olv
atio
n f
ree
ener
gy,
kca
l/mo
l
-10
-5
0
5
10
15
20
T2 T1
T7
T5 T3T4T6
Type IVdifficult
Type IIIuncertain
Type IIeasy
Type Ieasy
Type Vvery difficult
Overall Success rates of participants in CAPRI 1-5
Interface area ASA
1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500
De
solv
ati
on
fre
e e
ne
rgy,
kc
al/m
ol
-15
-10
-5
0
5
10
15
20
25
T13
T18
T8
T10
T12
T19
T14
T9
Type IIIuncertain
Type IIeasy Type V
very difficult
Type Ieasy
Classification of CAPRI 3-5 Targets
Overall Success rates of participants in CAPRI 1-5
Interface Area
Des
olva
tion
free
ene
rgy
1400 2000 3400
-4
Type IEasy
Enzymes
Type IIEasy
Largemultienzymecomplexes
Type IIIUncertain
Antibody/Antigen
Type IVDifficult
Type VHopeless
Transitionalcomplexes with
substantial conformational
change
Small signalling
complexes
Type II Or
Type V?
Expected Improvements
Much improved
3. How to deal with side chain flexibility?
Coarse details
Fine details
Trypsin/APPI
Receptor
Ligand
Recognition mechanisms:
Lock-and-key vs. Induced fit
Key-and-latch mechanismRajamani, D., Thiel, S. Vajda, S. and C.J. Camacho. Anchor residues in protein-protein
interactions. Proc. Natl. Acad. Sci. USA, 101: 11287-11292, 2004.
Key-Latch model
KEYS which stay close to the bound conformation in solution
LATCHES do not show preference to stay near bound conformation.
key
latch
UnboundBoundSimulated
Solvated protein
Individually crystallized protein Predisposition
0
1
2
3
4
5
6
7
1 2 3 4Time (ns)
RM
SD
BoundUnbound
RMSD of Arg39 of ribonuclease A with respect to the structure found in the complex (bound; PDB code 1DFJ) and in the individually crystallized ribonuclease A (unbound; PDB code 7RSA). The RMSD was computed for 2000 snapshots of a 4ns MD simulation of 7RSA.
3.32.72.2
2.41.8
1.52.4
0
100
200
300
2.3 2.9 2.4 1.8 1.7 3.5 1.7 3.1
Clusters
Clu
ste
r s
ize
Clustering of the conformations of Arg39 in ribonuclease A. The 16 largest clusters were derived from a pairwise RMSD analysis of the MD snapshots, and clustering using a radius of 2Å. The RMSD of the cluster center from the bound conformation is shown on the top/bottom of each bar. The bound conformation is shown in blue, unbound in red, and the dominant conformation from the MD simulations is shown in green.
Complex of trypsin with amyloid β-protein inhibitor (APPI). Key residue Arg-15 is a major contributor to the total binding free energy.
HIV-1 NEF/FYN tyrosine kinase SH3 domain complex. Trp-119 is within 1 and 2 Å of the bound conformation for 36% and 96% of the MD. It is stabilized in this native-like conformation by Tyr-93 (and therefore also native-like) in the free state. Thr-97 buries the second largest SASA (70 Å2).
Tyr93Thr97
Asp100
Trp119
Lys68Arg45
Hyhel-5 Fab/lysozyme complex. The main key residue, Arg-45, has a SASA value of 147 Å2; a second key residue, Lys-68, is found buried with a SASA = 93 Å2. Both side chains show native-like properties, sampling during 50% and 97% of the time conformations that were less than 2 Å rmsd from their corresponding bound rotamer.
The complex of acetylcholinesterase with fasciculin. The main key Met-33 is in a native-like conformation during most of the simulation. The SASA encompassed by Met-33 is comparable with the next largest SASA of 78 Å2 resulting from the burial of Arg-27; this anchor is in a native-like conformer during 95% of the MD.
Met33
Thr8Arg27
Complex Receptor/Ligand Anchora ΔSASAb
ΔGbind Residence time, %c
PDB ID ResID Å2 kcal/mol Rank MD Rotamer
library
Enzyme/Inhibitor
1BRC Trypsin/APPI (1AAP) Arg 15 251.24 -11.9 1 32† 7.4†
2SIC Subtilisin BPN/Inhibitor Met 70 196.33 -7.1 1 51†
2SNI Subtilisin novo/CI2 (2CI2) Ile 56 189.79 -7.6 1 37‡ 96.6‡
1CHO α-Chymotrypsin/OMTKY3 Leu 18 180.33 -7.9 1 73‡*
1CSE Subtilisin C/eglin C (1ACB) Leu 45 165.07 -5.1 1 50‡ 97.4‡
1BRS Barnase/barstar (1A19) Asp 35 125.06 -2.5 3 97‡
1UGH** UDG/UGI Leu 272 180.38 -5.2 1 66‡
1DFJ Ribonuclease inhibitor/ Asn 67 101.18 -1.2 8 41‡ 28.5‡
ribonuclease A (7RSA)
1FSS AchE/FasII (1FSC) Thr 8 96.29 -3.4 4 99‡
Antigen/Antibody
1BQL Hyhel5 Fab/QBL (1DKJ) Arg 45 165.3 -10.1 1 49† 38.3†
1FBI IgG1 Fab/lysozyme Arg 73 132.72 -1.9 4 46†
1DQJ Hyhel63 Fab/HEL (3LZT) Arg 21 131.4 5.4 92† 29.1†
PDB IDa
Receptor/Ligandb
Native-like Predictions
Anchor replacementBoundc UBd Resurf
e
Enzyme/Inhibitor complexes ResiduefRMSDBound Predictions
UBg MDh Bi MDj
2SNI Subtilisin Novo/Chymotrypsin
inhibitor2(2CI2)151 62 124 Met59 2.9 1.5 99 82
151 62 Ile56 0.6 0.9 73 74
1DFJ Ribonuclease inhibitor/ Ribonuclease A
(7RSA)63 22 43 Arg39 3.6 2.7 22 43
1BRC Trypsin/APPI (1AAP) 98 24 45 Arg15 3.8 2.4 70 57
1CHO A-Chymotrypsin/Ovomucoid 3rd domain 182 61 91 Leu/Met18 32 68
1BRS Barnase/Barstar (1A19) 54 23 27 Asp35 0.8 0.6 16 17
1CSE Subtilisin Carlsberg/Eglin C(1ACB) 176 105 116 Leu45 0.6 1 133 119
1FSS Snake venom
acetylcholinesterase/FasciculinII (1FSC)21 3 4 Thr9 0.5 0.6 4 5
2BTF β-actin/Profilin 43 18 15 Arg74* 2.4 1.9 25 28
1WQ1 RAS activating domain/ RAS 29 26 9 Tyr32 0.2 1.5 26 21
Gln61* 2.4 1.5 28 40
Credits
Crystallographers: Please submit to CAPRI
Dr. Carlos Camacho (University of Pittsburgh)
Graduate students at Boston University Stephen ComeauDeepa RajamaniDima KozakovYang ShenRyan Brenke
National Institute of Health