Upload
daisuke-kihara
View
153
Download
2
Embed Size (px)
Citation preview
Hyung-Rae Kim, Amit Roy, Daisuke Kihara
http://kiharalab.org
KIHARA LAB
333
Overall Prediction Procedure
Server Models
PRESCO Residue
Environment Score
BLOSUM30
CC80
CCPC
QUIB
QUC2
Ranking by AA matrices
Structure Refinement by CHARMM runs
Side-Chain Modeling
5 models
Final 5 models
20-30 models
CABS 5 Clusters
Fragment-interaction -potential
Add HStarting Structure
Minimize in Screened Coulomb
Potential (SCP)
10 x 10ns → 5000 snapshotsconstraints on
secondary structure
MD 20 x 10ns with SCP
CH27 force field
10 x 10ns → 5000 snapshotsconstraints on
all Ca atoms
Dfire score, RMSD with
initial structureCorr (dfire,Irmsd) > 0.4Discarded (very rare)
Average structure from low dfire, Irmsd
snapshots from set 1 and 2
Relax at low T MDModel 1 and 2
Select structures with Lowest dfire score
Model 3,4,5
Hassan, S. A., Guarnieri, F., & Mehler, E. L. (2000). The Journal of Physical Chemistry B, 104(27), 6478-6489.
Mirjalili, V., Noyes, K., & Feig, M. (2014). Proteins: Structure, Function, and Bioinformatics, 82(S2), 196-207.
Refinement Procedure
Side-Chain Depth Environment (SDE)
within a sphere of 6 or 8 Å
along the main-chainCenter
(Kim & Kihara, Proteins 2014)
Finding Similar SDE from Database
Structure Database
2536 proteins
500 lowest RMSD fragments of 9 side-chain centroids;Superimposed with the query fragment
Select SDE with the same number of side-chain centroids in the sphere of 8.0/6.0Å
Query SDE
Compute residue-depth RMSD for corresponding side-chain centroids
Sort by depth RMSD to the query
surface
(Kim & Kihara, Proteins, 2014)
Decoy Evaluation with Protein Residue Environmental Score (PRESCO)
CCPC, CC80 Matrices:Contact definition of two residues: any pair of side-chain heavy atoms or Cα atom less than 4.5 ÅCompute a knowledge-based residue contact potential (Gaussian chain reference state, composition correction averaging)Correlation coefficients of residue pairs are used as values of the amino acid similarity matrix
Residue Contact Potential-Based Matrix
(Tan, Huang, & Kihara, Proteins, 2006)
Structure-derived Amino Acid Similarity Matrices in AAIndexBLAJ010101 - Structural superposition data for identifying potential remote homologues
(Blake-Cohen, 2001)HENS920101 - BLOSUM45 substitution matrix (Henikoff-Henikoff, 1992)JOHM930101 - Structure-based amino acid scoring table (Johnson-Overington, 1993)KOLA920101 - Conformational similarity weight matrix (Kolaskar-Kulkarni-Kale, 1992)KOSJ950115 - Context-dependent optimal substitution matrices for all residues
(Koshi-Goldstein, 1995)MIYS930101 - Base-substitution-protein-stability matrix (Miyazawa-Jernigan, 1993)OVEJ920101 - STR matrix from structure-based alignments (Overington et al., 1992)PRLA000101 - Structure derived matrix (SDM) for alignment of distantly related sequences
(Prlic et al., 2000)PRLA000102 - Homologous structure derived matrix for alignment of distantly related sequences
(Prlic et al., 2000)QU_C930101 - Cross-correlation coefficients of preference factors main chain (Qu et al., 1993)QU_C930102 - Cross-correlation coefficients of preference factors side chain (Qu et al., 1993)QUIB020101 - STROMA score matrix for the alignment of known distant homologs
(Qian-Goldstein, 2002)
Alignment Accuracy by AA Matrices
2761 Fold level protein sequence pairs, Lindahl & Eloffson Database
(Tan, Huang, Kihara, Proteins 2006)
Correct alignments: >50% of residues are correctly aligned
Native Structure Recognition
10
Decoy Sets DFIRE dDFIRE DOPE RW RWplus OPUS-PSP
GOAP MRE (CC80)
SDE (QUIB)
Combinations of MRE & SDE
# Targets BLSM3
0+QU_C2
BLSM30+QU_C2
CC80+QU_C1
4state_reduced
6 7 7 6 6 7 7 7 7 7 7 7 7
Fisa 3 3 3 3 3 3 3 2 2 2 2 3 4Fisa_casp3 4 4 3 4 4 5 5 2 1 3 3 4 5Lmds 7 6 7 7 7 8 7 10 6 10 10 10 10
Lattice_ssfit 8 8 8 8 8 8 8 8 8 8 8 8 8hg_structal 12 16 ---- ---- 12 18 22 28 11 27 27 27 29
ig_structal 0 26 ---- ---- 0 20 47 61 6 61 61 61 61
ig_structal_hires
0 16 ---- ---- 0 14 18 20 6 20 20 20 20
Moulder 19 18 19 19 19 19 19 20 16 20 20 20 20
ROSETTA 20 12 21 20 20 39 45 25 31 41 41 39 58
I-TASSER 49 48 30 53 56 55 45 56 47 56 56 56 56
#Total (Z-score)
128(-1.94)
164(-2.52)
98/168(-2.47)
120/168(-3.23)
135(-2.13)
196(-2.86)
226(-3.57)
239 (6.78)
141(2.14)
255(5.70)
255(5.76)
255(5.65)
278
Scoring Function Models only Native includedAverage Rank
Ranked 1 Average Rank
ranked 1
MRE (CC80) 6.77 29 1.32 131SDE (QUIB) 2.89 56 1.98 97Combinations of MRE & SDE
BLSM30+QU_C1 6.79 31 1.18 139CC80(SDE)+BLSM30(SDE)
2.82 66 1.99 89
QMEAN6 2.87 85 1.71 113RWplus 2.97 57 1.78 106RW 3.08 51 1.71 110QMEANall_atom 3.59 74 1.71 119QMEANSSE_agree 3.74 62 3.72 39RF_HA_SRS 4.65 49 1.38 137OPUS_CA 4.72 79 5.13 55RF_HA 5.44 62 2.78 112DOPE 5.77 54 3.27 95DFIRE 6.03 50 5.69 33Floudas-CM 7.75 38 7.05 42Melo-ANOLEA 9.62 19 5.19 86Random 9.72 13.9 10.1 8.3
Benchamark on Ryukumov & Fiser CASP Set
Comparison against36 scoring functions.Only showing results of 13 functions.
Best Second best Third best
Side-Chain Building
(Peterson, Kang, Kihara, Proteins 2014)
T0804 Top 1 models
Kiharalab: TS333_1Boniecki_pred: TS301_1Skwark: TS358_1
T0804 Kiharalab Top 1 Model
Native (Coordinates not available)Kiharalab_1GDT-TS: 31.44 GOAP: -18178.22
QUARK_5GDT-TS: 30.93 GOAP: -14959.68
Best in Top 1 Models
T0804 Server Models Selected by PRESCOFinal Selection
Rank Model GDT-TS
1 QUARK_TS5 30.93
2 myprotein-me_TS4 12.63
3 Zhang-server_TS5 29.77
4 Seok-server_TS2 11.86
5 BAKER_ROSETTASERVER_TS3 12.37
QU_C2 + BLOSUM30
Rank Model
1 QUARK_TS5
2 TASSER-VMT_TS5
3 myprotein-me_TS4
4 BAKER_ROSETTAS_TS3
5 Zhang-Server-TS1
QU_C1 + QUIB
Rank Model
1 QUARK_TS5
2 SAM-T08-server_TS3
3 myprotein-me_TS1
4 myprotein-me_TS4
5 TASSER-VMT-TS1
CC80+ BLOSUM30
Rank Model
1 BAKER_ROSETTAS_TS3
2 myprotein-me_TS4
3 QUARK_TS5
4 myprotein-me_TS1
5 RBO_Aleph_TS3
CCPC+ BLOSUM30
Rank Model
1 QUARK_TS5
2 BAKER_ROSETTAS_TS3
3 myprotein-me_TS4
4 myprotein-me_TS1
5 TASSER-VMT_TS5
QUIB
Rank Model
1 SAM-T08-server_TS3
2 myprotein-me_TS4
3 QUARK_TS5
4 BAKER_ROSETTAS_TS3
5 BAKER_ROSETTAS_TS2
T0799-D1 Kiharalab Top 1 Model
Native (Coordinates not available)Kiharalab_1GDT-TS: 19.86 GOAP: -33178.17
BAKER-ROSETTASERVER_3GDT-TS: 19.86 GOAP: -31360.77
3rd Best in Top 1 Models
T0834-D1 Kiharalab Top1 Model 3rd Best in Top 1 Models
Kiharalab_1GDT-TS: 37.12 GOAP: -26474.14
RBO_ALeph_5GDT-TS: 37.88 GOAP: -26865.67
Superimposition with the native (130-192) (D1 also includes 2-37)
Acknowledgements
http://kiharalab.org@kiharalab
Hyung-Rae KimAmit Roy
Lenna Peterson
Daisuke Kihara