18
Hyung-Rae Kim, Amit Roy, Daisuke Kihara http://kiharalab.org KIHARA LAB 333

Kihara Lab protein structure prediction performance in CASP11

Embed Size (px)

Citation preview

Page 1: Kihara Lab protein structure prediction performance in CASP11

Hyung-Rae Kim, Amit Roy, Daisuke Kihara

http://kiharalab.org

KIHARA LAB

333

Page 2: Kihara Lab protein structure prediction performance in CASP11

Overall Prediction Procedure

Server Models

PRESCO Residue

Environment Score

BLOSUM30

CC80

CCPC

QUIB

QUC2

Ranking by AA matrices

Structure Refinement by CHARMM runs

Side-Chain Modeling

5 models

Final 5 models

20-30 models

CABS 5 Clusters

Fragment-interaction -potential

Page 3: Kihara Lab protein structure prediction performance in CASP11

Add HStarting Structure

Minimize in Screened Coulomb

Potential (SCP)

10 x 10ns → 5000 snapshotsconstraints on

secondary structure

MD 20 x 10ns with SCP

CH27 force field

10 x 10ns → 5000 snapshotsconstraints on

all Ca atoms

Dfire score, RMSD with

initial structureCorr (dfire,Irmsd) > 0.4Discarded (very rare)

Average structure from low dfire, Irmsd

snapshots from set 1 and 2

Relax at low T MDModel 1 and 2

Select structures with Lowest dfire score

Model 3,4,5

Hassan, S. A., Guarnieri, F., & Mehler, E. L. (2000). The Journal of Physical Chemistry B, 104(27), 6478-6489.

Mirjalili, V., Noyes, K., & Feig, M. (2014). Proteins: Structure, Function, and Bioinformatics, 82(S2), 196-207.

Refinement Procedure

Page 4: Kihara Lab protein structure prediction performance in CASP11

Side-Chain Depth Environment (SDE)

within a sphere of 6 or 8 Å

along the main-chainCenter

(Kim & Kihara, Proteins 2014)

Page 5: Kihara Lab protein structure prediction performance in CASP11

Finding Similar SDE from Database

Structure Database

2536 proteins

500 lowest RMSD fragments of 9 side-chain centroids;Superimposed with the query fragment

Select SDE with the same number of side-chain centroids in the sphere of 8.0/6.0Å

Query SDE

Compute residue-depth RMSD for corresponding side-chain centroids

Sort by depth RMSD to the query

surface

Page 6: Kihara Lab protein structure prediction performance in CASP11

(Kim & Kihara, Proteins, 2014)

Decoy Evaluation with Protein Residue Environmental Score (PRESCO)

Page 7: Kihara Lab protein structure prediction performance in CASP11

CCPC, CC80 Matrices:Contact definition of two residues: any pair of side-chain heavy atoms or Cα atom less than 4.5 ÅCompute a knowledge-based residue contact potential (Gaussian chain reference state, composition correction averaging)Correlation coefficients of residue pairs are used as values of the amino acid similarity matrix

Residue Contact Potential-Based Matrix

(Tan, Huang, & Kihara, Proteins, 2006)

Page 8: Kihara Lab protein structure prediction performance in CASP11

Structure-derived Amino Acid Similarity Matrices in AAIndexBLAJ010101 - Structural superposition data for identifying potential remote homologues

(Blake-Cohen, 2001)HENS920101 - BLOSUM45 substitution matrix (Henikoff-Henikoff, 1992)JOHM930101 - Structure-based amino acid scoring table (Johnson-Overington, 1993)KOLA920101 - Conformational similarity weight matrix (Kolaskar-Kulkarni-Kale, 1992)KOSJ950115 - Context-dependent optimal substitution matrices for all residues

(Koshi-Goldstein, 1995)MIYS930101 - Base-substitution-protein-stability matrix (Miyazawa-Jernigan, 1993)OVEJ920101 - STR matrix from structure-based alignments (Overington et al., 1992)PRLA000101 - Structure derived matrix (SDM) for alignment of distantly related sequences

(Prlic et al., 2000)PRLA000102 - Homologous structure derived matrix for alignment of distantly related sequences

(Prlic et al., 2000)QU_C930101 - Cross-correlation coefficients of preference factors main chain (Qu et al., 1993)QU_C930102 - Cross-correlation coefficients of preference factors side chain (Qu et al., 1993)QUIB020101 - STROMA score matrix for the alignment of known distant homologs

(Qian-Goldstein, 2002)

Page 9: Kihara Lab protein structure prediction performance in CASP11

Alignment Accuracy by AA Matrices

2761 Fold level protein sequence pairs, Lindahl & Eloffson Database

(Tan, Huang, Kihara, Proteins 2006)

Correct alignments: >50% of residues are correctly aligned

Page 10: Kihara Lab protein structure prediction performance in CASP11

Native Structure Recognition

10

Decoy Sets DFIRE dDFIRE DOPE RW RWplus OPUS-PSP

GOAP MRE (CC80)

SDE (QUIB)

Combinations of MRE & SDE

# Targets      BLSM3

0+QU_C2

BLSM30+QU_C2

CC80+QU_C1

4state_reduced

6 7 7 6 6 7 7 7 7 7 7 7 7

Fisa 3 3 3 3 3 3 3 2 2 2 2 3 4Fisa_casp3 4 4 3 4 4 5 5 2 1 3 3 4 5Lmds 7 6 7 7 7 8 7 10 6 10 10 10 10

Lattice_ssfit 8 8 8 8 8 8 8 8 8 8 8 8 8hg_structal 12 16 ---- ---- 12 18 22 28 11 27 27 27 29

ig_structal 0 26 ---- ---- 0 20 47 61 6 61 61 61 61

ig_structal_hires

0 16 ---- ---- 0 14 18 20 6 20 20 20 20

Moulder 19 18 19 19 19 19 19 20 16 20 20 20 20

ROSETTA 20 12 21 20 20 39 45 25 31 41 41 39 58

I-TASSER 49 48 30 53 56 55 45 56 47 56 56 56 56

#Total (Z-score)

128(-1.94)

164(-2.52)

98/168(-2.47)

120/168(-3.23)

135(-2.13)

196(-2.86)

226(-3.57)

239 (6.78)

141(2.14)

255(5.70)

255(5.76)

255(5.65)

278

Page 11: Kihara Lab protein structure prediction performance in CASP11

Scoring Function Models only Native includedAverage Rank

Ranked 1 Average Rank

ranked 1

MRE (CC80) 6.77 29 1.32 131SDE (QUIB) 2.89 56 1.98 97Combinations of MRE & SDE

BLSM30+QU_C1 6.79 31 1.18 139CC80(SDE)+BLSM30(SDE)

2.82 66 1.99 89

QMEAN6 2.87 85 1.71 113RWplus 2.97 57 1.78 106RW 3.08 51 1.71 110QMEANall_atom 3.59 74 1.71 119QMEANSSE_agree 3.74 62 3.72 39RF_HA_SRS 4.65 49 1.38 137OPUS_CA 4.72 79 5.13 55RF_HA 5.44 62 2.78 112DOPE 5.77 54 3.27 95DFIRE 6.03 50 5.69 33Floudas-CM 7.75 38 7.05 42Melo-ANOLEA 9.62 19 5.19 86Random 9.72 13.9 10.1 8.3

Benchamark on Ryukumov & Fiser CASP Set

Comparison against36 scoring functions.Only showing results of 13 functions.

Best Second best Third best

Page 12: Kihara Lab protein structure prediction performance in CASP11

Side-Chain Building

(Peterson, Kang, Kihara, Proteins 2014)

Page 13: Kihara Lab protein structure prediction performance in CASP11

T0804 Top 1 models

Kiharalab: TS333_1Boniecki_pred: TS301_1Skwark: TS358_1

Page 14: Kihara Lab protein structure prediction performance in CASP11

T0804 Kiharalab Top 1 Model

Native (Coordinates not available)Kiharalab_1GDT-TS: 31.44 GOAP: -18178.22

QUARK_5GDT-TS: 30.93 GOAP: -14959.68

Best in Top 1 Models

Page 15: Kihara Lab protein structure prediction performance in CASP11

T0804 Server Models Selected by PRESCOFinal Selection

Rank Model GDT-TS

1 QUARK_TS5 30.93

2 myprotein-me_TS4 12.63

3 Zhang-server_TS5 29.77

4 Seok-server_TS2 11.86

5 BAKER_ROSETTASERVER_TS3 12.37

QU_C2 + BLOSUM30

Rank Model

1 QUARK_TS5

2 TASSER-VMT_TS5

3 myprotein-me_TS4

4 BAKER_ROSETTAS_TS3

5 Zhang-Server-TS1

QU_C1 + QUIB

Rank Model

1 QUARK_TS5

2 SAM-T08-server_TS3

3 myprotein-me_TS1

4 myprotein-me_TS4

5 TASSER-VMT-TS1

CC80+ BLOSUM30

Rank Model

1 BAKER_ROSETTAS_TS3

2 myprotein-me_TS4

3 QUARK_TS5

4 myprotein-me_TS1

5 RBO_Aleph_TS3

CCPC+ BLOSUM30

Rank Model

1 QUARK_TS5

2 BAKER_ROSETTAS_TS3

3 myprotein-me_TS4

4 myprotein-me_TS1

5 TASSER-VMT_TS5

QUIB

Rank Model

1 SAM-T08-server_TS3

2 myprotein-me_TS4

3 QUARK_TS5

4 BAKER_ROSETTAS_TS3

5 BAKER_ROSETTAS_TS2

Page 16: Kihara Lab protein structure prediction performance in CASP11

T0799-D1 Kiharalab Top 1 Model

Native (Coordinates not available)Kiharalab_1GDT-TS: 19.86 GOAP: -33178.17

BAKER-ROSETTASERVER_3GDT-TS: 19.86 GOAP: -31360.77

3rd Best in Top 1 Models

Page 17: Kihara Lab protein structure prediction performance in CASP11

T0834-D1 Kiharalab Top1 Model 3rd Best in Top 1 Models

Kiharalab_1GDT-TS: 37.12 GOAP: -26474.14

RBO_ALeph_5GDT-TS: 37.88 GOAP: -26865.67

Superimposition with the native (130-192) (D1 also includes 2-37)

Page 18: Kihara Lab protein structure prediction performance in CASP11

Acknowledgements

http://kiharalab.org@kiharalab

Hyung-Rae KimAmit Roy

Lenna Peterson

Daisuke Kihara