Upload
georgia-evans
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Template-based Prediction of Protein 8-state
Secondary Structures
June 12th 2013
Ashraf Yaseen and Yaohang Li
DEPARTMENT OF COMPUTER SCIENCEOLD DOMINION UNIVERSITY, NORFOLK, VA
3rd IEEE International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)
2
Contents
Introduction Secondary Structure Definition &
Representation Secondary Structure Prediction C8-Scorpion
Materials & Methods Data Sets, Template Construction, and
Encoding Neural Network Model
Results & Discussions Summary
3
Protein Secondary Structure Prediction in Protein Modeling
Proteins; Proteios, “primary”, “of prime importance.” The primary components of living things
In nature, proteins fold into specific 3D structures critical to their functions
Protein Modeling
Correctly predicting protein secondary structure is a critical step stone to obtain correct 3D models
Sequence
3D
intermediate prediction steps
4
Secondary Structures - Definition
Protein 1BOO Chain A
π-helix
α-helix
310-helix
Turn
Bend
Other
β-strand
• General 3D form of local segments of residues• Identified from determined
protein 3D• DSSP
5
Secondary Structures - Representation
3-10 helix (G)
α-helix (H) π-helix (I)
β-stand (E) bridge (B)
turn (T)bend (S)
others (C)
6
Secondary Structure Prediction - Effectiveness
Correctly predicting secondary structure Reduce the degrees of freedom in protein
structure modeling reduce the difficulty of obtaining high resolution 3D models
Derive a much smaller range of possible torsion angles
http://www.imb-jena.de/~rake/Bioinformatics_WEB/basics_peptide_bond.html
7
Secondary Structure Prediction - Background
Secondary Structure Prediction • 3-state (helix, sheet, coil)• 8-state (α-helix, π-helix, 310-helix, β-
strand, β-bridge, turn, bend and others)
Predictor
Structural state of Ri
Secondary Structure Prediction classificationEach residue is predicted to be in one of few states
Machine Learning (ANN, SVM, HMM, ...)
3-state Examples: GOR4, PSI-Pred, PHD, SAM, Porter, JPred, SPINE, SSPRO, NETSURF, and many others. ~80% (Q3)
8-state Examples: SSpro8, 62-63% Q8 RaptorXss8, 67.9% Q8
8
Secondary Structure Prediction - 8-state
CB513 CASP9 Manesh215 Carugo338
QG17.54 20.58 18.43 19.20
QH89.96 92.90 90.22 89.91
QI0.00 0.00 0.00 0.00
QE77.68 81.64 79.60 79.45
QB0.09 0.00 0.32 0.44
QS15.87 18.11 17.80 17.14
QT48.02 51.45 51.28 50.11
QC63.29 59.37 63.73 63.36
Q865.59 69.31 67.69 66.64
Prediction Accuracy of RaptorXss8 on Benchmarks of CB513, CASP9, Manesh215, and Carugo338. Prediction accuracies for 3-10 helices (G), π-helices (I), β-bridges (B), and bends (T) are particularly low due to their low appearance frequencies
Distribution of 3-10 helices (G), α-helices (H), π-helices (I), β-sheets (E), β-bridges (B), turns (T), bends (S), and coils (C) in Cull5547
9
Secondary Structure Prediction - Template-based
Most current methods for secondary structure predictions are ab initio
However, many protein sequences have some degree of similarity among themselves
Latest version of Porter (in 3-state) Improvement in prediction accuracy with >30%
sequence similarity Decline in efficiency with low sequence
similarity <20%
10
Template-based C8-SCORPION
Predictor
Structural feature (state) of Ri
Input encoding
Sequence & evolutionary info (PSSM)
+ Structure info. from (templates
Orcontext-based
scores)
Is an extension of our previous method C3-
SCORPION
11
Materials & Methods
Cull5547
PISCES
server 25% (at most) sequence identity, 2.0A resolution
CASP9
Manesh215
Carugo338CB51
3
Data Sets Template Construction
Encoding
Context-based scores: potential scores, based on statistics, derived from the protein datasets, estimate the favorability of residues in adopting specific structural states, within their amino acid environment.
12
Materials & Methods -cont.
Two phases of template-based 8-state secondary structure prediction (architecture and encoding)
Neural Network Model
13
Results & Discussions
Q8 SOV8
G 43.99 47.96
H 92.48 95.19
I 0.00 0.00
E 88.30 92.77
B 27.86 27.57
S 43.46 45.32
T 64.18 66.64
C 75.51 71.45
Overall 78.85 80.10
7-fold cross-validation accuracy in template-based 8-state prediction
Q8 SOV8
No Template With Template No Template With Template
CB513 67.22 79.39 67.66 80.64
CASP9 71.54 76.36 73.47 78.15
Manesh215 69.71 81.10 70.79 82.99
Carugo338 68.44 80.39 69.50 81.95
Comparison between 8-state predictions with and without template on Benchmarks
Distribution of 8-state secondary structure prediction accuracy (Q8) as a function of sequence similarity- the first group of bars corresponds to template-
less predictions
14
Results & Discussions -cont.
(0, 10] (10, 20] (20, 40] (40, 70] (70, 95]# of chains 4,426 4,215 3,204 1,437 1,133
QH92.05 92.70 93.60 94.97 95.94
QG22.07 23.93 35.09 55.03 69.44
QI0.00 0.00 0.00 0.00 0.00
QE83.37 84.53 86.59 90.16 93.61
QB1.53 3.59 7.24 22.30 44.26
QT53.35 55.34 60.89 69.66 77.06
QS22.83 26.41 35.19 54.09 73.40
QC66.55 67.84 71.81 79.56 86.80
Q871.33 73.01 76.29 82.11 88.01
Comparison of 7-fold cross validation prediction accuracies in eight states when templates with different sequence similarities are used
15
Results & Discussions -cont.
Comparison between template-less and template-based predictions on 1BTN chain A
16
Working with C8-ScorpionInput titleInput your sequenceInput your e-mail Submit, then wait for the results...
“C8-Scorpion” available at: http://hpcr.cs.odu.edu/c8scorpion
17
Working with C8-ScorpionCheck your e-mail,Click the link providedThe results are displayed
18
Summary
The effectiveness of using structural information in templates has been demonstrated in our computational results in 7-fold cross validation as well as on benchmarks, where enhancements of prediction accuracies are observed.
Overall, 78.85% Q8 accuracy and 80.10% SOV8 accuracy are achieved in 7-fold cross validation
More importantly, when good templates are available, the prediction accuracy of less frequent secondary structure states, such as 3-10 helices, turns, and bends, are highly improved, which are suitable for practical use in applications.
A webserver (C8-Scorpion) implementing template-less 8-state secondary structure prediction is currently available at http://hpcr.cs.odu.edu/c8scorpion. The integration of template-based prediction into the C8-Scorpion webserver is currently under development
19
Acknowledgement
This work is partially supported by NSF grant 1066471 and ODU 2013 Multidisciplinary Seed grant
20
Questions?
Thank You