View
226
Download
2
Category
Tags:
Preview:
Citation preview
Thomas Huber huber@maths.uq.edu.au
Computational Biology andBioinformatics Environment
ComBinE
Department of Mathematics The University of Queensland
Protein Scoring Functions:Essential Tools or Fancy Fad?
Why do we (still) care about Protein Structures/Prediction?
• Academic curiosity?– Understanding how nature works
• Urgency of prediction 104 structures are determined
• insignificant compared to all proteins
– sequencing = fast & cheap
– structure determination = hard & expensive
Transistors in
Intel processo
rs
TrEMBL sequences
(computer annotated)
SwissProt sequences (annotated)
structures in PDB
What would we like to be able to predict?
• What is a protein’s structure?– Does a sequence adopt a known fold?
• Fold recognition
– Does a sequence adopt a new fold?• New fold prediction (dream of structural
genomics)
• How stable is a protein– Thermodynamic stability
• What is a protein’s function?– Functional annotation
Three basic choices in molecular modelling
• Representation– Which degrees of freedom are treated
explicitly
• Scoring– Which scoring function (force field)
• Searching– Which method to search or sample
conformational space
Two Linages of Protein Structure Prediction
• The physicist’s approach– Thermodynamics: Structures with low
energy are more likely
• The biologist’s approach– Similar sequences similar structures
Fragment Scoring
• Proteins are decomposed into overlapping fragments of 7 residues
• Each fragment is described by• Amino acid specific local structure• Non-specific environment
• Fragments are clustered and a statistical model for each cluster is built
• Total score = fragment scores
Finding Remote Homologueswith sausage
• 572 sequence-structure pairs• Structures are similar (FSSP)• > 70% structurally aligned• < 20% sequence identity
250
300
350
400
0 50 100 150 200
alignment quality(arb. units)
sequence similarity weight (arb. Units)
Testing/Breaking the Scoring
• Designed -sheet (Serrano)– 12 residues
– Forms stable -sheet at
room temperature
Another Uniquely Folded Mini-Protein
• Villin head-piece (36 residues)– High thermodynamic
stability (Tm>70º)
– Folds autonmously
A Uniquely Folded Mini-Protein
• Zinc finger analoge (Mayo)– 28 residues
– thermodynamic stable (Tm25º)
Trimer Stability
• Nitrogen regulation proteins– 2 protein (PII (GlnB) and GlnK)
– 112 residues
– sequence: 67% identities, 82% positives
– structure: 0.7Å RMSD
– trimeric
– Dr S. Vasudevan: hetero-trimers
Hetero-trimer Stability• What is the most/least stable trimer• Why use a low resolution force field?
– Structures differ (0.7Å RMSD)– Side chains are hard to optimise
• Calculation: – GlnB3 > GlnB2-GlnK > GlnB-GlnK2 > GlnK3
• Experiment:– GlnB3 > GlnB2-GlnK > GlnB-GlnK2 > GlnK3
GlnK
GlnB
People• sausage
– Andrew Torda (RSC)
– Oliver Martin (RSC)
• GlnB/GlnK, RdR polymerases– Subhash Vasudevan (JCU)
Sausage and Cassandra freely available http://rsc.anu.edu.au/~torda
huber@maths.uq.edu.au
• Increasing urgency for in-silico proteomics
• Good force fields = essential for success – Different tasks (may) require different
scoring schemes
Summary
Recommended