29
Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition or threading ab initio folding Genetic algorthms

Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Embed Size (px)

Citation preview

Page 1: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Protein Structure Prediction

• What’s the big deal?• Why is it important?• Who is working on it• Different methods

Methods;• Comparative modeling• Fold recognition or threading• ab initio folding• Genetic algorthms

Page 2: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

3D structure give clues to function:

• active sites, binding sites, conformational changes...• structure and function conserved more than sequence

3D structure determination is difficult, slow and expensive

Intellectual challenge, Nobel prizes etc...

Engineering new proteins

Why do we need structure prediction?

Page 3: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

IEEE Computer July 2002 page 27Computational Biology’s Holy Grail

“When asked what the Holy Grail of computational biology is, most researchers would answer that it is either

sequence-structure-function prediction orComputing the genotype-phenotype map.”

Page 4: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Structure predictionSummary of the four main approaches to structure prediction. Note thatthere are overlaps between nearly all categories.

Method Knowledge Approach Difficulty Usefulness

Comparative Proteins of Identify related Relatively Very, ifModelling known structure with easy sequence(Homology structure sequence methods, identitymodelling) copy 3D coords and > 40% drug modify where design necessary

Fold Proteins of Same as above, but Medium Limited dueRecognition known use more to poor Structure sophisticated models methods to find related structure

Comparison of Different methods

Page 5: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Structure predictionSummary of the four main approaches to structure prediction. Note thatthere are overlaps between nearly all categories.

Method Knowledge Approach Difficulty Usefulness

Secondary Sequence- Forget 3D Medium Can improvestructure structure arrangement and alignments,Prediction statistics predict where the fold helices/strands are recognition, ab initio

ab initio Energy Simulate folding, or Very hard Not reallyTertiary functions, generate lots ofStructure statistics structures and try toPrediction pick the correct one

Comparison of Different methods

Page 6: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Protein Structure Prediction

Instrumentation methods for determining a proteins structure

• X-ray crystallography

• NMR spectroscopy

Page 7: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

A Guide to Structure Prediction (version 2.1) EMBL

Meyerhofstrasse, 1D-69117 Heidelberg

Germanyspeedy.embl-heidelberg.de/gtsp/

Page 8: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Experimental Data Much experimental data can aid the structure predictionprocess. Some of these are:

• Disulphide bonds, which provide tight restraints on the location of cysteines in space.• Spectroscopic data, which can give you an idea as to the secondary structure content of your protein. • Site directed mutagenesis studies, which can give insights as to residues involved in active or binding sites.• Knowledge of proteolytic cleavage sites, post-translational modifications, such as phosphorylation or glycosylation can suggest residues that must be accessible.

Remember to keep all of the available data in mind whendoing predictive work. Always ask yourself whether aprediction agrees with the results of experiments. If not,then it may be necessary to modify what you've done.

Page 9: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 10: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

The PSA Protein Structure Prediction Server bmerc-www.bu.edu/psa/

The Protein Sequence Analysis (PSA) server predictsprobable secondary structures and folding classes fora given amino acid sequence.

Used for proteins of unknown structure and for which nohomologous sequences are known

Developed at:The BioMolecular Engineering Research Center

(BMERC) of Boston University in Boston, Massachusetts,and TASC, Inc. in Reading, Massachusetts.

Email or webpage submissionsReturn data in PDF or PS format

Page 11: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 12: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

NNPREDICTProtein Secondary Structure Prediction

www.cmpharm.ucsf.edu/~nomi/nnpredict.html

nnpredict is a program that predicts the secondary structuretype for each residue in an amino acid sequence. The basisof the prediction is a two-layer, feed-forward neural network.The network weights were determined by a separate program -- a modification of the Parallel Distributed Programming suiteof McClelland and Rumelhart (1).

Input a sequence consisting of one-letter amino acid codes (A C D E F G H I K L M N P Q R S T V W Y)(NOTE: B and Z are not recognized as valid amino acid codes)or three-letter amino acid codes separated by spaces(ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASNPRO GLN ARG SER THR VAL TRP TYR).

Page 13: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

123d.ncifcrf.gov/sarf2.htmlCommon SARFs in protein structures SARF stands for Spatial ARangement of backbone Fragments. Small alpha helix 1aca

Submit123d.ncifcrf.gov/run123D+.html

http://www.sbg.bio.ic.ac.uk/~3dpssm/A Fast, Web-based Method for Protein Fold Recognitionusing 1D and 3D Sequence Profiles coupled with SecondaryStructure and Solvation Potential Information.

Other Sources of Protein Structure Prediction

Page 14: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 15: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 16: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 17: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 18: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 19: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 20: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition
Page 21: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

UCSC HMM Applications

2GLIA. Chain A, Five-Finger [gi:2392684]TITLE Crystal structure of a five-finger GLI-DNAcomplex: new perspectives on zinc fingersSOURCE Homo sapiens (human)See graphics (Secondary structure of 2gli) and 2GLIx500

Amino acid sequence:

>vyetdcrwdgcsqefdsqeqlvhhinsehihgerkefvchwggcsrelrpfkaqymlvvhmrrhtgekphkctfegcrksysrlenlkthlrshtgekpymcehegcskafsnasdrakhqnrthsnekpyvcklpgctkrytdpsslrkhvktvhgpda

Page 22: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Testing The UCSC HMM ApplicationUsing a known protein

five-finger GLI on DNA

Page 23: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

LIBRA I Structure Prediction by Threading:Forward Folding Protocol www.ddbj.nig.ac.jp/E-mail/libra/LIBRA_I.html

Compatible structures of a target sequence are sought fromthe structural library chosen from Protein Data Bank (PDB).

The target sequence and 3D profile are aligned by simpledynamic programming. According to the alignment, sequencere-mounts on the structure and its fitness are evaluated bypsuedo-energy potential.

Scores are sorted from the best match and shown as wellas their alignments.

Page 24: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

LIBRA I

Sequence Homology Searchby Threading: Inverse Folding

Compatible sequences of a target structure are sought fromthe sequence database (Swiss-Prot). Scores are sorted fromthe best match and shown as well as their alignments.

A recent study revealed that it is suitable in this searchto use the 3D-1D alignment score per se as the compatibilityscore rather than the sequence re-mounting score.

Page 25: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

The problem of predicting protein structure from sequenceremains fundamentally unsolved despite more than threedecades of intensive research effort.

The search has been driven by the belief that the 3D structureof a protein is determined by its amino acid sequence(Anfinsen, 1973). While it is now known that chaperonesoften play a role in the folding pathway, and in correctingmisfolds (Corrales and Fersht, 1996, Hartl et al., 1994),it is believed that the final structure is at the free-energyminimum. Thus, all information needed to predict thenative structure of a protein is contained in the aminoacid sequence, plus a knowledge of its native solutionenvironment.

Page 26: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Ab initio prediction of protein structure from sequence: not yet;

Given only the amino acid sequence, it should be possible inprinciple to directly predict protein structure from physico-chemical principles using, for example, molecular dynamicsmethods (Levitt and Warshel, 1975). In practice, however,such approaches are frustrated by the enormous complexityof the calculation (requiring many orders of magnitude morecomputing time than is currently feasible) and by inaccuraciesin the experimental determination of basic parameters(van Gunsteren, 1993, Shortle et al., 1996). Thus, the mostsuccessful structure prediction tools are knowledge-based,using a combination of statistical theory and empirical rules.

Page 27: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Odyssey of evolution teaches us structure prediction;

It appears that for most proteins, almost all residues can bechanged without affecting the structure (Rost et al., 1996b);however, a single, randomly chosen mutation is more likelyto destabilize than to maintain a particular structure. Thus,the precise pattern of amino acid exchanges observed in a multiple sequence alignment of a protein family is highlyindicative of the particular structure. These patterns constitutea fossil record of mutations preserving protein structure andfunction. The importance of such evolutionary information forstructure prediction was realized very early and has long been exploited in exceptional cases by experts, as well as in automatic and systematic ways. More recently, the use of evolutionary information has grown in importance. This importance was made particularly clear recently when it was shown that the accuracy of secondary structure was improved to over 70% due to the use of evolutionary Information.

Page 28: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Genetic programming for protein structure prediction

S. Sun, Reduced representation model of protein structureprediction: statistical potential and genetic algorithms,Protein Science, vol 2, no 5, pp. 762-785, 1993.

Lamont, Gary B., Charles Kaiser, George Gates, LaurenceMerkle, and Ruth Pachter, Real-Valued Genetic AlgorithmCase Studies in Protein Structure Prediction, Proceedingsof the SIAM Conference on Parallel Applications, March 1997.

Natalio Krasnogor & Daniel H. Marcos & David Pelta & Walter A.Risi. Protein Structure Prediction as a Complex Adaptive System ,Frontiers in Evolutionary Algorithms (FEA98), 1998

Natalio Krasnogor & Bill Hart & Jim Smith & David PeltaProtein Structure Prediction With Evolutionary Algorithms,Proceedings of the 1999 International Genetic and EvolutionaryComputation Conference (GECCO99).

Page 29: Protein Structure Prediction What’s the big deal? Why is it important? Who is working on it Different methods Methods; Comparative modeling Fold recognition

Some source pageshttp://www.sbc.su.se/~maccallr/http://scpd.stanford.edu/SOL/courses/proEd/RACMB/vidList.htm

http://scpd.stanford.edu/SOL/courses/proEd/RACMB/vidList.htm