25
Homework grades 0-14 (D) 15-19 (C) 20-24 (B) 25- 30(A) HW1 5 1 9 5 HW2 0 6 4 9

Homework grades

  • Upload
    geona

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Homework grades. Protein folding. Anfinsen’s experiments. Assumption: amino acid sequence completely and uniquely determines the protein tertiary structure. Protein folding problem: find native conformation among the large number of alternative conformations. - PowerPoint PPT Presentation

Citation preview

Page 1: Homework grades

Homework grades

0-14 (D) 15-19 (C) 20-24 (B) 25-30(A)

HW1 5 1 9 5

HW2 0 6 4 9

Page 2: Homework grades

Protein folding.Anfinsen’s experiments.

Assumption: amino acid sequence completely and uniquely determines the protein tertiary structure.

Protein folding problem: find native conformation among the large number of alternative conformations.

Ex: polypeptide chain of 100 residues can have ~ 9^100 different conformations.

Page 3: Homework grades

Protein folding: step by step.

Extended chain Disordered globule: hydrophobic – inside, hydrophilic - outside Native highly ordered

conformation

Factors which influence the thermodynamics and kinetics of protein folding:

- size, amino acid content, hydrophobic/hydrophilic content

- strength of intramolecular interactions, number of S-S bonds

- domain architecture

Page 4: Homework grades

Fold recognition.

Unsolved problem: direct prediction of protein structure from the physico-chemical principles.

Solved problem: to recognize, which of known folds are similar to the fold of unknown protein.

Fold recognition is based on observations/assumptions:- The overall number of different protein folds is limited

(1000-3000 folds)

- The native protein structure is in its ground state (minimum energy)

Page 5: Homework grades

Definition of protein folds.Protein fold – arrangement of secondary structures into a unique

topology/tertiary structure.Example of alpha+beta proteins:

•TIM beta/alpha-barrel contains parallel beta-sheet barrel, closed; n=8, S=8; strand order 12345678, surrounded by alpha-helices

•NAD(P)-binding Rossmann-fold domains core: 3 layers, a/b/a; parallel beta-sheet of 6 strands, •order 321456

Page 6: Homework grades

Protein structure prediction.

Prediction of three-dimensional structure from its protein sequence. Different approaches:

- Homology modeling (predicted structure has a very close homolog in the structure database).

- Fold recognition (predicted structure has an existing fold).

- Ab initio prediction (predicted structure has a new fold).

Page 7: Homework grades

Homology modeling.

Aims to produce protein models with accuracy close to experimental and is used for:

- Protein structure prediction- Drug design- Prediction of functionally important sites (active

or binding sites)

Page 8: Homework grades

Steps of homology modeling.

1. Template recognition & initial alignment.2. Backbone generation.3. Loop modeling.4. Side-chain modeling.5. Model optimization.6. Model validation.

Page 9: Homework grades

1. Template recognition.Recognition of similarity between the target and template.

Target – protein with unknown structure.

Template – protein with known structure.

Main difficulty – deciding which template to pick, multiple choices/template structures.

Template structure can be found by searching for structures in PDB using sequence-sequence alignment methods.

Page 10: Homework grades

Two zones of sequence alignment.Two sequences are guaranteed to fold into the same

structure if their length and sequence identity fall into “safe” zone.

50 100 150 200

50

100Safe homology modeling zone

Twilight zone

Alignment length

Sequence identity

Page 11: Homework grades

2. Backbone generation.

If alignment between target and template is ready, copy the backbone coordinates of those template residues that are aligned.

If two aligned residues are the same, copy their side chain coordinates as well.

Page 12: Homework grades

3. Insertions and deletions. insertion AHYATPTTT AH---TPSS deletion Occur mostly between secondary structures, in the loop

regions. Loop conformations – difficult to predict.

Approaches to loop modeling:- Knowledge-based: searches the PDB for loops with

known structure- Energy-based: an energy function is used to evaluate

the quality of a loop. Energy minimization or Monte Carlo.

Page 13: Homework grades

4. Side chain modeling.Side chain conformations – rotamers. In similar proteins -

side chains have similar conformations. If % identity is high - side chain conformations can be copied

from template to target.

If % identity is not very high - modeling of side chains using libraries of rotamers and different rotamers are scored with energy functions.

E1

E2

E3

E = min(E1, E2, E3)

Page 14: Homework grades

5. Model optimization.

Energy optimization of entire structure.

Since conformation of backbone depends on conformations of side chains and vice versa - iteration approach:

Predict rotamers Shift in backbone

Page 15: Homework grades

6. Model validation.

• Correct bond length and bond angles

• Correct placement of functionally important sites

>> 3.8 Angstroms

Page 16: Homework grades

Classwork I: Homology modeling.

- Go to NCBI Entrez, search for gi461699- Do Blast search against PDB- Do CD-search.

Page 17: Homework grades

Fold recognition.Goal: to find in PDB a fold which best matches a

given sequence.

Since similarity between target and the closest to it template is not high, sequence-sequence alignment methods fail to find a closest match.

Solution: threading – sequence-structure alignment method.

Page 18: Homework grades

Threading – method for structure prediction.

Sequence-structure alignment, target sequence is compared to all structural templates from the database.

Requires:- Alignment method (dynamic programing, Monte

Carlo,…)- Scoring function, which yields relative score for

each alternative alignment

Page 19: Homework grades

Scoring function for threading.

Contact-based scoring function depends on the amino acid types of two residues and distance between them.

Sequence-sequence alignment scoring function does not depend on the distance between two residues.

If distance between two non-adjacent residues in the template is less than 8 Å, these residues make a contact.

Page 20: Homework grades

Scoring function for threading.

),(),(;),(1,

TrpIlewTyrAlawSaawSN

jiji

Ala

Ile Tyr

Trp

w is calculated from the frequency of amino acid contacts in protein structures; ai – amino acid type of target sequence aligned with the position “i” of the template; N- number of contacts

Page 21: Homework grades

Classwork II: calculate the score for target sequence “ATPIIGGLPY” aligned to template

structure which is defined by the contact matrix.

1 2 3 4 5 6 7 8 9 10

1 * * *

2

3 *

4 *

5 * *

6 *

7 *

8 *

9

10 * *

A T P Y I G L

A -0.2 -0.1 0 -0.1 0.5 -0.2 0.2

T 0.3 -0.1 -0.2 -0.3 0.1 0

P -0.2 -0.4 -0.1 0.1 -0.2

Y -0.4 -0.2 -0.1 -0.2

I 0.3 0.2 0.4

G 0.4 0.2

L 0.3

Page 22: Homework grades

GenThreader http://bioinf.cs.ucl.ac.uk/psipred.

1. Predicts secondary structures for target sequence.

2. Makes sequence profiles (PSSMs) for each template sequence.

3. Uses threading scoring function to find the best matching profile.

4. Evaluates sequence-structure quality using neural networs.

Page 23: Homework grades

Evaluation of model accuracy

2)()1(

2''

Bji

N

ji

Aij dd

NNRMSD

Root Mean Square Deviation

dij (A) - distance between residues i and j in template structure “A”;

di’j’ (B) – distance between residues i’and j’ in predicted structure of target sequence “B”.

Residues i and j in template structure “A” are aligned to residues i’and j’ in predicted structure “B”

Page 24: Homework grades

CASP prediction competitions.

Page 25: Homework grades

Classwork III.

- Go to http://bioinf.cs.ucl.ac.uk/psipred- Go over the options of protein structure

prediction programhttp://bioinf2.cs.ucl.ac.uk/psiout/271020e9dc74fead.mgen.html