Ab-initio protein structure prediction

Preview:

DESCRIPTION

Ab-initio protein structure prediction. Chen Keasar BGU. ?. Any educational usage of these slides is welcomed. Please acknowledge. keasar@cs.bgu.ac.il. The problem : Predict the three dimensional structure of a protein based on its sequence. ?. ?. ?. ?. ?. ?. - PowerPoint PPT Presentation

Citation preview

Ab-initio protein structure prediction

?

Chen KeasarBGU

Any educational usage of these slides is welcomed.Please acknowledge. keasar@cs.bgu.ac.il

……TVFAIYDYDFK…..

…… TEDDAGSFHEK ……

…… TLUNSGDGDWW ……

…… TGYVGSSYV ……

The problem: Predict the three dimensional structureof a protein based on its sequence.

?

?

?

?

??Chen KeasarBGU

How can we predict protein structures?

Are we lucky?

yes

A

V

C WK

A

GK

C

A C C W K A

V GKC

C

+

A

V

C WK

A

GK

C

C

homology

no

ab initio

a bit

fold recognition

Chen KeasarBGU

Why is ab-initio prediction hard?

Chen KeasarBGU

Ab-initio is hard, why do it?Wait until enough proteins are solved

and use homology modeling/fold-recognition

Chen KeasarBGU

Because it’s there

Chen KeasarBGU

• Because homology modeling tells us nothing about the physical nature of the protein folding and stability.

• Because ab-initio methods can augment fold-recognition and homology (refinement, large loops, side chains).

• Because of ORFans (orphan ORFs).

• Because it can ease experimental structure determination.

• Because prediction is the basis of design.

Chen KeasarBGU

ab-initio protein structure prediction

Simulation of the actual folding process

1. Build an accurate initial

model (including energy

and forces).

2. Accurately simulate the

dynamics of the system.

3. The native structure will

emerge.

44

Optimization problem

1. Define some initial model.

2. Define a function mapping

structures to numerical

values (the lower the better).

3. Solve the computational

problem of finding the

global minimum.

44 Chen Keasar

BGU

Simulating the actual folding process

dimer a CHOOH

Model I – quantum description of the system

Chen KeasarBGU

Model II

Semi-empirical energy functions – forcefields

Classic world no quantum effects (that is no chemistry). Parameterized to reproduce experimental results for small

molecules. Their use for proteins is an extrapolation. The basic element is an atom:

• Unbreakable.• Represented by the X,Y,Z coordinates of its center.• Its attributes (volume, charge, mass etc.) are the basic

parameters of the energy function.

Chen KeasarBGU

Chen Keasar

BGU

Chen Keasar

BGU

Chen Keasar

BGU

Chen Keasar

BGU

The good newsThe model is rather accurate and correctly describe many natural phenomena.

The bad news• Each time step is hard to compute.• An order of 1012 steps are needed to simulate protein folding.

Chen KeasarBGU

conformation

ener

gyAb-initio protein structure prediction as

an optimization problem

2. Solve the computational problem of finding an optimal structure.

3.

1. Define a function that map protein structures to some quality measure.

Chen KeasarBGU

A dream function Has a clear minimum in the native structure. Has a clear path towards the minimum. Global optimization algorithm should find the

native structure.

Chen KeasarBGU

An approximate function Easier to design and compute. Native structure not always the global minimum. Global optimization methods do not converge. Many

alternative models (decoys) should be generated.

Chen KeasarBGU

An approximate function Easier to design and compute. Native structure not always the global minimum. Global optimization methods do not converge. Many

alternative models (decoys) should be generated. No clear way of choosing among them.

Decoy set

Chen KeasarBGU

Energy functions: Typically include terms for hydrophobicity, hydrogen bonds etc. Typically based on the distribution of structural features (say

contacts between alanine residues and arginine residues) in the

PDB. The more frequent is the feature the lower is the energy

associated with it.

A small problems – these assumptions are wrong.

A brilliant solution – ignore it.

Assumptions: These features are independent. The proteins in the PDB are a

representative sample of conformation

space.

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

heavy atom

atom

half a residue

Not really

Ab-initio

torsion angle lattice

electrons & protons

Hinds &Levitt

Skolnik2000

Skolnik1998

Scheraga1998

Baker(Rosetta)

Levitt &Keasar

AMBR ECEPCHARM OPLS

ENCAD GROMOS

Levitt1976

Osguthorpe

JonesPark &

Levitt

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

extended atom

atom

half a residue

torsion angle lattice

electrons & protons

Hinds &Levitt

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

extended atom

atom

half a residue

torsion angle lattice

electrons & protons

Park &Levitt

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

extended atom

atom

half a residue

torsion angle lattice

electrons & protons

Skolnik2000

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

extended atom

atom

half a residue

torsion angle lattice

electrons & protons

Scheraga1998

Chen KeasarBGU

diamondlattice

fine square lattice

fragments continuous

Some residues

Basic element

residue

extended atom

atom

half a residue

torsion angle lattice

electrons & protons

Hinds &Levitt

Skolnik2000

Skolnik1998

Scheraga1998

Baker(Rosetta)

Levitt,Keasar

AMBR ECEPCHARM OPLS

ENCAD GROMOS

Levitt1976

Osguthorpe

JonesPark &

Levitt

Chen KeasarBGU

Apparently the best current method

Recommended