40
1 Molecular Modeling Methods & Molecular Modeling Methods & Ab Initio Ab Initio Protein Structure Protein Structure Prediction Prediction By Haiyan Jiang By Haiyan Jiang Oct. 16, 2006 Oct. 16, 2006

1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

1

Molecular Modeling Methods & Molecular Modeling Methods & Ab InitioAb Initio Protein Structure Prediction Protein Structure Prediction

By Haiyan JiangBy Haiyan Jiang

Oct. 16, 2006Oct. 16, 2006

Page 2: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

2

About meAbout me

2003, Ph.D in Computational Chemistry, University of Science

and Technology of China

Research: New algorithms in molecular structure optimization

2004~2006, Postdoc, Computational Biology, Dalhousie

University

Research: Protein loop structure and the evolution of protein

domain

Page 3: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

3

PublicationsPublications

Haiyan Jiang, Christian Blouin, Ab Initio Construction of All-atom Loop

Conformations, Journal of Molecular Modeling, 2006, 12, 221-228.

Ferhan Siddiqi, Jennifer R. Bourque, Haiyan Jiang, Marieke Gardner, Martin

St. Maurice, Christian Blouin, and Stephen L. Bearne, Perturbing the

Hydrophobic Pocket of Mandelate Racemase to Probe Phenyl Motion During

Catalysis, Biochemistry, 2005, 44, 9013-9021. (Responsible for building the

simulation model and performing molecular dynamics study)

Yuhong Xiang, Haiyan Jiang, Wensheng Cai, and Xueguang Shao, An Efficient Method Based on Lattice Construction and the Genetic Algorithm for Optimization of Large Lennard-Jones Clusters, Journal of Physical Chemistry A, 2004, 108, 3586-3592.

Xueguang Shao, Haiyan Jiang, Wensheng Cai, Parallel Random Tunneling Algorithm for Structural Optimization of Lennard-Jones Clusters up to N=330, Journal of Chemical Information and Computer Sciences, 2004, 44, 193-199.

Page 4: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

4

PublicationsPublications

Haiyan Jiang, Wensheng Cai, Xueguang Shao., New Lowest Energy Sequence

of Marks’ Decahedral Lennard-Jones Clusters Containing up to 10000 atoms,

Journal of Physical Chemistry A, 2003, 107, 4238-4243.

Wensheng Cai, Haiyan Jiang, Xueguang Shao., Global Optimization of Lennard-

Jones Clusters by a Parallel Fast Annealing Evolutionary Algorithm, Journal of

Chemical Information and Computer Sciences, 2002, 42, 1099-1103.

Haiyan Jiang, Wensheng Cai, Xueguang Shao., A Random Tunneling Algorithm

for Structural Optimization Problem, Physical Chemistry and Chemical Physics,

2002, 4, 4782-4788.

Xueguang Shao, Haiyan Jiang, Wensheng Cai., Advances in Biomolecular

Computing, Progress in Chemistry (chinese) , 2002, 14, 37-46.

Haiyan Jiang, Longjiu Cheng, Wensheng Cai, Xueguang Shao., The Geometry

Optimization of Argon Atom Clusters Using a Parallel Genetic Algorithm,

Computers and Applied Chemistry (chinese), 2002, 19, 9-12.

Page 5: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

5

Unpublished work

Haiyan Jiang, Christian Blouin, The Emergence of Protein Novel Fold and

Insertions: A Large Scale Structure-based Phylogenetic Study of Insertions in

SCOP Families, Protein Science, 2006. (under review)

Page 6: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

6

ContentsContents

Molecular modeling methods and applications in ab initio protein

structure prediction

Potential energy function

Energy Minimization

Monte Carlo

Molecular Dynamics

Ab initio protein loop modeling

Challenge

Recent progress

CLOOP

Page 7: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

7

Molecular Modeling MethodsMolecular Modeling Methods

Molecular modeling methods are the theoretical methods and

computational techniques used to simulate the behavior of

molecules and molecular systems

Molecular Forcefields

Conformational Search methods

Energy Minimization

Molecular Dynamics

Monte Carlo simulation

Genetic Algorithm

Page 8: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

8

Ab InitioAb Initio Protein Structure Prediction Protein Structure Prediction

Ab initio protein structure prediction methods build protein 3D

structures from sequence based on physical principles.

Importance

The ab initio methods are important even though they are

computationally demanding

Ab initio methods predict protein structure based on physical models,

they are indispensable complementary methods to Knowledge-based

approach

eg.

Knowledge-based approach would fail in following conditions:

Structure homologues are not available

Possible undiscovered new fold exists

Page 9: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

9

Applications of MM in Applications of MM in Ab InitioAb Initio PSP PSP

Basic idea

Anfinsen’s theory: Protein native structure corresponds to the

state with the lowest free energy of the protein-solvent system.

General procedures

Potential function

Evaluate the energy of protein conformation

Select native structure

Conformational search algorithm

To produce new conformations

Search the potential energy surface and locate the global minimum

(native conformation)

Page 10: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

10

Protein Folding FunnelProtein Folding Funnel

Local mimina

Global minimum

Native Structure

Page 11: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

11

Potential Functions for PSPPotential Functions for PSP

Potential function

Physical based energy function

Empirical all-atom forcefields: CHARMM, AMBER, ECEPP-3,

GROMOS, OPLS

Parameterization: Quantum mechanical calculations, experimental

data

Simplified potential: UNRES (united residue)

Solvation energy

Implicit solvation model: Generalized Born (GB) model, surface

area based model

Explicit solvation model: TIP3P (computationally expensive)

Page 12: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

12

General Form of All-atom ForcefieldsGeneral Form of All-atom Forcefields

pairs ,ticelectrosta

pairs , der Waalsvan

612

Hbonds

1012

dihedralsangles

2

0

bonds

2

0totalcos1

jiij

ji

jiij

ij

ij

ij

ij

ij

ij

ij

b

r

qq

r

B

r

A

r

D

r

C

nKKrrKV

Electrostatic term

H-bonding term

Van der Waals term

Bond stretching term

Dihedral termAngle bending term

r ΦΘ

+ ーO H rr r

The most time demanding part.

Page 13: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

13

Search Potential Energy Surface

We are interested in minimum points on Potential Energy Surface (PES)

Conformational search techniques

Energy Minimization

Monte Carlo

Molecular Dynamics

Others: Genetic Algorithm,

Simulated Annealing

Page 14: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

14

Energy MinimizationEnergy Minimization

Energy minimization

Methods

First-order minimization: Steepest descent, Conjugate gradient

minimization

Second derivative methods: Newton-Raphson method

Quasi-Newton methods: L-BFGS

Local miminum

Page 15: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

15

Monte CarloMonte Carlo

Monte Carlo

In molecular simulations, ‘Monte Carlo’ is an importance sampling technique.1. Make random move and produce a new conformation

2. Calculate the energy change E for the new conformation

3. Accept or reject the move based on the Metropolis criterion

exp( )E

PkT

Boltzmann factor

If E<0, P>1, accept new conformation;

Otherwise: P>rand(0,1), accept, else reject.

Page 16: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

16

Monte CarloMonte Carlo

Monte Carlo (MC) algorithmGenerate initial structure R and calculate E(R);

Modify structure R to R’ and calculate E(R’);

Calculate E = E(R’) E(R);

IF E<0, then R R’;

ELSE

Generate random number RAND = rand(0,1);

IF exp( E/KT) > RAND, then R R’;

ENDIF

ENDIF

Repeat for N steps;

Monte Carlo Minimization (MCM) algorithm

Parallel Replica Exchange Monte Carlo algorithm

Page 17: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

17

Molecular DynamicsMolecular Dynamics

Molecular Dynamics (MD)

MD simulates the Movements of all the particles in a molecular system by

iteratively solving Newton’s equations of motion.

MC view many frozen butterflies in a museum; MD watch the butterfly fly.

Page 18: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

18

Molecular DynamicsMolecular Dynamics

Algorithm

For atom i, Newton’s equation of motion is given by

Here, ri and mi represent the position and mass of atom i and Fi(t) is

the force on atom i at time t. Fi(t) can also be expressed as the

gradient of the potential energy

V is potential energy. Newton’s equation of motion can then relate

the derivative of the potential energy to the changes in position as a

function of time.

2

2

d

di

i i

tt m

t

rFi i iF m a

i iV F 2

2

d

di

i i

tV m

t

r

(1) (2)

(4)(3)

Page 19: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

19

Molecular DynamicsMolecular Dynamics

Algorithm (continue)

To obtain the movement trajectory of atom, numerous numerical algorithms

have been developed for integrating the equations of motion. (Verlet algorithm,

Leap-frog algorithm)

Verlet algorithmVerlet algorithm

The algorithm uses the positions and accelerations at time t, and the positions

from the previous step to calculate the new positions

Selection of time stepSelection of time step

Time step is approximately one order of magnitude smaller than the fastest

motion

Hydrogen vibration ~ 10 fs (10-15 s), time step = 1fs

2( ) 2 ( ) ( ) ( )t t t t t t t r r r a

t

( )t tr

Page 20: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

20

Molecular DynamicsMolecular Dynamics

MD Software

CHARMM (Chemistry at HARvard Molecular Mechanics) is a program for

macromolecular simulations, including energy minimization, molecular

dynamics and Monte Carlo simulations.

NAMD is a parallel molecular dynamics code designed for high-performance

simulation of large biomolecular systems.

http://www.ks.uiuc.edu/Research/namd/

Application in PSP

Advantage: Deterministic; Provide details of the folding process

Limitation: The protein folding reactions take place at ms level, which is at

the limit of accessible simulation times

It is still difficult to simulate a whole process of a protein folding using

the conventional MD method.

Page 21: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

21

Time Scales of Protein Motions and MDTime Scales of Protein Motions and MD

MD Time Scale

10-15 10-610-910-12 10-3 100

(s)(fs) (ps) (μs)(ns) (ms)

Bond stretching

Elastic vibrations of proteins

α-Helix folding

β-Hairpin folding

Protein folding

Page 22: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

22

MD is fun!MD is fun!

A small protein A small protein folding movie: folding movie: simulated with simulated with NAMD/VMDNAMD/VMD

Page 23: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

23

Other Conformational Search AlgorithmsOther Conformational Search Algorithms

Global optimization algorithms

“Optimization” refers to trying to find the global energy minimum

of a potential surface.

Genetic Algorithm (GA)

Simulated Annealing (SA)

Tabu Search (TS)

Ant Colony Optimization (ACO)

A model system: Lennard Jones clusters

Page 24: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

24

Applications of MM methods in PSPApplications of MM methods in PSP

Application in PSP

Combination of several conformational search techniques

Recent developments

Simplified force field: united residue force field

Segment assembly

Secondary structure prediction are quite reliable, so conformation can be

produced by assemble the segments

Ab initio PSP software

Rosetta is a five-stage fragment insertion Metropolis Monte Carlo method

ASTRO-FOLD is a combination of the deterministic BB global optimization

algorithm, and a Molecular Dynamics approach in torsion angle space

LINUS uses a Metropolis Monte Carlo algorithm and a simplified physics-

based force field

Page 25: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

25

ASTRO-FOLD

Page 26: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

26

ReferencesReferences

Hardin C, et. al. Ab initio protein structure prediction. Curr Opin

Struct Biol. 2002, 12(2):176-81.

Floudas CA, et. al. Advances in protein structure prediction and de

novo protein design: A review. Chemical Engineering Science, 2006,

61: 966-988.

Klepeis JL, Floudas CA, ASTRO-FOLD: a combinatorial and global

optimization framework for ab initio prediction of three dimensinal

structures of proteins from the amino acid sequence, Biophysical

Journal, 2003, 85: 2119-2146.

Page 27: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

27

Ab InitioAb Initio Protein Loop Prediction Protein Loop Prediction

Protein loop

Protein loops are polypeptides

connecting more rigid structural

elements of proteins like helices and strands.

Challenge in Loop Structure Prediction

Loop is important to protein folding and protein function even

though their size is small, usually <20 residues

Loops exhibit greater structural variability than helices and strands

Loop prediction is often a limiting factor on fold recognition methods

Page 28: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

28

Ab InitioAb Initio Protein Loop Prediction Protein Loop Prediction

Ab initio methods have recently received increased attention in the prediction of protein loop

Potential energy function

Molecular mechanics force field is usually better than statistical potential in protein loop modeling.

Recent progress

Dihedral angle sampling

Clustering

Select representative structures from ensembles

Page 29: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

29

Ab InitioAb Initio Loop Prediction Methods Loop Prediction Methods

Loopy

Random tweak

Colony energy

Fiser’s method

MM methods:

Physical energy function

Energy Minimization + MD + SA

Forrest & woolf

Predict membrane protein loop

MM methods: MC + MD

Review:

Floudas C.A. et al, Advances in protein structure prediction and de novo protein

design: A review, Chemical Engineering Science, 2006, vol. 61, 966-988.

Page 30: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

30

CLOOP:CLOOP: Ab Initio Ab Initio Loop Modeling Method Loop Modeling Method

CLOOP build all-atom ensemble of protein loop conformations (it

is not a real protein loop prediction method)

Paper

Haiyan Jiang, Christian Blouin, Ab Initio Construction of All-atom Loop

Conformations, Journal of Molecular Modeling, 2006, 12, 221-228.

CLOOP methods

Energy function: CHARMM

Dihedral sampling

Potential smoothing technique

The designed minimization (DM) strategy

Divided loop conformation construction

Page 31: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

31

The Energy Function of CHARMM ForcefieldThe Energy Function of CHARMM Forcefield

CHARMM

elecvdwimpdiheangleUBbondsCHARMm EEEEEEEE

bonds

bbonds bbkE 20 )(

UB

UBUB SSkE 20 )(

angle

angle kE 20 )(

dihe

dihe nkE ))(cos(1(

imp

impimp kE 20 )(

nonbond ij

ij

ij

ijijvdw r

R

r

RE

6

min,

12

min, 2 nonbond ij

jielec r

qqE

04

Page 32: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

32

CLOOPCLOOP

Dihedral sampling

Loop main-chain dihedral and are generated by sampling main-

chain dihedral angles from a restrained / set

The restrained dihedral range has 11 pair of / dihedral sub-

ranges. It was obtained by adding 100 degree variation on each

state of the 11 / set developed by Mault and James for loop

modeling.

Side chain conformations are built randomly.

Page 33: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

33

CLOOPCLOOP

Potential smoothing techniquePotential smoothing technique

A soft core potential provided in CHARMM software package

was applied to smooth non-bonded interactions

softr is the switching distance for the soft core potential

is the distance of the two interacting atoms

CHARMMnonbond nonbondE E softr r

)( CHARMMnonbonded soft nonbondedE k r r E softr r

r

Page 34: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

34

CLOOPCLOOP

The designed minimization (DM) strategy

Minimization methods:

steepest descent, conjugate gradient, and adopted basis

Newton-Raphson minimization method

Two stages:

1. Minimize the internal energy terms of loop conformations including

bond, angle, dihedral, and improper

2. The candidates were further minimized with the full CHARMM

energy function including the van der Waal and electrostatic energy

terms.

Page 35: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

35

CLOOPCLOOP

Divided loop conformation construction

Generate position of middle residue

Build initial conformation of main chain with dihedral sampling

Build side chain conformation

Run DM and produce closed loop conformation

Page 36: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

36

CLOOPCLOOP

Performance of CLOOP

CLOOP was applied to construct the conformations of 4, 8, and 12 residue long loops in Fiser’s loop test set. The average main-chain root mean square deviations (RMSD) obtained in 1000 trials for the 10 different loops of each size are 0.33, 1.27 and 2.77 Å, respectively.

The performance of CLOOP was investigated in two ways. One is to calculate loop energy with a buffer region, and the other is loop only. The buffer region included a region extending up to 10 Å around the loop atoms. In energy minimization, only the loop atoms were allowed to move and all non-loop atoms include those in the buffer region were fixed.

Page 37: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

37

Loop Conformations built by CLOOPLoop Conformations built by CLOOP

a. 1gpr_123-126 b. 135l_84-91 c. 1pmy_77-88

Page 38: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

38

Performance of CLOOPPerformance of CLOOP

Page 39: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

39

ConclusionConclusion

CLOOP can be applied to build a good all-atom conformation

ensemble of loops with size up to 12 residues.

Good efficiency, CLOOP is faster than RAPPER

The contribution of the protein to which a loop is attached (i.e.

the ‘buffer region’ ) facilitates the discrimination of near-

optimal loop structures.

The soft core potentials and a DM strategy are effective

techniques in building loop conformations.

Page 40: 1 Molecular Modeling Methods & Ab Initio Protein Structure Prediction By Haiyan Jiang Oct. 16, 2006

40

Thanks! Thanks!