18
Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III Prof. Corey O’Hern Department of Mechanical Engineering & Materials Science Department of Physics Yale University 1

Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

  • Upload
    marty

  • View
    16

  • Download
    0

Embed Size (px)

DESCRIPTION

Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III. Prof. Corey O’Hern Department of Mechanical Engineering & Materials Science Department of Physics Yale University. 1. “Using massively parallel simulation and Markovian - PowerPoint PPT Presentation

Citation preview

Page 1: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Bioinformatics: Practical Application of Simulation and Data

Mining

Markov Modeling III

Prof. Corey O’HernDepartment of Mechanical Engineering & Materials

ScienceDepartment of Physics

Yale University

1

Page 2: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

“Using massively parallel simulation and Markovianmodels to study protein folding: Examining the dynamics

of the villin headpiece,” J. Chem. Phys. 124 (2006) 164902.

2

Page 3: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Villin headpiece-HP-36

MLSDEDFKAVFGMTRSAFANLPLWKQQNLKKEKGLF: PDB 1 VII

3

Page 4: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

50,000 trajectories *10ns/trajectory = 500 s

•Gromacs with explicit solvent (5000 water molecules)and eight counterions; Amber + bond constraints; T=300K

Simulation Details

4

50,000 trajectories*10 ns/trajectory*1 conformation/100 ps = 4,509,355 conformations

Page 5: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

I. Native State Ensemble

5

Page 6: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

II. Unfolded State Ensemble•10,000 trajectories equilibrated at T=1000K for 1 ns•Remove all structure•Random walk statistics

P R( )=4πR 2

23π R 2

( )3/2 exπ −

3R 2

2 R 2

⎣⎢⎢

⎦⎥⎥

end-to-end distance

R2 ~N1/2

N 3/5

⎧⎨⎩

idealexcludedvolume

N= # of amino acids

chaincrossing

•Each trajectory quenched from 1000K to 300K; run for 25 ns 6

Page 7: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Estimation of Folding Time: Including Unfolded Events

τ −1 =N f

Ntrajectoriest fi + tu

i

i∈U∑

i∈F∑⎡⎣⎢

⎤⎦⎥

−1

Initially unfolded

states

F: folded

U: unfolded

first passagetime: tf

tu

7

determined bydRMSD

Page 8: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Floppy Residues

8

Page 9: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Maximum Likelihood Estimator (MLE)

τF 4.3-10 s from laser-jump and other experiments

τF 8 s from MLE

τF 24 s from MLE + correction of water diffusioncoefficient

Page 10: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Sensitivity of MLE Results

“With these issues in mind, the calculated rate is wellwithin an order of magnitude of expeirmental measurements.”

Page 11: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

III. Transition State Ensemble: Effect of Perturbations

PX ,Y =N X( )

N X( )+ N Y( )

PX ,Y s( )=PX,Y s'( ) s’: perturbed state after 500 pss: unperturbed state

11

N(X)= # of trajectories that meet condition X before Y

Water does notaffect dynamics

Page 12: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Markov States

• 4,509,355 conformations 2454 Markov statesbased on clustering of C dRMSD

sf

•No dead ends

s1

s4

s3

s2

12

Page 13: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

C dRMSD

dRMSDij =

0 dRMSD12 dRMSD13 dRMSD14

dRMSD12 0 dRMSD 23 dRMSD 24

dRMSD13 dRMSD 23 0 dRMSD 34

dRMSD14 dRMSD 24 dRMSD 34 0

⎢⎢⎢⎢

⎥⎥⎥⎥

dRMSDij =

1N

rrki −

rrkj( )

2

k∑⎡

⎣⎢⎤⎦⎥1/2

k=sum over amino acidsi,j=configurations

13

Page 14: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Transition Probabilites and Mean First Passage Time

P sa , sb( )=T s,sb( )T s,si( )

i∑

14

stableMFPT

MFPT=3 s

Page 15: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

MSM: single exponential

Comparison of Short and Long Times

15

Page 16: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

First Passage Time in Random Processes

foldedunfolded

unfolded folded partially unfolded

16

Dx

P(Dx)

Gaussian

Page 17: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Survival Probability for Two Particles

17

Page 18: Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling III

Protein Aggregation

“Molecular simulation of protein aggregation,”Biotechnology & Bioengineering 96 (2007) 1.

18