Simulations of Stochastic Biological Phenomena Fernand Hayot, PhD Department of Neurology Mount Sinai School of Medicine [email protected] 1

Simulations of Stochastic Biological Phenomena

Fernand Hayot, PhDDepartment of Neurology

Mount Sinai School of [email protected]

1

mailto:[email protected]

Part 1. Modeling Stochastic Systems: The Master Equation

Measurements on single cells rather than across a population of cells: emphasis on variability from cell to cell, rather than on average behavior.

Importance: Cell variability can lead to different phenotypic outcomes; average behavior can mask what actually happens in individual cells, such as graded average response masking all-or-none individual cell response.

Example: Ferrell JE Jr, Machleder EM. The biochemical basis of an all-or-none cell fate switch in Xenopus oocytes. Science. 1998 May 8;280(5365):895-898. PMID: 9572732

2

Ferrell JE Jr, Machleder EM. The biochemical basis of an all-or-none cell fate switch in Xenopus oocytes. Science. 1998 May 8;280(5365):895-898. PMID: 9572732

Responses to progesterone stimulation of individual oocytes

3

(stochasticity=randomness=fluctuations)

Why do we need to consider stochasticity? ,, i.e.,

Consider a collection of p identical proteins that decay at a rate k, which means that per unit time k proteins disappear. Deterministically

However the decay process is stochastic: each protein has a probability kdt to decay in the time interval (t, t + dt).

Example:

i.e.

i.e.

4

(stochasticity = randomness = fluctuations)

Why do we need to consider stochasticity?

Now consider N boxes, each one with the identical system of p proteins, all having the same initial number p(0) and decaying at the same rate k. Suppose I open the boxes some time later and check the number of proteins: I'll find different numbers in each box, differing by small amounts.

Example (continued):

If the numbers are large, these differences do not matter, and the deterministic equation is a good description of what happens in each box.

However, when the number of initial proteins is small, say 10, the differences matter, the effect of the stochasticity of the decay shows up, and the deterministic approach to each box is no longer appropriate.

5

N identically prepared cells, numbered i = 1,2,...NAt some time t,measure the copy number x

i of some protein or mRNA in each cell

One can define:

average: <x> = (x1+ x

2 + ............ x

N)/ N

variance: σ2 = <x2> – <x>2 = < (x – <x>)2> Standard deviation = σ; coefficient of variation= σ/<x>;Fano factor= σ2/<x>


Example details

6


Example details

N large: statistical interpretation

Histogram: number of cells for which xi lies in some small interval x

< xi < x + δx

Histogram is an approximation (better the larger N) to the probability density distribution of the number of measured protein or mRNA across the N cells. Knowledge of <x> and σ (better the complete probability distribution) allows comparison with known distributions: Poisson, Gaussian, gamma, lognormal.

7

The figure shows several genes, conditions and time points. Thick line in (a): log (σ2/<p>2) = 1175 – log(<p>) Thin line in (b) is autofluorescence: log (σ2/<p>2) = 9.9 .105 – 2 log(<p>)

Thus, slope of autofluorescence is twice slope of thick line

Noise in protein levels in yeast cells

Bar-Even A, Paulsson J, Maheshri N, Carmi M, O'Shea E, Pilpel Y, Barkai N. Noise in protein expression scales with natural protein abundance. Nat Genet. 2006 Jun;38(6):636-43. PMID: 16715097

8

Birth-and-Death Process: transcription and mRNA degradationModel: D → D + M, rate k1; M → Ф, rate k3

D = DNA M = mRNA Φ = degradation product

ODE for concentrations:

Deterministic versus stochastic representations

This works well for cell population average, or for a single cell when the number n of M is so large that any change to it from the above reactions (change of +1 or –1) can be treated differentially. If n is small (say of the order of 100 and less), one has what is called “small copy number” fluctuations, and the appropriate language now becomes that of probabilities.

9

The new quantity now is P(n,t), the probability of having n copies of Min the cell at time t. This probability satisfies an evolution equation called the Master equation. For the above birth-and-death processthis equation is

How does this equation come about?

Deterministic versus stochastic representations

10

Take time interval δt sufficiently small so that either none of the 2 reactions takes place, or a single reaction, but not 2. Then ask for P(n, t + δt), knowing the state of the cell at time t.

3 contributions between t and t + δt:

- “birth” takes place: k1 δt nD P(n – 1, t)

- “death” takes place: k3 δt (n + 1) P(n + 1, t)

- no reaction occurs: (1 – k1 δt nD

– k3 δt n) P(n,t)

Derivation of the Master Equation

Model: D → D + M, rate k1; M → Ф, rate k3

D = DNA M = mRNA Φ = degradation product

11

Put nD=1; one promoter-transcription factor complex (haploid cell)

Now put P(n,t+δt) equal to the 3 contributions, replace[P(n,t+δt)-P(n,t)]/δt by ∂P(n,t)/∂t and the full Master equation ofthe preceding slide is obtained, namely

Derivation of the Master Equation

12

Calculate average of n from Master equation:Average: multiply Master equation by n, and sum over all n,using Σ P(n,t)=1, Σ n P(n,t)= <n(t)>

d <n(t)>/dt = k1 – k3 <n>

Result: the average value <n(t)> follows the same equation as the concentration [M(t)] of the ODE equation.

Additional hints relevant to Problem Set

1.If we multiply the Master equation by n2 and sum, this results in an equation for the variance. The steady-state relation between mean and variance provides important information.

2.Consider the steady state of the Master equation: ∂P(n,t)/∂t =0, and (n+1)P(n+1)=nP(n)+ (k1/k3) [P(n)-P(n-1)]. Notice that P(n+1) is

determined, once P(n) and P(n-1) are known. Therefore, one can find the steady state solution recursively, starting with P(1) (on the LHS) , with P(-1)=0.

13

The Problem Set requires exploring this model

A model of transcription plus translation

D → D + M, rate k1 M → Ф, rate k3

The model of transcription and mRNA degradation was expanded:Thattai M and van Oudenaarden A, PNAS 2001 98:8614-8619.

Model of Transcription & mRNA degradation only:

Complete Model:

D → D + M, rate k1 M → M + P, rate k2 M → Ф, rate k3

P → Ф, rate k4

14

15

Part 2. Modeling Stochastic Systems: The Gillespie Algorithm

Master equation in most cases is too complicated to be dealt withdirectly. Gillespie's algorithm is a numerical scheme that is equivalent to solving the Master equation.

The crux of the algorithm is the drawing of 2 random numbers at each time step, one to determine when the next reaction among the reactions considered will take place, the second one to choose which one of the reactions will occur.

Suppose there are j = 1,2,.... reactions. Given the state of the system attime t, a

j(t)dt denotes the probability that reaction j will occur in the time

interval (t, t + dt). aj(t) is the product of 2 parts: reaction rate c

j for

reaction j, and the number of possible reactions in volume V.

Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 1977;81, 2340-2361.

Gillespie DT. Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem Phys. 2001 115, 1716-1733.

16

Example: dimerization reaction P1 + P2 → Z, rate cj

Here aj(t) = c

j P1(t) P2(t), where P1(t) P2(t) represents the product

of the numbers of molecules of P1 and P2 at time t.

If P2 identical to P1 then aj(t) = c

j P1(P1 – 1)/2

Remark: connection between the c's and the corresponding chemical rate constants k's that appear in the ODEs (This remark applies as well to the Master Equation derivation)

By definition the c's have dimension of inverse of time. When the chemical rate constant has the dimension of an inverse timeas is the case for the transcription-translation-degradation modelconsidered previously, then c is equal to k. However for the above dimerization reaction, one would have the ODEd[Z]/dt = k[P1][P2], where k has dimension of volume over time.In this case c = k/V (c = 2k/V if P1 = P2), V representing the volume where the reaction takes place (whole cell, cytoplasm, nucleus) (1 nM corresponds to 1 particle in a volume of 1.6 micron cube.)

17

At time t the number of molecules of each of the interacting species areknown, the a

j(t) are known. Call ao(t) = Σa

j(t).

Take the following steps:

1. Find time τ at which the next reaction takes place; draw random number from an exponential probability distribution p(τ)= ao exp(-ao τ)2. Choose what reaction takes place at time τ; draw a random number from a uniform distribution between 0 and 1. If that number falls between 0 and a

1/ao,

reaction 1 is chosen, between a1/ao and (a

1 + a

2)/ao reaction 2, and so on.

3. The occurrence of the chosen reaction changes the numbers for the molecules involved, for example for the dimerization reactionP1→P1 – 1, P2→ P2 – 1, Z→ Z+1. Consequently the value of the corresponding a

j(t + τ) is different from a

j(t).

4. Reiterate, starting from point 1, as long as one wishes to follow the evolution of the system

Implementation of Gillespie’s Algorithm

18

If 2 random numbers x and y are connected through a monotonic function, such that y = f(x), then knowing the probability density function of x, namely p(x), one finds p(y) = p(x) |dx/dy|

For instance, if x is uniformly distributed between 0 and 1,then y = –(log x)/a is exponentially distributed.

Why is τ exponentially distributed?System known at time t.Probability that the next reaction occurs between t + τ and t + τ + dτp(τ) dτ = p(no reaction between t and t + τ) x p(reaction between t + τ and t + τ + dτ)

p(τ) dτ = exp(-ao τ) ao dτ

Remarks in connection with Gillespie’s algorithm implementation

19

% intialization of componentsd=1; r=0; time=0;timelimit= 500 ;

% Main loop over timewhile time < timelimit a1=k1*d ; a2=k3*r ; a0=a1+a2 ; r1=rand ; r2=rand ; tau=-(1./a0)*log(r1) ; yr2=r2*a0 ; cumprobs = zeros(1,3) ; cumprobs(2)=cumprobs(1)+a1 ; cumprobs(3)=cumprobs(2)+a2 ; for k=2:length(cumprobs) if ( (cumprobs(k) >= yr2) & (cumprobs(k-1) < yr2) ) mu=k-1; end end if (mu == 1) r = r + 1 ; end if (mu == 2) r = r - 1 ; end time=time+tau; end

Reactions: D→ D + M, rate k1

M→Ф ,rate k3

r = # mRNA molecules

Choice of time limit?Need to know k1,k3

take k1=0.01 1/s k3=0.0058 1/scharacteristic times:t1= 1/c1=100 st3= 1/c3= 172 s

core_gillespie.m

20

A MATLAB implementation of Gillespie’s algorithm

Some remarks on Gillespie's program

Efficiency: several tens of reactions problem of very fast time scales

Acceleration of Gillespie's algorithm for large systems: “tau leap” algorithm

Gillespie DT and Petzold LR, J. Chem. Phys. 2003;119, 583-591.

Gibson-Bruck algorithm Gibson MA and Bruck J, J. Phys. Chem. A 2000; 104, 1876-1889.

Software packages: “Dizzy” Ramsey et al., J. Bioinf. Comp. Biol. 2005 3, 415-436.

Hybrid models: combining deterministic and stochastic simulations

21

REMARKS ON NOISE: Intrinsic and Extrinsic Noise

Until now: small copy number noise: Intrinsic noise.Extrinsic noise: example: cell-to-cell variability of a kinase activated by a stimulus

Other source of Intrinsic noise: transcription and transcriptional bursting

Example Human dendritic cells infected by virus exhibit transcriptional burstingExamine cell-to-cell variability in IFNβ (interferon beta) mRNAThis is evident in a log-log plot of mRNA versus cumulative probability

22

Hu J et al.. Power-laws in interferon-B mRNA distribution in virus-infected dendritic cells Biophys. J. 2009 97, 1984-1989.

Experimental Data Simulation Results

REMARKS ON NOISE: Intrinsic and Extrinsic Noise

Model of transcriptional bursting to explain IFNβ mRNA levels in infected dendritic cells Simplified model D+P1 ↔ Ds1 Ds1+P2 ↔ Ds11 Ds11 ↔ D* D* → D* + M M → 0 P1 and P2 transcription factors whose binding leads to Ds11. Stochastic TF binding (reactions 1 and 2) coupled to random bursting (reactions 3 to 5).

References: J. Hu, S.I. Biswas, S.C. Sealfon, J. Wetmur, C. Jayaprakash, F. Hayot, Biophys. J. 97, 1984- 1989 (2009)S. Iyer-Biswas, F. Hayot, C. Jayaprakash, Phys. Rev. E, 031911 (2009)J. Hu, S. Sealfon, F. Hayot, C. Jayaprakash, M. Kumar, A.C. Pendleton,A. Ganee, A. Fernandez-Sesma, T.M. Moran, and J.G. Wetmur, Nucleic Acids Res, 35, 5232-5241 (2007)

23

www.sciencesignaling.org

Slides from a lecture in the course Systems Biology—Biomedical Modeling

Citation: F, Simulations of stochastic biological phenomena. Sci. Signal. 4, tr13 (Citation: F. Hayot, Stimulations of stochastic biological phenomenon. Sci. Signal.

4, tr13 (2011).).

Documents

Simulations of Stochastic Biological Phenomena Fernand Hayot, PhD Department of Neurology Mount Sinai School of Medicine [email protected] 1