Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Hybrid Multilevel Monte Carlo Simulation ofStochastic Reaction Networks
Alvaro Moraesjoint work with Raul Tempone and Pedro Vilanova
http://sri-uq.kaust.edu.sa
Center for UncertaintyQuantification
Center for UncertaintyQuantification
Center for Uncertainty Quantification Logo Lock-up
Thuwal, KSA, Jan. 6, 2015
January 6, 2015
1 / 39
Our Contribution
This presentation is based on:
Ref 1: A. Moraes, R. Tempone and P. Vilanova,“Hybrid ChernoffTau-leap”, SIAM Multiscale Modeling and Simulation,12(2):581-615, 2014. [Moraes et al., 2014a]
Ref 2: A. Moraes, R. Tempone and P. Vilanova,“Multilevel HybridChernoff Tau-leap”, preprint arXiv: 1403.2943, 2014(submitted). [Moraes et al., 2014c]
Ref 3: A. Moraes, R. Tempone and P. Vilanova,“Multilevel adaptivereaction-splitting simulation method for stochastic reactionnetworks”, preprint arXiv:1406.1989, 2014 (submitted).[Moraes et al., 2014b]
2 / 39
Part I
1 State the problem
2 Exact and approximate simulation of Stochastic ReactionNetworks
3 Chernoff Tau-leap
4 Hybrid paths
5 Global error control
6 Work optimization strategies
7 Some numerical results
3 / 39
Statement of the problemLet X be a Stochastic Reaction Network
X=(X1, . . . ,Xd) : [0,T ]→ Zd+
described by J reaction channels, Rj := (νj , aj), where
• νj ∈ Zd is called stoichiometric vector
• aj : Zd+ → R+ is called propensity function
such that
P(X (t + dt) = x+νj
∣∣ X (t) = x)
= aj(x)dt + o (dt) .
Goal: Given an observable g : Zd+ → R, a tolerance TOL>0, and a
small number α > 0, find an estimator M of E [g(X (T ))] suchthat
P (|E [g(X (T ))]−M| > TOL) < α
with near-optimal computational work.4 / 39
Total error |E [g(X (T ))]−M| vs TOL
Figure : P (|E [g(X (T ))]−M| > TOL) < α = 5%
5 / 39
Example: Gene transcription and translation
• ∅ 25−→ M, a gene is being transcribed into mRNA.
• M1000−−−→ M+P, mRNA is then being translated into proteins.
• P+P0.001−−−→ D, finally the proteins produce stable Dimers.
• M0.1−−→ ∅,P 1−→ ∅ degradation of mRNA and proteins, resp.
Estimate the expected number of Dimers at time T=1 up tocertain tolerance TOL, with high probability.
ν=(νj)Jj=1:=
1 0 00 1 00 −2 1−1 0 00 −1 0
T
and a(X ):=
25
103M10−3P(P−1)
0.1MP
,
(1)
where X (t) = (M(t),P(t),D(t)).6 / 39
Example: Gene transcription and translation
0 0.2 0.4 0.6 0.8 10
1000
2000
3000
4000
5000
6000
Time
Spe
cies
mRNAProteinsDimersMean
0 0.2 0.4 0.6 0.8 110
0
101
102
103
104
Time
Sp
ecie
s
M
P
D
Figure : Exact paths of the time evolution of the species in the genetranscription and translation (GTT) example.
7 / 39
The Stochastic Simulation Algorithm(SSA)[Gillespie, 1976]
It produces statistically correct path-simulations of X . Rememberthat X is a continuous-time Markov chain living in Zd
+.
1 In state x at time t, compute (aj(x))Jj=1 and their suma0(x) =
∑1≤j≤J aj(x).
2 Simulate the time to the next reaction, τ , as an exponentialrandom variable with rate a0(x).Observe: E [τ ] = (a0(x))−1 could be very small!.
3 Simulate independently the next reaction, νj , according to theprobability mass function values (aj(x)/a0(x))Jj=1.
4 Update: t ← t + τ and x ← x+νj .
5 Record (t, x). Return to step one if t < T , otherwise end thesimulation.
8 / 39
The Tau-leap method(TL)[Aparicio and Solari, 2001, Gillespie, 2001]
From Kurtz’s random time change representation [Kurtz, 1978],
X (t) = X (0) +J∑
j=1
Yj
(∫ t
0aj(X (s))ds
)νj ,
where Yj are independent unit-rate Poisson processes, we obtainthe Tau-leap method (kind of forward Euler approximation):Given X (t)=x ∈ Zd
+,
X (t+τ) = x +J∑
j=1
Pj
aj(x)τ︸ ︷︷ ︸λj
νj ,
where Pj(λj) are independent Poisson random variables with rateλj . Danger: X (t+τ) may have some negative components!
9 / 39
A Chernoff bound for the Tau-leap method I
Problem: Given δ>0, find the largest τ = τ(x , δ) s.t.
P(X (t+τ) /∈ Zd
+
∣∣ X (t) = x)≤ ChBnd(x , τ) < δ.
Idea: Use the moment generating function of a linear combinationof independent Poisson random variables to produce a Chernoffbound ChBnd(x , τ). For i = 1, . . . , d , solve numerically
supτi>0
infs>0
exp−sxi + τi
J∑j=1
aj(x)(e−sνji−1)
= δ/d .
We define τCh := minτ1, . . . , τd.
10 / 39
What is a Chernoff bound?
Single-reaction caseLet Q ∼ Poisson(λ), λ > 0. Given n ≥ 0 integer, the Chernoffbound is given by
P (Q ≥ n) ≤ exp(n(1− log(n/λ)− λ)
),
valid for λ < n (otherwise the bound is trivial).Proof: First note that for s > 0 the Markov inequality gives
P (Q ≥ n) = P(esQ ≥ esn
)≤
E[esQ]
esn,
P (Q ≥ n) ≤ exp
(infs>0−sn + λ(es − 1)
).
When λ ∈ (0, n), infs>0−sn + λ(es − 1) is achieved ats = log(n/λ), and its value is n(1− log(n/λ)− λ).
11 / 39
Exact pre-leaping: The Chernoff bound
4 6 8 10Λ
10-6
10-5
10-4
0.001
0.01
0.1
1
Klar H1DL
Chernoff H this workL
Poisson H exact L
Gaussian H approximation L
Figure : Let n = 10 and λ ∈ (2, 10). Semi-logarithmic plot of
P (Q(λ) ≥ n) ≤ ChBnd(n, λ) = exp(n(1− log(n/λ)− λ)
). See Klar’s
bound in [Klar, 2000]. See also [Cao et al., 2005a]
12 / 39
A Hybrid Chernoff Tau-Leap Algorithm
Main issues:
• Exact algorithms like SSA and MNRM may be notcomputationally feasible, since the expected inter-arrival timebetween transitions, τSSA(x) = (a0(x))−1, where x = X (t)could be very small.
• Approximate algorithms that evolve with fixed time steps,like the Tau-leap, may be faster[Aparicio and Solari, 2001, Gillespie, 2001], but introducesdiscretization error and can jump outside Zd
+ (non physicalresults). Pre-leap: adjust the time step to control theone-step exit probability, possibly enforcing too small steps.
Main idea:
• We propose a hybrid algorithm that, at each time step,adaptively switches between SSA and Tau-leap.
13 / 39
One-step switching rule
Algorithm 1 From X (t)=x take one step. T0 is the next grid point.
1: Compute τSSA.2: if K1τSSA > T0−t then3: Use SSA4: else5: Compute τCh6: if τCh ≥ K2τSSA then7: Use Tau-Leap8: else9: Use SSA
10: end if11: end if
Note: K1 is the cost of computing τCh(x) divided by the cost of taking anSSA step. K2 is the cost of taking a Chernoff Tau-leap step divided bythe cost of taking an SSA step plus the cost of computing τCh(x).
14 / 39
One-step switching rule in the GTT model
0 10 20 30 400
5
10
15
20
25
30
35
40
mRNA
Pro
tein
s
Chernoff tau−leap and SSA regions
Tau−leapSSA
0 10 20 30 400
5
10
15
20
25
30
35
40
mRNA
Pro
tein
s
Chernoff tau−leap and SSA regions
Tau−leapSSA
0 10 20 30 400
5
10
15
20
25
30
35
40
mRNA
Pro
tein
s
Chernoff tau−leap and SSA regions
Tau−leapSSA
Figure : Regions of the one-step switching rule in the GTT model. Theblue and red dots show the Chernoff tau-leap and the SSA regions,respectively. From left to right, δ = 10−2, 10−4, 10−6, respectively.
15 / 39
Hybrid Tau-leap algorithm
Algorithm 2 Given X (t0)=x0, simulates a hybrid path.
1: while t < T do2: method ← One-step rule3: if method = SSA then4: Apply SSA and advance one step5: else6: X ′t ← Xt + P(a(Xt)τCh)ν7: if X ′t 6∈ Zd
+ then8: return9: end if
10: Xt ← X ′t11: t ← t + τCh12: NTL ← NTL + 113: end if14: end while
16 / 39
An exiting path is a rare event: the role of δ
Let A be the event that a hybrid trajectory arrived to final time Twithout exiting Zd
+. We show in Ref 1 that
P (Ac) ≤ δE [NTL]− δ2
2(E[N2TL
]− E [NTL]) + o(δ2).
In practice, we use δE [NTL] as an upper bound of P (Ac). We alsoprove that E [NTL] is bounded for polynomial propensity functionsand tends to zero when δ → 0.
Remark: The role of δ is to turn Ac into a rare event. Directsampling of hybrid paths to estimate P(Ac) is non feasible, whilethe estimate of E [NTL] is straightforward.
17 / 39
Global Error DecompositionDefine the Monte Carlo estimator M as
M :=1
M
M∑m=1
(g(X (T ))1A
)(ωm).
Define the computational global error, E , as
E := E [g(X (T ))]−M.
Global Error decomposition (notation: g :=g(X (T )))
E = E [g(X (T ))(1A + 1Ac )]± E [g1A]−M= E [g(X (T ))1Ac ]︸ ︷︷ ︸
=:EE (exit)
+E [(g(X (T ))−g) 1A]︸ ︷︷ ︸=:EI (discretization)
+ E [g1A]−M︸ ︷︷ ︸=:ES (statistical)
.
Problem: Given TOL> 0 and α> 0, find the parameters forcomputing M such that |E|<TOL with confidence level 1−α, andwith nearly optimal computational work.
18 / 39
Estimation procedure
We have a 3-phase estimation procedure:
Phase 1 (off-line) Calibration of virtual machine-dependentquantities, for examples Ci , i = 1, 2, 3.
Phase 2 Given a prescribed TOL and a confidence level α,find:
• δ (upper bound for the one-step exit probabilityfor controlling the global exit error).
• a time mesh of size h (for controlling the timediscretization error).
• M (the number of Monte Carlo realizations forcontrolling the statistical error).
such that the estimator M can be computed withnear-optimal computational work.
Phase 3 Estimation of E [g(X (T ))].
19 / 39
About Phase 2, I
Define the expected cost of a hybrid trajectory,
Ψ(h, δ) := C1E [NSSA,K1(h, δ)] + C2E [NSSA,K2(h, δ)] + C3E [NTL(h, δ)]
+J∑
j=1
E
[∫[0,T ]
CP(aj(X (s))τCh(X (s), δ))1TL(X (s))ds
],
we want to find an approximate solution ofminM,h,δMΨ(h, δ)s.t.EI + EE + ES ≤ TOL
.
where C1,C2,C3 are machine-dependent work quantities (Alg. 1)and CP is the computational work of the Gamma simulationmethod for Poisson variates [Ahrens and Dieter, 1974],
20 / 39
About Phase 2, II
500 1000 1500
2
3
4
5
6
x10−4
λ
CP(λ
)
Poisson random variates computational work model
Actual simulation runtimes
Least squares fit
0 5 10 151
2
x10−4
λ
CP(λ
)
Poisson random variates computational work model
Actual simulation runtimes
Least squares fit
Figure : Computational work of generating Poisson deviates (computerand MATLAB dependent).
21 / 39
About Phase 2, III
For a given TOL > 0, we refine δ until EE ≤ TOL2,minM,h MΨ(h, δ0)s.t.EI (h, δ0) + ES ≤ TOL−TOL2
.
We now refine the mesh until EI (h0, δ0) + ES ≤ TOL−TOL2, usingan intermediate value for Maux=Maux(h, δ,Ψ), to finally get
M(h0, δ0) =
(CA
√S2 (g(X (T ));Ms)
TOL− TOL2 − EI (h0, δ0)
)2
.
Finally, if MΨ < MSSAE [NSSA∗ ] use Hybrid, otherwise use SSA.
22 / 39
Numerical results: global error vs predicted and actual work
101
102
103
104
10−2
10−1
Work (runtime, seconds)
Err
or
bound
Predicted/Actual work vs. Error bound, Genes model
SSA predictedHybrid predictedSSAHybrid
Figure : As the TOL goes to zero, the error bound goes to zero and theexpected computational work of the Hybrid method tends to theexpected work of the SSA. Observe the agreement between predicted andactual work. (1)
23 / 39
Part II
1 The Multilevel Monte Carlo idea
2 A result about complexity of our MLMC algorithm
3 Some numerical results
24 / 39
Introducing some notation
Consider a hierarchy of nested time meshes of the interval [0,T ],indexed by ` = 0, 1, . . . , L, such that
1 ∆t0 is the size of the coarsest time mesh.
2 ∆t` = 2−`∆t0, ` = 1, . . . , L.
Let
• X`(·):=X (·; ∆t`, δ) be a hybrid Chernoff tau-leap pathgenerated using a time mesh of size ∆t` and one-step exitprobability bound δ.
• A`:=ω ∈ Ω : X`(t) ∈ Zd+, ∀t ∈ [0,T ].
• g` := g(X`(T )).
25 / 39
The MLMC estimator and Global Error Decomposition I
Consider the following telescoping decomposition (see Ref 2):
E [gL1AL] = E [g01A0 ] +
L∑`=1
E[g`1A`
− g`−11A`−1
],
which motivates our MLMC estimator of E [g(X (T ))],
ML :=1
M0
M0∑m0=1
g01A0(ωm0) +L∑`=1
1
M`
M∑m`=1
[g`1A`− g`−11A`−1
](ωm`).
We define the computational global error, EL, as
EL := E [g(X (T ))]−ML.
26 / 39
The MLMC estimator and Global Error Decomposition II
Now, consider the following decomposition of EL
EL = E[g(X (T ))(1AL
+ 1AcL)]± E [gL1AL
]−ML
= E[g(X (T ))1Ac
L
]︸ ︷︷ ︸
=:EE ,L (exit)
+E [(g(X (T ))−gL) 1AL]︸ ︷︷ ︸
=:EI ,L (discretization)
+E [gL1AL]−ML︸ ︷︷ ︸
=:ES,L (statistical)
.
Problem: Given TOL> 0, find the parameters for computing ML
such that |EL|<TOL with high probability, and with nearly optimalcomputational work.Issues:
1 Simulate coupled pairs (g`, g`−1) for ` = 1, . . . , L.
2 Estimate accurately the global error components.
27 / 39
Computational Complexity I
Theorem: Computational complexity of the Multilevel HybridChernoff Tau-Leap is
wL(TOL) =
(CA
θ
L∑`=0
√V`ψ`
)2
TOL−2 = O(TOL−2
).
where V0 := Var [g01A0 ], V` := Var[g`1A`
− g`−11A`−1
], ` ≥ 1
and ψ` is the expected computational work per level of simulatingcoupled paths.
28 / 39
Numerical results: global error vs actual work
101
102
103
104
10−2
10−1
Actual work (runtime, seconds)
Err
or
bound
Actual work vs. Error bound, Genes model
SSAHybridHybrid MLslope 1/2
Figure : Actual work for each one of the one hundred calibrations.
29 / 39
Part III
1 This is a new extension of our hybrid Multilevel Monte Carlomethod.
2 The Reaction-splitting strategy is intended to deal with stiffproblems.
3 A novel Control Variate based on a deterministic-time changeapproximation.
4 We preserve the computational complexity O(TOL−2
).
5 Some numerical results.
30 / 39
Example: Virus kinetics
• E1−→ E+G , the viral template (E) forms a viral genome (G).
• G0.025−−−→ E , the genome generates a new template.
• E1000−−−→ E+S , a viral structural protein (S) is generated.
• G+S7.5×10−6
−−−−−−→ V , the virus (V) is produced.
• E0.25−−→ ∅, S 2−→ ∅ degradation reactions.
ν:=
1 0 0 0−1 0 1 00 1 0 0−1 −1 0 10 0 −1 00 −1 0 0
tr
and a(X ):=
E0.025G1000E
7.5×10−6G S0.25E
2 S
,
where X (t) = (G (t), S(t),E (t),V (t)).
31 / 39
Example: Virus kineticsEstimate the expected number of produced viruses (V) at timeT=20 (in days) up to a given TOL with high confidence.
0 5 10 15 200
1000
2000
3000
4000
5000
600020 exact paths
Time
Num
ber
of p
artic
les
GSEV
0 5 10 15 2010
0
101
102
103
104
20 exact paths
Time
Num
ber
of p
artic
les
(log
scal
e)
GSEV
Figure : Exact paths of the time evolution of the species in the viruskinetics (VRK) example.
32 / 39
Numerical results: global error vs predicted work
104
106
108
10−3
10−2
10−1
Predicted work
Err
or
bound
Predicted work vs. Error bound, Virus model
SSAMixed MLslope 1/2Asymptotic
Figure : The quotient WMix/WHyb starts at 2% and reaches itsasymptotic value of 6%.
33 / 39
Final comments
• A novel mixed algorithm that allows us to approach problemsin which there is a natural partition on the reaction channels.
• An 3-phase estimation procedure that provides the elementsfor the multilevel Monte Carlo setting (one-step exitprobabilities, time meshes and number of mixed paths), whichoptimizes the computational work.
• The complexity of the method is of order O(TOL−2
), like in
the SSA, but with a better constant!
• A control variate for the variance of the QoI when usingsingle-level paths (applies for the hybrid and mixed methods).
• We achieved speedups of 102 − 104 with the control variateand the mixed MLMC for large systems (200 reactions &species).
34 / 39
References I
Ahrens, J. and Dieter, U. (1974).Computer methods for sampling from gamma, beta, Poissonand bionomial distributions.Computing, 12:223–246.
Anderson, D. and Higham, D. (2012).Multilevel Monte Carlo for continuous Markov chains, withapplications in biochemical kinetics.SIAM Multiscal Model. Simul., 10(1).
Anderson, D. F. (2007).A modified next reaction method for simulating chemicalsystems with time dependent propensities and delays.The Journal of Chemical Physics, 127(21):214107.
35 / 39
References II
Aparicio, J. P. and Solari, H. (2001).Population dynamics: Poisson approximation and its relationto the langevin processs.Physical Reveiw Letters, 86(18):4183–4186.
Cao, Y., Gillespie, D., and Petzold, L. (2005a).Avoiding negative populations in explicit Poisson tau-leaping.The Journal of Chemical Physics, 123:054104.
Cao, Y., Gillespie, D. T., and Petzold, L. R. (2005b).The slow-scale stochastic simulation algorithm.The Journal of Chemical Physics, 122(1):–.
Giles, M. (2008).Multi-level Monte Carlo path simulation.Operations Research, 53(3):607–617.
36 / 39
References III
Gillespie, D. T. (1976).A general method for numerically simulating the stochastictime evolution of coupled chemical reactions.Journal of Computational Physics, 22:403–434.
Gillespie, D. T. (2001).Approximate accelerated stochastic simulation of chemicallyreacting systems.Journal of Chemical Physics, 115:1716–1733.
Karlsson, J., Katsoulakis, M., Szepessy, A., and Tempone, R.(2010).Automatic Weak Global Error Control for the Tau-LeapMethod.preprint arXiv:1004.2948v3, pages 1–22.
37 / 39
References IV
Klar, B. (2000).Bounds on tail probabilities of discrete distributions.Probab. Eng. Inf. Sci., 14:161–171.
Kurtz, T. G. (1978).Strong approximation theorems for density dependent markovchains.Stochastic Processes and their Applications, 6(3):223 – 240.
Kurtz, T. G. (1982).Representation and approximation of counting processes.Adv. Filtering OptimalStochastic Control 42.
Moraes, A., Tempone, R., and Vilanova, P. (2014a).Hybrid chernoff tau-leap.Multiscale Modeling & Simulation, 12(2):581–615.
38 / 39
References V
Moraes, A., Tempone, R., and Vilanova, P. (2014b).A multilevel adaptive reaction-splitting simulation method forstochastic reaction networks.submitted to SIAM Journal of Scientific Computing, preprintarXiv:1406.1989.
Moraes, A., Tempone, R., and Vilanova, P. (2014c).Multilevel hybrid Chernoff tau-leap.arXiv:1403.2943.
39 / 39