17
, doi: 10.1098/rsif.2008.0172 6 2009 J. R. Soc. Interface Tina Toni, David Welch, Natalja Strelkowa, Andreas Ipsen and Michael P.H Stumpf inference and model selection in dynamical systems Approximate Bayesian computation scheme for parameter Supplementary data http://rsif.royalsocietypublishing.org/content/suppl/2009/05/19/6.31.187.DC1.html "Data Supplement" References http://rsif.royalsocietypublishing.org/content/6/31/187.full.html#related-urls Article cited in: http://rsif.royalsocietypublishing.org/content/6/31/187.full.html#ref-list-1 This article cites 36 articles, 11 of which can be accessed free Subject collections (103 articles) systems biology Articles on similar topics can be found in the following collections Email alerting service here right-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top http://rsif.royalsocietypublishing.org/subscriptions go to: J. R. Soc. Interface To subscribe to on January 10, 2013 rsif.royalsocietypublishing.org Downloaded from

Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

, doi: 10.1098/rsif.2008.01726 2009 J. R. Soc. Interface Tina Toni, David Welch, Natalja Strelkowa, Andreas Ipsen and Michael P.H Stumpf inference and model selection in dynamical systemsApproximate Bayesian computation scheme for parameter  

Supplementary datahttp://rsif.royalsocietypublishing.org/content/suppl/2009/05/19/6.31.187.DC1.html

"Data Supplement"

References

http://rsif.royalsocietypublishing.org/content/6/31/187.full.html#related-urls Article cited in:

 http://rsif.royalsocietypublishing.org/content/6/31/187.full.html#ref-list-1

This article cites 36 articles, 11 of which can be accessed free

Subject collections (103 articles)systems biology   �

 Articles on similar topics can be found in the following collections

Email alerting service hereright-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top

http://rsif.royalsocietypublishing.org/subscriptions go to: J. R. Soc. InterfaceTo subscribe to

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

Page 2: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

Electronic su10.1098/rsif.2

*Authors anmatics, Impeimperial.ac.u†Present addUniversity, U

Received 30 AAccepted 12 J

Approximate Bayesian computation schemefor parameter inference and model selection

in dynamical systems

Tina Toni1,2,*, David Welch3,†, Natalja Strelkowa4,

Andreas Ipsen5 and Michael P. H. Stumpf1,2,*

1Centre for Bioinformatics, Division of Molecular Biosciences, 2Institute of MathematicalSciences, 3Department of Epidemiology and Public Health, 4Department of Bioengineering,and 5Department of Biomolecular Medicine, Imperial College London, London SW7 2AZ, UK

Approximate Bayesian computation (ABC) methods can be used to evaluate posteriordistributions without having to calculate likelihoods. In this paper, we discuss and apply anABC method based on sequential Monte Carlo (SMC) to estimate parameters of dynamicalmodels. We show that ABC SMC provides information about the inferability of parametersand model sensitivity to changes in parameters, and tends to perform better than other ABCapproaches. The algorithm is applied to several well-known biological systems, for whichparameters and their credible intervals are inferred. Moreover, we develop ABC SMC as atool for model selection; given a range of different mathematical descriptions, ABC SMC isable to choose the best model using the standard Bayesian model selection apparatus.

Keywords: sequential Monte Carlo; Bayesian model selection; sequential importancesampling; parameter estimation; dynamical systems; sloppy parameters

1. INTRODUCTION

Most dynamical systems studied in the physical, lifeand social sciences and engineering are modelled byordinary, delay or stochastic differential equations.However, for the vast majority of systems and particu-larly for biological systems, we lack reliable informationabout parameters and frequently have several competingmodels for the structure of the underlying equations.Moreover, the biological experimental data are oftenscarce and incomplete, and the likelihood surfaces oflarge models are complex. The analysis of such dynamicalsystems therefore requires new, more realistic quantita-tive and predictive models. Here, we develop novelstatistical tools that allow us to analyse such data interms of dynamical models by (i) providing estimates formodel parameter values, and (ii) allowing us to comparethe performance of different models in describing theoverall data.

In the last decade, extensive research has beenconducted on estimating the parameters of deterministicsystems.Muchattentionhas been given to local and globalnonlinear optimization methods (Mendes & Kell 1998;Moles et al. 2003) and, generally, parameter estimation

pplementary material is available at http://dx.doi.org/008.0172 or via http://journals.royalsociety.org.

d address for correspondence: Centre for Bioinfor-rial College London, London SW7 2AZ, UK (ttoni@k; [email protected]).ress: Department of Statistics, Pennsylvania Stateniversity Park, PA 16802, USA.

pril 2008une 2008 187

has been performed by maximum-likelihood estimation(Muller et al. 2004; Timmer & Muller 2004; Baker et al.2005; Bortz & Nelson 2006). The methods developedfor ordinary differential equations have been extendedto ordinary differential equations with time delays(Horbelt et al. 2002). Deterministic models havealso been parametrized in a Bayesian frameworkusing Bayesian hierarchical models (Putter et al.2002; Banks et al. 2005; Huang et al. 2006). Simulatedannealing, which attempts to avoid getting trapped inlocal minima, is another well-known optimizationalgorithm that has been found successful in variousapplications (Kirkpatrick et al. 1983; Mendes & Kell1998). There are also several Monte Carlo-basedapproaches applied to the parameter estimation ofdeterministic (Battogtokh et al. 2002; Brown & Sethna2003) and stochastic (Sisson et al. 2007) systems.The parameter estimation for stochastic models hasbeen extensively explored in financial mathematics(Johannes & Polson 2005) and has been applied tobiological systems in a frequentist maximum likelihood(Reinker et al. 2006) and Bayesian (Golightly &Wilkinson 2005, 2006; Wilkinson 2006) framework.

Most commonly, model selection has been performedby likelihood ratio tests (in the case of nested models)or the Akaike information criterion (in the case ofnon-nested models). Recently, Bayesian methods haveincreasingly been coming into use. Vyshemirsky &Girolami (2008) have investigated different waysof computing the Bayes factors for model selectionof deterministic differential equation models, and

J. R. Soc. Interface (2009) 6, 187–202

doi:10.1098/rsif.2008.0172

Published online 9 July 2008

This journal is q 2008 The Royal Society

Page 3: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

188 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

Brown & Sethna (2003) have used the Bayesianinformation criterion. In population genetics, modelselection has been performed using approximateBayesian computation (ABC) in its basic rejectionform (Zucknick 2004; Wilkinson 2007) and coupledwith multinomial logistic regression (Beaumont 2008a;Fagundes et al. 2007).

There is thus a wide variety of tools availablefor parameter estimation and, to a lesser extent, modelselection. However, to our knowledge, no method avail-able can be applied to all different kinds of modellingapproaches (e.g. ordinary or stochastic differentialequations with and without time delay) without sub-stantial modification, estimate credible intervals fromincomplete or partially observed data, reliably explorethe whole parameter space without getting trapped inlocal extrema and be employed for model selection.

In this paper, we apply an ABC method based onsequential Monte Carlo (SMC) to the parameterestimation and model selection problem for dynamicalmodels. In ABC methods, the evaluation of thelikelihood is replaced by a simulation-based procedure(Pritchard et al. 1999; Beaumont et al. 2002; Marjoramet al. 2003; Sisson et al. 2007). We explore theinformation provided by ABC SMC about the infer-ability of parameters and the sensitivity of the model toparameter variation. Furthermore, we compare theperformance of ABC SMC with other ABC methods.The method is illustrated on two simulated datasets(one from ecology and another from molecular systemsbiology), and real and simulated epidemiologicaldatasets. As we will show, ABC SMC yields reliableparameter estimates with credible intervals, can beapplied to different types of models (e.g. deterministicas well as stochastic models), is relatively computa-tionally efficient (and easily parallelized), allows fordiscrimination among sets of candidate models in aformal Bayesian model selection sense and gives us anassessment of parameter sensitivity.

2. METHODS

In this section, we review and develop the theoryunderlying ABC with emphasis on applications todynamical systems, before introducing a formal Baye-sian model selection approach in an ABC context.

2.1. Approximate Bayesian computation

ABC methods have been conceived with the aim ofinferring posterior distributions where likelihood func-tions are computationally intractable or too costly toevaluate. They exploit the computational efficiencyof modern simulation techniques by replacing thecalculation of the likelihood with a comparison betweenthe observed and simulated data.

Let q be a parameter vector to be estimated. Giventhe prior distribution p(q), the goal is to approximatethe posterior distribution, p(qjx)ff(xjq)p(q), wheref(xjq) is the likelihood of q given the data x. The ABCmethods have the following generic form.

1. Sample a candidate parameter vector q� from someproposal distribution p(q).

J. R. Soc. Interface (2009)

2. Simulate a dataset x� from the model described by aconditional probability distribution f(xjq�).

3. Compare the simulated dataset, x�, with the experi-mental data, x0, using a distance function, d, andtolerance e; if d(x0, x

�)%e, accept q�. The toleranceeR0 is thedesired level of agreementbetweenx0 andx

�.

The output of an ABC algorithm is a sample ofparameters from a distribution p(qjd(x0, x�)%e). If e issufficiently small, then the distribution p(qjd(x0, x�)%e)will be a good approximation for the posterior distri-bution p(qjx0). It is often difficult to define a suitabledistance function d(x0, x

�) between the full datasets, soone may instead replace it with a distance defined onsummary statistics, S(x0) and S(x

�), of the datasets. Thatis, the distance function may be defined as d(x0, x

�)Zd 0(S(x0), S(x

�)), where d 0 is a distance function definedon the summary statistic space. However, here, as weconsider values of a dynamical process at a set of timepoints, we are able to compare the datasets directlywithout the use of summary statistics. In any case, thealgorithms take the same form.

The simplest ABC algorithm is the ABC rejectionsampler (Pritchard et al. 1999), which is as follows.

R1 Sample q� from p(q).R2 Simulate a dataset x� from f(x jq�).R3 If d(x0, x

�)%e, accept q�, otherwise reject.R4 Return to R1.

The disadvantage of the ABC rejection sampler isthat the acceptance rate is low when the prior distri-bution is very different from the posterior distribution.To avoid this problem, anABCmethod based onMarkovchain Monte Carlo was introduced (Marjoram et al.2003). The ABC MCMC algorithm proceeds as follows.

M1 Initialize qi , iZ0.

M2 Propose q� according to a proposal distributionq(qjqi).

M3 Simulate a dataset x� from f(x jq�).M4 If d(x0, x

�)%e, go to M5, otherwise set qiC1Zqiand go toM6.

M5 Set qiC1Zq� with probability

aZmin 1;pðq�Þqðqijq�ÞpðqiÞqðq�jqiÞ

� �

and qiC1Zqi with probability 1Ka.M6 Set iZiC1, go toM2.

The outcome of this algorithm is a Markov chain withthe stationary distribution p(qjd(x0, x�)%e) (Marjoramet al. 2003). That is, ABC MCMC is guaranteed toconverge to the target approximate posterior distri-bution. Note that if the proposal distribution is sym-metric, q(qi jq�)Zq(q�jqi), then a depends only on theprior distribution. Furthermore, if the prior is uniform,then aZ1 in M5. Potential disadvantages of the ABCMCMC algorithm are that the correlated nature ofsamples coupled with the potentially low acceptanceprobability may result in very long chains and that thechain may get stuck in regions of low probability for longperiods of time.

Page 4: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

Table 1. Interpretation of the Bayes factor (adapted fromKass & Raftery 1995).

the value of the Bayesfactor B12

evidence against m2

(and in favour of m1)

1–3 very weak3–20 positive20–150 strongO150 very strong

ABC for dynamical systems T. Toni et al. 189

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

The above-mentioned disadvantages of ABC rejec-tion and ABC MCMC methods can, at least in part, beavoided in ABC algorithms based on SMCmethods, firstdeveloped by Sisson et al. (2007). In this paper, we deriveABC SMC from a sequential importance sampling (SIS)algorithm (Del Moral et al. 2006); see appendix A forthe derivation and appendix B for a comparison with thealgorithm of Sisson et al. (2007).

In ABC SMC, a number of sampled parameter values(called particles), {q(1),., q(N )}, sampled from the priordistribution p(q), are propagated through a sequence ofintermediate distributions, p(qjd(x0, x�)%ei), iZ1,.,TK1, until it represents a sample from the targetdistribution p(qjd(x0, x�)%eT). The tolerances ei arechosen such that e1O/OeTR0, thus the distributionsgradually evolve towards the target posterior. Forsufficiently large numbers of particles, the populationapproach can also, in principle, avoid the problem ofgetting stuck in areas of low probability encountered inABC MCMC. The ABC SMC algorithm proceedsas follows.1

S1 Initialize e1,., eT.Set the population indicator tZ0.

S2.0 Set the particle indicator iZ1.S2.1 If tZ0, sample q�� independently from p(q).

Else, sample q� from the previous population

1For a moapplicatio

J. R. Soc.

fqðiÞtK1g with weights wtK1 and perturb theparticle to obtain q��wKt(qjq�), where Kt is aperturbation kernel.If p(q��)Z0, return to S2.1.Simulate a candidate dataset x�wf ðxjq��Þ.If d(x�, x0)Ret, return to S2.1.

S2.2 Set qðiÞt Zq�� and calculate the weight for

particle qðiÞt ,

wðiÞt Z

1; if t Z 0;

pðqðiÞt ÞXNjZ1

wð jÞtK1Ktðq

ð jÞtK1; q

ðiÞt Þ

; if tO0:

8>>>>><>>>>>:

If i!N, set iZiC1, go to S2.1.

S3 Normalize the weights.

If t!T, set tZtC1, go to S2.0.

Particles sampled from the previous distribution aredenoted by a single asterisk, and after perturbation theseparticles are denoted by a double asterisk. Here, wechoose the perturbation kernel Kt to be a random walk(uniform or Gaussian). Note that for the special casewhen TZ1, the ABC SMC algorithm corresponds to theABC rejection algorithm.

2.2. Model selection

Here, we introduce an ABC SMC model selectionframework that employs standard concepts fromBayesian model selection, including Bayes factors(a comprehensive review of Bayesian model selectioncan be found in Kass & Raftery 1995). Let m1 and

re general version of the algorithm, suitable especially forn to stochastic models, see appendix A.

Interface (2009)

m2 be two models; we would like to choose whichmodel explains the data x better. The Bayes factor isdefined as

B12 ZPðm1jxÞ=Pðm 2jxÞPðm1Þ=Pðm 2Þ

; ð2:1Þ

where P(mi) is the prior and P(mijx) is the marginalposterior distribution of model mi , iZ1, 2. If the priorsare uniform, then (2.1) simplifies to

B12 ZPðm1jxÞPðm 2jxÞ

: ð2:2Þ

The Bayes factor is a summary of the evidence providedby the data in favour of one statistical model overanother (see table 1 for its interpretation).

There are several advantages of Bayesian modelselection when compared with traditional hypothesistesting. First, the models being compared do not need tobe nested. Second, the Bayes factors do not only weighthe evidence against a hypothesis (in our case m2), butcan equally well provide evidence in favour of it. This isnot the case for traditional hypothesis testing where asmall p-value only indicates that the null model hasinsufficient explanatory power. However, one cannotconclude from a large p-value that the two models areequivalent, or that the null model is superior, but onlythat there is not enough evidence to distinguish betweenthe two. In other words, ‘failing to reject’ the nullhypothesis cannot be translated into ‘accepting’ the nullhypothesis (Cox & Hinkley 1974; Kass & Raftery 1995).Third, unlike the posterior probability of the model, thep-value does not provide any direct interpretation of theweight of evidence (the p-value is not the probabilitythat the null hypothesis is true).

Here, we approach the model selection problem byincluding a ‘model parameter’ m2{1,.,M}, where Mis the number of models, as an additional discreteparameter and denote the model-specific parameters asqðmÞZðqðmÞð1Þ;.; qðmÞðkmÞÞ, mZ1,.,M, where kmdenotes the number of parameters in model m.

In each population, we start by sampling a modelindicator m from the prior distribution p(m). For modelm, we then propose new particles by perturbing theparticles from the previous population specific tom; thisstep is the same as in the parameter estimationalgorithm. The weights for particles q(m) are also calcu-lated in a similar way as in the parameter estimationalgorithm for m.

TheABC SMC algorithm for model selection proceedsas follows.2

2In the stochastic framework, we again suggest using the general formof the algorithm with BtO1; see appendix A.

Page 5: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

190 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

MS1 Initialize e1,., eT.Set the population indicator tZ0.

MS2.0 Set the particle indicator iZ1.MS2.1 Sample m� from p(m).

If tZ0, sample q�� from p(q(m�)).

If tO0, sample q� from the previous population

J. R. So

{q(m�)tK1} with weights w(m�)tK1.Perturb the particle q� to obtain q��wKt(qjq�).If p(q��)Z0, return to MS2.1.Simulate a candidate dataset x�wf(xjq��,m�).If d(x�, x0)Ret, return to MS2.1.

MS2.2 SetmðiÞt Zm� and add q�� to the population of

particles {q(m�)t}, and calculate its weight as,

wðiÞt Z

1; if t Z 0;

pðq��ÞXNjZ1

wðjÞtK1Ktðq

ðjÞtK1; q

��Þ; if tO0:

8>>>>><>>>>>:

If i!N, set iZiC1, go to MS2.1.

MS3 For every m, normalize the weights.

If t!T, set tZtC1, go toMS2.0.

The outputs of the ABC SMC algorithm are theapproximations of the marginal posterior distributionof the model parameter P(mjx) and the marginalposterior distributions of parameters Pðqijx;mÞ, mZ1,.,M, iZ1,., km. Note that it can happen that amodel dies out (i.e. there are no particles left that belongto a particular model) if it offers only a poor descriptionof the data. In this case, the sampling of particlescontinues from the remaining models only.

The Bayes factors can be obtained directly fromP(mjx) using equation (2.2). However, in many casesthere will not be a single best and most powerful/explan-atory model (Stumpf & Thorne 2006). More commonly,different models explain different parts of the data to acertain extent. One can average over these models toobtain a better inference than from a single model only.The approximation of the marginal posterior distri-bution of the model, P(mjx), which is the output ofthe above algorithm, can be used for Bayesian modelaveraging (Hoeting et al. 1999).

The parameter estimation for each of the models isperformed simultaneously with the model selection. Themodel with the highest posterior probability willtypically have the greatest number of particles, therebyensuring a good estimate of the posterior distribution ofthe parameters. However, some models are poorlyrepresented in the marginal posterior distribution of m(i.e. only a small number of particles belong to thesemodels), and so the small number of particles does notprovide a very good estimate of the posterior distri-butions of the parameters. Therefore, one might wishto estimate parameters for these models independently.

We note that the ABC SMC model selectionalgorithm implicitly penalizes the models with a largenumber of parameters; the higher the parameterdimension is, the smaller is the probability that theperturbed particle is accepted.

c. Interface (2009)

2.3. Implementation of the algorithm

The algorithm is implemented in CCC. For the ODEsolver code, the fourth-order classical Runge–Kuttaalgorithm from the GNU Scientific Library (Galassi2003) is used; for the simulation of stochastic models,we use the Gillespie algorithm (Gillespie 1977); and forthe simulation of delay differential equations, weimplemented the algorithm based on the adaptive step-size ODE solvers from Numerical recipes in C (Presset al. 1992) extended by code handling the delay partaccording to Paul (1992) and Enright & Hu (1995).

3. RESULTS

We demonstrate the performance of the ABC algori-thms using the simulated data from deterministic andstochastic systems. The data points were obtained bysolving the systems for some fixed parameters at chosentime points. The sizes of the input datasets were chosento reflect what can typically be expected in real-worlddatasets in ecology, molecular systems biology andepidemiology. The first two examples highlight thecomputational performance of ABC SMC, the problemof inferability of dynamical models and its relationshipto parameter sensitivity. The third example illustratesthe use of ABC SMC for model selection, which is thenfurther demonstrated in an application to a real dataset.

3.1. Parameter inference for the deterministicand stochastic Lotka–Volterra model

The first model is the Lotka–Volterra (LV) model(Lotka 1925; Volterra 1926) describing the interactionbetween prey species, x, and predator species, y, withparameter vector qZ(a, b),

dx

dtZ axKxy; ð3:1aÞ

dy

dtZ bxyKy: ð3:1bÞ

3.1.1. Computational efficiency of ABC SMC appliedto deterministic LV dynamics. The data {(xd, yd)} arenoisy observations of the simulated system withparameter values set at (a, b)Z(1, 1). We sample eightdata points (for each of the species) from the solution ofthe system for parameters (a, b)Z(1, 1) and addGaussian noise N (0, (0.5)2) (figure 1a). Let the distancefunction d((xd, yd), (x, y)) between the data {xd[i ], yd[i ]},iZ1, ., 8, and a simulated solution for proposedparameters, {(x[i ], y[i ])}, be the sum of squared errors,

dððx; yÞ; ðxd ;ydÞÞZXi

ðx½i �Kxd ½i �Þ2Cðy½i �Kyd ½i �Þ2� �

:

ð3:2ÞIn appendix C we show that this distance function is,in fact, related to the conventional likelihood treatmentof ODEs.

The distance between our noisy data and thedeterministic solution for (a, b)Z(1, 1) from which thedata were generated is 4.23, so the lowest distance to be

Page 6: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

Table 2. Cumulative number of data generation steps neededto accept 1000 particles in each population for deterministicLV dynamics.

population 1 2 3 4 5

data generationsteps

26 228 36 667 46 989 49 271 52 194

0 5 10 15

0

1

2

3

4(b)(a)

0.7 0.9 1.1 1.3 0.6 1.0 1.4

0

500

300

100

Figure 1. (a) Trajectories of prey (solid curve) and predator (dashed curve) populations of the deterministic LV system and thedata points (circles, prey data; triangles, predator data). (b) Parameters inferred by the ABC rejection sampler.

ABC for dynamical systems T. Toni et al. 191

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

reached is expected to be close to this number and wechoose the tolerance e accordingly.

First, we apply the ABC rejection sampler approachwith eZ4.3. The prior distributions for a and b are takento be uniform, a, bwU(K10, 10). In order to obtain 1000accepted particles, approximately 14.1 million datageneration steps are needed, which means that theacceptance rate (7!10K5) is extremely low. The inferredposterior distributions are shown in figure 1b.

Applying the ABC MCMC scheme outlined aboveyields results comparable to those of ABC rejec-tion, and after a careful calibration of the approach(using an adaptive Gaussian proposal distribution),we manage markedly to reduce the computationalcost (including burn-in, we had to generate between40 000 and 60 000 simulations in order to obtain 1000independent particles).

Next, we apply the ABC SMC approach. The priordistributions for a and b are taken to be uniform, a,bwU(K10, 10), and the perturbation kernels for bothparameters are uniform, KtZsU(K1, 1), with sZ0.1.The number of particles in each population is NZ1000.To ensure the gradual transition between populations,we take TZ5 populations with eZ(30.0, 16.0, 6.0,5.0, 4.3). The results are summarized in table 2 andfigure 2. From the last population (population 5), it canbe seen that both parameters are well inferred(a: medianZ1.05, 95% quantile rangeZ[1.00,1.12];b: medianZ1.00, 95% quantile rangeZ[0.87,1.11]). Theoutcome is virtually the same as previously obtained bythe ABC rejection sampler (figure 1b); however, there isa substantial reduction in the number of steps needed toreach this result. For this model, the ABC SMCalgorithm needs 50 times fewer data generation stepsthan the ABC rejection sampler, and about the same

J. R. Soc. Interface (2009)

number of data generation steps as the adaptive ABCMCMC algorithm.

The analyses were repeated with different distancefunctions, such as

dððx; yÞ; ðxd ; ydÞÞZXi

jx½i�K xd ½i�jC jy½i�K yd ½i�jð Þ

ð3:3Þand

dððx; yÞ; ðxd ; ydÞÞZ 2Kx$xd

kxkkxdkK

y$ydkykkydk

; ð3:4Þ

where the dot denotes the inner product. As expected,the resulting approximations of posterior distributionsare very similar (histograms not shown). Replacing theuniform perturbation kernel by a Gaussian kernel alsoyields the same results, but requires more simulationsteps (results not shown).

3.1.2. ABC SMC inference for stochastic LV dynamics.Having obtained good estimates for the deterministiccase, next, we try to infer parameters of a stochastic LVmodel. The predator–prey process can be described bythe following rate equations:

aCX/2X with rate c1; ð3:5aÞ

X CY/2Y with rate c2; ð3:5bÞ

Y/0/ with rate c3; ð3:5cÞwhere X denotes the prey population; Y is the predatorpopulation; anda is thefixedamountof resourceavailableto the prey (we fix it to aZ1). These reactions definethe stochastic master equation (van Kampen 2007)

vPðx; y; tÞvt

Z c1aðxK1ÞPðxK1; y; tÞ

Cc2ðxC1ÞðyK1ÞPðxC1; yK1; tÞCc3ðyC1ÞPðx; yC1; tÞKðc1axCc2xyCc3yÞPðx; y; tÞ: ð3:6Þ

The ABC approach can easily be applied to inferenceproblems involving master equations, because thereexists anexact simulationalgorithm(Gillespie algorithm;Gillespie 1977; Wilkinson 2006).

Our simulated data consist of 19 data points for eachspecies with rates (c1, c2, c3)Z(10, 0.01, 10) and initial

Page 7: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

0 5 10–10

–2

–5 0 5 10–10 –5

0 5 10–10 –5

0

200

400(a) (i) (ii)

(i) (ii)

(i) (ii)

(i) (ii)

(i) (ii)

(i) (ii)

(b)

(c)

(d )

(e)

( f )

0

200

400

0

200

400

0

200

400

0

200

400

0

200

400

0 1 2 3

0 0.5 1.0 1.5 2.0 0 1.0 2.0 3.0

0.8 1.0 1.2 1.4 0.6 1.0 1.4

0.8 1.0 1.2 1.4 0.6 1.0 1.4

0.8 1.0 1.2 1.4 0.6 1.0 1.4

Figure 2. Histograms of populations (a) 0 (uniform priordistribution), (b) 1, (c) 2, (d ) 3, (e) 4 and ( f ) 5 (approximationof posterior distribution) of parameters (i) a and (ii) b of theLV system.

c1

c2

c3

0

0

5 10 15 20 25

0

10

20

30

40

0

10

20

30

40

0

10

20

30

40

(a)

(b)

(c)

0.010 0.020

0 5 10 15 20 25

Figure 3. Histograms of the approximated posterior distri-butions of parameters (a) c1, (b) c2 and (c) c3 of thestochastic LV dynamics.

192 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

conditions (X0, Y0)Z(1000, 1000). The distance func-tion is the square root of the sum of squared errors (3.2)and the SMC algorithm is run for TZ5 populations witheZ(4000, 2900, 2000, 1900, 1800). Especially in labora-tory settings, the results from several replicate experi-ments are averaged over; here, we therefore also usedata averaged over three independent replicates.The simulated data at every run are then, just as theexperimental data, averaged over three runs. Forinference purposes, the average over several runs tendsto hold more information about the system’s meandynamics than a single stochastic run. If the experi-mental data consisted of one run only, i.e. if therewere no repetitions, then the inference could in principleproceed in the sameway, by comparing the experimentaldata with a single simulated run. This would result ina lower acceptance rate and, consequently, more datageneration steps to complete the inference.

We generate NZ100 particles per population andassign the prior distributions of the parameters tobe p(c1)ZU(0, 28), p(c2)ZU(0.0, 0.04) and p(c3)ZU(0, 28), reflecting the respective orders of magnitudeof the simulated parameters (if a larger domain for p(c2)

J. R. Soc. Interface (2009)

is taken, there are no accepted instances for c2O0.04).Perturbation kernels are uniform with sc1Z sc 3

Z1:0and sc 2Z0:0025, and BtZ10. The results are sum-marized in figure 3.

3.2. Parameter inference for the deterministicand stochastic repressilator model

The repressilator (Elowitz & Leibler 2000) is a populartoy model for gene regulatory systems and consistsof three genes connected in a feedback loop, whereeach gene transcribes the repressor protein for the nextgene in the loop. The model consists of six ordinarydifferential equations and four parameters, which areas follows:

dm1

dtZKm1 C

a

1Cpn3Ca0; ð3:7aÞ

dp1dt

ZKbðp1Km1Þ; ð3:7bÞ

dm 2

dtZKm 2 C

a

1Cpn1Ca0; ð3:7cÞ

Page 8: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

a0

0.6 0.8 1.0 1.2 1.4

n

1.6 1.8 2.0 2.2 2.4

0

50

100

200 (i)

(iii) (iii)(iv) (iv)

(ii)

(i)

(iii) (iv)

(ii)

(i) (ii)

b3 4 5 6 7 8 9

0

50

150

250

0

50

100

200

a800 1000 1400

0

0.4

0.8

0

0.4

0.8

0

0.4

0.8

0

0.4

0.8

a0

(qr)

b (q

r)

a (q

r)n

(qr

)

0 2 4 6 8 10

population

0 2 4 6 8 10

population

500 2500200015001000

a a0

a0

a0

a0

a0a

aaa

a0

b b

–2

1 2 3 4 5

0

5

10

15

20

0

5

10

15

20

n

nb

b b bb

1

2

3

4

5

comp. 1 comp. 2 comp. 3 comp. 40

0.2

0.4

0.6

0.8

1.0

n

n n

n

2 4 6 8 100

–2 2 4 6 8 100

(b)(a)

(c) (d )

Figure 4. (a) Histograms of the approximate marginal posterior distributions of parameters a0, n, b and a of the deterministicrepressilator model. (b) The normalized 95% interquantile ranges (qr) of each population. The narrower the interval for a giventolerance et, the more sensitive the model is to the corresponding parameter. The interquantile range reached in population 9 isdetermined by the added experimental noise. As e9 was chosen accordingly, one cannot proceed by lowering the tolerance further.The sharp change in the interquantile ranges, which occurs, for example, for parameter a0 between populations 1 and 2, can beexplained by the steep gradient of the likelihood surface along a0. (c) The output (i.e. the accepted particles) of the ABC SMCalgorithm as two-dimensional scatterplots. The particles from population 1 are in yellow, particles from population 4 in black,particles from population 7 in blue and those from the last population in red. Islands of particles are observed in population 4 andthey can be explained by the multimodality of the fourth intermediate distribution. (d ) PCA of the set of accepted particles(population 9). Owing to the dependence of the PCA on the scaling of original variables, the PCA was performed on thecorrelation matrix. The first PC explains 70.0% of the total variance, the second 24.6%, the third 5.3% and the fourth 0.1% of thevariance. Pie charts show the fraction of the length of PCs explained by individual parameters.

ABC for dynamical systems T. Toni et al. 193

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

dp2

dtZKbðp2Km 2Þ; ð3:7dÞ

dm 3

dtZKm 3 C

a

1Cpn2Ca0; ð3:7eÞ

dp3dt

ZKbðp3Km3Þ: ð3:7f Þ

J. R. Soc. Interface (2009)

3.2.1. Inferability and sensitivity in deterministic repres-silator dynamics. Let qZ(a0,n, b, a) be the parametervector. For the simulated data, the initial conditions are(m1, p1,m2, p2,m3, p3)Z(0, 2, 0, 1, 0, 3) and the values ofparameters are qZ(1, 2, 5, 1000); for these parametersthe repressilator displays limit-cycle behaviour. Weassume that only the mRNA (m1,m2,m3) measurementsare available and the protein concentrations are

Page 9: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

194 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

considered as missing data. Gaussian noise N (0, 52) isadded to the data points. The distance function isdefined to be the square root of the sum of squarederrors. The prior parameter distributions are chosenas follows: p(a0)ZU(K2, 10); p(n)ZU(0, 10); p(b)ZU(K5, 20); and p(a)ZU(500, 2500). We assume thatthe initial conditions are known.

The results are summarized in figure 4a–d, wherewe show the approximate posterior distributions, thechanges in 95% interquantile ranges of parameterestimates across the populations and scatterplots ofsome of the two-dimensional parameter combinations.Each parameter is easily inferred when the other threeparameters are fixed (histograms not shown). When thealgorithm is applied to all four parameters simultaneously,parameter n is inferred the quickest and has the smallestposterior variance, while parameter a is barely inferableand has large credible intervals (figure 4a,c).

We find that ABC SMC recovers the intricate linkbetween model sensitivity to parameter changes andinferability of parameters. The repressilator system ismost sensitive to changes in parameter n and leastsensitive to changes in a. Hence, the data appear tocontain little information about a. Thus, ABC SMCprovides us with a global parameter sensitivity analysis(Sanchez & Blower 1997) on the fly as the intermediatedistributions are being constructed. Note that theintermediate distributions are nested in one another(as should be the case in SMC; figure 4c). An attempt tovisualize the four-dimensional posterior distributionscan be accessed in the electronic supplementary materialwhere we provide an animation in which the posteriordistribution in the four-dimensional parameter space isprojected onto two-dimensional planes.

The interquantile ranges and the scatterplots providean initial impression of parameter sensitivity; however,the first problem with scatterplots is that it is increas-ingly difficult to visualize the behaviour of the modelwith increasing parameter dimension. Second, we have todetermine the sensitivity when a combination of para-meters is varied (andnot just individual parameters), andthis cannot be visualized via simple one-dimensionalinterquantile ranges or two-dimensional scatterplots.

We can use principal component analysis (PCA) toquantify the sensitivity of the system (Saltelli et al. 2008).The output of the ABC SMC algorithm, which we aregoing to use for our sensitivity analysis, is the lastpopulation of N particles. Associated with the acceptedparticles is their variance–covariance matrix, S, ofrank p, where p denotes the dimension of the parametervector. The principal components (PCs) are the eigen-vectors of S, which define a set of eigenparameters,ciZai1q1C/Caipqp. Here, aiZ(ai1, ., aip) is thenormalized eigenvector associated with the ith eigenvalueof S, li, and aij describes the projection of parameter qjonto the ith eigenparameter. The PCA provides anorthogonal transformation of the original parameterspace and the PCs can be taken to define a p-dimensionalellipsoid that approximates the population of data points(i.e. the accepted particles), and the eigenvalues specifythe p corresponding radii. The variance of the ith PC isgiven by li and the total variance of all PCs equalsPp

iZ1 liZtraceðSÞ. Therefore, the eigenvalue li

J. R. Soc. Interface (2009)

associated with the ith PC explains a proportionli

traceðSÞ ð3:8Þ

of thevariation in thepopulationofpoints.The smaller theli, the more sensitive the system is to the variation of theeigenparameter ci. The PCA yields only an approximateaccountof sensitivity similar towhatwouldbeobtainedbycomputing the Hessian around the mode of the posteriordistribution.

Figure 4d summarizes the output of the PCA. Itshows how much of the variance is explained by eachPC, and which parameters contribute the most to thesePCs. In contrast to the interest in the first PC in mostPCA applications, our main interest lies in the smallestPC. The last PC extends across the narrowest regionof the posterior parameter distribution, and thereforeprovides information on parameters to which the modelis the most sensitive. In other words, the smaller PCscorrespond to stiff parameter combinations, while thelarger PCs may correspond to sloppy parametercombinations (Gutenkunst et al. 2007).

The analysis reveals that the last PC mainly extendsin the direction of a linear combination of parameters nand b, from which we can conclude that the model ismost sensitive to changes in these two parameters.Looking at the third component, the model is somewhatless sensitive to variation in a0. The model is thereforethe least sensitive to changes in parameter a, which isalso supported by the composition of the second PC.This outcome agrees with the information obtained fromthe interquantile ranges and the scatterplots.

3.2.2. Inference of the stochastic repressilator dynamics.Next, we apply ABC SMC to the stochastic repressilatormodel. We transformed the deterministic model (3.7)into a set of the following reactions:

0//mi with hazarda

1CpnjCa0; ð3:9aÞ

mi/0/ with hazard mi; ð3:9bÞ

mi/mi Cpi with hazard bmi; ð3:9cÞ

pi/0/ with hazard bpi; ð3:9dÞ

where iZ1, 2, 3 and, correspondingly, jZ3, 1, 2. Thestochastic process defined by these reactions canbe simulated with the Gillespie algorithm. Trueparameters and initial conditions correspond tothose of the deterministic case discussed previously.The data include both mRNA and protein levelsat 19 time points. Tolerances are chosen aseZ{900, 650, 500, 450, 400}, the number of parti-cles NZ200 and BtZ5. The prior distributions arechosen as follows: p(a0)ZU(0, 10); p(n)ZU(0, 10);p(b)ZU(K5, 20); and p(a)ZU(500, 2500).

The inference results for comparing the averageover 20 simulations with the data generated from theaverage of 20 simulations are summarized in figure 5a,b.Figure 5a,b shows that parameters n and b getreasonably well inferred, while a0 and a are harder to

Page 10: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

ABC for dynamical systems T. Toni et al. 195

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

infer. It is clearly noticeable that parameters are betterinferred in the deterministic case (figure 4a).

3.2.3. Contrasting inferability for the deterministic andstochastic dynamics. Analysing and comparing theresults of the deterministic and stochastic repressilatordynamics shows that parameter sensitivity is intimatelylinked to inferability. If the system is insensitive toa parameter, then this parameter will be hard (or evenimpossible) to infer, as varying such a parameter doesnot vary the output—which here is the approximateposterior probability—very much. In stochastic pro-blems, we may furthermore have the scenario where thefluctuations due to small variations in one parameteroverwhelm the signals from other parameters.

3.3. Model selection on different SIR models

We illustrate model selection using a range of simplemodels that can describe the epidemiology of infectiousdiseases. SIR models describe the spread of such diseasein a population of susceptible (S ), infected (I ) andrecovered (R) individuals (Anderson & May 1991). Thesimplest model assumes that every individual can beinfected only once and that there is no time delaybetween the individual getting infected and their abilityto infect other susceptible individuals,

_S ZaKgSIKdS; ð3:10aÞ

_I ZgSIKvIKdI ; ð3:10bÞ

_RZ vIKdR; ð3:10cÞwhere _x denotes the time derivative of x, dx/dt.Individuals, who are born at rate a, are susceptible; dis thedeath rate (irrespectiveof thedisease class,S, IorR);g is the infection rate; and v is the recovery rate.

The model can be made more realistic by adding atime delay t between the time an individual gets infectedand the time when they become infectious,

_S ZaKgSI ðtKtÞKdS; ð3:11aÞ_I ZgSI ðtKtÞKvIKdI ; ð3:11bÞ

_RZ vIKdR: ð3:11cÞAnother way of incorporating the time delay into

the model is by including a population of individualsin a latent (L) phase of infection; in this state, theyare infected but cannot yet infect others. The equationsthen become

_S ZaKgSIKdS; ð3:12aÞ_LZgSIKdLKdL; ð3:12bÞ_I Z dLKvIKdI ; ð3:12cÞ

_RZ vIKdR: ð3:12dÞHere, d denotes the transition rate from the latent to theinfective stage.

Another extension of the basic model (3.10) allowsthe recovered individuals to become susceptible again

J. R. Soc. Interface (2009)

(with rate e),

_S ZaKgSIKdSCeR; ð3:13aÞ

_I ZgSIKvIKdI ; ð3:13bÞ

_RZ vIKðdCeÞR: ð3:13cÞThere are obviously manymore ways of extending the

basic model, but here we restrict ourselves to the fourmodels described above. Given the same initial con-ditions, the outputs of all models are very similar, whichmakes it impossible to choose the right model by visualinspection of the data alone. Therefore, some moresophisticated, statistically based methods need to beapplied for selecting the best available model. Therefore,we apply the ABC SMC algorithm for model selection,developed in §2.2. We define a model parameter m2{1, 2, 3, 4}, representing the above models in the sameorder, and model-specific parameter vectors q(m): qð1ÞZða;g; d; vÞ; qð2ÞZða;g; d; v; tÞ; qð3ÞZða;g; d; v; dÞ;and qð4ÞZða;g; d; v; eÞ.

The experimental data consist of 12 data points fromeach of the three groups (S, I and R). If the data are verynoisy (Gaussian noise with standard deviation sZ1 wasadded to the simulated data points), then the algorithmcannot detect a single best model, which is not surprisinggiven the high similarity of model outputs. However, ifintermediate noise is added (Gaussian noise withstandard deviation sZ0.2), then the algorithm producesa posterior estimate with the most weight on the correctmodel. An example is shown in figure 6, where theexperimental data were obtained from model 1 andperturbed by Gaussian noise, N (0, (0.2)2). Parameterinference is performed simultaneously with the modelselection (posterior parameter distributions not shown).

The Bayes factor can be calculated from the marginalposterior distribution of m, which we take from the finalpopulation. From 1000 particles, model 1 (basic model)was selected 664 times, model 2 230 times, model 4 106times and model 3 was not selected at all in thefinal population. Therefore, we can conclude from theBayes factors

B1;2 Z664

230Z 2:9; ð3:14Þ

B1;4 Z664

106Z 6:3; ð3:15Þ

B2;4 Z230

106Z 2:2; ð3:16Þ

that there is weak evidence in favour of model 1 whencompared with model 2 and positive evidence in favourof model 1 when compared with model 4. Increasing theamount of data will, however, change the Bayes factorsin favour of the true model.

3.4. Application: common-cold outbreaks onTristan da Cunha

Tristan da Cunha is an isolated island in the AtlanticOcean with approximately 300 inhabitants, and it wasobserved that viral diseases, such as common cold, break

Page 11: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

0

200

600

(a) (b) (c)

(d ) (e) ( f )

(g) (h) (i)

0

200

600

0

200

600

1 2 3 4model

1 2 3 4model

1 2 3 4model

Figure 6. Histograms show populations (a– i ) 1–9 of the modelparameter m. Population 9 represents the final posteriormarginal estimate of m. Population 0 (discrete uniform priordistribution) is not shown.

(iii) (iv) (iii) (iv)

a0

0 1.0 1.0 1.5 2.0 2.5 3.02.0 3.0

0

20

40

60(i) (ii) (i) (ii)

0

20

40

60

n

b0 2 4 6 8 10

a400 800 1200 1600

500 1000 1500 2000 2500a

b b

0 2 4 6 8 10a0

0 2 4 6 8 10

0

5

10

15

20

0

5

10

15

20

n

b

0 2 4 6 8 10

0

2

4

6

810

n

a0

(b)(a)

Figure 5. (a) Histograms of the approximated posterior distributions of parameters a0, n, b and a of the stochastic repressilatormodel. (b) The output of the ABC SMC algorithm as two-dimensional scatterplots. The particles from population 1 are in yellow,particles from population 2 in black, particles from population 3 in blue and those from the last population in red. We note thatthe projection to parameter n in population 2 is a sample from a multimodal intermediate distribution, while this distributionbecomes unimodal from population 3 onwards.

196 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

out on the island after the arrival of ships from CapeTown. We use the 21-day common-cold datafrom October 1967. The data, shown in table 3, wereobtained from Hammond & Tyrrell (1971) and Shibliet al. (1971). The data provide only the numbers ofinfected and recovered individuals, I(t) and R(t),whereas the size of the initial susceptible populationS(0) is not known. Therefore, S(0) is an extra unknownparameter to be estimated.

The four epidemic models from §3.3 are used andbecause there are no births and deaths expected in theshort period of 21 days, parameters a and d are set to 0.The tolerances are set to eZ{100, 90, 80, 73, 70, 60, 50,40, 30, 25, 20, 16, 15, 14, 13.8} and 1000 particles areused. The prior distributions of parameters are chosenas follows: gwU(0, 3); vwU(0, 3); twU(K0.5, 5);dwU(K0.5, 5); ewU(K0.5, 5); S(0)wU(37, 100); andmwU(1, 4), where S(0) andm are discrete. Perturbationkernels are uniform, KtZsU(K1, 1), with sgZsvZ0.3,stZsdZseZ1.0 and sS(0)Z3.

The target and intermediate distributions ofmodel parameters are shown in figure 7. The modelselection algorithm chooses model (3.12), i.e. themodel with a latent class of disease carriers, to bethe most suitable one for describing the data; however, itis only marginally better than models (3.10) and (3.11).Therefore, to draw reliable conclusions from the inferredparameters, one might wish to use model averaging overmodels (3.10)–(3.12). The marginal posterior distri-butions for the parameters of model (3.12) are shownin figure 8. However, the estimated initial susceptiblepopulation size S(0) is low compared with the wholepopulation of the island, which suggests that either themajority of islanders were immune to a given strand ofcold or (perhaps more plausibly) the system is not well

J. R. Soc. Interface (2009)

represented by our general epidemic models withhomogeneous mixing. The estimated durations of thelatent period t (from model (3.11)) and 1/d (from model(3.12)), however, are broadly in line with the establishedaetiology of common cold (Fields et al. 1996). Thus,within the set of candidate epidemiological models, theABC SMC approach selects the most plausible modeland results in realistic parameter estimates.

Page 12: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

Table 3. Common-cold data from Tristan da Cunha collected in October 1967.

day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

I(t) 1 1 3 7 6 10 13 13 14 14 17 10 6 6 4 3 1 1 1 1 0R(t) 0 0 0 0 5 7 8 13 13 16 16 24 30 31 33 34 36 36 36 36 37

0

200

500

(a) (b) (c)

(d ) (e) ( f )

(g) (h) (i)

( j ) (k) (l)

(m) (n) (o)

0

200

500

0

200

500

0

200

500

0

200

500

1 2 3 4model

1 2 3 4model

1 2 3 4model

Figure 7. Populations of the marginal posterior distributionof m. Models 1–4 correspond to equations (3.10)–(3.13),respectively. An interesting phenomenon is observed inpopulations 1–12, where model 2 has the highest probability,in contrast tomodel 3 having the highest inferred probability inthe last population. The most probable explanation for this isthat a local maximum favouring model 2 has been passed onroute to a global maximum of the posterior probabilityfavouring model 3. Populations (a–o) 1–15. Population 0(discrete uniform prior distribution) is not shown.

ABC for dynamical systems T. Toni et al. 197

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

4. DISCUSSION

Our study suggests that ABC SMC is a promising toolfor reliable parameter inference and model selection formodels of dynamical systems that can be efficientlysimulated. Owing to its simplicity and generality, ABCSMC, unlike most other approaches, can be appliedwithout any change in both deterministic and stochasticcontexts (including models with time delay).

The advantage of Bayesian statistical inference, incontrast to most conventional optimization algorithms(Moles et al. 2003), is that the whole probabilitydistribution of the parameter can be obtained, rather

J. R. Soc. Interface (2009)

than merely the point estimates for the optimal para-meter values. Moreover, in the context of hypothesistesting, the Bayesian perspective (Cox & Hinkley 1974;Robert & Casella 2004) has a more intuitive meaningthan the corresponding frequentist point of view. ABCmethods share these characteristics.

Another advantage of our ABC SMC approach is thatobserving the shape of intermediate and posteriordistributions gives (without any further computationalcost) information about the sensitivity of the model todifferent parameters and about the inferability of para-meters. All simulations are already part of the parameterestimation, and can be conveniently re-used for thesensitivity analysis via scatterplots or via the analysis ofthe posterior distribution using, for example, PCA. It canbe concluded that the model is sensitive to parametersthat are inferred quickly (in earlier populations) and thathave narrow credible intervals, while it is less sensitiveto those that get inferred in later populations and arenot very localized by the posterior distribution. If thedistribution does not change much between populationsand resembles the uniform distribution from population0, then it can be concluded that the correspondingparameter is not inferable given the available data.

While the parameter estimation for individualmodels is straightforward when a suitable numberof particles are used, more care should be taken inmodel selection problems; the domains of the uniformprior distributions should be chosen with care andacceptance rates should be closely monitored. Thesemeasures should prevent models being rejected inearly populations solely due to inappropriately chosen(e.g. too large) prior domains. Apart from a potentiallystrong dependence on the chosen prior distributions(which is also inherent in the standard Bayesianmodel selection; Kass & Raftery 1995), we also observethe dependency of the Bayes factors on changes in thetolerance levels et and perturbation kernel variances.Therefore, care needs to be taken when applying theABC SMC model selection algorithm.

Finally, we want to stress the importance of moni-toring convergence in ABC SMC. There are several waysto see whether a good posterior distribution has beenobtained: we can use interquantile ranges or tests ofgoodness of fit between successive intermediate distri-butions. A further crucial signature is the number ofproposals required to obtain a specified number ofaccepted particles. This will also impose a practicallimit on the procedure.

For the problems in this paper, the algorithm wasefficient enough. Examples here were chosen to highlightdifferent aspects of ABC SMC’s performance andusability. However, for the use on larger systems, thealgorithm can bemademore computationally efficient byoptimizing the number of populations, the distance

Page 13: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

0.020 0.025 0.030 0.035 0.040

050

150

250

0.24 0.28 0.32

050

150

250

0 2 4 6 8 10

0

50

150

250

36 37 38 39 40 41

0100

300

500

(a) (b)

(c) (d )

Figure 8. Histograms of the approximated posterior distri-butions of parameters (a) g, (b) v, (c) d and (d ) S(0) of the SIRmodel with a latent phase of infection (3.12).

198 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

function, the number of particles and perturbationkernels (e.g. adaptive kernels). Moreover, the algorithmis easily parallelized.

5. CONCLUSION

We have developed a SMC ABC algorithm that can beused to estimate model parameters (including theircredible intervals) and to distinguish among a set ofcompeting models. This approach can be applied todeterministic and stochastic systems in the physical,chemical and biological sciences, for example bio-chemical reaction networks or signalling networks.Owing to the link between sensitivity and inferability,ABC SMC can, however, also be applied to largersystems: critical parameters will be identified quickly,while the system is found to be relatively insensitive toparameters that are hard to infer.

Financial support from the Division on Molecular Biosciences(Imperial College), MRC and the Slovenian Academy ofSciences and Arts (to T.T.), EPSRC (to D.W.), the WellcomeTrust (to N.S., A.I. and M.P.H.S.) and EMBO (to M.P.H.S.)is gratefully acknowledged. We thank Ajay Jasra andJames Sethna for their helpful discussions and Paul Kirk,David Knowles and Sarah Langley for commenting on thismanuscript. We are especially grateful to Mark Beaumontfor providing his helpful comments on an earlier version ofthis manuscript.

APPENDIX A. DERIVATION OF ABC SMC

Here we derive the ABC SMC algorithm from theSIS algorithm of Del Moral et al. (2006) and inappendix B we show why this improves on the ABCpartial rejection control (PRC) algorithm developedby Sisson et al. (2007). We start by briefly explainingthe basics of importance sampling and then presentthe SIS algorithm.

J. R. Soc. Interface (2009)

Let p be the distribution we want to sample from,our target distribution. If it is impossible to samplefrom p directly, one can sample from a suitableproposal distribution, h, and use importance samplingweights to approximate p. The Monte Carlo approxi-mation of h is

hN ðxÞZ1

N

XNiZ1

dX ðiÞ ðxÞ;

where X ðiÞwiidh and dx 0ðxÞ is the Dirac delta function,

defined by ðxf ðxÞdx 0

ðxÞdx Z f ðx 0Þ:

If we assume p(x)O00h(x)O0, then the targetdistribution p can be approximated by

pN ðxÞZ1

N

XNiZ1

wðX ðiÞÞdX ðiÞ ðxÞ;

with importance weights defined as

wðxÞZ pðxÞhðxÞ :

In other words, to obtain a sample from the targetdistribution p, one can instead sample from theproposal distribution, h, and weight the samples byimportance weights, w.

In SIS, one reaches the target distribution pT

through a series of intermediate distributions, pt,tZ1, ., TK1. If it is hard to sample from thesedistributions, one can use the idea of importancesampling described above to sample from a series ofproposal distributions ht and weight the obtainedsamples by importance weights

wtðxtÞZptðxtÞhtðxtÞ

: ðA 1Þ

In SIS, the proposal distributions are defined as

htðxtÞZðhtK1ðxtK1ÞktðxtK1; xtÞdxtK1; ðA 2Þ

where htK1 is the previous proposal distribution and ktis a Markov kernel.

In summary, we can write the SIS algorithmas follows.

Initialization Set tZ1.For iZ1,., N, draw X

ðiÞ1 wh1.

Evaluate w1ðXðiÞ1 Þ using (A 1) and normalize.

Iteration Set tZtC1, if tZTC1, stop.

For iZ1,., N, draw XtwktðXðiÞtK1; $Þ.

Evaluate wtðXðiÞt Þ using (A 1) with ht(xt) from (A 2)

and normalize.

To apply SIS, one needs to define the intermediateand the proposal distributions. Taking this SIS frame-work as a base, we now define ABC SMC to be a specialcase of the SIS algorithm, where we choose theintermediate and proposal distributions in an ABCfashion, as follows. The intermediate distributions aredefined as

ptðxÞZpðxÞBt

XBt

bZ1

1 dðD0;DðbÞðxÞÞ%et� �

;

Page 14: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

3See also http://web.maths.unsw.edu.au/scott/papers/paper_sm-cabc_optimal.pdf.4This, including the experiment below, was suggested by Beaumont(2008b).

ABC for dynamical systems T. Toni et al. 199

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

where p(x) denotes the prior distribution; Dð1Þ;.;DðBtÞare Bt datasets generated for a fixed parameter x,DðbÞwpðDjxÞ; 1(x) is an indicator function; and etis the tolerance required from particles contributingto the intermediate distribution pt. This allows us todefine btðxÞZ

PBt

bZ1 1 dðD0;DðbÞðxÞÞ%et� �

. We definethe first proposal distribution to equal the priordistribution, h1Zp. The proposal distribution at timet (tZ2,., T ), ht, is defined as the perturbed inter-mediate distribution at time tK1, ptK1, such that forevery sample, x, from this distribution we have

(i) bt(x)O0 (in other words, particle x is accepted atleast one out of Bt times), and

(ii) p(x)O0 (in order for a condition pt(x)O00ht(x)O0 to be satisfied),

htðxtÞZ1ðptðxtÞO0Þ1ðbtðxtÞO0Þ!

ðptK1ðxtK1ÞKtðxtK1; xtÞdxtK1

Z1ðptðxtÞO0Þ1ðbtðxtÞO0Þ!

ðhtK1ðxtK1ÞwtK1ðxtK1ÞKtðxtK1;xtÞdxtK1;

ðA3Þwhere Kt denotes the perturbation kernel(random walk around the particle).

To calculate the weights defined by

wtðxÞZptðxtÞhtðxtÞ

; ðA 4Þ

we need to find an appropriate way of evaluating ht(xt)defined in equation (A 3). This can be achieved throughthe standard Monte Carlo approximation,

htðxtÞZ1ðptðxtÞO0Þ1ðbtðxtÞO0Þ

!

ðptK1ðxtK1ÞKtðxtK1;xtÞdxtK1

z1ðptðxtÞO0Þ1ðbtðxtÞO0Þ 1N

XxðiÞtK1wptK1

Kt xði ÞtK1;xt

� �;

where N denotes the number of particles and fx ðiÞtK1g,

iZ1,., N, are all the particles from the intermediatedistribution ptK1. The unnormalized weights (A 4) canthen be calculated as

wtðxtÞZpðxtÞbtðxtÞP

xðiÞtK1

wptK1Ktðx

ðiÞtK1; xtÞ

;

where p is the prior distribution; btðxtÞZPBt

bZ1 1 dðD0;ðDðbÞðxtÞÞ%etÞ; and D(b), bZ1,., Bt, are the Bt simu-lated datasets for a given parameter xt. For BtZ1, theweights become

wtðxtÞZpðxtÞP

xðiÞtK1

wptK1Kt x

ðiÞtK1; xt

� � ;

for all accepted particles xt.The ABC SMC algorithm can be written as follows.

S1 Initialize e1,., eT.Set the population indicator tZ0.

J. R. Soc. Interface (2009)

S2.0 Set the particle indicator iZ1.S2.1 If tZ0, sample x�� independently from p(x).

If tO0, sample x� from the previous population

fx ðiÞ

tK1g with weights wtK1 and perturb theparticle to obtain x��wKt(x jx�), where Kt is aperturbation kernel.If p(x��)Z0, return to S2.1.Simulate a candidate dataset DðbÞðx��Þwf ðDjx��Þ Bt times (bZ1,., Bt) and calculatebt(x

��).If bt(x

��)Z0, return to S2.1.

S2.2 Set xðiÞt Zx�� and calculate the weight for

particle xðiÞt ,

wðiÞt Z

btðxðiÞt Þ; if t Z 0;

pðxðiÞt ÞbtðxðiÞt Þ

XNjZ1

wðjÞtK1Ktðx

ðjÞtK1; x

ðiÞt Þ

; if tO0:

8>>>>>><>>>>>>:

If i!N, set iZiC1, go to S2.1.

S3 Normalize the weights.

If t!T, set tZtC1, go to S2.0.

When applying ABC SMC to deterministic systems, wetake BtZ1, i.e. we simulate the dataset for each particleonly once.

APPENDIX B. COMPARISON OF THE ABC SMCALGORITHM WITH THE ABC PRCOF SISSON et al.

In this section, we contrast the ABC SMC algorithm,which we developed in appendix A, and the ABC PRCalgorithm of Sisson et al. (2007). The algorithms arevery similar in principle, and the main difference is thatwe base ABC SMC on an SIS framework whereas Sissonet al. use a SMC sampler as a basis for ABC PRC, wherethe weight calculation is performed through the use ofa backward kernel. Both algorithms are explained indetail in Del Moral et al. (2006). The disadvantage ofthe SMC sampler is that it is impossible to use anoptimal backward kernel and it is hard to choose agood one. Sisson et al. choose a backward kernelthat is equal to the forward kernel, which we suggestcan be a poor choice.3 While this highly simplifiesthe algorithm since in the case of a uniform priordistribution all the weights become equal, the resultingposterior distributions can result in bad approxi-mations to the true posterior. In particular, usingequal weights can profoundly affect the posteriorcredible intervals.4

Here, we compare the outputs of both algorithmsusing the toy example from Sisson et al. (2007). Thegoal is to generate samples from a mixture of twonormal distributions, ð1=2Þfð0; 1=100ÞCð1=2Þfð0; 1Þ,wheref(m,s2) is thedensity functionof aN(m, s2) random

Page 15: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

0

0.5

0.6

0.8

1.0

1.2

1.4

0.6

0.8

1.0

1.2

1.4

1.0

1.5

2.0

(a) (b) (c)

–3 –2 –1 0 1 2 3 0 2 4 6 8 10 0 2 4 6 8 10

Figure 9. (a) The probability density function of a mixture of two normal distributions, ð1=2Þfð0; 1=100ÞCð1=2Þfð0; 1Þ, taken asa toy example used for a comparison of ABC SMC with ABC PRC. (b,c) Plots show how the variance of approximatedintermediate distributions pt changes with populations (tZ1,., 10 on the x -axis). The red curves plot the variance of the ABCSMC population and the blue curves the variance of the non-weighted ABC PRC populations. The perturbation kernel in bothalgorithms is uniform, KtZsUðK1; 1Þ. In (b), sZ1.5 and this results in poor estimation of the posterior variance with the ABCPRC algorithm. In (c), s is updated in each population so that it expands over the whole population range. Such s is big enoughfor non-weighted ABC PRC to perform equally well as ABC SMC.

200 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

variable (figure 9a). Sisson et al. approximate the dis-tribution well by using three populations with e1Z2,e2Z0.5 and e3Z0.025, respectively, starting from auniform prior distribution. However, ABC PRC wouldperform poorly if they had used more populations. Infigure 9b,c, we show how the variance of the approxi-mated posterior distribution changes through popu-lations. We use 100 particles and average over 30 runsof the algorithm,with tolerance schedule eZ{2.0, 1.5, 1.0,0.75, 0.5, 0.2, 0.1, 0.075, 0.05, 0.03, 0.025}. The redline shows the variance from ABC SMC, the blue linethe variance from the ABC PRC and the green line thevariance from the ABC rejection algorithm (figure 9b,c).In the case when the perturbation is relatively small,we see that the variance resulting from improperlyweighted particles in ABC PRC is too small (blue linein figure 9b), while the variance resulting fromABCSMCis ultimately comparable to the variance obtained by theABC rejection algorithm (green line).

However, we also note that when using manypopulations and too small a perturbation, e.g. uniformperturbationKtZsU(K1, 1) with sZ0.15, the approxi-mation is not very good (results not shown). Onetherefore needs to be careful to use a sufficiently largeperturbation, irrespective of the weighting scheme.However, at the moment, there are no strict guidelinesof how best to do this and we decide on the perturbationkernel based on experience and experimentation.

For the one-dimensional deterministic case with auniform prior distribution and uniform perturbation,one can show that all the weights will be equal when s ofthe uniform kernel is equal to the whole parameterrange. In this case, the non-weighted ABC PRC yieldsthe same results as ABC SMC (figure 9c). However,when working with complex high-dimensional systems,it simply is not feasible to work with only a smallnumber of populations, or a very big perturbationkernel spreading over the whole parameter range.Therefore, we conclude that it is important to use theABC SMC algorithm owing to its correct weighting.

Another difference between the two algorithms isthat ABC PRC includes the resampling step in orderto maintain a sufficiently large effective sample size

J. R. Soc. Interface (2009)

(Liu & Chen 1998; Liu 2001; Del Moral et al. 2006). Incontrast to non-ABC SIS or SMC frameworks, ABCalgorithms, by sampling with weights in step S2.1 priorto perturbation, we suggest, do not require thisadditional resampling step.

We note that in SIS, the evaluation of weights aftereach population is computationally more costly, i.e.O(N 2), than the calculation of weights in SMC with abackward kernel, which is O(N ), where N denotes thenumber of particles. However, this cost is negligible inthe ABC case, because the vast proportion ofcomputational time in ABC is spent on simulatingthe model repeatedly. While thousands or millions ofsimulations are needed, the weights only need to beupdated after every population. Therefore, we caneasily afford spending a bit more computational timein order to use the correctly weighted versionand circumvent the issues related to the backwardkernel choice.

APPENDIX C. ABC AND FULL LIKELIHOOD FORODE SYSTEMS

We would like to note that the ABC algorithm withthe distance function chosen to be squared errorsis equivalent to the maximum-likelihood problem for adynamical system forwhichGaussian errors are assumed.In other words, minimizing the distance function

XniZ1

XmjZ1

ðxijK gjðti; qÞÞ2; ðC 1Þ

where gðt; qÞ2Rm is the solution of the m-dimensional

dynamical system and DZðxiÞfiZ1;.;ng, xi 2Rm, are

(m-dimensional) data pointsmeasured at times t1,., tn,is equivalent to maximizing the likelihood function

YniZ1

1

ð2pÞm=2jSj1=2eKð1=2ÞðxiKgðti ;qÞÞTSK1ðxiKgðti ;qÞÞ;

where S is diagonal and its entries equal. This canstraightforwardly be generalized for the case of multipletime-series measurements. Thus, ABC is for determinis-tic ODE systems closely related to standard Bayesian

Page 16: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

ABC for dynamical systems T. Toni et al. 201

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

inference where the likelihood is evaluated. This isbecause ODEs are not based on a probability model,and likelihoods are therefore generally defined in anonlinear regression context (such as assuming that thedata are normally distributed around the deterministicsolution; e.g. Timmer & Muller 2004; Vyshemirsky &Girolami 2008).

REFERENCES

Anderson, R. M. & May, R. M. 1991 Infectious diseases ofhumans: dynamics and control. New York, NY: OxfordUniversity Press.

Baker, C., Bocharov, G., Ford, J., Lumb, P. S. J., Norton,C. A. H., Paul, T., Junt, P. & Krebs, B. 2005 Ludewigcomputational approaches to parameter estimation andmodel selection in immunology. J. Comput. Appl. Math.184, 50–76. (doi:10.1016/j.cam.2005.02.003)

Banks, H., Grove, S., Hu, S. & Ma, Y. 2005 A hierarchicalBayesian approach for parameter estimation in HIVmodels. Inverse Problems 21, 1803–1822. (doi:10.1088/0266-5611/21/6/001)

Battogtokh, D., Asch, D. K., Case, M. E., Arnold, J. &Schuttler, H.-B. 2002 An ensemble method for identify-ing regulatory circuits with special reference to theqa gene cluster of Neurospora crassa. Proc. NatlAcad. Sci. USA 99, 16 904–16 909. (doi:10.1073/pnas.262658899)

Beaumont, M. 2008a Simulations, genetics and humanprehistory (eds S. Matsumura, P. Forester & C. Renfrew).McDonald Institute Monographs, University ofCambridge.

Beaumont, M. A. 2008b A note on the ABC-PRC algorithmof Sissons et al. (2007). (http://arxiv.org/abs/0805.3079v1).

Beaumont, M. A., Zhang, W. & Balding, D. J. 2002Approximate Bayesian computation in populationgenetics. Genetics 162, 2025–2035.

Bortz, D. M. & Nelson, P. W. 2006 Model selection andmixed-effects modeling of HIV infection dynamics.Bull. Math. Biol. 68, 2005–2025. (doi:10.1007/s11538-006-9084-x)

Brown, K. S. & Sethna, J. P. 2003 Statistical mechanicalapproaches to models with many poorly known par-ameters. Phys. Rev. E 68, 021 904. (doi:10.1103/Phys-RevE.68.021904)

Cox, R. D. & Hinkley, D. V. 1974 Theoretical statistics. NewYork, NY: Chapman & Hall/CRC.

Del Moral, P., Doucet, A. & Jasra, A. 2006 Sequential MonteCarlo samplers. J. R. Stat. Soc. B 68, 411–432. (doi:10.1111/j.1467-9868.2006.00553.x)

Elowitz, M. B. & Leibler, S. 2000 A synthetic oscillatorynetwork of transcriptional regulators. Nature 403,335–338. (doi:10.1038/35002125)

Enright, W. & Hu, M. 1995 Interpolating Runge–Kuttamethods for vanishing delay differential equations. Com-puting 55, 223–236. (doi:10.1007/BF02238433)

Fagundes, N., Ray, N., Beaumont, M., Neuenschwander, S.,Salzano, F. M., Bonatto, S. L. & Excoffier, L. 2007Statistical evaluation of alternative models of humanevolution. Proc. Natl Acad. Sci. USA 104, 17 614–17 619.(doi:10.1073/pnas.0708280104)

Fields, B. N., Knipe, D.M. &Howley, P. M. 1996 Fundamentalvirology. Philadelphia, PA: Lippincott Williams &Wilkins.

Galassi, M. 2003 GNU scientific library reference manual,2nd edn. Cambridge, UK: Cambridge University Press.

J. R. Soc. Interface (2009)

Gillespie, D. 1977 Exact stochastic simulation of coupledchemical reactions. J. Phys. Chem. 81, 2340–2361. (doi:10.1021/j100540a008)

Golightly, A. & Wilkinson, D. 2005 Bayesian inference forstochastic kinetic models using a diffusion approximation.Biometrics 61, 781–788. (doi:10.1111/j.1541-0420.2005.00345.x)

Golightly, A. & Wilkinson, D. 2006 Bayesian sequentialinference for stochastic kinetic biochemical networkmodels. J. Comput. Biol. 13, 838–851. (doi:10.1089/cmb.2006.13.838)

Gutenkunst, R., Waterfall, J., Casey, F., Brown, K., Myers,C. & Sethna, J. 2007 Universally sloppy parametersensitivities in systems biology models. PLoS Comput.Biol. 3, e189. (doi:10.1371/journal.pcbi.0030189)

Hammond, B. J. & Tyrrell, D. A. 1971 A mathematical modelof common-cold epidemics on Tristan da Cunha. J. Hyg.69, 423–433.

Hoeting, J., Madigan, D., Raftery, A. & Volinsky, C. 1999Bayesian model averaging: a tutorial. Statist. Sci. 14,382–401. (doi:10.1214/ss/1009212519)

Horbelt, W., Timmer, J. & Voss, H. 2002 Parameterestimation in nonlinear delayed feedback systems fromnoisy data. Phys. Lett. A 299, 513–521. (doi:10.1016/S0375-9601(02)00748-X)

Huang, Y., Liu, D. & Wu, H. 2006 Hierarchical Bayesianmethods for estimation of parameters in a longitudinalHIV dynamic system. Biometrics 62, 413–423. (doi:10.1111/j.1541-0420.2005.00447.x)

Johannes, M. & Polson, N. 2005 MCMC methods forcontinuous-time financial econometrics. In Handbook offinancial econometrics (eds Y. Ait-Sahalia & L. Hansen).Amsterdam, The Netherlands: Elsevier Science Ltd.

Kass, R. & Raftery, A. 1995 Bayes factors. J. Am. Stat. Assoc.90, 773–795. (doi:10.2307/2291091)

Kirkpatrick, S., Gelatt, C. & Vecchi, M. 1983 Optimizationby simulated annealing. Science 220, 671–680. (doi:10.1126/science.220.4598.671)

Liu, J. S. 2001 Monte Carlo strategies in scientific computing.Berlin, Germany: Springer.

Liu, J. S. & Chen, R. 1998 Sequential Monte Carlo methodsfor dynamic systems. J. Am. Stat. Assoc. 93, 1032–1044.(doi:10.2307/2669847)

Lotka, A. J. 1925 Elements of physical biology. Baltimore,MD: Williams & Wilkins Co.

Marjoram, P., Molitor, J., Plagnol, V. & Tavare, S. 2003Markov chain Monte Carlo without likelihoods. Proc. NatlAcad. Sci. USA 100, 15 324–15 328. (doi:10.1073/pnas.0306899100)

Mendes, P. & Kell, D. 1998 Non-linear optimization ofbiochemical pathways: applications to metabolic engin-eering and parameter estimation. Bioinformatics 14,869–883. (doi:10.1093/bioinformatics/14.10.869)

Moles, C., Mendes, P. & Banga, J. 2003 Parameter estimationin biochemical pathways: a comparison of global optimi-zation methods. Genome Res. 13, 2467–2474. (doi:10.1101/gr.1262503)

Muller, T. G., Faller, D., Timmer, J., Swameye, I., Sandra, O.& Klingmuller, U. 2004 Tests for cycling in a signallingpathway. J. R. Stat. Soc. Ser. C 53, 557. (doi:10.1111/j.1467-9876.2004.05148.x)

Paul, C. 1992 Developing a delay differential equation solver.Appl. Numer. Math. 9, 403–414. (doi:10.1016/0168-9274(92)90030-H)

Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery,B. P. 1992 Numerical recipes in C: the art of scientificcomputing, 2nd edn. Cambridge, UK: CambridgeUniversity Press.

Page 17: Approximate Bayesian computation scheme for parameter … · 2013-01-11 · INTRODUCTION Most dynamical systems studied in the physical, life and social sciences and engineering are

202 ABC for dynamical systems T. Toni et al.

on January 10, 2013rsif.royalsocietypublishing.orgDownloaded from

Pritchard, J., Seielstad, M. T., Perez-Lezaun, A. & Feldman,M. W. 1999 Population growth of human Y chromosomes:a study of Y chromosome microsatellites. Mol. Biol. Evol.16, 1791–1798.

Putter, H., Heisterkamp, S. H., Lange, J. M. A. & de Wolf, F.2002 A Bayesian approach to parameter estimation in HIVdynamical models. Stat. Med. 21, 2199–2214. (doi:10.1002/sim.1211)

Reinker, S., Altman, R. & Timmer, J. 2006 Parameterestimation in stochastic biochemical reactions. IEE Proc.Syst. Biol. 153, 168–178. (doi:10.1049/ip-syb:20050105)

Robert, C. P. & Casella, G. 2004 Monte Carlo statisticalmethods, 2nd edn. New York, NY: Springer.

Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni,J., Gatelli, D., Saisana, M. & Tarantola, S. 2008 Globalsensitivity analysis: the primer. New York, NY: Wiley.

Sanchez, M. & Blower, S. 1997 Uncertainty and sensitivityanalysis of the basic reproductive rate: tuberculosis as anexample. Am. J. Epidemiol. 145, 1127–1137.

Shibli, M., Gooch, S., Lewis, H. E. & Tyrrell, D. A. 1971Common colds on Tristan da Cunha. J. Hyg. 69, 255.

Sisson, S. A., Fan, Y. & Tanaka, M. M. 2007 SequentialMonte Carlo without likelihoods. Proc. Natl Acad. Sci.USA 104, 1760–1765. (doi:10.1073/pnas.0607208104)

J. R. Soc. Interface (2009)

Stumpf, M. & Thorne, T. 2006 Multi-model inference ofnetwork properties from incomplete data. J. Integr. Bioin-form. 3, 32.

Timmer, J. & Muller, T. 2004 Modeling the non-linear dynamics of cellular signal transduction. Int.J. Bifurcat. Chaos 14, 2069–2079. (doi:10.1142/S0218127404010461)

van Kampen, N. G. 2007 Stochastic processes in physics andchemistry, 3rd edn. Amsterdam, The Netherlands: North-Holland.

Volterra, V. 1926 Variazioni e fluttuazioni del numerod’individui in specie animali conviventi. Mem. R. Acad.Naz. dei Lincei 2, 31–113.

Vyshemirsky, V. & Girolami, M. A. 2008 Bayesian ranking ofbiochemical system models. Bioinformatics 24, 833–839.(doi:10.1093/bioinformatics/btm607)

Wilkinson, D. J. 2006 Stochastic modelling for systemsbiology. London, UK: Chapman & Hall/CRC.

Wilkinson, R. D. 2007 Bayesian inference of primatedivergence times. PhD thesis, University of Cambridge.

Zucknick, M. 2004 Approximate Bayesian computation inpopulation genetics and genomics. MSc thesis, ImperialCollege London.