31
Universita degli Studi di Pisa Dipartimento di Informatica Dottorato di Ricerca in Informatica Ph.D. Thesis Proposal Formal Modeling of Biological Systems with Delays Giulio Caravagna [email protected] Abstract Delays in biological systems may appear at any level of detail of the modeled system, in particular delays may be used to model events for which the underline dynamics can not be completely observed. There exist constant and variable (e.g. time– dependent, state–dependent) forms of delays and different modeling techniques for interpreting delays. In this thesis we address the problem of formal modeling bio- logical systems with delays in all their variants and interpretations. In the first part of the thesis we introduce the framework for deterministic modeling of biological systems (e.g. Delay Differential Equations) and we define Delayed Chemical Master Equations. In complete accordance with the standard approach for formal model- ing biological systems without delays, we also extend the framework for stochastic modeling by defining some variants of Delayed Stochastic Simulation Algorithms. In the second part of the thesis we address the problem of both qualitative and quantitative formal modeling of biological systems with delays by using existing for- mal languages theory and by extending well–known formal languages for modeling biological systems without delays. January 2, 2009

Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

Universita degli Studi di Pisa

Dipartimento di InformaticaDottorato di Ricerca in Informatica

Ph.D. Thesis Proposal

Formal Modeling of BiologicalSystems with Delays

Giulio [email protected]

AbstractDelays in biological systems may appear at any level of detail of the modeled system,in particular delays may be used to model events for which the underline dynamicscan not be completely observed. There exist constant and variable (e.g. time–dependent, state–dependent) forms of delays and different modeling techniques forinterpreting delays. In this thesis we address the problem of formal modeling bio-logical systems with delays in all their variants and interpretations. In the first partof the thesis we introduce the framework for deterministic modeling of biologicalsystems (e.g. Delay Differential Equations) and we define Delayed Chemical MasterEquations. In complete accordance with the standard approach for formal model-ing biological systems without delays, we also extend the framework for stochasticmodeling by defining some variants of Delayed Stochastic Simulation Algorithms.In the second part of the thesis we address the problem of both qualitative andquantitative formal modeling of biological systems with delays by using existing for-mal languages theory and by extending well–known formal languages for modelingbiological systems without delays.

January 2, 2009

Page 2: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

Preface This proposal is developed as a thesis in progress. Sections are structured as,at the moment, I expect they will be in the final version: some of them contain resultsobtained in the first year of my Ph.D. studies, and others contain some ideas on furtherdevelopments and on results I hope to obtain in the future.

Contents

1 Motivations 21.1 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Deterministic Models of Biological Systems . . . . . . . . . . . . . . . . . 61.3 Stochastic Models of Biological Systems . . . . . . . . . . . . . . . . . . . 7

2 Simulation of Biological Systems with Delays 92.1 Deterministic Models with Delays . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Deterministic Models with Constant Delays . . . . . . . . . . . . . 102.1.2 Deterministic Models with Variable Delays . . . . . . . . . . . . . 10

2.2 Stochastic Models with Delays . . . . . . . . . . . . . . . . . . . . . . . . 112.2.1 Stochastic Models with Constant Delays . . . . . . . . . . . . . . . 112.2.2 Stochastic Models with Variable Delays . . . . . . . . . . . . . . . 202.2.3 Approximation Techniques for Stochastic Models . . . . . . . . . . 21

3 Examples 213.1 Epidemics Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Cellular Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 Evolutionary Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Formal Modeling of Biological Systems With Delays 234.1 Qualitative Modeling of Biological Systems With Delays . . . . . . . . . . 244.2 Quantitative Modeling of Biological Systems With Delays . . . . . . . . . 25

5 Conclusions 27

1

Page 3: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

1 Motivations

Biochemistry, often conveniently described as the study of the chemistry of life, is amulti–faceted science that includes the study of all forms of life and that utilizes basicconcepts derived from Biology, Chemistry, Physics and Mathematics to achieve its goals.Biochemical research, which arose in the last century with the isolation and chemicalcharacterization of organic compounds occurring in nature, is today an integral componentof most modern biological research.

Most biological phenomena of concern to biochemists occur within small, living cells.In addition to understanding the chemical structure and function of the biomolecules thatcan be found in cells, it is equally important to comprehend the organizational structureand function of the membrane–limited aqueous environments called cells. Attempts to dothe latter are now more common than in previous decades. Where biochemical processestake place in a cell and how these systems function in a coordinated manner are vitalaspects of life that cannot be ignored in a meaningful study of biochemistry. Cell biology,the study of the morphological and functional organization of cells, is now an establishedfield in biochemical research.

Computer Science and Mathematics can help the research in cell biology in severalways. For instance, it can provide biologists with models and formalisms able to describeand analyze complex systems such as cells. This rather new interdisciplinary field ofresearch is named Systems Biology.

The approach which is used to solve a problem of system biology is typically the follow-ing: firstly the biological system has to be identified in all of its components, if possible. Inparticular all the involved elements and the interesting events have to be identified; this isgenerally one of the major problems because often not all these informations are availabledue to the non–observability of some events or to the non–full knowledge of the biologicalsystem itself. Furthermore, we say components and events rather than molecules andreactions because of the general applicability of systems biology to different biologicalsystems (e.g. biochemistry, cell study, epidemics, population dynamics). Whenever thesystem has been identified, then it is possible to build a deterministic or a stochasticmodel; the model is obviously built on the data which has been carried out by observingthe biological system and, potentially, by some biologically meaningful conjectures.

These models differ from the frameworks on which they are based: in particular, thedeterministic model is, in general, a system of ordinary differential equations (ODEs)which models the variation of concentrations of the involved components. Theoreticallythe ODEs model may be studied analytically (e.g. the solution of the equation, theequilibrium and the bifurcation points) or via numeric simulations. Practically, for acomplex and real model, the analytical solution is difficult to be studied or may beimpossible at all, differently, the numerical simulation is always possible.

These kinds of models, although very usefull when dealing with biological systemsinvolving a huge number of components, are not always satisfactory. In particular, whenthe size of the biological systems is small, their simulation does not show some behaviorswhich are observed in the real biological systems. To overcome this incompleteness, thestochastic models can be defined.

The stochastic model can be defined only if it is possible to model the biologicalevents with a stochastic behavior, and this is typical in biological systems due to thechaotic dynamics of the systems itselves. Such a model, which is built with the sameobjectives and by the same observations used to build the deterministic one, consists indefining a chemical master equation (CME), a special kind of differential equation, whichdescribes the time evolution of the probability of the system to occupy each one of adiscrete set of states. Although formally correct, this last approach is, in general, notusable because the CME is prohibitively difficult to be solved analytically [31]. To fill

2

Page 4: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

this gap, from the definition of the CME, it has been defined the stochastic simulationalgorithm (SSA) by Gillespie [31] which computes one trajectory in state space of themodeled system; this algorithm is exact in the sense that produces one exact time–evolution of the modeled system, given an initial conditions. Unfortunately, the SSAsuffers from scalability problems with respect to the size of the modeled system and withrespect to the kinetics of the modeled events. In other words, if the concentrations of theinvolved components are huge (e.g. a cellular interaction model may contain billions ofmolecules), then the simulation via the SSA may be prohibitively slow. Some variantsof this algorithm exist [30, 18] and, to fix this scalability problem, some approximations,[33, 17], have been developed which are not exact but practically usable under someassumptions.

Whatever model is built if it fits, when it is simulated to trace its temporal evolution,the real observed data, then it can be used as a base for compositionally developingmore complex models. In particular, it can be used to make predictions on the biologicalsystems by simply adding to the model the modeling of the predictions themselves. Thisapproach can be usefull to make biological conjectures on the behavior of more complexsystems and, furthermore, to have a simulable model of the biological system.

The following schema summarizes the discussed approach.

This work aims at defining a wider class of biological systems which can be modeledby following the outlined approach. In particular, the ordinary differential equations, theCME and the SSA can be used only in presence of systems where the modeled eventshave no delay. It is the aim of this thesis to extend the frameworks to be able to modelalso biological systems with delays.

Delays in biological systems may appear at any level of detail of the modeled system,in particular delays may be used to model events for which the underline dynamics cannot be completely observed. In particular, if it would be possible to define the event interms of the all the sub–events which compose it, then that would be the best low–leveldescription of the modeled event. If this is not possible, namely there is not full–knowledgeof all the underline dynamics composing the main event, then it is reasonable to arguea fixed maximum time in which the sub–events are completed and, consequently, it isreasonable to model the main event with a delay equal to the argued time. Notice thatthis a quite general idea of interpreting the delays when modeling biological systems, inparticular there exist some biological systems for which the biological interpretations ofdelays are quite different. Some examples of this kind of systems, together with theirinterpretation of delays, are presented in Section 3.

This work addresses the modeling of biological systems with delays by following thisapproach: as regards deterministic models, it is possible to define a more general frame-work of differential equations, the delay differential equations (DDEs) which can naturallymodel delays. In particular, this framework is very general with respect to the kind ofdelays it can models, it is of interest to study whether all this kind of different forms ofdelay can be biologically meaningful. In Section 2.1 some of these variants are discussed.

For the same motivations we had to build a stochastic simulation framework from thedeterministic one, it is worth defining, dependently on the type of delay we are interested

3

Page 5: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

in, a delayed chemical master equation (DCME) conceptually analogous to the CME.Also the DCME suffers from the disadvantages of the CME and, consequently, in thesame fashion of non–delayed systems, it has to be defined a delayed stochastic simulationalgorithm (DSSA) conceptually analogous to the SSA. Dependently on the kind of delaysinvolved in the system, it is possible to define an algorithm as shown in Section 2.2.Summarizing, all the Chapter 2 is dedicated to the extention of both deterministic andstochastic modeling of biological systems with delays.

From a computer science perspective, Chapter 4 is devoted to the study of formallanguage for the modeling of biological systems with delays. In particular, the formalcomputer science approach, with all the background theory on concurrent systems, hasbeen used in the last years to define formal languages for systems biology.

Many formalisms have been proposed, some as adaptations of the existing ones andothers considered to be biologically–inspired. As regards the former class of languages, itis worth mentioning the Stochastic π–calculus [42], namely a stochastic extention of theπ–calculus [39], a process algebra for modeling concurrent processes. In this approachchemical reacting entities can be described by processes and biological reactions are mod-eled as communications on channels, the synchronization of communicating processes isinterpreted as the firing of a reaction.

As regards the latter class of formal languages, the wider one, it is worth mentioningκ–calculus [24], BioAmbients [44], Brane Calculi [19], P Systems [40], Stochastic CLS [38]and Stochastic String Multiset Rewriting [9]. This languages differ for the theory onwhich they are built on, namely are based on process algebras theory, on concurrentsystems theory, on rewriting systems theory or on their possible combinations. From abiological perspective, these languages permit to easily define more complex biologicalsystems then simple chemical reacting systems. In fact, they have generally the possi-bility of expressing, by using some ad–hoc synctactic operators, biological aggregationsof components, arbitrarily nested membranes and more complex and general biologicalcomponents. Furthermore, they provide some primitive operations for easily modelingbiological events, for instance the creation, the dissolution, the merging of membranesand of their content.

All the mentioned languages permit to define models on which properties con beverified via model checking or abstract interpretation, but these models can be simulatedin a stochastic framework only if it has been defined a stochastic semantics. For someof them the stochastic semantics has been defined and this permitted to develop specificsimulators (e.g. SPiM [53] based on the Stochastic π–calculus, and CytoSim and PSym[52] based on P Systems and the CLSm [50] based on Stochastic CLS). These simulatorspermit, in the same fashion as the approximation techniques for the ODEs, to tracethe time–evolution of the modeled biological systems with respect to some simulationalgorithms. The former approach is commonly named qualitative modeling and the latterquantitative modeling.

The algorithm which is mainly used for simulations is, as expected the SSA, whichcan be applied on the result of applying the semantics to the modeled system. To thisextent, the semantics of these languages are typically given in terms of inference ruleswhich, when applied to a term describing the model, build a label transition systems(LTS) where the labels represent the stochastic rates of the events which bring from onestate to another. From the initial model of the biological systems, by applying the rule ofthe semantics, it is possible to compute all the possible reachable states, namely the fullLTS. This LTS can be used to build a Continuous–Time Markov Chain (CTMC), namelya matrix of all the possible different states enriched with the probability of moving fromon state to another. These probabilities, namely the stochastic behavior of the system,depend only on the current state which describe the modeled systems by definition ofthe CTCM. For this reason, on the CTMC it is possible to apply an algorithm, the SSA,

4

Page 6: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

on which the probability of choosing one event depends only on the current state of thesimulation.

We can notice that whenever considering models with delays, under some interpreta-tions of the delays, the probabilities may depend on some past–states of the system andnot only on the current one. Consequently, the corresponding stochastic process built bythe SSA would not be correct due to the non–markovian behavior of this kind of process.In other words, depending on the past–state of the systems corresponds to having a non–memoryless process, which is the main feature of a Markov process and, consequently,the current semantics with the corresponding LTSs cannot be used any more.

Summarizing, in order to model both qualitatively and quantitatively biological sys-tems with delays, we will discuss whether some of the well–known formal languages canbe extended to address qualitative modeling. Moreover, we would like to extend thesemantics of some of the mentioned formalisms to quantitative modeling.

1.1 Scenario

In this section we introduce the scenario in which we can define both the deterministicand the stochastic modeling of biological systems.

We consider a well–stirred system of molecules of N chemical species {S1, . . . , SN}interacting through M chemical reaction channels R1, . . . , RM . We assume the systemto be confined in a constant volume and to be in thermal equilibrium at some constanttemperature. We denote the number of molecules of species Si in the system at time t withXi(t) and we want to study the evolution of the state vector X(t) = (X1(t), . . . , XN (t)),assuming that the system was initially in some state X(t0) = x0.

A reaction channel Rj is characterized mathematically by two quantities. The first isits state–change vector νj = (ν1j , . . . , νNj), where νij is defined to be the change in theSi molecular population caused by one Rj reaction; thus, if the system is in state x anda reaction Rj occurs, the system jumps to state x + νj .

The other characterizing quantity for reaction channel Rj is its propensity functionaj(x); this is defined, accordingly to [31], so that aj(x)dt, given X(t) = x, is the prob-ability of reaction Rj to fire in state x. The probabilistic definition of the propensityfunction finds its justification in physical theory [31].

In order to correctly define the propensity functions of the reactions we recall thefundamental empirical law governing reaction rates in biochemistry: the law of massaction. This states that for a reaction in a homogeneous medium, the reaction rate willbe proportional to the concentrations of the individual reactants involved. For example,given the simple molecular reaction 2A

k7→ B, namely a reaction which transforms the tworeactants A into the single product B with kinetic constant k, is such that, by the law ofmass action, the rate of the production of molecule B is:

dB

dt= k[A]2

and the rate of destruction of A is:dB

dt= −k[A]2

where [A] and [B] are the chemical standard notations of the concentrations (i.e. molesover volume unit) of the respective molecules. To the extent of defining the propensityfunction for such a reaction, accordingly to [31], would be correct to define it as a(x) =k[A]2 where [A] denotes the concentration of A in x.

This kinetic law is the fundamental and mainly used to develop models, both stochasticand deterministic. Other laws exist (e.g.Michaelis–Menten) and will appear later in theexample of biological systems discussed in this thesis.

5

Page 7: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

1.2 Deterministic Models of Biological Systems

Deterministic models of biological systems are the most used and widespread modelsof biological systems since the last century. This kind of models consist of a set ofordinary differential equations (ODEs) which describe, in general, the time evolution ofthe concentrations of the involved species in a given volume.

In mathematics, an ordinary differential equation is a relation that contains functionsof only one independent variable, and one or more of its derivatives with respect to thatvariable. The general form of an ordinary differential equation for X(t) ∈ Rn is

dX

dt= fx(t,X(t)),

where dX/dt may depend on the state of the system at time t, X(t), and not on anyprevious states. Much study has been devoted to the solution of ordinary differentialequations. In the case where the equation is linear, it can be solved by analytical methods.Unfortunately, most of the interesting differential equations are non–linear and, with a fewexceptions, cannot be solved exactly. Approximate solutions are obtained by numericalsimulation algorithm developed in the last century.

As an example of deterministic model, let us consider one of the first mathematicalmodels of tumor immunotherapy [35], namely a model of the dynamics between tumorcells, T (t), immune–effector cells, E(t), and IL-2, IL, a cytokine interleukin-2 which isobserved to boost the immune system to fight tumors. For a close examination of themodel we refer to [35], here we simply reproduce the model and discuss how it is built.The model consisting of the set of ODEs is the following:

dE

dt= cT (t)− µ2E(t) +

p1E(t)IL(t)g1 + IL(t)

dT

dt= r2T (t)− r2bTT (t)− aE(t)T (t)

g2 + T (t)dIL

dt=

p2E(t)T (t)g3 + T (t)

− µ3IL(t).

The first equation describes the rate of change for the effector–cell population. Effectorcells are stimulated to grow based on two terms. One is a recruitment term, cT , due tothe direct presence of the tumor, where the parameter c models the antigen of the tumor.The other growth term, (p1EIL)/(g1 + IL), is a proliferation term whereby effector cellsare stimulated by IL-2 that is produced by effector cells. This term is of Michaelis–Menten form to indicate the saturated effects of the immune response. The last termmodels the natural lifespan of an average 1/k2 days of the effector cells. Second equationmarks the rate of change of the tumor cells. This is described by a logistic limiting–growth term r2T − r2bTT . The loss of tumor cells is represented by an immune–effectorcell interaction at rate a. This rate constant, a represents the strength of the immuneresponse and is modeled by Michaelis–Menten kinetics to indicate the limited immuneresponse to the tumor. Last equation gives the rate of change for the concentration ofIL-2. Its source is the effector cells that are stimulated by interaction with the tumor andalso has Michaelis–Menten kinetics to account for the self–limiting production of IL-2.The term −ILµ3 represents degraded rate of IL-2.

In [35] the model is fully examined and, by looking at the time evolution of the nu-merical simulations and both the analytical analysis of the equilibrium of this dynamicalsystems, many results are discovered. In particular, four cases dependently on the param-eters and on the initial configuration are observed; in one single case there is a prevalencein the tumor mass and, in the others three cases, oscillations of the tumor mass and of the

6

Page 8: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

effector cells are observed. These oscillations reach very small values of the tumor massand this is a suitable case in which a stochastic model can show interesting behaviors.In fact, the stochastic behavior can result in cases in which the tumor mass becomes 0and the immune system eradicates spontaneously the tumor. This phenomenon can beobserved in practice but is not observable by examining these equations.

This kind of results hide the main motivation for defining stochastic models, namelythe fact that under certain conditions, these models show behaviors which are not ob-servable in the deterministic counterparts.

1.3 Stochastic Models of Biological Systems

In this section we will briefly discuss on the stochastic modeling of biological systems.In particular, we will recall firstly both the definition of the chemical master equationand the stochastic simulation algorithm by Gillespie [31] and, secondly, we will show, viaan example, why the stochastic framework is suitable as the deterministic one for themodeling of biological systems.

The Chemical Master Equation

In this section we recall the definition, given by Gillespie in [31], of the chemical masterequation (CME). The CME is a set of first–order differential equations (ODEs) describingthe time evolution of the probability of a system to occupy each one of a discrete set ofstates; in the definition given by Gillespie [31], the crucial quantity is P (x, t | x0, t0),namely the probability that, given the initial configuration X(t0) = x0, at time t thesystem is described by the state vector x, X(t) = x. In order to define the CME,namely the differential equation which describes the variation of such a probability inthe infinitesimal time dt, the quantity P (x, t + dt | x0, t0) is defined. Such a probability,assuming that the dt is chosen so small that at most one reaction can fire in the timeinterval [t; t + dt[, is defined in terms of these two events:

- at time t the system is already in state x and in the infinitesimal time [t; t + dt[ noreaction fires;

- at time t the system is in state x− νj and reaction Rj fires.

Summing up the probabilities of these two events, we get

P (x, t + dt | x0, t0) = P (x, t | x0, t0)

1−

M∑

j=1

aj(x)dt

+M∑

j=1

P (x− νj , t | x0, t0) · aj(x− νj)

where the first term represents the probability of the former event, and the second termrepresents the probability for the latter event. By subtracting P (x, t | x0, t0), dividing bydt, and taking the limit dt → 0 we get the CME

∂P (x, t | x0, t0)∂t

=M∑

j=1

P (x− νj , t | x0, t0) · aj(x− νj)− P (x, t | x0, t0) · aj(x).

As shown in [31], this ODE is generally difficult to solve, in particular it can be solvedanalytically only for a very few simple systems and, furthermore, numerical solutionsmay be prohibitively difficult. These difficulties justified the introduction of alternative

7

Page 9: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

simulation techniques for the stochastic simulation of biological systems although theCME is an importante conceptual base for the studying of the mathematical foundationsof the stochastic simulation algorithm.

The Stochastic Simulation Algorithm

In this section we are going to recall the definition of the stochastic simulation algorithm(SSA) by Gillespie [31]. This algorithm addresses the following problems, given the sys-tem in state x at time t, compute the time instant at which the next reaction fires andchoose, accordingly to some policy, the reaction to fire. As regards the former problem, itis shown in [31] how the putative time for the next reaction can be chosen by sampling anexponentially distributed random variable with mean a0(x) =

∑Mj=1 aj(x). The sampling

of such variable can be obtained by inverse Montecarlo algorithm for generating expo-nentially distributed values. Similarly, the reaction to fire is chosen accordingly to thefollowing inequalities,

∑j−1i=1 ai(x) < r2 · a0(x) ≤ ∑j

i=1 ai(x). For a proof of correctnessof these choices see [31].

Having solved this problem it is possible to state that, given an initial configurationX(t0) = x0, one possible time–evolution of the modeled biological systems is given byapplying the following algorithm.

Algorithm SSA

1. Initialize the time t = t0 and the system state x = x0.

2. With the system in state x at time t, evaluate all the aj(x) andtheir sum a0(x) =

∑Mj=1 aj(x).

3. Given two random numbers r1 and r2 uniformally distributed in[0; 1], generate values for τ and j accordingly to

τ =1

a0(x)ln(

1r1

)j−1∑

i=1

ai(x) < r2 · a0(x) ≤j∑

i=1

ai(x)

then update x = x + νj and t = t + τ , go to step 2.

Notice that this algorithm is exact in the sense that produces one exact trajectoryin the state space of the system. Furthermore, it is worth mentioning that exist manyequivalent variants of the SSA, in particular they differ from the way in which theycompute the putative time for next reaction and on the way in which they choose reactionto fire. The variant we presented here is named Direct Method [31], the other presentedin the literature is the First Reaction Method [32]. Other algorithms for the stochasticsimulation of biological systems can be found in [30, 18]. All this algorithms are similarand based on the ideas defined by Gillespie, furthermore, they suffer from the samedisadvantages of this version of the SSA as explained in the previous sections.

Example

As an example of stochastic model let us examine the immunotherapy model of tumorpresented in Section 1.2. The deterministic model, accordingly to the technique shownin [20], can be translated in the following set of reactions,

8

Page 10: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

ODE term reaction propensity functioncT T

c7→ T + E c[T ]µ2E E

µ27→ µ2[E](p1EIL)/(g1 + IL) IL + E 7→ IL + 2E (p1[E][IL])/(g1 + [IL])r2T T

r27→ 2T r2[T ]

r2bTT 2Tr2b7→ T r2b[T ]2

(aET )/(g2 + T ) T + E 7→ E (a[E][T ])/(g2 + [T ])(p2ET )/(g3 + T ) E + T 7→ E + T + IL (p2[E][T ])/(g3 + [T ])µ3IL IL

µ37→ µ3[IL]

As expected, the modeled species are T , E and IL and is obtained by defining one reactionfrom each term of the equations. Consider for instance the term cT , it appears only in thefirst equation and, consequently, it models the introduction of an effector cell E, namelyit semantics, whenever it fires, is to add 1 cell E to the population of Es. Furthermore, itskinetic in the ODE is cT where c is the antigen constant and T is the number of tumoralcells in the current state of the system. It is trivial to recognizer this as a law of massaction kinetics and, consequently, to model the ODE term it is enough to use the reactionT

c7→ T + E which, when applied, produces 1E and consumes no cells T as they appearwith the same quantity as reactants and products, with kinetics c[T ].

Notice that for those reaction whose propensity function is defined accordingly tothe law of mass action the kinetic constant is specified on top of the rewriting arrow,differently for those who have different kinetics, in this case the terms with Michaelis–Menten kinetics, the kinetic constant is not specified in the reaction.

By simulating this model instantiated with some appropriate parameters which canbe found in [35], it is possible to observe that for some initial configuration of the modelthe tumor is eradicated by the immune–system. We recall that this kind of behavior,although realistic, was not observable in the deterministic model which showed alwayslimit cycles, namely configurations of the model which correspond to cyclic behaviorsthat, in this case, did not correspond to the spontaneous eradication of the tumor. Thiskind of result, which appears in [8], is a practical proof of how the stochastic models areas interesting as the deterministic ones.

2 Simulation of Biological Systems with Delays

In this section we define both the deterministic and the stochastic frameworks for modelingbiological systems with delays. Deterministic models are presented as extension of theODEs framework and stochastic models are presented analogously as the SSA. At theend of the chapter some application examples are given.

2.1 Deterministic Models with Delays

In mathematics, delay differential equations (DDEs) are a kind of differential equationin which the derivative of the unknown function at a certain time is given in terms ofthe values of the function at previous times. DDEs are a more general framework thanODEs and it is easy to note that ODEs are a particular case of DDEs in absence of delay.However, although the framework of DDEs is more expressive than the one of the ODEs,not all the theoretical mathematical results on the ODEs are valid for the DDEs as statedin [25].

The general form of a time–delay differential equation for X(t) ∈ Rn is

dX

dt= fx(t,X(t), Xt),

9

Page 11: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

where Xt = {X(t′) : t′ ≤ t} represents the trajectory of the solution in the past. Noticethat this formulation is general enough to capture all the possible forms of delays we aregoing to analyze in the next sections of this thesis. As already said, the justification forthe study of different forms of delays is due to the general framework of the DDEs andto the many models of biological systems which can be found in the literature, based onthese different kinds of DDEs.

2.1.1 Deterministic Models with Constant Delays

In this section we study DDEs in their easiest form, namely we refer to DDEs withconstant delays, namely equations of the form

dX

dt= fx(t,X(t), X(t− σ1), . . . , X(t− σn))

with σ1 > . . . > σn ≥ 0 and σi ∈ R.In order to introduce this framework we start by giving an example. Consider the

model of tumor immunotherapy presented in [8], an extention of the model firstly pre-sented in [35] and discussed in Section 1.2 and in Section 1.3. This model describes theinteraction between two different kind of cells, tumor and effector cells, and one molecule,the interleukin IL-2 for stimulating immunotherapy. The model in [35] has been extendedby adding a constant delay τ in the response of the immune system in presence of tumorcells. This kind of delay is realistic and justified by experimental observations whichshowed a delayed response of the immune–system.

The model with delays is then the following:

dE

dt= cT (t− τ)− µ2E(t) +

p1E(t)IL(t)g1 + IL(t)

dT

dt= r2T (t)− r2bTT (t)− aE(t)T (t)

g2 + T (t)dIL

dt=

p2E(t)T (t)g3 + T (t)

− µ3IL(t)

where the term with delay is cT (t− τ) which models the response of the immune systemto the presence of tumor cells. For a close analysis of this model we refer to [8].

Other interesting examples of use of this kind of DDEs regard biological system at anylevel of abstraction, these models will be examined in Section 3 and, in the same section,it will be of interest to discuss the interpretation of delays considered in these models.

2.1.2 Deterministic Models with Variable Delays

In this section we propose to study deterministic models of biological systems with variabledelays. In particular, in the literature it is possible to observe two classes of delays: thefirst regarding the delays which depend on time or on concentrations of species, namelythe DDEs of the form

dx

dt= fx(t, x(t), x(t− τ1(t, x(t)), . . . , x(t− τn(t, x(t)))

where τi(t, x(t)) : R+ × Rn 7→ R+ is a function of both time and x(t) which representsthe delay. The second regarding distributed delays, namely DDEs of the from

dx

dt= fx

(t, x(t),

∫ t

−∞γ(t′, x(t′))dt′

)

10

Page 12: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

where∫ t

−∞ γ(t′, x(t′))dt′ is the integral over the time of a function γ : R × Rn 7→ R+ ofboth time and state. These kind of models can be a very useful and, although non trivial,extension of those presented in the previous section.

From an application point of view we may want to model a biological system withmemory in presence of external impulses. In the case in which the time needed to elaboratethe response to the external impulses is fixed, then a discrete delay would be enough tomodel such a time quantity. Differently, as the system has got a memory, and this is typicalin most complex cellular systems (e.g. the immune–system [8]), then it can be observedthat the response for subsequent similar impulses requires different time quantities to beelaborated. In many cases the time quantities depend on the time elapsed between theimpulses or depend on the number of responses to the impulses. To model this kind ofbehaviors, the first form of variable delays can be used. For interesting models of DDEswith variable delays we refer to [5, 6, 36] and to the references therein where a populationdynamics model of mammals is presented by using DDEs with state–dependent delays.

Differently, as regards distributed delays, they are a more precise generalization ofthe fact that an event depends on a single past–state of the system. In particular, asthe integral models all the area of the integration interval, then these models take intoaccount a dense set of past–states rather than a single one. For the use of integro–differential equations and example models we refer to [7] and to the references therein.

2.2 Stochastic Models with Delays

In this section we discuss on the stochastic models with delay. In particular, with respectto the different forms of delays we discussed in the previous sections of this chapter, namelywithin the deterministic modeling of delayed systems, we will try to define, if possible,equivalent chemical master equations with delays and stochastic simulation algorithmwith delays. The motivation for defining stochastic algorithms are the same explainedin Section 1.3 and in Section 1.3, namely the fact that the intrinsic non–deterministicnature of these kind of simulations can show, and consequently justify, behavior naturallyobserved in real experiments but which are not captured by the deterministic models.

2.2.1 Stochastic Models with Constant Delays

In this section we extend the scenario introduced in Section 1.1 by assuming that eachreaction Rj has a constant delay σj ∈ R such that σj ≥ 0. Notice that this kind of delayis the same presented for DDEs of Section 2.1.1.

In the first part of this section we will define a delayed chemical master equation ina similar way as we defined the chemical master equation of Section 1.3. After analyzingthe equation, in the last part of this section, we will define some algorithms for thestochastic simulation of models with constant delays. In particular, we will give twodifferent interpretations to constant delays. The first is considering delays as fixed timequantities which are needed, in addition to a stochastic time quantity, to fire a reaction.The second is, in the same sense of the DDEs, the fact that a reaction depends upona state in the past history fo the system. By following the first interpretation we willdefine, accordingly to [13, 16, 54], a DSSA with scheduled reactions; differently, by thesecond interpretation, we will define the DSSA with delayed propensity functions [9]. Asa future work, will be of interest to formally investigate wether these two variants of theDSSA are equivalent.

Notice that these different algorithms, which differ from their interpretation of delaysand from some strategies they adopt, are suitable to be used to simulate the time evolutionof models of biological systems with delays but, as expected, being different implies thatsome systems may be simulated more correctly, from a biological interpretation of delays,

11

Page 13: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

by one version of the algorithm rather than another. This lets the modeler choose,in accordance with the delays he is modeling in the system, the proper version of thealgorithm to be used.

A Delayed Chemical Master Equation

In this section we derive, via the same principles used to derive the CME in Section 1.3and similarly to how it is done in [13], a delayed chemical master equation (DCME) forsystems with constant delays.

In order to define the DCME we change the use of the propensity function with respectto the CME. In particular, in the CME the propensity function, which depends on thestate at the current time t, is computed in the state x given X(t) = x, namely the valueaj(x) is computed for any reaction Rj . Differently, in the DCME we want to computethe propensity function in a state x′ which represents a past–state of the system, namelyX(t′) = x′ with t′ ≤ t, then we will compute the value aj(x′) for any reaction Rj . Thechoice of the state x′ has to be, as expected, consistent with the delay σj . Notice alsothat, having possibly different delays for each reaction, each propensity function depends,potentially, on a different past–state of the system.

Having pointed out this difference, we can now try to identify the quantities in whichwe are interested in order to define the DCME. Also in the case of the DCME thesequantities are conceptually similar to those present in the definition of the CME. Inparticular we denote as P (x, t | x0, t0; ω) the probability that the system is in state x attime t given these two facts: the initial configuration is X(t0) = x0 and, as we have delays,the value of the state for time instants preceding t0 is given by the function ω, namelyX(t) = ω(t) if t < t0. Notice that the need of the function ω is due to the fact that, inthe time interval [t0; t0 +σj ], the value for the propensity function of reaction Rj dependson the state at an instant t′ < t0 and, consequently, could not be computed withoutusing ω. We note also that this choice is strictly related to the standard mathematicaltechnique [25] used to solve systems of DDEs by infinite approximations of set of ODEs.

Analogously as for the CME, we want to define the quantity P (x, t + dt | x0, t0; ω)assuming that dt is small enough that at most one reaction fires in time [t; t + dt[. Wehave the consider the following events:

- at time t the system is already in state x and, at delayed time t − σj , the systemwas in state xi, no reactions fire;

- at time t the system is in state x− νj and, at delayed time t− σj , the system wasin state xi, reaction Rj fires.

Formally, denoting with I(x) the set of all possible states in which the system can be,we can define P (x, t + dt | x0, t0;ω) as follows:

P (x, t + dt | x0, t0; ω) =

P (x, t | x0, t0;ω)

1−

M∑

j=1

xi∈I(x)

P (xi, t− σj | x, t;x0, t0; ω) · aj(xi)dt

+M∑

j=1

P (x− νj , t | x0, t0; ω)

xi∈I(x)

P (xi, t− σj | x− νj , t;x0, t0; ω) · aj(xi)dt

Notice that the quantities P (xi, t−σj | x, t;x0, t0; ω) and P (xi, t−σj | x−νj , t;x0, t0;ω)denote the probability that the system is in state xi at time t−σj when knowing that attime t it is in state x and x−νj , respectively, and knowing that the initial configuration is

12

Page 14: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

described by x0, t0 and ω. Furthermore, notice that the need of coupling all the possiblesystem states justifies the introduction of the set I(x); in particular, for those states xi

which have no connection with state x or x − νj such a probability will be equal to 0.As regards the outlined events, the probability of the former is given by the first termof the equation and the probability of the latter is given by the second one. In order todefine the DCME we have to make some algebraic rearrangements of this last equation,in particular, from probability theory, we can prove the following theorem.

Theorem 1. Given events A, B and C, it holds that

P (A | B; C) · P (B | C) = P (A;B | C)

Proof. By definition of conditioned probability we have

P (A | B; C) · P (B | C) =P (A; B; C)P (B; C)

· P (B; C)P (C)

=P (A; B; C)

P (C)= P (A; B | C)

By applying this theorem to the terms of the previous equation we get the followingequalities

P (xi, t− σj | x, t;x0, t0; ω) =P (x, t;xi, t− σj | x0, t0; ω)

P (x, t | x0, t0; ω)P (x− νj , t | x0, t0;ω) · P (xi, t− σj | x− νj , t;x0, t0; ω) = P (x− νj , t;xi, t− σj |;x0, t0; ω).

Consequently, the probability of the former event can be rewritten as

P (x, t | x0, t0; ω)−M∑

j=1

xi∈I(x)

P (x, t;xi, t− σj | x0, t0; ω) · aj(xi)dt

and the probability of the latter event becomes

M∑

j=1

xi∈I(x)

P (x− νj , t;xi, t− σj |;x0, t0; ω) · aj(xi)dt

.Summarizing, the quantity P (x, t + dt | x0, t0; ω) can be rewritten as follows:

P (x, t + dt | x0, t0; ω) = P (x, t | x0, t0; ω)

−M∑

j=1

xi∈I(x)

P (x, t;xi, t− σj | x0, t0; ω) · aj(xi)dt

+M∑

j=1

xi∈I(x)

P (x− νj , t;xi, t− σj | x0, t0; ω) · aj(xi)dt

Finally, by simple algebraic rearrangement of this last equation it is possible to getthe following DCME:

∂P (x, t | x0, t0; ω)∂t

=M∑

j=1

xi∈I(x)

P (x− νj , t;xi, t− σj | x0, t0;ω) · aj(xi)

−M∑

j=1

xi∈I(x)

P (x, t;xi, t− σj | x0, t0; ω) · aj(xi)

13

Page 15: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

We discuss now about the relation between the DCME and the CME. In particular,we may expect to have that in absence of delays the DCME reduces to the CME. Thefollowing proposition holds.

Theorem 2. In absence of delays the delayed chemical master equation reduces to thechemical master equation

Proof. In order to prove the theorem we have to prove that

∀j = 1, . . . , n. σj = 0 ⇒ ∂P (x, t | x0, t0; ω)∂t

=∂P (x, t | x0, t0)

∂t

where the left side of the equality is the DCME end the right side is the CME. If we notethat the probability of being in two different states at the same time instant is 0, namelyP (x, t;x, t′) = 0 if t 6= t′, and, furthermore, as we have no delays, than we don’t need ωin out initial configuration, x0, t0, then P (x− νj , t;xi, t | x0, t0;ω) = P (x− νj , t | x0, t0)because x− νj = xi. Consequently, as x− νj = xi then a(x− νj) = a(xi) and we get thefollowing equation

M∑

j=1

xi∈I(x)

P (x− νj , t;xi, t | x0, t0; ω) · aj(xi) =M∑

j=1

P (x− νj , t | x0, t0) · aj(x− νj).

With the same considerations it is possible to show that, without delays, it holds

M∑

j=1

xi∈I(x)

P (x, t;xi, t− σj | x0, t0;ω) · aj(xi) =M∑

j=1

P (x, t | x0, t0) · aj(x).

By combining these two observations, it holds that, without delays, the DCME reduces,as expected, to the CME.

Finally, we can notice that the DCME has the same disadvantages of the CME,namely it cannot be always solved analytically and its numerical simulations may bedifficult, however it is an importante conceptual base for the studying of the mathematicalfoundations of the DSSA with constant delays.

The DSSA with Scheduled Reactions

In this section we introduce the DSSA with scheduled reactions. Its definition appears inthe works by [13, 16, 46] although the proof of its correctness is given in [47] which builda stochastic process which behaves exactly as the one built by the algorithm.

In this section we introduce a formalization of the DSSA presented in the literaturewith some simple differences, in particular in our scenario we have every reaction witha non–negative delay. Differently, in the literature is presented a scenario where thereactions ar divided between delayed and non–delayed. We remark that this difference isnot crucial as these DSSAs are equivalent.

The main idea at the base of interpreting delays as scheduled reactions is the following:the delay of a reaction is not interpreted as the fact that propensities have to be computedin the past–states of the system, but as an intrinsic minimum quantity of time which hasto be spent in order to fire a reaction. In other words, the delay of a reaction is interpretedas a constant time quantity which has to be summed to a stochastic time quantity in orderto fire the reaction. In particular, the stochastic time quantity is computed, as expected,as in the SSA, namely it is an exponentially distributed number with mean a0(t). Wehave to note that, given the system in state x at time t, the propensity functions, with

14

Page 16: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

this interpretation of delays, are computed in the current state x rather than in a paststate of the system. This is in accordance with the fact that the delay will be part ofthe time spent to fire the reaction. This is the main difference between the version of theDSSA we present here and in the next sections.

As regards delays, if τ is the stochastic time quantity computed for the next reactionto fire, say Rj , then the reaction will complete its firing not at time t + τ as in the SSA,but will be scheduled to fire at time t+σj +τ and the clock will be update, as in the SSA,at time t+ τ . This scheduling of the firings yields to the fact that, sometimes, within thetime interval [t; t + τ ] a reaction was already scheduled to fire and, in this case, a properstrategy has to be adopted which we will discuss later.

In the literature are unformally discussed two different scheduling strategies, [54].The first strategy, the full–scheduling, completely applies a reaction (e.g. removes itsreactants and inserts its products) at the moment in which it is found to be candidatefor firing (e.g. it was scheduled some instants before).

The second one divides the firing of a reaction in two different time moments. Moreprecisely, the reaction, and consequently its state–change vector, is divided accordinglyto its reactants and to its products. At the moment in which the reaction is scheduled,the state of the system changes accordingly to the portion of the state–change vectorregarding the reactants. At the moment in which the is found to be scheduled, namelyat the moment in which the full–scheduling completely applied it, the state of the systemchanges accordingly to the portion of the state–change vector regarding the products.It is to note that, as regards the first strategy, it may schedule a reaction too manytimes with respect to the number of times it could fire in the current state of the system.This problem, which typically is observed when a consuming reaction has a big delay withrespect to average time steps of the DSSA, can be easily solved if in this scenario we checkfor applicability of the scheduled reactions when we have to perform their application.

The biological motivations which justify these two different scheduling strategies areto be discussed. In particular, in the case of full–scheduling, as we do not remove thereactants from the state of the simulation when we schedule the reaction, we may thinkof having a reaction which, when starts firing, it does not inhibit the reactants fromhaving other interactions within the modeled system. A typical example of this kind ofinteraction is common in the modeling of epidemiological systems [15, 55]; in particular,if we are modeling the infection process with a delay, which can be interpreted as theminimum time quantity to get infected after a contact with the illness, then whenever theprocess of becoming infected is started between two individuals, those individuals are notremoved from the population and can interact with the other individuals of the populationspreading the illness. This is a typical use of the delay in the definition of epidemiologymodels. Differently, as regards the partial–scheduling strategy, in this case we removethe reactants from the population at the time in which we schedule the reaction. Froma biological perspective this corresponds to inhibiting the involved reactants to have anyother reaction within the current modeled system. An example application could be anevolutionary model [37] in which the reproduction is a process which take place in aseparate and safe area; in this case we may think the the reactants which are going togive birth to a juvenile, are moving in this safe place and, consequently, do not take partinto any other event which may happen within the modeled population until they comeback.

Notice that whenever modeling, the choice of the DSSA version to use depends on thebiological interpretations of the delays we have in the model with respect to the eventsof the system.

In order to formally define these two different strategies and, consequently, to definetwo different DSSAs, let us denote each state–change vector νj as a the composition ofthe state–change vector for reactants, νr

j , and the state–change vector for products, νpj ,

15

Page 17: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

noting that νj = νrj + νp

j .The former strategy for scheduling reactions yields to the definition of the following

version of the DSSA with full–scheduling.Algorithm DSSA with full–scheduling

1. Initialize the time t = t0 and the system state x = x0.

2. With the system in state x at time t, evaluate all the aj(t) and theirsum a0(t) =

∑Mj=1 aj(X(t));

3. Given two random numbers r1, r2 uniformally distributed in the inter-val [0; 1], generate values for τ and j accordingly to

τ =1

a0(t)ln(

1r1

)j−1∑

i=1

ai(X(t)) < r2 · a0(t) ≤j∑

i=1

ai(X(t))

(a) If delayed reaction Rk is scheduled at time t+ τk and τk < τ andit can be correctly applied, then update x = x+νk and t = t+τk;

(b) else, schedule Rj at time t + σj + τ , set time to t + τ ;

4. go to step 2.

Notice that this algorithm, which is, in its schema, similar to the SSA, contains itsfull–scheduling strategy in its step (3a), namely when executes x = x+ νk. Furthermore,in order to check if the scheduled rule is applicable, we state an informal constraint whichcan be easily implemented by checking if the rectants of Rk are a sub–multiset of x. Thiscontrol was not present in the DSSA versions studied in the literature [13, 16, 46] as theyconsidered simply delays in non–consuming reactions which can be always applied in anystate of the system. This can be seen as a more general version of their DSSA.

Similarly, the latter strategy for scheduling reactions yields to the definition of thefollowing version of the DSSA with partial–scheduling.

Algorithm DSSA with partial–scheduling

1. Initialize the time t = t0 and the system state x = x0.

2. With the system in state x at time t, evaluate all the aj(t) and theirsum a0(t) =

∑Mj=1 aj(X(t));

3. Given two random numbers r1, r2 uniformally distributed in the inter-val [0; 1], generate values for τ and j accordingly to

τ =1

a0(t)ln(

1r1

)j−1∑

i=1

ai(X(t)) < r2 · a0(t) ≤j∑

i=1

ai(X(t))

(a) If delayed reaction Rk is scheduled at time t+τk and τk < τ thenupdate x = x + νp

k and t = t + τk;

(b) else, schedule Rj at time t + σj + τ , update x = x + νrk and set

time to t + τ ;

4. go to step 2.

Notice that this second DSSA is, as the previous one, similar to the SSA, and itcontains its partial–scheduling strategy in steps (3a) and (3b). In particular, it removes

16

Page 18: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

the reactant of Rk when this reaction is scheduled, namely when executes x = x + νrk at

step (3b), and completes its firing at step (3a) when executes x = x+νpk . Furthermore, as

the scheduling strategy is partial, we have not to check if reaction Rk can be applied atstep (3a) as the application of only its reactants is analogous to applying a non consumingreaction.

As expected, these two variants of the DSSA differ only for the steps (3a) and (3b),namely those in which reactions are scheduled, (3a), and fired, (3b). A future work is anin–depth analysis of these two DSSAs and a formal proof of their correctness and, if it ispossible, a proof of their equivalence.

We discuss now the step (3a) of both the algorithms. At this step we check for thepresence of a scheduled reaction Rk at time t+τk and such that τk < τ , this last inequalityis enough to guarantee that Rk is scheduled in the time interval [t; t+τ ] and, consequently,it is in conflict with the putative time for next reaction. Furthermore, independently fromthe scheduling policy, the application of scheduled reaction Rk would change the currentstate of the simulation and, indirectly, the probability of all the reactions included thenext putative Rj in the interval [t; t + τ ]. We call the time instants in which the value ofthe propensity functions change discontinuities. It is intuitive, and this is the choice ofthe algorithm in both its variants, to discard the current τ and the index j in order toapply the scheduled reaction Rk, to set the clock value at t+ τk and restart the algorithmiteration. However, from a formal point of view, it is to be shown, as future work, thatthis choice of moving forewards the time doesn’t affect the probability distributions.

We discuss now the implementation of this algorithm. It is trivial to notice that boththe versions of the DSSA are really similar to the SSA, namely they only need to store,in addition to the SSA, the list of scheduled reactions. It is also easy to note that this listis dynamic it and cannot be bounded in space by any parameters as it deeply dependson the system we are modeling. So any formal analysis of the overheaded space withrespect to the SSA cannot be carried out and, probably, should be of no interest due tothe simplicity of this versions of the DSSA. In particular, an implementation of this listas a queue is enough to compute efficiently step (3) of these algorithms. Moreover, asthese two variants of the DSSA are very similar, it is enough to use very few informationsto have the different behaviors at step (3a) and (3b). For a prototype implementation ofthis algorithms see [51].

The DSSA with Delayed Propensity Functions

In this section we firstly introduce some notations in order to give a clear exposition of theDSSA with delayed propensity functions [9] and, secondly, after defining the algorithm,we give a formal proof of its correctness and, at the end of the section, we discuss someimplementation issues.

Let us denote, with the system at time t and for each reaction Rj , the quantitytj = t−σj +θt,j with tj < t and θt,j ≥ 0 as the first time instant immediately subsequentto the time instant t − σj in which a reaction fired. It is clear that, as a reaction firedat time instant tj , than at least the propensity function of one reaction changes itsvalue when the time is around value t + θt,j . Consequently, also the summation a0(t) =∑M

j=1 aj(X(t− σj)) changes around such a time instant. It is clear that the time instantt+ θt,j represent a discontinuity in the same sense as defined previously. Notice that herewe write aj(X(t− σj)) meaning that, given X(t− σj) = xi, we want to compute aj(xi).This is the basic idea of this version of the DSSA with respect to the ones presented inthe previous section and this justifies its name, DSSA with delayed propensity functionsbecause, as in the DCME, we compute the propensity functions in the past–state of thesystem.

We can define the DSSA with delayed propensity functions as follows.

17

Page 19: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

Algorithm DSSA with delayed propensity functions

1. Initialize the time t = t0 and the system state x = x0.

2. With the system in state x at time t, evaluate all the aj(t) and theirsum a0(t) =

∑Mj=1 aj(X(t− σj)).

3. Given a random number r1 uniformally distributed in the interval[0; 1], generate values for τ and θt accordingly to

τ =1

a0(t)ln(

1r1

) θt = min{θt,j | j = 1, . . . , M}

(a) If τ < θt then, given a random number r2 uniformally distributedin the interval [0; 1], select applicable reaction Rj such that

j−1∑

i=1

ai(X(t− σi)) < r2 · a0(t) ≤j∑

i=1

ai(X(t− σi))

then update x = x + νj and t = t + τ , go to step 2.

(b) If τ ≥ θt, then t = t + θt and go to step 2.

Notice that this algorithm, when the system is in state x at time t, generates anexponentially distributed number τ with mean a0(t) which, analogously as in the SSA,is considered as the putative time for the next reaction. Furthermore, the value of θt

represents the minimum offset which, summed to one of the t − σj instants, reaches atime instant in which a rule has been applied. Such a value is used to discover if thevalue of a0(t) is constant or not in the time interval [t, t + τ ] and, consequently, to decidewhether a rule has to be chosen to be applied or not; in other words, t + θt, representsthe nearest discontinuity point for all the propensity functions when the system is at timet. In the case that no rule is applied, namely τ < θt, then its index j is chosen as inthe SSA, otherwise, if no rules has to be applied, then the algorithm simply moves timeforewords. Notice that the choice of generating the value for j if and only if a rule will beapplied let us avoid to generate a random number r2 and then rejecting it. Furthermore,we say applicable reaction Rj to mean that we want a reaction which is applicable inthe current state x. To implement this check, it is enough to check if the rectants of Rj

are a sub–multiset of x and, if it is so, to check for the other constraints. This controlmechanism may seem illogical with respect to the fact that its probability is computedin a past state of the system, however, it is necessary to avoid to create inconsistencies ifthe reactants needed to fire the reaction are not present in the current state x. Wheneverthe reaction has been chosen, the clock is increased to the value t + τ .

In the case of non–application of a rule, the choice of moving time forewords has to beproved to be correct. We define now the framework in which we can prove its correctness;we plan to do this kind of work also for the previously defined versions of the DSSA.

Accordingly to [31, 48], the probability that the system is in state x at time t + dtgiven the fact that is in state x at time t can be defined as P (x, t + dt) = P (x, t)(1 −∑M

j=1 aj(X(t− σj))dt) yielding to the following differential equation

dP (x, t)dt

= −P (x, t)a0(t)

This differential equation has the exponential solution exp(− ∫ t+τ

ta0(t′)dt′

)which rep-

resents the probability that no reaction fires in the time interval [t, t+τ ]. The probability

18

Page 20: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

that a single reaction Rj fires at time t + τ taking time dt is, by definition, denoted asaj(X(t + τ − σj))dt. By merging these two independent events we get the probabilitythat no reaction fires in the system during the time interval [t, t + τ ] and, at time t + τ ,reaction Rj fires, namely

pj,t(τ) = aj(X(t + τ − σj)) exp(−

∫ t+τ

t

a0(t′)dt′)

.

Consequently, the probability that no reaction fires in the time interval [t, t + τ ] and onegeneric reaction fires at time t+ τ can be obtained by summing over all possible rules Rj .Formally

pt(τ) =M∑

j=1

pj,t(τ) = a0(t + τ) exp(−

∫ t+τ

t

a0(t′)dt′)

denotes the probability density for the putative time of next reaction, namely τ , given thesystem at time t. By definition, its probability density function at time t is

Ft(ϕ) = Pt(−∞ ≤ τ ≤ ϕ) =∫ ϕ

−∞pt(τ)dτ

In order to prove the correctness of this algorithm we must prove that the choice ofincreasing the clock without applying any rule at step (3b) of the DSSA with delayedpropensity functions is correct.

To this aim, let us assume the following scenario:

• the system is in state x at time t;

• in the past, precisely at time t− σj + θt a reaction fired.

We recall that for reaction Rj with delay σj its rate, namely aj(X(t−σj)), depends uponthe system at time t − σj . As already said, by the latter assumption, if a rule has beenapplied at that time it means that the values of the propensity functions of some reactionsand, consequently, their summation, change when time crosses time instant t+ θt. Let usassume that a0(t′) is defined as follows

a0(t′) =

{ξ1 if t ≤ t′ < t + θt

ξ2 t′ ≥ t + θt

Notice that a0(t′) is defined in terms of the current value of time, t, and, in this scenario,the firing of a reaction at time t−σj +θt reflects a change, namely a left–discontinuity, inthe definition of the value of a0(t + θt). The following figure gives a visual representationof this scenario

Summarizing, in order to prove that our algorithm is correct when we generate aputative time τ exponentially distributed with mean a0(t), we have to prove two propo-sitions. The former proposition will regard the correctness of the step (3a) of the DSSAand, analogously, the latter of the step (3b). Notice that, in the former proposition, wewill have to prove that the putative time is an exponentially distributed random variableas in the SSA. Differently, in the latter, we will have to prove that we do not lose any

19

Page 21: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

Figure 1: The scenarios in which has to be proved the correctness of the DSSA withdelayed propensity functions.

probability information when we increase the clock to value t + θt. Intuitively, we willprove that the probability at time t of generating a value of the form θt + θϕ, for a givenθϕ, can be expressed as two independent events, one related to generating a value greaterthan θt at time t and the other related to generating, at time t + θt, a value smaller thanθϕ.

The theorem, which is summarized in the scenario of Figure 1, is the following.

Theorem 3. Given X(t) = x, if reaction Rj fired at time t− σj + θt, it holds:

(a) if ϕ < θt then Pt(τ ≤ ϕ) = P (Exp(ξ1) ≤ ϕ)

(b) if ϕ ≥ θt then ∃θϕ. Pt(τ ≤ ϕ) = P (Exp(ξ1) ≤ θt) + P (Exp(ξ1) > θt, Exp(ξ2) ≤ θϕ)

The proof of the theorem appears in [9]. It is now trivial to notice that, as regards thelatter proposition, this is enough to prove the correctness of the step (3b) of the algorithm.In particular, analyzing how it works, we can observe that we generate, at step (3), thenext putative time as an exponentially distributed number with mean a0(t) = ξ1, namelyτ1, and, at step (3b), increasing the clock by θt we move the system in the same statebut with time t+ θt; furthermore, at the next iteration of the algorithm, again in its step(3), we generate an exponentially distributed number with mean a0(t + θt) = ξ2 whichcorresponds, as expected, to τ2.

It is now proved that this algorithm, which generates exponentially distributed num-bers and, in the case of discontinuities, moves time forewords, is correct.

For some considerations on the implementation of this DSSA we refer to [9], notice thatthis DSSA requires a more complex implementation than the previous one. In particular,in order to compute the propensity functions in the past states of the systems and tocompute the value θt, it is necessary to efficiently compute steps (2) and (3) of the DSSA.In [9] is presented an exhaustive discussion on the implementation of these operationsand useful suggestions are given.

As regards the relation between this DSSA and the two versions previously presented,we conjecture that this version is analogous to the DSSA with full–scheduled reactions.We plan to define an equivalence relation built on the probabilities of the firings and toprove this equivalence.

2.2.2 Stochastic Models with Variable Delays

In this section we expect to investigate different form of variable delays, in particular,we refer to the ones described in Section 2.1.2. Consequently, we plan to investigate ifit is possible to extend the DCME and the DSSA to handle time–dependent or state–dependent delays or, more in general, delays in any other reasonable form.

20

Page 22: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

2.2.3 Approximation Techniques for Stochastic Models

In this section we expect to investigate approximation techniques for the stochastic simula-tion of systems with delays. This is done because the DSSAs provides an exact simulationof a delayed system and it is, for the same reasons as the SSA, too slow for complex sys-tems. In particular, it is easy to note that, at each step of computation, the chosen timefor putative reaction, is somehow related to the inverse of the size of the system. Thisyields the fact that if the system size grows up, then the simulation steps decrease in sizeand, consequently, the simulation slows down. This limitation is so strong that the DSSA,without any approximation, become unusable for many of the models we present in thisthesis; however, it is the correct base from which to propose an efficient approximationof the stochastic simulation of delayed systems.

We plan to introduce approximation to the DSSA in the same fashion as it has beendone with the τ–leaping algorithm and the SSA [33]. In particular, the key idea can besummarized as follows: at each step of computation, whenever a reaction is decided to befired, then it is applied more than once in the time interval [t; t+τ ], if τ is the putative timefor next reaction. More formally, the number of firings in such a time interval is properlychosen by sampling a Poisson random variable with parameter equal to the probabilityof that rule to fire, namely the value computed by its propensity function. This simpleidea, which is at the base of the definition of the τ–leaping algorithm, has been shown tospeeds up significantly the SSA [33]; however, it may introduce some inconsistencies asnegative populations if the number of firings is not bounded by some other parameters.

Summarizing, we expect to extend the DSSA, in possibly all its variants, to approxi-mated versions accordingly to this leaping technique.

3 Examples

In this section we are going to show some models of biological systems which have beenstudied by using the DDEs framework. In particular, the models we present here havebeen studied by using DDEs with constant delay, although for some of them exist gen-eralization based upon DDEs with variable delays. For any model presented we give theset of equations defining it, an explanation of the role of delays within the model and itsequivalent stochastic model. For the main results of their analytical study or numericalsimulations we refer to to the original works.

3.1 Epidemics Models

The study of epidemics models deals with the modeling of populations and illnesses withinthe population; more precisely, an SIR model is an epidemiological model that computesthe theoretical number of people infected with a contagious illness in a closed populationover time. The name of this class of models derives from the fact that they involve coupledequations relating the number of susceptible people S(t), the number of people infectedI(t) and the number of people who have recovered R(t). These kind of models have beenstudied since the starting of last century and in [15, 55] is presented an SIR epidemicmodel with constant infectious period which is incorporated as a time delay.

The model has been defined as follows:dS

dt= µ− µS(t)− βS(t)I(t)

dI

dt= βS(t)I(t)− βe−µτS(t− τ)I(t− τ)− µI(t)

dR

dt= βe−µτS(t− τ)I(t− τ)− µR(t)

21

Page 23: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

where it is assumed that all newborns are susceptible and infection confers permanentimmunity. Parameters in the system are as follows: µ is a natural death and birth rate, βis the average number of adequate contacts of an infectious individual per time unit andτ is the length of the infectious period. The term βe−µτS(t− τ)I(t− τ) reflects the factthat an individual has recovered from infection and still are alive after infectious periodτ . Notice that this interpretation of delays differs from the one suggested in Section 2.2when presentind the DSSA with scheduled reactions.

The equivalent stochastic model which describes the time evolution the populationcomposed by S, I and R, built accordingly to the technique explained in [20], and whichcan be simulated by the DSSA, is given in the following Table.

DDE term reaction propensity function delayµ

µ7→ S µ 0µS(t) S

µ7→ µ[S] 0

βS(t)I(t) S + Iβ7→ 2 ∗ I β[S][I] 0

βe−µτS(t− τ)I(t− τ) S + I−βe−µτ

7→ S + R −βe−µτ [S][I] τ > 0¡ µI(t) I

µ7→ −µ[I] 0µR(t) R

µ7→ −µ[R] 0

3.2 Cellular Models

As a cellular model we refer to the models studying the dynamics of the cells and thebiochemical events inside and outside them. As a practical example, we refer to the modelof tumor immunotherapy with delay presented in [8] and briefly discussed in Section 2.1,we recall that this is an extention of the model firstly presented in [35] and here discussedin Section 1.2 and in Section 1.3.

The equivalent stochastic model which describes the time evolution of the cells E andT , and of molecules IL, built accordingly to the technique explained in [20], and whichcan be simulated by the DSSA, is given in the following table.

DDE term reaction propensity function delaycT T

c7→ T + E c[T ] τ > 0µ2E E

µ27→ µ2[E] 0(p1EIL)/(g1 + IL) IL + E 7→ IL + 2E (p1[E][IL])/(g1 + [IL]) 0r2T T

r27→ 2T r2[T ] 0

r2bTT 2Tr2b7→ T r2b[T ]2 0

(aET )/(g2 + T ) T + E 7→ E (a[E][T ])/(g2 + [T ]) 0(p2ET )/(g3 + T ) E + T 7→ E + T + IL (p2[E][T ])/(g3 + [T ]) 0µ3IL IL

µ37→ µ3[IL] 0

For similar models it is worth citing [22] in which a model of the HIV infection ispresented by using DDEs.

3.3 Evolutionary Models

In this section we present a simple predator–prey model with harvesting and time delays.These class of models play a crucial role in bio–economics, namely the management ofrenewable resources, and are based upon the competition between the involved speciestogether with their evolution simply for the purpose of seeking resources to sustain theirstruggle for their existence. Depending on their specific settings of applications, these

22

Page 24: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

models can be interpreted as resource–consumer, parasite–host, tumor cells and virus–immune system, susceptible–infectious interactions and so on. They deal with the generalloss–win interactions and hence may have applications outside of ecosystems. There are alot of studies on the application of these models, from those not considering harvesting tothose non considering delays, in this simple application we will show a classical predator–prey model enriched wit both these features.

Let X(t) and Y (t) denote the prey and predators population densities at time t. Thereferring model, taken from [37], is the following.

dX

dt= r1X(t)− r1

KX(t)2 − bX(t)Y (t)−H

dY

dt= −r2Y (t) + cX(t− τ)Y (t− τ)

In the model, r1 is the rate of increase of the prey population, r2 is the death rate ofthe predator population, b is the coefficient of effect of predation on x, c is the coefficientof effect of predation on y, K represents the limitation upon the growth of the prey bypredation and H is the constant–rate harvesting of the prey species x. The delay τ isa constant based on the assumption that the change rate of predators depends on thenumber of prey and predators at some previous time.

In order to translate this model in an equivalent stochastic process let us define theset of species S = {X, Y } and six reaction channels R1−6. Accordingly to the techniquesshown in [20] it is possible to define the following reactions:

DDE term reaction propensity function delayr1X(t) X

r17→ 2X r1[X] 0

r1K−1X(t)2 2X

r1/K7→ X (r1/K)[X]2 0

bX(t)Y (t) X + Yb7→ Y b[X][Y ] 0

H XH7→ [H] 0

r2Y (t) Yr27→ r2[Y ] 0

cX(t− τ)Y (t− τ) X + Yc7→ X + 2Y c[X][Y ] τ > 0

4 Formal Modeling of Biological Systems With Delays

In the last years many formalisms have been proposed for modeling biological systemswithout delays, some are adaptations of the formal languages proposed with differentpurposes and others considered to be biologically–inspired. As regards the former class oflanguages, namely the first chronologically appearing in the literature, it is worth men-tioning the Stochastic π–calculus [42], namely a stochastic extention of the π–calculus [39],a process algebra for modeling concurrent processes. In this approach chemical reactingentities can be described by processes and biological reactions are modeled as communi-cations on channels, the synchronization of communicating processes is interpreted as thefiring of a reaction.

As regards the latter class of formal languages, the wider one, it is worth mentioningthe κ–calculus [24], BioAmbients [44], Brane Calculi [19], the P Systems [40], StochasticCLS [38] and the Stochastic String Multiset Rewriting [9]. This languages differ for theoryon which they are built on, namely are based on process algebras theory, on concurrentsystems theory, on rewriting systems theory or on their possible combinations. From abiological perspective, these languages permit to easily define more complex biologicalsystems then simple chemical reacting systems. In fact, they have generally the pos-sibility of expressing, by using some ad–hoc sintactic operators, biological aggregations

23

Page 25: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

of components, arbitrarily nested membranes and more complex and general biologicalcomponents. Furthermore, they provide some primitive operations for easily modelingbiological events, for instance the creation, the dissolution, the merging of membranesand of their content.

For all the mentioned languages, if it has not been defined a stochastic semantics,then they permit to define models on which properties con be verified via model checkingor abstract interpretation, but they cannot be simulated in a stochastic framework. Dif-ferently, for some of them the stochastic semantics has been defined and this permittedto develop specific simulators (e.g. SPiM [53] based on the Stochastic π–calculus, andCytoSim and PSym [52] based on P Systems and the CLSm [50] based on StochasticCLS). These simulators permit, in the same fashion as the approximation techniques forthe ODEs, to trace the time–evolution of the modeled biological systems with respectto some simulation algorithms. The former approach is commonly named qualitativemodeling and the latter quantitative modeling.

In the next sections we will firstly examine the possibility of extending some of thealready existing languages in order to qualitative model biological systems with delayand, in the last section we discuss on their quantitative modeling.

4.1 Qualitative Modeling of Biological Systems With Delays

There exist many formal languages which permit to model time as an explicit feature ofthe model, most of them can be used for modeling real–time systems of interacting com-ponents and, for this reason, they may be also used for qualitatively modeling chemicalreacting systems with delays. As regards the first chronologically appearing in the litera-ture it is worth mentioning Timed Petri Nets [41] and Timed Automata [4]. These formallanguages have been enriched with features for modeling real–time systems, namely prob-ability and stochasticity. In particular Stochastic Petri Nets [28], Probabilistic TimedPetri Nets [26], Probabilistic Timed Automata [14] and Stochastic Timed Automata [21]have been defined.

We plan to examine these formal languages to decide wether they can be suitable atall or have to be extended for qualitatively modeling biological systems with delays. Tothis extent, we want to mention, and briefly introduce, an interesting variant of PetriNets [41, 45], namely Interval Timed Colored Petri Nets [1].

A Petri Net is a formal language for the description of discrete distributed systems.A Petri Net consists of places, transitions, and directed arcs. Arcs run between placesand transitions, never between places or between transitions. The places from which anarc runs to a transition are called the input places of the transition; the places to whicharcs run from a transition are called the output places of the transition.

Places may contain a multiset of tokens. A distribution of tokens over the places of anet is called a marking. A transition of a Petri Net may fire whenever there is a token atthe end of all input arcs; when it fires, it consumes these tokens, and places tokens at theend of all output arcs. A firing is, in its simpler interporetation, atomic. Execution ofPetri Nets is nondeterministic: when multiple transitions are enabled at the same time,any one of them may fire. If a transition is enabled, it may fire, but it doesn’t have to.

Since firing is nondeterministic, and multiple tokens may be present anywhere in thenet, Petri Nets are well suited for modeling the concurrent behavior of distributed systems.Furthermore, a well–founded theory of analyzing Petri nets has been developed over thelast 20 years and this permits to analyze the following properties: reachability, livenessand boundness [27]. Furthermore, model checking of Timed Petri Nets and StochasticPetri Nets has been defined [23].

It is common to use Petri Net to qualitatively model chemical reacting systems. Inparticular, whenever it is assigned to each place a chemical specie, then the number of

24

Page 26: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

occurrences of the chemical specie in the state of the modeled system is represented by thenumber of tokens in the corresponding place. Analogously, a reaction can be modeled asa transition starting from the places representing the reactants and ending in the placesrepresenting the products. It is easy to note that this model, in absence of delays andin absence of complex biological structures, is suitable for modeling chemical reactingsystems.

In order to consider biological systems with delays, we should consider more complexvariant of these nets, in particular, we consider Interval Timed Colored Petri Nets [1](ITCPN), which use a timing mechanism where time is associated with tokens. In partic-ular, in a ITCPN model a timestamp is attached to every token which indicates the timea token becomes available. Furthermore, each token has a distinctive color. The enablingtime of a transition is the maximum timestamp of the tokens to be consumed. Transi-tions are eager to fire (i.e. they fire as soon as possible), therefore the transition with thesmallest enabling time will fire first. Firing is an atomic action, thereby producing tokenswith a timestamp of at least the firing time. The difference between the firing time andthe timestamp of such a produced token is called the firing delay. The firing delay of aproduced token is specified by an upper and lower bound, i.e. an interval.

There exist also variants to these Petri nets, in particular, instead of using intervaltiming, there exist Petri nets models with fixed delays or stochastic delays. Petri netswith fixed deterministic delays have been proposed in [56, 43, 49, 34].

In the case we want to model variable delays by assuming certain delay distributions,i.e. to use a timed Petri net model with delays described by probability distributions, wecan refer to stochastic Petri Nets [28, 2, 3] . Analysis of stochastic Petri Nets is possiblein theory, since the reachability graph can be regarded, under certain conditions, as aMarkov chain or a semi–Markov process. These conditions are severe: all firing delayshave to be sampled from an exponential distribution or the topology of the net has tobe of a special form [2]. We can notice that the first of these conditions is in completeaccordance with the Gillespie’s algorithm and with the extensions proposed in Section 2.2in which the firing delays are exponentially distributed values.

In order to have a general framework for modeling delayed events , ITCPN proposedelays described by an interval specifying an upper and lower bound for the duration ofthe corresponding activity. On the one hand, interval delays allow for the modelling ofvariable delays, on the other hand, it is not necessary to determine some artificial delaydistribution (as opposed to stochastic delays). Instead, we have to specify bounds. Thesebounds can be used to verify time constraints. This is very important when modellingtime–critical systems and, this kind of analysis, may become usefull also in verifyingcorrectness of the modeled biological systems. The ITCPN are, as expected, a prominentcandidate for qualitative modeling of biological systems with delays.

4.2 Quantitative Modeling of Biological Systems With Delays

In this section we expect to consider two formalisms belonging to two different classesof formal languages used in systems biology and to extend their stochastic semantics inorder to be used for modeling delays.

As already said, modeling biological systems without delays is possible with the currentsemantics based on labeled transition systems (LTS), but, in order to extend these LTSs,it is necessary to define alternative semantics. In particular, we can notice that, from thecurrent LTS it is possible to build a Continuous–Time Markov Chain (CTMC), namelya matrix of all the possible different states enriched with the probability of moving fromon state to another, these probabilities, by definition of the CTCM, depend only on thecurrent state which describe the modeled systems. For this reason, on this CTMC, builtby starting from the LTS, it is possible to apply the SSA but not the DSSA.

25

Page 27: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

In particular, whenever considering models with delays, we have to deal with thefact that, under some interpretations of the delays as shown in the previous chapter,the probabilities depend on a trace of the system and not only on the current state.The corresponding stochastic process is not, as it happened for the non–delayed systems,a Markov process, namely memoryless, and consequently the current semantics cannotbe used. It is then necessary to have a LTS in which the current state of the systemsis enriched with all the past states visited. This implies that two current states of abiological system will be equal if and only if they will share the current state togetherwith all the trace.

This kind of work has not be done by now and it is the main theoretical issue of thisthesis, nevertheless it is a result which, together with the formal definition of the delaystochastic simulation framework, could open new research applications of system biology.

As regards the languages we want to work on, we choose the Stochastic π–calculus [42]and the Stochastic Calculus of Looping Sequences [38]. The motivations for choosingthese two formalisms are the following: the stochastic π–calculus (SPi) is a stochasticvariant of the π–calculus [39] process algebra and has been widely used for the descriptionof biological systems, furthermore many other formalisms are defined as its extensions.Differently, the Stochastic CLS [38] is an extension of the Calculus of Looping Sequences(CLS) [11, 12], a rewriting system language where it is possible to syntactically specifyarbitrarily nested membranes, atomic objects and it is possible to define objects inside oron the surfaces of the membranes. These two languages have been chosen because theycan be considered to be representative for the classes of process algebras and rewritingsystems.

SPi processes are based on input and output actions on channels, processes representbiological entities and a communication on a channel in SPi is interpreted to be a chemicalreaction. For this reason every channel in SPi is associated with a reaction rate, namelythe rate at which the associated reaction happens, and every transition in the semanticsof SPi is labeled with a function of the rate of the channel on which the communicationhas been performed. Typically, this function computes a kinetic value which correspondsto the value computed by the law of mass action for that reaction, namely the valuecomputed by its propensity function. Such a semantics permits to apply the SSA onthe obtained LTS. We expect to extend SPi channels with a delay and to consequentlyredefine the communication semantics in order to apply the DSSA.

CLS is a formalism based on term rewriting systems with some features, such as acommutative parallel composition operator, and some semantic means, such as bisimula-tions, which are common in process algebras. All this permits to combine the simplicityof notation of rewriting systems with the advantage of a form of compositionality.

Given an alphabet of symbols representing basic biological entities, such as genes,proteins and other macromolecules, CLS terms are constructed by applying to thesesymbols operators of sequencing, looping, containment and parallel composition. Termsconstructed by means of these operators represent biological structures such as DNA se-quences and membranes. Rewrite rules can be used to model biological events that permitthe system to evolve. In particular, they can be used to model biochemical reactions andstructure rearrangements such as membrane fusion and dissolution.

As an extension, the Stochastic CLS allows the description of quantitative aspects ofthe modeled systems such as the frequency of chemical reactions and, when reactions areequipped with rates, the LTS obtained by the application of the semantics permits toapply the SSA.

These two languages, whenever added a delay to channels in the case of SPi, or whenadded a delay to each rewriting rule of Stochastic CLS, become two interesting candidatefor the application of the DSSA.

26

Page 28: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

5 Conclusions

In this Ph.D. thesis proposal we addressed the issue of formal modeling biological sys-tems with delays. Delays in biological systems may appear at any level of detail of themodeled system, in particular delays may be used to model events for which the under-line dynamics can not be completely observed. There exist constant and variable (e.g.time–dependent, state–dependent) forms of delays and different modeling techniques forinterpreting delays. In this thesis we addressed the problem of formal modeling biologi-cal systems with delays in all their variants and interpretations. In the first part of thethesis we introduced the framework for deterministic modeling of biological systems (e.g.Delay Differential Equations) and we defined Delayed Chemical Master Equations. Incomplete accordance with the standard approach for formal modeling biological systemswithout delays, we also extended the framework for stochastic modeling by defining somevariants of Delayed Stochastic Simulation Algorithms. In the second part of the thesiswe addressed the problem of both qualitative and quantitative formal modeling of bio-logical systems with delays by using existing formal languages theory and by extendingwell–known formal languages for modeling biological systems without delays.

References

[1] W.M.P. van der Aalst, Interval Timed Coloured Petri Nets and their Analysis, in:Lecture Notes in Computer Science 691 (1993), 453–472.

[2] M. Ajmone Marsan, G. Balbo, A. Bobbio, G. Chiola, G. Conte, A. Cumani, OnPetri Nets with Stochastic Timing, in: Proceedings of the International Workshopon Timed Petri Nets, Torino, 1985, IEEE Computer Society Press, 80–87.

[3] M. Ajmone Marsan, G. Balbo, G. Conte, A Class of Generalised Stochastic PetriNets for the Performance Evaluation of Multiprocessor Systems, in: ACM Trans-actions on Computer Systems 2 (1984), 93–122.

[4] R. Alur, D.L. Dill, A Theory of Timed Automata, in: Theoretical Computer Science126(1994), 183–235.

[5] O. Arino, K.P. Hadler, M.L. Hbid Existence of Periodic Solutions for Delay Differ-ential Equations with State Dependent Delay, in: Journal of Differential Equations144 (1998), 263–301.

[6] O. Arino, M.L. Hbid, R.B. De La Parra A Mathematical Model of Growth of Pop-ulation of Fish in the Larval Stage: DEnsity–dependence Effects, in: MathematicalBiosciences 150 (1998), 1–20.

[7] C.T.H. Baker, G.A. Bocharov, F.A. Rihan, A Report on the Use of Delay Differen-tial Equations in Numerical Modelling in the Biosciences, Technical Report, TheUniversity of Manchester, ISSN 1360–1725.

[8] R. Barbuti, G. Caravagna, A. D’Onofrio, P. Milazzo, Tumor Suppression by Im-mune System Through Stochastic Limit Cycles, Draft.

[9] R. Barbuti, G. Caravagna, A. Maggiolo-Schettini, P. Milazzo, A Delay StochasticSimulation Algorithm with Delayed Propensity Functions, Draft.

[10] R. Barbuti, G. Caravagna, A. Maggiolo-Schettini, P. Milazzo, An IntermediateLanguage for the Simulation of Biological Systems, in: Proceedings of the 1thInternational Workshop From Biology To Concurrency and Back, FBTC07, in:Electronic Notes in Theoretical Computer Science 194 (2008), 19–34.

27

Page 29: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

[11] R. Barbuti, G. Caravagna, A. Maggiolo-Schettini, P. Milazzo, G. Pardini, TheCalculus of Looping Sequences, in: M. Bernardo, P. Degano and G. Zavattaro(Eds.), Formal Methods for Computational Systems Biology, Lecture Notes inComputer Science, vol. 5016, Springer, 2008, 387–423.

[12] R. Barbuti, A. Maggiolo-Schettini, P. Milazzo, A. Troina, A Calculus of LoopingSequences for Modelling Microbiological Systems, in: Fundamenta Informaticae 72(2006), 21–35.

[13] M. Barrio, K. Burrage, A. Leier, T. Tian, Oscillatory Regulation of Hes1: DiscreteStochastic Delay Modelling and Simulation, in: PLoS Computational Biology 2(9)(2006), e117.

[14] D.Beauquieur, On Probabilistic Timed Automata, in: Theoretical Computer Sci-ence 292 (2003), 65–84.

[15] E. Beretta, T. Hara, W. Ma, Y. Takeuchi, Permanence of an SIR Epidemic Modelwith Distributed Time Delays, in: Tohoku Mathematical Journal 54(2), 2002, 581–591.

[16] D. Bratsun, D. Volfson, L.S. Tsimring, J. Hasty, Delay-induced Stochastic Oscilla-tions in Gene Regulation, in: Proceedings of the National Academy of Sciences ofthe United States of America 102(41) (2005), 14593–14598

[17] Y. Cao, D. Gillespie, L. Petzold, The Slow-scale Stochastic Simulation Algorithm,in: Journal of Chemical Physics, 122(1), 014116, 2005.

[18] Y. Cao, H. Li, L. Petzold, Efficient Formulation of the Stochastic Simulation Al-gorithm for Chemically Reacting System, in: Journal of Chemical Physics, 121(9),pp. 4059-4067, 2004.

[19] L. Cardelli, Brane Calculi. Interactions of Biological Membranes, in: Proceedingsof the International Conference on Computational Methods in Systems Biology,CMSB 04, in: Lecture Notes in Computer Science, vol. 3082, Springer, 2005, pp.257–280.

[20] L. Cardelli, On Process Rate Semantics, in: Theoretical Computer Science 391(3)190-215, Elsevier, 2008.

[21] C. G. Cassandras, S. Lafortune, Stochastic Timed Automata, in: Introduction toDiscrete Event Systems (2007), Springer, 327–367.

[22] R.V. Culshaw, S. Ruan, A Delay–differential equation model of HIV infection ofCD4+ T–cells, in: Mathematical Biosciences 165 (2000) 27–39.

[23] D. D’Aprile, Timed and Stochastic Model Checking of Petri Nets, Ph.D. Thesis,University of Turin, Turin, Italy, 2006.

[24] V. Danos, C. Laneve, Formal Molecular Biology, in: Theoretical Computer Science325 (2004), 69–110.

[25] R.D. Driver, Ordinary and Delay Differential Equations, Springer–Verlag (1977),New York ix+501 pp.

[26] M. A. Escalante, N. J. Dimopoulos, A Probabilistic Approach to Timing Analysisfor Synthesis, Technical Report ECE-94-6. University of Victoria, 1994.

28

Page 30: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

[27] J. Esparza, M. Nielsen, Decidability Issues for Petri Nets, in: Petri Nets Newsletter52 (1994), 245–262.

[28] G. Florin, S. Natkin, Evaluation Based Upon Stochastic Petri Nets of the MaximumThroughput of a Full Duplex Protocol, in: Application and theory of Petri nets:selected papers from the rst and the second European workshop, C. Girault andW. Reisig, eds., vol. 52 of Informatik Fachberichte, Berlin, 1982, Springer-Verlag,Berlin, pp. 280–288.

[29] A. Gallegos, T. Plummer, D. Uminsky, C. Vega, C. Wickman, M. Zawoiski, AMathematical Model of a Crocodilian Population Using Delay–differential Equa-tions, in: Journal of Mathematical Biology 57(2008), 737–754.

[30] M.A. Gibson, J. Bruck Efficient Exact Stochastic Simulation of Chemical Systemswith Many Species and Many Channels, in: Journal of Physical Chemistry A, 2000,104 (9), pp 18761889.

[31] D. Gillespie, Exact Stochastic Simulation of Coupled Chemical Reactions, in: Jour-nal of Physical Chemistry 81 (1977), 2340–2361.

[32] D. Gillespie, A General Method for Numerically Simulating the Stochastic TimeEvolution of Coupled Chemical Reactions, in: Journal of Computational Physics22 (4): 403-434.

[33] D. Gillespie, Approximated Accelerated Stochastic Simulation of Chemically React-ing Systems, in: Journal of Chemical Physics 115(4) (2001), 1716-1733.

[34] K.M. van Hee, L.J. Somers, M. Voorhoeve, Executable Specifications for DistributedInformation Systems, in: Proceedings of the IFIP TC 8 / WG 8.1 Working Con-ference on Information System Concepts: An In-depth Analysis, E.D. Falkenbergand P. Lindgreen, eds., Namur, Belgium, 1989, Elsevier Science Publishers, Ams-terdam, pp. 139–156.

[35] D. Kirschner, J.C. Panetta Modeling Immunotherapy of the tumor–immune inter-action, in: Journal of Mathematical Biology 37 (1998), 235–252.

[36] P. Magal, O. Arino Existence of Periodic Solutions for a State Dependent DelayDifferential Equation, in: Journal of Differential Equations 165 (2000), 61–95.

[37] A. Martin, S. Ruan, Predator-prey Models with Delay and Prey Harvesting, in:Journal of Mathematical Biology 43(3) (2001), 247–267.

[38] P. Milazzo, Qualitative and Quantitative Formal Modeling of Biological Systems,Ph.D. Thesis, University of Pisa, Pisa, Italy, 2007.

[39] R. Milner, Communicating and Mobile Systems: the π–Calculus, Cambridge Uni-versity Press, 1999.

[40] G. Paun, Membrane Computing. An Introduction. Springer, 2002.

[41] C.A. Petri, Kommunikation mit Automaten, Ph. D. Thesis, University of Bonn,1962.

[42] C. Priami, A. Regev, W. Silvermann, E. Shapiro, Application of a StochasticName–Passing Calculus to Representation and Simulation of Molecular Processes,in: Information Processing Letters 80 (2001), 25–31.

29

Page 31: Formal Modeling of Biological Systems with Delaysgroups.di.unipi.it/~caravagn/papers/proposal-phd-thesis.pdf · 2009-01-16 · Universita degli Studi di Pisa Dipartimento di Informatica

[43] C. Ramchandani, Performance Evaluation of Asynchronous Concurrent Sys- temsby Timed Petri Nets, PhD thesis, Massachusetts Institute of Technology, Cam-bridge, 1973.

[44] A. Regev, E.M. Panina, W. Silverman, L. Cardelli, E.Y. Shapiro, BioAmbients: anabstraction for biological compartments, in: Theoretical Computer Science 325(1)(2004), 141-167.

[45] W. Reisig, Petri nets: an Introduction, Prentice–Hall, Englewood Cliffs, 1985.

[46] M.R. Roussel, R. Zhu, Validation of an Algorithm for Delay Stochastic Simulationof Transcription and Translation in Prokaryotic Gene Expression, in: PhysicalBiology 3 (2006), 274–284.

[47] R. Schlicht, G. Winkler, A Delay Stochastic Process with Applications in MolecularBiology, in: Journal of Mathematical Biology 57(5) (2008), 613–648.

[48] V. Shahrezaei, J.F. Ollivier, P. Swain, Colored Extrinsic Fluctuations and Stochas-tic Gene Expression, in: Molecular System Biology 196(4) (2008).

[49] J. Sifakis, Use of Petri Nets for Performance Evaluation, in: Proceedings of theThird International Symposium IFIP W.G. 7.3., Measuring, modelling and evalu-ating computer systems (Bonn-Bad Godesberg, 1977), H. Beilner and E. Gelenbe,eds., Elsevier Science Publishers, Amsterdam, 1977, pp. 75–93.

[50] The CLSm web page: http://www.di.unipi.it/∼milazzo/biosims/.[51] The DSSA with Scheduled Reactions, a Java implementation, web page:

http://www.di.unipi.it/∼caravagn/software/.[52] The P Systems web page: http://psystems.disco.unimib.it/.

[53] The SPiM web page: http://research.microsoft.com/∼aphillip/spim/.[54] T.Tian, K. Burrage, P.M. Burrage, M. Carletti, Stochastic Delay Differential Equa-

tions for Genetic Regulatory Networks, in: Journal of Computational AppliedMathematics 205(2) (2007), 377–427

[55] F. Zhanga, Z. Lia, F. Zhangc, Global Stability of an SIR Epidemic Model withConstant Infectious Period, in: Applied Mathematics and Computation 199(1),2008, 285–291

[56] W.M. Zuberek, Timed Petri Nets and Preliminary Performance Evaluation, in:Proceedings of the 7th annual Symposium on Computer Architecture, vol. 8(3) ofQuarterly Publication of ACM Special Interest Group on Computer Architecture,1980, 62–82.

30