Diffusion approximation for signaling stochastic networks

Available online at www.sciencedirect.com

Stochastic Processes and their Applications 123 (2013) 2957–2982www.elsevier.com/locate/spa

Diffusion approximation for signalingstochastic networks

Saul C. Leitea, Marcelo D. Fragosob,∗

a Department of Computer Science, Federal University of Juiz de Fora (UFJF), Brazilb Department of Systems and Control, National Laboratory for Scientific Computing (LNCC), Av. Getulio Vargas, 333,

Quitandinha, Petropolis, RJ, CEP: 25651-075, Brazil

Received 13 March 2012; received in revised form 1 March 2013; accepted 2 March 2013Available online 14 March 2013

Abstract

This paper introduces an unified approach to diffusion approximations of signaling networks. This isaccomplished by the characterization of a broad class of networks that can be described by a set ofquantities which suffer exchanges stochastically in time. We call this class stochastic Petri nets withprobabilistic transitions, since it is described as a stochastic Petri net but allows a finite set of randomoutcomes for each transition. This extension permits effects on the network which are commonly interpretedas “routing” in queueing systems. The class is general enough to include, for instance, G-networks withnegative customers and triggers as a particular case. With this class at hand, we derive a heavy trafficapproximation, where the processes that drive the transitions are given by state-dependent Poisson-typeprocesses and where the probabilities of the random outcomes are also state-dependent. The objective of thisapproach is to have a diffusion approximation which can be readily applied in several practical problems.We illustrate the use of the results with some numerical experiments.c⃝ 2013 Elsevier B.V. All rights reserved.

Keywords: Queueing theory; Heavy traffic analysis; Stochastic Petri nets; G-networks

1. Introduction

Petri Nets (PNs) are graph-theoretical models of communication systems such as thosecharacterized by being concurrent, asynchronous and distributed [37]. Originally introduced by

∗ Corresponding author. Tel.: +55 24 2233 6008; fax: +55 24 2233 6141.E-mail addresses: [email protected], [email protected] (M.D. Fragoso).

0304-4149/$ - see front matter c⃝ 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.spa.2013.03.002

http://www.elsevier.com/locate/spa

http://dx.doi.org/10.1016/j.spa.2013.03.002

http://www.elsevier.com/locate/spa

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1016/j.spa.2013.03.002

2958 S.C. Leite, M.D. Fragoso / Stochastic Processes and their Applications 123 (2013) 2957–2982

Petri in [39], its main goal was the study of qualitative properties of the system that could bedetected using this formalism, which included: deadlocks, cycles, reachability of states, andso on [44,40,37]. Besides the obvious gain in theoretical understanding, PNs also introduceda graphical representation which is attractive to practitioners and it serves as a link betweentheory and application. Initial applications included, most notably, the design of communicationprotocols and performance evaluation of computer systems [37].

The idea of introducing time in the description of PNs came later with Ramchandani in [43],and with it came the idea of studying the time evolution and dynamics of these systems.Stochastic Petri nets (SPNs) are natural extensions of timed PNs and were introduced later by[45,38,35,36]. In these models, transitions which change the system state occur stochasticallyin time and are given by random variables. In all of its varying forms, PNs have been widelyapplied for performance evaluation of computer systems, parallel and distributed systems and,more recently, it has been applied extensively in modeling biological (molecular) networks (someexamples of the vast literature includes [22,19,8,7,48,18,41]).

Stochastic Petri nets have several similarities with queueing systems. In fact, if one is willingto forget for a moment things such as order of service, sojourn times, or other queueing theoreticrelated problems, we can interpret them simply as systems composed of holding bins containinga certain amount of some quantities and a set of transitions which exchanges the quantities inthe bins. If these aspects of queueing systems are not relevant, one can indeed use the SPNformalism to describe a single queue, for example. However, one aspect of queueing networkswhich cannot be modeled by SPNs is routing, since it establishes a certain level of correlationbetween transitions with different effect on the network. In order to cope with this, a newclass of SPN is introduced in this paper. In this class, we consider SPN where each transitionhas a set of possible effects on the net, which is chosen at random, i.e., stochastic Petri netswith probabilistic transition (from now on, SPN PT). In addition, we derive for this class aheavy traffic approximation, where the processes that drive the transitions are given by state-dependent Poisson-type processes and where the probabilities of the random outcomes are alsostate-dependent. A favorable feature of this generalization, inter alia, is that:

• One can describe, for instance, Jackson-type networks and also an important class of queueingsystems called G-networks as a SPN PT. (for more details on G-networks see e.g. [14,16,28]and the reviews [2,11,12])

• Since SPN PT is general enough to include a great deal of networks as particular cases,the heavy traffic approximation obtained here extends to these classes and generalizes theprevious result presented in [29] (detail on how our previous work is extended is given below).

Most work dedicated to G-networks and Jackson-type networks, among others, consider thesystem under stationary regime. As in any regular queueing network, an exact and convenientdescription of the time evolution of the system state is a hard problem due to its discretenature and the complex interaction among the queues. One common detour has been simulation,which can be computationally expensive. In addition, simulations are not useful to answerquestions related to optimum (or nearly optimum) control, which are of special interest forsignaling networks. Another approach to describe the transient evolution of queueing systemsis to construct an approximate model. There exist two common approximations, the fluid andthe diffusion (or heavy traffic) approximation. Usually, fluid approximations describe the timeevolution of the system average by a differential equation. The diffusion approximation differsfrom the fluid approximation since the “randomness” present in queueing systems is not averagedout (e.g., [25,9,47]) and, instead, it appears in the limit as a Wiener process or sometimes as an

S.C. Leite, M.D. Fragoso / Stochastic Processes and their Applications 123 (2013) 2957–2982 2959

Ito integral. Hence, in this aspect, the diffusion approximation is more faithful to the dynamics ofthe system than the fluid approximation. However, that comes with the addition of an assumptionwhich requires the system to operate under heavy traffic, in the sense that the rate of customersentering any particular queue is close to the rate of customers leaving this queue. This assumptionis common in several practical problems and it has been observed that this approximation workswell even for systems under moderate traffic (see for instance [25, Chapter 1]).

In the particular case of G-networks, fluid approximations have been considered in [1,17]. Aheavy traffic approximation was presented in [29] for G-networks satisfying a certain conditionon the network topology, and having only the so-called “negative customer” type of signals.A particular feature of the results derived here is that it generalizes our previous result givenin [29] in three ways: (i) the heavy traffic limit for a G-network with negative customers with anytopology can be set into the framework proposed here; (ii) it is possible to consider different typesof signals called “triggers;” and (iii) the model is constructed in a more flexible way allowingcases which are not covered by G-networks.

We will often use a “chemical reaction” type notation to describe the systems considered here.For example, a network of two queues in tandem will be described by the quantities types Q1and Q2 and the stochastic transitions A, D, and T . These transitions have the following effectson the systems:

∅A−→ Q1

Q2D−→ ∅

Q1T−→ Q2,

which are interpreted as the arrival of a customer from an external source (where ∅ denotesthis external source) into queue 1 (Q1), the departure of customers from queue 2 (Q2), and thedeparture of customers from queue 1 (Q1) which are routed to queue 2 (Q2), respectively. Thisdescription also allows us to represent signaling in queueing systems. For example, we couldthink of the transitions:

Q1 + Q2Ta−→ ∅,

Q1 + Q2Tb−→ Q3.

The transition Ta consumes a quantity of type Q1 and of type Q2. This transition can beinterpreted as the “negative customer” that leaves queue 1 (Q1) and removes a customer fromqueue 2 (Q2). Similarly, the transition Tb can be thought of as the effect of a trigger, wherea customer in queue 1 (Q1) moved a customer from queue 2 (Q2) into queue 3 (Q3). Also, asystem with routing can be represented as, for example:

Q1 + Q2T−→

Q3 (with probability p)Q4 (with probability q),

where p + q = 1. In this case the outcome of the transition T can produce a quantity of Q3 withprobability p and a quantity of Q4 with probability q .

Our main goal with this approach is to devise a model general enough such that its diffusionapproximation can be directly used, in an unified way, in several different practical problems. Inorder to obtain such an approximation, we adopt an approach in the spirit of [29]. We considerstate-dependent transition rates as well as state-dependent outcome probabilities. One of thepeculiarities of the approximation for the system presented here is the presence of reflection


directions which appear only at corners or edges of the state space, which may not be convexcombinations of the directions at the adjacent faces. Recently, a similar scenario has appearedin [3] but, in general, this is not usual in queueing systems. For non-degenerated diffusions, weshow that these directions do not interfere in the dynamics of the limit process using a resultfrom [10] (a similar approach was used in [3, Section 7.4]).

The layout of the paper is as follows: we begin by introducing in Section 2 some notationthat will be used throughout the paper. In Section 3, we give a brief introduction to Petri nets,which helps to fix some nomenclature and notation that will be used in the following sections.Next, in Section 4, we introduce the stochastic Petri net model with Poisson-type transitionsand state-dependent rates. In Section 4.1, we derive the diffusion approximation for this class ofnetworks. Section 4 serves as a preparation for Section 5, where we introduce the SPN PT modelwith Poisson-type transitions, state-dependent rates and outcome probabilities. The heavy trafficapproximation for this class of networks is derived by showing that it can be reduced to the classof network given in Section 4. That is, we can see each individual outcome of a transition as anindependent transition. This way, the notation needed to introduce the diffusion approximationsfor the SPN PT is significantly reduced, and the result can be more readily used. At last, inSection 6, we illustrate the use of the results with some practical problems.

2. Preliminaries and notation

In this section, we introduce the bare essential of notation that will be used in the paper. LetN = 1, 2, 3, . . . be the set of natural numbers and let N0 , N ∪ 0 stand for the set of non-negative integers. Also, we define R>0 as the set of positive real numbers and R≥0 as the set ofnon-negative real numbers. In addition, define R>0 as R>0 ∪ ∞. For a set S ⊆ N0, let |S| beits cardinality, and, for some constant a ∈ N0 define S + a as the set i + a | i ∈ S. Also,let P(S) be the set of subsets of S. In addition, for any finite set S ⊂ N such that |S| ≥ 1, andvectors di ∈ Ru, i ∈ S, let ConvexHulldi , i ∈ S be the set generated by convex combinationof the vectors di , i ∈ S. For a vector ν ∈ Rn let diag (ν) be the n × n diagonal matrix withdiagonal entries given by the vector ν. For i ∈ 1, . . . , u, define the following half-spaces:Gi , ξ ∈ Ru

| ξi ≥ 0. Also, for some given constants Bi ∈ Ru>0, i ∈ 1, . . . , u, let

G j , ξ ∈ Ru| ξ j−u ≤ B j−u, for each j ∈ u + 1, . . . , 2u for which B j−u is finite. For

convenience, let F , 1, . . . , u ∪ i + u | Bi < ∞. In this paper, we consider diffusionprocesses constrained to the state space G defined as the intersection of these half-spaces:G , ∩i∈F Gi . Let ∂G denote the boundary of the state space G. Let us define the faces ofthe state space as ∂i , ξ ∈ G | ξi = 0 for i ∈ 1, . . . , u and let ∂ j , ξ ∈ G | ξ j−u = B j−u,when B j−u < ∞, for j ∈ u + 1, . . . , 2u. When we say “corners” of the state space, wemean any ∩k∈K ∂k with only one element, where K ⊆ 1, . . . , 2u and |K | ≥ 2. Similarly,an “edge” of the state space refers to ∩k∈K ∂k with more than one element. Let E be the setcontaining the sets of indexes of possible corners or edges of the state space. For instance, if|K | ≥ 2 and ∩k∈K ∂k = ∅ then K ∈ E. In addition, let us define the following “extendedfaces” ∂i , ξ ∈ Ru

| ξi ≤ 0 for i ∈ 1, . . . , u and ∂ j , ξ ∈ Ru| ξ j−u ≥ B j−u, for

j ∈ u + 1, . . . , 2u for which B j−u are finite.

3. Petri nets

Petri nets are described by a set of places, transitions, and directed arcs. Each place has anassociated marking which indicates the number of tokens present. For example, the markings can


a b

Fig. 1. Graphical representation of a Petri net. (a) The black circles inside the nodes are the tokens before the firing oftransition t1. The numbers near the arrows indicate the arc weights. Notice that there are enough tokens at the nodes tofire transition t1. (b) The black circles indicate the marking of the net after the firing of the transition t1. At the presentstate of the network, the transition t1 cannot be fired again since there are not sufficient tokens at p2 for it to take place.

represent the “number of customers” in a queueing system, or the “number of molecules” of acertain type in a biochemical network. Each directed arc links a transition to a place or a placeto a transition. When a transition is fired, it moves tokens from places that are linked to it by thedirected arcs. An arc that links a place to a transition indicates that the place is “needed” for thetransition to take place. Arcs linking transitions to places indicate which places receive outputfrom the transition. The number of tokens that are moved are given by the weights assigned tothe arcs. A transition can only be fired if each place, that has an arc connecting it to the transition,has a marking greater or equal to the weight of that arc.

In the example of Fig. 1, we have three places p1, p2, and p3 that have markings 3, 4, and2, respectively, before the firing of transition t1. The transition t1 needs one token from p1 andthree from p2 to produce two tokens of p3. Alternatively, the system can be represented as wehave done before:

p1 + 3p2t1−→ 2p3.

We will call a place an input place for the transition ti if there is a directed arc connectingit to the transition. Similarly, a place will be called an output place for transition ti if there is adirected arc connecting the transition to the place. For example, in Fig. 1, the places p1 and p2are input places of t1 and p3 is an output place (it is possible to have a place that is both an inputand output of a transition). In addition, we will use the word node as a synonym for place, anduse both words interchangeability.

Formally, a Petri net is defined as a directed, weighted, bipartite graph with a marking [37].That is,

Definition 1 ([37]). A Petri net is the quintuple (P, T, F,W,M0) where:

P = p1, . . . , pu, u ∈ N, is a finite set of places,T = t1, . . . , tv, v ∈ N, is a finite set of transitions,F ⊆ (P × T ) ∪ (T × P) is a set of directed arcs,W : F → N is a weight function,M0 : P → N0 is the marking of the system,

where P ∩ T = ∅.

Let X0 ∈ Nu be the vector constructed by setting (X0)i , M0(pi ), for i ∈ 1, . . . , u.Also, let Pre ∈ Nv×u be the matrix constructed with the weights of arcs going from places


to transitions, that is (Pre)ki , W (pi , tk), for (k, i) ∈ 1, . . . , v × 1, . . . , u. Similarly, letPost ∈ Nv×u be the matrix constructed with the weights of arcs going from transitions to places,that is (Post)ki , W (tk, pi ), for (k, i) ∈ 1, . . . , v × 1, . . . , u. The proposition below is awell known result for Petri nets that can be easily verified and will be used in this paper (see forinstance [44, Chapter 5]):

Proposition 2. Let r ∈ 0, 1v be a vector indicating the set of transitions to be fired (that is,

transition tk will be fired if rk = 1, otherwise rk = 0). Let X0 be the initial marking of thesystem. Suppose that there are enough tokens in the input nodes for each transition that will befired. The new marking X after the firing of the transitions given by r is

X = X0 + AT r

where A , Post − Pre.

For further details on Petri nets, see for instance the book by Reisig [44].

4. Stochastic Petri nets

A stochastic Petri net is defined as above except that each transition is assigned with arandom firing time [32,33]. Let us suppose that T (t) is a Nv0-valued random variable whereeach component Tk(t), k ∈ 1, . . . , v, counts the number of firings of transition tk up to time t .Then we can write the markings on the system at any given time t ∈ R≥0 as

X (t) = X (0)+ AT T (t),

where X (0) is the initial marking of the system, and A is defined as in Proposition 2. We willsuppose that each place can have a maximum number of tokens that it can hold, which we willcall buffer sizes, and denote by Bi ∈ R>0, i ∈ 1, . . . , u.

Usually, when dealing with Petri nets, a transition can only be fired if there are enough tokensat the input nodes and space in the output nodes for the new tokens created. However, this doesnot represent every scenario desired. For instance, consider a transition given by

Q1 + Q2t1−→ Q3,

which represents the effect of a trigger in a G-network. Queue 1 (Q1) sends a signal to queue2 (Q2) to move a customer to queue (Q3). In this case, even if there are no tokens at queue 2,the transition will be fired, removing a customer from queue 1 but not creating any new tokensat queue 3, since there are no tokens at queue 2 to be moved. In order to consider every possiblecase, the sets Eki , Fki ⊆ 1, . . . , u will be defined, for each k ∈ 1, . . . , v and i ∈ 1, . . . , u,as Eki , indexes of the nodes which, when empty, prevent the effects of transition k fromchanging the state of node i, and Fki , indexes of the nodes which, when full, prevent theeffects of transition k from changing the state of node i. In the example given above regardingthe effect of trigger, we would have

E11 = 1, F11 = ∅

E12 = 1, 2, F12 = ∅

E13 = 1, 2, and F13 = 3.

Clearly, in order to maintain the number of tokens positive, we assume that i ∈ Eki if i is an inputfor transition k, that is, Preki > 0. Also, if i is an output node for transition tk , i.e. Postki > 0,


we assume that i ∈ Fki , so that the number of tokens of a place does not exceed its maximumbuffer size.

This way, we may write the marking of the system at any time t ≥ 0 as:

X (t) = X (0)+

t

0AT (X (s−))dT (s), (1)

where

Aki (ξ) , Aki Iξ j ≥ Prek j , for j ∈ Eki , and ξ j ≤ B j − Postk j , for j ∈ Fki ,

for each ξ ∈ Ru , where IC denotes the indicator function of a subset C.

The above definition of A leads us to a technical difficulty in the representation of theassociated “reflecting process” that appears in the limit of the heavy traffic approximation (adetailed account of this problem is presented in [25, Section 7.1.3]). A common approach toavoid this technical difficulty, which we adopt, is to permit “backlogging” in the system. In thesetting presented here, backlogging consists of a simplification of the definition of A to:

Aki (ξ) , Aki Iξ j > 0, for j ∈ Eki , and ξ j < B j , for j ∈ Fki .

In practice, this means that we are allowing transitions to fire even when there are not enoughtokens in the input nodes or not enough buffer space at the output nodes. However, since we willadopt a scaling on X (which will be presented in the next section), the fact that the number oftokens may be temporarily negative or exceeding buffer size does not create great discrepanciesbetween the model and the associated physical problem. In fact, as the scaling parameter n(defined in the next section) increases, this effect will be less significant and it will not showup in the limit. Backlogging has been used before in heavy traffic models of manufacturingsystems, see for example [25, Section 7.1]. Also, it is important to notice that systems that haveall arc weights equal to 1 are unaltered by backlogging.

Let (Ω ,P,F ,Ft ) be a common stochastic basis for the stochastic processes to be defined,where we assume that the filtration Ft satisfies the so-called “usual assumptions” (that is, it isright continuous and contains all P-null sets of F ). We will suppose that each component of T isa stochastic process with cadlag sample paths defined by

Tk(t) , Nk

t

0Λk(X (s))ds

, (2)

where Nk is a standard (rate 1) Poisson process, and Λk : Ru→ R≥0 are measurable, bounded

and such that

∞

0 Λk(X (s))ds = ∞ a.s., for each k ∈ 1, . . . , v. Notice that the latter conditionis equivalent to requiring that Tk(∞) = ∞ a.s. (see, for instance, Lemma 17 of [5, p. 41]).In addition, we suppose that the random elements X (0), Nk; k ∈ 1, . . . , v are mutuallyindependent.

Using Theorem 8 in [5, p. 27] together with the fact that Λk(X (t)) is the Ft -intensity of Tk(this can be seen using an straight forward adaptation of the ideas of Theorem 16 in [5, p. 41]),we have the following decomposition of Tk into an Ft -martingale and an Ft -predictable process:

Tk(t) = Mk(t)+

t

0Λk(X (s))ds.


Therefore, we can write X in vector notation as

X (t) = X (0)+

t

0AT (X (s−))d M(s)+

t

0AT (X (s))Λ(X (s))ds,

where M , (M1, . . . ,Mv)T , and Λ(·) , (Λ1(·), . . . ,Λv(·))T .

In order to have a more convenient notation for next section, define the set:

E(ξ) , i ∈ (1, . . . , 2u) | ξi ≤ 0, for i ≤ u, or ξi−u ≥ Bi−u, for i > u,

for each ξ ∈ Ru . Notice that this set contains the node indexes i ∈ 1, . . . , u for which ξi iszero or negative and the indexes j ∈ u + 1, . . . , 2u for which ξ j−u is greater or equal than themaximum buffer size B j−u .

Now, for each subset S ⊆ 1, . . . , 2u, define the following:

Rik(S) =

AT

ik if S ∩ (Eki ∪ u + Fki ) = ∅

0 otherwise,(3)

for each k ∈ 1, . . . , v and i ∈ 1, . . . , u. Then observe that

ATik(ξ) = AT

ik − Rik(E(ξ)),

and, hence, we may write

X (t) = X (0)+

t

0AT (X (s−))d M(s)+ AT

t

0Λ(X (s))ds

−

t

0R(E(X (s)))Λ(X (s))ds. (4)

This way, X was decomposed into a martingale term, which will give rise to the Wiener processin the limit, a term that determines the drift term, and the last term which is associated with thereflection process, which keeps X from leaving the state space.

4.1. Heavy traffic limit

As it is common in heavy traffic approximations, one assumes the existence of a sequenceXn, n ∈ N, where for each n, Xn is defined as in (1) and the matrix A is independent ofthe parameter n. We assume that this sequence approaches heavy traffic in the sense that thedifference between the rate of tokens entering and leaving a node gets smaller as n increases.This is made more clear with the presentation of Assumption 3. We also assume that the buffersizes Bn

i are given by Bni ,

√nBi , where Bi ∈ R>0 is a fixed value independent of n. This is a

common assumption in heavy traffic approximations.Now we define the scaled process xn as

xn(t) , Xn(nt)/√

n.

In view of (4) and using a change of variables in the integral terms, we can write:

xn(t) = xn(0)+

t

0AT (xn(s−))dmn(s)+

√n AT

t

0λn(xn(s))ds

−√

n t

0R(E(xn(s)))λn(xn(s))ds (5)


where mn(t) , Mn(nt)/√

n, λnk (ξ) , Λn

k

√nξ, xn(0) , Xn(0)/

√n, for ξ ∈ Ru

≥0 and

t ∈ R≥0. Also, A and E were redefined to have Bi in place of Bi in their respective definitions.That is, E(ξ) , i ∈ 1, . . . , 2u | ξi ≤ 0, if i ≤ u, or ξi−u ≥ Bi−u, if i > u, andAki (ξ) , Aki Iξ j > 0, for j ∈ Eki , and ξ j < B j , for j ∈ Fki .

In the assumptions that follow, let γ (S) , −R(S)r , for each set S ⊆ 1, . . . , 2u, where thevector r ∈ Rv

≥0 will be defined below in Assumption 3(a). In addition, define d(ξ) , γ (S) |

S ∈ P(E(ξ)), for each ξ ∈ Rn , where E(ξ) , i ∈ 1, . . . , 2u | ξi = 0, if i ≤ u, or ξi−u =

Bi−u, if i > u. Let G be the state-space defined in Section 2 and ∂G its boundary. Notice that,for ξ ∈ ∂G, d(ξ) is the set of reflection directions appearing at point ξ .

Assumption 3. (a) Let λn(ξ) = r + f (ξ)/√

n + o1/

√n, where r ∈ Rv

≥0, fk : Ru→ R are

continuous and bounded, for k ∈ 1, . . . , v, and o1/

√n

denotes a function of ξ such that√

no1/

√n

converges uniformly (in ξ ) to zero.(b) AT r = 0, which is usually called “heavy traffic condition”.(c) For each ξ ∈ ∂G, there exists a ν ∈ ConvexHullni , i ∈ E(ξ) such that νT d > 0 for all

d ∈ d(ξ), where ni is the inward normal vector appearing at the face of ∂i .

Although the state dependence given by Assumption 3(a) is small for large n, it has asignificant effect in the limit equation. Notice that the functions fk appear in the drift term ofEq. (6). Similar hypothesis for the rates λn have been used in heavy traffic approximations, as forinstance in [49,31,25]. Hypothesis 3(b) is usual in the heavy traffic literature. As we have alreadymentioned, this assumption requires the rate of tokens entering a node to be close to the rate oftokens leaving this node. That is the reason why this assumption is usually called “heavy trafficassumption”. The last condition in Assumption 3 (see, e.g. [25]) is used to establish the weaksense convergence given by the next theorem. We return to the discussion of this condition in thenext section.

It is perhaps noteworthy here that we are not claiming weak-sense uniqueness for the processof the theorem below. However, weak-sense uniqueness holds for the non-degenerate diffusionsof the next section, where the “extra” reflection directions at the corner and edges of the statespace are dropped.

Theorem 4. Consider Assumption 3, and suppose that xn(0) converges weakly to x(0). ThenΨn

is tight where

Ψn(·) , (xn(·),mn(·), zn(·)),

and zn(·) ,√

n

·

0 γ (E(xn(s)))ds. Take any weakly convergent subsequence and denote itsweak-sense limit by Ψ(·) , (x(·),m(·), z(·)). In addition let Gt be the minimal σ -algebra thatmeasures x(0), x(s),m(s), z(s); s ≤ t. Then for t ≥ 0,Ψ(·) satisfies:

x(t) = x(0)+ AT t

0f (x(s))ds + AT m(t)+ z(t) (6)

where x(t) ∈ G,m is a Rv-valued, Gt -Wiener process with covariance matrix given byEm(1)m(1)T

= diag (r). The process z is such that

z(t) =

e∈F∪E

γ (e)ζe(t),

where ζe are continuous, non-decreasing, ζe(0) = 0, and increase only when x(t) ∈ ∩i∈e ∂i .


Proof. Notice that, using the decomposition (5) with Assumption 3(a) and (b), we can writexn as

xn(t) = xn(0)+

t

0AT (xn(s−))dmn(s)+

t

0AT (xn(s)) f (xn(s))ds

+ zn(t)+√

no1/

√n, (7)

where zn(t) , −√

n t

0 R(E(xn(s)))rds =√

n t

0 γ (E(xn(s)))ds. We are going to use Theorem3.6.1 in [25, p. 130] to show that zn

is tight. For that, we need to show that the processψn(·) ,

·

0 AT (xn(s−))dmn(s)+

·

0 AT (xn(s)) f (xn(s))ds is asymptotically continuous in thesense that for each ϵ > 0 and T > 0:

limδ→0

lim supn

P

supt≤T

sups≤δ

|ψn(t + s)− ψn(t)| ≥ ϵ

= 0.

The above condition can be verified directly for

·

0 AT (xn(s)) f (xn(s))ds using the fact that fkis bounded. For the martingale term, one can use the criterion of Theorem 2.7(b) of [24, p. 10]together with Lemma 12 (in the Appendix) to show tightness, and since the jump sizes of mn areof the order of 1/

√n, it is also asymptotically continuous.

Now, observe that the process zn can be rewritten as:

zn(t) =

l∈F

√n t

0γ (E(xn(s)))Ixn(s) ∈ ∂l , and xn(s) ∈ ∂ j for j = lds

+

e∈E

√n t

0γ (E(xn(s)))Ixn(s) ∈ ∩i∈e ∂i , and xn(s) ∈ ∂ j for j ∈ eds.

Since γ (E(xn(s))) is constant inside these integrals, we can write

√n t

0γ (E(xn(s)))ds =

l∈F

γ (l)ζ nl(t)+

e∈E

γ (e)ζ ne (t), (8)

where ζ nS (t) ,

√n t

0 Ixn(s) ∈ ∩i∈S ∂i , and xn(s) ∈ ∂ j for j ∈ Sds, for any set S ∈ F ∪ E.Therefore, we have reflection directions γ (e) at the corners and edges of the state space thatneed not be positive linear combination of the reflection directions at the adjacent faces. For eache ∈ E, let us define the following half-space:

Ge ,

ξ ∈ Ru

| νTe ξ ≥ −

i∈e,i>u

αi Bi−u

,

where the vector νe ,

i∈e αi ni is a convex combination of the ni , i ∈ e such that νTe γ (e) >

0, which exists by Assumption 3(c). Notice that G ⊂ Ge. In fact, let ξ ∈ G then ξi ≥ 0, fori ∈ 1, . . . , u, and ξ j−u ≤ B j−u , for j ∈ u + 1, . . . , 2u for which B j−u < ∞. This way:

νTe ξ =

i∈e

αi nTi ξ =

i∈e,i≤u

αiξi −

i∈e,i>u

αiξi−u ≥ −

i∈e,i>u

αi Bi−u

and so ξ ∈ Ge. Hence, G can be written as ∩i∈F Gi ∩e∈E Ge, where now we can interpret thevectors γ (e) as being reflection directions on the face of Ge. Notice that we can verify condition(A.5.1) of [25, p. 118] by the description of G and the choice of νe as the normal vector of Ge,


since νTe γ (e) > 0. Also, condition (A.5.3) in [25, p. 121] is verified by Assumption 3(c). In

addition, notice that (6.2a) and (6.2b) of [25, p. 130] hold, the former can be seen by how zn(t) iswritten in Eq. (8). Hence, Theorem 3.6.1 in [25, p. 130] can be applied to show that zn, ζ n

S ; S ∈

F ∪ E is tight and that any weakly convergent subsequence is continuous with probability one.We can verify that t

0R(E(xn(s))) f (xn(s))ds

≤ K

e∈F∪E

ζ ne (t)

√

n,

where K is a positive constant, using similar arguments used to arrive at (8) and using thefact that f is bounded. Hence, by the tightness of ζ n

e , the process taking values t

0 R(E(xn(s))) f (xn(s))ds converges weakly to the “zero” process. Also, by continuity of f , wehave

t0 AT f (φn(s))ds ⇒

t0 AT f (φ(s))ds whenever φn

⇒ φ.In addition, with the help of Lemma 12, we have that

t0 Rik(E(xn(s)))dmn

k (s) converges tothe “zero” process by the tightness of ζ n

e . The Doob–Meyer process associated with the mar-tingale mn

k is given bymn

k

(t) =

t0 λ

nk (x

n(s))ds using Lemma 11 (in the Appendix). Also,mn

k ,mnj

= 0, for k = j , since the jump times of mn

k and mnj are distinct with probability

one. Therefore, using Theorem 2.8.2 in [25, p. 80] and Assumption 3(a), one verifies that mn

converges to the stated limit.Now take any weakly convergent subsequence of (xn,mn, zn) and denote its limit by

(x,m, z). By the properties of ζ nS and its asymptotic continuity, the corresponding weak sense

limit ζS is continuous, non-decreasing, ζS(0) = 0, and increase only at t such that x(t) ∈

∩i∈S ∂i .

4.1.1. Non-degenerate diffusionsThe theorem below simplifies considerably the representation of the reflection process z for

the cases where the diffusion is non-degenerate. It shows that the reflection directions that onlyappear at the corners or edges of the state space do not contribute to the dynamics of the systemand can be safely dropped. This implies in particular that there is weak-sense uniqueness for theprocess satisfying Eq. (9).

In order to see this, first notice that Assumption 3(c) is equivalent to condition (S.b) usedin [10]. Since the limit diffusion of Theorem 5 has no reflection directions at the corners oredges of the state space, the “extended” representation of G (used in the proof of Theorem 4with the half-spaces Ge defined on the corners and edges of the state space) is not needed. Thisimplies that G is simple and both (S.a) and (S.b) conditions of [10, p. 2] hold by Proposition1.1 of [10, p. 3]. This way, the weak-sense uniqueness result of [10, p. 4] is valid for the processgiven by Eq. (9). Notice that the same Girsanov transformation argument used in [10, p. 28] andLemma 6.1 of [46, p. 305] can be used to consider the case where the drift term is not constantbut bounded.

Theorem 5. Consider Theorem 4 and suppose in addition that the matrix Σ , AT diag (r) A ispositive definite. Then the weak-sense limit of Ψn (defined as in Theorem 4) satisfies

x(t) = x(0)+ AT t

0f (x(s))ds + AT m(t)+

i∈F

γiζi (t) (9)

where m is as defined in Theorem 4, ζi are continuous, non-decreasing, ζi (0) = 0, and increasesonly when x(t) ∈ ∂i .


Proof. We begin by using the “extended” representation of G as ∩i∈F Gi ∩e∈E Ge constructedin the proof of Theorem 4. Notice that although it is assumed in [10] that the state space isformed with minimal description, the same argument used in the proof of Theorem 4.4 in [10]can be repeated for state spaces with non-minimal description whenever u > 1, as it was done in[3, Section 7.4, Theorem 7.7]. Moreover, notice that condition (S.b) of [10, p. 2] is equivalent toAssumption 3(c) and it is the only condition on the reflection directions used in the proof of thistheorem. Therefore, from Theorem 4.4 in [10], it follows that:

ζS(t) =

t

0Ix(s) ∈ ∩i∈S ∂i dζS(s) = 0 a.s., (10)

for t > 0 and each S ∈ E. This way, the reflection term can be simplified by writing:z(t) =

i∈F γ (i)ζi(t).

4.1.2. Weakening the conditions on the function fIn this section, we list a set of results relating to the relaxation of the hypothesis on the function

f (·) of Assumption 3. Most of them are adaptation of standard results. The first result concernsthe boundedness of fk , which can be replaced by boundedness when fk is restricted to G, asgiven by the theorem below.

Theorem 6. Assume the conditions on Theorem 5. Replace the conditions on the boundednessof fk, k ∈ 1, . . . , v, by supposing that it is bounded only when restricted to G. Then theconclusions of Theorem 5 are still valid.

Proof. By the definition of xn (see the discussion about backlogging in Section 4), if xn(t) ∈ Gthen dist(xn(t),G) ≤ C/

√n, where C = maxi j |Ai j |−1, and dist(ξ,G) , infmaxi |ξi−vi |, v ∈

G. Let Gn ,ξ ∈ Ru

|dist(ξ,G) ≤ C/√

n, and define SN , ξ ∈ Ru

||ξ | ≤ N , for a givenN > 0. Let fk be the restriction of fk to the compact set Gn ∩ SN , for k ∈ 1, . . . , v, andlet f N

k be the continuous and bounded extension of fk to Ru (see Tietze’s extension theorem[6, p. 144]). Let xn,N be the process given by Eq. (7) with f N

k in lieu of fk . By the assumptionson Theorem 5, xn,N

is tight and any weak-sense limit x N satisfies Eq. (9) with f Nk in place of

fk . Now the arguments of [25, p. 72] can be used to arrive at the result.

The boundedness of fk can also be replaced by supposing that fk has at most linear growth,that is: | fk(ξ)| ≤ L(1 + |ξ |), for each k and all ξ ∈ G, where L is some positive constant.However, this comes with the addition of the “Set B” assumption (i.e., Assumption 2.1 in [13])since we will need the Lipschitz property given by Theorem 2.2 of [13]. It is important to mentionthat, under this “Set B” assumption and having fk Lipschitz continuous, there is strong senseuniqueness of solution to Eq. (9) (see, for instance, Theorem 3.5.2 of [25, p. 124]).

Theorem 7. Assume the conditions on Theorem 4. Replace the conditions on the boundednessof fk, k ∈ 1, . . . , v, by supposing that it has at most linear growth and assume in addition the“Set B” condition on the reflection directions. Then the conclusions of Theorem 4 are still valid.

Proof. The proof is carried out using the truncation technique together with the fact thatlimK→∞ P

sups≤t |x(s)| ≥ K

= 0, which can be verified with the aid of Theorem 2.2 in [13]

and the fact that AT f (·) has at most linear growth.

It is common the need to have the functions fk discontinuous. In fact, in control problems suchas [27,29,30] (see also Section 6), one often finds the optimal control as a switching curve, that


is, the control is applied at a maximum rate after a given threshold. The theorem below extendsthe results of the previous theorems to include this case.

Theorem 8. Assume the conditions on Theorem 4. Replace the continuity of the functionsfk, k ∈ 1, . . . , v, by measurability and suppose that the functions φ(·) →

·

0 fk(φ(s))dsare continuous on D(Ru

; 0,∞) with probability one with respect to the measure induced by anyweak-sense limit x. Then the assertions of Theorem 4 are still valid.

Proof. This is verified with the aid of Theorem 2.7 in [4, p. 21]. See also Section 8.4 of [25] fora more elaborated discussion, and easier to verify conditions.

5. Stochastic Petri nets with probabilistic transitions (SPN PT)

In order to consider a more general class of networks that can fully represent the “routing”present in queueing systems, we introduce the stochastic Petri net where the effect of a transitionon the network is random. For example, we want to consider transitions such that:

p1 + 3p2t1−→ 2p3 (with probability q1)

2p1 + 2p2t1−→ 2p3 (with probability q2)

2p1 + 3p2t1−→ 2p3 + p4 (with probability q3),

where t1 is a transition, pi , for i ∈ 1, 2, 3, 4, are nodes, and q j , for j ∈ 1, 2, 3, are theprobabilities (which may be state-dependent) of the different effects for transition t1. We willcall each of these possible effects of a transition on the network as the possible outcomes of thetransition.

Let us suppose that, for each k ∈ 1, . . . , v, there are ck ∈ N possible outcomes fortransition tk . Let Prek ∈ Nck×u be the matrix constructed with the weights on the arcs linkingthe places with the transition tk , for every possible outcome. That is, for ( j, i) ∈ 1, . . . , ck ×

1, . . . , u, (Prek) j i is the number of tokens removed from place pi when transition tk is firedand the random outcome is j . Similarly, define Postk ∈ Nck×u as the matrix constructed with theweights on the arcs linking the transitions tk with the places for every possible outcome. Again,for ( j, i) ∈ 1, . . . , ck × 1, . . . , u, (Postk) j i is the number of tokens received by place pi

when the transition tk is fired and the random outcome is j . Then define Ak , Postk − Prek .In addition, let the sets Ek ji , Fk ji ⊆ 1, . . . , u be defined as follows: Ek ji , indexes of the

nodes which, when empty, prevent the effect of transition k under outcome j from changing thestate of node i; Fk ji , indexes of the nodes which, when full, prevent the effects of transitionk under outcome j from changing the state of node i.

Again, we assume a common stochastic basis (Ω ,P,F ,Ft ), satisfying the usual assumptions,where all processes given below are defined. This time, Tk is defined as a Nck -valued randomvariable where each component Tk j (t), j ∈ 1, . . . , ck, counts the number of times thattransition tk has been fired under the outcome j by time t ∈ R≥0. Formally, we define Tk as

Tk(t) , t

0Ik(s)d Nk(s), where Nk(t) , Nk

t

0Λk(X (s))ds

,

where Nk is a standard Poisson process, and Ik j (t) denotes the indicator function of the eventthat transition tk is fired under the outcome j at time t . Also, we require that

ckj=1 Ik j (t) ≤ 1,

for each t ≥ 0 and k, and that Tk j (∞) = ∞ a.s., for each k and j . Analogously to what was donepreviously, define Ak(ξ), for each k ∈ 1, . . . , v and ξ ∈ Ru , as:


(Ak) j i (ξ) , (Ak) j i Iξl > 0, for l ∈ Ek ji and ξl < Bl , for l ∈ Fk ji .

Hence, the system can be written as:

X (t) = X (0)+

vk=1

t

0AT

k (X (s−))dTk(s). (11)

We suppose that the functions Λk : Ru→ R≥0 are measurable and bounded, the random

elements X (0), Nk, Ik , (Ik1, . . . , Ikck )T; k ∈ 1, . . . , v are mutually independent, and that

there are measurable functions Qk j : Ru→ [0, 1] such that E

Ik j (t)

F rt

= Qk j (X (t−)),

where F rt is the minimal σ -algebra that measures all driving processes up to time t except Ik(t).

Under these assumptions, we can show that the stochastic intensity of Tk j (t) is given byQk j (X (t))Λk(X (t)). Indeed, for any Ft -predictable process C we have:

E

∞

0C(s)Ik j (s)d Nk(s)

= E

∞

0C(s)Qk j (X (s−))d Nk(s)

= E

∞

0C(s)Qk j (X (s))Λk(X (s))ds

(see, for instance, Definition 7 of [5, p. 27]). In addition, by independence of the driving Poissonprocesses and the assumptions on Ik , we have that the point processes Tk j have no commonjumps with probability one. This implies, by Theorem 6.19(b) of [23, p. 714] or Theorem 2′ of[34, p. 195], that, for each k and j , we can write:

Tk j (t) = Yk j

t

0Qk j (X (s))Λk(X (s))ds

,

where each Yk j are independent standard Poisson processes. But then, each transition k firedunder each outcome j can be seen as the independent transitions of the model of Section 4 (seethe definition given by Eq. (2)). Indeed, let AT (ξ) , (AT

1 (ξ), . . . ,ATv (ξ)) and let l be an index

for the columns of this matrix AT (ξ). Then we can rewrite Eq. (11) as Eq. (1) where each Tl (ofSection 4) is a point process with stochastic intensity given by Λl(X (t)) , Qk j (X (t))Λk(X (t)),for some k and j corresponding to the index l.

Now, in order to get the heavy traffic approximation, we assume a sequence of networksXn, n ∈ N and define the scaled process xn(t) , Xn(nt)/

√n. Also, let λn

k (ξ) , Λnk

√nξ

and qnk j (ξ) , Qn

k j

√nξ, for ξ ∈ Ru, t ∈ R≥0, and j ∈ 1, . . . , ck, k ∈ 1, . . . , v. Now we

add the following assumption:

Assumption 9. For o(·) uniform in ξ ∈ Ru , let

(a) λnk (ξ) = rk + fk(ξ)/

√n + o

1/

√n, where rk ∈ R≥0 and fk : Ru

→ R are continuous andbounded, for k ∈ 1, . . . , v.

(b) qnk j (ξ) = pk j + gk j (ξ)/

√n + o

1/

√n, where pk j ∈ R≥0 and gk j : Ru

→ R are continuousfunctions, for k ∈ 1, . . . , v and j ∈ 1, . . . , ck.

Notice that qnk j can be interpreted as the probability of observing a certain outcome j of

transition k for the scaled system. In queueing systems, this qnk j is commonly interpreted as state

dependent “routing probabilities”. Under these assumptions on λnk and qn

k j , we have that the

corresponding “scaled rate” λnl (ξ) , Λn

l

√nξ

can be written as:

λnl (ξ) , qn

k j (ξ)λnk (ξ) = rk pk j + (rk gk j (ξ)+ pk j fk(ξ))/

√n + o

1/

√n


and satisfies Assumption 3(a), with constant rl , rk pk j ∈ R≥0 and continuous and boundedfunction fl(ξ) , rk gk j (ξ) + pk j fk(ξ). Also, with AT , (AT

1 , . . . , ATv ), we have that

Assumption 3(b) can be written asv

k=1 ATk pkrk = 0, which is the heavy traffic condition.

Similarly, let (Rk)i j be defined as the following for each set S ⊆ 1, . . . , 2u:

(Rk)i j (S) ,

(AT

k )i j if S ∩ (Ek ji ∪ u + Fk ji ) = ∅

0 otherwise

for each k ∈ 1, . . . , v, j ∈ 1, . . . , ck, and i ∈ 1, . . . , u. Then, by setting R(S) ,(R1(S), . . . ,Rk(S)), the vector γ (S) used to define d(ξ) in Assumption 3(c) can be written asγ (S) , −

vk=1 Rk(S)pkrk . This way, the results on Sections 4.1, 4.1.1 and 4.1.2 are applicable

for the model of this section.In particular, Theorem 5 can be stated as follows for the SPN PT:

Theorem 10. Consider Assumption 9, the heavy traffic conditionv

k=1 ATk pkrk = 0, and As-

sumption 3(c) with the vector γ (S) defined as γ (S) , −v

k=1 Rk(S)pkrk . Assume in additionthat xn(0) converges weakly to x(0) and that the matrix Σ ,

vk=1 rk AT

k diag (pk) Ak is positivedefinite. Then the weak-sense limit of Ψn (defined as in Theorem 4) satisfies

x(t) = x(0)+ w(t)+

vk=1

ATk

t

0[gk(x(s))rk + pk fk(x(s))] ds + z(t),

where z(t) ,

i∈F γiζi (t) and w is a Ru-valued, Gt -Wiener process with covariance matrixgiven by Σ and ζi are continuous, non-decreasing, ζi (0) = 0, and increases only when x(t) ∈ ∂i .Here, Gt is the minimal σ -algebra that measures x(0), x(s), w(s), z(s); s ≤ t.

6. Numerical experiments

Let us now consider some practical applications that illustrate the use of the heavy trafficapproximation presented here. The following examples are considered in the next sections: apredator–prey model, a congestion control system which uses signals to move customers, a loadbalancing system for parallel processing stations, and a production line with quality assurance.In order to give a general idea of the steps required to use the approximation, some details aregiven in the derivation of the first example. The presentation is more concise for the remainingexamples.

6.1. The Lotka–Volterra predator–prey model

We consider the predator–prey model used in [48] where there are two places p1 and p2 andthree transitions t1, t2, and t3 which are given by:

p1t1−→ 2p1

p1 + p2t2−→ 2p2

p2t3−→ ∅.

Two additional transitions are added so that the system will not stop if there is a shortage oftokens at p1 or at p2. They are given by:


∅t4−→ p1

∅t5−→ p2.

The corresponding matrix AT for this system is:

t1 t2 t3 t4 t5

AT=

1 −1 0 1 00 1 −1 0 1

,

where the listed tk above the matrix AT are added to help identify the corresponding transitions.Suppose that the transition rates Λi (·), i ∈ 1, . . . , 5, can be written as the λn

k (·) given byAssumption 3(a) and that they satisfy Assumption 3(b), for some large value of n ∈ N. Thatis, the difference between birth and death rates of prey, or predators, is nearly zero. Then, thenumber of tokens at each node at time t can be approximated by x(nt)/

√n, where x is the heavy

traffic limit given by Eq. (6), as long as Assumption 3(c) is also satisfied.For example, suppose that

Λi (ξ) = ri + Fi (ξ) for i ∈ 1, 2, 3

Λi (ξ) = ri for i ∈ 4, 5,

where F1(ξ) = k1ξ1, F2(ξ) = k2ξ1ξ2, and F3(ξ) = k3ξ2, for positive constants k1, k2, and k3.Then if, for some large n ∈ N, we have that b ,

√n AT r =

√n(r1 −r2 +r4, r2 −r3 +r5)

T , k1 ,k1 · n, k2 , k2 · n

√n, and k3 , k3 · n are all of “moderate” size, then we can set:

λn1(ξ) = r1 +

k1ξ1√

n, λn

2(ξ) = r2 +k2ξ1ξ2√

n, λn

3(ξ) = r3 +k3ξ2√

n,

λn4(ξ) = r4 +

b1√

n, λn

5(ξ) = r5 +b2√

n,

where r4 ,r4 − b1/

√n

and r5 ,r5 − b2/

√n. This way, Λi (ξ) = λn

i

ξ/

√n

andAssumption 3(a)–(b) are verified for this choice of λn

i . To simplify the discussion, let b1 = b2 = 0and, hence, r4 = r4 and r5 = r5.

Now let us consider Assumption 3(c). For that, the reflection directions γ (S), defined on thefaces and edges of the state space, need to be constructed. Let Eki and Fki be as given in Table 1and suppose that both places p1 and p2 have finite buffers given by B1 =

√nB1 and B2 =

√nB2,

respectively, for constants B1, B2 > 0. Then the reflection directions at the corners or edges ofthe state space are given by:

at ∂1 , ξ ∈ R2| ξ1 = 0γ (1) = (r2 − r1,−r2)

T

at ∂2 , ξ ∈ R2| ξ2 = 0γ (2) = (r2, r3 − r2)

T

at ∂3 , ξ ∈ R2| ξ1 = B1γ (3) = (−r1 − r4, 0)T

at ∂4 , ξ ∈ R2| ξ2 = B2γ (4) = (r2,−r2 − r5)

T

at ∂1 ∩ ∂2γ (1, 2) = (r2 − r1, r3 − r2)T

at ∂1 ∩ ∂4γ (1, 4) = (−r1 + r2,−r2 − r5)T

at ∂2 ∩ ∂3γ (2, 3) = (−r1 + r2 − r4,−r2 + r3)T

at ∂3 ∩ ∂4γ (3, 4) = (−r1 + r2 − r4,−r2 − r5)T ,


Table 1Summary of the data used in the predator–prey model.

Transitions (k) Places (i) Eki Fki Eki ∪ u + Fki rk fk (ξ)

t1p1 1 1 1, 3

r1 k1ξ1p2 – – –

t2p1 1, 2 2 1, 2, 4

r2 k2ξ1ξ2p2 1, 2 2 1, 2, 4

t3p1 – – –

r3 k3ξ2p2 2 – 2

t4p1 – 1 3

r4 0p2 – – –

t5p1 – – –

r5 0p2 – 2 4

Fig. 2. (Left) Sample path of x versus time for the predator–prey model. (Right) Plot of x1 versus x2 for the same path.

and one needs to check whether they satisfy Assumption 3(c). With some algebraic manipulation,it is possible to show that these reflection directions satisfy the required assumption if ri > 0 forall i ∈ 1, . . . , 5. Under this assumption, we also have that the covariance matrix of the processAT m(t) (present in Eq. (6)), given by

AT diag (r) A , Σ =

r1 + r2 + r4 −r2

−r2 r2 + r3 + r5

,

is positive definite. Then, the reflection directions at the corners of the state space can be droppedby Theorem 5, and we can write the heavy traffic limit for this system as:

x(t) = x(0)+

t

0

k1x1(s)− k2x1(s)x2(s)k2x1(s)x2(s)− k3x3(s)

ds +

2r2 −r2−r2 2r3

1/2

w(t)

+

r4 r2 −r2 r2

−r2 r5 0 −r3

ζ(t),

where we used the identities r1 +r4 = r2 and r2 +r5 = r3, given by the heavy traffic condition, tosimplify the covariance and reflection matrices. Also, w is a standard Wiener process and Σ 1/2

is such that Σ = Σ 1/2(Σ 1/2)T .


Fig. 3. Diagram of a queueing system with congestion control.

Fig. 2 has a sample path of x constructed with Euler’s method [21], where the reflectionprocess is implemented by pushing the process back (in the direction of the reflection direction)if it crosses the boundaries of the state space. In this example we used r1 = 3, r2 = 4, r3 =

5, r4 = 1, r5 = 1, k1 = 1, k2 = 0.1, k3 = 1, x(0) = (0, 0)T , and the buffer sizes B1and B2 are large enough so that they do not interfere with the process in the time intervalconsidered.

6.2. Congestion control

Let us now consider the queueing system given by Fig. 3. In this system, there are two queuesin tandem and one queue which is not directly connected to the other two. Let the buffer of eachqueue be finite. Suppose that queue 2 (Q2) needs to avoid buffer overflow and it does that bymoving customers from queue 1 (Q1) into queue 3 (Q3)whenever it is almost full. These signalsto move a customer from Q1 to Q3 are sent at the moment of departure of customers from Q2.This type of signals are usually called “triggers” in G-networks literature [15]. Hence, we canrepresent the system by:

∅t1−→ Q1

Q1t2−→ Q2

Q1 + Q2Q2

t3−→

Q3∅

∅t4−→ Q3

Q3t5−→ ∅,

where transition t3 has two possible outcomes. One of the outcomes corresponds to the departureof a customer from Q2 and the other corresponds to the departure of a customer from Q2 togetherwith the displacement of one customer from Q1 to Q3.

For this example, we suppose that the transition rates are not state dependent. However, theprobability of each outcome of transition t3 will be state dependent and it will be used to define acontrol policy that avoids buffer overflow in Q2 by moving customers from Q1 to Q3. For brevity,following the same line of reasoning given in the previous section, let us suppose that there are λn

isatisfying Assumption 9(a)–(b) with data given by ri > 0, for i = 1, . . . , 5, f1(·) = f4(·) = 0,f2(·) = b2, f3(·) = b3, and f5(·) = b5, where b2, b3, and b5 are constants (see Table 2 forreference). Also, let p31 = 0, p32 = 1, g31(ξ) = g(ξ) and g32(ξ) = −g(ξ), for ξ ∈ R3, wherethe function g : R3

≥0 → [0, 1] will be chosen later.


Table 2Summary of the data used in the congestion control model.

tk pi Eki Fki Eki ∪ u + Fki rk fk (ξ)

t1

p1 – 1 4r1 0p2 – – –

p3 – – –

t2

p1 1 2 1, 5r2 b2p2 1 2 1, 5

p3 – – –

t3,1

p1 1, 2 – 1, 2r3 p31 = 0 r3g31(ξ)+ p31 f3(ξ) = r3g(ξ)p2 2 – 2

p3 1, 2 3 1, 2, 6

t3,2

p1 – – –r3 p32 = r3 r3g32(ξ)+ p32 f3(ξ) = b3 − r3g(ξ)p2 2 – 2

p3 – – –

t4

p1 – – –r4 0p2 – – –

p3 – 3 6

t5

p1 – – –r5 b5p2 – – –

p3 3 – 3

Fig. 4. Plot of the switching curve of optimal control in the (scaled) state space. The control is applied at maximumstrength in the region underneath the curve. On the left it is shown the contour map of this curve.

As discussed in Section 5, each outcome of transition t3 can be seen as an individual(independent) transition, which we call t3,1 and t3,2 with rates given by Table 2. This way, thematrix AT can be written as:

t1 t2 t3,1 t3,2 t4 t5

AT=

1 −1 −1 0 0 00 1 −1 −1 0 00 0 1 0 1 −1

,where the associated transitions are indicated on top of each column.


This way, the limit equation for this system can be written as:

x(t) = x(0)+

t

0

−b2 − g(x(s))r3b2 − b3

g(x(s))r3 − b5

ds +

r1 + r2 −r2 0−r2 r2 + r3 0

0 0 r4 + r5

1/2

×w(t)+

r2 0 0 −r1 r2 0−r2 r3 0 0 −r2 0

0 0 r5 0 0 −r4

ζ(t),where w is an standard Wiener process. Under the given assumptions, the diffusion is non-degenerate, so that the reflection directions at the corners and edges of the state space can bedropped.

Now we set up a stochastic optimal control problem in order to find an adequate conges-tion control policy. That is, we want to find an optimal control function g which minimizes thediscounted cost:

W (x0, g) = Egx0

∞

0e−βt (cg(x(t))dt + v1dζ5(t)+ v2dζ6(t))

,

where c is a positive constant associated with the cost of moving the customers, v1 is a positiveconstant associated with the cost of the buffer being full at Q2, v2 is a positive constant associatedwith the cost of exceeding the buffer of Q3, and x(0) = x0 is the initial condition.

The Markov chain approximation method of Kushner and Dupuis [26] was used to findthe optimum control g numerically (we used the implementation of Jarvis and Kushner [20]).The discretization parameter was set to h = 0.1, and the constants β = 0.001, c = 1, v1 =

v2 = 200, r1 = r4 = 1, b1 = b3 = 0.1, and b2 = 0.2. Also, the buffer sizes were set toB1 = B2 = B3 = 10. Fig. 4 has the switching curve of the optimal control. This curve separatesthe state space into a region where the control is applied at a maximum rate and another wherethe control is not applied at all.

6.3. Load balancing system

We now turn our attention to a system composed of two queues in parallel, as shown in Fig. 5.In this system, both queues receive input at the same time. In other words, the input is forkedbetween the two queues upon arrival. At the time of customer departure, one queue may movea customer from the other queue into itself whenever the system is imbalanced (i.e., when onequeue has more customers than the other). This model was used in [30] to devise a balancingpolicy for parallel processing stations used in web search systems. The system can be representedby the following transitions:

∅t1−→ Q1 + Q2

Q2 + Q1Q1

t2−→

Q1∅

Q1 + Q2Q2

t3−→

Q2∅

where t2 and t3 have two possible outcomes each. Again, by Section 4, each outcome oftransitions t2 and t3 can be seen as independent transitions, let us call them t2,1, t2,2, t3,1, and


Fig. 5. Diagram showing two queues in parallel with forked input.

Table 3Summary of the data used in the load balancing problem.


t1p1 – 1, 2 3, 4

r1 0p2 – 1, 2 3, 4

t2,1p1 1 – 1

0 r2h1(ξ)p2 1, 2 – 1, 2

t2,2p1 1 – 1

r2 b2 − r2h1(ξ)p2 – – –

t3,1p1 1, 2 – 1, 2

0 r3h2(ξ)p2 2 – 2

t3,2p1 – – –

r3 b3 − r3h2(ξ)p2 2 – 2

t3,2. The matrix AT for this system is given by:

t1 t2,1 t2,2 t3,1 t3,2

AT=

1 0 −1 −1 01 −1 0 0 −1

.

Assume that there are appropriate λni with data given by ri > 0, for i = 1, . . . , 3, f1(·) =

0, f2(·) = b2, and f3(·) = b3, where b2 and b3 are constants. Let in addition p21 = 0, p22 =

1, g21(·) = h1(·), g22(·) = −h1(·), p31 = 0, p32 = 1, g31(·) = h2(·), and g32(·) = −h2(·),where h1, h2 : R2

+ → [0, 1] will be chosen later. Table 3 has a summary of the data used in thisproblem and the sets Eik and Fik .

This way, the limit equation can be written as:

x(t) = x(0)+

t

0

h1(x(s))r2 − h2(x(s))r3 − b2h2(x(s))r3 − h1(x(s))r2 − b3

ds

+

r1 + r2 r1

r1 r1 + r3

1/2

w(t)+

r2 0 −r1 −r10 r3 −r1 −r1

ζ(t),

where w is a standard Wiener process and we used the fact that the diffusion is non-degenerate.


Fig. 6. Region of the (scaled) state space where the control is active. Region under the name of “Control 1” (resp.,“Control 2”) is the region where the optimal control h1(·) (resp., h2(·)) is active. Notice that when control 1 is active, theprocess is forced down diagonally to the center of the state space.

Now we can find the optimum choice of h1 and h2 that reduces imbalance in the system. Forthat, we use the following discounted cost:

W (x0, h1, h2) = Ehx0

∞

0e−βt (v|x1(t)− x2(t)| + c1h1(x(t))+ c2h2(x(t))) dt

,

where c1 and c2 are positive constants associated with the cost of moving customers between thequeues and v is a constant associated with the cost of having the queues unbalanced.

Again, we use the Markov chain approximation method to compute the optimal h =

(h1, h2)T . For that we set c1 = c1 = v = 1, β = 0.001, and the discretization parameter was

set to 0.1. Also, we let b1 = b2 = 0.21 and r1 = 0.028. Since the Markov chain approximationmethod requires finite buffers, we let B1 = B2 = 5. Again, the optimal control is of the switchingtype. The state space is divided into a region where the control is applied at the maximum rateand a region where the control is not applied at all. Fig. 6 shows the regions of the state spacewhere the optimal control is active.

6.4. Quality assurance in production lines

Let us now consider a model of a production line consisting of two queues in tandem, seeFig. 7 for reference. Products at the second station need to pass a quality assurance process. Theproducts which do not attend a minimum quality requirement are sent back to the first queue forreprocessing or are send to a third station, where they are restored and resent to the productionline. This problem can be represented by the following transitions:

∅t1−→ Q1

Q1t2−→ Q2

Q2t3−→ ∅

Q2t4−→

Q1Q3

Q3t5−→ Q1,


Table 4Summary of the data used in the quality assurance model.


t1

p1 – 1 4r1 0p2 – – –

p3 – – –

t2

p1 1 – 1r2 b2p2 1 2 1, 5

p3 – – –

t3

p1 – – –r3 b3p2 2 – 2

p3 – – –

t4,1

p1 2 1 2, 4r4 p41 p41b4p2 2 – 2

p3 – – –

t4,2

p1 – – –r4 p42 p42b4p2 2 – 2

p3 2 3 2, 6

t5

p1 3 1 3, 4r5 b5p2 – – –

p3 3 – 3

where transition t4 has two possible outcomes. Let us suppose that there are appropriate λni with

positive constants ri and functions given by f1(·) = 0, fi (·) = bi , for i ∈ 2, 3, 4, 5, where biare constants. Also, let there be and appropriate qn

4 j , j ∈ 1, 2, such that p41 + p42 = 1 andg41(·) = g42(·) = 0. In addition, let the queue buffers be finite. Table 4 contains a summary ofthe data used and the sets Eki and Fki .

In this way, the heavy traffic limit for this system is given by:

x(t) = x(0)+

−b2 + b5 + p41b4b2 − b3 − b4−b5 + p42b4

t

+

r1 + p41r4 + r2 + r5 −r2 − p41r4 −r5−r2 − p41r4 r2 + r4 + r3 −p42r4

−r5 −p42r4 p42r4 + r5

1/2

w(t)

+

r2 −p41r4 −r5 −r1 − p41r4 − r5 0 0−r2 r3 + r4 0 0 −r2 0

0 −p42r4 r5 0 0 −p42r4

ζ(t)Again, under the given assumptions, the diffusion is non-degenerate and the reflections at thecorners and edges were dropped.

Acknowledgments

This work was supported in part by the Brazilian National Research Council-CNPq, under theGrants 140687/2005-0, 384163/2009-2, and 302501/2010-0 and by PRONEX under the grantFAPERJ E-26/170.008/2008.


Fig. 7. A diagram of a production line with quality assurance. The dashed lines indicate products which are sent back tothe first station for reprocessing or send to a third station to be restored.

The authors would like to thank the reviewer for his valuable comments, which helpedimproving the quality of the manuscript.

Appendix. Auxiliary results

In this appendix, we give some results which are used in the main theorems of the paper butwhich carry unnecessary details to be shown in the main body of the paper.

Lemma 11. Let M be an Ft -martingale such that M(t) = t

0 H(s) (d N (s)− λ(s)ds), whereN is a point process with Ft -stochastic intensity λ (where

t0 λ(s)ds < ∞ a.s.) and H is a

predictable process satisfying E t

0 |H(s)|2λ(s)ds< ∞, t ≥ 0. Then the Doob–Meyer process

associated with M is given by ⟨M⟩ (t) = t

0 H2(s)λ(s)ds, for t ≥ 0.

Proof. From Ito’s formula (given by Theorem 32 of [42, p. 71]) for M with f (x) = x2,we have that M2(t) = M2(0) +

t0 2M(s−)d M(s) +

0<s≤t (∆M(s))2, where ∆M(s) =

M(s) − M(s−). Notice that, if M jumps at the instant s, then ∆M(s) = H(s). Hence, we maywrite:

0<s≤t (∆M(s))2 =

∞

l=0 H2(sl)Isl ≤ t = t

0 H2(s)d N (s), where sl denotes theinstant of the l-th jump of M (or N ). Therefore, we have that

M2(t) =

t

0[2M(s−)+ H(s)] H(s) (d N (s)− λ(s)ds)+

t

0H2(s)λ(s)ds

=

t

0[2M(s−)+ H(s)] d M(s)+

t

0H2(s)λ(s)ds.

By Theorem T8 of [5, p. 27], we have that t

0 [2M(s−)+ H(s)] d M(s) is an Ft -martingale. Bythe definition of the Doob–Meyer process associated with M we have the stated result.

Lemma 12. Let A be an Ft -adapted point process with intensity λ (where t

0 λ(s)ds < ∞,a.s.), and H an Ft -predictable process such that E

t0 |H(s)|λ(s)ds < ∞, t ≥ 0. Let M(t) ,

A(t)− t

0 λ(s)ds then E t

0 H(s)d M(s)2

= E t

0 H2(s)λ(s)ds.

Proof. Notice that M has bounded variation on finite intervals of [0,∞) a.s., and t

0 H(s)d M(s)is a martingale by Theorem 8 of [5, p. 27]. Then by Theorem 26 of [42, p. 63], M hasquadratic variation given by [M](t) =

0<s≤t (∆M(s))2 = A(t). By Corollary 3 and


Theorem 29 of [42] (pp. 66 and 68, respectively), E t

0 H(s)d M(s)2

= E t

0 H2(s)d A(s) =

E t

0 H2(s)λ(s)ds.

References

[1] A. Arazi, E. Ben-Jacob, U. Yechiali, Controlling an oscillating Jackson-type network having state-dependent servicerates, Mathematical Methods of Operations Research 62 (3) (2005) 453–466.

[2] J. Artalejo, G-networks: a versatile approach for work removal in queueing networks, European Journal ofOperational Research 126 (2000) 233–249.

[3] S. Bhardwaj, R. Williams, Diffusion approximation for a heavily loaded multi-user wireless communication systemwith cooperation, Queueing Systems 62 (4) (2009) 345–382.

[4] P. Billingsley, Convergence of Probability Measures, second ed., John Wiley & Sons, New York, 1999.[5] P. Bremaud, Point Processes and Queues, Martingale Dynamics, Springer-Verlag, New York, 1981.[6] D. Bridges, Foundations of Real and Abstract Analysis, in: Graduate Texts in Mathematics, vol. 146, Springer,

1998.[7] C. Chaouiya, Petri net modelling of biological networks, Briefings in Bioinformatics 8 (4) (2007) 210–219.[8] L. Chen, G. Qi-Wei, M. Nakata, H. Matsuno, S. Miyano, Modelling and simulation of signal transductions in an

apoptosis pathway by using timed Petri nets, Journal of Biosciences 32 (1) (2007) 113–127.[9] H. Chen, D. Yao, Fundamentals of Queueing Networks, Springer-Verlag, New York, 2001.

[10] J.G. Dai, R.J. Williams, Existence and uniqueness of semimartingale reflecting Brownian motions in convexpolyhedrons, Theory of Probability and its Applications 40 (1) (1995) 1–40.

[11] T.V. Do, An initiative for a classified bibliography on g-networks, Performance Evaluation 68 (4) (2011) 385–394.[12] T.V. Do, Bibliography on g-networks, negative customers and applications, Mathematical and Computer Modelling

53 (1/2) (2011) 205–212.[13] P. Dupuis, H. Ishii, On lipschitz continuity of the solution mapping to the skorokhod problem, with applications,

Stochastics and Stochastics Reports 35 (1991) 31–62.[14] E. Gelenbe, Product-form queueing networks with negative and positive customers, Journal of Applied Probability

28 (1991) 656–663.[15] E. Gelenbe, G-networks with triggered customer movement, Journal of Applied Probability 30 (1993) 742–748.[16] E. Gelenbe, P. Glynn, K. Sigman, Queues with negative arrivals, Journal of Applied Probability 28 (1) (1991)

245–250.[17] V. Guffens, E. Gelenbe, G. Bastin, Qualitative dynamical analysis of queueing networks with inhibition, in: Interperf

’06: Proceedings from the 2006 Workshop on Interdisciplinary Systems Approach in Performance Evaluation andDesign of Computer & Communications Sytems, 2006.

[18] S. Hardy, P. Robillard, Modeling and simulation of molecular biology systems using Petri nets: modeling goals ofvarious approaches, Journal of Bioinformatics and Computational Biology 2 (4) (2004) 619–637.

[19] S. Hardy, P. Robillard, Petri net-based method for the analysis of the dynamics of signal propagation in signalingpathways, Bioinformatics 24 (2) (2008) 209–217.

[20] D. Jarvis, H. Kushner, Codes for optimal stochastic control: documentation and users guide, Technical report,Brown University, Lefschetz Center for Dynamical Systems Report 96–3, 1996.

[21] P. Kloeden, E. Platen, H. Schurz, Numerical Solution of SDE Through Computer Experiments, Springer–Verlag,New York, 1994.

[22] I. Koch, W. Reisig, F. Schreiber (Eds.), Modeling in Systems Biology, in: The Petri Net Approach, vol. 16, Springer,2011.

[23] T. Kurtz, Representations of markov processes as multiparameter time changes, The Annals of Probability 8 (4)(1980) 682–715.

[24] T. Kurtz, Approximation of Population Processes, SIAM, Philadelphia, Pennsylvania, 1981.[25] H. Kushner, Heavy Traffic Analysis of Controlled Queueing and Communication Networks, Springer-Verlag, New

York, 2001.[26] H. Kushner, P. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer–Verlag,

New York, 1992.[27] H. Kushner, J. Yang, D. Jarvis, Controlled and optimally controlled multiplexing systems: a numerical exploration,

Queueing Systems 20 (1995) 255–291.[28] S.C. Leite, Aproximacoes para redes estocasticas sinalizantes sob trafego pesado, Ph.D. Thesis, National Laboratory

for Scientific Computing, Brazil, 2009.


[29] S. Leite, M. Fragoso, Diffusion approximation of state dependent g-networks under heavy traffic, Journal of AppliedProbability 45 (2008) 347–362.

[30] S. Leite, M. Fragoso, Heavy traffic analysis of state-dependent parallel queues with triggers and an application toweb search systems, Performance Evaluation 67 (10) (2010) 913–928.

[31] A. Mandelbaum, G. Pats, State-dependent stochastic network. part i: approximations and applications withcontinuous diffusion limits, The Annals of Applied Probability 8 (1998) 569–646.

[32] M.A. Marsan, Stochastic Petri nets: an elementary introduction, in: Advances in Petri Nets 1989, Springer-Verlag,New York, 1990, pp. 1–29.

[33] M. Marsan, G. Conte, G. Balbo, A class of generalized stochastic Petri nets for the performance evaluation ofmultiprocessor systems, ACM Transactions on Computer Systems 2 (2) (1984) 93–122.

[34] P. Meyer, Demonstration simplifiee d’un theoreme de knight, Seminaire de probabilites (Strasbourg) 5 (1971)191–195.

[35] M. Molloy, On the integration of delay and throughput measures in distributed processing models, Ph.D. Thesis,University of California, Los Angeles, 1981.

[36] M. Molloy, Performance analysis using stochastic Petri nets, IEEE Transactions on Computers c–31 (9) (1982)913–917.

[37] T. Murata, Petri nets: properties, analysis and applications, Proceedings of the IEEE 77 (4) (1989) 541–580.[38] S. Natkin, Reseaux de Petri stochastiques, Ph.D. Thesis, CNAM-PARIS, Paris, 1980.[39] C. Petri, Kommunikation mit automaten, Ph.D. Thesis, Institut fur Instrumentelle Mathematik, Bonn, 1962.[40] C.A. Petri, Nets, time and space, Theoretical Computer Science 153 (1–2) (1996) 3–48.[41] J. Pinney, D. Westhead, G. McConkey, Petri net representations in systems biology, Biochemical Society

Transactions (2003) 1513–1515.[42] P. Protter, Stochastic Integration and Differential Equations: A New Approach, Springer–Verlag, New York, 1995.[43] C. Ramchandani, Analysis of asynchronous concurrent systems by timed Petri nets, Ph.D. Thesis, Massachusetts

Institute of Technology, Massachusetts, 1973.[44] W. Reisig, Petri Nets: Introduction, Springer–Verlag, New York, 1985.[45] F. Symons, Introduction to numerical Petri nets, a general graphical model of concurrent processing systems,

Australian Telecommunications Research 14 (1) (1980) 28–33.[46] L. Taylor, R. Williams, Existence and uniqueness of semimartingale reflecting brownian motions in an orthant,

Probability Theory and Related Fields 96 (3) (1993) 283–317.[47] W. Whitt, Stochastic-Process Limits, Springer-Verlag, New York, 2002.[48] D. Wilkinson, Stochastic Modelling for Systems Biology, Chapman & Hall/CRC, Boca Raton, Florida, 2006.[49] K. Yamada, Diffusion approximation for open state-dependent queueing networks in the heavy traffic situation, The

Annals of Applied Probability 5 (4) (1995) 958–982.

Documents

Diffusion approximation for signaling stochastic networks