8
Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383 Contents lists available at ScienceDirect Chemical Engineering Research and Design journal h om epage: www.elsevier.com/locate/cherd The “No Sampling Parameter Estimation (NSPE)” algorithm for stochastic differential equations Kirti M. Yenkie b,c , Urmila Diwekar a,b,a Department of Bio Engineering, University of Illinois, Chicago, IL 60607, USA b Center for Uncertain Systems: Tools for Optimization & Management (CUSTOM), VRI, Crystal Lake, IL 60012, USA c Department of Chemical Engineering, Rowan University, Glassboro, NJ, USA a r t i c l e i n f o Article history: Received 5 July 2017 Received in revised form 6 November 2017 Accepted 9 November 2017 Available online 26 November 2017 Keywords: Parameter estimation Uncertainty Ito process Brownian motion Population dynamics Process systems engineering a b s t r a c t The parameter estimation problem in stochastic differential equations (SDEs) has gained much attention in recent years due to their increased applications in the field of phar- maceuticals, ecosystem modeling, and medical data like EKG, blood pressure, and sugar levels. The predictive power of SDEs lie in the choice of parameter values that describe the real data effectively. The classical SDE parameter estimation methods are largely based on likelihood estimation, which is computationally expensive. In this work, we propose a rel- atively simplified approach based on deterministic nonlinear optimization method which does not require sampling. The results from the suggested No Sampling Parameter Estimation (NSPE) algorithm for selected examples are compared with the data, results from deter- ministic ODE (ordinary differential equation) model and traditional methods of maximum likelihood estimation (MLE) and generalized method of moments (GMM) for SDEs. The NSPE algorithm is more accurate and reduces computation time significantly when compared to the traditional methods. © 2017 Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved. 1. Introduction Modeling the behavior of a process or phenomena in terms of math- ematical equations has been a fundamental research area for many scientists working in systems engineering. Ordinary differential equa- tions (ODEs) are often used to model dynamic systems where the behavior of one or more states is a function of an independent vari- able. For example, time is an independent variable for systems evolving in time. Such ODE models assume that the system is usually driven by internal mechanisms and the effect of external influences or dis- turbances is negligible. However, real life systems such as biological processes (Allen, 2010) are bound to be influenced by disturbances, which cannot be adequately modeled by ODEs. Hence, there is a need for better models which can capture the dynamics of such systems effectively. The study of random processes, fluctuations and non- equilibrium states led to the use of stochastic differential equations (SDEs) to model the time evolution of dynamic systems and the asso- ciated noise (Van Kampen, 2007). Corresponding author at: Center for Uncertain Systems: Tools for Optimization & Management (CUSTOM), Vishwamitra Research Insti- tute, 2714 Crystal Way, Crystal Lake, IL 60012, USA. E-mail address: [email protected] (U. Diwekar). Usually, the deterministic ODE model consists of certain param- eters whose best-fitting values are estimated by using discrete observations from experiments and by the application of least squares minimization technique (Timmer, 2000); also referred as inversion problems (Tang et al., 2005). However, the same approach cannot be applied to SDE models since they have additional parameters due to random influences. In SDEs, along with the experimental data some prior information is essential in estimating the model parameters. Thus, some researchers have described SDEs as one of the hallmarks in grey box modeling for system identification (Costanzo et al., 2013; Duun-Henriksen et al., 2013; Nielsen et al., 2000). It has been argued that the major difficulty in SDE parameter estimation is the inconsistency in the estimates and the computationally expensive methods (Hurn and Lindsay, 1997). This work aims at proposing a novel approach which is computationally efficient and accurate. The readers are encour- aged to read the Supplementary information document for essential terminologies and concepts for understanding SDEs and their proper- ties. https://doi.org/10.1016/j.cherd.2017.11.018 0263-8762/© 2017 Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved.

Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383

Contents lists available at ScienceDirect

Chemical Engineering Research and Design

journa l h om epage: www.elsev ier .com/ locate /cherd

The “No Sampling Parameter Estimation (NSPE)”algorithm for stochastic differential equations

Kirti M. Yenkieb,c, Urmila Diwekara,b,∗

a Department of Bio Engineering, University of Illinois, Chicago, IL 60607, USAb Center for Uncertain Systems: Tools for Optimization & Management (CUSTOM), VRI, Crystal Lake, IL 60012, USAc Department of Chemical Engineering, Rowan University, Glassboro, NJ, USA

a r t i c l e i n f o

Article history:

Received 5 July 2017

Received in revised form 6

November 2017

Accepted 9 November 2017

Available online 26 November 2017

Keywords:

Parameter estimation

Uncertainty

Ito process

a b s t r a c t

The parameter estimation problem in stochastic differential equations (SDEs) has gained

much attention in recent years due to their increased applications in the field of phar-

maceuticals, ecosystem modeling, and medical data like EKG, blood pressure, and sugar

levels. The predictive power of SDEs lie in the choice of parameter values that describe the

real data effectively. The classical SDE parameter estimation methods are largely based on

likelihood estimation, which is computationally expensive. In this work, we propose a rel-

atively simplified approach based on deterministic nonlinear optimization method which

does not require sampling. The results from the suggested No Sampling Parameter Estimation

(NSPE) algorithm for selected examples are compared with the data, results from deter-

ministic ODE (ordinary differential equation) model and traditional methods of maximum

likelihood estimation (MLE) and generalized method of moments (GMM) for SDEs. The NSPE

algorithm is more accurate and reduces computation time significantly when compared to

Brownian motion

Population dynamics

Process systems engineering

the traditional methods.© 2017 Institution of Chemical Engineers. Published by Elsevier B.V. All rights reserved.

terminologies and concepts for understanding SDEs and their proper-

1. Introduction

Modeling the behavior of a process or phenomena in terms of math-

ematical equations has been a fundamental research area for many

scientists working in systems engineering. Ordinary differential equa-

tions (ODEs) are often used to model dynamic systems where the

behavior of one or more states is a function of an independent vari-

able. For example, time is an independent variable for systems evolving

in time. Such ODE models assume that the system is usually driven

by internal mechanisms and the effect of external influences or dis-

turbances is negligible. However, real life systems such as biological

processes (Allen, 2010) are bound to be influenced by disturbances,

which cannot be adequately modeled by ODEs. Hence, there is a need

for better models which can capture the dynamics of such systems

effectively. The study of random processes, fluctuations and non-

equilibrium states led to the use of stochastic differential equations

(SDEs) to model the time evolution of dynamic systems and the asso-

ciated noise (Van Kampen, 2007).

∗ Corresponding author at: Center for Uncertain Systems: Tools for Optitute, 2714 Crystal Way, Crystal Lake, IL 60012, USA.

E-mail address: [email protected] (U. Diwekar).https://doi.org/10.1016/j.cherd.2017.11.0180263-8762/© 2017 Institution of Chemical Engineers. Published by Elsev

Usually, the deterministic ODE model consists of certain param-

eters whose best-fitting values are estimated by using discrete

observations from experiments and by the application of least squares

minimization technique (Timmer, 2000); also referred as inversion

problems (Tang et al., 2005). However, the same approach cannot be

applied to SDE models since they have additional parameters due to

random influences. In SDEs, along with the experimental data some

prior information is essential in estimating the model parameters.

Thus, some researchers have described SDEs as one of the hallmarks

in grey box modeling for system identification (Costanzo et al., 2013;

Duun-Henriksen et al., 2013; Nielsen et al., 2000). It has been argued that

the major difficulty in SDE parameter estimation is the inconsistency in

the estimates and the computationally expensive methods (Hurn and

Lindsay, 1997). This work aims at proposing a novel approach which

is computationally efficient and accurate. The readers are encour-

aged to read the Supplementary information document for essential

mization & Management (CUSTOM), Vishwamitra Research Insti-

ties.

ier B.V. All rights reserved.

Page 2: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383 377

1

T

i

i

a

D

t

F

l

B

n

C

e

g

K

i

1

t

o

t

b

t

l

e

O

s

h

r

a

e

u

a

e

M

c

f

l

p

t

p

t

A

t

2

2

t

i

i

s

w

M

o

t

p

e

f

c

T

S

2

Tif

.1. Existing methods for parameter estimation for SDEs

he existing parameter estimation procedures for SDEs are divided

nto two categories: likelihood based methods and sample DNA match-

ng methods (Jeisman, 2005). The likelihood based methods try to

pproximate the maximum likelihood function, while the sample

NA matching methods try to match some characteristics or fea-

ures of the data. The likelihood methods involve the solution of

okker–Planck equation (Hurn and Lindsay, 1997), discrete maximum

ikelihood estimation (MLE) methods (Hurn and Lindsay, 1997; Le

reton, 1976; Picchini, 2007; Picchini and Ditlevsen, 2011), Hermite poly-

omial expansion methods (Bakshi, 2005) and Markov chain Monte

arlo (Golightly and Wilkinson, 2011; Martin et al., 2012; Mbalawata

t al., 2013) methods. The sample DNA matching methods involve the

eneralized method of moments (GMM) (Andersen and Sørensen, 1996;

ladivko, 2007), characteristic function (Singleton, 2001), and estimat-

ng functions (Kessler and Sørensen, 1999; McLeish and Kolkiewicz,

997).

Some of the earliest work on the estimation of the drift parame-

er set in SDEs was done by Le Breton (1976). His approach was based

n approximate methods where the continuous time likelihood func-

ion was discretized, and the diffusion coefficients were assumed to

e independent of the system parameters. Dorogovtsev (1976) proved

hat the conditional least square method is equivalent to the maximum

ikelihood estimator (MLE) and used it for computing the drift param-

ters. Robinson (1977) applied the MLE methods to discretely observed

rnstein–Uhlenbeck process. Recent work on parameter estimation for

uch processes are reported in Favetto and Samson (2010). Lot of work

as been done in MLE approximation methods but they remain inaccu-

ate and time consuming because of sampling requirements. McLeish

nd Kolkiewicz (1997) used higher order Ito-Taylor expansions for gen-

rating optimal estimating functions when the likelihood formula is

nknown. Kessler and Sørensen (1999) used estimating functions for

specific class of diffusion processes by utilizing the martingale prop-

rty of the Eigen functions of the process. The major drawback in earlier

LE methods is that diffusion processes are usually observed at dis-

rete times and hence hinder the explicit computation of the likelihood

unctions.

In this paper, we propose a novel approach named as the

No Sampling Parameter Estimation (NSPE)’ algorithm which uti-

izes deterministic optimization methods for estimating deterministic

arameters and use observations to obtain the dependent stochas-

ic parameters in a SDE. This approach has been discussed in our

revious studies where it has been compared to discrete event simula-

ions (Yenkie et al., 2016) used for modeling gene signaling networks.

dditionally, this approach has been beneficial in uncertainty charac-

erization and modeling in batch crystallizations (Yenkie and Diwekar,

012), batch reactors for biodiesel production (Benavides and Diwekar,

012) and in-vitro fertilization (Yenkie and Diwekar, 2015). However,

he claim that the proposed NSPE algorithm is less time consum-

ng and provides computationally efficient and reproducible results

n comparison to the traditional methods requires a comparison with

ome existing SDE parameter estimation methods. Thus, in this paper

e have selected the two traditional methods of simulated discrete

LE (Hurn and Lindsay, 1997; Picchini, 2007) from the first category

f ‘likelihood based methods’ and the GMM (Kladivko, 2007) from

he second category of ‘sample DNA matching methods’ for SDE

arameter estimation and compared their results with the param-

ters predicted by the NSPE algorithm. The detailed methodology

or parameter estimation using traditional methods (Simulated dis-

rete MLE and GMM) is presented in the Supplementary information.

he methodology of the proposed NSPE algorithm is discussed in

ection 2.

. Methodology

his section discusses the NSPE algorithm in detail and

ncludes the required mathematical expressions and formulaeor understanding the rationale and implementation method

adopted in this work. The SDE comprises of two parts; (i) driftterm or the deterministic part and (ii) diffusion term or thestochastic part. The parameters representing the dynamicsof the system, which are part of the deterministic model, arecalled as the SDE parameter set (�) in this work. The additionalparameter associated with random influences or disturbanceis called the standard deviation (�). Due to these two parame-ter sets the estimation procedure is divided into two levels;

Outer level — The evaluation of the SDE parameter set “�”using deterministic optimization methods.

Inner level — The evaluation of the randomness estimatoror standard deviation “�”, as a function of the SDE parameterset “�”, model equations and variance of the available data.

The inner level calculations take precedence over the outerlevel calculations. The ideology behind the approach as wellas the procedure for standard deviation (�) estimation isexplained in the following equations and supporting text. Thedeterministic model for a system can be represented in thegeneral form as shown in Eq. (1).

dxt

dt= f (xt, �) (1)

Here, xt is the state variable of the system, t represents the timeand f (x, �) is the function representing the change in variable‘x’ with time ‘t’. The mathematical model of a system can be asingle ODE or a set of ODEs, depending on the number of statevariables in the system.

If the deterministic equations do not follow the exactdynamics of the system based on experiments or observa-tions, there will be an error due to model insufficiency. TheLHS (left hand side) and RHS (right hand side) values in Eq. (1)on computation from observations recorded at discrete timepoints will not be equal and this difference will give rise to thedeterministic modeling error (errordet) as shown in Eq. (2).

errordet|t =[

dxt

dt− f (xt, �)

](2)

Hence, the deterministic model is not able to capture thedynamics for most processes such as cellular level systems,biomedical observations and natural food chains. Hence, weuse the SDE models to represent such systems. The exam-ple in literature from Kirchman et al. (2009) studied theinfluence of concentration of nutrients in polar water bod-ies, temperature and climatic changes on the microbial andphytoplankton growth. They discussed the influences of tem-perature, dissolved oxygen, carbon, etc. on the competitivegrowth of these phytoplankton and bacterial systems. In oneof their studies (see Fig. 1), they reported the effect of dis-solved organic carbon on the growth rate of bacteria in thedifferent water bodies having exposure to varying climaticconditions. They fitted the Monod’s growth function to thedata and from their results we can deduce that the data(discrete markers) still has a lot of fluctuations from thepredicted kinetics of the growth curve (continuous black pro-file).

However, if we assume these disturbances in the growthrate to follow a normal distribution they can be modeled as Itoprocesses and thus we can represent the bacterial growth inthe form of a SDE. Also, from the density function for Normalor Gaussian distribution (refer Fig. 2), approximately 68% of themeasurement deviations can be covered by ±1�, while almost

95% can be covered by ±2� and this accuracy can be furtherincreased to 99% by using ±3�, this is referred as the three
Page 3: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

378 Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383

Fig. 1 – Bacterial growth as a function of dissolved organiccarbon in the four water bodies around polar region. TheMonod growth kinetics cannot represent the overallfluctuation in the growth (Kirchman et al., 2009).

Fig. 2 – Normal distribution and the standard deviationestimation information (Normal distribution - Encyclopedia

of Mathematics, 2010).

sigma rule in statistics (“Normal distribution − Encyclopediaof Mathematics,” 2010). Using the same idea, we can use themeasured data for the evaluation of the standard deviationparameters in the SDEs for numerous examples.

The stochastic models are an improvement over thedeterministic models and hence the modeling error in thestochastic case (errorstoc) shall be reduced by a multiple of thestandard deviation (�) as shown in Eq. (3).

errorstoc|t = errordet|t − j�; j ∈ {1, 2, 3} (3)

For the ease of understanding, the evaluation method forstandard deviation (�) is illustrated using the simple Brownianform of the Ito SDE (Eq. (4)).

dx = f (x, �)dt + �εt

√dt (4)

This equation represents a change in x with the increasein time (t) in a small interval, dt. Thus, the discretized form ofEq. (4) can be written as Eq. (5).

xi+1 − xi = f (xi, �)�t + �εt

√�t (5)

On rearranging the terms in Eq. (5), such that � is on theLHS of the equation, we get;

� ={[

xi+1 − xi

�t− f (xi, �)

]}√�t (6)

Here, we have removed the unit normal distribution term of ‘εt’

since it can vary in unit proportions from positive to negativevalues and thus nullifying its overall impact on the standard

deviation evaluation. The inner level evaluations can be per-formed by choosing some initial values for the SDE parameterset �. The values of xi and xi+1 are available from the measure-ment data at different time points, and the value of f(xi, �) canbe computed by substituting the assumed values for � in theknown function.

The outer level computations provide better estimates for�. The most commonly used objective function for parameterestimation in deterministic models in the sum of least squareerrors (Eq. (7)).

ObjSSE =∑[

yexpi− ycalci

]2(7)

Here, yexp are the experimental measurements or observationsat distinct time points, ycalc are the values projected at thecorresponding time points from the deterministic model. Theycalc values for SDE models will have the diffusion term or thestochastic part in addition to the deterministic part as shownin Eq. (8).

ycalci= f (xi, �)�t + �εt

√�t (8)

Thus, resulting in a modified objective function with anadditional term involving the standard deviation estimatedearlier from the inner level calculations, as shown in Eq. (9).

Objective =∑[

|yexpi− f (xi, �)�t| − �

√�t

]2(9)

Note that the absolute value of the deterministic error isconsidered. Since the purpose of the stochastic model is todecrease the model prediction error, a term involving stan-dard deviation is subtracted from the deterministic error. Thepremise for this development was also mentioned earlier inEq. (3) and the suggestion of using multiples of � in the objec-tive function was made. However, in our studies we have notfound any major differences in the results by changing thismultiplication factor. The proposed NSPE algorithm can beillustrated in the form of a simple flow diagram as shown inFig. 3.

The advantage of this approach is that the parameter(s)due to randomness is (are) evaluated from the observed dataand model, thus, providing us with the new objective functionwhich is free of any probabilistic functions. The methods fordeterministic optimization can be applied to solve the newobjective function for better estimates of the SDE parameters(�). The steps involved in this method are given below (Yenkieand Diwekar, 2015).

Step #1: Specify initial values of �

Step #2: Use Eq. (6) to obtain value of the standard deviation(�)

Step #3: Find the value of the modified objective function(Eq. (9)) for the specified value of �.

Step #4: Find the derivative value and check if this valueof objective function is optimum, if yes then stop, else findnew values of � using deterministic nonlinear programmingmethod and go to step #2.

3. NSPE implementation examples

In this section we discuss some examples of stochastic sys-tems and apply the proposed NSPE algorithm. We compare

the results obtained from the NSPE algorithm with the resultsfrom the observed data, the deterministic model, the simu-
Page 4: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383 379

lgori

lt

3

T2Emtls

d

Hrsumpa

fhtti‘v(

oTdaa

tpmSts

Fig. 3 – The proposed two level NSPE a

ated discrete MLE method and the GMM method for the firsthree examples.

.1. Stochastic exponential function

he first example is the stochastic exponential function (Allen,010) and the mathematical SDE representation is shown inq. (10). It is a linear SDE and resembles the simple Brownianotion type of Ito process. These models are used to describe

he population dynamics of any biological species or particu-ate systems (Randolph, 1988) and can be combined in largertochastic population models (Wehrly, 2005).

xt = rxtdt + �dzt (10)

ere, xt is the species population, r is the combined growthate (the difference between the birth and death rate of thepecies), � is the standard deviation corresponding to thencertainty in the system and dzt is the Wiener process incre-ent. Depending upon the population dynamics, r can be

ositive when there is an increase in the species populationnd it can be negative when there is a decline.

On comparison with the standard simple Brownian motionorm, shown earlier in Eq. (4), the function f(x, �) is rxt, andence the parameter vector � has just one element ‘r’. As perhe suggested two level NSPE algorithm, for the evaluations inhe inner level, i.e. the standard deviation (�), we assume somenitial value of ‘r’. The value of � is dependent on the guess forr’, the model equation (Eq. (10)) and the observed data. Thealues of the corresponding terms can be substituted in Eq.6) in Section 2 for evaluating �.

Then we proceed to the outer level, where the modifiedbjective function is solved for estimating better values of ‘r’.he optimization problem can be solved by using any of theeterministic optimization methods and in this work, we havepplied the non-linear programming optimization methodsvailable in MATLAB like fmincon.

After estimation of the values of � and r for the exponen-ial function, the next step is to validate whether the predictedarameters perform better as compared to the deterministicodel, existing methods and do they match the actual data.

ince we are dealing with stochastic models, the result for

he system is the expected value of the maximum possiblecenarios projected from the solutions of the SDE. We con-

thm for parameter estimation in SDEs.

sider the expectation values over 50 scenarios as the finalresult projected by the stochastic form of the exponentialfunction. The numerical method used for solving the SDE isthe Euler–Maruyama scheme (Kloeden and Platen, 1999) (seeSupplementary information).

3.2. Stochastic logistic function

The second example is a non-linear SDE and is called as thestochastic logistic function (Allen, 2010; Yenkie and Diwekar,2015) as shown in Eq. (11).

dxt = rxt

(1 − xt

K

)dt + xt�dzt (11)

Here, xt is the species population, r and K are the parameterscorresponding to the combined birth and death rates (bothare second order functions of xt as reported by Allen and Allen(2003)), � is the standard deviation corresponding to the uncer-tainty in the system and dzt is the Wiener process increment.The equation has a second order function with respect to x inthe deterministic part and also has a dependency on x in thestochastic part. Thus, this closely resembles to the geometricBrownian form on the Ito process. The SDE parameter set �

has two constants r and K to be estimated in the outer level ofthe proposed approach.

3.3. Stochastic bimodal function

The third example is of the stochastic bimodal function (Eq.(12)), again a non-linear SDE having a third order function in xin the deterministic part.

dxt = rxt

(1 − xt

2

K

)dt + xt�dzt (12)

These types of stochastic functions have been used tomodel the environmental noise in the population systems byMao et al. (2002) and thus ensured that the potential popula-tion explosion problem possible in the deterministic modelscould be avoided. The inclusion of environmental noise in pop-ulation models can alter their dynamics significantly, like; it

can avoid population explosions, make the species extinct ormake them persistent.
Page 5: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

380 Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383

Fig. 4 – Results for the example 1, stochastic exponentialgrowth model (comparison between the data,deterministic, MLE, GMM and new NSPE algorithm).

3.4. Tri-tropic food chain model

After studying the population dynamics examples repre-sented as single SDEs depending upon the birth and deathrate dynamics and environmental noise, we analyzed a morerealistic model involving a set of differential equations. Thislast example is a combined model of the population dynam-ics of the different species, previously studied by Shastri andDiwekar (2006) for sustainable ecosystem management. Theyproposed a tri-tropic food chain model (Eqs. (13)–(16)) com-prising of prey (x1), predator (x2) and super-predator (x3) inwhich the death rate (x4) of the predator was modeled as astochastic process and followed the geometric mean-revertingIto process as discussed earlier in literature by Diwekar,(2013,2005).

dx1

dt= x1

[r

(1 − x1

K

)−

(a2x2

b2 + x1

)](13)

dx2

dt= x2

[e2

(a2x1

b2 + x1

)−

(a3x3

b3 + x2

)− x4

](14)

dx3

dt= x3

[e3

(a3x2

b3 + x2

)− d3

](15)

dx4

dt= � (x4 − x4) + �ε√

�tx4 (16)

Here, r is the prey growth rate, K is the prey carrying capac-ity, i = 2 for predator and i = 3 for super-predator, ai is the maxpredation rate, bi is the half saturation constant, ei is the effi-ciency, d3 is the mortality rate of super-predator and � is themean reverting coefficient. It can be observed that the dynam-ics are influenced from the logistic models discussed earlierin Section 3.2.

4. Results and discussion

The data sets for the first three examples were simulated usingsuitable parameter values and these were used as the basisfor comparison with the estimated parameter values fromthe deterministic ODE and stochastic SDE models. We denotethe suitable parameter values selected for these examples as‘actual values’ in the results. The results for the example-1,the exponential growth function, can be seen in Fig. 4. Thediscrete markers are the data points, the dotted curve is theresult from the deterministic model (i.e. considering only thefirst term in the RHS of Eq. (10)), the results from the simu-lated discrete MLE method and the GMM approach are alsoshown along with the results from the NSPE algorithm (solidgreen curve). The results for the three stochastic methods

(MLE, GMM, NSPE) are evaluated by taking an average over50 scenarios obtained on running the stochastic model using

Table 1 – The comparison of parameter values estimated for exdeterministic, MLE, GMM and NSPE algorithm for the stochasti

Method Parameter Actual value

1. Deterministic r 1

� 0.5

2. MLE r 1

� 0.5

3. GMM r 1

� 0.5

4. NSPE r 1

� 0.5

the estimated parameters from the respective methods. Theparameter values evaluated from the deterministic and thethree stochastic cases are summarized in Table 1. This beinga simple linear model, the deterministic and stochastic mod-els capture the dynamics reasonably well and the parametervalues estimated from all the methods are within reasonableaccuracy.

The results from the example 2, the stochastic logisticfunction is shown in Fig. 5 (Yenkie and Diwekar, 2015). Theprofiles follow the same notations and profile patterns asused for example 1. The parameter values estimated fromthe different methods are shown in Table 2. In this model,we can clearly see the effect of including stochasticity in themodel equations. The deviations from the data in case of thedeterministic model are quite significant while, the stochasticmodels do captures the process noise. However, the devia-tions from the data are quite significant for the simulateddiscrete MLE method. The results for the GMM method do notaccount for the disturbances properly and the profile looksvery smooth, not in agreement with the actual data. An expla-nation for this could be the low standard deviation (�) value inthe GMM case. The solid black profile obtained from the pro-posed NSPE algorithm captures the process noise very welland even the parameter values do not deviate significantlyfrom their actual values, suggesting this new method to be animprovement over the existing methods.

We see a similar pattern for the example 3, the stochas-tic bimodal function, in Fig. 6. Again, the stochastic model isefficient in capturing the process noise. With more complex

models the deviations from the data increase significantly in

ample 1, the stochastic exponential model, using thec model against the actual values.

Estimated value % Deviation from actual value

1.069 6.86%N.A. N.A.1.028 2.8%0.5221 4.42%1.1313 13.13%0.2573 48.6%1.053 5.3%0.5295 5.9%

Page 6: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383 381

Table 2 – The comparison of parameter values estimated for example 2, the stochastic logistic model, using thedeterministic, MLE, GMM and NSPE algorithm for the stochastic model against the actual values.

Method Parameter Actual value Estimated value % Deviation from actual value

1. Deterministic r 1 3.095 209.5%K 5 5.2989 5.78%� 0.5 N.A. N.A.

2. MLE r 1 3.6094 261%K 5 7.3079 46.16%� 0.5 0.5742 14.84%

3. GMM r 1 1.4446 44.46%K 5 5.3385 6.77%� 0.5 0.0353 92.95%

4. NSPE r 1 1.384 38.4%K 5 5.089 1.78%� 0.5

Fig. 5 – Results for the example 2 (Yenkie and Diwekar,2015), stochastic logistic function (comparison between thedata, deterministic, MLE, GMM and new NSPE algorithm).

tcceifpiAmTm

of the prediction accuracy of the method. The percentage erroris below 5% for most of the estimated parameters. The high-

Fig. 6 – Results for the example 3, stochastic bimodalfunction (comparison between the data, deterministic, MLE,

he deterministic model projections and hence the stochasticomponent inclusion becomes significant. The simulated dis-rete MLE method and the proposed NSPE algorithm are morefficient as compared to the GMM approach. The low approx-mation of the standard deviation parameter (�) is the reasonor the inefficiency of GMM. Our proposed NSPE algorithmrovides a simplified methodology for parameter estimation

n complex SDEs as compared to the traditional approaches.nother significant aspect is the elimination of the require-ent of sampling while estimating (�) in this novel approach.

he estimated parameter values for the example 3 are sum-

arized in Table 3.

Table 3 – The comparison of parameter values estimated for exdeterministic, MLE, GMM and NSPE algorithm for the stochastic

Method Parameter Actual value

1. Deterministic r 1

K 5

� 0.5

2. MLE r 1

K 5

� 0.5

3. GMM r 1

K 5

� 0.5

4. NSPE r 1

K 5

� 0.5

0.46005 8%

The results from the example 4, the tri-tropic food chainmodel for sustainable ecosystem management, are repre-sented in Table 4, where the estimated parameter values fromthe NSPE algorithm are compared with the values stated inliterature (Shastri and Diwekar, 2006). The profiles for the vari-ation of the prey (x1), predator (x2) and super-predator (x3)with time are shown in Fig. 7. The variation of the stochasticvariable, predator death rate (x4) is shown in Fig. 8 in compar-ison with the actual death rate from literature. However, theparameter value comparison in Table 4 provides a better idea

GMM and new NSPE algorithm).

ample 3, the stochastic bimodal model, using the model against the actual values.

Estimated value % Deviation from actual value

2.103 110.3%6.5041 30.082%N.A. N.A.2.1358 113.6%6.546 30.92%0.8416 68.32%0.7539 24.61%6.37 27.4%0.1218 75.65%0.9979 0.215%5.2223 4.446%0.5836 16.7%

Page 7: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

382 Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383

Fig. 7 – Variation of the species population in the example4, the tri-tropic food chain model with time.

Fig. 8 – The stochastic variable, predator death ratevariation with time (actual profile from literature and theprojected profile from the new NSPE algorithm) in theexample 4, the tri-tropic food chain model.

est percentage error (7.9%) is for the prediction of standarddeviation, � which is still low. Also, the standard deviationhas been evaluated from the data and more measurementsat additional time points can provide better estimates, further

reducing the percentage error.

Table 4 – Comparison of parameter values for example4, the tri-tropic food chain model for sustainableecosystem management (Shastri and Diwekar, 2006).

Parameter Values fromliterature(Shastri andDiwekar, 2006)

Estimatedvalues fromNSPEalgorithm

% Deviationfrom actualvalues

� 0.1205 0.1110 7.88%r 1.2 1.24 3.33%K 710 693.9 2.27%a2 2 2.02 1%b2 227.27 230.60 1.46%e2 1.35 1.32 2.22%a3 0.1 0.10 0%b3 250 259.46 3.784%e3 1.12 1.16 3.57%d3 0.04 0.04 0%� 0.3535 0.3535 0%

bution

5. Conclusions

The proposed new NSPE algorithm involves modificationsto the deterministic method of parameter estimation and isbased on a technique free of sampling while determining theSDE parameter set. The parameter due to random influencesis addressed as the standard deviation and is estimated byanalyzing the available data and using the Ito form of theSDEs. Thus, the parameter in the diffusion term is evalu-ated without any optimization or probabilistic methods. Theparameters in the drift term are estimated by using inversionmethods which involve non-linear programming optimizationmethods. However, instead of using the least square error min-imization as the objective, the objective function is modifiedto include the previously evaluated parameter due to random-ness. The NSPE algorithm is faster since it does not involve theuse of sampling techniques. It is simple and gives promisingresults, which can save a lot of computation time required intraditional methods. The comparison of the results with twotraditional methods, the simulated discrete MLE and GMM,show the enhanced prediction accuracy of the NSPE algorithmwith the increase in complexity of the SDE models. The solu-tions are reproducible and consistent and hence this methodcan be used as an alternative for developing stochastic mod-els for biological systems by analyzing the available data ordiscrete observations.

Appendix A. Supplementary data

Supplementary data associated with this arti-cle can be found, in the online version, athttps://doi.org/10.1016/j.cherd.2017.11.018.

References

Allen, L.J.S., 2010. An Introduction to Stochastic Processes withApplications to Biology, 2nd ed. Chapman and Hall/CRC, BocaRaton, FL.

Allen, L.J.S., Allen, E.J., 2003. A comparison of three differentstochastic population models with regard to persistence time.Theor. Popul. Biol. 64, 439–449,http://dx.doi.org/10.1016/S0040-5809(03)00104-7.

Andersen, T.G., Sørensen, B.E., 1996. GMM estimation of astochastic volatility model: a Monte Carlo study. J. Bus. Econ.Stat. 14, 328–352, http://dx.doi.org/10.2307/1392446.

Normal distribution — Encyclopedia of Mathematics [WWWDocument], 2010. URLhttps://www.encyclopediaofmath.org/index.php/Normal distri(Accessed 1 October 2017).

Bakshi, G.S., 2005. A refinement to Ait-Sahalia’s (2002) maximumlikelihood estimation of discretely sampled diffusions: aclosed-form approximation approach. J. Bus. 78, 2037–2052.

Benavides, P.T., Diwekar, U., 2012. Optimal control of biodieselproduction in a batch reactor: part II: stochastic control. Fuel94, 218–226, http://dx.doi.org/10.1016/j.fuel.2011.08.033.

Costanzo, G.T., Sossan, F., Marinelli, M., Bacher, P., Madsen, H.,2013. Grey-box modeling for system identification ofhousehold refrigerators: a step toward smart appliances. 20134th International Youth Conference on Energy (IYCE).Presented at the 2013 4th International Youth Conference onEnergy (IYCE), 1–5,http://dx.doi.org/10.1109/IYCE.2013.6604197.

Diwekar, U., 2005. Green process design, industrial ecology, andsustainability: a systems analysis perspective. Resour.

Conserv. Recycl. 44, 215–235,http://dx.doi.org/10.1016/j.resconrec.2005.01.007.
Page 8: Chemical Engineering Research and Design · Modeling the behavior of a process or phenomena in terms of math-ematical equations has been a fundamental research area for many scientists

Chemical Engineering Research and Design 1 2 9 ( 2 0 1 8 ) 376–383 383

D

D

D

F

G

H

J

K

K

K

K

L

M

M

gene signaling networks. Comput. Chem. Eng. 87, 154–163,http://dx.doi.org/10.1016/j.compchemeng.2016.01.010.

iwekar, U., 2013. Introduction to Applied Optimization. SpringerScience & Business Media.

orogovtsev, A., 1976. The consistency of an estimate of aparameter of a stochastic differential equation. TheoryProbab. Math. Stat. 10, 73–82,http://dx.doi.org/10.1090/S0094-9000-06-00658-2.

uun-Henriksen, A.K., Schmidt, S., Røge, R.M., Møller, J.B.,Nørgaard, K., Jørgensen, J.B., Madsen, H., 2013. Modelidentification using stochastic differential equation grey-boxmodels in diabetes. J. Diabetes Sci. Technol. 7, 431–440,http://dx.doi.org/10.1177/193229681300700220.

avetto, B., Samson, A., 2010. Parameter estimation for abidimensional partially observed Ornstein–Uhlenbeck processwith biological application. Scand. J. Stat. 37, 200–220,http://dx.doi.org/10.1111/j.1467-9469.2009.00679.x.

olightly, A., Wilkinson, D.J., 2011. Bayesian parameter inferencefor stochastic biochemical network models using particleMarkov chain Monte Carlo. Interface Focus,http://dx.doi.org/10.1098/rsfs.2011.0047.

urn, A.S., Lindsay, K.A., 1997. Estimating the parameters ofstochastic differential equations by Monte Carlo methods.Math. Comput. Simul. 43, 495–501,http://dx.doi.org/10.1016/S0378-4754(97)00037-2.

eisman, J., 2005. Estimation of the Parameters of StochasticDifferential Equations (Ph.D. Thesis). Queensland Universityof Technology, Brisbane, Queensland.

essler, M., Sørensen, M., 1999. Estimating equations based oneigenfunctions for a discretely observed diffusion process.Bernoulli 5, 299–314, http://dx.doi.org/10.2307/3318437.

irchman, D.L., Morán, X.A.G., Ducklow, H., 2009. Microbialgrowth in the polar oceans—role of temperature and potentialimpact of climate change. Nat. Rev. Microbiol. 7, 451–459,http://dx.doi.org/10.1038/nrmicro2115.

ladivko, K., 2007. The General Method of Moments (GMM) UsingMATLAB: The Practical Guide Based on The CKLS Interest RateModel (IG410067 and EEA/Norwegian FMP). Department ofStatistics and Probability Calculus, University of Economics,Prague, Prague.

loeden, P.E., Platen, E., 1999. Numerical Solution of StochasticDifferential Equations, 3rd ed. Springer-Verlag,Berlin-Heidelberg-New York.

e Breton, A., 1976. On continuous and discrete sampling forparameter estimation in diffusion type processes. In: Wets,R.J.-B. (Ed.), Stochastic Systems: Modeling, Identification andOptimization, I, Mathematical Programming Studies. SpringerBerlin Heidelberg, pp. 124–144,http://dx.doi.org/10.1007/BFb0120770.

ao, X., Marion, G., Renshaw, E., 2002. Environmental Browniannoise suppresses explosions in population dynamics. Stoch.Process. Their Appl. 97, 95–110,http://dx.doi.org/10.1016/S0304-4149(01)00126-0.

artin, J., Wilcox, L., Burstedde, C., Ghattas, O., 2012. A stochasticNewton MCMC method for large-scale statistical inverseproblems with application to seismic inversion. SIAM J. Sci.

Comput. 34, A1460–A1487,http://dx.doi.org/10.1137/110845598.

Mbalawata, I.S., Särkkä, S., Haario, H., 2013. Parameter estimationin stochastic differential equations with Markov chain MonteCarlo and non-linear Kalman filtering. Comput. Stat. 28,1195–1223, http://dx.doi.org/10.1007/s00180-012-0352-y.

McLeish, D.L., Kolkiewicz, A.W., 1997. Fitting diffusion models infinance. Lect. Notes Monogr. Ser. 32, 327–350.

Nielsen, J.N., Madsen, H., Young, P.C., 2000. Parameter estimationin stochastic differential equations: an overview. Annu. Rev.Control 24, 83–94,http://dx.doi.org/10.1016/S1367-5788(00)90017-8.

Picchini, U., Ditlevsen, S., 2011. Practical estimation of highdimensional stochastic differential mixed-effects models.Comput. Stat. Data Anal. 55, 1426–1444,http://dx.doi.org/10.1016/j.csda.2010.10.003.

Picchini, U., 2007. SDE Toolbox: Stochastic Differential Equationswith MATLAB. sourceforge.nethttp://lup.lub.lu.se/record/4216230.

Randolph, A.D., 1988. Theory of Particulate Processes: Analysisand Techniques of Continuous Crystallization. AcademicPress.

Robinson, P.M., 1977. Estimation of a time series model fromunequally spaced data. Stoch. Process. Their Appl. 6, 9–24,http://dx.doi.org/10.1016/0304-4149(77)90013-8.

Shastri, Y., Diwekar, U., 2006. Sustainable ecosystemmanagement using optimal control theory: part 2 (stochasticsystems). J. Theor. Biol. 241, 522–532,http://dx.doi.org/10.1016/j.jtbi.2005.12.013.

Singleton, K., 2001. Estimation of affine asset pricing modelsusing the empirical characteristic function. J. Econom. 102,111–141.

Tang, W., Zhang, L., Linninger, A.A., Tranter, R.S., Brezinsky, K.,2005. Solving kinetic inversion problems via a physicallybounded Gauss–Newton (PGN) method. Ind. Eng. Chem. Res.44, 3626–3637, http://dx.doi.org/10.1021/ie048872n.

Timmer, J., 2000. Parameter estimation in nonlinear stochasticdifferential equations. Chaos Solitons Fractals 11, 2571–2578,http://dx.doi.org/10.1016/S0960-0779(00)00015-1.

Van Kampen, N.G., 2007. Stochastic Processes in Physics andChemistry, 3rd ed. North Holland, Amsterdam, Boston.

Wehrly, T.E., 2005. Introduction to Stochastic Population Models.Department of Statistics, Texas A&M University.

Yenkie, K.M., Diwekar, U., 2012. Stochastic optimal control ofseeded batch crystallizer applying the ito process. Ind. Eng.Chem. Res. 52, 108–122.

Yenkie, K.M., Diwekar, U., 2015. Uncertainty in clinical data andstochastic model for in vitro fertilization. J. Theor. Biol. 367,76–85.

Yenkie, K.M., Diwekar, U.M., Linninger, A.A., 2016. Simulation-freeestimation of reaction propensities in cellular reactions and