A hybrid method for parameter estimation and its application to biomedical systems

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

journa l homepage: www. int l .e lsev ierhea l th .com/ journa ls /cmpb

A hybrid method for parameter estimation and itsapplication to biomedical systems

Hugo Alonsoa,∗, Teresa Mendoncaa, Paula Rochab

a Departamento de Matematica Aplicada, Faculdade de Ciencias da Universidade do Porto, Rua do Campo Alegre 687,4169-007 Porto, Portugalb Departamento de Matematica, Universidade de Aveiro, Campo de Santiago,3810-193 Aveiro, Portugal

a r t i c l e i n f o

Article history:

Received 29 December 2006

Received in revised form

11 October 2007

Accepted 29 October 2007

a b s t r a c t

A general version of a hybrid method for parameter estimation is presented with a theoreti-

cal support and an illustrative example of application. This method consists of a curve fitting

algorithm that takes the initial estimate of the parameterization from an artificial neural

network. The idea is to improve the convergence of the algorithm to the sought parameteri-

zation using a close initial estimate. The motivation arises from biomedical problems where

Keywords:

Curve fitting algorithm

Artificial neural network

Neuromuscular blockade

one is interested in obtaining a meaningful estimate so that it can be used for both descrip-

tion and prediction purposes. Two strategies are proposed for the application of the hybrid

method: one is of general applicability, the other is intended for systems defined by the

series connection of various blocks. The feasibility of the method is illustrated with a case

study related to the neuromuscular blockade of patients undergoing general anaesthesia.

we introduce a general version of a method for parameter esti-

1. Introduction

System identification plays undoubtedly an important role inthe study of biomedical systems. In particular, obtaining accu-rate models for patient response to a clinical procedure canbe crucial. Consider the example of a patient undergoing gen-eral anesthesia where model reference control is used for drugdelivery optimization [1]: the safety of the patient obviouslydepends on the reliability of the control process, but the latteris based on the identification of the patient dynamics. In thiscontext, the most usual models are parametric, derived fromthe application of biological, physical and chemical laws; infact, they are preferred to nonparametric models as they canbe used not only for prediction but also for description pur-
poses, offering some insight into the underlying biology [2].The fitting of a parametric model to individual patient datais made by means of estimating meaningful values for the
∗ Corresponding author. Tel.: +351 220 100 864; fax: +351 220 100 809.E-mail addresses: [email protected] (H. Alonso), [email protected].

0169-2607/$ – see front matter © 2007 Elsevier Ireland Ltd. All rights resdoi:10.1016/j.cmpb.2007.10.014

© 2007 Elsevier Ireland Ltd. All rights reserved.

parameters. However, commonly applied curve fitting algo-rithms often produce meaningless values because of theirsensitivity to the initial guess; two examples are the algo-rithms of Nelder–Mead [3] Levenberg–Marquardt [4,5]. Thisneed for a good initial guess motivated our study.

In this paper, our focus is on systems � of the form

yk = f (ϕk, �), (1)

where yk is the output at time tk, ϕk is a vector of m vari-ables (such as several inputs) measured up to time tk and �

is a vector of p parameters. We make no other assumption onf than being continuously differentiable. Our goal is to esti-mate � from the collected data (yk, ϕk)k=1,...,n, with n ≥ p. Here,

pt (T. Mendonca), [email protected] (P. Rocha).

mation and propose two different strategies for its applicationto the problem of estimating �. This method is hybrid in thesense that it combines two different types of procedures for

erved.

mailto:[email protected]



dx.doi.org/10.1016/j.cmpb.2007.10.014

i n b

paGrpwrceaa“pamacmlpetivStim

pHir“msai

ipfiwimanfstfmic

bnLst

c o m p u t e r m e t h o d s a n d p r o g r a m s

arameter estimation: an iterative curve fitting algorithm andnoniterative artificial neural network. The idea is as follows.iven a system� of the form (1), assume that a set T of possible

ealizations (y, ϕ) and corresponding parameterizations � is ariori known. Then, an unknown parameterization associatedith a new realization can be estimated by curve fitting algo-

ithms that take as initial estimate the vector of parametersorresponding to the “closest” realization in the set. However,ven if this set includes many cases, it is common that theselgorithms diverge or converge very slowly (and frequently tolocal optimum), simply because the initial estimate is not

good enough”. This problem can be overcome if the initialarameter estimate is taken not from the set T but rather fromneural network defined by T, i.e., implementing an approxi-ating function of the map between the spaces of realizations

nd of parameterizations. The hybrid method here proposedonsists of a curve fitting algorithm that takes the initial esti-ate from the output of such a neural network. There are at

east two main advantages in taking the initial estimate asroposed and both follow from the ability of the network toxtrapolate the set T and thus to generate, in principle, a “bet-er” initial estimate in a new case. First, there is a potentialncrease in the chance that the curve fitting algorithm con-erges and that it converges to a parameterization of interest.econd, there is a potential decrease in the number of itera-ions and function evaluations by the curve fitting algorithmn the case of convergence. These advantages make our hybrid

ethod suitable for real time estimation.Other hybrid methods such as the Continuous Tabu Sim-

lex Search [6] were recently proposed in the literature.owever, they differ from the one that we propose here mainly

n the following: first, many of them try to find a “promising”egion where a point of interest is supposed to be, instead of apromising” guess of that point as in our method; second, and

ore important, if there is some prior knowledge about theystem they do not explore it, contrary to our method wheresuitable neural network is used to represent and extrapolate

t.A preliminary version of our hybrid method was considered

n [7,8]. The idea behind it was first explored in [7], were weroposed the method as the combination between the curvetting algorithm of Levenberg–Marquardt and a neural net-ork. There, only single-input, single-output systems whose

nput is a known function of time and whose output is a para-etric function of time were considered. In [8], we focused our

ttention on systems that are also defined by the series con-ection of various blocks, and proposed a different strategy

or the application of the hybrid method. This strategy con-ists in first estimating all the parameters, and then improvingheir estimates using the inner signals recovered backwardsrom the output. The case study in [7,8] concerns the neuro-

uscular blockade model. In both papers, the set T of a priorinformation contained thousands of cases and no noise wasonsidered in the simulations.

In this paper, we present the hybrid method as the com-ination between any curve fitting algorithm and a neural
etwork. Here, we propose two algorithms: Nelder-Mead andevenberg-Marquardt. With respect to the system, we con-ider a more general structure (1), where there can be morehan one input, where the inputs are not necessarily known
i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122 113

functions of time and where the output is a parametric func-tion of (possibly) other variables than time, such as past andpresent inputs and past outputs. Moreover, although we onlytake a single output in (1), we show throughout the paper howto extend the application of the hybrid method to systemswith multiple outputs. The implementation strategy previ-ously considered for systems defined by the series connectionof various blocks is also adapted and presented. In practicalterms, we apply both strategies of the hybrid method to thesame case study as before, but this time under more restric-tive and realistic conditions. In particular, we only take fromten to one hundred cases in the set T with a priori informationused to define the network and we corrupt the observationswith noise to simulate real data.

The remainder of the paper is organized as follows. InSection 2, we present theoretical results concerning the iden-tifiability of a system of the form (1). Section 3 describes howto define and apply a neural network to parameter estimation.In Section 4, we briefly introduce two optimization algorithmsfor curve fitting. Any of these can be combined with a networkin the hybrid method, which is presented in Section 5 with twoimplementation strategies. The case study is given is Section6. We end the paper with the conclusions in Section 7.

2. Identifiability

Consider a system � of the form (1) for which data collec-tion yields a data set D = {(y∗

k, ϕ∗

k)k=1,...,n

| ∃�∗ : y∗k

= f (ϕ∗k, �∗), k =

1, . . . , n}, where n ≥ p. The identifiability of � at �∗ from thedata set D means that �∗ can be uniquely determined from D,i.e., that the following holds:

f (ϕ∗k, �) = f (ϕ∗

k, �∗), k = 1, . . . , n ⇒ � = �∗.

Here, we present necessary and sufficient conditions forthe local identifiability of �, i.e., for the identifiability of � in aneighbourhood of �∗.

Let � be a vector of r variables for which there are func-tions k : � �→ ϕk, k = 1, . . . , n. Then, the collected data can belumped in a system of equations

y = F(�, �), (2)

where y = (y1, . . . , yn)T and F : (�, �) �→ (F1(�, �), . . . , Fn(�, �))T issuch that Fk(�, �) = f ( k(�), �), k = 1, . . . , n.

Example 1. Consider a system determined by

yk = ayk−2 + byk−1 + uk, (3)

where y denotes the output, u the input and a,b are param-eters. Start by noting that (3) can be cast in the form (1),for instance with ϕk = (yk−2, yk−1, uk)

T and � = (a, b)T. Now,assume that the collected data are (yk, ϕk)k=1,2,3. A naive

approach would be to take � = (y−1, y0, y1, y2, u1, u2, u3)T

with k(�) = (�k, �k+1, �k+4)T, k = 1,2,3. However, mind-ing the recurrence of (3), it suffices to take � =(y−1, y0, u1, u2, u3)T with 1(�) = (�1, �2, �3)T and k(�) =( k−1(�)2, �1 k−1(�)1 + �2 k−1(�)2 + k−1(�)3, �k+2)T, k = 2,3.

s i n
114 c o m p u t e r m e t h o d s a n d p r o g r a m
The next proposition establishes a necessary and sufficientcondition for the local identifiability of � at �∗ from the data(2).

Proposition 2. Let (y∗, �∗, �∗) be a solution of (2) for which the n× p

matrix

JF,�(�∗, �∗) =

(∂Fi∂�j

(�∗, �∗)

)i=1,...,n, j=1,...,p

(4)

has full column rank. Then, there exist neighborhoods Y of y∗, � of�∗ and� of �∗ such that there is a unique continuously differentiablefunction G : Y ×� → � that satisfies y = F(�,G(y, �)); moreover, if(y, �, �) ∈Y ×�×� is such that y = F(�, �), then

G(y, �) = �.

Reciprocally, if G exists, then JF,�(�∗, �∗) has full column rank.

Proof. See Appendix A. �

Despite of its importance, this (existence) result does notprovide any further information on how to obtain the functionG. In this context, a neural network can be used to imple-ment an approximating function of G, given a set of knownobservations and corresponding parameterizations

TG = {(y(s), �(s), �(s))s=1....,N |G(y(s), �(s)) = �(s), s = 1, . . . , N}. (5)

The computational time needed to find a network imple-menting the best approximating function (with respect to TGand a measure of closeness) increases with the amount of data(mind that we may need to take a large n in order to guaran-tee that JF,�(�∗, �∗) has full column rank and hence that � isidentifiable at �∗). This is why it is common in practice to takea compressed form of the observations, namely of y, usingfor instance Principal Component Analysis [9]. In this case,

if Sy = 1N−1

∑N

s=1(y(s) − y)(y(s) − y)T

represents the n× n sam-

ple covariance matrix, where y = 1N

∑N

s=1y(s), then the vectorof the first p principal components of y is

zy = ATy (y − y),

where Ay is a n× p matrix whose orthonormal columns areeigenvectors corresponding to the p largest eigenvalues of Sy.In this way, from (2) we obtain

zy = H(�, �) (6)

with H(�, �) = ATy (F(�, �) − y). Applying Proposition 2 to the

compressed data (6), we conclude that � is locally identifiableat �∗ from (6) if and only if

JH,�(�∗, �∗) =

(∂Hi∂�j

(�∗, �∗)

)i=1,...,p, j=1,...,p

has full column rank. This means that there is a unique con-tinuously differentiable function J : Z×� → � such that

J(zy, �) = �.

Note that JH,� = ATy JF,� ; therefore, if we know through JF,� that

G exists, it is easy to verify the existence of J. If this is the

b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

case, we can set up a neural network to implement instead anapproximating function of J, given the set

TJ = {(z(s)y , �

(s), �(s))s=1....,N

| J(z(s)y , �

(s)) = �(s), s = 1, . . . , N}

obtained from TG. In this way, less computational time isexpected to find a suitable network.

The next proposition relates the system identifiability fromoriginal and compressed data, pointing out that data compres-sion may lead to the loss of identifiability.

Proposition 3. With the previous notation, G exists if so does J. Theconverse is false.

Proof. See Appendix A. �

The extension of the theory here presented to the case of amultiple-output system is straightforward: for each extra out-put, add the corresponding n equations to (2) and make thenecessary changes in the remaining developments.

3. On the use of neural networks forparameter estimation

In the previous section, the use of artificial neural networkswas suggested for function approximation, namely to approx-imate G and J. Here, we shall focus on one of these functionssince the developments can be easily adapted for the other.Note that if a network implements an approximating functionN of, let us say, G, then it can be viewed as a parame-ter estimation tool: once the data (y, �) are collected andfed into the network, an estimate N(y, �) = � of G(y, �) = � isobtained. Several types of neural networks can be consid-ered. In the following, we give a brief overview of the relevantconcepts concerning the class of multilayer feedforward net-works, which will be used in this paper (for more details, see,for instance [10]). Moreover, we point out how to use the lim-ited knowledge about G represented by the set TG above in thesearch for a network implementing the best approximatingfunction.

A neural network is an intricate set of neurons. Each neu-ron, the neural network basic element, is a unit that receives,processes and transmits data. In mathematical terms, itimplements a function x �→ (w0 +

∑iwixi), where is usually

the logistic sigmoid or the hyperbolic tangent, x is the vectorof inputs, wi is a scalar weighing the i-th input and w0 is thebias. In a multiplayer feedforward network, the neurons aredisjointly split into ordered layers. Each neuron in one layer isonly connected to all the neurons in the next one. Moreover,the neurons in the first/last layer are known as input/outputneurons, whereas the other ones are called hidden or inter-mediate neurons. Assuming that the input neurons do notprocess data, the overall function N : x �→ (N1(x), . . . ,Nq(x))T

implemented by such a network is defined by

N(x) =

(w0 +

∑k

wk

(. . .

(wj0 +

∑i

wjixi

). . .

)),

= 1, . . . , q,

i n b

wnnmvfcaHirtws

ifbaainfmcww

m

rtnu

til

4

Iuta

m

ime

cost

The first implementation strategy of the hybrid method,named HM1, consists in: first of all, knowing whether the sys-


here N is the function corresponding to the -th outputeuron, wji weighs the connection from the i-th to the j-theurons and wj0 is the bias of the j-th neuron. The class ofultilayer feedforward networks satisfies the so-called uni-

ersal approximation property [11], which states that everyunction in Cc(Ru), together with its derivatives up to order c,an be approximately implemented by a neural network incompact subset of Ru up to an arbitrary level of accuracy.owever, given that it is an existence result, this provides no

nformation on the number of hidden layers and hidden neu-ons per layer that should be considered in order to achievehe desired accuracy. In practice, the choice of a suitable net-ork architecture can be done by applying methods of model

election, such as cross-validation (see, for instance, [12]).In the context of parameter estimation, recall that our goal

s to find a network implementing the best approximatingunction N of G. This network should be such that the num-er of input neurons is n+ r (given that G is a function of ynd �) and the number of output neurons is p (since therere p parameters in � = G(y, �) to estimate). Based on the lim-ted knowledge about G available through the set TG (5), theumber of hidden layers and hidden neurons can be chosen

rom a pre-specified finite set of candidates by using the afore-entioned methods of model selection. Once the architecture

onsidered to be the best is selected, the corresponding net-ork is trained, i.e., we obtain a set of values for the networkeights w by solving the optimization problem

inw

∑(y,�,�) ∈TG

∥∥(G− N)(y, �)∥∥2 = min

w

∑(y,�,�) ∈TG

‖� − N(y, �)‖2. (7)

After this off-line procedure, carried out only once, theesulting network implementing the function N is a ready-o-use parameter estimation tool; hence, it can be applied toew data (y, �) in order to obtain an estimate N(y, �) = � of thenknown parameterization G(y, �) = �.

Finally, note that in the case of a multiple-output system,he complexity of the function that the network approximatess in principle greater and thus the number of suitable hiddenayers and neurons is expected to increase.

. Optimization algorithms for curve fitting

n the context of parameter estimation, an alternative to these of artificial neural networks is the application of optimiza-ion algorithms to curve fitting, i.e., to solve a problem suchs

in�E(�)� min

�‖y − F(�, �)‖2. (8)

This problem has a locally unique global minimizer if Gn Proposition 2 exists. Note that we could have considered

in�‖zy −H(�, �)‖2 and made a similar comment under thexistence of J.

There are several types of optimization algorithms that
an be applied to solve the problem above. Those that usenly the values of the objective function E are called directearch methods, and those that use the values of its deriva-ives up to order k, with k ≥ 1, are said k-th order methods.
i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122 115

In principle, the efficiency of an algorithm in seeking for aglobal minimizer increases with the amount of used infor-mation on E. However, there are many situations in practicewhere the derivatives of E are unavailable, both explicitly andimplicitly. In these cases, neither the local uniqueness of aglobal minimizer can be established through Proposition 2,nor k-th order methods can be applied. Furthermore, even ifthe derivatives of E are available, their computation can bevery expensive and time-consuming in computational terms,in which case direct search methods should be applied. Inthis paper, we consider two popular, widely used optimizationalgorithms: Nelder–Mead [3] and Levenberg–Marquardt [4,5].They are briefly described in what follows.

The Nelder–Mead simplex algorithm [3] was introduced in1965, but the first theoretical analysis of its convergence prop-erties was only presented in 1998 by Lagarias et al. [13]. It isa direct search method designed to minimize general mul-tivariate functions. This is done by generating a sequence ofsimplexes in such a way that from one iteration to the next thevalues of the objective function at the simplex vertices satisfysome descent condition; hence, at some point in time, we takethe “best” vertex of the current simplex as an approximatesolution to the minimization problem.

The Levenberg–Marquardt algorithm was introduced byLevenberg [4] and later rediscovered by Marquardt [5]. It is afirst order method designed to solve least-squares problems.This is done by first taking an approximation to the objec-tive function around the current estimate of a solution to theproblem, and depending on the quality of this approximation,a parameter is adjusted so that the search direction of thealgorithm is between those determined by the Gauss–Newtonmethod and the steepest descent method. Then, the objectivefunction is minimized along the search direction by apply-ing a line search algorithm (see, for instance [14]). These twosteps make up an iteration and the algorithm iterates until thesatisfaction of a stopping condition.

Before moving on, a remark should be made about thecase of a multiple-output system. In that case, multi-objectiveoptimization algorithms should be applied to find a singleparameterization for all the curve fittings, one per systemoutput (see, for instance [15] for a review on a class of thosealgorithms known as evolutionary).

5. The hybrid method

In the two previous sections, we presented two differentapproaches to parameter estimation, namely the use of arti-ficial neural networks and curve fitting algorithms. This wasdone in the usual way of looking at each of them as an alterna-tive to the other. Here, we propose an hybrid method, wherethey are rather seen as a complement to each other. Next, twoapplication strategies are described.

tem is identifiable, then, in the affirmative case, defining aneural network that acts as a parametric estimator, and finally,applying a curve fitting algorithm that takes the initial fromthe network. The algorithmic description is:

s i n

((


(1) Use Proposition 2 to find whether the system (1) is identi-fiable, i.e., if there is G such that � = G(y, �). If not, take alarger number of observations and start over again. If thisis not possible or does not work, give a suitable value tothe least important parameter and repeat this step fromthe beginning.

(2) Assuming that G exists, gather all a priori informationabout it in the set TG (5).

(3) Choose the best architecture from a finite set of candidatesfor a network implementing an approximating functionof G. This can be done using TG and a method of modelselection.

(4) Based on TG, train a network having the best architecturechosen in step 3, i.e., solve the optimization problem (7).LetN be the function implemented by the trained network.

(5) For a new realization of the system do:a) collect the new data y, �;b) use the network to generate the estimate � = N(y, �);(c) apply an optimization algorithm to the curve fitting

problem (8) with the aim of improving the network esti-mate �, which is taken by the algorithm as the initial guessof the unknown parameterization � = G(y, �).

Note that there are more possibilities for the algorithmicdescription of HM1, which can be obtained if we consider Jinstead of G and the curve fitting problem min�‖zy −H(�, �)‖2

instead of (8).The second implementation strategy of the hybrid method,

named HM2, assumes that the system (1) can be decomposedas the series connection of various blocks, in particular that itcan be cast in the form

y1k = f1(ϕ1

k, �1), yk = f(y−1

k, ϕk, �

), = 2, . . . , b,

where the -th block has its structure defined by f and isparameterized in the p parameters in �, y

kis the output and

ϕk

is a vector of m variables measured up to time tk. Thisstrategy also assumes that yb

kand ϕ

k, = 1, . . . , b are available,

contrary to the inner signals yk, = 1, . . . , b− 1, otherwise

HM1 could be individually applied to each of the subsystems.Finally, this strategy assumes that f, = 2, . . . , b is invertiblein y−1

kso that we can recover this signal using y

k, ϕ

k, �. The

strategy HM2 consists in applying HM1 to estimate all theparameters in a first step, and then to improve their estimatesusing the inner signals recovered backwards from the output.The algorithmic description is:

(1) Estimate �1, . . . , �b by applying HM1 with ybk

and ϕk, =

1, . . . , b.(2) For = b, . . . ,2 do:

Fig. 1 – Block diagram of the neuromuscular blockade model forrate, ce(·) [�g ml−1] the effect concentration, r(·) (%) the neuromusC50 [�g ml−1], ˇ patient-dependent parameters.

b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

(a) Recover y−1k

using yk, ϕ

kand the HM1 estimate of �;

(b) Improve the estimates of �1, . . . , �−1 by applying HM1 withy−1k

and ϕjk, j = 1, . . . , − 1.

Note that there are more possibilities for the algorithmicdescription of HM2 following those of HM1.

Now, if HM2 is used, then in principle only the estimateof �b coincides with the one obtained if HM1 was applied tothe system as a whole. Note that HM2 applies HM1 as manytimes as the number b of blocks, using each time a differentnetwork and making a different curve fitting. Although thisdemands a higher computational burden than the first imple-mentation strategy, it can possibly pay with better estimatesof the parameters.

Before ending this section, note that although both appli-cation strategies were presented assuming a single-outputsystem in HM1 and single-output subsystems in HM2, it is pos-sible to extend each of the methods combined in the hybridmethod to the case of a multiple-output system, as pointedout in our remarks in Sections 3 and 4.

6. Case study: the neuromuscular blockademodel

One of the anaesthetic agents usually administered to apatient undergoing general anaesthesia for surgery is the mus-cle relaxant or neuromuscular blocker. It is used to disablemuscle contraction, being the resulting paralysis required tofacilitate tracheal intubation and to maintain “ideal” surgi-cal conditions. The level of muscle relaxation is measuredfrom an evoked electrical response at the hand by electricalstimulation of the adductor policies muscle to supramaxi-mal train-of-four stimulation of the ulnar nerve. In a clinicalenvironment, the measurement of the neuromuscular block-ade level corresponds to the first single response calibratedby a reference twitch, obtained by defining a supramaximalstimulation current (for more details, see, for instance [16]).In the case of an intravenous administration of the musclerelaxant atracurium, the dynamic response of the neuromus-cular blockade may be modelled as shown in Fig. 1[1]. In thismodel, the drug infusion rate u(·) [�g kg−1 min−1] is relatedto the effect concentration ce(·) [�g ml−1]by a linear dynamicpharmacokinetic-pharmacodynamic model, and the latter tothe level r(·) (%) of the neuromuscular blockade by a nonlin-ear static Hill equation [17]. The variable r(·) is normalizedbetween 0% and 100%, 0% corresponding to full paralysis and
100% to full muscular activity. In practice, r(·) is available, con-trary to the inner signal ce(·).
In the beginning of a surgery, it is usual to administer a bolusof atracurium, i.e., a single dose injected over a short period of

atracurium, where u(·) [�g kg−1 min−1] is the drug infusioncular blockade level and ai [kg ml−1],�i [min−1], ke0 [min−1],

i n b

tnat

u

wtc

c

wdn

r

wtN(enpp3t

m�

PloisbFcco2ia

suoHoarprc


ime, with the purpose of inducing a rapid decrease in theeuromuscular blockade level. The typical bolus of 500 �g kg−1

dministered at t = 0 min can be represented in mathematicalerms by

(t) = 500ı(t) �g kg−1,

here ı is the continuous time Dirac delta function (in prac-ice, we approximate ı by a difference between two steps). Theorresponding effect concentration is

e(t) = 500∑i=1,2

aike0

ke0 − �i(e−�it − e−ke0t) �g ml−1,

here ai[kg ml−1], �i [min−1], ke0 [min−1] are patient-ependent parameters, and hence the level of theeuromuscular blockade is given by

(t) = 100

1 +(

(500/C50)∑i=1,2

ai(ke0/ke0 − �i)(e−�it − e−ke0t)

)ˇ%,

(9)

here C50 [�g ml−1], ˇ are also patient-dependent parame-ers (the meaning of all parameters can be found in [17]).ote that (9) can be cast in the form (1) by letting � =a1, �1, a2, �2, ke0, C50, ˇ)T, ϕk = tk and yk = r(tk). Our goal is tostimate � from data collected in the induction phase, whereo further drug infusion is administrated. The inductionhase typically corresponds to the first 10 min and r(·) is sam-led every 20 s, so this means that we collect a maximum of1 samples of data corresponding to {(r(tk), tk)k=1,...,31}, where

k = (k− 1)/3 min.The first issue to consider in the application of the hybrid

ethod is to know whether the system is identifiable, i.e., ifcan be uniquely determined from the collected data. Usingroposition 2, it was found that this is not the case even for aarger amount of data, that is, more than 10 min or 31 samplesf data. Moreover, it turns out that it suffices to fixC50, the least

mportant parameter according to a sensitivity analysis [18],o that the system parameterized in � = (a1, �1, a2, �2, ke0, ˇ)T

ecomes identifiable using only 2 min or 7 samples of data.inally, if the original data are compressed using principalomponent analysis, the system remains identifiable. In thisontext, we investigate whether there is a difference in termsf practical performance of the hybrid method if instead of themin that our theoretical results guarantee being sufficient to

dentify the system, we consider 5 min or 16 samples of datand 10 min or 31 samples of data.

The next step in the application of the hybrid method con-ists in gathering a priori information that is later used to setp the necessary number of neural networks. In the previ-us studies [7,8], thousands of noiseless cases were taken.ere, we assess the performance of the hybrid method whennly 10, 50 and 100 cases are considered. Moreover, we alsossume the presence of noise in the measurement of the neu-
omuscular blockade level with the aim of studying how theerformance is further affected. The overall assessment is car-ied out in a testing set with 400 different cases. In all 500ases, the level of the neuromuscular blockade was simulated
i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122 117

using the multidimensional log-normal distribution consid-ered in [19] for the parameters a1, �1, a2, �2, ke0, C50, ˇ. Duringthe curve fitting phase of the hybrid method, we assume thatC50 is either known or unknown, taking it equal to its meanvalue in the latter case. This is done in order to investigate howour knowledge about C50 affects the estimation of the remain-ing parameters. Finally, the measurement noise was assumedlog-normal with mean zero and a variance that increases withthe blockade level, being 1 for a blockade level of 100%. Thisnoise level simulates the one typically found in filtered realdata.

After gathering the data, we applied five-fold cross-validation to choose the best architecture for each network inthe two implementation strategies, HM1 and HM2. This wasdone from a set of 10 candidates, all with one hidden layerand a number of hidden neurons ranging from 1 to 10. Notethat, due to the structure of the model depicted in Fig. 1, thestrategy HM2 requires two networks, one of which is com-mon to HM1. We concluded that the network common to HM1and HM2 should have in mean six hidden neurons in thecase of noiseless data and four hidden neurons in the caseof noisy data. As for the other network needed in HM2, it canbe the same regardless of the presence of noise in the neu-romuscular blockade level provided that the inner signal ce(·)is not noisy. In other to achieve this, we recover ce(·) usingthe fitted blockade level instead of the observed one, whichmay be corrupted by noise. In this way, the best architec-ture can be determined using the noiseless data and shouldhave in mean six hidden neurons. After the selection process,all the networks were trained and this completed the workthat needed to be done before the application of the hybridmethod.

Next, we present and analyze the results from the applica-tion of the hybrid method to the testing cases, focusing on thefirst implementation strategy HM1. Furthermore, we comparethese results with the ones that we get if the initial estimateof the parameterization is given not by a neural network butrather by either the “nearest” model or the mean estimate. Fora given number of known cases, the nearest model of a testingcase is the closest known case in the sense of the Euclideandistance between the corresponding vectors (r(t1), . . . , r(tk)) ofthe observed blockade level for the testing case and for theknown cases, while the mean estimate is the same for alltesting cases and is obtained by averaging the parameteriza-tions of all known cases. The results presented in this sectionrefer to the case where C50 is assumed to be unknown and isreplaced by its mean value during the fitting of the neuromus-cular blockade model to the observed blockade level. The casewhere C50 is assumed to be known is considered in AppendixB. There are no significant differences in the results and hencein the conclusions made here.

In the following, we start by looking at the global estima-tion error, i.e. the error concerning all the parameters that areidentifiable, given by

100‖� − �‖

%,
‖�‖where � represents an estimate of �. The mean and thestandard-deviation of this error in the case of noiseless dataare shown in Table 1 for several combinations of the variables

118 c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

Table 1 – Global estimation error (%) concerning the initial estimate produced by the neural network (NN), the nearestmodel (NM) and the mean estimate (ME)

Global estimation error (%) for noiseless data

Initial estimate Estimation time (min) Known cases

10 50 100

NN 2 10.42±9.49 3.69±4.96 3.29±3.435 9.45±7.97 3.00±4.31 2.18±2.6910 9.05±9.32 2.48±3.08 2.29±3.90

NM 2 13.02±9.63 9.79±7.32 7.89±5.805 12.47±9.68 9.70±7.13 7.97±5.8610 12.44±9.64 9.76±7.21 8.01±5.86

ME 2, 5, 10 11.46±7.83 11.36±7.94 11.42±8.13

orm m

Overall, for a given method of producing the initial estimate,both algorithms tend to improve their performance with anincrease in the estimation time and the number of knowncases. Nevertheless, NLM seems to generate a final estimate

Table 2 – Global estimation error (%) concerning the finalestimate produced by a combination of the neuralnetwork (NN), the nearest model (NM) and the meanestimate (ME) with the Levenberg–Marquardt (LVM) andNelder–Mead (NLM) algorithms


Method Estimation time (min)/known cases

2/10 5/50 10/100

NN+LVM 4.74±9.30 1.79±3.98 1.53±3.84NN+NLM 3.54±4.64 1.76±1.54 1.37±1.13NM+LVM 3.61±6.68 3.32±6.34 2.58±4.31NM+NLM 3.23±2.91 2.01±1.95 2.13±1.96

The results refer to the case of noiseless data and are shown in the f

estimation time and known cases. Note that the estimationtime is related to the amount of data that is collected in a new,previously unseen, case and that the number of known casesis related to the amount of cases known a priori, i.e. before anew case is presented.

Let us begin by verifying that the network exhibits a bet-ter performance than the nearest model and mean estimatemethods for all combinations of the variables estimation timeand known cases. The network has the ability to produce aninitial estimate that is less biased and less variant (the meanand the standard-deviation of the global estimation error areboth lower for the network). Another conclusion is that thenetwork is the method showing the greatest improvementin terms of performance with an increase in the number ofknown cases. In fact, the mean estimate method does notimprove and from the other two methods the network is theone that most explores the additional data assumed to beknown a priori to considerably reduce the estimation error ina new case. For 100 known cases, the network error has atmost a mean value of about 3% and a standard-deviation ofabout 4%, against the corresponding 8% and 6% of the near-est model error. Finally, for a fixed number of known cases,an increase in the estimation time leads only to a mild (ifany) improvement in the network estimation error. Note thatthe nearest model method does not seem to benefit with thatincrease and that the mean estimate is by definition indepen-dent of the blockade level data and hence of the estimationtime considered.

The presence of noise in the data has little impact on theperformances of the network and the nearest model methodand, as expected, no impact at all on the performance of themean estimate method. The general comments made beforefor the case of noiseless data remain valid.

Overall, we confirm our suggestion that an initial estimateis closer to the sought parameterization if it is taken froma neural network that learns how to do it from some casesknown a priori, regardless of their data being clean or contam-inated by noise. Moreover, the more of these cases, the betterthe estimate produced by the network in a new case. Finally,
it seems that extending the estimation time is not as impor-tant as increasing the number of known cases to significantlyimprove the quality of the initial estimate. This reflects ourtheoretical result according to which 2 min of data suffice to
ean ± standard-deviation.

identify the parameters, as long as the number of known casesis sufficiently large.

Now, we look at the effect that the initial estimates pro-duced by the neural network and by the nearest modeland mean estimate methods have on the performance oftwo curve fitting algorithms: Levenberg–Marquardt (LVM) andNelder–Mead (NLM). Note that the two algorithms adjust aninitial estimate so that the model fits the observed block-ade level and since they do it in different ways it is naturalthat different final estimates are generated from the sameinitial estimate. Moreover, depending on the initial estimate,the convergence process of each algorithm can end with afinal estimate that is closer or farther from the sought param-eterization. In order to investigate these issues, we look atthe global estimation error of the final estimate generatedby LVM and NLM from the initial estimate produced by theneural network, the nearest estimate and the mean estimate.The mean and the standard-deviation of this error are shownin Tables 2 and 3, respectively for noiseless and noisy data.

ME+LVM 5.08±6.49 5.38±7.66 7.73±8.01ME+NLM 4.99±5.53 3.71±3.78 3.69±4.26

The results refer to the case of noiseless data and are shown in theform mean ± standard-deviation.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122 119


Global estimation error (%) for noisy data


2/10 5/50 10/100

NN+LVM 10.33±8.33 4.53±3.69 4.17±3.34NN+NLM 10.36±7.44 5.20±3.54 4.62±3.49NM+LVM 12.89±9.41 9.43±7.06 8.10±6.05NM+NLM 13.61±9.79 7.19±5.08 5.16±4.77ME+LVM 11.74±7.79 10.77±7.49 11.57±8.27ME+NLM 12.07±8.42 8.62±6.09 6.40±4.58

tLgpfi

1

wobtmi1ititfttttm

featkadmgoa

3

Table 4 – Target infusion dose error (%) following theapplication of the curve fitting algorithms ofLevenberg–Marquardt (LVM) and Nelder–Mead (NLM)with an initial estimate produced by the neural network(NN), the nearest model (NM) and the mean estimate(ME)

Target infusion dose error (%)

Initial estimate LVM NLM

NN 3.87±6.60 9.04±7.76NM 6.20±8.98 15.85±12.82ME 13.12±13.04 18.66±17.14

The results refer to the case of noisy data and are shown in the formmean ± standard-deviation.

hat is most of the times less variant than that generated byVM. Finally, but most important, in practically all cases thelobal estimation error is lower when the initial estimate isroduced by the neural network. As a consequence, so is thetting error defined as

001s

s∑k=1

∣∣∣∣ r(tk) − r(tk)r(tk)

∣∣∣∣ %,

here r(·) is the predicted blockade level and s is the orderf the sample associated with the recovery time to reach alockade level of 10%,1 considered to be the level for closinghe loop in drug delivery control (of course, this value of 10%

ay be adjusted accordingly to the clinical requirements). Fornstance, in the case of 10 min estimation time with NLM and00 known cases, the mean fitting error in the absence of noises 27% for the network, 53% for the nearest model and 60% forhe mean estimate. In the case of noisy data, the correspond-ng values are 59%, 70% and 74% and they are higher thanhe previous ones mainly because the fitting error accountsor the presence of noise not fitted by the model. Finally, notehat the overall values obtained reflect the fact that duringhe estimation time most of the samples have a value closeo zero. Hence, this measure of fitting error is used only withhe purpose of comparing the different performances of the

ethods.The presented results confirm our suggestion that the per-

ormance of a curve fitting algorithm applied to parameterstimation can be improved if the initial estimate is given byneural network. This has a significant impact on the iden-

ification of the patient characteristics, resulting on a betternowledge of the patient response to drug administration. Asn example, we consider the prediction of the steady-staterug infusion dose. More concretely, the problem is to deter-
ine the constant infusion dose of atracurium that should be
iven to a particular patient in order to take the limiting valuef its neuromuscular blockade level to a prefixed referencefter an initial bolus. One way to solve the problem consists in

1 In our data set, we found that the mean recovery time is about6 min.

The results refer to the case of 10 min estimation time and 100known cases and are shown in the form mean ± standard-deviation.

finding the constant infusion dose from the parameterizationof the individual model representing the particular patient(see the first part of Appendix C). Following this approach andassuming for the prefixed reference of the blockade level thetypical value of 10%, we compared the constant infusion doseassociated with each testing case, found using the true modelparameterization, with that predicted from the final estimategenerated by combining LVM and NLM with the neural net-work, the nearest model and the mean estimate. The meanand the standard-deviation of the resulting error, which wascalled target infusion dose error, are given in Table 4 for thecase of 10 min estimation time and 100 known cases. It can beseen that the best overall prediction of the target infusion doseis achieved when either fitting algorithm is combined withthe network. This is explained by the fact that the parameterthat most affects the dose calculation, namely ˇ according to asensitivity analysis (see the second part of Appendix C), is bet-ter estimated when the network is used. For instance, whenLVM is applied, ˇ has an estimation error of 0.69% ± 3.63% and1.39% ± 4.25% if the network and the nearest model are con-sidered, respectively. Finally, note that the greater the targetinfusion dose error, the more the patient is endangered byunder or over dosage if the anaesthesiologist decides to followthe suggestion about the constant infusion dose to adminis-ter. Therefore, the difference between the performance of thefitting algorithms when combined with the network and whencombined with the other two methods is indeed significant inpractical terms.

With respect to HM2, the difference to HM1 is that theresults are further improved but at the expense of a highercomputational burden. In particular, the global estimationerror and the target infusion dose error are reduced up toone half with an increase in both the estimation time and thenumber of known cases.

7. Conclusions

In this paper, a general version of a hybrid method for param-eter estimation was presented. This method takes the initial
estimate of the parameterization from an artificial neural net-work and then applies a curve fitting algorithm with the aim ofrefining it. Since the network estimate is close to the soughtparameterization, the chance of an effective and rapid con-

s i n

how to find the constant infusion dose of atracurium thatshould be given to a patient in order to take the limiting valueof its neuromuscular blockade level to a prefixed referenceafter an initial bolus. The second part corresponds to a sen-




2/10 5/50 10/100

NN+LVM 4.78±9.42 1.69±3.56 1.55±3.80NN+NLM 3.45±3.74 1.86±1.70 1.30±1.01NM+LVM 3.62±6.88 3.37±6.42 3.07±4.87NM+NLM 3.20±2.50 2.09±2.12 2.18±2.28


vergence of the curve fitting algorithm is high; hence, so isthe possibility of obtaining not only a good fitting, but alsoa good prediction ability. These are important features inmany problems arising in biomedicine, such as the identifica-tion of the neuromuscular blockade level during anaesthesiahere considered as the case study. Two curve fitting algo-rithms were suggested: the Levenberg–Marquardt algorithmand the Nelder–Mead algorithm. Moreover, two implementa-tion strategies of the hybrid method were proposed: the first isof general applicability and the second is intended for systemsdefined by the series connection of various blocks. The secondstrategy, whenever possible to apply, improves the results offirst one but at the expense of a higher computational burden.

In our case study, it was possible to conclude that the hybridmethod exhibits a performance superior to that of two alter-native approaches to the problem of parameter estimation,namely where the initial estimate given to a curve fitting algo-rithm is either that of the nearest model or that correspondingto the mean estimate. Furthermore, this was verified for noise-less and noisy data.

Finally, our preliminary results on using the hybrid methodfor predicting the steady-state drug infusion dose that shouldbe given to a patient undergoing general anaesthesia were verypromising. Therefore, we plan to explore this issue in detailin the near future. Furthermore, we plan to investigate theperformance of the hybrid method in the framework of modelreference control and apply it to more complex systems.

Acknowledgments

The first author would like to thank the Fundac cao para aCiencia e a Tecnologia (FCT) for the financial support (PhDgrant) during the course of this project. This work was sup-ported in part by FCT through the Unidade de Investigac caoMatematica e Aplicac coes (UIMA), Universidade de Aveiro,Portugal, and through the project IDEA, reference PTDC/EEA-ACR/69288/2006.

Appendix A

Proof of Proposition 2. (If) If JF,�(�∗, �∗) has full column rank,then it has p linearly independent rows, i1, . . . , ip ∈ {1, . . . , n}.Consider the subsystem of (2) given by

y = F(�, �),

where y = (yi1 , . . . , yip )T and F : (�, �) �→ (Fi1 (�, �), . . . , Fip (�, �))T

is such that JF,�(�∗, �∗) = ((∂Fi/∂�j)(�∗, �∗))

i=i1,...,ip, j=1,...,phas full

column rank. The existence of G follows from the implicitfunction theorem (see, for instance, [20]).(Only if) From G(F(�, �), �) = � we have (∂Gi/∂�j) =∑

(∂Gi/∂y)(∂F/∂�j) = ı(i− j), where ı(k) = 1 if k = 0, 0

otherwise. Then,

∗ ∗ ∗ ∗
JG,y(y , � ) JF,�(� , � ) = Ip,
where JG,y(y∗, �∗) = ((∂Gi/∂yj)(y∗, �∗))i=1,...,p, j=1,...,n

and Ip is the

identity matrix of order p. This means that JF,�(�∗, �∗) has a leftinverse, and thus has full column rank. �

b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

Proof of Proposition 3. (If) Start by noting that

rank(JH,�) = rank(ATy JF,�)

≤ min{rank(ATy ), rank(JF,�)}

= min{p, rank(JF,�)}= rank(JF,�).

If J exists, then rank(JH,�(�∗, �∗)) = p by Proposition 2 applied tothe compressed data (6); thus, minding the above inequality,rank(JF,�(�∗, �∗)) = p and G exists by Proposition 2.(Only if) Let yk = (dk sin /d�k)(�)ck−1, where � ∼ U(0,2�) is theunknown parameter and c∈ (0,1) is a known constant. Assumethat data are collected at k = 1,2, yielding y = (y1, y2)T =(cos(�),− sin(�)c)T. We shall prove that � is uniquely definedaround �∗ = � if we consider y but not if we consider thefirst principal component of y, zy. That the former holds iseasy because JF,�(�) = (0, c)T has full column rank and theresult follows from Proposition 2. Now, note that E[y1] =E[y2] = 0, E[y2

1] = � > E[y22] = c2� and E[y1y2] = 0, where E[·] =

(1/2�)∫ 2�

0· d� denotes the mathematical expectation with

respect to �; hence, Ay = (1,0)T and zy = ATy (y − E[y]) = y1.

Finally, it is easy to see that � is not uniquely defined around�∗ = � from zy = cos(�). �

Appendix B

The results presented in this appendix refer to the case whereC50 is assumed to be known during the fitting of the neuro-muscular blockade model to the observed blockade level andare related to the results presented in Section 6 (see Tables 5–7).

Appendix C

This appendix is divided into two parts. The first part shows

ME+LVM 4.95±6.24 5.46±7.55 8.40±10.84ME+NLM 4.98±5.37 3.75±4.36 3.85±4.06

The results refer to the case of noiseless data and are shown in theform mean ± standard-deviation.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b


Global estimation error (%) for noisy data


2/10 5/50 10/100

NN+LVM 10.38±8.36 4.55±3.68 4.23±3.53NN+NLM 10.52±7.53 5.22±3.77 4.92±4.52NM+LVM 12.82±9.45 9.33±6.95 8.12±6.04NM+NLM 13.64±9.72 7.16±5.23 4.99±4.02ME+LVM 11.65±7.76 10.78±7.61 11.81±8.93ME+NLM 12.43±9.14 8.29±6.02 6.37±4.74

The results refer to the case of noisy data and are shown in the formmean ± standard-deviation.

Table 7 – Target infusion dose error (%) following theapplication of the curve fitting algorithms ofLevenberg–Marquardt (LVM) and Nelder–Mead (NLM)with an initial estimate produced by the neural network(NN), the nearest model (NM) and the mean estimate(ME)

Target infusion dose error (%)

Initial estimate LVM NLM

NN 4.23±6.84 9.50±8.08NM 7.06±9.95 14.94±12.41ME 12.93±12.70 19.94±18.41

siu

itfc

u

wtsSc

wt

r

functions, Eur. J. Operat. Res. 161 (3) (2005) 636–654.

The results refer to the case of 10 min estimation time and 100known cases and are shown in the form mean ± standard-deviation.

itivity analysis of the dose calculation with respect to thedentifiable parameters of the neuromuscular blockade modelsed in the calculation.

Let us start by noting that in theory the initial bolus admin-stered to a patient has no influence on the limiting value ofhe blockade level r(·) because this will always be 100% if nourther drug is given. Hence, we only consider the effect of aonstant infusion dose represented by

(t) = kH(t − t∗) �g kg−1 min−1,

here k determines the quantity of drug administered beyondime t∗ and H is the Heaviside step function. Minding thetructure of the neuromuscular blockade model presented inection 6, it is easy to see that the limiting value of the effectoncentration for this infusion dose is

limt→+∞

ce(t) = lims→0

s

(a1

s+ �1+ a2

s+ �2

)ke0

s+ ke0L{u(t)}(s)

= lims→0

s

(a1

s+ �1+ a2

s+ �2

)ke0

s+ ke0k

e−st∗

s(a1 a2

)−1
= k
�1+�2

�g ml ,

here we applied the Final Value Theorem and L representshe (unilateral) Laplace Transform. Therefore, the correspond-

i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122 121

ing limiting value of the blockade level is

limt→+∞

r(t) = limt→+∞

100

1 + ((ce(t)/C50))ˇ

= 100

1 + ((k/C50)((a1/�1) + (a2/�2)))ˇ%.

Finally, in order to achieve the condition limt→+∞r(t) = ref,where ref is the prefixed reference of r(t), the constant infusiondose that we are looking for should be

u(t) = C50((100/ref) − 1)1/ˇ

(a1/�1) + (a2/�2)︸︷︷︸k

H(t − t∗) �g kg−1 min−1.

Next, we carry out a sensitivity analysis to assess howthe calculation of k is affected by each of the identifiableparameters of the neuromuscular blockade model used in thecalculation. Let us begin by noting that

∂k

∂ai= − 1

�i((a1/�1) + (a2/�2))k,

∂k

∂�i= ai

�2i((a1/�1) + (a2/�2))

k,

∂k

∂ˇ= − ln((100/ref) − 1)

ˇ2k.

From here it can be seen that |∂k/∂ai| = (�i/ai)|∂k/∂�i|, butsince �i > ai in our data set, it follows that |∂k/∂ai| > |∂k/∂�i|,i.e. k is more sensitive to ai than it is to �i. On theother hand, given that |∂k/∂a2| = (�1/�2)|∂k/∂a1| and that�1 > �2 in our data set, we have |∂k/∂a2| > |∂k/∂a1|, i.e.k is more sensitive to a2 than it is to a1. Finally, since|∂k/∂ˇ| = (| ln((100/ref) − 1)|/(ˇ2�2((a1/�1) + (a2/�2))))|∂k/∂a2|and (| ln((100/ref) − 1)|(ˇ2�2((a1/�1) + (a2/�2)))) > 1 in our dataset if we assume for ref the typical value of 10% as we did inour practical experiments, it follows that |∂k/∂ˇ| > |∂k/∂a2|, i.e.k is more sensitive to ˇ than it is to a2. Overall, the calculationof k is more influenced by ˇ than by any other parameter.

e f e r e n c e s

[1] T. Mendonca, P. Lago, PID control strategies for theautomatic control of neuromuscular blockade, Contr. Eng.Pract. 6 (10) (1998) 1225–1231.

[2] H. Derendorf, B. Meibohm, Modeling ofpharmacokinetic/pharmacodynamic (PK/PD) relationships:concepts and perspectives, Pharmaceut. Res. 16 (2) (1999)176–185.

[3] J.A. Nelder, R. Mead, A simplex method for functionminimization, Comput. J. 7 (1965) 308–313.

[4] K. Levenberg, A method for the solution of certain problemsin least squares, Q. Appl. Math. 2 (1944) 164–168.

[5] D. Marquardt, An algorithm for least-squares estimation ofnonlinear parameters, SIAM J. Appl. Math. 11 (1963)431–441.

[6] R. Chelouah, P. Siarry, A hybrid method combiningcontinuous tabu search and Nelder–Mead simplexalgorithms for the global optimization of multiminima

[7] H. Alonso, H. Magalhaes, T. Mendonca, P. Rocha, A hybridmethod for parameter estimation, in: Proceedings of theIEEE Symposium on Intelligent Signal Processing, WISP2005, Faro, Portugal, 2005, pp. 304–309.

s i n
[8] H. Alonso, H. Magalhaes, T. Mendonca, P. Rocha, Multiplestrategies for parameter estimation via a hybrid method: acomparative study, in: Proceedings of the Sixth IFACSymposium on Modelling and Control in BiomedicalSystems (Including Biological Systems) (MCBMS’06), Reims,France, 2006, pp. 87–92.

[9] I.T. Jolliffe, Principal Component Analysis, second ed.,Springer-Verlag, New York, 2002.

[10] S. Haykin, Neural Networks, A Comprehensive Foundation,second ed., Prentice-Hall, New Jersey, 1999.

[11] K. Hornik, Approximation capabilities of multilayerfeedforward networks, Neural Networks 4 (2) (1991)251–257.

[12] T. Hastie, R. Tibshirani, J. Friedman, The Elements ofStatistical Learning: Data Mining, Inference and Prediction,first ed., Springer-Verlag, New York, 2001.

[13] J.C. Lagarias, J.A. Reeds, M.H. Wright, P.E. Wright,Convergence properties of the Nelder–Mead simplex methodin low dimensions, SIAM J. Optim. 9 (1) (1998) 112–147.

[14] D.G. Luenberger, Linear and Nonlinear Programming, seconded., Addison-Wesley, Massachusetts, 1984.

b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 112–122

[15] K. Deb, Multi-Objective Optimization using EvolutionaryAlgorithms, first ed., John Wiley and Sons, West Sussex,2001.

[16] I. Kalli, Neuromuscular block monitoring, in: R. Kirby, N.Gravenstein, E. Lobato, J. Gravenstein (Eds.), ClinicalAnesthesia Practice, W.B. Saunders Company, 2002.

[17] B. Weatherley, S. Williams, E. Neill, Pharmacokinetics,pharmacodynamics and dose–response relationships ofatracurium administered i.v., Br. J. Anaesth. 55 (1) (1983)39s–45s.

[18] H. Alonso, T. Mendonca, P. Rocha, Contributions toparameter estimation using neural networks, Tech. Rep. CM05/I-35, Universidade de Aveiro,http://pam.pisharp.org/handle/2052/86, 2005.

[19] P. Lago, T. Mendonca, H. Azevedo, Comparison of on-lineautocalibration techniques of a controller of neuromuscular
blockade, in: Proceedings of the fourth IFAC Symposium onModelling and Control in Biomedical Systems,Karlsburg-Greifswald, Germany, 2000, pp. 263–268.
[20] J.E. Marsden, A.J. Tromba, Vector Calculus, fourth ed., W.H.Freeman and Company, New York, 1996.

http://pam.pisharp.org/handle/2052/86

Documents

A hybrid method for parameter estimation and its application to biomedical systems