Predictive Control Based Therapy of Bone Marrow Cancer
9
1 Predictive Control Based Therapy of Bone Marrow Cancer Hugo M. Silva Abstract—Despite its simple appearance, bone is a complex and dynamic tissue that is remodelled continuously. Bone remodelling is a delicate process that is essentially made of two types of cells: osteoclasts, that digest and remove old bone, and osteoblasts, that create new bone. The bone microenvironment is rich in nutrients and proteins, and hence it is not surprising that the bone is an usual place for tumor appearance. From the moment in which a tumor is established, it starts to deregulate the healthy balance between osteoclasts and osteoblasts, accelerating the bone resorption process, so that it can spread to other organs. Furthermore, the deregulated balance will cause the bone mass density to decrease, resulting in another kind of bone diseases. This paper starts by reviewing the mathematical model for tumor growth as well as the pharmacokinetics and pharmacodynamics models of the drug, so that the therapy can be as close as possible to reality. Then, the Nonlinear Model Predictive Control algorithm (NMPC) is used to find the optimal drug dose, in order to reduce the tumor density. An exponential reference signal is used in NMPC to produce smaller tracking errors with respect to a reference that drives the tumor size to a vanishing value, and the Recursive Least Squares method is used to learn the parameters of the tumor growth model, in order to obtain an adaptive NMPC strategy. Finally, this control strategy is applied to a state-of-the- art bone microenvironment model used in cancer research, to schedule a therapy for reducing tumor density. Simulations in MATLAB show that the tumor is eliminated and the bone mass is recovered in a period of five years, assuming model validity. Keywords: Nonlinear Model Predictive Control, Nonlinear models, Pharmacodynamical Models, Adaptive Control, Bone Marrow Cancer, Biomedical Systems. I. I NTRODUCTION A. Motivation and Literature Review A wide range of diseases that have in common an unusual and unnecessary cell reproduction beyond the organism needs are called cancer. This uncontrolled proliferation provokes the formation of a cellular mass called tumor. The bone microenvironment provides a fertile soil for cancer cells. The reciprocal interaction between tumor and bone cells, known as vicious cycle, supports the establishment and orchestrates the expansion of malignant tumors in the bone. The therapy to repress cancer growth has collateral toxic effects that affect the patient. To study that interaction, several studies were per- formed recently, using mathematical models and optimization solvers, allowing a deeper understanding and control of the tumor growth [1]–[10]. Indeed some of those studies have used Model Predictive Control (MPC) to compute an optimal therapeutic schedule. In [7] an optimal chemotherapy dose is found by solving a convex optimization problem based on linear matrix inequalities; in [8] it is shown that, even when the system states are not fully directly measurable and there are mismatches in the model parameters, MPC still provides an useful schedule for cancer treatment; in [9] MPC is used to provide a chemotherapy schedule for mice with breast cancer. It has been assumed in many researches related to diseases that it is possible to directly control the effect of a drug on the target. The effect of the body has on the drug and the effect that the drug has on the body is called pharmacokinetics (PK) and pharmacodynamics (PD), respectively. After the administration of a drug there are natural processes, such as solubility, distribution, metabolism and elimination, that affect the amount of drug concentration that reaches the target organ [14]. Studies [12] and [14] suggest mathematical models to represent those interactions. Resistance to drugs is a natural process of the human body and is a major problem in cancer therapies [16]. In [12] a drug resistance model in the treatment of HIV is presented, based on the amount of drug concentration present in the bloodstream. Bone marrow cancer is a common type of cancer that may be the result of metastasis from prostate and breast cancers and a very low percentage of patients (∼20%) survive for more than five years after bone marrow cancer is diagnosed due to the vicious cycle [13]. Since the bone microenvironment is a fertile soil to the development of this type of cancer, it is crucial to better understand the interactions between osteoclasts, osteoblasts, bone density and the tumor. In [11] a model to represent the microenvironment interactions be- tween osteoclasts, osteoblasts and the bone mass density was developed. A tumor growth model is proposed in [4] and the model of [11] was adapted to show the relations that the tumor has with the bone microenvironment. A recent research [10] employs continuous optimal control to deal with this disease and uses a classical PI controller to recover the bone mass density. B. Paper Contributions and Structure The main goal of this work is to develop a control and adaptive based framework to schedule a therapy to reduce the density of a cancer tumor. The time evolution of the tumor density T, is represented by a nonlinear function, that depends on the tumor density itself and on the drug effect, u. To discover which drug effect u, should be applied, the MPC algorithm is used in order to solve an optimization problem. For that sake, a quadratic cost function that weights the drug effect u and the error between the tumor density T and a reference signal T ref , is used. An exponential reference signal is used to generate smaller errors between T and T ref . The PK and the PD of a drug are modelled so that the drug dose is the manipulated variable to better traduce the reality.
Predictive Control Based Therapy of Bone Marrow Cancer
Hugo M. Silva
Abstract—Despite its simple appearance, bone is a complex and
dynamic tissue that is remodelled continuously. Bone remodelling is
a delicate process that is essentially made of two types of cells:
osteoclasts, that digest and remove old bone, and osteoblasts, that
create new bone. The bone microenvironment is rich in nutrients and
proteins, and hence it is not surprising that the bone is an usual
place for tumor appearance. From the moment in which a tumor is
established, it starts to deregulate the healthy balance between
osteoclasts and osteoblasts, accelerating the bone resorption
process, so that it can spread to other organs. Furthermore, the
deregulated balance will cause the bone mass density to decrease,
resulting in another kind of bone diseases. This paper starts by
reviewing the mathematical model for tumor growth as well as the
pharmacokinetics and pharmacodynamics models of the drug, so that
the therapy can be as close as possible to reality. Then, the
Nonlinear Model Predictive Control algorithm (NMPC) is used to find
the optimal drug dose, in order to reduce the tumor density. An
exponential reference signal is used in NMPC to produce smaller
tracking errors with respect to a reference that drives the tumor
size to a vanishing value, and the Recursive Least Squares method
is used to learn the parameters of the tumor growth model, in order
to obtain an adaptive NMPC strategy. Finally, this control strategy
is applied to a state-of-the- art bone microenvironment model used
in cancer research, to schedule a therapy for reducing tumor
density. Simulations in MATLAB show that the tumor is eliminated
and the bone mass is recovered in a period of five years, assuming
model validity.
Keywords: Nonlinear Model Predictive Control, Nonlinear models,
Pharmacodynamical Models, Adaptive Control, Bone Marrow Cancer,
Biomedical Systems.
I. INTRODUCTION
A. Motivation and Literature Review
A wide range of diseases that have in common an unusual and
unnecessary cell reproduction beyond the organism needs are called
cancer. This uncontrolled proliferation provokes the formation of a
cellular mass called tumor. The bone microenvironment provides a
fertile soil for cancer cells. The reciprocal interaction between
tumor and bone cells, known as vicious cycle, supports the
establishment and orchestrates the expansion of malignant tumors in
the bone. The therapy to repress cancer growth has collateral toxic
effects that affect the patient. To study that interaction, several
studies were per- formed recently, using mathematical models and
optimization solvers, allowing a deeper understanding and control
of the tumor growth [1]–[10]. Indeed some of those studies have
used Model Predictive Control (MPC) to compute an optimal
therapeutic schedule. In [7] an optimal chemotherapy dose is found
by solving a convex optimization problem based on linear matrix
inequalities; in [8] it is shown that, even when the system states
are not fully directly measurable and there are mismatches in the
model parameters, MPC still provides
an useful schedule for cancer treatment; in [9] MPC is used to
provide a chemotherapy schedule for mice with breast cancer.
It has been assumed in many researches related to diseases that it
is possible to directly control the effect of a drug on the target.
The effect of the body has on the drug and the effect that the drug
has on the body is called pharmacokinetics (PK) and
pharmacodynamics (PD), respectively. After the administration of a
drug there are natural processes, such as solubility, distribution,
metabolism and elimination, that affect the amount of drug
concentration that reaches the target organ [14]. Studies [12] and
[14] suggest mathematical models to represent those interactions.
Resistance to drugs is a natural process of the human body and is a
major problem in cancer therapies [16]. In [12] a drug resistance
model in the treatment of HIV is presented, based on the amount of
drug concentration present in the bloodstream.
Bone marrow cancer is a common type of cancer that may be the
result of metastasis from prostate and breast cancers and a very
low percentage of patients (∼20%) survive for more than five years
after bone marrow cancer is diagnosed due to the vicious cycle
[13]. Since the bone microenvironment is a fertile soil to the
development of this type of cancer, it is crucial to better
understand the interactions between osteoclasts, osteoblasts, bone
density and the tumor. In [11] a model to represent the
microenvironment interactions be- tween osteoclasts, osteoblasts
and the bone mass density was developed. A tumor growth model is
proposed in [4] and the model of [11] was adapted to show the
relations that the tumor has with the bone microenvironment. A
recent research [10] employs continuous optimal control to deal
with this disease and uses a classical PI controller to recover the
bone mass density.
B. Paper Contributions and Structure
The main goal of this work is to develop a control and adaptive
based framework to schedule a therapy to reduce the density of a
cancer tumor. The time evolution of the tumor density T, is
represented by a nonlinear function, that depends on the tumor
density itself and on the drug effect, u. To discover which drug
effect u, should be applied, the MPC algorithm is used in order to
solve an optimization problem. For that sake, a quadratic cost
function that weights the drug effect u and the error between the
tumor density T and a reference signal Tref , is used. An
exponential reference signal is used to generate smaller errors
between T and Tref . The PK and the PD of a drug are modelled so
that the drug dose is the manipulated variable to better traduce
the reality.
2
Summarizing, a drug dose d (an impulse signal) will generate a drug
concentration c given by its PK model, that in turn is going to
produce a drug effect u, given its PD model. Since MPC computes an
optimal drug effect u∗, but only the drug dose d can be
manipulated, it is necessary to find the optimal drug effect u∗
that corresponds to the optimal drug concentration c∗. To do so,
the inverse PD model is defined and used. To discover which drug
dose d is going to generate a drug concentration c as close as
possible to the optimal drug concentration c∗, a controller with an
asymptotic observer is designed (see Figure 1). Note that the drug
dose could be the number of pills or even the number of
chemotherapy cycles. Since the therapy is not specified in this
work, the drug dose d is dimensionless.
Controller PD−1 PK control PD Systemu∗ c∗ c u
Fig. 1. Block diagram.
To find the tumor density model that best fits a patient, the
Recursive Least Squares (RLS) method is used to learn the model
parameters from data, in real time, yielding an adaptive MPC
algorithm. The framework developed is used with the model of [4] to
suppress the tumor and to break the vicious cycle. The bone mass
density recover is enhanced using a classical discrete PI
algorithm.
After this introduction where are presented the motivation, the
literature review and the main contribution, the paper is
structured as follows: the mathematical models of PK, PD and drug
resistance are defined as well as the tumor density time variation.
Then, the MPC is formulated. Some MPC perfor- mance characteristics
are studied and the developed framework is used to eliminate a bone
marrow tumor. To recover to an healthy bone mass density, a
discrete PI controller is used.
II. MODELS
A. Pharmacodynamical model
The pharmacodynamical model (Figure 2) is composed of the PK, PD,
and the drug resistance models, and aims to bring a real clinical
therapy to the simulation. Figure 2 shows the block diagram
composed of these models. The PK control is designed as in Figures
3 and 4.
PD−1 PK control PD
R
cc∗
c50
Fig. 2. Pharmacodynamical model (PC) - block diagram.
1) PK model: The PK model of a drug is represented by a linear and
time invariant system with 2 real poles and unitary static
gain
C(s) = p1p2
+ Observer −K PK c∗ e x d c
-
PK control c∗ c
Fig. 3. Pharmacokinetics control model (PK control) - block
diagram. The PK block represents the transfer function (1).
−L
B
Fig. 4. Observer - block diagram.
where s is the complex frequency in rad · s−1 and p1,2 ∈
<+
are the poles magnitude. This model relates the drug con-
centration c in the bloodstream as a function of time t with the
therapy dose administered. Figure 5 shows the impulse response of
the PK model. Consider now the equivalent state
0 0.5 1 1.5 2 2.5
x 10 4
c
Fig. 5. PK model impulse response. p1 = p2 = 0.5 · 10−3. The drug
is fully eliminated from the body after approximately 7
hours.
space representation of the transfer function (1), represented by
matrices A, B and C in the model
x = Ax+B,
y = Cx, (2)
where x ∈ <2 is the state and y is the blood drug concen-
tration. This system is fully controllable and observable and thus
a controller with an asymptotic observer can be designed
3
to discover the optimal drug dose d. Let the system dynamics with a
controller and a observer be given by
d = −Kx, x = (A−BK − LC)x− Le,
(3)
where K and L are gain vectors that may be computed using a pole
placement technique. The closed loop system can be defined, in an
equivalent way, by the following new matrices
ACL = A−BK − LC, BCL = −L, CCL = C.
(4)
The discrete response of the PK model with the controller and the
asymptotic observer is given by the response for non-homogeneous
systems, that is composed of the solution of the homogeneous
equation and the input signal, using the superposition theorem
[15]. With the system defined by matrices (4), the drug
concentration evolution in discrete time, for Dirac input signals
d(k), is given by
c(k + ) = CCL · ( e·ACLx(k) + e·ACLBCL · d(k)
) , (5)
where is the sampling time. 2) PD model: The PD of a drug is
represented by the Hill
equation, assumed to be a static nonlinear relation given by
u(k) = c(k)
c50(k) + c(k) , (6)
where c50 ∈ <+ is the drug concentration value for which the
drug effect is half of the maximum drug effect. It is assumed that
c50 may vary in time depending on the resistance model explained
below. The PD model has an horizontal asymptote when the drug
concentration tends to infinity, meaning that when the drug
concentration increases, the drug effect tends to saturate.
Therefore the PD model is normalized to vary between umin = 0 (no
drug effect) until umax = 1 (maximum drug effect). The drug effect
variation as a function of the drug concentration is presented in
Figure 6.
10 −2
10 −1
10 0
10 1
10 2
Fig. 6. Pharmacodynamics - drug concentration in logarithmic scale.
c50(k) = 1.
3) Drug resistance model: If the drug concentration c is below a
given threshold clim, only weak cells are killed. The cells
reproduced are resistant to that amount of drug concentration. This
phenomenon is called drug resistance (Figure 7). Let r(k) be the
drug resistance level at time k
r(k) = r(k − 1) + δ ·max(0, clim − c(k)), (7)
where δ is the sampling interval and clim the limit above which no
resistance to the drug is developed. When drug resistance is
developed by the body, an higher drug concentration c is needed to
perform the same drug effect u. This can be done by increasing the
c50 parameter proportionally to the drug resistance level [12]. Let
c50 be affected by the drug resistance r as follows
c50(k) = c50(0) · (1 +Kr · r(k)), (8)
where c50(0) is the initial value of the c50 parameter and Kr ∈
<+
0 is a parameter related to the ability of the disease to develop
resistance to the drug. In Figure 2, the block diagram R is
composed by equations (7) and (8).
Fig. 7. Resistance model with forward Euler integration. δ =
1.
To finish the explanation of the Figure 2 block diagram, the
inverse pharmacodynamics model PD−1 is the function that gives drug
concentration values as a function of the drug effect. The
transposed graph of the PD−1 model is shown in Figure 6. Although
it has been concluded that the drug effect u will vary between umin
= 0 and umax = 1, the inverse pharmacodynamics model is not defined
when u = 1 (the corresponding drug concentration value would be
infinity) and for that reason, from now on, the umax is limited to
99% of the maximum drug effect (umax = 0.99) for both PD and PD−1
models.
B. Bone microenvironment model with tumor and drug treat- ment
dynamics
1) Tumor growth model: It is assumed that a cell-kill drug is
administered to the patient to diminishing the tumor density. Thus,
the tumor growth model used in [4] is slightly changed to a more
realistic one [5]. Consider that the tumor density variation is
given as a function of continuous time t by T (t) 1
T = aT log ( η T
) − bTu2, (9)
1For notation simplicity, let the time dependence of functions be
omitted. The dot notation is used for functions time
derivatives.
4
where a ∈ R+ is a parameter related to the tumor growth rate, b ∈
R+
0 is the tumor sensitivity to the drug, η ∈ R+ is the plateau
level, T ∈]0, η[ is the tumor density and u2 ∈ R+
0 is the tumor cell kill drug effect. In the absence of treatment,
model (9) leads to an S-shaped growth, with η being an horizontal
asymptote. Although the drug effect u causes the tumor growth to
decrease, when T becomes small, the effect of u also decreases and
there is no danger that T is driven to meaningless negative
values.
2) Bone microenvironment model: Consider the following nonlinear
model, presented in [4], where C(t) and B(t) represent,
respectively, osteoclasts and osteoblasts activity, and Z(t)
represents the bone mass density, as a functions of continuous time
t
C = α1C g11
1+r11 T
(10)
where g••, r••, α•, β• and k• are bone microenvironment model
parameters, and C and B are the mean value of osteoclasts and
osteoblasts function [11]. The variable u1
represents the osteoblasts recovery drug effect. Since the control
algorithms used operate in discrete time, models (9) and (10) are
approximated using the 4th order Runge−Kutta method, with step size
h. Thereafter, consider y(k) as the discrete version of T
(t).
III. RECURSIVE LEASTS SQUARES METHOD
The adaptive MPC strategy is obtained by using the RLS method to
estimate the model parameters. The new estimate of the parameters
θ(k+1) is found as a function of the previous estimate θ(k), the
system input u(k) and system output y(k+ 1) [17]. Let the method be
defined by
θ(k + 1) = θ(k) +Kg(k + 1)[y(k + 1)− φT (k)θ(k)],
Kg(k + 1) = P (k)φ(k)
1 + φT (k)P (k)φ(k) ,
1 + φT (k)P (k)φ(k) ,
(11)
where Kg is the Kalman gain, P is the covariance matrix and φ is
the vector with the model dependent variables such that y(k + 1) =
φ(k)T θ. By applying the 4th order Runge-Kutta method to discretize
the Gompertz model (9), the result is an accurate discrete model.
However the model is nonlinear in the parameters. Thus, consider
for the purpose of parameter estimation that the Gompertz model (9)
is discretized by the 1st order Euler method. The discrete Gompertz
model with this last method is given by
y(k + 1) = y(k) + h
(12)
where h ∈ <+ is the discretization step size and rewriting in
the RLS notation
φ(k) =
[ hy(k)log
( η
y(k)
) − hu(k)y(k)
]T ,
(13)
the method is fully described, assuming that the plateau level η is
well known.
As MPC algorithm and RLS method use different dis- cretization
methods to obtain a discrete Gompertz model, it is expected that
the estimated parameters with the Euler method have a deviation
from the real ones.
IV. NONLINEAR MPC OF TUMOR GROWTH
A. MPC formalization
At time k it is desired to discover which value should u2(k) be to
reduce the tumor density. This is done by solving the following
constrained optimization problem in a receding horizon
strategy
minimize Uk
J(y(k), Uk)
s.t. Umin ≤ Uk ≤ Umax , (14)
where Umin = umin ·1 and Umax = umax ·1 are the constant constraint
vectors with the same dimension as U . The cost function J is the
quadratic cost function defined as
J(y(k), Uk) = (Yk+1−Y ∗k+1)T (Yk+1−Y ∗k+1)+ρUTk Uk, (15)
where Yk+1 is the predicted output vector
Yk+1 = [y(k + 1|k)2 y(k + 2|k) ... y(k +N |k)]T , (16)
N is the prediction horizon and ρ ∈ <+ is a tuning parameter.
The reference signal vector Y ∗k+1 that Yk+1 should follow is given
by
Y ∗k+1 = [Yref (k + 1) Yref (k + 2) ... Yref (k +N)]T , (17)
and Uk is the virtual inputs control vector
Uk = [u2(k) u2(k + 1|k) ... u2(k + Tc − 1|k)]T . (18)
With an initial condition y(0) the problem (14) solution is
U∗k = argmin Uk
s.t. Umin ≤ Uk ≤ Umax , (19)
where the inequality is to be taken elementwise. Note that only the
first element of U∗k , u∗2(k), is actually applied to the system.
The same procedure is repeated at time k + 1, in a receding horizon
strategy, to consider the system output feedback, which gives a
level of robustness to the controller. To solve the optimization
problem (14), the fmincon MAT- LAB function is used. This function
uses the quasi-Newton algorithm, which needs an initial estimate of
the solution U∗ for every time k. The first estimate of U∗, U0(0)
is a vector of ones, with length N . Therapeutically, this
estimate
2y(b|a) ≡ system output at time b predicted at time a, with a ∈ N0,
b ∈ N and b > a. For notation simplicity let y(a|a) = y(a). The
same notation is used with the system input u.
5
is equivalent to the worst case scenario where the drug effect is
set to the maximum admissible value in all the prediction horizon.
Although the choice of U0(0) influence the solution of (19), the
initialization U0(0) is not crucial to solve the optimization
problem as will be seen below. The next estimates are given
by
U0(k) = [U∗k−1(2 : end)T 1]T , (20)
for k > 0.
B. MPC performance and features
In order to assess the dependence of the MPC performance on the
parameters that configure it, some simulation results are presented
thereafter, in Figure 9 to 13, with the RLS estimator turned off.
Let ”5% rise time”, be the time that y(k) takes to reach 5% of the
difference between the initial tumor density y(0) and the
equilibrium tumor density y(+∞), and the ”Ratio Teq/T∞” be the
offset between the equilibrium of y, Teq , and the reference signal
equilibrium, called T∞. Those two performance characteristics are
studied as a functions of the maximum drug effect umax and the
optimization parameter ρ.
MPC PC T
RLS z−1
y(k)
θ(k + 1)
Fig. 8. Simulation block diagram. The T block diagram illustrates
the tumor growth dynamics given by the Gompertz model.
The experiment is to follow an exponential reference given by
Yref (k) = T∞ + (T0 − T∞)e−λk, (21)
with initial tumor density y(0) = 97.5, T∞ = 10 and where λ ∈ <+
is a parameter that defines how quickly the reference signal
varies. The exponential reference signal starts in accordance with
y(0).
Figure 9 shows the typical result of simulating the block diagram
of Figure 8. The tumor density is tracking the reference signal and
the system reaches the equilibrium in approximately 5 weeks of
therapy. The value chosen to the step size h makes the controller
to compute the system input in a daily base. The Figure 9 bottom
graph shows the daily drug dose that must be administered to the
patient.
To study the performance of the controller, the block dia- gram of
Figure 8 is simulated for different umax with a fixed ρ and
vice-versa. Figure 10 shows that as umax decreases, the system
response y is slower. However, when umax is small, MPC cannot drive
y to T∞, even if the simulation time is increased. As seen, umax is
set to 1 and by inspection this value allows the MPC to drive y to
T∞ in an admissible time with no offset. Figure 11 shows that as ρ
increases, the smaller is the band width and the slower the system
response will be. Furthermore, more robustness is given to the
system because the high frequency dynamics is attenuated. Moreover,
by increasing p, the bigger the offset will be.
0 1 2 3 4 5 0
50
100
50
100
0.5
1
0.005
0.01
D ru
g do
se , d
Fig. 9. System input and output with a exponential reference
signal. Gompertz model parameters: a = 0.15, b = 1.5, η = 100, h =
1/7. Tref parameters: y0 = 97.5, T∞ = 10, λ = 0.2.
0 0.2 0.4 0.6 0.8 1 0
10
20
2
4
6
8
10
eq /
T ∞
Fig. 10. 5% rise time as a function of umax. Tumor growth
parameters: a = 0.15, b = 1.5, η = 1, h = 1/7. Reference quickness
parameter: λ = 0.2. MPC parameters: ρ = 10−4, N = 7.
Two other important characteristics to study are the relation
between the simulation cost and simulation time as a function of
the prediction horizon, N . When analysing the cost as a function
of the prediction horizon, two cases were considered: in Test 1,
the MPC tumor density model has no parameter errors, while in Test
2 MPC tumor density model has ±20% parameter errors. Let J be the
cost that evaluates all the experience, with D samples, defined
as
J(N) = 1
] , (22)
where YJ is a vector that contains the system outputs, Y ∗J is a
vector that contains the reference signal and UJ is a vector that
contains the system inputs.
By analysing Figure 12, Test 1 has a smaller value of J than in
Test 2. This happens because the difference between
6
5
10
15
20
0.5
1
1.5
2
2.5
ρ
eq /
T ∞
Fig. 11. 5% rise time as a function of ρ. MPC parameters: umin = 0,
umax = 0.99, N = 7.
0 5 10 15 20 0
0.005
0.01
0.015
5
10
15
Test 1 Test 2
Fig. 12. Cost and simulation time of having a prediction horizon of
size N . MPC parameters: ρ = 10−4, umin = 0, umax = 0.99. The
vertical black line is pointing at the chosen prediction horizon N
= 7.
T and Tref does not vanish in Test 2, due to the differences
between the model that MPC knows and the real system behaviour.
There is no specific rule to choose the best number of predictors N
. Although, there is a rule of thumb that suggests that the chosen
N should be after the cost curve knee, because more predictors will
cause a very small decrease of the cost. For the Test 1 case, the
cost curve knee is at N = 2, which means that any N > 2 is
probably a good choice (for the Test 2 case, the cost curve knee is
approximately at N = 5). However, big values of N does not mean
that the cost J is reduced proportionally. It is clear that when N
increase, the variation of J is almost null. Furthermore, when N
increases, the simulation time increases exponentially (Figure 12).
Thus, the prediction horizon N was set to 7. Since the algorithm
step size h is expressed in days, this prediction horizon value
means that MPC is observing one week in the
future. Concluding, there is no significant difference in the
simulation time between the two tests, which is admissible, since
MPC has to make exactly the same computations for both tests.
The initialization of the solution to the optimization problem
(19), U0(0), was chosen to be a vector of ones, representing the
worst case, a non optimal therapy where the drug effect is always
at its maximum value.
0 1 2 3 4 5 0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 0.2
0.4
0.6
0.8
1
Test1 Test2 Test3 Test4
Fig. 13. System input and output for different optimization problem
initial- izations. Test1: U0(0) = 0.1 · 1. Test2: U0(0) = 0.5 · 1,
Test3: U0(0) = 1, Test4: U0(0) is a random vector between 0 and 1,
where its elements may be different.
Besides the biological interpretation of this choice, any value for
this vector will lead to the same drug effect therapy and
consequently the same system response. Figure 13 shows the system
input and output for four different U0(0). Although the system
input and output are different in the first weeks, all the system
inputs converges to the same result, showing that the value of the
initialization of the optimization problem is not crucial.
V. BONE MASS RECOVERY
In this section, the framework represented by Figure 8 is used with
the model determined in [4] that is represented by the block
diagram of the Figure 14, to decrease the tumor den- sity and to
recover bone mass. Two time instants were defined: k1 is the same
as in [4] and represents the beginning of both treatments (tumor
cell kill drug and osteoblasts regulator drug), and k2 is where it
is considered that the tumor is eliminated. For the purpose of
recovering the bone mass density, a discrete PI controller with
forward Euler integrator is used to compute the drug effect u1,
assumed to be the manipulated variable. The controller is given by
the following difference equation [18]
u1(k) = −a1
a1 = −1, (24)
and e(k) is the error between the osteoblasts reference signal and
the output at time k.
PID Bref (k + 1), B(k)
B
C
Z
B(k), C(k)
B(k + 1)
C(k + 1)
Z(k + 1)
Fig. 14. Bone microenvironment block diagram. The B, C and Z blocks
illustrate the osteoblasts, osteoclasts and the bone mass density
dynamics, respectively. The reference signal Bref is equal to the
osteoblasts steady state B.
The approach is the following: between k1 and k2 both osteoblasts
regulator drug and the tumor cell kill drug are administered to the
patient, and after k2 only the osteoblasts regulator drug is still
administered. This decision is based on [13] that suggests that, to
break the vicious cycle, the osteoclasts number must return to a
normal value, so that excessive bone resorption stops making the
tumor to not spread to other sites.
Figure 15 shows the tumor growth model parameters esti- mation
since k = k1. It is visible that the estimate converges in
approximately 10 days. Although a is close to the real value, b
shows an error when compared to the real one, which is admissible
due to the difference between the Euler and the Runge-Kutta
discretization methods. It is acceptable to consider, for instance,
that this difference is a result of an error on the sensors that
measure the tumor density. Figures 18 and 19 show that, between k =
0 and k1, where no drugs were applied, the tumor density starts to
increase, deregulating the balance between osteoclasts and
osteoblasts, and consequently the bone mass density starts to
decrease from the steady state.
The presence of the vicious cycle is also evident. The amplitude of
osteoclasts and osteoblasts dynamics is changed. Osteoclasts
increase in number, destroying more bone than the one that is
created and also the period of oscillations is affected. Between k1
and k2, where both drugs are applied, the tumor density decreases
until the tumor is considered to be extinct (T < 2%) following a
reference signal given by (21) and osteoclasts and osteoblasts
balance starts to recover as also does the bone mass density. After
k2, where only the osteoblasts regulator drug is active, bone mass
density continues to recover until it reaches the steady state Z =
100. The variation of the cell kill drug concentration c, the
c50
parameter and the drug resistance level r are shown in Figure 17.
Those graphs only have meaning when the tumor cell kill drug
therapy is on, this being the reason why those measures were set to
0 after k2.
0 5 10 15 20 0
10
20
30
40
50
a
20
40
60
b
Fig. 15. Tumor growth model parameters estimation. The tumor growth
model parameters estimation starts when the cell kill drug is being
administered. This figure shows the estimates for the first 20 days
of therapy. The red horizontal lines show the real model
parameters. a = 0.005 and b = 0.7. RLS method initial conditions:
θ(0) = [a b]T = [50 50]T , P (0) = 104I where I is the identity
matrix. This value of P (0) shows the low confidence in the
starting estimate of the parameters.
0 500 1000 1500 2000 2500 −2
0 2 4 6
0.05
0.1
0.15
0.01
0.02
0.03
2
Fig. 16. Drug effects u1 and u2 and drug dose d2 time variation.
The vertical lines indicate the therapy time instants k1 = 600 and
k2 = 1435. Discrete PID controller parameters: Kd = 0, Ki = −10−8
and Kp = −10−7.
VI. CONCLUSIONS
The combination of both MPC and RLS methods provides an adaptive
optimization solver, applied here to the treatment of bone marrow
cancer. Optimizing the therapy can be for- mulated as a control
problem whose solution provides a drug dose schedule. This
open-loop solution can be transformed into a feedback control law,
with all the inherent advantages, by using the receding horizon
strategy. In the simulations presented, MPC can drive the tumor
density to low values, interrupting the vicious cycle and stopping
the tumor from spreading to other organs.
The MPC also shows to be a powerful and robust tool. When
considering errors between the system model and the
8
10
20
30
1000
2000
100
200
c 50
Fig. 17. Drug concentration, c50 and drug resistance time
variation. The rose horizontal line, indicate the clim parameter.
Controller and observer poles location on the complex plan s = −50
and s = −100 respectively. Controller and observer initial state [0
0]T and [1 1] respectively. PK model sampling time: = 1. Resistance
model parameters: δ = 1, clim = 5, Kr = 3 and c50(0) = 1.
0 500 1000 1500 2000 2500 0
10
20
30
200
400
600
800
1000
Time, k [day]
Fig. 18. Osteoclasts and osteoblasts activity. C = 5, B = 316, α1 =
3 cell day−1, α2 = 4 cell day−1, β1 = 0.2 day−1, β2 = 0.02 day−1,
g11 = 1.1, g12 = 0, g21 = −0.5, g22 = 0, r11 = 0.005, r12 = 0, r21
= 0, r22 = 0.2, C(0) = 15, B(0) = 316, y(0) = 1.
actual system output as well as when using different methods to
discretize the Gompertz model, the controller still drives the
tumor density to the desirable reference values. Adding a therapy
to regulate the osteoblasts activity and consequently to recover an
healthy bone mass density, this framework provides an optimal drug
dose to the patient, killing the tumor in approximately 2 years and
recovering a normal bone mass density in 5 years, assuming model
validity.
According to [13] and as seen in this work, the bone marrow cancer
has the ability to deregulate the healthy balance between the
osteoclasts and osteoblasts in order to accelerate the bone
resorption process which, in turn, promote further tumor growth,
known as the vicious cycle. Thus, the tumor
0 500 1000 1500 2000 2500 0
20
40
60
80
100
80
100
120
ity
Fig. 19. Bone mass and tumor densities variation. Tumor growth
model parameters: a = 0.005, b = 0.7, η = 100 and h = 1. MPC
parameters: ρ = 10−4, umin = 0, umax = 0.99, N = 7. Initial tumor
density: y(0) = 1. Tumor density at the beginning of the therapy:
y(k1) = 79.51. Reference signal parameters: λ = 0.0044105, T∞ = 0,
T0 = y(k1).
growth model presented in [4] should be review in order to model
the interference that the osteoclasts and osteoblasts activity have
in the tumor growth dynamics.
For future work, it may be interesting to adapt the phar-
macodynamical model (PK, PD and drug resistance) to real data.
Furthermore, the drug toxicity effect associated with huge values
of drug concentration also have to be considered. Finally, it was
assumed that it is possible to have full control on the osteoblasts
recovery drug effect u1 which is not admissible. A more realistic
therapy must be considered by taking into account the
pharmacodynamical model of this drug. Although the above aspects
are essential when modelling a realistic therapy of cancer, they
exceed the objective of this paper that was circumscribed to show
how MPC can be applied together with the pharmacodynamical models
in an adaptive strategy, to kill a cancer tumor.
REFERENCES
[1] F. Michor, Y. Iwasa and M. Nowak, Dynamics of cancer
progression, Nature Rev. Cancer, 4:197−205, 2004.
[2] J. Domingues, Gompertz model: Resolution and analysis for
tumors, Journal of Mathematical Modelling and Application,
I(7):70−77, 2012.
[3] D. Caiado and J. M. Lemos, Optimal control for cancer therapy
design, Inesc-ID Thechical Report, (6/2015), March 2015.
[4] B. Ayati, C. Edwards, G.Webb, and J. Wiskwo, A mathematical
model of bone remodeling dynamics for normal bone cell populations
and myeloma bone disease, Biology Direct, 2010, 5:28.
[5] R. Martin and K. Teo, Optimal control of drug administration in
cancer chemotherapy, World Scientific, 1993.
[6] A. Matveev and A. Savkin, Optimal control applied to drug
administration in cancer chemotherapy: the case of several toxicity
constransints, 39a IEEE Conference on Decision and Control,
2000.
[7] P. Bumroongsri and S. Hheawhom, Optimal dosing of breast cancer
chemotherapy using robust MPC based on linear matrix inequalities,
Engineering Journal, 19(1), January 2015.
[8] T. Chen, N. Kirkby, and R. Jena, Optimal dosing of cancer
chemother- apy using model predictive control and moving horizon
state/parameter estimation, Computer methods and programs in
biomedicine, Elsevier, 108(2012), 973−983.
9
[9] J. Florian, J. Eiseman, and R. Parker, A nonlinear model
predictive control algorithm for breast cancer treatment, Fakukltet
for Naturvitenskap og teknologi, 2004.
[10] J.M. Lemos and D. Caiado, Receding horizon control of tumor
growth based on optimal control, 23rd Mediterranean Conference on
Control and Automation, 2015.
[11] S. Komarova, R. Smith, S. Dixon, S. Sims, and L. Wahl,
Mathematical model predicts a critical role for osteoclasts
autocrine regulation in the control of bone remodeling, Bone,
Elsevier, 33:206−215, April 2003.
[12] J.M. Lemos, J. Pinheiro, and S. Vinga, A nonlinar MPC approach
to minimize toxicity in HIV-1 infection multi-drug therapy,
Controlo, 2012.
[13] Y. Zheng, H. Zhou, C. Dunstan, R. Sutherland, and M. Seibel,
The role of the microenvironment in skeletal metastasis, Journal of
Bone Oncology, 2:47−57, December 2013.
[14] S. Jambhekar and P. Breenm, Basic pharmacokinetics,
Pharmacentral Press, 2009.
[15] K. Ogata, Modern Control Engineering, Prentice Hall, 5, 2009.
[16] M. Gottesman, Mechanisms of Cancer Drug Resistance, Annu. Rev.
Med.
2002. 53:615−27. [17] A. Vahidi, A. Stefanoulou, and H. Peng,
Recursive least squares with
forgetting for online estimation of vehicle mass and road grade:
Theory and experiments, Vehicle System Dynamics: International
Journal of Vehicle Mechanics and Mobility, 43(1):3155, 2005.