9
1 Predictive Control Based Therapy of Bone Marrow Cancer Hugo M. Silva Abstract—Despite its simple appearance, bone is a complex and dynamic tissue that is remodelled continuously. Bone remodelling is a delicate process that is essentially made of two types of cells: osteoclasts, that digest and remove old bone, and osteoblasts, that create new bone. The bone microenvironment is rich in nutrients and proteins, and hence it is not surprising that the bone is an usual place for tumor appearance. From the moment in which a tumor is established, it starts to deregulate the healthy balance between osteoclasts and osteoblasts, accelerating the bone resorption process, so that it can spread to other organs. Furthermore, the deregulated balance will cause the bone mass density to decrease, resulting in another kind of bone diseases. This paper starts by reviewing the mathematical model for tumor growth as well as the pharmacokinetics and pharmacodynamics models of the drug, so that the therapy can be as close as possible to reality. Then, the Nonlinear Model Predictive Control algorithm (NMPC) is used to find the optimal drug dose, in order to reduce the tumor density. An exponential reference signal is used in NMPC to produce smaller tracking errors with respect to a reference that drives the tumor size to a vanishing value, and the Recursive Least Squares method is used to learn the parameters of the tumor growth model, in order to obtain an adaptive NMPC strategy. Finally, this control strategy is applied to a state-of-the- art bone microenvironment model used in cancer research, to schedule a therapy for reducing tumor density. Simulations in MATLAB show that the tumor is eliminated and the bone mass is recovered in a period of five years, assuming model validity. Keywords: Nonlinear Model Predictive Control, Nonlinear models, Pharmacodynamical Models, Adaptive Control, Bone Marrow Cancer, Biomedical Systems. I. I NTRODUCTION A. Motivation and Literature Review A wide range of diseases that have in common an unusual and unnecessary cell reproduction beyond the organism needs are called cancer. This uncontrolled proliferation provokes the formation of a cellular mass called tumor. The bone microenvironment provides a fertile soil for cancer cells. The reciprocal interaction between tumor and bone cells, known as vicious cycle, supports the establishment and orchestrates the expansion of malignant tumors in the bone. The therapy to repress cancer growth has collateral toxic effects that affect the patient. To study that interaction, several studies were per- formed recently, using mathematical models and optimization solvers, allowing a deeper understanding and control of the tumor growth [1]–[10]. Indeed some of those studies have used Model Predictive Control (MPC) to compute an optimal therapeutic schedule. In [7] an optimal chemotherapy dose is found by solving a convex optimization problem based on linear matrix inequalities; in [8] it is shown that, even when the system states are not fully directly measurable and there are mismatches in the model parameters, MPC still provides an useful schedule for cancer treatment; in [9] MPC is used to provide a chemotherapy schedule for mice with breast cancer. It has been assumed in many researches related to diseases that it is possible to directly control the effect of a drug on the target. The effect of the body has on the drug and the effect that the drug has on the body is called pharmacokinetics (PK) and pharmacodynamics (PD), respectively. After the administration of a drug there are natural processes, such as solubility, distribution, metabolism and elimination, that affect the amount of drug concentration that reaches the target organ [14]. Studies [12] and [14] suggest mathematical models to represent those interactions. Resistance to drugs is a natural process of the human body and is a major problem in cancer therapies [16]. In [12] a drug resistance model in the treatment of HIV is presented, based on the amount of drug concentration present in the bloodstream. Bone marrow cancer is a common type of cancer that may be the result of metastasis from prostate and breast cancers and a very low percentage of patients (20%) survive for more than five years after bone marrow cancer is diagnosed due to the vicious cycle [13]. Since the bone microenvironment is a fertile soil to the development of this type of cancer, it is crucial to better understand the interactions between osteoclasts, osteoblasts, bone density and the tumor. In [11] a model to represent the microenvironment interactions be- tween osteoclasts, osteoblasts and the bone mass density was developed. A tumor growth model is proposed in [4] and the model of [11] was adapted to show the relations that the tumor has with the bone microenvironment. A recent research [10] employs continuous optimal control to deal with this disease and uses a classical PI controller to recover the bone mass density. B. Paper Contributions and Structure The main goal of this work is to develop a control and adaptive based framework to schedule a therapy to reduce the density of a cancer tumor. The time evolution of the tumor density T, is represented by a nonlinear function, that depends on the tumor density itself and on the drug effect, u. To discover which drug effect u, should be applied, the MPC algorithm is used in order to solve an optimization problem. For that sake, a quadratic cost function that weights the drug effect u and the error between the tumor density T and a reference signal T ref , is used. An exponential reference signal is used to generate smaller errors between T and T ref . The PK and the PD of a drug are modelled so that the drug dose is the manipulated variable to better traduce the reality.

Predictive Control Based Therapy of Bone Marrow Cancer

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Hugo M. Silva
Abstract—Despite its simple appearance, bone is a complex and dynamic tissue that is remodelled continuously. Bone remodelling is a delicate process that is essentially made of two types of cells: osteoclasts, that digest and remove old bone, and osteoblasts, that create new bone. The bone microenvironment is rich in nutrients and proteins, and hence it is not surprising that the bone is an usual place for tumor appearance. From the moment in which a tumor is established, it starts to deregulate the healthy balance between osteoclasts and osteoblasts, accelerating the bone resorption process, so that it can spread to other organs. Furthermore, the deregulated balance will cause the bone mass density to decrease, resulting in another kind of bone diseases. This paper starts by reviewing the mathematical model for tumor growth as well as the pharmacokinetics and pharmacodynamics models of the drug, so that the therapy can be as close as possible to reality. Then, the Nonlinear Model Predictive Control algorithm (NMPC) is used to find the optimal drug dose, in order to reduce the tumor density. An exponential reference signal is used in NMPC to produce smaller tracking errors with respect to a reference that drives the tumor size to a vanishing value, and the Recursive Least Squares method is used to learn the parameters of the tumor growth model, in order to obtain an adaptive NMPC strategy. Finally, this control strategy is applied to a state-of-the- art bone microenvironment model used in cancer research, to schedule a therapy for reducing tumor density. Simulations in MATLAB show that the tumor is eliminated and the bone mass is recovered in a period of five years, assuming model validity.
Keywords: Nonlinear Model Predictive Control, Nonlinear models, Pharmacodynamical Models, Adaptive Control, Bone Marrow Cancer, Biomedical Systems.
I. INTRODUCTION
A. Motivation and Literature Review
A wide range of diseases that have in common an unusual and unnecessary cell reproduction beyond the organism needs are called cancer. This uncontrolled proliferation provokes the formation of a cellular mass called tumor. The bone microenvironment provides a fertile soil for cancer cells. The reciprocal interaction between tumor and bone cells, known as vicious cycle, supports the establishment and orchestrates the expansion of malignant tumors in the bone. The therapy to repress cancer growth has collateral toxic effects that affect the patient. To study that interaction, several studies were per- formed recently, using mathematical models and optimization solvers, allowing a deeper understanding and control of the tumor growth [1]–[10]. Indeed some of those studies have used Model Predictive Control (MPC) to compute an optimal therapeutic schedule. In [7] an optimal chemotherapy dose is found by solving a convex optimization problem based on linear matrix inequalities; in [8] it is shown that, even when the system states are not fully directly measurable and there are mismatches in the model parameters, MPC still provides
an useful schedule for cancer treatment; in [9] MPC is used to provide a chemotherapy schedule for mice with breast cancer.
It has been assumed in many researches related to diseases that it is possible to directly control the effect of a drug on the target. The effect of the body has on the drug and the effect that the drug has on the body is called pharmacokinetics (PK) and pharmacodynamics (PD), respectively. After the administration of a drug there are natural processes, such as solubility, distribution, metabolism and elimination, that affect the amount of drug concentration that reaches the target organ [14]. Studies [12] and [14] suggest mathematical models to represent those interactions. Resistance to drugs is a natural process of the human body and is a major problem in cancer therapies [16]. In [12] a drug resistance model in the treatment of HIV is presented, based on the amount of drug concentration present in the bloodstream.
Bone marrow cancer is a common type of cancer that may be the result of metastasis from prostate and breast cancers and a very low percentage of patients (∼20%) survive for more than five years after bone marrow cancer is diagnosed due to the vicious cycle [13]. Since the bone microenvironment is a fertile soil to the development of this type of cancer, it is crucial to better understand the interactions between osteoclasts, osteoblasts, bone density and the tumor. In [11] a model to represent the microenvironment interactions be- tween osteoclasts, osteoblasts and the bone mass density was developed. A tumor growth model is proposed in [4] and the model of [11] was adapted to show the relations that the tumor has with the bone microenvironment. A recent research [10] employs continuous optimal control to deal with this disease and uses a classical PI controller to recover the bone mass density.
B. Paper Contributions and Structure
The main goal of this work is to develop a control and adaptive based framework to schedule a therapy to reduce the density of a cancer tumor. The time evolution of the tumor density T, is represented by a nonlinear function, that depends on the tumor density itself and on the drug effect, u. To discover which drug effect u, should be applied, the MPC algorithm is used in order to solve an optimization problem. For that sake, a quadratic cost function that weights the drug effect u and the error between the tumor density T and a reference signal Tref , is used. An exponential reference signal is used to generate smaller errors between T and Tref . The PK and the PD of a drug are modelled so that the drug dose is the manipulated variable to better traduce the reality.
2
Summarizing, a drug dose d (an impulse signal) will generate a drug concentration c given by its PK model, that in turn is going to produce a drug effect u, given its PD model. Since MPC computes an optimal drug effect u∗, but only the drug dose d can be manipulated, it is necessary to find the optimal drug effect u∗ that corresponds to the optimal drug concentration c∗. To do so, the inverse PD model is defined and used. To discover which drug dose d is going to generate a drug concentration c as close as possible to the optimal drug concentration c∗, a controller with an asymptotic observer is designed (see Figure 1). Note that the drug dose could be the number of pills or even the number of chemotherapy cycles. Since the therapy is not specified in this work, the drug dose d is dimensionless.
Controller PD−1 PK control PD Systemu∗ c∗ c u
Fig. 1. Block diagram.
To find the tumor density model that best fits a patient, the Recursive Least Squares (RLS) method is used to learn the model parameters from data, in real time, yielding an adaptive MPC algorithm. The framework developed is used with the model of [4] to suppress the tumor and to break the vicious cycle. The bone mass density recover is enhanced using a classical discrete PI algorithm.
After this introduction where are presented the motivation, the literature review and the main contribution, the paper is structured as follows: the mathematical models of PK, PD and drug resistance are defined as well as the tumor density time variation. Then, the MPC is formulated. Some MPC perfor- mance characteristics are studied and the developed framework is used to eliminate a bone marrow tumor. To recover to an healthy bone mass density, a discrete PI controller is used.
II. MODELS
A. Pharmacodynamical model
The pharmacodynamical model (Figure 2) is composed of the PK, PD, and the drug resistance models, and aims to bring a real clinical therapy to the simulation. Figure 2 shows the block diagram composed of these models. The PK control is designed as in Figures 3 and 4.
PD−1 PK control PD
R
cc∗
c50
Fig. 2. Pharmacodynamical model (PC) - block diagram.
1) PK model: The PK model of a drug is represented by a linear and time invariant system with 2 real poles and unitary static gain
C(s) = p1p2
+ Observer −K PK c∗ e x d c
-
PK control c∗ c
Fig. 3. Pharmacokinetics control model (PK control) - block diagram. The PK block represents the transfer function (1).
−L
B
Fig. 4. Observer - block diagram.
where s is the complex frequency in rad · s−1 and p1,2 ∈ <+
are the poles magnitude. This model relates the drug con- centration c in the bloodstream as a function of time t with the therapy dose administered. Figure 5 shows the impulse response of the PK model. Consider now the equivalent state
0 0.5 1 1.5 2 2.5
x 10 4
c
Fig. 5. PK model impulse response. p1 = p2 = 0.5 · 10−3. The drug is fully eliminated from the body after approximately 7 hours.
space representation of the transfer function (1), represented by matrices A, B and C in the model
x = Ax+B,
y = Cx, (2)
where x ∈ <2 is the state and y is the blood drug concen- tration. This system is fully controllable and observable and thus a controller with an asymptotic observer can be designed
3
to discover the optimal drug dose d. Let the system dynamics with a controller and a observer be given by
d = −Kx, x = (A−BK − LC)x− Le,
(3)
where K and L are gain vectors that may be computed using a pole placement technique. The closed loop system can be defined, in an equivalent way, by the following new matrices
ACL = A−BK − LC, BCL = −L, CCL = C.
(4)
The discrete response of the PK model with the controller and the asymptotic observer is given by the response for non-homogeneous systems, that is composed of the solution of the homogeneous equation and the input signal, using the superposition theorem [15]. With the system defined by matrices (4), the drug concentration evolution in discrete time, for Dirac input signals d(k), is given by
c(k + ) = CCL · ( e·ACLx(k) + e·ACLBCL · d(k)
) , (5)
where is the sampling time. 2) PD model: The PD of a drug is represented by the Hill
equation, assumed to be a static nonlinear relation given by
u(k) = c(k)
c50(k) + c(k) , (6)
where c50 ∈ <+ is the drug concentration value for which the drug effect is half of the maximum drug effect. It is assumed that c50 may vary in time depending on the resistance model explained below. The PD model has an horizontal asymptote when the drug concentration tends to infinity, meaning that when the drug concentration increases, the drug effect tends to saturate. Therefore the PD model is normalized to vary between umin = 0 (no drug effect) until umax = 1 (maximum drug effect). The drug effect variation as a function of the drug concentration is presented in Figure 6.
10 −2
10 −1
10 0
10 1
10 2
Fig. 6. Pharmacodynamics - drug concentration in logarithmic scale. c50(k) = 1.
3) Drug resistance model: If the drug concentration c is below a given threshold clim, only weak cells are killed. The cells reproduced are resistant to that amount of drug concentration. This phenomenon is called drug resistance (Figure 7). Let r(k) be the drug resistance level at time k
r(k) = r(k − 1) + δ ·max(0, clim − c(k)), (7)
where δ is the sampling interval and clim the limit above which no resistance to the drug is developed. When drug resistance is developed by the body, an higher drug concentration c is needed to perform the same drug effect u. This can be done by increasing the c50 parameter proportionally to the drug resistance level [12]. Let c50 be affected by the drug resistance r as follows
c50(k) = c50(0) · (1 +Kr · r(k)), (8)
where c50(0) is the initial value of the c50 parameter and Kr ∈ <+
0 is a parameter related to the ability of the disease to develop resistance to the drug. In Figure 2, the block diagram R is composed by equations (7) and (8).
Fig. 7. Resistance model with forward Euler integration. δ = 1.
To finish the explanation of the Figure 2 block diagram, the inverse pharmacodynamics model PD−1 is the function that gives drug concentration values as a function of the drug effect. The transposed graph of the PD−1 model is shown in Figure 6. Although it has been concluded that the drug effect u will vary between umin = 0 and umax = 1, the inverse pharmacodynamics model is not defined when u = 1 (the corresponding drug concentration value would be infinity) and for that reason, from now on, the umax is limited to 99% of the maximum drug effect (umax = 0.99) for both PD and PD−1 models.
B. Bone microenvironment model with tumor and drug treat- ment dynamics
1) Tumor growth model: It is assumed that a cell-kill drug is administered to the patient to diminishing the tumor density. Thus, the tumor growth model used in [4] is slightly changed to a more realistic one [5]. Consider that the tumor density variation is given as a function of continuous time t by T (t) 1
T = aT log ( η T
) − bTu2, (9)
1For notation simplicity, let the time dependence of functions be omitted. The dot notation is used for functions time derivatives.
4
where a ∈ R+ is a parameter related to the tumor growth rate, b ∈ R+
0 is the tumor sensitivity to the drug, η ∈ R+ is the plateau level, T ∈]0, η[ is the tumor density and u2 ∈ R+
0 is the tumor cell kill drug effect. In the absence of treatment, model (9) leads to an S-shaped growth, with η being an horizontal asymptote. Although the drug effect u causes the tumor growth to decrease, when T becomes small, the effect of u also decreases and there is no danger that T is driven to meaningless negative values.
2) Bone microenvironment model: Consider the following nonlinear model, presented in [4], where C(t) and B(t) represent, respectively, osteoclasts and osteoblasts activity, and Z(t) represents the bone mass density, as a functions of continuous time t
C = α1C g11
1+r11 T
(10)
where g••, r••, α•, β• and k• are bone microenvironment model parameters, and C and B are the mean value of osteoclasts and osteoblasts function [11]. The variable u1
represents the osteoblasts recovery drug effect. Since the control algorithms used operate in discrete time, models (9) and (10) are approximated using the 4th order Runge−Kutta method, with step size h. Thereafter, consider y(k) as the discrete version of T (t).
III. RECURSIVE LEASTS SQUARES METHOD
The adaptive MPC strategy is obtained by using the RLS method to estimate the model parameters. The new estimate of the parameters θ(k+1) is found as a function of the previous estimate θ(k), the system input u(k) and system output y(k+ 1) [17]. Let the method be defined by
θ(k + 1) = θ(k) +Kg(k + 1)[y(k + 1)− φT (k)θ(k)],
Kg(k + 1) = P (k)φ(k)
1 + φT (k)P (k)φ(k) ,
1 + φT (k)P (k)φ(k) ,
(11)
where Kg is the Kalman gain, P is the covariance matrix and φ is the vector with the model dependent variables such that y(k + 1) = φ(k)T θ. By applying the 4th order Runge-Kutta method to discretize the Gompertz model (9), the result is an accurate discrete model. However the model is nonlinear in the parameters. Thus, consider for the purpose of parameter estimation that the Gompertz model (9) is discretized by the 1st order Euler method. The discrete Gompertz model with this last method is given by
y(k + 1) = y(k) + h
(12)
where h ∈ <+ is the discretization step size and rewriting in the RLS notation
φ(k) =
[ hy(k)log
( η
y(k)
) − hu(k)y(k)
]T ,
(13)
the method is fully described, assuming that the plateau level η is well known.
As MPC algorithm and RLS method use different dis- cretization methods to obtain a discrete Gompertz model, it is expected that the estimated parameters with the Euler method have a deviation from the real ones.
IV. NONLINEAR MPC OF TUMOR GROWTH
A. MPC formalization
At time k it is desired to discover which value should u2(k) be to reduce the tumor density. This is done by solving the following constrained optimization problem in a receding horizon strategy
minimize Uk
J(y(k), Uk)
s.t. Umin ≤ Uk ≤ Umax , (14)
where Umin = umin ·1 and Umax = umax ·1 are the constant constraint vectors with the same dimension as U . The cost function J is the quadratic cost function defined as
J(y(k), Uk) = (Yk+1−Y ∗k+1)T (Yk+1−Y ∗k+1)+ρUTk Uk, (15)
where Yk+1 is the predicted output vector
Yk+1 = [y(k + 1|k)2 y(k + 2|k) ... y(k +N |k)]T , (16)
N is the prediction horizon and ρ ∈ <+ is a tuning parameter. The reference signal vector Y ∗k+1 that Yk+1 should follow is given by
Y ∗k+1 = [Yref (k + 1) Yref (k + 2) ... Yref (k +N)]T , (17)
and Uk is the virtual inputs control vector
Uk = [u2(k) u2(k + 1|k) ... u2(k + Tc − 1|k)]T . (18)
With an initial condition y(0) the problem (14) solution is
U∗k = argmin Uk
s.t. Umin ≤ Uk ≤ Umax , (19)
where the inequality is to be taken elementwise. Note that only the first element of U∗k , u∗2(k), is actually applied to the system. The same procedure is repeated at time k + 1, in a receding horizon strategy, to consider the system output feedback, which gives a level of robustness to the controller. To solve the optimization problem (14), the fmincon MAT- LAB function is used. This function uses the quasi-Newton algorithm, which needs an initial estimate of the solution U∗ for every time k. The first estimate of U∗, U0(0) is a vector of ones, with length N . Therapeutically, this estimate
2y(b|a) ≡ system output at time b predicted at time a, with a ∈ N0, b ∈ N and b > a. For notation simplicity let y(a|a) = y(a). The same notation is used with the system input u.
5
is equivalent to the worst case scenario where the drug effect is set to the maximum admissible value in all the prediction horizon. Although the choice of U0(0) influence the solution of (19), the initialization U0(0) is not crucial to solve the optimization problem as will be seen below. The next estimates are given by
U0(k) = [U∗k−1(2 : end)T 1]T , (20)
for k > 0.
B. MPC performance and features
In order to assess the dependence of the MPC performance on the parameters that configure it, some simulation results are presented thereafter, in Figure 9 to 13, with the RLS estimator turned off. Let ”5% rise time”, be the time that y(k) takes to reach 5% of the difference between the initial tumor density y(0) and the equilibrium tumor density y(+∞), and the ”Ratio Teq/T∞” be the offset between the equilibrium of y, Teq , and the reference signal equilibrium, called T∞. Those two performance characteristics are studied as a functions of the maximum drug effect umax and the optimization parameter ρ.
MPC PC T
RLS z−1
y(k)
θ(k + 1)
Fig. 8. Simulation block diagram. The T block diagram illustrates the tumor growth dynamics given by the Gompertz model.
The experiment is to follow an exponential reference given by
Yref (k) = T∞ + (T0 − T∞)e−λk, (21)
with initial tumor density y(0) = 97.5, T∞ = 10 and where λ ∈ <+ is a parameter that defines how quickly the reference signal varies. The exponential reference signal starts in accordance with y(0).
Figure 9 shows the typical result of simulating the block diagram of Figure 8. The tumor density is tracking the reference signal and the system reaches the equilibrium in approximately 5 weeks of therapy. The value chosen to the step size h makes the controller to compute the system input in a daily base. The Figure 9 bottom graph shows the daily drug dose that must be administered to the patient.
To study the performance of the controller, the block dia- gram of Figure 8 is simulated for different umax with a fixed ρ and vice-versa. Figure 10 shows that as umax decreases, the system response y is slower. However, when umax is small, MPC cannot drive y to T∞, even if the simulation time is increased. As seen, umax is set to 1 and by inspection this value allows the MPC to drive y to T∞ in an admissible time with no offset. Figure 11 shows that as ρ increases, the smaller is the band width and the slower the system response will be. Furthermore, more robustness is given to the system because the high frequency dynamics is attenuated. Moreover, by increasing p, the bigger the offset will be.
0 1 2 3 4 5 0
50
100
50
100
0.5
1
0.005
0.01
D ru
g do
se , d
Fig. 9. System input and output with a exponential reference signal. Gompertz model parameters: a = 0.15, b = 1.5, η = 100, h = 1/7. Tref parameters: y0 = 97.5, T∞ = 10, λ = 0.2.
0 0.2 0.4 0.6 0.8 1 0
10
20
2
4
6
8
10
eq /
T ∞
Fig. 10. 5% rise time as a function of umax. Tumor growth parameters: a = 0.15, b = 1.5, η = 1, h = 1/7. Reference quickness parameter: λ = 0.2. MPC parameters: ρ = 10−4, N = 7.
Two other important characteristics to study are the relation between the simulation cost and simulation time as a function of the prediction horizon, N . When analysing the cost as a function of the prediction horizon, two cases were considered: in Test 1, the MPC tumor density model has no parameter errors, while in Test 2 MPC tumor density model has ±20% parameter errors. Let J be the cost that evaluates all the experience, with D samples, defined as
J(N) = 1
] , (22)
where YJ is a vector that contains the system outputs, Y ∗J is a vector that contains the reference signal and UJ is a vector that contains the system inputs.
By analysing Figure 12, Test 1 has a smaller value of J than in Test 2. This happens because the difference between
6
5
10
15
20
0.5
1
1.5
2
2.5
ρ
eq /
T ∞
Fig. 11. 5% rise time as a function of ρ. MPC parameters: umin = 0, umax = 0.99, N = 7.
0 5 10 15 20 0
0.005
0.01
0.015
5
10
15
Test 1 Test 2
Fig. 12. Cost and simulation time of having a prediction horizon of size N . MPC parameters: ρ = 10−4, umin = 0, umax = 0.99. The vertical black line is pointing at the chosen prediction horizon N = 7.
T and Tref does not vanish in Test 2, due to the differences between the model that MPC knows and the real system behaviour. There is no specific rule to choose the best number of predictors N . Although, there is a rule of thumb that suggests that the chosen N should be after the cost curve knee, because more predictors will cause a very small decrease of the cost. For the Test 1 case, the cost curve knee is at N = 2, which means that any N > 2 is probably a good choice (for the Test 2 case, the cost curve knee is approximately at N = 5). However, big values of N does not mean that the cost J is reduced proportionally. It is clear that when N increase, the variation of J is almost null. Furthermore, when N increases, the simulation time increases exponentially (Figure 12). Thus, the prediction horizon N was set to 7. Since the algorithm step size h is expressed in days, this prediction horizon value means that MPC is observing one week in the
future. Concluding, there is no significant difference in the simulation time between the two tests, which is admissible, since MPC has to make exactly the same computations for both tests.
The initialization of the solution to the optimization problem (19), U0(0), was chosen to be a vector of ones, representing the worst case, a non optimal therapy where the drug effect is always at its maximum value.
0 1 2 3 4 5 0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 0.2
0.4
0.6
0.8
1
Test1 Test2 Test3 Test4
Fig. 13. System input and output for different optimization problem initial- izations. Test1: U0(0) = 0.1 · 1. Test2: U0(0) = 0.5 · 1, Test3: U0(0) = 1, Test4: U0(0) is a random vector between 0 and 1, where its elements may be different.
Besides the biological interpretation of this choice, any value for this vector will lead to the same drug effect therapy and consequently the same system response. Figure 13 shows the system input and output for four different U0(0). Although the system input and output are different in the first weeks, all the system inputs converges to the same result, showing that the value of the initialization of the optimization problem is not crucial.
V. BONE MASS RECOVERY
In this section, the framework represented by Figure 8 is used with the model determined in [4] that is represented by the block diagram of the Figure 14, to decrease the tumor den- sity and to recover bone mass. Two time instants were defined: k1 is the same as in [4] and represents the beginning of both treatments (tumor cell kill drug and osteoblasts regulator drug), and k2 is where it is considered that the tumor is eliminated. For the purpose of recovering the bone mass density, a discrete PI controller with forward Euler integrator is used to compute the drug effect u1, assumed to be the manipulated variable. The controller is given by the following difference equation [18]
u1(k) = −a1
a1 = −1, (24)
and e(k) is the error between the osteoblasts reference signal and the output at time k.
PID Bref (k + 1), B(k)
B
C
Z
B(k), C(k)
B(k + 1)
C(k + 1)
Z(k + 1)
Fig. 14. Bone microenvironment block diagram. The B, C and Z blocks illustrate the osteoblasts, osteoclasts and the bone mass density dynamics, respectively. The reference signal Bref is equal to the osteoblasts steady state B.
The approach is the following: between k1 and k2 both osteoblasts regulator drug and the tumor cell kill drug are administered to the patient, and after k2 only the osteoblasts regulator drug is still administered. This decision is based on [13] that suggests that, to break the vicious cycle, the osteoclasts number must return to a normal value, so that excessive bone resorption stops making the tumor to not spread to other sites.
Figure 15 shows the tumor growth model parameters esti- mation since k = k1. It is visible that the estimate converges in approximately 10 days. Although a is close to the real value, b shows an error when compared to the real one, which is admissible due to the difference between the Euler and the Runge-Kutta discretization methods. It is acceptable to consider, for instance, that this difference is a result of an error on the sensors that measure the tumor density. Figures 18 and 19 show that, between k = 0 and k1, where no drugs were applied, the tumor density starts to increase, deregulating the balance between osteoclasts and osteoblasts, and consequently the bone mass density starts to decrease from the steady state.
The presence of the vicious cycle is also evident. The amplitude of osteoclasts and osteoblasts dynamics is changed. Osteoclasts increase in number, destroying more bone than the one that is created and also the period of oscillations is affected. Between k1 and k2, where both drugs are applied, the tumor density decreases until the tumor is considered to be extinct (T < 2%) following a reference signal given by (21) and osteoclasts and osteoblasts balance starts to recover as also does the bone mass density. After k2, where only the osteoblasts regulator drug is active, bone mass density continues to recover until it reaches the steady state Z = 100. The variation of the cell kill drug concentration c, the c50
parameter and the drug resistance level r are shown in Figure 17. Those graphs only have meaning when the tumor cell kill drug therapy is on, this being the reason why those measures were set to 0 after k2.
0 5 10 15 20 0
10
20
30
40
50
a
20
40
60
b
Fig. 15. Tumor growth model parameters estimation. The tumor growth model parameters estimation starts when the cell kill drug is being administered. This figure shows the estimates for the first 20 days of therapy. The red horizontal lines show the real model parameters. a = 0.005 and b = 0.7. RLS method initial conditions: θ(0) = [a b]T = [50 50]T , P (0) = 104I where I is the identity matrix. This value of P (0) shows the low confidence in the starting estimate of the parameters.
0 500 1000 1500 2000 2500 −2
0 2 4 6
0.05
0.1
0.15
0.01
0.02
0.03
2
Fig. 16. Drug effects u1 and u2 and drug dose d2 time variation. The vertical lines indicate the therapy time instants k1 = 600 and k2 = 1435. Discrete PID controller parameters: Kd = 0, Ki = −10−8 and Kp = −10−7.
VI. CONCLUSIONS
The combination of both MPC and RLS methods provides an adaptive optimization solver, applied here to the treatment of bone marrow cancer. Optimizing the therapy can be for- mulated as a control problem whose solution provides a drug dose schedule. This open-loop solution can be transformed into a feedback control law, with all the inherent advantages, by using the receding horizon strategy. In the simulations presented, MPC can drive the tumor density to low values, interrupting the vicious cycle and stopping the tumor from spreading to other organs.
The MPC also shows to be a powerful and robust tool. When considering errors between the system model and the
8
10
20
30
1000
2000
100
200
c 50
Fig. 17. Drug concentration, c50 and drug resistance time variation. The rose horizontal line, indicate the clim parameter. Controller and observer poles location on the complex plan s = −50 and s = −100 respectively. Controller and observer initial state [0 0]T and [1 1] respectively. PK model sampling time: = 1. Resistance model parameters: δ = 1, clim = 5, Kr = 3 and c50(0) = 1.
0 500 1000 1500 2000 2500 0
10
20
30
200
400
600
800
1000
Time, k [day]
Fig. 18. Osteoclasts and osteoblasts activity. C = 5, B = 316, α1 = 3 cell day−1, α2 = 4 cell day−1, β1 = 0.2 day−1, β2 = 0.02 day−1, g11 = 1.1, g12 = 0, g21 = −0.5, g22 = 0, r11 = 0.005, r12 = 0, r21 = 0, r22 = 0.2, C(0) = 15, B(0) = 316, y(0) = 1.
actual system output as well as when using different methods to discretize the Gompertz model, the controller still drives the tumor density to the desirable reference values. Adding a therapy to regulate the osteoblasts activity and consequently to recover an healthy bone mass density, this framework provides an optimal drug dose to the patient, killing the tumor in approximately 2 years and recovering a normal bone mass density in 5 years, assuming model validity.
According to [13] and as seen in this work, the bone marrow cancer has the ability to deregulate the healthy balance between the osteoclasts and osteoblasts in order to accelerate the bone resorption process which, in turn, promote further tumor growth, known as the vicious cycle. Thus, the tumor
0 500 1000 1500 2000 2500 0
20
40
60
80
100
80
100
120
ity
Fig. 19. Bone mass and tumor densities variation. Tumor growth model parameters: a = 0.005, b = 0.7, η = 100 and h = 1. MPC parameters: ρ = 10−4, umin = 0, umax = 0.99, N = 7. Initial tumor density: y(0) = 1. Tumor density at the beginning of the therapy: y(k1) = 79.51. Reference signal parameters: λ = 0.0044105, T∞ = 0, T0 = y(k1).
growth model presented in [4] should be review in order to model the interference that the osteoclasts and osteoblasts activity have in the tumor growth dynamics.
For future work, it may be interesting to adapt the phar- macodynamical model (PK, PD and drug resistance) to real data. Furthermore, the drug toxicity effect associated with huge values of drug concentration also have to be considered. Finally, it was assumed that it is possible to have full control on the osteoblasts recovery drug effect u1 which is not admissible. A more realistic therapy must be considered by taking into account the pharmacodynamical model of this drug. Although the above aspects are essential when modelling a realistic therapy of cancer, they exceed the objective of this paper that was circumscribed to show how MPC can be applied together with the pharmacodynamical models in an adaptive strategy, to kill a cancer tumor.
REFERENCES
[1] F. Michor, Y. Iwasa and M. Nowak, Dynamics of cancer progression, Nature Rev. Cancer, 4:197−205, 2004.
[2] J. Domingues, Gompertz model: Resolution and analysis for tumors, Journal of Mathematical Modelling and Application, I(7):70−77, 2012.
[3] D. Caiado and J. M. Lemos, Optimal control for cancer therapy design, Inesc-ID Thechical Report, (6/2015), March 2015.
[4] B. Ayati, C. Edwards, G.Webb, and J. Wiskwo, A mathematical model of bone remodeling dynamics for normal bone cell populations and myeloma bone disease, Biology Direct, 2010, 5:28.
[5] R. Martin and K. Teo, Optimal control of drug administration in cancer chemotherapy, World Scientific, 1993.
[6] A. Matveev and A. Savkin, Optimal control applied to drug administration in cancer chemotherapy: the case of several toxicity constransints, 39a IEEE Conference on Decision and Control, 2000.
[7] P. Bumroongsri and S. Hheawhom, Optimal dosing of breast cancer chemotherapy using robust MPC based on linear matrix inequalities, Engineering Journal, 19(1), January 2015.
[8] T. Chen, N. Kirkby, and R. Jena, Optimal dosing of cancer chemother- apy using model predictive control and moving horizon state/parameter estimation, Computer methods and programs in biomedicine, Elsevier, 108(2012), 973−983.
9
[9] J. Florian, J. Eiseman, and R. Parker, A nonlinear model predictive control algorithm for breast cancer treatment, Fakukltet for Naturvitenskap og teknologi, 2004.
[10] J.M. Lemos and D. Caiado, Receding horizon control of tumor growth based on optimal control, 23rd Mediterranean Conference on Control and Automation, 2015.
[11] S. Komarova, R. Smith, S. Dixon, S. Sims, and L. Wahl, Mathematical model predicts a critical role for osteoclasts autocrine regulation in the control of bone remodeling, Bone, Elsevier, 33:206−215, April 2003.
[12] J.M. Lemos, J. Pinheiro, and S. Vinga, A nonlinar MPC approach to minimize toxicity in HIV-1 infection multi-drug therapy, Controlo, 2012.
[13] Y. Zheng, H. Zhou, C. Dunstan, R. Sutherland, and M. Seibel, The role of the microenvironment in skeletal metastasis, Journal of Bone Oncology, 2:47−57, December 2013.
[14] S. Jambhekar and P. Breenm, Basic pharmacokinetics, Pharmacentral Press, 2009.
[15] K. Ogata, Modern Control Engineering, Prentice Hall, 5, 2009. [16] M. Gottesman, Mechanisms of Cancer Drug Resistance, Annu. Rev. Med.
2002. 53:615−27. [17] A. Vahidi, A. Stefanoulou, and H. Peng, Recursive least squares with
forgetting for online estimation of vehicle mass and road grade: Theory and experiments, Vehicle System Dynamics: International Journal of Vehicle Mechanics and Mobility, 43(1):3155, 2005.