Upload
gustavo-sanchez
View
228
Download
0
Embed Size (px)
Citation preview
8/11/2019 Note Optimal Control
1/126
NOTES ON OPTIMAL CONTROL THEORY
with economic models and exercises
Andrea Calogero
Dipartimento di Matematica e Applicazioni Universita di Milano-Bicocca
March 17, 2014
8/11/2019 Note Optimal Control
2/126
ii
8/11/2019 Note Optimal Control
3/126
Contents
1 Introduction to Optimal Control 1
1.1 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Statement of problems of Optimal Control . . . . . . . . . . . 5
1.2.1 Admissible control and associated trajectory . . . . . 5
1.2.2 Optimal Control problems . . . . . . . . . . . . . . . . 10
1.2.3 Calculus of Variation problems . . . . . . . . . . . . . 10
2 The simplest problem of OC 13
2.1 The necessary condition of Pontryagin . . . . . . . . . . . . . 13
2.1.1 The proof in a particular situation . . . . . . . . . . . 15
2.2 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 First generalizations . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Initial/final conditions on the trajectory . . . . . . . . 212.3.2 On minimum problems . . . . . . . . . . . . . . . . . . 22
2.4 The case of Calculus of Variation . . . . . . . . . . . . . . . . 22
2.5 Examples and applications . . . . . . . . . . . . . . . . . . . . 24
2.5.1 The curve of minimal length . . . . . . . . . . . . . . 28
2.5.2 A problem of business strategy I . . . . . . . . . . . . 28
2.5.3 A two-sector model . . . . . . . . . . . . . . . . . . . . 31
2.5.4 A problem of inventory and production. . . . . . . . . 34
2.6 Singular and bang-bang controls . . . . . . . . . . . . . . . . 36
2.6.1 The building of a mountain road: a singular control . 37
2.7 The multiplier as shadow price I: an exercise . . . . . . . . . 40
3 General problems of OC 45
3.1 Problems of Bolza, of Mayer and of Lagrange . . . . . . . . . 45
3.2 Problems with fixed final time . . . . . . . . . . . . . . . . . . 46
3.3 Problems with free final time . . . . . . . . . . . . . . . . . . 47
3.4 Time optimal problem . . . . . . . . . . . . . . . . . . . . . . 49
3.4.1 The classical example of Pontryagin and its boat . . . 50
3.5 The Bolza problem in Calculus of Variations. . . . . . . . . . 53
3.5.1 Labor adjustment model of Hamermesh. . . . . . . . . 55
3.6 Infinite horizon problems . . . . . . . . . . . . . . . . . . . . 57
iii
8/11/2019 Note Optimal Control
4/126
3.6.1 The model of Ramsey . . . . . . . . . . . . . . . . . . 59
3.7 Autonomous problems . . . . . . . . . . . . . . . . . . . . . . 623.8 Current Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . 63
3.8.1 A model of optimal consumption I . . . . . . . . . . . 66
4 Constrained problems of OC 69
4.1 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 694.2 Pure state constraints . . . . . . . . . . . . . . . . . . . . . . 74
4.2.1 Commodity trading . . . . . . . . . . . . . . . . . . . 774.3 Isoperimetric problems in CoV . . . . . . . . . . . . . . . . . 81
4.3.1 Necessary conditions with regular constraints . . . . . 824.3.2 The multiplier as shadow price . . . . . . . . . . . . 85
4.3.3 The foundation of Cartagena . . . . . . . . . . . . . . 874.3.4 The Hotelling model of socially optimal extraction . . 87
5 OC with dynamic programming 91
5.1 The value function: necessary conditions . . . . . . . . . . . . 915.1.1 The final condition . . . . . . . . . . . . . . . . . . . . 925.1.2 Bellmans Principle of optimality . . . . . . . . . . . . 935.1.3 The Bellman-Hamilton-Jacobi equation . . . . . . . . 94
5.2 The value function: sufficient conditions . . . . . . . . . . . . 975.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.1 A problem of business strategy II . . . . . . . . . . . . 104
5.4 Infinite horizon problems . . . . . . . . . . . . . . . . . . . . 1075.4.1 A model of optimal consumption II . . . . . . . . . . . 1095.5 Problems with discounting and salvage value . . . . . . . . . 111
5.5.1 A problem of selecting investment . . . . . . . . . . . 1125.6 The multiplier as shadow price II: the proof . . . . . . . . . . 114
iv
8/11/2019 Note Optimal Control
5/126
Chapter 1
Introduction to Optimal
Control
1.1 Some examples
Example 1.1.1. The curve of minimal length and the isoperimetricproblem
Suppose we are interested to find the curve of minimal length joining twodistinct points in the plane. Suppose that the two points are (0, 0) and (a, b).Clearly we can suppose that a = 1. Hence we are looking for a function
x: [0, 1] Rsuch that x(0) = 0 andx(1) =b.The length of such curve is defined by1
0 ds, i.e. as the sum of arcs of in-finitesimal length ds; using the pictureand the Theorem of Pitagora we obtain
( ds)2 = ( dt)2 + ( dx)2
ds=
1 + x2 dt,
where x= dx(t)dt .
x
1
t
ds
t+dtt
x+dx
x
b
Hence the problem is
minx
10
1 + x2(t) dt
x(0) = 0x(1) =b
(1.1)
It is well known that the solution is a line. We will solve this problem insubsection 2.5.1.
A more complicate problem is to find the closed curve in the plane ofassigned length such that the area inside such curve is maximum: we call this
1
8/11/2019 Note Optimal Control
6/126
2 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
problem the foundation of Cartagena.1 This is the isoperimetric problem.
Without loss of generality, we consider a curve x : [0, 1] R such thatx(0) = x(1) = 0. Clearly the area delimited by the curve and the t axis isgiven by
10 x(t) dt. Hence the problem is
maxx
10
x(t) dt
x(0) = 0x(1) = 0 1
0
1 + x2(t) dt= A >1
(1.2)
Note that the length of the interval [0, 1] is exactly 1 and, clearly, it isreasonable to require A > 1. We will present the solution in subsection4.3.3.
Example 1.1.2. A problem of business strategy
A factory produces a unique good with a rate x(t), at time t. At everymoment, such production can either be reinvested to expand the productivecapacity or sold. The initial productive capacity is > 0; such capacitygrows as the reinvestment rate. Taking into account that the selling price isconstant, what fractionu(t) of the output at time t should be reinvested tomaximize total sales over the fixed period [0, T]?
Let us introduce the functionu : [0, T]
[0, 1]; clearly, ifu(t) is the fraction
of the outputx(t) that we reinvest, (1 u(t))x(t) is the part ofx(t) that wesell at time t at the fixed price P >0. Hence the problem is
maxuC
T0
(1 u(t))x(t)Pdtx= uxx(0) =,C ={u: [0, T][0, 1] R, uK C}
(1.3)
where and T are positive and fixed. We will present the solution insubsection 2.5.2 and in subsection 5.3.1.
Example 1.1.3. The building of a mountain roadThe altitude of a mountain is given by a differentiable function y, withy : [t0, t1] R. We have to construct a road: let us determinate the shapeof the road, i.e. the altitude x = x(t) of the road in [t0, t1], such that theslope of the road never exceeds , with >0, and such that the total cost
1When Cartagena was founded, it was granted for its construction as much land asa man could circumscribe in one day with his plow: what form should have the groovebecause it obtains the maximum possible land, being given to the length of the groovethat can dig a man in a day? Or, mathematically speaking, what is the shape with themaximum area among all the figures with the same perimeter?
8/11/2019 Note Optimal Control
7/126
1.1. SOME EXAMPLES 3
of the construction t1t0
(x(t) y(t))2 dt
is minimal. Clearly the problem is
minuC
t1t0
(x(t) y(t))2 dtx= uC ={u: [t0, t1][, ] R, uK C}
(1.4)
wherey is an assigned and continuous function. We will present the solutionin subsection 2.6.1
Example 1.1.4. In boat with Pontryagin.Suppose we are on a boat that at time t0= 0 has distance 1> 0 from thepier of the port and has velocity2> 0 in the direction of the port. The boatis equipped with a motor that provides an acceleration or a deceleration. Weare looking for a strategy to arrive to the pier in the shortest time with asoft docking, i.e. with vanishing speed in the final time T .We denote byx = x(t) the distance from the pier at timet, by xthe velocityof the boat and by x= u the acceleration (x > 0) or deceleration (x < 0).In order to obtain a soft docking, we require x(T) = x(T) = 0, where thefinal timeTis clearly unknown. We note that our strategy depends only onour choice, at every time, on u(t). Hence the problem is the following
minuC
T
x= ux(0) =1x(0) =2x(T) = x(T) = 0C ={u: [0, )[1, 1] R, uK C}
(1.5)
where1 and 2 are positive fixed and T is free.
This is one of the possible ways to introduce a classic example due toPontryagin; it shows the various and complex situations in the optimal con-trol problems [12]. We will solve this problem in subsection 3.4.1.
Example 1.1.5. A model of optimal consumption.Consider an investor who, at time t= 0, is endowed with an initial capitalx(0) = x0 > 0. At any time he and his heirs decide about their rate ofconsumption c(t)0. Thus the capital stock evolves according to
x= rx c
where r >0 is a given and fixed rate to return. The investors time utilityfor consuming at rate c(t) is U(c(t)), where U(z) = ln z for z > 0. The
8/11/2019 Note Optimal Control
8/126
4 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
investors problem is to find a consumption plain so as to maximize his
discounted utility 0
etU(c(t))dt
where , with r, is a given discount rate, subject to the solvency con-straint that the capital stock x(t) must be positive for all t 0 and suchthat vanishes at. Then the problem is
max
0
et ln c dt
x= rx cx(0) =x0> 0x >0
limt
x(t) = 0
c0
(1.6)
with > r 0 fixed constants. We will solve this problem in subsections3.8.1 and 5.4.1.
One of the real problems that inspired and motivated the study of op-timal control problems is the next and so called moonlanding problem.Here we give only the statement of this hard problem: in [7] there is a goodexposition (see also [4]).
Example 1.1.6. The moonlanding problem.
Consider the problem of a spacecraft attempting to make a soft landing onthe moon using a minimum amount of fuel. To define a simplified versionof this problem, let m = m(t) denote the mass, h = h(t) and v = v(t)denote the height and vertical velocity of the spacecraft above the moon,and u = u(t) denote the thrust of the spacecrafts engine. Hence in theinitial time, we have initial height and vertical velocity of the spacecraft ash(0) = h0 > 0 and v(0) = v0 < 0; in the final and fixed time t1, equal tothe first time the spacecraft reaches the moon, we require h(t1) = 0 andv(t1) = 0. Clearly
h= v.
Let M denote the mass of the spacecraft without fuel, c0 the initial
amount of fuel and g the gravitational acceleration of the moon. The equa-tions of motion of the spacecraft is
mv= u mgwhere m = M+c and c(t) is the amount of fuel at time t. Let be themaximum thrust attainable by the spacecrafts engine ( > 0 and fixed):the thrust u, 0u(t), of the spacecrafts engine is the control for theproblem and is in relation with the amount of fuel with
m= c=ku,
8/11/2019 Note Optimal Control
9/126
1.2. STATEMENT OF PROBLEMS OF OPTIMAL CONTROL 5
with k a positive constant.
Moon
h
spacecraft h0
v0
Moon
mg
u
spacecraft
On the left, the spacecraft at timet= 0 and, on the right, the forces that act on it.
The problem is to land using a minimum amount of fuel:
min(m(0) m(t1)) =m0+ min(m(t1)).
Let us summarize the problem
minuC
m(t1)h= vmv= u mgm=kuh(0) =h0, h(t1) = 0
v(0) =v0, v(t1) = 0m(0) =M+ c0C ={u: [0, t1] R, 0u}
(1.7)
whereh0, M , c0,v0, k , andt1 are positive constants.
1.2 Statement of problems of Optimal Control
1.2.1 Admissible control and associated trajectory
Let us consider a problem where the development of the system is given bya function
x: [t0, t1] Rn, with x= (x1, x2, . . . , xn),
with n 1. At every time t, the value x(t) describes our system. We callx state variable (or trajectory): the state variable is at least a continuousfunction. We suppose that the system has an initial condition, i.e.
x(t0) =, (1.8)
8/11/2019 Note Optimal Control
10/126
6 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
where = (1, 2, . . . , n) Rn.Let us suppose that our system depends on some particular choice (or
strategy), at every time. Essentially we suppose that the strategy of oursystem is given by a function
u: [t0, t1]U, with u= (u1, u2, . . . , uk),with k 1. Here U is a fixed set in Rk that is called control set. We callthe function u control variable. Generally in the literature we work withmeasurable controls; in all this notes we suppose that u is in the spaceKC([t0, t1]) of piecewise continuous function on [t0, t1], i.e. u is continuousin [t0, t1] up to a finite number of points such that lim
t+u(t) and lim
tu(t)
exist and are finite.The fact that u determines the system is represented by the dynamics,
i.e. the relationx(t) =g(t, x(t), u(t)), (1.9)
whereg : [t0, t1] Rn Rk Rn.From a mathematical point of view we are interesting in solving the
Ordinary Differential Equation (ODE) of the formx= g(t, x, u) a.e. in [t0, t1]x(t0) =
(1.10)
where u is an assigned function. In general, without assumption on g and
u,it is not possible to guarantee that there exists a unique solution for (1.9)defined in all the interval [t0, t1].
Definition 1.1. We say that a piecewise continuous function u: [t0, t1]U is anadmissible control (or shortly control) for (1.10) if there existsa unique solution of such ODE defined on [t0, t1]; we call such solution xtrajectory associated to u. We denote byCt0, the set of the admissiblecontrol for at timet0.
We remark that as first step we are interested on the simplest problemof optimal control, i.e. a situation with a initial condition of the type (1.8),with t0, t1 and fixed, and without conditions of the final value of the
trajectory. In the following, we will modify such condition and the definitionof admissible control will change.
Let us give some examples that show the difficulty to associated a tra-jectory to a control:
Example 1.2.1. Let us considerT >0 fixed and
x= 2u
xx(0) = 0C0,1 = {u: [0, T] R, u KC([0, T])}
Prove that the function u(t) = a, with a positive constant, is not an admissible controlsince the two functionsx1(t) = 0 andx2(t) = a
2t2 solve the previous ODE.
8/11/2019 Note Optimal Control
11/126
1.2. STATEMENT OF PROBLEMS OF OPTIMAL CONTROL 7
Example 1.2.2. Let us consider
x= ux2
x(0) = 1C0,1= {u: [0, 3] R, u KC([0, 3])}
Prove that the function u(t) =a, with a constant, is an admissible control if and only ifa 1/3. Prove that the trajectory associated to such control isx(t) = 1
1at .
Example 1.2.3. Let us consider
x= uxx(0) = 1x(2) = 36
C0,1 = {u: [0, 2] [0, 3], u KC([0, 2])}
Prove2 that the set of admissible control is empty.
The following wellknown theorem is fundamental
Theorem 1.1. Let us consider f = f(t, x) : [t0, t1]Rn Rn and letf, fx1 , . . . , f xn be continuous in an open set D Rn+1 with (, x) D[t0, t1] Rn. Then, there exists a neighborhoodI ofsuch that the ODE
x(t) =f(t, x(t))x() =x
admits a unique solution x= F(t) defined in I.
Moreover, if there exist two positive constantsAandB such thatf(t, x) Ax + B for all(t, x)[t0, t1] Rn, then the solution of the previous ODEis defined in all the interval[t0, t1].
Let u: [t0, t1]Ube continuous in [t0, t1] up to the points1, 2, . . . , N,witht0 = 0< 1< 2< .. . < N < N+1= t1,where u has a discontinuityof the first type. Let us suppose that there exists in [t0, 1] a solution x0 ofthe ODE (1.9) with initial condition x0(t0) =. Let us suppose that thereexistsx1 solution of (1.9) in [1, 2] with initial conditionx0(1) =x1(1).In
general for everyi, 1iN,let us suppose that there exists xisolution for(1.9) in [i, i+1] with initial condition xi1(i) = xi(i). Finally we definethe function x: [t0, t1] Rn by
x(t) =xi(t),
whent[i, i+1].Such functionx is the trajectory associated to the controlu and initial date x(t0) =. An idea is given by the following pictures:
2Note that 0 x= ux 3x and x(0) = 1 imply 0 x(t) e3t.
8/11/2019 Note Optimal Control
12/126
8 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
t
t0
u
u
1 t12 3
t
t0
x
x
1 t12 3
Hereu is an admissible control andx is the associated trajectory, in thecasek= n = 1.
Example 1.2.4. Let
x= ux
x(0) = 1C0,1= {u: [0, 3] R, u KC([0, 3])}andu C be defined by
u(t) =
0 with t [0, 1)1 with t [1, 2]t with t (2, 3]
Prove that u is admissible and that the associated trajectoryx is
x(t) =
1 with t [0, 1]et1 with t (1, 2]et
2/21 with t (2, 3]
In the next chapter, in order to simplify the notation, we put
C=
Ct0,.
The problem to investigate the possibility to find admissible control foran optimal control problem is calledcontrollability. We say that adynamicsis linearif (1.9) is of the form
x(t) =A(t)x(t) + B(t)u(t), (1.11)
where A(t) is a square matrix of order n and B(t) is a matrix of ordern k : moreover the elements of such matrices are continuous function in[t0, t1]. A fundamental property of controllability of the linear dynamics isthe following
8/11/2019 Note Optimal Control
13/126
1.2. STATEMENT OF PROBLEMS OF OPTIMAL CONTROL 9
Proposition 1.1. If the dynamics is linear, then every piecewise continu-
ous function is an admissible control for (1.10), i.e. exists the associatedtrajectory.
The proof of this result is an easy application of the previous theorem:for every uKCwe haveA(t)x(t) + B(t)u(t) Ax +B, where
A=
n max{|aij(t)|: t[t0, t1], 1in, 1jn},B =
n max{|bij(t)|: t[t0, t1], 1in, 1jk}( sup
t[t0,t1]u(t)),
and aij(t), bij(t) are the elements of the matrices A(t), B(t) respectively.Since the assumptions of the mentioned theorem hold, then there exists a
unique function defined in [t0, t1] such that satisfies (1.10).For every [t0, t1], we define the reachable set at time as the set
R(, t0,) Rn of the pointsxsuch that there exists an admissible controlu and an associated trajectory x such that x(t0) = and x() =x. Froma geometric point of view the situation is the following:
tt0
u1
u2
u (t )1 0
u (t )02
U
u
u ( )1
u ( )2
U
t1
u (t )1 1
u (t )21
U
u1 u1
u2 u2
An admissible control u= (u1, u2) : [t0, t1]U R2.
tt0
x1
x2
x (t )=1 0
x (t )=2 0
1
2
x
x1
x2
x ( )1
x ( )2
t1
x1
x2
x (t )1 1
x (t )2 1
R(t ,t , )1 0
R( ,t , ) 0
The trajectoryx= (x1, x2) : [t0, t1] R2 associated to u.
If we consider the example 1.2.3, we have 3 6 R(2, 0, 1).
8/11/2019 Note Optimal Control
14/126
10 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
1.2.2 Optimal Control problems
Let us introduce the functional that we would like to optimize. Let usconsider the dynamics in (1.10) and a function f : [t0, t1] Rn+k R, theso called running cost or running payoff.
The simplest problem
Let t1 be fixed. Let us consider the set of admissible controlC. We defineJ :C Rby
J(u) = t1
t0
f(t, x(t), u(t)) dt,
where the function x is the (unique) trajectory associated to the control uthat satisfiesx(t0) =.This is the reason why Jdepends only on u. Henceour problem is
J(u) =
t1t0
f(t, x, u) dt
x= g(t, x, u)x(t0) =maxuC
J(u),
C ={u: [t0, t1]U Rk, uadmissible}
(1.12)
The problem (1.12) is called thesimplest problem of Optimal Control(in allthat follows we shorten Optimal Control with OC). We say that u Cis an optimal control for (1.12) if
J(u)J(u), u C.
The trajectory x associated to the optimal control u, is called optimaltrajectory.
In this problem and in more general problems, when f and g (and thepossible other functions that define the problem) do not depend directly ont, i.e. f(t, x(t), u(t)) = f(x(t), u(t)) and g(t, x(t), u(t)) = g(x(t), u(t)), wesay that the problem is autonomous.
1.2.3 Calculus of Variation problems
A very particular situation appears when the dynamics (1.9) is of the typex= g(t, x, u) =u (and hencek = n) and the control set U is Rn. Clearly it
8/11/2019 Note Optimal Control
15/126
8/11/2019 Note Optimal Control
16/126
12 CHAPTER 1. INTRODUCTION TO OPTIMAL CONTROL
8/11/2019 Note Optimal Control
17/126
Chapter 2
The simplest problem of OC
2.1 The necessary condition of Pontryagin
We are interested in the problem (1.12). Let us introduce the function
(0,) = (0, 1, . . . , n) : [t0, t1] Rn+1,
with0 constant. We call such function multiplier(or costate variable). Wedefine the Hamiltonian function H : [t0, t1] Rn Rk R Rn Rby
H(t, x, u, 0,) =0f(t, x, u) + g(t, x, u).
The following result is fundamental:
Theorem 2.1 (Pontryagin). Let us consider the problem (1.12) with fC1([t0, t1] Rn+k) andgC1([t0, t1] Rn+k).Letu be an optimal control andx be the associated trajectory.Then there exists a multiplier(0,
), with
0 constant, : [t0, t1] Rn continuous,
such that(0,)
= (0, 0) and
i) (Pontryagin Maximum Principle, (shortly PMP)) for all[t0, t1]wehave
u()arg maxvU
H(, x(), v, 0,()), i.e.
H(, x(), u(), 0,()) = max
vUH(, x(), v, 0,
()); (2.1)
ii) (adjoint equation) in [t0, t1] we have
=xH; (2.2)
13
8/11/2019 Note Optimal Control
18/126
14 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
iii) (transversality condition) (t1) =0;
iv) 0 = 1.
Clearly, in the assumptions of the previous theorem, iv. implies (0,)=
(0, 0).The proof of this result is very long and difficult (see [12], [7], [6]): in section2.1.1 we give a proof in a particular situation. Now let us list some commentsand definitions.
We remark that we can rewrite the dynamics (1.9) as
x=H.An admissible controlu that satisfies the conclusion of the theorem of
Pontryagin is calledextremal. We define (0,
) theassociated multipliertothe extremalu as the function that satisfies the conclusion of the mentionedtheorem. There are two distinct possibilities for the constant0 :
a. if0= 0, we say that u is normal: in this situation we may assumethat 0= 1;
b. if0 = 0, we say that u is abnormal. Then the Hamiltonian H, for
such0,does not depend onfand the Pontryagin Maximum Principleis of no use.
Hence the previous theorem guarantees that
Remark 2.1. In the simplest optimal control problem (1.12) every extremalis normal.
We will see in Example 2.5.6 an abnormal control.Let us define, for every fixed[t0, t1], the function H :U Ras
H(v) =H(, x(), v, 0,
()).
An important necessary condition of optimality in convex analysis1 implies
1Theorem. Let Ube a closed and convex set in Rk and F : U R be differentiable.If v is a point of maximum for F in U, then
F(v) (v v) 0, v U. (2.3)Proof: If v is in the interior of U, thenF(v) = 0 and (2.3) is true. Let v be onthe boundary ofU: for all v U, let us consider the function f : [0, 1] R defined byf(s) = F((1 s)v+ sv). The formula of Mc Laurin gives f(s) f(0) = f(0)s+o(s),where o(s)/s 0 for s 0+. Since v is maximum we have
0 F((1 s)v+ sv) F(v)= f(s) f(0)= f(0)s + o(s)
= F(v) (v v)s + o(s).Since s 0, (2.3) is true.
8/11/2019 Note Optimal Control
19/126
2.1. THE NECESSARY CONDITION OF PONTRYAGIN 15
Remark 2.2. Let the control setUbe closed and convex andu be optimal
for (1.12). Since, for every fixed, u
() is a maximum point forH, thePMP implies
H(u()) (v u())0, vU,i.e.
uH(, x(), u(), 0,()) (v u())0, (2.4)for every vU, [t0, t1]. In the particular caseU = Rk, we can replacethe PMP and (2.4) with
uH(, x(), u(), 0,()) =0, [t0, t1].
2.1.1 The proof in a particular situation
In this section we consider a simplest optimal control problem (1.12) withtwo fundamental assumptions that simplify the proof of the theorem of Pon-tryagin:
a. we suppose that the control set isU= Rk.
b. We suppose that the setC =Ct0,, i.e. the set of admissible controls,does not contain discontinuous function, is non empty and is open.We remark that with a linear dynamics, these assumptions onC aresatisfied.
In order to prove the mentioned theorem, we need a technical lemma:
Lemma 2.1. LetgC([t0, t1]) and t1t0
g(t)h(t) dt= 0 (2.5)
for everyhC([t0, t1]). Theng is identically zero on [t0, t1].
Proof. Let us suppose thatg(t)= 0 for some point t [t0, t1] : we supposethat g(t)> 0 (ifg(t)< 0 the proof is similar). Since g is continuous, there
exists an interval [t0, t
1][t0, t1] containing t
such that g is positive.Let us define the function h : [t0, t1] Ras
h(t) =(t t0)(t t1) 1[t0,t1](t),
where 1A is the indicator function on thesetA. Hence
x
1
t
t0 tt
h
t0 t1
t1t0
g(t)h(t) dt= t1
t0g(t) (t t0)(t t1) dt >0. (2.6)
8/11/2019 Note Optimal Control
20/126
16 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
On the other hand, (2.5) implies that t1
t0
g(t)h(t) dt = 0. Hence (2.6) is
absurd and there does not exist such point t.
Theorem 2.2. Let us consider the problem
J(u) =
t1t0
f(t, x, u) dt
x= g(t, x, u)x(t0) =maxuC
J(u)
C=
{u: [t0, t1]
R
k, u
C([t0, t1])
}withfC1([t0, t1] Rn+k) andgC1([t0, t1] Rn+k).Let u be the optimal control and x be the optimal trajectory. Then thereexists a multiplier : [t0, t1] Rn continuous such that
uH(t, x(t), u(t),(t)) =0, t[t0, t1] (2.7)xH(t, x(t), u(t),(t)) =(t), t[t0, t1] (2.8)(t1) =0, (2.9)
whereH(t, x, u,) =f(t, x, u) + g(t, x, u).
Proof. Let u
C be optimal control and x
its trajectory. Let us fixa continuous function h = (h1, . . . , hk) : [t0, t1] Rk. For every constant Rk we define the function u: [t0, t1] Rk by
u= u + (1h1, . . . , khk) = (u
1+ 1h1, . . . , u
k+ khk). (2.10)
SinceC is open, for every Let us show that u with sufficiently small,u is an admissible control.
2 Hence, for such u there exists the associated
2We remark that the assumptionU = Rk is crucial. Suppose, for example, thatU R2and let us fix t[t0, t1]. Ifu(t) is an interior point ofU, for every function h and for with modulo sufficiently small, we have thatu(t) = (u
1(t) + 1h1(t), u
2(t) +2h2(t)) U.
If u(t) lies on the boundary of U, is impossible to guarantee that, for every h, u(t) =
u(t) + h(t) U.
u*(t)
Uu* h(t)+ (t)
u*(t)
U
The case u(t) in the interior ofU; the case u(t) on the boundary ofU.
8/11/2019 Note Optimal Control
21/126
2.1. THE NECESSARY CONDITION OF PONTRYAGIN 17
trajectory: we denote by x : [t0, t1] Rn such trajectory associated3 tothe control u in (2.10). Clearly
u0(t) =u(t), x0(t) =x
(t), x(t0) =. (2.11)
Now, recalling that his fixed, we define the functionJh: Rk Ras
Jh() = t1
t0
f(t, x(t), u(t)) dt.
Sinceu is optimal,Jh(0) Jh(),; then Jh(0) =0.Let : [t0, t1]R
n be a generic continuous function. Using the dynamics we have
Jh() =
t1
t0f(t, x, u) + g(t, x, u) x dt
=
t1t0
[H(t, x, u,) x] dt
(by part) =
t1t0
H(t, x, u,) + x
dt
x
t1t0
For every i, with 1ik, we haveJhi
=
t1t0
xH(t, x, u,) i x(t) +
+
uH(t, x, u,)
i u(t) +
+ i x(t)
dt+
(t1) i x(t1) + (t0) i x(t0).Note that (2.10) implies i u(t) = (0, . . . , 0, hi, 0, . . . , 0),and (2.11) impliesi x(t0) =0. Hence, by (2.11), we obtain
Jhi
(0) =
t1t0
xH(t, x, u,) +
i x(t)=0
+
+H
ui(t, x, u,)hi(t)dt +
(t1)
i x(t1)=0
= 0. (2.12)
3For example, ifn = k = 1 and the dynamics is linear we have, for every ,x(t) =a(t)x(t) + b(t)[u
(t) + h(t)]x(t0) =
and hence x(t) = ett0
a(s) ds
+
tt0
b(s)[u(s) + h(s)]est0
a(w) dwds
.
8/11/2019 Note Optimal Control
22/126
18 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
Now let us chose the function as the solution of the following ODE:=xH(t, x, u,) fort[t0, t1](t1) = 0
(2.13)
Since
xH(t, x, u,) =xf(t, x, u) + xg(t, x, u),this implies that the previous differential equation is linear (in ). Hence,the assumption of the theorem implies that there exists a unique4 solution C([t0, t1]) of (2.13). Hence conditions (2.8) and (2.9) hold. For thischoice of the function = , we have by (2.12)
t1t0
Hui(t, x, u,)hidt= 0, (2.14)
for every i, with 1 i k, and h = (h1, . . . , hk) C([t0, t1]). Lemma 2.1and (2.14) imply that Hui (t, x
, u,) = 0 in [t0, t1] and hence (2.7).
2.2 Sufficient conditions
In order to study the problem (1.12), one of the main result about the suffi-cient conditions for a control to be optimal is due to Mangasarian (see [11]).
Recalling in the simplest problem that every extremal control is normal (seeremark 2.1), we have:
Theorem 2.3(Mangasarian). Let us consider the maximum problem (1.12)with f C1 and g C1. Let the control set U be convex. Let u be anormal extremal control, x the associated trajectory and = (1, . . . ,
n)
the associated multiplier (as in theorem 2.1).Consider the Hamiltonian functionH and let us suppose that
v) the function(x, u)H(t, x, u,) is, for everyt[t0, t1], concave.Thenu is optimal.
Proof. The assumptions of regularity and concavity onH imply5
H(t, x, u,) H(t, x, u,) ++xH(t, x, u,) (x x) ++uH(t, x, u,) (u u), (2.15)
4We recall that for ODE of the first order with continuous coefficients holds the theorem1.1.
5We recall that if F is a differentiable function on a convex set C Rn, then F isconcave in C if and only if, for every v, v C, we have F(v) F(v) + F(v) (v v).
8/11/2019 Note Optimal Control
23/126
2.2. SUFFICIENT CONDITIONS 19
for every admissible control u with associated trajectory x, and for every
[t0, t1]. The PMP implies, see (2.4), thatuH(, x, u,)(u()u()) =H(u())(u()u())0. (2.16)
The adjoint equationii), (2.15) and (2.16) imply
H(t, x, u,)H(t, x, u,) (x x). (2.17)Since xand x are trajectory associated to uand u respectively, by (2.17)we have
f(t, x, u) f(t, x, u) + (g(t, x, u) g(t, x, u)) + (x x)= f(t, x, u) +
(x
x) +
(x
x)
= f(t, x, u) + d
dt( (x x)). (2.18)
Hence, for every admissible controluwith associated trajectory xwe have t1t0
f(t, x, u) dt t1
t0
f(t, x, u) dt+ (x x)
t1t0
=
t1t0
f(t, x, u) dt+
+(t1) (x(t1) x(t1)) (t0) (x(t0) x(t0));
since x
(t0) = x(t0) = and the transversality condition iii) are satisfied,we obtain that u is optimal.
In order to apply such theorem, it is easy to prove the next note
Remark 2.3. If we replace the assumption v) of theorem 2.3 with one ofthe following assumptions
v) for every t [t0, t1], let f and g be concave in the variable x and u,and let us suppose(t) 0, (i.e. for everyi, 1in, i (t)0);
v) let the dynamics of problem (1.12) be linear and, for everyt[t0, t1],letfbe concave in the variablex and u;
thenu is optimal.
A further sufficient condition is due to Arrow: we are interested in aparticular situation of the problem (1.12), more precisely
maxuC
t1t0
f(t, x, u) dt
x= g(t, x, u)x(t0) =C ={u: [t0, t1]U, uK C}
(2.19)
8/11/2019 Note Optimal Control
24/126
20 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
with U Rk closed and convex. Let us define the function U : [t0, t1]R
n
Rn
Rk
by, for every (t, x,),
U(t, x,) = arg maxuU
H(t, x, u,),
whereH(t, x, u,) =f(t, x, u) + g(t, x, u) is the Hamiltonian. Now wedefine the maximized Hamiltonian functionH0 : [t0, t1] Rn Rn Rby
H0(t, x,) =H(t, x, U(t, x,),). (2.20)
We have the following result by Arrow (see [1], [5] section 8.3, [9] part IIsection 15):
Theorem 2.4 (Arrow). Let us consider the maximum problem (2.19) with
fC1
andgC1
.Letu
be a normal extremal control, x
be the associatedtrajectory and = (1, . . . ,
n) be the associated multiplier.
Consider the maximized Hamiltonian functionH0 and let us suppose that,for everyt[t0, t1] Rn, the function
xH0(t, x,)is concave. Moreover, we suppose that the function U along the curve t(t, x(t),(t)) is equal to u, i.e.
u(t) =U(t, x(t),(t)) t[t0, t1]. (2.21)Thenu is optimal.
Proof. Let us considertfixed in [t0, t1] (and hence we have x =x(t), u =
u(t), . . .). Our aim is to arrive to prove relation (2.17) with our newassumptions. First of all we note that the definitions of H0 and U, and(2.21) imply
H0(t, x,) =H(t, x, U(t, x,),) =H(t, x, u,)
and H(t, x, u,)H0(t, x,) for every x, u. These relations giveH(t, x, u,) H(t, x, u,)H0(t, x,) H0(t, x,). (2.22)
Since the function g : Rn R, defined by g(x) = H0(t, x,), is concavethen there exists a supergradient6 a in the point x, i.e.
H0(t, x,)H0(t, x,) + a (x x), x Rn. (2.23)6We recall that if we consider a function g : Rn R, we say that a Rn is a
supergradient (respectively subgradient) in the point x0 if
g(x) g(x0) + a (x x0) x Rn (respectivelly g (x) g(x0) + a (x x0) ).A fundamental result in convex analysis is the following
Theorem 2.5 (Rockafellar). Let g : Rn R a concave (convex) function. Then, forevery x0 Rn, the set of the supergradients (subgradients) in x0 is non empty.
8/11/2019 Note Optimal Control
25/126
2.3. FIRST GENERALIZATIONS 21
Clearly from (2.22) and (2.23) we have
H(t, x, u,) H(t, x, u,)a (x x). (2.24)In particular, choosing u= u, we have
H(t, x, u,) H(t, x, u,)a (x x). (2.25)Now let us define the function G : Rn R by
G(x) =H(t, x, u,) H(t, x, u,) a (x x).Clearly, by (2.25),G has a maximum in the point x: moreover it is easy tosee thatG is differentiable. We obtain
0 =
G(x) =
xH(t, x, u,)
a.
Now, the adjoint equation and (2.24) give
H(t, x, u,)H(t, x, u,) (x x).Note that this last relation coincides with (2.17): at this point, using thesame arguments of the second part of the proof of Theorem 2.3, we are ableto conclude the proof.
2.3 First generalizations
2.3.1 Initial/final conditions on the trajectory
What happens if we modify the initial or the final condition on the trajec-tory? We have found the fundamental ideas in the proof of Theorem 2.2(see (2.11)), in the proof of Theorem 2.3 and hence in the proof of Theorem2.4: more precisely, using the notation in (2.11), iftis the initial or the finalpoint of the interval [t0, t1], we have the following two possibilities:
if x(t) = is fixed, then x(t) = ; hencei x(t) = 0 and wehave no conditions on the value (t);
ifx(t) is free, then x(t) is free; hence we have no information on
ix(
t) and we have to require the condition
(t) =
0.
We left to the reader the details, but it is clear that slight modificationson the initial/final points of the trajectory of the problem (1.12), give ussome slight differences on the transversality conditions in Theorem 2.2, inTheorem 2.3 and in Theorem 2.4.
Pay attention that if the initial and the final point of the trajectory areboth fixed, it is not possible to guarantee that 0 is different from zero,i.e. that the extremal control is normal: note that in the case of abnormalextremal control, the previous sufficient conditions dont work (see Example2.5.3 and Example 2.5.6).
8/11/2019 Note Optimal Control
26/126
22 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
2.3.2 On minimum problems
Let us consider the problem (1.12) where we replace the maximum with aminimum problem. Since
min
t1t0
f(t, x, u) dt= max t1
t0
f(t, x, u) dt,
clearly it is possible to solve a min problem passing to a max problem withsome minus.
Basically, a more direct approach consists in replace some words in allthe previous pages as follows
max minconcave
convex.
In particular in (2.1) we obtain the Pontryagin Minimum Principle.
2.4 The case of Calculus of Variation
In this section, we will show that the theorem of Euler of Calculus of Varia-tion is an easy consequence of the theorem of Pontryagin of Optimal Control.Suppose we are interested in the problem
maxxC1
t1t0
f(t, x, x) dt
x(t0) =
(2.26)
with Rn fixed. We remark that here x is inC1. We have the followingfundamental result
Theorem 2.6 (Euler). Let us consider the problem (2.26) withfC1.Let x be optimal. Then, for allt[t0, t1], we have
d
dt
xf(t, x(t), x(t))
=xf(t, x(t), x(t)). (2.27)
In calculus of variation the equation (2.27) is called Euler equation (shortlyEU); a function that satisfies EU is calledextremal. Let us prove this result.If we consider a new variable u= x, we rewrite problem (2.26) as
maxuC
t1t0
f(t, x, u) dt
x= ux(t0) =
Theorem 2.2 guarantees that, for the HamiltonianH(t, x, u,) =f(t, x, u)+ u, we have
uH(t, x, u) =0 uf+ =0 (2.28)xH(t, x, u) = xf = (2.29)
8/11/2019 Note Optimal Control
27/126
2.4. THE CASE OF CALCULUS OF VARIATION 23
If we consider a derivative with respect to the time in (2.28) and using (2.28)
we have d
dt(uf) = =xf;
taking into account x= u, we obtain (2.27). Moreover, we are able to findthe transversality condition of Calculus of Variation: (2.9) and (2.28), imply
xf(t1, x(t1), x(t1)) =0.As in subsection 2.3.1 we obtain
Remark 2.4. Consider the theorem 2.6, its assumptions and let us modifyslightly the conditions on the initial and the final points of x. We have the
following transversality conditions:
ifx(ti) Rn, i.e. x(ti) is free xf(ti, x(ti), x(ti)) =0,whereti is the initial or the final point of the interval [t0, t1].
An useful remark, in some situation, is that iffdoes not depend on x,i.e. f=f(t, x), then the equation of Euler (2.27) is
xf(t, x) =c,wherec R is a constant. Moreover, the following remark is not so obvious:Remark 2.5. Iff=f(x, x)does not depend directly ont, then the equation
of Euler (2.27) isf(x, x) x xf(x, x) =c, (2.30)
wherec R is a constant.Proof. Clearly
d
dtf= x xf+x xf, d
dt(x xf) =x xf+ x d
dt(xf).
Now let us suppose that x satisfies condition the Euler condition (2.27):hence, using the previous two equalities we obtain
0 = x ddt
xf(x, x) xf(x, x)=
d
dt
x xf(x, x)
d
dt
f(x, x)
=
d
dt
x xf(x, x) f(x, x)
.
Hence we obtain (2.30).
If we are interested to find sufficient condition of optimality for the prob-lem (2.26), since the dynamics is linear, remark 2.3 implies
8/11/2019 Note Optimal Control
28/126
24 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
Remark 2.6. Let us consider an extremalx for the problem (2.26) in the
assumption of theorem of Euler. Suppose thatx
satisfies the transversalityconditions. If, for everyt[t0, t1], the functionfis concave on the variablex andx, then x is optimal.
2.5 Examples and applications
Example 2.5.1. Consider7
max
10
(x u2) dtx= ux(0) = 2
1st method: Clearly the Hamiltonian is H = x u2 +u (note that the extremal iscertainly normal) and theorem 2.2 implies
H
u = 0 2u+ = 0 (2.31)
H
x = 1 = (2.32)
H
= x x = u (2.33)
(1) = 0 (2.34)
Equations (2.32) and (2.34) give = 1t;consequently, by (2.31) we haveu = (1t)/2;hence the initial condition and (2.33) givex = (2t t2)/4 + 2. The dynamics is linearandfis concave in x andu; remark 2.3 guarantees that the extremalu is optimal.
2nd method: The problem is, clearly, of calculus of variations, i.e.max
10
(x x2) dtx(0) = 2
The necessary condition of Euler (2.27) and the transversality condition give
dfxdt
(t, x, x) =fx(t, x, x) 2x = 1
x(t) = 14
t2 + at + b,a, b Rfx(1, x
(1), x(1)) = 0 2x(1) = 0An easy calculation, using the initial condition x(0) = 2, impliesx(t) =
t2/4 + t/2 + 2.
Since the function (x, x) (x x2) is concave, then x is really the maximum of theproblem.
Example 2.5.2. Consider8
max
20
(2x 4u) dtx= x + ux(0) = 50 u 2
7In the example 5.3.1 we solve the same problem with the dynamics programming.8In the example 5.3.3 we solve the same problem with the dynamics programming.
8/11/2019 Note Optimal Control
29/126
2.5. EXAMPLES AND APPLICATIONS 25
Let us consider the Hamiltonian H= 2x4u+(x+u)(note that the extremal is certainlynormal). The theorem of Pontryagin gives
H(t, x, u, ) = maxv[0,2]
H(t, x, v , )
2x 4u+ (x+ u) = maxv[0,2]
(2x 4v+ (x+ v)) (2.35)H
x = 2 + = (2.36)
H
= x x= x+ u (2.37)
(2) = 0 (2.38)
From (2.35) we have, for everyt [0, 2],
u(t)((t)
4) = max
v[0,2](v((t)
4))
and hence
u(t) =
2 for(t) 4> 0,0 for(t) 4< 0,? for(t) 4 = 0.
(2.39)
(2.36) implies(t) =aet 2,a R : using (2.38) we obtain
(t) = 2(e2t 1). (2.40)
Since(t)> 4 if and only ift [0, 2 log 3), the extremal control is
u(t) =
2 for0 t 2 log3,0 for2 log 3< t 2. (2.41)
We remark that the value of the function u in t = 2 log 3 is irrelevant. Since thedynamics is linear and the function f(t,x,u) = 2x 4u is concave, u is optimal.
If we are interested to find the optimal trajectory, the relations (2.37) and (2.41), andthe initial condition give us to solve the ODE
x = x+ 2 in [0, 2 log 3)x(0) = 5
(2.42)
The solution isx(t) = 7et 2. Taking into account that the trajectory is a continuousfunction, by (2.42) we havex(2 log 3) = 7e2log 3 2 = 7e2/3 2. Hence the relations(2.37) and (2.41) give us to solve the ODE
x = x in [2 log3, 2]x(2
log3) = 7e2/3
2
We obtain
x(t) =
7et 2 for0 t 2 log 3,(7e2 6)et2 for2 log 3< t 2. (2.43)
t0
u
2-log 3 2
2
t0
x
2-log 3 2
5
8/11/2019 Note Optimal Control
30/126
26 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
We note that an easy computation givesH(t, x(t), u(t), (t)) = 14e2 12 for all t[0, 2].
Example 2.5.3. Find the optimal tern for
max
40
3x dt
x= x + ux(0) = 0x(4) = 3e4/20 u 2
Let us consider the Hamiltonian H= 3x+(x+u); it is not possible to guarantee thatthe extremal is normal, but we try to put 0 = 1 since this situation is more simple; if
we will not found an extremal (more precisely a normal extremal), then we will pass tothe more general Hamiltonian H = 30x+(x+u) (and in this situation certainly theextremal there exists). The theorem of Pontryagin gives
H(t, x, u, ) = maxv[0,2]
H(t, x, v , ) u = maxv[0,2]
v
u(t) =
2 for(t)> 0,0 for(t)< 0,? for(t) = 0.
(2.44)
H
x = 3 + = (t) =aet 3,a R (2.45)
H
= x x = x+ u (2.46)
Note that we have to maximize the area of the function t
3x(t)and that x(t)
0 sincex(0) = 0 andx= u+x x 0. In order to maximize the area, it is reasonable that thefunction x is increasing in an interval of the type[0, ) : hence it is reasonable to supposethat there exists a positive constant such that (t)> 0 fort [0, ).In this case, (2.44)givesu = 2. Hence we have to solve the ODE
x = x+ 2 in [0, )x(0) = 0
(2.47)
The solution isx(t) = 2(et1).We note that for such function we havex(4) = 2(e41)>3e4/2;hence it is not possible that 4 : we suppose that (t)< 0 fort (, 4]. Takinginto account the final condition on the trajectory, we have to solve the ODE
x = x in (, 4]x(4) = 3e4/2
(2.48)
The solution isx(t) = 3et/2. We do not know the point , but certainly the trajectoryis continuous, i.e.
limt
x(t) = limt+
x(t) limt
2(et 1) = limt+
3et/2
that implies = ln4. Moreover, since the multiplier is continuous, we are in the positionto find the constant a in (2.45): more precisely (t) = 0 for t = ln4, implies a = 12.Finally, the dynamics and the running cost is linear and hence the sufficient condition aresatisfied. The optimal tern is
u(t) =
2 fort [0, ln4),0 fort [ln4, 4] (2.49)
8/11/2019 Note Optimal Control
31/126
2.5. EXAMPLES AND APPLICATIONS 27
x(t) = 2(et 1) fort [0, ln4),3et/2 fort
[ln4, 4]
and(t) = 12et 3.We note that an easy computation givesH(t, x(t), u(t), (t)) = 18 for allt [0, 4]. Example 2.5.4.
min
e1
(3x + tx2) dt
x(1) = 1x(e) = 1
It is a calculus of variation problem. Since f = 3x + tx2 does not depend on x, thenecessary condition of Euler implies
3 + 2tx= c,
where c is a constant. Hencex(t) = a/t,
a
R, implies the solution x(t) = a ln t+
b,a, b R. Using the initial and the final conditions we obtain the extremalx(t) = 1.Sincefis convex in x andx, the extremal is the minimum of the problem.
Example 2.5.5.
min20
(x2 xx + 2x2) dtx(0) = 1
It is a calculus of variation problem; the necessary condition of Euler (2.27) gives
d
dtfx = fx 4x x= 2x x
2x x= 0 x(t) =aet/2 + bet/
2,
for everya, b R.Hence the initial condition x(0) = 1givesb = 1a.Since there does notexist a final condition on the trajectory, we have to satisfy the transversality condition,i.e.
fx(t1, x(t1), x
(t1)) = 0 4x(
2) x(
2) = 0 ae +1 a
e 4
ae
2 1 a
e
2
= 0
Hence
x(t) = (4 +
2)et/
2 + (4e2 e22)et/
2
4 +
2 + 4e2 e22 .
The function f(t,x, x) =x2 xx + 2x2 is convex in the variablex andx, since its hessianmatrix with respect (x, x)
d2
f= 2 11 4 ,
is positive definite. Hencex is minimum.
The following example gives an abnormal control.
Example 2.5.6. Prove that u= 1 satisfied the PMP with 0= 0 and it is optimal for
max
10
u dt
x= (u u2)2x(0) = 0x(1) = 00 u 2
8/11/2019 Note Optimal Control
32/126
28 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
ClearlyH= 0u + (u u2)2; the PMP and the adjoint equation giveu(t) arg max
v[0,2][0v+ (v v2)2], = 0.It is clear that the trajectory associated tou = 1 isx = 0. If we consider0 = 0 and= k, wherek is a negative constant, then it is easy to see that (u, x, 0,
) satisfiesthe previous necessary conditions.
In order to prove that u is optimal, we cannot use the Mangasarians theorem.We note that the initial and the final conditions on the trajectory and the fact thatx= (u u2)2 0, implies that x= 0 a.e.; hence if a controlu is admissible, then we haveu(t) {0, 1} a.e. This implies, for every admissible controlu, 1
0
u(t) dt= 10
1 dt 10
u(t) dt;
henceu is maximum.
2.5.1 The curve of minimal length
We have to solve the calculus of variation problem (1.1). Since the functionf(t,x, x) =
1 + x2 does not depend on x, the necessary condition of Euler
(2.27) gives
fx= a x1 + x2
=a
x= c x(t) =ct +d,with a, b, c R constants. The conditions x(0) = 0 and x(1) = b implyx(t) = t/b. The function f is constant and hence convex in x and it is
convex in xsince 2f
x2= (1 + x2)3/2 >0.This proves that the line x is the
solution of the problem.
2.5.2 A problem of business strategy I
We solve9 the model presented in the example 1.1.2, formulated with (1.3).We consider the HamiltonianH= (1u)x+xu: the theorem of Pontryaginimplies that
H(t, x, u, ) = maxv[0,1]
H(t, x, v , )
(1
u)x + xu = max
v[0,1][(1
v)x + xv]
ux( 1) = maxv[0,1]
[vx( 1)] (2.50)H
x = 1 u +u = (2.51)
H
= x x =xu (2.52)
(T) = 0 (2.53)
9In subsection 5.3.1 we solve the same problem with the Dynamic Programming ap-proach.
8/11/2019 Note Optimal Control
33/126
2.5. EXAMPLES AND APPLICATIONS 29
Since x is continuous, x(0) = >0 andu 0, from (2.52) we obtain
x =xu 0, (2.54)
in [0, T]. Hence x(t) for all t[0, ]. Relation (2.50) becomes
u( 1) = maxv[0,1]
v( 1).
Hence
u(t) =
1 if(t) 1> 0,0 if(t) 1< 0,? if(t)
1 = 0.
(2.55)
Since the multiplier is a continuous function that satisfies (2.53), there exists [0, T) such that
(t)< 1, t[, T] (2.56)Using (2.55) and (2.56), we have to solve the ODE
=1 in [, T](T) = 0
that implies
(t) =T t, t[
, T]. (2.57)Clearly, we have two cases: T 1 (case A) and T >1 (case B).
t0 T
1
1
t0 T
1
T-1
The caseT 1 and the case T >1.
Case A: T 1.In this situation, we obtain = 0 and hence u = 0 and x = in [0, T].
u
t0 T 1
u
x
t0 T 1
x
t0 T
1
1
8/11/2019 Note Optimal Control
34/126
30 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
From an economic point of view, if the time horizon is short the optimal
strategy is to sell all our production without any investment. Note that thestrategy u that we have found is an extremal: in order to guarantee thesufficient conditions for such extremal we refer the reader to the case B.Case B: T 1.In this situation, taking into account (2.55), we have =T 1. Hence
(T 1) = 1. (2.58)First of all, if there exists an intervalI[0, T1) such that(t)< 1,thenu = 0 and the (2.51) is =1 : this is impossible since (T 1) = 1.Secondly, if there exists an interval I[0, T 1) such that (t) = 1, then = 0 and the (2.51) is 1 = 0 : this is impossible.Let us suppose that there exists an interval I= [, T1)[0, T1) suchthat (t)> 1 : using (2.55), we have to solve the ODE
+ = 0 in [, T 1](T 1) = 1
that implies(t) =eTt1, for t[0, T 1].
We remark the choice = 0 is consistent with all the necessary conditions.Hence (2.55) gives
u(t) =
1 for 0tT 1,0 forT 1< tT (2.59)
The continuity of the function x, the initial condition x(0) = and thedynamics imply
x =x in [0, T 1]x(0) =
that implies x(t) =et; hencex = 0 in [T 1, T]x(T 1) =eT1
that implies x(t) =eT1. Consequently
x(t) =
et for 0tT 1,eT1 for T 1< tT
Recalling that
(t) =
eTt1 for 0tT 1,T t for T 1< tT
we have
8/11/2019 Note Optimal Control
35/126
2.5. EXAMPLES AND APPLICATIONS 31
u
u
t0
T
1
T-1
x
x
t0
TT-1
t0
T
1
T-1
In an economic situation where the choice of business strategy can becarried out in a medium or long term, the optimal strategy is to direct alloutput to increase production and then sell everything to make profit in thelast period.
We remark, that we have to prove some sufficient conditions for the tern(x, u, ) in order to guarantee that u is really the optimal strategy. An
easy computation shows that the Hamiltonian is not concave. We study themaximized Hamiltonian (2.20): taking into account thatx(t) > 0 weobtain
H0(t,x,) = maxv[0,1]
[(1 v)x + xv] = x +x maxv[0,1]
[( 1)v]
In order to apply theorem 2.4, using the expression of we obtain
H0(t,x,(t)) =
eTt1x ift[0, T 1)x ift[T 1, T]
and
U(t,x,(t)) =
1 ift[0, T 1)0 ift[T 1, T]
Note that, for every fixedt the function xH0(t,x,(t)) is concave withrespect to x and that the function U(t, x(t), (t)) coincides with u: thesufficient condition of Arrow holds. We note that an easy computation givesH(t, x(t), u(t), (t)) =eT1 for all t[0, T].
2.5.3 A two-sector model
This model has some similarities with the previous one and it is proposedin [15].
Consider an economy consisting of two sectors where sector no. 1 pro-duces investment goods, sector no. 2 produces consumption goods. Letxi(t) the production in sector no. i per unit of time, i = 1, 2, and let u(t)be the proportion of investments allocated to sector no. 1. We assume thatx1=ux1 and x2 = (1 u)x1 where is a positive constant. Hence, theincrease in production per unit of time in each sector is assumed to be pro-portional to investment allocated to the sector. By definition, 0 u(t)1,and if the planning period starts at t = 0, x1(0) and x2(0) are historicallygiven. In this situation a number of optimal control problems could be in-vestigated. Let us, in particular, consider the problem of maximizing the
8/11/2019 Note Optimal Control
36/126
32 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
total consumption in a given planning period [0,T ]. Our precise problem is
as follows:
maxuC
T0
x2dt
x1= ux1x2= (1 u)x1x1(0) =a1x2(0) =a2C ={u: [0, T][0, 1] R, uK C}
where, a1, a2 and Tare positive and fixed. We study the case T > 2 .
We consider the Hamiltonian H = x2+ 1ux1+ 2(1
u)x1; the theorem
of Pontryagin implies that
u arg maxv[0,1]
H(t, x, v , ) = arg maxv[0,1]
[x2+1vx
1+
2(1 v)x1]
u arg maxv[0,1]
(1 2)vx1 (2.60)H
x1=1 1u 2(1 u) = 1 (2.61)
H
x2=2 1 = 2 (2.62)
1(T) = 0 (2.63)
2(T) = 0 (2.64)
Clearly (2.62) and (2.64) give us 2(t) =T t. Moreover (2.63) and (2.64)in equation (2.61) give 1(T) = 0. We note that
1(T) =2(T) = 0,
1(T) = 0, 2(T) =1
and the continuity of the multiplier (1, 2) implies that there exists < T
such that
T t= 2(t)> 1(t), t(, T). (2.65)Since x1 is continuous, x
1(0) =a1> 0 and u
0, we have x1(t)> 0; from
(2.60) we obtain
u(t)arg maxv[0,1]
(1(t) T+t)v=
1 if1(t)> T t? if1(t) =T t0 if1(t)< T t
(2.66)
Hence (2.65) and (2.66) imply that, in (, T], we haveu(t) = 0. Now (2.61)gives, taking into account (2.64),
1=2 1(t) =
2(t T)2, t(, T].
8/11/2019 Note Optimal Control
37/126
2.5. EXAMPLES AND APPLICATIONS 33
An easy computation shows that the assumption in (2.65) holds for =
T 2 . Hence let us suppose that there exists
< T 2 such that
T t= 2(t)< 1(t), t(, T 2/). (2.67)By (2.66) we obtain, in (, T 2/), that u(t) = 1. Now (2.61) gives,taking into account the continuity of2 in the point T 2/,
1=1 1(t) = 2
e(tT+2/), t(, T 2/].
It is easy to verify that 2 C1((, T)): since1(T) =2(T) and 1(T2/) = 2(T 2/), the convexity of the functions 1 and 2 imply thatassumption (2.67) holds with = 0. Using the dynamics and the initial
condition on the trajectory, we obtain
u(t) =
1 for 0tT 2 ,0 for T 2 < tT
x1(t) =
a1e
t for 0tT 2 ,a1e
T2 for T 2 < tT
x2(t) =
a2 for 0tT 2 ,a2e
(tT+2)a1eT2 for T 2 < tT
1(t) =
2
e(tT+2/) for 0tT 2 ,2 (t T)2 for T 2 < tT
2(t) =T t.u
u
t0
T
1
T-2/
x
x
t0
T
a
T-2/
1
a2
1
x2
t0
T
T-2/
2
1
In order to guarantee some sufficient conditions, since H is not con-vex, we use the Arrows sufficient condition. Taking into account that
x1(t) 1 > 0 we construct the functions H0
= H0
(t, x1, x2, 1,
2) and
U=U(t, x1, x2, 1,
2) as follows
H0 = maxv[0,1]
[x2+1vx
1+
2(1 v)x1]
= x2+x1 maxv[0,1]
(1 2)v
=
x2+ x1
2e(tT+2/) + (t T) for 0tT 2 ,
x2 for T 2 < tT
U =
1 for 0tT 2 ,0 for T 2 < tT
8/11/2019 Note Optimal Control
38/126
34 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
Note that, for every fixed t the function (x1, x2) H0(t, x1, x2, 1, 2) isconcave and that the function U(t, x
1(t), x
2(t),
1(t)
2(t)) coincides withu
:the sufficient condition of Arrow holds.
2.5.4 A problem of inventory and production.
A firm has received an order for B > 0 units of product to be delivery bytimeT(fixed). We are looking for a plane of production for filling the orderat the specified delivery date at minimum cost (see [9]). Letx = x(t) be theinventory accumulated by time t: since such inventory level, at any moment,is the cumulated past production and taking into account thatx(0) = 0,wehave that
x(t) = t
0p(s) dt,
where p = p(t) is the production at time t; hence the rate of change ofinventory x is the production and is reasonable to have x= p.
The unit production cost c rises linearly with the production level, i.e.the total cost of production is cp = p2 = x2; the unit cost of holdinginventory per unit time is constant. Hence the total cost, at time tisu2+xwith and positive constants, and u = x. Our strategy problem is
minu
T
0(u2 +x) dt
x= ux(0) = 0x(T) =B >0u0
Let us consider the Hamiltonian H(t,x,u,) =u2 +x +u: we are notin the situation to guarantee that the extremal is normal, but we try! Thenecessary conditions are
u(t)arg maxv0
(v2 + x + v) = arg maxv0
(v2 + v) (2.68)
=
=
t +a, (2.69)
for some constant a. Hence (2.68) gives these situations
y
v
y= v + v 2
-/(2
for y
v
y= v 2
fory
v
y= v + v 2
for
- /(2
8/11/2019 Note Optimal Control
39/126
2.5. EXAMPLES AND APPLICATIONS 35
This implies
u(t) =
0 if(t)0(t)2 if(t)< 0
Taking into account (2.69), we have thethree different situations as in the pic-ture here on the right, where = a.
First, a T implies u = 0 in [0, T]and hence, using the initial condition,x = 0 in [0, T]; this is in contradictionwith x(T) =B >0.
Second, 0< a < T implies
t
t+a, for a
8/11/2019 Note Optimal Control
40/126
36 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
t
u
T
x
t
T
B
ifT 2
B , then
u(t) =
2t +
4B T24T
and x(t) =
4t2 +
4B T24T
t
t
u
T
x
t
T
B
In both the cases, we have a normal extremal and a convex Hamiltonian:hence such extremals are optimal.
2.6 Singular and bang-bang controls
The Pontryagin Maximum Principle (2.1) gives us, when it is possible, thevalue of the u at the point[t0, t1] : more precisely, for every[t0, t1]we are looking for a unique point w= u() belonging to the control set Usuch that
H(, x(), w, 0,())H(, x(), v, 0,()) vU. (2.70)
In some circumstances, it is possible that using only the PMP can not be
found the value to assign at u at the point [t0, t1] : examples of thissituation we have found in (2.39), (2.44) and (2.55). Now, let us considerthe set Tof the points[t0, t1] such that PMP gives no information aboutthe value of the optimal control u at the point , i.e. a point T if andonly if there no exists a unique w= w() such that it satisfies (2.70).
We say that an optimal control issingularifTcontains some interval of[t0, t1].
In optimal control problems, it is sometimes the case that a control isrestricted to be between a lower and an upper bound (for example whenthe control set U is compact). If the optimal control switches from one
8/11/2019 Note Optimal Control
41/126
2.6. SINGULAR AND BANG-BANG CONTROLS 37
extreme to the other at certain times (i.e., is never strictly in between the
bounds) then that control is referred to as a bang-bang solution and iscalled switching point. For example
in example 2.5.2, we know that the controlu in (2.41) is optimal: thevalue of such control is, at all times, on the boundary U ={0, 2} ofthe control set U = [0, 2]; at time = 2 log 3 such optimal controlswitches from 2 to 0. Hence 2 log 3 is a switching point and u isbang-bang;
in example 2.5.3, the optimal controlu in (2.44) is bang-bang since itsvalue belongs, at all times, toU={0, 2}of the control setU= [0, 2];the time log 4 is a switching point;
in the case B of example 1.1.2, the optimal control u in (2.59) isbang-bang since its value belongs, at all times, to U ={0, 1} of thecontrol setU= [0, 1]; the time T 1 is a switching point.
2.6.1 The building of a mountain road: a singular control
We have to solve the problem (1.4) presented in example 1.1.3 (see [9]).We note that there no exist initial or final conditions on the trajectory andhence we have to satisfy two transversality conditions for the multiplier. TheHamiltonian isH= (x y)2 +u:
(x y)2 + u = minv[,]
(x y)2 +v u = min
v[,]v (2.71)
H
x = =2(x y)
(t) =b 2 t
t0
(x(s) y(s)) ds, b R (2.72)H
= x x =u (2.73)
(t0) =(t1) = 0 (2.74)
We remark that (2.72) follows from the continuity ofy and x. The mini-mum principle (2.71) implies
u(t) =
for (t)> 0, for (t)< 0,??? for (t) = 0.
(2.75)
Relations (2.72) and (2.74) give
(t) =2 t
t0
(x(s) y(s)) ds, t[t0, t1] (2.76)
8/11/2019 Note Optimal Control
42/126
38 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
t1
t0
(x(s)
y(s)) ds= 0. (2.77)
Let us suppose that there exists an interval [c, d][t0, t1] such that = 0:clearly by (2.76) we have, for t[c, d],
0 = (t)
= 2
ct0
(x(s) y(s)) ds 2 t
c(x(s) y(s)) ds
= (c) 2 t
c(x(s) y(s)) ds t[c, d]
and hence, since y and x are continuous,
d
dt
t
c(x(s) y(s)) ds
=x(t) y(t) = 0.
Hence, if(t) = 0 in [c, d],thenx(t) =y(t) for all t[c, d] and, by (2.73),u(t) = y(t). We remark that in the set [c, d], the minimum principle hasnot been useful in order to determinate the value ofu. If there exists suchinterval [c, d][t0, t1] where is null, then the control is singular.At this point, using (2.75), we are able to conclude that the trajectory x
associated to the extremal controlu is built with interval where it coincideswith the ground, i.e. x(t) =y(t), and interval where the slope of the roadis maximum, i.e. x(t)
{,
}. Moreover such extremal satisfies (2.77).
Finally, we remark that the Hamiltonian is convex with respect to x and u,for every fixed t : hence the extremal is really a minimum for the problem.Let us give three examples.Example A: suppose that|y(t)| , t[t0, t1] :
x
x =y
t
t0 t1
tt0 t1
We obtainx =y and the control is singular.
Example B: suppose that the slope y of the ground is not contained, for allt[t0, t1], in [, ] :
8/11/2019 Note Optimal Control
43/126
2.6. SINGULAR AND BANG-BANG CONTROLS 39
u
u
t
t0
t1
In the first picture on the left, the dotted line represents the groundy , thesolid represents the optimal roadx : we remark that, by (2.77), the area ofthe region between the two mentioned lines is equal to zero if we take into
account the sign of such areas. The control is singular.
Example 2.6.1. Suppose that the equation of the ground isx(t) =et fort [1, 1]andthe slope of such road must satisfy|x(t)| 1.
We have to solve
minu
11
(x et)2 dtx= u
1 u 1We know, for the previous consideration and calculations, that for everyt [1, 1] one possibility is that x(t) =y(t) = et and(t) = 0, |x(t)| = |u(t)| 1, the other possibility is that x(t) = u(t) {1, +1}.
We note that fort >0 the second possibility can not happen becausey(t)> 1. Hence letus consider the function
x(t) =
et fort [1, ],t + e fort (, 1],
with (1, 0) such that (2.77) is satisfied:x
t-1 1
y
x
1
1
11
(x(s) y(s)) ds = 1
(s + e es) ds
= 1
2+ 2e +
1
22 e e = 0. (2.78)
8/11/2019 Note Optimal Control
44/126
40 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
For the continuous function h : [1, 0] R, defined by
h(t) = 1
2+ 2et +
1
2t2 e t tet,
we have
h(1) =e2 + 2e + 3
e = (e 3)(e + 1)
e >0
h(0) = 5
2 e < 0,
certainly there exists a point (1, 0) such that h() = 0 and hence (2.78) holds.Moreover, sinceh(t) = (et 1)(1 t)< 0 in (0, 1), such point is unique. Using (2.76),we are able to determinate the multiplier:
(t) =
0 ift
[
1, ]
2
t
(s + e es)ds=
= 1
2(t2 2) + (e )(t ) + e et ift (, 1]
Note that in the interval[1, ] the PMP in (2.71) becomes
0 = minv[1,1]
0
and gives us no information. Henceu is singular.
2.7 The multiplier as shadow price I: an exerciseExample 2.7.1. Consider, for every (, )[0, 2] [0, ) fixed, the problem
min
2
(u2 +x2) dt
x= x +ux() =u0
a. For every fixed (, ), find the optimal tern (x, u, ). Let us denoteby (x, , u
,
,
,
) such tern.
b. Calculate
min{u: x=x+u, x()=, u0}
2
(u2 + x2)dt=
2
((u, )2 + (x, )
2)dt
and denote with V(, ) such value.
c. For a given (, ),consider a point (t, x)[, 2][0, ) on the optimaltrajectory x, , i.e. x
, (t) =x.
8/11/2019 Note Optimal Control
45/126
2.7. THE MULTIPLIER AS SHADOW PRICE I: AN EXERCISE 41
x
t2
x x* t
x*
t
Consider the function V(, ) : [0, 2] [0, ) R defined in b. andcompute
V
(t, x). What do you find ?
Solution a. Let us consider the Hamiltonian H = u2 +x2 +(x+u); thetheorem of Pontryagin gives
H(t, x, u, ) = minv[0,)
H(t, x, v , )
u arg minv[0,)
(v2 +v) (2.79)
H
x = = 2x (2.80)
(2) = 0 (2.81)
For every fixed t, the function v v2 +v that we have to minimizerepresents a parabola:
y
v- /2*
y=v + v2 *
y
v
y=v + v2 *
y
v
y=v + v2 *
- /2*
The case(t)< 0; the case(t) = 0; the case(t)> 0.
Since in (2.79) we have to minimize for v in [0, ), we obtain
u(t) =
0 for (t)0,(t)/2 for (t)< 0 (2.82)
Let us suppose that(t)0, t[, 2]. (2.83)
Then (2.82) implies that u = 0 in [, 2] : from the dynamics we obtain
x= x +u x= x x(t) =aet,a R.
8/11/2019 Note Optimal Control
46/126
42 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
The initial condition on the trajectory gives x(t) = et. The adjoint
equation (2.80) gives
= 2et (t) =bet et.
By the condition (2.81) we obtain
(t) =(e4t et) (2.84)
Recalling that 0, an easy computation shows that (t)0, for everyt2 : this is coherent with the assumption (2.83). Hence the tern
(u, , x, ,
, ) = (0, e
t, (e4t
et)) (2.85)
satisfies the necessary condition of Pontryagin. In order to guarantee somesufficient condition note that the Hamiltonian is clearly convex in (x, u):hence u, is optimal.
Solution b. Clearly, by (2.85),
V(, ) = min{u: x=x+u, x()=, u0}
2
(u2 +x2)dt
=
2
((u, )2 + (x, )
2)dt
= 2
(02 +2e2t2)dt
= 2
2(e42 1). (2.86)
Hence this is the optimal value for the problem, if we work with a trajectorythat starts at time from the point .
Solution c. Since
V(, ) =()2
2 (e42
1),we have
V
(, ) =(e42 1).
In particular, if we consider a point (t, x) [, 2] [0, ) on the optimaltrajectory x, , i.e. using (2.85) the point (t, x) is such that
x= x, (t) =et,
we obtain
V
(t, x) =et(e42t 1) =(e4t et).
8/11/2019 Note Optimal Control
47/126
2.7. THE MULTIPLIER AS SHADOW PRICE I: AN EXERCISE 43
Hence we have found that
V
(t, x, (t)) =
, (t),
i.e.
Remark 2.7. The multiplier , at time t, expresses the sensitivity, theshadow price, of the optimal value of the problem when we modify theinitial date, along the optimal trajectory.
We will see in theorem 5.4 that this is a fundamental property that holds inthe general context and links the multiplier of the variational approach tothe value function Vof the dynamic programming. Two further commentson the previous exercise:
1. The function V(, ) : [0, 2] [0, )R is called value function andis the fundamental object of the dynamic programming. One of itsproperty is that V(2, ) = 0,.
2. Consider the points (, ) and (, ) in [0, 2] [0, ) : we know thatthe optimal trajectories are
x, (t) =et, x,(t) =
et.
Now consider (, ) on the optimal trajectory x, , i.e. the point
(, ) is such that =x, (
) =e.
The previous expressions give us that, with this particular choice of(, )
x,(t) =et
=e
et
=et =et
=x, (t).
Hence the optimal trajectory associated to the initial date () (with() that belongs to the optimal trajectory associated to the initialdate (, )), coincides with the optimal trajectory associated to theinitial date (, ).We will see that this is a fundamental property thatholds in general: the second part of an optimal trajectory is againotpimal is the Principle of Bellman of dynamic programming (seeTheorem 5.1).
8/11/2019 Note Optimal Control
48/126
44 CHAPTER 2. THE SIMPLEST PROBLEM OF OC
8/11/2019 Note Optimal Control
49/126
Chapter 3
General problems of OC
In this chapter we will see more general problem then (1.12). In the literaturethere are many books devoted to this study (see for example [16], [10], [14],[2], [15]): however, the fundamental tool is the Theorem of Pontryagin thatgives a necessary and useful condition of optimality.
3.1 Problems of Bolza, of Mayer and of Lagrange
Starting from the problem (1.12), let us consider t0 fixed and Tbe fixed orfree, withT > t0. The problem
J(u) =(T, x(T))x= g(t, x, u)x(t0) =maxuC
J(u),
(3.1)
with : [t0, ) Rn R, is called OC problem of Mayer. The problem
J(u) =
Tt0
f(t, x, u) dt + (T, x(T))
x= g(t, x, u)x(t0) =maxuC
J(u),
(3.2)
is called OC of Bolza. The problem
J(u) =
Tt0
f(t, x, u) dt
x= g(t, x, u)x(t0) =maxuC
J(u),
(3.3)
is calledOC of Lagrange. The function is usually called pay off function.We have the following result
45
8/11/2019 Note Optimal Control
50/126
46 CHAPTER 3. GENERAL PROBLEMS OF OC
Remark 3.1. The problems (3.1), (3.2) e (3.3) are equivalent.
Clearly the problems (3.1) and (3.3) are particular cases of (3.2). First, letus show how (3.2) can become a problem of Lagrange: we introduce a newvariable xn+1 defined as xn+1(t) =(t, x(t)). Problem (3.2) becomes
J(u) =
Tt0
(f(t, x, u) + xn+1) dt
(x, xn+1) = (g(t, x, u), d(t,x(t))
dt )(x(t0), xn+1(t0)) = (, (t0,))maxuC
J(u)
that is clearly of Lagrange. Secondly, let us proof how (3.3) can become aproblem of Mayer: we introduce the new variable xn+1 defined by xn+1(t) =f(t, x, u) with the condition xn+1(t0) = 0. Problem (3.3) becomes
J(u) =xn+1(T)(x, xn+1) = (g(t, x, u), f(t, x, u))(x(t0), xn+1(t0)) = (, 0)maxuC
J(u)
that is of Mayer. Finally, we show how the problem (3.1) becomes a problemof Lagrange: let us introduce the variable xn+1 as xn+1(t) = 0 with the
condition xn+1(T) = (T,x(T))
Tt0. Problem (3.1) becomes
J(u) =
Tt0
xn+1dt
(x, xn+1) = (g(t, x, u), 0)x(t0) =
xn+1(T) = (T,x(T))
Tt0
maxuC
J(u)
that is of Lagrange.
3.2 Problems with fixed final time
Let us considerf : [t0, t1]Rn+k R, : Rn Rand let Rn be fixed.Letx = (x1, x2, . . . , xn) and letn1, n2 and n3be non negative, fixed integer
8/11/2019 Note Optimal Control
51/126
3.3. PROBLEMS WITH FREE FINAL TIME 47
such that n1+n2+ n3= n. Let us consider the problem
maxuC
t1t0
f(t, x, u) dt +(x(t1))
x= g(t, x, u)x(t0) =xi(t1) free 1in1xj(t1)j withj fixed n1+ 1jn1+n2xl(t1) =l withl fixed n1+n2+ 1ln1+n2+ n3C ={u: [t0, t1]U Rk, u admissible}
(3.4)
where t0 and t1 are fixed. Since xi(t1) is fixed for n n3 < in, then thepay off function depends only onxi(t1) with 1
i
n
n3.
We have the following necessary condition, a generalization theorem 2.1of Pontryagin (see [12]):
Theorem 3.1. Let us consider the problem (3.4) with f C1([t0, t1]R
n+k), gC1([t0, t1] Rn+k) andC1(Rn).Letu be optimal control andx be the associated trajectory.Then there exists a multiplier(0,
), with
0 constant, : [t0, t1] Rn continuous,
such that(0,)= (0, 0) andi) the Pontryagin Maximum Principle (2.1) holds,
ii) the adjoint equation (2.2) holds,
iiit1) the transversality condition is given by
for 1i n1, we have i (t1) =
xi(x(t1))
forn1+ 1jn1+ n2, we have j (t1)
xj(x(t1)),
xj (t1)j and
j (t1)
xj(x(t1))
xj (t1) j
= 0;
iv) 00.
3.3 Problems with free final time
Let us consider f : [t0, ) Rn+k R, g : [t0, ) Rn+k R and : [t0, ) Rn R,and let Rn be fixed. Let x = (x1, x2, . . . , xn) and
8/11/2019 Note Optimal Control
52/126
48 CHAPTER 3. GENERAL PROBLEMS OF OC
n1, n2 andn3 be non negative, fixed integer such that n1 + n2 + n3= n. We
consider the problem
maxuC
Tt0
f(t, x, u) dt+ (T, x(T))
x= g(t, x, u)x(t0) =xi(T) free 1in1xj(T)j with j fixed n1+ 1jn1+n2xl(T) =l with l fixed n1+ n2+ 1ln1+ n2+n3maxuC
J(u)
C=
{u: [t0,
)
U
R
k, uadmissible
}
(3.5)
wheret0 is fixed and T is free with T > t0. We say that u is optimal with
exit time T if Tt0
f(t, x, u) dt +(T, x(T)) T
t0
f(t, x, u) dt+ (T, x(T))
for every admissible control u and for every T t0. Hence and optimalcontrol has an (unique) associated exit time. We have the following result,again a generalization of the theorem of Pontryagin 2.1 (see [12]):
Theorem 3.2. Let us consider the problem (3.5) with f C1([t0, )R
n+k), g
C1([t0,
)
R
n+k) and
C1([t0,
)
R
n).Let u be optimal control with exit timeT andx be the associated trajec-tory.Then there exists a multiplier(0,
), with
0 constant, : [t0, t1] Rn continuous,
such that(0,)= (0, 0) and
i) the Pontryagin Maximum Principle (2.1) holds,
ii) the adjoint equation (2.2) holds,
iiiT) the transversality condition is given by
for 1i n1, we have i (T) =
xi(T, x(T))
forn1+ 1jn1+n2, we have j (T)
xj(T, x(T)),
xj (T)j and
j (T)
xj(T, x(T))
xj (T
) j
= 0;
8/11/2019 Note Optimal Control
53/126
3.4. TIME OPTIMAL PROBLEM 49
moreover we have
H(T, x(T), u(T), 0,(T)) +
t(T, x(T)) = 0; (3.6)
iv) 00.
3.4 Time optimal problem
A particular case of a free final time problem in (3.5) is the following
minuC
T
x= g(t, x, u)x(0) =x(T) =
(3.7)
where and are fixed points in Rn, andTis non negative and free. Hence(3.7) is the problem to transfers in the shortest possible time in : suchproblem is calledtime optimal problem: its solution has a optimal time T.Since T =
T0 1 dt, we define the Hamiltonian as
H(t, x, u, 0,) =0+ g(t, x, u) (3.8)
and we have the following result (see for example [16], page 614):
Theorem 3.3. Let us consider the problem (3.7) with g C1([t0, t1]R
n+k).Letu be optimal control with exit timeT and x be the associated trajec-tory.
Then there exists a multiplier(0,), with
0 constant, : [t0, t1] Rn continuous,
such that(0,)= (0, 0) and
i) the Pontryagin Minimum Principle holds, i.e. in [t0, t1]
u(t)arg minvU
H(t, x, v, 0,);
ii) the adjoint equation (2.2) holds,
iiiT) the transversality condition is given by
H(T, x(T), u(T), 0,(T))0, (3.9)
8/11/2019 Note Optimal Control
54/126
50 CHAPTER 3. GENERAL PROBLEMS OF OC
iv) 0 = 1.
As usual an easy sufficient condition of optimality involves the convexity(note that is a minimum problem) of the Hamiltonian.Example 3.4.1. Let us consider
min Tx= x + ux(0) = 5x(T) = 11|u| 1
whereT is free. The Hamiltonian isH= 1 + (x + u). The necessary conditions give
(P M P) u(t) arg min|v|1
[1 + (x + v)] u(t) arg min|v|1
v (3.10)
= Hx
= (t) =Aet (3.11)(3.9) 1 + (T)(x(T) + u(T)) = 1 + (T)(11 + u(T)) 0 (3.12)
for some constantA. It is easy to see that|u| 1 and (3.12) imply(T)< 0, i.e. A
8/11/2019 Note Optimal Control
55/126
3.4. TIME OPTIMAL PROBLEM 51
Since the problem is autonomous, remark 3.3 guarantee that H is a con-
stant: moreover, (3.17) implies thatH = h with h R, h0. Now, easycomputations by (3.15) and (3.16) give
2= a, 1=at +b, (3.18)
wherea and b are constants. From (PMP) in (3.14) we have
u(t) =
1 se 1(t)> 0,1 se 1(t)< 0,? se 1(t) = 0.
Let us suppose that 1 vanishing in an interval, then we obtain a= b = 0.Since H=h we have, by (3.18),
0h = H= 1 + 1(t)u(t) + 2(t)x1(t) = 1 :
hence a = b = 0 is impossible. Hence1 is a straight line and there existsat most a point [0, T] such that 1() = 0 andu has a discontinuity (a
jump).
Now we study two cases:
case A:Let us suppose1(t)< 0 int
(t, t)
(0,
).We have, for every
t(t, t), u(t) = 1 and
x1= u = x1(t) =t +c, withc R (3.19)x2= x1 = x2(t) =t2/2 +ct + d, with d R (3.20)
We obtain
x2= 1
2x21+d
c2
2. (3.21)
For the moment, it is not easy to find the con-
stants c and d : however, it is clear that (3.21)represents some parabolas on the (x1, x2)-plane.Moreover, the dynamics x2 = x1 gives that ifx1 > 0, then x2 increases and ifx1 < 0 then x2decreases: hence there is a direction on such linewhen the time passes.
x1
x2
case B: Let 1(t)> 0 in (t, t)(0, ) : hence u(t) =1 and, as before,
x1= u = x1(t) =t +e, con e R (3.22)x2= x1 = x2(t) =t2/2 +et + f, con f R (3.23)
8/11/2019 Note Optimal Control
56/126
52 CHAPTER 3. GENERAL PROBLEMS OF OC
that imply
x2=12
x21+f+e2
2. (3.24)
Again we have some parabolas and a precise di-rection for such curves.
x1
x2
case A+B: Now let us put to-gether these two families of parabo-las in (3.21) and (3.23). In orderto start at time t = 0 from thepoint (x1(0), x2(0)) = (1, 2) andto arrive in the final and unknowntimeTto the point (x1(T), x2(T)) =(0, 0), we can follow some part ofsuch parabolas (with the right direc-tion). It is clear that there exists in-finite path: for example (see the fig-ure) starting fromA we can move onthe curve as arrive in B, hence fol-lows the dashed line and arrive in thepoint C and finally to arrive in theorigin along a new parabola.
x
A
1
x2
O
B
C
D
We remark that every time we passfrom a curve to another curve, theoptimal control has a discontinuitypoint, i.e. u passes from +1 to1or vice versa. Since we know that theoptimal control has at most one dis-continuity point and since the Hamil-tonian is convex in (x1, x2, u) for ev-ery fixed time, the optimal path isthe red line in the figure.
x
A
1
x2
O
D
In the next example we solve a particular case of this general situation:
Example 3.4.2. Let us consider
minu
T
x1 = ux2 = x1x1(0) = 2, x2(0) = 1x1(T) =x2(T) = 0|u| 1
8/11/2019 Note Optimal Control
57/126
3.5. THE BOLZA PROBLEM IN CALCULUS OF VARIATIONS. 53
SinceA = (x1(0), x2(0)) = (1, 2) = (2, 1),by (3.22) and (3.23) we obtain e = 2ef= 1.
(3.21) gives the curve1 with equationx2= x21/2 + 3.
The point D is the intersection of the curve2 with equation
x2= x21/2,
and the curve1: we obtain D = (
3, 3/2).We note that starting fromA at timet = 0,we arrive in D at timeD : such D is, by (3.22), D = 2 +
3.
We restart fromD = (x1(D), x2(D))and arrive, on 2,toO. By (3.19) we havex1(D) =D + c =
3 and hence c =2(1 + 3). By (3.21) and the equation of 2, we have
d= c2/2 = 4(2+
3).We arrive in the origin at the timeT that is, using (3.19) and hencex1(T) =T 2(1 +
3) = 0, T= 2(1 +
3). Hence
T
= 2(1 +
3).
The optimal control and the optimal trajectory is
u(t) =
1 pert [0, 2 + 3],1 pert (2 + 3, 2(1 + 3)],
x1(t) =
t + 2 pert [0, 2 + 3],t 2(1 + 3) pert (2 + 3, 2(1 + 3)],
x2(t) =
t2/2 + 2t + 1 pert [0, 2 + 3],t2/2 2(1 + 3)t + 4(2 + 3) pert (2 + 3, 2(1 + 3)],
t0
u
u
D T
1
-1
x
t0
D T
2
-3
1
1/2
x
t0
D T
1
3/2
2
3.5 The Bolza problem in Calculus of Variations.
Let us consider the problem
J(x) = T
t0
f(t, x(t), x(t)) dt + (T, x(T))
x(t0) =x(T) =TmaxxAB
J(x)
AB ={x: [t0, ) Rn, xC1, x(t0) =}
(3.25)
where t0 is fixed, and T > t0, T Rn are free. We call this problemBolza problem of calculus of variation. Clearly (3.25) is a particular case of(3.5), but let us provide the necessary condition for this particular situation:hence, let us apply theorem 3.2 to our situation.
8/11/2019 Note Optimal Control
58/126
54 CHAPTER 3. GENERAL PROBLEMS OF OC
Since u= x, we have H = f(t, x, u) + u : hence, as in (2.28) and (2.29),
we have
(PMP) uf+ =0(adjoint equation) xf=
(transv. cond. iiiT) =x f+ x + t
= 0 int = T.
More precisely we have
Theorem 3.4. Let us consider (3.25) with f C2([t0, t1] R2n) andC1([t0, t1] Rn). Let x be an optimal solution with exit timeT. Then xis extremal (i.e. satisfies EU). Moreover
i) ifT is fixed andT is free, then
fx(T, x(T), x(T)) + x(T, x(T)) = 0; (3.26)
ii) ifT is free andT is fixed, then
f(T, x(T), x(T)) x(T)xf(T, x(T), x(T))+ t
(T, x(T)) = 0;
(3.27)
iii) ifT andTare free, then we have (3.26) and (3.27).
Example 3.5.1. Let us consider
Ott
10
(x2 x) dt + x2(1)x(0) = 0
The solution of EU isx(t) = 14 t2 +at+b, with a, b R. The initial condition impliesb= 0. SinceT = 1 is fixed, form (3.26) we have
2x(1) + 2x(1) = 0 a= 3/8.Hence the extremal is x =t2/4 + 3t/8. Now, let us show that is a minimum. Leth C1([0, 1]) be such that h(0) = 0 and let x = x+ h. Then 10
(x2 x) dt + x2(1) = 10
(x2 + 2h x+ h2 x h) dt + x2(1) + 2x(1)h(1) + h2(1)
1
0 (x2
x) dt + x2
(1) + 10 (2
h x h) dt + 2x(1)h(1).
Sinceh(0) = 0, x(t) = 1/2, x(1) = 1/8 and x(1) = 1/8, we have 10
(2h x h) dt + 2x(1)h(1) =
2h x10
10
(2hx+ h) dt +h(1)
4
= h(1)4
10
(h + h) dt + h(1)4
= 0.
The previous inequality implies that x is a minimum.
8/11/2019 Note Optimal Control
59/126
3.5. THE BOLZA PROBLEM IN CALCULUS OF VARIATIONS. 55
3.5.1 Labor adjustment model of Hamermesh.
Consider a firm that has decided to raise its labor input from L0 to a yetundetermined level LT after encountering a wage reduction at the initialtime t0 = 0. The adjustment of labor input is assumed to entail a cost Cthat varies, at every time, with L(t), the rate of change of L. Thus the firmhas to decide on the best speed of adjustment toward LT as well as themagnitude ofLTitself. This is the essence of the labor adjustment problemdiscussed in a paper by Hamermesh.We assume that the profit of the firm beexpressed by a general function(L), with(L) < 0. The labor input is taken to bethe unique determinant of profit because
we have subsumed all aspects of productionand demand in the profit function. Thecost of adjusting L is assumed to be
L
C(L) =bL2
+ c,
with b and c positive constants. Thus the net profit at any time is (L) C(L).The problem of the firm is to maximize the total net profit over timeduring the process of changing the labor input.Inasmuch as it must choose not only theoptimal LT, but also an optimal time
T for completing the adjustment, wehave both the terminal state and termi-nal time free. Another feature to noteabout the problem is that we should in-clude not only the net profit fromt = 0tot = T,but also the capitalized valueof the profit in the postT period, whichis affected by the choice of LT and T,too.
L
tT
L
LT
0
Since the profit rate at time is (LT), its present value is (LT)et, where
> 0 is the given discount rate. So the capitalized value of that present
value is, T
(L)et dt=
(LT)
et
T
=(LT)
eT.
Hence the problem is
maxL