Optimal Control Exercises

Optimal Control, TSRT08: ExercisesThis version: September 2012

AUTOMATIC CONTROLREGLERTEKNIK

LINKPINGS UNIVERSITET

Preface

Acknowledgment

The exercises are modified problems from different sources:

Exercises 2.1, 2.2,3.5,4.1,4.2,4.3,4.5, 5.1a) and 5.3 are from

P. gren, R. Nagamune, and U. Jnsson. Exercise notes on optimalcontrol theory. Division of Optimization and System Theory, RoyalInstitute of Technology, Stockholm, Sweden, 2009

Exercises 1.1, 1.2, 1.3 and 1.4 are from

D. Bertsekas. Dynamic Programming and Optimal Control, vol-ume 1. Athena Scientific, second edition, 2000

Exercises 8.3, 8.4, 8.6 and 8.8 are based on

Per E. Rutquist and Marcus M. Edvall. PROPT - Matlab OptimalControl Software. TOMLAB Optimization, May 2009

Abbreviationsdare discrete-time algebraic Riccati equationdre discrete-time Riccati equationdp dynamic programminghjbe Hamilton-Jacobi-Bellman equationlqc linear quadratic controlmil matrix inversion lemmaode ordinary differential equationpde partial differential equationpmp Pontryagin minimization principletpbvp two-point boundary value problemare algebraic Riccati equation

1

ExercisesThis version: September 2012

1 Discrete OptimizationIn this section, we will consider solving optimal control problems on the form

minimize (xN ) +N1k=0

f0(k, xk, uk)

subject to xk+1 = f(k, xk, uk),x0 given, xk Xk,uk U(k, xk),

using the dynamic programming (dp) algorithm. This algorithm finds the optimalfeedback control uk, for k = 0, 1, . . . , N 1, via the backwards recursion

J(N, x) = (x)J(n, x) = min

uU(n,x){f0(n, x, u) + J(n+ 1, f(n, x, u))} .

The optimal cost is J(0, x0) and the optimal feedback control at stage n is

un = arg minuU(n,x)

{f0(n, x, u) + J(n+ 1, f(n, x, u))} .

1.1 A certain material is passed through a sequence of two ovens. Denote by

x0: the initial temperature of the material,xk, k = 1, 2: the temperature of the material at the exit of oven k,uk1, k = 1, 2: the prevailing temperature in oven k.

The ovens are modeled as

xk+1 = (1 a)xk + auk, k = 0, 1,where a is a known scalar from the interval (0, 1). The objective is to get thefinal temperature x2 close to a given target T , while expending relatively littleenergy. This is expressed by a cost function of the form

r(x2 T )2 + u20 + u21,where r > 0 is a given scalar. For simplicity, we assume no constraints on uk.

(a) Formulate the problem as an optimization problem.(b) Solve the problem using the dp algorithm when a = 1/2, T = 0 and r = 1.(c) Solve the problem for all feasible a, T and r.

1.2 Consider the system

xk+1 = xk + uk, k = 0, 1, 2, 3,

with the initial state x0 = 5 and the cost function3k=0

(x2k + u2k).

Apply the dp algorithm to the problem when the control constraints are

U(k, xk) = {u | 0 xk + u 5, u Z}.

1.3 Consider the cost function

N(xN ) +N1k=0

kf0(k, xk, uk),

1

where is a discount factor with 0 < < 1. Show that an alternate form ofthe dp algorithm is given by

V (N, x) = (x),V (n, x) = min

uU(n,x){f0(n, x, u) + V (n+ 1, f(n, x, u))

},

where the state evolution is given as xk+1 = f(k, xk, uk).

1.4 A decision maker must continually choose between two activites over a timeinterval [0, T ]. Choosing activity i at time t, where i = 1, 2, earns a rewardat a rate gi(t), and every switch between the two activity costs c > 0. Thus,for example, the reward for starting with activity 1, switching to 2 at time t1,and switching back to 1 at time t2 > t1 earns total reward t1

0g1(t) dt+

t2t1

g2(t) dt+ Tt2

g1(t) dt 2c.

We want to find a set of switching times that maximize the total reward.Assume that the function g1(t) g2(t) changes sign a finite number of timesin the interval [0, T ]. Formulate the problem as a optimization problem andwrite the corresponding dp algorithm.

1.5 Consider the infinite time horizon linear quadratic control (lqc) problem

minimizeu( )

k=0

(xTkQxk + uTkRuk

)subject to xk+1 = Axk +Buk,

x0 given,

where Q is a symmetric positive semidefinite matrix, R is a symmetric positivedefinite matrix, and the pair (A,B) is controllable. Show by using the Bellmanequation that the optimal state feedback controller is given by

u = (x) = (BTPB +R)1BTPAx,where P is the positive definite solution to the discrete-time algebraic Riccatiequation (dare)

P = ATPAATPB(BTPB +R)1BTPA+Q.

1.6 Consider the finite horizon lqc problem

minimizeu( ) x

TNQNxN +

N1k=0

(xTkQxk + uTkRuk

)subject to xk+1 = Axk +Buk,

x0 given,

where QN , Q and R are symmetric positive semidefinite matrices.

(a) Apply the discrete version of the Pontryagin minimization principle (pmp)and show that the solution to the two-point boundary value problem(tpbvp)

xk+1 = Axk 12BR1BTk+1, x0 given,

k = 2Qxk +ATk+1,N = 2QNxN ,

yields the minimizing sequence

uk = 12R1BTk+1, k = 0, . . . , N 1.

(b) Show that the above tpbvp can be solved using the Langrange multiplier

k = 2Skxk,

where Sk is given by the backward recursion

SN = QN ,Sk = AT (S1k+1 +BR

1BT )1A+Q.

(c) Use the matrix inversion lemma (mil) to show that the above recursionfor Sk is equivalent to the discrete-time Riccati equation (dre)

Sk = ATSk+1AATSk+1B(BTSk+1B +R)1BTSk+1A+Q.The mil can be expressed as

(A+UBV )1 = A1A1U(B1+V A1U)1V A1, BV (A+UBV )1 = (B1+V A1U)V A1

for arbitrarily matrices A, B, and C such that required inverses exists.(d) Use your results to calculate a feedback from the state xk

uk = (k, xk).

2

2 Dynamic ProgrammingIn this chapter, we will consider solving optimal control problems on the form

minimizeu( ) (x(tf )) +

tfti

f0(t, x(t), u(t)) dt

subject to x(t) = f(t, x(t), u(t)),x(ti) = xi,u(t) U, t [ti, tf ]

using dynamic programming via the Hamilton-Jacobi-Bellman equation (hjbe):

1. Define Hamiltonian as

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u).

2. Pointwise minimization of the Hamiltonian yields

(t, x, ) , arg minuU

H(t, x, u, ),

and the optimal control is given by

u(t) , (t, x, Vx(t, x)),

where V (t, x) is obtained by solving the hjbe in Step 3.

3. Solve the hjbe with boundary conditions:

Vt = H(t, x, (t, x, Vx), Vx), V (tf , x) = (x).

2.1 Find the optimal solution to the problem

minimizeu( )

tf0

((x(t) cos t)2 + u2(t)) dt

subject to x(t) = u(t),x(0) = 0,

expressed as a system of odes.

2.2 A producer with production rate x(t) at time t may allocate a portion u(t)of his/her production rate to reinvestments in the factory (thus increasing theproduction rate) and use the rest (1 u(t)) to store goods in a warehouse.Thus x(t) evolves according to

x(t) = u(t)x(t),

where is a given constant. The producer wants to maximize the total amountof goods stored summed with the capacity of the factory at final time.

(a) Formulate the continuous-time optimal control problem.(b) Solve the problem analytically.

3

3 PMP: A Special CaseIn this chapter, we will start solving optimal control problems on the form


tfti


subject to x(t) = f(t, x(t), u(t)),x(ti) = xi.

using a special case of the pmp: Assuming that the final time is fixed and thatthere are no terminal constraints, the problem may be solved using the followingsequence of steps

1. Define the Hamiltonian as

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u).


(t, x, ) , arg minu( )

H(t, x, u, ),

and the optimal control candidate is given by

u(t) , (t, x(t), (t)).

3. Solve the adjoint equations

(t) = Hx

(t, x(t), (t, x(t), (t)), (t)), (tf ) =

x(x(tf ))

4. Compare the candidate solutions obtained using pmp

3.1 Solve the optimal control problem

minimizeu( )

tf0

(x(t) + u2(t)

)dt

subject to x(t) = x(t) + u(t) + 1,x(0) = 0.

3.2 Find the extremals of the functionals

(a) J = 10y dt,

(b) J = 10yy dt,

given that y(0) = 0 and y(1) = 1.

3.3 Find the extremals of the following functionals

(a) J = 10

(y2 + y2 2y sin t) dt,

(b) J = 10

y2

t3dt,

(c) J = 10

(y2 + y2 + 2yet) dt.

where y(0) = 0.

3.4 Among all curves of length l in the upper half plane passing through the points(a, 0) and (a, 0) find the one which encloses the largest area in the interval[a, a], i.e., solve

maximizex( )

aax(t) dt

subject to x(a) = 0,x(a) = 0,

K(x) = aa

1 + x(t)2 dt = l.

4

3.5 From the point (0, 0) on the bank of a wide river, a boat starts with relativespeed to the water equal . The stream of the river becomes faster as it departsfrom the bank, and the speed is g(y) parallel to the bank, see Figure 3.5a.

Figure 3.5a. The so called Zermelo problem in Exercise 3.5.

The movement of the boat is described by

x(t) = cos ((t)) + g(y(t))y(t) = sin((t)),

where is the angle between the boat direction and bank. We want to deter-mine the movement angle (t) so that x(T ) is maximized, where T is a fixedtransfer time. Show that the optimal control must satisfy the relation

1cos((t) +

g(y(t))

= constant.

In addition, determine the direction of the boat at the final time.

3.6 Consider the electromechanical system in Figure 3.6a. The electrical subsys-tem contains a voltage source u(t) which passes current i(t) trough a resistorR and a solenoid of inductance l(z) where z is the position of the solenoidcore. The mechanical subsystem is the core of mass m which is constrained tomove horizontally and is retrained by a linear spring and a damper.

(a) Use the following principle:

J(u( )) = tfti

L(q(t), u(t)) dt+ tfti

(q(t))T q dt = 0,

Figure 3.6a. Electromechanical system.

where q are generalized coordinates, are generalized forces and q = u,to derive the generalization of the Euler-Lagrange equations for noncon-servative mechanical systems without energy dissipation (no damping):

d

dt

L

q Lq

= , (3.1)

Also, apply the resulting equation to the electromechanical system above.(b) For a system with energy dissipation the Euler-Lagrange equations are

given byd

dt

L

q Lq

= Fq

(3.2)

where q are generalized coordinates, are generalized forces, and F is theRayleighs dissipation energy defined as

F (q, q) = 12j

kj q2j (3.3)

where kj are damping constants. 2F is the rate of energy dissipation dueto the damping. Derive the system equations with the damping included.

3.7 Consider the optimal control problem

minimizeu( )

2 x(tf )2 + 12

tfti

u2(t) dt

subject to x(t) = u(t)x(ti) = xi.

Derive a optimal feedback policy using pmp.

5

3.8 Consider the dynamical system

x(t) = u(t) + w(t)

where x(t) is the state, u(t) is the control signal, w(t) is a disturbance signal,and where x(0) = 0. In so-called H-control the objective is to find a controlsignal that solves

minu( ) maxw( ) J(u( ), w( ))

whereJ(u( ), w( )) =

T0

(2x2(t) + u2(t) 2w2(t))

where is a given parameter and where < 1.

(a) Express the solution as a tpbvp problem using the pmp. Note that youdo not need to solve the equations yet.

(b) Solve the tpbvp and express the solution in feedback form, i.e., u(t) =(t, x(t)) for some .

(c) For what values of does the control signal exist for all t [0, T ]

6

4 PMP: General ResultsIn this chapter, we will consider solving optimal control problems on the form


tfti


subject to x(t) = f(t, x(t), u(t)),x(ti) = xi, x(tf ) Sf (tf ),u(t) U, t [ti, tf ],

using the general case of the pmp:

1. Define the Hamiltonian as

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u).


(t, x, ) , arg minuU

H(t, x, u, ),

and the optimal control candidate is given by

u(t) , (t, x(t), (t)).

3. Solve the tpbvp

(t) = Hx

(t, x(t), (t, x(t), (t)), (t)), (tf ) x

(x(tf )) Sf (tf ),

x(t) = H

(t, x(t), (t, x(t), (t)), (t)), x(ti) = xi, x(tf ) = Sf (tf ).

4. Compare the candidate solutions obtained using pmp

4.1 A producer with production rate x(t) at time t may allocate a portion u(t)of his/her production rate to reinvestments in a factory (thus increasing theproduction rate) and use the rest (1 u(t)) to store goods in a warehouse.Thus x(t) evolves according to

x(t) = u(t)x(t), (4.1)

where is a given constant. The producer wants to maximize the total amountof goods stored summed with the capacity of the factory at final time. Thisgives us the following problem:

maximizeu( ) x(tf ) +

tf0

(1 u(t))x(t) dt

subject to x(t) = u(t)x(t), 0 < < 1x(0) = x0 > 0,0 u(t) 1, t [0, tf ].

Find an analytical solution to the problem above using the pmp.

4.2 A spacecraft approaching the face of the moon can be described by the follow-ing equations

x1(t) = x2(t),

x2(t) =cu(t)x3(t)

g(1 kx1(t)),

x3(t) = u(t),where u(t) [0,M ] for all t, and the initial conditions are given by

x(0) = (h, ,m)T

with the positive constants c, g, k, M, h, and m. The state x1 is thealtitude above the surface, x2 the velocity and x3 the mass of the spacecraft.Calculate the structure of the fuel minimizing control that brings the craft torest on the surface of the moon. The fuel consumption is tf

0u(t) dt,

7

and the transfer time tf is free. Show that the optimal control law is bang-bangwith at most two switches.

4.3 A wasp community can in the time interval [0, tf ] be described by

x(t) = (au(t) b)x(t), x(0) = 1,y(t) = c(1 u(t))x(t), y(0) = 0,

where x is the number of worker bees and y the number of queens. Theconstants a, b and c are given positive numbers, where b denotes the deathrate of the workers and a, c are constants depending on the environment.The control u(t) [0, 1], for all t [0, tf ], is the proportion of the communityresources dedicated to increase the number of worker bees. Calculate thecontrol u that maximizes the number of queens at time tf , i.e., the number ofnew communities in the next season.

4.4 Consider a dc motor with input signal u(t), output signal y(t) and the transferfunction

G(s) = 1s2.

Introduce the states x1(t) , y(t) and x2(t) , y(t) and compute a control signalthat takes the system from an arbitrary initial state to the origin in minimaltime when the control signal is bounded by |u(t)| 1 for all t.

4.5 Consider a rocket, modeled as a particle of constant mass m moving in zerogravity empty space, see Figure 4.5a. Let u be the mass flow, assumed tobe a known function of time, and let c be the constant thrust velocity. Theequations of motion are given by

x1(t) = x3(t),x2(t) = x4(t),

x3(t) =c

mu(t) cos v(t),

x4(t) =c

mu(t) sin v(t).

(a) Assume that u(t) > 0 for all t R. Show that cost functionals of the class

minimizev( )

tf0

dt or minimizev( ) (x(tf ))

x1

x2

x3

x4

u

v

Figure 4.5a. A rocket

gives the optimal control

tan v(t) = a+ b tc+ d t

(the bilinear tangent law).(b) Assume that the rocket starts at rest at the origin and that we want to

drive it to a given height x2f in a given time T such that the final velocityin the horizontal direction x3(T ) is maximized while x4f = 0. Show thatthe optimal control is reduced to a linear tangent law,

tan v(t) = a+ b t.(c) Let the rocket in the example above represent a missile whose target is

at rest. A natural objective is then to minimize the transfer time tf fromthe state (0, 0, x3i, x4i) to the state (x1f , x2f , free, free). Solve the problemunder assumption that u is constant.

(d) To increase the realism we now assume that the motion is under a constantgravitational force g. The only difference compared to the system aboveis that equation for the fourth state is replaced by

x4(t) =c

mu(t) sin v(t) g.

Show that the bilinear tangent law still is optimal for the cost functional

minimizev( ) (x(tf )) +

tf0

dt

8

(e) In most cases the assumption of constant mass is inappropriate, at leastif longer periods of time are considered. To remedy this flaw we add themass of the rocket as a fifth state. The equations of motion becomes

x1(t) = x3(t),x2(t) = x4(t),

x3(t) =c

x5(t)u(t) cos v(t),

x4(t) =c

x5(t)u(t) sin v(t) g,

x5(t) = u(t),

where u(t) [0, umax], t [0, tf ]. Show that the optimal solution totransferring the rocket from a state of given position, velocity and mass toa given altitude x2f using a given amount of fuel, such that the distancex1(tf ) x1(0) is maximized, is

v(t) = constant, u(t) = {umax, 0}.

9

5 Infinite Horizon Optimal ControlIn this chapter, we will consider solving optimal control problems on the form

minimizeu( )

0

f0(x(t), u(t)) dt

subject to x(t) = f(x(t), u(t)),x(0) = x0,u(t) U(x)

using dynamic programming via the infinite horizon Hamilton-Jacobi-Bellmanequation (hjbe):

1. Define Hamiltonian as

H(x, u, ) , f0(x, u) + T f(x, u).


(x, ) , arg minuU

H(x, u, ),

and the optimal control is given by

u(t) , (x, Vx(x)),

where V (x) is obtained by solving the infinite horizon hjbe in Step 3.

3. Solve the infinite horizon hjbe with boundary conditions:

0 = H(x, (x, Vx), Vx).

5.1 Consider the following infinite horizon optimal control problem

minimizeu( )

12

0

(u2(t) + x4(t)

)dt

subject to x(t) = u(t),x(0) = x0,

(a) Show that the solution is given by u(t) = x2(t) sgn x(t).(b) Add the constraint |u| 1. What is now the optimal cost-to-go function

J(x) and the optimal feedback?

5.2 Solve the optimal control problem

minimizeu( )

12

0

(u2(t) + x2(t) + 2x4(t)

)dt

subject to x(t) = x3(t) + u(t),x(0) = x0,

5.3 Consider the problem

minimizeu( ) J =

0

(2u2(t) + 2y2(t)

)dt

subject to y(t) = u(t),u(t) R, t 0

where > 0 and > 0.

(a) Write the system in a state space form with a state vector x.(b) Write the cost function J on the form

0(u(t)TRu(t) + x(t)TQx(t)

)dt.

(c) What is the optimal feedback u(t)?(d) To obtain the optimal feedback, the algebraic Riccati equation (are) must

be solved. Compute the optimal feedback by using the Matlab commandare for the case when = 1 and = 2. In addition, compute the eigen-values of the system with optimal feedback.

10

(e) In this case the ARE can be solved by hand since the dimension of thestate space is low. Solve the ARE by hand or by using a symbolic toolfor arbitrarily and . Verify that your solution is the same as in theprevious case when = 1 and = 2.

(f) (Optional) Compute an analytical expression for the eigenvalues of thesystem with optimal feedback.

5.4 Assume that the lateral equations of motions (i.e. the horizontal dynamics)for a Boeing 747 can be expressed as

x(t) = Ax(t) +Bu(t) (5.1)

where x = (v r p y)T , u = (a r)T ,

v; lateral velocity r; yaw rate p; roll rate ; roll angle ; yaw angle y; lateral distance from a straight reference path a; aileron angle r; rudder angle

and

A =

0.089 2.19 0 0.319 0 00.076 0.217 0.166 0 0 00.602 0.327 0.975 0 0 0

0 0.15 1 0 0 00 1 0 0 0 01 0 0 0 2.19 0

, B =

0 0.03270.0264 0.1510.227 0.0636

0 00 00 0

C =(0 0 0 0 0 1)

(5.2)The task is now to design an autopilot that keeps the aircraft flying along astraight line during a long period of time. First an intuitive approach will beused and then a optimal control approach.

(a) A reasonable feedback, by intuition, is to connect the aileron with the rollangle and the rudder with the (negative) yaw angle. Create a matrix Kin Matlab that makes these connection such that u(t) = Kx(t). Thencompute the poles of the feedback system and explain why this controllerwill have some problems.

(b) Use the Matlab command lqr to compute the optimal control feedback.(Make sure you understand the similarities and the differences between lqrand are.) Compute the poles for the case when Q = I66 and R = I22.

(c) Explain how different choices of (diagonal) Q and R will affect the perfor-mance of the autopilot.

11

6 Model Predictive ControlIn this exercise session we will focus on computing the explicit solution to themodel predictive control (mpc) problem (see [3]) using the multi-parametric toolbox(mpt) for matlab.The toolbox is added to the matlab path via the command

initcourse(tsrt08),

and the mpt manual may be read online at

www.control.isy.liu.se/lyzell/MPTmanual.pdf.

6.1 Here we are going to consider the second order system

y(t) = 2s2 + 3s+ 2u(t), (6.1)

sampled with Ts = 0.1 s to obtain the discrete-time state-space representation

xk+1 =(

0.7326 0.17220.08611 0.9909

)xk +

(0.1722

0.009056

)uk

yk =(0 1

)xk.

(6.2)

(a) Use the mpt toolbox to find the mpc law for (6.2) using the 2-norm withN = 2, Q = I22, R = 0.01, and the input satisfying

2 u 2 and(1010

) x

(1010

)How may regions are there? Present plots on the control partition and thetime trajectory for the initial condition x0 =

(1 1

)T .(b) Find the number of regions in the control law for N = 1, 2, . . . , 14. How

does the number of regions seem to depend on N , e.g., is the complexitypolynomial or exponential? Estimate the order of the complexity via

arg min,

N nr22 and arg min,

(N 1) nr22,

where nr denotes the number of regions. Discuss the result.

12

7 Numerical Algorithms

All Matlab-files that are needed are available to download from the course pages,but some lines of code are missing and the task is to complement the files to beable to solve the given problems. All places where you have to write some code areindicated by three question marks (???).A basic second-order system will be used in the exercises below and the model andthe problem are presented here. Consider a 1D motion model of a particle withposition z(t) and speed v(t). Define the state vector x = (z v)T and the continuoustime model

x(t) = f(x(t), u(t)) =(

0 10 0

)x(t) +

(01

)u(t). (7.1)

The problem is to go from the state xi = x(0) = (1 1)T to x(tf ) = (0 0)T , wheretf = 2, but such that the control input energy

tf0 u

2(t)dt is minimized. Thus, theoptimization problem is

minu

J = tf0

f0(x, u)dt

s.t. x(t) = f(x(t), u(t))x(0) = xix(tf ) = xf

(7.2)

where f0(t) = u2(t) and f( ) is given in (7.1).

7.1 Discretization Method 1. In this approach we will use the discrete time model

x[k + 1] = f(x[k], u[k]) =(

1 h0 1

)x[k] +

(0h

)u[k] (7.3)

to represent the system where t = kh, h = tf/N and x[k] = x(kh). Thediscretized version of the problem (7.2) is then

minu[n],n=0,...,N1

h

N1k=0

u2[k]

s.t. x[k + 1] = f(x[k], u[k])x[0] = xix[N ] = xf .

(7.4)

Define the optimization parameter vector as

y =(x[0]T u[0] x[1]T u[1] ... x[N 1]T u[N 1] x[N ]T u[N ] )T . (7.5)

Note that u[N ] is superfluous, but is included to make the presentation andthe code more convenient. Furthermore, define

F(y) = hN1k=0

u2[k] (7.6)

and

G(y) =

g1(y)g2(y)g3(y)g4(y)...

gN+2(y)

=

x[0] xix[N ] xf

x[1] f(x[0], u[0])x[2] f(x[1], u[1])

...x[N ] f(x[N 1], u[N 1])

(7.7)

Then the optimization problem in (7.4) can be expressed as the constrainedproblem

minyF(y) (7.8)

s.t. G(y) = 0. (7.9)

13

(a) Show that the discrete time model in (7.3) can be obtained from (7.1) byusing the Euler-approximation

x[k + 1] = x[k] + hf(x[k], u[k]). (7.10)

and complement the file secOrderSysEqDisc.m with this model.(b) Complement the file secOrderSysCostCon.m with a discrete time version

of the cost function.(c) Note that the constraints are linear for this system model and this is very

important to exploit in the optimization routine. In fact the problemis a quadratic programming problem, and since it only contain equalityconstraints it can be solved as a linear system of equations.Complete the file mainDiscLinconSecOrderSys.m by creating the linearconstraint matrix Aeq and the vector Beq such that the constraints definedby G(y) = 0 is expressed as Aeqy = Beq. Then run the Matlab script.

(d) Now suppose that the constraints are nonlinear. This case can be handledby passing a function that computes G(y) and its Jacobian. Let the initialand terminal state constraints, g1(y) = 0 and g2(y) = 0, be handled asabove and complete the function secOrderSysNonlcon.m by computing

ceq = (g3(y) g4(y) . . . gN+2(y)) (7.11)

and the Jacobian

[yceq]T = ceqJac =

g3x[0]

g3u[0]

g3x[1]

g3u[1] . . .

g4x[0]

g4u[0]

g4x[1]

g4u[1] . . .

...gN+2x[0]

gN+2u[0]

gN+2x[1]

gN+2u[1] . . .

T

(7.12)

Each row in (7.12) is computed in each loop in the for-statement insecOrderSysNonlcon.m. Note that the length of the state vector is 2.Also note that Matlab wants the Jacobian to be transposed, but that op-eration is already added last in the file. Finally, run the Matlab scriptmainDiscNonlinconSecOrderSys.m to solve the problem again.

7.2 Discretization Method 2. In this exercise we will also use the discrete timemodel in (7.3). However, the terminal constraints are removed by including itin the objective function. If the terminal constraints are not fulfilled they will

penalize the objective function. Now the optimization problem can be definedas

minu[n],n=0,...,N1

c||x(tf ) xf ||2 + hN1k=0

u2[k]

s.t. x[k + 1] = f(x[k], u[k])x[0] = xi

(7.1)

where f( ) is given in (7.3), N = tf/h and c is a predefined constant. Thisproblem can be rewritten as an unconstrained nonlinear problem

minu[n],n=0,...,N1

J = w(xi, xf , u[0], u[1], ..., u[N 1]) (7.2)

where w( ) contains the system model recursion (7.3). The optimization prob-lem will be solved by using the Matlab function fminunc.

(a) Write down an explicit algorithm for the computation of w( ).(b) Complement the file secOrderSysCostUnc.m with the cost function w( ).

Solve the problem by running the script mainDiscUncSecOrderSys.m.(c) Compare method 1 and 2. What are the advantages and disadvantages?

Which method handles constraints best? Suppose the task was also toimplement the optimization algorithm itself, which method would be eas-iest to implement an algorithm for (assuming that the state dynamics isnon-linear)?

(d) (Optional) Implement an unconstrained gradient method that can replacethe fminunc function in mainDiscUncSecOrderSys.m.

7.3 Gradient Method. As above the terminal constraint is added as a penalty inthe objective function in order to introduce the PMP gradient method here ina more straightforward way. Thus, the optimization problem is

minu

J = c||x(tf ) xf ||2 + tfti

u2(t)dt (7.1)

s.t. x(t) = f(x(t), u(t)) (7.2)x(ti) = xi. (7.3)

(a) Write down the Hamiltonian and show that the Hamiltonian partial deriva-tives w.r.t. x and u are

Hx = (0 1)Hu = 2u(t) + 2,

(7.4)

14

respectively.(b) What is the adjoint equation and its terminal constraint?(c) Complement the files secOrderSysEq.m with the system model, the

file secOrderSysAdjointEq.m with the adjoint equations, the filesecOrderSysFinalLambda.m with the terminal values of , and the filesecOrderSysGradient.m with the control signal gradient. Finally, com-plement the script mainGradientSecOrderSys.m and solve the problem.

(d) Try some different values of the penalty constant c. What happens if c issmall? What happens if c is large?

7.4 Shooting Method. Shooting methods are based on successive improvements ofthe unspecified terminal conditions of the two point boundary value problem.In this case we can use the original problem formulation in (7.2).Complement the files secOrderSysEqAndAdjointEq.m with the combined sys-tem and adjoint equation, and the file theta.m with the final constraints. Themain script is mainShootingSecOrderSys.m.

7.5 Reflections.

(a) What are the advantages and disadvantages of discretization methods?(b) Discuss the advantages and disadvantages of the problem formulation in

7.1 compared to the formulation in 7.2.(c) What are the advantages and disadvantages of gradient methods?(d) Compare the methods in 7.1, 7.2 and 7.3 in terms of accuracy and com-

plexity/speed. Also compare the results for different algorithms used byfmincon in Exercises 7.1. Can all algorithms handle the optimizationproblem?

(e) Assume that there are constraints on the control signal and you can useeither the discretization method in 7.1 or the gradient method in 7.3.Which methods would you use?

(f) What are the advantages and disadvantages of shooting methods?

15

8 The PROPT toolboxPROPT is a Matlab toolbox for solving optimal control problems. See the

PROPT manual [5] for more information.PROPT is using a so called Gauss pseudospectral method (GPM) to solve op-timal control problems. A GPM is a direct method for discretizing a continuousoptimal control problem into a nonlinear program (NLP) that can be solved by well-developed numerical algorithms that attempt to satisfy the Karush-Kuhn-Tucker(KKT) conditions associated with the NLP. In pseudospectral methods the stateand control trajectories are parameterized using global and orthogonal polynomi-als, see Chapter 10.5 in the lecture notes. In GPM the points at which the optimalcontrol problem is discretized (the collocation points) are the Legendre-Gauss (LG)points. See [1] for details.PROPT initialization; PROPT is available in the ISY computer rooms withLinux, e.g. Egypten. Follow these steps to initialize PROPT:

1. In a terminal window; run module add matlab/tomlab

2. Start Matlab, e.g. by running matlab & in a terminal window

3. At the Matlab prompt; run initcourse tsrt08

4. At the Matlab prompt; run initPropt

5. Check that the printed information looks OK.

http://tomdyn.comhttp://tomopt.com/docs/TOMLAB_PROPT.pdfhttp://fdol.mae.ufl.edu/AIAA-20478.pdf

8.1 Minimum Curve Length; Example 1. Consider the problem of finding the curvewith the minimum length from a point (0, 0) to (xf , yf ). The solution is ofcourse obvious, but we will use this example as a starting point.The problem can be formulated by using an expression of the length of thecurve from (0, 0) to (xf , yf ):

s = xf0

1 + y(x)2dx. (8.1)

Note that x is the time variable. The optimal control problem is solved inminCurveLength.m by using PROPT/Matlab.

(a) Derive the expression of the length of the curve in (8.1).(b) Examine the script minCurveLength.m and write problem that is solved

on a standard optimal control form. Thus, what are ( ), f(( )), f0(( )),etc.

(c) Run the script.(d) Use Pontryagins minimum principle to show that the solution is a straight

line.

8.2 Minimum Curve Length; Example 2. Consider the minimum length curveproblem again. The problem can be re-formulated by using a constant speedmodel where the control signal is the heading angle. This problem is solved inthe PROPT/Matlab file minCurveLengthHeadingCtrl.m.

(a) Examine the PROPT/Matlab file and write down the optimal controlproblem that is solved in this example, i.e., what are f , f0 and in thestandard optimal control formulation.

(b) Run the script and compare with the result from the previous exercise.

8.3 The Brachistochrone problem; Find the curve between two points, A andB, that is covered in the least time by a body that starts in A with zero speed andis constrained to move along the curve to point B, under the action of gravityand assuming no friction. The Brachistochrone problem (Gr. brachistos - theshortest, chronos - time) was posed by Johann Bernoulli in Acta Eruditorumin 1696 and the history about this problem is indeed very interesting and

16

Figure 8.3a. The brachistochrone problem.

involves several of the greatest scientists ever, like Galileo, Pascal, Fermat,Newton, Lagrange, and Euler.

The Brachistochrone problem; Example 1. Let the motion of the particle,under the influence of gravity g, be defined by a time continuous state spacemodel X = f(X, ) where the state vector is defined as X = (x y v)T and(x y)T is the Cartesian position of the particle in a vertical plane and v is thespeed, i.e.,

x = v sin()y = v cos() (8.1)

The motion of the particle is constrained by a path that is defined by the angle(t), see Figure 8.3a.

(a) Give an explicit expression of f(X, ) (only the expression for v is missing).

(b) Define the Brachistochrone problem as an optimal control problem basedon this state space model. Assume that the initial position of the particleis in the origin and the initial speed is zero. The final position of theparticle is (x(tf ) y(tf ))T = (xf yf )T = (10 3)T .

(c) Modify the script minCurveLengthHeadingCtrl.m of the minimum curvelength example 8.2 above and solve this optimal control problem withPROPT.

8.4 The Brachistochrone problem; Example 2. The time for a particle to travel ona curve between the points p0 = (0, 0) and pf = (xf , yf ) is

tf = pfp0

1vds (8.1)

where ds is an element of arc length and v is the speed.

(a) Show that the travel time tf is

tf = xf0

1 + y(x)22gy(x) dx (8.2)

in the Brachistochrone problem where the speed of the particle is due togravity and its initial speed is zero.

(b) Define the Brachistochrone problem as an optimal control problem basedon the expression in (8.2). Note that the time variable is eliminated andthat y is a function x now.

(c) Use the PMP to show (or derive) that the optimal trajectory is a cycloid

y(x) =

C y(x)y(x) . (8.3)

with the solutionx = C2 ( sin())

y = C2 (1 cos()).(8.4)

(d) Try to solve the this optimal control problem with PROPT/Matlab, bymodifying the script minCurveLength.m from the minimum curve lengthexample. Compare the result with the solution in Examples 1 above. Tryto explain why we have problems here.

8.5 The Brachistochrone problem; Example 3. To avoid the problem when solvingthe previous problem in PROPT/Matlab we can use the results in the exercise8.4c). The cycloid equation (8.3) can be rewritten as

y(x)(y(x)2 + 1) = C. (8.1)

17

PROPT can now be used to solve the Brachistochrone problem by defining afeasibility problem

minC

0

s.t. y(x)(y(x)2 + 1) = Cy(0) = 0y(xf ) = yf

(8.2)

Solve this feasibility problem with PROPT/Matlab. Note that the systemdynamics in example 3 (and 4) are not modeled on the standard form asan ODE. Instead, the differential equations on the form F (x, x) = 0 is calleddifferential algebraic equations (DAE) and this is a more general form of systemof differential equations.

8.6 The Brachistochrone problem; Example 4. An alternative formulation of theBrachistochrone problem can be obtained by considering the law of conser-vation of energy (which is related to the principle of least action, see Ex-ample 17 in the lecture notes). Consider the position (x y)T of the particleand its velocity (x y)T .

(a) Write the kinetic energy Ek and the potential energy Ep as functions ofx, y, x, y, the mass m, and the gravity constant g.

(b) Define the Brachistochrone problem as an optimal control problem basedon the law of conservation of energy.

(c) Solve the this optimal control problem with PROPT. Assume that themass of the particle is m = 1.

8.7 The Brachistochrone problem; Reflections.

(a) Is the problem in Exercise 8.4 due to the problems formulation or thesolution method or both?

(b) Discuss why not only the solution method (i.e. the optimization algo-rithm) is important, but also the problem formulation.

(c) Compare the different approaches to the Brachistochrone problem in Ex-ercises 8.3-8.6. Try to explain the advantages and disadvantages of thedifferent formulations of the Brachistochrone problem. Which approachcan handle additional constraints best?

8.8 The Zermelo problem. Consider a special case of Exercise 3.5 where g(y) = y.Thus, from the point (0, 0) on the bank of a wide river, a boat starts withrelative speed to the water equal . The stream of the river becomes faster asit departs from the bank, and the speed is g(y) = y parallel to the bank. Themovement of the boat is described by

x(t) = cos ((t)) + y(t)y(t) = sin((t)),

where is the angle between the boat direction and bank. We want to de-termine the movement angle (t) so that x(T ) is maximized, where T is afixed transfer time. Use PROPT to solve this optimal control problem bycomplementing the file mainZermeloPropt.m, i.e. replace all question marks??? with some proper code.

8.9 Second-order system. Solve the problem in (7.2) by using PROPT.

18

HintsThis version: September 2012

1 Discrete Optimization1.2 Note that the control constraint set and the system equation give lower and

upper bounds on xk. Also note that both u and x are integers, and because ofthat, you can use tables to examine different cases, where each row correspondsto one valid value of x, i.e.,

x J(k, x) u

0 ? ?1 ? ?...

......

1.3 Define V (n, x) := nJ(n, x) where J is the cost to go function.

1.4 Let t1 < t2 < . . . < tN1 denote the times where g1(t) = g2(t). It is neveroptimal to switch activity at any other times! We can therefore divide theproblem into N1 stages, where we want to determine for each stage whetheror not to switch.

1.5 Consider a quadratic costV (x) = xTPx,

where P is a symmetric positive definite matrix.

1

2 Dynamic Programming2.1 V (t, x) quadratic in x.

2.2 b) What is the sign of x(t)? In the hjbe, how does V (t, x) seem to depend onx?

2

3 PMP: A Special Case3.4 Rewrite the equation K(x) = l as a differential equation z = f(z, x, x), plus

constraints, and then introduce two adjoint variables corresponding to x andz, respectively.

3.6 a) The generalized force is u(t).

3.8 a) You can consider[uw

]as a control signal in the pmp framework.

3

5 Infinite Horizon Optimal Control5.2 Make the ansatz V (x) = x2 + x4

5.3 c) Use Theorem 10 in the lecture notes.e) Note that the solution matrix P is symmetric and positive definite, thusP = PT > 0.

4

6 Model Predictive Control6.1 Useful commands: mpt_control, mpt_simplify, mpt_plotPartition,

mpt_plotTimeTrajectory and lsqnonlin.

5

7 Numerical Algorithms7.1 c) The structure of the matrix Aeq is

Aeq =

E 0 0 0 . . . 0 00 0 0 0 . . . 0 EF E 0 0 . . . 0 00 F E 0 . . . 0 00 0 F E . . . 0 0... . . .

...0 0 0 0 . . . F E

(7.1)

where the size of F and E is (2 3).d) You can use the function checkCeqNonlcon to check that the Jacobian ofthe equality constraints are similar to the numerical Jacobian. You can alsocompare the Jacobian to the matrix Aeq in this example.

6

8 The PROPT toolbox8.4 Use the fact that H(y, u, ) = const. in this case (explain why!).

7

SolutionsThis version: September 2012

1 Discrete Optimization1.1 (a) The discrete optimization problem is given by

minimizeu0, u1

r(x2 T )2 +1k=0

u2k

subject to xk+1 = (1 a)xk + auk, k = 0, 1,x0 given,uk R, k = 0, 1.

(b) With a = 1/2, T = 0 and r = 1 this can be cast on standard form withN = 2, (x) = x2, f0(k, x, u) = u2, and f(k, x, u) = 12x +

12u. The dp

algorithm gives us:

Stage k = N = 2:J(2, x) = x2.

Stage k = 1:

J(1, x) = minu{u2 + J(2, f(1, x, u))}

= minu{u2 + (12x+

12u)

2}.(1.1)

The minimization is done by setting the derivative, w.r.t. u, to zero, sincethe function is strictly convex in u. Thus, we have

2u+ (12x+12u) = 0

which gives the control function

u1 = (1, x) = 15x.

Note that we now have computed the optimal control for each possiblestate x. By substituting the optimal u1 into (1.1) we obtain

J(1, x) = 15x2. (1.2)

Stage k = 0:

J(0, x) = minu{u2 + J(1, f(0, x, u)}

= minu{u2 + J(1, 12x+

12u)}.

Substituting J(1, x) by (1.2) and minimizing by setting the derivative,w.r.t. u, to zero gives (after some calculations) the optimal control

u0 = (0, x) = 121x (1.3)

and the optimal costJ(0, x) = 121x

2. (1.4)

(c) This can be cast on standard form with N = 2, (x) = r(x T )2,f0(k, x, u) = u2, and f(k, x, u) = (1 a)x + au. The dp algorithm givesus:

Stage k = N = 2:J(2, x) = r(x T )2.

Stage k = 1:

J(1, x) = minu{u2 + J(2, f(1, x, u))}

= minu{u2 + r((1 a)x+ au T )2}. (1.5)

The minimization is done by setting the derivative, w.r.t. u, to zero, sincethe function is strictly convex in u. Thus, we have

2u+ 2ra((1 a)x+ au T ) = 0

which gives the control function

u1 = (1, x) =ra(T (1 a)x)

1 + ra2 .

1

Note that we now have computed the optimal control for each possiblestate x. By substituting the optimal u1 into (1.5) we obtain (after somework)

J(1, x) = r((1 a)x T )2

1 + ra2 . (1.6)

Stage k = 0:J(0, x) = min

u{u2 + J(1, f(0, x, u)}

= minu{u2 + J(1, (1 a)x+ au)}.

Substituting J(1, x) by (1.6) and minimizing by setting the derivative,w.r.t. u, to zero gives (after some calculations) the optimal control

u0 = (0, x) =r(1 a)a(T (1 a)2x)

1 + ra2(1 + (1 a)2) . (1.7)

and the optimal cost

J(0, x) = r((1 a)2x T )2

1 + ra2(1 + (1 a)2) . (1.8)

1.2 We have the optimization problem on standard form with N = 4, (x) = 0,f0(k, x, u) = x2 + u2, and f(k, x, u) = x+ u. Note that 0 xk + uk 5 0 xk+1 5. So we know that 0 xk 5 for k = 0, 1, 2, 3, if uk U(k, x).Stage k = N = 4: J(4, x) = 0.

Stage k = 3:J(3, x) = min

uU(3,x){f0(3, x, u) + J(4, f(3, x, u)}

= minuU(3,x)

{x2 + u2} = x2 (1.1)

since u = 0 U(3, x) for all x satisfying 0 x 5.Stage k = 2:

J(2, x) = minuU(2,x)

{f0(2, x, u) + J(3, f(2, x, u))}

= minuU(2,x)

{x2 + u2 + (x+ u)2}

= 32 x2 + min

xu5x{2(12x+ u)

2}

(1.2)

x J(2, x) u

0 min0u5{2u2} = 0 01 3/2 + min1u4{2(1/2 + u)2} = 2 1, 02 6 + min2u3{2(1 + u)2} = 6 13 27/2 + min3u2{2(3/2 + u)2} = 14 2, 14 24 + min4u1{2(2 + u)2} = 24 25 75/2 + min5u0{2(5/2 + u)2} = 38 3, 2

Stage k = 1:


{f0(1, x, u) + J(2, f(1, x, u))}

= x2 + minxu5x

{u2 + J(2, x+ u)} (1.3)

x J(1, x) u

0 0 + min0u5{u2 + J(2, 0 + u)} = 0 01 1 + min1u4{u2 + J(2, 1 + u)} = 2 12 4 + min2u3{u2 + J(2, 2 + u)} = 7 13 9 + min3u2{u2 + J(2, 3 + u)} = 15 24 16 + min4u1{u2 + J(2, 4 + u)} = 26 25 25 + min5u0{u2 + J(2, 5 + u)} = 40 3

since the expression inside the brackets u2 +J(2, x+u) are, for different x andu,

x u = 5 4 3 2 1 0 ...0 0+0 ...1 1+0 0+2 ...2 4+0 1+2 0+6 ...3 9+0 4+2 1+6 0+14 ...4 16+0 9+2 4+6 1+14 0+24 ...5 25+0 16+2 9+6 4+14 1+24 0+38 ...

2

x ... 1 2 3 4 50 ... 1+2 4+6 9+14 16+24 25+381 ... 1+6 4+14 9+24 16+382 ... 1+14 4+24 9+383 ... 1+24 4+384 ... 1+385 ...

Stage k = 0: Note that x0 = 5!


{f(0, x, u) + J(1, f(0, x, u))}

= {x = 5} = 25 + min5u0

{u2 + J(1, 5 + u)} = 41 (1.4)

given by u = 3 since the expression inside the brackets u2 + J(1, 5 +u) are,for x = 5 and different u,

x u = 5 4 3 2 1 05 25+0 16+2 9+7 4+15 1+26 0+40

To summarize; the system evolves as follows:

k xk uk Jk

0 5 3 411 2 1 72 1 1 or 0 23 0 or 1 0 0 or 1

1.3 Apply the general DP algorithm to the given problem:

J(N, x) = N(x) NJ(N, x) :=V (N,x)

= (x)

J(n, x) = minuU(n,x)

{nf0(n, x, u) + J(n+ 1, f(n, x, u))

} nJ(n, x) = min

uU(n,x){f0(n, x, u) + nJ(n+ 1, f(n, x, u))

} nJ(n, x) :=V (n,x)

= minuU(n,x)

{f0(n, x, u) + (n+1)J(n+ 1, f(n, x, u))

:=V (n+1,f(n,x,u))

}

In general, defining V (n, x) := nJ(n, x) yields

V (N, x) = (x),V (n, x) = min

uU(n,x){f0(n, x, u) + V (n+ 1, f(n, x, u))

},

1.4 Define

xk ={

0 if on activity g1 just before time tk1 if on activity g2 just before time tk

uk ={

0 continue current activity1 switch between activities

The state transition is

xk+1 = f(xk, uk) = (xk + uk) mod 2

and the profit for stage k is

f0(x, u) = tk+1tk

g1+f(x,u)(t) dt ukc.

The dp algorithm is then

J(N, x) = 0,J(n, x) = max

u{f0(x, u) + J(n+ 1, (x+ u) mod 2)}.

3

1.5 The Bellman equation is given by

V (x) = minu{f0(x, u) + V (f(x, u))} (1.1)

where

f0(x, u) = xTQx+ uTRu,f(x, u) = Ax+Bu.

Assume that the cost is on the form V = xTPx, P > 0. Then the optimalcontrol law is obtained by minimizing (1.1)

(x) = arg minu

{f0(x, u) + V (f(x, u))}

= arg minu

{xTQx+ uTRu+ (Ax+Bu)TP (Ax+Bu)}

= arg minu

{xTQx+ uTRu+ xTATPAx+ uTBTPBu+ 2xTATPBu}.

The stationary point of the expression inside the brackets are obtained bysetting the derivative to zero, thus

2Ru + 2BTPBu + 2BTPAx = 0 u = (BTPB +R)1BTPAx,which is a minimum since R > 0 and P > 0. Now the Bellman equation (1.1)can be expressed as

xTPx = xTQx+ uT (Ru +BTPBu +BTPAx) =0

+xTATPAx+ xTATPBu

= xTQx+ xTATPAx xTATPB(BTPB +R)1BTPAxwhich holds for all x if and only if the dare

P = ATPAATPB(BTPB +R)1BTPA+Q, (1.2)holds. Optimal feedback control and globally convergent closed loop systemare guaranteed by Theorem 2, assuming that there is a positive definite solutionP to (1.2).

1.6 (a) 1. The Hamiltonian is

H(k, x, u, ) = f0(k, x, u) + T f(k, x, u)= xTQx+ uTRu+ T (Ax+Bu).

2. Pointwise minimization yields

0 = Hu

(k, xk, uk, k+1) = 2Ruk +BTk+1 = 0

= uk = 12R1BTk+1.

(1.1)

3. The adjoint equations are

k =H

xk(xk, uk, k+1) = 2Qxk +ATk+1, k = 1, . . . , N 1

N =

x(xN ) = 2QNxN .

(1.2)

Hence, we obtain

xk+1 = Axk 12BR

1BTk+1. (1.3)

Thus, the minimizing control sequence is (1.1) where k+1 is the solu-tion to the tpbvp in (1.2) and (1.3).

(b) Assume that Sk is invertible for all k. Using

k = 2Skxk xk = 12S1k k

in (1.3) yields

12S1k+1k+1 = Axk

12BR

1BTk+1

k+1 = 2(S1k+1 +BR1BT )1Axk. (1.4)Insert this result in the adjoint equation (1.2)

k = 2Skxk = 2Qxk + 2AT (S1k+1 +BR1BT )1Axk,

SNxN = QNxN

and since these equations must hold for all xk we see that the backwardsrecursion for Sk is

SN = QNSk = AT (S1k+1 +BR

1BT )1A+Q

4

(c) Now, the mil yields

Sk = AT (S1k+1 +BR1BT )1A+Q

= AT (Sk+1 Sk+1B(R+BTSk+1B)1BTSk+1)A+Q= ATSk+1AATSk+1B(R+BTSk+1B)1BTSk+1A+Q

This recursion converges to the dare in (1.2) as k if (A,B) iscontrollable and (A,C) is observable where Q = CTC.

(d) By combining (1.1) and (1.4) we get

uk = R1BT (S1k+1 +BR1BT )1Axk (1.5)

Another matrix identity also refered to as the mil is

BV (A+ UBV )1 = (B1 + V A1U)V A1

with this version of the mil we get

uk = (R+BTSk+1B)1BTSk+1Axk (1.6)

which converges to the optimal feedback for infite time horizon LQ as k .q

5

2 Dynamic Programming2.1 1. The Hamiltonian is given by

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)= (x cos t)2 + u2 + u.

2. Point-wise optimization yields

(t, x, ) , arg minuR

H(t, x, u, )

= arg minuR

{(x cos t)2 + u2 + u} = 2 ,

and the optimal control is

u(t) , (t, x(t), Vx (t, x(t))) = 12Vx(t, x(t)),

where Vx(t, x(t)) is obtained by solving the hjbe.

3. The hjbe is given by

Vt = H(t, x, (t, x, Vx), Vx), V (tf , x) = (x),where Vt , V/t and Vx , V/x. In our case, this is equivalent to

Vt + (x cos t)2 V 2x /4 = 0, V (tf , x) = 0,which is a partial differential equation (pde) in t and x. These equations arequite difficult to solve directly and one often needs to utilize the structureof the problem. Since the original cost-function

f0(x(t), u(t)) , (x(t) cos t)2 + u2(t),is quadratic in x(t), we make the guess

V (t, x) , P (t)x2 + 2q(t)x+ c(t).

This yields

Vt = P (t)x2 + 2q(t)x+ c(t),Vx = 2P (t)x+ 2q(t).

Now, substituting these expressions into the hjbe, we get

P (t)x2 + 2q(t)x+ c(t) =Vt

+(x cos t)2 (2P (t)x+ 2q(t) =Vx

)2)/4 = 0,

which can be rewritten as(P (t) + 1P (t)2)x2 + 2(q(t) cos tP (t)q(t))x+ c(t) + cos2 t q2(t) = 0.Since this equation must hold for all x, the coefficients needs to be zero,i.e.,

P (t) = 1 + P 2(t),q(t) = cos t+ P (t)q(t),c(t) = cos2 t+ q2(t),

which is the sought system of odes. The boundary conditions are given by

P (tf ) = q(tf ) = c(tf ) = 0,

which come from V (tf , x) = 0.

2.2 (a) The problem is a maximization problem and can be formulated as

minimizeu( ) x(tf )

tf0

(1 u(t))x(t) dt

subject to x(t) = u(t)x(t),x(0) = x0 > 0,u(t) [0, 1], t

(b) 1. The Hamiltonian is given by

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)= (1 u)x+ ux.

6

2. Pointwise optimization yields

(t, x, ) , arg minu[0,1]

H(t, x, u, )

= arg minu[0,1]

{(1 u)x+ ux}

= arg minu[0,1]

{(1 + )ux}

=

1, (1 + )x < 00, (1 + )x > 0u, (1 + )x = 0

,

where u is arbitrary in [0, 1]. Noting that x(t) > 0 for all t 0, we findthat

(t, x, ) =

1, < 10, > 1u, = 1

. (2.1)


Vt = H(t, x, (t, x, Vx), Vx), V (tf , x) = (x),where Vt , V/t and Vx , V/x. In our case, this is may be writtenas

Vt +(1 (t, x, Vx))x+ Vx(t, x, Vx)x = 0.By collecting (t, x, Vx) this may be written as

Vt + (1 + Vx)x(t, x, Vx)) x = 0,and inserting (2.1) yields{

Vt + Vxx = 0, Vx < 1Vt x = 0, Vx > 1

, (2.2)

with the boundary condition V (tf , x) = (x) = x. Both the equationsabove are linear in x, which suggests that V (t, x) = g(t)x for somefunction g(t). Substituting this into (2.2) yields{

g(t)x+ g(t)x = 0, g(t) < 1/g(t)x x = 0, g(t) > 1/ ,

These equations are valid for every x > 0 and we can thus cancel the xterm, which yields{

g(t) + g(t) = 0, g(t) < 1/g(t) 1 = 0, g(t) > 1/ .

The solutions to the equations above are{g1(t) = c1e(tt

), g(t) < 1/g2(t) = t+ c2, g(t) > 1/

,

for some constants c1, c2 and t. The boundary value is

V (tf , x) = x g(tf ) = 1 > 1/(since (0, 1)).

This gives g2(tf ) = t + c2 in the end. From g2(tf ) = 1, we obtainc2 = 1 tf and

g(t) = t 1 tf ,in some interval (t, tf ]. Here, t fulfills g2(t) = 1/, and hence

t = 1 1/+ tf ,which is the same time at which g switches. Now, use this as a boundaryvalue for the solution to the other pde g1(t) = c1e(tt

):

g1(t) = 1/ c1 = 1/.Thus, we have

g(t) ={ 1e(tt

), t < t

t 1 tf , t > t,

where t = 1 1/+ tf . The optimal cost function is given by

V (t, x) ={ 1e(tt

)x, t < t (u = 0)(t 1 tf )x, t > t (u = 1)

,

where t = 1 1/+ tf . Thus, the optimal control strategy is

ut(t) = (t, x(t), Vx(t, x(t))) ={

1, t < t

0, t > t

and we are done. Not also that Vx and Vt are continuous at the switchboundary, thus fulfilling the C1 condition on V (t, x) of the verificationtheorem.

7

3 PMP: A Special Case3.1 The Hamiltonian is given by

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)= x+ u2 + x+ u+ .

Pointwise minimization yields

(t, x, ) , arg minu

H(t, x, u, ) = 12,

and the optimal control can be written as

u(t) , (t, x(t), (t)) = 12(t).

The adjoint equation is given by

(t) = (t) 1,which is a first order linear ode and can thus be solved easily. The standardintegrating factor method yields

(t) = eCt 1,for some constant C. The boundary condition is given by

(T ) x

(T, x(T )) Sf (T ) (T ) = 0,

Thus,(t) = eTt 1,

and the optimal control is found by

u(t) = 12(t) =1 eTt

2 .

3.2 (a) Since

J = 10y dt = y(1) y(0) = 1,

it holds that all y( ) C1[0, 1] such that y(0) = 0 and y(1) = 1 areminimal.

(b) Since

J = 10yy dt = 12

10

d

dt(y2) dt = 12

(y(1)2 y(0)2) = 12 ,

it holds that all y( ) C1[0, 1] such that y(0) = 0 and y(1) = 1 areminimal.

3.3 (a) Define x = y, u = y and f0(t, x, u) = x2 + u2 2x sin t, then the Hamilto-nian is given by

H(t, x, u, ) = x2 + u2 2x sin t+ uThe following equations must hold

y(0) = 0 (3.1a)

0 = Hu

(t, x, u, ) = 2u+ (3.1b)

= Hx

(t, x, u, ) = 2x+ 2 sin t (3.1c)(1) = 0 (3.1d)

Equation (3.1b) gives = 2u = 2y and with (3.1c) we havey y = sin t

with the solution

y(t) = c1et + c2et +12 sin t, c1, c2 R

where (3.1a) gives the relationship of c1 and c2 as

c2 = c1and where (3.1d) gives.

c2 = c1e2 +e

2 cos 1.

Consequently, we have

y(t) = e cos 12(e2 + 1)(et et) + 12 sin t,

8

(b) Define x = y, u = y and f0(t, x, u) = u2/t3, then the Hamiltonian is givenby

H(t, x, u, ) = u2t3 + u.

The following equations must hold

y(0) = 0 (3.2a)

0 = Hu

(t, x, u, ) = 2ut3 + (3.2b)

= Hx

(t, x, u, ) = 0 (3.2c)

(1) = 0 (3.2d)

Equation (3.2c) gives = c1, c1 R, and (3.2d) that c1 = 0. Then (3.2b)gives u(t) = 0 which implies y(t) = c2, c2 R. Finally, with (3.2a) thisgives

y(t) = 0

(c) Analogous to (a). The system equation is

y y = et

with the solutiony(t) = c1et + c2et +

12 te

t

wherec2 = c1

and wherec2 = e2(c1 + 1)

which gives

y(t) = e2

e2 + 1(et et) + 12 te

t

3.4 Define

z(t) = ta

1 + x(s)2 ds.

and let x = u. Then an equivalent problem is

minimizex( )

aax(t) dt

subject to x(t) = u(t),x(a) = 0,x(a) = 0,z(t) =

1 + u(t)2

z(a) = 0z(a) = l

Thus, the Hamiltonian is

H = x+ 1u+ 2

1 + u2

and the following equations must hold

1 = Hx

= 1 (3.1a)

2 = Hz

= 0 (3.1b)

0 = Hu

= 1 + 2u(1 + u2)1/2 (3.1c)

Equations (3.1a) and (3.1b) gives 1 = t + c1 and 2 = c2 where c1 and c2are real constants. These solutions inserted in (3.1c) gives

(t+ c1)(1 + u2)1/2 + uc2 = 0and after some rearrangement and taking the square we have

u2(c22 (t c1)2

)= (t c1)2. (3.2)

Note that c22 (tc1)2 0 leads to the unrealistic requirement t = c1. Hence,we can assume that c22 (t c1)2 > 0 and rewrite (3.2) as

x = u = t c1c22 (t c1)2

.

The solution isx(t) =

c22 (t c1)2 + c3,

9

where c3 is constant, and this can be rewritten as the equation of a circle withthe radius c2 and the center in (t, x(t)) = (c1, c3), i.e.,

(x(t) c3)2 + (t c1)2 = c22.

The requirements x(a) = 0 and x(a) = 0 gives

c23 + (a c1)2 = c22c23 + (a+ c1)2 = c22,

respectively. Thus, c1 = 0 and c23 + a2 = c22. We have three different cases

l < 2a: No solution since the distance between (a, 0) and (a, 0) is longerthan the length of the curve l.

2a l pia: Let define the angle from the x(t)-axis to the vector fromthe circle center to the point (a, 0), i.e, a = c2 sin, see Figure 3.4a(a).Furthermore, the length of the circle segment curve is l = 2c2. Thus

sin l2c2= ac2

which can be used to determine c2.

pia < l: as the previous case, but the length of the circle segment curve isnow l = 2(pi )c2, see Figure 3.4a(b).

3.5 By introducing x1 = x, x2 = y, u = and redefining x , (x1, x2)T and(T, x(T )) = x1(T ), the problem at hand can be written on standard formas

minimizeu( ) x1(T )

subject to x1(t) = cos (u(t)) + g(x2(t)),x2(t) = sin(u(t)),x1(0) = 0, x2(0) = 0,u(t) R, t [0, T ].

a a t

l

c2

x(t)

(0, c3)

(a) 2a l pia

t

l

(0, c3)

c2

aa

x(t)

(b) pia < l

Figure 3.4a. Optimal curve enclosing the largest area in the upper half plane for twodifferent cases.

The Hamiltonian is given by

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)= 1( cos (u) + g(x2)) + 2 sin(u)= 1g(x2) + 1 cos (u) + 2 sin(u).

Pointwise optimization yields

(t, x, ) , arg minu

H(t, x, u, )

= arg minu

{1 cos (u) + 2 sin(u)

}= arg min

u

21 + 22 sin (u+ )(

sin = 121 + 22

, cos = 221 + 22

)= pi2 .

10

Hence,

tan (u(t)) =sin ((t) pi2 )cos ((t) pi2 )

= cos (t) sin (t)) = 2(t)/1(t). (3.1)

Now, the adjoint equations are

1(t) = Hx1

(x(t), u(t), (t)) = 0

2(t) = Hx2

(x(t), u(t), (t)) = g(x2(t))1(t).

The first equation yields 1(t) = c1 for some constant c1. The boundaryconstraints are given by

(T ) x

(T, x(T )) Sf (T ) (1(T ) + 12(T )

)=(

00

),

and hence 1(t) = 1. Substituting this into the expression for the optimalcontrol law (3.1) yields

tan (u(t)) = 2(t).What remains is to take care of 2(t), for which no explicit solution can befound. To this end, we need the following general result:A system is said to be autonomous if it does not have a direct dependence onthe time variable t, i.e., x(t) = f(x(t), u(t)). On the optimal trajectory, theHamiltonian for autonomous system has the property

H(x(t), u(t), (t)) ={

0, t, if the final time is free,constant t, if the final time is fixed,

see Chapter 6 in the course compendium for details.In this example the final time is fixed. Thus,

constant = H(x(t), u(t), (t))= 1(t)

=1

( cosu(t) + g(x2(t))

)+ 2(t)

= tanu(t)

sin u(t)

= cosu(t) g(x2(t)) tan u(t) sin u(t)= g(x2(t)) (cosu(t) + tan u(t) sin u(t))= g(x2(t)) cosu(t) ,

and by dividing the above relation with the desired result is attained.

3.6 (a) 1. Firstly, let us derive the Euler-Lagrange equations. Adjoining the con-straint q = u to the cost function yields tf

ti

(L

uu+ L

qq + T q + T (u q)

)dt = 0.

Integrating the last term by parts yields tfti

((L

u+ T

)u+

(L

q+ T + T

)q

)dt = 0.

To make the entire integral vanish, we choose

T = Lq T ,

T = Lu

.

Combining the above, we have

d

dt

(L

q

) Lq

= T .

2. Define the generalized coordinates as q1 = z and q2 = i, and the gener-alized force 2 = u(t). The Lagrangian is L = T V where the kineticenergy is the sum of the mechanical and the electrical kinetic energies

T = 12mq21 +

12 l(q1)q

22

and the potential energy is the spring energy

V = 12kq21 .

Euler-Lagranges equations are

d

dt

L

q1 Lq1

= 0

d

dt

L

q2 Lq2

= u(t)

11

and this gives the system equations

mq1 + kq1 12 q22l(q1)q1

= 0

l(q1)q2 = u(t).

(b) The Rayleigh dissipative energy is

F (q, q) = 12dq21 +

12Rq

22 .

Euler-Lagranges equations of motion is

d

dt

L

q1 Lq1

= 0 Fq1

d

dt

L

q2 Lq2

= u(t) Fq2

and this gives the system equations

mq1 + dq1 + kq1 12 q22l(q1)q1

= 0

l(q1)q2 +Rq2 = u(t).

3.7 The Hamiltonian is given by

H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)

= 12u2 + u


(t, x, ) , arg minu

H(t, x, u, ) = ,

and the optimal control can be written as

u(t) , (t, x(t), (t)) = (t).The adjoint equation is given by

(t) = Hx1

(x(t), u(t), (t)) = 0,

which means that is constant. The boundary condition is given by

(tf ) =

x(x(tf )) = x(tf ) (t) = x(tf ).

The final state x(tf ) is not given, however it can be calculated as a function ofx(t) assuming the optimal control signal u(t) = (t) = x(tf ) is appliedfrom t to tf . This will also give us a feedback solution.

x(tf ) = x(t) + tft

x(t) dt = x(t) tft

x(tf ) dt = x(t) x(tf )(tf t).

Thus,x(tf ) =

11 + (tf t)x(t),

and the optimal control is found by

u(t) = (t) = x(tf ) = 11 + (tf t)x(t).

3.8 (a) Define

u ,(uw

).

Withf0(x, u) , 2x2 + u2 2w and f(x, u) , u+ w,

the Hamiltonian is given by

H(x, u, ) , f0(x, u) + T f(x, u) = 2x2 + u2 2w2 + (u+ w).Pointwise minimization yields

0 = Hu

(x, u, ) =(

2u+ 22w +

)= (t, x(t), (t)) =

( (t)/2(t)/(22)

).

The tpbvp is given by

(t) = Hx

(x(t), (x(t), (t)), (t)) = 22x(t), (T ) = 0 (3.1)

x(t) = H

(x(t), (t, x(t), (t)), (t)) =(

12 1)(t), x(0) = 0. (3.2)

12

(b) Differentiation of (3.2) yields

x(t) =(2 1)(t) (3.1)= 22(2 1)x(t).

which is a second order homogeneous ode

x(t) + 22(2 1)x(t) = 0.

The characteristic polynomial

h2 + 22(2 1) = 0,

has solutions on the imaginary axis (since 0 < < 1) given by

h = i

2(2 1).

Thus, the solution is

x(t) = A sin (rt) +B cos (rt),

for some constants A and B, where r =

2(2 1). Now, the bound-

ary condition (T ) = 0 implies that x(T ) = 0 (see (3.2)) and thus(sin (rt) cos (rt)r cos (rT ) r sin (rT )

)(AB

)=(x(t)

0

),

which has the solution(AB

)= 1r sin (rt) sin (rT ) r cos (rt) cos (rT )

(r sin (rT ) cos (rt)r cos (rT ) sin (rt)

)(x(t)

0

)= x(t)sin (rt) sin (rT ) + cos (rt) cos (rT )

(sin (rT )cos (rT )

)= x(t)

cos(r(T t))

(sin (rT )cos (rT )

).

Finally, the optimal control is given by

u(t) = 12(t)(3.2)= 12

22 1 x(t)

= 12 1

(rA cos (rt) rB sin (rt))

= r2 1

(sin (rT )

cos(r(T t)) cos (rt) cos (rT )cos (r(T t)) sin (rt)

)x(t)

=

22 1

(sin (rT ) cos (rt) cos (rT ) sin (rt)

cos(r(T t))

)x(t)

=

22 1 tan

(r(T t))x(t).

(c) Since tan ( ) is defined everywhere on [0, pi/2) and r(T t) rT for allt = [0, pi/2), it must hold that rT < pi/2. Thus, for

2(2 1)T < pi2 = > 11 + 2( pi2T )2 ,

the control signal is defined for all t [0, T ].

13

4 PMP: General Results4.1 The problem can be rewritten to standard form as

minimizeu( ) x(T )

T0

(1 u(t))x(t) dt

subject to x(t) = u(t)x(t), 0 < < 1x(0) = x0 > 0,0 u(t) 1, t [0, T ].


H(t, x, u, ) , f0(t, x, u) + T f(t, x, u) = (1 u)x+ ux,Pointwise optimization of the Hamiltonian yields

(t, x, ) , arg minu[0,1]

H(t, x, u, )

= arg minu[0,1]

{ (1 u)x+ ux}= arg min

u[0,1]

{(1 + )ux

}

=

1, (1 + )x < 00, (1 + )x > 0u, (1 + )x = 0

,

where u is an arbitrary value in [0, 1]. To be able to find an analytical solution,it is important to remove the variable x from the pointwise optimal solutionabove. Otherwise we are going to end up with a pde, which is difficult tosolve. In this particular case it is simple. Since x0 > 0, > 0 and u > 0 in(4.1), it follows that x(t) > 0 for all t [0, T ]. Hence, the optimal control isgiven by

u(t) , (t, x(t), (t)) =

1, (1 + ) < 00, (1 + ) > 0u, (1 + ) = 0

,

and we define the switching function as

(t) , 1 + (t).

The adjoint equations are now given by

(t) , Hx

(t, x(t), u(t), (t)

)=(1 u(t)) (t)u(t)

=

(t), (1 + ) < 0, (u(t) = 1)1, (1 + ) > 0,

(u(t) = 0

)1, (1 + ) = 0

. (4.1)

The boundary constraints are

(T ) x

(T, x(T )) Sf (T ) (T ) + 1 = 0,

which implies that (T ) = 1. Thus,

(T ) = 1 + (T ) = 1 > 0,

so that u(T ) = 0. What remains to be determined is how many switchesoccurs. A hint of the number of switches can often be found by consideringthe value of (t)|(t)=0. From (4.1), it follows that

(t)|(t)=0 = (t)1+(t)=0 = > 0.

Hence, there can only be at most one switch, since we can pass (t) = 0 onlyonce. Since u(T ) = 0, it is not possible that u(t) = 1 for all t [0, T ]. Thus,

u(t) ={

1, 0 t t0, t t T ,

for some unknown switching time t [0, T ]. The switching occurs when

0 = (t) = 1 + (t),

and to find the value of t we need to determine the value of (t). From (4.1),it holds that ((t) = 0)

(t) = 1, (T ) = 1,

14

which has the solution (t) = t T 1. Since (t) = 0 is equivalent to(t) = 1/, we get

t = T + 1 1,

and the optimal control is thus given by

u(t) ={

1, 0 t T + 1 10, T + 1 1 t T

.

It is worth noting that when is small, t will become negative and the optimalcontrol law is u(t) = 0 for all t [0, T ].

4.2 The problem to be solved is

minimizeu( )

T0u(t) dt

subject to x1(t) = x2(t),

x2(t) =cu(t)x3(t)

g(1 kx1(t)),

x3(t) = u(t),x1(0) = h, x2(0) = , x3(0) = m,x1(T ) = 0, x2(T ) = 0,0 u(t) M.


H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)

= u+ 1x2 + 2(cux3 g(1 kx1)

) 3u

= 1x2 2g(1 kx1) +(

1 + c2x3 3

)u.


(t, x, ) , arg minu[0,M ]

H(t, x, u, )

= arg min0uM

(1 + c2

x3 3

)

,

u

=

M, < 00, > 0u, = 0

,

where u [0,M ] is arbitrary. Thus, the optimal control is expressed by

u(t) , (t, x(t), (t)) =

M, (t) < 00, (t) > 0u, (t) = 0

,

where the switching function is given by

(t) , 1 + c2(t)x3(t)

3(t).

The adjoint equations are

1(t) = Hx1

(x(t), u(t), (t)) = gk2(t),

2(t) = Hx2

(x(t), u(t), (t)) = 1(t),

3(t) = Hx3

(x(t), u(t), (t)) = c2(t)u(t)

x23(t).

The boundary conditions are given by

(T ) x

(T, x(T )) Sf (T ) 1(T )2(T )3(T )

= 110

0

+ 201

0

,which says that 3(T ) = 0, while 1(T ) and 2(T ) are free. This, yields noparticular information about the solution to the adjoint equations.

15

Now, let us try to determine the number of switches. Using the state dynamicsand the adjoint equations, it follows that

(t) = c2(t)x3(t) c2(t)x3(t)x23(t)

3(t)

= c1(t)x3(t) + c2(t)u(t)

x23(t) c2(t)u

(t)x23(t)

= c1(t)x3(t)

.

Since c > 0 and x3(t) 0, it holds that sign (t) = sign1(t) and we needto know which values 1(t) can take. From the adjoint equations, we have

1(t) = gk2(t) = gk1(t),which has the solution

1(t) = Aegkt +Be

gkt.

Now, if A and B have the same signs, 1(t) will never reach zero. If A and Bhave opposite signs, 1(t) has one isolated zero. This implies that (t) has atmost one isolated zero. Thus, (t) can only, at the most, pass zero two timesand the optimal control is bang-bang. The only possible sequences are thus{M, 0,M}, {0,M} and {M} (since the spacecraft should be brought to rest,all sequences ending with a 0 are ruled out).

4.3 By introducing x1 = x, x2 = y and redefining x , (x1, x2)T , the problem athand can be written on standard form as

minimizeu( ) x2(T )

subject to x1(t) = (au(t) b)x1(t),x2(t) = c(1 u(t))x1(t),x1(0) = 1, x2(0) = 0,u(t) [0, 1], t [0, T ].


H(t, x, u, ) , f0(t, x, u) + T f(t, x, u)= 1(au b)x1 + 2c(1 u)x1= 1bx1 + 2c+ (a1 c2)x1u.


(t, x, ) , arg minu[0,1]

H(t, x, u, )

= arg min0u1

(a1 c2)x1u

=

1, (a1 c2)x1 < 00, (a1 c2)x1 > 0u, (a1 c2)x1 = 0

,

where u [0,M ] is arbitrary. Now, since the number of worker bees x1(t)must be positive, we define the switching function as

(t) , a1(t) c2(t).Thus, the optimal control is expressed by

u(t) , (t, x(t), (t)) =

1, (t) < 00, (t) > 0u, (t) = 0

.


1(t) = Hx1

(x(t), u(t), (t))

= 1(t)(au(t) b) 2(t)c(1 u(t)),2(t) = H

x2(x(t), u(t), (t)) = 0.

The boundary conditions are given by

(T ) x

(T, x(T )) Sf (T ) (

1(T )2(T ) + 1

)=(

00

),

which yields 1(T ) = 0 and 2(T ) = 1. Substituting this into the adjointequations implies that 2(t) = 1 and thus

(t) = a1(t) + c.

This yields(T ) = a1(T ) + c = c > 0,

16

so that u(T ) = 0. To get the number of switches, consider

(t) = a1(t) = a( 1(t)(au(t) b) 2(t)

=1

c(1 u(t)))= a

( 1(t)(au(t) b) + c(1 u(t)))which implies that

(t)(t)=0 = (t)

1(t)=c/a = a

(c

a

(au(t) b)+ c(1 u(t))) = c(a b).

We have three cases:

i) a < b : (t) crosses zero from positive to negative at most once. Sincewe have already shown that (T ) > 0, it follows that no switch is presentand thus u(t) = 0 for all t [0, T ] (since u(T ) = 0).

ii) a > b : (t) crosses zero from negative to positive at most once. Thus

u(t) ={

1, 0 t t0, t t T ,

for some unknown switching time t. To find t, we assume that (t) < 0.Then t [t, T ], u(t) = 0 and the adjoint equation can be simplified to

1(t) = 1(t)(au(t) b) 2(t) =1

c(1 u(t)) = b1(t) + c,

with the boundary constraint 1(T ) = 0. This first order boundary valueproblem has the solution

1(t) =c

b

(eb(tT ) 1

).

For t = t, we have (t) = 0 and thus

ca

= 1(t) =c

b

(eb(t

T ) 1),

which can be rewritten as

t = T + 1b

log(

1 ba

).

iii) a = b : We cannot say anything about the number of switches. Solve{1(t) = a1(t) + c,1(T ) = 0

1(t) = ca

(1 + ea(tT )) 6= ca

Hence, there is no switch, and u(t) = 0, t [0, T ].

4.4 The problem can be written on standard form as

minimizeu( )

tf0

1 dt

subject to x1(t) = x2(t),x2(t) = u(t),x1(0) = x1i, x2(0) = x2i,x1(tf ) = 0, x2(tf ) = 0,u(t) [1, 1], t [0, tf ].


H(t, x, u, ) , f0(t, x, u) + T f(t, x, u) = 1 + 1x2 + 2u.

Pointwise optimization of the Hamiltonian yields

(t, x, ) , arg minu[1,1]

H(t, x, u, )

= arg minu[1,1]

2u =

1, 2 < 01, 2 > 0u, 2 = 0

,

where u is an arbitrary value in [1, 1]. Hence, the optimal control candidateis given by

u(t) , (t, x(t), (t)) =

1, 2 < 01, 2 > 0u, 2 = 0

,

and we define the switching function as

(t) , 2(t).

17

The adjoint equations are now given by

1(t) = Hx1

(x(t), u(t), (t)) = 0,

2(t) = Hx2

(x(t), u(t), (t)) = 1(t).

The first equation yields 1(t) = c1 for some constant c1. Substituting thisinto the second equation yields 2(t) = c1t + c2 for some constant c2. Theboundary conditions are given by

(tf ) x

(tf , x(tf )) Sf (tf ) (1(tf )2(tf )

)= 1

(10

)+ 2

(01

),

which does not add any information about the constants c1 and c2. So, howmany switches occurs? Since (t) = c1 it follows that at most one switchcan occur and therefore

u(t) ={1, 0 t < t1, t t tf

or u(t) ={

1, 0 t < t1, t t tf

for some t [0, tf ]. We have two cases:i) When u(t) = 1, we get x1 = x2 and x2 = 1, so that

x1x2

= dx1dx2

= x2 x1 + c3 = 12x22,

for some constant c3. The contour curves for some different values of c3are depicted in Figure 4.4a(a).

ii) When u(t) = 1 we get x1 = x2 and x2 = 1x1x2

= dx1dx2

= x2 x1 + c4 = 12x22,

for some constant c4. The contour curves for some different values of c4are depicted in Figure 4.4a(b).

We know that u(t) = 1 with at most one switch. Hence, we must approachthe origin along the trajectories depicted in Figure 4.4a(c) for which the controlis defined by

u(t) = sgn{x1(t) 12 sgn {x1(t)}x

22(t)

}.

5 4 3 2 1 0 1 2 3 4 55

4

3

2

1

0

1

2

3

4

5

(a)

5 4 3 2 1 0 1 2 3 4 55

4

3

2

1

0

1

2

3

4

5

(b)

5 4 3 2 1 0 1 2 3 4 55

4

3

2

1

0

1

2

3

4

5

u=-1

u=+1

(c)

Figure 4.4a. Contour plots of the solutions.

4.5 (a) First consider the time optimal case, i.e., the problem is given by

minimizev( )

tf0

dt.

The Hamiltonian is

H(t, x, v, ) , f0(t, x, v) + T f(t, x, v)

= 1 + 1x3 + 2x4 + 3c

mu cos v + 4

c

mu sin v.

18

A necessary condition for the Hamiltonian to be minimal is

0 = Hv

(t, x, v, ) =0 = 3 sin v + 4 cos v =

tan v = 43

Thus, the optimal control satisfies

tan(v(t)

)= 4(t)3(t)

(4.1)

The adjoint equations are given by

1(t) = Hx1

(t, x(t), v(t), (t)) = 0,

2(t) = Hx2

(t, x(t), v(t), (t)) = 0,

3(t) = Hx3

(t, x(t), v(t), (t)) = 1(t),

4(t) = Hx4

(t, x(t), v(t), (t)) = 2(t).

(4.2)

The first two equations yields

1(t) = c1 and 2(t) = c2,

for some constants c1 and c2. Substituting this into the last two equationsyields

3(t) = c1t+ c34(t) = c2t+ c4,

for some constants c3 and c4. Finally, the optimal control law (4.1) satisfies

tan v(t) = 4(t)3(t)

= c4 c2tc3 c1t ,

which is the desired result.

Now, consider the problem given by

minimizev( ) (x(tf )).

The solution to this problem is quite similar to the one already solvedand the only differences is that the Hamiltonian does not contain the con-stant 1, and that the boundary value conditions for (t), when solving theadjoint equations, are different. This does not affect the optimal solutionv(t) more than changing the values of the constants c1, . . . , c4. Thus, thesame expression is attained.

(b) The optimal control problem can be stated as

minimizev( ) x3(tf )

subject to x1(t) = x3(t),x2(t) = x4(t),

x3(t) =c

mu(t) cos v(t),

x4(t) =c

mu(t) sin v(t),

x2(tf ) = x2f , x4(tf ) = 0.

The first steps of the solution in (a) are still valid, and the solution to theadjoint equations are the same. The difference lies in the constraints onthe constants c1, . . . , c4, which are given by

(tf ) x

(tf , x(tf )) Sf (tf )

1(tf )2(tf )

3(tf ) + 14(tf )

= 1

0100

+ 2

0001

,which yields 1 = 0 and 3 = 1, while the remaining are free. Substi-tuting this into (4.2) gives c1 = 0, c3 = 1 and thus 3(t) = 1. Hence,

tan v(t) = c4 + c2t,

which is the desired result.

19

(c) The optimal control problem can be stated as

minimizev( )

tf0

dt


x3(t) =c

mu(t) cos v(t),

x4(t) =c

mu(t) sin v(t),

x1(0) = 0, x2(0) = 0, x3(0) = x3i, x4(0) = x4i,x1(tf ) = x1f , x2(tf ) = x2f .

This problem is similar to (a) and the only difference is the new boundaryconditions, which are now given by

(tf ) x

(tf , x(tf )) Sf =

1(tf )2(tf )3(tf )4(tf )

=1200

=freefree00

.Therefore, we get 3(t) = c1(t tf ), 4(t) = c2(t tf ), and

tan(v(t)) = 4(t)3(t)

= c2c1

(constant).

Since v(t) and u are constant, we can solve the system of equations

x3(t) =cu

mcos v, x3(0) = x3i = x3(t) = cu

mt cos v + x3i

x4(t) =cu

msin v, x4(0) = x4i = x4(t) = cu

mt sin v + x4i

x1(t) = x3(t), x1(0) = 0 = x1(t) = cum

t2

2 cos v + x3it

x2(t) = x4(t), x2(0) = 0 = x2(t) = cum

t2

2 sin v + x4it.

Using the terminal conditions x1(tf ) = x1f , x2(tf ) = x2f , we can calculatev and tf .

(d) The new Hamiltonian is given by

Hnew = H 4g,

and the rest is the same.

(e) The problem is

minimizev( ) x1(tf )


x3(t) =c

x5(t)u(t) cos v(t),

x4(t) =c

x5(t)u(t) sin v(t) g,

x5(t) = u(t),x(0) given,x2(tf ) = x2f , x5(tf ) = x5f ,u(t) [0, umax], t [0, tf ].

The Hamiltonian is

H(t, x, [u, v]T , ) , 1x3 +2x4 +3c

x5u cos v+4

(c

x5u sin v g

)5u.

Pointwise minimization

(t, x, ) , arg minu[0,umax],v

H(t, x, [u, v]T , )

= arg minu[0,umax],v

(3

c

x5cos v + 4

c

x5sin v 5

)u.

If u = 0, then v can be arbitrary. If u 6= 0, then tan v = 4/3 as earlier.In any case, the optimal v can be taken as

tan v(t) = 4(t)3(t)

.

20


1(t) = 02(t) = 03(t) = 1(t)4(t) = 2(t)5(t) =

c

x5(t)2u(t)

(3(t) cos v(t) + 4(t) sin v(t)

)Solving the four first yields (as before)

1(t) = c1, 2(t) = c2, 3(t) = c1t+ c3, 4(t) = c2t+ c4.

The boundary conditions are

(tf ) x

(tf , x(tf )) Sf =

1(tf ) + 12(tf )3(tf )4(tf )5(tf )

=

01002

,

and thus 1(tf ) = 1, 3(tf ) = 4(tf ) = 0 while the remaining are free.Thus, we can conclude that c1 = 1 and

3(t) = t tf4(t) = c2(t tf )

} tan v(t) = 4(t)

3(t)= c2 = constant

Hence, the steering angle v(t) is constant. This gives

u(t) =

0, if (t) > 0umax, if (t) < 0arbitrary, if = 0

,

where the switching function is

(t) , 3(t)c

x5(t)cos v + 4(t)

c

x5(t)sin v 5(t)

Now, let us look at the switching functions. The standard trick for deter-

mining the number of switches is to investigate (t). Here we have

(t) = 3(t)c

x5(t)cos v + 4(t)

c

x5(t)sin v

cx5(t)2

x5(t)(3(t) cos v + 4(t) cos v

) 5(t) =0

= cx5(t)

(3(t) cos v + 4(t) sin v

)= cx5(t)

(cos v c2 sin v

)= cx5(t)

(cos v c2(c2 cos v)

)= c(1 + c

22)

x5(t)cos v 0,

(we know that cos v must be positive for our purpose) and thus we concludethat there can be at most one switch and the direction is from u = umaxto u = 0.

21

5 Infinite Horizon Optimal Control5.1 (a) 1. The Hamiltonian is given by

H(x, u, ) , f0(x, u) + T f(x, u)

= 12u2 + 12x

4 + u


(x, ) , arg minuR

H(x, u, )

= arg minuR

{12u2 + 12x

4 + u} = ,


u(t) , (x(t), Vx (x(t))) = Vx(x(t)),

where Vx(x(t)) is obtained by solving the infinite time horizon hjbe.


0 = H(x, (x, Vx), Vx).

In our case this is equivalent to

0 = 12V2x +

12x

4,

which implies that Vx = x2 and thus V (x) = 13x3 + C, for someconstant C. Since V (x) > 0 for all x 6= 0 and V (0) = 0, we have

V (x) = 13 |x3|.


u(t) = Vx(x(t)) = x2(t) sgn (x(t)), (5.1)

which is reasonable if one looks at the optimal control problem we aresolving.

(b) 1. The Hamiltonian will be the same


(x, ) , arg min|u|1

H(x, u, )

= arg min|u|1

{12u2 + 12x

4 + u} ={, if || 1 sgn (), if || > 1


u(t) , (x(t), Vx (x(t))) ={Vx (x(t)) , if |Vx (x(t)) | 1 sgn (Vx (x(t))), if |Vx (x(t)) | > 1

,



0 = H(x, (x, Vx), Vx).


0 ={ 12V 2x + 12x4, if |Vx| 112 +

12x

4 Vx sgn (Vx), if |Vx| > 1

Vx ={x2, if |Vx| 112 (1 + x4) sgn (Vx), if |Vx| > 1

.

Since | x2| 1 |x| 1 and 12 (1 + x4) sgn (Vx) > 1 |x| > 1 wehave

Vx ={x2, if |x| 112 (1 + x4) sgn (Vx), if |x| > 1

,

which implies

V (x) ={

13 |x|3, if |x| 112 |x|+ 110 |x|5 + C, if |x| > 1

.

22

Since V (x) must be continuous we have

limx1

V (x) = V (1) 12 +110 + C =

13 C =

415

We also notice that Vx is continuous V C1. Thus, the optimalcost-to-go function is

J(x) = V (x) ={

13 |x|3, if |x| 112 |x|+ 110 |x|5 415 , if |x| > 1


u(t) ={Vx (x(t)) , if |x(t)| 1 sgn (Vx), if |x(t)| > 1

={x2(t) sgn (x(t)), if |x(t)| 1 sgn (x(t)), if |x(t)| > 1,

which is reasonable if we combine the control signal constraint with theunconstrained optimal control (5.1).

5.2 1. The Hamiltonian is given by

H(x, u, ) , f0(x, u) + T f(x, u)

= 12u2 + 12x

2 + x4 + (x3 + u)


(x, ) , arg minuR

H(x, u, )

= arg minuR

{12u2 + 12x

2 + x4 + (x3 + u)} = ,


u(t) , (x(t), Vx (x(t))) = Vx(x(t)),


3. The hjbe is given by0 = H(x, (x, Vx), Vx).


0 = 12V2x +

12x

2 + x4 + Vxx3,

We make the ansatz V (x) = x2 + x4 which gives Vx = 2x + 4x3 andwe have

0 = 12(2x+ 4x3)2 + 12x

2 + x4 + (2x+ 4x3)x3

= 22x2 8x4 82x6 + 12x2 + x4 + 2x4 + 4x6

= (22 + 12)x2 + (8 + 1 + 2)x4 + (82 + 4)x6,

which is true for all x if = 12 and =12 and we have

V (x) = 12(x2 + x4).


u(t) = Vx(x(t)) = x(t) 2x3(t).

5.3 (a)

x(t) =(

0 10 0

)

=A

x(t) +(

01

)=B

u(t)

y(t) =(1 0

) =C

x(t)(5.1)

(b)R = 2

Q = 2CTC(5.2)

(c) Optimal feedback is u(t) = R1BTPx(t), where P is the solution tothe are

PA+ATP PBR1BTP +Q = 0 (5.3)(See Theorem 10 in the lecture notes.)

23

(d) The closed loop system is

x(t) = Ax(t) +B(R1BTPx(t)) = (ABR1BTP )x(t). (5.4)Read about are by running help are in Matlab!

>> A = [0 1; 0 0]; B = [0; 1]; C = [1 0];>> alpha = 1; beta = 2;>> R = alpha^2; Q = beta^2*C*C;>> P = are(A, B*inv(R)*B, Q)P =

4.0000 2.00002.0000 2.0000

>> eig(A - B*inv(R)*B*P)ans =-1.0000 + 1.0000i-1.0000 - 1.0000i

(e) Let

P =(p1 p2p2 p3

). (5.5)

and the ARE can be expressed as(2 2p22 p1 2p2p3p1 2p2p3 2p2 2p23

)= 0. (5.6)

or equivalently asp1 = 2p2p3p22 = 22

p23 = 22p2.(5.7)

Note that the last equation gives that p2 > 0, and furthermore we havethat p1 > 0 and p3 > 0 since P is positive definite. Thus, the solution tothe ARE is

P =(

2

2

). (5.8)


u(t) = x1(t)

2x2(t) =

y(t)

2y(t). (5.9)

Note that this is a PD-controller dependent on the fraction /.

(f) The system matrix of the closed loop system is

ABR1BTP =(

0 1/ 2/

). (5.10)

with the eigenvalues

=

2 (1 i). (5.11)

5.4 (a) >> A = [-0.089 -2.19 0 0.319 0 00.076 -0.217 -0.166 0 0 0-0.602 0.327 -0.975 0 0 00 0.15 1 0 0 00 1 0 0 0 01 0 0 0 2.19 0];

>> B = [0 0.03270.0264 -0.1510.227 0.06360 00 00 0];

>> C = [0 0 0 0 0 1];>> K1 = [0 0 0 1 0 0; 0 0 0 0 -1 0]K1 =

0 0 0 1 0 00 0 0 0 -1 0

>> E1 = eig(A-B*K1)E1 =

0-1.0591-0.0235 + 0.7953i-0.0235 - 0.7953i-0.0874 + 0.1910i-0.0874 - 0.1910i

Note that the first eigenvalue is in the origin, and hence the system is notasymptotically stable. This eigenvalue correspond to the lateral position,and this state will not be driven to zero by this controller. Thus, it is notstraightforward to design a controller to this system by intuition.

24

(b) Consider the following optimal control problem

minimizeu( ) J =

0

(x(t)TQx(t) + u(t)TRu(t)

)dt

subject to x(t) = Ax(t) +Bu(t),u(t) R2, t 0.

The Matlab command lqr is used to solve this problem, read the infor-mation about lqr by running help lqr in Matlab.

>> Q = eye(6); R = eye(2);>> [K2, S2, E2] = lqr(A, B, Q, R);>> K2,E2K2 =

2.7926 4.2189 2.0211 3.2044 9.7671 0.8821-1.2595 -6.8818 -1.0375 -2.0986 -6.9901 -0.4711

E2 =-1.2275-0.2911 + 0.7495i-0.2911 - 0.7495i-0.2559 + 0.3553i-0.2559 - 0.3553i-0.4618

All eigenvalues are in the left half plan, and the feedback system is asymp-totically stable.

(c) Note that it is not the absolute values of the elements of Q and R thatare important, but their ratios. If Q is large and R is small, then thestates are forced to zero quickly and the magnitude of the control signalwill be large, and vice versa.

25

6 Model Predictive Control6.1 (a) The following script solves the problem

mptSolutionA.m

% the zoh systemclose allclear alls=tf(s)G=2/(s^2+3*s+2);Ts = 0.05;S=c2d(ss(G),Ts)Z.A = S.A;Z.B = S.B;Z.C = S.C;Z.D = S.D;Z.umin = -2;Z.umax = 2;Z.xmin = -[10;10];Z.xmax = [10;10];

prob.norm = 2;prob.Q = eye(2);prob.R = 1e-2;prob.N = 2;prob.Tconstraint = 0;prob.subopt_lev = 0;ctrl = mpt_control(Z,prob);figurempt_plotPartition(ctrl)figurempt_plotU(ctrl)figurempt_plotTimeTrajectory(ctrl, [1;1])

(b) matlab script based on the results in (a)

mptSolutionB.m

nr = [];n = 1:14;for i = n

prob.N = i;ctrl = mpt_control(Z,prob,[]);nr = [nr,length(ctrl.dynamics)];

endfigureplot(nr)hold onx = lsqnonlin(@(x) monomial(x,[n.;nr.]),[1,2]);alpha = x(1)beta = x(2)plot(alpha*(n).^beta,r--)x = lsqnonlin(@(x) exponent(x,[n.;nr.]),[1,2])gamma = x(1)delta = x(2)plot(gamma*delta.^(n),go-)legend(n_r,\alpha*N^{\beta},\gamma*\delta^{N})

where we defined the functions

monomial.m

function f = monomial(x,c)N = length(c);f = x(1)*c(1:N/2).^x(2)-c(N/2+1:end);end

exponent.m

function f = exponent(x,c)N = length(c);f = x(1)*(x(2).^c(1:N/2)-1)-c(N/2+1:end);end

The values of the optimized parameters are given by

= 1.7914, = 1.9739, = 44.1304, = 1.1665,

26

and the complexity, at least in this case, seems to be quadratic and not ex-ponential (since the monomial has good adaptation and that the estimatedbase for the exponential is close to 1). This shows that the complexity inpractice may be better than the theoretical complexity which is exponen-tial in N .

27

7 Numerical Algorithms7.1 (a)

x[k + 1] = x[k] + hf(x[k], u[k])

= x[k] + h(

0 10 0

)+ h

(01

)u[k]

=(

1 h0 1

)x[t] +

(0h

)u[t]

(7.1)

secOrderSysEqDisc.m

function x = secOrderSysEqDisc(x0, u, dt)

x = zeros(2,1);%#x(1) = ???;%#x(2) = ???;x(1) = x0(1) + dt*x0(2);x(2) = x0(2) + dt*u;

(b)secOrderSysCostCon.m

function f = secOrderSysCostCon(X, dt)%#f = ???;f = dt*sum(X(3:3:end).^2);

(c)

Aeq =

E 0 0 0 . . . 0 00 0 0 0 . . . 0 EF E 0 0 . . . 0 00 F E 0 . . . 0 00 0 F E . . . 0 0... . . .

...0 0 0 0 . . . F E

(7.2)

Beq = (xi xf 0 0 . . . 0)T . (7.3)

whereF =

(1 h 00 1 h

)(7.4)

andE =

(1 0 00 1 0

). (7.5)

mainDiscLinconSecOrderSys.m

% Second order opti

Documents

Optimal Control Exercises