51
An Introduction to Numerical Methods for Differential Games M. Falcone Dipartimento di Matematica School ”Game Theory and Dynamic Games” Campione, September 4-9 , 2016 M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 1 / 51

An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

An Introduction to Numerical Methodsfor Differential Games

M. FalconeDipartimento di Matematica

School ”Game Theory and Dynamic Games”

Campione, September 4-9 , 2016

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 1 / 51

Page 2: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Outline

1 Introduction

2 Dynamic Programming for 1-Player

3 Dynamic Programming for 2-Players

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 2 / 51

Page 3: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Outline

1 Introduction

2 Dynamic Programming for 1-Player

3 Dynamic Programming for 2-Players

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 3 / 51

Page 4: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Foreword

Optimal control problem can be solved via the Pontryagin MaximumPrinciple (open-loop) and by the Dynamic Programming (DP) approach.

However Pontryagin’s Principle does not work for games, so we presentthe DP approach.

By the Dynamic Programming Principle, we will derive thecharacterization of the value function in terms of a first order partialdifferential equation (PDE), the Isaacs equation.We will introduce weak solutions (i.e. non differentiable) which will allowto select a unique solution to the Bellman and to the Isaacs equation.

The general framework of this approach is the theory of viscosity solutions(Crandall-Lions, 1984).

Uniqueness is a delicate and fundamental issue and is crucial to proveconvergence of numerical approximation schemes.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 4 / 51

Page 5: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

The numerical solution of optimal control problems via the DynamicProgramming approach is mainly motivated by the search for feedbackcontrols for generic nonlinear Lipschitz continuous dynamics and costs.

The solution of the corresponding Bellman equation in high dimension is acomputationally intensive task and this bottleneck has limited theapplications of this theory to industrial cases.

Recently several new efficient algorithms with limited (and controlled)memory allocations and reasonable CPU times have been developed tomitigate the ”curse of dimension”.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 5 / 51

Page 6: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

The Zermelo navigation problem

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

10.8

0.60.4

0.20

0.20.4

0.60.8

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

10.8

0.60.4

0.20

0.20.4

0.60.8

1

0

1

2

3

4

5

6

Figure: The rescaled value function (left), feedback control (right)

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 6 / 51

Page 7: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Trajectories for a pursuit-evasion game

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

x2

x1

Test 5: P=(0.0,0.3) E=(0.3,0.0)

P

E

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1x2

x1

Test 5: P=(-0.1,-0.3) E=(0.1,0.3)

P

E

Figure: Optimal trajectories derived by feedback laws

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 7 / 51

Page 8: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

The model problem

Let us consider the nonlinear system dynamicsy(t) = f (y(t), a(t), b(t)), t > 0,y(0) = x

(D)

wherey(t) ∈ RN is the statea( · ) ∈ A is the control of player 1 (player a)

A = admissible control functions of player 1

= a : [0,+∞[→ A, measurable

(e.g. A = piecewise constant functions with values in A),

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 8 / 51

Page 9: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

The model problem

b( · ) ∈ B is the control of player 2 (player b),

B = b : [0,+∞[→ B, measurable ,

A,B ⊂ RM are given compact sets.Assume f is continuous and

|f (x , a, b)− f (y , a, b)| ≤ L |x − y | ∀x , y ∈ RN , a ∈ A, b ∈ B.

Then, for every given control strategies a( · ) ∈ A, b( · ) ∈ B, there is aunique trajectory of (D), denoted by yx(t; a, b) (Caratheodory).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 9 / 51

Page 10: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Payoff

The payoff of the game is

tx(a( · ), b( · )) = min t : yx(t; a, b) ∈ T ≤ +∞,

where T ⊆ RN is a given closed target .

Goal of the gamePlayer a wants to minimize the payoff, he is called the pursuer, whereasPlayer b wants to maximize the payoff, he is called the evader.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 10 / 51

Page 11: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Example: Minimum time problem

This is a simple example with just 1 playery = a, A = a ∈ RN : |a| = 1 ,y(0) = x .

Then, tx(a∗) is equal to the length of the optimal trajectory joining x andthe point yx(tx(a∗), thus

tx(a∗) = mina∈A

tx(a) = dist(x , T )

and any optimal trajectory is a straight line!

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 11 / 51

Page 12: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Example: Pursuit-Evasion Games

We have two players, each one controlling its own dynamicsy1 = f1(y1, a), yi ∈ RN/2, i = 1, 2y2 = f2(y2, b)

(PEG)

The target is

Tδ ≡ |y1 − y2| ≤ δ , δ > 0, or T0 ≡ (y1, y2) : y1 = y2 .

Then, tx(a( · ), b( · )) is the capture time corresponding to the strategiesa(·) and b(·) of the first and second players.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 12 / 51

Page 13: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Dynamic Programming for 1-Player

In this section we assume B = b , so the system can be rewritten asy = f (y , a), t > 0,y(0) = x .

Define the value function

T (x) ≡ infa(·)∈A

tx(a) .

T ( · ) is the minimum-time function,i.e. it is the best possible outcome ofthe game for player a, as a function of the initial position x of the system.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 13 / 51

Page 14: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Reachable set

DEFINITIONR ≡ x ∈ RN : T (x) < +∞, i.e. the set of starting points from which itis possible to reach the target.

WARNINGThe reachable set R depends on the target and on the dynamics in arather complicated way.

R is NOT known in our problem, so we have to determine the couple(T ,R) (i.e. it’s a free boundary problem).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 14 / 51

Page 15: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Outline

1 Introduction

2 Dynamic Programming for 1-Player

3 Dynamic Programming for 2-Players

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 15 / 51

Page 16: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Dynamic Programming for 1-Player

LEMMA (Dynamic Programming Principle)For all x ∈ R \ T , 0 ≤ t < T (x) ,

T (x) = infa( · )∈A

t + T (yx(t; a)) . (DPP)

Sketch of the ProofThe inequality “≤” follows from the intuitive fact that ∀a( · )

T (x) ≤ t + T (yx(t; a)).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 16 / 51

Page 17: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Sketch of the proof

The proof of the opposite inequality “≥” is based on the fact that theequality holds if a( · ) is optimal for x .For any ε > 0 we can find a minimizing control aε such that

T (x) + ε ≥ t + T (yx(t; aε)

split the trajectory and pass to the limit for ε→ 0.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 17 / 51

Page 18: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Sketch of the proof

To prove rigorously the above inequalities the following two properties ofA are crucial:

1 a( · ) ∈ A implies that ∀s ∈ R the function t 7→ a(t + s) is still in A;

2 a1, a2 ∈ A implies that for any given s > 0 the new control

a(t) ≡

a1(t) t ≤ s,a2(t) t > s.

belongs to A (concatenation property)

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 18 / 51

Page 19: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Concatenation is a crucial property

Note that the DP Principle works for

A = piecewise constants functions into A

but not forA = continuous functions into A .

because joining together two continuous controls we are not guaranteedthat the resulting control is continuous.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 19 / 51

Page 20: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Getting the Bellman equation

Let us derive the Hamilton-Jacobi-Bellman equation from the DPPrinciple.Rewrite (DPP) as

T (x)− infa( · )∈A

T (yx(t; a)) = t

and divide by t > 0,

supa( · )∈A

T (x)− T (yx(t; a))

t

= 1 for t < T (x) .

We want to pass to the limit as t → 0+.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 20 / 51

Page 21: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Bellman equation

Assume T is differentiable at x and that the limit for t → 0+ commuteswith the supa( · ).Then, if yx(0; a) exists,

supa( · )∈A

−∇T (x) · yx(0, a) = 1.

Then, for limt→0+

a(t) = a0, we get

supa0∈A−∇T (x) · f (x , a0) = 1 . (HJB)

This is the Hamilton-Jacobi-Bellman partial differential equation, for ourproblem is a first order nonlinear PDE.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 21 / 51

Page 22: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Bellman equation

Let us define the Hamiltonian,

H1(x , p) := maxa∈A−p · f (x , a) − 1,

we can rewrite (HJB) in short as

H1(x ,∇T (x)) = 0 in R \ T .

A natural boundary condition on ∂T is

T (x) = 0, for x ∈ ∂T

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 22 / 51

Page 23: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

T verifies the HJB equation

PROPOSITIONIf T ( · ) is C 1 in a neighborhood of x ∈ R \ T , then T ( · ) satisfies forevery x

maxa∈A−∇T (x) · f (x , a) = 1 .

Sketch of the proofLet us prove first the inequality “≤”.Fix a(t) ≡ a0 ∀t, and set yx(t) = yx(t; a). The (DPP) gives

T (x)− T (yx(t)) ≤ t ∀ 0 ≤ t < T (x).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 23 / 51

Page 24: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

T verifies the HJB equation

We divide by t > 0, getting

T (x)− T (yx(t))

t≤ 1, ∀ 0 < t < T (x).

Now let t → 0+ to get

−∇T (x) · yx(0) = −∇T (x) · f (x , a0) ≤ 1,

where yx(0) = f (x , a0) since a(t) tends to a0. Then, we get

maxa∈A−∇T (x) · f (x , a) ≤ 1 .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 24 / 51

Page 25: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

T verifies the HJB equation

To prove the inequality “≥”, we fix ε > 0.For all t ∈ ]0,T (x)[, by (DPP) there exists α ∈ A such that

T (x) ≥ t + T (yx(t;α))− εt .

Then

1− ε ≤ T (x)− T (yx(t;α))

t≤ −1

t

∫ t

0

∂sT (yx(s;α)) ds

= −1

t

∫ t

0∇T (yx(s)) · yx(s) ds = −1

t

∫ t

0∇T (x) · f (x , α(s)) ds

Passing to the limit for t → 0+ we get for every positive ε

1− ε ≤ −∇T (x) · f (x , a0) for a0 ∈ A

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 25 / 51

Page 26: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

T verifies the HJB equation

Since ε is abitrary, we finally obtain

supa∈A−∇T (x) · f (x , a) ≥ 1 .

and we conclude the proof.

In conclusion:assuming that T is a differentiable function, we have proved that Tsatisfies pointwise the Bellman equation in the reachable set R.

WARNING: the reachable R is not given in the problem.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 26 / 51

Page 27: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Is T regular?

The answer is NO even for simple cases.

Let us go back to Example 1 where T (x) = dist(x , T ). Note that T is notdifferentiable at x if there exist two distinct points of minimal distance.

EXAMPLELet us take N = 1, f (x , a) = a, A = B(0, 1) and choose

T = ]−∞,−1] ∪ [1,+∞[ .

Then,T (x) = 1− |x |

which is not differentiable at x = 0.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 27 / 51

Page 28: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

a.e. solutions

Note that in this example the Bellman equation is the eikonal equation

|Du(x)| = 1 (1)

which has infinitely many a.e. solutions also when we fix the values on theboundary ∂T ,

u(−1) = u(1) = 0

−1 1

u(x) = 1− |x|

x

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 28 / 51

Page 29: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Is T continuous?

Also the continuity of T is, in general, not guarateed.Take the previous example and set A = [−1, 0], then we have

T (1) = 0 limx→1

T (x) = 2

However, the continuity of T ( · ) is equivalent to the property ofSmall-Time Local Controllability (STLC) around T .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 29 / 51

Page 30: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Small Time Local Controllability (STLC)

DEFINITIONAssume ∂T smooth. Then the dynamical system satisties the STLCcondition iff

∀x ∈ ∂T ∃a ∈ A : f (x , a) · η(x) < 0 (STLC)

Note that (STLC) guarantees that R is an open subset of RN and that

limx→x0

T (x) = +∞, ∀x0 ∈ ∂R

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 30 / 51

Page 31: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Weak solutions (in the viscosity sense)

We want to interpret the HJB equation in a “weak sense” so that T ( · ) isa “solution” (non-classical), and is also unique (under suitable boundaryconditions).

Let’s go back to the proof of our proposition.

We proved that

1 T (x)− T (yx(t)) ≤ t, ∀t small and T (·) ∈ C 1(R) implies

H(x ,∇T (x)) ≤ 0

2 T (x)− T (yx(t)) ≥ t(1− ε), ∀t, ε small and T ∈ C 1(R) implies

H(x ,∇T (x)) ≥ 0

.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 31 / 51

Page 32: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Weak solutions (in the viscosity sense)

MAIN IDEA : If φ ∈ C 1 and T − φ has a maximum at x then

T (x)− φ(x) ≥ T (yx(t))− φ(yx(t)) ∀t,

thusφ(x)− φ(yx(t)) ≤ T (x)− T (yx(t)) ∀t,

so we can replace T by φ in the proof of proposition and get

H(x ,∇φ(x)) ≤ 0 .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 32 / 51

Page 33: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Weak solutions (in the viscosity sense)

Similarly, if φ ∈ C 1 and T − φ has a minimum at x , then

T (x)− φ(x) ≤ T (yx(t))− φ(yx(t)), ∀t.

thusφ(x)− φ(yx(t)) ≥ T (x)− T (yx(t)) ≥ t(1− ε)

and by the proof of the proposition

H(x ,∇φ(x)) ≥ 0 .

Thus, the classical proof can be fixed also for T /∈ C 1(R) just replacing Twith a “test function” φ ∈ C 1(R).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 33 / 51

Page 34: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Viscosity solutions

DEFINITION (Crandall-Evans-Lions, 1985)Let H : RN × R× RN → R be continuous, Ω ⊆ RN open.We say that u ∈ C (Ω) is a viscosity subsolution of

H(x , u,∇u) = 0 in Ω (HJB)

if ∀φ ∈ C 1, ∀x local maximum point of u − φ,

H(x , u(x),∇φ(x)) ≤ 0.

It is a viscosity supersolution of (HJB) if ∀φ ∈ C 1, ∀x local minimumpoint of u − φ,

H(x , u(x),∇φ(x)) ≥ 0.

A viscosity solution is a sub- and supersolution in Ω.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 34 / 51

Page 35: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Viscosity solutions

THEOREMIf R \ T is open and T ( · ) is continuous, then T ( · ) is a viscosity solutionof the Hamilton-Jacobi-Bellman equation (HJB).

The proof is the argument before the definition! Note that the definition islocal (since the definition must be satisfied only at some points).

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 35 / 51

Page 36: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Relation with classical solutions

PROPOSITION

1 If u is a classical solution of H(x , u,∇u) = 0 in Ω then u is a viscositysolution;

2 if u is a viscosity solution of H(x , u,∇u) = 0 in Ω and if u isdifferentiable at x0 then the equation is satisfied in the classical senseat x0, i.e.

H(x0, u(x0),∇u(x0)) = 0 .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 36 / 51

Page 37: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Uniqueness

Next we want to prove the uniqueness of the solution for the Dirichletboundary value problem

u + H(x ,∇u) = 0 in Ωu = g on ∂Ω

(BVP)

under assumptions satisfied by the Hamiltonian H1.T (·) can be recovered from the solution of a boundary value problem asfollows.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 37 / 51

Page 38: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Uniqueness

We rescale T by the Kruzkov change of variable

V (x) :=

1− e−T (x) if T (x) < +∞, i.e. x ∈ R1 if T (x) = +∞, (x /∈ R)

= infa( · )∈A

J(x , a)

where

J(x , a) :=

∫ tx (a)

0e−t dt .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 38 / 51

Page 39: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Uniqueness

Note that, by definition, we have

∇V (x) = e−T (x)∇T (x)

and 1− V (x) = e−T (x), which implies

∇T (x) =∇V (x)

1− V (x)

Then, substituting into the equation for the minimum fime,we get the new equation for v

V (x) + maxa∈A

[−f (x , a) · ∇V (x)] = 1

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 39 / 51

Page 40: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Solving the free boundary problem

From V we can obtain the minimum time function T and the reachableset R by

T (x) = − log(1− V (x))

R = x : V (x) < 1 .This is quite important to solve the free boundary problem as well as forthe numerical approximation.In fact, since V takes values in [0,1] is computable.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 40 / 51

Page 41: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Uniqueness

Moreover, by the DP Principle, V satisfiesV + max

a∈A−∇V · f (x , a)− 1 = 0 in RN \ T

V = 0 on ∂T ,(BVP-B)

which is a special case of (BVP), with

H(x , u, p) = H1(x , u, p) ≡ u + maxa∈A−f (x , a) · p − 1

Ω = T c ≡ RN \ T .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 41 / 51

Page 42: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Uniqueness

LEMMAThe Mininimum Time Hamiltonian H1 satisfies the“structure condition”

|H(x , p)− H(y , q)| ≤ K (1 + |x |)|p − q|+ |q| L |x − y | (SH)

for any x , y , p, q.

THEOREM (Crandall-Lions, 1984)Assume H satisfies (SH). Let u,w ∈ BUC(Ω) be respectively a subsolutionand a supersolution of

v + H(x ,∇v) = 0 in Ω

If u ≤ w on ∂Ω, then u ≤ w in Ω.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 42 / 51

Page 43: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Outline

1 Introduction

2 Dynamic Programming for 1-Player

3 Dynamic Programming for 2-Players

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 43 / 51

Page 44: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Dynamic Programming for 2-Players

What is the value function for the 2-players game?

WARNING:It is not

infa(·)∈A

supb(·)∈B

J(x , a, b)

because this would give to Player-a a big advantage choosing his controlsince he would know completely the future response of Player-b to anycontrol function a( · ) ∈ A.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 44 / 51

Page 45: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Nonanticipating Strategies

A more reasonable information pattern can be modeled by means of thenotion of nonanticipating strategies introduced by Varayia, Roxin andElliott-Kalton

1-st Player

∆ ≡ α : B → A| b(t) = b(t) ∀t ≤ t ′ implies

α[b](t) = α[b](t) ∀t ≤ t ′ ,

2-nd Player

Γ ≡ β : A → B|a(t) = a(t) ∀t ≤ t ′ implies

β[a](t) = β[a](t) ∀t ≤ t ′ .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 45 / 51

Page 46: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Lower Value of a game

Now we can define the lower value of the game

T (x) ≡ infα∈∆

supb∈B

tx(α[b], b),

or, after the change of variable,

V (x) ≡ infα∈∆

supb∈B

J(x , α[b], b)

where the payoff is

J(x , a, b) =

∫ tx (a,b)

0e−t dt

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 46 / 51

Page 47: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Value of a game

Similarly the upper value of the game is

T (x) := supβ∈Γ

infa∈A

tx(a, β[a]),

orV (x) := sup

β∈Γinfa∈A

J(x , a, β[a]) .

DEFINITIONWe say that the game has a value if the upper and lower values coincide,i.e. if T = T or V = V .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 47 / 51

Page 48: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Dynamic Programming Principle for 2 Players

LEMMAFor all 0 ≤ t < T (x)

T (x) = infα∈∆

supb∈B t + T (yx(t;α[b], b)) , ∀x ∈ R \ T ,

and

V (x) = infα∈∆

supb∈B

∫ t

0e−s ds+e−tV (yx(t;α[b], b))

, ∀x ∈ T c ≡ RN\T .

The proof is similar to the 1-player case but more tecnical due to theessential use of non-anticipating strategies.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 48 / 51

Page 49: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Isaacs equation

Isaacs’ Lower Hamiltonian

H(x , p) := minb∈B

maxa∈A−p · f (x , a, b) − 1 .

The upper values T and V satisfy a similar DP Principle.

Isaacs’ Upper Hamiltonian

H(x , p) := maxa∈A

minb∈B−p · f (x , a, b) − 1 .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 49 / 51

Page 50: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Isaacs equation

THEOREM (Evans-Souganidis, 1984)1. If R \ T is open and T ( · ) is continuous, then T ( · ) is a viscositysolution of

H(x ,∇T ) = 0 in R \ T . (HJI)

2. If V ( · ) is continuous, then it is a viscosity solution of

V + H(x ,∇V ) = 0 in RN \ T .

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 50 / 51

Page 51: An Introduction to Numerical Methods for Differential Games · An Introduction to Numerical Methods for Di erential Games M. Falcone ... applications of this theory to industrial

Basic References

DETERMINISTIC CONTROL PROBLEMS AND GAMESM. Bardi, I. Capuzzo Dolcetta, Optimal control and viscosity solutions ofHamilton-Jacobi-Bellman equations, Birkhauser, 1997

A. I. Subbotin, Generalized solutions of first-order PDEs, Birkhauser,Boston, 1995

OTHER NUMERICAL METHODSViability kernel method via Viability Theory (Saint-Pierre, Quincampoix,Cardaliaguet...)Stable bridges methods (Patsko, Kumkov, ...)

STOCHASTIC CONTROL PROBLEMSW.H. Fleming, H.M. Soner, Control of Markov chains and viscositysolutions, Springer-Verlag, 1998.

M. Falcone (SAPIENZA, Rome) Introduction to Numerical Methods for DG 51 / 51