14
Appl Math Optim 24:317-330 (1991) Applied Mathematics and Optimization © 1991 Springer-VerlagNew York Inc. On Extremai Solutions to Stochastic Control Problems Vivek S. Borkar Department of Electrical Engineering, Indian Institute of Science, Bangalore 560012, India Communicated by S. Mitter Abstract. We identify two solutions of a controlled diffusion if the corre- sponding one-dimensional marginals of the state and control process agree. The extreme points of the set of such equivalence classes are shown to correspond to Markov controls. Key Words. Optimal stochastic control, Controlled diffusion, Extremal solu- tions, Markov controls, Marginal classes. 1. Introduction Recently, Haussmann [6] and El Karoui et al. [4] adapted Krylov's Markov selection procedure [12, Chapter 12] to establish the existence of an optimal Markov control for some classical problems in the control of degenerate diffussions (see also Section IV.1 of [3]). The arguments of [4] and [6] suggest the possibility of proving that solutions of this control system which are extremal in some appropriate sense must be Markov. We prove such-a result here after introducing a suitable notion of extremality. A precise statement (Theorem 1.1 below) follows after some preliminaries. Our controlled diffusion is a d-dimensional process X(.) = IX1('), ..., Xa(")'] v satisfying the stochastic differential equation fofo X(t) = X o + m(X(s), u(s)) ds + ~r(X(s)) dW(s), (1.1)

On extremal solutions to stochastic control problems

Embed Size (px)

Citation preview

Page 1: On extremal solutions to stochastic control problems

Appl Math Optim 24:317-330 (1991) Applied Mathematics and Optimization © 1991 Springer-Verlag New York Inc.

On Extremai Solutions to Stochastic Control Problems

Vivek S. Borkar

Department of Electrical Engineering, Indian Institute of Science, Bangalore 560012, India

Communicated by S. Mitter

Abstract. We identify two solutions of a controlled diffusion if the corre- sponding one-dimensional marginals of the state and control process agree. The extreme points of the set of such equivalence classes are shown to correspond to Markov controls.

Key Words. Optimal stochastic control, Controlled diffusion, Extremal solu- tions, Markov controls, Marginal classes.

1. Introduction

Recently, Haussmann [6] and El Karoui et al. [4] adapted Krylov's Markov selection procedure [12, Chapter 12] to establish the existence of an optimal Markov control for some classical problems in the control of degenerate diffussions (see also Section IV.1 of [3]). The arguments of [4] and [6] suggest the possibility of proving that solutions of this control system which are extremal in some appropriate sense must be Markov. We prove such-a result here after introducing a suitable notion of extremality. A precise statement (Theorem 1.1 below) follows after some preliminaries.

Our controlled diffusion is a d-dimensional process X(.) = IX1('), . . . , Xa(")'] v satisfying the stochastic differential equation

f o f o X(t) = X o + m(X(s), u(s)) ds + ~r(X(s)) dW(s) , (1.1)

Page 2: On extremal solutions to stochastic control problems

318 V. S, Borkar

where:

(i)

(ii) (iii) (iv)

(v)

m(., . ) = [ml(', ") . . . . . md(', .)]T: R d x U--* R d (with U a prescribed com- pact metric space) is bounded continuous and Lipschitz in its first argument uniformly with respect to the second. 0(') = [[~rlj(')]]i.i= 1 ..... d: Rd ~ Ra×d is bounded Lipschitz. X 0 is a random variable taking values in R d, with a prescribed law re. W(') = [WI('), . . . , Wd(')] T is a d-dimensional standard Wiener process independent of X o. u(.) is a U-valued process with measurable paths satisfying the following "nonanticipativity" condition: for t > s > y, W(t ) - W(s) is independent of u(z), W(z) , z <_ y.

We use the weak formulation of the stochastic control system (1.1). That is, we require it to hold on some probability space with some Wiener process W(-), as opposed to a prescribed probability space and Wiener process. See Chapter I of [31 for background concerning weak (and strong) formulation. In particular, Theorem 1.2.2 of [3] implies that we may take u(') to be adapted to the natural filtration of X(.). We also use the relaxed control framework of [5] (see also Chapter I of [3]). This means that we take U to be of the form P(V) for some compact metric space V, where P(X) for any Polish space X is the Polish space of probability measures on X with the Prohorov topology. Furthermore, we assume that

mi(x, u) = f v fiti(x' y)u(dy), 1 <_ i < d, x E R ~, u E U,

for an fit(., ") = [fit/(.,-) . . . . . fitn(', .)IT: R d x V ~ R a which is bounded continuous and Lipschitz in its first argument uniformly with respect to the second.

The process u(') above is called an admissible control. Call it a Markov control if u(.) -- v(X(.) , ") for a measurable map v: R a x R + ~ U. We topologize the path space of u(') as follows: For T > 0, denote by B T the closed unit ball of Loo[0, T] with the weak topology of L2[0 , T] relativized to it. Then BT is compact metrizable and therefore Polish. Let B denote the closed unit ball in L~[0, oo) with the coarsest topology required to render continuous the maps B ~ Br, T _> 0, that map f ( ' ) ~ B to its restriction to [0, T]. Let {f~} be a countable dense set in the unit ball of Cb(V). It is then a convergence determining class for U. Let ~i(') = S fi du('), i > 1. Then o~i(')6B for each i and c~(.) = [(~1('), °~2('), . . . ] eBoo. The map ~o: #E U ~ IS f l d/~, S f2 d# . . . . ] s [ - 1 , 11°° is a homeomorphism between U and (p(U). (This is immediate from the fact that q~ is continuous one-one and U is compact.) We identify u(') with ~(') via this homeomorphism and use the notation u(') to represent either the former U-valued process as it originally did, or the latter B°°-valued random variable. The interpretation will be clear from the context.

Let Cr = C([0, T], Rd), (~ = C([0, oo), Re), S = P(C x B°°). Let F'~ c S be the set of attainable laws of (X(.), u(.)) satisfying (1.1) under all admissible controls with the law of Xo being kept fixed at re. In Section 3 we prove the following:

Page 3: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 319

Lemma 1.1. F'~ is compact.

Processes (X( - ), u( - )), (X'(-),u'(-)) (with laws denoted respectively by ~(X(-), u(-)), 5°(X'('), u'( '))e S) are said to be marginally equivalent if the laws of (X(t), u(t)), (X'(t), if(t)) coincide for each t > 0. Equivalently, we say that their laws are marginally equivalent. This is clearly an equivalence relation on F'~. Equi- valence classes under this relation are called marginal classes. Denote by ((X(-), u(-))) ~ or, equivalently, (5¢(X(.), u( .))) , the marginal class containing the law of (X(.), u(.)). In view of Lemma 1.1, the set F~ of marginal classes obtained from F'~ by the above procedure is compact in the quotient topology inherited from S. Extreme points F. are called extremal classes. A marginal class containing some Y(X(.), u(')) for which X(.) is a Markov process is called a Markov class. Our main result, proved in Section 4, is the following:

Theorem 1.1. Every extremal class is a Markov class.

The paper is organized as follows: Section 2 establishes some technical lemmas concerning extremal measures on a product Polish space with a prescribed marginal. Section 3 proves Lemma 1.1 and related results. The results of these two sections are used to prove Theorem 1.1 in Section 4.

We make much use of disintegration of measures. The reader is referred to [11] for background.

2. Extremal Measures with a Given Marginal

This section is devoted to a characterization of extremal measures with a given marginal on a product of Polish spaces. Let S 1, $2 be Polish spaces and let f¢1, f#2 be their respective Borel a-fields. Let/z be a probability measure on ($l x $2, ~1 x f#2) and disintegrate it as

#(dx, dy) = v(dx)v(x, dy), (2.1)

where v is the image of/~ under the projection (St x $2, f¢~ x fgz) ---' (S~, f¢1) and x ~ v(x, '): (Sa, f#l) ---'/P(S2) is the regular :onditional law, defined v-a.s. Let Q be the set of all such/~ for which v is a prescribed element of P(S0. Q is clearly closed and convex. Let Qe denote the set of its extreme p~ints and let Qo c Q be the set of/~ for which v(x, .) as in (2.1) is a Dirac measure for v-a.s.x. We prove below that Qe = Qo.

Let f ~ Cb(S2) and q ~ P(S2). Let b ~ R be the unique least number such that

q ( { x l f ( x ) < b}) < ½,

q ( { x l f ( x ) > b}) _< ½.

Let A 1 = { x [ f ( x ) < b}, A 2 = { x l f ( x ) > b}, A 3 = { x [ f ( x ) = b}, and let IA,, i = 1, 2, 3, be their respective indicators. If q(A3) = 0, define el(q), ~2(q) e P(S2) by

~a(q) = 2Ia,q,

~2(q) = 2Iazq"

Page 4: On extremal solutions to stochastic control problems

320 V.S. Borkar

If q(A3) 5 ~ 0, let 6 ~ [0, 1] be such that

q(A1) d- 6q(A3) = q(A2) + (1 - ~)q(A3) = ½.

Define cq(q), 0~2(q) E P ( S ) by

~l(q) = 2(Ia, + ~Ia)q,

c~2(q) = 2(Ia~ + (1 -- 6)Ia3)q.

It is not hard to check that ~ , ~2 are measurable maps P(S2) ~ P(S2). Also,

q = ½(~(q) + ~2(q)). (2.2)

Lemma 2.1. I f l~ ~ QkQt), there exist A ~ ffl and f ~ Cb(S2) such that v(A) > 0 and, for all x ~ A, f is not a constant v(x)-a.s.

Proof Let {f~} be a countable subset of Cb(S2) which separates points of $2. Suppose that, for v-a.s, x, f~ is v(x)-a.s, a constant for all i. Then v(x) is a Dirac measure for such x, contradict ing the fact that # ~ QD. Thus there exists an A' e ~1 with v(A') > 0 such that, for all x ~ A', f~ is not a constant v(x)-a.s, for some i. Let A i = { x ~ S l l f i is not a constant v(x)-a.s.}. Then A ' = ~ A i . Since v ( A ) > 0 , v(Ai0 ) > 0 for some i 0 and the claim follows with A = Aio. []

Remark . The measurabi l i ty of the set A i is easy to check. Note that, for f ~ Cb(S2) , the set {x s Sll f is a constant v(x)-a.s.} equals

~i{xES1; fg idu(x )~; fd l ) ( x ) ;g ld l ) (X)} (*)

for a countable collection {g~} in Cb(S2) having the proper ty : ~ g, dmx = ~ g, am2 for all i ~ m~ = m2,for finite signed measures ml, m 2 on S 2. The measurabi l i ty of (*) is clear.

Lemma 2.2. Qe c QD.

Proof. Let # ~ Qe. Suppose # ¢ QD. Pick A, f as in the preceding lemma. Then el(v(x, .)), 0~2(v(x, .)) must differ f rom each other for x e A. Define/~1,/~2 e Q by

#i(dx, dy) = v(dx)~i(v(x, dy)), i -- 1, 2.

By (2.2), # = (/~1 + P2)/2. Clearly,/~1 #/~2. Thus/~ cannot be in Qe, a contradiction. The claim follows. [ ]

To each/~ e Q, (2.1) associates a v-£s. equivalence class of maps v: S1 --" P (S2) . This associat ion is one-one . We topologize the collection ~t of v-a.s, equivalence classes of maps S1 ~ P(S2) by imposing the coarsest topology that makes the above mapp ing between Q and M a homeomorph i sm. 1~ will be endowed with its Borel a-field. Also, we write v(x, .) of (2.1) as vu(x, .) to make explicit its #-dependence.

Let A be a measurable subset of Q and ~ e p(Q) satisfying ~(A) = 1. Let/~ ~ Q

Page 5: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 321

be the barycenter of ~ [10]. That is, for any f e Cb(S 1 X $2) ,

ff(x,Y)/2(dx, dy)=fAfs,×sf(X,Y)~(dx, dy)¢(d~). Disintegrate/2 as

/2(dx, dy) = v(dx)~(x, dy).

(2.3)

(2.4)

Lemma 2.3. For v-a.s, x, g(x, .) is the barycenter of a probability measure supported on {~,(x, ")I/~ e A}.

Proof By (2.1), (2.3), and (2.4),

/2(dx, dy) = v(dx)O(x, dy)

= t~ ((dl~)l~(dx, dy)

= fx ((d#)v(dx)vu(x' dy)

= v(dx) ~ ~(dp)v~(x, dy). Ja

Thus

a ¢(dp)vu(x, dy) = O(x, dy)

proving the claim.

v-a.s.

[]

We conclude this section with a slight extension of Choq~et's theorem [10] for later use. In the remainder of this paper, "Choquet 's theorem" refers to this result.

Lemma 2.4. Let X be a Polish space and let D c P(X) be a closed convex set with De ~ D the set of its extreme points. Then every element of D is the barycenter of a probability measure on D e.

Proof If D is compact, this is immediate from the classical Choquet's theorem. Since X is Polish, it is homeomorphic to a Gn subset" Y of [0, I] 4. We identify X with Y and denote by )? the closure of X in [0, 1] 4. (Note that the corresponding identification between P(X) and P(Y) preserves extreme points of convex sets.) )? and, therefore, by Prohorov's theorem, P()() are compact. View P(X) as a subset of P(X) by identifying each element, say p, of the former with its unique extension /2 in the latter given by:/2 restricts to / l on X with/2(X'\X) = 0. Le t / ) be the closure of D in PO ?) and let/)e be the set of its extreme points. By the classical Choquet's theorem, each/~ e D is the barycenter of some v e P(/)o). Fix p, v. If an element e of Do is not in D e, it must be a convex combination of two distinct elements of

Page 6: On extremal solutions to stochastic control problems

322 V . S . B o r k a r

D, at least one of which must assign a strictly positive mass to X \ X . (Otherwise these two would also be in D, contradict ing the fact that e is in D e and hence in D.) Thus D e c / ) e . N o w if v(De\D~) > 0, we would also have # (X \X) > 0 because any element o f / 3 \ D (and hence of be\De) mus t assign a strictly positive mass to ) ( \ X . This is clearly not possible. Thus v(De) = 1 and we are done. [ ]

3. C o m p a c t n e s s o f F'~

N o w we prove L e m m a 1.1 and some allied results. Denote by Xt('), ut('), Wt(') the restrictions of X(') , u(.), W(') to [0, t].

Proof o f Lemma 1.1. Argue as on p. 25 of [2] to conclude that the laws of X(.) over all admissible u(.) with fixed rc are tight in P(C). Also, our choice of topo logy for B ~ makes it compact . Thus F'~ is t ight and therefore, by Prohorov ' s theorem, relatively compact . We only need to show that it is closed. Let (X"(.), u"(.), W"(.)), n > 1, each solve (1.1) on some probabi l i ty space with the law of X"(0) kept fixed at rc for all n. Suppose that, for some (X°°(-), W~(.), u°°(.)),

(x"(.), w,(.), u"(.)) ~ (x~(.), w+(.), u+(-))

in law as (C) 2 x B°%valued r andom variables. Fo r t > 0, define

;o Y"(t) =- m(X"(s), u"(s)) ds,

fo Z"(t) = a(X"(s)) dW"(s).

n = 1, 2 . . . . . It is p roved in Section 3 of [8] that the laws of {Y"(-)}, {Z"(-)} are tight in P((7). Thus we m a y drop to a subsequence if necessary and suppose that, a s rl -"'* (XD,

(X"(-), Y"(.), Z"(.), W"(.), u" ( . ) )~ (X~(.), Y~(.), Z~(.) , W~(.), u~(-)) (3.1)

in law, viewed as C 4 x B~-valued r a n d o m variables, for some limit processes X~( . ) .. . u~°(.) which we investigate next. Let { ~ ' } , n = 1, 2 . . . . . o% be the natura l filtration of (i.e., the increasing family of a-fields generated by) the processes (X"(-), W"(.), u"(-)), n = 1, 2 , . . . , o% respectively. Then (Z"(t), g~) is a zero mean mar - tingale for n = 1, 2 . . . . . By a simple m o n o t o n e class argument , this is seen to be equivalent to the following statement. For any t > s >_ 0 and

g~ Cb(C s x Cs x B~;Rd) ,

E[(Z"(t) Z" s T X" W" - ( ) ) 0 ( s ( ' ) , s('), us"('))] = 0, n --= 1, 2 . . . . . (3.2)

N o w

sup E[lIZ"(t)ll 2] < K t < oo n

for some K depending on the uniform bound on trij(. ). Thus Z"(t), n = 1, 2 . . . . . are uniformly integrable for each t. This allows us to pass to the limit as n ~ oo

Page 7: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 323

in (3.2) to obtain (3.2) with n = ~ . This proves that (Z°°(t), ~ ) is a zero mean martingale. Similar arguments prove that

and

(Z~(t)W°~(t)r - fla(X~°(s)) ds, f f ~ t

are R d × a-valued martingales. F r o m Theorem 5.4 of [9, p. 160] we can deduce that

Z°~(t) = f la(X~(s)) dW~(s), t >_ O.

Passing to the limit in (1.1), we have

X°~(t) = Xoo(O) + Y°~(t) + fla(X°°(s)) dW°~(s), t >_ O.

By Skorohod ' s theorem [7, p. 9], we may assume that all the above processes are defined on a c o m m o n probabi l i ty space and (3.1) holds a.s. In view of our assumpt ion of Lipschitz continuity of m(', u), u ~ U, uniformly in u, we have

sup supHm(X"(t), u"(t)) - m(X°~(t), u"(t))l[ ~ 0 a.s. (3.3) t~[O, T] n

for each T > 0. Since u~'(-) ~ u~( ' ) a.s. in B7 ~, it follows f rom L e m m a 3.2 of [2, p. 26] that, a.s.,

fo fo m(X~(s), u"(s)) ds ~ m(X~°(s), uoo(s)) ds. (3.4)

Combin ing (3.3) and (3.4) with the fact that Y"(.) --, Y~(.) a.s. in C, we deduce that

Y°~(t) = flm(X (s), uoo(s)) ds, t >_ o.

Thus (X°~(.), u°~(-), Woo(.)) solve (1.1). The only step that remains to be checked is the admissibility of u~('). For t >_ s >_ y, W'(t) - W"(s) is independent of W~(-), us"(') for n = 1, 2 , . . . . Since independence is preserved under convergence in law, it follows that W°~( t ) - W°~(s) is independent of W~(') , u~('). Thus u*(-) is admissible. [ ]

Corollary 3.1. F~ is compact.

The next ma jo r result we prove is L e m m a 3.3 below for which we require some preliminaries. Let {f~} be a countable subset of Cb(R d x U) which separates points of P(R d x U). For i > 1, ~ 6 (0, ~ ) , define F,i: F'~ ~ R, n ~ P(Rd), to be the m a p

[fo o 2 ' (X( ' ) , u(')) e F'~ ~ E e - ~ ( X ( t ) , u(t)) dt .

Page 8: On extremal solutions to stochastic control problems

324 V.s. Borkar

This map is constant on a marginal class and thus can be viewed as a map F~ --* R.

Lemma 3.1. I f ~I, [22 C= I" n satisfy F~i(lil) = F~i(#2) for all i > 1 and rational ct in (0, 00), then #l = #2.

Proof Le t /~ = (~(X~(.) , u~('))), i = 1, 2. By Lemma 6.2 of [2, p. 36],

E[fi(X'(t), ul(t))] = E[fi(X2(t), u2(t))]

for i > 1 and a.e. t . The qualification "a.e." can be dropped by taking a suitable version of the ui(')'s. (This is permissible in view of our topology on B~.) F r o m our choice of {fi}, it follows that the laws of (X'(t), ul(t)), (X2(t), u2(t)) agree for each t. [ ]

Let (X(.), u(.)) solve (1.1) with 5e(X(.), u ( ' ) ) e F . . Let p(x, dy) denote a representative of the regular condit ional law of (X(.), u(.)) given X(0) = x. F rom the martingale formulat ion of (1.1), it is easy to show (see, e.g., the p roof of Lemma 4.3 below) that p(x, dy) may be taken to be in F;~ for each x, fix being the Dirac measure at x.

Lemma 3.2. Suppose that for x in a set of strictly positive z-measure, (p(x, dy))" is not an extremal class in Fox. Then there exist a measurable set A ~ R d, an e > 0, and some (e,i) e (O ,~) x N such that n(A)>O and, for Vlx , V2xe Fgj~ such that

(p(x, dy))" = (vlx + v2x)/2

and

IF,~(vlx ) - F,,(VEx)[ > e.

any x ~ A, there exist

(3.5)

(3.6)

Proof Let F: Fo~ x F~ denote the map that maps (v l, V2) into (V 1 + V2)/2. Let

A(i, n, e ) = {(vl, v2)e F~ x r~x[lF~i(v 0 -F,i(v2) [ > i/n},

A' = ~) A(i, n, 00,

where the union is over all i, n > 1 and all rational ~ e (0, 00). Suppose that

n({x] (p(x, dy)) "~ F(A')}) = 0.

That is, for n-a.s, x, the following holds: For all i, n >_ 1 and all rational ~ > 0,

IF=i(vl) - F~i(v2) I < 1In

whenever va, v2 e F~ satisfy

<p(x, dy)>'= (vl + v2)/2.

By Lemma 3.1, this implies vl = v2, implying that <p(x, dy))" is an extremal class. This contradicts our hypothesis. Thus

n({xt(p(x, dy)) -e F(A')}) > 0.

Page 9: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 325

But

{x I (p(x, dy))~E F(A')) = U {xl (p(x, dy)} ~6 F(A(i, n, c0)},

where the union is over all i, n > I and rat ional ~ > 0. Thus for some i, n > 1, # > 0,

n({xl (p(x, dy))~~ r(A(i, n, ~))}) > 0.

This implies the claim. [ ]

Wi thout any loss of generality we m a y take A to be bounded. If A is not bounded, we can replace it by its intersection with a ball of sufficiently large radius so that the n-measure of the intersection is strictly positive. Let A' c S be defined by

A ' = Ur x. x~A

The arguments of L e m m a 1.1 can be adapted to prove that A' is compact . (Recall that A is now bounded, hence relatively compact .)

The relation of marginal equivalence, originally defined only on F~, rc ~ P(Ra), extends in the obvious manner to the whole of S, namely, we identify two elements of S if the corresponding R a x U-valued canonical processes have the same one-dimensional marginals. Let S ~ denote the space of the corresponding equi- valence classes with the quotient topology inherited f rom S. Let A denote the set of elements of S" corresponding to A' c S. Then A is compact . Let F: A x A ~ S" denote the cont inuous m a p defined by (v 1, vz) --, (v 1 + v2)/2 and set

G' = {(vl, vz)e S" x S-1(3.6) holds with vl, v2 replacing vlx, v2x resp.}.

L e m m a 3.3. Let the hypothesis of Lemma 3.2 hold. Then (X( ' ) , u( ' ) ) ~ cannot be an extremal class.

Proof Let A, e, F,i be as in L e m m a 3.2 with A closed and bounded. For x ~ A, let

Kx = {(vl, v2)e F~ x × F~x[(3.5), (3.6) hold with Vl, v2 replacing Vlx, V2x}.

Kx is nonempty by L e m m a 3.2. It is easily seen to be compac t (since (3.5), (3.6) are preserved under convergence in F ~ x F~x which is compact . ) Let G c A x A be closed and therefore compac t in A x A. The set

{x ~ A IKx c~ G # q)} (3.7)

equals

{x ~ A l(p(x, dy)) ~6 F(G c~ G')}. (3.8)

G' is clearly closed. Thus G c~ G' and hence F(G c~ G') is compact . Since x --* p(x, dy) and therefore x ~ (p(x, dy))~ is a measurable map, it follows that (3.8) and hence (3.7) is measurable. Thus the mult ifunction x ~ A ~ Kx c A x A is measurable in the sense of [13, p. 862], and hence weakly measurable in the sense of [13, p. 862], in view of the remarks on p. 863, p a r a g r a p h 5, of [13]. By Theorem 4.1 of [13, p. 867], there exists a measurable m a p x ~ (v ' , Vx) E Kx from A to A x A such that

Page 10: On extremal solutions to stochastic control problems

326 V.S. Borkar

(3.5), (3.6) hold with v~, v~ replacing vlx , v2x , respectively. Define v~, v~ = (p(x, dy)) ~ for x ~ A, extending thereby the above map to the whole of R a. Define #i ~ S, i = 1, 2, by

i~l(dx, dy) = n(dx)?~(dy), #2(dx, dy) = n(dx)~x(dy),

where 9~, ~ are any representatives of v~,, v~, respectively. By (3.8) and the fact that n(A) > 0, we have (#1)~ # (#2)2 By the definition of /~ , #2, we have

( X ( - ) , u ( ' ~ ) ~ = ( ( # 1 ) ~ -[- (#2)~)/2.

Since v'~, v ' e F ~ for each x, (/~1), ( /~2)~ F~ and hence (X( . ) , u(.)) ~ is not an extremal class. []

4. Extremal Classes are Markov

We now prove Theorem 1.1 via a sequence of lemmas. Suppose (XI(.), Ux(.)), (X2('), u2(')) satisfy (i.1) on some (probably distinct)

probability spaces.

Lemma 4.1. I f the laws of (Xl (T) , ul(T)) and (X2(0), u2(0)) agree for some T > O, we can construct on some probability space a process (X('), u(')) satisfying (1.1) such that the law of (X(t), u(t)) agrees with the law of (Xl(t), Ul(t)) for t ~ [0, T-] and with that of (Xz(t - T), u2(t - T)) for t > T.

Proof. Let fl = C x B ~ and let .~ be its Borel a-field. Define a probability measure P on (tl, •) as follows: Let x --* #(x, -): R d- x U ~ ~(f~) denote any one representative of the regular conditional law of (X2(.), uz(')) given (X2(O), u2(0)). Fo r0_<t t < tz_< ' . ' _< t i_< T _ < t i + l _ < . . . _ < t , < o o f o r n > _ 1, A j = R eBorelsets for 1 _< j < n, Bj =' U Borel sets for 1 <_ j <_ n, define

P({(x('), y('))e C x B ~ l x ( t j ) e A j , y(tj)eB~ for 1 _<j < n})

= e[ I {X , ( t j ) ~ A j, Ul(tj) ~ Bj, 1 <_ j <_ i}~(X,(T), ul(T)),

({(x('), y('))~ C' x B°~lxtt j)~Aj, y( t j )~Bj , i < j <_ n})3.

Define the canonical process (X(.), u(.)) on (fL o ~ , P) by

X((x(.), y(')))(t) = x(t), u((x(.), y(')))(t) = y(t), t >_ O.

Then (X(-), u(-)) has the desired properties by construction. []

Remark. Note that the claim concerns only one-dimensional marginals and not higher-dimensional marginals for which it would be false. This is precisely why we operate with marginal classes ((X('), u(')))~ instead of the laws Lf(X(.), u(.)) throughout.

Corollary 4.1. Let ((X(.), u('))) ~ be an extremal class in F~ with the law of X (T ) being rl for a prescribed T > O. Then ( X ( T + "), u( T + 9)~ is an extremal class in F n.

Page 11: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 327

Proof Suppose not. Then there exist a e (0, 1) and (Xi('), ui(')) ~~ I~rl , i = 1, 2, such that

(XI("), Ul('))~5~ (X2(.), u2('))~ and

( X I ( T -1- "), Ul(T q- ")) ~ ~- a(X l ( ' ) , u l ( ' ) ) ~ + (1 -- a)(X2(') , u2(')) ".

Use the above l emma to construct on some probabi l i ty spaces processes (Xi('), ui(')), i = 1, 2, satisfying (1.1) such that the law of (Xi(t), ui(t)) agrees with the law of (X(t), u(t)) for 0 _< t _< T and with that of (Xi(t - T), u~(t - T)) for t > T. Then

(Sl('), /~1('))~~ (~'~2('), bl2('))~ in F,~ and

(x(.) , u ( . ) ) ' = a(Xl('), ~ ( ' ) ) - + (1 - a)(g2('), ~ ( ' ) ) ,

contradict ing the extremali ty of (X(.) , u(-)) 2 The claim follows. [ ]

Let (X(.) , u ( . ) )~be an extremal class in F~. Fix T > 0. Let/~o ~ ~(Cr) be the law of Xr( . ). Let f : C r ~ R a × Cr denote the m a p x(.) --* (x(T), x(.)) and let p denote the image of #o under f . Let C denote the set of measurable maps R d ~ Cr satisfying the condit ion: ~b ~ (9 implies that, for x ~ R d, (9(x) evaluated at T equals x. Let v o denote the law of X(T) and let M ~ P(Cr) be the set of probabi l i ty measures obtainable as the image of v o under some m a p belonging to C.

L e m m a 4.2. ft (resp. I~o) is the barycenter of a probability measure supported on f , ( M ) = {/~]/~ is the image of some element of M under f ) (resp. M).

Proof In the set-up of L e m m a 2.2, let S t = R d, let $2 = CT, and let if1, if2 be the respective Borel a-fields. By Lemmas 2.2 and 2.4, it follows that fi is the barycenter of a probabi l i ty measure ~ on

A = {/~ ~ p(~d × CT)[ v = Vo and v(x) is a Dirac measure for v0-a.s, x},

where v, v(') are as in (2.1). Let

A' = f , (P(Cr)) ~- {#C P(R a × CT)]~ is the image of some

element of P(CT) under f} .

If 4(A') < 1, 4((A') ~) > 0 and hence ft((R d × CT)\f(CT)) > 0, a contradiction. Hence 4 is suppor ted on A' and therefore on A n A' which is easily seen to be indentical to f , (M). This proves the first claim. N o w f is a bijection between CT and f(CT). Thus the m a p f , which maps the elements of P(CT) into their images under J is a bijection P(CT) *-~ P(f(CT)). Let 4o denote the image of 4 under f . 1. Then 4o is suppor ted on M and/~o is the barycenter of 40, proving the second claim. [ ]

Let (x, y) ~ R d × Cr ~ q((x, y), du) ~ S be any version of the regular condi t ional law of ( X ( T + "), u(T + .)) given (X(T), Xr(.)). Thus the law of

(X(T), XT('), (X (T + "), u(T + .)))

Page 12: On extremal solutions to stochastic control problems

328 V.S. Borkar

i s O ~ P ( R d × Cr × S) & (~ given by

O(dx, dy, dz) = #(dx, dy)q((x, y), dz) (4.1)

with x, y, z denot ing typical elements of R a, CT, C × B °~, respectively. Let H ~ Q be the set of measures 7 of the form

7(dx, dy, dz) = vo(dx)c~4,(x)(dy)q((x, y), dz) (4.2)

for some ~ ~ C, 6+(x) being the Dirac measure suppor ted o n 0(x). By L e m m a 4.2, • is the barycenter of a probabi l i ty measure ~1 on H.

Let Co(R"), C2(R"), n > 1, denote respectively the Banach space of cont inuous maps f : R" --* R and twice cont inuously differentiable maps f : R" -~ R which (along with the first- and second-order derivatives in the lat ter case) vanish at infinity. The norms are II f I1 = sup:,[ f (x) l and, for x = [x l . . . . . X,]T~ R",

~f

resp. For f e Cg(Rd), define

g (Lf)(x, u) = •i mi(x, u) ~xi (x) + ½ i,j,k • alk(X)Cr~k(X) ~ ~32f (X).

L e m m a 4.3. For ~l-a.s. ~//', the probability measure fl(dz)~ S defined by

f f(z)fl(dz)= f f vo(dx)q((x, 4)(x)), dz)f(z), f ~ Cb(C x B°°), (4.3)

is in F' v 0"

Proof L e t n > _ l , t > _ s > t , > t , _ ~ > " ' > t 1> T, f~C~(Ra) ,and

g ~ Co((R~) "~ × B~_ ~).

Then for ¢l-a.s. ~ ;

x~(.) = 4~(X(T))~ = o a.s. (4.4)

Let N be a set of zero probabi l i ty outside which (4.4) holds for all m > 1, all rat ional t > s > t, >_ "" > t~ > T, a l l f and 9 in a countable dense set of C2(Ra), Co((Ra) " x B~_T) respectively and hence for all t > s > t, > ... > t 1 _> T,

f ~ C2(R d)

and

9 ~ Co((Rd) m x B~_ r).

Then f rom the definition of q((', "), "), a s tandard mart ingale a rgument shows that outside N, the measure q((X(T), (a(X(T))), dz) on S satisfies the follow-

Page 13: On extremal solutions to stochastic control problems

On Extremal Solutions to Stochastic Control Problems 329

ing: Define the canonical process (Z(.), z(')) on (7 x B ~ by

Z((x, y))(t) : x(t), t((x, y))(t) = y(t), t >_ O, (x, y) ~ (7 x B ~.

Then under q((X(T), 4)(X(T))), dz), the law of (Z(.), t(.)) is simply the law of an (X'(.), u'(.)) satisfying (1.1) with X'(O) = X(T) and u'(.) an admissible control. Hence for vo-a.s, x, q((x, q~(x)), dz) satisfies: under q((x, ~(x)), dz), (Z(.), ft.)) as above have the law of some (X'(.), u'(')) satisfying (1.1) with X'(O) = x. Thus fl(dz) is the law of some (X'(.), u'(')) satisfying (i.1) with the law of X'(0) = v o. [ ]

Note that in passing we have also proved that q((x, q~(x)), dz)~ F~x. Both this and the above lemma are implicitly used in the proof of the next lemma.

Denote the function ~b ~ C appearing in (4.2) as qSr and the measure fl defined correspondingly by (4.3) as//~ so as to make explicit their v-dependence. Recall that • is the barycenter of a probabil i ty measure ~1 on H.

The map 7--+ 6~.(x)(') is to be unders tood in the sense of the discussion immediately following Lemma 2.2. In particular, it is measurable.

Lemma 4.4 For x outside a set of vo-measure zero, <q((x, ~b~(x)), dz)> ~ is the same for ~l-a.s. V.

Proof The law/~ of (X(T + .), u(T + .)), given by

satisfies: (/~> ~is an extremal class in F~0 (see Corol lary 4.1). Disintegrate/~ as

vo(dx)p(x, dz),

where p(x, dz) is a representative of the regular condit ional law of (X (T + .), u(T + ")) given X(T) = x. By Lemma 3.3, (p(x, dz)> "is an extremal class in F~x for vo-a.s.x. Now 4) is the barycenter of a probabil i ty measure, namely ~1, on H. Thus /~ is the barycenter of a probabil i ty measure on {/~[V ~ H}. By Lemma 2.3, for vo-a.s, x, p(x, dz) is the barycenter of a probabil i ty measure on

{q((x, ~br(x)), dz)[7 6 H}

and, in turn, <p(x, dz)> ~ is the barycenter of a probabil i ty measure on {<q((x, ~br(x)), dz)> ~1V ~ H}. For x outside the set of zero vo-measure outside which the above holds and <p(x, dz)> ~ is extremal in Fox, we must have

<p(x, dz)> ~ = <q((x, q~,(x)), dz)> ~ (4.5)

for ~l-a.s. 7, proving the claim. [ ]

Proof o f Theorem 1.1. It suffices to show that X( ') above is a Markov process and u(.) is a Markov control. We only show the former from which the latter follows as on p. 184 of [6]. Fix t > 0 and let ~(x, dz), O((x, y), dz) ~ P(R d) denote the images of p(x, dz), q((x, y), dz) resp. under the map (x(.), y(')) ~ C x B °~ ~ x(t) ~ R a.

Page 14: On extremal solutions to stochastic control problems

330 v . s . Borkar

Reca l l ing (4.1), (4.2), we conc lude t ha t the law of (X(T), Xr( . ) , X(T + t)) is

f~(dx, dy)O((x, y), dz)

= f ~l(cl~)vo(dx)~,~Ady)O((x, y), dz)

= j" ~l(dy)vo(dx)bo.,~)(dy)O((x, (a,(x)), dz)

= vo(dx)q(x, dy)~(x, dz) (4.6)

by (4.5) where

dy) & f ~,(dT)b4,.~(~)(dy ). ll(x,

F r o m (4.6), it is c lear tha t X(T + t) a n d XT(') are cond i t i ona l l y i n d e p e n d e n t given X(T). G i v e n the a rb i t r a ry choice of T, t, the c la im follows. [ ]

W e have p roved , in fact, a s t ronge r resul t t h a n T h e o r e m 1.1, namely , tha t any represen ta t ive of a n ex t remal class is M a r k o v . W e conjec ture tha t an ex t remal class is a s ingle ton , i.e., it c o n t a i n s exact ly one (X(') , u(-)).

References

1. Borkar, V. S., A remark on the attainable distributions of controlled diffusions, Stochastics 18 (1986), 17 23.

2. Borkar, V. S., The probabilistic structure of controlled diffusion processes, Acta Appl. Math. 11 (1988), 1948,

3. Borkar, V. S., Optimal Control of Diffusion Processes, Research Notes in Mathematics, No. 203, Longman, Harlow, 1989.

4. El Karoui, N., Huu Nguyen, D., Jeanblanc-Pique, M., Compactification methods in the control of degenerate diffusions: existence of an optimal control, Stochastics 20(3) (1987), 169-219.

5. Fleming, W. H., Generalized solutions in optimal stochastic control, in E. Roxin, P. T. Liu, and R. L. Sternberg (eds.), Differential Games and Control Theory III, Marcel Dekker, New York, 1977, pp. 147-165.

6. Haussmann, U. G., Existence of optimal Markovian controls for degenerate diffusions, in N. Christopeit, K. Helmes, and M. Kohlmann (eds.), Stochastic Differential Systems, Lecture Notes in Control and Information Sciences, Vol. 78, Springer-Verlag, Berlin, 1986, pp. 171 186.

7. Ikeda, N., Watanabe, S., Stochastic Differential Equations and Diffusion Processes, North- Holland, Amsterdam, Kodansha, Tokyo, 1981.

8. Kushner, H. J., Existence results for optimal stochastic control, J. Optim. Theory Appl. 15 (1975), 347-359.

9. Liptser, R. S., Shiryayev, A. N., Statistics of Random Processes, I: General Theory, Springer- Verlag, New York, 1977.

10. Phelps, R., Lectures on Choquet's Theorem, Van Nostrand, Princeton, N J, 1966. 11. Schwartz, L., Lectures on Disintegration of Measures, Notes by S. Ramaswamy, Tata Institute

of Fundamental Research, Bombay, 1976. 12. Stroock, D. W., Varadhan, S. R. S., Multidimensional Diffusion Processes, Springer-Verlag, New

York, 1979. 13. Wagner, D., Survey of measurable selection theorems, SIAM J. Control Optim. 15 (1977), 859-903.

Accepted 5 December 1990