Controlled diffusions with boundary-crossing costs

Appl Math Optim 18:67-83 (1988) Applied Mathematics and Optimization © 1988 Springex-Verlag New York Inc.

Controlled Diffusions with Boundary-Crossing Costs*

Vivek S. Borkar**

Tata Institute of Fundamental Research, Bangalore Center, P.O. Box 1234, Bangalore 560012, India

Communicated by W. Fleming

Abstract . This p a p e r cons iders cont ro l o f n o n d e g e n e r a t e d i f fus ions in a b o u n d e d d o m a i n with a cost a s soc ia t ed with the bounda ry -c ro s s ings o f a subdoma in . Exis tence o f op t ima l M a r k o v cont ro l s and a ver i f ica t ion theo rem are es tabl i shed .

1. Introduct ion

In the c lass ical t r ea tmen t o f con t ro l l ed dif fus ion processes , we typ ica l ly cons ide r a cost which is the expec ted va lue o f a " n i c e " func t iona l o f the t r a jec to ry o f the con t ro l l ed process . This func t iona l is of ten the t ime in tegra l up to a s topp ing t ime o f a " r u n n i n g cos t" func t ion on the state space [ 1], [5]. This p a p e r cons iders a s i tua t ion where, loose ly speaking , the runn ing cost is a Schwar tz d i s t r ibu t ion ra ther than a funct ion . The specific case we cons ide r has a na tu ra l i n t e rp re t a t ion as the cost ( " to l l " ) a s soc ia t ed with the b o u n d a r y crossings o f a p r e sc r ibed region.

The prec i se fo rmu la t i on o f the p r o b l e m is as fol lows: let U be c o m p a c t metr ic space and X ( - ) an Rn-va lued con t ro l l ed dif fus ion on some p r o b a b i l i t y space desc r ibed by

f0 m f0 x( t )=x+ (X(s),u(s))ds+ o-(X(s))dW(s) (1.1)

* Research supported by ARO Contract No. DAAG29-84-K-0005 and AFOSR 85-0227. ** Current address: Laboratory for Information and Decision Systems, Building 35, M.I.T.,

Cambridge, MA 02139, USA.

68 V.S. Borkar

for t -- 0, where

(i) m( . , . ) = [ml( "," ) , . . . , mn( "," )]T: R n x U--> R n is bounded continuous and Lipschitz in its first argument uniformly with respect to the second,

(ii) o-(. ) = [[cr0(. )]]: R n--> R n×~ is bounded C 2 and satisfies the uniform ellipticity condition

II~(z)yll2>_xllyll 2 for all z, y e R " for some )t > 0,

(iii) W(. ) = [ Wl(" ) , . . . , Wn(" )]x is an R"-valued standard Wiener process, (iv) u( . ) is a U-valued process with measurable sample paths and satisfies

the nonanticipativity condition: for t >- s >- y, W( t) - W(s) is independent of u(y).

Call such a u( . ) an admissible control. Call it a Markov control if u ( - ) = v(X(. )) for some measurable v: R" --> U. In this case, it is well known that (1.1) has a strong solution which is a Markov process [7]. In particular, this implies that Markov controls are admissible. We shall also refer to the map v itself as a Markov control by abuse of terminology.

Let B, D be bounded open sets in R n with C 2 boundaries ~B, 8D, respectively, such that / ~ c D and x~D\SB. Let r=inf{t>>-OlX(t)~D}. Let M( /5 ) denote the space of finite nonnegative measures o n / 5 with the weakest topology needed to make the maps 7/--> Sfd~l continuous f o r f ~ C(/5) . For x E/5, define 1,x c M( /5) by

f fdl, x = E [ y f f (X( t ) )d t ] , f ~ C(E)).

Note that z,~ depends on u(. ). From the Krylov inequality [5, Section 2.2], it follows that ~'x is absolutely continuous with respect to the Lebesgue measure on /5 and thus has a density g(x, .), defined a.e. with respect to the Lebesgue measure. Following standard p.d.e, terminology, we shall call ~,~, g(x,. ) the Green measure and the Green function, respectively. Later, we shall show that g(x, • ) is continuous on D\{x}. Let h be a finite signed measure on ~B, the latter being endowed with the Borel o,-field corresponding to its relative topology. Define the cost associated with control u ( . ) as

J x ( u ( ' ) ) = I g(x, y)h(dy). (1.2) B

The control problem is to minimize this over all admissible u (-). For nonnegative h, (1.2) has the heuristic interpretation of being the total toll paid whenever X ( - ) hits ~B, before it exists from /5.

Remark. The restriction x ¢~ t$B simplifies the presentation considerably and is therefore retained. It could be relaxed by imposing suitable conditions on h, the nature of which will become apparent as we proceed.

The main results of this paper are as follows:

(i) There exists an optimal Markov control v which is optimal for all initial x ~/~\SB.

Controlled Diffusions with Boundary-Crossing Costs 69

(ii) This v is a.e. characterized by a verification theorem involving the value function V:/) \~B--> R mapping x into infu<.)Jx(u(. )), in analogy with the classical situation.

For technical reasons we use the relaxed control framework, i.e., we assume that U is the space of probability measures on a compact metric space S with the Prohorov topology and m is of the form

m(y, u) = Is b(y, s)u( ds) (termwise integration)

for some b ( . , . ) = [ b ~ ( . , . ) , . . . , bn ( ' , ' ) ]T : R ~ xS__>R ~ which is bounded continuous and Lipschitz in its first argument uniformly with respect to the second. This restriction will eventually be dropped.

In the next section we establish a compactness result for Green measures. Section 3 derives a corresponding result for Green functions and deduces the existence of an optimal Markov control v for a given initial condition x. Section 4 studies the basic properties of the value function. Section 5 uses these to prove a verification theorem for v which shows, among other things, that v is optimal for any x.

The motivation for this problem comes from the following situation: suppose the controlled process represents a dynamical system (engineering, e conomic , . . . ) which is "handled" by two different bureaus A and A', each having a complete observation of the state, but with the proviso that A handles the process when it is in D \ B and A' handles it when it is in B. There is no "handling charge" (which, in any case, could be easily accommodated in the present set-up by including an additional "classical" cost). However, there is a charge associated with each transfer of management from A to A' and vice versa, the charge being a function only of the state at the time of transfer and not of the direction of transfer. The term "handl ing" above is deliberately left nebulous and can be given a variety of interpretations.

2. The Green Measures

The results of this section allow us to restrict our attention to the class of Markov controls and establish a key compactness result for the set of attainable ~'x- We start with some technical preliminaries.

For the purposes of the following two lemmas, we allow the initial condition of (1.1) to be a random variable Xo (i.e., X(0) = Xo a.s.) independent of W(-) .

Lemma 2.1. For any T > 0, there exists a ~ c (0, 1) such that

I { X o s l ~ } I { ' r > s } P ( ~ ' > s + T / X ( y ) , u (y ) , y<-s )<-8 a.s.

under any choice of Xo, u( . ), s.

Proof. We need consider only the case P ( X o ~ / ) , ~-> s ) > 0. Let ([l, F, P) be the underlying probability space. Let I) = l-I n {Xo ~/~} n {~- > s}, P = F relativized

70 V.S. Borkar

to l ) , X o = X ( s ) , . ~ ( . ) = X ( s + .), and t ~ ( . ) = u ( s + .). Instead o f the control system described by ( X ( . ) , X o , u ( . ) ) on (fI, F , P ) , we could look at (3~(.), Xo, u(" )) on (fl, F, P) where P ( A ) = P(A) /P( (~ ) for A ~ F. Thus we may take s = 0. By a simple condi t ioning argument, it also suffices to consider Xo = )Co for some Xo~/). I f the claim is false we can find a sequence o f processes X " ( . ), n = 1 , 2 , . . . , satisfying (1.1) on some probabil i ty space, with x, u ( . ) replaced by some x , , u , ( . ) , respectively, such that if r"=in f{ t>-OlXn( t )~E)} , then P ( r " > T)~' 1. Using the arguments of [6], we may pick a subsequence of {n}, denoted {n} again, so that x, ~ xo~ for some x ~ / ) and there exists a process X ~ ( - ) satisfying (1.1) on some probabil i ty space with x = Xoo and u ( . ) = some admissible control uoo(-), such that X " ( . ) - > X ~ ( . ) in law as C([0, co); R" ) - valued r andom variables. By Skorohod ' s theorem we may assume that this convergence is a.s. on some c o m m o n probabil i ty space. Let ~ = inf{ t -> 01X~( t ) ~ £3} and tr ~ = inf{ t -> 0] X ~ ( t ) ~ D}. From simple geometric con- siderations we can see that, for each sample point, any limit point o f {~.n} in [0, 0o] must lie between o -~ and r ~. Under our hypotheses on ~D and tr, o -°° = z °~ a.s. Hence z" ~ r ~ a.s. Thus

1 = l i m sup P ( r n -> T ) - < P(~'~-> T),

implying P ( r ~ --- T) = 1. Thus X ~ ( T / 2 ) c 15 a.s., which we know to be false under our condit ions on m, tr. The claim follows by contradict ion. []

Lemma 2.2. There exist constants a, b E (0, 0o) such that

P ( r > t)<-a e x p ( - b t )

under any x, u( . ). In particular,

E[exp( a'r) ] < oo

uniformly in x, u(. ) f o r some a '> O.

Proof Let T > 0 . T h e n f o r n = l , 2 , . . . ,

P (~ '> nT) = E[E[I{ ' r> n T } / X ( y ) , u(y) , y ~< (n - 1 ) T ] I { ~ ' > (n - 1)T}]

<-- 6P(~" > (n - 1) T)

by the above lemma. Iterating the argument,

P(.r > nT) <~ B".

The rest is easy.

Remark. of [3].

[]

An alternative argument can be given along the lines o f pp. 145-146

We now state and prove the first main result o f this section, which is in the spirit o f [2]. Let/xx denote the probabil i ty measure on t~D defined by

f fdtxx = E [ f ( X ( r ) ) ] , f ~ C(6D).


Theorem 2.1. For each admissible control u(. ), there exists a Markov control which yields the same l, x and #Zx.

Proof. By Lemma 2.2, E [ r ] < ~ . Define a probability measure 77 on / ) x S by

f f(Y,S)~(dy, ds)= E [ f f fs f (X(t) ,s)u( t)(ds) d t ] /E[r] ,

f ~ C(E) x S).

Disintegrate ~ as

7q( dy, ds) = Bl( dy)n2(y)( ds),

where r/l is the image o f ~/under the projec t ion/5 x S-->/) and rl2:/5--> U is the regular conditional law, defined ~71-a.s. Pick any representative of r#2. Then u ' ( . ) = ~72(X'(')) defines a Markov control, X ' ( - ) being the solution to (1.1) under u'( . ). We shall show that u(. ), u '( . ) lead to the same ix,/xx.

For y = [Yl,- • •, Yn] Tc D, u ~ U,f~ H2oc(D), define

o 2 f , " (Lf)(y,u)= i=l ~ mi(y,u) (y)+½ i,j,k=l ~ O - i k ( Y ) ~ k ( Y ) ~ t y , .

Let ~b: /5--> R be smooth and ~: E3--> R the map that maps x into

,t] where ~'=inf{t>~OIX'(t)~D}. (Recall that X'(O)=x.) Then ~, is the unique solution in C(/5) c~ H~oc(D) to

-(LqQ(y, n2(Y)) = ~(Y) in D, q~ =0 on 6D. (2.1)

(That (2.1) has a unique solution in the given class of functions follows from Theorem 8.30, p. 196, of [4]. That this solution coincides with our definition of q~ is an easy consequence of Krylov's extension of the Ito formula as in Section 2.10 of [5].) Consider the process

I0 Y(t) = tp(X(t)) + tp(X(s)) ds, t>_O.

Another straightforward application of Krylov's extension of the Ito formula yields (see, e.g., p. 122 of [5])

} E[Y(r)]-E[Y(O)]= E (L~(X(t), u(t))+~(X(t))) dt . (2.2)

Note that the first equality in (2.1) holds a.e. with respect to the Lebesgue measure. Since ux is absolutely continuous with respect to the Lebesgue measure, it holds l,x-a.s. Hence the right-hand side of (2.2) equals

[Io } E V~ (X( t ) ) " (m(X(t), u( t ) ) -m(X( t ) , u'(t))) dt ,

72 v .s . Borkar

which is zero by our definit ion of u ' ( . ). Thus E [ Y(z ) ] = E [ Y(0)], i.e.,

Since the choice of q~ was arbi trary, it follows that u( . ), u ' ( . ) yield the same Vx. The cor responding claim for /zx is p roved in Theo rem 1.2 of [2]. []

The second main result o f this section combines the foregoing ideas with those of [6].

Theorem 2.2. The set of the pairs ( ux, Izx) as x varies over D and u(. ) varies over all Markov controls (equivalently, all admissible controls) is sequentially compact.

Proof. In view of the preceding theorem, it suffices to consider the case of arbi trary admissible controls. Let X n ( . ) be a sequence of processes satisfying (1.1) on some probabi l i ty space with X n ( 0 ) - - x n , u ( . ) - -un ( " ) for some x, c / ) , and admissible controls u , ( . ), n = 1, 2 , . . . . As in the p r o o f of L e m m a 2.1, we can arrange to have these defined on a c o m m o n probabi l i ty space such that x, ~ xoo ~ / ) and X " ( • ) - X°°( • ) a.s. in C([0, oo); R ' ) where X ~ ( • ) satisfies (1.1) with x replaced by Xoo and u ( . ) by some admissible control uoo('). Defining z ", n - - 1 , 2 , . . . ,oo, as in L e m m a 2.1, we have z ' - ~ z °° a.s. Thus f o r f ~ C( /5) ,

f; I; f (X"(t)) d t - f(X°°(t)) dt a . s .

f (X"(r" ) ) -> f(X°°(r°°)) a.s.

By L e m m a 2.2 we can take expecta t ions in the above to conclude. []

3. Existence of Optimal Markov Controls

This section establishes a compac tness result for the Green funct ions which immedia te ly leads to the existence of an opt imal Markov control. We start with several pre l iminary lemmas .

Let x c / ) and v a Markov control. As in Sect ion 2.6 of [5], we construct a family o f R" -va lued diffusions X ~, 0 < e - < 1, with X ~ ( 0 ) = x for all e, having drift coefficients m ~: R" ~ R" and diffusion coefficients o f o-e: R" ~ R" ×', respectively, such that

(i) m e, tr ~ are smooth and b o u n d e d with the same bounds as m, or, respectively,

(ii) [[o'~(z)yii2>--A[[y[[2 for all y, z ~ R" with the same h as in Section 1, (iii) X~( • ) ~ X ( . ) in law as e $ O, X ( . ) being the solution to (1.1) under the

Markov control v.

Let v~,, g~ (x, • ) denote the Green measure and the Green funct ion, rspectively, cor responding to X~( • ) and vx, g(x , . ) those for X ( . ) . Then the arguments of the preceding section can be used to show that

v~,-->Vx i n M ( / ) ) as e ~ 0 . (3.1)


Lemma 3.1. Given any open set A such that fi, c D \ {x}, there exists an a > 0 and a K e (0, oo) such that

I g ( x , y ) - g ( x , z ) l < g l l y - z l V , y, z e a ,

under any choice of v.

Proof Consider a fixed v to start with. Let L~ denote the extended generator of X~( . ) and L* its formal adjoint. Then

L*g ~ (x , . ) = 8x(. )

in the sense of distributions, where 6x(" ) is the Dirac measure at x. Let O be an open neighborhood of x such that ,4 c D \ O. Using hypoellipticity of L*, standard p.d.e, theory tells us that g~(x, • ) is a smooth solution of

L*g ~ (x , . ) = 0

on D\O, extending continuously to 15\0. By Theorem 8.29, p. 195, of [4], it follows that there exist a > 0, K > 0 such that for y, z e / 3 \ O,

Ig~ (x, z ) - g ~ ( x , Y)I <- g IIz -Yll"(llg~(x, " )ll L2~O\o) + 1).

Letting g~(-) = g~(x,. ), I1" I1~ = II" I ILp~\o)by abuse of terminology, we have, from the above,

Ig~(y)l<-g~(llg~ll2+l), y e D \ O ,

for some K1 > 0. Thus

IIg~ll~ ~ < g,(llg~ll=+ 1)llg~ IIx

and thus, for some K2 > 0,

(llg~ll= + l )2 <- g2(llg~ll~ + l )(llg~ll, + l ),

implying

IIg~ll=- < g3(llg" II, + 1)

for s o m e K 3 > 0 . Thus, for any q->l ,

]g~(x,z)-g~(x,y)l<-K4[ly-zl l~(l lg~[Iq+l), y, z e D \ O ,

for s o m e K4> 0 depending on q. Now for any smooth f supported in D \ O and r(e) =in f{ t>-OIX ' ( t )~ D},

I f g~(x ,y) f (y) dy = l E [ f ; ' ~ ' f ( X ~ ( t ) ) .dt][ <<- Ksllfllp

by Theorem 4, p. 54, of [5] for some p -> n, Ks > 0 depending only on D, O, the bounds on m, tr, and the constant A. Letting q be the conjugate exponent of p (i.e., p-~+q-~= 1), we have

IIg~[l~--- g~.

Hence, for K 6 = ( K 5 q - 1)K4,

Ig~(x ,y)-g~(x,z) l<-g611Y-Zl lL y, z e a . (3.2)

74 V.S. Borkar

Note that this estimate holds uniformly in e. Fix z ~ A. I f {g~(x, z), 0 < e -< 1} is unbounded, there exists a sequence {e(n)} in (0, 1] such that

g~(")(x, z)'~oo.

By (3.2) it follows that

g~(")( x, . )'~ oo

uniformly on A. Thus we have

E[r(e(n))] = fD g~(")(X, y) dy¢oo.

Recalling the definition of X ' ( • ) from Section 2.6 of [4], and using an argument similar to that of Lemma 2.2, we can show that E[~-(e)] is bounded uniformly in e, giving a contradiction. Hence {g~(x,z), 0 < e - < l } is bounded. By the Arzela-Ascoli theorem, g~(x, • ), 0 < e -< 1, is relatively compact in C(D\{x}) with the topology of uniform convergence on compacts. Pick a sequence {e(n)} in (0, 1] such that e(n) $ 0 and let g(x, . ) be a limit point in C(Dk{x}) of {g~{")(x, • )}. Then, for a n y f e C( / ) ) with support in D\{x},

I f(y)g~(")(x, y) dy-> f f(y)g(x, y) dy.

From (3.1) it follows that g(x, • ) = g(x, • ). Letting e -> 0 in (3.2), the claim follows for given v. That it holds uniformly for all v is clear from the fact that a, K depend only on A, A and the bounds on m, tr. []

Corollary 3.1. The set of g(x, • ) as u(. ) varies over all Markov (equivalently, all admissible) controls is compact in C( D\{x}).

Proof. Relative compactness of this set follows as above. That any limit point of it is also a Green function for some Markov control can be proved by using the argument of the last part of the proof of Lemma 3.1 in conjunction with Theorems 2.1 and 2.2. []

Theorem 3.1. An optimal Markov control exists.

Proof. Let {u,( .)} be Markov controls such that

Jx(u,(" ))~,inf J~ (u(-) ) ,

where the infimum is over all Markov (equivalently, all admissible) controls. Let {g,(x,. )} be the corresponding Green functions. Let u( . ) be a Markov control with g(x, .) the corresponding Green function, such that g"(x, .)-->g(x, .) in C(D\{x}) along a subsequence. Thus g"(x,. )~ g(x,.) uniformly on SB along this subsequence. The optimality of u(. ) follows easily from this. []


Let u( . ) above be of the form v ( X ( . )). The above theorem does not tell us whether the same v would be optimal for any choice of x. This issue is settled in Section 5 using the verification theorem, which also allows us to drop the relaxed control framework. As a preparat ion for that, we derive some regularity properties of the value function V in the next section.

4. Regularity of the Value Function

Recall the definition of the value function V from Section 1.

[,emma 4.1. V is continuous on 19\8B.

Proof Let x(n)-->x(~) in /5\/5B. For n = l , 2 , . . . , let u , ( . ) be the optimal Markov control when the initial condition is x (n) and g, (x (n), • ) the corresponding Green function. By arguments similar to those of the preceding section, we can arrange that (by dropping to a subsequence if necessary) g , (x (n ) , .)--> g~(x(oo), • ) uniformly on compact subsets of D that are disjoint from {x(n), n = 1, 2 , . . . , c~} (in fact, disjoint from x ( ~ ) will do), where g~(x (~ ) , . ) is the Green function for some Markov control uoo(.) when the initial condition is x (~ ) . It follows that

Jx~,)(u,(" )) --> Jx~oo)(u~(" )). (4.1)

Let v be any Markov control. Then

by Feller property. Since

L~.~(v)>>-Jx~.~(u.(" )),

we have

Jx(~)(v) >= Jx(~)(u~( . )).

Hence u~(. ) is optimal for the initial condition x (~ ) . Then (4.1) becomes

V(x(n))-~ V(x(~)). []

Let a~, 0 < e < - 1, be a family of compactly supported mollifiers and define h~: R"-~ R by he(y)= S a~ (y - z)h(dz). Then {h~} are smooth with compact supports decreasing to 8B as e ~ 0 (and hence can be assumed to be contained in D for all e). Also, hE ~ h as e-~ 0 in the sense of distributions. Thus

h~(y) d y ~ h ( d y ) as e-~0

as measures in M ( / ) ) . Pick e(n),~O in (0, 1) and denote h~,) by h, by abuse of notation. Define

[fo ] V . ( x ) = i n f E h~(X(t)) dt , n = 1 , 2 . . . . ,

76 v.S. Borkar

the infimum being over all admissible controls. By the results of Chapter IV, Section 3, of [1] this infimum is attained by some Markov control u , ( . ) = v,(X(. )). Letting {g,(x,. )} be the corresponding Green functions, we have

V,(x) = f g,(x, y)h,(y) dy, n = l , 2 , . . . .

Lemma 4.2. V, ~ V uniformly on compact subsets of D\6B.

Proof Let K c D\6B be compact and Q c D an open neighborhood of 6B such that Q n K = Q. By familiar arguments we conclude that g, (x , . ) , n = 1, 2 , . . . , x ~ K, is equicontinuous and pointwise bounded on Q. Fix x ~ K. Again familiar arguments show that any subsequence of {n} has a further subsequence, say {n(k)}, along which

g, (x, . ) -~ go(x," )

uniformly on Q, where goo(x,.) is the Green function corresponding to some Markov control u~( . ) . Without any loss of generality we may assume that the supports of {h,} are contained in Q. Then, for m = n(k), k = 1, 2 , . . . ,

IVm(x)- fg~(x ,y)h(dy) <- fg~o(x,y)hm(yldy- fg~(x,y)h(dy)

+ suplg,, (x, y) - go(x, Y)I IhI(SB). y~Q

Hence Vn¢k)(X)~S go(x, y)h(dy). Let u ( . ) be an arbitrary Markov control and g(x, • ) the corresponding Green function. Then

f g(x,y)h,(y) dy~ f g(x,y)h(dy).

Since

f g(x,y)h.(y) dy > - V.(x),

we have

f g(x,y)h(dy)>-f g~(x,y)h(dy).

Hence S go~(x, y)h(dy) = V(x) and V.(x) ~ V(x). Note that for each n, V. satisfies L.V.=O on D \ s u p p ( h . ) where L. is the extended generator of the Markov process corresponding to u . ( . ) [1]. Arguments similar to those of Lemma 3.1 can now be employed to show that V. ( . ) are equicontinuous in a neighborhood of K. It follows that the convergence of V, to V is uniform on K. []


Let A c D be open with a C z boundary 6A that does not intersect 6B and define ~ = inf{ t/> 01X (t) ~ A} for X ( . ) as in (1.1) with x e A. Define a measure ~/x on ,if, by

I fd~x= E l i f f ( X ( t ) ) d t ] , f e C(,4).

We shall briefly digress to insert a technical lemma whose full import is needed only in the next section.

Lemma 4.3. ~x is mutually absolutely continuous with respect to the Lebesgue measure on .4.

Proof Absolute continuity of ~x with respect to the Lebesgue measure follows from the Krylov inequality [5, Section 4.6]. To show the converse, first note that, by Theorem 3.1 with A replacing D, it suffices to consider u ( . ) Markov. Let q(x,.) denote the density of ~x with respect to the Lebesgue measure. Then q(x, • ) >- 0 on A\{x}. Our claim follows if we show that the strict inequality holds. Suppose that, for some y e A\{x}, q(x, y ) = 0. Let L0 denote the extended generator of the Markov process under consideration and L* its formal adjoint. Then L*oq(x,.)= 0 in A\{x}. By the Harnack inequality for uniformly elliptic operators [4, Section 8.8], it follows that q(x, z) = 0 for any compact subset of A\{x} containing y. (Though we do not have the a priori regularity of q(x, • ) on A\{x} required to invoke this result, we can go via e-approximations as in the proof of Lemma 3.1 and use the uniform convergence on compacts thereof.) Thus q(x, • ) = 0 on A\{x}, leading to

IA q(X, y) dy = E[~/ X(O) = x] = O.

This contradicts the fact s c > 0 a.s. for x e A. The claim follows. []

Let A,x be as above and u , ( - ) , g,(x, .), n = l , 2 , . . . , u ( . ) , g(x, .) be as in Lemma 4.2. Define q,(x,. ), n = 1, 2 , . . . , and q(x,. ) correspondingly.

Lemma 4.4. For x e A \ 6 B,

V ( x ) = i n f [ f A q(x ,y)h(dy)+E[V(X(~))]] (4.2)

where the infimum is over all admissible controls. In particular, if 6B c D\,4, this reduces to

V(x) = inf E[ V(X(~:))]. (4.3)

78 V.S. Borkar

Proof. Without any loss of generality we may assume that the supports of {hn} are contained in the same connected component of D\SA as /~B. Let X" ( . ) , n = 1, 2 , . . . , be the solutions to (1.1) under u,( . ), n = 1, 2 , . . . , respectively, and ~" = inf{ t -> 0[ X" (t) ~ fi,}. By the result of Chapter IV, Section 4.3, of [ 1 ]

V,(x) = IA q,(x, y)h,(y) dy + E[ V,(X" (~"))].

As in Theorem 2.2, we can have a process X°~( • ) starting at x and controlled by some Markov control u~(. ) such that for ~:~ = inf{t -> 01X~(t) ¢~ ,4} and fl C(,4), f2e C(6A),

E [f2(X" (~:"))] -> E [fz(X°~(~:~))].

Define qo~(X, .) correspondingly. Arguments similar to Lemma 3.1 show that q,(x, . ) ~ q~(x, • ) uniformly on compact subsets of A\{x}. Thus

~ q,(x,y)h,(y) dy-> L q~(x,y)h(dy).

By the conclusion concerning {/.ex} in Theorem 2.2 (with A replacing D) and Lemma 4.2 above,

E[ V, (X" (~:"))] --> E[ V(X~°(s¢~)) ].

Thus

V(x) = fa q~o(x, y)h(dy) + E[ V(X~(~) ) ] .

The results of Chapter IV, Section 4.3, of [1] also imply that if X ( . ) is the solution to (1.1) under u(. ), then

V,(x)<- fa q(x, y)h,(y) dy+ E[ V,(X(~))].

Taking limits,

V(x) <-- IA q(x, y)h(dy) + E[ V(X(~))].

The claim follows. []

T h e o r e m 4.1. V ~ H2oc( D\ 6B ) and satisfies

inf(LV)(x, u)= 0 a.e. on D/3B. (4.4) u

Proof Equation (4.3) above implies that V restricted to any A in D \6B satisfying 6B c D\ ,4 is the value function for the control problem on A with E[ V(X(~:))] as the cost. The claim follows from Chapter IV, Section 2.2, of [1]. []


5. A Verification Theorem

We shall now derive an analog of the classical vertification theorem that allows us to improve on Theorem 3.1. Let u(. ) -- v(X( . ) ) be a Markov control which is optimal for the initial condition x, X ( . ) being the corresponding solution to (1.1).

Lemmu 5.1.

(LV)(x, v(x))=0 a.e. in D\SB. (5.1)

Proof. Let A be as in the proof of Theorem 4.1. Let x e A and define 7x c M(£)) by

f o r f e C(/5) . Then Yx has a density p(x, • ) with respect to the Lebesgue measure which coincides with g(x, • ) on D\.4. For any bounded continuous f supported in D\fi,,

I g ( x , Y ) f ( y ) d y = I p ( x , Y ) f ( y ) d y = E [ I g ( X ( ~ ) , Y ) f ( y ) d y ]

by virtue of (5.2) and the strong Markov property. Letting f = h,, n = 1, 2 . . . . , successively in the above and taking limits,

Thus V(x)>= E[ V(X(~))] . By (4.3), V(x )=E[ V(X(~:))]. By Krylov's extension of the Ito formula [5, Section 2.10], it follows that

, (LV)(y , v ( y ) )nx (dy )=0 .

By Theorem 4.1 and Lemma 4.3,

(LV)(y, v(y)) >-0 Vx-a.e. on A.

Hence

(LV)(y,v(y))=O r/x-a.e.

on A and hence Lebesgue-a.e. by Lemma 4.3. []

A variation on the above theme yields the following.

Lemma 5.2. I f a Markov control v is optimal for some initial condition x ~ D\ SB, it is also optimal for any other initial condition in D\SB.

80 V.S. Borkar

Proof Let A, x be as in Lemma 4.4 with A connected and v an optimal Markov control for X ( 0 ) = x. Define g(x, • ), q(x, • ) correspondingly. Then

f g(x,y)h,(y) dy= E[f~ h,(X(t))dt]

= E l f ; h,(X(t))dt] + E l i f h,(X(t)) dt]

=IAq(x,y)h,,(y)dy+E[fg(X('~),Y)h,,(y)dy]

by the strong Markov property. Letting n ~ oo,

V(x)=fAq(x,y)h(dy)+E[fg(X(~),Y)h(dy) ]

>- f q(x, y)h(dy) + E[ V(X(~:))]. J,

By (4.2), equality must hold. Hence

f g(X(~),y)h(dy) = V(X(~)) a.s. (5.3)

Note that the maps z-~ V(z) and z~S g(z , y)h(dy) for z E 6A are continuous. Since the support of X(~) is the whole of 6A (this would follow, e.g., from the Stroock-Varadhan support theorem), this along with (5.3) implies that

V(z)= f g(z,y)h(dy) for zE6A,

i.e., v is also optimal for the initial conditions z E 6A. Since A can be chosen so as to contain any prescribed point of D\~B, the claim follows. []

This allows us to prove the following converse to Lemma 5.1.

Lemma 5.3. A Markov control v is optimal if (5.1) holds.

Proof Fix x E D\6B. Let A~, A2 be open sets in Dk6B with C 2 boundaries 6A1, 6A2, respectively, such that x E A~ c A1 c A2 and 6B c D\.42. Let v~ be an optimal Markov control. Let v2(" ) = v( - ) on A~ and =v~( . ) elsewhere. Let X ( . ) b e the process starting at x and controlled by v2. Define the stopping times

To = O,

r, = inf{ t -> 0 ] X (t) Z -42},

r2n = (inf{t => r2n_ 1 IX( t ) E A1} ) ^ ¢,

z2.+l = (inf{t > r2. IX(t)~ *2}) ^ r


for n = 1, 2 , . . . . Then r , 1" r a.s. Define measures/32, on A2 and /32n+1 o n f3\A1, n =0 , 1 , 2 , . . . , by

fd/32n = E f (X( t ) ) dt for f ~ C(A2), ~'2n

f fd/3z,+,= E[~i i~i2f(X( t ) )dt] for f c C(D\A1) .

Since {/3,} are dominated by vx, they have densities with respect to the Lebesgue measure (on .X,2 or OkAa as the case may be). Denote these by p,(x, . ) , n = 0, 1, 2 , . . . . By familiar arguments it can be shown that these are continuous on a neighborhood Q of 6B in Dk-A2. Letting g(x, .) be the Green function under v2, it is clear that

g(x, y)h(dy) = F~ p2,+l(x, y)h(dy), (5.4) n = 0 o

where we have used the fact that the supports of /3, for even n are disjoint from 8B. Since v, v~ satisfy (5.1), so does v2. Hence an application of Krylov's extension of the Ito formula [5, Section 2.10] yields

E[V(X('c2,))]=E[V(X(h,+O)], n =0, 1 , 2 , . . . .

On the other hand, in view of the optimality of Vl, arguments similar to those in Lemma 4.4 show that

E[V(X(~'2,+,))]=f p2,+l(x,y)h(dy)+E[V(X(r2,+2))], n = 0 , 1 , 2 , . . . .

It follows that V(x) = V(X(To)) equals the right-hand side of (5.4). Hence v2 is optimal. Iterating the argument we construct a sequence of open sets B2 c B3 c B 4 c . . . in D\6B increasing to D\6B and optimal Markov controls v~, i = 2, 3 , . . . , such that v~(. ) = v( . ) on Bi and = Vl(" ) elsewhere. Let {g~( - , . )} denote the corresponding Green functions. Fix x ~ D\3B. By familiar arguments we have (on dropping to a subsequence if necessary) g~(x, • ) ~ ~(x, • ) uniformly on compact subsets of D\{x}, where ~ (x , . ) is the Green function for some optimal Markov control. Now vi(x)~v(x) a.e., implying mj(x, "oi (x))~mj(x , I)(X)) a.e. as i ~ ~ , for 1 --<j--< n. For smooth f : / 3 ~ R with a compact support in D,

f gi(x, y)(Lf)(y, vi(y)) dy = f ( x ) .

In view of the foregoing we can let i ~ oo to obtain

f ~(x, y)(Lf)(y, v(y)) dy = f ( x ) .

It follows that ~ (x , . ) is the Green function under v. The claim follows. []

82 v.s . Borkar

The following theorem recapitulates the above results.

Theorem 5.1.

(i) There exists a Markov control v which is optimal under any initial condition in D\~B.

(ii) A Markov control v is optimal if and only if (5.1) holds. (iii) An optimal Markov control v may be chosen so that the range of v lies in

the set of Dirac measures on S.

Remark. A U-valued control taking values in U ' = {Dirac measures on S } c U can be associated with an S-valued (i.e., "o rd ina ry" or "pre-re laxat ion") control in an obvious manner. Thus (iii) above allows us to drop the relaxed control f ramework and replace (4.4) by

b~(x, ~ ~ik(X)%k(x) (X) = 0 a.e. on D\~B.

Proof. Only (iii) needs to be proved. Note that the min imum o fV V(x) • m(x, • ), x e D\~B, over U will always be attained by an element o f U'. Since U ' is a compact subset o f U, a s tandard selection theorem (see, e.g., Lemma 1.1 o f [2]) allows us to pick a measurable v: D \ ~ B ~ U' such that

V V(x )" re(x, v(x)) = min V V ( x ) . m(x, u), u

x e D\6B.

Set v ( x ) = an arbitrary fixed element o f U ' for x ~ 3B. Then v is an optimal Markov control by (ii). []

The above theorem gives a verification theorem and a recipe for construct ing an optimal v, in terms o f the funct ion V. Thus it is desirable to have a good characterization o f V. Formal dynamic programming considerat ions and experience in the classical case leads us to expect that V should be characterized as the unique solution to the Hami l ton - Jacob i -Be l lman equat ion

inf(LV)(x, u ) = - h in D, V = 0 on 6D, (5.5) u

in some appropr ia te sense, where, by abuse o f terminology, we have let h denote the Schwartz distribution corresponding to the measure h. It is an interesting open problem to make sense o f (5.5), thereby obtaining the said characterization o f V.

Acknowledgments

The work was done while the author was visiting the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.


References

1. A Bensoussan, Stochastic Control by Functional Analysis Methods, North-Holland, Amsterdam, 1982

2. VS Borkar, A remark on the attainable distributions of controlled diffusions, Stochastics 18 (1) (1986), 17-23

3. A Friedman, Stochastic Differential Equations and Applications I, Academic Press, New York, 1975

4. D Gilbarg, NS Trudinger, Elliptic Partial Differential Equations of Second Order, Springer-Verlag, Berlin, 1977

5. NV Krylov, Controlled Diffusion Processes, Springer-Verlag, New York, 1980 6. HJ Kushner, Existence results for optimal stochastic control, J. Optim. Theory Appl. 15 (4)

(1975), 347-359 7. AJu Veretennikov, On strong solutions and explicit formulas for solutions of a stochastic

differential equations, Math. USSR-Sb. 39 (3) (1981), 387-403

Accepted 11 May 1987

Documents

Controlled diffusions with boundary-crossing costs