Bellman, Lee - Functional Equations in Dynamic Programming

7/27/2019 Bellman, Lee - Functional Equations in Dynamic Programming

1/18

Aequ ationes M athematicae 17 (1978) 1-18 Birkh~iuser Verlag, BaselUniversity of W aterloo

E x p o s i t o r y p a p e r sF u n c t i o n a l e q u a t i o n s i n d y n a m i c p r o g r a m m i n gRICHARD BELLMAN an d E . STANLEYLE E

1. IntroductionT h e b a si c f o r m o f t h e f u n c ti o n a l e q u a t io n o f d y n a m i c p r o g r a m m i n g is:f ( p ) = m a x [ H ( p , q , f ( T ( p , q) ) ) ] , ( 1 )

q

w h e r e p a n d q r e p r e s e n t t h e s t a t e a n d d e c i s i o n v e c t o r s , re s p e c t i v e l y , T r e p r e s e n t st h e t r a n s f o r m a t i o n o f th e p r o c e s s , a n d f ( p ) r e p r e s e n t s t h e o p t i m a l r e t u r n f u n c t i o nw i t h i n i t i a l s t a t e p .

T h i s f u n c t i o n a l e q u a t i o n c a n b e s t u d i e d i n s e v e r a l w a y s , e i t h e r w i t h r e s p e c t t ot h e t y p e o f p r o c e s s e s g i v i n g r i s e t o ( 1 ) , o r w i t h r e s p e c t t o t h e p r e c i s e f o r m o f ( 1 ) ,o r w i t h r e s p e c t t o t h e c o m p u t a t i o n a l a s p e c t s o f ( 1 ). I n t hi s s u r v e y a r ti c l e , t h isf u n c t i o n w i l l b e t r e a t e d a c c o r d i n g t o t h e d i f f e r e n t t y p e s o f p r o c e s s e s .

I n a d di t i o n t o t h e o p t i m i z a t io n p r o b l e m s i n d y n a m i c p r o g r a m m i n g a s s h o w n i n( 1) , t h e d y n a m i c p r o g r a m m i n g c o n c e p t c a n a l so b e u s e d t o s o lv e v a ri o u s t y p e s o fb o u n d a r y v a l u e p r o b l e m s a r i s i n g i n e n g i n e e r i n g a n d p h y s i c a l s c i e n c e s . T h ed y n a m i c p r o g r a m m i n g c o n c e p t w i t h o u t o p t i m i z a t i o n is k n o w n a s i n v a ri a n t i m b e d -d i n g. T h e r e s u l ti n g f u n c t i o n a l e q u a t i o n o f i n v a r i a n t i m b e d d i n g is v e r y s i m i la r to( 1) e x c e p t f o r t h e a b s e n c e o f th e m a x i m i z a t i o n o p e r a t i o n .

D y n a m i c p r o g r a m m i n g i n vo l ve s a c o m p l e t e l y d i f fe r e n t a p p r o a c h t o f o rm u l a t -i n g t h e p r o b l e m : I n s t e a d o f o n l y c o n s i d e r i n g a s in g l e p r o b l e m w i t h a f i x edd u r a t i o n , t h e d y n a m i c p r o g r a m m i n g a p p r o a c h is t o c o la s id e r a f a m i ly o f p r o b l e m s ,w i t h d u r a t i o n o f th e p r o c e s s r a n g i n g f r o m z e r o t o t h e d u r a t i o n o f t h e o ri g in a lp r o b l e m . I n o r d e r t o c o n s i d e r t h e s e d i f f er e n t d u r a t i o n p r o c e s s e s , t h e c o r r e s p o n d -i n g in i ti a l c o n d i t i o n s f o r t h e s e p r o c e s s e s m u s t a l s o b e c a l c u l a t e d a n d i n t e r p o l a t e d

AM S (1970) subject classification: Primary 39A1 5. Seco ndary 49C1 0, 49C 15.This research was supported by the National Science Found ation under Grant No. M PS-74-15650and the En ergy Research and D evelopm ent Adm inistration (ERD A) un der Contract No. E(04-3)113,Project 19.

Received Augu st 26, 19 75 and, in revised form, Apr il 20, 1976.


2/18

2 R , B E L L M A N A I~ q?) E . S . L E E A E Q . M A T I : I .

c a r e f u l l y . T h e a d v a n t a g e s a n d d r a w b a c k s o f d y n a m i c p r o g r a m m i n g a r e p r e c i s e l yd u e t o t h e s e d i f f e r e n c e s . F o r m o r e d e t a i l e d d i s c u s s i o n o f t h e s e c o m p u t a t i o n a lp r o b l e m s , s e e r e f e r e n c e s [ 1 } - [ 3 ] a n d [ 7 ] - [ 1 1 ] .

S o m e o f t h e m o s t n o t i c e a b l e a d v a n t a g e s o f d y n a m i c p r o g r a m m i n g a r e i t sa d a p t i v e n a t u r e a n d i t s c o m p u t a t i o n a l f l e x i b i l i t y . B y t h e u s e o f t h e f u n c t i o n a le q u a t i o n o f d y n a m i c p r o g ra m m i n g , w e c a n ig n o r e w h a t e v e r h a p p e n e d in t he p a s ta n d o n l y c o n s i d e r t h e p r e s e n t a n d f u t u r e f u n c t io n s b a s e d o n t h e c u r r e n t s t a te o ft h e p r o b l e m . F u r t h e r m o r e , b e c a u s e o f t h e c o m p u t a t i o n a l f le x ib il it y, t h e f u n c ti o n a le q u a t i o n t e c h n i q u e c a n b e a p p l i e d i n d i f f e r e n t w a y s , i n c o m b i n a t i o n w i t h t h ec l a s s i c a l t e c h n i q u e s a n d s e p a r a t e l y , w i t h d i f f e r e n t o b j e c t i v e s i n m i n d . A l s o , m a n yo f t h e d i ff ic u l ti e s i n o p t i m i z a t i o n b y t h e c l as s ic a l t e c h n i q u e s s u c h a s r e l a t iv ee x t r e m e , i n e q u a l i t y c o n s t r a i n t s , t a b u l a t e d o r d i s c r e t e f u n c t i o n s , n o n d i f f e r e n t i a b l ef u n c t io n s a n d s o o n c a n b e a v o i d e d w i t h a p p r o p r i a t e c o m p u t a t i o n a l t e c h n i q u e s .

O b v i o u s l y , i t is i m p o s s i b l e t o c o v e r a ll t h e f u n c t i o n a l e q u a t i o n s a r is i n g i nd y n a m i c p r o g r a m m i n g o r t o l is t a ll t h e r e f e r e n c e s . I n t h is s u r v e y a rt i cl e , o n l y s o m eo f t h e i m p o r t a n t p r o c e s s e s w i l l b e d i s c u s s e d w i t h e m p h a s i s o n r e c e n t d e v e l o p -m e n t s .2. Allocat ion processes

T o b e g i n o u r d i s c u s s i o n , l e t u s c o n s i d e r p r o c e s s e s a r i s i n g i n e c o n o m i c s ,o p e r a t i o n s r e s e a r c h a n d e n g i n e e ri n g . T h e b a s i c q u e s t i o n is t h at O f u s in g r e s o u r c e so f v a r i o u s t y p e s i n e ff i ci e n t w a y s . T h i s p r o b l e m i s i d e a l l y s u i t e d f o r t h e d y n a m i cp r o g r a m m i n g a p p r o a c h .

T o i ll u st ra t e t h e a p p r o a c h , c o n s i d e r t w o t y p e s o f r e s o u r c e s w h i c h a r e p r e s e n tin qua n t i t e s x and y , and l e t x~ and Yi, r e spec t ive ly , by the q uan t i t i e s o f theser e s o u r c e s a l l o c a t e d t o t h e i - t h a c t iv i t y . I f w e d e f i n e t h e r e t u r n f u n c t i o n a s

g~(x~, y ~)= t h e r e t u r n f r o m t h e i - t h a c t iv i t ydue to a l loca t ions x~ and y~ . (1)T h e r e s u l ti n g e q u a t i o n o f d y n a m i c p r o g r a m m i n g c a n b e re p r e s e n t e d b y [ 1 - 3 ]

fN(x , y )= m a x [gav(XN, YZV )+fN--I(X--XN,Y-- YN)]XN, YN

(2)

w h e r e fr4(x, y ) i s t h e o p t i m a l r e t u r n f u n c t i o n f o r a n N s t a g e p r o c e s s w i t hq u a n t i t i e s x a n d y t o a l l o c a t e a m o n g t h e N s t a g e s, i = 1 , 2 , . . . , N .

C o m p a r i n g w i t h E q u a t i o n ( 1 -1 ) , t h e q u a n t i e s x a n d y t o b e a l l o c a te d a r et h e s t a t e v a r i a b l e s , t h e a m o u n t s a l l o c a t e d a t s t a g e N , x N a n d YN, a r e t h e d e c i s i o nv a r i a b l e s .


3/18

Vol. 17, 1978 Functional equations in dynam ic programming 3B e c a u s e o f t h e i m b e d d i n g n a t u r e o f th is fu n c t i o n a l e q u a t i o n , t h e c o m p u t a -

t i o n a l e ff o r t i n c re a s e s e x p o n e n t i a l l y w i th t h e n u m b e r o f s t a t e v a r ia b l e s. B yc o m p u t a t i o n a l e f fo r t , w e m e a n b o t h f a st m e m o r y r e q u i r e m e n t a n d c o m p u t~ it io nt i m e . E v e n w i t h t h e m o s t r e c e n t c o m p u t e r s , c o m p u t e r m e m o r y i s l i m i t e d . T h u s ,w e h a v e t h e d i m e n s i o n a li t y p r o b l e m .

T h e r e a r e m a n y w a y s t o r e d u c e th is m e m o r y p r o b l e m . S o m e o f t h e se t e c h n i-q u e s a r e t h e u s e o f L a n g u a g e m u l t ip l i er , p o l y n o m i a l a p p r o x i m a t i o n , s p li n ea p p r o x i m a t i o n a n d q u a s i l i n e a r i z a t i o n [ 1 1 ] .

E q u a t i o n (2 ) o n l y r e p r e s e n t s a s im p l e e x a m p l e o f a ll o c a t io n p r o b l e m s . V a r i o u sc o m p l i c a t e d a l l o c a t i o n p r o b l e m s a ri si n g i n p r a c t i c e s u c h a s t h e r e l i a b il i ty p r o b -l e m s , t h e t r a n s p o r t a t i o n p r o b l e m s a n d s o o n c an a ls o b e t r e a t e d b y d y n a m i cp r o g r a m m i n g . T h e e s s e n ti a l f e a t u r e i s t o v i ew t h e s t a ti c a l l o c a t io n p r o c e s s e s a sd y n a m i c p r o c es s e s fo r d y n a m i c p r o g r a m m i n g t r e a t m e n t .

3 . Opt i ma l ro ut i ng , i nv ento ry a nd s chedul i ngM o s t o f t h e r o u t in g , i n v e n t o r y a n d s c h e d u l in g p r o b l e m s c a n b e t r e a t e d a s

d y n a m i c p r o b l e m s a n d t h u s n a t u r a l l y s u i t ed f o r th e f u n c t i o n a l e q u a t i o n o fd y n a m i c p r o g r a m m i n g .

A s i m p l e s m o o t h i n g p r o c e s s c an b e d e s c r i b e d a s f o ll o w s : A s u p p l y d e p o t isr e q u i r e d a t a p r e a s s i g n e d s e t o f t i m e s t o m e e t a s et o f k n o w n d e m a n d s f o r se r v ic e so r s u p p l i e s. I f t h e d e m a n d is n o t m e t , a p e n a l t y is i n c u r r e d . O n t h e o t h e r h a n d , i fd e p o t i s o v e r s t a f f e d , a n o t h e r t y p e o f p e n a l t y is le v e l e d . B e c a u s e t h e c o s t c h a n g e sw i t h t h e l e v e l o f s e r vi c e s a n d s u p p l i es a n d a ls o b e c a u s e t h e s e t o f d e m a n d sf l u c tu a t e g r e a t l y o v e r t im e , t h e p r o b l e m is n o n t r i v i a l a n d t h e s t o c k l e v e l m u s t b ea d j u s t e d t o m i n i m i z e t h e t o t a l c o s t .

L e t r t, r 2 . . . . rN b e a p r e a s s ig n e d s e q u e n c e o f d e m a n d s , w h e r e rk is t h ed e m a n d a t t h e k - t h s t a g e . L e t

X k = t h e c a p a b i l i t y o f t h e s y s t e m a t t h e k - t h s t a g e ,k = 1 , 2 . . . . . N , w h e r e Xo = c i s a f i x e d i n i t ia l l e v e l .

I n t h i s e x a m p l e , l e t u s a s s u m e t h a t i t i s r e q u i r e d t h a tXk>--rk , k = 1 , 2 , . . . , N .

(1 )

(2 )I n o t h e r w o r d s , w e i n s i s t t h a t t h e d e m a n d a l w a y s b e m e t .

L e t u s t h e n i n t r o d u c e t w o c o s t f u n c t i o n s4 ) k ( x k - r k ) = t h e c o s t i n c u r r e d a t t h e k - t h s t a g e i f x k > rk,g tk (X k -- X k- 1) = t h e c o s t i n c u r r e d a t t h e k - t h s t a g e i f x k ~ x k - l . (3 )


4/18

4 g . B E L L M A N A N D E . S . L E E A E Q . M A T H _

T h i s l a t t e r f u n c t i o n m e a s u r e s t h e c o s t in v o l v e d i n c h a n g i n g s u p p l y o r s e r v i c e l e ve l .T h e t o t a l c o s t i n c u r re d d u e t o a c h o i c e o f l ev e ls x l , x 2 , . . . , X n i s g i v e n b y

NC ( x l , x 2 , . . . , x N )= }-' . [k(Xk - - r k ) + q t k ( X k -- X~_a)]. (4 )k = l

O u r o b j e c t i v e is t o c h o o s e t h e x k, k = 1 , 2 . . . . . N , s u b j e c t t o t h e c o n d i t i o n Xk >-- rk ,SO a s t o m i n i m i z e t h is f u n c t i o n .

I n o r d e r t o t r e a t t hi s m i n i m i z a ti o n p r o b l e m b y m e a n s o f f u n ct i o n al e q u a t i o nt e c h n i q u e s, w e i m b e d t h is p r o b l e m w i t h in t h e f a m i l y o f p r o b l e m s r e q u i r in g t h em i n i m i z a t i o n o f t h e f u n c t i o n

NC R = Y~ [ ~ ( x ~ - r ~ ) + ~ ( x ~ - x ~ _ 0 ] , (5 )k = R

o v e r t h e r e g i o n d e f i n e d b y x k - rk , k = R , R + 1 . . . , iV , w i t h X n - 1 = c , f o r R = 1 ,2 . . . . . N .

L e t u s d e f i n e

f r ( c ) = m i n C ~ , R = 1 , 2 . . . . . . N , ( 6 ){xk}w h e r e t h e m i n i m u m is t a k e n o v e r th e x k - r e g i o n d e f i n e d a b o v e .

T h e nf N ( c ) = m i n [ b (xN - rN) + ~ ( xN - c ) ] , (7)

a r e a d i ly d e t e r m i n e d f u n c t i o n .T h e u s u al a r g u m e n t y ie ld s t h e r e c u r r e n c e r e l a t i o n [ 1 - 3 ]f R ( c ) = m i n [ ~ g ( x R - r R ) + g tR ( X R - - C ) + f R + ~ ( X R ) ] ,

x R ~(8)

f o r R = 1, 2 . . . . . N ' I . W e t h u s h a v e a s im p l e a l g o r it h m f o r o b t a in i n g t h ec o m p u t a t i o n a l s o l u t io n o f t h e o p t im i z a t i o n p r o b l e m .

E q u a t i o n ( 8 ) is , a g a i n , a p a r t i c u l a r f o r m o f E q u a t i o n ( 1 - 1 ) w i t h c a s th e s t a t ev a r i a b l e a n d x R a s t h e d e c i s i o n v a r i a b l e .

O p t i m a l r e p l a c e m e n t , i n v e n t o r y a n d s c h e d u li n g p r o b l e m s c a n b e t r e a t e d i ne s s en t ia l ly t h e s a m e w a y . C o n s i d e r t h e f o l lo w i n g s i m p l e i n v e n t o r y p r o b l e m : W ea s s u m e t h a t o r d e r s f o r f u r t h e r s u p p l i es a r e m a d e a t e a c h o f a f i n it e s e t o f ti m e s ,


5/18

VoL 17, 1978 Functional equations in dynamic programming 5and immedia t e ly fu l f i l l ed . Af t e r t he o rde r has been made and f i l l ed , t he re i s ademand made fo r t he i t em. Th i s demand i s sa t i s f i ed a s fa r a s poss ib l e , w i th anye x c e s s o f d e ma n d o v e r s u p p l y l e a d i n g t o a p e n a l t y c o s t .

W e s u p p o s e t h a t t h e fo l lo w i n g fu n c t io n s a r e k n o w n :(a ) 4)(s)ds = the p rob ab i l i t y t ha t t he d em and wi ll l i e be tw een s and s + ds.(b) k(z)= t he cos t o f o rde r ing z i t ems to inc rease the s tock l eve l .(c ) p(z)= t h e c o s t o f o rd e r i n g z i t e ms t o me e t a n e x c e s s , z , o f d e ma n d o v e r

supp ly , t he pena l ty cos t .To s impl i fy the s i t ua t ion , l e t u s a ssume tha t t hese func t ions a re independen t o ft ime . O u r a i m is to d e t e rm i n e a n o rd e r i n g p o l i c y w h i c h mi n imi z e s t h e e x p e c t e dc o s t o f c a r ry i n g o u t a n N - s t a g e p ro c e s s . L e t u s t h e n i n t ro d u c e t h e fu n c t i o n

fN(x) = the expec ted cos t o f an N s t age p rocess , s t a r t i ng wi th a s tockleve l x and us ing an op t ima l o rde r ing po l i cy .

L e t u s s u p p o s e t h a t w e o rd e r a t t h e fi rs t s ta g e a q u a n t i t y y - x t o b r i n g t h e l e v elup to y . T he n the exp ec te d cos t is g iven by the func t ion

k (y - x) + p(s - y)~b(s) ds. (9)H e n c e ,

f l ( x ) = m i n k ( y - x ) + p(s-y)4~(s) ds . (10)y~x

T h e u s u a l a rg u me n t a t i o n y i e l d s t h e r e c u r r e n c e r e l a t i o n [1 -3 ]

[ in (x ) = m i n k ( y - x ) + p(s-y)rb(s)dsy~Xi" ds], 2, (11)fn-l(O) [~4 )( s)d s+ f,-t (y- s)4~ (s) n>-

u p o n a n e n u m e ra t i o n o f th e v a r i o u s p os s ib i li ti e s c o r r e s p o n d i n g t o t h e d i f f e r e n tc a s e s o f a n e x c e s s o f d e ma n d o v e r s u p p l y , a n d s u p p l y o v e r d e ma n d .

A s c a n b e s e e n f ro m t h e a b o v e e x a mp l e s , v a r i o u s fo rms o f i n v e n t o ry ,s c h e d u l i n g , s mo o t h i n g a n d ro u t i n g p ro b l e ms c a n b e t r e a t e d a s d y n a mi c p ro b l e msa n d d y n a mi c p ro g ra mmi n g i s n a t u r a l l y s u i t e d fo r s o l v i n g t h e s e p ro b l e ms .

4. Calculus of variationsD y n a m i c p ro g ra m mi n g is i d ea l ly s u i t e d fo r t r e a t i n g p ro b l e m s i n c al cu l us o f

v a r i a t i o n s . I n f a c t , t h e fu n c t i o n a l e q u a t i o n a p p ro a c h o f d y n a mi c p ro g ra mmi n g


6/18

6 R _ B E L L M A N A N D E . S . L E E A E Q . M A T H .yields the fundamental classical results of the calculus of variations and theHamilton-Jacobi theory. Furthermore, because of the completely different con-cept in the functional equation approach, many of the shortcomings encounteredin the calculus of variations such as the nonlinear two point boundary problem,inequality constraints and the solution of linear problems, can be overcome by thedynamic programming formulation.

To illustrate the approach, consider the following calculus of variationsproblem:

Find that function z(t) such that the function x(t) given by the differentialequation

dxd t = f ( x , z ) (1)and the initial condition

x ( O ) = c ( 2 )

maximize the integral

Y (z) = f '~ h (x, z ) d t (3)The function x(t) represents the state of the system and is known as the statevariable and the function z(t) is the control or decision variable.

If this problem is treated by the classical calculus of variations, the result willbe a two-point boundary-value problem. This problem will be treated by thedynamic programming approach.

To use the invariant imbedding concept, observe that when the maximum of(3) has been obtained, the integral is a function only of the initial condition c andthe duration of the process t. Thus, we wish to imbed the original problem withparticular values of c and re, within a family of processes in which c and t areparameters. Define

~the maximum value of J where the starting state of~g(c, t 0 = t the process is c and the total duration is tf. !Thus

g(c, t ) = max J ( z ) = max Iot'[ o , q ]

h ( x , z ) d t


7/18

Vol. 17, 1978 Functio nal equations in dynamic prog ramm ing 7

T h e m a x i m i z a t i o n i s e x e c u t e d b y c h o o s i n g t h e p r o p e r v a l u e o f z o v e r t h e in t e r v a l[ 0, tf] . T h e f u n c t i o n g w i ll b e r e f e r r e d t o a s t h e o p t i m u m r e t u r n a n d J , w h i c h i ng e n e r a l is n o t t h e o p t i m u m o r m a x i m u m v a lu e , w il l b e c a l l ed t h e r e t u r n o rn o m i n a l r e t u r n . T h e c o n t r o l v a r ia b l e z ( t ) a l so is k n o w n a s a p o l i c y . T h e o p t i m u mv a l u e o f z ( t ) i s t h e o p t i m a l p o l i c y .

L e t u s p e r t u r b t h e d u r a t i o n o f t h e p r o c e s s a n d c o n s i d e r t h e o r ig i n a l p r o c e s sw i t h d u r a t i o n f r o m t = 0 t o t = t, a n d a n e i g h b o r i n g p r o c e s s w i t h d u r a t i o n f r o mt = A t o t = tf. H o w e v e r , i n s t e a d o f r e l a t in g t h e s e t w o p r o c e s s e s a s h a s b e e n d o n ei n t h e p r e v i o u s s e c t io n s , w e s h a l l e m p l o y a d i f f e r e n t a p p r o a c h o f u s i n g th ep r o p e r t y o f g ( c , t~) a n d t h e a d d i t iv e p r o p e r t y o f th e i n te g r al . T h e o r i g in a l p r o c e s sc a n b e a s s u m e d t o b e c o m p o s e d o f tw o d i f fe r e n t p r o c e s se s . T h e f ir s t p r o c e s s h a s ad u r a t i o n o f t = 0 t o t = zl a n d t h e s e c o n d o r n e i g h b o r i n g p r o c e s s h a s a d u r a t i o n o ft = A t o t = t t. W e m a y w r i t e

L tg( c , tf) = m a x h ( x , z ) d t = m a x m a x h ( x , z ) d tz[o,q] z[o,a ] z[a,~tJL ] { ; J L ]h ( x , z ) d t = m a x h ( x , z ) d t + m a x h ( x , z ) d t . ( 5 )z [o, a] z [a,q]

T h e s e c o n d t e r m r e p r e s e n t s t h e m a x i m u m r e t u r n f r o m t h e s e c o n d p ro c e s s .O b v i o u s l y , t h e s t a r t i n g s t a te f o r t hi s s e c o n d p r o c e s s is

LAc + f ( x , z ) d t (6 )w h i c h is o b t a i n e d f r o m (1 ). F r o m t h e d e f i n i t i o n o f g a n d E q . (4 ), w e c a n s e e t h a tt h e m a x i m u m r e t u r n f r o m t h e s e c o n d p r o c e s s i s

g c + f ( x , z ) d t , t f - A = m a x h ( x , z ) d t .z[a,~] ( 7 )

S u b s t i t u t i n g ( 7 ) i n t o ( 5 ) , w e h a v e

g (c , t f ) = m a x h ( x , z ) d t + g c + f ( x , z ) d t , t f - z i .z 0 , , ~ ] ( 8 )


8/18

8 R. BELLMAN AND E. S. LEE

T h e t e rm s u n d e r t h e i n te g ra ls m a y b e a p p r o x i m a t e d b y

~ h ( x , z ) d t = h ( c , z (O ) ) A[ a f ( x , z ) d t = f ( c , z (0 ) )Aw i t h t e r m s i n v o l v i n g /12 a n d h i g h e r o r d e r s o f A o m i t t e d .b e c o m e s

AEQ. MATH.

( 9 )

E q u a t i o n ( 8 ) n o w

g ( c , t f ) = m a x [ h ( c , z ( 0 ) ) A + g ( c + f ( c , z ( 0 ) ) A , t ~ - A ) ] .z[0,a]U s i n g T a y l o r ' s s e r i e s , w e o b t a i n

(1 0 )

g ( c + f ( c , z ( 0 ) ) a , i f - a ) = g (c , t t ) + f ( c , z ( 0 ) ) a a g ( c , t r ) _ a O g ( c , t ~ ) + o ( a ) .O c o t fE q u a t i o n ( 1 0 ) b e c o m e s

(11)

g ( c , t t ) = m a x [ h ( c , z ( 0 ) ) a + g ( c , t f ) + ( c , z ( 0 ) ) A - -z[O,a]O g(c, t f)

Oca t ? g ; ; : t f ) F O ( A ) ] . ( 1 2 )

S i n c e g ( c , t ) is i n d e p e n d e n t o f t h e c h o i c e o f z, w e p u t i t o u t s i d e o f t h e m a x i m u mo p e r a t i o n s i g n . T h u s ,

0 = m a x [ h ( c , z(0) )A + f ( c , z (0 ) )A O g ( c , t f ) _ A 0g( c , t~) + O (A )] .zto,a] [ Oc Ot J (1 3 )

I n t h e l im i t a s A ~ 0 , e q u a t i o n ( 1 3) b e c o m e s

O g ( c ' t r ) = m a x [ h ( c , y ) + f ( c , y ) g ( c ' tO ]Ott y Oc (1 4 )


9/18

Vol. 17, 1978 Functional equations in dynamicprogramming 9where y = z(0). The initial condition is

g(c, O) = O. (15)Equation (14) is the desired relationship. This equation should be compared

with the funct ional equations of invariant imbedding such as (7.13). Note thatexpression (7.14) also is for a process with zero duration. Although Eq. (14) hasbeen obtained analytically, it also can be obtained by using a basic property of anoptimal policy by purely verbal arguments. This basic property is known as theprinciple of optimality [1]. This principle forms the cornerstone of dynamicprogramming and is possessed by most multistage decision processes. For simplic-ity, we have obtained the dynamic programming equations for a system with onlyone state variable. Similar equations can be obtained for multi-dimensionalproblems.

Notice that the present approach not only avoided the two-point boundary-value difficulties, but also many other difficulties which are usually associated withthe calculus-of-variations approach. This is due to the fact that the maximum of(14) not only can be obtained by means of calculus, but also by search techniquesincorporated into the numerical solution schemes of (14). With a proper searchtechnique, one can avoid the difficulties in handling inequality constraints and inanswering the question whether a true maximum has been obtained. In addition,the dynamic programming approach also can handle unusual functions such asnonanalytical functions. Problems in which the control variable appears linearlyalso can be treated by dynamic programming, if we remember that constraintsmust be present in order for a linear problem to have an optimum.

As a price in overcoming these difficulties, one encounters other forms ofproblems. Obviously, Eq. (14) cannot be solved easily if the dimension of c islarge. Although, Eq. (14) can be solved by using the difference equation beforethe limit is taken, there still is the dimensionality difficulty for large dimensions ofc. This dimensionality difficulty severely limits the number of state variables thatcan be handled by this approach.

The dynamic programming technique has been used to treat various optimiza-tion problems. However, detailed discussions will not be given here. The readercan consult any of the references cited earlier for more details.

5 . M a r k o v d e c i s i o n p r o c e s s e sThe functional equation of dynamic programming forms the backbone in the

formulation and solution of Markov decision processes. Various monographs have


10/18

1 0 R . B E L L M A N A N D E . S . L E E A E Q . M A ,q l q .

b e e n w r i t t e n o n t h is s u b j e c t ( s ee , f o r e x a m p l e , [ 5 ] a n d [ 6 ]) . I t is o b v i o u s l yi m p o s s i b l e i n t h i s r e v i e w t o c o v e r t h e s u b j e c t i n a n y s i g n i f i c a n t m a n n e r .

C o n s i d e r a s e q u e n t i a l d e c i s io n s t o c h a s t i c p r o c e s s w h i c h i s r u l e d b y t h et r a n s i t i o n m a t r i x P ( q ) = ( P i i , ( q ) ) , w h e r e

P i i ( q ) = t h e p r o b a b i l i t y t h a t t h e s y s t e m is i n s t a t e j a t t i m e t + 1 , g i v e n t h a t i tw a s i n s t a t e i a t t i m e t , a s s u m i n g t h a t a d e c i s i o n p o l i c y q is u s e d , i,/ = 1 , 2 . . . . . N .

T h e s t a t e o f t h e s y s t e m c an b e r e p r e s e n t e d b y t h e t ra n s f o r m a t i o n e q u a t i o nNx , + ~ ( j ) = Y , p ~ i x , ( i ) , j = 1 , 2 . . . . . N ( 1 )

i = 1

w i t h t h e i n i t i a l c o n d i t i o n

x o ( ) = c ,

w h e r e

(2 )

x , ( i) = t h e p r o b a b i l i t y t h a t t h e s y s t e m is i n s t a t e i a t t , i = 1, 2 . . . . . N .

W e h a v e a s s u m e d t h a t t h e s y s t e m a t a n y p a r t i c u l a r t i m e is i n o n e o f a f in i t en u m b e r o f s ta t e s , i = 1, 2 . . . . . N . T h e r e is a c h a n g e o f s t a te i n v o l v e d a t e a c hs t a g e . A s s o c i a t e d w i t h t h i s c h a n g e o f s t a t e is a r e t u r n w h i c h i s a f u n c t i o n o f t h ei n it ia l a n d t e r m i n a l s t a t e a n d o f th e d e c i s i o n . L e t

R ( q ) = r~,(q) (3 )r e p r e s e n t t h e r e t u r n m a t r i x .

T h e p r o b l e m is t h a t o f c h o o s i n g t h e s e q u e n c e o f d e c i si o n s w h i c h w ill m a x i m i z et h e e x p e c t e d r e t u r n o b t a i n e d f r o m a n n - s t a g e p r o c e s s , g i v e n t h e i ni ti al s ta t e o f t h es y s t e m .

U s i n g t h e fu n c t i o n a l e q u a t i o n t e c h n i q u e , l e t u s d e f i n e t h e o p t i m a l r e t u r n a s

f ~ ( i ) = e x p e c t e d r e t u r n o b t a i n e d f r o m a n n - s t a g e p r o c e ss , s t ar t in g in s t a te ia n d u s i n g a n o p t i m a l p o l i c y .


11/18

Vol. 17, 1978 Functional equations in dynamic programming 11T h e f u n c ti o n a l e q u a t i o n o f d y n a m i c p r o g r a m m i n g is

f . ( i ) = ma x p~j(q)(r,,(q)+f,~ 1q i (4)

wi th i = 1 , 2 . . . . . N , n = 1 , 2 . . . . . and fo(i) = O.On e o f t he com puta t iona l d if f icu lt ie s i n so lv ing the abov e func t iona l equ a t ion

i s t he necess i ty o f s to r ing the t rans i t i on ma t r i ces Pij(q) an d rij(q). I f N exceeds av a l u e su c h a s 1 ,0 0 0 , th e s t o r a g e r e q u i r e m e n t s b e c o m e e x c e ss i v e. H o w e v e r , t h ea s y mp t o t i c b e h a v i o r o f t h e fu n c t i o n s xt( i ) as t ~ o0 can fre qu en t ly be used toavo id th i s d i f f i cu l ty . Even i f t h i s a sympto t i c behav io r does no t ex i s t , app rox ima-t ion can s t i l l be used based on th i s f i c t i t i ous in f in i t e p rocess behav io r .M a n y p ro c e s s e s o c c u r r i n g i n o u r d a i l y l if e c a n b e t r e a t e d a s a M a rk o v p ro c e s s .F o r e x a m p l e , t h e o p t i m a l c h o i c e o f o u r d a il y co m m u n i c a t in g r o u t e f r o m h o m e t oo f fi ce , t h e o p t i ma l o p e ra t i o n o f a ta x i c a b, s c h e d u l i n g a n d r e p l a c e m e n t p ro b l e m s ,mul t i - s t age games , and so on .

6. Fuzzy systemsT h e fu n c t i o n a l e q u a t i o n o f d y n a mi c p ro g ra mmi n g i s i d e a l l y s u i t e d t o s o l v e

fu z z y p ro b l e ms . F u z z i n e s s sh o u l d n o t b e c o n fu s e d w i t h r a n d o mn e s s . F o r e x a m p l e ,mu c h o f t h e d e c i s i o n -ma k i n g i n a r e a l w o r l d t a k es p l a c e i n a fu z z y e n v i ro n m e n t i nwhich the goa l s, t he cons t ra in t s an d the c onse que nces o f poss ib l e ac t ions a re no tk n o w n p re c i s e l y .

Th e the o ry o f fuzzy se ts i s d i scussed in de t a il i n t he l i t e ra tu re ( see, fo re x a mp l e , [ 1 4 -1 8 ] ) . O n l y t h e d y n a m i c p ro g ra m mi n g a p p l i c a ti o n o f f u z z y s e ts w il lbe d i scussed he re .

We sha l l beg in ou r d i scuss ion wi th mul t i s t age dec i s ion -mak ing in a fuzzye n v i ro n me n t . F o r s i mp l ic i ty w e s h a ll a s s u me t h a t t h e s y s t e m u n d e r c o n t ro l , A , is at ime- inva r i an t f i n i t e - s t a t e de t e rmin i s t i c sys t em in wh ich the s t a t e , x , , a t t ime t ,t = 0 , 1 , 2 , . . . , range s o ve r a f in i te se t X = {o-1 . . . . o-m}, and t he input , u,, r a n g e so v e r a f i n it e s e t U = { a l . . . . . a , ,} . T h e t e m p o ra l e v o l u t i o n o f A is d e s c r i b e d b yt h e s t a t e e q u a t i o n

x,+a = f(x,, u~), t = O, 1, 2 , . . . (1)in which f i s a g iven funct io n fr om X U to X. T hus, f ( x , u t ) rep resen t s t hesuccesso r s t a t e o f x , fo r i npu t u ,. No te tha t i f f i s a ran do m func t ion , t he n A i s as tochas t i c sys t em whose s t a t e a t t ime t+ 1 is a p robab i l i t y d i s t r ibu t ion ove r X ,


12/18

12 R . B E L L M A N A N D E . S . L E E A E Q . M A T H .

P ( x , + t l x , , u t ) , w h i c h is c o n d i t i o n e d o n x t a n d ut. A n a l o g o u s l y , i f f i s a fu z z yf u n c t i o n , t h e n A is a f u z z y s y s t e m w h o s e s t a t e a t t i m e t + l is a f u z z y s e tc o n d i t i o n e d o n x , a n d u,, w h i c h m e a n s t h a t it is c h a r a c t e r i z e d b y a m e m b e r s h i pf u n c t i o n o f t h e f o r m tz(x t+ ~ [ x ,, u ,) . S i n c e w e w i l l n o t b e c o n c e r n e d w i t h s u c hs y s t e m s i n t h e s e q u e l , i t w i ll b e u n d e r s t o o d t h a t f i s n o n f u z z y u n l e s s e x p l i c it l ys t a t e d t o t h e c o n t r a r y .

W e a s s u m e t h a t a t e a c h t i m e t t h e i n p u t is s u b j e c t e d t o a fu z z y c o n s t r a i n t C t,w h i c h i s a f u z z y s e t i n U c h a r a c t e r i z e d b y a m e m b e r s h i p f u n c t i o n tz ,( ut ).F u r t h e r m o r e , w e a s s u m e t h a t t h e g o a l is a f u z z y s e t G N i n X , w h i c h i s c h a r a c -t e r iz e d b y a m e m b e r s h i p f u n c t i o n t L~ N (x N ), w h e r e N is t h e t i m e o f t e r m i n a t i o n o ft h e p ro c e s s . T h e s e a s s u m p t i o n s a r e c o m m o n t o m o s t o f t h e p r o b l e m s c o n s i d e r e di n t h e s e q u e l .

L e t u s d e f in e t h e p r o b l e m m o r e p r e ci se ly . T h e s y s t e m is a s s u m e d t o b ec h a r a c t e r iz e d b y (1 ), w i t h f a g i v e n n o n r a n d o m f u n c ti o n . T h e t e r m i n a t i o n t im e Nis a s s u m e d t o b e f i x ed a n d s p e c if i e d . T h e i n it ia l s t a t e , X o, is a s s u m e d t o b e g i v e n .T h e p r o b l e m is t o fi n d a m a x i m i z i n g d e c i s io n .

T h e d e c i s i o n - - v i e w e d a s a d e c o m p o s a b l e f u z z y s e t i n U x U x x U , m a y b ee x p r e s s e d a t o n c e a s

R = C N C l f 3 . . . N C N - l f q ~ N (2 )w h e r e ~ N is t h e f u z z y s e t in U x U x . . - x U w h i ch i nd u c e s G N i n X . M o r ee x p l i c i t l y , i n t e r m s o f m e m b e r s h i p f u n c t i o n s , w e h a v e

~ L L D ( ~ 0 . . . . . / ~ N 1 ) = ~ L L 0 ( / ~ 0 ) A " " ' A t.N_I(UN 1 ) A t.GN(XN) ( 3 )w h e r e X N is e x p r e s s i b l e a s a f u n c t i o n o f Xo a n d Uo . . . . U N - 1 t h r o u g h t h e i t e r a t i o no f ( 1 ) .

O u r p r o b l e m , t h e n , is t o f in d a s e q u e n c e o f i n p u t s u0 . . . . , U N - 1 w h i c hm a x i m i z e s tz o a s g i v e n b y ( 3 ). A s i s u s u a l l y t h e c a s e in m u l t i - s t a g e p r o c e s s e s , i t ise x p e d i e n t t o e x p r e s s t h e s o l u t i o n i n t h e f o r m

u , = , r r , ( x , ) , t = 0 , 1 , 2 . . . . . N - l ,w h e r e 7r, is a p o l i c y f u n c t i o n . T h e n , w e c a n e m p l o y d y n a m i c p r o g r a m m i n g t o g i v eu s b o t h t h e 7r, a n d a m a x i m i z i n g d e c i s io n u ~ . . . . . u ~ _ l .

M o r e s p e c i f i c a l l y , u s i n g ( 2 ) a n d ( 1 ) , w e c a n w r i t et z D ( U o ~ , . . . , u ~ - l ) = M a x M a x (~ o(U o )

U o , - - - , U N - - 2 ~ N - - IA . /xtc-2(ui,,,-z) A t Z N - I (U N - , ) /X I ~ N ( f ( X N _ a , U N -1 )) . (4 )


13/18

Vo l. 17, 1978 Functional equations in dynam ic programm ing 13N o w , i f 3 ' is a c o n s t a n t a n d g is a n y f u n c t i o n o f u N _ l, w e h a v e i d e n t i t yM a x ( 3'/~ g ( u N _ l ) ) = 3' A M a x g ( u N - 0 .

UN I I~N 1C o n s e q u e n t l y , ( 4) m a y b e r e w r i t t e n a s

t~D (uo . . . . . u N _ 0 = M a x ( ~ 0 ( U o ) A . - . A ~ U _ z( U N _ z ) A t ~ 6 N - - I ( x N 1)) (5)Uo, - . , UN l

w h e r e

~ 6 N - 1 (X n _l) = M a x ( t~ N 1 U u 1) A t ~ N ( f ( x N 1, U s 1 ) ),UN- 1

( 6 )

m a y b e r e g a r d e d a s t h e m e m b e r s h i p f u n c t i o n o f a fu z z y g o a l a t t i m e t = N - 1w h i c h i s i n d u c e d b y t h e g i v e n g o a l G n a t t i m e t = N .

O n r e p e a t i n g t hi s b a c k w a r d i t e r at i o n , w h i c h is a s i m p l e i n s t a n c e o f d y n a m i cp r o g r a m m i n g , w e o b t a i n t h e s et o f r e c u r r e n c e e q u a t i o n s

t ~ 6 N - u ( x N _ ~ ) = M a x ( t z ( u u _ , ~ ) A t ~ c N - u + 1 (X N v + 0 )UN v

X N , ,+ 1 = f ( X N - , , U S - -, ,) , V = 1 . . . . . N ,( 7 )

w h i c h y i e ld th e s o l u t i o n t o t h e p r o b l e m . T h u s , a m a x i m i z i n g d e c is i o nU o ~ , . . , u u-lM is g i v e n b y t h e s u c c e s s i v e m a x i m i z i n g v a l u e s o f u N _ , i n ( 7 ), w i t hu M d e f i n e d a s a f u n c t i o n o f X n - ~ , u = 1 , , N .N - - v

T h e a b o v e is o n l y o n e e x a m p l e c o n c e r n i n g t h e a p p l i c a ti o n o f f u n c t i o n a le q u a t i o n s i n m u l ti s ta g e d e c i s i o n - m a k i n g i n a f u z z y e n v i r o n m e n t . T h e c o m p l e x i t ya n d t h e c o n s i d e r a b l e o r d e r o f m a g n i t u d e i n v o l v e d f o r f u r t h e r i n v e s ti g a t io n a r et r e m e n d o u s . S o m e o f t h e f ac et s o f t h e f u n c t io n a l t h e o r y o f d e c is i o n - m a k i n g i n af u z z y e n v i r o n m e n t t h a t re q u i r e f u r t h e r i n v e s t i g a ti o n a r e t h e q u e s t i o n o f e x e c u t i o no f f u z z y d e c i s i o n s ; t h e w a y in w h i c h t h e g o a l s a n d t h e c o n s t r a i n t s m u s t b ec o m b i n e d w h e n t h e y a r e o f u n e q u a l i m p o r t a n c e o r a r e i n t e r d e p e n d e n t ; t h ec o n t r o l o f f u z z y s y s t em s a n d t h e i m p l e m e n t a t i o n o f f u z z y a l g o r i th m s ; t h e n o t i o no f f u z z y f e e d b a c k a n d i ts e f f ec t o n d e c i s i o n - m a k i n g ; c o n t r o l o f s y s t e m s i n w h ic ht h e f u z z y e n v i r o n m e n t is p a r t ia l l y d e f i n e d b y e x e m p l i f ic a t i o n ; a n d d e c i s io n -m a k i n g i n m i x e d e n v i r o n m e n t s , t h a t is , i n e n v i r o n m e n t s i n w h i c h t h e im p r e c i si o ns t e m s f r o m b o t h r a n d o m n e s s a n d f u z z i n e s s .


14/18

14 R. BELLMA N AND E. S. LEE

7. Invariant imbedding and nonlinear boundary value problemsAEQ. MATH.

In many applications in engineering and physical sciences, there occur manytwo-point or multipoint nonlinear boundary value problems. The functionalequation approach can be used effectively to overcome the stability problem insolving these nonlinear boundary value problems.

To illustrate the approach, consider the nonlinear two-point boundary valueproblem

dx- ~ = f (~ , y , t )d y =d t g(x, y, t)

(1)

with boundary conditions

x(O) = c, y(t f) = o (2)

with 0 - t-< t . In order to avoid the various computational difficulties in solvingthe above boundary-value problem, we shall convert it into an initial-valueproblem. In other words, the missing initial condition y(0) will be obtained byusing the invariant imbedding concept. To do this, consider the problem with themore general boundary conditions

x ( a ) = c , y( t~) = 0 (3 )

where a < - t < - t s and a is the starting value of the independent variable t.However, it should be kept in mind tha t a also controls the dura tion of theprocess. If a assumes different values from zero to t , say a = 0, A, 2A ... , thenthere will be a family of problems. Each member of this family has a differentstarting value of a and is represented by Eqs. (1) and (3). Let us considerobtaining the missing initial conditions y(a) for this family of problems. The ideais that neighboring processes are related to each other. It may be possible toobtain the missing initial condition for the original problem y(0) by examining therelationships between the neighboring processes.

Notice that the missing initial condition y(a) for this family of processes is notonly a function of the starting point of the process a, but also a function of the


15/18

Vo l. 17, 1978 Func tional equ ations in dyn am icprogramming 15s t a r t i n g s t a t e o r t h e g i v e n i n i t i a l c o n d i t i o n c . D e f i n e

t h e m i s s i n g i n it ia l c o n d i t i o n f o r t h e s y s t e mr(c , a ) = r e p r e s e n t e d b y (1 ) a n d (3 ) w h e r e t h e p r o c e s s~beg ins a t t = a w i th x ( a ) = c. J

O b v i o u s l yy ( a ) = r (c , a ) (4)

N o t i c e t h a t x ( a ) a n d y ( a ) r e p r e s e n t t h e s t a r t i n g s t a t e o f t h e p r o c e ss .W e s h a l l c o n s i d e r r as t h e d e p e n d e n t v a r ia b l e , a n d c a n d a a s t h e i n d e p e n d e n t

v a r i a b l e s. A n e x p r e s s i o n f o r r i n t e r m s o f c a n d a w i ll b e o b t a i n e d . C o n s i d e r i n gt h e n e i g h b o r i n g p r o c e s s w i t h s t a r t i n g v a l u e a + / t , t h e m i s s i n g i n i t i a l c o n d i t i o n o ft h is n e i g h b o r i n g p r o c e s s c a n b e r e l a t e d t o y ( a ) b y th e u s e o f T a y l o r ' s s e ri e s

y ( a + A ) = y ( a ) + y ' ( a ) z l + O ( A ) ( 5 )w h e r e o ( a ) r e p r e s e n t s h i g h e r - o r d e r t e r m s o r t e r m s i n v o lv i n g p o w e r s o f A h i g h e rt h a n t h e f ir st . A t t h e s t a r t i n g v a l u e a , E q . ( 1) b e c o m e s

x ' ( a ) = f ( x ( a ) , y ( a ) , a ) = f ( c , r (c , a ) , a )y ' ( a ) = g ( x ( a ) , y ( a ) , a ) = g (c , r (c , a ) , a ) .

S u b s t i t u t i n g ( 6 b ) a n d ( 4 ) i n t o ( 5 ) , w e o b t a i ny (a + a ) = r ( c , a ) + g(c, r (c , a ) , a ) a + O ( a) .

(6a )(6b)

(7)O n t h e o t h e r h a n d , t h e f o l lo w i n g e x p r e s s i o n c a n b e o b t a i n e d f o r th i s m i s s in g

i n it ia l c o n d i t i o n y ( a + A ) f r o m E q . ( 4) :y ( a + A ) = r ( x ( a +A ) , a +Zl ) . (8 )

A g a i n , t h e e x p r e s s i o n x ( a + z l ) c a n b e r e l a t e d t o i t s n e i g h b o r i n g p r o c e s s x ( a ) = cb y T a y l o r ' s s e r i e s ,

x ( a + A ) = x ( a ) + x ' ( a ) d + O ( Zl ) = c + f ( c , r ( c, a ) , a ) a + O ( z l ) . (9 )T h u s , E q . ( 8 ) b e c o m e s

y ( a + A ) = r (c + f ( c , r(c , a ) , a ) a + O ( a ) , a + a ) . ( 1 0 )


16/18

16 R . B E L L M A N A N D E . S . L E E A E Q . M A T H .

E q ua t i n g E qs . ( 7 ) an d ( 10 ), w e ob t a i n t he des i r ed re l a t i onr(c, a) + g(c, r(c, a) , a)A = r(c + f(c, r(c, a) , a)A , a +At) ( 1 1 )

om i t t ing t he t e r ms i nvo lv i ng pow er s o f d h i ghe r t han t he f irst. T he d i f f e r ence E q .( 11) can be u s ed d i r ec t l y t o o b t a i n t he mi s s ing i n it ia l cond i t i ons r(c, a). A l t e r -na t e l y , a pa r t i a l d i ff e r en t i a l equ a t i on c an be o b t a i ned f r om (11). E xp and i ng t her i gh t - hand s i de o f ( 11 ) by T ay l o r ' s s e r i e s , w e ob t a i n

r (c + f (c , r (c , a ), a ) A , a + A ) = r (c , a ) + f ( c, r (c , a ), a ) A -r(c, a)O c+ a c3r(c' a ) + o ( a ) . ( 1 2 )d a

I n t h e l i m i t a s a t ends t o ze r o , t he f o l l ow i ng f i r s t- o r de r quas i l i nea r pa r t i a ld i f f e r en t ia l equ a t i on is ob t a i ned f r om ( 11 ) and ( 12) :

f ( c , r ( c , a ) , a ) O r ( c , a~_~ t O r ( c , ~ a , g(c, r ( c , a ) , a ) . (13)dc OaFr om ( 3 ) and ( 4 ) , i t c an be s een t ha t

r(c, tf) = 0. (14 )Thus , the mis s ing in i t i a l condi t ions r(c, a) f o r t he f ami l y o f p r oces s e s , w i t h t h e

s t a r ti ng va l ues o f t he i nd ep en de n t va r i ab l e a f r om ze r o t o t~, c an be ob t a i ned byso lv ing the sys tems (13) and (14) .

T h e a b o v e a p p r o a c h c a n b e u s e d e f f e c ti v e ly t o s o lv e u n s ta b l e b o u n d a r y v a l u ep r o b l e m s . F u r t h e r m o r e , t h i s a p p r o a c h c o m b i n e d w i t h o t h e r t e c h n i q u e s s u c h a squas i l i nea r iza t i on , s p li ne , and non l i nea r s umm ab i l i t y can be u s ed t o s o l ve r ea s on -a b l y l a rg e d i m e n s i o n a l p r o b l e m s .

8 . D ynam ic programm ing and the numerical solution of partialdifferential equationsT h e d y n a m i c p r o g r a m m i n g f u n c ti o n a l e q u a ti o n s c a n a l s o b e u s e d f o r s o lv i ng

c o m p l i c a t e d b o u n d a r y v a l u e p r o b l e m s i n p a rt ia l d i ff e r e n ti a l e q u a t i o n s. T h i sa p p r o a c h f o r m s a p o w e r f u l t e c h n i q u e f o r s o l v i n g l i n e a r e l l i p t i c a n d p a r a b o l i c


17/18

Vol. 17, 1978 Functional equations in dynamic programming 17partial ditterential e quatio ns over regular and irregular regions., Nonli near equa-tions can also be solved by the combined use of quasilinearization, spline,nonlinear summability and dynamic programming. The interested reader canconsult the book by Bellman and Angel [10].

9 . D i s c u s s i o n

In this short survey article, we have tried to illustrate the various applicationsof the functional equations of dynamic programmi ng, starting from simple optimi-zation problems to calculus of variational problems to stochastic problems and,finally, to the numerical solution of differential equations. No attempt was madeto cover all the applications of the functional equations. Indeed, it would beimpossible even to cover only the important applications. Furthermore, newapplications are still being discovered. The reader can refer to some of the bookslisted at the end of this paper.

Some of the other applications which are not discussed are identification andnonlinear filtering, communication theory, multistage games and simulation,inverse problems, feedback control, transportation problems, and so on.

Our purpose is to illustrate the typical applications of the functional equationand to select a representative process in each typical application. This seems to bethe only way to show the versatility of dynamic programming, and at the sametime, to obtain a reasonable shor[ survey article.

REFERENCES[1 ] BELLMAN,R., Dynamic programming . Princeton University Press, Princeton, N.J., 1957.[2] BELLMAN,R., Ad apti ve control processes. A guided tou r. Princeton University Press, Princeton,

N.J., 1961.[3 ] BELLMAN, R. and DREVFtrS, S., Applied dynamic programming. Princeton University Press,Princeton, N.J., 1962.[4] ARROW, K, J., KARLIN,S. and Sc~av, H., Studies in the m ath em atic al theory of inventory andproduction. Stanford University Press, Stanford, 1958.[5] MINE, H. and OsA~I, S., Markovian decision-processes. American Elsevier, New York, 1970.[6] HOWARD,R. A., Dynamic prograh~ming and Markov processes . Wiley, 1960.[7] BELLMAN, R., Introduction to the ma them atic al theory of control processes, Volumes I and II,1967, 197l. Academic Press, New York.[8] DREYFUS,S., D yn am ic programm ing and the calculus o f variations. Academic Press, New York,1965.[9] Ares, R., Discrete dynamic programming. Blaisdell, New York, 1964.[10] BELLMAN, R. and ANG f~L., E., D yn am ic" programming an d partial differential equations.Academic Press, New York, 1972.[11] LEE, E. S., Ouasilinearization an d invariant imbedding. Academic Press, New York, 1968.


18/18

18 R. BELLMAN AND E. S. LEE AEQ. MATH.

[12] BELLMAN, R. a n d WING, G . M . , A n introduction to invariant imbedding. Wiley , New York , 1975 .[13] WING , G. M., A n introduction to transport theory. Wiley , New York . 1962 .[14] BELLMAN, R. and ZADEH , L. A . , Decis ion -ma k ing in a fuz zy environment, M a n a g e m e n t S c i e n c e s17, (1970), B-141-B164.[15] BELLMAN,R., A note on cluster analysis and dynamic programming. M a th e m a t i c a l B io s c i en c e s 1 9(1973), 311-312.~16] CH ANG , S. S. L., Fuzzy dynamic programming and dec is ion-mak ing process . In Proceedings of the3rd Princeton Conference on Information Sciences, 1969 , pp . 200-203 .[17] ZADEH, L . A . , F u z z y s e t s . In fo rm at ion and Co n t ro l , 8 (1965), 3 38-3 53 .[18] ZADEH, L . A . , Fuzzy algori thms. I n f o r m a t io n a n d Co n t r o l i 2 (1968), 94-102.

University o f Southern California,Los Angeles , Cali fornia 900 07U.S .A .a ndKansas State Universi ty ,M a n h a t ta n , Ka n s a s 6 6 5 0 2U.S .A .

Documents

Bellman, Lee - Functional Equations in Dynamic Programming