An inequality with applications to statistical estimation for probabilistic functions of markov...

Preview:

Citation preview

8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…

http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 1/4

AN INEQUALITY WITH APPLICATIONS TO STATISTICAL

E S T I M A T I O N F O R P R O B A B I L IS T IC F U N C T I O N S

O F M A R K O V P R O C E S S E S A N D T O

A M O D E L F O R E C O L O G Y

BY LEONARD E. BAUM AND J . A. EAGON

Communicated by R. C. Buck, November 21, 1966

1. Summary . T h e ob jec t of th i s no t e is to p ro ve the theo rem be low

and ske tch two appl ica t ions , one to s ta t i s t i ca l es t imat ion for (p robabi l is t ic) funct ions of Markov processes [ l ] and one to Blak ley ' s

model for ecology [ 4 ] .

2. Resu l t .

T H E O R E M . Let P(x)=P({xij}) be a polynomial w ith nonnegative

coefficients homogeneous of degree d in its variables { # # } . Let x= {##}

be any point of the doma in D: ##§: ( ) , ] p L i ## = 1, i = l, • • • , p,

j=l, • • • , q% . For x= {xij} £ £ > le t 3(#) = 3 {# #} denote the point of D

whose i, j coordinate is

( dP\ \ f « dP3(*)<i = ( Xij 7 — ) / 2 * *< i —

\ dXij\(X)// , - i dXij (»>

Then P(3(x))>P(x) unless 3 ( x ) = x .

Notation, fi will de no te a dou bly indexed a r r ay of non neg a t ive

i n t e g e r s : fx= {M#}> i=

l > • • • > < lu i=l, • • • , A #* the n de no tes

I l f - i H î - i ^ * S im ila r ly , cM is an abb rev ia t ion for C[M iJ}. The po ly

n o m i a l P({xij}) i s then wr i t t en P(x) = ] C M V ^ -In ou r no t a t i on :

(1) 3(&)*i = ( Z ) «Wnys* ) / JLH CpiiijX».

We wish t o p rove

(2) P O ) = E <^" ̂ z ^ n f i 3(*)«w.

P R O O F .

^ ) = i k n i i 3 ( * ) « }

360

8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…

http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 2/4

l/d+l

^ijld\ d/d+1

STATISTICAL ESTIMATION AN D A MODEL FOR ECOLOGY 3 6 1

W e ap ply H old er ' s inequ a l i ty [6, p . 21] to ob ta in

/ P Qi \

( P Qi / %.. \Mild\ *

( 3 ) x {E^nn~-) }( In the las t b races we have used (x»)d+lld=x* ü f - i H j - i ^

/ t f0

S i nc e

Z ? - i Z î - i M t f / ^ 5 5 1 by hom oge ne i ty o f P , we can ap ply the inequ a l i ty

of geom et r ic an d a r i th m et ic m ean s [6, p . 16] to the dou ble p ro du c ts

of the second brace of (3) to conclude:

P Qi / X.-J \PHld P Qi nu Xss

( 4 ) E v n n f e - i i ^ E L ? ^ -We now subst i tute the def ini t ion (1) of 3(#)# in the express ion on

the r igh t o f (4) and in te rchange the order o f summat ion to ob ta in :P Qi n . . < Y » . .

2- c,x" 2- 2- -r 77-r"\ P Qi

= "J Z <T»*M X S /*</*#

(5) ' ( Z Z Ctfp'ijJP' ) / ( Z <V/4^ M ' )

= — £ Z ^i f £ M*;VMJ / ( Z MW*

M'J

* Z Z^'Mtfo^'-i o - i M'

For each (i, j) the express ion wi th in the bracke ts i s = 1 and by hy

pothesis for each i, Z ? - i x * y = *• Hence the whole las t express ion of (5)

reduces to (1/d) Z f - i Z ? { - i Z M V / 4 0X M

' - B u t thi s is ju st (1 /d) Z</a**/o

• (dP/dXij0) so by t he E ule r theo rem for homog eneous func t ions i t i s

equa l to 2/lcMa;'1.

F ina l ly , i f we use th i s upper bound H^W for the express ion within

the second braces in (3 ) , r a i se bo th s ides o f (3 ) to the (d+l ) s t power ,

and d iv ide bo th s ides o f the resu l t ing inequa l i ty by ÇSpW) we ob

ta in the des i red inequa l i ty (2 ) .

T h a t P(3{xij})>P{xij) if { ## }? *{ ## } follows from (4) an d th e

s t r ic tness o f the inequa l i ty o f geomet r ic and a r i thmet ic means i f a l l

s u m m a n d s a r e n o t e q u a l .

8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…

http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 3/4

362 L. E. BAUM AND J. A. EAGON [May

3. Application 1. T h e f irs t app l ica t ion of th i s theor em i s to s ta t i s t i

ca l es t im at ion for (p robabi l i s ti ca l ly ) lum ped M ark ov cha ins . Le t 5

be th e f inite s t a t e space of a M ar ko v cha in. Le t ƒ be a funct ion f rom

5 to R. L e t yG.RT, T an in teger , be an observ a t ion . In [ l ] the p rob lem

is cons idered of es t im at ing the t rans i t ion probab i l i t ies a# for i, j&S,

given y.

L e t X=(fT)-

1(y). XQS

T. F o r xEX, i, jES, le t Vi j{x ) be t he num

ber of t imes the pat tern • , • , • , i, j , •, -, • oc cu rs in x. The func t ion

P ( { # # } )=

Z » e x TLiJesaVij^

x) m ay t>

ei n t e rp r e t ed a s t he " p robab i l i t y

of observing y given the t rans i t ion probabi l i t i es { # # } . " N o t e t h a t P

i s a homogeneous po lynomia l o f degree T with no nne ga t iv e ( in teger )

coeff ic ients in the var iables a#.

An i te ra t ive procedure for es t imat ing the t rans i t ion probabi l i t i es

{&<;} given y i s sugg ested in [ l ] . If {ay} is an a priori es t ima te , l e t

4 j = ( ] £ * e j r Pij(x) TLk,iG8 öww(af)

/-P({^y})«A'a

m aY be in te rp re ted as

th e "a posteriori expected value of the f requency of t ransi t ion f rom

s t a t e i t o s t a t e j given y and the a priori probabi l i t i es {da}." T h u s

AyXjA'u may be thought o f as an "a p osteriori es t imate o f the t rans i

t ion probabi l i t i es g iven y." Since

A'ufàA'H = a^dP/da^/^aaidP/dai,)

by ou r the ore m app l ied to t he t ran sfo rm atio n 3{#<,-} = {A'^fLA'^}

w e c o n cl ud e t h a t - P ( 3 { a # } ) è P ( { a # } ) . I n o t h e r w o r d s t h e a pos

teriori es t imate o f t rans i t ion probabi l i t i es increases the l ike l ihood of

the g iven obse rva t i on y.

Various resul ts on the convergence of hi l l c l imbing i tera t ion pro

cedures [2 ] , [3 ] , [5] may be adduced to show tha t fo r a lmos t a l l

s ta r t s success ive i t e ra t ions wi l l converge to a connec ted component

of the local m ax im um set of P . If P ha s only f initely m an y local ma x

ima then success ive i t e ra tes converge to a po in t .

This i s the usua l case in the more genera l s i tua t ion cons idered in

[ l ] in which the observa t ion y t a t t ime t i s ob ta in ed f rom th e M ark ov

s t a t e x t a t t ime t accord ing to P(y t = k \ x t =j) = b3-k where bjk is an 5 X r

s toc has t ic m at r ix w hich is a l so to be es t im ated . H ere th e iden ti f iab i l-

i ty problem does not ar ise s ince, according to a theorem of Ted

Pé t r i e [7 ] , " i n gene ra l " no o the r ( a# ) , (b jk) y i e l d s t h e s a m e ^ p r o b a

bilities as a given (a°y), (&°*) (save for the si re label l ings of s ta tes) .

The second appl icat ion is to some resul ts of Blakley and Dixon

[2]» [3], [ 4 ] . Let r be a symmetr ic ^ - l inear fo rm on Rn

t h a t h a s

nonnegat ive coeff ic ients with respect to the s tandard basis for Rn.

L e t g(rj) = T(rj, TJ, • • • , rj) w h e r e rj is a vector in Rn. Since g is then

ju s t a pth. degree homogeneous po lynomia l wi th nonnega t ive coef iv

8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…

http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 4/4

1967I STATISTICAL ESTIMATION AND A MODEL FOR ECOLOGY 363

cients of the components of rj we may apply the theorem of this note

to it. In Blakley's model g is the adaptati on (rate of growth)

of a population. The transformation in Blakley's model cr(rj)==r

]i(dg(v)/drli)/pÇ[(y) is the same as the transformation 3 {#»-,*} where

Xij^rjj, * = 1, i = l , • • • , n.

In Blakley's model if rj is the distribution of genotypes at time t>

then a(r]) is the distribution at time t+l. Thus it follows from the

theorem in this note with i = 1 tha t adaptation is nondecreasing with

time when evolution of the genotypes at a single locus is considered.

Our theorem with i>\ yields the same conclusion under natural

hypotheses for evolution of the genotypes at several loci. This non-

decreasing of the adapta tion with time is clearly a desirable feature

of the model.

REFERENCES

1. L. E. Baum, A statistical estimation procedure for probabilistic functions of

M arkov processes% IDA -CR D Working Paper No. 131.2. G. R. Blakley, Homogeneous non-negative symmetric quadratic transformations,

Bull. Amer. Math. Soc. 70 (1964), 712-715.3. G. R. Blakley and R. D. Dixon, The sequence of iterates of a non-negative nonlinear transformation. Ill, The theory of homogeneous symmetric transformations andrelated differential equations, (to appear) .

4. G. R. Blakley, Natural selection in ecosystems from the standpoint of mathematicalgenetics, (to appear).

5. Wolfgang Hahn, Theory and application of Liapunov's direct method, Prentice-Hall, Englewood Cliffs, N. J., 1963, pp. 139-150.

6. G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, CambridgeUniv. Press, New York, 1959.

7. Ted Pétrie, Classification of equivalent processes which are probabilistic functions

of finite Markov chains, IDA-CR D Working Paper No. 181, ID A- CR D Log No. 8694.

INSTITUTE FOR DEFENSE ANALYSES

Recommended