7
STOCHASTIC APPROXIMATION THEOREM IN A HILBERT SPACE 413 [8] J. KUELBS AND T. KURTZ, Berry-Esseen estimates in Hilbert space and an application to the law of the iterated logarithm, Ann. Probab., 2 3 (1974), pp. 387-407. [9] N. N. BAKHANIYA, Probability distributions in linear spaces, Trudy VTs AN Gr.SSR, 10, 3 (1971). (In Russian.) [10] I. I. GIKHMAN AND m. V. SKOROKHOD, Introduction to the Theory o Random Processes, Saunders, Philadelphia, 1969. [11 A. A. MOGUL’SKII, On the law of the iterated logarithm in Chung’s form, Tezisy Dokladov Vtori Vil’nyusskoi Konferetsii po Teorii Veroyatn. Matem. Statistike, t. 2, Vil’nyus, 1977, pp. 44-47. (In Russian.) ON A STOCHASTIC APPROXIMATION THEOREM IN A HILBERT SPACE AND ITS APPLICATIONS G. L SALO V (Translated by A. B. Aries) Stochastic approximation procedures for solving Fredholm integral equations of the first kind, arising in signal detection and filtration problems involving random processes, have been described in [1]. It has turned out that the convergence proof given in [1] contains an error; nevertheless, with a minor correction, it does hold for equations of the second kind [2]. The main result of this paper is Theorem 2, which establishes the validity of the theorems given in [1]. We also indicate a stochastic approximation procedure for a system of integral equations, related to the approach taken in [3] for the detection of signals in arbitrary noise (see Theorem 3). 1. Let (f, , P) denote a complete probability space, and let (H, (H)) be a real (infinite-dimensional) separable Hilbert space with atr- algebra (H) of all Borel sets in H. We denote by X, Y, X,, Y,, Z,, n 1, 2,..., random variables with values in H, i.e., measurable mappings of (f, ’, P) into (H, (H)). By the conditional expectation of a random variable Y with respect to a -algebra , we mean a random variable E{ Y[}, unique up to equivalence, such that for each h H, with probability 1, ({ YI}, h {( v, h)}. If X is an - measurable random variable, then E{(X, Y)I} (x, E{ Y[}). We denote by , the -algebra generated by random variables X,..., X,. In addition, instead of E{. 1,} we shall frequently write E{. IX1,’’’, X,}. Let S be a measurable mapping of (H, (H)) into (H, (H)), and let OH be a solution of the equation S(x)= 0. We recall now two well-known assertions. Let X be an arbitrary random variable with EIIXII < . Let X, X3, be defin6d by (1) X.+=X.-a.Y., where a l, a2, is a sequence of positive numbers, (2) Z a.=oo, Z a 2<0 n=l n=l and the variables Y. are such that, for all n _-> 1, { y.lx,,..., x.} { y. Ix.} Downloaded 11/21/14 to 130.113.86.233. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

  • Upload
    g-i

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

STOCHASTIC APPROXIMATION THEOREM IN A HILBERT SPACE 413

[8] J. KUELBS AND T. KURTZ, Berry-Esseen estimates in Hilbert space and an application to the law

of the iterated logarithm, Ann. Probab., 2 3 (1974), pp. 387-407.[9] N. N. BAKHANIYA, Probability distributions in linear spaces, Trudy VTs AN Gr.SSR, 10, 3

(1971). (In Russian.)[10] I. I. GIKHMAN AND m. V. SKOROKHOD, Introduction to the Theory o Random Processes,

Saunders, Philadelphia, 1969.[11 A. A. MOGUL’SKII, On the law of the iterated logarithm in Chung’s form, Tezisy Dokladov Vtori

Vil’nyusskoi Konferetsii po Teorii Veroyatn. Matem. Statistike, t. 2, Vil’nyus, 1977, pp.44-47. (In Russian.)

ON A STOCHASTIC APPROXIMATION THEOREM INA HILBERT SPACE AND ITS APPLICATIONS

G. L SALOV

(Translated by A. B. Aries)

Stochastic approximation procedures for solving Fredholm integral equations of thefirst kind, arising in signal detection and filtration problems involving random processes,have been described in [1]. It has turned out that the convergence proof given in [1]contains an error; nevertheless, with a minor correction, it does hold for equations of thesecond kind [2].

The main result of this paper is Theorem 2, which establishes the validity of thetheorems given in [1]. We also indicate a stochastic approximation procedure for a systemof integral equations, related to the approach taken in [3] for the detection of signals inarbitrary noise (see Theorem 3).

1. Let (f, , P) denote a complete probability space, and let (H, (H)) be a real(infinite-dimensional) separable Hilbert space with atr- algebra (H) of all Borel sets in H.We denote by X, Y, X,, Y,, Z,, n 1, 2,..., random variables with values in H, i.e.,measurable mappings of (f, ’, P) into (H, (H)).

By the conditional expectation of a random variable Y with respect to a -algebra, we mean a random variable E{ Y[}, unique up to equivalence, such that for eachh H, with probability 1,

({ YI}, h {( v, h)}.

IfX is an -measurable random variable, then E{(X, Y)I} (x, E{ Y[}). We denote by, the -algebra generated by random variables X,..., X,. In addition, instead ofE{. 1,} we shall frequently write E{. IX1,’’’, X,}.

Let S be a measurable mapping of (H, (H)) into (H, (H)), and let OHbe a solution of the equation S(x)= 0. We recall now two well-known assertions.

LetX be an arbitrary random variable with EIIXII < . Let X, X3, be defin6d by

(1) X.+=X.-a.Y.,

where a l, a2, is a sequence of positive numbers,

(2) Z a.=oo, Z a2<0n=l n=l

and the variables Y. are such that, for all n _-> 1,

{ y.lx,,..., x.} { y. Ix.}

Dow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 2: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

414 o.I. SALOV

Theorem 1. For all n >= 1, let

E{II Y.[llx,, x.} _-<A + B IIx.ll=,where A, B > O. Then:

1. Iffor any x H

(3) (x-O,S(x))>=o,

then the sequence IIXn 01[ converges as n oo with probability 1, and the limit ofthe sequenceElIX 011 xists.

2. Iffor any x H

(4) (x 0, S(x)) >= k[[x 0[[2,

where k > O, then IIXn 011 0 as n - oo with probability 1, and in addition EIIX 0ll2 - O.Assertion 1 is a version of Lemma II-2 from [4] (see also [5]). The proof of 2 is similar

to the proof in the finite-dimensional case.

2. In this section we establish the convergence of the process (1) for the important casewhen S(x)= Rx- z, where z H and R is a completely continuous, self-adjoint, strictlypositive operator on H: (Rx, x) > 0 for any x s H\{0}.

Set

(5) Y, T,,X,, Z,,

where T1, T2, is a sequence of random linear operators, i.e., for each n >_- 1 and to

T, (to) is a bounded linear operator from H into H, and, furthermore, for each x H, T,x isa random variable with values in H.

We shall assume from now on that for each n -> 1 the following general conditions aresatisfied:

(i) E{llz.ll2lx,, x.} EIIz.[I K < oo,(ii) E{Z.IX1,.’., X.}= EZ z, z 6 H,(iii) [[T.(o)xll<- c.(o)l[xll, o l, x H,(iv) E{C] [Xx,’.. Xn} EC] K2 <,(v) E{T,X, IXa, X,}= E{T,X, IX,}= nx,.

If the equation

(6) Rx z

has a solution 0 6 H, then

(7) (x 0, S(x)) ( (x 0), x 0) 0.

Thus, in this case, only condition (iii) can be satisfied, and condition (iv) required byTheorem 1 is not.

In the next theorem, we remove the requirement (iv) for the variables Y, of type (5) weare considering.

Theorem 2. Let conditions (i)-(v) be satisfied. Furthermore, let there exist a solution0 6H of eq. (6). Then for any random variable X with El]xl[< and sequenceX2, Xa, ., defined by

(8) x.+ x. + a.(Z. T.X.),

we have:

P{llx 011 0, n a, EIIX 0112 0, n .PROOF. By (1), (5), (7), (8) and Assertion 1 of Theorem 1, it suffices to show that

EIIX,, 011:---> O, n --->

Dow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 3: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

STOCHASTIC APPROXIMATION THEOREM IN A HILBERT SPACE 415

1. We introduce variables Y, of a more general structure than that of (5)"

(9) Y. (/z) T,,X,,(lx)+ g(X.(g)-O)-Z.,

where the parameter tz [0, 1]. For the variables (9)

S(X,, (la,)) RX,, (tz + la, (X,, (tx O) z,

so that we have a perturbed equation of the second kind"

(10) Rx + tzx z + tzO.Since the operator R is strictly positive and tx --> 0, equation (10) has a unique solution x 0.Furthermore,

(a x) E{II Y,, ()11:1,, --< 4(K1 + KzIIoll / (u, / K2)IIx,, () o11),(12) (x 0, S(x)) (R (x 0), x 0) +Therefore all the conditions of Theorem 1 are satisfied when /x >0. Hence for thesequence of random variables {X. (/x)} defined inductively by the conditions" Xa (/x) X,

(13) X,,+x(lz X,, (l.z a,, T,,X,, (# + tx(Xn (l,z O) Z,,),

EIIX.(z)- 0112 0, n oo, for any z >0. From (9) and (13) we have

(14)IIg () 0112 2a(g(

By (11), (12), and (14) (for 0 0)

llx/ ()11 -<- llx ()112 / 4aZ,, (K1 +Now it is easy to see that

EIIX/()II<_-EIIXlll (l+4a0x +K2))+4K1 2 a (l+4a( +K)).i=1 =1 /=1

By (2) we obtain

(15) sup sup EIIX.By (11), (14), and (15), for all/x [0, 1] and EIIxII=- 0,

2 E a.E(X,,(t)-O,S(X.()))<= E aEIIY()IIn=l n=l

(16) <-4(KI+KIIOII+(I+Kz)K) 2 a,,n=l

Relations (2), (12), (14), and (16) enable us to assert (independently of Theorem 1) that, as

(17) Ells. () 0112 - 0, if tz > 0,

2. Let qi be a normalized eigen-element of the operator R. We shall show that, foreach/x [0, 1],

(18) E(X,(t)-0, (i)2 -’> 0,

as n ->00, which follows from (17) for/z >0. By (9) and (13) we have (here and further0=0)(19) E(X.+(tz), q) E(X.(tz), q)-Za.E(X.(u), o)(Y.(/z), o)+ a.E(Y.(tz), o).

Dow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 4: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

416 G.I. SALOV

Since

(20)

E(Xn (/x), i)( Yn (/z), qi)

EE{(X, (/z ), (i)(TnXn(l.]b)-Zn, lgi)ln}--/.E(Xn (/z), (i)2

E(X (iLl.,), (49i)(RXn ([d, ), (i) .at- [.z E(X ([d, ), (49i) (/i t./.L )E(X (itJ,), (i)2,

where hi > 0 is an eigenvalue of the operator R, corresponding to the eigen-element (i, wesee from (19) that

(21)E(Xn+I(/L), i) 2 a(Ai-I-/x)E(X.(/x), )

=1

(y,.(g), ,).E(X1, qi) + ai=1

This means, by virtue of (16), that for each/x e [0, 1] the series

(22) Y’, an(Ai +/.t)E(X.(Iz)- 0, qgi)2= fi(l.t)

converges; in addition, the limit of the sequence E(Xn (/) 0, i) as n - oo exists, that is tosay, (18) holds by virtue of (2).

3. We now note that the sum of the series (22) and also the sum of the series

(23) Y a2.E(Y.(Iz), i)2= gi(vt)n=l

are continuous functions on the interval [0, 1]. In fact, it is easy to see by induction on n thatEIIX. (t*)-X. (/,o)[I and E[I Y, (/x)- Y. (/Xo)[] tend to 0 as > -/zo for each n. Due to theinequality

[E(,, Y.())- E(q,. Y.-< II,II=(E*/=II Y. (,)11= / E*/=II Y. (,o)ll=)E*/=ll Y.(,)- Y (,o)11

and the boundedness of EIIY()II=, the series (23) consists of continuous functions on[0, 1]; it converges uniformly in/x on the interval [0, 1] by virtue of (2), according to theWeierstrass principle. Finally, by virtue of (18) and (21), the sum of the series (22) isalso continuous on [0, 1].

4. By hypothesis, the sequence of eigen-elements i of the operator R is a completeorthonormal system in H. Furthermore, hi => hi+l > 0 and hi --> 0, --> m. We consider nowthe sequence

h.(s,/z)=E Y ,(X,,(ia.)-O,qi)2= Z ,E(X.(/x)-0, qi)2,i=1 i=1

where s s [0, 1]. By (13) and (20) we have (here and hereafter 0 0):

Z AE(Xn+I(/Z), qgi)2= Z hE(X.(/z), qi)i=1 i=1

2a, Y. a (/i +/./.)E(X, (tz), i)i=1

+ a Y’. ,M(Y.(/z), qi)2.i=1

Dow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 5: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

STOCHASTIC APPROXIMATION THEOREM IN A HILBERT SPACE 417

Similarly to (21),

hE(Xn/l(/X),q/)2+2 aj Y’, A(hg+x)E(X(/x), qi))i=1 ]=1 i=1

(24)s(, ).E hiE(Xl, qgi)2+ a] 2 hi

i=1 /=1 i=1

Using (22) and (23), we have

(25) Y h Y a(Ai + z)E(X(tz), i) Y h is[i(z),i=1 /=1 i=1

E(y(), (i)2(26) E hi 2 a 2 h igi([.l,)i=1 /=1 i=1

From (24) and (16) it follows that

EllY (,)112 < K.(27) 2 E fi(IdL)-EIIX,II=-<- E gi(/x) 2 ani=1 i=1 n=l

By the Dirichlet principle, the series (25) and (26) converge uniformly in tz on [0, 1] [oreach s > 0. Due to the continuity o[ the [unctions fi(tz) and gi(/x), the sums o[ the series (25),(26) are also continuous on [0, 1]. Using (24), we find that the limit h (s, tz) o[ the sequencehn (s, tz) is, [or each fixed s > 0, a function which is continuous in t* on the interval [0, 1].Further, since

0 -<_ hn (s,/x <- h EIIx.()- 011=,by (17) we have that h(s,/x)=0 for tz >0. Therefore, due to the continuity of h(s,l)proved above,

(28) lim h,(s, 0)= h(s, 0)=0 if s >0.

Finally, we consider the sequence hn (s, 0). Since

I isfi(O) sup a sfi(O) and h Sigi(O <: sup h ]gi(0),

by virtue of (27) for/z 0 the series (25), (26) converge uniformly in s on the interval [0, 1by the Weierstrass principle. Using (24) again, we find that the function h(s, 0) iscontinuous on [0, 1]. Therefore, by (28), h(0, 0)= 0, which was to be proved.

3. We apply Theorem 2 to the problem of detection of a signal in arbitrary noise [3].Assume that the random processes ’(t) ("noise") and r(t) are measurable on a finite

observation intrerval/, and all their sample functions belong to a real Hilbert space L2(I).The optimal Neyman-Pearson criterion for testing the processes contains a functional, theRadon-Nykodym derivative F on L2(I). In the general case (without special assumptionsconcerning (t) and r(t)), it is rather difficult to resolve the question on the existence of thefunctional F and its form. Following [3], we take a different criterion for optimality, whichalso leads to the Radom-Nykodym functional if the latter exists and has a finite secondmoment.

We denote by f a functional on L2(I) such that E[[(:) < c. If a probability measureon the Borel or- algebra of the space L2(I), corresponding to the process r(t), is absolutelycontinuous with respect to the measure corresponding to the process :(t), and if, inaddition, E[F(:)] < oo, then

(29)E[f()-f({)]

Dow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 6: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

418 G.I. SALOV

attains a maximum for f kaF k2. It is intuitively obvious that in the converse case whereF does not exist, or E[F(’)] oo, the functional f obtained by maximizing (29) will also beappropriate to distinguish between the processes. Therefore, solving an appropriateextremal problem, we can approximate F as desired.

It is natural to approximate F by a polynomial of degree k => 1,

(30) f(x)=p__a I1"’’I qp(ta’’’’’t)x(ta)’’’x(t)dtl’’’dt’

in which , s L2(I) or, equivalently, the vector-function (Ca, ’, k) belongs to thereal separable Hilbert space L2 =@o--a L2(I) with scalar product

(g, h)= E (go, ho)o,p=l

where (.,.)o is the scalar product in L2(I)), and norm Ilgll.We shall assume from now on that, for p 1,. ., k,

E(/ ’2(t)dt) <oo, E(f, :2(t)dt)2=K(,; p)<eo.

Upon substituting (3) into (29), we may write

(ms me,(Re, )1/2

where ms, m L are vector-functions with the components

mp((; h, to)= Er(h) ((tp) and mo(,tj; h, to)= EsC(tl) "’’:(tp)

respectively. R is the integral operator mapping Lx onto itself, with matrix kernel

tZpq(tl,"’, to;u1,’’’, Uq) E:(I1) t(to)(Ul)’’" (Uq)- rap(C; h,"" ", to)mq (so; ul,’" ’, uq).

The operator R is completely continuous, selfadjoint and positive, which implies thefollowing (see [3]): A necessary and sufficient condition for the functional (30) to ensure themaximum of the expression (29) is that L be a solution of the equation

Re m -m0

that is, of the system of integral equations

(31)Ii’"Iilzoq(tl,’",tp;ua,’",uq)q(ul,’",uq)dUl""duq

q=l

mo(’; h,""", to)-mo(:; tl,’’’, to) (p 1,..., k).

Assume that the operator R is strictly positive. Let :, (t), , (t), /, n 1, 2, ., bemeasurable random processes on (f, d, P), formed by two independent sequences ofindependent observations of the random processes (t) and r(t). Let r/1((h), , rlk(h, ", tk); to I) be a measurable vector random process on (f, , P),independent of , (t), st, (t), n > 1, and such that all the sample vector-functions belong to Land, in addition, Ellwll=< ,. Then we have the following theorem generalizing thetheorems from [1], [2] to systems of integral equations.

Theorem 3. If there exists a solution L of the system of integral equations(31) and if in addition, the sequence of the vector random processes {ft,(rt, l(h), , q,k(h,’’’, tk); to I); n --> 2} on (l), sg, P) is given by

r/(,+)p(tl,’’’, to) rl,o(ta,’’., to)+an[,(h)... n(to)--2.(t)’’" :2. (to)

X’(1-" q=l fI’’" fI 2n(ul)’’" 2n(Uq)--2n--l(Ul)’’’2n--l(Uq)]

/.IDow

nloa

ded

11/2

1/14

to 1

30.1

13.8

6.23

3. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 7: On a Stochastic Approximation Theorem in a Hilbert Space and Its Applications

CONFERENCE REGIONS FOR DISTRIBUTION FUNCTIONS 419

then

PROOF. It is sufficient to verify that conditions (i)-(v) of Theorem 2 are satisfied. Werestrict ourselves to some remarks. Here the sequence of vector random processes with the

Zno(tl,..., t1))= st,(t1)

corresponds to the random variables Z. According to the measurability assumptions of therandom processes st, and :, and, by Fubini’s theorem, (x, Z) is a random variable on(lq, , P) for all n--> 1 and x L. Since the space L2 is separable, the Z are random

Lvariables on (fl, , P) with values in , (L,)). The same is true for the sequence ofrandom processes r/ corresponding to the random variables x.

T, is an integral operator with matrix kernel (:1)), where

Since the variables of the sequence :, are independent,

Finally,

E{C]lrl,..., rt,I_-<4k Y’. K(;p).1)----1

REMARK. It is not hard to extend our result to other systems of integral equations, forexample, to those arising in problems of nonlinear filtration of random processes and inproblems of filtration of vector random processes.

Received by the editorsJuly 7, 1976

REFERENCES

G. I. SALOV, The method of stochastic approximation for integral equations of detecting, filtering,and predicting random processes, Kibernetika, 2 (1973), pp. 127-132. (In Russian.)

[2] G. I. SALOV, On the method of stochastic approximation for integral equations from the randomprocesses theory, Kibernetika, 2 (1974), pp. 141-143. (In Russian.)

[3] N. G. GATKIN and Yu. L. DALETSKII, On optimal detection of a signal in arbitrary noise, Theor.Prob. Appl., 16, 4 (1971), pp. 728-732.

[4] L. SCHMETTERER, L’approximation stochastique, Univ. Clermont-Ferrand, 1972.[5] E. G. GLADYSHEV, On stochastic approximation. Theory Prob. Appl. 10, 2 (1965), pp. 275-278.

CONFIDENCE REGIONS FOR UNIMODAL AND SYMMETRICDISTRIBUTION FUNCTIONS

N. A. BOGOMOLOV

(Translated by A. B. Aries)

1. Introduction

The problem of constructing confidence regions for continuous distribution functionscan usually be solved by methods of nonparametric statistics (for example, see [1 ]). Let T,T2, , T,, be independent random variables with identical continuous distributions. LetD

ownl

oade

d 11

/21/

14 to

130

.113

.86.

233.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php