[Part 1: Physical Sciences] || Estimation and Prediction for Mixtures of the Exponential Distribution

Estimation and Prediction for Mixtures of the Exponential DistributionAuthor(s): Herbert RobbinsSource: Proceedings of the National Academy of Sciences of the United States of America,Vol. 77, No. 5, [Part 1: Physical Sciences] (May, 1980), pp. 2382-2383Published by: National Academy of SciencesStable URL: http://www.jstor.org/stable/8674 .

Accessed: 07/05/2014 18:09

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access toProceedings of the National Academy of Sciences of the United States of America.

http://www.jstor.org

This content downloaded from 169.229.32.136 on Wed, 7 May 2014 18:09:18 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=nas

http://www.jstor.org/stable/8674?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Proc. Natl. Acad. Sci. USA Vol. 77, No. 5, pp. 2382-2383, May 1980 Statistics

Estimation and prediction for mixtu distribution

(empirical Bayes)

HERBERT ROBBINS

Columbia University, New York, New York 10027

Contributed by Herbert Robbins, February 25, 1980

ABSTRACT Let x be a random variable whose distribution is an unknown mixture of exponentials with different means 0. From a random sample xi,..., xn of x values we show that E(O I x > a) can be estimated for any given a > 0. We can therefore predict the average of all future observations taken on those x values in the sample that exceed a.

Let (0,x,y) be a random vector with positive real components. Concerning the joint distribution of 0, x, and y, we assume that

Given 0, x and y are independent exponential random variables with mean 0.

We make no particular assumption about the distribution function G of 0, and our approach is that of empirical Bayes. Thus, let (0i,xi,yi) for i = 1,.... ,n be independent random vectors with the same joint distribution as (0,x,y). Only the xi are observed.

To estimate Oi from xi with minimum mean squared error we should use E(0 xi). Now, the conditional density and distribution functions of x, given 0, are

f(x 1) = e-/, F(x) = 1 e-X/,

and the marginal density and distribution functions of x are

f(X)= f je-x/?dG(), F(x) = 1- e-dx/OG(O).

Hence,

o sf(x 0)dG(0) 1 - F(x)

f(x) f(x) From a sample x1, . . Xn of x values we can estimate 1 - F(a) for any fixed a > 0 by Zlvi/n, in which by definition

I1 if xi > a Vi = if xi -< a'

We can also estimate f(a) by any method of density estimation that we choose. To avoid the complications that this involves, we shall here consider instead the problem of estimating

f E(0 I )f(x)dx f (1 -F(x))dx a = E(0lx > a) = -)

1 - F(a) 1 - F(a)

(x - a)dF(x)

1 -F(a)

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "ad- vertisement" in accordance with 18 U. S. C. ?1734 solely to indicate this fact.

238'

ires of the exponential

Define

Wi = Vi(Xi - a) = (xi - a)+;

then

Ew Ev'

which can be estimated by nwi/Zvni. As n

n ' n

E{ w }i E (wi - aVi)

/ n? ' =n EVi t( vi/n) v/n

1 E (wi i)cv 0 ( E(w - av)2] Ev - - No E2v

Now,

E(w - v)2 = E(w2) - 2 Ew

E2w E 2w + E Ev =E(w2)E E 2v Ev

and

nZw0 n wi.

2 E(w - av)2 E(W2) E2W n

E2v E2v E3V n 2 (n 3

in which we define v = L Vi = number of terms xl, .., Xn that exceed a.

Hence as n -X o,

n

3 wi 1

?- a -*N(0,1), [2]

Vv HI

I 1

1 r e (Eo oi)2 9

and therefore for large n an approximately 95% confidence



Statistics: Robbins

interval for a = E(OIx > a) is given by

1 1.96 / 2 1 2

The statistic Lnwi/v is also useful as an estimate of two random variables of interest, InviOi/v and Zviyi/v. Con- cerning the former, we see that

n

E (wi - vii) 1

is a sum of i.i.d. (independent, identically distributed) random variables with conditional mean, given 0,

E[(w - v0) ] = E(w 0)- OE(v 0)

= 3'(x -a) e-x/ldx - e-a/O a H

= 6e-a/0 te-tdt - Oe-a/= O,

and hence E(w - vO) = 0. Moreover,

E(w210) = (x -a)2 e e-x/?dx = 202e-a/0, a? 0

co 1 E(wvO 0) = OE(w0) = (x - a) - e-xl/dx = 02e-a/,

E(V210210) = 02e-a/l,

so that

2 E[(w - vO)2l a] = 02e-a/? = 2 E(w2l O),

and therefore

E(w - vO)2 = E(W2).

It follows that as n . oo

n

(Wi -v )i)

n1 ' V - *- N(0,1),

nE 2

and therefore for large n an approximately 95% confidence interval for

In -EVioi V 1

is given by n

it 1 ,1.96 /1 n

Similarly, L(wi - viyi) is a sum of i.i.d. random variables with conditional mean

E(w - vyl ) = E(w I ) - OE(v 0) = 0

and hence E(w - vy) = 0. Moreover,

E[(w - vy)20] = 22e-a/0 - 202e-a/ + 202e-a/? = E(W21a),

Proc. Natl. Acad. Sci. USA 77 (1980) 2383

so that

E(w - vy)2 = E(W2).

Hence as n oo, n L (Wi - viyi)

1

1' i -*-N(0,1), [4]

V^Wi

and therefore for large n an approximately 95% confidence interval for lviyi/ p is given by

n

Y'. wi 1 1. 96 /1 n

It is interesting that we can derive the prediction relation 4 from a much weaker assumption than 1. Under 1 the joint density of (x,y) is

f(x,y) = f e-(x + y)/OdG(O),

a function only of z = x + y. It follows that for any z > 0

The conditional distribution of x, given that x + y = z, is uniform on (0,z). [5]

We now show that 5 implies 4. In fact, for any z > a,

E(wz)=- (x- a)dx = tdt Z Ja Z */O

E(vylz) = (z -x)dx = 1 tdt, Z a Jo

so E(w - vy) = 0. Likewise, for any z > a, 1

rz-a E(W21z) = z- t2dt, z o

and

E[(w - vy)21z] = a (2x - a )2dx Z a

1 az-a

=2Z * t2dt = E(w21z) 2z Ja-z

Now 4 follows as before, but with the phantoms 0 and G no longer in the picture.

Limit theorems analogous to 2, 3, and 4 can also be obtained for several families other than the exponential, and should have many practical applications, because an estimate of the function h(a) = E(0lx > a) will often be more important than separate estimates of 01, ... , n. Because h(a) is increasing, the point estimate Elwi/ of a could well be isotonized for all a > 0 to improve its accuracy for sample sizes n that are not very large. What further improvements could be made that would be valid for arbitrary G is an open question.

This research was supported in part by the National Science Foun- dation.

1. Robbins, H. (1977) Proc. Natl. Acad. Sci. USA 74, 2670-2671.



Documents

[Part 1: Physical Sciences] || Estimation and Prediction for Mixtures of the Exponential Distribution