5
How Much Better Are Better Estimators of a Normal Variance Author(s): Andrew L. Rukhin Source: Journal of the American Statistical Association, Vol. 82, No. 399 (Sep., 1987), pp. 925- 928 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2288806 . Accessed: 16/06/2014 03:35 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AM All use subject to JSTOR Terms and Conditions

How Much Better Are Better Estimators of a Normal Variance

Embed Size (px)

Citation preview

Page 1: How Much Better Are Better Estimators of a Normal Variance

How Much Better Are Better Estimators of a Normal VarianceAuthor(s): Andrew L. RukhinSource: Journal of the American Statistical Association, Vol. 82, No. 399 (Sep., 1987), pp. 925-928Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2288806 .

Accessed: 16/06/2014 03:35

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AMAll use subject to JSTOR Terms and Conditions

Page 2: How Much Better Are Better Estimators of a Normal Variance

How Much Better Are Better Estimators of a Normal Variance

ANDREW L. RUKHIN*

This article considers the estimation problem of a normal variance a2 on the basis of a random sample xi, . . ., xn under quadratic loss when the mean 4 is unknown. The best equivariant estimator 60 = I (x, - x)2/ (n + 1) is known to be inadmissible, but the extent to which this in- admissibility phenomenon is serious has not previously been considered. Herein, the mean squared error of the minimax and admissible estimator due to Brewster and Zidek (1974) is evaluted and an explicit formula for that estimator is given. It is shown that this risk function has maximum at 4 = 0, which is the mode of the corresponding generalized prior density. Locally optimal shrinkage scale-equivariant estimators are in- troduced and their risks are calculated for several values of the sample size n. It is observed that the Brewster-Zidek estimator has risk function close to that of locally optimal minimax shrinkage estimators, but that the latter cannot give more than 4% relative improvement on the tra- ditional procedure in the sense that the ratio [(mean squared error of 6o) - (mean squared error of any shrinkage estimator)] + (mean squared error of 60) does not exceed .04 for all parametric values. KEY WORDS: Quadratic risk; Brewster-Zidek estimator; Efficiency comparison; Shrinkage estimators; Local optimality.

1. INTRODUCTION

Let xl, . . . Xn (n 2 2) be a random normal sample with unknown mean 4 and unknown variance q2. We con- sider here the estimation problem of the variance o2 under quadratic loss.

Let X = I xjln and S2 = S (xI - X)2 be a version of the sufficient statistic. The best unbiased estimator of o2 iS 6U(X, S) = S21(n - 1). This estimator can be easily improved on by 60(X, S) = S21(n + 1), which is optimal in the class of all procedures of the form CS2 with some positive c. Some other choices of the constant c have been considered in the literature: Lindley (1953) suggested tak- ing c = lI(n - 2), and Goodman (1960) proposed c = lI(n - .5). Notice that the estimator 60 is more optimistic than 6u about the value of the unknown variance in the sense that 60 < 6u. The shrinkage estimators considered in this article are even more optimistic than 60.

The inadmissibility of co has been established by Stein (1964), and a considerable amount of research has been directed to related problems after this discovery. Brown (1968) extended the inadmissibility result to more general loss functions. Strawderman (1974) obtained a class of minimax estimators, some of which are generalized Bayes. Olkin and Selliah (1977) proved the admissibility of co within the class of all estimators whose risk is a multipli- cative or additive function of 4 and a. Cohen (1972) ob- tained improved confidence intervals for q2. Brewster and Zidek (1974) derived a minimax estimator that is admis-

* Andrew L. Rukhin is Professor, Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003. He is cur- rently on leave from Purdue University. The research for this article was supported by National Science Foundation Grant DMS-8401996. Thanks are due to H. Proskin, who kindly supplied a copy of his unpublished thesis. The author is also very grateful to John Deely for many helpful remarks.

sible within the class of all scale-equivariant procedures. Proskin (1985) proved absolute admissibility of the Brew- ster-Zidek estimator by formalizing the heuristic argu- ment in Brown (1979). Multivariate extensions are given in Shorrock and Zidek (1976) and Tsui, Weerahandi, and Zidek (1980).

In this article we study the possible improvements over 60 in terms of quadratic risk. In particular, in Section 2 we evaluate the risk of the Brewster-Zidek estimator. This risk function is shown to have a peculiar form with sur- prising maximum at 4 - 0 (which is the mode of the corresponding generalized prior density). In Section 3 the largest possible improvement among all scale-equivariant minimax shrinkage estimators for small samples is calcu- lated and is shown to be no more than 4%. The estimator attaining this improvement is explicitly given.

2. THE BREWSTER-ZIDEK ESTIMATOR

We consider here scale-equivariant shrinkage estimators of o2 of the form

6(X, S) = S2(1 - /i(U))I(n + 1), (2.1)

where U = n"21X1(nX2 + S2)-"2 = n 2 XXI/(j7xJ)"2 and + is a positive measureable function.

The minimax estimator 61 obtained by Brewster and Zidek (1974) has this form, as is readily shown. It is gen- eralized Bayes against the prior density

A C) = f exp{-nt42/(2U2)}t-u"2(1 + t)-l dtlo

with respect to the invariant measure dX dlav when scaled quadratic loss (6/q2 - 1)2 iS used. Thus one has the fol- lowing representation of c1:

f o'-n-3 exp{ - [n(x - ,)2 + s2]/(2U2)}A(c, a) dc, du

31(x, S) =XX f o a' exp{-[n(x - ,)2 + s2]/(2U2)})A(c, a) dc, du

Ef a--3 exp{-[nx2(1 + t-l)-l + s2]/(2a2)}(1 + t)-312t- 12 dt da

f a-n-5 exp{-[nx2(1 + t-l)-l + s2]/(2a2)}(1 + t)-312t-1/2 dt du

s2 (1 - V2)(n 1)/2 dv

(n + 2) (1 - V2)(n+l)/2 dv

- 2

2

u(1 - u2)(n+1)/2

n + 1 - (n + 2) u (1 - v2)(n+1)/2 dv)

= s2(1 - 1(u))/(n + 1). (2.2)

? 1987 American Statistical Association Joumal of the American Statistical Association

September 1987, Vol. 82, No. 399, Theory and Methods

925

This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AMAll use subject to JSTOR Terms and Conditions

Page 3: How Much Better Are Better Estimators of a Normal Variance

926 Journal of the American Statistical Association, September 1987

Formula (2.2) allows easy calculation of c1. Indeed, if n = 2p - 1 where p is positive, then

Xl(u) =(1-u2)p / () ( )k(2p + 1) u2k ~ k= 2k+ 1

and for even values of the sample size n, n = 2p,

X1(U) = (1 -U2YI(/2)

PE (2p + 1) ... (2p - 2k + 3) Lk= 2p ... (2p -2k + 2)

X (1 - U2)P-k+(1/2)

+ (2p+1)!! arcsin u

(2p)!! u These formulas are convenient for numerical evaluation

of quadratic risk of c1 for small sample sizes. It is easy to see that the mean squared error of c1, as

well as that for any other estimator of the form in (2.1), depends only on p = nI12 II/a, and that for any such es- timator it follows from Equation (A.3) of the Appendix that

AQ) = Eq{[S21(n + 1) - 1]2

- [S2(1 - q5)I(n + 1) - 1]2}

= 4e-212(2n + 1)-2(dn2)-l

x [ / (u)(1 - 0k(u)/2)(1 - U2 (n+1)/2

X 7tn+3(ui) du

- (n + 1) I0(u)(1 - U2)(n-1)/2

X 7nt+ l(u) du] (2.3)

where

dm = sme-S 2ds = 2(m-l)12r(.5(m + 1))

and

7Em(u) = (27r)-1/2 sme-S22 cosh su ds. (2.4)

Notice that

X1(u) = U d log (1 - V2)(n+l)/2 dv, (n + 2) du lg (

Table 1. The Relative Risk Improvements of the Brewster-Zidek Estimator (in percents)

'1

n .1 .5 .75 1.00 1.50 2.00 2.50 3.00

8 .0 .2 7n I I r.9 .9 2 2.52 2.21

so

02(u) = -uq5(u)I(n + 2) + [1 - (n + 1)u2/(l - u2)]O1(u)/(n + 2).

Integration by parts now gives

AQ) = 4qe-l2I2(fn + 1)-2(n + 2)-'(dn2)-'

X J 1(u)u(l - U2)(n-l)/2

x [(n + 1)7rn+1(uq) - .5(1 - U2)n+3(Uq)] du.

(2.5)

It follows that A(0) = 0; that is, at 0 = 0 the risk of the Brewster-Zidek estimator takes its largest value. This is to be contrasted with the fact that 0 = 0 is the mode of the generalized prior density.

By using formulas for 01 and formulas for (A.4)-(A.10) of the Appendix, one can evaluate the integral (2.5), for instance, by the Romberg adaptive extrapolation rule. No- tice that for odd sample sizes 01 is a rational function so that (2.5) can be calculated in closed form.

Table 1 contains results of such numerical integration for the relative risk improvement Arel(q) = A(q)1E(0a-2 - 1)2 = (n + 1)A(t7)/2 for 3 ? n < 8 and 0 < q ' 3. These numerical results and the following asymptotical formulas (2.6) and (2.7), which are useful for larger values of n and q, show that c1 cannot give more than 3% relative improvement over 60.

Asymptotical formulas when n is large can be derived if one observes that

(n + 2)01(u(n + 1)-1/2) - ue -U2 /2U e-t212 dt

and from Equation (A.11) of the Appendix

Arel(i) - i,e n212(n + 1)'-

x v2e-v2 sinh vii dv e-t212 dt. (2.6)

Approximations for A(q) when li7l is large can be ob- tained from (A. 12) of the Appendix. There is asymptotical series for A in powers i -2:

Arel(Q) = A,-n-1[1 - (n + 2)/(2 ,2) + *.] (2.7)

where

A 2(n+3)2rF(n + 1)F(.5(n + 2)) [F(.5)F(.5(n - 1))F(.5(n + 3))]*

The formula (2.6) gives a good approximation for Are, if n -10, and (2.7) provides an accurate answer if , > 6.

3. LOCALLY OPTIMAL SHRINKAGE ESTIMATORS

It follows from Section 2 that the Brewster-Zidek es- timator does not lead to substantial improvement over &.

Here we address the question of how much improvement can be made on c0 in the class of shrinkage estimators 2.1.

Let us fix a parametric value i = i0 and look for a shrinkage estimator os that minimizes the mean squared

This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AMAll use subject to JSTOR Terms and Conditions

Page 4: How Much Better Are Better Estimators of a Normal Variance

Rukhin: Estimation of Normal Variance 927

error E(6 - 1)2 at io. We call this 6 to be locally optimal (at i7o) shrinkage estimator.

Because of (2.3) the nonnegative function 0 that min- imizes the quadratic risk or maximizes AQ(to) has the form

(u )= max{O, 1 - (n + 1)Eq0,(S2IU)IE,(S4I/U)} = max{O, 1 - (n + J)7r"+1(Uuo)I[(j - U2)n+AU10)]}

(3.1) where nm is defined by (2.4).

For this q A (7o) = 2e_,2I2(n + 1)-2(dn2)-I

x (1 - U2)((n+l)/2)n - ON)

x [1 - (n + 1)7En-U1(0)

+ [(1 - u2)7-3(u17n )]]2 du, (3.2) where uo solves the equation

(n + 1)7tn+1(u0q0) = (1 0U)7n-3(u0iO). (3.3) For numerical calculation of AQ(to) put vo = u0i70, so that

20 = V07n+3(VO)1[n+3(V0) - (n + 1)n+l(vo)

This formula gives the value of ,72 as a function of vo, and

Arel(q0) = e_,122(n + 1)-'(dn-2)-'i7-n-2

X f (72 - V2)(n-3)12

X [1o(7 +3(V) - (n + 1)7r z + 1 (v))

v 27n+3(V)]2 dvhtn+3(v).

The function Arel(qO) is monotonically increasing in io. When /o = 0

+ (u) = max{O, 1 - (n + 1)/[(1 - u2)(n + 2)]},

which corresponds to the original Stein's (1964) estimator. As is seen from Table 2, where the function (3.2) is tab- ulated (see the column corresponding to q = 0), its im- provement over 60 is small. For larger values of io one obtains more noticeable improvements.

Unfortunately, these improvements are unattainable in the class of minimax estimators. In fact, functions 0(u) generate minimax estimators only if q c ij. This value i1 can be determined by the condition A(0) = 0, where A corresponds to estimator (3.1) with io = . Thus these estimators have a risk function similar in form to that of the Brewster-Zidek estimator.

Table 2. The Relative Risk Improvements of Locally Optimal Estimators (in percents)

'1

n .00 .5 1.0 2.0 3.0 4.0 6.0

3 1.79 1.80 2.07 4.30 6.78 8.68 10.95 4 1 .83 1 .84 2.18 4.61 7.60 10.23 13.24 5 1.76 1.79 2.09 4.53 7.85 10.78 14.66 6 1 .67 1.68 2.05 4.42 7.80 1 1.1 1 15.32 7 1.57 1.57 2.01 4.30 7.75 12.07 15.91 8 1.47 1.47 1.96 4.27 7.69 10.59 16.50

Table 3. The Largest Possible Improvements Within the Class of Shrinkage Estimators (in percents)

'1

n i uO .5 1.0 1.5 2.0 2.5 3.0

3 1.38 .68 .53 1.76 2.88 3.25 2.83 1.98 4 1.34 .63 .54 1.85 3.00 3.51 2.85 1.92 5 1.31 .59 .56 1.83 2.93 3.21 2.67 1.75 6 1.29 .56 .54 1.76 2.81 3.03 2.52 1.67 7 1.28 .53 .51 1.67 2.66 2.88 2.33 1.48 8 1.27 .51 .48 1.55 2.50 2.72 2.17 1.28

Table 3 provides values i1 at which the largest gain by a scale-equivariant shrinkage estimator 3 is attained. The shifted estimator 3(X + c, S) - c gives the maximal improvement at , = c + ii. Therefore, if prior information about the ratio , = nl"2llIla, say ' ,zo, is available, then the best choice of c is c = qo - ii. Notice that the estimator b cannot be improved on in the class of scale-equivariant shrinkage estimators, but it is not smooth and hence can- not be admissible. It appears that further scale-equivariant improvements are not possible and the risk of the alter- native improvements will be a complicated function of both ' and a.

In Table 3 we give the values i1, solutions uo of (3.3), and relative risk improvements of the corresponding lo- cally optimal shrinkage estimators for n = 3, . . . , 8. It can be seen that these estimators give just slightly larger maximal improvement over 60 than the Brewster-Zidek estimator. Since there is no improvement larger than 3.5%, the answer, perhaps not surprisingly, to the title question of this article is "not much."

This situation changes, however, in the problem of si- multaneous estimation of a number of unknown variances, of a covariance matrix, or of a function of the latter. For instance, Lin and Perlman (1985) reported significant im- provements on the traditional estimate of a covariance matrix. If this matrix is known to be of a diagonal form, the methods of the present article are applicable and the corresponding vector estimators provide sizable improve- ment on the analog of co. In this situation the fact that shrinkage scale-equivariant estimators cannot give sub- stantial improvements for small values of 141/v, but do noticeably better for larger values of this ratio, can be useful.

APPENDIX: NONCENTRAL BETA DISTRIBUTION

We assemble here some results about the functions (2.4) that are closely related to the functions of a parabolic cylinder. These results are useful for practical evaluation of estimators and their risks in many other estimation problems involving normal pa- rameters (see, e.g., Rukhin 1986).

Let for m > -1, u > O,

7(m(u) = (27)-12 f es2/2sm cosh su ds.

Then m( = (2i)-1/2 f e-s2/2sm+1 sinh su ds

= [1(m+2(U) - (m + 1)1(m(U)]/U, (A.1) i4m(u) = 2rm+2(U). (A.2)

This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AMAll use subject to JSTOR Terms and Conditions

Page 5: How Much Better Are Better Estimators of a Normal Variance

928 Journol of the American Statistical Association, September 1987

The importance of functions 7Cm is due to the fact that for any integrable function q(U), U = n112lXI(nX2 + S2)-112, and a > 1 -n,

E,ISaq5(U)

= 2e-7212 O(U)(1 - U2)(n+a-3)27rnl+a(u7) duldn-2.

(A.3) Here ? = n"1211Ia.

Define the following polynomials in U2:

Pp(u) = (2k) (2p - 2k - 1)!! U2k, (A.4) k=O \

Qp(u) = (2k) 2k + 1 (2p - 2k - 1)!! U2k, (A.5) p

Rp (u) 2 > k u 1 k=O

p - k+1 /2 2*

(2p +2k) (2p+ 2k-2j+ 4) xk 2k-i

x (2p -j + 1)(2p -2k - 2j + 1)!!. (A.6)

Proposition. If m = 2p with integer p, then

7rm(u) = .5eu22P(U), (A.7)

7r'(u) = .5ueu212QP(u). (A.8) If m = 2p + 1 with integer p, then

7rm(u) = (2n)-112Rp(u) + .5uQp(u)eu2/2 erf(u2-"12), (A.9)

7 (u) = (27n)-1/2RP,(u) + uQp(u)

+ .5Pp+i(u)eu2I2 erf(u2-"12), (A. 10) where erf denotes the error function.

These formulas can be proved by induction if one uses recur- rent formulas (A.1) or (A.2) from the known representations for functions of a parabolic cylinder, D_m(u). Indeed,

7rm(u) = .5eu2'4F(m + 1)[D-mil(U) + D_mi1(-U)]/(27)'1/2

dm = .5 2

(eu2/2, where m is even,

dm 22 12] = .5 d [eu22 erf(u2"2)] where m is odd.

Asymptotical formulas can be obtained from Laplace's method,

which, for instance, gives 7r (um -1/2 - e-(m+1)/2(m + 1)m/22-112 cosh u,

r I(u(m + 1)-1/2) - e-(m+2)/2(m + 2)(m+1)/22- 12 sinh u, (A.11) and asymptotical expansions for large u are also easily derived, as follows:

7m(u) = (27)-1/2eu2/2 (t + u)me-t2/2dt

+ f (t - u)me-t2l2 dtl 2 = .5eu22um[1 + .5m(m - 1)u-2 + ..j]. (A.12)

[Received April 1986. Revised November 1986.]

REFERENCES

Brewster, J. F., and Zidek, J. V. (1974), "Improving on Equivariant Estimators," The Annals of Statistics, 2, 21-38.

Brown, L. (1968), "Inadmissibility of the Usual Estimators of Scale Parameters in Problems With Unknown Location and Scale Param- eters," Annals of Mathematical Statistics, 39, 29-48.

(1979), "A Heuristic Method for Determining Admissibility of Estimators-With Applications," The Annals of Statistics, 7, 960-994.

Cohen, A. (1972), "Improved Confidence Intervals for the Variance of a Normal Distribution," Journal of the American Statistical Associa- tion, 76, 382-387.

Goodman, L. A. (1960), "A Note on the Estimation of Variance," Sankhyai, 22, 221-228.

Lin, S. P., and Perlman, M. D. (1985), "A Monte Carlo Comparison of Four Estimators of a Covariance Matrix," in Multivariate Analysis- VI, ed. P. R. Krishnaiah, Amsterdam: Elsevier Science Publishers, pp. 441-429.

Lindley, D. V. (1953), "Statistical Inference" (with discussion), Journal of the Royal Statistical Society, Ser. B, 15, 30-76.

Olkin, I., and Selliah, J. B. (1977), "Estimating Covariances in a Mul- tivariate Normal Distribution," in Statistical Decision Theory and Re- lated Topics II, eds. S. Gupta and D. Moore, New York: Academic Press, pp. 313-326.

Proskin, H. M. (1985), "An Admissibility Theorem With Applications to the Estimation of the Variance of the Normal Distribution," un- published Ph.D. dissertation, Rutgers University, Dept. of Statistics.

Rukhin, A. L. (1986), "Estimating Normal Tail Probabilities," Naval Research Logistics Quarterly, 33, 91-99.

Shorrock, R. W., and Zidek, J. V. (1976), "An Improved Estimator of the Generalized Variance," The Annals of Statistics, 4, 629-638.

Stein, Ch. (1964), "Inadmissibility of the Usual Estimator for the Vari- ance of a Normal Distribution With Unknown Mean," Annals of the Institute of Statistical Mathematics, 16, 155-160.

Strawderman, W. E. (1974), "Minimax"Estimation of Powers of the Variance of a Normal Population Under Squared Error Loss," The Annals of Statistics, 2, 190-198.

Tsui, K-W., Weerahandi, S., and Zidek, J. V. (1980), "Inadmissibility of the Best Fully Equivariant Estimator of the Generalized Residual Variance," The Annals of Statistics, 8, 1156-1159.

This content downloaded from 185.2.32.49 on Mon, 16 Jun 2014 03:35:33 AMAll use subject to JSTOR Terms and Conditions