Estimation in a Simple Linear Model Using Prior Beliefs

The Canadian Jounral of S t a t i s t i c s La Revue Canadienne de S ta t i s t i que Section D: Notes and Students ' Comer Section D: Notes et Vol. 2, NO. 2, 1974, p p . 277-283 rubrique des e'tudiants

1 ESTIMATION IN A SIMPLE LINEAR MODEL USING PRIOR BELIEFS

3 Jagbir SINGH -

The Ohio State University

Key Words and

Phrases

mo-stage estimation, l inear model, least square estimator, prior b e l i e f s .

ABSTRACT

This paper considersa two stage estimation for parameters in a simple linear model using prior beliefs. The proposed esti- matvr is compared not only with the usual least square estimator but also is contrasted with the one proposed in 111. The effect of prior beliefs being wrong is investigated.

1. INTRODUCTION

An experimenter's beliefs about the parameters in his model is basic to the development of this paper. To avoid possible ambiguity when speaking of an experimenter's beliefs, we are not referring to the usual Bayesian situation; no prior distribution is being contemplated on the parameter space. We simply mean that the experimenter sometimes guesses values of parameters in the underlying model without being able or even caring to assign a measure of uncertainty. Indeed the formulation of statistical hypotheses often can be attributed to such guesses. If there is available such prior information, it should be utilized €or better planning and statistical analysis.

Katti [ 4 1 has an original paper on estimation of the mean

'Research par t ia l l y supported by the Air Force Of f i ce of S c i e n t i f i c Research, Off ice of Aerospace Research, United Stntes Air Force, under AFOSR Grant NO. AFOSR-2305-67.

'Now a t Temple liriiverAsity, Uept . of S t a t i s t i c s - 2 7 7 -

SINGH

utilizing what he calls "a guess estimate", i.e., the experimenter's belief is stated as a point estimate. Subsequently, estimation of a mean vector and parameters in linear models have been considered respectively in 1 1, mation in a simple linear model, but has features not found in earlier works.

2 1 and [ 4 1. This paper also deals with esti-

To state the basic idea a bit formally, consider the model

Y. = a f BX. f E i = 1,2, ..., n z i where E ~ ' S are generally taken to be independently and normally distributed with The usual statistical procedures are available for estimating a and B . If, however, it is suspected that the true values of a and 8 may be a. and B o , say, or very nearly so, then we ask: this information be incorporated into the making of inferences regarding a and B? procedure requiring possibly fewer observations than n and yet robust against erroneous belief.

= 0, and E ( r i 2 ) = c2 . The Xi's are known.

Can

In this paper we consider one such alternative

2. MODEL AND ESTIMATION

It becomes easier to explain our estimation procedure if we rewrite the model as follows:

i' Yij = a f BX.. f E . . , i = 1,2; j = 1,2 ,..., n z3 ZJ

where n l t n2 = n is fixed. for all n observations right away, the experiment is conducted to take only a preliminary sample of size n l which is to be decided in an optimum manner. If the preliminary sample is in agreement with the experimenter's prior belief, then the experiment is not planned further. Otherwise the experiment is planned for the remaining n 2 observations and both the samples are used. malize it, let

Instead of conducting the experiment

To for-

Y. = [;IJ 'in

X *il

'in i

-278-

ESTIMATION USING mron BELIEFS

i = 1 , 2 . Now the model in matrix notation is

We assume that the second stage sample size n 2 is an integer multiple of n l . That is, n 2 = r n l . Moreover, it is assumed that the experiment can be so controlled that S 2 = r S l , where S . = Z'Z

2 i i' The above assumptions would be satisfied, for example, when the first stage input matrix Z I is repeatedly used at the second stage if it is required. In the experimental design context one would say that the design is replicated once or, if necessary, addi- tionally r times.

Let 0" denote the experimenter's belief as a point estimate of @ and O i its least square estimator based on the i-th stage. Denote by 0 the least square estimator based on observations of both the stages and by R a region centered at eU

- so that

R = {u: (U - ed ' s l ( u - eo) G P } - A choice of P is discussed shortly. But, first, we define 0 , an alternative and competitor to B , as follows:

- 8 , if O 1 E R

O i f O 1 E R . - t i =

Estimator is well defined once p is specified. We see that R is a likelihood ratio acceptance region for testing e = eu based on the first stage only. One may, therefore, choose p so that a specified level of significance is attained. In fact, this is done by Al-Bayyati and Arnold 111. Instead, since the primary aim is to estimate e rather than to accept or reject the prior belief, we determine p so that -

T ; ( P I e u ) = E [ 6 - o o ) f s16 - H 0 ) I

is minimized when 0 = O U .

then C is called the expected weighted mean square. This criterion judges the prior belief BU more rigorously than the approach of 111. This fact is also illustrated by a numerical example from [ 11. Another feature is worth mentioning. Our first stage

When, in particular, S , is diagonal -

- 2 7 9 -

SINGH

estimator is e l whereas Al-Bayyatti and Arnold use a normalized linear combination K O l + ( I - K l e for appro- priate choice of K. In other words, we believe that once having utilized the prior belief in planning the experiment and having judged it rigorously, only the sampled data should be used for estimation, while Al-Bayyatti and Arnold not only judge e0 less rigorously but also continue using it in exchange of I l - K ) % of the information in the first stage sample summarized by g,.

Let F i ( u ) denote the continuous distribution function of the vector gi.

- which shrinks to e 0 0

We now turn to determining p so that c ' ( p I e I is minimized. 0

It can be shown that

Letting l ; , - 11) is a constant multiple o f

S , ( S , - e o l = W , the integral expression in

where H l t l is the distribution function of W. It follows that p = 2 0 / ( r + ~ ) minimizes G ( P I 0,) . to replace it by [ Y i Y l - 6; 2; Y l ] / ( n l - 2 ) .

n2P[ ii, o RI < n.

- 2 If o2 is unknown, we suggest

The sample size N of our procedure is random with EIN] = n l + -

3. ROBUSTNESS

To compare with i, first we observe that c = E!(; - e 0 ) ) s1G - eO)l = 20 2 / ( * I )

This and the fact that the integral expression in ( 1 ) is negative imply that we consider the effect as eo departs from the true 8. Note that under the normality assumption ti;, - e 0 J 1 ,Y1(il - e o ) has a non- central chi-square distribution function x 2 ( . , A ) with 2 degrees of freedom and the non-centrality parameter

is more efficient than i when indeed e = go. Further

2 A = ( 9 - g o ) ' S l ( O - 8 1 / 2 0 . 0 Therefore,

-280-

ESTIMATION USING PRIOR BELIEFS

2 P G 1 R1 = 4 dF1(u) = x2 ln,Al , 2 lim 2

where 0 = P I 0 . that one would almost surely go to the second stage and 5 will be 8 . Moreover, we can show using the mean value theorem that

Since ~ + ~ X Z ( n , A ) = 0 , the departure effect is

1

2 - G f p 16) = r(r+Z/ [ fu - e l

(rt112 S1(u - 81 - p ] x2 ( n , A ) + G ,

where u E R is some point. to e as Z/Z, we write,

Defining efficiency e ( A ) of 5 relative

e f A ) = fu-O)'Sllu-Ol-p1xZ(n,A)}-' 2

Since u E R , the expression under the brackets is negative if 8 = e0 and would remain negative for A close to zero. pression becomes positive and increases for a while as A departs away from zero but soon starts decreasing and goes to zero as A

increases. That is, Zim e ( A l = 1 . In other words, 8 remains efficient for values of A in the neighborhood of zero and less efficient than as A is not close to zero but the two are equally efficient if b is very large.

The ex-

%

Similar types of inferences can be reached had we chosen weight matrix I or f S l + S 2 1 instead of S1 in the definition of G. -

To find an optimum r or equivalently n l , we consider the relative efficiency e(Al for A = 0. It is possible to write

~ ( 0 ) = 2/ I fr+l)m + 2 1,

where m = 2 x Z q ( n l - r ? x Z 2 ( n ) . of this section that e l 0 1 > 1 . Further behavior of e ( 0 1 is illustrated by Table 1. This table suggests that the second stage sample size, if it is needed, should be 3 times that of the first stage.

We have already seen at the beginning

TABLE 1 . Some values of e ( 0 )

P: 1 2 3 4 8

e(01: 1.08 1.15 1.17 1.07 1.05

-281-

SINGH

4 . AN EXAMPLE

Al-Bayyatti and Arnold consider the following example. For details see their paper [I]. In one experiment of testing a cer- tain influenza virus in mice, previous experiments indicated that the relationship between the scored response Y and virus dilutions in log units was Y = 6 - 3 ( X - f ) . Therefore the prior belief is

r 6-1 . In a screening experiment the control was repeated '0 = L - 3 J twice in nl = n2 = 80 Thus in this example r =-I which is not an optimum choice.

animals, 20 rats at each of 4 dilutions. It was

found that e l =[-63:9] = [-!:;'I . Moreover,

we calculated S l =

Incidentally, Al-Bayyatti and Arnold's setup has S diagonal -1

which need not be in our setup. We find p = 7.2, and - e0 I ' S,(e - e o ) = 1 3 . 3 . E T and consequently our procedure requires the second stage. However, Al-Bayyatti and Arnold's procedure would not recommend the second stage sampling. We have judged the prior belief more rigorously. howevcr, that had our prior belief been [-3:1 ] , which is 0 in this example, we would have found that ;, E R and stopped at the first stage.

Hence

It is interesting to note, 6 35

Cet article considere une estimation a' deux degr&, utilisant dea hypotheses a priori, des parame'tres d'un modgle linBaire simple. L'estimateur props6 eat non seulement compar6 tuel par les moindres can&, mais est aussi mis en contraste avec celui propose' en [l]. L'effet des hypothbses se rgvglant stre fausse est aussi examine.

l'estimateur habi-

REFERENCES

[I1 Al-Bayyati, H.A. and Arnold, J.C. (1972). On double-stage estimation in simple linear regression using prior knowledge. Technomtrios, 14, 405-412.

-282-

ESTIMATICN USING PRIOR BELIEFS

[2] Mayer, L.S; Singh, J. and Willke, T.A . (1974). Utilizing initial estimates in estimating the coefficients in a general linear model. JASA., 69, 219-222.

[ 3 1 Katti, S.K.(1962). Use of some a priori knowledge in the estimation of means from double samples. Biometries, 18, 139-147.

[4] Waiker, V.B. and Katti, S.K.(1971). On a two-stage estimate of the mean. J . h e r . S t a t i s t . Assoc.,66, 75-81.

Received 17 JuZy 1973 Professor Jagbir Singh Division of S t a t i s t i c s

!Phe Ohio State University Columbus, Ohio 43210

(1. S. A

-283-

Documents

Estimation in a Simple Linear Model Using Prior Beliefs