36
-e NON BALANCED EXPERIMENTAL DESIGNS FOR ESTIMATING VARIANCE COMPONENTS by R. L. Anderson Presented at Seminar on Sampling of Bulk Materials Tokyo, Japan, November 15-18, 1965, sponsored by the National Science Foundation and the Japan Society for the Promotion of Science. Institute of Statistics l{tmeograph Series 452

non balanced experimental designs for estimating variance

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

-e

NON BALANCED EXPERIMENTAL DESIGNS FORESTIMATING VARIANCE COMPONENTS

by

R. L. Anderson

Presented at Seminar on Sampling of Bulk MaterialsTokyo, Japan, November 15-18, 1965, sponsored by theNational Science Foundation and the Japan Societyfor the Promotion of Science.

Institute of Statisticsl{tmeograph Series 452

NON BALANCED EXPERD1ENTAL DESIGNS FOR

ESTIMATING VARIANCE COMPONEln'S*R. L. Anderson

North Carolina state University at RaleighRaleigh, North carolina

1. Introduction

This paper is a summary of recent reeearch at North Carolina State University

on the development of experimental designs to estimate variance components. Much of

the research has been conducted by doctoral candidates in Experimental Statistics

under my direction. The two most recent projects are not completed, but developments

to date are summarized here. In our experience, a knowledge of the magnitude of the

variance components would be useful in the following situations:

(1) Population changes often can be described in terms of variance components, e.g.,

in quantitative genetics. Hence a knowledge of the actual magnitude of these

components is reqUired in assessing various control programs.

The proper allocation of resources to reduce product variability depends on a

knowledge of the relative magnitude of the variance components.

t~) A knowledge of the relative magnitude of the variance components is also needed

to determine the best allocation of funds in sampling to estimate population

means and totals and in planning experiments to compare treatments.

Unfortunately I have had very little experience with the problems faced in the

sampling of bulk materials. It is hoped that my appearance on this symposium will

have two benefits:

(a) Enable me to become familiar with your problems so that some of our

research can be oriented in the direction of helping to solve them.

* Presented at Seminar on Sampling of :Bulk Materials, Tokyo, Japan, November 15-18, 1965;sponsored by the National Science Foupdation and the Japan Society for the Promotionof Science.'e, ",>\1,#$

2

(b) Furnish you with a brief description of current devel~ents in designing

experiments to estimate variance components in the hope that you will be

able to adapt some of the results to bulk swmpling situations.

In this paper, it will be assumed that all sources of variation are essentially

random. Two experimental and operational procedures are considered: the nested or

hierarchical type and the classification type. A balanced nested design and a two­

way classification operation with nested-type sub-sampling are illustrated in Table ~.

I have been using as a typical example of a nested-type operation the following:

A pilot study was considered to assess the various sources of variability in the

production and assay for streptomycin before conducting an experiment on the efficacy

of various molds. There were five stages in this process: an initial incubation in

a test tube; a primary inoculation period in a petrie dish; a secondary inoculation

period in another petrie dish; a fermentation period in a bath; and the final assay

of the amount of streptomycin produced. This is a five-stage nested operation.

Another example is a study by Newton, et ale (1951) of the variability of rubber,

considering the following sources of variation: producer's estates, days at a given

estate, bales on a given day, sheets from a given bale, and samples from each sheet.

Technicians then took measurements on the samples.

As an example of a classification-type operation, consider the follOWing: Samples

from each of s =10 sources of a material are to be analyzed by each of f =5 labora­

tories. Let us assume that n ;:; 2 samples are sent to each laboratory from each

source. This is a two-way classification operation. The purpose of the investigation

is to est1ma~e the lab-to-lab variation, the source-to-source variation, a possible

lab x source interaction (the failure of source differences to be the same from lab

to lab~ and the swmple-to-sample variation. It is also possible to determine if the

sample-to-sample variation is constant from lab to lab or source to source.

Table 1. Examples of Balanced Nested and Classification-Type Operations

Nested-Type Operations

2a

Test Tube

Primary Inoculation

Secondary Inoculation

Fermentation

Assay

~' ...,

Nodel: Yijk~m = I..l + t i + Pj(i) + sk(1j) + fe(ijk) + B.m(ijkt)

*************Two-Way Classification Operation With Nested-Type Sampling

5(10)

1 2

Source

Batch

Lab. Tech.

1 1

2

2 1

2

1

2

1

2

2

2

2

2

2

1

2

2

2

2

2

2

2

1

2

2

2

2

2

2

2 . . .2

2 . 2

2 ••• 2

2 • • • 2

2 • • • 2

. . . . .2 • • • 2

2 . 2

2

2

2

2

2

2

for example, Pj(i) is the

.3

Additional complications can be added to the last example1 which involve com­

binations of nested and classification-type operations. Suppose each laboratory

selects 2 technicians for the study and from each source 2 batches are selected. In

this case1 4 samples are sent to each laboratory from each batch, 2 for each tech-

nician.

Many classification-type operations will involve more than two classes. Some

discussion of the general multi-way classification operation will be included in

this paper.

The stochastic models for the observations obtained by use of these operations

are given at the bottom of each example in Table 1. In all cases ~ is the general

mean and the other parts of the models are assumed to be independently identically

distributed random variables. We will assume these random variables are normally

distributed; however1 this requirement is Bot always essential. The parameters of

e interest are the variances of the random variables. For the nested-type operation,

these are

For the two-way classification operation, they are

22 22 22 2 2 20s'Ob(s),og,Ot(K),C1sf,Ost(£),ob(s)K,ab(s)t(J.) ~ C1a •

The notation j(i) means the jth sample in the i th class;

th thj primary inoculation sample from the i test tube.

If only one batch is used per source and one technician per lab, one will not be

able to separate batch and source variation or technician and lab variation. The

model might be written in this manner

Yikm = ~ + S1 + ~ + (SL)ik + am(ik) I

222 2 2 2where 1n this case Os is the sum of as and C\ above, 0L is the sum of OK and at and

2C1SL the sum of the interaction components.

4

Suppose we have i incubation test tUbes and select two samples from each for the

primary inoculation, two from each primary for the secondary, two from each secondary

for the fermentation bath and assay two samples from each fermentation bath, giving

a total of l6t assays. The analyses of variance and the accompanying average values

of the mean squares are given in Table 2 for data obtained from balanced nested,

simple two-way classification and combined nested-classification designs.

An examination of these analyses of variance reveals that a balanced design will

concentrate most of the information on the estimation of (J2 and only a small amounta2

of information on the estimation of some of the other variance components, e.g., (Jt

and (J~ for the nested design and (J;,(J~(s),(Jt' and (J~(f) for the classification

designs. In order to more nearly balance the information on the various estimators)

some form of non-balanced design is necessary. The decision as to how to construct

a non-balanced design is complicated by a number of questions, e.g.,

1) What is the best estimation procedure for a non-balanced design?

2) What criteria of efficiency should be used When a number of variance

components, plus the general mean, must be estimated? In this con-

text, it should be noted that in many cases it is ratios of variance

components which are of most importance to the investigator.

3) How much will comparisons of different designs be affected by

total sample size and by the values of the parameters to be estimated?

4) How will these results depend on different sampling cost functions?

5) If the optimal results depend on the values of the parameters being

estimated, would a sequential plan be useful? Or should one con-

sider some form of Bayesian estimation?

One might recommend that maximum likelihood (ML) procedures should be used;

however, these produce equations which are non-linear in the estimators and for

4a

~ Table 2. Analyses of Variance for Balanced Nested and Classification-Type Data

5-stage nested operation with (t,2,2,2,2) successive samples

Degrees ofFreedom

Coefficients of variancecomponents in averagevalue of mean square.2 2 2 2 2

a a~ a a 0ta Jo s P- -.- --t - 1 1 2 4 8 16

t 1 2 4 82t 1 2 44t 1 28t 1

**************

Source of variation

IncubationPrimary InoculationSecondary InoculationFermentationAssay

,Two-way classification operation

Source ofVariation

SourcesLabsS x LAnalyses*

Degrees ofFreedom

s-1(9)e-1(4)

(8-1)( ~ -1)(:56)(n-1)sQ(50)

Coefficients of variancecomponents in averagevalue of mean square222 2

°a °s~ ae Os

1 n(2) nZ (10)n(2) ns(20)

1 n(2)1

* Can be subdivided into e parts with (n-1)s d. f. each for each lab or into s partswith (n-l)~ d.f. each for each source.

**************

Coefficients of variance componentsin average value of mean square

Combined two-way classification and nested operation- ..

Source of Degrees ofvariation Freedom

2040 8040

8

8

4444

44

4

4

2222222

1

11111111

2 2 2 222 222°a ab(s)t(e) °b(s).e ast(z) O'st °t(.e) ,ae ab(s} ~

2 4 4 8 20 40Sources 9Batches in

sources 10Labs 4Tech. in labs 5S x L 36S x T(L) 45B(S) x L 40B(S) x T(L) 50Analyses* 200

e * Again this can be subdivided for internal analysis.

which, in general, there is no closed-for.m solution.

5

In other words, ML estimates

Crump also

The problem.

02 ) and thec

must be obtained in a complicated iterative computing procedure. In addition, small

sample properties of ML estimators cannot be obtained except by use of empirical

samp1ing methods for specified sets of parameter values; these properties may not be

too good, i.e., the estimators will be biased and may be rather inefficient.

If one is estimating the five variance components and \.1 for the five-stage nested

operation, it is recognized that a plan which (for a given total number of assays)

produces the minimum variance estimator of a2 will furnish no estimator of the othera

components; all l6t assays would be made from the same fer.mentation bath. The best

estimator of the total variance, 0t2 + 02 + 0

2 + a2f

+ 02 would involve using l6t test

p s a

tubes and only one sample from each test tube, from each primary and secondary

inoculant, and from each fer.mentation bath.

_ In his analysis of the design problem for two-way classification data, Gaylor

(1960) showed that the lower bound to the variance of the estimator of any linear

combination, O~, of the variance components is 20;/(N-l), where N is the total

number of samples obtained. Gaylor showed that the variances of the estimators of

the functions given in Table 3 could reach this lower bound, where 02 is the samplings .

variance Within cells.

On the other hand, if the interaction variance component were zero, the optimal

design to estimate the row component above would use only one column (or the optimal

design to estimate the column component above would use only one row).

of the number of rows to best estimate 02 (or columns to best estimater

number of samples per row (per column) was considered by Crump (1954) .studied the use of various estimators.

Prairie (1962) considered various designs for a three-stage nested operation and

advanced a possible general criterion for a two-stage operation.

e ~able 3. Two-way classification designs which produce most efficient estimatorsof certain linear functions of the variance components for fixed totalsample. *

Function estimated (cr~) r c nijl.

21 1 Ncr

s2 2

+ cr; N 1 1cr + crs rc

2 + (J2

+ a~ 1 N 1crs rc

2 2 2 2 **cr + (j + a + a N N 1 or 0s rc r c

* 1 with l' th .th d .th 1 ~ Nr rows x c co umns nij samp es In e l. rowan J co umn, .~.nij = •l,J

** Each sample to be taken from a different row and column, e.g., row 1 and column 1row 2 and column 2, •••• This design is also most efficient to estimate ~; thevariance of

**************

Table 5. Efficiency ratios (E) for a, ~~, '(j;, 'P, N = 100 1:/

1.0

2.0

a E{P) E(1r~) E(~2) E(~)a

10 .25 ·91 .. 58 ·7120 .43 .81 .88 .9725 .50 ·76 .96 1.0033 .60 .68 1.00 .9635 .61 .66 .99 .9450 .15 .51 .91 .14

25 .40 .76 ·11 .9633 .49 .68 .89 1.0040 .56 .61 .94 .9750 .67 ·51 1.00 .8860 .11 .40 .95 .72

25 .33 ·76 .61 .9038 .47 .63 .81 1.0050 .60 -51 .97 .9766 .69 .34 1.00 ·7275 .82 .25 .98 .54

Y E= Variance for optimum allocation at optimum aVariance for optimum allocation at given a

The between-classes sum of squares can

6

Bush (1962) and Bush and Anderson (1963) compared three estimation procedures for

different two-way classification designs.

2. A Two-Stage Nested Design with Varying Number of Classes

In order to indicate how the variance of the estimators of the variance component~;

changes with the design, consider a simple two-stage nested design with only eight

observations, for which the number of classes (a) ranges from 2 to 7. Crump (1954)

has shown that, if the usual Analysis of Variance estimators are used, the best

allocation of a fixed total sample of N into!:. classes is to have .£ classes with p+l

observations per class and a - r classes with p observations per class, where

N = ap + r, O:S r < a.

Crump~s result was used to determine the allocation for each of the situations

presented in Table 4. For this table, the model is

Yij = \..L + ai + bj(i)'

where O~ refers to the between-classes variance component and O~ the within-classes~2 4 ~2 4

component. We let Va = var(~a)/20b and Vb = var(~b)/20b' where" refers to the

usual Analysis of Variance estimator. These estimators are based on the regression

sums of squares which would be obtained if the a. were regarded as constants and the~

estimation was that of weighted least squares (weighted by the number of observations).

The derivation of the variance of ~2 is not simple for unbalanced designs, becausea

2SSA is not a multiple of X unless the design is balanced. To illustrate, consider

the case for 5 classes with n. = 2,2,2,1,1.~

be subdivided into three orthogonal parts, each haVing a X2-distribution as shown in

2 2Table 4c. The sums of squares in Table 4c are independently distributed as Xi0i with

DFi degrees of freedom, where ai is the expected value of the mean square.

6a

eTable 4a. Analysis of variance for two-stage nested design with 8 observations

(~ classes) --

Source of Degrees of Sum of Average value ofvariation Freedom (DF) squa"'~ mean square *

Between classes

vlithin classes

* 2, 2p .= (Ja! (JP. and k

a-I

8 - a,2= (64 - L:ni )/8(a-l).

/' . \ ';;',

SSA

SSB-~.._-.---

1'2 "'2 ( )/(Jb = MSBj O'a = MSA - MSB k.

;* * * * * * * * * * * * * *Table 4b. Values of Va and ~b for various designs; N =8, and for selected values of

p.**No. classes V for given p

**a:

(a) n. ....--- Vb1- .05 .1 .2 .5 1-..0 2.0 5.0 10.0

? Lt(?) ~/lQO, ·ill .~13 ·573 1.57 5.07 27·6 105 ·J.913 3U;),? ·.~1 .145 .128 .419 .993 2·90 14.8 54.9 .2004 2(4.) .163 .182 .226 ()96) -.8l2\ 2.15 10.1" 36.13 .250

_.J '--;----T-

5 ~C~J,~(2) .255 .274 .314 .467 .829 1.96 8.60 30.5 .3336 ~(2),l(4) .430 .447 .485 .623 ,941 1.90 7·43 25 .. 4 .5qo

7 ~,:I.(6). .937 .953 .988 1.11 1.40 2.22 6.83 2l.6 1.000- ~ ~:' t

* ("2)1 4 ~2 4V = Var ~a 2~bj Vv = _var(b)/~qb~ underlined values are minimum for givena2 2

p = 0' 1(Jb'a .

** Number in parentheses refers to number of classes with this n1

.

**************Table 4c. Subdivision of $SA for a = 5

Source

Among classes of~

Among classes of 1

Between two groups

Sum ofSquares

SSAl

SSA..... __ 2

SSl\3

DF

2

1

1

/1./)

Average value of '.

mean squares «(j~)2 2 '21-

( a,O -+ 20'aJ 0'::1.

/ 2 2, 2i O'b +(Ja j= 0'2

/ 2 ,2 2r. 0"1.. + 5';4 0' = (J~\ J;,I ,I. a':;

e Hence3

SSA = i: BSA. and k =[2(2) + 1 + 51€! /4 = 25/16.i=l J.

~; =(i: SSAi /(a-l) - MS~/k

7

""2var( a ) .1-....,4~a..... = [r. (DF)ia~/(a_l)2 + a~/3J/k2a~

2ab

_ 16 2 (7a~ g2, 2 2 169 4J 4- (25) 12 + 32 abG'a + 256 aa /ab

448 8 ~ 2= 1875 + a; p + b25 p

An examination of Table 4 reveals that the best experimental plan to estimate ia

varies from a = 2 to a = 7, depending on the value of p. This is the standard result

in designing experiments to estimate variance components; the criterion depends on the

values of the parameters to be estimated.2

Of course, the best plan to estimate ab is

e that for a = 1, for which Vb =1/2 = .143. Note that if p is small, one could do

qUite well in estimating for a; and a~ with small a. For p large, the best plan to

estimate a; is the worst to estimate a~. One other complication is that these

estimators are not ML unless the design is balanced. My students have been performing

some empirical sampling studies for p = 4 which indicate tentatively that the ML

estimator is superior (on a Mean Square Error basis) to the above weighted analysis of

variance estimator. I am not able to state whether this will continue to be true as

more samples are collected or for other values of p.

I should indicate that Crump showed that the optimal number of classes to estimate

"a; (using the usual analysis of variance) is approximately

al = N(Np + 2)/ ~(P+l) + 1.1 ·

Since al

, in general, will not be an integer, it appears that theelosest integer to~

~ should suffice because the variance profile is. quite flat in the neighborhood of

the optimum, except tha.t one should use a balanced plan if possible.

8

In general, one

does not lose much by using a. balanced plan unless p is quite large, and in many

situations the balanced plan is best.

All of these conclusions are based on the assumption that one has a good guess of

the value of p beforehand, that the analysis of variance estimator is to be used, and

that prime interest is centered on estimating 02•a

In Table 5, I have presented efficiency ratios (using the analysis of variance

2 2 2 2procedures) for estimating each of four parameters, fJ., O'b' O'a and p = O'a/O'b for N=lOO

and various values of p and number of classes. In the interest of time, I will not

dwell on these results; however, one can see that there is considerable loss in

efficiency in estimating one parameter if efforts are made to optimize the estimation

of some other parameter. Crump shows the effects of incorrectly guessing the value

of p on the efficiency of estimates of 0'2.a

1/2 < plPO < 2,

In generaJ., if

where Po is the guessed P, the loss in efficiency is less than 10%; this percentage

loss decreases as p increases.

The problem of needing a prior estimate of P in order to select an optimum design

is now being considered by one of my graduate students; a brief report on this

research will be made at the end of this paper.

3. Problem of Reducing Total Variability for a Two-Stage Production Process

Prairie ~961,196~ considered the problem of the effect of different designs in

estimating the variance components needed to allocate funds in the most efficient

manner to reduce the total product variance, 0'2 = 0'; + o~. Most of the theoretical

results are presented in Table 6. D units are assumed available to reduce 02;

e D = Da + Db' where Da units are to be spent to reduce 0'; and ~ units to reduce O'~,

l3a

if D = D ,a;

-k D21 + pe

=12

G.R

Fo~ulas used in Prairie's problem of reducing total variability for a t~stage production process

.- -k D1

e + P

e-kl(D-Da ) + pe-k2Da ; if 0 < D < D- a-(6.1)

'rable 6.

-l';.lDA

=: e + P .L ~ P ~ PI,

r 21~ '"-kl(D-C1 ) }';.1/C2

(6.2) oR p:: VO(p~ = = e p

-k2C1 -k2/C2 '"+ pe S- o

PI ~ p :5 P2,-k D

1 + pe 2 '"::: . p ~ P2,

(6.3) ~"~-: " ~ [(a-hlSE -1Jwhere K :: [N(N~2l?-1) + al?(l?+l)] /N(a-l) and

Q:: o~ ~\l + (l?+l)P] X(r_l) + ~ + l?Pj X(a_r_l) + Cl + ap(l?+l)P/~X(l'~ l-.

(6.4) p = Prob [Vo(p) ~ t3J'where t3 is a fixed number between min Vo and max VO'

(9 min Vo + max VO)/IO j

(3 min Vo + max VO)/ 4 .

9

o ~ [Da'~ ~ D. It is assumed one unit of funds will reduce either O'~ by 1% or 0';by 100 A%, A a known constant.

20' will be minimized if

where C2 =kl + k2 and CI C2 = Dkl + [n(k2/kl ). If da < 0, set Da = OJ if da < D,

set D = D.a

In order to use these results, one must obtain an estimator of p, p. Unfortunately

the use of an estimate of p results in less than the optimal reduction in total

variancej hence, one objective is to minimize the loss due to estimation. The true

reduced variance ratio is given by (6.2). In order to make (6.2) small, it is

desirable to obtain a p as close to p as possible.

Prairie considered a two-stage nested design of the Crump-type with N = ap + r

"and investigated how the true reduced variances (based on p) changed with a, the

number of classes. The estimator of p is given by (6.3). Smallness of (6.2) might

be a small expected variance ratio,

2'"~(dRlp).

Unfortunately it was not possible to obtain a closed-form expressionj Prairie has

recently investigated this numerically at Sandia Corporation.

In his thesis, Prairie investigated the effect of changing ~ (for fixed N) on

the maximization of P, equation (6.4). Two values of ~, as shown in (6.5), were used

in the study. The parameter values studied were:

p,lOO A = 1/10, 1/4, 1/2, 1, 2, 4 and 10; D = 25,100; for

N = 24, a = 2(2)8(4)20; for N = 72, a = 2,4,8,12,18,24(12-)60.

In general, it appears that if one uses a design that is moderately near the optimal,

he will do qUite well in achieving his objective of reducing total variance. The

study indicated that for most situations an intermediate value of ~, say N/4 :s a :s N/2

will give almost optimal results.

be a =N/3.

10

If one value of ! were to be recommended, it would

P was computed by use of Incomplete l3eta approximations. Profiles of VO(~) are

given in the two Prairie references.

4. Extensions to Multi-Stage Production Processes

In the five-stage nested experiment mentioned earlier, I had proposed a so-called

staggered design. This design was constructed to eqUalize (as far as possible) the

degrees of freedom for the various parts of the analysis of variance. Prairie (1962)

also considered my staggered (Dl ) design plus other non-balanced three-stage nested

designs.. with the modeli=l,2, ••• ,aj=l,2, ••• ,b

ik=l,2, ••• ,nij

Some of these are presented in Table 1, plus a balanced design, for N = 48. vlhen

he was a graduate student at Raleigh, L. D. Calvin became interested in these non-

balanced designs. He has subsequently used some of them; see Calvin and Miller

(1961). Prairie developed a specific procedure for constructing his three-stage

designs, called D2-designs:

1) N = aql + r l , 0 S r l < a.

Assign ql + 1 units to each of r l A-classes (designated as group Gl ) and

ql units to each of a-rl A-classes (designated as group G2 )·

2) b =a~ + r 2, 0 S r 2 < a.

To each A-class assign ~ B-classes and then one extra B-class to each of

r 2 A-classes. Make sure that b > a.

lOa

*Table 7. Some three-stage nested designs with N = 48 observations.

Balanced D.F. (15, 16, 16)

16

In

Calvin-Miller and

Prairie (D2)Balanced Staggered (Dl )

A

rS7-1 n!?/A

C/B/A12 8 8

Other Prairie designs, unbalanced D.F.

-±-. +(23, 16, 8) r I

8 16

(15, 24, 8)r

8 8

(7,52,8) 1-1, +t +18

Special staggered (11.) design for (15, 24, 8)

l-S I~ +- +4 4

++8

* The number below each basic plan is the number of replications used for the completeexperiment. The degrees of freedom (D.F.) are between A-classes, B in A classesand C in B in A classes, respectively.

lOb

e ,+,ab1e 7a. Some Non-Balanced Four and Five-stage Nested Designs

Bainbridge (Calvin-Miller) designs

Four stages Five stages

1;\1 = 4

nij = .3,1

nijk = 2,1,1~

I ISome Prairie deSignS:

ni = 5nij = .3,2nijk = 2,1,2

nijk!= 2,1,1;11,1,2II . f

Five stases

ni = 4

nij =nijk=

Four stages

*Anderson five-stage staggered design

n+t+¥+ tl

+ta2 ni =16,8,4,2,1 a,; a4 a5

nij = 8,4,2,1,.1

nijk = 4,2,1,1,1

nijkf = 2,1,1,1,1

* ai replications of each basic plan plus ~ replications of the full design given

at the top of Table 1, giving i~lai A-samples. This design is given in Anderson

and Bancroft (1952) with 13.1=2, a2=2, a,;=4, a4=8, a5=O.

,

11

3) 'Vlithin each A-class" assign the observations (nij ) to the B-classes as equall:y

as possible.

When b ::: N/2" the D2-design will usually satisfy the relations:

Ini - ntl = 0 or 1 ;

Inij - ntml =0 or 1.

Consider the four D2-designs given in Table 7. In all cases" the ni are equal

for a given design; the nij are either 2 or 1. ~ staggered (D1 ) design does not

have these features" e.g., ni =2 and 4" nij =1 and 2. An additional benefit to be

derived from a design such as the D2

(15,,16,,16) design is the facility to detect

variance heterogene1ty. If we indicate the three observations for a given A-class

as x11,x12 and x21, then a single degree of freedom contribution to SSC is

(x11 - x12'f/2. This can be computed separately for each of the 16 A-classes and

tested for heterogeneity; if the A-classes are sampled on successive time periods"

one could determine if there was a time trend in the variance estimates. The same

procedure can be used with SSB" where the single-degree-of-freedom contribution is

(xU + x12 - 2x21 )2/6• For my staggered (&1) design" six observations must be

secured at each step" giving two degrees of freedom for C and B at each step but

with only 8 steps instead of 16.

For each of 14 non-balanced designs, Prairie computed the efficiency ratios

Ea, ~, and Ec" where

E = var(~) for a given non·ba1anced design

'1("'2)var ~i for the balanced design

assuming the analysis of variance estimation procedure of Table 8. The design

parameters considered were Pl = 02

/02 and P2 = 0b2/02 =1/10,1/4,1/2,,1,2,4,10.a cc

Special analyses for my staggered (D1 ) design and the Calvin-Miller (D2) design

are presented in Table 9.

12

e !ale computation of var(o~) for the D2

-designs (of which the Calvin-Miller is one

example) is complicated by the fact that MBA and MSB are usually correlated; however,

if the design consists of replications of one basic plan, all mean squares are2 .

multiples of X -variates. The mean squares for rrr:y staggered design are uncorrelated;

however, MBA and MSB are determined by pooling several sums of squares which have

different expectations. In Table 9a, each mean square is independently distributed

as a multiple ofax2-variate; however, the pooled mean squares (Table 8) are distrib­

uted as weighted sums of X2-variates, e.S. ,

MBA = (7~ + 7 MBA2 + MBA,)!15

( 22 ..22 22;= Xlol + x202 + X,o,) 15,

-~ 2 2where x;: and X2 have 7 d. f. each and X; has 1 d. f •

A D2-design which consists of replications of one basic plan ~UCh as for

e (15,16,16») 1s. easier to administe~ and is more amenable to sequential experimenta­

tion than is my staggered design or any design with several parts in the basic plan.

This feature is emphasized by T. R. Bainbridge (1965).

Calvin and Miller (1961) have constructed a four-stage analogue of the

D2(15,16,16) design given by Prairie. Bainbridge (1965) presents the same four-stage

design and extends this principle to the construction of designs for five and six

stages. The Bainbridge four and five-stage designs, my five-stage staggered design,

and possible Prairie four and five-stage designs are presented in Table 7a. It

should be indicated that the Bainbridge four-stage design does not meet the criteria

set up by Prairie, because nij = 1 or';, similarly for all higher-stage Bainbridge

designs. However the benefits mentioned for the three..stage design become even more

important for higher stage designs, because it is possible to teat for variance

heterogeneity at each successive stage and sequential sampling becomes more important

e as the total number of samples increases.

Table 8. Analysis of variance for general three-stage nested design

l2a

Among C- classes in B in A N-b

Source of variation

.Among A-classes

Among B-classes in A

Mean.m:.. Square

a-l MSA

b-a MSB

MSC

Average value ofMean Square

2 I 2 2°c + K10b + K20a

2 20c + Klcrb

O~

A2oc

IS. = [N - ~; (~/ni>]/(b-a) ;

Ki = r~ ~ (nij/ni ) - Ei~ (n~j)/~/(a-l)

b J J

K2 =[N - t ni/N]/(a-l) •

MaB - MaC 1'2 K1MSA - KiMSB - (Kl - Ki) MSC. ° =Kl ~ a K1K2

**************Table 9a. Analysis of variance for the staggerea (Di) ~design in Table 7

Source of variation D.F. M.S. Average value of M. S.-Al (Group 1) 2 2 2 2

7 MBAl 0c + 2crb + 4cra = 0'1

A2 (Group 2) 7 MSA22 2 2 2

crc + crb + 20 ::.:: O2.a

A3

(Between Groups) 1 MSA3 O~ + 4/3 O"~ + 8/3

2 20"a ::.:: 0'3

Bl (Group 1) 8 MSB2 2

1 O'e + 20'b

B2 (Group 2) 8 MSB22 2

0c + °b

C (Group 1) 16 MaC 20"

C

Table 9b. Analysis of variance for the Calvin-Miller ( D2 ) design in Table 7

Source of variation ~ M.S. Average value of M.S.

A 15 MSA 2/2 20"c + 5 3 O"b + 3cra

16 2/ 2B MSB O'c+ 43 Ob

C 16 MSC2

O'c

13

e It is difficult to summarize Prairie's comparisons in a few words, mainly because

a reduction in variance for the estimate of one variance component will be offset by

an increase for some other component. This reemphasizes the need for a single

criterion.

Table 10 presents selected efficiency ratios for the six non-balanced designs

presented in Table 7. Ratios < 1 indicate superiority for the non-balanced design.

There is a loss of efficiency for a2 directly proportional to the reduction in degreescof freedom from 24 for the balanced plan to 8 or 16 for the non-balanced plans. Only

2; 2 2if P2 =ab ac is large does the non-balanced plans give better estimators of ab, even

though the non-balanced plans have more degrees of freedom for MSB than does the

balanced plan with 12 d.f. The non-balanced designs often achieve considerable~2

reduction in the variance of a •·a

2e The following summary seems in order for estimating aa:

(1) Since m;y staggered and the D2-designs have about the same efficiencies,

the D2-designs might be preferred on the basis of simplicity of con­

ducting the experiments and analyzing the data.

(2) For Pl small" D2 with assignment (7,32,8) is very good; also for Pl = 1.

and P2 large.

(3) For P1. and P2 1.arge, D2 With assignment (1.5,,24,8) is quite good; also

for P1. =1 and P2 S 1..

(4) For P1. 1.arge and P2 S 1." D2 with assignment (23,,1.6,8) is very good.

However, we note that in some instances, there is considerable 1.oss in the efficiency

2of estimating ab, e.g. (4) above. One of the merits of my staggered design is that

it does not entaU such a 1.arge loss. Additional research needs to be centered on

obtaining a single criterion to be optimized and on extending Prairie's comparisons

e to non-balanced designs with more than three stages.

l3a

Table 10. Selected efficiency ratios for comparing the non-balanced designs with thebalanced desiGn in Table 8*

Values of EDesign for P;L eq,ua~ to

~ EP2 D.F. ~ ...1!!± 1 4 c

- -1/4 (15,16,16) D1 1.02 .89 .83 ~ 1·50

(15,16,16) D2 1.15 .89 ·77 1.71 1.50(15,24,8) D1 .88 .84 .82 2·50 3·00(15,24,8) D2 ·92 .82 .76 2·72 3.00(23,16,8 ) D2 1.77 .91 ill 3.89 3.00(7,32,8 ) D2 ill 1.17 1.46 2.27 3·00

1 (15,16,16) D1 ·93 .88 .83 1.05 1.50(15,16,16) D2 1.03 .91 ·79 1.15 1.50(15,24,8 ) D1 .69 ·75 ·79 1.21 3.00(15,24,8) D2 ·72 ill ·74 1.27 3.00(23,16,8 ) D2 1.48 .99 .61 1.89 3.00(7,32,8 ) D2 .46 .86 1.31 1.02 3·00-

4 (15,16,16) D1 .86 .85 .83 .88 1.50(15,16,16) D2 ·91 .88 .82 .85 1.50(15,24,8 ) D1 ·55 .60 ·70 .67 3.00(15,24,8) D2 .57 .61 .:.E§ !65 . 3.00(23,16,8 ) D2 1.25 1.08 ·75 .99 3·00(7,32,8 ) D2 .28 .d§ .93 ill 3.00

*A Minimum values for given (P1'P2) underlined.

14

5· Comparisons of Estimators and DeSignS for Two-~lay Classification Operations

Three unbiased methods for estimating variance components were compared for a

two-way classification operation by Bush (1962) and subsequently published in~­

nometrics in 1963. Two of the procedures, A and B, are based on the respective

methods of fitting constants and weighted squares of means described by Yates (19,4)

for fixed effects models. The third procedure, H, which uses unadjusted sums of

squares" was developed by Henderson (1953) explicitly for random models. Bush con-

sidered only connected designs. The three procedures were compared on the bas1s of

variances of estimated variance components for a number of experimental designs and

parameter values (papulation variance components ) with the aid of a UNIVAC 1105

computer. For each estimating procedure, efficiency factors relative to a balanced

design were computed for the various non-balanced designs considered.

The model considered for occupied cells 1mS

i =l,2, ••• ,a; j =l,2, ••• ,b; k =1,2, ••• ,n1j

Bush based his results on the cell means,

- '"Yij • = ~ij =~ + r i + c j + (rc)ij + eij•

222 2The variance components were 0 , 0 , 0 and o. Eighteen sets of parameter valuesr c rc e

were used by Bush (1962): 0; ranged from 1/2 to 16, a: from 0 to 16, o~ from 0 to

16 and 02

= 1; four sets are presented in the 'rechnometrics article.·e

Insufficient computer storage prevented our investigating larger than 6 x 6

designs; extensions to desisns with more rows and columns are planned in the future.

Six non-balanced and one balanced 6 x 6 designs are presented in Table 11.

15

Table 11- Values of nij for designs considered by Bush

Equal 6 x 6 S 16 S 22 C 18

1 1 1 1 1 1 1 1 0 0 0 0 2 1 0 0 0 0 1 1 1 0 0 01 1 1 1 1 1 1 1 1 0 0 0 1 2 1 0 0 0 1 1 1 0 0 01 1 1 1 1 1 0 1 1 1 0 0 0 1 2 1 0 0 0 1 1 1 0 01 1 1 1 1 1 0 0 1 1 1 0 0 0 1 2 1 0 0 0 1 1 1 01 1 1 1 1 1 0 0 0 1 1 1 0 0 0 1 2 1 0 0 0 1 1 11 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 2 0 0 0 1 1 1

C 24 L 20 L 24

2 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 02 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 00 2 1 1 0 0 1 1 0 0 0 0 2 1 0 0 0 00 0 1 1 2 0 1 1 0 0 0 0 1 2 0 0 0 00 0 0 1 1 2 1 1 1 1 1 1 1 1 2 1 1 10 0 0 1 1 2 1 1 1 1 1 1 1 1 1 2 1 1

Only when the residual variance (a;c or a;c 222+ ae) was larger than ar and ac was

procedure H superior to procedures A and B for the non~balanced designs; however, for

this situation, a balanced design is better than a non-balanced one ( and in this

case, the estimators are identical). ~fuen 0'2 and 0'2 were larger than the residualr c

variance, a non-balanced S or C design was preferred to estilllate 0'2 and 0'2; in thisr c

case, procedure A was slightly better than B.

The L-designs originally were considered by Gaylor (1960). He also considered

some unconnected designs, which he Galled Balanced Disjoint Rectangles (BDR). Bush •s

S and C designs are connected designs s1m1lar to Gaylor IS BDR-designs. Examples of

two BDR designs and the resultant analysis of variance for one of them, are given in

Table 12. The major difficulty is how to use the mean square between groups to

2 2 . 2 2improve the estilllates of ar and ac. Note that separate estmates of arc and ae

cannot be obtained in Table 12. One might use maximum likelihood with the four

mean squares; this is equivalent to an iterated least squares procedure, discussed

in my 1961 paper.

15a

Table 12. Some balanced disjount rectangle designs and the analysis of variance

1 1 1 0001 1 1 0001 1 1 0 0 0o 0 011 1o 001 1 1o 001 1 1

.)

1 11 1

o

o

o

o1 11 1

o

o

o

o1 11 1

o

o

o

o1 11 1

Analysis of variance for second design

squaremeanAverage value of

MSR

MSC

MSC

MeanSquares

MSGD.F.3

4

4

4

Columns in Groups

Rows in Groups

Source of variationBetween Groups

(R x C) in Groups

2 2° + °e rc

2 2°e + °rc2 2

C1 + 0e rc2 2

0+0e rc

**************Table 13. Efficiency of some two-waY aesigns for est:l.fuating 0; and Pr , :N' = 30 *

1.0

r

35678

10111415

c

106554

3332

s

ooo2

6

o82o

E(~2)r

.69

.951.00

.981'.00

'.88.90.96

1.00

E(Pr )

.74

.981.00

.95

.94

1.00·97.83.84

4.0

r

10121519

10152024

c

3322

3222

s

o6o

11

oo

106

E(~2)r

·74.82'.97

1~00

·59.84.94

1.00

E(Pr )

1.00·97·93.63

l'.00·99·53

* r =no. rows; c = no. columns; N =rc if s = 0; N = r(c-1) + s if s > O.

16

e Gaylor considered the problem of optimal designs to estimate a2 • It was shownr

first that if the design were restricted to a class of designs in which nij = 0 or

n (. n integer), n should equal one. Hence each cell would either be empty or contain

e 2 222only one observation. In this case, only a", a and. a = a + a could be estimated.r c e rc

Using Method A (fitting constants), we came up with the following rule (see my 1961

paper, p. 811):

1) If Pr >~ use one column with r = N - r' rows and a second column

with r' of these rows, where r' is the integer (~ 2) which is closest

to 1 + (N - 2)/P 12.r

2) When Pr s{2, use a balanced design with the number of columns, c as the

integer closest to

Pr{N - 1/2) + N + 1/2. PrtN'- 1/2) + 2

In general, N/c will not be an integer; hence it probably would be

adVisable to use a few more or less observations to obtain balance.

Efficiency factors for some designs used to estimate a2 and P = a2/a2 are presentedr r r

in Table 13. Gaylor showed that the loss of efficiency due to the use of the incorrect

Pr is of the same order of magnitude as found by CI'\lD1P. If only a2 or P is ofc c

interest, interchange r and c in the above results.

The results in Table 13 show that if one wants to obtain good estimates of both2 . 2 . 2

<1r and <1c' he must alter the design, because an optimal design for ar furnishes little

information on a2 (note the small number of columns used). This led to Gaylor'sc2

investigation of the L-design, which consists of a good design for or superimposed on

a good design for 02• The diagonalized designs (8, C or BDR) have the same principlec

With sequences of balanced designs. Unfortunately no optimal rule for the formation

~ of such designs has been developed.

17

~ Computational procedures for non-balanced multi-factor designs may be very compli-

cated. Research on best estimators and designs is being planned.

6. Two-stage Nested Desisn with Composited Samples

In cases where the measurements cost is high, it may be advantageous to com,posite

some of the samples and take measurements on the com,posited samples. This may be

the case for sampling of many bulk materials, especially those involving chemical

assays. One of my graduate students, K. L. Kussmaul, is considering a two-stage

sampling problem with compositing of samples within the first-stage (A) classes. The

model for the composited measurement will be:

1 nij= .... + a + - 1: b ( )i nij k=l k i

)"i=1,2, ••• ,a

13=1,2, •.• ,ri

th thwhere nij samples are com,posited for the j measurement of the i A-class. As

before) ~ nij =ni and ~ ~ nij = Ni in addition, we assume the total number of

measurements is R =1: r i •i

Four procedures to estimate the variance components a; and a~ are being studied:

(1) Method of Weighted Means. Weights are proportional to the number of samples

com,posited per measurement.

S~ = 1: .1.1-£ nij yij12

- j ~ 1; nij yij12

i ni _j ~ G. j ~

Sfll\ = g niJ i'fj - ~ ~ [~ niJ lit

Analysis of Variance

Source ofVariation l2.:!.:. !1&.

A a-l MBAl

B in A R-a MSBl

18

The estimators are:

'" - "'2~l= ~ ~ nij Yi/N i ~bl = MSBI ;

"'2~al = (MSAI - MSBI )/kl '

(2) Method of Unweighted Class Means. Each A-class mean (Yh

) is given equal weight

for SSA2, where Yi • = ~(nij Yij)/ni i hence,

!"'2 1II J2 ~l J2 1~ nij Yij12S~ = ~ Yj:. - a~ Yi • e ~ Lni ; nij Yi~ - a ~ ; ni J ·

2 2 1 1The expected value of MSA2 = SSA2/(a-l) is <1a + k2C1b, where k 2 =a ~ ~

2 "'2same estimator of C1b is used as in (1), i.e., C1b2 = MS~. Hence

(} ) Method of Unweighted Measurements.

• ~e

s~ =

E(S~) =

1r-]2 lU _~2E - E Yij - R E E Yiji r i j i j

1 2 - 2 2 ~1 1 l]a2-(R - E r)<1 + E r. (- - -) -R i i a i j _ri R nij b

Again one would use MSBI to estimate C1~; hence,

'"Also ~}

t (1 1 1SSA.. - E r. - - ->:=­, i j r i R nij

19

(It.) Method of Maximum Likelihood. This method has been studied only for the case of

equal Dij and equal r i " when all estimation methods give identical results.

The evaluation of various compositing plans is based on minimiz1nB the variance

ot the estimated parameters sUbject to a fixed sampling cost, C =aCa + NCb + RCc'

where Ca" ~ and Cc are the respective unit A-class" B-class and measurement costs.

The usual difficulty faced here is that comparative variances of the three est.imators1' ....2"2 22

(J.1, O'al O'b) are functions ot P = O'a/abj in practice, one probably would make an

initial guess (po) and attempt to find compositing plans with high precision tor

values ot p in the neighborhood of p •o

A number of empirical studies have been madej some ot the results are presented

in Table 14. These studies indicate that the best compositing plan seems to be

balanced or nearly balancedj hence, there is little or no problem of selection of the

best estimator-, as indicated above.

It may be desirable, especially for very costly' measurements, to composite

samples from more than one A-class. Cameron (195l) discussed two such plans for

sampling to estimate the clean content of shipments of woolj one of these plans was

developed by Mr. Tanner, who is presenting a paper on composite samples at this Semi-

nar. At present, Mr. Kussmaul is considering some optimal sampling plans for these

more complicated compositing procedures.

19a

e ~able 14. 2 2Efficiencies of Various Compositing Plans in Estimating j.L, Ga and GbUsing Estimating Procedure (1) with Fixed Total Cost C* •

P = 4 p=l e =1/4

** i'r* ." E(~) E(~;) " E(~~) E(~;) " E(~) E(~;)aRN r 1 n1 E(j.L) E(j.L) E(Il)

!

C = 506 7 10 2,1(5) 2(4),1(2) 100 14 99 95 14 55 75 14 14

~ 8 8 2~3 ~,1(2~ 2Pl,1!2l 82 43 82 77 43 60 60 43 225 7 12 2 2 ,1(3 3 2 ,2 3 89 29 94 91 29 83 80 29 375 6 16 2,1(4) 4,3 4) 94 14 100 100 14 94 96 14 42

4810 2~4) 3(2l ,2(2j 72 57 72 74 57 71 66 57 414 7 14 2 3),1 4~2),3(2 75 43 77 81 43 86 80 43 61It 6 18 2(2),1(2) 5 2),4(2 77 29 80 86 29 95 91 29 76It 5 22 2 /1(3} 6(2},5(2 78 14 '1 89 14 - 100 100 14 79

-)

3. 8 12 3(2l ,2 4(3) 58 71 53 64 71 63 65 71 553. 7 16 3,2 2) 6,5(2) 58 57 54 67 57 70 74 57 72

e 3. 6 20 2(3) 7~2),6 59 43 55 69 43 75 81 43 863 5 24 2(2),1 8 3) 60 29 56 71 29 78 87 29 96.3 4 28 2,1(2) 10,9(2) 60 14 57 72 14 80 91 14 lOO

2 9 lO 5,4 5(2) 39 100 27 44 100 35 48 100 362 8 14 4(2) 7(2~ 40 86 28 46 86 38 55 86 482 7 18 4,3 9(2 40 71 2$ 48 71 41 60 71 gz2 6 22 3(2) 1l(2j 40 57 29 49 57 42 64 572526 3... 2 13~2 40 43 29 49 43 43 66 43 69a 430 2(2) 152 40 29 29 50 29 44 68 29 74a 3 34 2,1 17(2) 40 14 29 50 14 45 70 14 76

C = 4848 8 2(4} 2(4) 73 57 10 11 51 62 58 57 303618 2(3) 6(3) 59 43 55 68 43 73 18 43 78

* C =aC + N~ + RC , where C =2, ~ =1E =relative efficiency as percentage. a c a

and Cc = 4; a A-classes, N =Eni samples and R =Eri composites (measurements),

2/ 2P =Ga Gb •.** 1(5) indicates 1,1,1,1,1; etc.

e

20

7· Bayesian Estimation of Variance Components in a Two-step Estimation Process forTwo-stage Nested DeSignS

We are now considering a simplified model for a two-stage nested design,

Yijk =ai + b~(i)' i =1,2, ••• ,a; j =l,2, ••• ,ni '

where it is assumed that ~ is zero (or if known to be,some other value, the observa-

tions are coded so that the new mean is zero). This assumption may not be too

unrealistic in many processes, and where it is not true the approximation of sub­

tracting the sampJ.e mean should not seriously affect the results. The analysis of

variance will now have.! degrees of freedom for MBA, and the expected value of MBA

2 2 ""2is 0b + Na la. Hence 0 =a(MSA • MBB)/N. The optimal allocation of degrees ofa .a

freedom (a,N-a) to 'minimize the variance ot ~ i6 g:Lven by the equation

02

a arr.:a=P="2 •

°b

(1)

(2)

A. I. Weiss is investigating a two-stage procedure:

Nl « N) observations are observed first with al h-classes, say, ~ = Nl/2.

From these are obtained MBAl and MB~, based on al and Nl - ta:J.. degrees O'i free-

'"dom, respectively, and Pl =~(MBAl - MSBl )/Nl (MSB1 )·

The degrees of freedom (a2 and N2

- a2 =N - Nl - a2) for a second step are

then allocated as follows to minimize the variance of the combined estimator

of 02 :a

a2 N2 "'. N2 '"=-NP=-N P •N-al -a2

The combined estimator of p is"" .~c =(~+ a2 )(MSAc - MSBc)/N(MSBc )'

where MSAc = (SSAl + SSA2)/(al + a2), and MSBc = (SS~ + SSB2 )/(N-al -a2), SSA2

and SSB2 obtained in the second step.

21

e Of course, the second step allocation is still only a guess, but is based on experi­

mental evidence.

Since the above results involve guesses for p, one naturally considers the use

2of prior distributions for p, and collaterally, Bayesian estimates of aa' In order

to make the mathematics tractable, Weiss is using Gamma-function priors, e.g.

The prior for p is

O<p<oo.g(p) £( )

P+q-2r l + r 2 + r1kp

tit Weiss conceives of the prior distribution being based on a hypothetical experi-

•E(p) =.,=

ment, from which we obtain r l = SSBO and r 2 = SSAo. In order to have a finite second

moment for g(p), the smallest values of p and q are 2 and 4, respectively. For p = 2

and q = 4,

This "minimum-information" prior distribution is the same as a posterior distribution

based on 6 d.f. for SSAO and 2 d.f. for SSBO' coupled with a non-informative prior

(p =q = 1; r l = r 2 =0).

Weiss concentrated his

neighborhood of p =1, i.e.,

attention on obtaining good estimators of a; in the

2 2aa = ab; then he studied the robustness of various

procedures when p deviates from 1, e.g., p =1/10, 1/4, 1/2, 2, 4 and 10. For a

standard, he used the usual one-step experiment with N/2 A-classes (and hence N/2 d.f.

for both SSA and SSB), so that k = 2.

22

For all two-step procedures, the first step had Nl /2 degrees of freedom for

both S~ and SSD1, i.e., al/(N1-al )=1 and k = 2. If priors were used, various

procedures were introduced to determine the ratio of r 2 to r l • In addition to

Procedure A, the non-informative prior mentioned above, three minimum information

priors were considered:

Prior based on E(p2) =1.

1\2aa at the first step is

B. The allocation to minimize the variance ofal • r:::-2':

Nl

- al

=VE(p ), = lj r 2/rl =1.

Naturally this allocation does not minimize the variance of the combined

estimator of a:.c. Prior based on E(p) =1. This tends to concentrate information near

p =lj r 2/rl =3·

D. Prior based on ~o = (MSAO - MSBo)/2MSBo = 1. This returns to the concept

of a previous hypothetical eJqleriment, with the estimator concentrated

near lj r 2/rl = 9.

These priors require further specification, because MSAO and MSBo are unknown. Weiss

recommends estinnting them from the first step.

Allocation at the second step for B, C and D would be based on the optimal

allocationa2 N2~2

No = -N E(p) ,-al -a2

where E(p2) is determined from the posterior distributions obtained after the first

step. After the second step is completed, final posterior distributions were used

to obtain the estimates of a2• Since this estimation procedure was quite complicated,a

empirical sampling procedures were used to determine the efficiency relative to the

one-step standard procedure. These results for N =100 are presented in Table 15;

other results for N = 25 and 50 are also available. Some one-step prior procedures

e are includedj these are indicated by Nl = 100.

23

Table 15. Percent Reduction in Mean Square Error of Two-Stepand Prior procedur.

7es over One-Step Standard Pro­

cedure, N = lOa.!

p

Prior N1 1/10 1/4 1/P. 1 4 10

None 10 25 - 3 -30 -40 -19 9 30None 20 38 17 -14 -20 - 4 18 28None 40 35 '14 - 4 - 7 2 12 24

A 10 -100 -60 -30 -29 -15 8 2720 10 7 -13 -12 - 6 12 2140 44 36 8 - 2 - 8 5 5

B lO 52 51 20 4 0 7 440 74 57 16 3 15 2 - 2

100 56 51 26 0 10 - 2 3

c 40 40 43 14 10 19 15 14100 34 36 21 9 5 5 5

D 40 - 3 16 - 8 1 10 16' 22

!/ Negative results indicate the standard procedure issuperior; procedures using no prior are unbiased, i.e.,mean square error equals variance. Additional samplingis being conducted.

If no prior is used, N1 = 40 is recommended, because at worst it is only 6%

less efficient than the one-step plan (p=l) and is almost as efficient as N1 = 20

for large or small p. Of the priors, C in two-steps is recommendedj in aeaeral

this shows considerable imporvement over the no-prior two-step procedure, especially

for p between 1/2 and 2. As indicated in .Tab1e 15, further sampling is being conducted.

REFERENCES

Anderson, R. L. 1961. Designs for estimating variance components. Proc. Seventh

Conf. Des. Expts. Army Res., Dev., Test., ARODR 62-2, 781-823. Inst. of Stat.

Mimeo Series No. 310.

Anderson, R. L. and T. A. Bancroft. 1952. Statistical Theo;g in Research. McGraw­

Hill, New York.

Bainbridge, T. R. 1965. Staggered, nested designs for estimating variance components.

~dustrial Quality Control 22(1):12-20.Bush, N. 1962. Estimating variance components in a multi-way classification. UnpUb­

lished Pil.D. Thesis, North Carolina State University, Raleigh. Inst. of stat.

M1meo Series No. 324.

Bush, N. and R. L. Anderson. 1963. A comparison of three different procedures for

estimating variance components. Technometrics 5:421-440.Calvin, L. D. and J. D. Miller. 1961. A sampling design with incomplete dichotomy.

!\gron. J. 53:325-328.Cameron, J. M. 1951. The use of components of variance in preparing schedules for

sampling of baled wool. Biometrics 7(1):83-96.Crump, P. P. 1954. Optimal designs to estimate the parameters of a variance compo­

nent model. Unpublished Ph.D. Thesis, North Carolina state University, Raleigh.

Gaylor, D. W. 1960. The construction and evaluation of some designs for the esti­

mation of parameters in random models. Unpublished Ph.D. Thesis, North Carolina

State University, Raleigh. Inst. of stat. Mimeo Series No. 256.

Henderson, C. R. 1953. Estimation of variance and covariance components. Biometrics

9:226-252.

Newton, R. G., M. vi. Philpott, H. F. Smith and W. G. Wren. 1951. Variability of

Malayan rubber. Indus. and gr. Chem. 43 :329.

Prairie, R. R. 1961. Some results concerning the reduction of product variability

through the use of variance components. Proc. Seventh Coni'. Des. Expts. Army

Res., Dev., Test., ARODR 62-2, 655-688.

Prairie, R. R. 1962. Optimal designs to estimate variance components and to reduce

product variability for nested classifica.tions. Unpublished Ph.D. Thesis,

North Carolina State University, Raleigh. Inst. of Stat. Mimeo Series No. 313.

Yates, F. 1934. The analysis of multiple classifications with unequal numbers in

the different classes. J. lmJ.. Stat. Ass. 29:51-66.