1 Design & Analysis of Multi-Stratum Randomized Experiments Ching-Shui Cheng June 5, 2008...

Preview:

Citation preview

1

Design & Analysis of Multi-Stratum Randomized Experiments

Ching-Shui Cheng

June 5, 2008

National SunYat-sen University

2

Randomization model, Null ANOVA, Orthogonal designs

3

4

Randomization model for designs with simple block structures

where are the treatment effects.

Nelder (1965a) gives simple rules for determining .

5

6

Block designs (b/k)

The covariance matrix has three different entries: a common variance and two covariances depending on whether the corresponding pair of observations are in the same block or not.

7

A model of this form also arises from the following mixed-

effects model:

Random block effects

8

9

has spectral form

where is the orthogonal projection matrix onto theeigenspace of with eigenvalue

The eigenspaces are independent of the values of thevariances and covarinces: co-spectral.

Each of these eignespaces is called a stratum.

10

Completely randomized design

11

Block design (b/k)

12

Row-column design ( )

13

Total sum of squares

14

Total variability

15

Suppose (in which case does not contain treatment effects, and therefore measures variability among the experimental units)It can be shown that

th stratum variance

Null ANOVA

16

One error term

Complete randomization

17

b/k: s=2

18

19

Two error terms

20

21

22

Three error terms

23

24

Nelder (1965a) gave simple rules for determiningthe degrees of freedom, projections onto the strataand the sums of squares in the null ANOVA for anysimple block structure.

25

Null ANOVA

26

27

28

McLeod and Brewster (2004) Technometrics

29

From the d.f. identity, we can write down a yield identity which gives projections to all the strata. For convenience, we index each unit by multi-subscripts, and as before, dot notation is used for averaging. The following is the rule given by Nelder:

Expand each term in the d.f. identity as a function of the n's; then to each term in the expansion corresponds a mean of the y's with the same sign and averaged over the subscripts for which the corresponding n's are absent.

30

31

32

Miller (1997) Technometrics

33

34

There are 8 nontrivial strata

35

36

37

38

Projections of the data vector onto different

strata are uncorrelated between and

homoscedastic within.

39

Estimates computed in different strata are uncorrelated.

Estimate each treatment contrast in each of the strata in which it is estimable, and combine the uncorrelated estimates from different strata.

Simple analysis results when the treatment contrasts are estimable in only one stratum.

40

Designs such that falls entirely in one stratum are

called orthogonal designs.

Examples:

Completely randomized designs

Randomized complete block designs

Latin squares

41

42

Completely randomized design

43

44

Complete block designs

The two factors T and B satisfy the condition of proportional frequencies.

45

Under a row-column design such that each treatment appears the

same number of times in each row and the same number of times in

each column (such as a Latin square),

46

47

ANOVA for an orthogonal design

48

49

50

51

52

53

54

55

56

Three replications of 23

Treatment structure: 2*2*2

Block structure: 3/8

57

58

Incomplete blocks

The condition of proportional frequencies cannot besatisfied. has nontrivial projections in both the inter- and intra-block strata. Hence there is treatmentinformation in both strata.

Interblock estimateIntrablock estimate Recovery of interblock information

59

60

61

62

63

64

In general, there may be information for treatment contrasts

in more than one stratum.

Analysis is still simple if the space of treatment contrasts

can be decomposed as , where each ,

consisting of treatment contrasts of interest, is entirely in

one stratum.

Orthogonal designs

65

Treatment structure: 22 Block structure: 6/2

Information for the main effects is in the intrablock stratum andinformation for the interaction is in the interblock stratum

66

ANOVAdf

InterblockAB 1Residual 4

IntrablockA 1B 1Residual 4

Total 11

67

ANOVA for 3/2/2df

Replicates 2

InterblockAB 1Residual 2

IntrablockA 1B 1Residual 4

Total 11

68

69

Treatment structure: 3*4Block structure: 3/3/4

70

71

72

Miller (1997)

The experiment is run in 2 blocks and employs

4 washers and 4 driers. Sets of cloth samples

are run through the washers and the samples

are divided into groups such that each group

contains exactly one sample from each washer.

Each group of samples is then assigned to one

of the driers. Once dried, the extent of wrinkling

on each sample is evaluated.

73

Treatment structure:

A, B, C, D, E, F: configurations of washers

a,b,c,d: configurations of dryers

Block structure:2 blocks/(4 washers* 4 dryers)

74

75

Source of variation d.f. block stratum

AD=BE=CF=ab=cd 1 block.wash stratum

A=BC=EF 1B=AC=DF 1 C=AB=DF 1D=BF=CE 1E=AF=CD 1F=BD=AE 1

76

block.dryer stratum

a 1b 1c 1d 1ac=bd 1bc=ad 1

77

block.wash.dryer stratum

Aa=Db 1Ba=Eb 1Ca=Fb 1Da=Ab 1Ea=Bb 1Fa=Cb 1Ac=Dd 1Bc=Ed 1Cc=Fd 1Dc=Ad 1Ec=Bd 1Fc=Cd 1Residual 6Total 31

78

Blocks Block/((Sv/St/Sr)*Sw) 4/((3/2/3)*7) Treatments Var*Time*Rate*Weed 3*2*3*7

79

Factor [nvalues=504;levels=4] Block & [levels=3] Sv,  Sr, Var, Rate  & [levels=2] St, Time & [levels=7] Sw, WeedGenerate Block, Sv, St, Sr, SwMatrix [rows=4;columns=6; \values="b1 b2 Col St Sr Row"\1, 0, 1, 0, 0, 0,\0, 0, 1, 1, 0, 0,\0, 0, 1, 1, 1, 0,\1, 1, 0, 0, 0, 1] CkeyAkey [blockfactor=Block,Sv,St,Sr,Sw; \Colprimes=!(2,2,3,2,3,7);Colmappings=!(1,1,2,3,4,5);Key=Ckey] Var, Time, Rate, WeedBlocks Block/((Sv/St/Sr)*Sw)Treatments Var*Time*Rate*WeedANOVA

80

 

Block stratum 3 Block.Sv stratumVar 2Residual 6 Block.Sw stratumWeed 6Residual 18

Block.Sv.St stratumTime 1Var.Time 2Residual 9  

81

 Block.Sv.Sw stratumVar.Weed 12Residual 36

Block.Sv.St.Sr stratumRate 2Var.Rate 4Time.Rate 2Var.Time.Rate 4Residual 36

 

82

From the archive of S-news email list:

“I'm having trouble with what I'm sure is a very

simple problem, i.e., specifying the correct

syntax for a simple split plot design. To check

my understanding I'm using the cake data set

from Table 7.5 of Cochran & Cox. .....

I can't figure out what error term I need to

specify .....”

83

“Speaking as an absolutely committed devotee of Splus, I think that the Splus syntax for analyzing experimental designs containing random effects absolutely SUCKS!!! …..I am firmly of the opinion that talking about ‘splitplots’ as such is counter-productive, confusing, and antiquated. The ***right*** way to think about such problems is simply to decide which effects are random and which are fixed. Then think about expectedmean squares, and choose the denominator of your F-statistic so that under H_0 the ratio of the expected mean squares is 1 ….”

84

“It's a cross-Atlantic culture clash. …..This divide-and-conquer idea of separating thedata using linear functions with homogeneous error variances, that is, the spectral decomposition of the variance matrix, I find immensely simplifyingand natural. …..The contrary notion of putting everything togetherin one big anova table with interactions between random and fixed effects standing in forwhat really are error terms and pondering which one goes on the bottom of which other is to lose the plot entirely. ……” ---- Bill Venables

85

“The great advantage of the western Pacific view of specifying an experiment is that the commands follow from the design of the experiment. There is no need to know anything about fixed or random effects, or denominators of F ratios. Experimenters know what their treatments are, and they know how they randomised their experiment. That’s all they need to specify the model, and the correct analysis pops out whether it was a split plot, confounded factorial, Latin square, or whatever. …..”

86

Bailey (1981) JRSS, Ser. A

“Although Nelder (1965a, b) gave a unified treatment of what he called ‘simple’ block structures over ten years ago, his ideas do not seem to have gained wide acceptance. It is a pity, because they are useful and, I believe, simplifying. However, there seems to be a widespread belief that his ideas are too difficult to be understood or used by practical statisticians or students.”

Recommended