42
Ch12 Analysis of Variance

Ch12 Analysis of Variance

  • Upload
    nimrod

  • View
    35

  • Download
    1

Embed Size (px)

DESCRIPTION

Ch12 Analysis of Variance. Outline. Completely randomized designs Randomized-block designs. Analysis of Variance. Single Factor Analysis of Variance Single Factor ANOVA One Way Analysis of Variance One Way ANOVA. Background. - PowerPoint PPT Presentation

Citation preview

Page 1: Ch12 Analysis of Variance

Ch12 Analysis of Variance

Page 2: Ch12 Analysis of Variance

Outline

Completely randomized designs

Randomized-block designs

Page 3: Ch12 Analysis of Variance

Analysis of Variance

Single Factor Analysis of Variance

Single Factor ANOVA

One Way Analysis of Variance

One Way ANOVA

Page 4: Ch12 Analysis of Variance

BackgroundIf we have, say, 3 treatments to compare (A, B, C) then we would need 3 separate t-tests (comparing A with B, A with C, and B with C). If we had 7 treatments we would need 21 separate t-tests. This would be time-consuming but, more important, it would be inherently flawed because in each t-test we accept a 5% chance of our conclusion being wrong (when we test for p = 0.05). So, in 21 tests we would expect (by probability) that one test would give us a false result. ANalysis Of Variance (ANOVA) overcomes this problem by enabling us to detect significant differences between the treatments as a whole. We do a single test to see if there are differences between the means at our chosen probability level.

Assumption: equal variances, independent populations, random sampling

Page 5: Ch12 Analysis of Variance

The scheme of one-way classification

1

1

2

2

211 12 1 1 1 1 1

1

221 22 2 2 2 2 2

1

21 2

1

1 2

, , , , , ( )

, , , , , ( )

, , , , , ( )

1

, , , , , (

:

2 :

:

:

i

i

k

n

j n ji

n

j n ji

n

i i ij in i ij ii

k k kj kn k kj

y y y y y y y

y y y y y y y

y y y

Observations Means Sum of Squ

Sample

Sample

Sample i

Samp

y y y y

y y

are

le k y y y y

s

2

1

)kn

ki

y

Page 6: Ch12 Analysis of Variance

Simplify

1 1

.ink

iji j

T y

1

k

ii

N n

1

1

.

k

i iik

ii

n yT

yN

n

y is the overall mean or grand mean of all observations.

iyis the mean of the measurements obtained by the i-th laboratory.

The statistical analysis leading to a comparison of the k different population means consists essentially of splitting the sum of squares about the overall grand mean into a component due to treatment difference, and a component due to error or variation within a sample.

Page 7: Ch12 Analysis of Variance

EX

Suppose 3 drying formulas for curing a glue are studied and the following times observed.

Formula A: 13 10 8 11 8

Formula B: 13 11 14 14

Formula C: 4 1 3 4 2 4

Page 8: Ch12 Analysis of Variance

Each observation can be decomposed as

( ) ( )ij i ij i

observation grand deviation due error

mean to treatment

y y y y y y

Repeating the decomposition for each observation, we obtain the arrays

( ) ( )

1310 8 11 8 8 8 8 8 8 2 2 2 2 2 3 0 2 1 2

13111414 8 8 8 8 5 5 5 5 0 2 1 1

4 1 3 4 2 4 8 8 8 8 8 8 5 5 5 5 5 5 1 2 0 1 1 1

ij i ij i

observation grand mean tr

y y y y y

eament effects err r

y

o

Page 9: Ch12 Analysis of Variance

2

1

( ) ( )k

i ii

treatment sum of square SS Tr n y y

2

1 1

( )ink

ij ii j

error sum of square SSE y y

Degrees of freedom for treatment: k-1

Degrees of freedom for error: N-k

Theorem. 2 2 2

1 1 1 1 1

( ) ( ) ( )i in nk k k

ij ij i i ii j i j i

SST y y y y n y y

SST SSE SS(Tr)

Page 10: Ch12 Analysis of Variance

If denotes the mean of the i-th population and denotes the common variance of the k populations.

i 2

ij i ijY

ij i ijY Where is the mean of the in the experiment and

is the effect of the i-th treatment; hence

i

1

0k

i ii

n

The null hypothesis that the k population means are all equal can be replaced by the null hypothesis

1 2 0k

Page 11: Ch12 Analysis of Variance

The alternative hypothesis that at least two of the population means are unequal.

To test the null hypothesis that the k population means are all equal, we shall compare two estimates of

One based on the variation among the sample means, and one based on the variation within the samples.

2

Each sum of squares is first converted to a mean square.

sum of squares

degrees of freedommean square

Page 12: Ch12 Analysis of Variance

When the population means are equal, both

are estimates of 2

2

1

( )

-1

k

i iitreatment mean squaren y y

k

2

1 1

( )ink

ij ii jerror mean square

y y

N k

and

Page 13: Ch12 Analysis of Variance

If the null hypothesis is true, it can be shown that the two mean squares are independent and that their ratio

2

1

2

1 1

( ) /( 1)( ) /( 1)

/( )( ) /( )

i

k

i iink

ij ii j

n y y kSS Tr k

SSE N ky y N

F

k

has an F distribution with k-1 and N-k degrees of freedom.

A large value for F indicates large difference between the sample means. Therefore, the null hypothesis will be rejected, if at level of significance. F F

Page 14: Ch12 Analysis of Variance

One-way ANOVASource of variance

Sum of squares

Degree of freedom

Mean square

Computed

f

Treatments SS(Tr) K - 1

Error SSE K (n - 1)

Total SST nk - k

21

( )

1

SS Trs

k

2

( 1)

SSEs

k n

212

s

s

Page 15: Ch12 Analysis of Variance

Solution of EX

Page 16: Ch12 Analysis of Variance

Solution

One-way ANOVA: A, B, C

Source DF SS MS F P

Factor 2 270.00 135.00 50.63 0.000

Error 12 32.00 2.67

Total 14 302.00

The value of so we reject the null hypothesis of equal means.

0.05 (2,12) 3.89F

Page 17: Ch12 Analysis of Variance

Exercise

Assume that we have recorded the biomass of 3 bacteria in flasks of glucose broth, and we used 3 replicate flasks for each bacterium

Replicate Bacterium

A

Bacterium

B

Bacterium C

1 12 20 40

2 15 19 35

3 9 23 42

Page 18: Ch12 Analysis of Variance

Solution

One-way ANOVA: A, B, C

Source DF SS MS F P

Factor 2 1140.2 570.1 64.93 0.000

Error 6 52.7 8.78

Total 8 1192.9

The value of F(2,6) = 5.1 in the level of 0.05 so we reject the null hypothesis of equal means.

Page 19: Ch12 Analysis of Variance

12.3 Random-Block designs

1 2 3 4

1

2

3

13 7 9 3

14 6 3 1

11 5 15 5

Blocks

Two way ANOVA

Page 20: Ch12 Analysis of Variance

RCB Randomized Complete Block

The randomized block design is an extension of the paired t-test to situations where the factor of interest has more than two levels.

Page 21: Ch12 Analysis of Variance

Example 1:

Suppose we are interested in how weight gain (Y) in rats is affected by Source of protein (Beef, Cereal, and Pork) and by Level of Protein (High or Low).

There are a total of t = 32 = 6 treatment combinations of the two factors (Beef -High Protein, Cereal-High Protein, Pork-High Protein, Beef -Low Protein, Cereal-Low Protein, and Pork-Low Protein) .

Page 22: Ch12 Analysis of Variance

Suppose we have available to us a total of N = 60 experimental rats to which we are going to apply the different diets based on the t = 6 treatment combinations.

Prior to the experimentation the rats were divided into n = 10 homogeneous groups of size 6.

The grouping was based on factors that had previously been ignored (Example - Initial weight size, appetite size etc.)

Within each of the 10 blocks a rat is randomly assigned a treatment combination (diet).

Page 23: Ch12 Analysis of Variance

The weight gain after a fixed period is measured for each of the test animals and is tabulated on the next slide:

Page 24: Ch12 Analysis of Variance

Block Block 1 107 96 112 83 87 90 6 128 89 104 85 84 89 (1) (2) (3) (4) (5) (6) (1) (2) (3) (4) (5) (6)

2 102 72 100 82 70 94 7 56 70 72 64 62 63 (1) (2) (3) (4) (5) (6) (1) (2) (3) (4) (5) (6)

3 102 76 102 85 95 86 8 97 91 92 80 72 82 (1) (2) (3) (4) (5) (6) (1) (2) (3) (4) (5) (6)

4 93 70 93 63 71 63 9 80 63 87 82 81 63 (1) (2) (3) (4) (5) (6) (1) (2) (3) (4) (5) (6)

5 111 79 101 72 75 81 10 103 102 112 83 93 81 (1) (2) (3) (4) (5) (6) (1) (2) (3) (4) (5) (6)

Randomized Block Design

Page 25: Ch12 Analysis of Variance

Example 2:

The following experiment is interested in comparing the effect four different chemicals (A, B, C and D) in producing water resistance (y) in textiles.

A strip of material, randomly selected from each bolt, is cut into four pieces (samples) the pieces are randomly assigned to receive one of the four chemical treatments.

Page 26: Ch12 Analysis of Variance

This process is replicated three times producing a Randomized Block (RB) design.

Moisture resistance (y) were measured for each of the samples. (Low readings indicate low moisture penetration).

The data is given in the diagram and table on the next slide.

Page 27: Ch12 Analysis of Variance

Diagram: Blocks (Bolt Samples)

9.9 C 13.4 D 12.7 B 10.1 A 12.9 B 12.9 D 11.4 B 12.2 A 11.4 C 12.1 D 12.3 C 11.9 A

Page 28: Ch12 Analysis of Variance

Table

Blocks (Bolt Samples)

Chemical 1 2 3

A 10.1 12.2 11.9

B 11.4 12.9 12.7

C 9.9 12.3 11.4

D 12.1 13.4 12.9

Page 29: Ch12 Analysis of Variance

The randomized block design (RBD) consists of a two-step procedure:

1. Matched sets of experimental units, called blocks, are formed, each block consists of units. The blocks should consist of experimental units that are as similar as possible (to reduce the within-treatments variation) .

2. One experimental unit from each block is randomly assigned to each treatment, resulting in a total of

responses.

3. If every block has responses from all treatments, the design is complete, randomized complete block design.

ab

a b

Page 30: Ch12 Analysis of Variance

RCB

For example, consider the situation where three different methods were used to predict the shear strength of steel plate girders. Say we use four girders as the experimental units.

Page 31: Ch12 Analysis of Variance

RCB

b

jiji y

by

1.

1

a

iijj y

ay

1.

1

a

i

b

jijyab

y1 1

..

1

The total number of responses is ab.

Page 32: Ch12 Analysis of Variance

RCB

The appropriate linear statistical model:

We assume

• treatments and blocks are initially fixed effects

• blocks do not interact

Page 33: Ch12 Analysis of Variance

RCB

The hypotheses of interest are:

i.e., there is no treatments effect

Page 34: Ch12 Analysis of Variance

RCB

has a-1 df

has b-1 df

has (a-1)(b-1) df

Page 35: Ch12 Analysis of Variance

The mean squares are:

RCB

Page 36: Ch12 Analysis of Variance

RCB

The expected values of these mean squares are:

Page 37: Ch12 Analysis of Variance

RCB

Page 38: Ch12 Analysis of Variance

RCB

Page 39: Ch12 Analysis of Variance

Minitab

Page 40: Ch12 Analysis of Variance

Two-way ANOVA: response versus row, col

Source DF SS MS F P

row 2 56 28.0000 3.23 0.112

col 3 90 30.0000 3.46 0.091

Error 6 52 8.6667

Total 11 198

The P-value > 0.05 level of significance, we cannot reject the null hypothesis.

Page 41: Ch12 Analysis of Variance

The Anova Table for Diet Experiment

Source S.S d.f. M.S. F p-valueBlock 5992.41667 9 665.82407 9.52 0.00000Diet 4572.88333 5 914.576666 13.0766586 0.00000

ERROR 3147.28333 45 69.93963Total 13712.58 59

Page 42: Ch12 Analysis of Variance

The Anova Table forTextile Experiment

SOURCE SUM OF SQUARES D.F. MEAN SQUARE F TAIL PROB.Blocks 7.17167 2 3.5858 40.21 0.0003Chem 5.20000 3 1.7333 19.44 0.0017

ERROR 0.53500 6 0.0892

Total 12.90667 11