40
SRIT / UICM007 – P & S / Analysis of Variance SRIT / M & H / M. Vijaya Kumar 1 SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY (AN AUTONOMOUS INSTITUTION) COIMBATORE- 641010 UICM007 & PROBABILITY AND STATISTICS ANALYSIS OF VARIANCE History: The - and -tests developed in the 20 th century were used until 1918, when Ronald Fisher created the analysis of variance. ANOVA is also called the Fisher analysis of variance, and it is the extension of the - and the -tests. The term became well-known in 1925, after appearing in Fisher's book, "Statistical Methods for Research Workers." It was employed in experimental psychology and later expanded to subjects that are more complex. Definition: Analysis of variance Analysis of variance is a collection of statistical models and their associated estimation procedures used to analyze the differences among group means in a sample. ANOVA was developed by statistician and evolutionary biologist Ronald Fisher. One Way Classification Completely Randomized Design (CRD) The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically significant differences between the means of two or more independent (unrelated) groups (although you tend to only see it used when there are a minimum of three, rather than two groups). For Problem, you could use a one-way ANOVA to understand whether exam performance differed based on test anxiety levels amongst students, dividing students into three independent groups (e.g., low, medium and high-stressed students). Also, it is important to realize that the one-way ANOVA is an omnibus test statistic and cannot tell you which specific groups were statistically significantly different from each other; it only

SRIT / UICM007 P & S / Analysis of Variance SRI

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 1

SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY

(AN AUTONOMOUS INSTITUTION)

COIMBATORE- 641010

UICM007 & PROBABILITY AND STATISTICS

ANALYSIS OF VARIANCE

History:

The - and -tests developed in the 20th century were used until 1918, when Ronald

Fisher created the analysis of variance. ANOVA is also called the Fisher analysis of variance,

and it is the extension of the - and the -tests. The term became well-known in 1925, after

appearing in Fisher's book, "Statistical Methods for Research Workers." It was employed in

experimental psychology and later expanded to subjects that are more complex.

Definition:

Analysis of variance

Analysis of variance is a collection of statistical models and their associated

estimation procedures used to analyze the differences among group means in a sample.

ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.

One Way Classification

Completely Randomized Design (CRD)

The one-way analysis of variance (ANOVA) is used to determine whether there are

any statistically significant differences between the means of two or more independent

(unrelated) groups (although you tend to only see it used when there are a minimum of

three, rather than two groups).

For Problem, you could use a one-way ANOVA to understand whether exam

performance differed based on test anxiety levels amongst students, dividing students into

three independent groups (e.g., low, medium and high-stressed students). Also, it is

important to realize that the one-way ANOVA is an omnibus test statistic and cannot tell

you which specific groups were statistically significantly different from each other; it only

Page 2: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 2

tells you that at least two groups were different. Since you may have three, four, five or

more groups in your study design, determining which of these groups differ from each

other is important.

Advantages of randomized complete block designs

Complete flexibility. Can have any number of treatments and blocks.

Provides more accurate results than the completely randomized design due to

grouping.

Relatively easy statistical analysis even with missing data.

Allows calculation of unbiased error for specific treatments.

Disadvantages of randomized complete block designs

Not suitable for large numbers of treatments because blocks become too large.

Not suitable when complete block contains considerable variability.

Interactions between block and treatment effects increase error.

ANOVA table for One-way classification.

Source of variation Sum of squares

Degrees of freedom

Mean sum of square

F-ratio

Between samples SSC

Within samples SSE

Problem: 1

A random sample is selected from each of three makes of ropes and their breaking

strength (in pounds) are measured with the following results:

I II III

70 100 60

72 110 65

75 108 57

80 112 84

Page 3: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 3

83 113 87

120 73

107

Test whether the breaking strength of the ropes differs significantly.

Answer:

For simplification of work we subtract each entry by 80, we form the table as follows

S. No

1

2

3

4

5

6

7

Null Hypothesis:

Let us take the null hypothesis that the breaking strength of the ropes does not differ

significantly.

∑∑

( )

Total sum of squares SST [( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ]

Page 4: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 4

Between ropes (Column) sum of squares

( )

( )

( )

Error sum of squares

ANOVA Table

Source of Variation Degrees of

Freedom

Sum of

squares

Mean

square F-ratio

Between ropes

Error

Total

Table value:

( ) .

Conclusion:

( ), we reject the null hypothesis, there is some significant difference

between the robes.

Problem: 2

The following are the number of mistakes made in 5 successive days by 4 technicians

working for a photographic laboratory. Test whether the difference among the four

samples mean can be attributed to chance. [Test at a level of significance ].

I II III IV

6 14 10 9

14 9 12 12

10 12 7 8

8 10 15 10

11 14 11 11

Answer:

Page 5: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 5

S. No I II III IV

1 6 14 10 9

2 14 9 12 12

3 10 12 7 8

4 8 10 15 10

5 11 14 11 11

Null Hypothesis:

i.e., the difference among the four sample means can be attributed to chance.

Alternative Hypothesis:

There is a significant difference among the four sample means.

∑∑

( )

Total sum of squares SST [

]

Between column sum of squares

( )

( )

( )

( )

Page 6: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 6

Error sum of squares

ANOVA Table

Source of

Variation Degrees of Freedom

Sum of

squares

Mean sum of

squares F-ratio

Between

technicians

Error

Total 114.55

Table value:

( )

Conclusion:

( ), we accept the null hypothesis, there is no significant difference between

the two sample means.

Problem: 3

As part of the investigation of the collapse of the roof of a building, a testing

laboratory is given all the available bolts that connected all the steel structure at three

different positions on the roof. The forces required to shear each of these bolts (coded

values) are as follows:

Position 1 90 82 79 98 83 91

Position 2 105 89 93 104 89 95 86

Position 3 83 89 80 94

Answer:

Page 7: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 7

For simplifying calculations we subtract 90 from each data.

S. No Position 1 Position 2 Position 3

1

2

3

4

5

6

7

Total

Null Hypothesis:

i.e., the difference among the sample means at the three positions is not significant.

Alternative Hypothesis:

: The differences between the sample means are significant.

∑∑

( )

Total sum of squares SST [

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ]

Page 8: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 8

Between column sum of squares

( )

( )

( )

Error sum of squares

ANOVA Table Source of

Variation

Degrees of

Freedom

Sum of

squares

Mean sum

of squares F-ratio

Between

Positions

Error

Total

Table value:

( )

Conclusion:

( ), we accept the null hypothesis, there is no significant difference between

the two sample means.

Problem: 4

A completely randomized design experiment with 10 plots and 3 treatments gave the

following results:

Plot No 1 2 3 4 5 6 7 8 9 10

Treatment A B C A C C A B A B

Yield 5 4 3 7 5 1 3 4 1 7

Analyze the results for treatment effects.

Page 9: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 9

OR

A completely randomized design experiment with ten plots and three treatments gave the

results given below. Analyze the results for the effects of treatments.

Treatment Replications

A 5 7 1 3

B 4 4 7

C 3 1 5

Answer:

S.No Replicant A Replicant B Replicant C

1 5 4 3

2 7 4 1

3 1 7 5

4 3

Total

Null Hypothesis:

There is no significant difference in the effects of treatments.

Alternative Hypothesis:

There is significant difference in the effects of treatments.

∑∑

( )

Total sum of squares SST [

]

Page 10: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 10

Between column sum of squares

Error sum of squares

ANOVA Table

Source of

Variation

Degrees of

Freedom

Sum of

squares

Mean sum of

squares F-ratio

Between

Positions

Error

Total 40

Table value:

( )

Conclusion:

. We accept and conclude that there is no significant difference between

the effects of treatments.

Practice Problems:

1. The following table shows the lives in hours of four brands of electric lamps:

Brand A 1610 1610 1650 1680 1700 1720 1800

Brand B 1580 1640 1640 1700 1750

Brand C 1460 1550 1600 1620 1640 1660 1740 1820

Brand D 1510 1520 1530 1570 1600 1680

Perform an analysis of variance and test the homogeneity of the mean lives of the

four brands of lamps.

Page 11: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 11

2. Suppose that a random sample of n = 5 was selected from the vineyard properties for

sale in Sonoma County, California, in each of three years. The following data are

consistent with summary information on price per acre for disease-resistant grape

vineyards in Sonoma County. Carry out an ANOVA to determine whether there is

evidence to support the claim that the mean price per acre for vineyard land in

Sonoma County was not the same for each of the three years considered. Test at the

0.05 level and at the 0.01 level.

1996: 30000 34000 36000 38000 40000

1997: 30000 35000 37000 38000 40000

1998: 40000 41000 43000 44000 50000

3. The accompanying data resulted from an experimental comparing the degree of

soiling for fabric copolymerized with the 3 different mixtures of methacrylic acid.

Analyse the classification

Mixture 1 0.56 1.12 0.90 1.07 0.94

Mixture 2 0.72 0.69 0.87 0.78 0.91

Mixture 3 0.62 1.08 1.07 0.99 0.93

Two Way Classification

Randomized Block Diagram (RBD)

With a randomized block design, the experimenter divides subjects into subgroups

called blocks, such that the variability within blocks is less than the variability between

blocks. Then, subjects within each block are randomly assigned to treatment conditions.

Compared to a completely randomized design, this design reduces variability within

treatment conditions and potential confounding, producing a better estimate of treatment

effects.

The table below shows a randomized block design for a hypothetical medical

experiment.

Gender Treatment

Placebo Vaccine

Male 250 250

Female 250 250

Page 12: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 12

Subjects are assigned to blocks, based on gender. Then, within each block, subjects

are randomly assigned to treatments (either a placebo or a cold vaccine). For this design,

250 men get the placebo, 250 men get the vaccine, 250 women get the placebo, and 250

women get the vaccine.

It is known that men and women are physiologically different and react differently to

medication. This design ensures that each treatment condition has an equal proportion of

men and women. As a result, differences between treatment conditions cannot be

attributed to gender. This randomized block design removes gender as a potential source

of variability and as a potential confounding variable.

Advantages of randomized block designs

The precision is more in RBD.

The amount of information obtained in RBD is more as compared to CRD.

RBD is more flexible. Statistical analysis is simple and easy.

Even if some values are missing, still the analysis can be done by using missing

plot technique.

Disadvantages of randomized complete block designs

When the number of treatments is increased, the block size will increase.

If the block size is large maintaining homogeneity is difficult and hence when more

number of treatments is present this design may not be suitable.

ANOVA table for One-way classification.

Source of variation Sum of

squares

Degrees of

freedom

Mean sum of

square F-ratio

Between Columns SSC

Between Rows SSR

Error SSE

Page 13: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 13

Problem: 5

The following data represents the number of units of production per day turned out

by different workers using 4 different types of machines.

Machine Type

A B C D

1 44 38 47 36

2 46 40 52 43

Workers 3 34 36 44 32

4 43 38 46 33

5 38 42 49 39

1. Test whether the five men differ with respect to mean productivity and

2. Test whether the mean productivity is the same for the four different machine types.

Answer:

Null hypothesis:

The 5 workers do not differ with respect to mean productivity

The mean productivity is the same for the four different machines.

To simplify calculation let us subtract 40 from each value, the new values are

Machine Type

Wo

rke

rs

A B C D Total

1

2

3

4

5

Total

( )

Page 14: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 14

∑∑

[

( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ]

Between machines (column) sum of squares

( )

( )

( )

( )

Between workers (row) sum of squares

( )

( )

( )

( )

( )

Error sum of squares

Page 15: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 15

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Machines

Workers

Error

Total

Table value:

( ) ( )

Conclusion:

( ). Hence is rejected. That is the mean productivity is not the same

for the four machines.

( ). Hence is rejected. That is the mean productivity is not the same

for the four different workers.

Problem: 6

A company appoints 4 salesmen’s A, B, C and D and observes their sales in 3 seasons:

summer, winter and monsoon. The figures (in lakhs of Rs.) are given in the following table:

Salesmen

Season A B C D

Summer 45 40 38 37

Winter 43 41 45 38

Monsoon 39 39 41 41

Carry out an analysis of variance.

Answer:

Null hypothesis:

There is no significant difference between the sales in the three seasons

There is no significant difference between the four salesman

Page 16: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 16

To simplify calculation let us subtract from each value, the new values are

Salesmen

Season Total

Summer

Winter

Monsoon

Total

( )

∑∑

[( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ]

Between salesmen (column) sum of squares

( )

( )

( )

( )

Between seasons (row) sum of squares

( )

( )

( )

Page 17: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 17

Error sum of squares

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Salesmen 7.639

Seasons 4.0835

Error 7.639

Total 76.917

Table value:

( ) and ( )

Conclusion:

( ). Hence we accept the null hypothesis. That is there is no difference

between in the sales of the four salesmen.

( ). Hence we accept the null hypothesis. That is there is no difference

between the sales in the seasons.

Problem: 7

Analyse the following RBD and draw your conclusion.

Treatments

Blocks

12 14 20 22

17 27 19 15

15 14 17 12

18 16 22 12

19 15 20 14

Page 18: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 18

Answer:

Null hypothesis:

There is no significant difference between treatments and blocks

To simplify calculation let us subtract 15 from each value, the new values are

Treatments

Blo

cks

Total

Total

( )

∑∑

[ ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ]

Between column sum of squares

Page 19: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 19

Between row sum of squares

( )

( )

( )

Error sum of squares

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Treatments

Blocks

Error 184.8

Total

Table value:

( ) and ( )

Conclusion:

( ). Hence the null hypothesis is accepted, that is there is no significant

difference between treatments.

( ). Hence the null hypothesis is accepted, that is there is no significant

difference between blocks.

Page 20: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 20

Problem: 8

A set of data involving four “four tropical feed stuffs A, B, C, D” tried on 20 chicks is

given below. All the twenty chicks are treated alike in all respects except the feeding

treatments and each feeding treatment is given to 5 chicks. Analyze the data. Weight gain

of baby chicks fed on different feeding materials composed of tropical feed stuffs.

Total

A 55 49 42 21 52 219

B 61 112 30 89 63 355

C 42 97 81 95 92 407

D 169 137 169 85 154 714

Grand Total

Answer:

Null hypothesis:

There is no significant difference between rows and columns.

55 49 42 21 52

61 112 30 89 63

42 97 81 95 92

169 137 169 85 154

( )

∑∑

[

]

Page 21: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 21

Between column sum of squares

Between row sum of squares

Error sum of squares

ANOVA table

Source of

variation

Degrees of

freedom

Sum of

squares (SS)

Mean sum of

squares (MS)

Variance Ration

(F-Ratio)

Between

Columns

Between

Rows

Error

Total

Table value:

( ) and ( )

Page 22: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 22

Conclusion:

( ). Hence the null hypothesis is accepted, that is there is no significant

difference between columns.

( ). Hence the null hypothesis is rejected, that is there is some

significant difference between rows.

Practice problems:

1. The following data represents a certain person to work from Monday to Friday by four

different routes.

Days

Routes

Mon Tue Wed Thu Fri

1 22 26 25 25 31

2 25 27 28 26 29

3 26 29 33 30 33

4 26 28 27 30 30

Test at 5% level of significance whether the differences among the means obtained for the

different routes are significant and whether the differences among the means obtained for

the different days of the week are significant.

2. The sales of 4 salesmen in 3 seasons are tabulated here. Carry out an analysis of

variance.

Salesmen

Season A B C D

Summer 36 36 21 35

Winter 28 29 31 32

Monsoon 26 28 29 29

3. Perform a 2 way ANNOVA on the data given below:

Treatment 1

1 2 3

1 30 26 38

2 24 29 28

Treatment 2 3 33 24 35

4 36 31 30

5 27 35 33

Page 23: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 23

4. Three varieties of coal analyzed by four chemists and the ash content is tabulated

below. Perform an analysis of variance.

Chemists

Coal A B C D

I 8 5 5 7

II 7 6 4 4

III 3 6 5 4

Three Way Classification

Latin Square Design (LSD)

A Latin square design is a method of placing treatments so that they appear in a

balanced fashion within a square block or field. Treatments appear once in each row and

column. Replicates are also included in this design.

Treatments are assigned at random within rows and columns, with each treatment once

per row and once per column.

There are equal numbers of rows, columns, and treatments.

Useful where the experimenter desires to control variation in two different directions

The Latin square design, perhaps, represents the most popular alternative design when

two (or more) blocking factors need to be controlled for. A Latin square design is actually

an extreme Problem of an incomplete block design, with any combination of levels

involving the two blocking factors assigned to one treatment only, rather than to all!

Advantages of Latin square design

Greater power than the RBD when there are two external sources of variation.

Easy to analyze.

Disadvantages of Latin square design

The number of treatments, rows and columns must be the same.

Squares smaller than 5×5 are not practical because of the small number of degrees of

freedom for error.

The effect of each treatment must be approximately the same across rows and

columns.

Page 24: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 24

ANOVA table for three-way classification.

Source of

variation

Sum of

squares Degrees of freedom

Mean sum of

square F-ratio

Between

Columns SSC

Between

Rows SSR

Between

treatments SSK

Error SSE ( )( )

Problem: 9

Set up the analysis of variance for the following results of a Latin Square Design. Use

0.01 level of significance.

A C B D

12 19 10 8

C B D A

18 12 6 7

B D A C

22 10 5 21

D A C B

12 7 27 17

Answer:

Null hypothesis:

There is no significant difference between rows, columns and treatments.

Columns (j) /

Rows (i) 1 2 3 4 Total

1 12 19 10 8

2 18 12 6 7

3 22 10 5 21

4 12 7 27 17

Total

Page 25: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 25

Treatment total

( )

∑∑

[

]

Between column sum of squares

Between row sum of squares

Between treatment sum of squares

Page 26: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 26

Error sum of squares

ANOVA TABLE

Source of

variation D.o.f

Sum of

squares (SS)

Mean Sum of squares

(MS)

Variance Ration

(F-ratio)

Between

columns 3 SSC=42.69

Between rows 3 SSR=60.19

between

treatments 3 SSK=465.19

Error 6 SSE=79.37

Total 15 647.74

Table value:

( )

Conclusion:

Since ( ) and ( ), we accept the null hypothesis and that is

there is no significant difference between the rows and columns.

The calculated value of ( ), we reject the null hypothesis and that is there is

some significant difference between the treatments.

Page 27: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 27

Problem: 10

Analyze the variance in the Latin Square of yields (in kgs) of paddy where P,Q,R,S

denote the different methods of cultivation.

S122 P121 R123 Q122

Q124 R123 P122 S125

P120 Q119 S120 R121

R122 S123 Q121 P122

Examine whether the different methods of cultivation have given significantly

different yields.

Answer:

Null hypothesis:

There is no significant difference between the different methods of cultivation.

To simplify calculations, we subtract 120 from the given values.

Columns (j) /

Rows (i) 1 2 3 4 Total

1 S 2 P 1 R 3 Q 2

2 Q 4 R 3 P 2 S 5

3 P 0 Q -1 S 0 R 1

4 R 2 S 3 Q 1 P 2

Total

Treatment total

( )

Page 28: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 28

∑∑

[

( )

]

Between column sum of squares

Between row sum of squares

Between treatment sum of squares

Error sum of squares

Page 29: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 29

ANOVA TABLE

Source of

variation D.o.f

Sum of

squares (SS)

Mean Sum of

squares (MS)

Variance Ration

(F-ratio)

Between

columns 3 2.75 0.917

Between rows 3 24.75 8.25

between

treatments 3 4.25 1.417

Error 6 4 0.667

Total 15 35.75

Table value:

( )

Conclusion:

Since ( ), we reject the null hypothesis that is there is some significant

difference between the rows.

Since ( ) and ( ), we accept the null hypothesis that is there

is no significant difference between the columns and treatments.

Problem: 11

The figures in the following 5*5 Latin square are the numbers of minutes, engines

and tuned up by mechanics and , ran with a gallon

of fuel A, B, C, D and E.

A B C D E

31 24 20 20 18

B C D E A

21 27 23 25 31

C D E A B

21 27 25 29 21

D E A B C

21 25 33 25 22

E A B C D

21 37 24 24 20

Use the level of significance to test

Page 30: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 30

The null hypothesis that there is no difference in the performance of the five

engines.

that the persons who tuned up these engines have no effect on their performance.

that the engines perform equally well with each of the fuels.

Answer:

Null hypothesis:

There is no significant difference between the engines, persons and fuels.

To simplify calculations, we subtract 25 from the given values.

Total

Total

Treatment total

( )

[

]

Between column sum of squares

Page 31: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 31

( )

( )

( )

Between row sum of squares

( )

( )

Between treatment sum of squares

( )

( )

( )

( )

Error sum of squares

ANOVA TABLE

Source of

variation D.o.f

Sum of squares

(SS)

Mean Sum of

squares (MS)

Variance Ration

(F-ratio)

Between

columns 4 95.6 23.9

Between rows 4 26.8 6.7

between

treatments 4 362.8 90.7

Error 12 34.8 2.9

Total 24

Page 32: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 32

Table value:

( )

Conclusion:

( ), we accept the null hypothesis and there is no significant difference

between the performance of the five machines.

( ) and ( ) we reject the null hypothesis and there is some

significant difference between the performance of the five engines and fuels.

Problem: 12

Four cars and four drivers are employed in a study for possible differences between

four gasoline additives(A, B, C, D). Even though cars can be identical models, slight

systematic differences are likely to occur in their performance, and even though each

driver may do his best to drive the car in the manner required by the test, slight systematic

differences can occur from driver to driver. It would be desirable to eliminate both the car-

to-car and driver-to-driver differences. Carry ANNOVA table.

Cars

Drivers 1 2 3 4

1 A 24 B 26 D 20 C 25

2 D 23 C 26 A 20 B 27

3 B 15 D 13 C 16 A 16

4 C 17 A 15 B 20 D 20

Use the level of significance to test

Answer: Null hypothesis:

There is no significant difference between the gasoline additives, cars and drivers.

To simplify calculations, we subtract 20 from the given values.

Car 1 Car 2 Car 3 Car 4 Total

Driver 1 A 4 B 6 D 0 C 5

Driver 2 D 3 C 6 A 0 B 7

Driver 3 B -5 D -7 C -4 A -4

Driver 4 C -3 A -5 B 0 D 0

Total

Page 33: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 33

Treatment total

( )

[

( ) ( ) ( ) ( ) ( ) ( ) ]

Between column sum of squares

( )

( )

Between row sum of squares

( )

( )

Between treatment sum of squares

( )

( )

Page 34: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 34

Error sum of squares

ANOVA TABLE

Source of

variation D.o.f

Sum of

squares (SS)

Mean Sum of

squares (MS)

Variance Ration

(F-ratio)

Between

columns 3 19.69 6.56

Between rows 3 235.69 78.56

between

treatments 3 29.69 9.90

Error 6 25.38 4.23

Total 15

Table value:

( )

Conclusion:

( ), we reject the null hypothesis and there is some significant difference

between the performance of the four drivers.

( ) and ( ), we accept the null hypothesis and there is no

significant difference between the cars and the gasoline additives .

Practice problems:

1. A variable trail was conducted on wheat with 4 variables in a Latin square design.

The plan of the experiment is given below. Analyse data and interpret the result.

C 25 B 23 A 20 D 20

A 19 D 19 C 21 B 18

B 19 A 14 D 17 C 20

D 17 C 20 B 21 A 15

Page 35: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 35

2. Four varieties A, B, C and D of a fertilizer are tested in a randomized block design

with four replicants. The plot yields in pounds are as follows.

A 12 D 20 C 16 B 10

D 18 A 14 B 11 C 14

B 12 C 15 D 19 A 13

C 16 B 11 A 15 D 20

Analyse the experimental yield.

Two Marks

1. Define Analysis of variance.

Answer:

Analysis of variance is a collection of statistical models and their associated

estimation procedures used to analyze the differences among group means in a sample.

ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.

2. State the basic principles of design of Experiments

Answer:

The major three principles of experimental designs are:

Replication : to provide an estimate of experimental error.

Randomization : to ensure that this estimate is statistically valid.

Local control : to reduce experimental error by making the experiment more efficient.

3. What do you understand by “Design of an experiment”?

Answer:

Design of experiments (DOE) is a systematic method to determine the relationship

between factors affecting a process and the output of that process.

In other words, it is used to find cause-and-effect relationships. This information is

needed to manage process inputs in order to optimize the output.

Page 36: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 36

4. What is the aim of the design of experiment?

Answer:

Design of experiments (DOE) is a systematic method to determine the relationship

between factors affecting a process and the output of that process.

In other words, it is used to find cause-and-effect relationships. This information is

needed to manage process inputs in order to optimize the output.

5. State the assumptions involved in ANOVA.

Answer:

The experimental errors of your data are normally distributed.

Equal variances between treatments (Homogeneity of variances Homoscedasticity).

Independent of samples (Each sample is randomly selected and independent).

6. What are the basic steps in ANOVA?

Answer:

Set up hypotheses

Determine the level of significance.

Select the appropriate test statistic. ...

Set up decision rule. ...

Compute the test statistic. ...

Write Conclusion.

7. When do you apply the analysis of variance technique?

Answer:

Suppose we consider three or more samples at a time, in this situation we need

another testing hypothesis that all the samples are drawn from the same population, i.e.,

they have the same means. In this case we use analysis of variance to test the homogeneity

of several means.

Page 37: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 37

8. What is a completely randomized design?

Answer:

The one-way analysis of variance or a completely randomized design is used to

determine whether there are any statistically significant differences between the means of

two or more independent (unrelated) groups (although you tend to only see it used when

there are a minimum of three, rather than two groups).

9. State any two advantages of a Completely Randomized Experimental Design.

Answer:

Complete flexibility. Can have any number of treatments and blocks.

Easy to calculate and perform the layouts.

10. Write down the ANOVA table for one way classification.

Answer:

Source of variation Sum of

squares

Degrees of

freedom

Mean sum of

square F-ratio

Between samples SSC

Within samples SSE

11. Define: RBD.

Answer:

With a randomized block design, the experimenter divides subjects into subgroups

called blocks, such that the variability within blocks is less than the variability between

blocks. Then, subjects within each block are randomly assigned to treatment conditions.

Compared to a completely randomized design, this design reduces variability within

treatment conditions and potential confounding, producing a better estimate of treatment

effects.

Page 38: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 38

12. Write the ANOVA table for randomized block design.

Answer:

Source of

variation

Sum of

squares

Degrees of

freedom

Mean sum of

square F-ratio

Between

Columns SSC

Between Rows SSR

Error SSE

13. Discuss the advantages and disadvantages of Randomized block design.

Answer:

Advantages:

The precision is more in RBD.

The amount of information obtained in RBD is more as compared to CRD.

RBD is more flexible. Statistical analysis is simple and easy.

Disadvantages:

When the number of treatments is increased, the block size will increase.

If the block size is large maintaining homogeneity is difficult and hence when more

number of treatments is present this design may not be suitable.

14. Compare one-way classification model with two-way classification model.

Answer:

One – way ANNOVA Two – way ANNOVA

1 We cannot test two sets of

hypothesis at a time.

Two sets of hypothesis

can be tested at a time.

2 Data are classified according to one

factor

Data are classified according

to two different factor.

Page 39: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 39

15. Write any two differences between RBD and CRD.

Answer:

Completely randomized block design Randomized block design

1 We cannot test two sets of hypothesis at a

time.

Two sets of hypothesis

can be tested at a time.

2 Data are classified according

to one factor

Data are classified according

to two different factor.

16. What is meant by Latin square?

Answer:

A Latin square design is a method of placing treatments so that they appear in a

balanced fashion within a square block or field. Treatments appear once in each row and

column. Replicates are also included in this design.

17. What are the advantages of a Latin square design?

Answer:

Greater power than the RBD when there are two external sources of variation.

Easy to analyze.

18. Is 2×2 Latin square Design possible? Why?

Answer:

No we cannot use 2×2 Latin square Design.

Reason:

In Latin square design, formula for finding degrees of freedom for error sum of

square (SSE) is ( )( ).

While using , then d.o.f is 0, then MSE . It is not possible.

19. Give the layout of Latin square design with four treatments.

Answer:

With four treatments A, B, C and D one typical arrangement of LSD is as follows

A B D C B A C D D C B A C D A B

Page 40: SRIT / UICM007 P & S / Analysis of Variance SRI

SRIT / UICM007 – P & S / Analysis of Variance

SRIT / M & H / M. Vijaya Kumar 40

20. Write the ANOVA table for Latin Square design.

Answer:

Source of

variation

Sum of

squares Degrees of freedom

Mean sum of

square F-ratio

Between

Columns SSC

Between

Rows SSR

Between

treatments SSK

Error SSE ( )( )

21. What is Latin square design? Under what conditions can this design are used?

Answer:

The treatments are then allocated at random to these rows and columns in such a

way that every treatment occurs once and only once in each row and in each column. Such

a layout is known as Latin square design.

“Education is not only a ladder of opportunity, but it is also an

investment in our future”

-Ed Markey.