Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 1
SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY
(AN AUTONOMOUS INSTITUTION)
COIMBATORE- 641010
UICM007 & PROBABILITY AND STATISTICS
ANALYSIS OF VARIANCE
History:
The - and -tests developed in the 20th century were used until 1918, when Ronald
Fisher created the analysis of variance. ANOVA is also called the Fisher analysis of variance,
and it is the extension of the - and the -tests. The term became well-known in 1925, after
appearing in Fisher's book, "Statistical Methods for Research Workers." It was employed in
experimental psychology and later expanded to subjects that are more complex.
Definition:
Analysis of variance
Analysis of variance is a collection of statistical models and their associated
estimation procedures used to analyze the differences among group means in a sample.
ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.
One Way Classification
Completely Randomized Design (CRD)
The one-way analysis of variance (ANOVA) is used to determine whether there are
any statistically significant differences between the means of two or more independent
(unrelated) groups (although you tend to only see it used when there are a minimum of
three, rather than two groups).
For Problem, you could use a one-way ANOVA to understand whether exam
performance differed based on test anxiety levels amongst students, dividing students into
three independent groups (e.g., low, medium and high-stressed students). Also, it is
important to realize that the one-way ANOVA is an omnibus test statistic and cannot tell
you which specific groups were statistically significantly different from each other; it only
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 2
tells you that at least two groups were different. Since you may have three, four, five or
more groups in your study design, determining which of these groups differ from each
other is important.
Advantages of randomized complete block designs
Complete flexibility. Can have any number of treatments and blocks.
Provides more accurate results than the completely randomized design due to
grouping.
Relatively easy statistical analysis even with missing data.
Allows calculation of unbiased error for specific treatments.
Disadvantages of randomized complete block designs
Not suitable for large numbers of treatments because blocks become too large.
Not suitable when complete block contains considerable variability.
Interactions between block and treatment effects increase error.
ANOVA table for One-way classification.
Source of variation Sum of squares
Degrees of freedom
Mean sum of square
F-ratio
Between samples SSC
Within samples SSE
Problem: 1
A random sample is selected from each of three makes of ropes and their breaking
strength (in pounds) are measured with the following results:
I II III
70 100 60
72 110 65
75 108 57
80 112 84
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 3
83 113 87
120 73
107
Test whether the breaking strength of the ropes differs significantly.
Answer:
For simplification of work we subtract each entry by 80, we form the table as follows
S. No
1
2
3
4
5
6
7
Null Hypothesis:
Let us take the null hypothesis that the breaking strength of the ropes does not differ
significantly.
∑∑
( )
Total sum of squares SST [( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ]
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 4
Between ropes (Column) sum of squares
( )
( )
( )
Error sum of squares
ANOVA Table
Source of Variation Degrees of
Freedom
Sum of
squares
Mean
square F-ratio
Between ropes
Error
Total
Table value:
( ) .
Conclusion:
( ), we reject the null hypothesis, there is some significant difference
between the robes.
Problem: 2
The following are the number of mistakes made in 5 successive days by 4 technicians
working for a photographic laboratory. Test whether the difference among the four
samples mean can be attributed to chance. [Test at a level of significance ].
I II III IV
6 14 10 9
14 9 12 12
10 12 7 8
8 10 15 10
11 14 11 11
Answer:
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 5
S. No I II III IV
1 6 14 10 9
2 14 9 12 12
3 10 12 7 8
4 8 10 15 10
5 11 14 11 11
Null Hypothesis:
i.e., the difference among the four sample means can be attributed to chance.
Alternative Hypothesis:
There is a significant difference among the four sample means.
∑∑
( )
Total sum of squares SST [
]
Between column sum of squares
( )
( )
( )
( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 6
Error sum of squares
ANOVA Table
Source of
Variation Degrees of Freedom
Sum of
squares
Mean sum of
squares F-ratio
Between
technicians
Error
Total 114.55
Table value:
( )
Conclusion:
( ), we accept the null hypothesis, there is no significant difference between
the two sample means.
Problem: 3
As part of the investigation of the collapse of the roof of a building, a testing
laboratory is given all the available bolts that connected all the steel structure at three
different positions on the roof. The forces required to shear each of these bolts (coded
values) are as follows:
Position 1 90 82 79 98 83 91
Position 2 105 89 93 104 89 95 86
Position 3 83 89 80 94
Answer:
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 7
For simplifying calculations we subtract 90 from each data.
S. No Position 1 Position 2 Position 3
1
2
3
4
5
6
7
Total
Null Hypothesis:
i.e., the difference among the sample means at the three positions is not significant.
Alternative Hypothesis:
: The differences between the sample means are significant.
∑∑
( )
Total sum of squares SST [
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ]
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 8
Between column sum of squares
( )
( )
( )
Error sum of squares
ANOVA Table Source of
Variation
Degrees of
Freedom
Sum of
squares
Mean sum
of squares F-ratio
Between
Positions
Error
Total
Table value:
( )
Conclusion:
( ), we accept the null hypothesis, there is no significant difference between
the two sample means.
Problem: 4
A completely randomized design experiment with 10 plots and 3 treatments gave the
following results:
Plot No 1 2 3 4 5 6 7 8 9 10
Treatment A B C A C C A B A B
Yield 5 4 3 7 5 1 3 4 1 7
Analyze the results for treatment effects.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 9
OR
A completely randomized design experiment with ten plots and three treatments gave the
results given below. Analyze the results for the effects of treatments.
Treatment Replications
A 5 7 1 3
B 4 4 7
C 3 1 5
Answer:
S.No Replicant A Replicant B Replicant C
1 5 4 3
2 7 4 1
3 1 7 5
4 3
Total
Null Hypothesis:
There is no significant difference in the effects of treatments.
Alternative Hypothesis:
There is significant difference in the effects of treatments.
∑∑
( )
Total sum of squares SST [
]
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 10
Between column sum of squares
Error sum of squares
ANOVA Table
Source of
Variation
Degrees of
Freedom
Sum of
squares
Mean sum of
squares F-ratio
Between
Positions
Error
Total 40
Table value:
( )
Conclusion:
. We accept and conclude that there is no significant difference between
the effects of treatments.
Practice Problems:
1. The following table shows the lives in hours of four brands of electric lamps:
Brand A 1610 1610 1650 1680 1700 1720 1800
Brand B 1580 1640 1640 1700 1750
Brand C 1460 1550 1600 1620 1640 1660 1740 1820
Brand D 1510 1520 1530 1570 1600 1680
Perform an analysis of variance and test the homogeneity of the mean lives of the
four brands of lamps.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 11
2. Suppose that a random sample of n = 5 was selected from the vineyard properties for
sale in Sonoma County, California, in each of three years. The following data are
consistent with summary information on price per acre for disease-resistant grape
vineyards in Sonoma County. Carry out an ANOVA to determine whether there is
evidence to support the claim that the mean price per acre for vineyard land in
Sonoma County was not the same for each of the three years considered. Test at the
0.05 level and at the 0.01 level.
1996: 30000 34000 36000 38000 40000
1997: 30000 35000 37000 38000 40000
1998: 40000 41000 43000 44000 50000
3. The accompanying data resulted from an experimental comparing the degree of
soiling for fabric copolymerized with the 3 different mixtures of methacrylic acid.
Analyse the classification
Mixture 1 0.56 1.12 0.90 1.07 0.94
Mixture 2 0.72 0.69 0.87 0.78 0.91
Mixture 3 0.62 1.08 1.07 0.99 0.93
Two Way Classification
Randomized Block Diagram (RBD)
With a randomized block design, the experimenter divides subjects into subgroups
called blocks, such that the variability within blocks is less than the variability between
blocks. Then, subjects within each block are randomly assigned to treatment conditions.
Compared to a completely randomized design, this design reduces variability within
treatment conditions and potential confounding, producing a better estimate of treatment
effects.
The table below shows a randomized block design for a hypothetical medical
experiment.
Gender Treatment
Placebo Vaccine
Male 250 250
Female 250 250
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 12
Subjects are assigned to blocks, based on gender. Then, within each block, subjects
are randomly assigned to treatments (either a placebo or a cold vaccine). For this design,
250 men get the placebo, 250 men get the vaccine, 250 women get the placebo, and 250
women get the vaccine.
It is known that men and women are physiologically different and react differently to
medication. This design ensures that each treatment condition has an equal proportion of
men and women. As a result, differences between treatment conditions cannot be
attributed to gender. This randomized block design removes gender as a potential source
of variability and as a potential confounding variable.
Advantages of randomized block designs
The precision is more in RBD.
The amount of information obtained in RBD is more as compared to CRD.
RBD is more flexible. Statistical analysis is simple and easy.
Even if some values are missing, still the analysis can be done by using missing
plot technique.
Disadvantages of randomized complete block designs
When the number of treatments is increased, the block size will increase.
If the block size is large maintaining homogeneity is difficult and hence when more
number of treatments is present this design may not be suitable.
ANOVA table for One-way classification.
Source of variation Sum of
squares
Degrees of
freedom
Mean sum of
square F-ratio
Between Columns SSC
Between Rows SSR
Error SSE
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 13
Problem: 5
The following data represents the number of units of production per day turned out
by different workers using 4 different types of machines.
Machine Type
A B C D
1 44 38 47 36
2 46 40 52 43
Workers 3 34 36 44 32
4 43 38 46 33
5 38 42 49 39
1. Test whether the five men differ with respect to mean productivity and
2. Test whether the mean productivity is the same for the four different machine types.
Answer:
Null hypothesis:
The 5 workers do not differ with respect to mean productivity
The mean productivity is the same for the four different machines.
To simplify calculation let us subtract 40 from each value, the new values are
Machine Type
Wo
rke
rs
A B C D Total
1
2
3
4
5
Total
( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 14
∑∑
[
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ]
Between machines (column) sum of squares
( )
( )
( )
( )
Between workers (row) sum of squares
( )
( )
( )
( )
( )
Error sum of squares
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 15
ANOVA table
Source of
variation
Degrees of
freedom
Sum of
squares (SS)
Mean sum of
squares (MS)
Variance Ration
(F-Ratio)
Machines
Workers
Error
Total
Table value:
( ) ( )
Conclusion:
( ). Hence is rejected. That is the mean productivity is not the same
for the four machines.
( ). Hence is rejected. That is the mean productivity is not the same
for the four different workers.
Problem: 6
A company appoints 4 salesmen’s A, B, C and D and observes their sales in 3 seasons:
summer, winter and monsoon. The figures (in lakhs of Rs.) are given in the following table:
Salesmen
Season A B C D
Summer 45 40 38 37
Winter 43 41 45 38
Monsoon 39 39 41 41
Carry out an analysis of variance.
Answer:
Null hypothesis:
There is no significant difference between the sales in the three seasons
There is no significant difference between the four salesman
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 16
To simplify calculation let us subtract from each value, the new values are
Salesmen
Season Total
Summer
Winter
Monsoon
Total
( )
∑∑
[( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ]
Between salesmen (column) sum of squares
( )
( )
( )
( )
Between seasons (row) sum of squares
( )
( )
( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 17
Error sum of squares
ANOVA table
Source of
variation
Degrees of
freedom
Sum of
squares (SS)
Mean sum of
squares (MS)
Variance Ration
(F-Ratio)
Salesmen 7.639
Seasons 4.0835
Error 7.639
Total 76.917
Table value:
( ) and ( )
Conclusion:
( ). Hence we accept the null hypothesis. That is there is no difference
between in the sales of the four salesmen.
( ). Hence we accept the null hypothesis. That is there is no difference
between the sales in the seasons.
Problem: 7
Analyse the following RBD and draw your conclusion.
Treatments
Blocks
12 14 20 22
17 27 19 15
15 14 17 12
18 16 22 12
19 15 20 14
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 18
Answer:
Null hypothesis:
There is no significant difference between treatments and blocks
To simplify calculation let us subtract 15 from each value, the new values are
Treatments
Blo
cks
Total
Total
( )
∑∑
[ ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ]
Between column sum of squares
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 19
Between row sum of squares
( )
( )
( )
Error sum of squares
ANOVA table
Source of
variation
Degrees of
freedom
Sum of
squares (SS)
Mean sum of
squares (MS)
Variance Ration
(F-Ratio)
Treatments
Blocks
Error 184.8
Total
Table value:
( ) and ( )
Conclusion:
( ). Hence the null hypothesis is accepted, that is there is no significant
difference between treatments.
( ). Hence the null hypothesis is accepted, that is there is no significant
difference between blocks.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 20
Problem: 8
A set of data involving four “four tropical feed stuffs A, B, C, D” tried on 20 chicks is
given below. All the twenty chicks are treated alike in all respects except the feeding
treatments and each feeding treatment is given to 5 chicks. Analyze the data. Weight gain
of baby chicks fed on different feeding materials composed of tropical feed stuffs.
Total
A 55 49 42 21 52 219
B 61 112 30 89 63 355
C 42 97 81 95 92 407
D 169 137 169 85 154 714
Grand Total
Answer:
Null hypothesis:
There is no significant difference between rows and columns.
55 49 42 21 52
61 112 30 89 63
42 97 81 95 92
169 137 169 85 154
( )
∑∑
[
]
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 21
Between column sum of squares
Between row sum of squares
Error sum of squares
ANOVA table
Source of
variation
Degrees of
freedom
Sum of
squares (SS)
Mean sum of
squares (MS)
Variance Ration
(F-Ratio)
Between
Columns
Between
Rows
Error
Total
Table value:
( ) and ( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 22
Conclusion:
( ). Hence the null hypothesis is accepted, that is there is no significant
difference between columns.
( ). Hence the null hypothesis is rejected, that is there is some
significant difference between rows.
Practice problems:
1. The following data represents a certain person to work from Monday to Friday by four
different routes.
Days
Routes
Mon Tue Wed Thu Fri
1 22 26 25 25 31
2 25 27 28 26 29
3 26 29 33 30 33
4 26 28 27 30 30
Test at 5% level of significance whether the differences among the means obtained for the
different routes are significant and whether the differences among the means obtained for
the different days of the week are significant.
2. The sales of 4 salesmen in 3 seasons are tabulated here. Carry out an analysis of
variance.
Salesmen
Season A B C D
Summer 36 36 21 35
Winter 28 29 31 32
Monsoon 26 28 29 29
3. Perform a 2 way ANNOVA on the data given below:
Treatment 1
1 2 3
1 30 26 38
2 24 29 28
Treatment 2 3 33 24 35
4 36 31 30
5 27 35 33
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 23
4. Three varieties of coal analyzed by four chemists and the ash content is tabulated
below. Perform an analysis of variance.
Chemists
Coal A B C D
I 8 5 5 7
II 7 6 4 4
III 3 6 5 4
Three Way Classification
Latin Square Design (LSD)
A Latin square design is a method of placing treatments so that they appear in a
balanced fashion within a square block or field. Treatments appear once in each row and
column. Replicates are also included in this design.
Treatments are assigned at random within rows and columns, with each treatment once
per row and once per column.
There are equal numbers of rows, columns, and treatments.
Useful where the experimenter desires to control variation in two different directions
The Latin square design, perhaps, represents the most popular alternative design when
two (or more) blocking factors need to be controlled for. A Latin square design is actually
an extreme Problem of an incomplete block design, with any combination of levels
involving the two blocking factors assigned to one treatment only, rather than to all!
Advantages of Latin square design
Greater power than the RBD when there are two external sources of variation.
Easy to analyze.
Disadvantages of Latin square design
The number of treatments, rows and columns must be the same.
Squares smaller than 5×5 are not practical because of the small number of degrees of
freedom for error.
The effect of each treatment must be approximately the same across rows and
columns.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 24
ANOVA table for three-way classification.
Source of
variation
Sum of
squares Degrees of freedom
Mean sum of
square F-ratio
Between
Columns SSC
Between
Rows SSR
Between
treatments SSK
Error SSE ( )( )
Problem: 9
Set up the analysis of variance for the following results of a Latin Square Design. Use
0.01 level of significance.
A C B D
12 19 10 8
C B D A
18 12 6 7
B D A C
22 10 5 21
D A C B
12 7 27 17
Answer:
Null hypothesis:
There is no significant difference between rows, columns and treatments.
Columns (j) /
Rows (i) 1 2 3 4 Total
1 12 19 10 8
2 18 12 6 7
3 22 10 5 21
4 12 7 27 17
Total
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 25
Treatment total
( )
∑∑
[
]
Between column sum of squares
Between row sum of squares
Between treatment sum of squares
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 26
Error sum of squares
ANOVA TABLE
Source of
variation D.o.f
Sum of
squares (SS)
Mean Sum of squares
(MS)
Variance Ration
(F-ratio)
Between
columns 3 SSC=42.69
Between rows 3 SSR=60.19
between
treatments 3 SSK=465.19
Error 6 SSE=79.37
Total 15 647.74
Table value:
( )
Conclusion:
Since ( ) and ( ), we accept the null hypothesis and that is
there is no significant difference between the rows and columns.
The calculated value of ( ), we reject the null hypothesis and that is there is
some significant difference between the treatments.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 27
Problem: 10
Analyze the variance in the Latin Square of yields (in kgs) of paddy where P,Q,R,S
denote the different methods of cultivation.
S122 P121 R123 Q122
Q124 R123 P122 S125
P120 Q119 S120 R121
R122 S123 Q121 P122
Examine whether the different methods of cultivation have given significantly
different yields.
Answer:
Null hypothesis:
There is no significant difference between the different methods of cultivation.
To simplify calculations, we subtract 120 from the given values.
Columns (j) /
Rows (i) 1 2 3 4 Total
1 S 2 P 1 R 3 Q 2
2 Q 4 R 3 P 2 S 5
3 P 0 Q -1 S 0 R 1
4 R 2 S 3 Q 1 P 2
Total
Treatment total
( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 28
∑∑
[
( )
]
Between column sum of squares
Between row sum of squares
Between treatment sum of squares
Error sum of squares
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 29
ANOVA TABLE
Source of
variation D.o.f
Sum of
squares (SS)
Mean Sum of
squares (MS)
Variance Ration
(F-ratio)
Between
columns 3 2.75 0.917
Between rows 3 24.75 8.25
between
treatments 3 4.25 1.417
Error 6 4 0.667
Total 15 35.75
Table value:
( )
Conclusion:
Since ( ), we reject the null hypothesis that is there is some significant
difference between the rows.
Since ( ) and ( ), we accept the null hypothesis that is there
is no significant difference between the columns and treatments.
Problem: 11
The figures in the following 5*5 Latin square are the numbers of minutes, engines
and tuned up by mechanics and , ran with a gallon
of fuel A, B, C, D and E.
A B C D E
31 24 20 20 18
B C D E A
21 27 23 25 31
C D E A B
21 27 25 29 21
D E A B C
21 25 33 25 22
E A B C D
21 37 24 24 20
Use the level of significance to test
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 30
The null hypothesis that there is no difference in the performance of the five
engines.
that the persons who tuned up these engines have no effect on their performance.
that the engines perform equally well with each of the fuels.
Answer:
Null hypothesis:
There is no significant difference between the engines, persons and fuels.
To simplify calculations, we subtract 25 from the given values.
Total
Total
Treatment total
( )
[
]
Between column sum of squares
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 31
( )
( )
( )
Between row sum of squares
( )
( )
Between treatment sum of squares
( )
( )
( )
( )
Error sum of squares
ANOVA TABLE
Source of
variation D.o.f
Sum of squares
(SS)
Mean Sum of
squares (MS)
Variance Ration
(F-ratio)
Between
columns 4 95.6 23.9
Between rows 4 26.8 6.7
between
treatments 4 362.8 90.7
Error 12 34.8 2.9
Total 24
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 32
Table value:
( )
Conclusion:
( ), we accept the null hypothesis and there is no significant difference
between the performance of the five machines.
( ) and ( ) we reject the null hypothesis and there is some
significant difference between the performance of the five engines and fuels.
Problem: 12
Four cars and four drivers are employed in a study for possible differences between
four gasoline additives(A, B, C, D). Even though cars can be identical models, slight
systematic differences are likely to occur in their performance, and even though each
driver may do his best to drive the car in the manner required by the test, slight systematic
differences can occur from driver to driver. It would be desirable to eliminate both the car-
to-car and driver-to-driver differences. Carry ANNOVA table.
Cars
Drivers 1 2 3 4
1 A 24 B 26 D 20 C 25
2 D 23 C 26 A 20 B 27
3 B 15 D 13 C 16 A 16
4 C 17 A 15 B 20 D 20
Use the level of significance to test
Answer: Null hypothesis:
There is no significant difference between the gasoline additives, cars and drivers.
To simplify calculations, we subtract 20 from the given values.
Car 1 Car 2 Car 3 Car 4 Total
Driver 1 A 4 B 6 D 0 C 5
Driver 2 D 3 C 6 A 0 B 7
Driver 3 B -5 D -7 C -4 A -4
Driver 4 C -3 A -5 B 0 D 0
Total
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 33
Treatment total
( )
[
( ) ( ) ( ) ( ) ( ) ( ) ]
Between column sum of squares
( )
( )
Between row sum of squares
( )
( )
Between treatment sum of squares
( )
( )
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 34
Error sum of squares
ANOVA TABLE
Source of
variation D.o.f
Sum of
squares (SS)
Mean Sum of
squares (MS)
Variance Ration
(F-ratio)
Between
columns 3 19.69 6.56
Between rows 3 235.69 78.56
between
treatments 3 29.69 9.90
Error 6 25.38 4.23
Total 15
Table value:
( )
Conclusion:
( ), we reject the null hypothesis and there is some significant difference
between the performance of the four drivers.
( ) and ( ), we accept the null hypothesis and there is no
significant difference between the cars and the gasoline additives .
Practice problems:
1. A variable trail was conducted on wheat with 4 variables in a Latin square design.
The plan of the experiment is given below. Analyse data and interpret the result.
C 25 B 23 A 20 D 20
A 19 D 19 C 21 B 18
B 19 A 14 D 17 C 20
D 17 C 20 B 21 A 15
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 35
2. Four varieties A, B, C and D of a fertilizer are tested in a randomized block design
with four replicants. The plot yields in pounds are as follows.
A 12 D 20 C 16 B 10
D 18 A 14 B 11 C 14
B 12 C 15 D 19 A 13
C 16 B 11 A 15 D 20
Analyse the experimental yield.
Two Marks
1. Define Analysis of variance.
Answer:
Analysis of variance is a collection of statistical models and their associated
estimation procedures used to analyze the differences among group means in a sample.
ANOVA was developed by statistician and evolutionary biologist Ronald Fisher.
2. State the basic principles of design of Experiments
Answer:
The major three principles of experimental designs are:
Replication : to provide an estimate of experimental error.
Randomization : to ensure that this estimate is statistically valid.
Local control : to reduce experimental error by making the experiment more efficient.
3. What do you understand by “Design of an experiment”?
Answer:
Design of experiments (DOE) is a systematic method to determine the relationship
between factors affecting a process and the output of that process.
In other words, it is used to find cause-and-effect relationships. This information is
needed to manage process inputs in order to optimize the output.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 36
4. What is the aim of the design of experiment?
Answer:
Design of experiments (DOE) is a systematic method to determine the relationship
between factors affecting a process and the output of that process.
In other words, it is used to find cause-and-effect relationships. This information is
needed to manage process inputs in order to optimize the output.
5. State the assumptions involved in ANOVA.
Answer:
The experimental errors of your data are normally distributed.
Equal variances between treatments (Homogeneity of variances Homoscedasticity).
Independent of samples (Each sample is randomly selected and independent).
6. What are the basic steps in ANOVA?
Answer:
Set up hypotheses
Determine the level of significance.
Select the appropriate test statistic. ...
Set up decision rule. ...
Compute the test statistic. ...
Write Conclusion.
7. When do you apply the analysis of variance technique?
Answer:
Suppose we consider three or more samples at a time, in this situation we need
another testing hypothesis that all the samples are drawn from the same population, i.e.,
they have the same means. In this case we use analysis of variance to test the homogeneity
of several means.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 37
8. What is a completely randomized design?
Answer:
The one-way analysis of variance or a completely randomized design is used to
determine whether there are any statistically significant differences between the means of
two or more independent (unrelated) groups (although you tend to only see it used when
there are a minimum of three, rather than two groups).
9. State any two advantages of a Completely Randomized Experimental Design.
Answer:
Complete flexibility. Can have any number of treatments and blocks.
Easy to calculate and perform the layouts.
10. Write down the ANOVA table for one way classification.
Answer:
Source of variation Sum of
squares
Degrees of
freedom
Mean sum of
square F-ratio
Between samples SSC
Within samples SSE
11. Define: RBD.
Answer:
With a randomized block design, the experimenter divides subjects into subgroups
called blocks, such that the variability within blocks is less than the variability between
blocks. Then, subjects within each block are randomly assigned to treatment conditions.
Compared to a completely randomized design, this design reduces variability within
treatment conditions and potential confounding, producing a better estimate of treatment
effects.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 38
12. Write the ANOVA table for randomized block design.
Answer:
Source of
variation
Sum of
squares
Degrees of
freedom
Mean sum of
square F-ratio
Between
Columns SSC
Between Rows SSR
Error SSE
13. Discuss the advantages and disadvantages of Randomized block design.
Answer:
Advantages:
The precision is more in RBD.
The amount of information obtained in RBD is more as compared to CRD.
RBD is more flexible. Statistical analysis is simple and easy.
Disadvantages:
When the number of treatments is increased, the block size will increase.
If the block size is large maintaining homogeneity is difficult and hence when more
number of treatments is present this design may not be suitable.
14. Compare one-way classification model with two-way classification model.
Answer:
One – way ANNOVA Two – way ANNOVA
1 We cannot test two sets of
hypothesis at a time.
Two sets of hypothesis
can be tested at a time.
2 Data are classified according to one
factor
Data are classified according
to two different factor.
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 39
15. Write any two differences between RBD and CRD.
Answer:
Completely randomized block design Randomized block design
1 We cannot test two sets of hypothesis at a
time.
Two sets of hypothesis
can be tested at a time.
2 Data are classified according
to one factor
Data are classified according
to two different factor.
16. What is meant by Latin square?
Answer:
A Latin square design is a method of placing treatments so that they appear in a
balanced fashion within a square block or field. Treatments appear once in each row and
column. Replicates are also included in this design.
17. What are the advantages of a Latin square design?
Answer:
Greater power than the RBD when there are two external sources of variation.
Easy to analyze.
18. Is 2×2 Latin square Design possible? Why?
Answer:
No we cannot use 2×2 Latin square Design.
Reason:
In Latin square design, formula for finding degrees of freedom for error sum of
square (SSE) is ( )( ).
While using , then d.o.f is 0, then MSE . It is not possible.
19. Give the layout of Latin square design with four treatments.
Answer:
With four treatments A, B, C and D one typical arrangement of LSD is as follows
A B D C B A C D D C B A C D A B
SRIT / UICM007 – P & S / Analysis of Variance
SRIT / M & H / M. Vijaya Kumar 40
20. Write the ANOVA table for Latin Square design.
Answer:
Source of
variation
Sum of
squares Degrees of freedom
Mean sum of
square F-ratio
Between
Columns SSC
Between
Rows SSR
Between
treatments SSK
Error SSE ( )( )
21. What is Latin square design? Under what conditions can this design are used?
Answer:
The treatments are then allocated at random to these rows and columns in such a
way that every treatment occurs once and only once in each row and in each column. Such
a layout is known as Latin square design.
“Education is not only a ladder of opportunity, but it is also an
investment in our future”
-Ed Markey.