25
1 © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of k Population Means

Chapter 10: Comparisons Involving Means

  • Upload
    zihna

  • View
    33

  • Download
    2

Embed Size (px)

DESCRIPTION

Chapter 10: Comparisons Involving Means. Introduction to Analysis of Variance. Analysis of Variance: Testing for the Equality of k Population Means. Analysis of Variance. Analysis of Variance (ANOVA) is used to test for the equality of three or more population means. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter  10:   Comparisons Involving Means

1 1 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Chapter 10: Comparisons Involving Means

Introduction to Analysis of Variance

Analysis of Variance: Testing for the Equality of

k Population Means

Page 2: Chapter  10:   Comparisons Involving Means

2 2 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Analysis of Variance

Analysis of Variance (ANOVA) is used to test for the equality of three or more population means. Analysis of Variance (ANOVA) is used to test for the equality of three or more population means.

We want to use the sample results to test the following hypotheses: We want to use the sample results to test the following hypotheses:

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

Page 3: Chapter  10:   Comparisons Involving Means

3 3 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Introduction to Analysis of Variance

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

If H0 is rejected, we cannot conclude that all population means are different.

If H0 is rejected, we cannot conclude that all population means are different.

Rejecting H0 means that at least two population means have different values.

Rejecting H0 means that at least two population means have different values.

Page 4: Chapter  10:   Comparisons Involving Means

4 4 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Sampling Distribution of the mean Given H0 is True

Introduction to Analysis of Variance

1x 3x2x

Sample means are close together because there is only

one sampling distribution when H0 is true.

22x n

Page 5: Chapter  10:   Comparisons Involving Means

5 5 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Introduction to Analysis of Variance

Sampling Distribution of the mean given H0 is False

33 1x 2x3x 11 22

Sample means come fromdifferent sampling distributionsand are not as close together

when H0 is false.

Page 6: Chapter  10:   Comparisons Involving Means

6 6 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

For each population, the response variable is normally distributed. For each population, the response variable is normally distributed.

Assumptions for Analysis of Variance

The variance of the response variable, denoted 2, is the same for all of the populations. The variance of the response variable, denoted 2, is the same for all of the populations.

The observations must be independent. The observations must be independent.

Page 7: Chapter  10:   Comparisons Involving Means

7 7 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Analysis of Variance:Testing for the Equality of k Population

Means

Between-Treatments Estimate of Population Variance

Within-Treatments Estimate of Population Variance

Comparing the Variance Estimates: The F Test

ANOVA Table

Page 8: Chapter  10:   Comparisons Involving Means

8 8 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Between-Treatments Estimateof Population Variance

A between-treatment estimate of 2 is called the mean square treatment and is denoted MSTR.

2

1

( )

MSTR1

k

j jj

n x x

k

Denominator represents the degrees of freedom associated with SSTR

Numerator is the sum of squares

due to treatmentsand is denoted SSTR

Page 9: Chapter  10:   Comparisons Involving Means

9 9 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

The estimate of 2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE.

Within-Samples Estimateof Population Variance

kn

sn

T

k

jjj

1

2)1(

MSE

Denominator represents the degrees of freedom

associated with SSE

Numerator is the sum of squares

due to errorand is denoted SSE

Page 10: Chapter  10:   Comparisons Involving Means

10 10 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Comparing the Variance Estimates: The F Test

If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to nT - k.

If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR overestimates 2.

Hence, we will reject H0 if the resulting value of MSTR/MSE appears to be too large to have been selected at random from the appropriate F distribution.

Page 11: Chapter  10:   Comparisons Involving Means

11 11 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Test for the Equality of k Population Means

F = MSTR/MSE

H0: 1=2=3=. . . = k

Ha: Not all population means are equal

Hypotheses

Test Statistic

Page 12: Chapter  10:   Comparisons Involving Means

12 12 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Test for the Equality of k Population Means

Rejection Rule

where the value of F is based on an F distribution with k - 1 numerator d.f. and nT - k denominator d.f.

Reject H0 if p-value < ap-value Approach:

Critical Value Approach: Reject H0 if F > Fa

Page 13: Chapter  10:   Comparisons Involving Means

13 13 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Sampling Distribution of MSTR/MSE

Rejection Region

Do Not Reject H0Do Not Reject H0

Reject H0Reject H0

MSTR/MSEMSTR/MSE

Critical ValueCritical ValueFF

Sampling Distributionof MSTR/MSE

a

Page 14: Chapter  10:   Comparisons Involving Means

14 14 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

ANOVA Table

SST is partitionedinto SSTR and

SSE.

SST’s degrees of freedom

(d.f.) are partitioned into

SSTR’s d.f. and SSE’s d.f.

TreatmentErrorTotal

SSTRSSESST

k – 1

nT – k

nT - 1

MSTRMSE

Source ofVariation

Sum ofSquares

Degrees ofFreedom

MeanSquares

MSTR/MSE

F

Page 15: Chapter  10:   Comparisons Involving Means

15 15 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

ANOVA Table

SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.

SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.

With the entire data set as one sample, the formula for computing the total sum of squares, SST, is: With the entire data set as one sample, the formula for computing the total sum of squares, SST, is:

2

1 1

SST ( ) SSTR SSEjnk

ijj i

x x

Page 16: Chapter  10:   Comparisons Involving Means

16 16 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

ANOVA Table

ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.

ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.

Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means.

Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means.

Page 17: Chapter  10:   Comparisons Involving Means

17 17 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Example:

Test for the Equality of k Population Means

Janet Reed would like to know ifthere is any significant difference inthe mean number of hours worked per week for the department managersat her three manufacturing plants(in Buffalo, Pittsburgh, and Detroit).

Page 18: Chapter  10:   Comparisons Involving Means

18 18 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Test for the Equality of k Population Means

A simple random sample of fivemanagers from each of the three plantswas taken and the number of hoursworked by each manager for theprevious week is shown on the nextslide. Conduct an F test using a = .05.

Page 19: Chapter  10:   Comparisons Involving Means

19 19 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

12345

4854575462

7363666474

5163615456

Plant 1Buffalo

Plant 2Pittsburgh

Plant 3DetroitObservation

Sample MeanSample Variance

55 68 5726.0 26.5 24.5

Test for the Equality of k Population Means

Page 20: Chapter  10:   Comparisons Involving Means

20 20 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Test for the Equality of k Population Means

H0: 1= 2= 3

Ha: Not all the means are equalwhere: 1 = mean number of hours worked per

week by the managers at Plant 1 2 = mean number of hours worked per week by the managers at Plant 2 3 = mean number of hours worked per week by the managers at Plant 3

1. Develop the hypotheses.

Page 21: Chapter  10:   Comparisons Involving Means

21 21 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

2. Specify the level of significance. a = .05

Test for the Equality of k Population Means

3. Compute the value of the test statistic.

MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490

= (55 + 68 + 57)/3 = 60x(Sample sizes are all equal.)

Mean Square Due to Treatments

Page 22: Chapter  10:   Comparisons Involving Means

22 22 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

3. Compute the value of the test statistic.

Test for the Equality of k Population Means

MSE = 308/(15 - 3) = 25.667

SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to Error

F = MSTR/MSE = 245/25.667 = 9.55

Page 23: Chapter  10:   Comparisons Involving Means

23 23 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

TreatmentErrorTotal

490308798

21214

24525.667

Source ofVariation

Sum ofSquares

Degrees ofFreedom

MeanSquares

9.55

F

Test for the Equality of k Population Means

ANOVA Table

Page 24: Chapter  10:   Comparisons Involving Means

24 24 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

5. Compare the Test Statistic with the critical Value

Because F = 9.55 > 3.89, we reject H0.

Using the Critical Value Approach

4. Determine the critical value and rejection rule.

Test for the Equality of k Population Means

We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.

Based on an F distribution with 2 numeratord.f. and 12 denominator d.f., F.05 = 3.89.

Page 25: Chapter  10:   Comparisons Involving Means

25 25 Slide

Slide

© 2009, Econ-2030 Applied Statistics-Dr Tadesse

Test for the Equality of k Population Means

5. Determine whether to reject H0.

We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.

The p-value < .05, so we reject H0.

With 2 numerator d.f. and 12 denominator d.f.,the p-value is .01 for F = 6.93. Therefore, thep-value is less than .01 for F = 9.55.

Using the p –Value Approach

4. Compute the p –value.