43
Chapter 10 Two-Sample Tests and ANOVA

Two Sample Test

Embed Size (px)

DESCRIPTION

Details of Two Sample Test

Citation preview

Page 1: Two Sample Test

Chapter 10

Two-Sample Tests and ANOVA

Page 2: Two Sample Test

Learning ObjectivesIn this chapter, you learn hypothesis testing procedures to

test: • The means of two independent populations• The means of two related populations • The proportions of two independent populations• The variances of two independent populations• The means of more than two populations

Page 3: Two Sample Test

Chapter Overview

One-Way Analysis of Variance (ANOVA)

F-test

Tukey-Kramer test

Two-Sample Tests

Population Means, Independent Samples

Means, Related Samples

PopulationProportions

PopulationVariances

Page 4: Two Sample Test

Two-Sample Tests

Two-Sample Tests

Population Means,

Independent Samples

Means, Related Samples

Population Variances

Mean 1 vs. independent Mean 2

Same population before vs. after treatment

Variance 1 vs.Variance 2

Examples:

Population Proportions

Proportion 1 vs. Proportion 2

Page 5: Two Sample Test

Difference Between Two Means

Population means, independent

samples

σ1 and σ2 known

Goal: Test hypothesis or form a confidence interval for the difference between two population means, μ1 – μ2

The point estimate for the difference is

X1 – X2

*

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 6: Two Sample Test

Independent Samples

Population means, independent

samples

• Different data sources• Unrelated• Independent

• Sample selected from one population has no effect on the sample selected from the other population

• Use the difference between 2 sample means

• Use Z test, a pooled-variance t test, or a separate-variance t test

*σ1 and σ2 known

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 7: Two Sample Test

Difference Between Two Means

Population means, independent

samples

σ1 and σ2 known

*Use a Z test statistic

Use Sp to estimate unknown σ , use a t test statistic and pooled standard deviation

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Use S1 and S2 to estimate unknown σ1 and σ2, use a separate-variance t test

Page 8: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 KnownAssumptions:

Samples are randomly and independently drawn

Population distributions are normal or both sample sizes are 30

Population standard deviations are known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Page 9: Two Sample Test

Population means, independent

samples

σ1 and σ2 known …and the standard error of X1 – X2 is

When σ1 and σ2 are known and both populations are normal or both sample sizes are at least 30, the test statistic is a Z-value…

2

22

1

21

XX nσ

nσσ

21

(continued)σ1 and σ2 Known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Page 10: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

2121

μμXXZ

The test statistic for μ1 – μ2 is:

σ1 and σ2 Known

*

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 11: Two Sample Test

Hypothesis Tests forTwo Population Means

Lower-tail test:

H0: μ1 μ2

H1: μ1 < μ2

i.e.,

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 ≤ μ2

H1: μ1 > μ2

i.e.,

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 = μ2

H1: μ1 ≠ μ2

i.e.,

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

Two Population Means, Independent Samples

Page 12: Two Sample Test

Two Population Means, Independent Samples

Lower-tail test:

H0: μ1 – μ2 0H1: μ1 – μ2 < 0

Upper-tail test:

H0: μ1 – μ2 ≤ 0H1: μ1 – μ2 > 0

Two-tail test:

H0: μ1 – μ2 = 0H1: μ1 – μ2 ≠ 0

a a/2 a/2a

-za -za/2za za/2

Reject H0 if Z < -Za Reject H0 if Z > Za Reject H0 if Z < -Za/2

or Z > Za/2

Hypothesis tests for μ1 – μ2

Page 13: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

21nσ

nσZXX

The confidence interval for μ1 – μ2 is:

Confidence Interval, σ1 and σ2 Known

*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

Page 14: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 Unknown, Assumed Equal

Assumptions: Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but assumed equal

*σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 15: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

(continued)

*

Forming interval estimates:

The population variances are assumed equal, so use the two sample variances and pool them to estimate the common σ2

the test statistic is a t value with (n1 + n2 – 2) degrees of freedom

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Page 16: Two Sample Test

Population means, independent

samples

σ1 and σ2 knownThe pooled variance is

(continued)

1)n(n

S1nS1nS21

222

2112

p

()1*σ1 and σ2 unknown,

assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Page 17: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

Where t has (n1 + n2 – 2) d.f.,

and

21

2p

2121

n1

n1S

μμXXt

The test statistic for μ1 – μ2 is:

*

1)n()1(nS1nS1nS

21

222

2112

p

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Assumed Equal

Page 18: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

21

2p2-nn21

n1

n1StXX

21

The confidence interval for μ1 – μ2 is:

Where

1)n()1(n

S1nS1nS21

222

2112

p

*

Confidence Interval, σ1 and σ2 Unknown

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 19: Two Sample Test

Pooled-Variance t Test: ExampleYou are a financial analyst for a brokerage firm. Is there a difference in dividend yield between stocks listed on the NYSE & NASDAQ? You collect the following data: NYSE NASDAQNumber 21 25Sample mean 3.27 2.53Sample std dev 1.30 1.16

Assuming both populations are approximately normal with equal variances, isthere a difference in average yield (a = 0.05)?

Page 20: Two Sample Test

Calculating the Test Statistic

1.50211)25(1)-(21

1.161251.301211)n()1(n

S1nS1nS22

21

222

2112

p

2.040

251

2115021.1

02.533.27

n1

n1S

μμXXt

21

2p

2121

The test statistic is:

Page 21: Two Sample Test

SolutionH0: μ1 - μ2 = 0 i.e. (μ1 = μ2)

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)a = 0.05df = 21 + 25 - 2 = 44Critical Values: t = ± 2.0154

Test Statistic: Decision:

Conclusion:Reject H0 at a = 0.05

There is evidence of a difference in means.

t0 2.0154-2.0154

.025

Reject H0 Reject H0

.025

2.040

2.040

251

2115021.1

2.533.27t

Page 22: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

σ1 and σ2 Unknown, Not Assumed Equal

Assumptions: Samples are randomly and independently drawn

Populations are normally distributed or both sample sizes are at least 30

Population variances are unknown but cannot be assumed to be equal*

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

Page 23: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

(continued)

*

Forming the test statistic:

The population variances are not assumed equal, so include the two sample variances in the computation of the t-test statistic

the test statistic is a t value (statistical software is generally used to do the necessary computations)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Not Assumed Equal

Page 24: Two Sample Test

Population means, independent

samples

σ1 and σ2 known

2

22

1

21

2121

nS

nS

μμXXt

The test statistic for μ1 – μ2 is:

*

(continued)

σ1 and σ2 unknown, assumed equal

σ1 and σ2 unknown, not assumed equal

σ1 and σ2 Unknown, Not Assumed Equal

Page 25: Two Sample Test

Related Populations Tests Means of 2 Related Populations

• Paired or matched samples• Repeated measures (before/after)• Use difference between paired values:

• Eliminates Variation Among Subjects• Assumptions:

• Both Populations Are Normally Distributed• Or, if not Normal, use large samples

Related samples

Di = X1i - X2i

Page 26: Two Sample Test

Mean Difference, σD KnownThe ith paired difference is Di , where

Related samples

Di = X1i - X2i

The point estimate for the population mean paired difference is D : n

DD

n

1ii

Suppose the population standard deviation of the difference scores, σD, is known

n is the number of pairs in the paired sample

Page 27: Two Sample Test

The test statistic for the mean difference is a Z value:Paired

samples

μDZD

D

Mean Difference, σD Known(continued)

WhereμD = hypothesized mean differenceσD = population standard dev. of differencesn = the sample size (number of pairs)

Page 28: Two Sample Test

Confidence Interval, σD Known

The confidence interval for μD isPaired samples

nσZD D

Where n = the sample size

(number of pairs in the paired sample)

Page 29: Two Sample Test

If σD is unknown, we can estimate the unknown population standard deviation with a sample standard deviation:

Related samples

1n

)D(DS

n

1i

2i

D

The sample standard deviation is

Mean Difference, σD Unknown

Page 30: Two Sample Test

• Use a paired t test, the test statistic for D is now a t statistic, with n-1 d.f.:Paired

samples

1n

)D(DS

n

1i

2i

D

nS

μDtD

D

Where t has n - 1 d.f.

and SD is:

Mean Difference, σD Unknown

(continued)

Page 31: Two Sample Test

The confidence interval for μD isPaired samples

1n

)D(DS

n

1i

2i

D

nStD D

1n

where

Confidence Interval, σD Unknown

Page 32: Two Sample Test

Lower-tail test:

H0: μD 0H1: μD < 0

Upper-tail test:

H0: μD ≤ 0H1: μD > 0

Two-tail test:

H0: μD = 0H1: μD ≠ 0

Paired Samples

Hypothesis Testing for Mean Difference, σD Unknown

a a/2 a/2a

-ta -ta/2ta ta/2

Reject H0 if t < -ta Reject H0 if t > ta Reject H0 if t < -ta/2 or t > ta/2 Where t has n - 1 d.f.

Page 33: Two Sample Test

• Assume you send your salespeople to a “customer service” training workshop. Has the training made a difference in the number of complaints? You collect the following data:

Paired t Test Example

Number of Complaints: (2) - (1)Salesperson Before (1) After (2) Difference, Di

C.B. 6 4 - 2 T.F. 20 6 -14 M.H. 3 2 - 1 R.K. 0 0 0 M.O. 4 0 - 4 -21

D =Di

n

5.67

1n)D(D

S2

iD

= -4.2

Page 34: Two Sample Test

• Has the training made a difference in the number of complaints (at the 0.01 level)?

- 4.2D =

1.6655.67/04.2

n/SμDt

D

D

H0: μD = 0H1: μD 0

Test Statistic:

Critical Value = ± 4.604 d.f. = n - 1 = 4

Reject

a/2 - 4.604 4.604

Decision: Do not reject H0(t stat is not in the reject region)

Conclusion: There is not a significant change in the number of complaints.

Paired t Test: Solution

Reject

a/2

- 1.66a = .01

Page 35: Two Sample Test

Two Population Proportions

Goal: test a hypothesis or form a confidence interval for the difference between two population proportions,

π1 – π2

The point estimate for the difference is

Population proportions

Assumptions: n1 π1 5 , n1(1- π1) 5

n2 π2 5 , n2(1- π2) 5

21 pp

Page 36: Two Sample Test

Two Population Proportions

Population proportions

21

21

nnXXp

The pooled estimate for the overall proportion is:

where X1 and X2 are the numbers from samples 1 and 2 with the characteristic of interest

Since we begin by assuming the null hypothesis is true, we assume π1 = π2 and pool the two sample estimates

Page 37: Two Sample Test

Two Population Proportions

Population proportions

21

2121

n1

n1)p(1p

ppZ ππ

The test statistic for p1 – p2 is a Z statistic:

(continued)

2

22

1

11

21

21

nXp ,

nXp ,

nnXXp

where

Page 38: Two Sample Test

Confidence Interval forTwo Population Proportions

Population proportions

2

22

1

1121 n

)p(1pn

)p(1pZpp

The confidence interval for π1 – π2 is:

Page 39: Two Sample Test

Hypothesis Tests forTwo Population Proportions

Population proportions

Lower-tail test:

H0: π1 π2

H1: π1 < π2

i.e.,

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 ≤ π2

H1: π1 > π2

i.e.,

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 = π2

H1: π1 ≠ π2

i.e.,

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

Page 40: Two Sample Test

Hypothesis Tests forTwo Population Proportions

Population proportions

Lower-tail test:

H0: π1 – π2 0H1: π1 – π2 < 0

Upper-tail test:

H0: π1 – π2 ≤ 0H1: π1 – π2 > 0

Two-tail test:

H0: π1 – π2 = 0H1: π1 – π2 ≠ 0

a a/2 a/2a

-za -za/2za za/2

Reject H0 if Z < -Za Reject H0 if Z > Za Reject H0 if Z < -Za/2

or Z > Za/2

(continued)

Page 41: Two Sample Test

Example: Two population Proportions

Is there a significant difference between the proportion of men and the proportion of women who will vote Yes on Proposition A?

• In a random sample, 36 of 72 men and 31 of 50 women indicated they would vote Yes

• Test at the .05 level of significance

Page 42: Two Sample Test

• The hypothesis test is:H0: π1 – π2 = 0 (the two proportions are equal)

H1: π1 – π2 ≠ 0 (there is a significant difference between proportions)

The sample proportions are: Men: p1 = 36/72 = .50

Women: p2 = 31/50 = .62

.54912267

50723136

nnXXp

21

21

The pooled estimate for the overall proportion is:

Example: Two population Proportions

(continued)

Page 43: Two Sample Test

The test statistic for π1 – π2 is:

Example: Two population Proportions

(continued)

.025

-1.96 1.96

.025

-1.31

Decision: Do not reject H0

Conclusion: There is not significant evidence of a difference in proportions who will vote yes between men and women.

1.31

501

721.549)(1.549

0.62.50

n1

n1p(1p

ppz

21

2121

)

Reject H0 Reject H0

Critical Values = ±1.96For a = .05