19
AP Statistics Chapter 25 Paired Samples and Blocks

AP Statistics Chapter 25 Paired Samples and Blocks

Embed Size (px)

Citation preview

Page 1: AP Statistics Chapter 25 Paired Samples and Blocks

AP StatisticsChapter 25

Paired Samples and Blocks

Page 2: AP Statistics Chapter 25 Paired Samples and Blocks

Objectives:

• Paired Data• Paired t-Test• Paired t-Confidence Interval

Page 3: AP Statistics Chapter 25 Paired Samples and Blocks

Paired Data

• Data are paired when the observations are collected in pairs or the observations in one group are naturally related to observations in the other group.

• Paired data arise in a number of ways. Perhaps the most common is to compare subjects with themselves before and after a treatment.– When pairs arise from an experiment, the pairing is a type

of blocking. – When they arise from an observational study, it is a form of

matching.

Page 4: AP Statistics Chapter 25 Paired Samples and Blocks

Paired Data (cont.)

• If you know the data are paired, you can (and must!) take advantage of it. – To decide if the data are paired, consider how they were

collected and what they mean (check the W’s).– There is no test to determine whether the data are paired.

• Once we know the data are paired, we can examine the pairwise differences. – Because it is the differences we care about, we treat them as

if they were the data and ignore the original two sets of data.

Page 5: AP Statistics Chapter 25 Paired Samples and Blocks

Paired Data (cont.)

• Now that we have only one set of data to consider, we can return to the simple one-sample t-test.

• Mechanically, a paired t-test is just a one-sample t-test for the means of the pairwise differences.– The sample size is the number of pairs.

Page 6: AP Statistics Chapter 25 Paired Samples and Blocks

Assumptions and Conditions

• Paired Data Assumption: – Paired data Assumption: The data must be paired.

• Independence Assumption: – Independence Assumption: The differences must be independent of

each other. – Randomization Condition: Randomness can arise in many ways. What

we want to know usually focuses our attention on where the randomness should be.

– 10% Condition: When a sample is obviously small, we may not explicitly check this condition.

• Normal Population Assumption: We need to assume that the population of differences follows a Normal model.– Nearly Normal Condition: Check this with a histogram or Normal

probability plot of the differences.

Page 7: AP Statistics Chapter 25 Paired Samples and Blocks

The Paired t-Test

• When the conditions are met, we are ready to test whether the paired differences differ significantly from zero.

• We test the hypothesis H0: d = 0, where the d’s are the pairwise differences and 0 is almost always 0.

Page 8: AP Statistics Chapter 25 Paired Samples and Blocks

The Paired t-Test (cont.)

• We use the statistic

where n is the number of pairs.

• is the ordinary standard error for the

mean applied to the differences.

• When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t-model on n – 1 degrees of freedom, so we can use that model to obtain a P-value.

0

1n

dt

SE d

dsSE d

n

Page 9: AP Statistics Chapter 25 Paired Samples and Blocks

Example: Matched Pairs t Hypothesis Test

• The following chart shows the number of men and women employed in randomly selected professions. Do these data suggest that there is a significant difference in gender in the workplace?

Page 10: AP Statistics Chapter 25 Paired Samples and Blocks

Solution:

• Choose method– Matched pairs t hypothesis test– The differences are (males – females): 1681, 837, 182, 466, -206,

133, -723, 399.

• Check conditions– Independence: data paired because they count employees in the

same occupations.– Randomization: occupations were randomly selected– Population Size: there are more than 80 occupations.– Sample Size: n=8, a small sample size, must check for skew and

outliers

Page 11: AP Statistics Chapter 25 Paired Samples and Blocks

Solution:

• Hypotheses: H0: μd=0 The mean gender difference in the workplace is 0.

Ha: μd≠0 The mean gender difference in the workplace is not 0.

• Find the t-statistic:

• Find the p-value: p-value = 2P(t ≥ 1.373) = .212• Conclusion: The p-value is too large at any commonly accepted

levels to be able to reject the null hypothesis. There is insufficient evidence to support a difference in gender in the workplace.

Page 12: AP Statistics Chapter 25 Paired Samples and Blocks

Confidence Intervals for Matched Pairs

• When the conditions are met, we are ready to find the confidence interval for the mean of the paired differences.

• The confidence interval is

where the standard error of the mean difference is

The critical value t* depends on the particular confidence level, C, that you specify and on the degrees of freedom, n – 1, which is based on the number of pairs, n.

dsSE d

n

1nd t SE d

Page 13: AP Statistics Chapter 25 Paired Samples and Blocks

Example: Confidence Intervals for Matched Pairs

• Over the ten-year period 1990-2000, the unemployment rates for Australia and the United Kingdom were reported as follows.

Find a 90% confidence interval for the mean difference in unemployment rates for Australia and the united Kingdom.

Page 14: AP Statistics Chapter 25 Paired Samples and Blocks

Solution:

• Choose method.– Matched pairs t confidence interval.– The paired differences are: (AU-UK) .4, .8, .6, .4, 0, -.2, .4, 1.6, 1.6, 1.2, .8.

• Check conditions– Independence: Not, matched pairs is the method to use.– Randomization: Yes, no reason to believe that this sequence of years is not representative of the

mean of the differences in unemployment rates of these 2 countries.– Sample Size: n < 15, check for skew and outliers.

• There is some skewness and no outliers, use t-distribution.

• Calculation the interval– df = n-1 = 10, t* = 1.812, , s = .589– 90% CI is (.369, 1.013)

• Conclusion– We are 90% confident that the mean difference in the unemployment rates between AU and UK is

between .369 and 1.013.

Page 15: AP Statistics Chapter 25 Paired Samples and Blocks

Blocking

• Consider estimating the mean difference in age between husbands and wives.

• The following display is worthless. It does no good to compare all the wives as a group with all the husbands—we care about the paired differences.

Page 16: AP Statistics Chapter 25 Paired Samples and Blocks

Blocking (cont.)

• In this case, we have paired data—each husband is paired with his respective wife. The display we are interested in is the difference in ages:

Page 17: AP Statistics Chapter 25 Paired Samples and Blocks

Blocking (cont.)

• Pairing removes the extra variation that we saw in the side-by-side boxplots and allows us to concentrate on the variation associated with the difference in age for each pair.

• A paired design is an example of blocking.

Page 18: AP Statistics Chapter 25 Paired Samples and Blocks

What Can Go Wrong?

• Don’t use a two-sample t-test for paired data.• Don’t use a paired-t method when the samples

aren’t paired.• Don’t forget outliers—the outliers we care about

now are in the differences.• Don’t look for the difference between means of

paired groups with side-by-side boxplots.

Page 19: AP Statistics Chapter 25 Paired Samples and Blocks

What have we learned?

• Pairing can be a very effective strategy.– Because pairing can help control variability between

individual subjects, paired methods are usually more powerful than methods that compare independent groups.

• Analyzing data from matched pairs requires different inference procedures.– Paired t-methods look at pairwise differences.

• We test hypotheses and generate confidence intervals based on these differences.

– We learned to Think about the design of the study that collected the data before we proceed with inference.