Math 3680 Lecture #13 Hypothesis Testing: The z Test

Math 3680

Lecture #13

Hypothesis Testing:The z Test

The One-Sided z Test

Example: Before a design artist was hired to improve its entrance, an average of 3218 people entered a department store daily, with an SD of 287.

Since the entrance was redesigned, a simple random sample of 42 days has been studied. The results are shown on the next slide.

Is this statistically significant for indicating that the average number of people entering the store daily has increased?

3767 3445 3154 3220 3039 33783678 3556 3171 3197 3630 36953456 3216 3379 3715 3525 34703596 3309 3405 3034 3193 35063449 3716 3310 3420 3338 30703358 3667 3780 3082 3125 35263341 3100 3212 3482 3186 3558

Average =

Note: Don’t compare the difference

3392 - 3218 = 174

with the population standard deviation 287.

The former applies to an average, while the latter is for individual days.

There are two possibilities:

• The average number of people entering has not changed. The observed sample average of 3,392 can be reasonably attributed to chance fluctuations.

• The difference between the observed average and

the expected average is too large to be simply chance. The average number of people entering has increased with the improved entrance.

Definition: Null Hypothesis. The first hypothesis, which asserts simple chance fluctuations, is called the null hypothesis.

Definition: Alternative Hypothesis. The second hypothesis, which asserts that the average has in fact increased, is called the alternative hypothesis.

The null hypothesis is the default assumption. This is the assumption to be disproved.

For example, if the sample average were 3219 per day, that would hardly be convincing evidence. However, if the new sample average were 5000 per day, we can be confident of the lure of the new storefront – and rule out simple chance. So where is the “cut-off” value?

Solution.

• H0: = 3218 (The average number of customers entering the store has not changed due to the improved storefront)

• Ha: > 3218 (The average number of customers entering the store has increased due to the improved storefront)

• We choose = 0.05

Before continuing, why isn’t Ha written as 3218?

Assuming H0, we have a sample of 42 days which are being drawn from a box with = 3218 and = 287.

The average has the following moments:

42

287)(

3218)(

nXSD

XE

• Test statistic:

• P-value. Assuming H0, we must find the chance of obtaining a test statistic at least this extreme. For this problem, that means

929.342/287

32183392

/

n

Xzs

3.929

510265.4

)929.3(

)(

ZP

zZP s

• Conclusion: We reject the null hypothesis. There is good reason to believe that the average number of customers has increased after the redesign.

Excel: Use the command

=ZTEST(A1:D11, 3218, 287)

3767 3445 3154 32203678 3556 3171 31973456 3216 3379 37153596 3309 3405 30343449 3716 3310 34203358 3667 3780 30823341 3100 3212 34823039 3525 3338 35583630 3193 3125 31863378 3470 30703695 3506 35264.362E-05

Observations:

1. This test of significance is called the z-test, named after the test statistic.

2. The z-test is best used with large samples – so that the normal approximation may be safely made.

3. Notice we have not proven beyond a shadow of a doubt that the new storefront was effective in increasing the number of patrons.

4. The alternative hypothesis is that the daily average of patrons is greater than 3218. It is not that the new average is exactly equal to 3392. In other words, the alternative hypothesis was a compound hypothesis, not a simple hypothesis. 5. Small values of P are evidence against the null hypothesis; they indicate that something besides chance is at work.

6. We are NOT saying that there is 1 chance in 20,000 for the null hypothesis to be correct.

1.6451.645

Another (equivalent) procedure for hypothesis testing:

In the previous problem, if the test statistic was any number greater than 1.645, then we would have obtained a P-value less than 0.05, the specified . (Why?)

We call zc = 1.645 the critical value, and the interval (1.645, ) is called the rejection region. Since zs lies in the rejection region, we choose to reject H0.

5%

1.6451.645

In terms of the customers, we have the critical value

5%

8.329042

287)645.1(32183218 Xcc zx

Another (equivalent) procedure for

hypothesis testing:

Hypothesis testing may be correctly conducted by using the P-value (the first method) or by using the critical value (as we just discussed).

In scientific articles, both are usually reported, even though the two methods are logically equivalent.

As we now discuss, the critical value also eases computation of the power of the test.

Example: In the previous example, suppose that the redesign increased the average number of patrons by 100, from 3218 to 3318. How likely is it that a sample of only 42 days will come to the correct conclusion (by rejecting the null hypothesis)?

Note: Recall that this is called the power of the test.

Solution: recall the critical value: xc= 3290.8

P( Reject H0 | = 3318)

7305.0

)6142.0(

42/287

33188.3290

)3318|8.3290(

ZP

XP

XP

X

3218 3291 33183218 3291 3318

Nulldistribution

Alternativedistribution

5%

73.05%

3225 3250 3275 3300 3325 3350 3375 3400

0.2

0.4

0.6

0.8

1

Power of the test (1-β) as a function of the true average

Example: The average braking distance from 60 mph of a Mercury Sable is 159 feet, with an SD of 23.5 feet. Sables equipped with (hopefully) improved tires have just undergone early testing; the results of the first 45 tests are shown below. Does this indicate that the new tires have decreased the braking distance? Use = 0.05.

139.6 170.4 127.9 157.7 172.4 175.4 182.0 157.6 125.8150.9 157.1 175.4 174.2 136.4 171.1 147.3 137.8 167.6151.1 159.1 120.1 143.2 143.6 171.8 151.0 146.0 120.3134.9 150.8 178.8 179.2 126.2 125.5 159.4 132.2 118.2120.9 165.0 164.4 183.0 175.2 175.5 167.7 163.2 180.0

Example: The Compute the probability of committing a Type II error if the braking distance with the improved tires is now 155 feet.

Conceptual Questions:

1) We made a test of significance because (choose one)

i) We knew what was in the box, but did not know how the sample would turn out; or

ii) We knew how the sample turned out, but did not know what was in the box.


2) The null hypothesis says that the average of the (sample / box) is 159 feet.


3) True or False:

a) The observed significance level of 8% depends on the data (i.e. sample)

b) There are 92 chances out of 100 for the alternative hypothesis to be correct.


4) Suppose only 10 tests were performed instead of 45. Should we use the normal curve to compute P?


5) True or False:

a) A “highly statistically significant” result cannot possibly be due to chance.b) If a sample difference is “highly statistically significant,” there is less than a 1% chance for the null hypothesis to be correct.


6) True or False:

a) If , then the null hypothesis looks plausible.

b) If , then the null hypothesis looks implausible.

%43P

%.430P

The Two-Sided z Test

Example: A company claims to have designed a new fishing line that has a mean breaking strength of 8 kg with an SD of 0.5 kg. Consumer Reports tests a random sample of 45 lines; the results are shown below. Test the validity of the company’s claim.

7.56 7.93 7.36 8.19 7.48 7.82 7.43 8.24 7.447.96 7.72 7.84 7.56 8.05 8.16 7.45 7.27 7.337.36 7.50 8.06 7.51 7.40 8.21 8.18 7.81 8.107.81 7.33 7.87 8.15 8.23 7.95 7.72 7.45 7.597.84 7.72 8.17 7.40 7.72 7.34 7.74 7.93 7.54

Notes:

•To avoid data snooping, we must use a two-tailed test. Before the tests were actually performed, we had no a priori reason to think that the sample average would return either too high or too low.

• For a two-sided alternative hypothesis, the P-value is twice as large as for a one-sided alternative.

Example. Let’s take a look at the results of the Salk vaccine trial, which we first saw back in Lecture #2:

Does it appear that the vaccine was effective?

Note: If the vaccine was ineffective, then we wouldexpect the 199 polio cases to be distributed with

= 200745/401974 = 0.499398,and the 57 polio cases among the treated was just due to a run of luck.

Number Polio CasesTreatment 200745 57

Control 201229 142Total 401974 199

Documents

Math 3680 Lecture #13 Hypothesis Testing: The z Test