57
Hypothesis Testing

Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Embed Size (px)

Citation preview

Page 1: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Hypothesis Testing

Page 2: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population parameter but rather the formation of a data-based decision procedure that can produce a conclusion about some scientific system.

Page 3: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Definition 10.1 A statistical hypothesis is an assertion or conjecture concerning one or more populations. “The truth or falsity of a statistical hypothesis is never known with absolute certainty unless we examine the entire population.” “Evidence from the sample that is inconsistent with the stated hypothesis leads to a rejection of the hypothesis, whereas evidence supporting the hypothesis leads to its acceptance.” “… acceptance of a hypothesis merely implies that the data do not give sufficient evidence to refute it. On the other hand, rejection implies that the sample evidence refutes it. Put another way, rejection means that there is a small probability of obtaining sample information observed when, in fact, the hypothesis is true.”

Page 4: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Elements of a Statistical Test Null hypothesis, H0, about one or more population parameters

Alternative hypothesis, H1, that we will accept if we decide to reject the null hypothesis

Test statistic, computed from sample data

Rejection region, indicating the values of the test statistic that will imply rejection of the null hypothesis

Page 5: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Definition 10.2 Rejection of the null hypothesis when it is true is called a type I error. The probability of committing a type I error, also called the level of significance, is denoted by the Greek letter .

Definition 10.3 Acceptance of the null hypothesis when it is false is called a type II error. The probability of committing a type II error, denoted by , is impossible to compute unless we have a specific alternative hypothesis.

Page 6: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Notes on important properties of and .

1. The type I error and type II error are related. A decrease in the probability of one generally results in an increase in the probability of the other.2. The size of the critical region, and therefore the probability of committing a type I error, can always be reduced by adjusting the critical value(s).3. An increase in the sample size n will reduce and simultaneously.4. If the null hypothesis is false, is a maximum when the true value of a parameter approaches the hypothesized value. The greater the distance between the true value and the hypothesized value, the smaller will be.

Page 7: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Definition 10.4 The power of a test is the probability of rejecting H0 given that a specific alternative is true. - The power of a test can be computed as 1 - .

Page 8: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

10.3 One- and Two- Tailed Tests A test of any hypothesis, where the alternative is one-sided, such as

H0: = 0,H1: > 0,

Or perhapsH0: = 0,H1: < 0,

is called a one-tailed test.

Page 9: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

A test of any statistical hypothesis where the alternative is two-sided, such as

H0: = 0,H1: 0,

is called a two-tailed test, since the critical region is split into two parts, often having equal probabilities placed in each tail of the distribution of the test statistic.

Page 10: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Guidelines as to which hypothesis should be stated as H0 and which should be stated as H1:

1. Determine the claim you want to test2. Should the claim suggest simple direction such as more than, less than, superior to, inferior to, and so on, the H1 will be stated using the inequality symbol ( < or >) corresponding to the suggested direction3. Should the claim suggest a compound direction (equality as well as direction) such as at least, equal to or greater, at most, no more than, and so on, then this entire compound direction ( or ) is expressed as H0, but using only the equality sign, and H1 is given by the opposite direction.4. If no direction whatsoever is suggested by the claim, then H1 is stated using the nor equal symbol ().

Page 11: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

14/309) A manufacturer has developed a new fishing line, which he claims has a mean breaking strength of 15 kilograms with a standard deviation of 0.5 kilogram. To test the hypothesis that = 15 kilograms against the alternative that < 15 kilograms, a random sample of 50 lines will be tested. The critical region is defined to bex < 14.9.a. Find the probability of committing a type I error when H0 is true.b. Evaluate for the alternatives = 14.8 and = 14.9 kilograms.

Page 12: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 13: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 14: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

10.4 The Use of P-Values in Decision Making The P-value approach is designed to give the user an alternative (in terms of a probability) to a mere “reject” or “do not reject” conclusion. The P-value computation also gives the user important information when the z-value falls well into the ordinary critical region.

Page 15: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Definition 10.5 A P-value is the lowest level (of significance) at which the observed value of the test statistic is significant. Summary of the hypothesis testing process 1. State the null hypothesis H0 that = 0.2. Choose an appropriate alternative hypothesis H1 from one of the alternatives

< 0, > 0, 0.3. Choose a significance level of size .4. Select the appropriate test statistic and establish the critical region. (If the decision is to be based on a P-value, it is not necessary to state the critical region.)5. Compute the value of the test statistic from the sample data.6. Decision: Reject H0 if the test statistic has a value in the critical region (or if the computed P-value is less than or equal to the desired significance level ); otherwise, do not reject H0.

Page 16: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 17: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Table 10.3 Continued…

Page 18: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 19: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 20: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Since z < z.05, reject H0. A - B < 12

Page 21: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Solution:

Page 22: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 23: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 24: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 25: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 26: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Choice of Sample Size When the alternative hypothesis is < 0, the required sample size is (z + z)2 2

n = ---------------------, 2

For the one-tailed test, the expression for the required sample size when n = n1 = n2 is given by

(z + z)2(21 + 2

2)n = -----------------------------

2

Page 27: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Solution:

Page 28: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Testing a Proportion: Small Samples

Page 29: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 30: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 31: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Testing a Proportion: Small Samples 1. H0: p = p0.2. One of the alternatives H1: p < p0, p > p0, or p p0.3. Choose a level of significance equal to .4. Test statistic: Binomial variable X with p = p0.5. Computations: Find x, the number of successes and compute the appropriate P-value.6. Decision: Draw appropriate conclusions based on the P-value.

Page 32: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Testing a Proportion: Large SamplesOne-tailed Test

H0: p = p0

H1: p > p0 (or p < p0)

Two-tailed Test

H0: p = p0

H1: p p0

Rejection Region:

z > z (or z < -z)

where q0 = 1 – p0

z < - z/2 and z > z/2

Rejection Region:

where q0 = 1 – p0

Assumption:

Page 33: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 34: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 35: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 36: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 37: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Two Samples: Tests on Two Proportions Test:1. H0: p1 = p2

2. H1: p1 < p2, p1 > p2 or p1 p2

Or equivalentlyH0: p1 – p2 = 0H1: p1 – p2 < 0, p1 – p2 > 0, or p1 – p2 0

Page 38: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

3. The z-value for testing p1 = p2 is determined from the formula p1 - p2

z = ----------------------------------- pq [(1/n1) + (1/n2)]

where p, the pooled estimate of the proportion p is

p = (x1 + x2)/(n1 + n2), where x1 and x2 are the number of successes in each of the two samples 4. The critical regions for the appropriate hypotheses are the following: - for the alternative p1 p2 at the -level of significance, the critical region is z < -z/2 and z > z/2. - for a test where the alternative is p1 < p2, the critical region is z < -z

- when the alternative is p1 > p2, the critical region is z > z.

Page 39: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 40: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 41: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 42: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 43: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 44: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 45: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 46: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 47: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 48: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Goodness-of-Fit Test Theorem 10.1 A goodness-of-fit test between observed and expected frequencies is based on the quantity k

2 = (oi – ei)2/ei, i=1

where 2 is a value of a random variable whose sampling distribution is approximated very closely by the chi-squared distribution with v = k –1 degrees of freedom. The symbols oi and ei represent the observed and expected frequencies, respectively, for the ith cell.The general rule for obtaining the expected frequency of any cell is given by the following formula: (column total) x (row total) expected frequency = -------------------------------------- grand total

Page 49: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 50: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 51: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Test for Independence Calculate 2 = (oi – ei)2/ei, where the summation extends over all rc i

cells in the r x c contingency table. If 2 > 2 with v = (r –1)(c – 1)

degrees of freedom, reject the null hypothesis of independence at the level of significance; otherwise, accept the null hypothesis. In a 2 x 2contingency table, where we have only 1 degree of freedom, a correction called Yates’ correction for continuity is applied. The correction formula then becomes (|oi – ei| - 0.5)2

2 (corrected) = ----------------------------- i ei

Page 52: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 53: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 54: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 55: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population
Page 56: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

Test for Homogeneityused when we are testing the hypothesis that the population proportions within each row are the same.

Testing for Several Proportionsused when we are interested in testing the null hyothesis H0: p1 = p2 = … = pk

H1: not all are equal

Page 57: Hypothesis Testing. Statistical Hypothesis: General Concepts The problem confronting the scientist or engineer is not so much the estimation of a population

- to perform this test, we first observe independent random samples of size n1, n2, …, nk from the k populations and arrange the data as in the 2 x k contingency table.

- the test procedure is identical to the test for homogeneity or the test for independence. k

2 = (oi – ei)2/ei, i=1

where v = (2 –1)(k - 1) = k –1 degrees of freedom.