23
Hypothesis Testing

Hypothesis Testing. Intro to Hypothesis Testing Make a conjecture and test its validity Null hypothesis, H o : Make a conjecture about a population statistic

Embed Size (px)

Citation preview

Hypothesis Testing

Intro to Hypothesis Testing

• Make a conjecture and test its validity

• Null hypothesis, Ho: Make a conjecture about a population statistic

• Alternative hypothesis, Ha: Accepted when the null hypothesis is rejected

• A test statistic is computed from sample data and tested to see if it falls in the rejection region

Testing Errors

• We would like for the test to work properly, i.e. if the conjecture is true the test indicates so, and if it is false, the test indicates that

• Type I Error: This is rejecting the null hypothesis when it is in fact true

• Type II Error: This is when the null hypothesis is not rejected when it is actually false

Graphical Interpretation

Possible Test Outcomes

Example 4.5EXAMPLE OF TESTING ERRORS

ASSUME:1. Population of 10,000 people.2. Test for flu virus that has 95% confidence level.3. 9200 test negative for flu4. 800 test positive.

Then:1. Type I error: people who test positive for flu but do not have it:

800 × 0.05 = 402. Type II error: people who have flu but test negative for it are:

9200 × 0.05 = 460. Thus = 460/10,000 = 0.046 or 4.6%

460 + 40 = 500 people incorrectly test for flu. Or 5% of the population.

Type I and Type II Errors

• We can fix the probability of the Type I error: it is α

• Generally, we can not determine the probability of the type II error

• The probability of the type II error decreases with increasing sample size or with increasing probability of the type I error

Outcomes of Hypothesis Test

• Reject the null hypothesis

• Do not reject the null hypothesis

• Caution – not rejecting the null hypothesis does not necessarily mean that the alternative hypothesis is true

Guilty vs Not Guilty

In the US, a person charged with a crime is assumed innocent until proven guilty. The null hypothesis is person=innocent. It is our custom (based on the Constitution) to use a small α (low probability of type I error) which makes it unlikely that an innocent person will be wrongly convicted. The trade-off is that the probability of the type II error increases and many guilty people will go free.

Types of Tests – One-Tail

ONE-TAIL TEST: Equivalent to checking if test statistic is only greater or less than critical value. Only concerned with one tail of distribution. That is, either greater or less than critical value.Typical null hypothesis, Ho : Statistic1 = Statistic2Typical alternative hypothesis, Ha : Statistic1 > Statistic2 or Statistic1 < Statistic2

Types of Tests – Two-Tail

TWO-TAIL TEST: Equivalent to confidence interval. Either test statistic in region or outside of region.Typical null hypothesis, Ho : Statistic1 = Statistic2 Typical alternative hypothesis, Ha : Statistic1 ≠ Statistic2

Test Concerning Population Mean

• One-tail test is used to see if a population mean is greater than (or less than) a conjectured value

• The null hypothesis in a one-tail test is stated as an equality, but it makes more sense (in Bon’s opinion) to make it an inequality

• The two-tail test is used to see if a population mean is different from a conjectured value

Hypothesis Test for Mean

)(: 000 H 00 : H

)(: 00 aH

One-tailed test Two-tailed test

0: aH

nSy

t 0Test Statistic

Rejection Region )( tttt

Null hypothesis

Alternative hyp.

nSy

t 0

2/tt

←not appropriate

Better Interpretation

94.8

20002.0

008.400012.400 n

Sy

t o

The question is whether the EDM is properly calibrated or not. The baseline of calibrated length 400.008m, is considered a fact. This hypothesis can be made before going to the calibration range.

Ho: μ=400.008

Ha: μ≠400.008

093.219,2/ t

8.94>2.093, therefore reject null hypothesis

Hypothesis Test for Variance

20

2: aH

20

20 : H

)(: 20

220

2 aH

)(: 20

220

20 H

One-tailed test Two-tailed test

Test Statistic

Rejection Region )( 21

222

Null hypothesis

Alternative hyp.

20

22

vS

20

22

vS

22/

222/1

2 or

Example 4.7

44.105.1

)9.0)(130(2

2

2

22

vS

The owner of a surveying firm wants all surveying technicians to be able to read a particular theodolite to within ±1.5". To test this value, the owner asks a senior field crew chief to perform a reading test with the instrument. The crew chief reads the plates 30 times, and obtains S=0.9”. Does this support the 1.5“ limit at a 5% significance level? (use one-tail test)

Ho: σ2 ≤ (1.5”)2 Ha: σ2 > (1.5”)2

56.42229,05. 10.44<42.56 so don’t reject null

Example 4.7 - revised

44.105.1

)9.0)(130(2

2

2

22

vS

The owner of a surveying firm wants all surveying technicians to be able to read a particular theodolite to within ±1.5". To test this value, the owner asks a senior field crew chief to perform a reading test with the instrument. The crew chief reads the plates 30 times, and obtains S=0.9”. Does this support the 1.5“ limit at a 5% significance level? (use one-tail test)

Ho: σ2 ≥ (1.5”)2 Ha: σ2 < (1.5”)2

71.17229,95. 10.44<17.71 so reject null

This is a much more definitive statement.

Comments on Example

“This example illustrates an important point to be made when using statistics. The interpretation of statistical testing requires judgment by the person performing the test. It should always be remembered that with a test, the objective is to reject and not accept an hypothesis.” Wolf and Ghilani, p. 75

Hypothesis Test for Variance Ratio

22

21

SS

F

)(2121 ,,1,, vvvv FFFF

22

21: aH

22

210 : H

)(: 22

21

22

21 aH

)(: 22

21

22

210 H

One-tailed test Two-tailed test

Test Statistic

Rejection Region

Null hypothesis

Alternative hyp.

22

21

SS

F

)(or2121 ,,2/,,2/1 vvvv FFFF

Example 4.8Minimally constrained trilateration network with 24 degrees of freedom has a reference variance, So

2 = 0.49. With full control constraint, So2 = 2.25. Are the two

reference variances different at a 0.05 level of significance?

59.449.025.2

22

21 SS

F

22

21: aH

22

210 : H

21.224,30,025.0,,2/ 21FF vv

468.024,30,975.0,,2/1 21 FF vv

4.59>2.21, therefore reject null hypothesis (reference variances are different)

Example 4.9

Ron and Kathi are comparing precision (i.e. variance). With 50 degrees of freedom each, Kathi’s variance is 0.81 and Ron’s variance is 1.21. Is Kathi’s precision better at a 0.01 level of significance?

49.181.021.1

2

2

K

R

SS

F

22: KRaH

220 : KRH

95.150,50,01.0,, FFKR vv

One-tail test

1.49 is not > 1.95 so Kathi is not better

How about at 0.1 level of significance?

44.150,50,10.0,, FFKR vv

1.49 > 1.44 so Kathi is better

Example 4.9 - Extended

It is possible to determine the level of α at the point of rejection through trial and error using STATS (not with tables). Keep entering different values of α until the F value is 1.49.

49.150,50,080.0,, FFKR vv

That’s why a 0.01 level does not indicate Kathi is better (nor 0.05), but at a 0.10 level the test shows she is better.