Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter...

Preview:

Citation preview

Chapter 9 Tests of Hypothesis

Single Sample TestsThe Beginnings – concepts

and techniques

Chapter 9A

9-1.1 Statistical Hypotheses

Some Definitions

Statistical Hypothesis - An assertion about a population parameter or distribution.

Test of hypothesis – arriving at a decision to reject or not reject a hypothesis based upon a sample from the population.

Null hypothesis – usually the hypothesis of no difference. The assertion that the researcher usually wants to reject.

Ho: = 0

Alternate Hypothesis – the assertion that is accepted if the null hypothesis is rejected. The assertion that the researcher generally wants to prove.

H1: 0

Hypothesis Test on a Population Mean

Two-Sided Test:

One-Sided Tests:

Test of Hypothesis If the information in a sample is consistent with the null

hypothesis, then we will conclude that the null hypothesis cannot be rejected;

If this information is inconsistent with the null hypothesis, we will conclude that the hypothesis is false and reject the null hypothesis in favor of the alternate hypothesis.

Critical Region – the set of values for the test statistic that results in rejecting the null hypothesis. The test statistic is calculated from the sample; i.e. a sample

statistic

0ˆIf critical region, then reject H

What can go wrong?

H0 true H0 false

Do not Reject H0 correct Type IIdecision error

Reject H0 Type I correct(accept H1) error decision

How Likely are the errors?

• Type I Error – Incorrectly Rejecting a True Hypothesis

= P(Type I error)

• (1- probability of not rejecting a true hypothesis

• Type II Error – Incorrectly Accepting a False Hypothesis

= P(Type II error)

• Power of test (1-) - probability of correctly rejecting the null when the alternative is true.

The probability of a type I error is called the significance level of the test.

Our Very First Hypothesis Test

Professor Notso Brite believes that his mean driving time to the campus from his home is 50 minutes while Dean Nowet Ah disagrees with him believing that it takes him, on the average more than 50 minutes.

It is known that the standard deviation of his driving time is 2.5 minutes and driving time is normally distributed.

humor me here For the next 10 days, Professor Brite records his driving

time with

Can we accept Dean Nowet Ah’s assertion that the mean driving time must be greater than 50 minutes?

51.7 minutesX

The HypothesisH0: = 50 minutes

H1: > 50 minutes (one-tailed test)

Given: = 2.5 minutesn = 10

51.7 minutesX

2.5.79

10X n

More Probability of a Type I Error

Let’s set the probability of a Type I error = .05, Then

P(Type I error) = P(reject H0|H0 is correct) = = .05

.05

0

50

.79

.05; 1.6449 .05

501.6449; 50 1.6449 .79 51.3

.79

If 51.3 reject H

cX

X

cc

X XP

P Z z P Z

XX

X

0Since 51.7 51.3, reject HX

H0: = 50 minutes

H1: > 50 minutes | 50 .05cP X X

What about the Type II Error?

Incorrectly Accepting a False Hypothesis

P(Type II error) = P(not rejecting H0 |H1 is correct) =

| 50cP X X H0: = 50 minutes

H1: > 50 minutes

But professor, that probability depends

upon the true value of the population mean under the alternate

hypothesis.

More about Type II Errors

51.3 51| 51 .3797 .6479

.79X

c

X

XP X X P P z

51.3 52| 52 .8861 .1878

.79X

c

X

XP X X P P z

51.3 53| 53 2.1519 .0157

.79X

c

X

XP X X P P z

Say the true mean is 51:

The Situation Graphically Displayed

Probability Density Function

0

0.1

0.2

0.3

0.4

0.5

0.6

47 48 49 50 51 52 53

Xc = 51.3

0 = 50; 1 = 51

Prob = .6479 Prob = .05

More Graphical Display

0

0.1

0.2

0.3

0.4

0.5

0.6

47 48 49 50 51 52 53 54 55

Xc = 51.3Prob = .05

0 = 50; 1 = 52

Prob = .1878

The Operating Characteristic (OC) Curve

0.0000

0.1000

0.2000

0.3000

0.4000

0.50000.6000

0.7000

0.8000

0.9000

1.0000

49.5 50 50.5 51 51.5 52 52.5 53 53.5

True Mean

Pro

b A

cc

ep

t N

ull

Hy

p

The Power of the Test

• The power is computed as 1 - , and power can be interpreted as the probability of correctly rejecting a false null hypothesis.

• We often compare statistical tests by comparing their power properties.

The Power of the TestPower of test (1-) - probability of correctly rejecting the

null hypothesis when the alternative is true.

Power Curve

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

1.2000

49.5 50 50.5 51 51.5 52 52.5 53 53.5

True Mean

Pro

b r

ejec

t n

ulll

The Prob-Value

H0: = 50 minutes

H1: > 50 minutes (one-tailed test)

Given: = 2.5, n = 1051.7X

51.7 50P-value 51.7 | 50

.79

2.1519 .0157

X

X

XP X P

P z

The Prob-ValuePDF of X-Bar

0

0.1

0.2

0.3

0.4

0.5

0.6

47 48 49 50 51 52 53 54 55

51.7

P-Value = .0157

X

51.3

= .05

Sample Size Determination

00|

/

|/

cc

cc

XP X X P z

n

XP X X P z

n

0 ;

/ /c cX X

z zn n

For 2-tailed test:

Sample Size in Action What sample size is need if the level of significance is

one percent and the probability of rejecting the null hypothesis if the true mean is 52 is 95 percent?

.01

.05

2 2

2

.01 2.33

1 .95 1.645

2.33 1.645 2.524.68 25

2

z

z

n

A Two-Tailed Test

H0: = 50 minutes

H1: = 50 minutes (two-tailed test)

Given: = 2.5 minutesn = 10 2.5

.7910X n

51.7 minutesX

More Probability of a Type I ErrorLet’s set the probability of a Type I error = .05,

Then

1 2

1 2

.025 .025

11

22

0

| 50 1 .95

50 50

.79 .79

.95; 1.96 1.96 .95

501.96; 50 1.96 .79 48.4516

.7950

1.96; 50 1.96 .79 51.5484.79

If 48.4516 or 51.5484 reject H

c c

c cX

X

cc

cc

P X X X

XX XP

P z Z z P Z

XX

XX

X X

Probability of a Type II Error

1 2 1

1 1

| 48.4516 51.5484 | 50

48.4516 51.5484

.79 .79

c c

X

X

P X X X P X

XP

48.4516 52 51.5484 524.4916 .5716 .2838

.79 .79X

X

XP P z

1 = 52:

48.4516 49 51.5484 490.6941 3.2258 .7556

.79 .79X

X

XP P z

1 = 49:

A Two-Tailed Prob-Value

51.7 50P-value 2 51.7 | 50 2

.79

2 2.1519 2 .0157 .0314

X

X

XP X P

P z

Reject H0 if .0314

0 00

/X

X Xz

n

A two-sided confidence interval – a study in comparison

Given: = 2.5 minutes and n = 10

95% confidence interval:

51.7 minutesX

.025

/2

1.96

51.7 1.96 .79 (50.1516,53.2484)

z

X zn

A 95% confidence interval identifies a set of acceptable hypothesesat the 5% level of significance. A mean of 50 lies outside theinterval and is therefore rejected.

Confidence Intervals and Hypothesis Tests – together at last

0 /2 xz 0 /2 xz 0

/2 xx z /2 xx z x

9-2 Tests on the Mean of a Normal Distribution, Variance Known

We wish to test:

The test statistic is:

Reject H0 if the observed value of the test statistic z0 is either:

z0 > z/2 or z0 < -z/2

Fail to reject H0 if -z/2 < z0 < z/2

9-2 Tests on the Mean of a Normal Distribution, Variance Known

Alternately

0 0

1 0

1 0 /2

2 0 /2

1 2 0

:

:

If or reject H

c

c

c c

H

H

X zn

X zn

X X X X

Points to Ponder

When we are considering Type II errors (beta), we use the distribution of the test statistic under the alternative hypothesis.

When we are considering Type I errors (alpha), we use the distribution of the test statistic under the null hypothesis.

Statistical versus Practical Significance statistical significance says nothing about the

importance of the difference there may be statistically significant difference

between two values with no practical difference mean of 50.4 driving minutes versus 49.7 driving

minutes large sample sizes will identify a difference

There may no statistically significant difference between two values but there is a significant practical difference mean of .20 mm in the diameter of a ball-bearing

versus .18 mm

Interactions -- alpha, beta, sample size

Alpha/beta tradeoffs. Lower alpha value means a larger beta value. Power of a test is (1-beta). Lower alpha implies we are reluctant to risk rejecting a true hypothesis. But it means we must risk accepting a false one. Only way to improve both is to increase the sample size.

N and then

N and then

N then: and

On the selection of the level of significance Convention is to use .01 or .05 Consider practical consequences of making a

type I or II error Consider power of the test and sample size

Large N – small difference will be statistically significant – use small (.01 - .001)

Small N – large differences may not be detected – use large (.05 - .10)

Consider “true” difference Type I versus Type II errors Use the P-value and let the reader decide

“I’m just reporting the facts; you decide”

General Procedures for Hypothesis Tests

1. Identify the parameter of interest.2. State the null hypothesis – H0.

3. Specify the alternative – H1.

4. Choose the significance level – alpha – risk of Type I error.5. Determine the appropriate test statistic.6. State the rejection region for the statistic.7. Compute the sample quantities (i.e. from the experiment or

measurement) and substitute into the equation for test statistic.

8. Decide whether to reject H0.

A Little Philosophy

• If we reject the null hypothesis: • either H1 is true • or we were extremely unlucky and hit on the 5 percent of the samples

that fall in the critical region• We go with the odds and reject the null

• If we fail to reject the null (assume x-bar = 11.1)• H0 is still left standing at the end of the test• The alternative hypothesis is what we wish to prove and believe to be

correct• The sample supports H1 but the test does not allow us to reject H0

• Therefore, we conclude that the evidence does not allow us to reject H0 stopping short of saying we accept H0.

11.7; .05CX

Large Sample Test

In most situations, the population variance is unknown and the population may not be well modeled as a normal distribution

If n is large (n >40), the sample standard deviation, s, can be substituted for with little effect appealing to the central limit theorem

Exact tests where the population is normal, 2

is unknown, and n is small results in t-distribution.

A Little Recap Tests on a mean, variance known, normal population or large sample size

(CLT)

H0: = 0

H1: 0

H0: = 0

H1: > 0

H0: = 0

H1: < 0

0 /2

0 /2

or; reject

or; reject

c c

c c

sX z X X

n

sX z X X

n

0

or; rejectc c

sX z X X

n

0

or; rejectc c

sX z X X

n

Next Time

Time Permitting

Recommended