64
1 Chapter 9 Hypothesis Testing: One Sample Tests David Chow Oct 2014

L9_hypo_2014

Embed Size (px)

DESCRIPTION

lecture notes

Citation preview

Page 1: L9_hypo_2014

1

Chapter 9

Hypothesis Testing:

One Sample Tests

David Chow

Oct 2014

Page 2: L9_hypo_2014

2

Learning Objectives

The basic principles of hypothesis testing

Use hypothesis testing to test a mean or

proportion

Underlying assumptions

Potential pitfalls and ethical issues

Page 3: L9_hypo_2014

3

Basic Concepts

Page 5: L9_hypo_2014

5

The Process

Claim: The population mean age is of security guards is 50.

Draw a sample and find the sample mean.

Population

Sample

Page 6: L9_hypo_2014

6

The Process

Suppose the sample mean age was X = 20

It is different from the assumption of μ = 50

What can we conclude?

1. Best educated guess, OR

2. Statistical reasoning (hypothesis testing here):

If the hypothesis were true, the probability of getting

such a small X is very small

So the hypothesis is rejected

Page 7: L9_hypo_2014

7

The Process – Graphical Illustration

σx = σ / √ n

μ = 50

If Ho is true … it is unlikely that

you would get a

sample mean of this

value ...

... then you reject

Ho that μ = 50

20

IF this were the

population mean…

X

Sampling Dist of X

Page 9: L9_hypo_2014

9

Terminology: Ho and H1

How to set up Ho?

1. Ho usually refers to the status quo

2. Ho always has a number, and an equality

Must include any one of “=” , “≤” or “”

Hypothesis testing begins by assuming the null is true

An hypothesis must be stated in pairs, i.e., Ho and H1

H1 is the alternative hypothesis. It is the complement of Ho

Eg: If the null is Ho: μ = 3, the alternative is H1: ____.

Page 10: L9_hypo_2014

10

A Quick Recap

Examples

1. State Ho and H1 on mean body temperature

2. State Ho and H1 for the mean age example

One-tailed (lower-tail)

One-tailed (upper-tail)

Two-tailed

0 0: H

0: aH

0 0: H

0: aH

0 0: H

0: aH

A hypothesis test can be one- or two- tailed The test about must take one of the following three forms (0 is the hypothesized value of ):

Page 11: L9_hypo_2014

11

Test Statistic & Critical Values

Critical Values define “Regions of Rejection”

Distribution of the test statistic (e.g., X)

Region of

Rejection

Region of

Rejection

If the sample mean is far from the assumed population mean, the null is rejected.

How far is “far enough” to reject Ho?

We need critical value(s) for our decision.

Page 12: L9_hypo_2014

12

Level of Significance, α

H0: μ ≥ 50

H1: μ < 50

0

H0: μ ≤ 50

H1: μ > 50

a

a

Represents critical value

Lower-tail test

0

Upper-tail test

Two-tail test

Simple Rule:

Rejection

region ____

0

H0: μ = 50

H1: μ ≠ 50

Claim: The population mean age is 50.

a/2

a/2

Page 13: L9_hypo_2014

13

Errors in Decision Making

Your conclusion to a hypothesis testing is subject to

two potential errors that are different:

Type I error: Wrongly reject a true Ho.

It is equal to a, the probability of drawing an “extreme”

sample when Ho is true.

It is set by researcher in advance.

Type II error: Fails to reject a false Ho.

It is called , which is not selected but computed when n

and a are known.

Page 14: L9_hypo_2014

14

Errors in Decision Making

Possible Hypothesis Test Outcomes

Actual Situation

Decision Ho True Ho False

Do Not

Reject Ho

No Error

Probability 1 - α

Type II Error

Probability β

Reject Ho Type I Error

Probability α

No Error

Probability 1 - β

Eg: A Murder Trial

Ho: Innocence

H1: Guilty

Jury decision

Reject the null (i.e., convicting the defendant), or

Do not reject Ho.

Identify the (potential) errors in this decision.

Page 15: L9_hypo_2014

15

Eg: Medical Test

Medical test

Ho: No cancer

H1: Cancer

For this decision, what exactly are the potential errors?

Type I error

wrong diagnosis of cancer -- unnecessary worry or treatment

Type II error

failure to detect cancer -- patient might miss treatment

_____________

_____________

Page 16: L9_hypo_2014

16

a or : Pick Your Poison?

Given the sample size n and a, can be computed.

Tradeoff: If you want to reduce one type of error, it generally results in increasing the other type of error.

The only way to minimize both types of error is to increase the sample size, but this may be infeasible.

Which error to choose?

Which one leads to a more serious consequences?

(The answer varies on a case-by-case basis.)

Set your a (hence ) accordingly.

The common choice is to set a = 5%.

Page 17: L9_hypo_2014

17

Hypothesis Testing: σ Known

Page 18: L9_hypo_2014

18

Hypothesis Testing: σ Known

Two tail test for the mean (assume σ is known):

Convert sample statistic (X) to test statistic (Z):

Use the Z-table to find the critical Z values, given a specified level of significance α.

Decision Rule: If the test statistic falls in the rejection region, reject Ho ; otherwise do not reject Ho

n

σ

μXZ

Page 19: L9_hypo_2014

19

Hypothesis Testing: σ Known

Do not reject H0 Reject H0 Reject H0

For two-tail tests,

there are two critical

values, and two

regions of rejection. a/2

-Z

0

H0: μ = 3

H1: μ ≠ 3

+Z

a/2

Lower

critical value

Upper

critical value

3

Z

X

Page 20: L9_hypo_2014

20

Eg: Mean Weight

Example: Test the claim that the true mean weight of

chocolate bars manufactured in a factory is 3 ounces.

State the appropriate null and alternative hypotheses:

H0: μ = 3 H1: μ ≠ 3 (two tailed test)

Specify the level of significance

Suppose a = .05 is chosen.

Choose a sample size

Suppose a sample of size n = 100 is selected.

Page 21: L9_hypo_2014

21

Eg: Mean Weight

2.0.08

.16

100

0.8

32.84

n

σ

μXZ

Determine the appropriate technique

σ is known so this is a Z test

σ = 0.8 is known from past company records

Set up the critical values

For a = .05 the critical Z values are ±1.96

Compute the test statistic based on the sample data:

Suppose n = 100 and X = 2.84

So the test statistic is:

Page 22: L9_hypo_2014

22

Eg: Mean Weight

Reject H0 Do not reject H0

Decision: Is the test statistic in the rejection region?

a /2

-Z α/2 = -1.96 0

Reject Ho if Z < -1.96

or Z > 1.96; otherwise

do not reject.

a= 0.05

Upper-tail area = ??

Reject H0

+Zα/2 = +1.96

Here, Z = -2.0 < -1.96, so the test statistic is

in the rejection region

Conclusion in non-

technical term:

Based on the

sample evidence,

the mean weight of

chocolate bars is

not equal to 3.

Page 23: L9_hypo_2014

23

Eg: Volume of Soft-Drink

(2) Test statistic: Z = 10.364

(3) Critical values: Z0.01 = 2.327, (Reject of rejection: Z > 2.327 or Z < -2.327)

(4) Conclusion: Reject Ho as the test statistic falls in the region of rejection.

There is sample evidence to reject the claim of =12.00. We conclude

that 12.00 instead.

ANSWER:

Set a = 2%. The sample has a mean of 12.19oz, and a size of 36. Past

record shows that = 0.11oz. Test the claim that = 12.00oz.

(1) Ho: = 12.00; H1: 12.00.

(2) Test statistic Z = (X – ) / X =

(3) Critical values: Z0.01 =

Hence the region of rejection is:

(4) Finally, the conclusion:

Page 24: L9_hypo_2014

24

Summary: Six Steps

Six Steps of Hypothesis Testing:

1. State the null hypothesis Ho and the alternative H1

2. Choose the level of significance and the sample size n

3. Determine the appropriate statistical technique and the test statistic to use

4. Find the critical values and determine the rejection region(s)

5. Collect data and compute the test statistic from the sample result

6. Compare the test statistic to the critical value:

Reject Ho if the test statistic falls in the rejection region

Otherwise do not reject Ho

Express the decision in non-technical terms

Page 25: L9_hypo_2014

25

Hypothesis Testing: σ Known

p-Value Approach

The p-value is the probability of

obtaining a test statistic equal to or

more extreme ( < or > ) than the

observed sample value given Ho is true

It is also called observed level of

significance

Page 26: L9_hypo_2014

26

Hypothesis Testing: σ Known

p-Value Approach

Convert sample statistic (eg, X) to test statistic (eg, Z

statistic ).

Obtain the p-value from a statistical table.

Compare the p-value with a:

If p-value < a , reject Ho

If p-value a , do not reject Ho If the p-value is

low, Ho must go.

Page 27: L9_hypo_2014

27

Hypothesis Testing: σ Known

p-Value Approach

Example: Mean Weight Again.

Ho: = 3.0; H1: ≠3.0

Sample mean = 2.84, n = 100

.02282.0)P(Z

.02282.0)P(Z

X = 2.84 is translated to a Z

score of Z = -2.0

p-value

=.0228 + .0228 = .0456

.0228

a/2 = .025

-1.96 0

-2.0

Z 1.96

2.0

.0228

a/2 = .025

Page 28: L9_hypo_2014

28

Hypothesis Testing: σ Known

p-Value Approach

Compare the p-value with a

If p-value < a , reject Ho

If p-value a , do not reject Ho

Now p-value = .0456

a is chosen to be 0.05

Since .0456 < .05, reject Ho

.0228

a/2 = .025

-1.96 0

-2.0

Z 1.96

2.0

.0228

a/2 = .025

Page 29: L9_hypo_2014

29

Eg: p-value Approach

1. If you use a 0.05 level of significance in a two-tail hypothesis test, what will you decide if the computed value of the test statistic Z is +2.21?

a. Use the critical value approach.

b. Use the p-value approach.

2. Suppose that in a two-tail hypothesis test, you compute the value of the test statistic Z as -1.38. What is the p-value?

1a. Reject Ho as Z > 1.96

1b. p-value = 2 x (0.01355) = 0.0271

As p-value < 0.05, reject Ho.

2. If Z = -1.38, p-value = 2 x (0.08379) = 0.1676

ANSWER

Page 30: L9_hypo_2014

30

Hypothesis Testing: σ Known

Confidence Interval Connections

100

0.8 (1.96) 2.84 to

100

0.8 (1.96) - 2.84

For X = 2.84, σ = 0.8 and n = 100, the 95%

confidence interval is:

2.6832 ≤ μ ≤ 2.9968

Since this interval does not contain the hypothesized

mean (3.0), you reject the null hypothesis at a = .05

Page 31: L9_hypo_2014

31

Hypothesis Testing: σ Known

One Tail Tests In many cases, the region of rejection is located in one

end of the distribution.

In other words, H1 is focused on one direction only.

There is only one region of rejection, whose area is α.

H0: μ ≥ 3

H1: μ < 3

H0: μ ≤ 3

H1: μ > 3

This is a lower-tail test as H1 is focused

on the lower tail below the mean of 3.

This is an upper-tail test as H1 is focused

on the upper tail above the mean of 3.

Page 32: L9_hypo_2014

32

Eg: Upper Tail Tests

There is only one critical

value, since the rejection

area is in only one tail. Reject Ho

Do not reject Ho

α

μ

Critical value

Similarly, by identifying the correct critical value, you

can construct one-sided confidence intervals.

Eg: For an upper tail test,μ ≤ an upper limit.

Page 33: L9_hypo_2014

33

Eg: Phone Bill

A phone industry manager thinks that customer monthly cell phone bills have increased, now averaging more than $52 per month.

The company wishes to test this claim. Past company records indicate thatσ= $10.

H0: μ ≤ 52 the mean is less than or equal to $52 per month

H1: μ > 52 the mean is greater than $52 per month (i.e., sufficient evidence exists to support the manager’s claim)

Form hypothesis:

Page 34: L9_hypo_2014

34

Eg: Phone Bill

Suppose that a = .10 is chosen for this test.

Find the rejection region:

Reject H0 Do not reject H0

a = .10

Z 0

Reject H0

1-a = .90

Page 35: L9_hypo_2014

35

Eg: Phone Bill

Check the critical value:

Z .07 .09

1.1 .8790 .8810 .8830

1.2 .8980 .9015

1.3 .9147 .9162 .9177 z 0 1.28

.08 a = .10

Critical Value

= 1.28

.90

.8997

.10

.90

Page 36: L9_hypo_2014

36

Eg: Phone Bill

Sample information: n = 64, X = 53.1

=10 was known from past company records

Compute the test statistic:

0.88

64

10

5253.1

n

σ

μXZ

Page 37: L9_hypo_2014

37

Eg: Phone Bill

a = .10

1.28 0

Reject H0

1-a = .90

Z = .88

Now, use the p-value approach to solve the problem.

Decision: Do not reject Ho since Z = 0.88 ≤ 1.28

I.e., There is not sufficient evidence that the mean bill is greater than $52.

Page 38: L9_hypo_2014

38

Hypothesis Testing:

σ Unknown

Page 39: L9_hypo_2014

39

Hypothesis Testing: σ Unknown

If the population standard deviation is unknown, simply replace it by the sample standard deviation S.

Because of this change, you use the t distribution to test Ho.

Check t-table (given αand df = n-1).

All other steps, concepts are the same.

Reminder:

As in the confidence interval chapter, when t-distribution is used, assume the population is approximately normal.

No need to have n > 30 if we assume a normal population.

Page 40: L9_hypo_2014

40

Hypothesis Testing: σ Unknown

Recall that the t test statistic with n-1

degrees of freedom is:

n

S

μXt 1-n

Page 41: L9_hypo_2014

41

Eg: Price Watch

The mean cost of a hotel room in New York City is said to be

$168 per night. A random sample of 25 hotels resulted in

X = $172.50 and S = 15.40. Test at the a = 0.05 level.

(A stem-and-leaf display shows the data are approx. normally distributed )

Ho: μ= 168

H1: μ 168

Page 42: L9_hypo_2014

42

Eg: Price Watch

H0: μ = 168

H1: μ ≠168

α = 0.05

n = 25

is unknown, so use

a t-statistic

Critical Value:

t24 = ± 2.0639

Reject H0 Reject H0

α/2=.025

-t n-1,α/2

Do not reject H0

0

α/2=.025

-2.0639 2.0639

t n-1,α/2

Determine the regions of rejection

Page 43: L9_hypo_2014

43

Eg: Price Watch

a/2=.025

-t n-1,α/2 0

a/2=.025

-2.0639 2.0639

t n-1,α/2

1.46

25

15.40

168172.50

n

S

μXt 1n

Conclusion: Do not reject Ho.

There is not sufficient evidence that true mean cost is different from $168

1.46

Page 44: L9_hypo_2014

44

Hypothesis Testing:

Connection to Confidence Intervals

For X = 172.5, S = 15.40 and n = 25, the 95%

confidence interval is:

166.14 ≤ μ ≤ 178.86

Since this interval contains the hypothesized

mean of 168, you do not reject the null

hypothesis at a = .05

25

15.4 (2.0639) 172.5 to

25

15.4 (2.0639) - 172.5

Page 45: L9_hypo_2014

45

Hypothesis Testing: σ Unknown

It is assumed that the sample statistic comes from

a random sample of a normal distribution.

If the sample size is small (< 30), you should use

a histogram to check the normality assumption.

If the sample size is large, the central limit

theorem applies.

Page 46: L9_hypo_2014

46

Testing Proportion

Page 47: L9_hypo_2014

47

Hypothesis Testing

Proportions

Involves categorical variables

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of the population in

the “success” category is denoted by π

Page 48: L9_hypo_2014

48

Hypothesis Testing

Proportions

sizesample

sampleinsuccessesofnumber

n

Xp

pμn

)(1σ

p

Sample proportion in the success category is denoted by p

When both nπ and n(1-π) are at least 5, p can be

approximated by a normal distribution with mean and

standard deviation

Page 49: L9_hypo_2014

49

Hypothesis Testing

Proportions

The sampling distribution of proportion (p)

is approximately normal, so the test statistic

is a Z value:

n

pZ

)1(

Page 50: L9_hypo_2014

50

Eg: Testing Proportion

A marketing company claims that it receives 8%

responses from its mailing.

To test this claim, a random sample of 500 were

surveyed with 30 responses.

Test at the a = .05 significance level.

First, check:

n π = (500)(.08) = 40

n(1-π) = (500)(.92) = 460

Page 51: L9_hypo_2014

51

Eg: Testing Proportion

H0: π = .08 H1: π ≠ .08

α = .05

n = 500, p = .06

Critical Values: ± 1.96

z 0

Reject Reject

.025 .025

1.96 -1.96

Determine region of rejection

Page 52: L9_hypo_2014

52

Eg: Test for Proportion

Of a sample of 899 home-based businesses, 369 are

owned by females. Want to test if π= 0.50.

Sample proportion p = 369/899, n = 899.

Ho: π= 0.50; H1: π 0.50

Test statistic Z = -5.37

At a = 5%, critical values = 1.96.

Ho is rejected by the sample evidence.

ANSWER

Page 53: L9_hypo_2014

53

Eg: Testing Proportion

Do not reject Ho at a = .05

Test Statistic: Decision:

Conclusion:

There isn’t sufficient

evidence to reject the

company’s claim of 8%

response rate.

1.648

500

.08).08(1

.08.06

n

)(1Z

p

z 0

.025 .025

1.96 -1.96

-1.646

Page 54: L9_hypo_2014

54

Potential Pitfalls and

Ethical Considerations

Use randomly collected data to reduce selection biases

No human subjects without informed consent

Choose the level of significance, α, before data collection

Do not employ “data snooping” to choose between one-tail and two-tail test, or to determine the level of significance

Do not practice “data cleansing” to hide observations that do not support a stated hypothesis

Report all pertinent findings

Page 55: L9_hypo_2014

55

Z or t?

Population Mean (μ)

Z: samp dist normally distributed if σ is known

t: use t-distribution if σ is unknown

Need to assume normal population. But n > 30 is

also acceptable.

Population Proportion (π)

Z: binomial approximated by normal dist

Page 56: L9_hypo_2014

56

More Examples

Page 57: L9_hypo_2014

57

Eg: Mean Waiting Time

Has the mean waiting time in a fast-food restaurant has changed from its previous value of 4.5 minutes?

Past experience shows that the population is normally distributed, with a population standard deviation of 1.2 minutes.

A sample of 25 orders is selected. The sample mean is 5.1 minutes.

The level of significance (a) is 0.05

Page 58: L9_hypo_2014

58

Cont: Critical Value Approach

2.50

25

1.2

5.41.5

n

σ

μXZ

H0: μ = 4.5 H1: μ ≠ 4.5

a = .05

Sample size n = 25

Determine the appropriate technique

The population is normal and σ is known (σ = 1.2) so this is a Z test

Set up the critical values

For a = .05 the critical Z values are ±1.96

Compute the test statistic based on the sample data:

Page 59: L9_hypo_2014

59

Cont: Critical Value Approach

Reject H0 Do not reject H0

Decision: Is the test statistic in the rejection region?

a /2

-Z α/2 = -1.96 0

Decision Rule

Reject Ho if

Z < -1.96 or Z > 1.96;

otherwise do not reject.

a= 0.05

Upper-tail area = 0.025

Reject H0

+Zα/2 = +1.96

Here, Z = 2.50 > 1.96, so the test statistic is

in the rejection region

Page 60: L9_hypo_2014

60

Cont: p-Value Approach

Use the p-value approach to solve the mean waiting time problem.

Again we compute the test statistic of 2.50.

Probability (test statistic ≥ 2.50) = 1-0.9938 = 0.0062

The p-value for this two-tail test =2 x 0.0062 = 0.0124

Decision rule: p < a, reject Ho.

GRAPH:

chosen at 0.05

______

______

Page 61: L9_hypo_2014

61

Eg: Mean Monthly Sales

Want to test if average monthly sales is $120.

Suppose a = 0.05,

X = $112.85, n = 12 and s = $20.80.

Answer

Test statistic t = (X – ) / (s/ n) = -1.19.

Critical values: t0.025, 11 = 2.2010.

The test statistic falls in the region of nonrejection.

Do not reject Ho. We don’t have enough evidence to reject the claim.

Caution: Do not say “accept Ho”.

How about the p-value approach?

ANSWER

Page 62: L9_hypo_2014

Review Questions

Level of significance

True or False? level of significance = α = confidence level

Types of error

In hypothesis testing if the null hypothesis has been rejected when the alternative hypothesis has been true, which error has been committed?

Setting hypotheses

The manager of an automobile dealership is considering a new bonus plan in order to increase sales. Currently, the mean sales rate per salesperson is five automobiles per month. The correct set of hypotheses for testing the effect of the bonus plan is ____

p-value approach

A two-tailed test is performed at 95% confidence. The p-value is 0.09. What is the decision?

Page 63: L9_hypo_2014

Review Questions: Testing μ

A random sample of 16 statistics examinations from a large population was taken.

The average score in the sample was 78.6 with a standard deviation of 8.0.

Want to know if the average grade of the population is significantly more than 75.

Assume the distribution of the population of grades is normal.

Is it a two-tailed test?

Do you use Z- or t- test?

Compute the test statistic

The p-value is between ___

Page 64: L9_hypo_2014

Review Questions: Testing π

A random sample of 100 people was taken.

85 of the people in the sample favored Candidate A.

Want to find out whether or not the proportion of the

population in favor of Candidate A is significantly more

than 80%.

Set up the hypotheses.

What is the test statistic and the p-value?