L9_hypo_2014

1

Chapter 9

Hypothesis Testing:

One Sample Tests

David Chow

Oct 2014

2

Learning Objectives

The basic principles of hypothesis testing

Use hypothesis testing to test a mean or

proportion

Underlying assumptions

Potential pitfalls and ethical issues

3

Basic Concepts

4

The Hypothesis

A hypothesis is a claim (assumption) about a

population parameter:

Population Mean

Population Proportion

Eg: The mean weight for

kindergarten kids is μ = 20 kg

Eg: The proportion of retirees with smart phones is π = .68

http://www.google.com.hk/imgres?imgurl=http://dietxnutrition.com/wp-content/uploads/2011/12/Diet-And-Nutrition-Plans-for-Obesity-in-Kid.jpg&imgrefurl=http://obesity.dietxnutrition.com/diet-nutrition-plans-obesity-kids/&usg=__-CO2zthNml7KhqER1JZ2OAMYu_c=&h=338&w=512&sz=33&hl=zh-TW&start=12&zoom=1&tbnid=d2rYS2JzoxhvSM:&tbnh=86&tbnw=131&ei=WB5eUpWUBqi6iQfdxIHoDg&prev=/search%3Fq%3Dobese%2Bkid%26um%3D1%26rlz%3D1T4ADFA_enHK371HK378%26hl%3Dzh-TW%26tbm%3Disch&um=1&itbs=1&sa=X&ved=0CEEQrQMwCw

5

The Process

Claim: The population mean age is of security guards is 50.

Draw a sample and find the sample mean.

Population

Sample

6

The Process

Suppose the sample mean age was X = 20

It is different from the assumption of μ = 50

What can we conclude?

1. Best educated guess, OR

2. Statistical reasoning (hypothesis testing here):

If the hypothesis were true, the probability of getting

such a small X is very small

So the hypothesis is rejected

7

The Process – Graphical Illustration

σx = σ / √ n

μ = 50

If Ho is true … it is unlikely that

you would get a

sample mean of this

value ...

... then you reject

Ho that μ = 50

20

IF this were the

population mean…

X

Sampling Dist of X

8

Terminology: Ho

3μ:H0

The starting point of hypothesis testing is a

hypothesis, usually called the null hypothesis

OR: “the null”, or Ho

Eg: The mean no. of TV sets in US homes is 3

In symbols, it is:

http://www.google.com.hk/imgres?imgurl=http://www.desktopclass.com/wp-content/uploads/2011/08/hypothesis.gif&imgrefurl=http://creafety.wordpress.com/2011/10/03/hypothesis-in-quantitative-research/&usg=__AtEAN_5DrwTbCTk1sDk-N8qztCs=&h=292&w=297&sz=5&hl=zh-TW&start=2&zoom=1&tbnid=X2JYjwFncJZ3HM:&tbnh=114&tbnw=116&ei=jCBeUpOzM-mViQfuy4HgBg&prev=/search%3Fq%3Dhypothesis%26um%3D1%26sa%3DN%26rlz%3D1T4ADFA_enHK371HK378%26hl%3Dzh-TW%26tbm%3Disch&um=1&itbs=1&sa=X&ved=0CC0QrQMwAQ

9

Terminology: Ho and H1

How to set up Ho?

1. Ho usually refers to the status quo

2. Ho always has a number, and an equality

Must include any one of “=” , “≤” or “”

Hypothesis testing begins by assuming the null is true

An hypothesis must be stated in pairs, i.e., Ho and H1

H1 is the alternative hypothesis. It is the complement of Ho

Eg: If the null is Ho: μ = 3, the alternative is H1: ____.

10

A Quick Recap

Examples

1. State Ho and H1 on mean body temperature

2. State Ho and H1 for the mean age example

One-tailed (lower-tail)

One-tailed (upper-tail)

Two-tailed

0 0: H

0: aH

0 0: H

0: aH

0 0: H

0: aH

A hypothesis test can be one- or two- tailed The test about must take one of the following three forms (0 is the hypothesized value of ):

11

Test Statistic & Critical Values

Critical Values define “Regions of Rejection”

Distribution of the test statistic (e.g., X)

Region of

Rejection

Region of

Rejection

If the sample mean is far from the assumed population mean, the null is rejected.

How far is “far enough” to reject Ho?

We need critical value(s) for our decision.

12

Level of Significance, α

H0: μ ≥ 50

H1: μ < 50

0

H0: μ ≤ 50

H1: μ > 50

a

a

Represents critical value

Lower-tail test

0

Upper-tail test

Two-tail test

Simple Rule:

Rejection

region ____

0

H0: μ = 50

H1: μ ≠ 50

Claim: The population mean age is 50.

a/2

a/2

13

Errors in Decision Making

Your conclusion to a hypothesis testing is subject to

two potential errors that are different:

Type I error: Wrongly reject a true Ho.

It is equal to a, the probability of drawing an “extreme”

sample when Ho is true.

It is set by researcher in advance.

Type II error: Fails to reject a false Ho.

It is called , which is not selected but computed when n

and a are known.

14

Errors in Decision Making

Possible Hypothesis Test Outcomes

Actual Situation

Decision Ho True Ho False

Do Not

Reject Ho

No Error

Probability 1 - α

Type II Error

Probability β

Reject Ho Type I Error

Probability α

No Error

Probability 1 - β

Eg: A Murder Trial

Ho: Innocence

H1: Guilty

Jury decision

Reject the null (i.e., convicting the defendant), or

Do not reject Ho.

Identify the (potential) errors in this decision.

15

Eg: Medical Test

Medical test

Ho: No cancer

H1: Cancer

For this decision, what exactly are the potential errors?

Type I error

wrong diagnosis of cancer -- unnecessary worry or treatment

Type II error

failure to detect cancer -- patient might miss treatment

_____________

_____________

16

a or : Pick Your Poison?

Given the sample size n and a, can be computed.

Tradeoff: If you want to reduce one type of error, it generally results in increasing the other type of error.

The only way to minimize both types of error is to increase the sample size, but this may be infeasible.

Which error to choose?

Which one leads to a more serious consequences?

(The answer varies on a case-by-case basis.)

Set your a (hence ) accordingly.

The common choice is to set a = 5%.

17

Hypothesis Testing: σ Known

18


Two tail test for the mean (assume σ is known):

Convert sample statistic (X) to test statistic (Z):

Use the Z-table to find the critical Z values, given a specified level of significance α.

Decision Rule: If the test statistic falls in the rejection region, reject Ho ; otherwise do not reject Ho

n

σ

μXZ

19


Do not reject H0 Reject H0 Reject H0

For two-tail tests,

there are two critical

values, and two

regions of rejection. a/2

-Z

0

H0: μ = 3

H1: μ ≠ 3

+Z

a/2

Lower

critical value

Upper

critical value

3

Z

X

20

Eg: Mean Weight

Example: Test the claim that the true mean weight of

chocolate bars manufactured in a factory is 3 ounces.

State the appropriate null and alternative hypotheses:

H0: μ = 3 H1: μ ≠ 3 (two tailed test)

Specify the level of significance

Suppose a = .05 is chosen.

Choose a sample size

Suppose a sample of size n = 100 is selected.

21

Eg: Mean Weight

2.0.08

.16

100

0.8

32.84

n

σ

μXZ

Determine the appropriate technique

σ is known so this is a Z test

σ = 0.8 is known from past company records

Set up the critical values

For a = .05 the critical Z values are ±1.96

Compute the test statistic based on the sample data:

Suppose n = 100 and X = 2.84

So the test statistic is:

22

Eg: Mean Weight

Reject H0 Do not reject H0

Decision: Is the test statistic in the rejection region?

a /2

-Z α/2 = -1.96 0

Reject Ho if Z < -1.96

or Z > 1.96; otherwise

do not reject.

a= 0.05

Upper-tail area = ??

Reject H0

+Zα/2 = +1.96

Here, Z = -2.0 < -1.96, so the test statistic is

in the rejection region

Conclusion in non-

technical term:

Based on the

sample evidence,

the mean weight of

chocolate bars is

not equal to 3.

23

Eg: Volume of Soft-Drink

(2) Test statistic: Z = 10.364

(3) Critical values: Z0.01 = 2.327, (Reject of rejection: Z > 2.327 or Z < -2.327)

(4) Conclusion: Reject Ho as the test statistic falls in the region of rejection.

There is sample evidence to reject the claim of =12.00. We conclude

that 12.00 instead.

ANSWER:

Set a = 2%. The sample has a mean of 12.19oz, and a size of 36. Past

record shows that = 0.11oz. Test the claim that = 12.00oz.

(1) Ho: = 12.00; H1: 12.00.

(2) Test statistic Z = (X – ) / X =

(3) Critical values: Z0.01 =

Hence the region of rejection is:

(4) Finally, the conclusion:

24

Summary: Six Steps

Six Steps of Hypothesis Testing:

1. State the null hypothesis Ho and the alternative H1

2. Choose the level of significance and the sample size n

3. Determine the appropriate statistical technique and the test statistic to use

4. Find the critical values and determine the rejection region(s)

5. Collect data and compute the test statistic from the sample result

6. Compare the test statistic to the critical value:

Reject Ho if the test statistic falls in the rejection region

Otherwise do not reject Ho

Express the decision in non-technical terms

25


p-Value Approach

The p-value is the probability of

obtaining a test statistic equal to or

more extreme ( < or > ) than the

observed sample value given Ho is true

It is also called observed level of

significance

26


p-Value Approach

Convert sample statistic (eg, X) to test statistic (eg, Z

statistic ).

Obtain the p-value from a statistical table.

Compare the p-value with a:

If p-value < a , reject Ho

If p-value a , do not reject Ho If the p-value is

low, Ho must go.

27


p-Value Approach

Example: Mean Weight Again.

Ho: = 3.0; H1: ≠3.0

Sample mean = 2.84, n = 100

.02282.0)P(Z

.02282.0)P(Z

X = 2.84 is translated to a Z

score of Z = -2.0

p-value

=.0228 + .0228 = .0456

.0228

a/2 = .025

-1.96 0

-2.0

Z 1.96

2.0

.0228

a/2 = .025

28


p-Value Approach

Compare the p-value with a

If p-value < a , reject Ho

If p-value a , do not reject Ho

Now p-value = .0456

a is chosen to be 0.05

Since .0456 < .05, reject Ho

.0228

a/2 = .025

-1.96 0

-2.0

Z 1.96

2.0

.0228

a/2 = .025

29

Eg: p-value Approach

1. If you use a 0.05 level of significance in a two-tail hypothesis test, what will you decide if the computed value of the test statistic Z is +2.21?

a. Use the critical value approach.

b. Use the p-value approach.

2. Suppose that in a two-tail hypothesis test, you compute the value of the test statistic Z as -1.38. What is the p-value?

1a. Reject Ho as Z > 1.96

1b. p-value = 2 x (0.01355) = 0.0271

As p-value < 0.05, reject Ho.

2. If Z = -1.38, p-value = 2 x (0.08379) = 0.1676

ANSWER

30


Confidence Interval Connections

100

0.8 (1.96) 2.84 to

100

0.8 (1.96) - 2.84

For X = 2.84, σ = 0.8 and n = 100, the 95%

confidence interval is:

2.6832 ≤ μ ≤ 2.9968

Since this interval does not contain the hypothesized

mean (3.0), you reject the null hypothesis at a = .05

31


One Tail Tests In many cases, the region of rejection is located in one

end of the distribution.

In other words, H1 is focused on one direction only.

There is only one region of rejection, whose area is α.

H0: μ ≥ 3

H1: μ < 3

H0: μ ≤ 3

H1: μ > 3

This is a lower-tail test as H1 is focused

on the lower tail below the mean of 3.

This is an upper-tail test as H1 is focused

on the upper tail above the mean of 3.

32

Eg: Upper Tail Tests

There is only one critical

value, since the rejection

area is in only one tail. Reject Ho

Do not reject Ho

α

μ

Critical value

Similarly, by identifying the correct critical value, you

can construct one-sided confidence intervals.

Eg: For an upper tail test,μ ≤ an upper limit.

33

Eg: Phone Bill

A phone industry manager thinks that customer monthly cell phone bills have increased, now averaging more than $52 per month.

The company wishes to test this claim. Past company records indicate thatσ= $10.

H0: μ ≤ 52 the mean is less than or equal to $52 per month

H1: μ > 52 the mean is greater than $52 per month (i.e., sufficient evidence exists to support the manager’s claim)

Form hypothesis:

34

Eg: Phone Bill

Suppose that a = .10 is chosen for this test.

Find the rejection region:


a = .10

Z 0

Reject H0

1-a = .90

35

Eg: Phone Bill

Check the critical value:

Z .07 .09

1.1 .8790 .8810 .8830

1.2 .8980 .9015

1.3 .9147 .9162 .9177 z 0 1.28

.08 a = .10

Critical Value

= 1.28

.90

.8997

.10

.90

36

Eg: Phone Bill

Sample information: n = 64, X = 53.1

=10 was known from past company records

Compute the test statistic:

0.88

64

10

5253.1

n

σ

μXZ

37

Eg: Phone Bill

a = .10

1.28 0

Reject H0

1-a = .90

Z = .88

Now, use the p-value approach to solve the problem.

Decision: Do not reject Ho since Z = 0.88 ≤ 1.28

I.e., There is not sufficient evidence that the mean bill is greater than $52.

38

Hypothesis Testing:

σ Unknown

39

Hypothesis Testing: σ Unknown

If the population standard deviation is unknown, simply replace it by the sample standard deviation S.

Because of this change, you use the t distribution to test Ho.

Check t-table (given αand df = n-1).

All other steps, concepts are the same.

Reminder:

As in the confidence interval chapter, when t-distribution is used, assume the population is approximately normal.

No need to have n > 30 if we assume a normal population.

40


Recall that the t test statistic with n-1

degrees of freedom is:

n

S

μXt 1-n

41

Eg: Price Watch

The mean cost of a hotel room in New York City is said to be

$168 per night. A random sample of 25 hotels resulted in

X = $172.50 and S = 15.40. Test at the a = 0.05 level.

(A stem-and-leaf display shows the data are approx. normally distributed )

Ho: μ= 168

H1: μ 168

42

Eg: Price Watch

H0: μ = 168

H1: μ ≠168

α = 0.05

n = 25

is unknown, so use

a t-statistic

Critical Value:

t24 = ± 2.0639

Reject H0 Reject H0

α/2=.025

-t n-1,α/2

Do not reject H0

0

α/2=.025

-2.0639 2.0639

t n-1,α/2

Determine the regions of rejection

43

Eg: Price Watch

a/2=.025

-t n-1,α/2 0

a/2=.025

-2.0639 2.0639

t n-1,α/2

1.46

25

15.40

168172.50

n

S

μXt 1n

Conclusion: Do not reject Ho.

There is not sufficient evidence that true mean cost is different from $168

1.46

44

Hypothesis Testing:

Connection to Confidence Intervals

For X = 172.5, S = 15.40 and n = 25, the 95%

confidence interval is:

166.14 ≤ μ ≤ 178.86

Since this interval contains the hypothesized

mean of 168, you do not reject the null

hypothesis at a = .05

25

15.4 (2.0639) 172.5 to

25

15.4 (2.0639) - 172.5

45


It is assumed that the sample statistic comes from

a random sample of a normal distribution.

If the sample size is small (< 30), you should use

a histogram to check the normality assumption.

If the sample size is large, the central limit

theorem applies.

46

Testing Proportion

47

Hypothesis Testing

Proportions

Involves categorical variables

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of the population in

the “success” category is denoted by π

48

Hypothesis Testing

Proportions

sizesample

sampleinsuccessesofnumber

n

Xp

pμn

)(1σ

p

Sample proportion in the success category is denoted by p

When both nπ and n(1-π) are at least 5, p can be

approximated by a normal distribution with mean and

standard deviation

49

Hypothesis Testing

Proportions

The sampling distribution of proportion (p)

is approximately normal, so the test statistic

is a Z value:

n

pZ

)1(

50

Eg: Testing Proportion

A marketing company claims that it receives 8%

responses from its mailing.

To test this claim, a random sample of 500 were

surveyed with 30 responses.

Test at the a = .05 significance level.

First, check:

n π = (500)(.08) = 40

n(1-π) = (500)(.92) = 460

51


H0: π = .08 H1: π ≠ .08

α = .05

n = 500, p = .06

Critical Values: ± 1.96

z 0

Reject Reject

.025 .025

1.96 -1.96

Determine region of rejection

52

Eg: Test for Proportion

Of a sample of 899 home-based businesses, 369 are

owned by females. Want to test if π= 0.50.

Sample proportion p = 369/899, n = 899.

Ho: π= 0.50; H1: π 0.50

Test statistic Z = -5.37

At a = 5%, critical values = 1.96.

Ho is rejected by the sample evidence.

ANSWER

53


Do not reject Ho at a = .05

Test Statistic: Decision:

Conclusion:

There isn’t sufficient

evidence to reject the

company’s claim of 8%

response rate.

1.648

500

.08).08(1

.08.06

n

)(1Z

p

z 0

.025 .025

1.96 -1.96

-1.646

54

Potential Pitfalls and

Ethical Considerations

Use randomly collected data to reduce selection biases

No human subjects without informed consent

Choose the level of significance, α, before data collection

Do not employ “data snooping” to choose between one-tail and two-tail test, or to determine the level of significance

Do not practice “data cleansing” to hide observations that do not support a stated hypothesis

Report all pertinent findings

55

Z or t?

Population Mean (μ)

Z: samp dist normally distributed if σ is known

t: use t-distribution if σ is unknown

Need to assume normal population. But n > 30 is

also acceptable.

Population Proportion (π)

Z: binomial approximated by normal dist

56

More Examples

57

Eg: Mean Waiting Time

Has the mean waiting time in a fast-food restaurant has changed from its previous value of 4.5 minutes?

Past experience shows that the population is normally distributed, with a population standard deviation of 1.2 minutes.

A sample of 25 orders is selected. The sample mean is 5.1 minutes.

The level of significance (a) is 0.05

58

Cont: Critical Value Approach

2.50

25

1.2

5.41.5

n

σ

μXZ

H0: μ = 4.5 H1: μ ≠ 4.5

a = .05

Sample size n = 25

Determine the appropriate technique

The population is normal and σ is known (σ = 1.2) so this is a Z test

Set up the critical values

For a = .05 the critical Z values are ±1.96

Compute the test statistic based on the sample data:

59

Cont: Critical Value Approach


Decision: Is the test statistic in the rejection region?

a /2

-Z α/2 = -1.96 0

Decision Rule

Reject Ho if

Z < -1.96 or Z > 1.96;

otherwise do not reject.

a= 0.05

Upper-tail area = 0.025

Reject H0

+Zα/2 = +1.96

Here, Z = 2.50 > 1.96, so the test statistic is

in the rejection region

60

Cont: p-Value Approach

Use the p-value approach to solve the mean waiting time problem.

Again we compute the test statistic of 2.50.

Probability (test statistic ≥ 2.50) = 1-0.9938 = 0.0062

The p-value for this two-tail test =2 x 0.0062 = 0.0124

Decision rule: p < a, reject Ho.

GRAPH:

chosen at 0.05

______

______

61

Eg: Mean Monthly Sales

Want to test if average monthly sales is $120.

Suppose a = 0.05,

X = $112.85, n = 12 and s = $20.80.

Answer

Test statistic t = (X – ) / (s/ n) = -1.19.

Critical values: t0.025, 11 = 2.2010.

The test statistic falls in the region of nonrejection.

Do not reject Ho. We don’t have enough evidence to reject the claim.

Caution: Do not say “accept Ho”.

How about the p-value approach?

ANSWER

Review Questions

Level of significance

True or False? level of significance = α = confidence level

Types of error

In hypothesis testing if the null hypothesis has been rejected when the alternative hypothesis has been true, which error has been committed?

Setting hypotheses

The manager of an automobile dealership is considering a new bonus plan in order to increase sales. Currently, the mean sales rate per salesperson is five automobiles per month. The correct set of hypotheses for testing the effect of the bonus plan is ____

p-value approach

A two-tailed test is performed at 95% confidence. The p-value is 0.09. What is the decision?

Review Questions: Testing μ

A random sample of 16 statistics examinations from a large population was taken.

The average score in the sample was 78.6 with a standard deviation of 8.0.

Want to know if the average grade of the population is significantly more than 75.

Assume the distribution of the population of grades is normal.

Is it a two-tailed test?

Do you use Z- or t- test?

Compute the test statistic

The p-value is between ___

Review Questions: Testing π

A random sample of 100 people was taken.

85 of the people in the sample favored Candidate A.

Want to find out whether or not the proportion of the

population in favor of Candidate A is significantly more

than 80%.

Set up the hypotheses.

What is the test statistic and the p-value?

Documents

L9_hypo_2014