Session 9 intro_of_topics_in_hypothesis_testing

Training on Teaching Basic Statistics for Tertiary Level Teachers

Summer 2008

Note: Most of the Slides were prepared by Prof. Josefina Almeda of the School of Statistics, UP Diliman

Introduction to Topics in Hypothesis Testing

TEACHING BASIC STATISTICS ….

Session 7.2

Two Areas of Inferential Statistics

Estimation

Point Estimation

Interval Estimation

Hypothesis Testing

Session 7.3

Research Problem: How effective is Minoxidil in treating male pattern baldness?

Specific Objectives:

1. To estimate the population proportion of patients who will show new hair growth after being treated with Minoxidil.

2. To determine whether treatment using Minoxidil is better than the existing treatment that is known to stimulate hair growth among 40% of patients with male pattern baldness.

Question: How do we achieve these objectives using inferential statistics?

This can be answered by ESTIMATION

This can be answered by HYPOTHESIS TESTING

Session 7.4

What is Hypothesis Testing?

Hypothesis testing is an area of statistical inference in which one evaluates a conjecture about some characteristic of the parent population based upon the information contained in the random sample.

Usually the conjecture concerns one of the unknown parameters of the population.

Session 7.5

What is a Hypothesis?

A hypothesis is a claim or statement about the populationparameterExamples of parameters

are population meanor proportion

The parameter mustbe identified before analysis

This drug is guaranteed to change in cholesterol levels (on the average) by more than 30%!

Session 7.6

Example of Hypothesis

The mean body temperature for patients admitted to elective surgery is not equal to 37.0oC.

Note: The parameter of interest here is which is the mean body temperature for patients admitted to elective surgery.

Session 7.7

Example of a Hypothesis

The proportion of registered voters in Quezon City favoring Candidate A exceeds 0.60.

Note: The parameter of interest here is p which is the proportion of registered voters in Quezon City favoring Candidate A.

Session 7.8

Things to keep in mind

Analyze a sample in an attempt to distinguish between results that can easily occur and results that are highly unlikely

We can explain the occurrence of highly unlikely results by saying either that a rare event has indeed occurred or that things aren’t as they are assumed to be.

Session 7.9

Example to illustrate the basic approach in testing hypothesis

A product called “Gender Choice” claims that couples “increase their chances of having a boy up to 85%, a girl up to 80%.” Suppose you conduct an experiment that includes 100 couples who want to have baby girls, and they all follow the Gender Choice to have a baby girl.

Session 7.10

Con’t of Example

Using common sense and no real formal statistical methods, what should you conclude about the effectiveness of Gender Choice if the 100 babies include

a. 52 girls

b. 97 girls

Session 7.11

Solution to Example:

a. We normally expect around 50 girls in 100 births. The result of 52 girls is close but higher than 50, so we should conclude that Gender Choice is effective. If the 100 couples used no special methods of gender selection, the result of 52 girls could easily occur by chance.

Session 7.12

Solution to Example

b. The result of 97 girls in 100 births is extremely unlikely to occur by chance. It could be explained in one of two ways: Either an extremely rare event has occurred by chance, or Gender Choice is effective because of the extremely low probability of getting 97 girls by chance, the more likely explanation is that the product is effective.

Session 7.13

Explanation of Example

We should conclude that the product is effective only if we get significantly more girls that we would expect under normal circumstances.

Although the outcomes of 52 girls and 97 girls are both “above average,” the result of 52 girls is not significant, whereas 97 girls does constitute a significant result.

Session 7.14

Components of a Formal Hypothesis Test

Session 7.15

Null Hypothesis

denoted by Ho the statement being tested it represents what the experimenter doubts to

be true must contain the condition of equality and

must be written with the symbol =, , or

When actually conducting the test, we operate under

the assumption that the parameter equals some

specific value.

Session 7.16

For the mean, the null hypothesis will be stated in one of these three possible forms:

Ho: = some value Ho: some value Ho: some value

Note: the value of can be obtained from previous studies or from knowledge of the population

Session 7.17

Example of Null Hypothesis

The null hypothesis corresponding to the common belief that the mean body temperature is 37oC is expressed as

Ho: We test the null hypothesis directly in

the sense that we assume it is true and reach a conclusion to either reject Ho or fail to reject Ho.

Session 7.18

Alternative Hypothesis

denoted by Ha Is the statement that must be true if the null

hypothesis is false the operational statement of the theory that

the experimenter believes to be true and wishes to prove

Is sometimes referred to as the research hypothesis

Session 7.19

For the mean, the alternative hypothesis will be stated in only one of three possible forms:

Ha: some value Ha: > some value Ha: < some value

Note: Ha is the opposite of Ho. For example, if Ho is given as = 37.0, then it follows that the alternative hypothesis is given

by Ha: 37.0.

Session 7.20

Note About Using or in Ho:

Even though we sometimes express Ho with the symbol or as in Ho: 37.0

or Ho: 37.0, we conduct the test by assuming that = 37.0 is true.

We must have a single fixed value for so that we can work with a single distribution having a specific mean.

Session 7.21

Note About Stating Your Own Hypotheses:

If you are conducting a research study and you want to use a hypothesis test to support your claim, the claim must be stated in such a way that it becomes the alternative hypothesis, so it cannot contain the condition of equality.

Session 7.22

If you believe that your brand of refrigerator lasts longer than the mean of 14 years for other brands, state the claim that > 14, where is the mean life of your refrigerators.

Ho: = 14 vs. Ha: > 14

Example in Stating your Hypothesis

Session 7.23

Some Notes: In this context of trying to support the goal of

the research, the alternative hypothesis is sometimes referred to as the research hypothesis.

Also in this context, the null hypothesis is assumed true for the purpose of conducting the hypothesis test, but it is hoped that the conclusion will be rejection of the null hypothesis so that the research hypothesis is supported.

Session 7.24

Suppose that the government is deciding whether or not to approve the manufacturing of a new drug. A drug is to be tested to find out if it can dissolve cholesterol deposits in the heart’s arteries. A major cause of heart diseases is the hardening of the arteries caused by the accumulation of cholesterol. The Bureau of Food and Drug (BFaD) will not allow the marketing of the drug unless there is strong evidence that it is effective.

Example of a Research Problem

Session 7.25

Con’t of Research ProblemA random sample of 98 middle-aged men has been selected for the experiment. Each man is given a standard daily dosage of the drug for 2 consecutive weeks. Their cholesterol levels are measured at the beginning and at the end of the test. The interest is to determine if the intake of the drug lead to reduced cholesterol levels; that is, a hypothesis test will have to be performed to determine if the drug is effective or not. Based on the results of the experiment, the director of BFaD will decide whether to release the drug to the public or postpone its release and request for more research.

Session 7.26

Con’t of Research ProblemTo perform a statistical hypothesis test, we must firstly identify the parameter of interest, and have some educated guess about the true value of the parameter. In the case of the BFaD example, the possible states of the drug’s effectiveness are referred to as hypotheses. Because the director wants only to know whether it is effective or not, either of the following hypotheses applies.

The drug is ineffective.The drug is effective.

Session 7.27

Con’t of Research ProblemTo measure the effectiveness of the drug for each middle-aged man, we can look at the percent change in cholesterol levels experienced by all middle-aged men who took the drug before and after they took the drug.

We summarize effectiveness in terms of the population mean.

Let be the population mean of the percent change in cholesterol levels. BFaD decides to classify the drug as effective only if, on the average, it reduces the cholesterol levels by more than 30% ( 30%).

Session 7.28

Stating the Null Hypothesis

The null hypothesis represents no practical change in cholesterol levels before and after the drug use. In terms of , we say

Ho: 30%

This means that the cholesterol level is reduced by 30% or less.

Session 7.29

Stating the Alternative Hypothesis

Ha: 30%

Whenever sample results fail to support the null hypothesis, the conclusion we accept is the alternative hypothesis. In our illustration, if results from the sample of percentage change in cholesterol levels fail to support Ho, then the director concludes Ha and says that the drug is effective.

Session 7.30

What is a Test of Significance?• A test of significance is a problem of

deciding between the null and the alternative hypotheses on the basis of the information contained in a random sample.

• The goal will be to reject Ho in favor of Ha, because the alternative is the hypothesis that the researcher believes to be true. If we are successful in rejecting Ho, we then declare the results to be “significant”.

Session 7.31

Note About Testing the Validity of Someone Else’s ClaimSometimes we test the validity of someone else’s claim, such as the claim of the Coca Cola Bottling Company that

“the mean amount of Coke in cans is at least 355 ml,” which becomes the null hypothesis of Ho: 355

In this context of testing the validity of someone else’s claim, their original claim sometimes becomes the null hypothesis (because it contains equality), and it sometimes becomes the alternative hypothesis (because it does not contain the equality).

Session 7.32

Two Types of Errors

Type I Error

Type II Error

Session 7.33

Type I Error The mistake of rejecting the null hypothesis

when it is true. It is not a miscalculation or procedural

misstep; it is an actual error that can occur when a rare event happens by chance.

The probability of rejecting the null hypothesis when it is true is called the significance level ( ).

The value of is typically predetermined, and very common choices are = 0.05

and = 0.01.

Session 7.34

Examples of Type I Error

1. The mistake of rejecting the null hypothesis that the mean body temperature is 37.0 when that mean is really 37.0.

2. BFaD allows the release of an ineffective medicine

Session 7.35

Type II Error

The mistake of failing to reject the null hypothesis when it is false.

The symbol (beta) is used to represent the probability of a type II error.

Session 7.36

Examples of Type II Errors

1. The mistake of failing to reject the null hypothesis ( = 37.0) when it is actually false (that is, the mean is not 37.0).

2. BFaD does not allow the release of an effective drug.

Session 7.37

Summary of Possible Decisions in

Hypothesis TestingTrue Situation

Decision

The null hypothesis

is true.

The null hypothesis is false.

We decide to reject the null hypothesis.

TYPE I error(rejecting a true

null hypothesis)

CORRECTdecision

We fail to reject the null hypothesis.

CORRECTdecision

TYPE II error(failing to

rejecta false nullhypothesis)

TRIAL: The accused did not do it.

VERDICT INNOCENT

GUILTY

GUILTY Type 1Error

Correct

INNOCENT

Correct Type 2Error

ANALOGY

Session 7.38

The experimenter is free to determine . If the test leads to the rejection of Ho, the researcher can then conclude that there is sufficient evidence supporting Ha at level of significance.

Usually, is unknown because it’s hard to calculate it. The common solution to this difficulty is to “withhold judgment” if the test leads to the failure to reject Ho.

and are inversely related. For a fixed sample size n, as decreases increases and vice-versa.

Controlling Type I and Type II Errors

Session 7.39

In almost all statistical tests, both and can be reduced by increasing the sample size.

Because of the inverse relationship of and , setting a very small should also be avoided if the researcher cannot afford a very large risk of committing a Type II error.

Session 7.40

Common Choices of

Consequences ofType I error

0.01 or smaller0.050.10

very seriousmoderately

seriousnot too serious

The choice of usually depends on the consequences associated with making a Type I error.

Session 7.41

The usual practice in research and industry is to determine in advance the values of and n, so the value of is determined.

Depending on the seriousness of a type I error, try to use the largest that you can tolerate.

For type I errors with more serious consequences, select smaller values of . Then choose a sample size n as large as is reasonable, based on considerations of time, cost, and other such relevant factors.

Session 7.42

Example to illustrate Type I and Type II Errors

Consider M&Ms (produced by Mars, Inc.) and Bufferin brand aspirin tablets (produced by Bristol-Myers Products).

The M&M package contains 1498 candies. The mean weight of the individual candies should be at least 0.9085 g., because the M&M package is labeled as containing 1361 g.

Session 7.43

The Bufferin package is labeled as holding 30 tablets, each of which contains 325 mg of aspirin.

Because M&Ms are candies used for enjoyment whereas Bufferin tablets are drugs used for treatment of health problems, we are dealing with two very different levels of seriousness.

Session 7.44

If the M&Ms don’t have a population mean weight of 0.9085 g, the consequences are not very serious, but if the Bufferin tablets don’t have a mean of 32.5 mg of aspirin, the consequences could be very serious.

Session 7.45

If the M&Ms have a mean that is too large, Mars will lose some money but consumers will not complain.

In contrast, if the Bufferin tablets have too much aspirin, Bristol-Myers could be faced with consumer lawsuits.

Session 7.46

Consequently, in testing the claim that = 0.9085 g for M&Ms, we might choose = 0.05 and a sample size of n = 100.

In testing the claim of = 325 mg for Bufferin tablets, we might choose = 0.01 and a sample size of n = 500.

The smaller significance level and large sample size n are chosen because of the more serious consequences associated with the commercial drug.

Session 7.47

The test statistic should tend to take on certain values when Ho is true and different values when Ha is true.

The decision to reject Ho depends on the value of the test statistic

• A decision rule based on the value of the test statistic:

Reject Ho if the computed value of the test statistic falls in the region of rejection.

The Test Statistic - a statistic computed from the sample data that is especially sensitive to the differences between Ho and Ha

Session 7.48

Factors that Determine the Region of Rejection

the behavior of the test statistic if the null hypotheses were true

the alternative hypothesis: the location of the region of rejection depends on the form of Ha

level of significance (): the smaller is, the smaller the region of rejection

Region of Rejection or Critical Region- the set of all values of the test statistic which will lead to the rejection of Ho

Session 7.49

Critical Value/s the value or values that separate the

critical region from the values of the test statistic that would not lead to rejection of the null hypothesis.

It depends on the nature of the null hypothesis, the relevant sampling distribution, and the level of significance.

Session 7.50

Types of Tests Two-tailed Test. If we are primarily concerned with

deciding whether the true value of a population parameter is different from a specified value, then the test should be two-tailed. For the case of the mean, we say Ha: 0.

Left-tailed Test. If we are primarily concerned with deciding whether the true value of a parameter is less than a specified value, then the test should be left-tailed. For the case of the proportion, we say Ha: P P0.

Right-tailed Test. If we are primarily concerned with deciding whether the true value of a parameter is greater than a specified value, then we should use the right-tailed test. For the case of the standard deviation, we say Ha: 0.

Session 7.51

Level of Significance and the Rejection Region

Ho: m = 30 Ha: m < 30 0

Ho: m = 30 Ha: m > 30

Ho: m = 30 Ha: m ¹ 30

Critical Value(s)

Rejection Regions

Session 7.52

The p-value - the smallest level of significance at which Ho will be rejected based on the information contained in the sample

An Alternative Form of Decision Rule

(based on the p-value)

Reject Ho if the p-value is less than or equal to the level of significance ().

Session 7.53

If the level of significance =0.05,

p-value Decision

0.01 Reject Ho.

0.05 Reject Ho.

0.10 Do not reject Ho

Example of Making Decisions Using the p-value

Session 7.54

Conclusions in Hypothesis Testing

1. Fail to reject the null hypothesis Ho.2. Reject the null hypothesis Ho.

Notes: Some texts say “accept the null hypothesis”

instead of “fail to reject the null hypothesis.”

Whether we use the term accept or fail to reject, we should recognize that we are not proving the null hypothesis; we are merely saying that the sample evidence is not strong enough to warrant rejection of the null hypothesis.

Session 7.55

Wording of Final Conclusion

Does the

originalclaim containthe condition

of equality

No (Original claim does not contain equality and becomes Ha)

Do you rejectHo?

“The sample data support the claim that….(original claim).”

(Reject Ho)

(Fail to Reject Ho)

“There is no sufficient sample evidence to support the claim that….(original claim).”

(This is the only case in which the original claim is supported.)

Session 7.56

Wording of Final Conclusion

Does the

originalclaim containthe condition

of equality

Yes (Original claim contains equality and becomes Ho)

Do you rejectHo?

“There is sufficient evidence to warrant rejection of the claim that….(original claim).”

(Reject Ho)

(Fail to Reject Ho)

“There is no sufficient sample evidence to warrant rejection of the claim that….(original claim).”

(This is the only case in which the original claim is rejected.)

Session 7.57

Example in Making Final Conclusion

If you want to justify the claim that the mean body temperature is different from 37.0oC, then make the claim that 37.0. This claim will be an alternative hypothesis that will be supported if you reject the null hypothesis of Ho: = 37.0.

Session 7.58

Example in Making Final Conclusion

If, on the other hand, you claim that the mean body temperature is 37.0oC, that is = 37.0, you will either reject or fail to reject the claim; in either case, you will not support the original claim.

Session 7.59

A Summary of the Steps in Hypothesis Testing

Determine the objectives of the experiment (responsibility of the experimenter).

1. State the null and alternative hypotheses.

2. Decide on a level of significance, .

Determine the testing procedure and methods of analysis (responsibility of the statistician).

3. Decide on the type of data to be collected and choose an appropriate test statistic and testing procedure.

Session 7.60

4. State the decision rule.

5. Collect the data and compute for the value of the test statistic using the sample data.

6. a) If decision rule is based on region of rejection: Check if the test statistic falls in the region of rejection. If yes, reject Ho.

b) If decision rule is based on p-value: Determine the p-value. If the p-value is less than or equal to , reject Ho.

7. Interpret results.

Con’t of Steps in Hypothesis Testing

Session 7.61

End of Presentation

Session 9 intro_of_topics_in_hypothesis_testing

Technology

Session 9 PatrickGreene

Session 9 Outline

IEP~Session 9

Session Objectives #9

Tech 20 Session 9

Session 9 21

Session 8 & 9

Session 9 Notes

Pm Session 9

Session 9 Christmas Special Session!!

SSwB: Session 9

Session 9: Visualization

Session 08 9

Session 9 -_common_toxicities

ASP.NET Session 9

Session 9 - Nirlaba

SESSION 9

Minutes of 802.16 Session #9 · Session#9 was an interim session. The 75% interval attendance criteria for Session #9 was 6 or more intervals. Please contact the 802.16 Secretary

Session Commands - Minitab€¦ · Using Session Commands 9 Alphabetical list of session commands 9 What are session commands? 24 Session command syntax notation 24 Symbols to use

INAF TA session 9