51
Objective: To test claims about inferences for means, under specific conditions 7.1 – 7.4: HYPOTHESES TESTS FOR MEANS

7.1 – 7.4: Hypotheses tests for means

  • Upload
    verne

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

7.1 – 7.4: Hypotheses tests for means. Objective: To test claims about inferences for means, under specific conditions. Hypotheses. Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called the null hypothesis . - PowerPoint PPT Presentation

Citation preview

Page 1: 7.1 – 7.4:  Hypotheses tests  for means

Objective: To test claims about inferences for means, under specific

conditions

7.1 – 7.4: HYPOTHESES TESTS FOR MEANS

Page 2: 7.1 – 7.4:  Hypotheses tests  for means

• Hypotheses are working models that we adopt temporarily.

• Our starting hypothesis is called the null hypothesis.

• The null hypothesis, that we denote by H0, specifies a population model parameter of interest and proposes a value for that parameter.

• We usually write down the null hypothesis in the form H0: parameter = hypothesized value.

• The alternative hypothesis, which we denote by HA, contains the values of the parameter that we consider plausible if we reject the null hypothesis.

HYPOTHESES

Page 3: 7.1 – 7.4:  Hypotheses tests  for means

• The null hypothesis, specifies a population model parameter of interest and proposes a value for that parameter.

• We might have, for example, H0: µ= 20.

• We want to compare our data to what we would expect given that H0 is true.

• We can do this by finding out how many standard deviations away from the proposed value we are.

• We then ask how likely it is to get results like we did if the null hypothesis were true.

HYPOTHESES (CONT.)

Page 4: 7.1 – 7.4:  Hypotheses tests  for means

• Think about the logic of jury trials:

• To prove someone is guilty, we start by assuming they are innocent.

• We retain that hypothesis (null hypothesis) until the facts make it unlikely beyond a reasonable doubt.

• Then, and only then, we reject the (null) hypothesis of innocence and declare the person guilty.

CONSIDER TRIALS

Page 5: 7.1 – 7.4:  Hypotheses tests  for means

• The same logic used in jury trials is used in statistical tests of hypotheses: • We begin by assuming that a hypothesis is true.

• Next we consider whether the data are consistent with the hypothesis.

• If they are, all we can do is retain the hypothesis we started with. If they are not, then like a jury, we ask whether they are unlikely beyond a reasonable doubt.

CONSIDER TRIALS (CONT.)

Page 6: 7.1 – 7.4:  Hypotheses tests  for means

There are three possible alternative hypotheses:

HA: parameter < hypothesized value

HA: parameter ≠ hypothesized value

HA: parameter > hypothesized value

ALTERNATIVE HYPOTHESES

Page 7: 7.1 – 7.4:  Hypotheses tests  for means

• HA: parameter ≠ value is known as a two-sided alternative because we are equally interested in deviations on either side of the null hypothesis value.

• For two-sided alternatives, the P-value is the probability of deviating in either direction from the null hypothesis value.

ALTERNATIVE HYPOTHESES (CONT.)

Page 8: 7.1 – 7.4:  Hypotheses tests  for means

• The other two alternative hypotheses are called one-sided alternatives.

• A one-sided alternative focuses on deviations from the null hypothesis value in only one direction.

• Thus, the P-value for one-sided alternatives is the probability of deviating only in the direction of the alternative away from the null hypothesis value.

ALTERNATIVE HYPOTHESES (CONT.)

Page 9: 7.1 – 7.4:  Hypotheses tests  for means

• The statistical twist is that we can quantify our level of doubt.• We can use the model proposed by our

hypothesis to calculate the probability that the event we have witnessed could happen.

• That is just the probability we’re looking for—it quantifies exactly how surprised we are to see our results.

• This probability is called a P-value.

P-VALUES

Page 10: 7.1 – 7.4:  Hypotheses tests  for means

• When the data are consistent with the model from the null hypothesis, the P-value is high and we are unable to reject the null hypothesis.

• In that case, we have to “retain” the null hypothesis we started with.

• We can’t claim to have proved it; instead we “fail to reject the null hypothesis” when the data are consistent with the null hypothesis model and in line with what we would expect from natural sampling variability.

• If the P-value is low enough, we’ll “reject the null hypothesis,” since what we observed would be very unlikely were the null model true.

P-VALUES (CONT.)

Page 11: 7.1 – 7.4:  Hypotheses tests  for means

• If the evidence is not strong enough to reject the presumption of innocent, the jury returns with a verdict of “not guilty.”• The jury does not say that the defendant is

innocent.

• All it says is that there is not enough evidence to convict, to reject innocence.

• The defendant may, in fact, be innocent, but the jury has no way to be sure.

P-VALUES: RETURN TO TRIALS

Page 12: 7.1 – 7.4:  Hypotheses tests  for means

• Said statistically, we will fail to reject the null hypothesis.

• We never declare the null hypothesis to be true, because we simply do not know whether it’s true or not.

• Sometimes in this case we say that the null hypothesis has been retained.

P-VALUES: RETURN TO TRIALS (CONT.)

Page 13: 7.1 – 7.4:  Hypotheses tests  for means

1. A research team wants to know if aspirin helps to thin blood. The null hypothesis says that it doesn’t. They test 12 patients, observe their mean blood levels, given baseline data, and get a P-value of 0.32. They proclaim that aspirin doesn’t work. What would you say?

2. A weight loss drug has been tested and found the mean weight loss of patients in a large clinical trial. Now the scientists want to see if the new, improved version works even better. What would the null hypothesis be?

3. The new drug is tested and the P-value is 0.0001. What would you conclude about the new drug?

EXAMPLES

Page 14: 7.1 – 7.4:  Hypotheses tests  for means

• A P-value is a conditional probability—the probability of the observed statistic given that the null hypothesis is true.

• The P-value is NOT the probability that the null hypothesis is true.

• It’s not even the conditional probability that null hypothesis is true given the data.

• Be careful to interpret the P-value correctly.

THINKING ABOUT P-VALUES

Page 15: 7.1 – 7.4:  Hypotheses tests  for means

• When we see a small P-value, we could continue to believe the null hypothesis and conclude that we just witnessed a rare event. But instead, we trust the data and use it as evidence to reject the null hypothesis.

• However big P-values just mean what we observed isn’t surprising. That is, the results are now in line with our assumption that the null hypothesis models the world, so we have no reason to reject it.

THINKING ABOUT P-VALUES (CONT.)

Page 16: 7.1 – 7.4:  Hypotheses tests  for means

• Sometimes we need to make a firm decision about whether or not to reject the null hypothesis.

• When the P-value is small, it tells us that our data are rare given the null hypothesis.

• How rare is “rare”?

ALPHA LEVELS

Page 17: 7.1 – 7.4:  Hypotheses tests  for means

• We can define “rare event” arbitrarily by setting a threshold for our P-value.

• If our P-value falls below that point, we’ll reject H0. We call such results statistically significant.

• The threshold is called an alpha level, denoted by .

ALPHA LEVELS (CONT.)

Page 18: 7.1 – 7.4:  Hypotheses tests  for means

• Common alpha levels are 0.10, 0.05, and 0.01.

• You have the option—almost the obligation—to consider your alpha level carefully and choose an appropriate one for the situation.

• The alpha level is also called the significance level.

• When we reject the null hypothesis, we say that the test is “significant at that level.”

ALPHA LEVELS (CONT.)

Page 19: 7.1 – 7.4:  Hypotheses tests  for means

• What can you say if the P-value does not fall below ?

• You should say that “The data have failed to provide sufficient evidence to reject the null hypothesis.”

• Don’t say that you “accept the null hypothesis.”

• Recall that, in a jury trial, if we do not find the defendant guilty, we say the defendant is “not guilty”—we don’t say that the defendant is “innocent.”

ALPHA LEVELS (CONT.)

Page 20: 7.1 – 7.4:  Hypotheses tests  for means

• The P-value gives the reader far more information than just stating that you reject or fail to reject the null.

• In fact, by providing a P-value to the reader, you allow that person to make his or her own decisions about the test.

• What you consider to be statistically significant might not be the same as what someone else considers statistically significant.

• There is more than one alpha level that can be used, but each test will give only one P-value.

ALPHA LEVELS (CONT.)

Page 21: 7.1 – 7.4:  Hypotheses tests  for means

• What do we mean when we say that a test is statistically significant?

• All we mean is that the test statistic had aP-value lower than our alpha level.

• Don’t be lulled into thinking that statistical significance carries with it any sense of practical importance or impact.

STATISTICALLY SIGNIFICANT

Page 22: 7.1 – 7.4:  Hypotheses tests  for means

DAY 2

Page 23: 7.1 – 7.4:  Hypotheses tests  for means

There are four basic parts to a hypothesis test:1. Hypotheses2. Model3. Mechanics4. Conclusion

Let’s look at these parts in detail…

REASONING OF HYPOTHESIS TESTING

Page 24: 7.1 – 7.4:  Hypotheses tests  for means

1. Hypotheses• The null hypothesis: To perform a hypothesis

test, we must first translate our question of interest into a statement about model parameters.

• In general, we have H0: parameter = hypothesized value.

• The alternative hypothesis: The alternative hypothesis, HA, contains the values of the parameter we consider plausible when we reject the null.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 25: 7.1 – 7.4:  Hypotheses tests  for means

2. Model• To plan a statistical hypothesis test, specify the model you will

use to test the null hypothesis and the parameter of interest.

• All models require assumptions, so state the assumptions and check any corresponding conditions.

• Your conditions should conclude with a statement such as:

• Because the conditions are satisfied, I can model the sampling distribution of the proportion with a Normal model.

• Watch out, though. It might be the case that your model step ends with “Because the conditions are not satisfied, I can’t proceed with the test.” If that’s the case, stop and reconsider (proceed with caution)

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 26: 7.1 – 7.4:  Hypotheses tests  for means

2. Model

• Don’t forget to name your test!

• The test about means is called a one-sample t-test.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 27: 7.1 – 7.4:  Hypotheses tests  for means

2. Model (cont.)One-sample t-test for the mean

• The conditions for the one-sample t-test for the mean are the same as for the one-sample t-interval.

• We test the hypothesis H0: = 0 using the statistic

• The standard error of the sample mean is

• When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t model with n – 1 df. We use that model to obtain a P-value.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 28: 7.1 – 7.4:  Hypotheses tests  for means

2. Model (cont.)

Finding the P-ValueEither use the table provided, or you may use your calculator:• normalcdf( is used for z-scores (if you know )• tcdf( is used for critical t-values (when you use s to

estimate )• 2nd Distribution • tcdf(lower bound, upper bound, degrees of

freedom)

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 29: 7.1 – 7.4:  Hypotheses tests  for means

3. Mechanics

• Under “mechanics” we place the actual calculation of our test statistic from the data.

• Different tests will have different formulas and different test statistics.

• Usually, the mechanics are handled by a statistics program or calculator, but it’s good to know the formulas.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 30: 7.1 – 7.4:  Hypotheses tests  for means

3. Mechanics (continued)

• The ultimate goal of the calculation is to obtain a P-value.• The P-value is the probability that the observed

statistic value (or an even more extreme value) could occur if the null model were correct.

• If the P-value is small enough, we’ll reject the null hypothesis.

• Note: The P-value is a conditional probability—it’s the probability that the observed results could have happened if the null hypothesis is true.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 31: 7.1 – 7.4:  Hypotheses tests  for means

4. Conclusion• The conclusion in a hypothesis test is always a

statement about the null hypothesis.

• The conclusion must state either that we reject or that we fail to reject the null hypothesis.

• And, as always, the conclusion should be stated in context.

REASONING OF HYPOTHESIS TESTING (CONT.)

Page 32: 7.1 – 7.4:  Hypotheses tests  for means

1. Check Conditions and show that you have checked these!• Random Sample: Can we assume this?• 10% Condition: Do you believe that your sample

size is less than 10% of the population size?• Nearly Normal: • If you have raw data, graph a histogram to

check to see if it is approximately symmetric and sketch the histogram on your paper.

• If you do not have raw data, check to see if the problem states that the distribution is approximately Normal.

STEPS FOR HYPOTHESIS TESTING

Page 33: 7.1 – 7.4:  Hypotheses tests  for means

2. State the test you are about to conduct Ex) One-Sample t-Test for Means

3. Set up your hypothesesH0:HA:

4. Calculate your test statistic

5. Draw a picture of your desired area under the t-model, and calculate your P-value.

STEPS FOR HYPOTHESIS TESTING (CONT.)

Page 34: 7.1 – 7.4:  Hypotheses tests  for means

6. Make your conclusion.When your P-value is small enough (or below α, if

given), reject the null hypothesis.

When your P-value is not small enough, fail to reject the null hypothesis.

STEPS FOR HYPOTHESIS TESTING (CONT.)

Page 35: 7.1 – 7.4:  Hypotheses tests  for means

Given a set of data:• Enter data into L1• Set up STATPLOT to create a histogram to check the nearly Normal

condition• STAT TESTS 2:T-Test • Choose Stored Data, then specify your data list (usually L1)• Enter the mean of the null model and indicate where the data are (>,

<, or )Given sample mean and standard deviation:• STAT TESTS 2:T-Test• Choose Stats enter• Specify the hypothesized mean and sample statistics• Specify the tail (>, <, or )• Calculate

CALCULATOR TIPS

Page 36: 7.1 – 7.4:  Hypotheses tests  for means

A company has set a goal of developing a battery that lasts over 5 hours (300 minutes) in continuous use. A first test of 12 of these batteries measured the following lifespans (in minutes): 321, 295, 332, 351, 281, 336, 311, 253, 270, 326, 311, and 288. Is there evidence that the company has met its goal?       

EXAMPLE 1

Page 37: 7.1 – 7.4:  Hypotheses tests  for means

Find a 90% confidence interval for the mean lifespan of this type of battery.       

EXAMPLE 1 (CONTINUED)

Page 38: 7.1 – 7.4:  Hypotheses tests  for means

Cola makers test new recipes for loss of sweetness during storage. Trained tasters rate the sweetness before and after storage. Here are the sweetness losses (sweetness before storage minus sweetness after storage) found by 10 tasters for one new cola recipe:

Are these data good evidence that the cola lost sweetness?

EXAMPLE 2 (PARTNERS)

Page 39: 7.1 – 7.4:  Hypotheses tests  for means

DAY 3

Page 40: 7.1 – 7.4:  Hypotheses tests  for means

Psychology experiments sometimes involve testing the ability of rats to navigate mazes. The mazes are classified according to difficulty, as measured by the mean length of time it takes rats to find the food at the end. One researcher needs a maze that will take the rats an average of about one minutes to solve. He tests one maze on several rats, collecting the data shown. Test the hypothesis that the mean completion time for this maze is 60 seconds at an alpha level of 0.05. What is your conclusion?

EXAMPLE 3

38.4

57.6

46.2

55.5

62.5

49.5

38.0

40.9

62.8

44.3

33.9

93.8

50.4

47.9

35.0

69.2

52.8

46.2

60.1

56.3

55.1

Page 41: 7.1 – 7.4:  Hypotheses tests  for means

• Confidence intervals and hypothesis tests are built from the same calculations.

• They have the same assumptions and conditions.

• You can approximate a hypothesis test by examining a confidence interval.

• Just ask whether the null hypothesis value is consistent with a confidence interval for the parameter at the corresponding confidence level.

CONFIDENCE INTERVALS & HYPOTHESIS TESTS

Page 42: 7.1 – 7.4:  Hypotheses tests  for means

• Because confidence intervals are two-sided, they correspond to two-sided tests.

• In general, a confidence interval with a confidence level of C% corresponds to a two-sided hypothesis test with an -level of

100 – C%.

• The relationship between confidence intervals and one-sided hypothesis tests is a little more complicated.• A confidence interval with a confidence level of C%

corresponds to a one-sided hypothesis test with an -level of ½(100 – C)%.

CONFIDENCE INTERVALS & HYPOTHESIS TESTS (CONT.)

Page 43: 7.1 – 7.4:  Hypotheses tests  for means

• Here’s some shocking news for you: nobody’s perfect. Even with lots of evidence we can still make the wrong decision.

• When we perform a hypothesis test, we can make mistakes in two ways:

I. The null hypothesis is true, but we mistakenly reject it.

(Type I error)

II. The null hypothesis is false, but we fail to reject it.

(Type II error)

MAKING ERRORS

Page 44: 7.1 – 7.4:  Hypotheses tests  for means

• Which type of error is more serious depends on the situation at hand. In other words, the importance of the error is context dependent.

• Here’s an illustration of the four situations in a hypothesis test:

MAKING ERRORS (CONT.)

Page 45: 7.1 – 7.4:  Hypotheses tests  for means

• http://www.youtube.com/watch?v=Q7fZXEW4mpA

• What type of error was made?

• How about OJ Simpson?

MAKING ERRORS (CONT.)

Page 46: 7.1 – 7.4:  Hypotheses tests  for means

• How often will a Type I error occur? • A Type I error is rejecting a true null hypothesis.

To reject the null hypothesis, the P-value must fall below . Therefore, when the null is true, that happens exactly with a probability of . Thus, the probability of a Type I error is our level.

• When H0 is false and we reject it, we have done the right thing.• A test’s ability to detect a false null hypothesis is

called the power of the test.

MAKING ERRORS (CONT.)

Page 47: 7.1 – 7.4:  Hypotheses tests  for means

• When H0 is false and we fail to reject it, we have made a Type II error.• We assign the letter to the probability of this

mistake.• It’s harder to assess the value of because we

don’t know what the value of the parameter really is.

• When the null hypothesis is true, it specifies a single parameter value, H0: parameter = hypothesized value.

• When the null hypothesis is false, we do not have a specific parameter; we have many possible values.

• There is no single value for --we can think of a whole collection of ’s, one for each incorrect parameter value.

MAKING ERRORS (CONT.)

Page 48: 7.1 – 7.4:  Hypotheses tests  for means

• One way to focus our attention on a particular is to think about the effect size. • Ask “How big a difference would matter?”

• We could reduce for all alternative parameter values by increasing .• This would reduce but increase the chance of a

Type I error.• This tension between Type I and Type II errors is

inevitable.

• The only way to reduce both types of errors is to collect more data. Otherwise, we just wind up trading off one kind of error against the other.

MAKING ERRORS (CONT.)

Page 49: 7.1 – 7.4:  Hypotheses tests  for means

• The power of a test is the probability that it correctly rejects a false null hypothesis.

• The power of a test is 1 – ; because is the probability that a test fails to reject a false null hypothesis and power is the probability that it does reject.

• Whenever a study fails to reject its null hypothesis, the test’s power comes into question.

• When we calculate power, we imagine that the null hypothesis is false.

POWER OF THE TEST

Page 50: 7.1 – 7.4:  Hypotheses tests  for means

• The value of the power depends on how far the truth lies from the null hypothesis value.

• The distance between the null hypothesis value, 0 , and the truth, , is called the effect size.

• Power depends directly on effect size. It is easier to see larger effects, so the farther is from 0, the greater the power.

POWER OF THE TEST (CONT.)

Page 51: 7.1 – 7.4:  Hypotheses tests  for means

• Day 1: 7.1-7.4-Set A Book Page (Same as 6.5 B Book Page) # 1 - 5

7.1-7.4-Set B Book Page (Same as 6.1 – 6.4 Book Page) # 23, 24 • Day 2: 7.1-7.4-Set B Book Page (Same as 6.1 – 6.4 Book

Page) # 1cd, 2cd, 29, 30, 33, 35

• Day 3: 7.1-7.4-Set B Book Page (Same as 6.1 – 6.4 Book Page) # 22, 25 – 28, 34

ASSIGNMENTS