Upload
ophelia-lester
View
225
Download
0
Tags:
Embed Size (px)
Citation preview
Example 10.1Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant
Concepts in Hypothesis Testing
2
Background Information
The manager of Pepperoni Pizza Restaurant
has recently begun experimenting with a new
method of baking its pepperoni pizzas.
3
Background Information – cont’d
He believes that the new method produces a
better-tasting pizza, but he would like to base a
decision on whether to switch from the old
method to the new method on customer
reactions.
Therefore he performs an experiment.
4
The Experiment
For 100 randomly selected customers who order
a pepperoni pizza for home delivery, he includes
both an old style and a free new style pizza in
the order.
5
The Experiment – cont’d
All he asks is that these customers rate the
difference between pizzas on a -10 to +10 scale,
where -10 means they strongly favor the old
style, +10 means they strongly favor the new
style, and 0 means they are indifferent between
the two styles.
Once he gets the ratings from the customers,
how should he proceed?
6
Hypothesis Testing
This example’s goal is to explain hypothesis
testing concepts. We are not implying that the
manager would, or should, use a hypothesis
testing procedure to decide whether to switch
methods.
7
Hypothesis Testing – cont’d
First, hypothesis testing does not take costs into
account. In this example, if the new method is
more costly it would be ignored by hypothesis
testing.
Second, even if costs of the two pizza-making
methods are equivalent, the manager might
base his decision on a simple point estimate and
possibly a confidence interval.
8
Null and Alternative Hypotheses
Usually, the null hypothesis is labeled Ho and the
alternative hypothesis is labeled Ha.
The null and alternative hypotheses divide all
possibilities into two nonoverlapping sets,
exactly one of which must be true.
9
Null and Alternative Hypotheses – cont’d
Traditionally, hypotheses testing has been
phrased as a decision-making problem, where
an analyst decides either to accept the null
hypothesis or reject it, based on the sample
evidence.
10
One-Tailed Versus Two-Tailed Tests
The form of the alternative hypothesis can be
either a one-tailed or two-tailed, depending on
what the analyst is trying to prove.
A one-tailed hypothesis is one where the only
sample results which can lead to rejection of the
null hypothesis are those in a particular
direction, namely, those where the sample mean
rating is positive.
11
One-Tailed Versus Two-Tailed Tests – cont’d
A two-tailed test is one where results in either of
two directions can lead to rejection of the null
hypothesis.
Once the hypotheses are set up, it is easy to
detect whether the test is one-tailed or two-
tailed.
12
One-Tailed Versus Two-Tailed Tests – cont’d One tailed alternatives are phrased in terms of “>”
or “<“ whereas two tailed alternatives are phrased
in terms of “”
The real question is whether to set up hypotheses
for a particular problem as one-tailed or two-tailed.
There is no statistical answer to this question. It
depends entirely on what we are trying to prove.
13
Types of Errors
Whether or not one decides to accept or reject
the null hypothesis, it might be the wrong
decision.
One might reject the null hypothesis when it is
true or incorrectly accept the null hypothesis
when it is false.
These errors are called type I and type II errors.
14
Types of Errors – cont’d
In general we incorrectly reject a null hypothesis
that is true. We commit a type II error when we
incorrectly accept a null hypothesis that is false.
These ideas appear graphically below.
15
Types of Errors -- continued
• While these errors seem to be equally
serious, actually type I errors have
traditionally been regarded as the more
serious of the two.
• Therefore, the hypothesis-testing procedure
factors caution in terms of rejecting the null
hypothesis.
16
Significance Level and Rejection Region The real question is how strong the evidence in
favor of the alternative hypothesis must be to
reject the null hypothesis.
The analyst determines the probability of a type I
error that he is willing to tolerate. The value is
denoted by and is most commonly equal to
0.05, although sigma=0.01 and sigma=0.10 are
also frequently used.
17
Significance Level and Rejection Region – cont’d
The value of is called the significance level of
the test.
Then, given the value of sigma, we use
statistical theory to determine the rejection
region.
18
Significance Level and Rejection Region – cont’d
If the sample falls into this region we reject the
null hypothesis; otherwise, we accept it.
Sample evidence that falls into the rejection
region is called statistically significant at the
sigma level.
19
Significance from p-values
This approach is currently more popular than the
significance level and rejected region approach.
This approach is to avoid the use of the level
and instead simply report “how significant” the
sample evidence is.
20
Significance from p-values – cont’d
We do this by means of the p-value.The p-value
is the probability of seeing a random sample at
least as extreme as the sample observes, given
that the null hypothesis is true.
Here “extreme” is relative to the null hypothesis.
21
Significance from p-values – cont’d
In general smaller p-values indicate more
evidence in support of the alternative
hypothesis. If a p-value is sufficiently small,
almost any decision maker will conclude that
rejecting the null hypothesis is the more
“reasonable” decision.
22
Significance from p-values – cont’d
How small is a “small” p-value? This is largely a matter of semantics but if the − p-value is less than 0.01, it provides “convincing”
evidence that the alternative hypothesis is true;
− p-value is between 0.01 and 0.05, there is “strong” evidence in favor of the alternative hypothesis;
23
Significance from p-values – cont’d
− p-value is between 0.05 and 0.10, it is in a “gray area”;
− p-values greater than 0.10 are interpreted as weak or no evidence in support of the alternative.
Example 10.1aExperimenting with a New Pizza Style at the Pepperoni Pizza Restaurant
Hypothesis Tests for a Population Mean
Objective
To use a one-sample t test to see
whether consumers prefer the new style
pizza to the old style.
Background Information
Recall that the manager of the Pepperoni Pizza
Restaurant is running an experiment to test the
hypotheses of H0: μ ≤ 0 versus Ha: μ > 0, where μ is
the mean rating in the entire customer population.
Here, each customer rates the difference between
an old-style pizza and a new-style pizza on a -10 to
+10 scale, where negative ratings favor the old-style
pizza and positive ratings favor the new-style pizza.
PIZZA.XLS The ratings of 40 randomly selected customers and
several summary statistics appear in this file and in the following table.
Summary Statistics
From the summary statistics, we see that the sample mean is 2.10 and the sample standard deviation is 4.717.
The positive sample mean provides some evidence in favor of the alternative hypothesis, but given the rather large standard deviation and the boxplot of ratings shown on the next slide, does it provide enough evidence to reject H0?
Summary Statistics – cont’d
Running the Test
To run the test, we calculate the test statistic, using
the borderline null hypothesis value mu0 = 0, and
report how much probability is beyond it in the right
tail of the appropriate t distribution.
We use the right tail because the alternative is one-
tailed of the “greater than” variety.
Running the Test – cont’d
The test statistic is
The probability beyond this value in the right tail of the
t distribution with n-1 = 39 degrees of freedom is
approximately 0.004, which can be found in Excel with
the function TDIST(2.816,39,1).
816.240/717.4
010.2
valuet
Running the Test – cont’d
The probability, 0.004, is the p-value for the test.
It indicates that these sample results would be
very unlikely if the null hypothesis is true.
The manager has two choices: he can conclude
that the null hypothesis is true or he can conclude
that the alternative hypothesis is true - and
presumably switch to the new-style pizza. The
second choice appears to be more reasonable.
Using StatTools
Another way to interpret the results is in terms of
traditional significance levels, but the p-value is the
preferred method.
Using StatTools – cont’d
The StatTools One-Sample procedure can be used
to perform this analysis easily. To use it, select the
StatTools/Statistical Inference/One-Sample Analysis
menu item, and choose the Rating variable as the
variable to analyze.
Then fill in the dialog boxes as shown on the
following slides.
One-Sample Hypothesis Test Dialog Box
The Results
Most of this output should be familiar; it mirrors the previous calculations.
The results are significant at the 1% level.
Conclusion
Should the manager switch to the new-style pizza
on the basis of these sample results?
We would probably recommend “yes”. There is no
indication that the new-style pizza costs any more to
make than the old-style pizza, and the sample
evidence is fairly convincing that customers, on
average, will prefer the new-style pizza.
Conclusion – cont’d
Therefore, unless there are reasons for not
switching (for example, costs), we recommend the
switch.