Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing

Example 10.1Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant

Concepts in Hypothesis Testing

2

Background Information

The manager of Pepperoni Pizza Restaurant

has recently begun experimenting with a new

method of baking its pepperoni pizzas.

3

Background Information – cont’d

He believes that the new method produces a

better-tasting pizza, but he would like to base a

decision on whether to switch from the old

method to the new method on customer

reactions.

Therefore he performs an experiment.

4

The Experiment

For 100 randomly selected customers who order

a pepperoni pizza for home delivery, he includes

both an old style and a free new style pizza in

the order.

5

The Experiment – cont’d

All he asks is that these customers rate the

difference between pizzas on a -10 to +10 scale,

where -10 means they strongly favor the old

style, +10 means they strongly favor the new

style, and 0 means they are indifferent between

the two styles.

Once he gets the ratings from the customers,

how should he proceed?

6

Hypothesis Testing

This example’s goal is to explain hypothesis

testing concepts. We are not implying that the

manager would, or should, use a hypothesis

testing procedure to decide whether to switch

methods.

7

Hypothesis Testing – cont’d

First, hypothesis testing does not take costs into

account. In this example, if the new method is

more costly it would be ignored by hypothesis

testing.

Second, even if costs of the two pizza-making

methods are equivalent, the manager might

base his decision on a simple point estimate and

possibly a confidence interval.

8

Null and Alternative Hypotheses

Usually, the null hypothesis is labeled Ho and the

alternative hypothesis is labeled Ha.

The null and alternative hypotheses divide all

possibilities into two nonoverlapping sets,

exactly one of which must be true.

9

Null and Alternative Hypotheses – cont’d

Traditionally, hypotheses testing has been

phrased as a decision-making problem, where

an analyst decides either to accept the null

hypothesis or reject it, based on the sample

evidence.

10

One-Tailed Versus Two-Tailed Tests

The form of the alternative hypothesis can be

either a one-tailed or two-tailed, depending on

what the analyst is trying to prove.

A one-tailed hypothesis is one where the only

sample results which can lead to rejection of the

null hypothesis are those in a particular

direction, namely, those where the sample mean

rating is positive.

11

One-Tailed Versus Two-Tailed Tests – cont’d

A two-tailed test is one where results in either of

two directions can lead to rejection of the null

hypothesis.

Once the hypotheses are set up, it is easy to

detect whether the test is one-tailed or two-

tailed.

12

One-Tailed Versus Two-Tailed Tests – cont’d One tailed alternatives are phrased in terms of “>”

or “<“ whereas two tailed alternatives are phrased

in terms of “”

The real question is whether to set up hypotheses

for a particular problem as one-tailed or two-tailed.

There is no statistical answer to this question. It

depends entirely on what we are trying to prove.

13

Types of Errors

Whether or not one decides to accept or reject

the null hypothesis, it might be the wrong

decision.

One might reject the null hypothesis when it is

true or incorrectly accept the null hypothesis

when it is false.

These errors are called type I and type II errors.

14

Types of Errors – cont’d

In general we incorrectly reject a null hypothesis

that is true. We commit a type II error when we

incorrectly accept a null hypothesis that is false.

These ideas appear graphically below.

15

Types of Errors -- continued

• While these errors seem to be equally

serious, actually type I errors have

traditionally been regarded as the more

serious of the two.

• Therefore, the hypothesis-testing procedure

factors caution in terms of rejecting the null

hypothesis.

16

Significance Level and Rejection Region The real question is how strong the evidence in

favor of the alternative hypothesis must be to

reject the null hypothesis.

The analyst determines the probability of a type I

error that he is willing to tolerate. The value is

denoted by and is most commonly equal to

0.05, although sigma=0.01 and sigma=0.10 are

also frequently used.

17

Significance Level and Rejection Region – cont’d

The value of is called the significance level of

the test.

Then, given the value of sigma, we use

statistical theory to determine the rejection

region.

18

Significance Level and Rejection Region – cont’d

If the sample falls into this region we reject the

null hypothesis; otherwise, we accept it.

Sample evidence that falls into the rejection

region is called statistically significant at the

sigma level.

19

Significance from p-values

This approach is currently more popular than the

significance level and rejected region approach.

This approach is to avoid the use of the level

and instead simply report “how significant” the

sample evidence is.

20

Significance from p-values – cont’d

We do this by means of the p-value.The p-value

is the probability of seeing a random sample at

least as extreme as the sample observes, given

that the null hypothesis is true.

Here “extreme” is relative to the null hypothesis.

21


In general smaller p-values indicate more

evidence in support of the alternative

hypothesis. If a p-value is sufficiently small,

almost any decision maker will conclude that

rejecting the null hypothesis is the more

“reasonable” decision.

22


How small is a “small” p-value? This is largely a matter of semantics but if the − p-value is less than 0.01, it provides “convincing”

evidence that the alternative hypothesis is true;

− p-value is between 0.01 and 0.05, there is “strong” evidence in favor of the alternative hypothesis;

23


− p-value is between 0.05 and 0.10, it is in a “gray area”;

− p-values greater than 0.10 are interpreted as weak or no evidence in support of the alternative.

Example 10.1aExperimenting with a New Pizza Style at the Pepperoni Pizza Restaurant

Hypothesis Tests for a Population Mean

Objective

To use a one-sample t test to see

whether consumers prefer the new style

pizza to the old style.

Background Information

Recall that the manager of the Pepperoni Pizza

Restaurant is running an experiment to test the

hypotheses of H0: μ ≤ 0 versus Ha: μ > 0, where μ is

the mean rating in the entire customer population.

Here, each customer rates the difference between

an old-style pizza and a new-style pizza on a -10 to

+10 scale, where negative ratings favor the old-style

pizza and positive ratings favor the new-style pizza.

PIZZA.XLS The ratings of 40 randomly selected customers and

several summary statistics appear in this file and in the following table.

Summary Statistics

From the summary statistics, we see that the sample mean is 2.10 and the sample standard deviation is 4.717.

The positive sample mean provides some evidence in favor of the alternative hypothesis, but given the rather large standard deviation and the boxplot of ratings shown on the next slide, does it provide enough evidence to reject H0?

Summary Statistics – cont’d

Running the Test

To run the test, we calculate the test statistic, using

the borderline null hypothesis value mu0 = 0, and

report how much probability is beyond it in the right

tail of the appropriate t distribution.

We use the right tail because the alternative is one-

tailed of the “greater than” variety.

Running the Test – cont’d

The test statistic is

The probability beyond this value in the right tail of the

t distribution with n-1 = 39 degrees of freedom is

approximately 0.004, which can be found in Excel with

the function TDIST(2.816,39,1).

816.240/717.4

010.2

valuet

Running the Test – cont’d

The probability, 0.004, is the p-value for the test.

It indicates that these sample results would be

very unlikely if the null hypothesis is true.

The manager has two choices: he can conclude

that the null hypothesis is true or he can conclude

that the alternative hypothesis is true - and

presumably switch to the new-style pizza. The

second choice appears to be more reasonable.

Using StatTools

Another way to interpret the results is in terms of

traditional significance levels, but the p-value is the

preferred method.

Using StatTools – cont’d

The StatTools One-Sample procedure can be used

to perform this analysis easily. To use it, select the

StatTools/Statistical Inference/One-Sample Analysis

menu item, and choose the Rating variable as the

variable to analyze.

Then fill in the dialog boxes as shown on the

following slides.

One-Sample Hypothesis Test Dialog Box

The Results

Most of this output should be familiar; it mirrors the previous calculations.

The results are significant at the 1% level.

Conclusion

Should the manager switch to the new-style pizza

on the basis of these sample results?

We would probably recommend “yes”. There is no

indication that the new-style pizza costs any more to

make than the old-style pizza, and the sample

evidence is fairly convincing that customers, on

average, will prefer the new-style pizza.

Conclusion – cont’d

Therefore, unless there are reasons for not

switching (for example, costs), we recommend the

switch.

Documents

Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing