Weebly · Web viewUse a 0.01 significance level to test the claim that the mean number of candies in a bag of Skittles is 55. The first hypothesis test we had to do was to test the

Math 1040 Skittles Term Project

Changhun Kim

Math 1040

12/10/17

Report Introduction

The goal of this project is to compare date collected by the class for 2.17oz bags of Skittles. I was asked

to count each color and report the totals to my teacher who then compiled the data. Once the data was all

collected I applied concepts learned in class to provide the charts and figures below.

Data Collection

Each student in the class will purchase one 2.17-ounce bag of Original Skittles and record the following

data:

Number of

red candies

Number of

orange

candies

Number of

yellow

candies

Number of

green candies

Number of

purple

candies

total

11 10 13 12 11 57

0.193 0.175 .228 .211 .193 1

Entire class results:

Red Orange Yellow Green Purple Total

332 286 370 270 312 1570

0.211 0.182 0.236 0.172 0.199 1

First, each member of the class purchased a 2.17-ounce bag of Original Skittles. I searched far and

wide, and after going to several grocery stores. I counted the number of red, orange, yellow, green

and purple candies form the bag. My bag of skittles had 11 red, 10 orange, 13 yellow, 12 green, 11

purple skittles.

Using the data compiled from the entire class, record the following information:

The total number of candies in the sample = 57

Organizing and Displaying Categorical Data: Colors

Class Data for Number of Skittles by color Pie Chart

Red, 332, 0.211465

Orage, 286, 0.182166

Yellow, 370, 0.235669

Green, 270, 0.171975

Purple, 312, 198726

Yellow Red Purple Orange Green0

50

100

150

200

250

300

350

400

The Number of Canides of Each Color in the Samplecolor

num

ber o

f Ski

ttles

Can

ides

I recorded the proportion of each color from the sample data gathered from the class. I created a

Pie Chart and a Pareto Chart for the numbers of candies of each color. To create the pie chart for

this data, I listed the color categories in the first five cells of column

Organizing and Displaying Quantitative Data: the Number of Candies per Bag

Column n Mean Std.dev Min Q1 Median Q3 Max

Total 26 60.385 3.817 47 59 61 63 65

45-49 50-54 55-59 60-64 65-690

2

4

6

8

10

12

14

16

18

1 1

6

17

1

Skittles Project Histogram

The distribution of the histogram is skewed to the right and doesn’t appear to have a symmetrical

bell shape to it due to the skewed look. The graphs that are shown above were what I expected to

see. With the histogram, it shows how many times numbers over 60 appear in the graph. The

boxplot shows the 5 number summary of the skittles collected. Comparing my skittles collection to

the class’s collection, the class collection has a right normal distribution making it a bit different

from my skittles.

Reflection

Categorical data is data that is collected with numbers that don’t have a special meaning to them.

Some examples would be social security numbers and colors. Categorical data deals more with

names than numbers. Quantitative data is data that is collected with numbers that have meaning to

them. An example would include time. Quantitative data deals with a finite and infinite numbers.

For the calculations, the pie chart and pareto chart both make sense for the categorical data

because they deal with putting colors together. Having a histogram for categorical data wouldn’t

make sense because a histogram deals with numbers and the amount of times a number appears in

a data set. For quantitative data, the histogram and boxplot make sense because they show the

number values. A pie chart wouldn’t work because showing the numbers in a pie chart doesn’t look

right.

Confidence Interval Estimates

A confidence interval is a range (or an interval) of values used to estimate the true value of a

population parameter. The purpose is to find out a true proportion of a sample

Construct a 99% confidence interval estimate for the true proportion of yellow candies.

Construct a 95% confidence interval estimate for the true mean number of candies per bag.

Construct a 98% confidence interval estimate for the standard deviation of the number of candies per bag.

With a 98% confidence interval level, the repeated experiment for the proportion of candies per

bag, skittles would have a confidence interval estimate for the standard deviation such as intervals

contain the estimated population mean would be 98%.

Hypothesis Tests

Hypothesis testing refers to the formal procedures used in statistical analysis to accept or reject

statistical hypotheses. A statistical hypothesis is an assumption about a population parameter. This

assumption may or may not be true. The usual process of hypothesis testing consists of several

steps. A basic outline is as follows:

• Formulate the null hypothesis (HO) and the alternate hypothesis (H1).

• Identify a test statistic that can be used to assess the truth of the null hypothesis.

• Draw a graph to include the test statistic, critical values, and critical region (if using

the critical value method).

• Reject the null hypothesis (HO) if the test statistic is in the critical region. Fail to reject the null

hypothesis if the test statistic is not in the critical region.

• Restate this previous decision in simple, non-technical terms, and address the original claim.

Use a 0.05 significance level to test the claim that 20% of all Skittles candies are red.

Use a 0.01 significance level to test the claim that the mean number of candies in a bag of Skittles is 55.

The first hypothesis test we had to do was to test the claim that 20% of all skittle candies are red.

With the data collected from our sample, our proportion of red skittles came to 21.1%. After

calculating the critical values with a 0.05 significance level and the z-score , we found that a 20%

proportion of red skittles is very plausible and therefore we failed to reject the claim. Below are the

calculations for the hypothesis test.

The second hypothesis test we were to complete was to test the claim that the mean number

of candies in a bag of skittles is 55. As discussed in the second confidence interval estimate we did,

our sample mean was 60.385 and with a 95% confidence interval we determined that the true mean

was between 58.843 and 61.927. With that information we had a pretty good idea that we would be

rejecting the claim. By calculating the critical test statistic with a 0.01 significance level we found

that in order for that claim to be true, our t value must fall between -2.787 and 2.787. In reality, the

t value came out to be 7.194, which is way outside the range of acceptable numbers, therefore, we

reject the claim that the mean number of candies in a bag of skittles is 55.

Reflection

To get accurate statistics while determining an interval estimate and preforming a

hypothesis test, there are certain requirements that must be met. For constructing a confidence

interval estimate for a population proportion the requirements are as follows; he sample is a simple

random sample, the conditions for the binomial distribution are satisfied, there are at least five

successes and at least 5 failures. For constructing a confidence interval estimate for a population

mean the requirements include; the sample is a simple random sample, and the population is

normally distributed or n>30. For constructing a confidence interval estimate for a population

standard deviation the requirements are; the sample is a simple random sample, and the population

must have normally distributed values even if the sample is large. For testing a claim about a

population proportion the requirements are; the sample observations are a simple random sample,

the conditions for a binomial distribution are satisfied, and the conditions np>/=5 and nq>/=5 are

both satisfied. For testing a claim abut a population mean the requirements are; the sample is a

simple random sample, and the population is normally distributed or n>30. For those requiring

conditions for a binomial distribution to be satisfied, that means that there is a fixed number of

independent trials having constant probabilities and each trial has to outcome categories of success

or failure. One possible error that occurs from using this data has to do with our sample size. We

only used 25 bags of skittles and one of the requirements, because our sample is not normally

distributed, is that our sample size needs to be greater than 30. Another possible error could occur

from inaccurate information, whether the data was recorded wrong, or some students got the

wrong size bag of skittles. This sample could be improved by using a larger sample size, and also by

verifying the data submitted. Students could count the skittles in class and have another classmate

double check their work to be sure the data was submitted correctly. It is very interesting and

helpful to see all how these math problems are applicable to real life. This seems like a very

effective way to determine the quality control of a product a manufacture is supplying. It is also

helpful to see how a consumer can verify that they are getting the right amount of product they are

paying for.

Documents

Weebly · Web viewUse a 0.01 significance level to test the claim that the mean number of candies in a bag of Skittles is 55. The first hypothesis test we had to do was to test the