MATH 1040 Skittles Data Project - Single Mom...

Laura Boren

MATH 1040 Skittles Data Project

For our project in MATH 1040 everyone in the class was asked to buy a 2.17 individual

sized bag of skittles and count the number of each color of candy in the bag. The class data was

compiled and we used it for a number of different exercises involving a different aspect of

statistics.

For the first part of the project, we determined the proportion of each color of candy and

created a Pareto chart and a pie chart for the total number of each color of candies in the entire

class. We compared the class data to our own personal data and noted any similarities or

differences.

For part 2 of the project we used the skittles data to create statistics summaries of the

mean, standard deviation and 5-number summary. We made a frequency histogram of the total

number of candies as well as a box plot. Individually, I also wrote a paragraph about the

significance of different qualitative and quantitative methods of analysis.

The last part of the project involved confidence intervals. We found 3 different

confidence intervals for the population proportion, mean, and standard deviation and wrote an

analysis about what each confidence interval meant.

Laura Boren, Melissa Oneal, Justin Peck, Nathan Schafer

Math 1040 Class Skittles Proportions Color Count Proportion of Total

Red Skittles

564 0.199

Orange Skittles

564 0.199

Green Skittles

566 0.199

Purple Skittles

559 0.197

Yellow Skittles

586 0.206

Total Number of Skittles in the class

MATH 1040 Skittles Data

Laura Boren, Melissa Oneal, Justin Peck, Nathan Schafer

Does the Class data represent a random sample?

Yes, the class data does represent a random sample. Although each student was asked to buy their own

bag of skittles and not every bag of skittles in the region had an equal chance of being selected, the

distribution of skittles from the central plant/warehouse was most likely random. The skittles company

most likely does not count colors as they load the bags and simply loads by weight, and assuming

students did not make any biased decisions about which bag to grab off the shelf every bag produced had

an equal chance of being shipped to any location in the country and being selected at random by a student

in the class.

What would the population be?

In this study, the sample is the class data. Since not everyone in the class is currently living in the same

state, the population would be all 2.17 ounce skittles bags in the United States. There are currently

different manufacturing plants operating overseas, therefore the population can only reasonably be

expanded to include the United States distribution circuit.

Laura Boren

Math 1040 Skittles Data Skittles Color Class Total Proportion My Total Proportion

Red Skittles

564 0.199 16 0.258

Yellow Skittles

586 0.206 11 0.177

Orange Skittles

564 0.199 10 0.161

Green Skittles

566 0.199 15 0.242

Purple Skittles

559 0.197 10 0.161

Total Skittles

2839 62

My skittles bag differed quite a bit from the class data. My bag had significantly more red and green

skittles than the class total, but like the class data had the fewest purple skittles. I had always assumed

that red was the most common skittles color, but that may just be due to the vibrancy of the color red

and it being noticed more. In my skittles bag it was the most common, but that was not supported by

the class data. I was surprised to see yellow skittles being the most common in the class.

1. Using the total number of candies in each bag in our class sample, compute the

following measures for the variable “Total candies in each bag”:

(a) mean number of candies per bag

The mean number of candies per bag is 59.1 candies.

(b) standard deviation of the number of candies per bag

The standard deviation per bag is 6.4 candies.

(c) 5-number summary for the number of candies per bag

The 5-number summary is 34-58-60-62-71.

Report these summary statistics rounded to one decimal place, if needed.

Math 1040 Skittle Data 2015

Laura Boren

Skittles Data Part 3

1. From these graphs we can conclude that the Frequency Histogram is skewed to the left,

although our boxplot appeared rather symmetrical, likely due to not having smaller value

increments on the number line. This distribution and skew is expected because the median

number of candies per bag is 60 but the mean is only 59.1. One of the main causes of the

negative skew is that several of the skittles bags only had 30-40 candies in them, which is almost

half as much as the median number of skittles per bag. Those bags represent outliers, and pull the

data towards the left. My data agrees with the data collected by the whole class because the

highest frequency of candies per bag was between 60-65 candies per bag. My bag had 62

candies, which falls right in that class.

2. Categorical variables are also known as qualitative variables. These variables can be put

into different categories, such as a model of car, color, gender, etc. Quantitative data is data that

can be ordered and measured. The number of candies in a bag of skittles is quantitative, whereas

the color of the candy is categorical.

Graphing quantitative data is best done with histograms, stem leaf plots, dot plots, bar

graphs, and box plots. All of these types of graphs can be used to measure the quantity of a

certain variable. Categorical data is best graphed using a method that lets you compare the

groups to one another. A bar graph can work for both quantitative and categorical data, but a pie

chart doesn’t make sense for quantitative data because it is comparing categories to the whole. A

pie chart would effectively show the percentage of each color of skittles in a bag (categorical

data), but cannot effectively be used to show the number of skittles in a bag (quantitative data).

When it comes to calculations, mean and median only make sense for quantitative data.

The mean is the average quantity of something in an entire sample, therefore it is a more

meaningful calculation when applied to quantitative data. The median represents the middle

value of the data and once again makes the most sense only when applied to quantitative data.

The best central tendency to apply to categorical data is the mode. When looking at the colors of

candy in a skittles bag, you may not able to find the average color or the median color, but you

can establish which color occurs the most often. Likewise, when looking at the number of

candies in a skittles bag, the best values for probability distributions are going to be the average

and median number of skittles.

Laura Boren, Nathan Schafer, Justin Peck, Melissa Oneal

99% Confidence Interval estimate for the population proportion of yellow candies

X= 586

n= 2839

Z-value for 95% CI = 2.576

p= 586/2839 = 0.206

0.206 +/- 2.576 * (0.007596)

0.206 +/- 0.01957

99% Confidence Interval Estimate: (0.186, 0.226)

Confidence Intervals estimated from a population proportion are used to determine, with the

specified degree of confidence, the proportion of a characteristic found within a population. In

relation to the skittles, we are 99% confident that the proportion of yellow skittles in any bag of

skittles falls between 0.186 and 0.226.

95% Confidence Interval estimate for the population mean number of skittles per bag

Sx = 6.38

Sample mean= 59.15

Standard error of the mean = 0.9114

To find the t-value, a t-table was consulted using a degree of freedom of 50. The t-value is 2.009.

59.15 +/– t*(0.9114)

59.15 + 1.83 = 60.98

59.15- 1.83 = 57.32

95% Confidence Interval Estimate: (57.32, 60.98)

Confidence Interval estimates of the population mean use sample date to extrapolate an interval with

the specified degree of confidence that the mean characteristic of a population should fall within. In

this case, we are 95% confident that the mean number of skittles in any bag is between 57.32 and

60.98.

Laura Boren, Nathan Schafer, Justin Peck, Melissa Oneal

98% confidence interval estimate for the population standard deviation of the number of

candies per bag

s=6.378

S2=40.679

χ2 1-a/2 = 0.99

χ2 a/2 = 0.01

On the Chi square distribution chart, 50 degrees of freedom was used. The value for χ2 1-a/2 was

29.707. For χ2 a/2 it was 76.154.

√[ s2(df)/Chi value]

Lower bound: 5.06

Upper bound: 8.11

Confidence Interval estimates from the population standard deviation use the sample standard

deviation in order to generate an interval that the population standard deviation of the number of

candies should fall within, with the specified level of confidence. In this case, we are 98%

confident that the population standard deviation is within 5.06 and 8.11 candies. The problem

with confidence interval estimates taken from the sample standard deviation is that the sample

standard deviation may be quite different from the actual population standard deviation.

Laura Boren

The purpose of taking sample data and calculating statistics from them is to apply those

statistics to a larger population. Since a population is larger than a sample, how well a sample

statistic can be used to estimate a population parameter is an issue. A confidence interval helps to

solve that issue by allowing us to provide a range of values that the population parameter is

likely to fall within. The intervals are constructed with a certain level of confidence, reflected as

a percentage such as 95%, 98% or 99%. This means that if the same population were to be

examined on multiple occasions and a parameter interval calculated each time, the intervals

would contain the true parameter in X% of cases.

Laura Boren

Skittle Project Reflection

When I first started the Skittles project, I was intimidated by the process of using

statistical concepts to interpret real-life data. As the project went on I became much more

comfortable with concepts such as confidence intervals and creating Pareto charts and frequency

histograms. In my volunteer work as a lactation educator and also as a nursing student I

sometimes find myself reading and interpreting peer-reviewed clinical research. Understanding

what things like confidence intervals are and what makes data significant or unusual is very

helpful in interpreting such studies and thinking critically about what the data actually means.

There are even some aspects of statistics that I used before taking this class. In Human

Physiology we were required to calculate the mean, median, and standard deviation of lung

inspiratory volume as part of our laboratory unit on the respiratory system.

Taking calculus really helped me to understand real-world math applications and

statistics only supported what I already knew about the practicality of math. Statistics is a very

fundamental part of scientific literacy and has numerous applications in the world of business

and economics. By completing the skittles project it helped me to understand how businesses and

corporations might need to use statistics, particularly standard deviations, in order to produce

accurate and consistent products. Statistics can also be used to calculate demand and determine

shipping and distribution needs, and evaluate product quality and customer satisfaction. In our

skittles project we determined the average proportion of each color of skittles candy that came in

a bag as well as a confidence interval of that population proportion. This could be helpful in

evaluating customer candy preferences and overall satisfaction based on flavor preference. A

company might use similar statistics in real life to ensure product standardization.

MATH 1040 Skittles Data Project - Single Mom...

Documents

Over the Skittles Rainbow A Statistical Analysis of 14 ags

Engleman - Skittles

KKUUURRLLLIIINNNGG AA ...gbkurling.co.uk/PFDs/Kurling Awards Scheme.pdf5 out of 10 skittles with 4 stones Established Level With 4 skittles, placed a skittle width apart at target

The 1 Minute Paper "GOD and SKITTLES"

Sorting Skittles: A User Research Game

Disclosures - SOHM LIBRARY · 2020. 1. 28. · SKITTLES GAME 1. Shake skittles bag with left hand 3 times 2. Hold medicine cup with right hand 3. Insert medicine cup into skittles

Graphs - 3rd Grade Australian Math Bearsthirdgradeaustralianmathbears.weebly.com › ... › graphing_presentati… · Trinity ate 38 skittles on Monday, 26 skittles on Tuesday, and

UniFI · 2016. 2. 19. · Artheo (ns) 0.161 0.205 0.269 0.314 0.419 0.513 0.352 0.274 0.111 AT (ns) exp 1.1

Skittles Ex?eriment You'll need Skittles White plate Warm ......water over the top. Watch as the COIOlJrS spread through the water! Science Sparks TM Adult supervision required. You

Skittles Media Plan

SCIENCE FAIR Skittles Lab. SKITTLES LAB Experiment on dissolving

Inflation Simulation. Prices Rent 2 Skittles Food 1 Skittle Utililties 1 Skittle Incidentals 2 Skittles

Jasper - King Living · Jasper Package 13T DT2017460 V1. 1040 1040 1820 1040 1040 1040 1040 1040 1040 1820 1820 1820 1820 1040 1820 1820 1820 1820 Package 13U Configurations: Components:

SMM in Crisis Presentation-final (2)northcountrychamber.com/data/files/SMM in Crisis... · 2020-04-28 · Skittles SKITTLES CACTUS Honor System First Come First Serve Don't Steal

MATH 1040 Skittles Data Project - Single Mom Studying · MATH 1040 Skittles Data Project For our project in MATH 1040 everyone in the class was asked to buy a 2.17 individual sized

MATH 1040 Final Project Skittles Study - WordPress.com · 2016-04-26 · find real-world applications for math (beyond the basic arithmetic, of course, that I used for my taxes last

SKITTLES: skits as a creative tool to connect

Current Drug Trends: Skittles, Synthetics and Salts Drug Trends: Skittles, Synthetics and Salts Jane Maxwell, Ph.D. Center for Social Work Research The University of Texas at Austin

SKITTLES SCIENCE - HowToSTEMhowtostem.co.uk/wp-content/uploads/2017/03/Skittle...YOU WILL NEED • Skittles • Shallow bowl (either white or clear with white paper underneath) •

Federal 1040 Overview 09-28-2011 v0.8FAM-20 Federal 1040 Overview1