8
Jennifer Porter Math 1040 – Summer 2017 Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math 1040 class. Some of the concepts I will be covering will include organizing and analyzing data, drawing conclusions using confidence intervals and hypothesis tests. In the information below you will see the collection of my own data as well as the overall class sample data. My data was collected along with all of the other students in class, and combined into one data set that will be used throughout this project. Count Red Count Orange Count Yellow Count Green Count Purple Total My Bag 13 19 17 7 8 64 Class Counts 340 370 363 346 342 1761

Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

Jennifer Porter

Math 1040 – Summer 2017

Term Project

July 28, 2017

Skittles Candies Project

The following project will consist of many of the concepts I have learned over the course of this Math

1040 class. Some of the concepts I will be covering will include organizing and analyzing data, drawing

conclusions using confidence intervals and hypothesis tests. In the information below you will see the

collection of my own data as well as the overall class sample data. My data was collected along with all of

the other students in class, and combined into one data set that will be used throughout this project.

Count Red

Count Orange

Count Yellow

Count Green

Count Purple

Total

My Bag 13 19 17 7 8 64 Class Counts

340 370 363 346 342 1761

Page 2: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math
Page 3: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math
Page 4: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

The pie charts above do reflect mainly what I was expecting to see in the results comparing my data to

the class data. Because the colors orange and yellow were so high in my own bag of Skittles I figured they

would be amongst the highest overall in the class as well. My hypothesis was correct there, however, I also

assumed that red would also be one of the most popular colors and I was surprised with the results; they were

the lowest overall color amongst the class totals. The two colors I assumed would be the least represented in

the bags were purple and green, and I was correct in that assumption. You can see in the Pareto charts below

that the overall class sample is fairly even between the five different colors, with red being the lowest and

orange being the highest color in the bags of Skittles.

Page 5: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

Summary Statistics for the Total Candies in Each Bag:

Mean is 60.7

Standard Deviation is 3.3

5-number Summary is 55, 58, 61, 62, 70; Minimum= 55, Q1=58, Median=61, Q3=62, Maximum=70

Page 6: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

1. In the above histogram and boxplot, it shows the distribution of the total candies per bag for the

classroom sample. In both graphs, you can see how the shape of the distribution is skewed to the right.

There was one bag of skittles that had 70 candies in it which is higher than all of the other bags, and

makes it an outlier for the class sample. This bag of 70 candies pulled the mean to the right of the graph.

The mean is 60.7 which you can clearly see on the histogram that it is where the majority of the total

number of candies per bag is for the classroom. I did not expect to see a gap in the histogram between 66

and 70, but that is another way to show that the class does represent a random sample. The classroom

data sample is fairly close but does not agree with my own Skittles candy bag. My bag had 64 candies in

it which is more than most of the class. I was also the only one to have 64 candies in their bag, which I

found interesting as well. My bag of 64 candies is closest to the third quartile, but it was still two away

from that.

2. Categorical variables are also known as qualitative variables. They are variables that are not numerical

and describe data that fit into categories. One example would be eye colors and the variables would

include blue, green, brown, and hazel. Qualitative data is data that can be ordered and measured. The

number of candies in a bag of skittles is quantitative, but the color of the actual candy is categorical or

qualitative. Pareto Charts, bar graphs, and pie charts would be the best way to graph categorical or

qualitative data. The best way to graph quantitative data is with histograms, stem leaf plots, bar graphs,

boxplots, or dot plots. All of these graphs can be used to measure the quantity of a certain variable. A

bar graph could work for both quantitative and qualitive data, however, a pie chart only makes sense for

qualitative data since it is comparing categories to the whole. It would effectively show the percentage

of each color of skittles in a bag (qualitative data), but not the number of skittles in a bag (quantitative

data).

Calculation wise, mean and median only make sense for quantitative data. The mean represents

the average quantity of something in an entire sample, just like the average mean for the class

sample in this project is 60.7, therefore it is better applied to quantitative data. The median

Page 7: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

represents the middle value of a data (in this case the median is 61) and also makes the most

sense when applied to quantitative data. The mode would be the best central tendency to use with

categorical data or qualitative data. You may not be able to find the average color or the median

color of the candy in a Skittles bag, but you could establish which color occurs most often.

Confidence Interval

A confidence interval is a range (or an interval) of values used to estimate the true value of a population

parameter. Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample

estimate of a population parameter. The range of the confidence interval is defined by the sample statistic plus

or minus the margin of error (confidence interval=sample statistic+ Margin of error). What a confidence

interval does is allows us to estimate the range in which our true population parameter falls, given what we

know about the population from the sample we’ve observed. Because no estimate can be 100 percent reliable,

we must be able to know how confident the estimates are. For example if repeated samples were taken and the

95% confidence interval was determined for each sample, 95% of the intervals would contain the population

mean. Confidence intervals can be computed for various parameters, not just the mean.

1.)

n=1761

x=363

p=363/1761=0.206

za/2=2.579

99%confidencethatthepopulationproportionofyellowcandiesisbetween0.181and0.231

2.)

x=60.7

s=3.3

n=29

ta/2=2.048

95%confidencethatthepopulationmeanfornumberofcandiesperbagisbetween59.4and61.9

Page 8: Skittles Candies Project · Term Project July 28, 2017 Skittles Candies Project The following project will consist of many of the concepts I have learned over the course of this Math

ReflectiveWriting

ThroughoutthisclassIhavelearnedmanynewconceptsinstatistics.Istillrememberlookingattheinstructions

forthisprojectandthinkingtomyselfthatIwasreadingaforeignlanguage.Inthebeginningitseemedlikeitwouldbea

verydifficultproject.However,astheclasswentonandIbegantounderstandtheconceptsfromeachmodulebetterI

realizedthatIwasnotonlycapableofgraduallyunderstandingtheinstructions,butIwasalsoabletoperformthe

correctsequenceofstepsforeachoftheexercisesintheproject.Wehadtocreatearandomsampleofdata,havethat

dataorganized,creategraphsandchartsandinterpretwhattheinformationmeant.Asinanyotherclass,afterlearning

thetheory,practicemakesthedifference.

Thisprojectallowedustoputintopracticekeyprinciplesstudiedthroughouttheterm—fromusingasampling

methodtoperforminghypothesistesting.Themostchallengingaspectsoftheprojectwerereallyunderstandingeach

conceptandhowitappliedtothepopulationofSkittlesnotjustoursample.Thisprojecthasgivenmeabetter

understandingofstatisticsandhowitworksintherealworld.Statisticsprovidesvaluableinformationtosociety.SinceI

workinthefieldofGISIhavefoundthisclassandstatisticsverybeneficialtomyjob.

Thisprojectandtheclassingeneralhavegivenmethetoolstodifferentiatebetweenvalidandquestionable

informationwhenanalyzingthingslikegraphs,andconfidencelevelsandintervals.Tobeabletounderstandthe

languagebehindthestatisticalanalysisofstudieswithsimple,butimportanttermssuchasmedium,range,mean,and

modethataresofrequentlyusedinsomanyinstances.

ThroughoutthiscourseIhavelearnedhowtoapplystatisticaltechniquesinadailylifesituationaswellas

learnedhowtouseagreatprogram,Statcrunch,tocreatevisualrepresentationofthedata.Iwasabletocreategraphs

andfindstatisticaldatabysimplyimputingthedatagiven.Ihavealwaysstruggledwithmathbutfoundthatthiscourse

allowedmetoseehowmathdoesapplytoreallife.