Upload
johnathan-newton
View
231
Download
0
Embed Size (px)
Citation preview
Boxplots Boxplots (Box and Whisker (Box and Whisker
Plots)Plots)
Boxplot and Modified Boxplot and Modified BoxplotBoxplot
Age_Wife15 20 25 30 35 40 45 50 55 60 65
HusbandsAndWives Box Plot
Ht_Husband1550 1650 1750 1850 1950
HusbandsAndWives Box Plot
25% of data in each section
How to make a BoxplotHow to make a Boxplot• Find the 5 number Summary of the dataFind the 5 number Summary of the data
– MinimumMinimum– Q1 (Quartile 1 – 25Q1 (Quartile 1 – 25thth percentile) percentile)– Median (50Median (50thth percentile) percentile)– Q3 (Quartile 3 – 75Q3 (Quartile 3 – 75thth percentile) percentile)– MaximumMaximum
• Scale the axis so all numbers fit appropriatelyScale the axis so all numbers fit appropriately• Make the box span the quartilesMake the box span the quartiles• Draw a line down in the box marking the medianDraw a line down in the box marking the median• Extend lines “whiskers” to the minimum and Extend lines “whiskers” to the minimum and
maximum maximum – Modified Boxplot: If there are outliers, extend whiskers to Modified Boxplot: If there are outliers, extend whiskers to
the smallest and largest values that aren’t outliers and put the smallest and largest values that aren’t outliers and put dots where the outliers liedots where the outliers lie
Finding the Median & Finding the Median & QuartilesQuartiles
• To find the median of a set of data:To find the median of a set of data:– Order the data from least to greatestOrder the data from least to greatest– The median is the middle numberThe median is the middle number– If there is an even number of numbers and If there is an even number of numbers and
there is no one middle number, then there is no one middle number, then average the two middle numbersaverage the two middle numbers
• To find the Quartiles:To find the Quartiles:– Q1 is the median of the lower half of the Q1 is the median of the lower half of the
datadata– Q3 is the median of the top half of the dataQ3 is the median of the top half of the data
Finding OutliersFinding Outliers
IQR(Interquartile Range) = Q3 – Q1IQR(Interquartile Range) = Q3 – Q1
An outlier on the low end is any point An outlier on the low end is any point lowerlower than than
Q1 -Q1 - 1.5(IQR) 1.5(IQR)
An outlier on the high end is any point An outlier on the high end is any point higherhigher than than
Q3 + 1.5(IQR)Q3 + 1.5(IQR)
Make and compare Boxplots:Make and compare Boxplots:Poverty Rates in the Eastern Poverty Rates in the Eastern
USUSSouthernSouthern Poverty (%)Poverty (%) NorthernNorthern Poverty (%)Poverty (%)
MarylandMaryland 6.16.1 New HampshireNew Hampshire 4.34.3
DelawareDelaware 6.56.5 WisconsinWisconsin 5.65.6
FloridaFlorida 9.09.0 ConnecticutConnecticut 6.26.2
North North CarolinaCarolina
9.09.0 New JerseyNew Jersey 6.36.3
GeorgiaGeorgia 9.99.9 VermontVermont 6.36.3
TennesseeTennessee 10.310.3 IndianaIndiana 6.76.7
South South CarolinaCarolina
10.710.7 MassachusettsMassachusetts 6.76.7
AlabamaAlabama 12.512.5 MichiganMichigan 7.47.4
KentuckyKentucky 12.712.7 MaineMaine 7.87.8
VirginiaVirginia 13.913.9 OhioOhio 7.87.8
West VirginiaWest Virginia 13.913.9 PennsylvaniaPennsylvania 7.87.8
MississippiMississippi 16.016.0 IllinoisIllinois 7.87.8
Rhode IslandRhode Island 8.98.9
New YorkNew York 11.511.5
5 Number Summary & 5 Number Summary & OutliersOutliers
Southern StatesSouthern States• Min: 6.1Min: 6.1• Q1: 9.0Q1: 9.0• Median: 10.5Median: 10.5• Q3: 13.3Q3: 13.3• Max: 16Max: 16• Outliers: < 9.0 – 1.5(13.3-Outliers: < 9.0 – 1.5(13.3-
9.0)9.0)• < 2.55 so none on low end< 2.55 so none on low end• OR > 13.3 + 1.5(13.3-OR > 13.3 + 1.5(13.3-
9.0)9.0)• >19.75 so none on high >19.75 so none on high
endend
Northern StatesNorthern States• Min: 4.3Min: 4.3• Q1: 6.3Q1: 6.3• Median: 7.05Median: 7.05• Q3: 7.8Q3: 7.8• Max: 11.5Max: 11.5• Outliers: < 6.3 – 1.5(7.8-6.3)Outliers: < 6.3 – 1.5(7.8-6.3)• < 4.05 so none on low end< 4.05 so none on low end• OR > 7.8 + 1.5(7.8-6.3)OR > 7.8 + 1.5(7.8-6.3)• >10.05 so NY on high end >10.05 so NY on high end
since it is 11.5since it is 11.5
Boxplots in CalculatorBoxplots in Calculator
• Enter data into List (Stat Edit)Enter data into List (Stat Edit)• Choose 1Choose 1stst boxplot option in StatPlot boxplot option in StatPlot• Choose the list you used for XlistChoose the list you used for Xlist• Choose 1 for Freq or a 2Choose 1 for Freq or a 2ndnd list if data is list if data is
stored in two lists (values in one, frequency stored in two lists (values in one, frequency in another)in another)
• Zoom 9 will scale it for you to see the graphZoom 9 will scale it for you to see the graph• Press Trace and the arrow keys to see the Press Trace and the arrow keys to see the
five number summary and any outliersfive number summary and any outliers
Measures of CenterMeasures of CenterMean(, ) —add up data values and divide by
number of data values
Median (M)—list data values in order, locate middle data value; average middle 2 if necessary
Data Set: 19, 20, 20, 21, 22
x
Mean = 20.04; Median = 20
Data Set: 19, 20, 20, 21, 38
Mean = 23.6; Median = 20
RobustRobust (Resistant) (Resistant) StatisticStatistic
• Robust or resistant: Robust or resistant: value doesn’t value doesn’t change dramatically when extreme change dramatically when extreme values (including outliers) are added to values (including outliers) are added to (or taken out of) the data set.(or taken out of) the data set.– MedianMedian is is resistant.resistant.– MeanMean is NOT is NOT resistantresistant against extreme against extreme
values. Mean is pulled values. Mean is pulled away fromaway from the the center center of the distribution of the distribution towardtoward the the extreme value extreme value (“tails of graph”).(“tails of graph”).
Mean Mean or or
MedianMedian??
Measures of Center on Measures of Center on Different Distribution Different Distribution
ShapesShapes
Skewed to the left Symmetric Skewed to the right
In each of the graphs, decide where the mean, median, and mode are relative to one another.
Remember the mean is pulled toward extreme values.Remember the mean is pulled toward extreme values.