Upload
lawrence-eaton
View
261
Download
0
Embed Size (px)
DESCRIPTION
Box Plots AKA: box and whiskers plots Graphical display of the 5 number summary: Minimum value Lower quartile (Q1) Median (Q2) Upper quartile (Q3) Maximum value
Citation preview
Box Plots &Cumulative Frequency Graphs
Box Plots• AKA: box and whiskers plots• Graphical display of the 5 number summary:– Minimum value– Lower quartile (Q1)– Median (Q2)– Upper quartile (Q3)– Maximum value
Box PlotsExample. Draw a box plot for the following data:
1. Determine 5 number summary.
2. Create an appropriate number line. Be sure to label the axis.
3. Plot the 5 number summary & connect the appropriate points using a straight edge.
8 2 3 9 6 5 3 22 6 2 5 4 5 5 6
Min: 2 Q1: 2.5 Q2: 5 Q3: 6 Max: 9
Outliers
• Outliers are noted on box plots with an asterisk or a dot
• Use 1.5 X IQR criteria• It is possible to have more than one outlier at
either end• Each whisker extends to the last value that is
not an outlier
OutliersExample. Draw a box plot for the following data.
1. Determine the 5 number summary.2. Test for outliers.3. Create an appropriate number line. Be sure to label the axis.4. Plot the 5 number summary & connect the appropriate points using a straight edge.
1 3 3 5 6 7 7 7 8 88 8 9 9 10 10 12 13 14 16
Outliers1. Determine the 5 number summary.
2. Test for outliers.
3. Create an appropriate number line. Be sure to label the axis.4. Plot the 5 number summary & connect the appropriate points using a straight edge.
Min: 1 Q1: 6.5 Q2: 8 Q3: 10 Max: 16
Lower Boundary Upper Boundary
Q1 – 1.5 X IQR Q3 + 1.5 X IQR
1 is below lower boundary 16 is above upper boundary
Box PlotsHow can we make the calculator do this for us?
“Stat”“Edit”Enter data into L1
“2nd” “y=” (“Stat Plot”)Select the plot you’d like to useTurn plot “on”Select the type you’d like to useList: L1 (or applicable list)“Graph”
Tips: -Make sure there aren’t equations in “y=” that will interfere with your box plot-You can use the calculator to check the box plot that you create
Interpreting Box Plots
• 25% of values are between smallest value & lower quartile (lower whisker)
• 25% of values are between lower quartile & median (lower small rectangle)
• 25% of values are between median & upper quartile (upper small rectangle)
• 25% of values are between upper quartile & largest value (upper whisker)
• 50% of values lie between lower & upper quartiles (entire rectangle)
Interpreting Box PlotsA set of data with a symmetric distribution will have a symmetric box plot.
The whiskers are the same length and the median is in the center of the box.
Interpreting Box PlotsA set of data which is positively skewed will have a positively skewed box plot.
The right whisker is longer than the left whisker and the median line is to the left of the box.
Interpreting Box PlotsA set of data which is negatively skewed will have a negatively skewed box plot.
The left whisker is longer than the right whisker and the median line is to the right of the box.
Parallel Box Plots
• A visual comparison of the distribution of two data sets
• Can easily compare descriptive statistics, such as median, range & IQR
Parallel Box PlotsExample. A hospital is trialing a new anesthetic drug and has collected data on how long the new and old drugs take before the patient becomes unconscious. They wish to know which drug is more reliable.
Using the below box plots, compare the two drugs for speed and reliability.
Parallel Box Plots
Speed: Using the median, 50% of the time the new drug takes 9 seconds or less, compared with 10 seconds for the old drug. We conclude that the new drug is generally a little quicker.
Reliability: Old drug: Range = 21 – 5 = 16; IQR = 12.5 – 8 = 4.5New drug: Range = 12 – 7 = 5; IQR = 10 – 8 = 2The new drug times are less “spread out” than the old drug times. The new drug is more reliable.
Cumulative Frequency Graphs• Cumulative frequency: the sum of all the
frequencies up to and including the new valueRace finishing time t Frequency Cumulative Frequency
2 h 26 ≤ t < 2 h 28 8 82 h 28 ≤ t < 2 h 30 3 8 + 3 = 112 h 30 ≤ t < 2 h 32 9 11 + 9 = 202 h 32 ≤ t < 2 h 34 11 20 + 11 = 312 h 34 ≤ t < 2 h 36 12 31 + 12 = 432 h 36 ≤ t < 2 h 38 7 43 + 7 = 502 h 40 ≤ t < 2 h 42 5 50 + 5 = 552 h 42 ≤ t < 2 h 48 8 55 + 8 = 632 h 48 ≤ t < 2 h 56 6 63 + 6 = 69
Cumulative Frequency Graphs• To draw a cumulative frequency graph:– scale and label the axes correctly (variable on the x-axis
& cumulative frequency on the y-axis)– plot the first point (lowest bound, 0)– plot the middle points (upper bound, corresponding
cumulative frequency)– plot the last point (highest bound, total frequency)– connect the points with a smooth curve
Cumulative Frequency Graphs
Cumulative Frequency GraphsExample. A supermarket is open 24 hours a day and has free parking. The number of parked cars each hour is monitored over a period of several days. Organize this information into a cumulative frequency table. Then draw a graph of the cumulative frequency.
# of cars parked per hour (n) Frequency
0 ≤ n < 50 650 ≤ n < 100 23
100 ≤ n < 150 41150 ≤ n < 200 42200 ≤ n < 250 30250 ≤ n < 300 24300 ≤ n < 350 9350 ≤ n < 400 5
Cumulative Frequency Graphs
# of cars parked per
hour (n)Frequency Cumulative
frequency
0 ≤ n < 50 6 6
50 ≤ n < 100 23 29
100 ≤ n < 150 41 70
150 ≤ n < 200 42 112
200 ≤ n < 250 30 142
250 ≤ n < 300 24 166
300 ≤ n < 350 9 175
350 ≤ n < 400 5 180
Interpreting Cumulative Frequency Graphs
• To find the median, find the frequency amount that translates to 50% of the cumulative frequency. Follow this amount on the y-axis over to the curve, and then down to the x-axis to find the median.
• To find the lower quartile, Q1, find the frequency amount that translates to 25% of the cumulative frequency. Follow this amount on the y-axis over to the curve, and then down to the x-axis to find Q1.
• To find the upper quartile, Q3, find the frequency amount that translates to 75% of the cumulative frequency. Follow this amount on the y-axis over to the curve, and then down to the x-axis to find Q3.
• To find the interquartile range subtract the lower quartile from the upper quartile: IQR = Q3 – Q1.
Interpreting Cumulative Frequency GraphsExample. Use the cumulative frequency graph to estimate the:i. median finishing timeii. number of competitors who finished in less
than 2 hours 35 minutesiii. percentage of competitors who took more
than 2 hours 39 minutes to finishiv. time taken by a competitor who finished in
the top 20% of runners completing the marathon
Interpreting Cumulative Frequency Graphs
Interpreting Cumulative Frequency Graphs
i. Median = 50th %ile. 50% of 69 = 34.5. Start with cumulative frequency of 34.5 & find corresponding time.2 hours 34.5 minutes
Interpreting Cumulative Frequency Graphs
ii. Start on x-axis at 2 hours 35 minutes & find corresponding cumulative frequency.37 people
Interpreting Cumulative Frequency Graphs
iii. Start on x-axis at 2 hours 39 minutes & find corresponding cumulative frequency.52 people took < 2 h 39 min.
Interpreting Cumulative Frequency Graphs
iv. 20% of 69 = 13.8.Find time corresponding to cumulative frequency of 13.8.2 hour 31 min