Upload
sybil-williamson
View
219
Download
3
Embed Size (px)
Citation preview
+
CHAPTER 2 Descriptive StatisticsSECTION 2.1 FREQUENCY DISTRIBUTIONS
+Section 2.1: Frequency Distributions and Their Graphs
GOAL: explore many ways to organize and describe a data setCenter, variability (or spread), and
shape
+FREQUENCY DISTRIBUTION
A table that shows classes or intervals of data entries with a count of the number of entries in each class. The frequency f of a class is the number of data entries in the class.
Frequency – how often
Distribution – how spread out/concentrated
Example: Pg. 40
+Example of a Frequency DistributionClass Frequency, f
1 – 5 5
6 – 10 8
11 – 15 6
16 – 20 8
21 - 25 5
26 – 30 4
Lower Class Limit – least number that can belong to a classUpper Class Limit – greatest number that can belong to a classClass Width – the distance between lower (or upper) limits of consecutive classesRange – difference between the maximum and minimum data entries
+Guidelines for Creating a Frequency Distribution
1. Determine the range of the data.
2. Determine the number of classes to use.
3. Determine the class width.
4. Find Class Limits.
5. Find the Class Midpoints.
6. Find the Class Boundaries.
7. Tally up the data in each class.
8. Get the FREQUENCY for each class.
+Definitions – Additional Features of Frequency Distributions Class Midpoint – Sum of the lower and upper limits of a
class divided by two (also known as class mark)
Relative Frequency – portion or percentage of the data that falls in that class. Take the frequency (f) divided by the sample size (n).
Cumulative Frequency – sum of the frequency for that class and all previous classes. The cumulative frequency of the last class is equal to the sample size n
+Class Example 1 Page 41
+Class Activity/HW
Pg. 51 #27, #28
We’ll be using these frequency distributions again, so make sure to hold onto them.
HAVE DONE FOR TOMORROW, WE NEED THEM!
DO ON SEPARATE PAPER
+Graphs of Frequency DistributionsFrequency Histogram – a bar graph the
represents the frequency distribution of a data set
Properties of a Frequency Histogram1. The horizontal scale is quantitative and
measures the data values2. The vertical scale measures the frequencies
of the classes3. Consecutive bars MUST touch
+Other Types of Graphs
FREQUENCY POLYGON A line graph that emphasizes the continuous
change in frequencies
RELATIVE FREQUENCY HISTOGRAM Has the same shape/horizontal scale as
frequency histogram Vertical scale measures RELATIVE
frequencies
CUMULATIVE FREQUENCY GRAPH (OGIVE) Line graph that displays the cumulative
frequency of each class at its upper class boundary
+#27: Newspaper Reading Times (min)
Class Frequency
Mid-point Relative f Cumulative f
0 – 7 8 3.5 0.32 8
8 – 15 8 11.5 0.32 16
16 – 23 3 19.5 0.12 19
24 – 31 3 27.5 0.12 22
32 – 39 3 35.5 0.12 25
n = 25
+Class Activity/HW Using Frequency Distribution you created for #28 from
page 51 complete the following:
ON GRAPH PAPER:1. Frequency Histogram
2. Frequency Polygon
3. Relative Frequency Histogram
4. Ogive
**MAKE SURE TO LABEL GRAPHS AND WRITE NEATLY!
(TURN IN WITH FREQUENCY DISTRIBUTION FOR WRITTEN FEEDBACK)
DUE TOMORROW!!!!
+#28 Book Spending Per Semester ($)
Class Frequency
Mid-Point Relative f Cumulative f
30 – 113 5 71.5 0.1724 5
114 – 197 7 155.5 0.2414 12
198 – 281 8 239.5 0.2759 20
282 – 365 2 323.5 0.0690 22
366 – 449 3 407.5 0.1034 25
450 – 533 4 491.5 0.1379 29
n = 29
+Pirate Baseball Activity: Due Given: Pittsburgh Pirates Home Run Data 1961 – 2009
Using this data, create the following: USING EIGHT CLASSES1. Frequency Distribution (including ALL parts and rel./cum. freq)
2. Frequency Histogram
3. Frequency Polygon
4. Relative Frequency Histogram
5. Ogive
Must include: Title, Axis Labels, equal class widths Evidence of ALL calculations (class widths, boundaries, midpoints) Straight lines Neatness Straight Edge Graph Paper
Then, using your phone or an iPad look up homerun data for 2010, 2011, 2012, 2013, 2014, and 2015. Create a NEW Frequency Distribution Two New Charts Explain how this new data has changed the distribution (one paragraph)
THIS WILL BE GRADED.Due:
Only given TODAY and TOMORROW to work in class.
+
Section 2.2: More Graphs and Displays
+Stem and Leaf PlotDisplay for quantitative data
Give the feel of a histogram while retaining data values
Easy way to sort data
Stem – the entry’s leftmost digits
Leaf – the entry’s rightmost digits
Example 1 and 2 on Pages 55 – 56 Ordered/Unordered MUST ALWAYS INCLUDE A KEY!
+Dot PlotEach data entry is plotted, using a point, above a horizontal axis
Can see how data is distributed, see specific data entries, and identify unusual data values
Example 3 Pg. 57
+Graphing Qualitative Data Sets: Pie ChartsA circle that is divided into sectors that represent categories
Area of each sector is proportional to the category’s frequency
KEY: To find central angle: MULTIPLY RELATIVE FREQUENCY BY 360°
+Pareto ChartA vertical bar graph where the height represents frequency or relative frequency
BARS ARE POSITIONED IN ORDER OF HIGHEST TO LOWEST
REMEMBER: Qualitative Data
Example 5 Page 59
+Graphing Paired Data Sets: Scatter PlotPaired Data Sets: one data set corresponds to one entry in a second data set
Scatter Plot: ordered pairs are graphed as points in a coordinate plane
Use to SHOW THE RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES
Example 6 Page 60
+Time Series ChartUsed to graph a time series
Time series – data set composed of quantitative entries taken at regular intervals over a period of time
Example 7 Page 61Scatter Plot: No LineTime Series Chart: Connected data points
+GRADED ASSSIGNMENT:
Individually, complete the following graphs from pages 64 – 65. #18, #20, #22, #24, #25, #29, #30Must be handed in by the beginning of
class on ________ (only ______to work in class)
Will be graded for correctness and neatness
Use graph paper, ruler, protractor, and compass!
+
Section 2.3 - Measures of Central Tendency
+Measures of Central TendencyMEAN, MEDIAN, MODE
Value that represents TYPICAL, or CENTRAL entry of the data
+Mean
Population Mean
μ= Σx /N
Sample Mean
x = Σx / n
N = number of entries in a population
n = number of entries in a sample
+Example 1 Pg. 67
The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to Cancun, Mexico are listed. What is the mean price of the flights?
872 432 397 427 388 782 397
WHEN CALCULATING GO ONE DECIMAL FURTHER THAN ORIGINAL DATA
+Median
Value that lies in the middle of the data when the data is ORDEREDIf data set has an even number of entries, the median is the mean of the two middle data entries
Median divides a data set into TWO equal partsEX: 4 5 6 8 10 14
+ModeMost frequently occurring data point
If ALL occur only ONCE, then there is NO MODE
If two data entries occur the same number of times, then BOTH are modes and we have a BIMODAL DISTRIBUTION
If more than two modes, we have a MULITMODAL DISTRIBUTION
+Note on ModeMode is only measure of central tendency that MUST be an actual data point.
+Outlier
Data point that is far away from all of the other data points
+Assignment: Part 1 Section 2.3
Pg. 75 – 78 #18 - #34 even
Finding mean, median, and mode.
Label any outliers.
Use correct notation for mean.
(population mean vs. sample mean)
+Today’s Question: How can we describe the “middle” of unequal data?
You have $200 for 17 days, $300 for 5 days, and $150 dollars for 9 days out of a month. What was your average amount of money for the month?
+Weighted Mean
A mean where each data point in not “worth” the same amount.
Entries have varying “weights”.
x = Σ(x * w) / Σw
**Where w is the weight of each entry
+Example: Weighted Mean Vs. Regular Mean
Tests are worth 50% of overall grade, quizzes 30% and homework 20%.
You get 100 in HW, 90 on a quiz, and 80 on a test.
Calculate regular and weighted mean.
Why is one lower than the other?
+Example: Weighted Mean Vs. Regular Mean
You have $200 for 17 days, $300 for 5 days, and $150 dollars for 9 out of a month.
Calculate regular and weighted mean.
Why is one lower than the other?
+Mean of a Frequency Distribution
x = Σ(x * f) / n
Where n = Σf,
x is the class midpoint,
and f is the frequency
of each class
+Guidelines: Finding the Mean of a Frequency Distribution (Pg. 72) Find the midpoint of each class.
Find the sum of the products of the midpoints and the frequencies.
Σ(x *f )
Find the sum of the frequencies.n = Σf
Find the mean of the frequency distribution.
x = Σ(x * f) / n
+The Shape of Distributions (Pg. 73)Symmetric – can be folded in the middle
Uniform – Rectangular, equal frequencies
Multimodal – More than one peak
Skewed – a “long tail” on one side Direction of the skew is the side the tail is
on. Left skewed means the tail is on the left
side Right skewed means the tail in on the right
side
+EXAMPLES: Page 73
Mean describes data best when data is symmetric.
Median describes data best when data is skewed or contains outliers.
Mode describes data best when data is nominal level of measurement.
+Assignment: Part 2 Section 2.3
Pg. 77 – 78 #41-#44, #46 - #48, #52- #54
THIS IS A LENGTHY ASSIGNMENT, GET STARTED ON IT!!!
+
Section 2.4: Measures of Variation
+Find the mean, median, and mode.
SET A: 37, 38, 39, 41, 41,41, 42, 44, 45, 47
SET B: 23, 29, 32, 40, 41, 41, 48, 50, 52, 59
+
+Measures of Variation:Range, Deviation, Variance, Standard Deviation
Range = (Maximum Data Entry) – (Minimum Data Entry)
Range only uses two pieces of data
Variation and Standard Deviation use ALL entries of a data set
+ DeviationDeviation of an entry x in a POPULATION data set is the difference between the entry and the mean μ of the data set.
Deviation of x = x – μ(POPULATION)
Deviation of x = x – x (SAMPLE)
DISTANCE FROM MEAN!
+Calculate Deviations of Company A
37, 38, 39, 41, 41,41, 42, 44, 45, 47
Find the sum of the deviations.
+POPULATION VARIANCE
For POPULATION DATA
σ^2 = Σ (x- μ) ^2 / N
σ is the lowercase Greek letter Sigma
+Population Standard Deviation
Square Root of Variance (only σ)
Average distance away from the mean
Larger standard deviation means more spread out data.
+Sample Variance and Sample Standard Deviation.
When using sample data use x not μ
Divide by N-1 instead of N
+Calculate sample variation and standard deviation for Company B.
SET A: 37, 38, 39, 41, 41,41, 42, 44, 45, 47
SET B: 23, 29, 32, 40, 41, 41, 48, 50, 52, 59
+
+Assignment: Part 1 Section 2.4
Pg. 92 – 94 #1, 3, 13, 14, 19, 20
+How can we use standard deviation to make decisions about data?Standard deviation and variance tell us how spread out the data is
+Empirical Rule (68-95-99.7 Rule)In a BELL – SHAPED distribution,
1. ~68% of data is within 1 Standard Deviation of mean
2. ~95% of data is within 2 Standard Deviations of mean
3. ~99.7% of data is within 3 Standard Deviations of mean
+
+Example:
If 65 men’s heights have a bell shaped distribution with mean of 68 in and standard deviation of 2.5 inches, what percent of people are between 68 and 73 inches?
How many men is that?
+Chebychev’s TheoremIn ANY distribution, the percent of data
with k standard deviations (k >1) is AT LEAST 1 – (1/k^2)
For k = 2:
For k = 3:
+Example:A sample of 40 runners in a 1 mile race
gave a mean of 7 minutes with a standard deviation of 1.25 minutes. What can we say about how many people ran a mile in between 4.5 and 9.5 minutes?
+Assignment: Part 2 Section 2.4Pg. 95 – 97 #29 - #36 ONLY PART A
Pg. 88 has nice picture of Empirical Rule and Bell-Shaped Distributions
+
Section 2.5: Measure of Position
+FractilesNumbers that partition, or divide, an ORDERED data set into equal parts
Example: Median – Fractile that divides data set into two equal parts
+QuartilesThree Quartiles: Q1, Q2, and Q3
Divide an ordered data set into four equal parts
Q1 – First Quartile – one quarter of data fall on or below Q1
Q2 – Second Quartile – half of the data fall on or below Q2 Q2 is MEDIAN of the data set
Q3 – Third Quartile – ¾ of the data fall on or below Q3
+Interquartile Range
Difference between the third and first quartiles
IQR = Q3 – Q1
+Box-and-Whisker Plot
Five Number Summary:MaximumMinimumMedianQ1Q3
5, 7, 9, 10, 11, 13, 14, 15, 16, 17, 18, 18, 20 21, 37
What conclusion can we draw from graph?
+
+Assignment: Part 1 Section 2.5
Pg. 110 – 111 #17 - #20, #23, #26, #27, #28
+The Standard Score or Z-ScoreMeasures a data value’s position in the
data set
The STANDARD SCORE or Z-SCORE represents the number of standard deviations a given value x fall from the mean μ. To find the z-score for a given value, use the following formula:
Z = Value – Mean = x – μ
Standard Dev. σ
+Z-ScoreCan be POSITIVE, NEGATIVE, or ZERO
If z is NEGATIVE, then the corresponding x value is BELOW the mean.
If z is POSITIVE, then the corresponding x value is ABOVE the mean.
If z is ZERO, then the corresponding x value is the MEAN.
+Z-Score Example
Mean speed of vehicles is 56 MPH.
Standard Deviation of 4 MPH.
Car 1: 62 MPH
Car 2: 47 MPH
Car 3: 56 MPH
Calculate the z-score for Cars 1, 2, and 3.
Interpret this information.
+
+Z-Scores PLUS the Empirical RuleEmpirical Rule: 95% of data lies within 2
Standard Deviations Z-Score: 95% of data lies within -2 and 2. Usual scores
A z-score less than -2 or greater than 2 we would consider unusual.
A z-score less than -3 or greater than 3 we would consider VERY unusual.
REMEMBER – BELL-Shaped for Empirical Rule
+Assignment: Part 2 Section 2.5
Pg. 111 - 112 #29 - #34
+Section 2.3 Part 1(Mean, Median, Mode,)18. 6.2, 6, 520. 200.4, 186, none22. 61.2, 55, 80 and 12524. NP, NP, worse26. NP, NP, domestic28. 16.6, 15, none30. 314.1, 374, none32. 2.49, 2.35, 4.034. 213.4, 214, 217
Section 2.3 Part 241. 8942. 3632043. 612.7344. 982.1946. 8447. 6548. 69.752. Skewed Right53. Symmetric54. Uniform
Section 2.4 Part 11. R = 8, M = 7.9, V = 6.1, SD = 2.53. R = 12, M = 11.9, V = 17.1, SD = 4.119. LA: R = 17.6, V = 37.5, SD = 6.11 LB: R = 8.7, V = 8.71, SD = 2.9520. Dallas: R = 18.1, V = 37.33, SD = 6.11 Houston: R = 13, V = 12.26, SD = 3.5
Section 2.4 Part 229. 68%30. Between 1500 and 330031. a. 51, b. 1732. a. 38, b. 1933. 1000, 200034. 3325, 149035. 2436.Sentences involving 54.97 and 59.17
Section 2.5 Part 117. None18. SR19. SL20. S23. Q1 = 2, Q2 = 4, Q3 = 526. Q1 = 15.125, Q2 = 15.8, Q3 = 17.6527. a. 5, b. 50%, c. 25%28. a. 17.65, b. 50%, c. 50%
Section 2.5 Part 2
31. Stats: 1.43, Bio: 0.77. Did better on Stats32. Stats: -0.43, Bio: -0.77, Did better on Stats33. Stats: 2.14, Bio: 1.54, Did better on Stats34.Both 0, Both performed equally.