View
3
Download
0
Category
Preview:
Citation preview
The Practice of Statistics, 5th Edition
Starnes, Tabor, Yates, Moore
Bedford Freeman Worth Publishers
CHAPTER 2 Modeling Distributions of Data 2.1 Describing Location in a Distribution
Learning Objectives
After this section, you should be able to:
The Practice of Statistics, 5th Edition 2
Percentiles: find and interpret the percentile of an individual value
within a distribution of data.
Cumulative relative frequency graph: estimate percentiles and
individual values using a cumulative relative frequency graph.
Z-score: find and interpret the standardized score (z-score) of an
individual value within a distribution of data.
Effect of adding, subtracting, multiplying by, or dividing by a
constant on the shape, center, and spread of a distribution of data.
Describing Location in a Distribution
The Practice of Statistics, 5th Edition 3
Measuring Position: Percentiles
One way to describe the location of a value in a distribution is to tell
what percent of observations are less than it.
The pth percentile of a distribution is the value with p percent of the
observations less than it.
6 7
7 2334
7 5777899
8 00123334
8 569
9 03
Jenny earned a score of 86 on her test. How did she perform
relative to the rest of the class? Michael got a 73 on his test.
How did he perform? And whose score is more unusual?
Example
Her score was greater than 21 of the 25
observations. Since 21 of the 25, or 84%, of the
scores are below hers, Jenny is at the 84th
percentile in the class’s test score distribution.
6 7
7 2334
7 5777899
8 00123334
8 569
9 03
The Practice of Statistics, 5th Edition 4
Practice: Wins in major League baseball
• The stemplot below shows the number of wins for each of the 30
Major League Baseball teams in 2012.
Problem: Find the percentiles for the following teams:
(a) The Minnesota Twins, who won 66 games.
(b) The Washington Nationals, who won 98 games.
(c) The Texas Rangers and Baltimore Orioles, who both won 93
games.
The Practice of Statistics, 5th Edition 5
Cumulative Relative Frequency Graphs
Interesting graphs with percentile: one being a cumulative relative
frequency graph displays the cumulative relative frequency of each
class of a frequency distribution.
Age of First 44 Presidents When They Were
Inaugurated
Age Frequenc
y
Relative
frequency
Cumulative
frequency
Cumulative
relative
frequency
40-44 2 2/44 =
4.5%
2 2/44 =
4.5%
45-49 7 7/44 =
15.9%
9 9/44 =
20.5%
50-54 13 13/44 =
29.5%
22 22/44 =
50.0%
55-59 12 12/44 =
34%
34 34/44 =
77.3%
60-64 7 7/44 =
15.9%
41 41/44 =
93.2%
65-69 3 3/44 =
6.8%
44 44/44 =
100%
0
20
40
60
80
100
40 45 50 55 60 65 70
Cu
mu
lati
ve r
ela
tive
fre
qu
en
cy (
%)
Age at inauguration
Relative
Freq Cumul.
freq
Cumulative
Relative
Frequency
The Practice of Statistics, 5th Edition 6
Practice: State median household incomes
• The table and cumulative relative frequency graph below show the
distribution of median household incomes for the 50 states and the
District of Columbia in a recent year.
• Problem: Use the cumulative relative frequency graph for the state
income data to answer each question.
• (a) At what percentile is California, with a median household income
of $57,445?
• (b) Estimate and interpret the first quartile of this distribution.
Median
income
($1000s)
Frequency Relative
frequency
Cumulative
frequency
Cumulative
relative
frequency
35 to < 40 1 1/51 = 0.020 1 1/51 = 0.020
40 to < 45 10 10/51 = 0.196 11 11/51 = 0.216
45 to < 50 14 14/51 = 0.275 25 25/51 = 0.490
50 to < 55 12 12/51 = 0.236 37 37/51 = 0.725
55 to < 60 5 5/51 = 0.098 42 42/51 = 0.824
60 to < 65 6 6/51 = 0.118 48 48/51 = 0.941
65 to < 70 3 3/51 = 0.059 51 51/51 = 1.000
The Practice of Statistics, 5th Edition 7
Measuring Position: z-Scores
A z-score tells us how many standard deviations from the mean an
observation falls, and in what direction.
If x is an observation from a distribution that has known mean and
standard deviation, the standardized score of x is:
A standardized score is often called a z-score.
z =x - mean
standard deviation
Jenny earned a score of 86 on her test. The class mean is 80 and
the standard deviation is 6.07. What is her standardized score?
z =x - mean
standard deviation=
86 - 80
6.07= 0.99
Example
The Practice of Statistics, 5th Edition 8
Year Player HR Mean SD
1927 Babe Ruth 60 7.2 9.7
1961 Roger
Maris 61 18.8 13.4
1998 Mark
McGwire 70 20.7 12.7
2001 Barry
Bonds 73 21.4 13.2
PRACTICE: The single-season home run record for Major League
Baseball has been set just three times since Babe Ruth hit 60 home runs
in 1927. Roger Maris hit 61 in 1961, Mark McGwire hit 70 in 1998,and
Barry Bonds hit 73 in 2001. In an absolute sense, Barry Bonds had the
best performance of these four players, because he hit the most home
runs in a single season. However, in a relative sense, this may not be true.
Baseball historians suggest that hitting a home run has been easier in
some eras than others. This is due to many factors, including quality of
batters, quality of pitchers, hardness of the baseball, dimensions of
ballparks, and possible use of performance-enhancing drugs. To make a
fair comparison, we should see how these performances rate relative to
those of other hitters during the same year.
Problem: Compute the standardized scores for each performance using
the information in the table. Which player had the most outstanding
performance relative to his peers?
The Practice of Statistics, 5th Edition 9
Homework
• Page100 #1-18 odd
• Extra Credit Project: Due October 3rd
– Recommended websites: www.censusatschool.com &
www.gapminder.com
– Individual work
– Extra credit: will replace your lowest POP Quiz grade
– Must make a poster and present the results to the class
• Chapter 2 Quiz: October 15th
Recommended