Chapter 3 Describing Data Using Numerical Measures
Slide 2
Chapter 3 - Chapter Outcomes After studying the material in
this chapter, you should be able to: Compute the mean, median, and
mode for a set of data and understand what these values represent.
Compute the range, variance, and standard deviation and know what
these values mean. Know how to construct a box and whiskers graph
and be able to interpret it.
Slide 3
Chapter 3 - Chapter Outcomes (continued) After studying the
material in this chapter, you should be able to: Compute the
coefficient of variation and z scores and understand how they are
applied in decision-making situations. Use numerical measures along
with graphs, charts, and tables to effectively describe data.
Slide 4
Parameters and Statistics parameter A parameter is a measure
computed from the entire population. As long as the population does
not change, the value of the parameter will not change.
Slide 5
Parameters and Statistics statistic A statistic is a measure
computed from a sample that has been selected from the population.
The value of the statistic will depend on which sample is
selected.
Slide 6
Mean mean The mean is a numerical measure of the center of a
set of quantitative measures computed by dividing the sum of the
values by the number of values in the data set.
Slide 7
Population Mean where: = population mean (mu) N = number of
data values x i = i th individual value of variable x
Slide 8
Population Mean Example 3-1 Table 3-1: Foster City Hotel
Data
Slide 9
Population Mean Example 3-1 The population mean for the number
of rooms rented is computed as follows:
Slide 10
Sample Mean where: = sample mean (pronounced x-bar) n = sample
size x i = i th individual value of variable x
Slide 11
Sample Mean Housing Prices Example {x i } = {house prices} =
{$144,000; 98,000; 204,000; 177,000; 155,000; 316,000;
100,000}
Slide 12
Median median ordered array The median is the center value that
divides data that have been arranged in numerical order (i.e. an
ordered array ) into two halves.
Slide 13
Median Housing Prices Example {x i } = {house prices} =
{$144,000; 98,000; 204,000; 177,000; 155,000; 316,000; 100,000}
Ordered array: $98,000; 100,000; 144,000; 155,000; 177,000;
204,000; 316,000 Middle Value Median = 155,000
Slide 14
Median Another Housing Prices Example {x i } = {house prices} =
{$144,000; 98,000; 204,000; 177,000; 155,000; 316,000; 100,000;
177,000; 177,000; 170,000} Ordered array: $98,000; 100,000;
144,000; 155,000; 170,000; 177,000; 177,000; 177,000; 204,000;
316,000 Middle Values Median = (170,000 + 177,000)/2 = 173,500
Slide 15
Skewed Data Right-skewed data: Right-skewed data: Data are
right-skewed if the mean for the data is larger than the median.
Left-skewed data: Left-skewed data: Data are left -skewed if the
mean for the data is smaller than the median.
Slide 16
Skewed Data (Figure 3-3) Median MeanMean Median Mean = Median
c) Symmetrica) Right-Skewedb) Left-Skewed
Slide 17
Mode mode The mode is the value in a data set that occurs most
frequently. A data set may have more than one mode if two or more
values tie for the highest frequency. A data set might not have a
mode at all if no value occurs more than one time.
Slide 18
Mode Housing Prices Example {x i } = {house prices} =
{$144,000; 98,000; 204,000; 177,000; 155,000; 316,000; 100,000;
177,000; 177,000; 170,000} Data array: $98,000; 100,000; 144,000;
155,000; 170,000; 177,000; 177,000; 177,000; 204,000; 316,000 Mean
= 1,718,000/10 = 171,800 Median = 173,500 Mode = 177,000
Slide 19
Percentiles pth percentile The pth percentile in a data array
is a value that divides the data into two parts. The lower segment
contains at least p % and the upper segment contains at least (100
- p )% of the data. The median is the 50th percentile.
Slide 20
Quartiles Quartiles Quartiles in a data array are those values
that divide the data set into four equal-sized groups. The median
corresponds to the second quartile.
Slide 21
Measures of Variation variation A set of data exhibits
variation if all of the data are not the same value.
Slide 22
Range range The range is a measure of variation that is
computed by finding the difference between the maximum and minimum
values in the data set. R = Maximum Value - Minimum Value
Slide 23
Interquartile Range interquartile range The interquartile range
is a measure of variation that is determined by computing the
difference between the first and third quartiles. Interquartile
Range = Third Quartile - First Quartile
Slide 24
Variance & Standard Deviation variance The population
variance is the average of the squared distances of the data values
from the population mean. standard deviation The standard deviation
is the positive square root of the variance.
Slide 25
Population Variance where: = population mean N = population
size 2 = population variance (sigma squared)
Slide 26
Population Variance (Bryce Lumber Example)
Slide 27
Population Standard Deviation (Bryce Lumber Example)
Slide 28
Sample Variance where: = sample mean n = sample size s 2 =
sample variance
Slide 29
Sample Standard Deviation where: = sample mean n = sample size
s = sample standard deviation
Slide 30
Coefficient of Variation coefficient of variation The
coefficient of variation is the ratio of the standard deviation to
the mean expressed as a percentage. The coefficient of variation is
used to measure the relative variation in the data.
Slide 31
Coefficient of Variation Population Coefficient of Variation
Sample Coefficient of Variation
Slide 32
The Empirical Rule If the data distribution is bell-shaped,
then the interval: contains approximately 68% of the values in the
population or the sample contains approximately 95% of the values
in the population or the sample contains virtually all of the data
values in the population or the sample
Slide 33
The Empirical Rule (Figure 3-11) X 68% 95%
Slide 34
Tchebysheffs Theorem Regardless of how the data are
distributed, at least (1 - 1/k 2 ) of the values will fall within k
standard deviations of the mean. For example: At least (1 - 1/1 2 )
= 0% of the values will fall within k=1 standard deviation of the
mean At least (1 - 1/2 2 ) = 3/4 = 75% of the values will fall
within k=2 standard deviation of the mean At least (1 - 1/3 2 ) =
8/9 = 89% of the values will fall within k=3 standard deviation of
the mean
Slide 35
Standardized Data Values standardized data value A standardized
data value refers to the number of standard deviations a value is
from the mean. The standardized data values are sometimes referred
to as z-scores.
Slide 36
Standardized Data Values STANDARDIZED POPULATION DATA where: x
= original data value = population mean = population standard
deviation z = standard score (number of standard deviations x is
from )
Slide 37
Standardized Data Values STANDARDIZED SAMPLE DATA where:x =
original data value = sample mean s = sample standard deviation z =
standard score
Slide 38
Key Terms Coefficient of Variation Data Array Empirical Rule
Interquartile Range Left-Skewed Data Mean Median Parameter
Percentiles Quartiles Range Right-Skewed Data Skewed Data Standard
Deviation Standardized Data Values Statistic
Slide 39
Key Terms (continued) Symmetric Data Tchebysheffs Theorem
Variance Variation