49%
39%
27%
20%
16%
9%
0%
10%
20%
30%
40%
50%
60%
Business Engineering Liberal Arts Education Science Social Sciences
Perc
enta
ges
Majors
Percent of Students with Different Majors#19
3-3: Measures of VariationObjective: To describe data using measures of variation, such as the
range, variance, and standard deviation.
A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test. Since different chemical agents are added to each group and only six cans are involved, these two groups constitute two small populations. The results (in months) are shown. Find the mean of each group.
Brand A
X X X X X X
10 15 20 25 30 35 40 45 50 55 60
Variation in Paint (in months)
Brand B
X
X X X X X
10 15 20 25 30 35 40 45 50 55 60
Three common measures of spread or variability of a set of data:◦ Range
◦ Variance
◦ Standard Deviation
Brand A Brand B
10 35
60 45
50 30
30 35
40 40
20 25
Range for set A: 60 – 10 = 50 months
Range for set B: 45 – 25 = 20 months
Rounding Rule for the Standard Deviation: Same as for the mean. Round to one more decimal place than the original data.
Find the variance and the standard deviation for the fading time of paint.
Brand A: 10, 60, 50, 30, 40, 20
Step 2: Subtract the Mean from each data point.
10 – 35 = -25
60 – 35 = +25
50 – 35 = 15
30 – 35 = -5
40 – 35 = +5
20 – 35 = -15
Step 3: Square each result.
Square
10 – 35 = -25………625
60 – 35 = +25…….625
50 – 35 = 15……….225
30 – 35 = -5…………25
40 – 35 = +5………..25
20 – 35 = -15……….225
Find the variance and standard deviation for Brand B: 35, 45, 30, 35, 40, 25
1. Find the mean.
2. Subtract mean from each data value.
3. Square each result.
4. Find the sum of the squares.
5. Divide sum by N to get variance.
6. Take square root to get standard deviation.
A B C
X 2. X-μ 3. (X-μ)²
35 35-35=0 0²=0
45 45-35=10 10²=100
30 30-35=-5 (-5) ²=25
35 35-35=0 0²=0
40 40-35=5 5²=25
25 25-35=-10 (-10) ²=100
1. Calculate the mean: 210/6 = 35 months 4. Find the sum of column C: 0+100+25+0+25+100=2505. Divide sum (step 4) by N to get the variance: 250/6=41.76. Take square root of the variance (step 5) to get the
standard deviation: 5.6
6
250
Compare set A to set B
Any conclusions? (see slide 10)
Set A Set B
Variance 291.7 41.7
Standard Deviation 17.1 6.5
Variance: The average of the squares of the distance each value point is from the mean.
Symbol: σ² Population Variance:
Where X: individual valueμ: population meanN: population size
N
X 2
2)(
Computational Formula for s² and s
Variance
Standard Deviation
1
)( 2
2
2
n
n
XX
s
1
)( 2
2
n
n
XX
s
Example 3-23, p. 121
Use the computational formulas for s and s² to find the standard deviation and the variance for the amount of European auto sales (in millions) for a sample of 6 years shown: 11.2, 11.9, 12.0, 12.8, 13.4, 14.3
Answers: s²=1.28 million
s = 1.13 million
Variance and Standard Deviation for Grouped Data
Procedure for finding the variance and standard deviation for grouped data is similar to that for finding the mean for grouped data: use the midpoint.
Procedure for Finding the Sample Variance and Standard Deviation for Grouped Data
1) Make a table with the following columns
2) Multiply: Frequency * Midpoint (column D)
3) Multiply: Frequency * Midpoint squared (column E)
4) Total columns B, D, and E. ◦ Total of B is n.
◦ Total of D is
◦ Total of E is
A B C D E
Class Frequency MidpointmXf 2
mXf
)( mXf
)( 2
mXf
Grouped Data-Variance & Standard Deviation cont’d
5) Substitute values from step 4 into
6) Take the square root of the variance (step 5) to find the standard deviation.
1
)()(
2
2
2
n
n
XfXf
s
m
m
Uses of Variance and Standard Deviation
Determine spread of data (large values mean data is fairly spread out)
Determine consistency of a variable. (Nuts & bolts diameters must have small variance & st. dev.)
Used to determine how many data values fall within certain interval. (Chebyshev-75% within 2 st. dev. of mean).
Used in inferential statistics (we’ll see how later).
Coefficient of Variation
Allows comparison of data with different units (number of sales per salesperson vs. commissions made by salesperson).
Coefficient of Variation:◦ Denoted: Cvar
◦ For Samples:
◦ For Populations:
%100X
sCVar
%100
CVar
Example-Coefficient of Variation
Example 3-25
The mean of the number of sales of cars over a 3-month period is 87 and the standard deviation is 5. The mean of the commissions is $5225 and the standard deviation is $773. Compare the variations of the two.
Sales:
Commission:
%7.5%10087
5
X
sCVar
%8.14%1005225
773
CVar
Range Rule of Thumb
4
ranges
• Only an approximation• Use only when distribution is
unimodal and roughly symmetric• Can be used to find large value and
small value when you know the mean and the standard deviation• Large:• Small:
sX 2
sX 2
For many sets of data, almost all values fall within 2 standard deviations of the mean.
Better approximations can be obtained by using Chebyshev’s Theorem.
Chebyshev’s Theorem
Specifies the proportions of the spread in terms of the standard deviation (for any shaped distribution)
Theorem states: The proportion of values from a data set that will fall within k standard deviations of the mean, will be at least
where k is a number greater than 1 (k is not necessarily an integer).
2
11
k
Example of Chebyshev’s Theorem
What percent of the data in a set should fall within 3 standard deviations of the mean?
So, 89% of the numbers in the set fall within 3 standard deviations of the mean.
%899
8
9
11
3
11
11
22
k
Empirical Rule
Applies only to bell-shaped (normal-shaped) distributions.
Rule states:
◦ Approximately 68% of the data values fall within 1 standard deviation of the mean.
◦ Approximately 95% of the data values fall within 2 standard deviations of the mean.
◦ Approximately 99.7% of the data values fall within 3 standard deviations of the mean.
See Figure 3-4, top of p. 128