Upload
rica-joy-pontilar
View
98
Download
1
Tags:
Embed Size (px)
Citation preview
Used to determine the scatter of values in a distribution. In this chapter, we will consider the six measures of variation: the range, quartile deviation, mean deviation and the coefficient of variation
Range
o RangeThe difference between the highest and
lowest values in the distribution.
RANGE = H - LWhere: H= represents the highest value
L = represents the lower value
Ungrouped DataSubtract the lowest score from the highest
score.
Example: Find the range of distribution if the highest score is 100 and the lowest score is 21.Solution:Range = highest score- lowest score
= 100-21= 79
Grouped DataTo find the range for a frequency distribution, just get the
differences between the upper limit of the highest score and the lower limit of the lowest class intervalExample: Find the range for the frequency distribution Class interval Frequency100-104 4105-109 6 110-114 10 115-119 13120-124 8125-129 6130-134 3
N= 50 Solution:
Range= Highest Class Upper Limit- Lowest Class Lower Limit
=134.5-99.5=35
Quartile Deviationsand
Mean Deviations
oQuartile Deviations
Is a measure that describes the existing dispersion in terms of the distance selected observation points. The smaller the quartiles deviation, the greater the concentration in the middle half if the observation in the data set. Are measures of variation which uses percentiles, deciles, or quartiles. Quartile Deviation (QD) means the semi variation between the upper quartiles (Q3) and lower quartiles (Q1) in a distribution. Q3 - Q1 is referred as the interquartile range.
Formula: QD = Q3 - Q1/2
where and are the first and third quartiles and is the interquartile range.
A. Ungrouped Data Example: given the data below
33 52 58 41
56 71 77
74 85 45
82 50 62
51 67 79
48 83 43
81 38 79
65 68 59
Solution: Arrange the 25 entries from lowest to highest.
33 38 41- 3rd entry 43 45 (n= 25)
48- 6th entry 50 51 52 56
79 81 82-23rd entry 83 85
68 71 74 77- 19th entry 79
58 59 62 65 67
A. For semi- inter quartile range Since Q3=P75 and Q1= P25 we use P75 and P25 for P75:
Cum. Freq. of P75 = x = 18.75 or 19
This means that P75 is the 19th entry Therefore, P75 = 77
For P25
Cum. Freq. of P25= . 25=6.6 or which means that P25 is entry 6th
P25= 48
But semi interquartile range= = =
Semi-interquartile range= = = or =
Hence semi interquartile range = 14.5
A. Group Data Example:
Class Intervals
f
<cf
21-23
24-26
3 4
3
7
27-29
6
13
30-32
10
23
33-35
5
28
36-38
2
n=30
30
Solution: Note that Q3-Q1= P75-P25 For P75
Cum freq. of P75 = x 75= 22.5 or 22
L= 29.5 f= 10 F=13, c=3 j= 75
P75= 32.35 For P25
Cum freq. of P25= x 25= 7.5 or 8
L= 26.5 f= 6 F=7, c=3 j= 25 P25= 26.75 Finally the interquartile range is P75-P25= 32.35-26.75= 5.6
o Mean Deviation
The mean deviation or average deviation is the arithmetic mean of the absolute deviations and is denoted by .
Example: Calculate the mean deviation of the following distribution: 9, 3, 8, 8, 9, 8, 9, 18
Mean Deviation for Grouped Data If the data is grouped in a frequency table, the expression of the mean deviation is:
Example: Calculate the mean deviation of the following distribution:
xi fi xi · fi |x - x| |x - x| · fi [10, 15) 12.5 3 37.5 9.286 27.858 [15, 20) 17.5 5 87.5 4.286 21.43 [20, 25) 22.5 7 157.5 0.714 4.998 [25, 30) 27.5 4 110 5.714 22.856 [30, 35) 32.5 2 65 10.714 21.428
21 457.5 98.57
Variance
In probability theory and statistics variance measures how far a set of numbers is spread out. A variance of zero indicates that all the values are identical. Variance is always non-negative: a small variance indicates that the data points tend to be very close to the mean expected value and hence to each other, while a high variance indicates that the data points are very spread out around the mean and from each other.
It is important to distinguish between the variance of a population and the variance of a sample. They have different notation, and they are computed differently.
The variance of a population is denoted by σ2; and the variance of a sample, by s2.
The variance of a population is defined by the following formula:
σ2 = Σ ( Xi - X )2 / N
where σ2 is the population variance, X is the population mean, Xi is the ith element from the population, and N is the number of elements in the population.
The variance of a sample is defined by slightly different formula:
s2 = Σ ( xi - x )2 / ( n - 1 )
where s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the variance of the sample is an unbiased estimate of the variance of the population.
For example, suppose you want to find the variance of scores on a test. Suppose the scores are 67, 72, 85, 93 and 98.
Write down the formula for variance: σ2 = ∑ (x-µ)2 / N There are five scores in total, so N = 5. σ2 = ∑ (x-µ)2 / 5
The formula will look like this: σ2 = [ (-16)2+(-11)2+(2)2+(10)2+(15)2] / 5
Then, square each paranthesis. We get 256, 121, 4, 100 and 225.
This is how:σ2 = [ (-16)x(-16)+(-11)x(-11)+(2)x(2)+(10)x(10)+(15)x(15)] / 5σ2 = [ 16x16 + 11x11 + 2x2 + 10x10 + 15x15] / 5
which equals:σ2 = [256 + 121 + 4 + 100 + 225] / 5
The mean (µ) for the five scores (67, 72, 85, 93, 98), so µ = 83.
σ2 = ∑ (x-83)2 / 5 Now, compare each score (x = 67, 72, 85, 93,
98) to the mean (µ = 83) σ2 = [ (67-83)2+(72-83)2+(85-83)2+(93-83)2+(98-83)2 ] / 5 Conduct the subtraction in each parenthesis. 67-83 = -16
72-83 = -1185-83 = 293-83 = 1098 - 83 = 15
Then summarize the numbers inside the brackets:
σ2 = 706 / 5 To get the final answer, we divide the sum by
5 (Because it was five scores). This is the variance for the dataset:
σ2 = 141.2