Upload
prudence-porter
View
215
Download
3
Embed Size (px)
Citation preview
Measures of Measures of Central TendencyCentral Tendency
Measures of Central Measures of Central TendencyTendency
These measures indicate a value, which all These measures indicate a value, which all the observations tend to have, or a value the observations tend to have, or a value where all the observations can be assumed where all the observations can be assumed to be located or concentratedto be located or concentrated
There are three such measures:There are three such measures: i) Meani) Mean ii) Median, Quartiles, Percentiles and ii) Median, Quartiles, Percentiles and DecilesDeciles iii) Modeiii) Mode
MeanMean Arithmetic MeanArithmetic Mean Harmonic MeanHarmonic Mean Geometric MeanGeometric Mean
1) Ungrouped data1) Ungrouped data
Sum of Observations xSum of Observations x1 1 + x+ x2 2 +….x +….x nn
Mean = ----------------------------- = --------------------Mean = ----------------------------- = --------------------
Number of Observations nNumber of Observations n
2) Grouped data2) Grouped dataWhen the data is grouped, prepare frequency tableWhen the data is grouped, prepare frequency table
Class Interval Mid-point of ClassMid-point of Class Frequency ( fFrequency ( fi i ))
Interval ( Xi )
-- x 1 f 1
-- -- --
-- x k f k
∑ f i x i
x = --------------- ∑ f i
Where xi is the middle point of the ith class interval.
f i is the frequency of the ith class interval.
fi xi is the product of fi and xi and k is the number of class intervals
MedianMedian Whenever there are some extreme values in
data, calculation of A.M. is not desirable Median of a set of values is defined as the
middle most value of a series of values arranged in ascending / descending order
If the number of observations is odd, the value corresponding to the middle most values is the median
If the number of observations is even then the average of the two middle most values is the median
ExampleExample 3144 4784 4923 5034 5424 5561 3144 4784 4923 5034 5424 5561
6505 6707 6874 4187 4310 4506 6505 6707 6874 4187 4310 4506 4745 5071 2717 2796 3144 3527 3098 4745 5071 2717 2796 3144 3527 3098 3534 3534
Ascending OrderAscending Order2717 2796 3098 3144 3527 3534 38622717 2796 3098 3144 3527 3534 38624187 4310 4506 4745 4784 4923 50344187 4310 4506 4745 4784 4923 50345071 5424 5561 6505 6707 68745071 5424 5561 6505 6707 6874Hence, the number of observations is 20, and Hence, the number of observations is 20, and
thereforethereforethere is no middle observation. Two middle there is no middle observation. Two middle
most observations 10most observations 10thth and 11 and 11thth . .
4506 + 4745 92514506 + 4745 9251Median = -------------------- = --------- = 4625.5Median = -------------------- = --------- = 4625.5 2 22 2
QuartilesQuartiles
Median divides the data into two parts such that Median divides the data into two parts such that 50 % of the observations are less than it and 50 50 % of the observations are less than it and 50 % are more than it. % are more than it.
Similarly there are “Quartiles”. There are three Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles.to as first, second and third quartiles.
The first quartile Q1, divides the data into two The first quartile Q1, divides the data into two parts such that 25 % of the observations are less parts such that 25 % of the observations are less than it and 75 % more than it.than it and 75 % more than it.
The second quartile Q2 is the same as median.The second quartile Q2 is the same as median. The third quartile divides the data into two parts The third quartile divides the data into two parts
such that 75 % observations are less than it and such that 75 % observations are less than it and 25 % are more than25 % are more than it. it.
PercentilesPercentiles
Percentiles splits the data into several parts, Percentiles splits the data into several parts, expressed in percentage.expressed in percentage.
A percentage is also known as centile, divides A percentage is also known as centile, divides the data in such a way that “given percent of the the data in such a way that “given percent of the observations are less than it.observations are less than it.
For example, 95 % of the observations are less For example, 95 % of the observations are less than the 95than the 95thth percentile percentile
It may be noted that the 50It may be noted that the 50thth percentile denoted percentile denoted as Pas P5050 is the same as the median is the same as the median
DecilesDeciles
The deciles divides the data into ten The deciles divides the data into ten partsparts
First decile (10%)First decile (10%) Second (20%) and so onSecond (20%) and so on
ModeMode
It is defined in such a way that it It is defined in such a way that it represents the fashion of the represents the fashion of the observations in a data.observations in a data.
Mode is defined as the most fashionable Mode is defined as the most fashionable value, which, maximum number of value, which, maximum number of observations have or tend to have as observations have or tend to have as compared to any other value. compared to any other value.
Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12
Here mode is 8 because 3 observations Here mode is 8 because 3 observations have this value.have this value.
Measures of Variation/ Measures of Variation/ DispersionDispersion
Measures of variation/dispersion provide an idea Measures of variation/dispersion provide an idea of the extent of variation present among the of the extent of variation present among the observations These are-observations These are-
i) Rangei) Range
ii) Mean Deviationii) Mean Deviation
iii) Standard Deviationiii) Standard Deviation
iv) Coefficient of Variationiv) Coefficient of Variation
RangeRange It is the simplest measure of variation, and is It is the simplest measure of variation, and is
defined as the difference between the maximum and defined as the difference between the maximum and the minimum values of the observationsthe minimum values of the observations
Range = Maximum Value – Minimum ValueRange = Maximum Value – Minimum Value Since the range depends only on the two viz. the Since the range depends only on the two viz. the
minimum and the maximum values, and does not minimum and the maximum values, and does not utilize the full information in the given data, it is not utilize the full information in the given data, it is not considered very reliable or efficient.. considered very reliable or efficient..
Coefficient of scatter is another based on the range Coefficient of scatter is another based on the range of the data of the data
Range Maximum – Minimum Range Maximum – Minimum -------------------- = -------------------------------------------------- = ------------------------------
Maximum + Minimum Maximum + MinimumMaximum + Minimum Maximum + Minimum
It gives an indication about variability in the dataIt gives an indication about variability in the data
Mean DeviationMean Deviation
In order to study the variation in a data, one In order to study the variation in a data, one method could be to take into consideration the method could be to take into consideration the deviation of all the observation from their meandeviation of all the observation from their mean
Example ( Mean 50)Example ( Mean 50) Observation Deviation from MeanDeviation from Mean 50 050 0 49 - 149 - 1 51 +251 +2 40 -1040 -10 10 -4010 -40 90 +4090 +40
Mean Deviation for Mean Deviation for Ungrouped DataUngrouped Data
If the data is ungrouped and the observations for certain If the data is ungrouped and the observations for certain variable x, are xvariable x, are x11, x, x22, x, x33, ….., x, ….., xnn
∑ ∑ xi - xxi - x
Mean Deviation = ---------------Mean Deviation = --------------- nnFor the data comprising observations 1,2,3, it can be For the data comprising observations 1,2,3, it can be
calculated as followscalculated as follows
(x(xii) x) xii – – xx x xii – x – x 1 -1 11 -1 1 2 0 02 0 0 3 +1 13 +1 1 --------------------------------------- --------------------------------------- Sum 6 0 2Sum 6 0 2 Mean 2 0 2 / 3Mean 2 0 2 / 3 Thus the mean deviation is 2 / 3 = 0.67Thus the mean deviation is 2 / 3 = 0.67
Mean Deviation for Mean Deviation for Grouped DataGrouped Data
Class Interval
Middle Point of Class Interval
( x i )
Frequency( ƒƒίί )
ƒƒίί x I │x i - x│ ƒƒίί │x i - x│
2000-3000 2500 2 5000 2050 4100
3000-4000 3500 5 17500 1050 5250
4000-5000 4500 6 27000 50 300
5000-6000 5500 4 22000 950 3800
6000-7000 6500 3 19500 1950 5850
Sum 20 91000 6050 19300
Average 4550 965
∑ ∑ ƒƒίί│x i - x│ 19300
Mean Deviation = --------------- = -------- = Mean Deviation = --------------- = -------- = 965965
∑ ∑ ƒƒίί 20 20
x x ii is the middle point of class interval is the middle point of class interval
x is the meanx is the mean
ƒί is the frequency of the i th class interval
Variance and Standard DeviationVariance and Standard Deviation While calculating mean deviation, the absolute values of While calculating mean deviation, the absolute values of
observations from the mean were taken because without observations from the mean were taken because without doing so, the total deviation was zero for the data doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was comprising values 1,2 and 3 even though there was variation present among these observations.variation present among these observations.
However another way of getting over this problem of total However another way of getting over this problem of total deviation being zero is to take the squares of deviations of deviation being zero is to take the squares of deviations of the observations from the mean the observations from the mean
xi xi -xi xi -x ( x ( xi -xi -x )x ) 22
1 -1 11 -1 1 2 0 02 0 0 3 +1 13 +1 1 ---------------------------------------------------------------------------------------------------- Sum 6 0 2Sum 6 0 2 Mean 2 0 2/3 (=0.67)Mean 2 0 2/3 (=0.67)
Calculation of variance and standard deviation for Calculation of variance and standard deviation for ungrouped dataungrouped data
11 Variance (σVariance (σ22 )= ---- ∑ )= ---- ∑ ( ( xi -xi -x )x ) 22
nn
11
= ----- x (2) = 0.67= ----- x (2) = 0.67
33
The square root of σThe square root of σ22 i. e σ is known as the i. e σ is known as the standard deviationstandard deviation
Standard Deviation (σ ) =Standard Deviation (σ ) = 0.67 = 0.820.67 = 0.82
Calculation of Variance and Calculation of Variance and Standard Deviation for Standard Deviation for
Grouped DataGrouped DataClass
IntervalMid Point of
ClassInterval
(x i )
Frequencyƒί
ƒί x i ƒί x i 2 (x i - x) (x i - x)2 ƒί │(x i - x)2
2000-3000 2500 2 5000 12500000 -2050 4202500 8405000
3000-4000 3500 5 17500 61250000 -1050 11002500 5512500
4000-5000 4500 6 27000 121500000
-50 2500 15000
5000-6000 5500 4 22000 121000000
950 902500 3610000
6000-7000 6500 3 19500 126750000
1950 3802500 11407500
Sum 20 91000 443000000
10012500 28950000
Average (x )
4550 Variance = 1447500
S.D. = Variance S.D. = Variance
= 1447500= 1447500
= 1203.12= 1203.12
Combining Variances of Combining Variances of Two PopulationsTwo Populations
The mean and S.D. of the “lives” of tyres manufactured by two factories of the “Durable” tyre company, making 50,000 tyres, annually, at each of the two factories, are given below. Calculate the mean and standard deviation of all the 100000 tyres producced in a year.
Group Mean (‘000 kms.) S.D. (‘000 kms.)
1 60 8 2 55 7
We know that if there is one set of data having n1 We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1observations with mean=m1 and s.d.= σ1 and another set of data having n2 observations with mean = m2 and s.d. = σσ2 2 then the mean (m) and then the mean (m) and variance (σvariance (σ22) of the combined data with (n) of the combined data with (n11 + n + n22))
observations are given asobservations are given as
m = nm = n11mm11 + n + n22mm2 2 / n/ n11 + n + n22
σσ2 2 = n= n11(σ(σ112 2
+ + d d1122
) +n) +n2 2 (σ(σ222 2 ++ dd22
22 ) / n ) / n11 +n +n22
dd11 = m = m11 – m – m
dd2 2 = m= m22 – m m= combined mean of both – m m= combined mean of both
the sets of datathe sets of data
Factory 1 : nFactory 1 : n11 = 50, m = 50, m11 = 60 and σ = 60 and σ11= 8= 8
Factory 2 : nFactory 2 : n22 = 50, m = 50, m22= 55 and σ= 55 and σ11= 7= 7
Substituting these values in the above formulasSubstituting these values in the above formulas
Mean = (50 x 60) + (50 x 55) / (50+50)Mean = (50 x 60) + (50 x 55) / (50+50)
= (3000 + 2750) / 100 = (3000 + 2750) / 100
= 5750 / 100= 5750 / 100
= 57.5= 57.5
Thus the mean life of the tyres manufactured by Thus the mean life of the tyres manufactured by the company is 57,500 kms. the company is 57,500 kms.
Therefore,Therefore,d1 = m1 – m = 60 - 57.5 = 2.5d1 = m1 – m = 60 - 57.5 = 2.5d2 = m2 – m = 55 - 57.5 = -2.5d2 = m2 – m = 55 - 57.5 = -2.5
50x ( 850x ( 822 + 2.5 + 2.522) + 50 x (7) + 50 x (722 + 2.5 + 2.522))Variance (σVariance (σ22)) = __________________________= __________________________ 50 + 5050 + 50
= (50x70.25) +(50x 55.25) / 100= (50x70.25) +(50x 55.25) / 100 = 3512.5 + 2762.5 /100= 3512.5 + 2762.5 /100 = 6275 /100= 6275 /100 = 62.75= 62.75
VarianceVariance = = (σ(σ22)) = 62.75= 62.75
ThereforeTherefore
S.D. (σ) = 62.75 = 7.92S.D. (σ) = 62.75 = 7.92
Thus, the S.D of the lives of tyres produced by the Thus, the S.D of the lives of tyres produced by the company is 7,920 kms.company is 7,920 kms.
Mean DeviationMean DeviationThe mean deviation is defined asThe mean deviation is defined as
∑ ∑ ƒί xi - x
Mean Deviation = -------------------Mean Deviation = -------------------
∑ ∑ ƒί
Where,
x1 is the middle point of i th class interval
ƒί is the frequency of the ith class interval and x
Is the arithmetic mean of the I.Q. scores
Class Interval
Frequencyƒί
Mid Point of Class Interval
xi
ƒί xi Xi -x ƒί Xi -x
40-50 10 45 450 26.5 265.0
50-60 20 55 1100 16.5 330.0
60-70 20 65 1300 6.5 130.0
70-80 15 75 1125 3.5 52.5
80-90 15 85 1275 13.5 202.5
90-100 20 95 1900 23.5 470.0
Summation 100 - 7150 1450.0
From this data, we getFrom this data, we get
∑ ∑ ƒί xi 7150
Mean = ----------- = ---------- = 71.5Mean = ----------- = ---------- = 71.5
∑ ∑ ƒί 100
∑ ∑ ƒί xi - x 1450
Mean Deviation = ------------------- = --------- = 14.5Mean Deviation = ------------------- = --------- = 14.5
∑ ∑ ƒί 100
Thus the average score is 71.5 and the mean deviation of the score is 14.5
Class Interval Frequencyƒί
Mid Point of
Class Interval
(xi)
ƒί xi Xi2 ƒί xi
2
40-50 10 45 450 2025 20250
50-60 20 55 1100 3025 60500
60-70 20 65 1300 4225 84500
70-80 15 75 1125 5625 84375
80-90 15 85 1275 7225 1,08,375
90-100 20 95 1900 9025 1,80,500
Summation 100 7150 5,38,500
Suppose we are required to calculate only standard deviation for the Suppose we are required to calculate only standard deviation for the
above data, then the table is constructed as belowabove data, then the table is constructed as below
∑ ∑ ƒί xi 7150
MeanMean = ----------- = ---------- = = ----------- = ---------- = 71.571.5
∑ ∑ ƒί 100
∑ ∑ ƒί xi 2__ (∑ ∑ ƒί )_ (x_)2 538500 – 100 (71.5) 2
S.D. = ----------------------------- = --------------------
∑ ∑ ƒί 100
= 272.75 = 16.5
Thus, the s.d. of the I.Q. scores is 16.5
Coefficient of VariationCoefficient of Variation
It is a relative measure of dispersion It is a relative measure of dispersion that enables us to compare two that enables us to compare two distributions.distributions.
It relates the standard deviation and the It relates the standard deviation and the mean by expressing the standard mean by expressing the standard deviation as a percentage of the meandeviation as a percentage of the mean
σσ C. V. = --------- x 100C. V. = --------- x 100
xx
ExampleExample For the dataFor the data
103,50,68,110,105,108,174,103,150,200,225,103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range 350,103 find the range, Coefficient of Range and coefficient of quartile deviationand coefficient of quartile deviation
1) Range = H –L = 350 - 50 =3001) Range = H –L = 350 - 50 =300 H – L 300H – L 3002) Coefficient of range = ----------- = ---------- = 2) Coefficient of range = ----------- = ---------- =
0.70.7 H + L 350+50H + L 350+50
To find Q1 and Q3 we arrange the To find Q1 and Q3 we arrange the data in ascending order data in ascending order
n+1 14n+1 14 ------- = ------ = 3.5------- = ------ = 3.5 4 44 4 3 (n+1)3 (n+1) ----------- = 10.5----------- = 10.5 44Q1 = 103 + 0.5 103-103) = 103Q1 = 103 + 0.5 103-103) = 103
Q2 = 174 + 0.5 (200 – 174) = 187Q2 = 174 + 0.5 (200 – 174) = 187
QQ33 – Q – Q1 1 187 - 103187 - 103
Coefficient of QD = ------------- = ------------ = Coefficient of QD = ------------- = ------------ = 0.28960.2896
QQ33 + Q + Q1 1 187+103187+103
ExampleExample A A purchasing agent obtained a sample of
incandescent lamps from two suppliers. He had the sample tested in his laboratory for length of life with the following results.
Length of light Sample ASample A Sample BSample B
in hoursin hours
700 – 900 10 3 700 – 900 10 3
900 – 1100 16 42900 – 1100 16 42
1100 - 1300 26 121100 - 1300 26 12
1300 – 1500 8 31300 – 1500 8 3
Which company’s lamps are more uniform?Which company’s lamps are more uniform?
Class interval
Sample A
Midpoint x
X - 1000U =
------------- 200
f u f u2
700 - 900 10 800 -1 -10 10
900 - 1100 16 1000 0 0 0
1100 - 1300 26 1200 1 26 26
1300 - 1500 8 1400 2 16 32
Total 60 32 68
Sample ASample A
32 32
u u AA = -------- = 0.533 = -------- = 0.533
6060
x x AA = 1000 + 200 = 1000 + 200 uu
= 1000 + 200 (0.533) = 1106.67= 1000 + 200 (0.533) = 1106.67
1 681 68
σσ22u = ---- u = ---- ∑ f u∑ f u2 2 - - ((u )u )2 2 = = ------- - ------- - (0.533)(0.533)22
N 60N 60
σσ22u = 1.133 – 0.2809 = 0.8524u = 1.133 – 0.2809 = 0.8524
σ σ x x = 200 x 0.9233 = 184.66= 200 x 0.9233 = 184.66
C. V. for sample A = σ A / C. V. for sample A = σ A / x A x 100x A x 100
= 184.66 / 1106.67 x = 184.66 / 1106.67 x 100100
= 16.68 % = 16.68 %
Sample 2
Classinterval
Sample A
Midpoint x
X - 1000U = -------------
200
f u f u2
700 - 900 3 800 -1 -3 3
900 - 1100 42 1000 0 0 0
1100 - 1300
12 1200 1 12 12
1300 - 1500
3 1400 2 6 12
Total 60 27