38
Measures of Measures of Central Tendency Central Tendency

Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Embed Size (px)

Citation preview

Page 1: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Measures of Measures of Central TendencyCentral Tendency

Page 2: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Measures of Central Measures of Central TendencyTendency

These measures indicate a value, which all These measures indicate a value, which all the observations tend to have, or a value the observations tend to have, or a value where all the observations can be assumed where all the observations can be assumed to be located or concentratedto be located or concentrated

There are three such measures:There are three such measures: i) Meani) Mean ii) Median, Quartiles, Percentiles and ii) Median, Quartiles, Percentiles and DecilesDeciles iii) Modeiii) Mode

Page 3: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

MeanMean Arithmetic MeanArithmetic Mean Harmonic MeanHarmonic Mean Geometric MeanGeometric Mean

1) Ungrouped data1) Ungrouped data

Sum of Observations xSum of Observations x1 1 + x+ x2 2 +….x +….x nn

Mean = ----------------------------- = --------------------Mean = ----------------------------- = --------------------

Number of Observations nNumber of Observations n

Page 4: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

2) Grouped data2) Grouped dataWhen the data is grouped, prepare frequency tableWhen the data is grouped, prepare frequency table

Class Interval Mid-point of ClassMid-point of Class Frequency ( fFrequency ( fi i ))

Interval ( Xi )

-- x 1 f 1

-- -- --

-- x k f k

∑ f i x i

x = --------------- ∑ f i

Where xi is the middle point of the ith class interval.

f i is the frequency of the ith class interval.

fi xi is the product of fi and xi and k is the number of class intervals

Page 5: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

MedianMedian Whenever there are some extreme values in

data, calculation of A.M. is not desirable Median of a set of values is defined as the

middle most value of a series of values arranged in ascending / descending order

If the number of observations is odd, the value corresponding to the middle most values is the median

If the number of observations is even then the average of the two middle most values is the median

Page 6: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

ExampleExample 3144 4784 4923 5034 5424 5561 3144 4784 4923 5034 5424 5561

6505 6707 6874 4187 4310 4506 6505 6707 6874 4187 4310 4506 4745 5071 2717 2796 3144 3527 3098 4745 5071 2717 2796 3144 3527 3098 3534 3534

Ascending OrderAscending Order2717 2796 3098 3144 3527 3534 38622717 2796 3098 3144 3527 3534 38624187 4310 4506 4745 4784 4923 50344187 4310 4506 4745 4784 4923 50345071 5424 5561 6505 6707 68745071 5424 5561 6505 6707 6874Hence, the number of observations is 20, and Hence, the number of observations is 20, and

thereforethereforethere is no middle observation. Two middle there is no middle observation. Two middle

most observations 10most observations 10thth and 11 and 11thth . .

4506 + 4745 92514506 + 4745 9251Median = -------------------- = --------- = 4625.5Median = -------------------- = --------- = 4625.5 2 22 2

Page 7: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

QuartilesQuartiles

Median divides the data into two parts such that Median divides the data into two parts such that 50 % of the observations are less than it and 50 50 % of the observations are less than it and 50 % are more than it. % are more than it.

Similarly there are “Quartiles”. There are three Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles.to as first, second and third quartiles.

The first quartile Q1, divides the data into two The first quartile Q1, divides the data into two parts such that 25 % of the observations are less parts such that 25 % of the observations are less than it and 75 % more than it.than it and 75 % more than it.

The second quartile Q2 is the same as median.The second quartile Q2 is the same as median. The third quartile divides the data into two parts The third quartile divides the data into two parts

such that 75 % observations are less than it and such that 75 % observations are less than it and 25 % are more than25 % are more than it. it.

Page 8: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

PercentilesPercentiles

Percentiles splits the data into several parts, Percentiles splits the data into several parts, expressed in percentage.expressed in percentage.

A percentage is also known as centile, divides A percentage is also known as centile, divides the data in such a way that “given percent of the the data in such a way that “given percent of the observations are less than it.observations are less than it.

For example, 95 % of the observations are less For example, 95 % of the observations are less than the 95than the 95thth percentile percentile

It may be noted that the 50It may be noted that the 50thth percentile denoted percentile denoted as Pas P5050 is the same as the median is the same as the median

Page 9: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

DecilesDeciles

The deciles divides the data into ten The deciles divides the data into ten partsparts

First decile (10%)First decile (10%) Second (20%) and so onSecond (20%) and so on

Page 10: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

ModeMode

It is defined in such a way that it It is defined in such a way that it represents the fashion of the represents the fashion of the observations in a data.observations in a data.

Mode is defined as the most fashionable Mode is defined as the most fashionable value, which, maximum number of value, which, maximum number of observations have or tend to have as observations have or tend to have as compared to any other value. compared to any other value.

Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12

Here mode is 8 because 3 observations Here mode is 8 because 3 observations have this value.have this value.

Page 11: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Measures of Variation/ Measures of Variation/ DispersionDispersion

Measures of variation/dispersion provide an idea Measures of variation/dispersion provide an idea of the extent of variation present among the of the extent of variation present among the observations These are-observations These are-

i) Rangei) Range

ii) Mean Deviationii) Mean Deviation

iii) Standard Deviationiii) Standard Deviation

iv) Coefficient of Variationiv) Coefficient of Variation

Page 12: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

RangeRange It is the simplest measure of variation, and is It is the simplest measure of variation, and is

defined as the difference between the maximum and defined as the difference between the maximum and the minimum values of the observationsthe minimum values of the observations

Range = Maximum Value – Minimum ValueRange = Maximum Value – Minimum Value Since the range depends only on the two viz. the Since the range depends only on the two viz. the

minimum and the maximum values, and does not minimum and the maximum values, and does not utilize the full information in the given data, it is not utilize the full information in the given data, it is not considered very reliable or efficient.. considered very reliable or efficient..

Coefficient of scatter is another based on the range Coefficient of scatter is another based on the range of the data of the data

Range Maximum – Minimum Range Maximum – Minimum -------------------- = -------------------------------------------------- = ------------------------------

Maximum + Minimum Maximum + MinimumMaximum + Minimum Maximum + Minimum

It gives an indication about variability in the dataIt gives an indication about variability in the data

Page 13: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Mean DeviationMean Deviation

In order to study the variation in a data, one In order to study the variation in a data, one method could be to take into consideration the method could be to take into consideration the deviation of all the observation from their meandeviation of all the observation from their mean

Example ( Mean 50)Example ( Mean 50) Observation Deviation from MeanDeviation from Mean 50 050 0 49 - 149 - 1 51 +251 +2 40 -1040 -10 10 -4010 -40 90 +4090 +40

Page 14: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Mean Deviation for Mean Deviation for Ungrouped DataUngrouped Data

If the data is ungrouped and the observations for certain If the data is ungrouped and the observations for certain variable x, are xvariable x, are x11, x, x22, x, x33, ….., x, ….., xnn

∑ ∑ xi - xxi - x

Mean Deviation = ---------------Mean Deviation = --------------- nnFor the data comprising observations 1,2,3, it can be For the data comprising observations 1,2,3, it can be

calculated as followscalculated as follows

(x(xii) x) xii – – xx x xii – x – x 1 -1 11 -1 1 2 0 02 0 0 3 +1 13 +1 1 --------------------------------------- --------------------------------------- Sum 6 0 2Sum 6 0 2 Mean 2 0 2 / 3Mean 2 0 2 / 3 Thus the mean deviation is 2 / 3 = 0.67Thus the mean deviation is 2 / 3 = 0.67

Page 15: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Mean Deviation for Mean Deviation for Grouped DataGrouped Data

Class Interval

Middle Point of Class Interval

( x i )

Frequency( ƒƒίί )

ƒƒίί x I │x i - x│ ƒƒίί │x i - x│

2000-3000 2500 2 5000 2050 4100

3000-4000 3500 5 17500 1050 5250

4000-5000 4500 6 27000 50 300

5000-6000 5500 4 22000 950 3800

6000-7000 6500 3 19500 1950 5850

Sum 20 91000 6050 19300

Average 4550 965

Page 16: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

∑ ∑ ƒƒίί│x i - x│ 19300

Mean Deviation = --------------- = -------- = Mean Deviation = --------------- = -------- = 965965

∑ ∑ ƒƒίί 20 20

x x ii is the middle point of class interval is the middle point of class interval

x is the meanx is the mean

ƒί is the frequency of the i th class interval

Page 17: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Variance and Standard DeviationVariance and Standard Deviation While calculating mean deviation, the absolute values of While calculating mean deviation, the absolute values of

observations from the mean were taken because without observations from the mean were taken because without doing so, the total deviation was zero for the data doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was comprising values 1,2 and 3 even though there was variation present among these observations.variation present among these observations.

However another way of getting over this problem of total However another way of getting over this problem of total deviation being zero is to take the squares of deviations of deviation being zero is to take the squares of deviations of the observations from the mean the observations from the mean

xi xi -xi xi -x ( x ( xi -xi -x )x ) 22

1 -1 11 -1 1 2 0 02 0 0 3 +1 13 +1 1 ---------------------------------------------------------------------------------------------------- Sum 6 0 2Sum 6 0 2 Mean 2 0 2/3 (=0.67)Mean 2 0 2/3 (=0.67)

Page 18: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Calculation of variance and standard deviation for Calculation of variance and standard deviation for ungrouped dataungrouped data

11 Variance (σVariance (σ22 )= ---- ∑ )= ---- ∑ ( ( xi -xi -x )x ) 22

nn

11

= ----- x (2) = 0.67= ----- x (2) = 0.67

33

The square root of σThe square root of σ22 i. e σ is known as the i. e σ is known as the standard deviationstandard deviation

Standard Deviation (σ ) =Standard Deviation (σ ) = 0.67 = 0.820.67 = 0.82

Page 19: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Calculation of Variance and Calculation of Variance and Standard Deviation for Standard Deviation for

Grouped DataGrouped DataClass

IntervalMid Point of

ClassInterval

(x i )

Frequencyƒί

ƒί x i ƒί x i 2 (x i - x) (x i - x)2 ƒί │(x i - x)2

2000-3000 2500 2 5000 12500000 -2050 4202500 8405000

3000-4000 3500 5 17500 61250000 -1050 11002500 5512500

4000-5000 4500 6 27000 121500000

-50 2500 15000

5000-6000 5500 4 22000 121000000

950 902500 3610000

6000-7000 6500 3 19500 126750000

1950 3802500 11407500

Sum 20 91000 443000000

10012500 28950000

Average (x )

4550 Variance = 1447500

Page 20: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

S.D. = Variance S.D. = Variance

= 1447500= 1447500

= 1203.12= 1203.12

Page 21: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Combining Variances of Combining Variances of Two PopulationsTwo Populations

The mean and S.D. of the “lives” of tyres manufactured by two factories of the “Durable” tyre company, making 50,000 tyres, annually, at each of the two factories, are given below. Calculate the mean and standard deviation of all the 100000 tyres producced in a year.

Group Mean (‘000 kms.) S.D. (‘000 kms.)

1 60 8 2 55 7

Page 22: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

We know that if there is one set of data having n1 We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1observations with mean=m1 and s.d.= σ1 and another set of data having n2 observations with mean = m2 and s.d. = σσ2 2 then the mean (m) and then the mean (m) and variance (σvariance (σ22) of the combined data with (n) of the combined data with (n11 + n + n22))

observations are given asobservations are given as

m = nm = n11mm11 + n + n22mm2 2 / n/ n11 + n + n22

σσ2 2 = n= n11(σ(σ112 2

+ + d d1122

) +n) +n2 2 (σ(σ222 2 ++ dd22

22 ) / n ) / n11 +n +n22

dd11 = m = m11 – m – m

dd2 2 = m= m22 – m m= combined mean of both – m m= combined mean of both

the sets of datathe sets of data

Page 23: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Factory 1 : nFactory 1 : n11 = 50, m = 50, m11 = 60 and σ = 60 and σ11= 8= 8

Factory 2 : nFactory 2 : n22 = 50, m = 50, m22= 55 and σ= 55 and σ11= 7= 7

Substituting these values in the above formulasSubstituting these values in the above formulas

Mean = (50 x 60) + (50 x 55) / (50+50)Mean = (50 x 60) + (50 x 55) / (50+50)

= (3000 + 2750) / 100 = (3000 + 2750) / 100

= 5750 / 100= 5750 / 100

= 57.5= 57.5

Thus the mean life of the tyres manufactured by Thus the mean life of the tyres manufactured by the company is 57,500 kms. the company is 57,500 kms.

Page 24: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Therefore,Therefore,d1 = m1 – m = 60 - 57.5 = 2.5d1 = m1 – m = 60 - 57.5 = 2.5d2 = m2 – m = 55 - 57.5 = -2.5d2 = m2 – m = 55 - 57.5 = -2.5

50x ( 850x ( 822 + 2.5 + 2.522) + 50 x (7) + 50 x (722 + 2.5 + 2.522))Variance (σVariance (σ22)) = __________________________= __________________________ 50 + 5050 + 50

= (50x70.25) +(50x 55.25) / 100= (50x70.25) +(50x 55.25) / 100 = 3512.5 + 2762.5 /100= 3512.5 + 2762.5 /100 = 6275 /100= 6275 /100 = 62.75= 62.75

Page 25: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

VarianceVariance = = (σ(σ22)) = 62.75= 62.75

ThereforeTherefore

S.D. (σ) = 62.75 = 7.92S.D. (σ) = 62.75 = 7.92

Thus, the S.D of the lives of tyres produced by the Thus, the S.D of the lives of tyres produced by the company is 7,920 kms.company is 7,920 kms.

Page 26: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Mean DeviationMean DeviationThe mean deviation is defined asThe mean deviation is defined as

∑ ∑ ƒί xi - x

Mean Deviation = -------------------Mean Deviation = -------------------

∑ ∑ ƒί

Where,

x1 is the middle point of i th class interval

ƒί is the frequency of the ith class interval and x

Is the arithmetic mean of the I.Q. scores

Page 27: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Class Interval

Frequencyƒί

Mid Point of Class Interval

xi

ƒί xi Xi -x ƒί Xi -x

40-50 10 45 450 26.5 265.0

50-60 20 55 1100 16.5 330.0

60-70 20 65 1300 6.5 130.0

70-80 15 75 1125 3.5 52.5

80-90 15 85 1275 13.5 202.5

90-100 20 95 1900 23.5 470.0

Summation 100 - 7150 1450.0

Page 28: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

From this data, we getFrom this data, we get

∑ ∑ ƒί xi 7150

Mean = ----------- = ---------- = 71.5Mean = ----------- = ---------- = 71.5

∑ ∑ ƒί 100

∑ ∑ ƒί xi - x 1450

Mean Deviation = ------------------- = --------- = 14.5Mean Deviation = ------------------- = --------- = 14.5

∑ ∑ ƒί 100

Thus the average score is 71.5 and the mean deviation of the score is 14.5

Page 29: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Class Interval Frequencyƒί

Mid Point of

Class Interval

(xi)

ƒί xi Xi2 ƒί xi

2

40-50 10 45 450 2025 20250

50-60 20 55 1100 3025 60500

60-70 20 65 1300 4225 84500

70-80 15 75 1125 5625 84375

80-90 15 85 1275 7225 1,08,375

90-100 20 95 1900 9025 1,80,500

Summation 100 7150 5,38,500

Suppose we are required to calculate only standard deviation for the Suppose we are required to calculate only standard deviation for the

above data, then the table is constructed as belowabove data, then the table is constructed as below

Page 30: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

∑ ∑ ƒί xi 7150

MeanMean = ----------- = ---------- = = ----------- = ---------- = 71.571.5

∑ ∑ ƒί 100

∑ ∑ ƒί xi 2__ (∑ ∑ ƒί )_ (x_)2 538500 – 100 (71.5) 2

S.D. = ----------------------------- = --------------------

∑ ∑ ƒί 100

= 272.75 = 16.5

Thus, the s.d. of the I.Q. scores is 16.5

Page 31: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Coefficient of VariationCoefficient of Variation

It is a relative measure of dispersion It is a relative measure of dispersion that enables us to compare two that enables us to compare two distributions.distributions.

It relates the standard deviation and the It relates the standard deviation and the mean by expressing the standard mean by expressing the standard deviation as a percentage of the meandeviation as a percentage of the mean

σσ C. V. = --------- x 100C. V. = --------- x 100

xx

Page 32: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

ExampleExample For the dataFor the data

103,50,68,110,105,108,174,103,150,200,225,103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range 350,103 find the range, Coefficient of Range and coefficient of quartile deviationand coefficient of quartile deviation

1) Range = H –L = 350 - 50 =3001) Range = H –L = 350 - 50 =300 H – L 300H – L 3002) Coefficient of range = ----------- = ---------- = 2) Coefficient of range = ----------- = ---------- =

0.70.7 H + L 350+50H + L 350+50

Page 33: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

To find Q1 and Q3 we arrange the To find Q1 and Q3 we arrange the data in ascending order data in ascending order

n+1 14n+1 14 ------- = ------ = 3.5------- = ------ = 3.5 4 44 4 3 (n+1)3 (n+1) ----------- = 10.5----------- = 10.5 44Q1 = 103 + 0.5 103-103) = 103Q1 = 103 + 0.5 103-103) = 103

Q2 = 174 + 0.5 (200 – 174) = 187Q2 = 174 + 0.5 (200 – 174) = 187

Page 34: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

QQ33 – Q – Q1 1 187 - 103187 - 103

Coefficient of QD = ------------- = ------------ = Coefficient of QD = ------------- = ------------ = 0.28960.2896

QQ33 + Q + Q1 1 187+103187+103

Page 35: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

ExampleExample A A purchasing agent obtained a sample of

incandescent lamps from two suppliers. He had the sample tested in his laboratory for length of life with the following results.

Length of light Sample ASample A Sample BSample B

in hoursin hours

700 – 900 10 3 700 – 900 10 3

900 – 1100 16 42900 – 1100 16 42

1100 - 1300 26 121100 - 1300 26 12

1300 – 1500 8 31300 – 1500 8 3

Which company’s lamps are more uniform?Which company’s lamps are more uniform?

Page 36: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Class interval

Sample A

Midpoint x

X - 1000U =

------------- 200

f u f u2

700 - 900 10 800 -1 -10 10

900 - 1100 16 1000 0 0 0

1100 - 1300 26 1200 1 26 26

1300 - 1500 8 1400 2 16 32

Total 60 32 68

Sample ASample A

Page 37: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

32 32

u u AA = -------- = 0.533 = -------- = 0.533

6060

x x AA = 1000 + 200 = 1000 + 200 uu

= 1000 + 200 (0.533) = 1106.67= 1000 + 200 (0.533) = 1106.67

1 681 68

σσ22u = ---- u = ---- ∑ f u∑ f u2 2 - - ((u )u )2 2 = = ------- - ------- - (0.533)(0.533)22

N 60N 60

σσ22u = 1.133 – 0.2809 = 0.8524u = 1.133 – 0.2809 = 0.8524

σ σ x x = 200 x 0.9233 = 184.66= 200 x 0.9233 = 184.66

C. V. for sample A = σ A / C. V. for sample A = σ A / x A x 100x A x 100

= 184.66 / 1106.67 x = 184.66 / 1106.67 x 100100

= 16.68 % = 16.68 %

Page 38: Measures of Central Tendency. These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed

Sample 2

Classinterval

Sample A

Midpoint x

X - 1000U = -------------

200

f u f u2

700 - 900 3 800 -1 -3 3

900 - 1100 42 1000 0 0 0

1100 - 1300

12 1200 1 12 12

1300 - 1500

3 1400 2 6 12

Total 60 27