Notes 04 Measure of Central Tendency

Embed Size (px)

Citation preview

  • 7/29/2019 Notes 04 Measure of Central Tendency

    1/12

    Statistical Tools for Managers 56

    Chapter 4

    Measure of Central Tendency

    4.1.1. Properties of a Good Measure of Central Tendency

    A good measure of central tendency should possess as far as possible the following properties:

    a. Easy to understand.

    b. Simple to compute.

    c. Based on all observations.

    d. Uniquely defined.

    e. Possibility of further algebraic treatment.

    f. Not unduly affected by extreme values.4.1.2. Common Measures of Central Tendency

    There are three common measures of central tendency:

    a. Mean. The average value.

    b. Median. The middle value.

    c. Mode. Most occurring value.

    4.2. Mean

    There are three types of mean:

    a. Arithmetic mean (AM).b. Geometric Mean (GM).

    c. Harmonic Mean (HM).

    4.2.1. Simple Arithmetic Mean

    4.2.1.1. Simple Arithmetic Mean for Ungrouped Data (AM)

    1 2 3 ..... nx x x x

    N

    + + + += =

    1

    n

    i

    i

    x

    N

    =

    There is a short cut method for calculations based on a simple concept that, if a constant is subtracted or

    added to all data points, the arithmetic mean (AM) is reduced or increased by that amount. Thus,

    1

    n

    i

    i

    d

    AN

    == +

    Where, A = Arbitrarily selected constant value (Assumed mean).

    id = Deviation of each observation from the assumed mean.

    N= Number of observations.

  • 7/29/2019 Notes 04 Measure of Central Tendency

    2/12

    Statistical Tools for Managers 57

    Note that, when assumed mean A is exactly equal to Arithmetic mean orX , algebraic sum of all

    deviations is equal to zero. Thus, algebraic sum of deviations of all observations about Arithmetic

    Mean is zero. Or,

    About Arithmetic Mean,1

    n

    i

    id

    == 0

    4.2.1.2. Simple Arithmetic Mean for Grouped Data

    Then the weighted average is calculated by dividing sum of these values of class marks with frequency

    as their weights, by total number of observation (sum of all frequencies). Thus for grouped data,

    =11

    1

    i i

    n

    i

    i

    i

    n

    i

    in

    i

    m f m f

    Nf

    ==

    =

    =

    Example 2:

    From the following data compute Arithmetic Mean by direct method, short cut methods and step division

    method.

    Marks 0-10 10-20 20-30 30-40 40-50 50-60

    No of students 5 10 25 30 20 10

    Solution:

    Let the assumed Mean beA = 35 and Step size h = 10

    Calculation Table

    Marks Class

    Mark

    ( mi)

    No. of

    Students

    (fi)mi *fi

    Deviation

    di= mi A fi* di

    Step

    Deviation

    di=(mi-A)/hfi* di

    0-10 5 5 25 -30 -150 -3 -15

    10-20 15 10 150 -20 -200 -2 -20

    20-30 25 25 625 -10 -250 -1 -25

    30-40 35 30 1050 0 0 0 0

    40-50 45 20 900 10 200 1 20

    50-60 55 10 550 20 200 2 20

    100 3300 - 200 - 20

    a. Direct Method:

    =

    6

    1

    6

    1

    330033

    100

    i

    i

    i

    i

    i

    m f

    f

    =

    =

    ==

  • 7/29/2019 Notes 04 Measure of Central Tendency

    3/12

    Statistical Tools for Managers 58

    b. Shortcut Method:

    =A +

    6

    1

    6

    1

    i

    i

    i

    i

    i

    f d

    f

    =

    =

    =

    35 +

    200

    100

    = 35 2 = 33

    c. Step Division method

    1

    1

    i

    i

    n

    i

    i

    n

    i

    f d

    A h

    f

    =

    =

    = +

    = 35 +

    2010

    100

    = 33

    Note: The answer is same irrespective the method used.

    4.3.1.6. Merits of Arithmetic Mean

    a. Easy to understand and calculate.

    b. Takes all values into account.

    c. Lends itself to further mathematical treatment.

    d. Since sum of all deviations from Arithmetic mean is zero, it is a point of balance or

    center of gravity.

    e. Sum of the squared deviations from arithmetic mean is always the minimum.

    4.3.1.7. Limitations of Arithmetic Mean

    a. Affected significantly by extreme values.b. Cannot be computed for open-end class distribution without some assumptions.

    c. May give fallacious conclusions if we depend totally on Arithmetic mean for decision-

    making.

    d. Cannot be determined by inspection or graphically.

    4.3.1.8. Arithmetic Mean of Combined Data

    1 21 2

    1 2

    ......

    ......n n

    n

    N N N

    N N N

    + + + =

    + + +

    4.3.2. Weighted Arithmetic Mean

    There are cases where relative importance of the different items is not the same. In such a case, we need

    to compute the weighted arithmetic mean. The procedure is similar to the grouped data calculations

    studied earlier, when we consider frequency as a weight associated with the class-mark. Now suppose the

    data values arex1, x2,x3, , xn and associated weights are W1, W2, W3 Wn, then the weighted arithmetic

    mean is: -

    Direct Method

    1 1 2 2

    1 2

    ......

    ......

    n n

    wn

    W W W

    W W W

    x x x

    + + + =

    + + +=

    i i

    i

    W

    W

    x

    4.3.2.1. Utility of Weighted Mean

  • 7/29/2019 Notes 04 Measure of Central Tendency

    4/12

    Statistical Tools for Managers 59

    Some of the common applications where weighted mean is extensively used are: -

    a. Construction of index numbers, e.g. consumer Price Index, BSE sensex, etc. where

    different weights are associated for different items or shares.

    b. Comparison of results of the two companies when their sizes are different.c. Computation of standardized death and birth rates.

    Example 4: Pune University MBA [2770]-104

    The management of hotel has employed 2 managers, 5 cooks and 8 waiters. The monthly

    salaries of the manager, the cook and waiter are Rs. 3000, Rs. 1200 and Rs. 1000 respectively.

    Find the mean salary of the employees. (Note: Although these salaries must be 10 to 15 year old,

    we will take it only to learn the principle.)

    Solution:Here we need to calculate waitedaverage of salary with salaries as weights.

    1 1 2 2

    1 2

    ...... 2 3000 5 1200 8 1000

    ...... 2 5 8

    n n

    w

    n

    W W W

    W W W

    x x x

    + + + + + = =

    + + + + +1333.33= Rs.

    4.3.3. Geometric Mean (GM)

    It is defined as nthroot of the product ofNvalues of data. Ifx1, x2 x n are values of data,

    then Geometric Mean,

    1 2 ......n nGM x x x= If different values are not of equal importance and are assigned different weights say w1, w2 ...w n then

    weighted Geometric Mean is given by

    1 21 2 ...... nw n nw w wGM x x x=

    Geometric Mean is useful to find the average % increase in sales, production, population, etc. It is the

    most representative average in the construction of index numbers.

    Example 5:

    A person takes home loan with floating interest, on reducing balance of 10 year term. The interest rates

    as changed from year to year in percent are 5.5, 6.25, 7.5, 6.75, 8.25, 9.5, 10.5, 9, 8.25 and 7.5. Find was

    the average interest rate? Was it beneficial for him to take fixed interest rate on reducible balance at 7.5%

    per annum?

    Solution:

    Average interest rate can be found out using G.M. as follows. First we find the index by dividing % rate

    by 100 and then adding 1. Then we take G.M. of this index as average index. From it we can find out theaverage interest rate.

    Average index (G.M.) =

    101.055 1.0625 1.075 1.0675 1.0825 1.095 1.105 1.09 1.0825 1.075

    10 2.137 1.0789= =

    Thus, Average Interest Rate = 7.89%

    Hence it was beneficial for him to take fixed interest rate on reducible balance at 7.5% per annum.

    4.3.4. Harmonic Mean (HM)

  • 7/29/2019 Notes 04 Measure of Central Tendency

    5/12

    Statistical Tools for Managers 60

    It is defined as the reciprocal of the arithmetic mean of the reciprocal of the individual observations. Thus

    Harmonic Mean is,

    1 2

    1 1 1....

    n

    HMn

    x x x

    =

    + + + =

    1

    1n

    ii

    n

    x=

    Example 6:

    A relay team has four members who have to drive four laps between two fixed points. Average speeds

    that the members can achieve in Km/hr are 280, 360, 380 and 310. Find average speed of the team to

    complete the event.

    Solution:

    The average speed can be calculated as Harmonic Mean HM. Thus, average speed of the team is,

    1 2

    4327.69

    1 1 1 1 1 1 1....280 360 380 310n

    HMn

    x x x

    = = = + + + + + +

    Km/hr

    4.3.5. Weighted Harmonic Mean

    If weight is attached with each observation then the weighted Harmonic Mean is,

    1 2 ......

    1 2

    1 2

    ....

    n

    n

    n

    HMw w w

    w w w

    x x x

    + +=

    +

    + + + =1

    1

    n

    i

    i

    ni

    ii

    w

    w

    x

    =

    =

    Harmonic Mean is useful in computing the average rate of increase in profits, average speed of journey,average price of articles sold, etc. For example, airplane travels distances w1, w2, w3 wn,with speedsx1,

    x2,x3 xn, km\hr respectively, then the average speed is equal to weighted Harmonic Mean of speeds,

    with weights as the distances w1, w2, w3 wn.

    Example 7:

    An aircraft travels 200 km upto border at speed 700 km/hr (economical), then 250 km upto the target in

    enemy territory at speed 950 km/hr, then after dropping the bombs travels at runaway speed of 1700

    km/hr upto our nearest border at 150 km and then at the speed of 800 km/hr to the base at distance of 300

    km. Find the average speed of the sortie. Also find the mission time.

    Solution:

    For the average speed, we need to find the weighted Harmonic Mean. Thus the average sortie speed is,

    1 2 ......

    1 2

    1 2

    150889.23

    200 250 300

    200 250 150 300....

    700 950 1700 800

    n

    n

    n

    HMw w w

    w w w

    x x x

    + + + += = =

    + +

    + + + + + + km/hr

    Mission time 1.012 1= ; hr approx.

    4.4. Median (Md)

    Median1

    2

    th

    d

    NM

    + =

    observation.

  • 7/29/2019 Notes 04 Measure of Central Tendency

    6/12

    Statistical Tools for Managers 61

    If the number of observations is even, then the median is the arithmetic mean of two middle observations.

    Median Md =1

    2 2

    2

    th thN N

    observation observation + +

    In case of grouped data we first find the value2

    N. Then from the cumulative frequency we find the class

    in which the2

    thN

    item falls. Such a class is called as Median Class. Then the median is calculated by

    formula: -

    Median Md= 2

    Npcf

    L hf

    +

    Where, L = lower limit of Median class.

    N = Total Frequency.

    pcf = preceding cumulative frequency to the median class.

    f = frequency of median class.

    h = class interval of median class.

    Let us understand the logic of the formula. Median is value of2

    thN

    observation. But this observation

    falls in the median class whose lower limit is L. Cumulative frequency of class preceding to the median

    class is pcf. Thus, the median observation is2

    thN

    pcf

    observation in the median class (counted

    from the lower limit of the median class). Now, if we consider that all fobservations in the median class

    are evenly spaced from lower limit L to upper limit L+h, the value of the median can be found out by

    using ratio proportion.

    Example 8:

    Calculate the median for the following data.

    Age 20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60

    No. of Workers 14 28 33 30 20 15 13 7

    Solution:

    Age Frequency

    f

    Cumulative frequency

    cf

    20-25 14 14

    25-30 28 42

    30-35 33 75

    35-40 30 105

    40-45 20 125

  • 7/29/2019 Notes 04 Measure of Central Tendency

    7/12

    Statistical Tools for Managers 62

    45-50 15 140

    50-55 13 153

    55-60 7 160

    Now, N= 160

    Or,2

    N= 80

    80th item lies in class 35-40.

    Hence, pcf= 75,f=30, h = 5 andL = 35

    Therefore, the Median is,

    Md= 2

    Npcf

    L hf

    +

    =

    530

    752

    160

    +L

    = 35.83

    4.4.1. Mathematical Properties of median

    a. An important mathematical property of the median is the sum of the absolute deviations about the

    median is minimum i.e. dMx is minimum.b. Median is affected by total number of observations rather than values of the observations.

    4.4.2. Merits of Median

    a. Easy to determine and easy to explain.

    b. Less distorted than arithmetic mean.

    c. Can be computed for open-end distribution.

    d. Median is the only measure of central Tendency that can be used for qualitative ranked

    data.

    4.4.3. Demerits of Median

    a. Need to rearranged data. For computer it is expensive operation.

    b. In case of even number of observations, median cannot be exactly determined.

    c. Less familiar than average.

    d. Does not take into account data values and their spread. It is intensive.

    e. Not capable of algebraic treatment.

    4.5. Quantiles

    Quantiles are related positional measures of Central Tendency. These are useful and frequently employed

    measures. Most familiar quantiles are Quartiles, Deciles, and Percentiles. We are familiar with

    percentile scores in competitive aptitude tests or examinations of few institutes. If your score is 90

    percentile, it means that 90% of the candidates who took the test, received a score lower than yours. In

    incomes in your organisation if you are 95 percentile, you are in the group of top 5% highest paid

    employees in your company.

    4.5.1. Percentile

  • 7/29/2019 Notes 04 Measure of Central Tendency

    8/12

    Statistical Tools for Managers 63

    Pth percentile of a group of observations is that observation below which lie P % (P percent)

    observations. The position ofPth percentile is given by( 1)

    100

    n P+ , where n is the number of data

    points.

    Example 9:

    In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10,

    13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find 80th and 90th percentiles of data.

    Solution

    First, we order the data in ascending order.

    6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24.

    80th percentile of the data set is the observation lying in the position: -

    ( 1)

    100

    n P+ =

    (20 1) 80

    100

    + = 16.8

    Now, the 16th observation is 19 and 17 th observation is 20. Therefore 80 th percentile is a point lying, 0.8

    proportion away from 19 to 20, which is 19.8.

    The 90th percentile is similarly found as observation lying in position: -

    ( 1)

    100

    n P+ =

    ( 1) 90

    100

    n + = 18.9

    The 18th observation is 21 and 19th observation is 22. Therefore 90 th percentile is a point 0.8 proportion

    away from 21 to 22, which is 21.9

    4.5.2. Quartile

    Example 10:

    In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10,

    13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the quartiles of data.

    Solution

    First, we order the data in ascending order.

    6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24.

    a) First quartile is the observation in position: -

    ( 1) 25

    100

    n + = 5.25.

    Value of the observation corresponding to 5.25th position is 13.25

    b) Second quartile or median is the observation in position: -

    ( 1) 50

    100

    n + = 10.5.

    Value of the observation corresponding to 10.5th position is 16.

    c) Third quartile is the observation in position: -

    ( 1) 75

    100

    n + = 15.75.

  • 7/29/2019 Notes 04 Measure of Central Tendency

    9/12

    Statistical Tools for Managers 64

    Value of the observation corresponding to 15.75th position is 18.75

    Note: 0th quartile is same as 0th percentile, which is the minimum observation. Similarly 4th quartile is

    100th percentile, which equals to the maximum observation.

    4.5.3. DecilesThese are the values, which divide the total number of observations in to 10 equal parts. Obviously there

    are 11 deciles (including 0 th and 10th). Method of calculating deciles is same as percentiles. We can use

    the formula same as percentile by substitutingPby 10, 20, 30, etc. for 1st, 2nd, 3rd, etc. deciles.

    4.6. Mode

    The mode of a data set is the value that occurs most frequently. There are many situations in which

    arithmetic mean and median fail to reveal the true characteristics of a data (most representative figure),

    e.g. most common size of shoes, most common size of garments. In such cases mode is the best-suited

    measure of the central tendency. There could be multiple model values, which occur with equal

    frequency. In some cases the mode may be absent. For a grouped data, model class is defined as the class

    with the maximum frequency. Then the mode is calculated as: -

    Mode = hL +

    +

    21

    1

    Where,

    L = Lower limit of model class.

    1 = Difference between frequency of the model class and preceding class.

    2 = Difference between frequency of the model class and succeeding class.

    h = Size of the model class.

    Example 11:

    In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10,

    13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the mode of the data.

    Solution:

    Now the value 16 occurs 3 times which is maximum for any observation. Therefore,

    Mode = 16

    Example 12:

    In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10,

    13, 15, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17. Find the mode of the data.

    Solution:

    Now the values 14, 16, 17 and 18 occur 2 times which is maximum for any observation. Therefore,

    Modes are 14, 16, 17 and 18 (this is a multimodal distribution)

    Example 13:

    In a computerized entrance test 20 candidates appear on a particular day. Their scores are: 9, 6, 12, 10,

    13, 15, 14, 16, 24, 21, 22, 19, 18, 20, 17. Find the mode of the data.

    Solution:

    Now there is no value that occurs more than 1 time. Therefore, the data has no Mode.

    4.7. Relationship Among Mean, Median and Mode

  • 7/29/2019 Notes 04 Measure of Central Tendency

    10/12

    Statistical Tools for Managers 65

    A distribution in which the mean, the median, and the mode coincide is known as symmetrical (bell

    shaped) distribution. Normal Distribution is one such a symmetric distribution, which is very

    commonly used.

    If the distribution is skewed, the mean, the median and the mode are not equal. In a moderately

    skewed distribution distance between the mean and the median is approximately one third of the distancebetween the mean and the mode. This can be expressed as: -

    Mean Median = (Mean Mode) / 3

    Mode = 3 * Median 2 * Mean

    Thus, if we know values of two central tendencies, the third value can be approximately determined in

    any moderately skewed distribution. In any skewed distribution the median lies between the mean and

    mode.

    In case of right-skewed (positive-skewed) distribution which has a long right tail,

    Mode

  • 7/29/2019 Notes 04 Measure of Central Tendency

    11/12

    Statistical Tools for Managers 66

    Where, L = 80 lower limit of Median class.

    N = 1000 Total Frequency.

    pcf = 50 +x preceding cumulative frequency to the median class.

    f = 500 frequency of median class.

    h = 20 class interval of median class.

    Thus,

    500 (50 )87 80 20 7 25 500 (50 ) 50 325

    500

    xx x

    += + = + + =

    Or, 275x =

    Thus the missing frequency of class 60-79 is 275.

    Also the frequency of the class 100-119 is (400 x ) = 125

    ii) Since the highest frequency is in class 80-99, it is a modal class. Now,

    Mode = hL +

    +

    21

    1

    Where,

    L = 80 Lower limit of model class.

    1 = 225 Difference between frequency of the model class and preceding class.

    2 = 375 Difference between frequency of the model class and succeeding class.

    h = 20 Size of the model class.

    22580 20 80 7.5 87.5225 375

    Mode = + = + =+

    Example 20: JHU MBA [102] 2004

    The following data are scores on a management examination taken by a group of 22 people.

    88, 56, 64, 45, 52, 76, 54, 79, 38, 98, 69, 77, 71, 45, 60, 78, 90, 81, 87, 44, 80, 41

    Find the mean, median, standard deviation, and 60th percentile.

    Solution:

    Number of observationsN= 22

    a) 1

    n

    i

    iX

    x

    N==

    88 56 64 45 52 76 54 79 38 98 69 77 71 45 60 78 90 81 87 44 80 41

    22

    + + + + + + + + + + + + + + + + + + + + +=

    66.9545=

    b) For calculating median we need to arrange the data in ascending order as follows,

    38, 41, 44, 45, 45, 52, 54, 56, 60, 64, 69, 71, 76, 77, 78, 79, 80, 81, 87, 88, 90, 98

    Since the number of observations is even, hence the median,

  • 7/29/2019 Notes 04 Measure of Central Tendency

    12/12

    Statistical Tools for Managers 67

    111 122 2

    2 2

    th th

    th th

    d

    N Nobservation observation

    Observation ObservationM

    + + + = =

    69 71 702+= =

    c) Pth percentile =( 1)

    100

    thn P+

    observation.

    60th percentile =( 1) 60

    13.2100

    th

    thn + =

    observation.

    Since it is a fraction, we need to interpolate the value between 13 th and 14th observations.

    Now 13th observation is 76 and 14th observation is 77. Thus by interpolating,

    60th percentile = 13.2th observation = 76.2

    4.9. Exercise

    = 58.89

    6. Calculate arithmetic mean and mode from the following:

    Monthly salary Rs. 400-600 600-800 800-1000 1000-1200 1200-1400

    Number of Workers 4 10 12 6 2

    Ans: Mean = 852.94 , Mode = 850 Pune University BBA [2791]-203