41
Statistical Analysis Session 2: Measures of Central Tendency

Session 2 - Measures of Central Tendency

Embed Size (px)

Citation preview

Page 1: Session 2 - Measures of Central Tendency

Statistical Analysis

Session 2: Measures of Central Tendency

Page 2: Session 2 - Measures of Central Tendency

Why do we need to consider measures of central tendency? Frequency distribution and graphical

representation of data fail on three counts The numerical value of an observation around

which most numerical values of other observations in the data set show a tendency to cluster or group, called central tendency

The extent to which numerical values are dispersed around the central value, called variation

The extent of departure of numerical values from symmetrical distribution around the central value, called skewness.

Page 3: Session 2 - Measures of Central Tendency

Objectives of average (central tendency) Useful to extract and summarize the

characteristics of the entire data set. Since average represents the entire data set,

it is possible to make comparison between two or more data sets. E.g. performance of a sales person based on average sales over two month or two years

It becomes the base for computing other measures such as dispersion, skewness, kurtosis etc.

Page 4: Session 2 - Measures of Central Tendency

Measures of Central Tendency Mathematical Averages

Arithmetic mean – simple or weighted Geometric mean Harmonic mean

Averages of position Median Quartiles Deciles Percentiles Mode

Page 5: Session 2 - Measures of Central Tendency

Arithmetic Mean Direct Method

Example 1: In a survey the profit earned by five car manufacturing companies was 15, 20, 10, 35, and 32. Find the arithmetic mean of the profits earned. Using the formula 2 above, the arithmetic mean = (15+20+10+35+32)/5 = 22.4

…1

…2

Page 6: Session 2 - Measures of Central Tendency

Arithmetic Mean Direct Method Example 2: If A,B,C, and D are four chemicals

costing Rs. 15, Rs.12, Rs.8 and Rs.9 per 100 gram, and are contained in a given compound in the ratio of 1:2:3:4 parts respectively, what should be price of the resultant compound.

= Rs. 8.30

Page 7: Session 2 - Measures of Central Tendency

Arithmetic Mean Short-Cut Method In this method an arbitrary assumed mean is

taken as the basis of calculating the deviations from individual values in the data set

Page 8: Session 2 - Measures of Central Tendency

Arithmetic Mean Indirect Method Example 1: The daily earnings (in rupees) of

employees working on a daily basis in a firm are given below. Calculate the average daily earning for all employees.Daily earnings (Rs.) 100 120 140 160 180 200 220

Number of employees 3 6 10 14 24 42 75

Solution - > Next slide

Page 9: Session 2 - Measures of Central Tendency

Solution to the Example

Daily earnings (Rs.)

xi

Number of Employees

fi

di = xi – A= xi - 160

fidi

100 3 -60 -180

120 6 -40 -240

140 10 -20 -200

160 15 0 0

180 24 +20 480

200 42 +40 1680

220 75 +60 4500

175 6040

Let the assumed mean be 160

Page 10: Session 2 - Measures of Central Tendency

Exercises on Direct/ Indirect methods Exercise 1: Calculate the simple and weighted

arithmetic mean price per tonne of coal purchased by a company for six months. Month Price/

TonnesTons

purchasedMonth Price/

TonnesTons

purchased

January 4205 25 April 5200 52

February 5125 30 May 4425 10

March 5000 40 June 5400 45

Page 11: Session 2 - Measures of Central Tendency

Exercises on Direct/ Indirect methods Exercise 2: Salary paid by a company to its

employees is as follows. Using the indirect method calculate the mean salary for all employees.Designation Monthly Salary

(Rs.)Number of persons

Senior Manager 35,000 1

Manager 30,000 20

Executives 25,000 70

Jr. Executives 20,000 10

Supervisors 15,000 150

Page 12: Session 2 - Measures of Central Tendency

Arithmetic Mean of grouped data Direct method Indirect methods

Short Cut method Step deviation method

The following assumptions should be made The class intervals are closed The width of each class interval should be equal The values of observations in each class interval

must be uniformly distributed between the lower and upper limits

The mid value of the class interval must represent the average of all values in that class.

Page 13: Session 2 - Measures of Central Tendency

Arithmetic mean of grouped data Direct method

Where mi = mid-value of the ith class interval and fi = frequency of the ith class interval

Example 1: A company is planning to improve plant safety. The following is the data for the accidents which happened over 50 weeks. Calculate the average accident per weekNo. of accidents

0-4 5-9 10-14 15-19 20-24

No. of weeks 5 22 13 8 2

Page 14: Session 2 - Measures of Central Tendency

Solution to the example

No. of accidents

Mid-value (mi) No. of weeks (fi)

fimi

0 – 4 2 5 10

5 – 9 7 22 154

10 – 14 12 13 156

14 – 19 17 8 136

20 – 24 22 2 44

50 500

Page 15: Session 2 - Measures of Central Tendency

Arithmetic Mean of grouped data Short-cut method

No. of accidents

Mid-value (mi)

No. of weeks (fi)

fidi

0 – 4 2 -10 5 -50

5 – 9 7 -5 22 -110

10 – 14 12 0 13 0

14 – 19 17 5 8 40

20 – 24 22 10 2 20

50 -100

A

Page 16: Session 2 - Measures of Central Tendency

Arithmetic mean of grouped data Step deviation method

Where, A = assumed value for the arithmetic meanh = width of the class intervalsmi = mid-value of the ith class interval

Page 17: Session 2 - Measures of Central Tendency

Arithmetic mean of grouped data Exercise 1: The following distribution gives the pattern of

overtime work done by 100 employees of a company. Calculate the average overtime hours

Exercise 2: In an examination of 675 candidates, the examiner supplied the following information. Calculate the mean percentage of marks obtained

Overtime hours

10-15 15-20 20-25 25-30 30-35 35-40

No. of employees

11 20 35 20 8 6

Marks obtained

(%)

No. of students

Marks obtained (%)

No. of student

s

Less than 10 7 Less than 50 381

Less than 20 39 Less than 60 545

Less than 30 95 Less than 70 631

Less than 40 201 Less than 80 675

Page 18: Session 2 - Measures of Central Tendency

Calculation of missing values

Wages X Number of workers f

110 25

112 17

113 13

117 15

X 14

125 8

128 6

130 2

From the following data, find the missing item, given that the Mean Wage of the workers is 115.86

Page 19: Session 2 - Measures of Central Tendency

Merits and Demerits of Arithmetic Mean Merits

Calculation of AM is simple Calculation is based on all observations and hence

it can be regarded as representative of the given data

It is capable of being treated mathematically and hence, is widely used in statistical analysis

It represents center of gravity of the distribution because it balances the magnitudes of observations which are greater and less than it

It gives good basis of comparison of two or more distributions

Page 20: Session 2 - Measures of Central Tendency

Merits and Demerits of Arithmetic Mean Demerits

It can neither be determined by inspection nor by graphical location

Arithmetic mean cannot be computed for a qualitative data

It is affected too much by extreme observations and hence does not adequately represent data consisting of some extreme observations

AM cannot be computed when class intervals have open ends

Simple arithmetic mean gives greater importance to larger values and lesser importance to smaller values

Page 21: Session 2 - Measures of Central Tendency

Weighted Arithmetic Mean

Example 1: An examination was held to decide the award of a scholarshipThe weights of various subjects are different. The marks obtained by 3 students are given below:

Subject Weight Students

A B C

Mathematics 4 60 57 62

Physics 3 62 61 67

Chemistry 2 55 53 60

English 1 67 77 49

Calculate the weighted AM to award the scholarship

Page 22: Session 2 - Measures of Central Tendency

Solution to the exerciseSubject Weigh

tStudents

Student A Student B Student C

Marks(xi)

xiwi Marks(xi)

xiwi Marks(xi)

xiwi

Mathematics 4 60 240 57 228 62 248

Physics 3 62 186 61 183 67 201

Chemistry 2 55 110 53 106 60 120

English 1 67 67 77 77 49 49

244 603 248 594 238 618

Page 23: Session 2 - Measures of Central Tendency

Geometric Mean In many business and economic problems we

deal with quantities that change over a period of time. In such cases if we aim to know the average rate of change, we consider geometric mean rather than arithmetic mean

Example 1: If the population of the country has been growing at a rate of 3%, 2.5%, 2.8%, 2% and 1.9% respectively over the last five years, what has been the average growth rate for the period.

In this case, we need to calculate the geometric mean rather than the arithmetic mean

Page 24: Session 2 - Measures of Central Tendency

Geometric Mean Example 2: The following table gives the

annual rate of growth of sales of a company in the last five years. Calculate the average growth rate over these five years.

Year Growth rate Sales at the end of the

year

2003 5.0 105

2004 7.5 112.87

2005 2.5 115.69

2006 5.0 121.47

2007 10.0 133.61

Page 25: Session 2 - Measures of Central Tendency

Solution to the example The average annual growth rate = GM = = (X1 x X2 x X3 x X4 x X5)1/5 = = 5.9 percent Simplified solution: Log (G.M.) =

GM = antilog{ }

Page 26: Session 2 - Measures of Central Tendency

Geometric Mean Exercise 1: The rate of increase in population

of a country during the last three decades is 5 percent, 8 percent and 12 percent. Find the average rate of growth during the last three decades.

Page 27: Session 2 - Measures of Central Tendency

Uses, Merits and Demerits of GM Uses

GM is highly useful in averaging, ratios, percentages, and rate of increase between two periods

GM is important for construction of index numbers Merits

The value of GM is not much affected by extreme observations and is computed by taking all observations

Useful in studying economic and social data Demerits

GM cannot be computed if any item in the series is negative or zero

Difficult to calculate

Page 28: Session 2 - Measures of Central Tendency

Harmonic Mean Harmonic Mean of a set of observations is

defined as the reciprocal of the arithmetic mean of the reciprocal of the individual observations

(For ungrouped data)

(For grouped data)

Page 29: Session 2 - Measures of Central Tendency

Harmonic Mean Example 1: An investor buys Rs. 20,000 worth

of shares of a company each month. During the first 3 months he bought the shares at a price of Rs. 120, Rs.160 and Rs. 210. After 3 months what is the average price paid by him for the shares

Solution

= Rs. 166.66

Page 30: Session 2 - Measures of Central Tendency

Harmonic Mean Example 2: Find the harmonic mean of the

following distribution of data

Solution

Dividend yield (%)

2 – 6 6 – 10 10 – 14

Number of companies

10 12 18

Class (DY)

Mid value (mi)

No. of companies

(fi)

Reciprocal

2 – 6 4 10 ¼ 2.5

6 – 10 8 12 1/8 1.5

10 – 14 12 18 1/12 1.5

N = 40 5.5HM = 7.27

Page 31: Session 2 - Measures of Central Tendency

Merits and Demerits of HM/ Relationship between AM, GM and HM Merits

It is based on all observations of the series It is suitable in case of series having wide dispersion

Demerits Difficult to calculate It is not often used for analyzing business problems

Relationship between AM, GM, and HM If all values are equal then AM = GM = HM If values are different then AM > GM > HM If the values of an observation takes the values a, ar,

ar2, ar3, …., arn, then (GM)2 = AM x HM

Page 32: Session 2 - Measures of Central Tendency

Averages of Position - Median Median – Median may be defined as the middle

value in the data set when the elements are arranged in sequential order (either ascending or descending)

Median for ungrouped data: If number of observations (n) is odd, then

Median = Size or value of { }th observation If the number of observations are odd, then

Median = observation in the data set

Exercise 1: What is the median value for the following data set: 3.5, 4, 3.8, 3, 5.5, 5, 4.5. What is the median if 5.8 is added to this data set?

Page 33: Session 2 - Measures of Central Tendency

Averages of Position - Median Median for grouped data

l = lower class limit of the median class interval

cf = cumulative frequency of the class prior to the median class interval

f = frequency of the median class h = width of the median class interval n = total number of observations in the

distribution

Page 34: Session 2 - Measures of Central Tendency

Averages of Position - Median Exercise 2: A survey was conducted to

determine the age in years of 120 automobiles. The result of such a survey is given in the table below. What is the median age of the autos?

Solution -> next slide

Age of auto 0 – 4 4 – 8 8 – 12 12 – 16 16 – 20

No. of autos 13 29 48 22 8

Page 35: Session 2 - Measures of Central Tendency

Averages of Position - Median Solution:

Age of autos (years)

Number of autos (fi)

Cumulative frequency (cf)

0 – 4 13 13

4 – 8 29 42

8 – 12 48 90

12 – 16 22 112

16 – 20 8 120

120

= 8 + 1.5 = 9.5

MedianClass

Page 36: Session 2 - Measures of Central Tendency

Partition Values – Quartiles, Deciles, Percentiles Quartiles: The values of observations in a

data set, when arranged in an ordered sequence, can be divided into four equal parts, or quarters, using three quartiles viz. Q1, Q2 and Q3. The first quartile Q1 divides the distribution in such a way that 25 percent of the observations have a value less than Q1

and 75 percent of the values are more than Q1. Q1 Q2 Q3

Page 37: Session 2 - Measures of Central Tendency

Partition Values – Quartiles, Deciles, Percentiles Deciles: The values of observations in a data

set when arranged in an ordered sequence can be divided into then equal parts, using nine deciles (D1, D2, ….., D9)

Percentiles: The values of observations in a data set when arranged in an ordered sequence can be divided into 100 equal parts using 99 percentiles (P1, P2, ….., P99)

Page 38: Session 2 - Measures of Central Tendency

Partition Values – Quartiles, Deciles, Percentiles Exercise 1: The following is the distribution of

weekly wages of 600 workers in a factory

Find the 1st quartile and 3rd quartile Find the 5th decile and 7th decile Find the 29th percentile and 95th percentile Find the median

Weekly wages (Rs.)

No. of workers

Weekly wages (Rs.)

No. of workers

Below 375 69 600 – 625 58

375 – 450 167 625 – 750 24

450 – 525 207 750 – 825 10

525 – 600 65

Page 39: Session 2 - Measures of Central Tendency

Averages of Position - Mode Mode: Mode is that value of an observation

which occurs most frequently in the data set, i.e. the point or class mark with the highest frequency.

Exercise 1: Find the mode of the distribution in the earlier example

= frequency of the modal class

= frequency of the class preceding the modal class

= frequency of the class following the modal class

= width of the modal class interval

Page 40: Session 2 - Measures of Central Tendency

Averages of Position - Mode Graphical method

Class interval5 15 25 35 45

Fre

quen

cy

5

10

15

Mode

Page 41: Session 2 - Measures of Central Tendency

Relationship between Mean, Median and Mode

Mean=median=modeMeanMedianMode ModeMedianMean

Mean – Mode = 3(Mean – Median)

For positively skewed distribution, Mean>Median>Mode

For negatively skewed distribution, Mean<Median<Mode

Next Session : Measures of Dispersion