Upload
priyankamodak
View
1.754
Download
2
Embed Size (px)
Citation preview
Statistical Analysis
Session 2: Measures of Central Tendency
Why do we need to consider measures of central tendency? Frequency distribution and graphical
representation of data fail on three counts The numerical value of an observation around
which most numerical values of other observations in the data set show a tendency to cluster or group, called central tendency
The extent to which numerical values are dispersed around the central value, called variation
The extent of departure of numerical values from symmetrical distribution around the central value, called skewness.
Objectives of average (central tendency) Useful to extract and summarize the
characteristics of the entire data set. Since average represents the entire data set,
it is possible to make comparison between two or more data sets. E.g. performance of a sales person based on average sales over two month or two years
It becomes the base for computing other measures such as dispersion, skewness, kurtosis etc.
Measures of Central Tendency Mathematical Averages
Arithmetic mean – simple or weighted Geometric mean Harmonic mean
Averages of position Median Quartiles Deciles Percentiles Mode
Arithmetic Mean Direct Method
Example 1: In a survey the profit earned by five car manufacturing companies was 15, 20, 10, 35, and 32. Find the arithmetic mean of the profits earned. Using the formula 2 above, the arithmetic mean = (15+20+10+35+32)/5 = 22.4
…1
…2
Arithmetic Mean Direct Method Example 2: If A,B,C, and D are four chemicals
costing Rs. 15, Rs.12, Rs.8 and Rs.9 per 100 gram, and are contained in a given compound in the ratio of 1:2:3:4 parts respectively, what should be price of the resultant compound.
= Rs. 8.30
Arithmetic Mean Short-Cut Method In this method an arbitrary assumed mean is
taken as the basis of calculating the deviations from individual values in the data set
Arithmetic Mean Indirect Method Example 1: The daily earnings (in rupees) of
employees working on a daily basis in a firm are given below. Calculate the average daily earning for all employees.Daily earnings (Rs.) 100 120 140 160 180 200 220
Number of employees 3 6 10 14 24 42 75
Solution - > Next slide
Solution to the Example
Daily earnings (Rs.)
xi
Number of Employees
fi
di = xi – A= xi - 160
fidi
100 3 -60 -180
120 6 -40 -240
140 10 -20 -200
160 15 0 0
180 24 +20 480
200 42 +40 1680
220 75 +60 4500
175 6040
Let the assumed mean be 160
Exercises on Direct/ Indirect methods Exercise 1: Calculate the simple and weighted
arithmetic mean price per tonne of coal purchased by a company for six months. Month Price/
TonnesTons
purchasedMonth Price/
TonnesTons
purchased
January 4205 25 April 5200 52
February 5125 30 May 4425 10
March 5000 40 June 5400 45
Exercises on Direct/ Indirect methods Exercise 2: Salary paid by a company to its
employees is as follows. Using the indirect method calculate the mean salary for all employees.Designation Monthly Salary
(Rs.)Number of persons
Senior Manager 35,000 1
Manager 30,000 20
Executives 25,000 70
Jr. Executives 20,000 10
Supervisors 15,000 150
Arithmetic Mean of grouped data Direct method Indirect methods
Short Cut method Step deviation method
The following assumptions should be made The class intervals are closed The width of each class interval should be equal The values of observations in each class interval
must be uniformly distributed between the lower and upper limits
The mid value of the class interval must represent the average of all values in that class.
Arithmetic mean of grouped data Direct method
Where mi = mid-value of the ith class interval and fi = frequency of the ith class interval
Example 1: A company is planning to improve plant safety. The following is the data for the accidents which happened over 50 weeks. Calculate the average accident per weekNo. of accidents
0-4 5-9 10-14 15-19 20-24
No. of weeks 5 22 13 8 2
Solution to the example
No. of accidents
Mid-value (mi) No. of weeks (fi)
fimi
0 – 4 2 5 10
5 – 9 7 22 154
10 – 14 12 13 156
14 – 19 17 8 136
20 – 24 22 2 44
50 500
Arithmetic Mean of grouped data Short-cut method
No. of accidents
Mid-value (mi)
No. of weeks (fi)
fidi
0 – 4 2 -10 5 -50
5 – 9 7 -5 22 -110
10 – 14 12 0 13 0
14 – 19 17 5 8 40
20 – 24 22 10 2 20
50 -100
A
Arithmetic mean of grouped data Step deviation method
Where, A = assumed value for the arithmetic meanh = width of the class intervalsmi = mid-value of the ith class interval
Arithmetic mean of grouped data Exercise 1: The following distribution gives the pattern of
overtime work done by 100 employees of a company. Calculate the average overtime hours
Exercise 2: In an examination of 675 candidates, the examiner supplied the following information. Calculate the mean percentage of marks obtained
Overtime hours
10-15 15-20 20-25 25-30 30-35 35-40
No. of employees
11 20 35 20 8 6
Marks obtained
(%)
No. of students
Marks obtained (%)
No. of student
s
Less than 10 7 Less than 50 381
Less than 20 39 Less than 60 545
Less than 30 95 Less than 70 631
Less than 40 201 Less than 80 675
Calculation of missing values
Wages X Number of workers f
110 25
112 17
113 13
117 15
X 14
125 8
128 6
130 2
From the following data, find the missing item, given that the Mean Wage of the workers is 115.86
Merits and Demerits of Arithmetic Mean Merits
Calculation of AM is simple Calculation is based on all observations and hence
it can be regarded as representative of the given data
It is capable of being treated mathematically and hence, is widely used in statistical analysis
It represents center of gravity of the distribution because it balances the magnitudes of observations which are greater and less than it
It gives good basis of comparison of two or more distributions
Merits and Demerits of Arithmetic Mean Demerits
It can neither be determined by inspection nor by graphical location
Arithmetic mean cannot be computed for a qualitative data
It is affected too much by extreme observations and hence does not adequately represent data consisting of some extreme observations
AM cannot be computed when class intervals have open ends
Simple arithmetic mean gives greater importance to larger values and lesser importance to smaller values
Weighted Arithmetic Mean
Example 1: An examination was held to decide the award of a scholarshipThe weights of various subjects are different. The marks obtained by 3 students are given below:
Subject Weight Students
A B C
Mathematics 4 60 57 62
Physics 3 62 61 67
Chemistry 2 55 53 60
English 1 67 77 49
Calculate the weighted AM to award the scholarship
Solution to the exerciseSubject Weigh
tStudents
Student A Student B Student C
Marks(xi)
xiwi Marks(xi)
xiwi Marks(xi)
xiwi
Mathematics 4 60 240 57 228 62 248
Physics 3 62 186 61 183 67 201
Chemistry 2 55 110 53 106 60 120
English 1 67 67 77 77 49 49
244 603 248 594 238 618
Geometric Mean In many business and economic problems we
deal with quantities that change over a period of time. In such cases if we aim to know the average rate of change, we consider geometric mean rather than arithmetic mean
Example 1: If the population of the country has been growing at a rate of 3%, 2.5%, 2.8%, 2% and 1.9% respectively over the last five years, what has been the average growth rate for the period.
In this case, we need to calculate the geometric mean rather than the arithmetic mean
Geometric Mean Example 2: The following table gives the
annual rate of growth of sales of a company in the last five years. Calculate the average growth rate over these five years.
Year Growth rate Sales at the end of the
year
2003 5.0 105
2004 7.5 112.87
2005 2.5 115.69
2006 5.0 121.47
2007 10.0 133.61
Solution to the example The average annual growth rate = GM = = (X1 x X2 x X3 x X4 x X5)1/5 = = 5.9 percent Simplified solution: Log (G.M.) =
GM = antilog{ }
Geometric Mean Exercise 1: The rate of increase in population
of a country during the last three decades is 5 percent, 8 percent and 12 percent. Find the average rate of growth during the last three decades.
Uses, Merits and Demerits of GM Uses
GM is highly useful in averaging, ratios, percentages, and rate of increase between two periods
GM is important for construction of index numbers Merits
The value of GM is not much affected by extreme observations and is computed by taking all observations
Useful in studying economic and social data Demerits
GM cannot be computed if any item in the series is negative or zero
Difficult to calculate
Harmonic Mean Harmonic Mean of a set of observations is
defined as the reciprocal of the arithmetic mean of the reciprocal of the individual observations
(For ungrouped data)
(For grouped data)
Harmonic Mean Example 1: An investor buys Rs. 20,000 worth
of shares of a company each month. During the first 3 months he bought the shares at a price of Rs. 120, Rs.160 and Rs. 210. After 3 months what is the average price paid by him for the shares
Solution
= Rs. 166.66
Harmonic Mean Example 2: Find the harmonic mean of the
following distribution of data
Solution
Dividend yield (%)
2 – 6 6 – 10 10 – 14
Number of companies
10 12 18
Class (DY)
Mid value (mi)
No. of companies
(fi)
Reciprocal
2 – 6 4 10 ¼ 2.5
6 – 10 8 12 1/8 1.5
10 – 14 12 18 1/12 1.5
N = 40 5.5HM = 7.27
Merits and Demerits of HM/ Relationship between AM, GM and HM Merits
It is based on all observations of the series It is suitable in case of series having wide dispersion
Demerits Difficult to calculate It is not often used for analyzing business problems
Relationship between AM, GM, and HM If all values are equal then AM = GM = HM If values are different then AM > GM > HM If the values of an observation takes the values a, ar,
ar2, ar3, …., arn, then (GM)2 = AM x HM
Averages of Position - Median Median – Median may be defined as the middle
value in the data set when the elements are arranged in sequential order (either ascending or descending)
Median for ungrouped data: If number of observations (n) is odd, then
Median = Size or value of { }th observation If the number of observations are odd, then
Median = observation in the data set
Exercise 1: What is the median value for the following data set: 3.5, 4, 3.8, 3, 5.5, 5, 4.5. What is the median if 5.8 is added to this data set?
Averages of Position - Median Median for grouped data
l = lower class limit of the median class interval
cf = cumulative frequency of the class prior to the median class interval
f = frequency of the median class h = width of the median class interval n = total number of observations in the
distribution
Averages of Position - Median Exercise 2: A survey was conducted to
determine the age in years of 120 automobiles. The result of such a survey is given in the table below. What is the median age of the autos?
Solution -> next slide
Age of auto 0 – 4 4 – 8 8 – 12 12 – 16 16 – 20
No. of autos 13 29 48 22 8
Averages of Position - Median Solution:
Age of autos (years)
Number of autos (fi)
Cumulative frequency (cf)
0 – 4 13 13
4 – 8 29 42
8 – 12 48 90
12 – 16 22 112
16 – 20 8 120
120
= 8 + 1.5 = 9.5
MedianClass
Partition Values – Quartiles, Deciles, Percentiles Quartiles: The values of observations in a
data set, when arranged in an ordered sequence, can be divided into four equal parts, or quarters, using three quartiles viz. Q1, Q2 and Q3. The first quartile Q1 divides the distribution in such a way that 25 percent of the observations have a value less than Q1
and 75 percent of the values are more than Q1. Q1 Q2 Q3
Partition Values – Quartiles, Deciles, Percentiles Deciles: The values of observations in a data
set when arranged in an ordered sequence can be divided into then equal parts, using nine deciles (D1, D2, ….., D9)
Percentiles: The values of observations in a data set when arranged in an ordered sequence can be divided into 100 equal parts using 99 percentiles (P1, P2, ….., P99)
Partition Values – Quartiles, Deciles, Percentiles Exercise 1: The following is the distribution of
weekly wages of 600 workers in a factory
Find the 1st quartile and 3rd quartile Find the 5th decile and 7th decile Find the 29th percentile and 95th percentile Find the median
Weekly wages (Rs.)
No. of workers
Weekly wages (Rs.)
No. of workers
Below 375 69 600 – 625 58
375 – 450 167 625 – 750 24
450 – 525 207 750 – 825 10
525 – 600 65
Averages of Position - Mode Mode: Mode is that value of an observation
which occurs most frequently in the data set, i.e. the point or class mark with the highest frequency.
Exercise 1: Find the mode of the distribution in the earlier example
= frequency of the modal class
= frequency of the class preceding the modal class
= frequency of the class following the modal class
= width of the modal class interval
Averages of Position - Mode Graphical method
Class interval5 15 25 35 45
Fre
quen
cy
5
10
15
Mode
Relationship between Mean, Median and Mode
Mean=median=modeMeanMedianMode ModeMedianMean
Mean – Mode = 3(Mean – Median)
For positively skewed distribution, Mean>Median>Mode
For negatively skewed distribution, Mean<Median<Mode
Next Session : Measures of Dispersion