50
Measures of Central Tendency

Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Measures of Central Tendency

Page 2: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

2

Measures of Central Tendency (Location)

Measures of location indicate where on the number line the data are to be found. Common measures of location are:

(i) the Arithmetic Mean,(ii) the Median, and(iii) the Mode

Page 3: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

The mean is the most widely used average in statistics. It is found by adding up all the values in the data and dividing by how many values there are.

, , ,...,1 2 3 nx x x x

...1 2 3 in xx x x xxn n

Mean

Notation: If the data values are , then the mean is

This is the mean symbol

This symbol means the total of all the x values

Page 4: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Q 1 : calculate the mean of 1 , 2 , 3 , 4 , 5 , 6

Answer:

Mean =...1 2 3 in xx x x xx

n n

5.3621

6654321

Mean

Page 5: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Q 2 : Find the mean of data 13 , 27 , 30 , 40 , 67 , 55

Answer :

Mean = ...1 2 3 in xx x x xx

n n

66.386

2326

556740302713

Mean

Page 6: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Q 3 : Find the mean of daily wages of 10 workers 13 , 16 , 15 , 15 , 18 , 15 , 14 , 18 , 16 , 10

Answer :

...1 2 3 in xx x x xxn n

5.110150

1010161814151815151613

Mean

Mean=

Page 7: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

If data are presented in a frequency table:

Mean

Value Frequency

… …2x

nx

1x 1f2f

nf

...1 1 2 2 i in n

i i

x fx f x f x fxf f

then the mean is

Page 8: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Example: The table shows the results of a survey into household size. Find the mean size.

Mean

Household size, x Frequency, f1 20

2 283 254 195 166 6

To find the mean, we add a 3rd column to the table.

x × f

20

5675768036

TOTAL 114 343

Mean = 343 ÷ 114 = 3.01

Page 9: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

SolutionCalculation of mean

i i i iClass x f f x0-10 5 5 2510-20 15 12 18020-30 25 8 20030-40 35 4 14040-50 45 2 90

N 31 635Hence, mean

i i1x f xN

635 20.4831

Page 10: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

10

The Median and Mode

• If the sample data are arranged in increasing order, the median is

(i) the middle value if n is an odd number, or(ii) midway between the two middle values if n is

an even number

• The mode is the most commonly occurring value.

Page 11: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

11

Example 1 – n is odd

The reordered systolic blood pressure data seen earlier are:

113, 124, 124, 132, 146, 151, and 170.

The Median is the middle value of the ordered data, i.e. 132.

Two individuals have systolic blood pressure = 124 mm Hg, so the Mode is 124.

Page 12: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

12

Example 2 – n is even

Six men with high cholesterol participated in a study to investigate the effects of diet on cholesterol level. At the beginning of the study, their cholesterol levels (mg/dL) were as follows:

366, 327, 274, 292, 274 and 230.

Rearrange the data in numerical order as follows:

230, 274, 274, 292, 327 and 366.

The Median is half way between the middle two readings, i.e. (274+292) 2 = 283.

Two men have the same cholesterol level- the Mode is 274.

Page 13: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Median in a Frequency DistributionMedian

12.2 – Measures of Central Tendency

Example:Find the median for the distribution.

Value (x) 1 2 3 4 5

Frequency (f) 4 3 2 6 8

Position of the median is the sum of the frequencies divided by 2.

Position of the median = (f)

2=

232

= 11.5 = 12th term

The 12th term is the median and its value is 4.Add the frequencies from either side until the sum is 12.

Page 14: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

)(2 hf

CFn

lMedian

where l is the lower limit of the median class, CF is the cumulative frequency preceding the median class, f is the frequency of the median class, and h is the median class interval.

The MedianMedian of a sample of data organized in a frequency distribution is computed by:

Page 15: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

To determine the median class for grouped data

Construct a cumulative frequency distribution.

Divide the total number of data values by 2.

Determine which class will contain this value. For example, if n=50, 50/2 = 25, then determine which class will contain the 25th value.

Page 16: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Movies showing

Frequency Cumulative Frequency

1 up to 3 1 1

3 up to 5 2 3

5 up to 7 3 6

7 up to 9 1 7

9 up to 11 3 10

Page 17: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

33.6)2(3

32

10

5)(2

hf

CFn

lMedian

From the table, l=5, n=10, f=3, h=2, CF=3

Page 18: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Mode in a Frequency DistributionMode

12.2 – Measures of Central Tendency

Example:Find the mode for the distribution.

Value (x) 1 2 3 4 5

Frequency (f) 4 3 2 6 8

The mode in a frequency distribution is the value that has the largest frequency.

The mode for this frequency distribution is 5 as it occurs eight times.

Page 19: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Example: Comparing the Mean, Median, and Mode

Find the mean, median, and mode of the sample ages of a class shown. Which measure of central tendency best describes a typical entry of this data set? Are there any outliers?

Ages in a class

20 20 20 20 20 20 21

21 21 21 22 22 22 23

23 23 23 24 24 65

Page 20: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Solution: Comparing the Mean, Median, and Mode

Mean:

x

xn

20 20 ... 24 6520

23.8 years

Median: 21 22 21.5 years2

20 years (the entry occurring with thegreatest frequency)

Ages in a class

20 20 20 20 20 20 21

21 21 21 22 22 22 23

23 23 23 24 24 65

Mode:

Page 21: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Solution: Comparing the Mean, Median, and Mode

Mean ≈ 23.8 years Median = 21.5 years Mode = 20 years

• The mean takes every entry into account, but is influenced by the outlier of 65.

• The median also takes every entry into account, and it is not affected by the outlier.

• In this case the mode exists, but it doesn't appear to represent a typical entry.

Page 22: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Solution: Comparing the Mean, Median, and Mode

Sometimes a graphical comparison can help you decide which measure of central tendency best represents a data set.

In this case, it appears that the median best describes the data set.

Page 23: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Measures of Variation

There are 3 values used to measure the amount of dispersion or variation. (The spread of the group)

1. Range2. Variance3. Standard Deviation

Page 24: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Why is it Important?• You want to choose the best brand

of paint for your house. You are interested in how long the paint lasts before it fades and you must repaint. The choices are narrowed down to 2 different paints. The results are shown in the chart. Which paint would you choose?

Page 25: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

The chart indicates the number of months a paint lasts before fading.

Paint A Paint B

10 35

60 45

50 30

30 35

40 40

20 25

210 210

Page 26: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Does the Average Help?

• Paint A: Avg = 210/6 = 35 months

• Paint B: Avg = 210/6 = 35 months

• They both last 35 months before fading. No help in deciding which to buy.

Page 27: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Consider the Spread

• Paint A: Spread = 60 – 10 = 50 months

• Paint B: Spread = 45 – 25 = 20 months

• Paint B has a smaller variance which means that it performs more consistently. Choose paint B.

Page 28: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Range

• The range is the difference between the lowest value in the set and the highest value in the set.

• Range = High - Low

Page 29: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Example

• Find the range of the data set.

• 40, 30, 15, 2, 100, 37, 24, 99

• Range = 100 – 2 = 98

Page 30: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Mean Deviation from the Mean

Let us first understand what ‘mean deviation’ is. Mean deviation is the mean of the absolute deviations of a set of observations, taken from a definite central value (can be mean, median or anything else).

The keyword to note in the above definition is ‘absolute’ — only the numerical value of the deviation is to be taken, ignoring the sign.

Page 31: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Mean Deviation from the Mean

Mean deviation from the meanfor raw data (unclassified) :

In this case mean deviation from the mean for a set of n observations is given by

n

ii 1

x xM.D. (X)

n

Mean deviation from the mean for grouped data (classified) :

In this case if xi’s are the mid-points of classes with frequencyfi, then the mean deviation from the mean is given by

n

i ii 1

n

ii 1

f x x

M.D. (x)f

Page 32: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

ExampleConsider the sample {12, 23, 17, 15, 18}.Find the mean deviation from the mean.

Solutions:

-560

-21

Data Deviation from Mean_________________________1223171518

x x x

x 15

12 23 17 15 18 17( )

Page 33: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Note: (Always!) xx 0)(

Mean Absolute Deviation

Mean Absolute Deviation: The mean of the absolute values of the deviations from the mean:

8.25

14)12065(

5

1||

1 xx

n

For the previous example:

xx ||1

deviationabsoluteMean n

Page 34: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Mean Deviation from the Median

The only difference here is that the mean is replaced by the value of the median.

Mean deviation from the medianfor raw data (unclassified)

In this case mean deviation fromthe median for a set of nobservations is given by

n

ii 1

x MedianM.D.

n

Mean deviation from the medianfor grouped data (classified)

In this case if xi’s are themid-points of classes withfrequency fi , then the meandeviation from the median isgiven by

n

i ii 1

n

ii 1

f x MedianM.D.

f

Page 35: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

SolutionCalculation of median and mean deviation

i i i i i

Cumulativex f x – 12 f x – 12

frequency5 1 1 7 76 5 6 6 367 11 17 5 858 14 31 4 1249 16 47 3 14110 13 60 2 12011 10 70 1 7012 70 140 0 013 4 144 1 14415 1 145 3 43518 1 146 6 87620 1 147 8 1176

N 147 3214

Here, N = 147, N 73.52

The cumulative frequency

just greater than

is 140 and the value ofx is 12.

N2

Hence, median = 12.

Page 36: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

The value of the median heresignifies that for about half thenumber of days, approximately12 students were absent.

Mean deviation about median i i1 f x 12N

3214147

= 21.86

Page 37: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

There are three commonly used measures of spread (or dispersion) – the range, the inter-quartile range and the standard deviation.

( )2

variance ix xn

( )2

s.d. ix xn

Standard deviation

The following formulae can be used to find the variance and s.d.

variance = (standard deviation)2

The variance is related to the standard deviation:

The standard deviation is widely used in statistics to measure spread. It is based on allthe values in the data, so it is sensitive to the presence of outliers in the data.

Page 38: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Total: 22

Example: The mid-day temperatures (in °C) recorded for one week in June were: 21, 23, 24, 19, 19, 20, 21

( )2

variance ix xn

Standard deviation

...21 23 21 147 217 7

x

21 0 023 2 424 3 919 -2 419 -2 420 -1 1

21 0 0

( )2ix xix xix

So variance = 22 ÷ 7 = 3.143

So, s.d. = 1.77°C (3 s.f.)

°CFirst we find the mean:

Page 39: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Variance and Standard Deviation

Standard deviation is defined asthe positive square root of the variance.

The value of the variance and standarddeviation for a grouped data is given by

Variance,

S.D.

n 2i i

2 i 1n

ii 1

f x x

f

n 2i i

i 1n

ii 1

f x x

f

Page 40: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Short-cut Method to Find Out Mean and Variance

( )(x)

2In order to reduce the calculations involved in finding out the values of mean and variance for a grouped data, the following algorithm can beused to calculate the same.

Algorithm for finding out the mean for a grouped data:(x)1. Write down the frequency table with a column giving

the class-marks (mid-points of class intervals)

2. Choose a number ‘A’ (usually the middle or almostmiddle value of all xi’s) and take deviationsdi = xi– A about A.

3. Divide each deviation by the class width h.

Hence you get .

i

id

uh

Page 41: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

4. Multiply the frequencies (fi) with thecorresponding ui .Calculate the sum (fi ui ).

5. Find the sum of all frequencies .

n

ii 1

f N

6. Use the formula

n

i ii 1

1X A h fuN

Page 42: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Short-cut Method to Find Out Mean and Variance(x) 2

Similarly, we can also use a short-cutmethod to calculate the variancefor a grouped data

2( )

1. Write down the frequency table with a column giving the class-marks (mid-points of class intervals)

2. Choose a number ‘A’ (usually the middle or almost middle value of all xi’s) and take deviations di = xi– A about A.

3. Multiply the frequencies (fi) with the corresponding di. Calculate the sum (fi di ).

4. Obtain the square of the deviations above (di2).

Page 43: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Short-cut Method to Find Out Mean and Variance ( )(x) 2

5. Multiply the frequencies (fi) with the corresponding di2.

Calculate the sum (fi di2).

6. Find the sum of all frequencies .

n

ii 1

f N

2n n2 2

i i i ii 1 i 1

1 1f d f dN N

7. Use the formula

Page 44: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Class Exercise - 2The following data represents theexpenditure pattern of a studentfor the month of July. The studentgets Rs. 50 everyday as a pocketmoney.

Expenditure (Rs.) Frequency No. of days0-10 510-20 1220-30 830-40 440-50 2

Calculate the mean and standard deviation.

Page 45: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

SolutionCalculation of mean

i i i iClass x f f x0-10 5 5 2510-20 15 12 18020-30 25 8 20030-40 35 4 14040-50 45 2 90

N 31 635

Hence, mean i i1x f xN

635 20.4831

Page 46: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Calculation of standard deviation

2 2ii i i i i i i i

x – 25x f u f u u f u10

5 5 –2 –10 4 2015 12 –1 –12 1 1225 8 0 0 0 035 4 1 4 1 445 2 2 4 4 8

N 31 –14 44

Hence, variance

22 2 2

i i i i1 1h f u f uN N

244 1410031 31

= 121.54Hence,

= 11.02 1 2 1 .5 4

Page 47: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

There is an alternative formula which is usually a more convenient way to find the variance:

Standard deviation

( ) ( )2 2 2But, 2i i ix x x x x x 2 22i ix x x nx 2 22ix x nx nx 2 2ix nx

22variance ix

xn

Therefore, and2

2s.d. ixx

n

( )2

variance ix xn

Page 48: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Example (continued): Looking again at the temperature data for June: 21, 23, 24, 19, 19, 20, 21

Standard deviation

147 217

x

...2 2 2 221 23 21ix

°C

Also, = 3109

.

.

22 23109variance 21 3 143

7s . 77.d 1

ixx

n

°C

Note: Essentially the standard deviation is a measure of how close the values are to the mean value.

We know that

So,

Page 49: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

When the data is presented in a frequency table, the formula for finding the standard deviation needs to be adjusted slightly:

Calculating standard deviation from a table

22s.d. i i

i

f xx

f

Example: A class of 20 students were asked how many times they exercise in a normal week. Find the mean and the standard deviation.

Number of times exercise taken

Frequency

0 51 32 53 44 25 1

Page 50: Measures of Central Tendency · (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i) the Arithmetic Mean,

Calculating standard deviation from a table

x × f x2 × f

0 03 3

10 2012 368 325 25

No. of times exercise taken, x

Frequency, f

0 51 32 53 44 25 1

. .2

2 2116s.d. 1 9 1 40

82

i i

i

f xx

f

The table can be extended to help find the mean and the s.d.

TOTAL: 20 38 116

.3820

1 9x