30
Where are we? Measure of central tendency FETP India

Where are we? Measure of central tendency FETP India

Embed Size (px)

Citation preview

Page 1: Where are we? Measure of central tendency FETP India

Where are we?

Measure of central tendency

FETP India

Page 2: Where are we? Measure of central tendency FETP India

Competency to be gained from this lecture

Calculate a measure of central tendency that is adapted to the sample studied

Page 3: Where are we? Measure of central tendency FETP India

Key issues

• Measures of central tendency Mode Median Mean Geometric mean

• Appropriate applications

Page 4: Where are we? Measure of central tendency FETP India

Summary statistics

• A single value that summarizes the observed value of a variable Part of the data reduction process

• Two types: Measures of location/central tendency/average Measures of dispersion/variability/spread

• Describe the shape of the distribution of a set of observations

• Necessary for precise and efficient comparisons of different sets of data The location (average) and shape (variability) of

different distributions may be different

Page 5: Where are we? Measure of central tendency FETP India

No. ofPeople

Factor X

Population A

Population B

Different VariabilitySame Location

Different variability, same location

Page 6: Where are we? Measure of central tendency FETP India

Different location, same variability

Page 7: Where are we? Measure of central tendency FETP India

Quick definitions of measures of central tendency

• Mode The most frequently occuring observation

• Median The mid-point of a set of ordered

observations

• Arithmetic mean The product of the division of the arithmetic

sum of observations by the number of observations

Page 8: Where are we? Measure of central tendency FETP India

The mode

• Definition The mode of a distribution is the value that is

observed most frequently in a given set of data

• How to obtain it? Arrange the data in sequence from low to high Count the number of times each value occurs The most frequently occurring value is the

mode

Mode

Page 9: Where are we? Measure of central tendency FETP India

The mode

0

2

4

6

8

10

12

14

16

18

20

N

Mode

Mode

Page 10: Where are we? Measure of central tendency FETP India

Examples of mode (1/2): Annual salary (in 100,000 rupees)

• 4, 3, 3, 2, 3, 8, 4, 3, 7, 2• Arranging the values in order:

2, 2, 3, 3, 3, 3, 4, 4, 7, 8 7, 8 The mode is three times “3”

Mode

Page 11: Where are we? Measure of central tendency FETP India

Examples of mode (2/2): Incubation period for hepatitis

affected persons (in days) • 29, 31, 24, 29, 30, 25• Arranging the values in order:

24, 25, 29, 29, 30, 31 Mode is 29

Mode

Page 12: Where are we? Measure of central tendency FETP India

The mode is the only location statistics to be used when some characteristic itself cannot be

measuredColour preference of people for their cars

Colour preference Number of people

Green 354

Blue 852

Gray 310

Red 474

Mode

Mode

Page 13: Where are we? Measure of central tendency FETP India

Specific features of the mode

• There may be no mode When each value is unique

• There may be more than one mode When more than 1 peak occurs Bimodal distribution

• The mode can be misinterpreted Is a distribution skewed or bimodal ?

• The mode is not amenable to statistical tests

• The mode is not based upon all observationsMode

Page 14: Where are we? Measure of central tendency FETP India

The median

• The median describes literally the middle value of the data

• It is defined as the value above or below which half (50%) the observations fall

Median

Page 15: Where are we? Measure of central tendency FETP India

Computing the median

• Arrange the observations in order from smallest to largest (ascending order) or vice-versa

• Count the number of observations “n” If “n” is an odd number

• Median = value of the (n+1) / 2th observation

If “n” is an even number• Median = the average of the n / 2th and (n

/2)+1th observations

Median

Page 16: Where are we? Measure of central tendency FETP India

Computing the Median, ExampleExample of median calculation

• What is the median of the following values: 10, 20, 12, 3, 18, 16, 14, 25, 2 Arrange the numbers in increasing order

• 2 , 3, 10, 12, 14, 16, 18, 20, 25• Median = 14

• Suppose there is one more observation (8) 2 , 3, 8, 10, 12, 14, 16, 18, 20, 25 Median = Mean of 12 & 14 = 13

Median

Page 17: Where are we? Measure of central tendency FETP India

Advantages and disadvantages of the median

• Advantages The median is unaffected by extreme values

• Disadvantages The median does not contain information on

the other values of the distribution • Only selected by its rank• You can change 50% of the values without

affecting the median

The median is less amenable to statistical tests

Median

Page 18: Where are we? Measure of central tendency FETP India

Median

0

2

4

6

8

10

12

14

Class of the variable

0

2

4

6

8

10

12

14

Class of the variable

The median is not sensitive to extreme values

Median

Same median

Page 19: Where are we? Measure of central tendency FETP India

Mean (Arithmetic mean / Average)

• Most commonly used measure of location• Definition

Calculated by adding all observed values and dividing by the total number of observations

• Notations Each observation is denoted as x1, x2, … xn The total number of observations: n Summation process = Sigma : The mean: X

X = xi /n

Mean

Page 20: Where are we? Measure of central tendency FETP India

Computation of the mean

• Duration of stay in days in a hospital 8,25,7,5,8,3,10,12,9

• 9 observations (n=9)• Sum of all observations = 87• Mean duration of stay = 87 / 9 = 9.67

• Incubation period in days of a disease 8,45,7,5,8,3,10,12,9

• 9 observations (n=9)• Sum of all observations =107 • Mean incubation period = 107 / 9 = 11.89

Mean

Page 21: Where are we? Measure of central tendency FETP India

Advantages and disadvantages of the mean

• Advantages Has a lot of good theoretical properties Used as the basis of many statistical tests Good summary statistic for a symmetrical

distribution

• Disadvantages Less useful for an asymmetric distribution

• Can be distorted by outliers, therefore giving a less “typical” value

Mean

Page 22: Where are we? Measure of central tendency FETP India

Mean of several groups combined

Group ( i )

Size ( n i)

Mean ( x i)

Sum (n i xi )

1 2 3

10

15

25

41

36

42

410

540

1050

Total 50 -- 2000

Mean of all groups = 2000 / 50 = 40 Crude average = 39.7

Page 23: Where are we? Measure of central tendency FETP India

The geometric mean

• Background Some distribution appear symmetric after

log transformation (e.g., Neutrophil counts)

A log transformation may help describing the central tendency

• Definition The geometric mean is the antilog of the

mean of the log values

Geometric mean

Page 24: Where are we? Measure of central tendency FETP India

Calculating a geometric mean

• Observe the set of observations 5,10,20,25,40

• Take the logarithm of these values 0.70, 1.00, 1.30, 1.40 & 1.60.

• Calculate the mean of the log values 0.70 + 1.00 + 1.30 + 1.40 + 1.60 = 6.00 6.00/ 5 = 1.20

• Take the antilog of the mean of the log values Antilog (1.20) = 15.85

Geometric mean

Page 25: Where are we? Measure of central tendency FETP India

Geometric mean of several groups combined

Overall GM = antilog of ( 48.42 / 50) = antilog ( 0.9684 ) = 9.3

Group (i)

Number of patients

(ni) Geometric mean (GM) log GM

ni * log GM

A

B

C

20

18

12

8.5

10.2

9.4

0.93

1.01

0.97

18.60

18.18

11.64

Total

50 -- -- 48.42

Geometric mean

Page 26: Where are we? Measure of central tendency FETP India

0

2

4

6

8

10

12

14

N

Mean = 10.8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Median = 10 Mode = 13.5

Choosing

Page 27: Where are we? Measure of central tendency FETP India

What measure of location to use?

• Consider the duration (days) of absence from work of 21 labourers owing to sickness 1, 1, 2, 2, 3, 3, 4, 4, 4, 4, 5, 6, 6, 6, 7, 8, 9, 10,

10, 59, 80

• Mean = 11 days Not typical of the series as 19 of the 21

labourers were absent for less than 11 days Distorted by extreme values

• Median = 5 days Better measure

Choosing

Page 28: Where are we? Measure of central tendency FETP India

Choice of measure of central tendency for symmetric distributions

• Any one of the central/location measures can be used

• The mean has definite advantages if subsequent computations are needed

Choosing

Page 29: Where are we? Measure of central tendency FETP India

Choice of measure of central tendency for asymmetric

distributions• For skewed distributions, the mean is not

suitable Positive skewed: Mean gives a higher value Negatively skewed: Mean gives a lower value

• If some observations deviate much more than others in the series, then median is the appropriate measure

• If the log-transformed distribution is symmetric, the geometric mean may be used

Choosing

Page 30: Where are we? Measure of central tendency FETP India

Key messages

• The mode is the most common value• The median is adapted when there are

extreme values• The mean is adapted for symmetric

distribution• The geometric mean may be useful

when log transformed data are symmetric

• The type of the distribution determines the measure of central tendency to use