30
LECTURE LECTURE CENTRAL TENDENCIES CENTRAL TENDENCIES & DISPERSION & DISPERSION P O S T G R A D U A T E M E T H O D O L O G Y C O U R S E

LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Embed Size (px)

Citation preview

Page 1: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

LECTURELECTURE

CENTRAL TENDENCIES CENTRAL TENDENCIES & DISPERSION& DISPERSION

PO

STG

RA

DU

AT

E

ME

TH

OD

OL

OG

Y C

OU

RSE

Page 2: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Measures of Central Measures of Central TendencyTendency

Summarizes the entire data set into a Summarizes the entire data set into a single variable ( measurement )single variable ( measurement )

Measures of Central Tendency Measures of Central Tendency includes: includes: – ModeMode– MedianMedian– MeanMean– Trimmed MeanTrimmed Mean– SkewnessSkewness

Page 3: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

ModeMode

The measurement that occurs most often The measurement that occurs most often ( with the highest frequency )( with the highest frequency )

Commonly used as a measure of Commonly used as a measure of popularity.popularity.

There can be more than 1 mode.There can be more than 1 mode. Not influence by extreme measurements.Not influence by extreme measurements. Applicable for both qualitative and Applicable for both qualitative and

quantitative data.quantitative data.

Page 4: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

MedianMedian

The middle value when the measurements The middle value when the measurements are arranged from lowest to highest.are arranged from lowest to highest.

50% of the measurement lie above it and 50% of the measurement lie above it and 50% fall below it.50% fall below it.

Often used to measure the midpoint of a Often used to measure the midpoint of a large set of measurement.large set of measurement.

There is only 1 medianThere is only 1 median Not influenced by extreme measurements.Not influenced by extreme measurements. Applicable to quantitative data only.Applicable to quantitative data only.

Page 5: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 6: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 7: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 8: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

MeanMean

The sum of the measurements The sum of the measurements divided by the total number of divided by the total number of measurements or better known as measurements or better known as the average.the average.

There is only 1 mean.There is only 1 mean. Value is influences by extreme Value is influences by extreme

measurementsmeasurements Applicable to quantitative data only.Applicable to quantitative data only.

Page 9: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 10: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 11: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 12: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Trimmed MeanTrimmed Mean

The mean is influenced by extreme The mean is influenced by extreme values ( Outliers )values ( Outliers )

To reduce the effect of outliers which To reduce the effect of outliers which distort the mean value, a variation of distort the mean value, a variation of the mean is introduced.the mean is introduced.

Trimmed mean drops the highest Trimmed mean drops the highest and lowest extreme values and and lowest extreme values and averages the rest.averages the rest.

Page 13: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

SkewnessSkewness Relationship of the mode, median, mean Relationship of the mode, median, mean

and trimmed mean is reflected through the and trimmed mean is reflected through the skewness of the data.skewness of the data.

Skewness of the data measures how the Skewness of the data measures how the data is distributed.data is distributed.

Zero Skewness Zero Skewness – symmetrical ( Mode = Median = Mean)symmetrical ( Mode = Median = Mean)

Positive Skewness Positive Skewness – skewed to the right ( Mode < Median < skewed to the right ( Mode < Median <

Mean )Mean ) Negative Skewness Negative Skewness

– skewed to the left ( Mode > Media > Mean )skewed to the left ( Mode > Media > Mean )

Page 14: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

SkewnessSkewness

Negatively or left skewed

0

10

20

30

40

50

1 2 3 4 5 6 7 8

Symmetrical

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9

Positively or right skewed

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8

Page 15: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 16: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Measures of Variability / Measures of Variability / Dispersion Dispersion

It is not sufficient to describe a data set using It is not sufficient to describe a data set using only measures of central tendencyonly measures of central tendency

Need to determine how dispersed / spread out Need to determine how dispersed / spread out the data is.the data is.

Measures of variability/spread includesMeasures of variability/spread includes– RangeRange– Percentile / QuartilePercentile / Quartile– Deviation / Standard Deviation (sisihan piawai)Deviation / Standard Deviation (sisihan piawai)– VarianceVariance– Coefficient of variationCoefficient of variation

Page 17: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

RangeRange

The difference between the largest The difference between the largest and the smallest measurement of and the smallest measurement of the set.the set.

It is easy to compute but very It is easy to compute but very sensitive to outliers.sensitive to outliers.

Does not give much information Does not give much information about the pattern of variabilityabout the pattern of variability

Page 18: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 19: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 20: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Percentile / QuartilePercentile / Quartile The pThe pthth percentile of a set of n percentile of a set of n

measurements arranged in order of measurements arranged in order of magnitude is that value that has at most magnitude is that value that has at most p% of the measurements below it and at p% of the measurements below it and at most ( 100 – p ) % above it.most ( 100 – p ) % above it.

Example: 60Example: 60thth percentile has 60% of the percentile has 60% of the data below it and 40% above it.data below it and 40% above it.

Percentile of interest are the 25Percentile of interest are the 25thth, 50, 50thth, , 7575thth, percentiles often called the lower , percentiles often called the lower quartile, median, and upper quartile. quartile, median, and upper quartile.

Interquartile range – difference between Interquartile range – difference between the upper and lower quartile the upper and lower quartile

Page 21: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Variance and Standard Variance and Standard DeviationDeviation

The variance of a set of n measurements The variance of a set of n measurements yy11, y, y22, … ,y, … ,ynn with mean y is the sum of the with mean y is the sum of the squared deviations divided by n – 1.squared deviations divided by n – 1.

The standard deviation of a set of The standard deviation of a set of measurement is defined to be the measurement is defined to be the positive square root of the variance.positive square root of the variance.

Both measure how spread out the data is Both measure how spread out the data is from the mean.from the mean.

Page 22: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 23: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 24: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 25: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 26: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 27: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 28: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Coefficient of VariationCoefficient of Variation

Measures the variability in the values Measures the variability in the values in a population relative to the in a population relative to the magnitude of the population mean.magnitude of the population mean.

CV = CV = Standard DeviationStandard Deviation

|Mean||Mean| The CV is a unit-free number, it is The CV is a unit-free number, it is

useful when comparing variation of useful when comparing variation of different sets of data.different sets of data.

Page 29: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE
Page 30: LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

BoxplotBoxplot

Top line Top line – MaximumMaximum

22ndnd line line – Upper QuartileUpper Quartile

33rdrd line line– MedianMedian

44thth line line– Lower QuartileLower Quartile

55thth line line– MinimumMinimum

312N =

Item 5 NF

8

7

6

5

4

3

2

1

0