22
Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Distribution Summaries

Measures of central tendencyMeanMedianMode

Measures of spreadStandard Deviation Interquartile Range (IQR)

Page 2: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Distribution spread

Range

Standard deviation

Variance

Page 3: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Range

The range of a distribution is the difference between the highest value and the lowest value

Fractio

n

# Months Cohabited0 50 100

0

.2

.4

.6

Length of Cohabitation in Months

0 103

Page 4: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Range (cont.)

. sum cohbl Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- cohblnth | 626 11.74601 17.1347 0 103

Page 5: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Range (cont.). sum cohbl, d # Months Cohabited ------------------------------------------------------------- Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 626 25% 0 0 Sum of Wgt. 626 50% 5 Mean 11.74601 Largest Std. Dev. 17.1347 75% 17 97 90% 32 97 Variance 293.5978 95% 46 103 Skewness 2.304175 99% 79 103 Kurtosis 9.411293

Page 6: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Range (cont.)

0

50

100

# Months Cohabited 103

97

Page 7: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Range

The range of a distribution is the difference between the highest value and the lowest value

Page 8: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance

The most commonly used measure of spread

One of the most fundamental concepts in statistics

Page 9: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance Formula

In words, the variance is the mean squared deviation (from the mean)

A deviation is the difference between a score and the mean of all scores

We square this deviation for all observations

We then take the mean of all these

Page 10: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance Formula (cont.)

n

XXn

ii

1

2

2

Definitional Formula

Page 11: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance Formula (cont.)

n

n

X

X

n

iin

ii

2

1

1

2

2

Computational Formula

Page 12: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance (example)

Obs Square Dev Dev Sq 1 1 -2 4 2 4 -1 1 3 9 0 0 4 16 1 1 5 25 2 4Sum 15 55 0 10Mean 3 2

Variance = (55 - 225 / 5) / 5 = (55-45) / 5 = 2

Page 13: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Why sum the SQUARES?

Recall that the sum of the deviations around the mean is zero

Therefore the average deviation is zero

Squaring a positive or negative number always creates a positive result

This way we are assured of a sum that is greater than or equal to zero

Page 14: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Compare

n

ii XX

1

0

number pos.1

2

n

ii XX

Page 15: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Compare (cont.)

Deviations Squared Deviations

10 - 12 = -211 - 12 = -112 - 12 = 013 - 12 = 114 - 12 = 2

10 - 12 = -211 - 12 = -112 - 12 = 013 - 12 = 114 - 12 = 2

41014

Sum

Mean

60 60 0 60 60 0 10

12 12 0 12 12 0 2Variance

Page 16: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Standard Deviation

The second most commonly used measure of spread

The square root of the variance

Which brings us back to the original metric or units of measure

2Standard DeviationVariance

Page 17: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

What are units?

Consider age

Units are years

Deviations are years

Squared deviations are years squared

Summing and taking mean leaves squared years

Taking square root yields years again

Page 18: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

So we have the sd?

The standard deviation is about 1/6 of the rangeFor a normal distribution, about 70% of observations are ± 1 σ from the mean.And, about 90% are ± 2 σ from the meanAnd, about 99% are ± 3 σ from the mean

Page 19: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variance (example)

1 2 3 4 5

Variance = 2 Std. Dev. = 1.414

1

Mean

Page 20: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

Variability of the scores

Variability and spread of the scores indicate the second characteristic of a distribution that we need to know.

The first was the mean or central location of the distribution

Page 21: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

The mean and variance are independent

Means can change without affecting the variance (or standard deviation)

Standard deviation (or variance) can change without affecting the mean

Two distributions may differ on means or on standard deviations or both (or neither)

Page 22: Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)

What makes scores variable?

Why are some scores high and others low?

Why does the variance change?

. tab sex, sum(income1) | Summary of income1 sex | Mean Std. Dev. Freq. ------------+------------------------------------ female | 16.207224 10.82088 263 male | 22.371972 13.304104 289 ------------+------------------------------------ Total | 19.434783 12.557429 552