26
Part II igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Part II igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Embed Size (px)

Citation preview

Page 1: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Part IIigma Freud & Descriptive

Statistics

Chapter 3

Viva La Difference: Understanding Variability

Page 2: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

What you will learn in Chapter 3

• Variability is valuable as a descriptive tool

• Difference between variance & standard deviation

• How to compute:• Range

• Inter-quartile Range

• Standard Deviation

• Variance

Page 3: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Why Variability is Important Variability

• how different scores are from one particular score• Spread

• Dispersion

What is the “score” of interest here?• Ah ha!! It’s the MEAN!!

So…variability is really a measure of how each score in a group of scores differs from the mean of that set of scores.

Page 4: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Measures of Variability Four types of variability that examine the

amount of spread or dispersion in a group of scores…• Range

• Inter-quartile Range

• Standard Deviation

• Variance

Typically report the average and the variability together to describe a distribution.

Page 5: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Computing the Range

Range is the most “general” estimate of variability…

Two types…• Exclusive Range

• R = h - l

• Inclusive Range• R = h – l + 1

(Note: R is the range, h is the highest score, l is the lowest score)

Page 6: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Chapter 3 6

Measures of variation Range

Range

• The difference between the highest and lowest numbers in a set of numbers.

2, 35, 77, 93, 120, 540

540 – 2 = 538

Page 7: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Chapter 3 7

Measures of variation Range

What is the range of:

2, 3, 3, 3, 4, 5, 6, 6, 7, 9, 11, 13, 15, 15, 15, 16

24, 57, 81, 96, 107, 152, 179, 211

1001, 1467, 1479, 1680, 1134

Page 8: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Difference between upper (third) and lower (first) quartiles

Quartiles divide data into four equal groups• Lower (first) quartile is 25th percentile

• Middle (second) quartile is 50th percentile and is the median

• Upper (third) quartile is 75th percentile

Interquartile range

Page 9: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Calculating the interquartile range for high temperatures

HighDate Temperature

7-Jan 328-Jan 326-Jan 35 <===Bottom Half Middle Value = First Quartile = 35

10-Jan 415-Jan 42 <===Middle Value4-Jan 43 <===Middle Value9-Jan 46

11-Jan 52 <===Top Half Middle Value = Third Quartile = 522-Jan 593-Jan 60

Median = Second Quartile = 42.5

interquartile range = 52 – 35 = 17

Page 10: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Stem and Leaf 0730 Q1 Fall 2010 (N=22)

2|349 3|03344555666677779 4|01 Q1= .25 (22)=5.5 data point round up to 6th data

point=value of 33 Q2= n+1/2=23/2=11.5 = avg of 11th and 12th data

pt = 35.5 Q3= .75(22)=16.5 =round up to17th data point= Value of 37

Chapter 3 10

Page 11: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Value can be considered to be an outlier if it falls more than 1.5 times the interquartile range above the upper quartile or more than 1.5 times the range below the lower quartile

Example for high temperatures• Interquartile range is 17

• 1.5 times interquartile range is 25.5

• Outliers would be values• Above 52 + 25.5 = 77.5 (none)

• Below 35 – 25.5 = 9.5 (none)

Interquartile range and outliers

Page 12: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Review: Steps to Quartiles, Interquartile Range,and Checking for Outliers

1) Put values in ascending OR descending order 2) Multiply .25 (n) for Q1

3) Multiply .75 (n) for Q3

4) Q3 - Q1 = IQR

5) Q1 – 1.5 (IQR)= value below smallest value in data set;

6) Q3 + 1.5 (IQR)= value above largest value in data set;

Page 13: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Let’s practice Finding Outliers

What is the median, Q1, Q3, range, and IQR for the following? Then check for outliers.

10, 25, 35, 65, 100, 255, 350, 395 (n=8) 10, 65, 75, 99, 299 (n=5) 5, 39, 45, 59, 64, 74 (n=6)

Chapter 3 13

Page 14: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Computing Standard Deviation

Standard Deviation (SD) is the most frequently reported measure of variability

SD = average amount of variability in a set of scores

What do these symbols represent?

Page 15: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Why n – 1?

The standard deviation is intended to be an estimate of the POPULATION standard deviation…• We want it to be an “unbiased estimate”

• Subtracting 1 from n artificially inflates the SD…making it larger

In other words…we want to be “conservative” in our estimate of the population

Page 16: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Things to Remember…

Standard deviation is computed as the average distance from the mean

The larger the standard deviation the greater the variability

Like the mean…standard deviation is sensitive to extreme scores

If s = 0, then there is no variability among scores…they must all be the same value.

Page 17: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Computing Variance

Variance = standard deviation squared

So…what do these symbols represent? Does the formula look familiar?

Page 18: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Standard Deviation or Variance

While the formulas are quite similar…the two are also quite different.• Standard deviation is stated in original units

• Variance is stated in units that are squared

• Which do you think is easier to interpret???

Page 19: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Same mean, different standard deviation; Sample variance and Sample standard deviation: {20,31,50,69,80}

Each number x1 Mean Distance from Mean

20 50 -30

31 50 -19

50 50 0

69 50 19

80 50 30

Chapter 3 19

Page 20: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Then square each distance from mean and add together…

(-30)2 + (-19)2 + (0)2+ (19)2 + (30)2 900+ 361+ 0+ 361 +900= 2522 Divide by N-1 (N=5) 2522/4=630.5= Sample Variance To find sample standard deviation, take

square root of variance= 25.11

Chapter 3 20

Page 21: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Same mean, different standard deviation: {39,44,50,56,61}

Each number x1 Mean Distance from Mean

39 50 -11

44 50 -6

50 50 0

56 50 6

61 50 11

Chapter 3 21

Page 22: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Which data set has more variability?

(-11)2 + (-6)2 + (0)2 + (11)2 + (6)2

121+ 36+ 0+ 121+ 36= 314 Divide by N-1 gives us sample variance 314/4=78.5 Square root of 78.5 gives us sample

standard deviation=8.86

Chapter 3 22

Page 23: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Chapter 3 23

Measures of variation Standard deviation

How about a more user-friendly equation?

1

2

2

NN

xx

S

Page 24: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Using Excel’s VAR Function

Page 25: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Using the Computer to Compute Measures of Variability

Page 26: Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability

Glossary Terms to Know

Variability• Range

• Standard deviation• Mean deviation

• Unbiased estimate

• Variance