28
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

Embed Size (px)

Citation preview

Page 1: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 10

Describing Data Distributions

Page 2: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-2

Modal and Median Category

Categorical Data Print Output Frequency Table

Occupational Status:

Category Code Freq. Pct. Adj. Cum.

Professional 1 37 13.8 14.1 14.1Mgr., Executive 2 62 23.1 23.6 37.6Admin., Clerical 3 69 25.7 26.2 63.9Engr., Technical 4 16 6.0 6.1 70.0Sales, Marketing 5 30 11.2 11.4 81.4Craft, Trade 6 22 8.2 8.4 89.7Semi-Skilled 7 27 10.1 10.3 100.0Missing Data 0 5 1.9 Missing

Total 268 100.0 100.0

Page 3: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-3

Frequency and Percentage Distributions Report Format

Age Number Percent

Over 50 94 22.4

36 to 50 188 45.4

18 to 35 132 31.9

Age Number Percent

Over 50 94 22.4

36 to 50 188 45.4

18 to 35 132 31.9

Page 4: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-4

Bar Chart With Frequency Labels

132

188

94

0 50 100 150 200

18 to 35

36 to 50

Over 50

Number

Page 5: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-5

Vertical Bar Chart With Percentage Labels

0%

10%

20%

30%

40%

50%

60%

22.7%

Over 50

45.4%

36 to 50

31.9%

18 to 35

Page 6: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-6

Pie Chart With Percentage Labels

22.7%

45.4%

31.9% Over 50

36 to 50

18 to 35

Page 7: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-7

Descriptive Statistical Tools

Scale Average Spread ShapeScale Average Spread Shape

Nominal ModeNominal Mode

Ordinal Mode Interquartile RangeMedian Data Range

Minimum, Maximum

Ordinal Mode Interquartile RangeMedian Data Range

Minimum, Maximum

Interval Mode Standard Deviation Skewness& Ratio Mode Interquartile Range Kurtosis

Median Data Range Maximum & Minimum

Interval Mode Standard Deviation Skewness& Ratio Mode Interquartile Range Kurtosis

Median Data Range Maximum & Minimum

Page 8: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-8

Choosing an Average

• Mean• The sum divided by the number• Inappropriate for highly skewed distributions• Overly sensitive to extreme values

• Median• Middle value when arrayed from low to high• Unaffected by asymmetry or extreme values

• Mode• Peak of a continuous distribution• Category with the highest frequency• Only legitimate average for nominal data

Page 9: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-9

Median

Mode Mean

Measures of Central Tendency

Page 10: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-10

Spread and Standard Deviation

• Standard Deviation• Root mean squared deviation from the mean• Special properties that make it very useful

• Normal Distributions• 68% of data are within ± 1 S.D. of the mean• 95% of data are within ± 2 S.D. of the mean• 99% of data are within ± 3 S.D. of the mean

Page 11: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-11

99% w/i ± 3 S.D.

95% w/i ± 2 S.D.

68% w/i ± 1 S.D.

Mean

Spread and Standard Deviation

Page 12: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-12

Median

Zero Skewness Indicates Symmetry

Mean

Mode

Page 13: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-13

Mode

Positive Skewness Leans Left

MeanMedian

Page 14: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-14

Negative Skewness Leans Right

ModeMeanMedian

Page 15: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-15

Zero Kurtosis Indicates Normality

MedianMean

Mode

Page 16: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-16

Negative Kurtosis: A Low Peak and High Tails

MedianMean

Mode

Page 17: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-17

Positive Kurtosis: A High Peak and Low Tails

MedianMean

Mode

Page 18: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-18

StatisticsMean 3.800 Skewness -0.887Median 4.000 Kurtosis 0.092Mode 4.000 Std. dev. 1.128Number100 Std. err. 0.113

1 5 5.0 5.0 5.02 10 10.0 10.0 15.03 15 15.0 15.0 30.04 40 40.0 40.0 70.05 30 30.0 30.0 100.0

Total 100 100.0 100.0

Code Freq. Pct. Adj. Cum.

Frequency and Percentage Table

26%

42%

16%

11%

5%

0% 10% 20% 30% 40% 50%

1

2

3

4

5

Bar Chart

01020304050

1 2 3 4 5

Line Plot

Mean, Median, and Mode

Page 19: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-19

Mean 5.66

Median 4

Mode 4

Averages and Outliers

0 5 10 15 20 25 30

One

Two

Three

Four

Five

Six

Fifty

• This bar chart appears at a glance to show a symmetrical distribution. In fact, there is radical asymmetry resulting from 5 outliers with values of 50.

Page 20: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-20

Outliers Correctly Shown

0

5

10

15

20

25

30

1 6 11 16 21 26 31 36 41 46 51

• This more clear representation of the distribution makes the radical asymmetry very obvious.

Page 21: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-21

Normal Amount of Data to the Left and Right of the Mean

13.5%13.5% 2.5%2.5% 34% 34%

Mean

Standard Normal Distribution

Page 22: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-22

More Data to the Left than to the Right of the Mean

7.5%9.5% 0.3%0.0% 47% 33%

Mean

Positively Skewed Distribution

Page 23: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-23

More Toward the Center than in the Tails of the Distribution

8.0%8.0% 1.5%1.5% 40.5% 40.5%

Mean

Distribution with Positive Kurtosis

Page 24: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-24

More Toward the Center than in the Tails of the Distribution

17%17% 4%4% 29% 29%

Mean

Distribution with Negative Kurtosis

Page 25: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-25

Statistical Inference and Confidence Intervals

• Objective• To make inferences about the population based on the sample

• Sample Statistics• Used as estimates of the population parameters

• Estimates Are Imperfect• The probability of error can be determined

• Confidence Interval• The range within which the parameter is likely to be from the sample mean at a given

probability

Page 26: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-26

Statistical Inference and Confidence Intervals

• Sampling Distribution of Means• The distribution that would result if samples of a given size were taken again and

again and the mean of each sample were plotted.

• Standard Error of the Estimate• The standard deviation of the theoretical sampling distribution of means.

• Confidence Interval Probabilities• 68% chance the parameter is within ± 1 S.E• 95% chance the parameter is within ± 2 S.E.• 99% chance the parameter is within ± 3 S.E.

Page 27: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

10-27

99% C.I.

95% C.I.

Confidence Interval Diagram

• Mean = 50• Standard Error = 5

20 30 40 50 60 70 80

Page 28: McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Describing Data Distributions

McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

End of Chapter 10