53
Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Embed Size (px)

Citation preview

Page 1: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Statistics for Managers Using Microsoft Excel

3rd Edition

Chapter 5The Normal Distribution and

Sampling Distributions

Page 2: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Chapter Topics

The normal distribution The standardized normal distribution Evaluating the normality assumption The exponential distribution

Page 3: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Chapter Topics

Introduction to sampling distribution

Sampling distribution of the mean

Sampling distribution of the proportion

Sampling from finite population

(continued)

Page 4: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Continuous Probability Distributions

Continuous random variable Values from interval of numbers Absence of gaps

Continuous probability distribution Distribution of continuous random variable

Most important continuous probability distribution The normal distribution

Page 5: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

The Normal Distribution

“Bell shaped” Symmetrical Mean, median and

mode are equal Interquartile range

equals 1.33 Random variable

has infinite range

Mean Median Mode

X

f(X)

Page 6: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

The Mathematical Model

21

2

2

1

2

: density of random variable

3.14159; 2.71828

: population mean

: population standard deviation

: value of random variable

X

f X e

f X X

e

X X

Page 7: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Expectation

0

)(

)(

22

22

22

2/)(

21

2/)(

21

2/)(

21

dxe

xdex

dxxeXE

x

x

x

Page 8: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Variance

2

)(

2

2/2

2

2/)(2

212

2

22

2

dyey

deXE

y

xxx

Page 9: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Many Normal Distributions

By varying the parameters and , we obtain different normal distributions

There are an infinite number of normal distributions

Page 10: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Finding Probabilities

Probability is the area under the curve!

c dX

f(X)

?P c X d

Page 11: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Which Table to Use?

An infinite number of normal distributions means an infinite number of tables to look

up!

Page 12: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Solution: The Cumulative Standardized Normal

Distribution

Z .00 .01

0.0 .5000 .5040 .5080

.5398 .5438

0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255

.5478.02

0.1 .5478

Cumulative Standardized Normal Distribution Table (Portion)

Probabilities

Shaded Area Exaggerated

Only One Table is Needed

0 1Z Z

Z = 0.12

0

Page 13: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Standardizing Example

6.2 50.12

10

XZ

Normal Distribution

Standardized Normal

Distribution

Shaded Area Exaggerated

10 1Z

5 6.2 X Z0Z

0.12

Page 14: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example:

Normal Distribution

Standardized Normal

Distribution

Shaded Area Exaggerated

10 1Z

5 7.1 X Z0Z

0.21

2.9 5 7.1 5.21 .21

10 10

X XZ Z

2.9 0.21

.0832

2.9 7.1 .1664P X

.0832

Page 15: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Z .00 .01

0.0 .5000 .5040 .5080

.5398 .5438

0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255

.5832.02

0.1 .5478

Cumulative Standardized Normal Distribution Table (Portion)

Shaded Area Exaggerated

0 1Z Z

Z = 0.21

Example: 2.9 7.1 .1664P X

(continued)

0

Page 16: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Z .00 .01

-03 .3821 .3783 .3745

.4207 .4168

-0.1.4602 .4562 .4522

0.0 .5000 .4960 .4920

.4168.02

-02 .4129

Cumulative Standardized Normal Distribution Table (Portion)

Shaded Area Exaggerated

0 1Z Z

Z = -0.21

Example: 2.9 7.1 .1664P X

(continued)

0

Page 17: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Normal Distribution in PHStat

PHStat | probability & prob. Distributions | normal …

Example in excel spreadsheet

Microsoft Excel Worksheet

Page 18: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example: 8 .3821P X

Normal Distribution

Standardized Normal

Distribution

Shaded Area Exaggerated

10 1Z

5 8 X Z0Z

0.30

8 5.30

10

XZ

.3821

Page 19: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example: 8 .3821P X

(continued)

Z .00 .01

0.0 .5000 .5040 .5080

.5398 .5438

0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255

.6179.02

0.1 .5478

Cumulative Standardized Normal Distribution Table (Portion)

Shaded Area Exaggerated

0 1Z Z

Z = 0.30

0

Page 20: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

.6217

Finding Z Values for Known Probabilities

Z .00 0.2

0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871

.6179 .6255

.01

0.3

Cumulative Standardized Normal Distribution Table

(Portion)

What is Z Given Probability = 0.1217 ?

Shaded Area Exaggerated

.6217

0 1Z Z

.31Z 0

Page 21: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Recovering X Values for Known Probabilities

5 .30 10 8X Z

Normal Distribution

Standardized Normal

Distribution10 1Z

5 ? X Z0Z 0.30

.3821.1179

Page 22: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Assessing Normality

Not all continuous random variables are normally distributed

It is important to evaluate how well the data set seems to be adequately approximated by a normal distribution

Page 23: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Assessing Normality Construct charts

For small- or moderate-sized data sets, do stem-and-leaf display and box-and-whisker plot look symmetric?

For large data sets, does the histogram or polygon appear bell-shaped?

Compute descriptive summary measures Do the mean, median and mode have similar

values? Is the interquartile range approximately 1.33

? Is the range approximately 6 ?

(continued)

Page 24: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Assessing Normality

Observe the distribution of the data set Do approximately 2/3 of the observations lie

between mean 1 standard deviation? Do approximately 4/5 of the observations lie

between mean 1.28 standard deviations? Do approximately 19/20 of the observations

lie between mean 2 standard deviations? Evaluate normal probability plot

Do the points lie on or close to a straight line with positive slope?

(continued)

Page 25: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Assessing Normality

Normal probability plot Arrange data into ordered array Find corresponding standardized normal

quantile values Plot the pairs of points with observed data

values on the vertical axis and the standardized normal quantile values on the horizontal axis

Evaluate the plot for evidence of linearity

(continued)

Page 26: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Assessing Normality

Normal Probability Plot for Normal Distribution

Look for Straight Line!

30

60

90

-2 -1 0 1 2

Z

X

(continued)

Page 27: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Normal Probability Plot

Left-Skewed Right-Skewed

Rectangular U-Shaped

30

60

90

-2 -1 0 1 2

Z

X

30

60

90

-2 -1 0 1 2

Z

X

30

60

90

-2 -1 0 1 2

Z

X

30

60

90

-2 -1 0 1 2

Z

X

Page 28: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Exponential Distributions

arrival time 1

: any value of continuous random variable

: the population average number of

arrivals per unit of time

1/ : average time between arrivals

2.71828

XP X e

X

e

e.g.: Drivers Arriving at a Toll Bridge; Customers Arriving at an ATM Machine

Page 29: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Exponential Distributions

Describes time or distance between events Used for queues

Density function

Parameters

(continued)

f(X)

X

= 0.5

= 2.0

1 x

f x e

Page 30: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example

e.g.: Customers arrive at the check out line of a supermarket at the rate of 30 per hour. What is the probability that the arrival time between consecutive customers to be greater than five minutes?

30 5/ 60

30 5 / 60 hours

arrival time > 1 arrival time

1 1

.0821

X

P X P X

e

Page 31: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Exponential Distribution in PHStat

PHStat | probability & prob. Distributions | exponential

Example in excel spreadsheet

Microsoft Excel Worksheet

Page 32: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Why Study Sampling Distributions

Sample statistics are used to estimate population parameters e.g.: Estimates the population mean

Problems: different samples provide different estimate Large samples gives better estimate; Large

samples costs more How good is the estimate?

Approach to solution: theoretical basis is sampling distribution

50X

Page 33: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Sampling Distribution

Theoretical probability distribution of a sample statistic

Sample statistic is a random variable Sample mean, sample proportion

Results from taking all possible samples of the same size

Page 34: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Developing Sampling Distributions

Assume there is a population … Population size N=4 Random variable, X,

is age of individuals Values of X: 18, 20,

22, 24 measured inyears A

B C

D

Page 35: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

1

2

1

18 20 22 2421

4

2.236

N

ii

N

ii

X

N

X

N

.3

.2

.1

0 A B C D (18) (20) (22) (24)

Uniform Distribution

P(X)

X

Developing Sampling Distributions

(continued)

Summary Measures for the Population Distribution

Page 36: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

1st 2nd Observation Obs 18 20 22 24

18 18,18 18,20 18,22 18,24

20 20,18 20,20 20,22 20,24

22 22,18 22,20 22,22 22,24

24 24,18 24,20 24,22 24,24

All Possible Samples of Size n=2

16 Samples Taken with Replacement

16 Sample Means1st 2nd Observation Obs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

Developing Sampling Distributions

(continued)

Page 37: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

1st 2nd Observation Obs 18 20 22 24

18 18 19 20 21

20 19 20 21 22

22 20 21 22 23

24 21 22 23 24

Sampling Distribution of All Sample Means

18 19 20 21 22 23 240

.1

.2

.3

P(X)

X

Sample Means

Distribution

16 Sample Means

_

Developing Sampling Distributions

(continued)

Page 38: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

1

2

1

2 2 2

18 19 19 2421

16

18 21 19 21 24 211.58

16

N

ii

X

N

i Xi

X

X

N

X

N

Summary Measures of Sampling Distribution

Developing Sampling Distributions

(continued)

Page 39: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Comparing the Population with its Sampling

Distribution

18 19 20 21 22 23 240

.1

.2

.3 P(X)

X

Sample Means Distribution

n = 2

A B C D (18) (20) (22) (24)

0

.1

.2

.3

PopulationN = 4

P(X)

X_

21 2.236 21 1.58X X

Page 40: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Properties of Summary Measures

I.E. Is unbiased

Standard error (standard deviation) of the sampling distribution is less than the standard error of other unbiased estimators

For sampling with replacement: As n increases, decreases

X

X

Xn

X

X

Page 41: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Unbiasedness

BiasedUnbiased

P(X)

X X

Page 42: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Less Variability

Sampling Distribution of Median Sampling

Distribution of Mean

P(X)

X

Page 43: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Effect of Large Sample

Larger sample size

Smaller sample size

P(X)

X

Page 44: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

When the Population is Normal

Central Tendency

Variation

Sampling with Replacement

Population Distribution

Sampling Distributions

X

Xn

X50X

4

5X

n

16

2.5X

n

50

10

Page 45: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

When the Population is Not Normal

Central Tendency

Variation

Sampling with Replacement

Population Distribution

Sampling Distributions

X

Xn

X50X

4

5X

n

30

1.8X

n

50

10

Page 46: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Central Limit Theorem

As sample size gets large enough…

the sampling distribution becomes almost normal regardless of shape of population

X

Page 47: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

How Large is Large Enough?

For most distributions, n>30 For fairly symmetric distributions, n>15 For normal distribution, the sampling

distribution of the mean is always normally distributed

Page 48: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example:

8 =2 25

7.8 8.2 ?

n

P X

Sampling Distribution

Standardized Normal

Distribution2

.425

X 1Z

8X 8.2 Z

0Z 0.5

7.8 8 8.2 87.8 8.2

2 / 25 2 / 25

.5 .5 .3830

X

X

XP X P

P Z

7.8 0.5

.1915

X

Page 49: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Population Proportions p Categorical variable

e.g.: Gender, voted for Bush, college degree

Proportion of population having a characteristic

Sample proportion provides an estimate

If two outcomes, X has a binomial

distribution Possess or do not possess characteristic

number of successes

sample sizeS

Xp

n

p

Page 50: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Sampling Distribution of Sample Proportion

Approximated by normal distribution

Mean:

Standard error: p = population

proportion

Sampling DistributionP(ps)

.3

.2

.1 0

0 . 2 .4 .6 8 1ps

5np 1 5n p

Spp

1Sp

p p

n

Page 51: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Standardizing Sampling Distribution of Proportion

1S

S

S p S

p

p p pZ

p p

n

Sampling Distribution

Standardized Normal

Distribution

Sp 1Z

Sp Sp Z0Z

Page 52: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Example: 200 .4 .43 ?Sn p P p

.43 .4.43 .87 .8078

.4 1 .4

200

S

S

S pS

p

pP p P P Z

Sampling Distribution

Standardized Normal

DistributionSp

1Z

Sp

Sp Z0.43 .87

Page 53: Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions

Sampling from Finite Sample

Modify standard error if sample size (n) is large relative to population size (N ) Use finite population correction factor (fpc)

Standard error with FPC

1X

N n

Nn

1

1SP

p p N n

n N

.05 or / .05n N n N