35
Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA MBACATÓLICA Miguel Gouveia Manuel Leite Monteiro Quantitative Methods MBACatólica 2006/07 Métodos Quantitativos 7-2 9. SAMPLING DISTRIBUTIONS

MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

Faculdade de Ciências Económicas e EmpresariaisUNIVERSIDADE CATÓLICA PORTUGUESA

MBACATÓLICA

Miguel GouveiaManuel Leite Monteiro

Quantitative Methods

MBACatólica 2006/07 Métodos Quantitativos 7-2

9. SAMPLING DISTRIBUTIONS

Page 2: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-3

Problem

! A soft-drink vending machine is set so the amount of drink dispensed is a random variable with a mean of 200 milliliters and a standard deviation of 15 milliliters. What is the probability that the average amount dispensed in a random sample of 36 is at least 204 milliliters:

a) if the the random variable is normally distributed?

b) if the distribution is unknown?

MBACatólica 2006/07 Métodos Quantitativos 7-4

Distribution of the sample mean

! The sample mean (computed from n observations drawn from a population) is a random variable.

! Our objective is to study the distribution of that variable and to see how it is related to the distribution of the population from which the sample was drawn.

Page 3: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-5

Distribution of the sample mean

! Example: samples (with replacement) of size n=2from a population with four values: 1, 2, 3, 4.

(µ=2.5 e σ 2 =1.25)

! Possible samples : 16 Sample means

4,44,34,24,1

3,43,33,23,1

2,42,32,22,1

1,41,31,21,1

4.03.53.02.5

3.53.02.52.0

3.02.52.01.5

2.52.01.51.0

MBACatólica 2006/07 Métodos Quantitativos 7-6

Distribution of the sample mean

116Total

1/1614.0

2/1623.5

3/1633.0

4/1642.5

3/1632.0

2/1621.5

1/1611.0

ProbabilityNº of samplesSample Mean

Page 4: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-7

Distribution of the sample mean

0.3

0.2

0.1

01 2 3 4 1.0 1.5 2.0 2.5 3.0 3.5 4.0

0

0.1

0.2

0.3

( )f x

x

Distribution of the population Distribution of the sample mean

( )f x

x

MBACatólica 2006/07 Métodos Quantitativos 7-8

Distribution of the sample mean

! The mean of the sample mean’s distribution is the mean of the population.

! Concepts of mean being used:

Expected value (parameter of the mean's distribution)

Random variable

Parameter (parameter of the universe)

. ( ) 2.5E X x f x µ = = = ∑

Page 5: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-9

Distribution of the sample mean

! The standard deviation of the sample mean is:

! As the sample size (n) increases, the standard deviation of the mean decreases.

! As the standard deviation (σ) decreases, the standard deviation of the mean also decreases.

( ) ( )2

2

. 0.625

1.25 / 2

V X x f x

V X n

µ

σ

= − = = =

xn

σ σ=

MBACatólica 2006/07 Métodos Quantitativos 7-10

Distribution of the sample mean

0

.1

.2

.3

Sample mean (n = 2)

0

.1

.2

.3

Population: N = 4

22.5 1.25µ σ= = [ ] 2.5 [ ] 0.625E X V X= =

( )f x( )f x

1 2 3 4 1.0 1.5 2.0 2.5 3.0 3.5 4.0xx

Page 6: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-11

Distribution of the sample mean

1 2

1 2

2 2 2 2 2

2 2

...

...

...

...

n

n

X X XE X E

n

n

n nX X X

V X Vn

n

n n n

µ µ µ µ µ

σ σ σ σ σ

+ + + = + + += = =

+ + + = + + += = =

MBACatólica 2006/07 Métodos Quantitativos 7-12

Distribution of the sample meanfor Normal Populations

! The linear combination of independent normal random variables is itself a normal random variable.

! Application:

If X ~ N (µ, σ) then

! X×Y e X/Y do not have a normal distribution

1

~ ,n

i ii

X X f Nn

σµ=

=

Page 7: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-13

Problem

! A soft-drink vending machine is set so the amount of drink dispensed is a random variable with a mean of 200 milliliters and a standard deviation of 15 milliliters. What is the probability that the average amount dispensed in a random sample of 36 is at least 204 milliliters:

a) if the the random variable is normally distributed?

b) if the distribution is unknown?

MBACatólica 2006/07 Métodos Quantitativos 7-14

Solution

! X: quantity of the soft-drink dispensed, with µ=200 and σ=15. Sample size: n=36

a)

! probability that the average amount is at least 204:

and if the distribution was unknown?

( )2

2 15if ~ 200,15 ~ 200,

36X N X N

[ ]

204 200P 204 =P

15 36

=P 1.6 =1-0.9452=5.48%

XX

n

Z

µσ − − ≥ ≥

Page 8: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-15

Central Limit Theorem

! The distribution of a random variable obtained from the sum (mean) of “n” independent and identically distributed (i.i.d) random variables approaches a normal distribution as “n” increases.

! This result is independent from the distribution of the population.

! If X1, X2, ..., Xn are n random variables i.i.d. with mean µ and variance σ 2, then:

( )1,0~ Nn

X

σµ−

MBACatólica 2006/07 Métodos Quantitativos 7-16

Page 9: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-17

MBACatólica 2006/07 Métodos Quantitativos 7-18

Central Limit Theorem

As the sample size increases…

…the distribution of the sample mean becomes almost Normal, independently of the population’s distribution.

x

Page 10: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-19

Central Limit Theorem

! What sample size (n) is “large enough”?

– For most population distributions, n>30

– For distributions that are fairly symmetric, n>15 may suffice

– For distributions that are normally distributed, the sampling distribution of the mean will always be normally distributed, regardless of the sample size.

MBACatólica 2006/07 Métodos Quantitativos 7-20

Solution

! X: quantity of the soft-drink dispensed, with µ=200 and σ=15. Sample size: n=36

b)

! probability that the average amount is at least 204:

215since is "large" ~ = 200,

36n X N

!

[ ]

204 200P 204 =P

15 36

P 1.6 =1-0.9452=5.48%

XX

n

Z

µσ − − ≥ ≥

≥"

Page 11: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-21

10. INTRODUCTION TO STATISTICAL INFERENCE

MBACatólica 2006/07 Métodos Quantitativos 7-22

Statistical Inference

11. Point Estimation

12. Confidence Intervals

13. Hypothesis Tests

Page 12: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-23

Problem

! BankX plans to launch a new financial product different from all the existing ones. A sample of 25 potential investors provided the following information regarding the amount they wish to invest in the new product (normally distributed): Σxi=1000 and Σ(xi–x)2=9600.

a) Compute a point estimate for the average amount invested.

b) Compute a 90% confidence interval for the average amount invested.

MBACatólica 2006/07 Métodos Quantitativos 7-24

Parameters and Statistics

! Parameter: is a numerical value that characterizes the distribution or the universe studied.

! Estimator: is a random variable that can take different values depending on the particular sample drawn.

! Estimate: is a number that is obtained from a specific sample.

Page 13: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-25

11. Point Estimation

MBACatólica 2006/07 Métodos Quantitativos 7-26

Estimators for the mean, variance and proportion

sSσStandard

deviation

(fn)fnpProportion

s2S2σ 2Variance

µMean

EstimateEstimatorPopulation’s parameter

X x

Page 14: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-27

Estimator’s properties

! UnbiasednessAn estimator is unbiased it the mean of its distribution equals the parameter.

! Efficiency

An unbiased estimator is the most efficient if its variance (around the parameter) is minimal.

! Consistency

An estimator is consistent if, as the sample size increases, itsmean approaches the parameter and its variance decreases.

MBACatólica 2006/07 Métodos Quantitativos 7-28

µµµµ

BiasedUnbiased

Unbiasedness

( )f ⋅

Page 15: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-29

µµµµ

Sampling distribution of the median

Sampling distribution of the mean

Efficiency

( )f ⋅

MBACatólica 2006/07 Métodos Quantitativos 7-30

µµµµ

Large sample

Small sample

Consistency

( )f ⋅

Page 16: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-31

Problem

! BankX plans to launch a new financial product different from all the existing ones. A sample of 25 potential investors provided the following information regarding the amount they wish to invest in the new product (normally distributed): Σxi=1000 and Σ(xi–x)2=9600.

a) Compute a point estimate for the average amount invested.

b) Compute a 90% confidence interval for the average amount invested.

MBACatólica 2006/07 Métodos Quantitativos 7-32

Solution

! BankX plans to launch a new financial product different from all the existing ones. A sample of 25 potential investors provided the following information regarding the amount they wish to invest in the new product (normally distributed): Σxi=1000 and Σ(xi–x)2=9600.

a) Compute a point estimate for the average amount invested.

2n=25; x=1000 25 =40; s = 9600 24 =400.

ˆpoint estimate: = x= 1000 25 =40µb) Compute a 90% confidence interval for the average amount

invested.

Page 17: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-33

12. CONFIDENCE INTERVALS

MBACatólica 2006/07 Métodos Quantitativos 7-34

Point Estimation vs. Confidence Intervals

The mean, µµµµ, is unknown

Population Random sampleI’ve got 95%

confidence that µµµµis located

between 40 and 60.

Mean x = 50

Sample

Page 18: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-35

Confidence Intervals for the mean

! Example for a Normal population (or for “large” samples)

As: we have

Thus:

n

NXσµ,~ ( )1,0~ N

n

X

σµ−

1.96 1.96 0.95/

XP

n

µσ

−− < < =

MBACatólica 2006/07 Métodos Quantitativos 7-36

Confidence Intervals for the mean

which can also be written as:

! So, we have a 95% confidence interval for the mean:

1.96 1.96x xn n

σ σµ− < < +

1.96 1.96 0.95P X Xn n

σ σµ − < < + =

Page 19: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-37

Interpretation of a (1-α)% confidence interval

! (1-α)% is the percentage of confidence intervals,

– from successive samples,

– all with size n,

– drawn from the same population

that include the true value of the parameter being estimated.

MBACatólica 2006/07 Métodos Quantitativos 7-38

Interpretation of a (1-α)% confidence interval

Confidence intervals for 10 different

samples

of the intervals contain and

don’t.

/ 2α / 2α

x[ ]E X µ=

1 α−

µ%α

/ 2zn

ασµ +

/ 2zn

ασµ −

( )1 %α−

Page 20: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-39

(1- α)% CI for the mean:Normal Pop., n large and σ known

! For a Normal population (or large n) with σ known:

1. Define the level of confidence (1- α)%

2. Collect a sample with size n. Compute

3. Obtain zα/2 from the statistic tables

4. The confidence interval is given by:

x

2 2x z x zn n

α ασ σµ− < < +

MBACatólica 2006/07 Métodos Quantitativos 7-40

Problem

! BankX plans to launch a new financial product different from all the existing ones. A sample of 25 potential investors, collected the following information regarding the amount they wish to invest in the new product (normally distributed): Σxi=1000 and Σ(xi–x)2=9600.

a) Compute a point estimate for the average amount invested.

b) Compute a 90% confidence interval for the average amount invested.

Page 21: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-41

Solution

! BankX plans to launch a new financial product different from all the existing ones. A sample of 25 potential investors, collected the following information regarding the amount they wish to invest in the new product (normally distributed): Σxi=1000 and Σ(xi–x)2=9600.

b) Compute a 90% CI for the average amount invested.

2n=25; x=1000 25 =40; s = 9600 24 =400.

0.05

2 2

10% 1.64520

40 1.64525

IC for : (33.156,46.844)

z

zx x zn n

α α

ασ σµ

µ

= ⇒ =

− < < + ⇒ ±

MBACatólica 2006/07 Métodos Quantitativos 7-42

Conflict between credibility and precision

! Credibility – Confidence level of an interval

! Precision – Width of the confidence interval

! For a given sample size n:– More precision means decrease the width of the interval. Therefore

implying a lower level of confidence.

– A higher level of confidence implies a larger interval (less precision).

! The only way to increase simultaneously the precision and the credibility of the inference is to increase n.

Page 22: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-43

! A vending machine is calibrated to pour a quantity of liquid that follows a normal distribution with variance equal to 16 ml2. In a sample of 25 drinks, the average was:

We want:

a) To construct a 95% Confidence Interval for the true average quantity of liquid on the served drinks;

b) To determine how many drinks should be included on a new sample, if the interval precision is to be increased to 2 ml.

Problem

2 5 0x m l=

MBACatólica 2006/07 Métodos Quantitativos 7-44

Solution

a)

The width of the interval is 3.136 ml.

568.251432.24825

496.1250

25

496.1250

96,196.1

<<

+<<−

+<<−

µ

µ

σµσn

xn

x

1.96 1.96

4 4250 1.96 250 1.96

25 25248.432 251.568

x xn n

σ σµ

µ

µ

− < < +

− < < +

< <

Page 23: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-45

Solution

b)

Width =

nz

σα 2

62

84.7

496.122

2 2

==

×=

×

n

n

n

nz

σα

MBACatólica 2006/07 Métodos Quantitativos 7-46

Problem

! Ten analysts have given the following year earnings forecasts for a stock, which are normally distributed:

Compute a 95% confidence interval for the population mean of the forecasts.

Forecast ( ) Number of analysts ( )1.40 11.43 11.44 31.45 21.47 11.48 11.50 1

i iX n

Page 24: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-47

Population’s Variance unknown

! Until now we have assumed that the variance of the population was known. However, it usually is unknown and has to be estimated.

! We know that

is an unbiased estimator for the population variance.

2 2E S σ =

( )2

2 1

1

n

ii

X XS

n=

−=

MBACatólica 2006/07 Métodos Quantitativos 7-48

Distribution of the sample mean from a Normal population with unknown σ

! If the population is Normal, is the sample mean distribution still given by

?

For small samples the answer is NO!

( )~ 0,1X

NS n

µ−

Page 25: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-49

Distribution of the sample mean from a Normal population with unknown σ

! With σ unknown, we have a “t” distribution:

where:

( )2

2 1

1

n

ii

x xS

n=

−=

( )~ 1X

t nS n

µ− −

MBACatólica 2006/07 Métodos Quantitativos 7-50

z, t0

t (df = 5)

Normal (0,1)

t (df = 13)Also bell shapedAlso symmetric

But with wider tails

t distribution (Student’s distribution)

Page 26: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-51

0.90 0.95 0.975 0.99 0.9951 3.078 6.314 12.706 31.821 63.6562 1.886 2.920 4.303 6.965 9.9253 1.638 2.353 3.182 4.541 5.8414 1.533 2.132 2.776 3.747 4.6045 1.476 2.015 2.571 3.365 4.032

6 1.440 1.943 2.447 3.143 3.7077 1.415 1.895 2.365 2.998 3.4998 1.397 1.860 2.306 2.896 3.3559 1.383 1.833 2.262 2.821 3.250

10 1.372 1.812 2.228 2.764 3.169

11 1.363 1.796 2.201 2.718 3.10612 1.356 1.782 2.179 2.681 3.05513 1.350 1.771 2.160 2.650 3.01214 1.345 1.761 2.145 2.624 2.97715 1.341 1.753 2.131 2.602 2.947

26 1.315 1.706 2.056 2.479 2.77927 1.314 1.703 2.052 2.473 2.77128 1.313 1.701 2.048 2.467 2.76329 1.311 1.699 2.045 2.462 2.756inf 1.282 1.645 1.960 2.326 2.576

F(x)n

1-0.975

0 3.182

t (df = 3)

Student’s t distribution

MBACatólica 2006/07 Métodos Quantitativos 7-52

(1- α)% CI for the mean:Normal Pop. and σ unknown

! For a Normal population with σ unknown:

1. Define the level of confidence (1- α)%

2. Collect a sample with size n. Compute

3. Obtain from the statistical tables

4. The confidence interval is given by:

x( )1

/ 2ntα

( ) ( )1 1/ 2 / 2

n ns sx t x t

n nα αµ− −− < < +

Page 27: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-53

Problem

! Ten analysts have given the following year earnings forecasts for a stock, which are normally distributed:

Compute a 95% confidence interval for the population mean of the forecasts.

Forecast ( ) Number of analysts ( )1.40 11.43 11.44 31.45 21.47 11.48 11.50 1

i iX n

MBACatólica 2006/07 Métodos Quantitativos 7-54

Solution

! For a 99% confidence level, the interval would be:

90.025

1.45; 0.02789; 10; 9

2.262

0.02789 0.027891.45 2.262 1.45 2.262

10 101.43 1.47

x s n df

t

µ

µ

= = = ==

− ≤ ≤ +

≤ ≤

0.02789 0.027891.45 3.250 1.45 3.250

10 101.421 1.479

µ

µ

− ≤ ≤ +

≤ ≤

Page 28: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-55

Distribution of the sample mean

CLT

We don’t know the

distribution

CLT

We don’t know the

distribution

Not Normal

Population

CLT

Normal

Population

n≥30n<30n≥30n<30

σ Unkownσ Known

~ (0,1)X

N

n

µσ

−~ (0,1)

XN

Sn

µ−~ (0,1)

XN

n

µσ

~ (0,1)X

N

n

µσ

−~ (0,1)

XN

Sn

µ−

~ ( 1)X

t nS

n

µ− −

MBACatólica 2006/07 Métodos Quantitativos 7-56

! The true proportion of a population is p.

The estimator of p is the proportion on the sample,

i.e., , where X is a binomial variable:n

Xf

n=

Confidence interval for a proportion

[ ] [ ] pp

npXE

nPE === 1

[ ] [ ]1n

npE f E X p

n n= = =

[ ] [ ] ( ) ( )2 2

1 11n

np p p pV f V X

n n n

− −= = =

Page 29: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-57

Confidence interval for a proportion

! For a large n:

! The confidence interval is given by:

( )( )~ 0 ,1

1nf p

Np p

n

−−

( ) ( )2 2

1 1n n n nn n

f f f ff z p f z

n nα α− −

− < < +

MBACatólica 2006/07 Métodos Quantitativos 7-58

(1- α)% CI for a proportion :with large samples

1. Define the level of confidence (1- α)%

2. Collect a sample of size n. Compute

3. Obtain zα/2 from the statistic tables

4. The confidence interval is given by:

nf

( ) ( )2 2

1 1n n n nn n

f f f ff z p f z

n nα α− −

− < < +

Page 30: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-59

Problem

! We want to estimate the proportion of voters in a political party. 400 citizens were interviewed and 140 of them revealed the intention to vote on that party.Compute a 99% confidence interval for the proportion of votes on that party.

MBACatólica 2006/07 Métodos Quantitativos 7-60

Solution

/ 2

400

140 / 400 0.35, 1 0.65

1 0.99, / 2 0.005, 2.57

0.35*0.65 0.35*0.650.35 2.57 0.35 2.57

400 4000.28871 0.41129

n n

n

f f

z

p

p

αα α

== = − =

− = = =

− ≤ ≤ +

≤ ≤

Page 31: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-61

Selection of the sample size

! The sample size is a decision variable reflecting a conflict between precision and the cost of sampling.

Very large:

• Too expensive

Very small:

• Imprecise results

MBACatólica 2006/07 Métodos Quantitativos 7-62

Selection of the sample size

! Question: for a desirable minimum precision, what should be the minimum sample size to be drawn?

The choice of n is affected by 3 factors:

1. The level of precision or the level of margin of error (interval width)

2. Level of confidence

3. The dispersion of the population

Page 32: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-63

Sample size:Estimation of a proportion

! Since the confidence interval is given by:

it can also be written as

with e being the margin of error.

( ) ( )2 2

1 1n n n nn n

f f f ff z p f z

n nα α− −

− < < +

n nf e p f e− < < +

MBACatólica 2006/07 Métodos Quantitativos 7-64

Sample size:Estimation of a proportion

! Fixing e, it is possible to obtain n as:

! BUT: the value of is unknown before the sample is drawn.

The value used for should be the one that maximizes p(1-p), i.e., .

( )22 2

1( ) n nf f

n zeα−

=

nf

0.5nf =

nf

Page 33: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-65

Problem

! Determine the minimum size of a sample in order to compute a 95% confidence interval for the proportion of consumers who are willing to buy a new product, with a margin of error of one percentage point.

! Recompute that confidence interval if you were sure that, given the high price of the product, no more than 25% of consumers would buy it.

MBACatólica 2006/07 Métodos Quantitativos 7-66

Solution

! If we knew “a priori” that p<0.25, then

/ 2

22

0.01

5%

1.96

0.5 0.51.96 9604

0.01

e

Z

n

α

α==

=×= =

22

0.25 0.751.96 7203

0.01n

×= =

Page 34: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-67

Sample size:Estimation of the mean

! The confidence interval is given by:

and it can be written as:

Thus:

2 2x z x zn n

α ασ σµ− < < +

x e x eµ− < < +2

22 2

( )n zeασ=

MBACatólica 2006/07 Métodos Quantitativos 7-68

Sample size:Estimation of the mean

! If σ is unknown:1. Collect a pilot sample, with a smaller size, to

estimate σ.

2. If the population is approximately normal:

Prob[µ ± 2σ]=0.95 and Prob[µ ± 3σ]=0.997

Therefore (and using past data or subjective evaluations of the population), we can “estimate”:

ι. σ = (Percentile 97.5- Percentile 2.5)/4

ιι. σ = (MAX- MIN)/6

Page 35: MBACATÓLICAicm.clsbe.lisboa.ucp.pt/docentes/url/metoqua/MQ07 BW.pdf · The mean, µ, is unknown Population Random sample I’ve got 95% confidence that µ is located between 40 and

MBACatólica 2006/07 Métodos Quantitativos 7-69

Problem

! Suppose you want to estimate the population mean of the analysts forecasts for next year stock earnings to within ± 0.01 with 95% confidence.On the basis of past studies, you believe the standard deviation of those forecasts to be 0.03.Find the minimum sample size needed.

MBACatólica 2006/07 Métodos Quantitativos 7-70

Solution

We need at least 35 forecasts in our sample.

/ 2

22

2

0.01

0.03

5%

1.96

0.031.96 34.6

0.01

e

z

n

α

σα

===

=

= =