23
Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Embed Size (px)

Citation preview

Page 1: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Chapter 8Estimation

Nutan S. Mishra

Department of Mathematics and Statistics

University of South Alabama

Page 2: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Statistical Inference

Drawing inference about the unknown population parameter based on the information from a sample.

Statistical inference is studied in two parts:

Estimation and Testing of Hypothesis.

Page 3: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

EstimationWhen we can not perform a census, we can not

know the value of the population parameter.

For example US census bureau may want to find the average expenditure per month incurred by a household in eating outside.

Population average is unknown.

So we collect a sample and assign the value(s) based on sample values

This is estimation

Page 4: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Examples of Estimation1. Since we do not know the average expenditure

per week on outside eating, we collect a sample and compute a sample mean and assign the value of sample mean to unknown population mean.

2. We do not know the proportion p of all the smokers in united states so we collect a sample of people and compute the sample proportion x/n

Page 5: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Estimation procedure• Select a sample

• Collect the required information from the members of the sample.

• Calculate the value of sample statistics

Assign the value(s) to the corresponding population parameter.

An estimator may be a point estimator or interval estimator

Page 6: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Point EstimationThe single value of a sample statistics is called point

estimate of the corresponding population parameter.

Example:

p. ofestimator point a is p̂

p ofestimator an isx/n p̂ proportion sample the

enunknown th is p proportion populationwhen

mean population theofestimator point a ismean sample

x ˆ writewe

ofestimator an is xmean sample the

enunknown th ismean populationWhen

Page 7: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Point EstimationConsider the problem of finding the average GPA of all

students at USA (around 13,000)Let be the population average, we do not know the valueCollect a sample of size say n=80 students.Collect their GPAs. Compute the sample mean.Suppose the sample mean is 3.04 is called estimator of 3.04 is an estimate of we collect another sample of size n=100 and compute the

sample mean. Suppose the sample mean is 2.99Then 2.99 is another estimate of

x

Page 8: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Error in the estimation

In the last chapter we have seen that there is a difference between value of and the value of

is the error in estimation.

Margin of error =

xx

x96.1

Page 9: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interval EstimatorTo estimate an unknown value, instead of using a single point

value, we use an interval of the values.

This is called interval estimation.

In interval estimation, an interval is constructed around the point estimate.

It is said that this interval is likely to contain unknown value of population parameter

Question : how likely? Can we assign likelihood to our statement?

Page 10: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interval estimatorConsider the estimation of the population mean of GPA of

the students at USAAfter collecting a sample we computed the sample mean .

Suppose sample mean = 3.05We add and subtract a number from and ask the question:

how confident are we that the interval contains unknown value of

3.05+.15 = 3.20 and 3.05-.15 = 2.90What is the probability that value of lies between 3.20 and

2.90? Questions: what number should be added and subtracted?How to attach probability (confidence level) with an interval?

xx

Page 11: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interval estimationConfidence level and Confidence interval:

Confidence level associated with an interval states how much confidence we have that this interval contains the true value of the population parameter.

Such an interval is called confidence interval

Confidence level is denoted by (1-)%

Page 12: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interval estimation of Recall that is a point estimator of and from

chapter 7 that ~ N(, ) =N(,/n) whenever the sample size is large.

For large samples, the (1-α)*100% confidence interval for µ is given by

xx x

level confidencegiven for table-z from read is z of value

/ and / recall

unknown is if

known is if

nssn

zsx

zx

xx

x

x

for estimateerror maximum called is zor xx szE

Page 13: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Confidence intervalFor large samples

Question: How to compute z for given (1-α) in the above formula?

For a 95% confidence interval (1-α) = .95

α = .05, α/2 = .025

thus z = 1.96

Similarly we can compute z

for 99%, 98% etc. confidence

levels

level confidencegiven for table-z from read is z of value

/ and / recall

unknown is if

known is if

nssn

zsx

zx

xx

x

x

-4 -3 -2 -1 0 1 2 3 4

(1-α)α/2 α/2

z-z

Page 14: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Example of Confidence Interval8.11 Given n=64, = 24.5, and s = 3.1

a. Point estimate of µ is 24.5

b. Margin of error associated with the point estimate of µ is =1.96*s/√n = 1.96*3.1/8 =.7595

c. 99% confidence interval for µ is =

24.5± z* 3.1/8

To compute z, 1-α=.99 , α=.01, α/2 = .005 z=2.58

Thus confidence interval is given by 24.5± 2.58* 3.1/8 = (24.5± .99975) =(23.50025,25.49975)

d. Maximum error of estimate is .99975

x

x96.1

xzsx

Page 15: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interpretation of Confidence IntervalIn the earlier example we constructed a 99%

confidence interval for µ, which is =(23.50025,25.49975)

This means that we are 99% confident that the unknown value of µ lies between 23.50025 and 25.49975

This does not mean that the interval contains µ with probability .99

This means that if we draw all possible samples of size 64 from the given population, then 99% of all such intervals will contain the value of µ.

Page 16: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Interpretation of Confidence IntervalRecall the formula for confidence interval for µ

Note the following;• The values in the interval depend on the sample chosen• The width of the interval is 2• A narrow interval is a better interval• The width depends on

– Z-value which in turn depends on confidence level

– Size of the sample

These are the two quantities which we can control.

To decrease the width of the interval– Lower the confidence level (not a good choice)

– Increase the sample size .

nzxzx x

nz

Page 17: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Application: Ex. 8.22X= amount of time spent/week online by mothers with children

under age 18.n=1000 = 16.87 hrs, s = 3.2To construct 95% confidence interval for µ.It’s a large sample, so formula to construct such interval is = 16.87± z *3.2/√1000 = 16.87± 1.96 *3.2/√1000 = 16.87± .1983 =(16.6717 ,17.06833)Interpretation: If we draw a large number of samples each of

size 1000, and construct a confidence interval corresponding to each sample, then 95% of all such intervals will trap the true value of µ

x

xzsx

Page 18: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Small samples caseObjective: To construct confidence interval for µ when the

sample is small.

T-distribution is used to construct a confidence interval for µ if

1. The population from which sample is drawn is approximately normal

2. Sample size is small

3. Population standard deviation σ is unknown.

Formula is

confidence of level give and freedom of degrees 1-n

for table- t thefrom obtained is t of value the

s wheretsx

is for interval conficence )%-(1 The

xx ns

Page 19: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

What is a t-distribution?

Picture borrowed from:http://www.aiaccess.net/tutor_demo/tutor_t_1.htm

•A specific bell shaped sampling distribution

•Only parameter is (n-1) where n is size of the sample

•(n-1) is called degrees of freedom

•Shape depends on degrees of freedom (n-1)

•t-distribution approaches to standard normal for larger values of n

•Values of t are tabulated for different degrees o freedom and right tails.

Page 20: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Exercise 8.39,.40,.418.39(a) Area in the right tail = .05, df =12From the t-table value of t =1.7824.40(a) Given that n= 21, area in the left tail is .10Here df = n-1 = 20 and since t-curve is symmetric first we find

t-value for area in the right tial = .10 and then assign a negative sign for the required value

For df=20 and area in right tail =.1, t=1.325For df=20 and area in left tail = .1, t= -1.3254.41(a) Given that t-value = 2.467 and df= 28, to find the area

in right tailIn the t-table in the first column look for 28. then in the row of

28, look for a t-value =2.467, find the corresponding area in the top row.

= .01http://lib.stat.cmu.edu/DASL/

Page 21: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Exercise 8.43(a)Given confidence level = 99%

1-α = .99 α= .01 α/2 = .005

Also given that df = 13

Thus from table for df=13 and α/2 = .005

t-value = 3.012

Page 22: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Exercise 8.49X= time spent in waiting in a line to….

Assumption X~ N(µ,σ) both unknown

Draw a sample of size n=16

Computed = 31, s = 7 minutes

To construct a 99% CI for µ

Note that 1. Population is approximately normal

2. Population standard deviation is unknown

3. Sample size is small

Then formula for CI is

=31±t*7/√16 = 31±t*7/4

Computation of t-value α/2=.005 df=n-1 = 15 thus from table

t= 2.947

x

ns xx s wheretsx

Page 23: Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama

Exercise 8.49 continued