Introduction to Business Statistics QM 220 Chapter 8 ... 8 slides.pdfDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 220 Chapter 8 Estimation

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS

Introduction to Business StatisticsIntroduction to Business StatisticsQM 220

Chapter 8Estimation of the mean and proportion

Dr. Mohammad ZainalSpring 2008

Estimation: An introduction2

Estimation is a procedure by which a numerical value orvalues are assigned to a population parameter based on the

2

values are assigned to a population parameter based on theinformation collected from a sample.

In inferential statistics μ is called the true population mean and p isIn inferential statistics, μ is called the true population mean and p iscalled the true population proportion. There are many otherpopulation parameters such as the median mode variance andpopulation parameters, such as the median, mode, variance, andstandarddeviation.

E a le of e ti atioExamples of estimation:mean fuel consumption for a particular model of a car

i k b l l j baverage time taken by new employees to learn a jobmean housing expenditure per month incurred by households

QM-220, M. Zainal


If we can conduct a census each time we want to find the valueof a population parameter, then the estimation procedures are

3

p p p pnot needed.

Example, if the Kuwaiti Census Bureau can contact everyhousehold in the Kuwait to find the mean housing expenditureof households, the result of the survey will actually be a census

However, conducting a census:is too expensive,very time consuming,virtually impossible to contact every member of a population

QM-220, M. Zainal


That is why we usually take a sample from the population andcalculate the value of the appropriate sample statistic. Then we

4

pp p passign a value or values to the corresponding populationparameter based on the value of the sample statistic.

Example, to estimate the mean housing expenditure per monthof all households in the Kuwait, the Census Bureau will

take a sample of certain householdscollect the information on the housing expenditure per monthcompute the value of the sample meanassign values to the population mean

QM-220, M. Zainal


The value assigned to a population parameter based on thevalue of a sample statistics is called an estimate of the

5

ppopulation parameter.

The sample statistic used to estimate a population parameter iscalled an estimator.

The estimation procedure involves the following steps.Select a sample.Collect the required information from the members of thesample.Calculate the value of the sample statistic.Assign value(s) to the corresponding population parameter.

QM-220, M. Zainal

Point and interval estimates6

An estimate may be a point estimate or an interval estimate.

6

APoint Estimate

The value of a sample statistic that is used to estimate aThe value of a sample statistic that is used to estimate a

population parameter is called a point estimate.

If Census Bureau takes a sample of 10,000 households and

determines the mean housing expenditure per month, x, for this

sample is $1370. Then, using x as a point estimate of μ, the

bureau can state that the mean housing expenditure per month,g p p ,

μ, for all households is about $1370.

QM-220, M. Zainal


Usually, whenever we use point estimation, we calculate the

a i of e o a o iated ith that oi t e ti atio

7

margin of error associated with that point estimation.

For the estimation of the population mean, the margin of error

is calculated as follows:

Margin of error = 1.96 or 1.96x xsσ± ±

An Interval Estimate

In the interval estimation, instead of assigning a single value toIn the interval estimation, instead of assigning a single value to

a population parameter, an interval is constructed around the

point estimatepoint estimate.QM-220, M. Zainal


For the example, instead of saying that the mean housing

e e ditu e e o th fo all hou ehold i $1370 e ay

8

expenditure per month for all households is $1370, we may

obtain an interval subtracting a number from $1370 and adding

the same number to $1370.

Then we say that this interval contains the population mean, μ.y p p

For purposes of illustration, suppose subtract $240 from $1370

and add $240 to $1370 Consequently we obtain the intervaland add $240 to $1370. Consequently, we obtain the interval

($1370 ‐ $240) to ($1370 + $240), or $1130 to $1610.

QM-220, M. Zainal


Then we state that the interval $1130 to $1610 is likely tocontain the population mean, μ, and that the mean housing

9

contain the population mean, μ, and that the mean housingexpenditure per month for all households in the United States isbetween $1130 and $1610.$ $

This procedure is called interval estimation.

Th l $1130 i ll d th l li it f th i t l dThe value $1130 is called the lower limit of the interval and$1610 is called the upper limit of the interval.

QM-220, M. Zainal


The question is, what number we should add to and subtractfrom the point estimate?

10

from the point estimate?

The answer to this question depends on two considerations:The standard deviation of the meanThe standard deviation of the meanThe level of confidence to be attached to the interval

First the larger the standard deviation the greater is theFirst, the larger the standard deviation, the greater is thenumber subtracted from and added to the point estimate.Second, the quantity subtracted and added must be large if weSecond, the quantity subtracted and added must be large if wewant to have a higher confidence in our interval.Confidence Level and Confidence Interval: Each interval isconstructed with regard to a given confidence level and is calleda confidence interval.

QM-220, M. Zainal


The confidence level associated with a confidence intervalstates how much confidence we have that this interval contains

11

states how much confidence we have that this interval containsthe true population parameter.

Th fid l l i d d b (1 )100% h i hThe confidence level is denoted by (1 ‐ α)100%, where α is theGreek letter alpha. When expressed as probability, it is called the

f d ff d d d bconfidence coefficient and is denoted by 1 – α.

α is called the significance level.

Any value of the confidence level can be chosen to construct aconfidence interval, the more common values are 90%, 95%, and, , ,99%. The corresponding confidence coefficients are .90, .95, and.99.

QM-220, M. Zainal

Interval estimation of a population mean:1212

QM-220, M. Zainal

Interval estimation of a population mean: large samples13

If the population standard deviation σ is not known, then weuse the sample standard deviation S in which

13

use the sample standard deviation S, in which

SS xxσσ == of instead used is

The (1 ‐ α)100% confidence interval for μ is

nn

( ) μ

known is if σσ xzx ±

The value of z used here is read from the standard normaldi ib i bl f h i fid l l

unknown isif σxzsx ±

distribution table for the given confidence level.

QM-220, M. Zainal


The quantity (or when σ is not known) in the confidenceinterval formula is called the maximum error of estimate and is

xzσ xzs14

interval formula is called the maximum error of estimate and isdenoted by E.

To find z:To find z:

1‐Divide (1 ‐ α) by 2.

2 Locate the ans er in the body of the standard normal2‐Locate the answer in the body of the standard normaldistribution table and record the corresponding value of z.

QM-220, M. Zainal


Example: A publishing company has just published a new collegetextbook. Before the company decides the price at which to sell this

15

p y ptextbook, it wants to know the average price of all such textbooks inthe market. The research department at the company took a sample of36 comparable textbooks and collected information on their prices.This information produced a mean price of $90.50 for this sample. It isknown that the standard deviation of the prices of all such textbooks isknown that the standard deviation of the prices of all such textbooks is$7.50.

(a) What is the point estimate of the mean price of all such(a) What is the point estimate of the mean price of all suchcollege textbooks? What is the margin of error for thisestimate?

(b) Construct a 90% confidence interval for the mean price ofall such college textbooks.g

QM-220, M. Zainal


Example: According to CardWeb.com, the mean bank credit card debtfor households was $7868 in 2004. Assume that this mean was based

16

on a random sample of 900 households and that the standarddeviation of such debts for all households in 2004 was $2070. Make a99% confidence interval for the 2004 mean bank credit card debt for allhouseholds.

QM-220, M. Zainal


The width of a confidence interval depends on the size of themaximum error E which depends on the values of z σ and n

17

maximum error, E, which depends on the values of z, σ, and n.Why ?

But e ha e o o t ol o Why?But we have no control on σ. Why?

So, the width depends only on:

The value of z, which depends on the confidence level.

The sample size n

The value of z increases as the confidence level increases.

For the same value of σ an increase in n decreases the value ofFor the same value of σ, an increase in n decreases the value ofσ, which ,in turn decreases the size of E when the confidencelevel remains unchangedlevel remains unchanged.

QM-220, M. Zainal


If we want to decrease the width of a confidence interval, wehave two choices:

18

have two choices:

Lower the confidence level.

I th l iIncrease the sample size.

Lowering the confidence level is not a good choice because alower confidence level may give less reliable results.

Increasing the sample size n, is the best way to decrease thewidth of a confidence interval.

QM-220, M. Zainal


Confidence level and the width of the confidence interval

19

Reconsider the last example. Suppose all the information givenin that example remains the same. First, let us decrease theconfidence level to 95%.

From the normal distribution table, z = 1.96 for a 95%confidence level. Then, using z = 1.96 in the confidence interval,we obtain

95% confidence interval is smaller than the 99% interval

QM-220, M. Zainal


Sample size and the width of the confidence interval

20

Reconsider the last example. Suppose we change n to be 2500and all other information remain the same.

The width of the confidence interval for n = 2500 is smallerthan that of n = 900

QM-220, M. Zainal


Example: The standard deviation for a population is 6.30. Arandom sample selected from this population gave a mean

21

random sample selected from this population gave a meanequal to 81.90.

Make a 99% confidence interval for μ assuming n = 36Make a 99% confidence interval for μ assuming n = 36

Make a 99% confidence interval for μ assuming n = 81

M k 99% fid i l f i 100Make a 99% confidence interval for μ assuming n = 100

Does the width of the confidence intervals constructed in parts ath h d th l i i ? Wh ?through c decrease as the sample size increases? Why?

QM-220, M. Zainal

Interval estimation of a population proportion: large samples22

Many times we want to estimate the population proportion.

Examples:

22

Examples:The production manager of a company wants to estimate the

proportion of defective items on a machineproportion of defective items on a machineA bank manager may want to know the percentage of customers who

are satisfied with the bank services.

Recall:The sampling distribution of the sample proportion isp g p p p

(approximately) normal.The mean of the sampling distribution of is equal to the population

proportion.The standard deviation of the sampling distribution of the sample

proportion is /ˆˆproportion is nqpp /ˆˆˆ =σQM-220, M. Zainal


The margin of error is23

pzs ˆ

The (1 ‐ α)100% confidence interval for p is

zsp ˆˆ ± pzsp ±

QM-220, M. Zainal


Example: According to a 2002 survey, 20% of Americans neededlegal advice during the past year to resolve such thorny issues

24

g g p y yas family trusts and landlord disputes. Suppose a recent sampleof 1000 adult Americans showed that 20% of them needed legaladvice during the past year to resolve such family‐relatedissues.( ) Wh t i th i t ti t f th l ti ti ? Wh t i th(a) What is the point estimate of the population proportion? What is themargin of error for this estimate?

(b) Construct a 99% confidence interval for all adults Americans who(b) Construct a 99% confidence interval for all adults Americans whoneeded legal advice during the past year.

QM-220, M. Zainal


Example: According to the analysis of a CNN‐USA TODAY‐Gallup poll conducted in October 2002, ʺStress has become a

f d l f h U d h d d

25

common part of everyday life in the United States. The demandsof work, family, and home place an increasing burden on theaverage American.ʺ According to this poll, 40% of Americansg g p ,included in the survey indicated that they had a limited amountof time to relax (Gallup. com, November 8, 2002). The poll wasbased on a randomly selected national sample of 1502 adultsbased on a randomly selected national sample of 1502 adultsaged 18 and older. Construct a 95% confidence interval for thecorresponding population proportion.

QM-220, M. Zainal


Example:a. A sample of 400 observations taken from a population

26

p p pproduced a sample proportion of .63. Make a 95% confidenceinterval for p.b A th l f 400 b ti t k f thb. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .59. Make a 95%confidence interval for p.pc. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .67. Make a 95%confidence interval for pconfidence interval for p.

QM-220, M. Zainal

Determining the sample size for the estimation of mean27

The big reason on why we usually conduct a surveyinstead of a census is our limited recourses.

27

If a smaller sample can serve our purpose then no needto take a bigger sample.

S h l f f bSuppose on a test to estimate the mean life of a battery.If 40 batteries can give us the required confidenceinterval, why should we waste our money by buying, y y y y gmore batteries.

The question is how can we decide the minimumsample size to produce a confidence interval with agiven αgiven α.

QM-220, M. Zainal


Recall that E is a function of z, σ, and n. That is28

nzE σ.=

If we fix z, σ, and E and try to find n. The sample size can befound using 2σ

22.

Ezn σ

=

If we don’t know σ, then s can be used instead by taking a pilotsample with any arbitrary sizesample with any arbitrary size.

QM-220, M. Zainal


Example: An alumni association wants to estimate the meandebt of this yearʹs college graduates It is known that the

29

debt of this year s college graduates. It is known that thepopulation standard deviation of the debts of this yearʹs collegegraduates is $11,800 How large a sample should be selected sograduates is $11,800. How large a sample should be selected sothat the estimate with a 99% confidence level is within $800 ofthe population mean?the population mean?

QM-220, M. Zainal

Determining the sample size for the estimation of proportion30

Similar to the sampling mean, we can determine the samplesize for the sampling proportion

30

size for the sampling proportion.

The only difference is the standard deviation.

The sample size can be found using

nzE σ.=

If p is not known, we choose a conservative sample of size n byusing p = q. Why?

n

g p q y

Then, we estimate p using the preliminary sample.

QM-220, M. Zainal


Example: Lombard Electronics Company has just installed anew machine that makes a part that is used in clocks The

31

new machine that makes a part that is used in clocks. Thecompany wants to estimate the proportion of these partsproduced by this machine that are defective The companyproduced by this machine that are defective. The companymanager wants this estimate to be within .02 of the populationproportion for a 95% confidence level. What is the mostproportion for a 95% confidence level. What is the mostconservative estimate of the sample size that will limit themaximum error to within .02 of the population proportion?a i u e o o i i o e popu a io p opo io

QM-220, M. Zainal


Example: Consider the previous example again. Suppose apreliminary sample of 200 parts produced by this machine

32

p y p p p yshowed that 7% of them are defective. How large a sampleshould the company select so that the 95% confidence intervalfor p is within .02 of the population proportion?

QM-220, M. Zainal

Interval estimation of a population mean: small samples33

In a previous section , we considered estimating the populationmean for large samples (n ≥ 30)

33

mean for large samples (n ≥ 30).

Using the CLT, we assumed that the sampling distribution ofthe a le ea i a o i ately o al de ite the ha e ofthe sample mean is approximately normal despite the shape ofthe population and whether or not σ is known.

Unfortunately, many times we are restricted to small samplesdue to the nature of the experiment.

For instance:

Clinical Trials

Space missions

QM-220, M. Zainal


If we are dealing with small sample sizes, we will have twoscenarios:

34

scenarios:

1‐The original population is normal and σ is known.

2 Th i i l l ti i ( i t l ) l d i k2‐The original population is (approximately) normal and σ is unknown.

In the first scenario, we use the normal distribution to constructthe confidence interval of μ.

In the second scenario, we can’t use the normal distribution toconstruct the confidence interval of μ. Instead, we will useanother distribution called the t‐distribution.

QM-220, M. Zainal


Conditions under which the t‐distribution is used to make a confidence interval about μ.

35

μ1‐ The population from which the sample is drawn is (approximately)

normally distributed2‐ The sample size is small (that is, n < 30)3‐ The population standard deviation, σ , is not known

The t distribution

The t distribution is a specific type of bell‐shaped distribution p yp pwith lower height and a wider spread than the standard normal distribution.As the sample size becomes larger, the t distribution approaches the standard normal distribution.

QM-220, M. Zainal


The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its

36

( ) qstandard deviation is √[df/(df ‐ 2)].The units of the t distribution are denoted by t.yThe number of degrees of freedom (df) is the only parameter ofthe t distribution.

df = n – 1

QM-220, M. Zainal


Example: Find the value of t for n = 10 and .05 area in the righttail. Also, find it’s standard deviation.

37

Solution: df = n – 1 = 9 → standard deviation = 1.134

The required value of t for 9 df and .05 area in the right tail

QM-220, M. Zainal


Example: Find the value of t for n = 10 and .05 area in the lefttail. Also, find it’s standard deviation.

38

Solution:

QM-220, M. Zainal


Confidence interval for μ using the t distribution

If the following three conditions hold true we use the

39

If the following three conditions hold true, we use thet distribution to make a confidence interval about μ.

1‐ The population from which the sample is drawn is1 The population from which the sample is drawn is (approximately) normally distributed

2‐ The sample size is small (that is, n < 30)

3‐ The population standard deviation, σ , is not known

The (1 ‐ α)% confidence interval for μ for small samples is

X Xsx ts where sn

± =

The value of t is obtained from the t distribution tablefor n‐1 df and a given confidence level.

n

QM-220, M. Zainal


Example: A doctor wanted to estimate the mean cholesterol levelfor all adult men living in Dasmah. He took a sample of 25 adult

40

g pmen from Hartford and found that the mean cholesterol levelfor this sample is 186 with a standard deviation of 12. Assumethat the cholesterol level for all adult men in Hartford are(approximately) normally distributed. Construct a 95%confidence interval for the population mean μconfidence interval for the population mean μ.

QM-220, M. Zainal


Example: Twenty‐five randomly selected adults who buy booksfor general reading were asked how much they usually spend

b k Th l d d f $1450 d

41

on books per year. The sample produced a mean of $1450 and astandard deviation of $300 for such annual expenses. Assumethat such expenses for all adults who buy books for general

d h l d breading have an approximate normal distribution. Determine a99% confidence interval for the corresponding population meanμ.μ

QM-220, M. Zainal

Documents

Introduction to Business Statistics QM 220 Chapter 8 ... 8 slides.pdfDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 220 Chapter 8 Estimation