Upload
phamnhu
View
231
Download
3
Embed Size (px)
Citation preview
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS
Introduction to Business StatisticsIntroduction to Business StatisticsQM 220
Chapter 8Estimation of the mean and proportion
Dr. Mohammad ZainalSpring 2008
Estimation: An introduction2
Estimation is a procedure by which a numerical value orvalues are assigned to a population parameter based on the
2
values are assigned to a population parameter based on theinformation collected from a sample.
In inferential statistics μ is called the true population mean and p isIn inferential statistics, μ is called the true population mean and p iscalled the true population proportion. There are many otherpopulation parameters such as the median mode variance andpopulation parameters, such as the median, mode, variance, andstandarddeviation.
E a le of e ti atioExamples of estimation:mean fuel consumption for a particular model of a car
i k b l l j baverage time taken by new employees to learn a jobmean housing expenditure per month incurred by households
QM-220, M. Zainal
Estimation: An introduction3
If we can conduct a census each time we want to find the valueof a population parameter, then the estimation procedures are
3
p p p pnot needed.
Example, if the Kuwaiti Census Bureau can contact everyhousehold in the Kuwait to find the mean housing expenditureof households, the result of the survey will actually be a census
However, conducting a census:is too expensive,very time consuming,virtually impossible to contact every member of a population
QM-220, M. Zainal
Estimation: An introduction4
That is why we usually take a sample from the population andcalculate the value of the appropriate sample statistic. Then we
4
pp p passign a value or values to the corresponding populationparameter based on the value of the sample statistic.
Example, to estimate the mean housing expenditure per monthof all households in the Kuwait, the Census Bureau will
take a sample of certain householdscollect the information on the housing expenditure per monthcompute the value of the sample meanassign values to the population mean
QM-220, M. Zainal
Estimation: An introduction5
The value assigned to a population parameter based on thevalue of a sample statistics is called an estimate of the
5
ppopulation parameter.
The sample statistic used to estimate a population parameter iscalled an estimator.
The estimation procedure involves the following steps.Select a sample.Collect the required information from the members of thesample.Calculate the value of the sample statistic.Assign value(s) to the corresponding population parameter.
QM-220, M. Zainal
Point and interval estimates6
An estimate may be a point estimate or an interval estimate.
6
APoint Estimate
The value of a sample statistic that is used to estimate aThe value of a sample statistic that is used to estimate a
population parameter is called a point estimate.
If Census Bureau takes a sample of 10,000 households and
determines the mean housing expenditure per month, x, for this
sample is $1370. Then, using x as a point estimate of μ, the
bureau can state that the mean housing expenditure per month,g p p ,
μ, for all households is about $1370.
QM-220, M. Zainal
Estimation: An introduction7
Usually, whenever we use point estimation, we calculate the
a i of e o a o iated ith that oi t e ti atio
7
margin of error associated with that point estimation.
For the estimation of the population mean, the margin of error
is calculated as follows:
Margin of error = 1.96 or 1.96x xsσ± ±
An Interval Estimate
In the interval estimation, instead of assigning a single value toIn the interval estimation, instead of assigning a single value to
a population parameter, an interval is constructed around the
point estimatepoint estimate.QM-220, M. Zainal
Point and interval estimates8
For the example, instead of saying that the mean housing
e e ditu e e o th fo all hou ehold i $1370 e ay
8
expenditure per month for all households is $1370, we may
obtain an interval subtracting a number from $1370 and adding
the same number to $1370.
Then we say that this interval contains the population mean, μ.y p p
For purposes of illustration, suppose subtract $240 from $1370
and add $240 to $1370 Consequently we obtain the intervaland add $240 to $1370. Consequently, we obtain the interval
($1370 ‐ $240) to ($1370 + $240), or $1130 to $1610.
QM-220, M. Zainal
Point and interval estimates9
Then we state that the interval $1130 to $1610 is likely tocontain the population mean, μ, and that the mean housing
9
contain the population mean, μ, and that the mean housingexpenditure per month for all households in the United States isbetween $1130 and $1610.$ $
This procedure is called interval estimation.
Th l $1130 i ll d th l li it f th i t l dThe value $1130 is called the lower limit of the interval and$1610 is called the upper limit of the interval.
QM-220, M. Zainal
Point and interval estimates10
The question is, what number we should add to and subtractfrom the point estimate?
10
from the point estimate?
The answer to this question depends on two considerations:The standard deviation of the meanThe standard deviation of the meanThe level of confidence to be attached to the interval
First the larger the standard deviation the greater is theFirst, the larger the standard deviation, the greater is thenumber subtracted from and added to the point estimate.Second, the quantity subtracted and added must be large if weSecond, the quantity subtracted and added must be large if wewant to have a higher confidence in our interval.Confidence Level and Confidence Interval: Each interval isconstructed with regard to a given confidence level and is calleda confidence interval.
QM-220, M. Zainal
Point and interval estimates11
The confidence level associated with a confidence intervalstates how much confidence we have that this interval contains
11
states how much confidence we have that this interval containsthe true population parameter.
Th fid l l i d d b (1 )100% h i hThe confidence level is denoted by (1 ‐ α)100%, where α is theGreek letter alpha. When expressed as probability, it is called the
f d ff d d d bconfidence coefficient and is denoted by 1 – α.
α is called the significance level.
Any value of the confidence level can be chosen to construct aconfidence interval, the more common values are 90%, 95%, and, , ,99%. The corresponding confidence coefficients are .90, .95, and.99.
QM-220, M. Zainal
Interval estimation of a population mean:1212
QM-220, M. Zainal
Interval estimation of a population mean: large samples13
If the population standard deviation σ is not known, then weuse the sample standard deviation S in which
13
use the sample standard deviation S, in which
SS xxσσ == of instead used is
The (1 ‐ α)100% confidence interval for μ is
nn
( ) μ
known is if σσ xzx ±
The value of z used here is read from the standard normaldi ib i bl f h i fid l l
unknown isif σxzsx ±
distribution table for the given confidence level.
QM-220, M. Zainal
Interval estimation of a population mean: large samples14
The quantity (or when σ is not known) in the confidenceinterval formula is called the maximum error of estimate and is
xzσ xzs14
interval formula is called the maximum error of estimate and isdenoted by E.
To find z:To find z:
1‐Divide (1 ‐ α) by 2.
2 Locate the ans er in the body of the standard normal2‐Locate the answer in the body of the standard normaldistribution table and record the corresponding value of z.
QM-220, M. Zainal
Interval estimation of a population mean: large samples15
Example: A publishing company has just published a new collegetextbook. Before the company decides the price at which to sell this
15
p y ptextbook, it wants to know the average price of all such textbooks inthe market. The research department at the company took a sample of36 comparable textbooks and collected information on their prices.This information produced a mean price of $90.50 for this sample. It isknown that the standard deviation of the prices of all such textbooks isknown that the standard deviation of the prices of all such textbooks is$7.50.
(a) What is the point estimate of the mean price of all such(a) What is the point estimate of the mean price of all suchcollege textbooks? What is the margin of error for thisestimate?
(b) Construct a 90% confidence interval for the mean price ofall such college textbooks.g
QM-220, M. Zainal
Interval estimation of a population mean: large samples16
Example: According to CardWeb.com, the mean bank credit card debtfor households was $7868 in 2004. Assume that this mean was based
16
on a random sample of 900 households and that the standarddeviation of such debts for all households in 2004 was $2070. Make a99% confidence interval for the 2004 mean bank credit card debt for allhouseholds.
QM-220, M. Zainal
Interval estimation of a population mean: large samples17
The width of a confidence interval depends on the size of themaximum error E which depends on the values of z σ and n
17
maximum error, E, which depends on the values of z, σ, and n.Why ?
But e ha e o o t ol o Why?But we have no control on σ. Why?
So, the width depends only on:
The value of z, which depends on the confidence level.
The sample size n
The value of z increases as the confidence level increases.
For the same value of σ an increase in n decreases the value ofFor the same value of σ, an increase in n decreases the value ofσ, which ,in turn decreases the size of E when the confidencelevel remains unchangedlevel remains unchanged.
QM-220, M. Zainal
Interval estimation of a population mean: large samples18
If we want to decrease the width of a confidence interval, wehave two choices:
18
have two choices:
Lower the confidence level.
I th l iIncrease the sample size.
Lowering the confidence level is not a good choice because alower confidence level may give less reliable results.
Increasing the sample size n, is the best way to decrease thewidth of a confidence interval.
QM-220, M. Zainal
Interval estimation of a population mean: large samples19
Confidence level and the width of the confidence interval
19
Reconsider the last example. Suppose all the information givenin that example remains the same. First, let us decrease theconfidence level to 95%.
From the normal distribution table, z = 1.96 for a 95%confidence level. Then, using z = 1.96 in the confidence interval,we obtain
95% confidence interval is smaller than the 99% interval
QM-220, M. Zainal
Interval estimation of a population mean: large samples20
Sample size and the width of the confidence interval
20
Reconsider the last example. Suppose we change n to be 2500and all other information remain the same.
The width of the confidence interval for n = 2500 is smallerthan that of n = 900
QM-220, M. Zainal
Interval estimation of a population mean: large samples21
Example: The standard deviation for a population is 6.30. Arandom sample selected from this population gave a mean
21
random sample selected from this population gave a meanequal to 81.90.
Make a 99% confidence interval for μ assuming n = 36Make a 99% confidence interval for μ assuming n = 36
Make a 99% confidence interval for μ assuming n = 81
M k 99% fid i l f i 100Make a 99% confidence interval for μ assuming n = 100
Does the width of the confidence intervals constructed in parts ath h d th l i i ? Wh ?through c decrease as the sample size increases? Why?
QM-220, M. Zainal
Interval estimation of a population proportion: large samples22
Many times we want to estimate the population proportion.
Examples:
22
Examples:The production manager of a company wants to estimate the
proportion of defective items on a machineproportion of defective items on a machineA bank manager may want to know the percentage of customers who
are satisfied with the bank services.
Recall:The sampling distribution of the sample proportion isp g p p p
(approximately) normal.The mean of the sampling distribution of is equal to the population
proportion.The standard deviation of the sampling distribution of the sample
proportion is /ˆˆproportion is nqpp /ˆˆˆ =σQM-220, M. Zainal
Interval estimation of a population proportion: large samples23
The margin of error is23
pzs ˆ
The (1 ‐ α)100% confidence interval for p is
zsp ˆˆ ± pzsp ±
QM-220, M. Zainal
Interval estimation of a population proportion: large samples24
Example: According to a 2002 survey, 20% of Americans neededlegal advice during the past year to resolve such thorny issues
24
g g p y yas family trusts and landlord disputes. Suppose a recent sampleof 1000 adult Americans showed that 20% of them needed legaladvice during the past year to resolve such family‐relatedissues.( ) Wh t i th i t ti t f th l ti ti ? Wh t i th(a) What is the point estimate of the population proportion? What is themargin of error for this estimate?
(b) Construct a 99% confidence interval for all adults Americans who(b) Construct a 99% confidence interval for all adults Americans whoneeded legal advice during the past year.
QM-220, M. Zainal
Interval estimation of a population proportion: large samples25
Example: According to the analysis of a CNN‐USA TODAY‐Gallup poll conducted in October 2002, ʺStress has become a
f d l f h U d h d d
25
common part of everyday life in the United States. The demandsof work, family, and home place an increasing burden on theaverage American.ʺ According to this poll, 40% of Americansg g p ,included in the survey indicated that they had a limited amountof time to relax (Gallup. com, November 8, 2002). The poll wasbased on a randomly selected national sample of 1502 adultsbased on a randomly selected national sample of 1502 adultsaged 18 and older. Construct a 95% confidence interval for thecorresponding population proportion.
QM-220, M. Zainal
Interval estimation of a population proportion: large samples26
Example:a. A sample of 400 observations taken from a population
26
p p pproduced a sample proportion of .63. Make a 95% confidenceinterval for p.b A th l f 400 b ti t k f thb. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .59. Make a 95%confidence interval for p.pc. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .67. Make a 95%confidence interval for pconfidence interval for p.
QM-220, M. Zainal
Determining the sample size for the estimation of mean27
The big reason on why we usually conduct a surveyinstead of a census is our limited recourses.
27
If a smaller sample can serve our purpose then no needto take a bigger sample.
S h l f f bSuppose on a test to estimate the mean life of a battery.If 40 batteries can give us the required confidenceinterval, why should we waste our money by buying, y y y y gmore batteries.
The question is how can we decide the minimumsample size to produce a confidence interval with agiven αgiven α.
QM-220, M. Zainal
Determining the sample size for the estimation of mean28
Recall that E is a function of z, σ, and n. That is28
nzE σ.=
If we fix z, σ, and E and try to find n. The sample size can befound using 2σ
22.
Ezn σ
=
If we don’t know σ, then s can be used instead by taking a pilotsample with any arbitrary sizesample with any arbitrary size.
QM-220, M. Zainal
Determining the sample size for the estimation of mean29
Example: An alumni association wants to estimate the meandebt of this yearʹs college graduates It is known that the
29
debt of this year s college graduates. It is known that thepopulation standard deviation of the debts of this yearʹs collegegraduates is $11,800 How large a sample should be selected sograduates is $11,800. How large a sample should be selected sothat the estimate with a 99% confidence level is within $800 ofthe population mean?the population mean?
QM-220, M. Zainal
Determining the sample size for the estimation of proportion30
Similar to the sampling mean, we can determine the samplesize for the sampling proportion
30
size for the sampling proportion.
The only difference is the standard deviation.
The sample size can be found using
nzE σ.=
If p is not known, we choose a conservative sample of size n byusing p = q. Why?
n
g p q y
Then, we estimate p using the preliminary sample.
QM-220, M. Zainal
Determining the sample size for the estimation of proportion31
Example: Lombard Electronics Company has just installed anew machine that makes a part that is used in clocks The
31
new machine that makes a part that is used in clocks. Thecompany wants to estimate the proportion of these partsproduced by this machine that are defective The companyproduced by this machine that are defective. The companymanager wants this estimate to be within .02 of the populationproportion for a 95% confidence level. What is the mostproportion for a 95% confidence level. What is the mostconservative estimate of the sample size that will limit themaximum error to within .02 of the population proportion?a i u e o o i i o e popu a io p opo io
QM-220, M. Zainal
Determining the sample size for the estimation of proportion32
Example: Consider the previous example again. Suppose apreliminary sample of 200 parts produced by this machine
32
p y p p p yshowed that 7% of them are defective. How large a sampleshould the company select so that the 95% confidence intervalfor p is within .02 of the population proportion?
QM-220, M. Zainal
Interval estimation of a population mean: small samples33
In a previous section , we considered estimating the populationmean for large samples (n ≥ 30)
33
mean for large samples (n ≥ 30).
Using the CLT, we assumed that the sampling distribution ofthe a le ea i a o i ately o al de ite the ha e ofthe sample mean is approximately normal despite the shape ofthe population and whether or not σ is known.
Unfortunately, many times we are restricted to small samplesdue to the nature of the experiment.
For instance:
Clinical Trials
Space missions
QM-220, M. Zainal
Interval estimation of a population mean: small samples34
If we are dealing with small sample sizes, we will have twoscenarios:
34
scenarios:
1‐The original population is normal and σ is known.
2 Th i i l l ti i ( i t l ) l d i k2‐The original population is (approximately) normal and σ is unknown.
In the first scenario, we use the normal distribution to constructthe confidence interval of μ.
In the second scenario, we can’t use the normal distribution toconstruct the confidence interval of μ. Instead, we will useanother distribution called the t‐distribution.
QM-220, M. Zainal
Interval estimation of a population mean: small samples35
Conditions under which the t‐distribution is used to make a confidence interval about μ.
35
μ1‐ The population from which the sample is drawn is (approximately)
normally distributed2‐ The sample size is small (that is, n < 30)3‐ The population standard deviation, σ , is not known
The t distribution
The t distribution is a specific type of bell‐shaped distribution p yp pwith lower height and a wider spread than the standard normal distribution.As the sample size becomes larger, the t distribution approaches the standard normal distribution.
QM-220, M. Zainal
Interval estimation of a population mean: small samples36
The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its
36
( ) qstandard deviation is √[df/(df ‐ 2)].The units of the t distribution are denoted by t.yThe number of degrees of freedom (df) is the only parameter ofthe t distribution.
df = n – 1
QM-220, M. Zainal
Interval estimation of a population mean: small samples37
Example: Find the value of t for n = 10 and .05 area in the righttail. Also, find it’s standard deviation.
37
Solution: df = n – 1 = 9 → standard deviation = 1.134
The required value of t for 9 df and .05 area in the right tail
QM-220, M. Zainal
Interval estimation of a population mean: small samples38
Example: Find the value of t for n = 10 and .05 area in the lefttail. Also, find it’s standard deviation.
38
Solution:
QM-220, M. Zainal
Interval estimation of a population mean: small samples39
Confidence interval for μ using the t distribution
If the following three conditions hold true we use the
39
If the following three conditions hold true, we use thet distribution to make a confidence interval about μ.
1‐ The population from which the sample is drawn is1 The population from which the sample is drawn is (approximately) normally distributed
2‐ The sample size is small (that is, n < 30)
3‐ The population standard deviation, σ , is not known
The (1 ‐ α)% confidence interval for μ for small samples is
X Xsx ts where sn
± =
The value of t is obtained from the t distribution tablefor n‐1 df and a given confidence level.
n
QM-220, M. Zainal
Interval estimation of a population mean: small samples40
Example: A doctor wanted to estimate the mean cholesterol levelfor all adult men living in Dasmah. He took a sample of 25 adult
40
g pmen from Hartford and found that the mean cholesterol levelfor this sample is 186 with a standard deviation of 12. Assumethat the cholesterol level for all adult men in Hartford are(approximately) normally distributed. Construct a 95%confidence interval for the population mean μconfidence interval for the population mean μ.
QM-220, M. Zainal
Interval estimation of a population mean: small samples41
Example: Twenty‐five randomly selected adults who buy booksfor general reading were asked how much they usually spend
b k Th l d d f $1450 d
41
on books per year. The sample produced a mean of $1450 and astandard deviation of $300 for such annual expenses. Assumethat such expenses for all adults who buy books for general
d h l d breading have an approximate normal distribution. Determine a99% confidence interval for the corresponding population meanμ.μ
QM-220, M. Zainal