Upload
marilynn-burns
View
214
Download
0
Embed Size (px)
Citation preview
PopulationSampleSize N nMean XS.D. s
, are some parameters of the population.In general, , are not known.Suppose we want to know (say),we take samples and we will knowX and s So, what can we say about ?Can we say X is ?
Can we say X is close to ?
But how close is it?
To estimate by a number Xit is too “dangerous”!It is much “safer” to estimate by an interval.Based on the data from random samples,we can have sample mean and variance;suppose by some further calculation,we can find an interval (L,U),
such that P(L< < U) = 95 % (say),
that means there’s 95% chance (L,U) traps .
We say (L,U) is a 95% confidence interval for
In general, to estimate a parameter , if we can find a random interval (L,U) such that P(L < < U) = k%,(L,U) is called a k% confidence interval for
But how to find(L,U)?
In AL, you are required to construct confidence interval C.I. for (1) population mean and (2) population proportion.
Let’s talk about C.I. for .By CLT, X ),(~
2
nN
Task: Find 95% C.I. for .
Suppose (L,U) is a 95% C.I. for , P(L < < U) = 95% --- (1)By table, P(1.96 < z < 1.96) = 95%
%95)96.1/
96.1( n
XP
Rearranging, %95)96.196.1
( n
Xn
XP
Comparing (1),95% C.I. for is
nX
nX
96.1,
96.1
nX
nX
96.1,
96.1 is a 95% C.I. for .
How about a 99% C.I. for ? Ans:
nX
nX
58.2,
58.2
since P(2.58 < z < 2.58) = 99%
In general, a % C.I. for is
nz
Xn
zX cc
, where P(zc < z < zc) = %
% is called the confidence level.
nz
Xn
zX cc
, is a % C.I. for .
Note 1: zc , hence width of C.I. Reasonable! To ensure more chance to “trap”the true , we can have wider width of C.I.But it is close to meaningless to mention C.I.of large range, e.g. if we claim that we have 100% confident that the true lies on (,).
Note 2: In practice, we don’t even know ,then we should use sample s.d. s to replace .More precisely, use s[n/(n1)] instead of s.
E.g. 26Masses of random sample (in g) are 182,184, 176, 178, 181, 180, 183, 178, 179, 177,180, 183, 179, 178, 181, 181. If this samplecame from a normal population = 10g, obtain a 95% C.I. for mean mass of thepopulation.
180XFor the sample, Hence 95% C.I. for is
1610
96.1180,16
1096.1180
= (175.1, 184.9)
In previous question, (175.1,184.9) is a 95%C.I. for the true mean . Am I right in sayingthat there is 95% chance that lies in (175.1,184.9) ?
Note 1: is NOT a random variable! While,the interval (L,U) is a random interval.Note 2: We can just say that we are 95% confident that lies on (L,U).
How to comprehend this ?
Population
Sample 1
1X (L1,U1)Sample 2
2X (L2,U2)...
Sample n
nX (Ln,Un)
...
If (L1,U1), (L2,U2) , …, (Ln,Un) are 95% C.I.then there should be 95% of theses intervals(L1,U1), (L2,U2) , …, (Ln,Un) includes the true mean .
X
For 20 95%C.I.
So (175.1,184.9) isjust one of the C.I.sand it may or may not trap .
there should be19 C.I. trap the true mean.
An example.
Suppose {X1, X2,…, X7} = population set.
We take 2-element samples. (n = 2)Total possible way = 7C2 = 21
Hence we can construct 21 different C.I.s
We consider the 90% C.I. XX sXsX 654.1,654.1
See the WORDS document now.
We know 21 C.I.s, 19 of them do trap .
Please notice that 2190% 19Also, the sample size = 2, too small!Instead of using
nsX
We use the adjusted sample s.d..
nNnN
sX
1
Refer to P.81 note (ii) in text book.
E.g. 27A certain population, = 6. How large a sample size => width of 95% C.I. for = 0.5
25.096.1 n
Half width = 0.25
nX
nX
96.1,
96.195%C.I.=
25.0696.1 n n = 2209
Do you agree?
If is known, C.I. is
nz
Xn
zX cc
,
If is unknown, C.I. is
nsz
Xnsz
X cc ,
Precisely,
ns
nn
zXn
sn
nzX cc 1
,1
E.g. 28A sample of 100 plugs with mean diameter25.10 cm. If s.d. of these plugs is 0.12, estimate the population mean diameter at 95% confident level.
Now, we don’t know , so use sample s.d. s
ns
nn
zXn
sn
nzX cc 1
,1
10012.0
99100
96.110.252,10012.0
99100
96.110.252 = (25.076,25.124)
E.g. 31(a) A two-stage rocket to be fired to put a satellite into orbit. Due to variation of the specified impulse in the second stage, the velocity imparted in this stage will benormally distributed about 4095 ms1 with s.d. 21 ms1Find 95% confident limits for the velocity imparted inthis stage.
4095v 211 s
ns
vns
v 11 96.1,
96.195% C.I. =
19621.1
4095,1
9621.14095=
= (4054 , 4126)
(b) In the first stage, the velocity imparted will be normally distributed about 3990 ms1 with s.d. 20 ms1due to variation of the specific impulse and (independently) with s.d. 8 ms1 due to variation in the time of burning of the change. Find 90% confident limits for the velocity imparted in this stage.
s2 = 20, s3 = 8222
32
2 820 ssCombined s.d. = = 21.54
90% C.I. =
ns
vn
sv 2
12
1
645.1,
645.1
= (39901.64521.54, 3990+1.64521.54)
= (3955,4025)
(c) Given that the final velocity of 8000 ms1 is required to go into orbit and that the second stage fires immediately after the first, find the probability of achieving orbit.
v = 4095s2 = 212
v1 = 3990s1
2 = 202+ 82
Let V = final velocityE(V) = 3990 + 4095Var(V) = 202 + 82 + 212
= 8085= 905
V ~ N(8085,905)
P(V > 8000) = )905
80858000(
zP
= 0.9977
Prerequisite on E.g. 32
Uniform distribution
a b
f(x)
r
xab
r
1
b
axdx
abXE
1)(
2ba
)()()( 22 XEXEXVar
b
a
badxx
ab
22
21
12
2ba
E.g. 32To add 104 numbers, each of which was rounded off with accuracy 10m degree. Assuming that the errors arising mutually independent and uniformly distributed on (0.510m, 0.510m), find the limits in which the total error will lie with probability 0.99.Let X = total error. X = X1 + X2 +…+ X10000
Since Xi is uniformly distributed,
2105.0105.0
)(mm
iXE = 0
12
105.0105.0)(
2mm
iXVar
1210 2m
)(10000)( iXEXE = 0)(10000)( iXVarXVar
1210 )2(2
m
By CLT, )12
10,0(~
)2(2 m
NXBy table,
P(2.56 < z < 2.56) = 0.99
99.0)56.2
1210
056.2( )2(
m
XP
)2(101256.2 mHence the limits are
Hence we can construct the 99% C.I. for total error X and this estimation is far more better! Let’s use m = 3 as an example. |X| 0.0005104
= 5, too large for estimation! But the C.I. is (0.0739,0.0739) only, more “precise”.
Now, let’s talk about C.I. for proportion
Suppose you want to look into the smoker’s
proportion in H.K.
You have interviewed with 100 H.K. people and
discovered 60 smokers.
Can we say the smokers’ proportion of H.K. people
is 60% ?
However, we canconstruct a C.I.
to estimate the trueproportion!
Let n be the sample size.Let m be the number of “success”
(i.e. “smokers” in the e.g.)Let p be the true proportion (of “success”)
Suppose the population is very large,
then m has a binomial distribution such that
m ~ B(n , p)
Suppose further that n is reasonably large.
We can use “normal” to approximate “binomial”.
m ~ N(np , npq)
Let Ps be the proportion on “success” in sample.
nm
Ps )()(nm
EPE s pn
np
)()(nm
VarPVar s 22
)(n
npqn
mVar npq
Hence ),(npq
pNPs ~
In practice, p is unknown. We use Ps Qs/n to estimate pq/n.
Thus Ps ~ ),(nQP
pN ss approximately
Hence
95.0)96.196.1(
nQPpP
Pss
s
Rearranging,
95.0)96.196.1( nQP
PpnQP
PP sss
sss
Hence 95% C.I. for population proportion p is
nQP
PnQP
P sss
sss 96.1,96.1
In general, % C.I. for population proportion p is
nQP
zPnQP
zP sscs
sscs ,
where P(zc < z < zc) = %
n > 30 is required.
E.g. 344000 items, 240 defective, find 95% C.I. for the probability p that an item is defective.
Ps = 4000240
= 0.06 Qs = 1 0.06 = 0.94
400094.006.0
nQP ss = 0.00375
00375.096.106.0,00375.096.106.0
Required 95% C.I. is
= (0.0526 , 0.00674)
E.g. 35Suppose that we know p = 0.6 for a Bernoulli population. How large is the size is necessary to be 95% confident that the obtained value p lies in (0.5,0.7) ?
(0.5,0.7) = (0.60.1,0.6+0.1)
Let n = sample size.
Hence, for 95% confidence,
n4.06.0
96.1
0.1 =
On solving,
n 92
E.g. 37(a) Of 50 houseflies, independently subjected to the same insecticide, 38 were killed. Obtain an estimate of p, the probability that a housefly is killed by the insecticide. Findalso the standard error of p.
Ps = 5038
2519
Standard error =nQP ss
0604.050
)25/191)(25/19(
(b) Now conduct a larger experiment with the same insecticide so that an estimate with standard error of about 0.03 can be quoted. On the basis of the information in the experiment already conducted, how many houseflies needed ?
Standard error =nQP ss 03.0
)25/6)(25/19( n
So n = 203(c) To be absolutely sure of obtaining the desired accuracy, how many houseflies should be taken ?
Standard error depends on Ps. n = 203 makes standard error = 0.03 only when Ps = 19/25.
So what n to ensure s.e. 0.03 irrespective of Ps ?
For fixed n, s.e. is a function of Ps.
s.e. 0.03 means max. of s.e. = 0.03.
nPP
es ss )1(..
Very easy to show that Ps(1-Ps) attains max. when p = 0.5
Hence s.e. n
)5.01(5.0 n4
1
Then set 03.041 n
n 279
i.e. Though different samples yield different Ps, it is sure that s.e. not greater than 0.03 if we take n = 279 (or more)