42
Chapter 4: Estimation 153 CHAPTER 4 : ESTIMATION Sub-Topic Introduction. Point estimation Interval estimation. Confidence interval for population mean. Confidence interval for a difference between two means. Confidence interval for a population variance. Confidence interval for ratio of two variances. Chapter Learning Outcome Estimate the confidence interval for the single, two population means, population variance and ratio of two variances. Learning Objective By the end of this chapter, students should be able to Know how to construct confidence interval for population means and variances. Able to choose which distribution should be used in order to construct confidence interval. Able to choose appropriate sample size. Key Term (English to Bahasa Melayu) English Bahasa Melayu 1. Point estimation Penganggar titik 2. Interval estimation Penganggar selang 3. The confidence level Aras keyakinan 4. Confidence coefficient Pekali keyakinan

Chapter 4

Embed Size (px)

DESCRIPTION

STATISTIC NOTE

Citation preview

Page 1: Chapter 4

Chapter 4: Estimation

153

CHAPTER 4 : ESTIMATION

Sub-Topic

Introduction.

Point estimation

Interval estimation.

Confidence interval for population mean.

Confidence interval for a difference between two means.

Confidence interval for a population variance.

Confidence interval for ratio of two variances.

Chapter Learning Outcome

Estimate the confidence interval for the single, two population means,

population variance and ratio of two variances.

Learning Objective

By the end of this chapter, students should be able to

Know how to construct confidence interval for population means and

variances.

Able to choose which distribution should be used in order to construct

confidence interval.

Able to choose appropriate sample size.

Key Term (English to Bahasa Melayu)

English Bahasa Melayu

1. Point estimation → Penganggar titik

2. Interval estimation → Penganggar selang

3. The confidence level → Aras keyakinan

4. Confidence coefficient → Pekali keyakinan

Page 2: Chapter 4

Chapter 4: Estimation

154

4.1 Introduction

One of the major applications of statistics is estimating population parameters from

sample statistics. If we have an unknown parameter, we may find an estimator for this

parameter and use for the parameter. However, how reliable this estimate is we do not

know. This is where confidence intervals come in. Instead of estimating the

parameter, we say that there is a 95% (or some other percentage) chance that a given

interval contains the parameter. As an example of a parameter estimation problem,

suppose that structural engineer is analyzing the tensile strength is naturally present

between the individual components because of differences in raw material batches,

manufacturing processes and measurement procedures, the engineer is interested in

estimating the mean tensile strength of the components. Knowledge of the statistical

sampling properties of the estimator used would enable the engineer to establish the

precision of the estimate.

4.2 Point Estimation

Definition 1

A point estimate is a single numerical value that used to estimates an unknown

population parameter.

Example 1

For example, a poll may seek to estimate the proportion of adult residents of a city

that support a proposition to build a new sports stadium. Out of a random sample of

200 people, 106 say they support the proposition. Thus in the sample, 0.53 of the

people supported the proposition. This value of 0.53 is called a point estimate of the

population proportion. It is called a point estimate because the estimate consists of a

single value or point.

Page 3: Chapter 4

Chapter 4: Estimation

155

Example 2

Suppose a random variable X is normally distributed with unknown population mean,

. After the sample has been selected, the numerical value of x is the point estimate

of . Thus, if the data are ,251 x ,302 x ,293 x ,334 x the point estimate of is

25.294

33293025

x

4.3 Interval estimation

Definition 2

A confidence interval is a set of (real) numbers between two values that likely to

contain the parameter being estimated.

Definition 3

The confidence level of an interval estimate of a parameter is the probability that the

interval estimates will contain the estimated parameter. Point estimates are usually

supplemented by confidence intervals.

Theory 1

A confidence interval estimate for is an interval of the form ul . There is a

probability of )1( of constructing interval that will contain the true value of .

where

P{ ul }= 1

The relationship between and the confidence level is that the stated confidence

level is percentage equivalent to the decimal value of )1( .

Definition 4

The end-points or bounds l and u are called the lower-confidence limit and upper-

confidence limits respectively.

Page 4: Chapter 4

Chapter 4: Estimation

156

Definition 5

)1( is called confidence coefficient.

Example 3

With 95% confidence interval, then 05.0 , since 95.005.01 .

When 01.0 , then 99.001.01 , and the 99% confidence interval is being

calculated.

Example 4

If the pollster used a method that contains the parameter 95% of the time it is used, he

or she would arrive at the following 95% confidence interval: 0.46 < π < 0.60. The

pollster would then conclude that somewhere between 0.46 and 0.60 of the population

supports the proposal. The media usually reports this type of result by saying that

53% favor the proposition with a margin of error of 7%.

4.4 Confidence Interval for Single Mean

4.4.1 Large Sample : 30n or known

Theory 2

If the random variable X has a normal distribution with mean and variance 2 ,

then the sample mean X also has a normal distribution with mean , but with

variance n/2 (refer previous chapter). In other words, X ~ )(2

nN

. In fact, if we

have a random variable X which has any distribution (not necessarily normal), by the

central limit theorem the distribution of X will be approximately normal with mean

and with variance n/2 , for large n i.e 30n .

So standardizing this, we get :

Page 5: Chapter 4

Chapter 4: Estimation

157

n

xZ

/2

(since the mean is and the standard deviation is )/( n

From the Normal Distribution section, we know that P(-1.96 < Z < 1.96) = 0.95.

Hence, 95.096.1/

96.1

n

xP

Rearranging this we get 95.0/96.1/96.1 nxnxP .

Thus, the 95% confidence interval for is :

nxnx /96.1/96.1

In general form given as :

nzxnzx // 2/2/ (4.1)

or nzx /2/ (4.2)

If the value of is unknown or not given, so the above formula is used by substitute

with s (sample standard deviation). Hence

nszx /2/ (4.3)

The term nz /2/ is called the maximum error of estimate, E. For specific value,

if 05.0 , 95% of the sample means will fall within this error value on either side

of the population mean. Refer Figure 1.

Figure 1

95%

025.02

025.02

nz /2/ nz /2/

Page 6: Chapter 4

Chapter 4: Estimation

158

Definition 6

The maximum error of estimate is the maximum likely difference between the point

estimate of a parameter and the actual value of the parameter.

Example 5

Suppose that 100 samples of water from a fresh water lake are taken and the calcium

concentration (milligrams per liter) is measured. The average is 0.66 mg/l and the

standard deviation is 0.049 mg/l. Construct 95% confidence interval for the

population mean.

Answer Example 5

049.0,66.0,100 sxn

05.095.01

96.1025.02/05.02/ zzz

nszx /2/

100/049.096.166.0

0096.066.0

6696.06504.0

That is, based on the sample data, a range of the population mean of calcium

concentration from fresh water lake is between 0.6504 and 0.6696.

Theory 3

Sample size determination is closely related to statistical estimation. Quite often, one

asks, how large a sample is necessary to make an accurate estimate ? To determine

the minimum sample size for finding a confidence interval for the mean, the formula

for sample size is derived from the maximum error of estimate formula.

nZE

2 (4.4)

Page 7: Chapter 4

Chapter 4: Estimation

159

and this formula is solved for n as follows.

2ZnE

E

Zn

2

Hence, 2

2

E

Zn

(4.5)

From (4.2) nzx /2/ . Since nzE /2/ , hence confidence interval for

also can be written as Ex .

Example 6

A manufacturer is interested in the output voltage of a power supply used in a PC.

Output voltage is assumed to be normally distributed, with standard deviation 1.25V.

How large a sample must be selected if he wants to be 99% confident of finding

whether the true mean differs from the sample mean by 0.3V ?

Answer Example 6

3.0,01.099.01,25.1 E

5758.2005.02/01.02/ zzz

2

2/

E

zn

2

3.0

)25.1(5758.2

1866.115

116

Page 8: Chapter 4

Chapter 4: Estimation

160

4.4.2 Small Sample: 30n and Unknown

Theory 4

When 2 is unknown, a logical procedure is replace 2 with the sample variance s.

The random variable Z now becomes ns

xT

/2

, whereby the random variable has a

t distribution with 1n degree of freedom. Hence the formula for mean population is

given by:

n

stx v,2

__

(4.6)

with degree of freedom, 1 nv .

Example 7

The comprehensive strength of concrete is being tested by civil engineer. He tests 10

specimens and the obtained data presented in Table 1.

Table 1

2590 2530 2510 2566 2541

2557 2582 2550 2583 2599

Construct a 98% confidence interval for mean strength of the concrete.

Answer Example 7

10,5455.28,8.2560 nsx

02.098.01

821.29,01.0110,2/02.0,2/ ttt v

nstx v /,2/

10/5455.28821.28.2560

4648.258.2560

2648.25863352.2535

Page 9: Chapter 4

Chapter 4: Estimation

161

That is, based on the sample data, a range of the population mean strength of the

concrete is between 3352.2535 and 2648.2586 .

Example 8

A civil engineer is analyzing the compressive strength of concrete. A random sample

of 12 specimens has a mean compressive strength of 3201.33 psi and a standard

deviation of 900 psi. Construct a 95% confidence interval for the mean compressive

strength.

Answer Example 8

12,900,33.3201 nsx

05.095.01

201.211,025.0112,2/05.0,2/ ttt v

nstx v /,2/

12/90201.233.3201

1837.5733.3201

5137.32581463.3144

That is, based on the 12 sample, we are 95% confident that the mean compressive

strength is between 3144.1463 and 3258.5137 psi.

Exercise 4.4

Objective Questions

1. A 95% confidence interval estimate can be interpreted to mean that :

(a) If all possible samples are taken and confidence interval estimates are

developed, 95% of them would include the true population mean

somewhere within their interval.

(b) You have 95% confidence that you have selected a sample whose

interval does include the population mean.

(c) b is true

(d) both a and b are true

Page 10: Chapter 4

Chapter 4: Estimation

162

2. Which of the following statements is false ?

(a) There is a different critical value for each level of alpha.

(b) You can construct a 100% confidence interval estimate of .

(c) Alpha is a proportion in the tails of the distribution that is outside the

confidence interval.

(d) In practice, the population mean is the unknown quantity that is to be

estimated.

3. In the construction of confidence intervals, if all other quantities are

unchanged, an increase in the sample size will lead to ________interval.

(a) a narrower.

(b) a wider.

(c) the same.

(d) a less significant.

4. Other things being equal, as the confidence level for a confidence interval

increase, the width of the interval increases.

(a) True.

(b) False.

5. The t distribution is used to construct confidence intervals for the population

mean when

(a) the population standard deviation is known and the sample size is

greater than 30.

(b) the population standard deviation is known or the sample size is

greater than 30.

(c) the population standard deviation is unknown and the sample size is

small.

(d) the population standard deviation is unknown or the sample size is

small.

Page 11: Chapter 4

Chapter 4: Estimation

163

Subjective Questions

6. The mean yield of a chemical process is being studied by an engineer. From

previous experience with this process the standard deviation of yield is known

to be 3. He would like to be 99% confident that the estimate should be

accurate within yield with the value of one.

(a) Determine the error.

(b) How large a sample is necessary for this study ?

(c) If it was found that the sample mean is 10, find a 99% confidence

interval for the mean yield.

7. In the production of airbag inflators for automotive safety systems, a company

is interested to estimate the true mean of the inflator. Measurements on 20

inflators yielded an average value 2.02 cm and a standard deviation of 0.05.

Find a 98% confidence interval of the true mean.

8. Suppose that in question 7 exercise 4.4 the sample size has been increased

from 20 to 50 inflators. Given that other measurements are still the same. Is

there any change for the confidence interval of the true mean ? If the answer is

yes, find the new confidence interval of the true mean.

9. The mean breaking strength of yarn used in manufacturing drapery material is

being studied. Past experience has indicated that the standard deviation of

breaking strength is 3.1 psi. A random sample of nine specimens is tested, and

the average breaking strength is found to be 99.4 psi. Find a 95% confidence

interval of the mean breaking strength.

10. A research engineer for a tire manufacturer is investigated tire life for new

rubber compound. Sample of 18 tires had been tested to end-of-life in a road

test and their average is 63,559 kilometers and standard deviation is 2,275

kilometers. Find a 99% confidence interval of the mean life tire.

Page 12: Chapter 4

Chapter 4: Estimation

164

11. The college president asks the statistics teacher to estimate the average age of

the students at their college. How large a sample is necessary ? The statistics

teacher would like to be 99% confident that the estimate should be accurate

within one year. From a previous study, the standard deviation of the ages is

known to be 3 years.

12. An insurance company is trying to estimate the average number of sick days

that full-time food-service workers use per year. A pilot study found the

standard deviation to be 2.5 days. How large a sample must be selected if the

company wants to be 95% confident of getting an interval that contains the

true mean with a maximum error of one day ?

13. A restaurant owner wishes to find the 99% confidence interval of the true

mean cost of a dry martini. How large should the sample be if she wishes to

be accurate within RM0.10 ? A previous study showed that the standard

deviation of the price was RM0.12.

14. A health care professional wishes to estimate the birth weights of infants.

How large a sample must she select if she desires to be 90% confident that the

true mean is within six ounces of the sample mean ? The standard deviation of

the birth weights is known to be eight ounces.

15. Find the sample size needed to estimate the population mean to within one

fifth of a standard deviation with 99% confidence level.

Answer Exercise 4.4

Objective Questions

1. d 2. b

3. a 4. a

5. c

Page 13: Chapter 4

Chapter 4: Estimation

165

Subjective Questions

6. (a) 3 (b) 60 (c) (9,11)

7. (1.9916, 2.0484) 8. (2.0036, 2.0364)

9. (97.3747, 101.4253) 10. (62,005.0268, 65,112.9732)

11. 60 12. 25

13. 10 14. 5

15. 167

4.5 Confidence Interval for a Difference Between Two Means

Theory 5

In the previous chapter we already know that the mean sampling distribution of

21 XX is given by

2

2

2

1

2

12121 ,~

nnNXX

.

In the same way as confidence interval for population mean, we obtained the formula

for confidence interval for the difference between two means. Hence the confidence

interval in general form given by :

2

2

2

1

2

12/2121

2

2

2

1

2

12/21

nnzxx

nnzxx

(4.7)

or

2

2

2

1

2

12/21

nnzxx

(4.8)

We also decide whether to use the Z distribution or t distribution by using the same

principle.

If the sample size is large ( 30n ) or known, we will use the Z

distribution.

If the sample size is small ( 30n ) and unknown, we will use the t

distribution.

Page 14: Chapter 4

Chapter 4: Estimation

166

In other words, we only use the t distribution when the sample size is small ( 30n )

and unknown, for other conditions we will use the Z distribution.

Z distribution case

When the value of is known, we will use (4.8) formula, but when is unknown

we simply substitute with the sample standard deviation, s. Thus from (4.8)

formula, we will obtained:

2

2

2

1

2

12/21

n

s

n

szxx (4.9)

t distribution case

The general formula that will be used when we choose t distribution is given by :

2

2

2

1

2

1,2/21

n

s

n

stxx v (4.10)

Whereby, 221 nnv

We already know that the 301 n , 302 n and the value of 1 and 2 are unknown.

When we use t distribution, we need to take note on both sample size and the

information on the population standard deviations. The different information on these

will make the (4.10) formula varies. It can be summarize as follows:

Case (i) : 21 nn , 22

12

Since 21 nn , we can represent both sample size as n . So, nnn 21 . From (4.10)

formula, we will obtained:

n

s

n

stxx v

2

2

2

1,2/21 (4.11)

With 221 nnv .

Or it can be simplify as :

2

2

2

1,2/21

1ss

ntxx v (4.12)

Page 15: Chapter 4

Chapter 4: Estimation

167

With 22 nv

Case (ii) : 21 nn , 22

12

Since 22

12 , we can substitute both sample variances with pooled estimated

variance, 2

pS . So, 22

21

2

pSss . From (4.10) formula, we will obtained :

n

S

n

Stxx

pp

v

22

,2/21

Thus,

n

Stxx

p

v

2

,2/21 2

With 2 nnv

Or it can be simplify as :

nStxx pv

2,2/21 (4.13)

With 22 nv

pS is a pooled estimate of the variance, which given by :

2

2)1(2)1(

21

22112

nn

snsnS

P

Case (iii) : 21 nn , 22

12

Since 22

12 , we can substitute both sample variances with pooled estimated

variance, 2

pS . So, 2

22

12

pSss . From (4.9) formula, we will obtained :

2

2

1

2

,2/21n

S

n

Stxx

pp

v (4.14)

Or it can be simplify as :

Page 16: Chapter 4

Chapter 4: Estimation

168

21

,2/21

11

nnStxx pv (4.15)

With 221 nnv

Case (iv) : 21 nn , 22

12

To construct the confidence interval we still use the general formula in (4.10) :

2

2

2

1

2

1,2/21

n

s

n

stxx v

Instead of using 221 nnv , we must use the following formula to calculate the

degree of freedom :

112

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

n

n

s

n

n

s

n

s

n

s

v

Example 9

A consumer organization collected data on two types of automobile batteries, A and

B. Both populations are normally distributed with standard deviations of 1.29 for

batteries A and 0.88 for batteries B. The summary statistics for 40 observations of

each type yielding average mean of 32.25 hours and 29.81 hours for batteries A and

batteries B respectively. Construct 90% confidence interval for difference between

means life hours for batteries A and batteries B.

Answer Example 9

81.29,25.32

40

,88.0,29.1

BA

BA

BA

xx

nn

Since 30An and 30Bn , 2

A and 2

B are known, so we will use Z- distribution.

Page 17: Chapter 4

Chapter 4: Estimation

169

,10.0 05.02/ and 6449.12/ z .

By using (4.8) formula :

B

B

A

ABA

nnzxx

22

2/

40

88.0

40

29.16449.1)81.2925.32(

22

4061.044.2

8461.20339.2 BA

We are 90% confident that the difference between means life hours for batteries A

and batteries B is between 2.0339 and 2.8461 hours.

Example 10

The diameter of steel rods manufactured on two different extrusion machines is being

investigated. Two random samples of size 151 n and 152 n are selected, and the

sample means are 8.69cm and 8.51cm and sample variances are 0.30 and 0.44,

respectively from Machine 1 and Machine 2. Construct 95% confidence interval for

difference between means diameter of steel rods manufactured by Machine 1 and

Machine 2.

Answer Example 10

51.8,69.8

15,15

,44.0,30.0

21

21

22

12

xx

nn

ss

Since 301 n and 302 n , 2

A and 2

B are not known, so we will use t- distribution.

21 nn so this is case (i) for t-distribution, will use (4.11) formula

2

2

2

1,2/21

1ss

ntxx v

With 22 nv then, 282)15(2 v

Page 18: Chapter 4

Chapter 4: Estimation

170

05.095.01 , 025.02/ and 048.228,2/ t

Substitute in (4.11) formula :

44.030.0

15

1048.251.869.8

4548.018.0

6348.02748.0 21

We are 90% confident that the difference between -0.2748 cm and 0.6348 cm.

Example 11

Reconsider the study on means diameter of steel rods in Example 10. If it is given

that the population variances are equal for both machines, construct the new

confidence interval for difference between means diameter of steel rods manufactured

by Machine 1 and Machine 2 at the same level of significant.

Answer Example 11

51.8,69.8

15,15

,44.0,30.0

21

21

22

12

xx

nn

ss

We will still use t- distribution.

Since 22

12 this is case (ii) for t-distribution, will use (4.13) formula

nStxx pv

2,2/11

With 22 nv then, 282)15(2 v

05.095.01 , 025.02/ and 048.228,2/ t

2

11

21

2

22

2

112

nn

snsnSP

21515

44.0)115(30.0)115(

Page 19: Chapter 4

Chapter 4: Estimation

171

37.0

6083.0pS

Substitute in (4.13) formula :

15

2)6083.0(048.251.869.8

4549.018.0

6349.02749.0 21

We are 90% confident that the difference between means diameter of steel rods

manufactured by Machine 1 and Machine 2 is between -0.2749cm and 0.6349cm.

Example 12

Reconsider the study on means diameter of steel rods in Example 10. If it is given

that the population variances are equal for both machines. Meanwhile the sample size

for Machine 1 has been changed from 15 to 10. Construct the new confidence interval

for difference between means diameter of steel rods manufactured by Machine 1 and

Machine 2 at the same level of significant.

Answer Example 12

51.8,69.8

15,10

,44.0,30.0

21

21

22

12

xx

nn

ss

We will still use t- distribution.

Since 22

12 and 21 nn , so this is case (iii) for t-distribution, will use (4.15)

formula

21

,2/21

11

nnStxx pv

With 221 nnv , then 21510 v

05.095.01 , 025.02/ and 069.223,2/ t

Page 20: Chapter 4

Chapter 4: Estimation

172

2

11

21

2

22

2

112

nn

snsnSP

21510

44.0)115(30.0)110(

= 3852.0

6206.0pS

Substitute in (4.15) formula :

15

2)6206.0(069.251.869.8

4689.018.0

6489.02889.0 21

We are 90% confident that the difference between means diameter of steel rods

manufactured by Machine 1 and Machine 2 is between -0.2889 cm and 0.6489 cm.

Example 13

Reconsider the study on means diameter of steel rods in Example 10. If it is given

that the population variances are not equal for both machines. Meanwhile the sample

size for Machine 1 has been changed from 15 to 10. Construct the new confidence

interval for difference between means diameter of steel rods manufactured by

Machine 1 and Machine 2 at the same level of significant.

Answer Example 13

51.8,69.8

15,10

,44.0,30.0

21

21

22

12

xx

nn

ss

We will still use t- distribution.

Since 22

12 and 21 nn , so this is case (iv) for t-distribution, will use (4.10)

formula

Page 21: Chapter 4

Chapter 4: Estimation

173

2

2

2

1

2

1,2/11

n

s

n

stxx v

Whereby,

115

1544.0

110

1030.0

1544.01030.0

11

22

2

2

2

2

2

2

1

2

1

2

1

2

2

2

21

2

1

n

ns

n

ns

nsnsv

7793.21

22

05.095.01 , 025.02/ and 047.222,2/ t

Substitute in (4.10) formula :

15

44.0

10

30.0047.251.869.8

4986.018.0

6786.03186.0 21

We are 90% confident that the difference between means diameter of steel rods

manufactured by Machine 1 and Machine 2 is between -0.3186cm and 0.6786cm.

Exercise 4.5

1. For the following problems, define whether the population standard deviations

are known or unknown. Then, choose the appropriate distribution in order to

construct the confidence interval for the difference in mean.

(a) The burning rates of two different solid-fuel propellants used in

aircrew escape systems are being studied. It is known that both

propellants have approximately the same standard deviation of burning

rate; that is 4cm/second. Two random samples 25 specimens are tested

for both propellants.

(b) The behavior of a stochastic generator in the presence of external

noise. The number of periods was measured in a sample of 100 trains

Page 22: Chapter 4

Chapter 4: Estimation

174

for each of two different levels of noise voltage, 100mV and 200mV.

For 100mV, the mean number of periods in a train was 7.9 with

variance of 4.7. Meanwhile, for 200mV, the mean number of periods

in a train was 6.5 with variance of 5.1

(c) A polymer is manufactured in a batch chemical process. Viscosity

measurements are normally made on each batch, and from past

experience the process has indicated that the variability in the process

is fairly stable with standard deviation of 17. Random sample of 15 are

taken to find their viscosity measurements. A process change is made

which involves switching the catalyst used in the process. Following

the process change, 20 batch viscosity measurements are taken.

Assume the process variability is unaffected by the catalyst change.

(d) The overall distance traveled by a golf ball is tested by hitting the ball

with Iron Byron, a mechanical golfer with a swing that is said emulate

the legendry champion, Byron Nelson. 10 Randomly selected balls of

two different brands are tested and the overall distance measured. The

obtained data presented in Table 2.

Table 2

(e) A researcher wanted to find out the intentions of young students to

enroll in IT courses in the future. He randomly selected fourth, fifth

and sixth graders and recorded how many IT courses they intend to

take. The obtained information presented in Table 3.

Table 3

n Mean Standard deviation

Males 190 2.82 1.41

Females 220 2.42 1.30

Brand A 275 286 287 260 277 273 269 282 281 275

Brand B 258 245 275 270 266 274 269 266 273 273

Page 23: Chapter 4

Chapter 4: Estimation

175

2. Refer to question 1(e) Exercise 5.5, given that the sample size for both male

and female have been reduced to 21. Construct a 90% confidence interval for

the difference between males and females in mean number of IT courses

planned to be taken in future.

3. A group of dietitians is investigating a diet-modification program to stimulate

weight loss. 7 volunteers have participate and their weight (in kilograms)

before and after the participation in the program is shown in the Table 4:

Table 4

Individual 1 2 3 4 5 6 7

Before 60 71 63 59 75 70 68

After 57 69 56 51 72 66 62

Construct a 98% confidence interval for the difference weight before and after

the program.

4. The usefulness of two different design languages in improving programming

tasks has been studied. 40 expert programmers, who familiar in both

languages, are asked to code a standard function in both languages, and the

time (in seconds) is recorded. For the Design Language 1, the mean time is

255s with standard deviation of 26s and for the Design Language 2, the mean

time is 319s with standard deviation of 17s. Construct a 95% confidence

interval for the difference in mean coding times between Design Language 1

and Design Language 2.

5. Two types of plastics are suitable for use by an electronics component

manufacturer. The Breaking strength of this plastic has been studied. It is

known that both types have the same standard deviation. The obtained

information from a random sample presented in Table 5.

Page 24: Chapter 4

Chapter 4: Estimation

176

Table 5

Sample Size Mean Variance

Type 1 13 173.5 3.7

Type 2 13 160.9 2.9

Construct a 99% confidence interval for the difference in mean breaking

strength of the two types of plastics.

6. A study to investigate the melting point of two type alloys is conducted by

melting 50 samples alloy Type I and 60 samples alloy Type II. The sample

mean and standard deviation for alloy Type I was C139 and C27 , while for

alloy Type II was C155 and C31 respectively. Construct a confidence

interval estimate for the difference in mean melting point for the two type

alloys at significant level 0.02.

7. The deflection temperature under load for two different types of plastic pipe is

normally distributed with standard deviation of C16 for Type A pipe and

C21 for Type B pipe. Two random samples of 13 pipe specimens from both

pipe types are tested and their means deflection temperatures are C125 and

C97 respectively for Type A pipe and Type B pipe. Find a 99% confidence

interval for the difference in mean deflection temperatures between the two

types of pipe.

8. A chemical engineer wish to know the difference mean yield between two

catalysts that have been used in a certain chemical process. Table 6 shows

data of yields from his experiment result.

Table 6

Catalyst X 80 88 76 83 85 85 81 79 75

Catalyst Y 77 85 81 76 81 74 79 80 83

Page 25: Chapter 4

Chapter 4: Estimation

177

Find a 99% confidence interval for the difference between the means yield

between two catalysts.

9. Researcher conducted a study to determine whether magnets are effective in

treating back pain. Comparison study has been conducted by giving 9

volunteers a treatment a using magnets and the pain was measured before and

after treatment. Pain was measured using the visual analog scale. Before the

treatment the mean pain is 12.4 with a standard deviation of 2.1. Meanwhile

after the treatment the mean pain is 9.5 with a standard deviation of 2.8. Find

a 98% confidence interval for the difference between mean pain before and

after the treatment. Assume the population variances for before and after the

treatment are equal.

10. A study to see the difference between BMI of men and women was

conducted. A random sample of 11 men yielding average BMI of 28.9 and

standard deviation of 6.4. A random sample of 13 women yielding average

BMI of 26.1 and standard deviation of 4. Find a 99% confidence interval for

the difference between mean BMI for men and women.

Answer Exercise 4.5

1. (a) known, Z – distribution

(b) unknown, t – distribution

(c) known, Z – distribution

(d) unknown, t – distribution

(e) unknown, Z – distribution

2. (-0.3048, 1.1048) 3. (-5.1154, 14.544)

4. (-73.627, -54.373) 5. (10.6071, 14.5929)

6. (-28.8678, -3.1323) 7. (46.8607, 9.1393)

8. (-3.6243, 7.1791) 9. (-0.1135, 5.914)

10. (-3.6505, 9.2505)

Page 26: Chapter 4

Chapter 4: Estimation

178

4.6 Confidence Interval for Population Variance

4.6.1 Confidence Interval for Single Population Variance

Theory 6

If 2s is the sample variance from a random sample of n observations from a normal

distribution with unknown population variance 2 , a confidence interval on 2 is

given as :

2

,2/1

22

2

,2/

2 )1()1(

vv

snsn

(4.16)

Whereby 2

,2/ v and 2

,2/1 v are the upper and lower 100 2/ percentage points of the

chi-square distribution with 1 nv degree of freedom, respectively.

5.6.2 Confidence Interval for Ratio of Two Population Variances

Theory 7

If 21s and 2

2s are the sample variances from a random samples of 1n and 2n

respectively, from two independent normal populations with unknown population

variances 21 and 2

2 , a confidence interval on the ratio 2

2

21

is given as :

1,2,2/2

2

2

1

2

2

2

2,1,2/

2

2

2

1 11vv

vv

fs

s

fs

s

(4.17)

Whereby 1,2,2/ vvf and 2,1,2/

1

vvf are the upper and lower 100 2/ percentage points of

the F distribution with 111 nv and 122 nv degree of freedom, respectively.

Page 27: Chapter 4

Chapter 4: Estimation

179

Example 14

The life hours of a heating element used in a furnace is known to be approximately

normally distributed. A random sample of 11 heating elements Type A is selected and

found to have sample standard deviation of 9.7 hours. Meanwhile a random sample of

13 heating elements Type B is selected and found to have sample standard deviation

of 11.4 hours. Construct a 95% confidence interval for the ratio population variances

2

2

B

A

.

Answer Example 14

22 7.9,11 AA sn 22 4.11,13 BB sn

121131

101111

BB

AA

nv

nv

,05.0 025.02/

62.310,12,025.0,,2/ ff vAvB

37.3

111

12,10,025.0,,2/

ff vBvA

vAvB

B

A

vBvAB

A fs

s

fs

s,,2/2

2

2

2

2

,,2/

2

211

62.34.11

7.9

37.3

1

4.11

7.92

2

2

2

2

2

21

6209.22148.02

2

21

We are 95% confident that the ratio population variances 2

2

B

A

is between 0.2148 and

2.6209.

Page 28: Chapter 4

Chapter 4: Estimation

180

Example 15

A polymer is manufactured in a batch chemical process. A random sample of 9

viscosity measurements yielding standard deviation of 19. Following the process

change, 8 batch viscosity measurements yielding standard deviation of 16. Construct

a 98% confidence interval for the ratio population variances 2

2

2

1

.

Answer Example 15

22

2`2

22

1̀1

16,8

19,9

sn

sn

7181

8191

22

11

nv

nv

,02.0 01.02/

18.68,7,01.01,2,2/ ff vv

84.6

111

7,8,01.02,1,2/

ff vv

1,2,2/2

2

2

1

2

2

2

2,1,2/

2

2

2

1 11vv

vv

fs

s

fs

s

18.616

19

84.6

1

16

192

2

2

2

2

2

21

7148.82062.02

2

21

We are 98% confident that the ratio population variances 2

2

B

A

is between 0.2062 and

8.7148

Page 29: Chapter 4

Chapter 4: Estimation

181

Exercise 4.6

1. The percentage of titanium in an alloy used in aerospace casting is measured

in 25 randomly selected parts. The sample standard deviation of 4.8

milligrams. Construct 90% confidence interval for the population variance.

2. A rivet is to be inserted in a hole. A random sample of 30 parts is selected,

and the hole diameter is measured. The sample variance of the holes diameter

measured is 0.01 millimeters. Construct 95% confidence interval for the

population variance.

3. The brightness of a television picture tube can be evaluated by measuring the

amount of current required to achieve a particular brightness level. A sample

of 20 tubes results in variance of 16.9. Find the 98% confidence interval for

the population variance.

4. A particular brand of diet margarine was analyzed to determine the level of

polyunsaturated fatty acid (in percentage). A sample of 12 packages resulted

in standard deviation of 1.8. Find the 99% confidence interval for the

population variance.

5. A post-mix beverage machine is adjusted to release a certain amount of syrup

into a chamber where it mixed with carbonated water. A random sample of 30

beverages was found to have a variance of 0.058 fluid milliliters. Construct a

90% confidence interval for the population variance.

6. A manufacturer produces crankshafts for an automobile engine. The wear of

crankshaft after 45,000 km is of interest because it is likely to have an impact

warranty claims. A ransom sample of 10 shafts is tested and obtained a

standard deviation of 0.75. Construct a 95% confidence interval for the

population standard deviation.

Page 30: Chapter 4

Chapter 4: Estimation

182

7. Consider the weight data in question 3 Exercise 5.5. Construct a 98%

confidence interval for the ratio population variances, 2

2

After

Before

.

8. Two chemical companies can supply a raw material. The concentration of a

particular element in this material is important. The standard deviation of

concentration in a random sample of 8 batches produced by Company A is 3.9

grams per liter, while for Company B, a random sample of 11 batches yields

4.7 grams per liter. Construct a 90% confidence interval for the ratio

population variances 2

2

B

A

.

9. A fuel-economy study was conducted for two local automobiles, X and Y. One

vehicle of each brand was selected, and the mileage performance was

observed for 9 tanks of fuel in each car. From the study it was found that X

has a variance of 0.41 liter and Y has a variance of 0.34 liter. Construct a 95%

confidence interval for the ratio population variances.

10. Table 7 shows the data of waiting times (in minutes) of customers at ABC

Bank and XYZ Bank. Construct a 90% confidence interval for the ratio

population variances.

Table 7

ABC Bank 6.1 5.8 7.0 6.4 6.9 7.3 6.5

XYZ Bank 6.8 5.4 7.4 7.1 6.6 6.0

Page 31: Chapter 4

Chapter 4: Estimation

183

11. Consider the breaking strength of a plastic described in question 5 Exercise

4.5. Construct a 90% confidence interval for the ratio population variances,

22

21

Type

Type

.

12. Consider the deflection temperature under load for two different types of

plastic pipe described in question 6 Exercise 4.5. Compute a 98% confidence

interval for the ratio population variances, 2

2

BType

TypeA

.

13. Consider the yields data in question 8 Exercise 4.5. Construct a 95%

confidence interval for the ratio population variances, 2

2

Y

X

.

14. Consider the study to determine whether magnets are effective in treating

back pain described in question 9 Exercise 4.5. Compute a 90% confidence

interval for the ratio population standard deviation,

After

Before

.

15. Consider the study to see the difference between BMI of men and women in

question 10 Exercise 4.5. Compute a 90% confidence interval for the ratio

population variances, 2

2

Female

Male

.

Answer Exercise 4.6

1. (15.1850, 39.9307) 2. (0.006343, 0.0181)

3. (8.8724, 42.0673) 4. (1.3320, 13.6919)

5. (0.0395, 0.0950) 6. (0.5158, 1.3693)

7. (0.0741, 5.3164) 8. (0.2193, 2.5063)

Page 32: Chapter 4

Chapter 4: Estimation

184

9. (0.2722, 5.3421) 10. (0.1037, 2.253)

11. (0.6051, 4.3788) 12. (0.1686, 4.9384)

13. (0.3519, 5.5311) 14. (0.1635, 1.935)

15. (0.9309, 7.4496)

EXERCISE CHAPTER 4

1. A major truck stop has kept extensive records on various transactions with its

customers. If a random sample of 18 of these records shows average sales of

63.84 gallons of diesel fuel with a standard deviation of 2.75 gallons,

construct a 99% confidence intervals for the mean of the population sampled.

2. The data represent below is a sample number of fires started by candles at

home for the past several years. Find the 95% confidence interval for the

mean of fires started by candles at home each year as in Table 8.

Table 8

5460 5900 6090 6310 7160 8440 9930

3. A lathe is set to cut steel bars and considered to be in perfect adjustment if the

average length of the bars it cuts is 7 centimeters. A sample of 31 bars is

selected randomly, and the lengths are measured. It is determined that the

average length of the bars in the sample is 7.055 centimeters with a standard

deviation of 0.35 centimeters. Find a 97% confidence interval that the average

length of the bars is in perfect adjustment.

4. A random sample of 48 days take at a large hospital shows that averages of 38

patients were treated in the emergency room (ER) per day. The standard

deviation of the population is four. Find the 99% confidence interval of the

mean number of ER patients treated each day at the hospital.

Page 33: Chapter 4

Chapter 4: Estimation

185

5. To find an interval estimate of the number of miles a certain brand tires will

last before tread depth falls below minimal safety threshold, a manufacturer

tests 50 tires under various operating conditions. The recorded data yield the

estimates 460,32x miles and .3106s Find an 80% confidence interval

for the mean number of miles.

6. A machine produces metal rods used in an automobile suspension system. A

random sample of 10 rods was selected, and diameter was measured. The

resulting data in millimeter is shown in Table 9.

Table 9

8.23 8.30 8.27 8.22 8.29

8.39 8.21 8.38 8.35 8.37

Find the 98% confidence interval of the mean rod diameter.

7. The thickness of blended cement for waterproofing application has a normal

distribution. 25 samples of blended cement for water proofing application

were selected randomly and have a mean of 1.17 cm and a standard deviation

of 0.32 cm.

(a) Determine the standard error of estimate if 05.0 .

(b) Find the 99% confidence interval of the mean thickness of blended

cement for waterproofing application.

(c) Find the 90% confidence interval of the mean thickness of blended

cement for waterproofing application.

(d) Is there any difference between the result in question (b) and (c)?

Why ?

8. Table 10 below shows the results of a mouse-infection experiment in which

14 mice in Group A and 11 mice in Group B received the same challenge dose

Page 34: Chapter 4

Chapter 4: Estimation

186

of bacteria and were then observed daily. If the population variances of both

group are not equal, find the 99% confidence interval for the difference

between the average days of 2 groups of mice that infected by the dose of

bacteria.

Table 10

Mouse

Group Day of death (post-infection) of individual mouse

A 2, 2, 3, 3, 3, 3, 4, 4, 5, 6, 7, 7, 8, 9

B 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 5

9. Two similar groups of patients, A and B, consisting of 50 and 100 individuals

respectively, the first was given a new type of sleeping pill and the second

was given a conventional type. The mean number of hours of sleep for

patients in group A was 7.82 with a standard deviation of 0.24h. While the

mean number of hours of sleep for patients in group B was 6.75 with a

standard deviation of 0.30h. Find 95% confidence limits for the difference in

the mean number of hours of sleep induced by the two types of sleeping pills.

10. A sample of 14 cans of Brand I diet soda gave the mean number of calories of

23 per can with a standard deviation of 3 calories. Another sample of 16 cans

of Brand II diet soda gave the mean number of calories of 25 per can with a

standard deviation of 4 calories. Assume that the calories per can of diet soda

are normally distributed for each of the two brands and that the standard

deviations for the two populations are equal. Find the 99% confidence interval

for 21 .

11. A study was conducted to investigate some effects of physical training.

Sample data are listed below, with all weights given in kilograms.

Pre training : 99 57 62 69 74 77 59 92 70 85 84

Page 35: Chapter 4

Chapter 4: Estimation

187

Post training : 94 57 62 69 66 76 58 88 70 84 83

Construct a 95% confidence interval for the difference between the weights in

pre training and the weights in post training by assuming 22

21 .

12. Twelve randomly selected mature Jati trees of one variety have a mean height

of 13.8 feet with a standard deviation of 1.2 feet, and 15 randomly selected

mature Jati trees of another variety have a mean height of 12.9 feet with a

standard deviation of 1.5 feet. Assuming that the random samples were

selected from normal population with equal variances, construct a 95%

confidence interval for the difference between the true average height of the

two kinds of Jati trees.

13. Reconsider question 1 Exercise Chapter 4, find the 99% confidence interval

for the standard deviation population.

14. A medical researcher wants to determine whether male pulse rates vary more

or less than female pulse rates. The statistics that he found from his research

can be summarized as shows in Table 11:

Table 11

Male Female

Number of samples : 7 Number of samples : 9

Mean : 69.4 Mean : 76.3

Standard deviation : 11.3 Standard deviation : 12.5

(a) Construct 95% of confidence interval for the male standard deviation.

(b) Construct 95% of confidence interval for the ratio variance between

male and female.

15. Table 12 below shows difference in waiting times (in minutes) of customers at

the Publician Bank, where customers enter a single waiting line and where

Page 36: Chapter 4

Chapter 4: Estimation

188

customers may enter any one of three different lines that have formed at three

teller windows.

Table 12

(a) Construct a 95% confidence interval for where the customers enter a

single waiting line.

(b) Construct a 95% confidence interval for where the customers may

enter any one of three different lines.

(c) Which arrangement seems better : the single-line system or the

multiple-line system? Why?

16. Ten alloy of Brand A had an average magnesium content of 3.1 mg with a

standard deviation of 0.5 mg. Eight alloy of Brand B had an average

magnesium content of 2.7 mg with a standard deviation of 0.7 mg. Assuming

that the two sets of data are independent random samples from normal

populations with equal variances.

(a) Construct a 99% confidence interval for population standard deviation

of Brand A.

(b) Construct a 98% confidence interval for BA .

17. The thickness of blended cement for waterproofing application has a normal

distribution. 25 samples of blended cement for water proofing application

were selected randomly and have a mean of 1.17 cm and a standard deviation

of 0.32 cm.

(a) Determine the standard error of estimate if 05.0

(b) Find the 99% confidence interval of the mean thickness of blended

Single-line

system 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7

Multiple-line

system 4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0

Page 37: Chapter 4

Chapter 4: Estimation

189

cement for waterproofing application.

(c) Find the 90% confidence interval of the mean thickness of blended

cement for waterproofing application.

18. A random sample of the number of farms (in thousands) in various states in

Malaysia is given in Table 13.

Table 13

23 45 10 18 20 39

18 16 29 9 38 33

(a) Find the point estimate for the sample variance.

(b) Construct the 90% confidence interval of variance for the number of

farms.

19. Two groups of students are given a problem-solving test, and the results are

shown in Table 14. Construct the 95% confidence interval for the ratio of the

standard deviation for the two groups, 2

1

.

Table 14

Group 1 (Finance Major) Group 2 (Management Major)

Sample size 13 9

Variance 15.9 11.4

20. A study was performed to determine whether men and women differ in their

repeatability in assembling components on printed circuit boards. Two

samples of 15 men and 17 women were selected, and each subject assembled

the units. The two sample standard deviations of assembly time were 1.21

minutes and 1.35 minutes.

(a) Construct 95% confidence interval of variance assembly time for men.

(b) Construct 90% confidence interval for ratio of two variances assembly

Page 38: Chapter 4

Chapter 4: Estimation

190

times for men and women, 2

2

women

men

.

ANSWER EXERCISE CHAPTER 4

1. (61.9616, 65.7184) 2. (5552.120, 8530.735)

3. (6.9186, 7.1914) 4. (36.5104, 39.4895)

5. (31897, 33023) 6. (8.240, 8.362)

7. (a) 0.12544 (b) (0.990992, 1.349008)

(c) (1.060496, 1.279504) (d) Yes.

8. (0.1428, 4.1948) 9. (0.9812, 1.1588)

10. (-5.6104, 1.6104) 11. (-9.7404, 13.5586)

12. (-0.198, 1.998)

13. (a) (7.2817, 24.8868) (b) (44.3625, 16.9966)

14. (0.1757, 4.5764)

15. (a) (0.329, 0.870) (b) (1.253, 3.326)

(c) Single-line system appears to be better. Because the variation appears

to be significance lower with a single line.

16. (a) (0.308, 1.139) (b) (-0.33, 1.130)

17. (a) 12544.0 (b) (0.99092, 1.349008)

(c) (0.99092, 1.349008)

18. (a) 139.4242 (b) (77.950, 335.228)

19. (0.576, 2.213)

20. (a) (0.2242, 1.0289) (b) (0.2398, 2.4663)

Page 39: Chapter 4

Chapter 4: Estimation

191

SUMMARY CHAPTER 4

Confidence Interval for Single Mean

Maximum error :

nZE

2 , Sample size :

2

2

E

Zn

(a) Large Sample : 30n or known

(i) is known : nzxnzx // 2/2/

(ii) is unknown : nszxnszx // 2/2/

(b) Small Sample : 30n and unknown

nstxnstx vv // ,2,2 ; 1 nv

Confidence Interval for a Difference Between Two Means

(a) Z distribution case

(i) is known :

2

2

2

1

2

12/21

nnzxx

(ii) is unknown :

2

2

2

1

2

12/21

n

s

n

szxx

(b) t distribution case

(i) 21 nn , 2

2

2

1 :

2

2

2

1,2/21

1ss

ntxx v ; 22 nv

(ii) 21 nn , 2

2

2

1 :

nStxx pv

2,2/21 ; 22 nv

2

2)1(2)1(

21

22112

nn

snsnS

P

(iii) 21 nn , 2

2

2

1 :

21

,2/21

11

nnStxx pv ; 221 nnv

Page 40: Chapter 4

Chapter 4: Estimation

192

2

2)1(2)1(

21

22112

nn

snsnS

P

(iv) 21 nn , 2

2

2

1 :

2

2

2

1

2

1,2/21

n

s

n

stxx v ,

112

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

n

n

s

n

n

s

n

s

n

s

v

Confidence Interval for Single Population Variance

2

,2/1

22

2

,2/

2 )1()1(

vv

snsn

; 1 nv

Confidence Interval for Ratio of Two Population Variances

1,2,2/2

2

2

1

2

2

2

2,1,2/

2

2

2

1 11vv

vv

fs

s

fs

s

; 111 nv and 122 nv

Page 41: Chapter 4

Chapter 4: Estimation

193

CORRECTION PAGE CHAPTER 4

Page 42: Chapter 4

Chapter 4: Estimation

194