74
1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

Embed Size (px)

Citation preview

Page 1: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

1

Review #2

Chapter 9

Chapter 10

Chapter 11 and 12

Page 2: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

2

Chapter 9Sampling Distributions

• A statistic is a random variable describing a characteristic of a random samples.– Sample mean– Sample variance

• We use statistic values in inferential statistics (make inference about population characteristics from sample characteristics).

• Statistics have distributions of their own.

Page 3: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

3

Chapter 9 The Central Limit Theorem

• The distribution of the sample mean is normal if the parent distribution is normal.

• The distribution of the sample mean approaches the normal distribution for sufficiently large samples (n 30), even if the parent distribution is not normal.

• The parameters of the sample distribution of the mean are:– Mean:– Standard deviation:

(Assumption: The population is sufficiently large. No correction is needed in the calculation of the variance). n

xx

xx

Page 4: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

4

Chapter 9 The Central Limit Theorem

• Problem 1 (Using Excel) Given a normal population whose mean is 50 and whose standard deviation is 5,– Question 1: Find the probability that a random

sample of 4 has a mean between 49 and 52– Answer:

.443566:answerThe-.4)NORMSDIST(-.8)NORMSDIST(

:type worksheetExcel In]

[).Z.(P

)Z(P)x(P

84

45

5052

45

50495249

-.4 .8

Page 5: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

5

Chapter 9The Central Limit Theorem

• Problem 1 (Using the table) Given a normal population whose mean is 50 and whose standard deviation is 5,– Question 1: Find the probability that a random

sample of 4 has a mean between 49 and 52– Answer:

.4435.3446.7881.8)Z.4P(

)455052

Z455049

P(52)xP(49

-.4 .8

Normal table

Page 6: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

6

Chapter 9The Central Limit Theorem

• Problem 1 – Question 2: Find the probability that a random

sample of 16 has a mean between 49 and 52.

• Answer

.7213.2119.93321.6)Z.8P(

)1655052

Z1655049

P(52)xP(49

Normal table

Page 7: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

7

• Problem 2: The amount of time per day spent by adults watching TV is normally distributed with =6 and =1.5 hours.– Question 1: What is the

probability that a randomly selected adult watches TV for more than 7 hours a day?

– Answer:

.252492:answer The anywhere. click then

True),6,1.5,NORMDIST(7-1 :type Excel [In)7X(P

– Question 2: What is the probability that 5 adults watch TV on the average 7 or more hours?Answer:

.0681.931911.49)P(Z

51.567

ZP7)XP(

Chapter 9 The Central Limit Theorem

Normal table

Page 8: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

8

• Problem 2:– Question 3: What is the probability that the total time of

watching TV of the five adults will not exceed 28 hours?

– Answer:

– Question 4: What total TV watching time is exceeded by only 3% of the population for samples of 5 adults?

51.5

65.6ZP28/5)XP(

Chapter 9 The Central Limit Theorem

34.46)5(6.892137 xThus,

6.892137x:answer The anywhere. click then .670822) 7,6,NORMINV(.9 :type Excel [In

.03)xtimeP(Average)xtime P(Total

0

0

00

Comments:

1.Excel returns X for agiven left hand tail probability2. .670822 = 1.5/5.5

Normal table

Page 9: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

9

• Problem 3:

Assume that the monthly rents paid by students in a particular town is $350 with a standard deviation of $40. A random sample of 100 students who rented apartments was taken.

Question1: What is the probability that the sample mean of the monthly rent exceeds $355?

.1056.894411.25)P(Z

1.25)P(Z10040

350355ZP355)XP(

Chapter 9 The Central Limit Theorem

Normal table

Page 10: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

10

• Problem 3 - continued

Question2: What is the probability that the total revenue from renting 10 randomly selected apartments falls between 3300 and 3700 dollars?

Chapter 9 The Central Limit Theorem

0.886154 :answer The 64911)30,350,12.NORMDIST(3-64911)70,350,12.NORMDIST(3

:type Excel [In)rAverageP(330

)revenue rental TotalP(3300

370

3700

ent40/10.5 = 12.64911

Normal table

Page 11: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

11

• Problem 3 - continued

Question3: Let’s assume the population mean was unknown, but the standard deviation was known to be $40. A sample of 100 rentals was selected in order to estimate the mean monthly rent paid by the whole student population. What is the probability that the sample mean differ from the actual mean by more than $5? How about more than $10?

Chapter 9 The Central Limit Theorem

Normal table

Page 12: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

12

• Problem 3

– continued

.0124.9938)2(12.5)P(Z2.5)P(Z

1004010

σμX

P10040

10σ

μXP

10)μXP(10)μXP(

)10μXor10μXP((ii)

.2112.1056.10561.25)P(Z1.25)P(Z

100405

σμX

P10040

μXP

5)μXP(5)μXP(

)5μXor5μXP(

xx

xx

)(i

Chapter 9 The Central Limit Theorem

Page 13: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

13

Chapter 9Sampling distribution of the sample

proportionIn a sample of size n, if np > 5 and n(1-p) > 5, then the sample proportion p = x/n is approximately normally distributed with the following parameters:

^

n)p1(p

pp̂Z

,therefore,n

)p1(pandp p̂p̂

(Assumption: The population is sufficiently large. No correction is needed in the calculation of the variance).

Page 14: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

14

Sampling distribution of the sample proportion

• Problem 4: – A commercial of a household appliances

manufacturer claims that less than 5% of all of its products require a service call in the first year.

– A survey of 400 households that recently purchased the manufacturer products was conducted to check the claim.

Page 15: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

15

Problem 4 - Continued: Assuming the manufacturer is right, what is the probability that more than 10% of the surveyed households require a service call within the first year?

059440005105

051010

).(

).(.

..).ˆ( ZPZPpP

If indeed 10% of the sampled households reported a call for service within the first year, what does ittell you about the the manufacturer claim?

Sampling distribution of the sample proportion

Normal table

Page 16: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

16

Sampling Distribution of the Difference Between two Means

• If two independent variables are normally distributed with means and variances , and respectively, then x1 – x2 is also normally distributed with:

2

22

1

212

xx

21xx

nn21

21

Page 17: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

17

• When at least one of the populations is not normally distributed but the samples sizes are both at least 30, x1 – x2 is approximately normally distributed, with a mean and a variance as indicated above.

Sampling Distribution of the Difference Between two Means

Page 18: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

18

• Example: A national TV telethon committee is interested in determining whether donations made by males are on the average larger than those made by females by $4. Two samples of 25 males and 25 females were selected, and the donations made recorded. If the standard deviations of the male and female populations are $2.4 and $1.8 respectively, what is the probability that sample mean of the male donations exceeds the sample mean of the female donations by at least $5? Assume donations for the two populations are normally distributed.

Sampling Distribution of the Difference Between two Means

Page 19: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

19

• Solution

Sampling Distribution of the Difference Between two Means

258.1

254.2

45

nn

)(xxP)5xx(P

22

2

22

1

21

212121

For males For females

Page 20: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

20

Chapter 10Introduction to Estimation

• A population’s parameter can be estimated by a point estimator and by an interval estimator.

• A confidence interval with 1- confidence level is an interval estimator that covers the estimated parameters (1-)% of the time.

• Confidence intervals are constructed using sampling distributions.

Page 21: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

21

Confidence interval of the mean – Known Variance

• We use the central limit theorem to build the following confidence interval

nzx

nzx

22 //

z/2-z/2

/2/2 1 -

Page 22: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

22

• Problem 5: How many classes university students miss each semester? A survey of 100 students was conducted. (See Data next)

• Assuming the standard deviation of the number of classes missed is 2.2, estimate the mean number of classes missed per student. Use 99% confidence level.

Confidence interval of the mean – Known Variance

Page 23: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

23

– Solution = 10.21 2.575 = 10.21 .57

nzx

2/

1- = .99 = .01/2 = .005Za/2 = Z.005= 2.575

100

2.2

LCL = 9.64, UCL = 10.78You can used Data Analysis Plus > Z-Estimate: Mean

Confidence interval of the mean – Known Variance

Data

Page 24: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

24

– Solution (using Data Analysis Plus):• Shade the data set (you may include the title label)

• Select Data Analysis Plus, then “Z-Estimate: Mean”

• Type in the sigma (2.2), check Labels (if appropriate), type in alpha (.01), click OK.

z-Estimate: Mean

ClassesMean 10.21Standard Deviation 2.1756Observations 100SIGMA 2.2LCL 9.643316UCL 10.77668

Confidence interval of the mean – Known Variance

Data

Page 25: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

25

Selecting the sample size

• The shorter the confidence interval, the more accurate the estimate.

• We can, therefore, limit the width of the interval to 2W, and get

• From here we have

nzWor

nzxWx

22 //

22

W

zn /

W is called “Margin of error”, or“Bound on the error estimate”

Page 26: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

26

• Problem 6An operation manager wants to estimate the average amount of time needed by a worker to assemble a new electronic component.

• Sigma is known to be 6 minutes.• The required estimate accuracy is within 20

seconds. • The confidence level is 90%; 95%.• Find the sample size.

Selecting the sample size

Page 27: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

27

– Solution = 6 min; W = 20 sec = 1/3 min;

• 1 - =.90 Z/2 = Z.05 = 1.645

• 1- = .95, Z/2 = Z.025 = 1.96

877

7587631

6645122

052

2

nTake

W

z

W

zn .

/

)(../

1245671244316961

2

nTaken .

/

)(.

Selecting the sample size

Page 28: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

28

Chapter 11Hypotheses tests

– In hypothesis tests we hypothesize on a value of a population parameter, and test to see if there is sufficient evidence to support our belief.

– The structure of hypotheses test• Formulate two hypotheses.

– H0: The one we try to reject in favor of …

– H1: The alternative hypothesis, the one we try to prove.

• Define a significance level

Page 29: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

29

Hypotheses tests

– The significance level is the probability of erroneously reject the null hypothesis.

= P(reject H0 when H0 is true)– Sample from the population and calculate a

statistic that provides an indication whether or not the parameter value under H1 is more likely to be true.

– We shall test the population mean assuming the standard deviation is known.

Page 30: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

30

• Problem 7: A machine is set so that the average diameter of ball bearings it produces is .50 inch. In a sample of 100 ball bearings the mean diameter was .51 inch. Assuming the standard deviation is .05 inch, can we conclude at 5% significance level that the mean diameter is not .50 inch.

Hypotheses tests of the Mean – Known Variance

Page 31: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

31

• Solution:The population studied is the ball-bearing diameters.– We hypothesize on the population mean.– A good point estimator for the population mean

is the sample mean.– We use the distribution of the sample mean to

build a sample statistic to test whether = .50 inch.

Hypotheses tests of the Mean – Known Variance

Page 32: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

32

Solution – (A Two Tail rejection region)– Define the hypotheses:

• H0: = .50

• H1: = .50

.Z and ZZZ

therefore,und zero, trical aro are symme and Zthe Z

d μlues arounetrical va have symm and XIf X

05.)50.thatgivenZZorZZ(P

or,05.)50.thatgivenXXorXX(P

α/2L2α/2L1

L2L1

L2L1

2L1L

2L1L

The probability of conducting atype one error

Hypotheses tests of the Mean – Known Variance

Page 33: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

33

05.)50.thatgivenZZorZZ(P 025.025.

Calculate the value of the sample Z statistic and compare it to the critical value

Z.025 = 1.96 (obtained from the Z-table)

Build a rejection region: Zsample> Z/2, or

Zsample<-Z/2

Critical Z

210005.

50.51.

n

XZsample

Since 2 > 1.96, there is sufficient evidence to rejectH0 in favor of H1 at 5% significance level.

1.96-1.96

Hypotheses tests of the Mean – Known Variance

Solution - A Two Tail rejection region

Page 34: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

34

• We can perform the test in terms of the mean value.• Let us find the critical mean values for rejection

XL2=0 + Z.025 =.50+1.96(.05)/(100)1/2=.5098

XL1=0 - Z.025 =.50 -1.96(.05)/(100)1/2=.402

n

n

Since.51 > .5098, there is sufficient evidence to reject the null hypothesis at 5% significance level.

Hypotheses tests of the Mean – Known Variance

Solution - A Two Tail rejection region

Page 35: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

35

• Calculate the p value of this test• Solution

p-value = P(Z > Zsample) + P(Z < -Zsample) =P(Z > 2) + P(Z < -2) = 2P(Z > 2) =2[1 - .9772} = .0456

• Since .0456 < .05, H0 is rejected.

Hypotheses tests of the Mean – Known Variance

Page 36: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

36

• Problem 8 – The average annual return on investment for American

banks was found to be 10.2% with standard deviation of 0.8%.

– It is believed that banks that exercise comprehensive planning do better.

– A sample of 26 banks that exercise comprehensive training provide the following result: Mean return = 10.5%

– Can we infer that the belief about bank performance is supported at 10% significance level by this sample result?

Hypotheses tests of the Mean – Known Variance

Page 37: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

37

• Solution: (A right Hand Tail Rejection region)The population tested is the “annual rate of return”.– H0: = 10.2

– H1: > 10.2

• Let us perform the test with the standardized

rejection region approach:

Zsample > Z.10 (Right hand tail rejection region)

Z.10 = 1.28. Reject H0 if Zsample > 1.28

Hypotheses tests of the Mean – Known Variance

Data

Page 38: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

38

• Conclusion– At 10% significance level there is sufficient evidence in

the data to reject H0 in favor of H1, since the sample statistic falls inside the rejection region.

• Interpretation: – If we are willing to accept 10% chance of making the

wrong conclusion, we can conclude banks conducting comprehensive training perform better than banks who do not.

Hypotheses tests of the Mean – Known Variance

91.1268.

2.105.10

n

xZsample

Page 39: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

39

• Let us perform the test with the p-value method:

P(X > 10.5 given that = 10.2) = P(Z > (10.5 – 10.2)/[.8/(26)1/2] = P(Z > 1.91) = .5 - .4719 = .0281

• Since .0281 < .10 we reject the null hypothesis at 10% significance level.

Hypotheses tests of the Mean – Known Variance

Data

Page 40: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

40

• Note the equivalence between the standardized method or the rejection region method and the p-value method.

P(Z>Z.10) = .10Z10 = 1.28

1.911.28

.0281

.10

Hypotheses tests of the Mean – Known Variance

The statement “p-value is smallerthan alpha, is equivalent to the statement “ the test statistic fallsin the rejection region”

Page 41: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

41

• Problem 9– In the midst of labor-management negotiations, the president

of a company argues that the company’s blue collar workers, who are paid an average of $30K a year, are well-paid because the mean annual pay for blue-collar workers in the country is less than $30K.

– This figure is disputed by the union. To test the president’s belief an arbitrator draws a random sample of 350 blue-collar workers from across the country and their income recorded (see file Salaries).

– If the arbitrator assumes that income is normally distributed with a standard deviation of $8,000, can it be inferred at 5% significance level that the company’s president is correct?

Hypotheses tests of the Mean – Known Variance

Page 42: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

42

• Solution (A left Hand Tail Rejection Region)The population tested is the ann. Salary – H0: = 30K

H1: < 30K– Left hand Tail Rejection region: Z < -Z.05 or Z < -1.645

ZSample =(29,119.5-30,000)/(8,000/350.5)= -2.059

Since –2.059 < -1.645 there is sufficient evidence to infer that on the average blue collar workers’ income is lower than $30K at 5% significance level.

Hypotheses tests of the Mean – Known Variance

Data

Page 43: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

43

• Calculate the p-value of this test:• Solution

p-value = P(Z < Zsample) = P(Z < -2.059)

Z-Test: Mean

IncomesMean 29119.52Standard Deviation 8460.491Observations 350Hypothesized Mean 30000SIGMA 8000z Stat -2.059P(Z<=z) one-tail 0.0197z Critical one-tail 1.6449P(Z<=z) two-tail 0.0394z Critical two-tail 1.96

Hypotheses tests of the Mean – Known Variance

Page 44: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

44

• Problem 7a Calculate for the two-tail hypotheses test performed in problem 7, when the actual mean diameter is .515 inch.

• Solution– The rejection region in terms of the critical values of the sample mean

was found before: XL1 = .402; XL2 = .5098.

= P(Do not reject H0 when H1 is true) =

P(.402 < < .5098 when = .515) =

P(.402-.515)[.05/(100).5] < Z < (.5098-.515)[.05/(100).5]

P(-22.6 < Z < -1.04) = P(1.04 < Z < 22.6) =

= 1 - .8508 = .1492

– This large probability may be reduced by taking larger samples

x

Type II Error

H0: = .500H1: = .515

P(Z<22.6) – P(Z<1.04) ≈ 1-P(Z<1.04)

Page 45: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

45

Ch 12: Inference when the Ch 12: Inference when the Variance is UnknownVariance is Unknown

• Generally, the variance may be unknownGenerally, the variance may be unknown

• In this case we change the test statistic from In this case we change the test statistic from “Z” to “t”, when testing the population “Z” to “t”, when testing the population mean.mean.

• To test the population proportion we’ll use To test the population proportion we’ll use the normal distribution (under certain the normal distribution (under certain conditions).conditions).

Page 46: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

46

Testing the mean – unknown variance

• Replace the statistic Z with “t”

The original distribution must be normal (or at least mound shaped).

ns

Xt

Page 47: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

47

• Problem 10– A federal agency inspects packages to determine if the

contents is at least as large as that advertised.– A random sample of (i)5, (ii)50 containers whose

packaging states that the weight was 8.04 ounces was drawn. (data is provided later)

– From the sample results…• Can we conclude that the average weight does not meet the

weight stated? (use = .05).• Estimate the mean weight of all containers with 99%

confidence• What assumption must be met?

Testing the mean – unknown variance

Page 48: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

48

• Solution– We hypothesize on the mean weight.

• H0: = 8.04

• H1: < 8.04

• (i) n=5. For small samples let us solve manuallyAssume the sample was: 8.07, 8.03, 7.99, 7.95, 7.94

– The rejection region: t < -tn = -t.05,5-1 = -2.132The tsample = ?

– Mean = (8.07+…+7.94)/5 = 7.996Std. Dev.={[(8.07-7.996)2+…+(7.94- 7.996)2]/4}1/2 = 0.054

-2.132

Testing the mean – unknown variance

Page 49: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

49

– The tsample is calculated as follows:

– Since -1.32 > -2.132 the sample statistic does not fall in the rejection region. There is insufficient evidence to conclude that the mean weight is smaller than 8, at 5% significance level.

32.15054.0

04.8996.7

ns

Xt

-2.132 -.165

Testing the mean – unknown variance

Page 50: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

50

– (ii) n=50. To calculate the sample statistics we use Excel, “Descriptive statistics” from the Tools>Data analysis menu. From the sample we obtain:Mean = 8.02; Std. Dev. = .04

– The confidence interval is calculated by = 8.02 2.678 = 8.02 .015

n

stx 2/

50

04.

t.005,50-1 = about 2.678 from the t - table

1- = .99 = .01/2 = .005

Testing the mean – unknown variance

LCL = 8.005, UCL = 8.35

Page 51: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

51

• Comments– Check whether it appears that the distribution is

normal

Frequency

0

5

10

15

20

7.93 7.97 8.01 8.05 8.09 More

Testing the mean – unknown variance

Data

Page 52: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

52

– To obtain an exact value for t use the TINV function:

The exact value:

Using Excel:

=TINV(0.01,49)

.01 is the two tail probability= .005*2

Degrees of freedom

2.6799535

Data

Page 53: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

53

• Problem 11– Engineers in charge of the production of car seats are

concerned about the compliance of the springs used with design specifications.

– Springs are designed to be 500mm long.• Springs too long or too short must be reworked.

• A standard deviation of 2mm in springs length will result in an acceptable number of reworked springs.

– A sample of 100 springs was taken and measured.

Testing the mean – unknown variance

Page 54: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

54

• Problem – continued– Can we infer at 10% significance level that the mean

spring length is not 500mm?SolutionH0: 500 Since the standard deviation is unknownH1: 500 We need to run a t-test, assuming the spring length is normally distributed.Rejection region:t < -t/2 or t > t/2

with d.f. = 99

t < -1.6604 ort > +1.6604

-1.6604 -1.6604

-.12

t-Test of a Mean

Sample mean 499.9697 t Stat -0.12Sample standard deviation 2.55247 P(T<=t) one-tail 0.4529Sample size 100 t Critical one-tail 1.2902Hypothesized mean 500 P(T<=t) two-tail 0.9057Alpha 0.1 t Critical two-tail 1.6604

Testing the mean – unknown variance

Data

Page 55: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

55

Inference about a population proportion

• The test and the confidence interval are based on the approximated normal distribution of the sample proportion, if np>5 and n(1-p)>5.

• For the confidence interval of p we have:

where p = x/n

• For the hypotheses test, we use a Z test.

n

)p̂1(p̂Zp̂ 2

^

Page 56: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

56

• Problem 12 (problem 11 continued). The engineers were interested in the percentage of springs that are the correct length. They marked each spring in the sample as – Correct – 1;

– Too long – 2;

– Too short – 3;Can we infer that less than 90% of the springs are the correct length, at 10% sig. level?

Inference about a population proportion

Page 57: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

57

• Problem 12 - Solution– H0: p = .9

H1: p < .9

– Rejection region:Z < -ZorZ < -1.28

33110086186

886

np1p

ppZ .

).(.

..

)ˆ(ˆ

ˆ

z-Test of a Proportion

Sample proportion 0.86 z Stat -1.33Sample size 100 P(Z<=z) one-tail 0.0912Hypothesized proportion 0.9 z Critical one-tail 1.2816Alpha 0.1 P(Z<=z) two-tail 0.1824

z Critical two-tail 1.6449

Conclusion:Since –1.33 < -1.28 we can infer that less than 90% of the springs do not need reworking.

Inference about a population proportion

Data

Page 58: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

58

• Problem 12 – solution continued– Let us estimate the proportion of good springs

at 99% confidence level.

100

)86.1(86.575.286.

n

)p̂1(p̂Zp̂ 2

z-Estimate of a Proportion

Sample proportion 0.86 Confidence Interval EstimateSample size 100 0.86 0.0894Confidence level 0.99 Lower confidence limit 0.7706

Upper confidence limit 0.9494

Inference about a population proportion

Data

Page 59: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

59

• Problem 12 – solution continued– Find the sample size if the proportion of good

springs is to be estimated to within .035. Consider the given sample an initial sample.

652035.

)86.1(86.575.2

W

)p̂1(p̂zn

22

2

Inference about a population proportion

Page 60: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

60

• Problem 13– A consumer protection group runs a survey of

400 dentists to check a claim that more than 4 out of 5 dentists recommend ingredients included in a certain toothpaste.

– The survey results are as follows: 71 – No; 329 – Yes

– At 5% significance level, can the consumer group infer that the claim is true?

Inference about a population proportion

Page 61: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

61

• Problem 13 - Solution– The two hypotheses are:

H0: p = .8

H1: p > .8

Z.05 = 1.645

Conclusion: Since 1.125 < 1.645 the consumer group cannot confirm the claim at 5% significance level.

The rejection region: Z > Z

125.1400)8.1(8.

8.8225.

n)p1(p

pp̂Z

Inference about a population proportion

Page 62: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

62

Summary Example

• An automotive expert claims that the large number of self-serve gas stations has resulted in poor automobile maintenance, and that the average tire pressure is more than 4.5 psi below it’s manufacturer specifications.

• A random sample of 50 tires revealed the results stored in the file TirePressure.

• Assume the tire pressure is normally distributed with = 1.5 psi, and answer the following questions:

Page 63: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

63

Solution– The Hypotheses:

H0: = 4.5H1: > 4.5

The rejection region: Z > Zor Z > 1.28.From the data we have: mean = 5.04, soZ=(5.04 – 4.5)/(1.5/50.5) = 2.545

– Since 2.545 > 1.28, there is sufficient evidence to infer that the expert is correct.

• At 10% significance level can we infer that the expert is correct? What is the p value?

Summary Example

The p value =

P(Sample Mean > 5.04 when = 4.5)=

P(Z > 2.545) = 1- .9945 = .0055

Tire Pressure

Page 64: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

64

• Find the probability of making a type II error when the actual tire under-inflation is 5 psi on the average.

SolutionThe Rejection Region in terms of the sample means is found first:ZL= 1.28 =(XL – 4.5)/(1.5/50.5). XL= 4.5 + 1.28(1.5/50.5) = 4.77. So, the Rejection Region is: Sample mean > 4.77.

= P(accept H0 when H1 is true) = P(sample mean does not fall in the RR, when = 5) =P( < 4.77 when = 5) = P(Z < (4.77-5)/(1.5/50.5)) = P(Z < -1.08) =

From Excel: [=NORMSDIST(-1.077)] = .1407

Summary Example

x

Page 65: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

65

Inference about the population Variance

• The following statistic is 2 (Chi squared) distributed with n-1 degrees of freedom:

• We use this relationship to test and estimate the variance.

2

22 s)1n(

Page 66: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

66

Inference about the population Variance

• The Hypotheses tested are:

• The rejection region is:

20

20

20

21

20

20

ororH

H

:

:

.with2

replacetesttailtwotheFor

ors)1n(

21n,1

21n,2

0

2

Page 67: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

67

Testing the Variance

• Problem 15

• Engineers in charge of the production of car seats are concerned about the compliance of the springs used with design specifications.

• Springs are designed to be 500mm long.

– Springs too long or too short must be reworked.

– A standard deviation of 2mm in springs length will result in an acceptable number of reworked springs.

• A sample of 100 springs was taken and measured.

Page 68: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

68

Testing the Variance• Problem 15 - continued

Can we infer at 10% significance level that the number of springs requiring reworking is unacceptably large?

H0: 2 = 4 H1: 2 > 4

Chi-squared Test of a Variance

Sample variance 6.515104 Chi-squared Stat 161.25Sample size 100 P(CHI<=chi) one-tail 0.0001Hypothesized variance 4 chi-squared Critical one-tail 117.4069Alpha 0.1 P(CHI<=chi) two-tail 0.0002

chi-squared Critical two-tail 77.0463123.2252

The number of springs requiring reworkingdepends on the standard deviation, or the variance.

Rejection region:2

Sample > 2

d.f. = 99

2Sample > 117.4069

Data

Page 69: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

69

Testing the Variance

• Problem 15 - conclusion Since 161.25 > 117.4069, we can infer at 10% significance level that the standard deviation is greater than 2, thus the number of springs that require reworking is unacceptably large.

Page 70: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

70

Testing the Variance

• Problem 16 • A random sample of 100 observations was taken

from a normal population. The sample variance was 29.76.

• Can we infer at 2.5% significance level that the population variance DOES NOT exceeds 30?

• Estimate the population variance with 90% confidence.

Page 71: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

71

Testing the Variance

• Problem – 16: Solution:• H0:2 = 30

• H1:2 < 30

2 = = = 98.21(n – 1)s2

2(100 – 1)29.76

Rejection region: 2 < 21-, n-1

2 < 73.36

Chi-squared Test of a Variance

Sample variance 29.76 Chi-squared Stat 98.21Sample size 100 P(CHI<=chi) one-tail 0.4964Hypothesized variance 30 chi-squared Critical one-tail 73.3611Alpha 0.975 P(CHI<=chi) two-tail 0.9928

chi-squared Critical two-tail 97.895698.7740!

Page 72: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

72

Testing the Variance

• Problem 16 - conclusion Since 98.208 > 73.36 we conclude that there is insufficient evidence at 2.5% significance level to infer that the variance is smaller than 30.

Page 73: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

73

– We can get an exact value of the probability

P(2d.f.> 2) = for a given 2

and known d.f., and then

determine the p-value.

– Use the CHIDIST function: For example: = .50359

That is: P(299> 98.208) = .50359

– In our example we had a left hand tail rejection region, and

therefore the p-value is P(299 < 98.208) = 1 - .50359

= .49641> .025

= CHIDIST(98.208,99)

Using Excel

=CHIDIST(2,d.f.)

Page 74: 1 Review #2 Chapter 9 Chapter 10 Chapter 11 and 12

74

Using Excel

– We can get the exact 2 value for which P(2

d.f.> 2) = for any given probability and known d.f., then define the rejection region:

– Use the CHIINV function

For example: =CHIINV(.975,99) = 73.36

That is: P(299 > ?) = .975. 2 = 73.36

The rejection region is: 2 < 73.36.

=CHIINV(,d.f.)