Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall

8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall

1/25

190 Chapter 9

CHAPTER 9

..

Categorical Data Analysis

9.2 a. The one-way table is shown below:

b. The form of the confidence interval is:

1 11 / 2

(1 )

p pp z

n

1

294 .3025

972p = =

For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table 5,Appendix B, .025z = 1.96. The confidence interval is:

.3025(1 .3025).3025 1.96 .3025 .0289 (.2736, .3314)

972

c. The form of the confidence interval is:

( ) 1 1 2 2 1 21 2 / 2 (1 ) (1 ) 2

p p p p p p

p p zn

+ +

1

357 .3673

972p = = 2

321 .3302

972p = =

The confidence interval is:

.3673(1 .3673) .3302(1 .3302) 2(.3673)(.3302)(.3673 .3302) 1.96

972

+ +

.0371 .0524 ( .0153, .0895)

We are 95% confident the difference in the proportion of cars turning left and right is

contained between .0153 and .0895.

9.4 a. Letp1= the proportion of WMU students who agree that their DSIP research

experience is valuable to their professional future.

94.50

47

1 ==p

Turned Left Turned Right Drove Straight Total

357 321 294 972


2/25

Categorical Data Analysis 191

The confidence interval forp1isn

qpzp 1121

For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table 5in Appendix B,z.005= 2.576. The 99% confidence interval forp1is:

.94(.06).94 2.576 .94 .087 (.853,1.027)

50

b. Letp1= the proportion of WMU students who agree that their DSIP research

experience is valuable to their professional future and letp2= the proportion of WMU

students who are neutral about the statement.

94.50

47

1 ==p and 06.50

3

2 ==p

The confidence interval forp1p2 is:

n

ppppppzpp 212211221

2)1()1()(

++

For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table 5in Appendix B,z.005= 2.576. The 99% confidence interval forp1is:

.94(.06) .06(.94) 2(.94)(.06)(.94 .06) 2.576 .88 .173 (.707,1.053)

50

+ +

9.6 The form of the interval is:

1 1 2 2 1 21 2 2

(1 ) (1 ) 2

p p p p p pp p z

n

+ +

1

58 .6444

90p = = 2

15 .1667

90p = =

For confidence coefficient .95, = .1 .95 = .05 and /2 = .05/2 = .025. From Table 5,

Appendix B, .025z = 1.96. The 95% confidence interval is:

.6444(.3556) .1667(.8333) 2(.6444)(.1667)(.6444 .1667) 1.96

90

+ +

.4777 .1577 (.3200, .6354)

We are 95% confident the difference between the proportion of subjects who selected

brighter side up and the proportion who select darker side up falls in the interval .3200 to

.6354.


3/25

192 Chapter 9

9.8 a. The form of the confidence interval for Cp is:

( )C CC / 2

1

p pp z

n

CC

22 .22100

npn

= = =

For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table 5,

Appendix B, .05 1.645z = . The confidence interval is:

.22(1 .22).22 1.645 .22 .068 (.152, .288)

100

b. The form of the confidence interval for ( )E Bp is:

( ) ( ) ( )E E B B E B

E B / 2

1 1 2

p p p p p pp p z

n

+ +

EE

19 .19

100

np

n= = =

BB

27 .27

100

np

n= = =

Using the information from part a, the confidence interval is:

.19(1 .19) .27(1 .27) 2(.19)(.27)(.19 .27) 1.645

100

+ +

.08 .111 ( .191, .031)

c. AA17

.17100

np

n= = =

DD

15 .15

100

np

n= = =

Using the information from part b, the confidence interval is:

.17(1 .17) .15(1 .15) 2(.17)(.15)(.17 .15) 1.645

100

+ +

.02 .093 ( .073, .095)


4/25


9.10 a. To determine if the opinions of Internet users are evenly divided among the four

categories, we test:

0 1 2 3 4

a

: .25

: At least two of the proportions differ

H p p p p

H

= = = =

b. The expected numbers in each category are:

E(ni)= npi= 328(.25) = 82

The test statistic is:

[ ]2 2 2 2 2

2 ( ) (59 82) (108 82) (82 82) (79 82) 14.805

( ) 82 82 82 82

i i

i

n E n

E n

= = + + + =

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k

1 = 4 1 = 3. From Table 8 in Appendix B, 2.05 = 7.81473. The rejection region is2

> 7.81473.

Since the observed value of the test statistic does fall in the rejection region ( 2 =

14.805 > 7.81473),H0is rejected. There is sufficient evidence to indicate that the

opinions of Internet users are not evenly divided among the four categories.

c. A Type I error would occur if we conclude that differences exist when, in fact, they do

not.

A Type II error would occur if we conclude that no differences exist when, in fact, they

do.

d. The expected cell counts must all be at least five and the multinomial assumptions must

be met.

9.12 To determine if there are significant differences in the percentage of incidents in the four

cause categories, we test:

0 1 2 3 4

a

: .25

: At least two of the proportions differ

H p p p p

H

= = = =

The expected numbers in each category are:

E(ni)= npi= 83(.25) = 20.75


5/25

194 Chapter 9


[ ]2 2 2 2 2

2 ( ) (27 20.75) (24 20.75) (22 20.75) (10 20.75) 8.04

( ) 20.75 20.75 20.75 20.75

i i

i

n E n

E n

= = + + + =

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1 =

4 1 = 3. From Table 8 in Appendix B, 2.05 = 7.81473. The rejection region is2

>

7.81473.

Since the observed value of the test statistic does fall in the rejection region 2 = 8.04 >

7.81473),H0 is rejected. There is sufficient evidence to indicate that there are significant

differences in the percentage of incidents in the four cause categories.

9.14 a. To determine if the traffic is equally divided among the three directions, we test:

0 1 2 3: 1/3H p p p= = =

a: At least two proportions are unequalH

The expected number in each category is:

E(ni)= npi=1

9723

= 324 (i= 1, 2, 3)

The observed and expected category counts are:

Straight Turn Right Turn Left

Observed 294 321 357

Expected 324 324 324


( )2 2 2 2

2 (294 324) (321 324) (357 324) 6.167

324 324 324

i i

i

n np

np

= = + + =

The rejection region requires = .05 in the upper tail of the2

distribution with df = k

1 = 3 1 = 2. From Table 8, Appendix B, 2.05 = 5.99147. The rejection region is

2 > 5.99147.

Since the observed value of the test statistic falls in the rejection region ( 2 = 6.167 >

5.99147), 0H is rejected. There is sufficient evidence to indicate the traffic is not

equally divided at = .05.


6/25


b. To determine if more than one-third of all automobiles entering the intersection turn

left, we test:

0: 1/3H p=

a: 1/3H p>

The rejection region for this large-sample, one-tailed test requires = .05 in the upper

tail of thezdistribution. From Table 5, Appendix B, .05z = 1.645. The rejection region

isz> 1.645.

The test statistics is 0

0 0

357 1 972 3 2.25

1 2

3 3972

p pz

p q

n

= = =

i

Since the observed value of the test statistic falls in the rejection region (z = 2.25 >

1.645), 0H is rejected. There is sufficient evidence to indicate the proportion of all

automobiles entering this intersection that turn left exceeds 1/3 using = .05.

9.16 To determine if three proportions differ, we test:

0 1 2 3: 1/3H p p p= = =

a: At least two of the proportions differH

The expected cell counts are:

E(ni) = npi=

1

90 3

= 30 (i= 1, 2, 3)


Brighter Side Up Darker Side Up Aligned

Observed 58 15 17

Expected 30 30 30


( )2 2 2 2

2 (58 30) (15 30) (17 30) 39.267

30 30 30

i i

i

n np

np

= = + + =

The rejection region requires = .05 in the upper tail of the 2 distribution with df = k1 =

3 1 = 2. From Table 8, Appendix B,2.05 = 5.99147. The rejection region is

2 > 5.99147.


5.99147), 0H is rejected. There is sufficient evidence to indicate at least two of the

proportions differ at = .05.


7/25

196 Chapter 9

9.18 For k= 2:

( ) ( ) ( )2 2 22

2 1 1 2 2

1 21

i i

ii

n np n np n np

np np np=

= = +

For a binomial experiment, 1 2 1 2, , , and (1 )n y n n y p p p p= = = =

[ ]22

2 ( ) (1 )( )

(1 )

n y n py np

np n p

= +

=2 2 2 2 2 22 ( ) 2 ( )(1 ) (1 )

(1 )

y ynp n p n y n n y p n p

np n p

+ + +

=2 2 2 2 2 2 2 2 2 2

22 2 2 2 2 2 2

(1 )

y ynp n p n ny y n n p ny npy n n p n p

np n p

+ + + + + ++

=2 2 2 2 2 2 2 2 2 2 2 2 32(1 ) 2 (1 ) (1 ) 2 2 2 2 2 2

(1 )

y p ynp p n p p n p nyp y p n p n p nyp nyp n p n p n p

np p

+ + + + + + +

=2 2 2 2 2 2 3 2 2 2 2 2 2 2 32 2 2 2 2

(1 )

y y p ynp ynp n p n p y p n p nyp n p n p

np p

+ + + + +

=2 2 2 2 2

22 ( ) ( )

(1 ) (1 )

y nyp n p y np y npz

np p np p npq

+ = = =

9.20 a. Yes, the sampling appears to satisfy the assumptions of a multinomial experiment. Theexperiment contains 120 trials and 2(4) = 8 categories. Since the 120 rats were

randomly selected, the trials are considered independent and the probabilities are

considered constant.

b. ( ). .

i jij

n nE n

n=

( )1180(30) 20

120E n = = ( )21

40(30) 10120

E n = =

( )12 80(30) 20120E n = = ( )22 40(30) 10120E n = =

( )1380(30) 20

120E n = = ( )23

40(30) 10120

E n = =

( )1480(30) 20

120E n = = ( )24

40(30) 10120

E n = =


8/25


c.( )

( )

2

2

ij ij

ij

n E n

E n

=

=2 2 2 2 2

(27 20) (20 20) (19 20) (14 20) (3 10)20 20 20 20 10 + + + +

+2 2 2(10 10) (11 10) (16 10)

12.910 10 10

+ + =

d. To determine if diet and presence/absence of cancer are independent, we test:

0: Diet and presence/absence of cancer are independentH

a: Diet and presence/absence of cancer are dependentH

The test statistic is 2 = 12.9.

The rejection region requires = .05 in the upper tail of the 2 distribution with df =

(r1)(c1) = (2 1)(4 1) = 3. From Table 8, Appendix B, 2.05 = 5.99147. The

rejection region is 2 > 5.99147.


5.99147), 0H is rejected. There is sufficient evidence to indicate that diet and

presence/absence of cancer are not independent at = .05.

e. Let 1 = proportion of rats on high fat/no fiber diet with cancer and let 2 = proportion

of rats on high fat/fiber diet with cancer.

1

27

30p = = .9 2

20

30p = = .667

The confidence interval for the difference between two proportions is:

( ) 1 1 2 21 2 21 2

p q p qp p z

n n

+


Appendix B, .025z = 1.96. The 95% confidence interval is:

.9(.1) .667(.333)(.90 .667) 1.645 .233 .2 (.033, .433)

30 30 +


9/25

198 Chapter 9

To obtain the confidence interval for the percentage, multiply the endpoints by 100%.

The interval is (3.3%, 43.3%).

We are 95% confident that the difference in the percentage of rats with cancer betweenthose on high fat/no fiber diets and those on high fat/fiber diets is between 3.3% and43.3%.

Since the rats were divided into groups according to diets, we assume the groups areindependent.

9.22 Using MINITAB, the results of the analyses are:

Tabulated statistics: Stops, Kills

Using frequencies in Fr

Rows: Stops Columns: Kills

1 2 3 4 5 All

1 32 33 19 5 2 9128.31 34.88 18.71 6.57 2.53 91.00

2 24 36 18 8 3 8927.69 34.12 18.29 6.43 2.47 89.00

All 56 69 37 13 5 18056.00 69.00 37.00 13.00 5.00 180.00

Cell Contents: CountExpected count

Pearson Chi-Square = 2.171, DF = 4, P-Value = 0.704

Likelihood Ratio Chi-Square = 2.182, DF = 4, P-Value = 0.702

* NOTE * 2 cells with expected counts less than 5

First, we check to see if the assumption about the expected cells is met. From the table, there

are two expected cell counts that are less than 5. Thus, the results of the test are suspect.

To determine if the number of kills is related to whether the trial was stopped or not, we test:

H0: Number of kills and whether the trial was stopped or not are independentHa: Number of kills and whether the trial was stopped or not are dependent

The test statistic is

2

= 2.171 (from the printout).

Thep-value of the test is .704. Since thisp-value is so large,H0is not rejected. There is

insufficient evidence to indicate that the number of kills is related to whether the trial was

stopped or not at .10.


10/25


9.24 a. The contingency table is shown below:

b. To determine if flight response of the geese depends on altitude of the helicopter, wetest:

H0: Flight response and Altitude are independentHa: Flight response and Altitude are dependent

Statistix was used to create the following printout:

Chi-Square Test for Heterogeneity or Independence

for Count = Altitude Response

ResponseAltitude Low High

+-----------+-----------+1 Observed | 85 | 105 | 190

Expected | 73.30 | 116.70 |Cell Chi-Sq | 1.87 | 1.17 |

+-----------+-----------+2 Observed | 77 | 121 | 198


+-----------+-----------+3 Observed | 17 | 59 | 76

Expected | 29.32 | 46.68 |

Cell Chi-Sq | 5.18 | 3.25 |+-----------+-----------+179 285 464

Overall Chi-Square 11.48P-Value 0.0032Degrees of Freedom 2

Since = .01 >p-value = .0032,H0can be rejected. There is sufficient evidence toindicate that flight response of the geese depends on the altitude of the helicopter.

c. The contingency table is shown below:

High Low Total

Less than 300 105 85 190

300-600 meters 121 77 198

600 or more 59 17 76

Total 285 179 464

High Low Total

Less than 1,000 243 37 280

1,000-2,000 meters 37 68 105

2,000-3,000 meters 4 44 48

3,00 or more 1 30 31

Total 285 179 464


11/25

200 Chapter 9

d. To determine if flight response of the geese depends on lateral distance of the

helicopter, we test:

H0: Flight response and Lateral distance are independentHa: Flight response and Lateral distance are dependent

Statistix was used to create the following printout:

Chi-Square Test for Heterogeneity or Independencefor Count = Lat_Cat Response

ResponseLat_Cat Low High

+-----------+-----------+1 Observed | 37 | 243 | 280


+-----------+-----------+2 Observed | 68 | 37 | 105

Expected | 40.51 | 64.49 |

Cell Chi-Sq | 18.66 | 11.72 |+-----------+-----------+3 Observed | 44 | 4 | 48


+-----------+-----------+4 Observed | 30 | 1 | 31


+-----------+-----------+179 285 464


Since = .01 >p-value = .0000,H0can be rejected. There is sufficient evidence toindicate that flight response of the geese depends on the lateral distance of the

helicopter.

9.26 a. To find the proportion of censored measurements for each of the six tractor lines, wetake the number of censored measurements for each tractor line and divide it by thetotal number of measurements for each tractor lane.

1

175 0.028

6047p = =

2236 0.050

4692p = =

3

319 0.045

7140p = =

4

231 0.038

6120p = =


12/25


5

480 0.046

10353p = =

6

187 0.039

4794p = =

b. Statistix was used to create the following printout:

Chi-Square Test for Heterogeneity or Independencefor Count = Lat_Cat Response

ResponseTractor Line Uncensored Censored

+-----------+-----------+1 Observed | 175 | 6047 | 6222


+-----------+-----------+2 Observed | 236 | 4456 | 4692

Expected | 194.26 | 4497.74 |

Cell Chi-Sq | 8.97 | 0.39 |+-----------+-----------+

3 Observed | 319 | 6821 | 7140Expected | 295.62 | 6844.38 |

Cell Chi-Sq | 1.85 | 0.08 |+-----------+-----------+


Cell Chi-Sq | 1.98 | 0.09 |+-----------+-----------+


Cell Chi-Sq | 6.15 | 0.27 |+-----------+-----------+

6 Observed | 187 | 4607 | 4794


+-----------+-----------+1628 37693 39321


To determine the proportion of censored measurements differs for the six tractor lines,we test:

H0: Measurement type and tractor line are independent

Ha: Measurement type and tractor line are dependent

Since = 01 >p-value = .0000,H0can be rejected. There is sufficient evidence toindicate that the proportion of censored measurements differs for the six tractor lines.

c. While statistically significant, we have no way of knowing when a tractor line willproduce a large number of censored measurements and when it will produce a smallnumber of censored measurements. From a practical perspective, not much useful

information has been learned.


13/25

202 Chapter 9

9.28 a. The contingency table is:

Committee

Acceptable Rejected Totals

Acceptable 101 23 124InspectorRejected 10 19 29

Totals 111 42 153

b. Yes. To plot the percentages, first convert frequencies to percentages by dividing thenumbers in each column by the column total and multiplying by 100. Also, divide therow totals by the overall total and multiply by 100.

Acceptable Rejected Totals

Acceptable

111

101100 = 90.99%

42

23100 = 54.76%

123

124100 = 81.05%

InspectorRejected

111

10100 = 9.01%

42

19100 = 45.23%

153

29100 = 18.95%

From the plot, it appears there is a relationship.

c. Some preliminary calculations are:

11

E = 153

)111(12411

=ncr

= 89.961 12

E = 153

)42(12421

=ncr

= 34.039

21E =

153

)111(2912 =n

cr= 21.039 22

E =153

)42(2922 =n

cr= 7.961

0

0.2

0.4

0.6

0.8

1

Acceptable Rejected Total

Committee

Proportiona

ccept/rejecte


14/25


To determine if the inspector's classifications and the committee's classifications arerelated, we test:

H0: The inspector's and committee's classification are independentHa: The inspector's and committee's classifications are dependent

The test statistic is 2=2[ ]

ij ji

ij

n E

E

=961.7

)961.719(

039.21

)039.2110(

039.34

)039.3423(

961.89

)961.89101( 2222 +

+

+

= 26.034

The rejection region requires = .05 in the upper tail of the 2distribution withdf = (r1)(c1) = (2 1)(2 1) = 1. From Table 8, Appendix B, 2.05 = 3.84146.The rejection region is 2> 3.84146.

Since the observed value of the test statistic falls in the rejection region (2= 26.034 >3.84146),H0is rejected. There is sufficient evidence to indicate the inspector's andcommittee's classifications are dependent at = .05. This indicates that the inspectorand committee tend not to make the same decisions.

9.30 We wish to test:

0 1 2 3 4 5 6 7: 1/7H p p p p p p p= = = = = = =

a

1 2 3 4 5 6 7

: At least two of these proportions are different from

1/7

H

p p p p p p p= = = = = = =

Our statistic is( )

272

1

i i

ii

O e

e=

=

The observed counts are found by using the table information:

iO = (number of specimens)(percentage with manganese nodules)

The expected counts are found by i i ie n p=

These results are summarized as follows:

2 2 22 (23 55.6) (25 20.0) (11 14.1)

32.5955.6 20.0 14.1

= + + + =

Age Observed Expected

Miocene-recent 389(.059) = 23 389(1/7) = 55.6

Oligocene 140(.179) = 25 140(1/7) = 20.0

Eocene 214(.164) = 35 214(1/7) = 30.6

Paleocene 84(.214) = 18 84(1/7) = 12.0

Lake Cretaceous 247(.211) = 52 247(1/7) = 35.3

Early and Middle Cretaceous 1120(.142) = 159 1120(1/7) = 160.0

Jurassic 99(.110) = 11 99(1/7) = 14.1

323


15/25

204 Chapter 9

The rejection region requires = .05 in the upper tail of the 2 distribution with k1 = 7 1

= 6 df. From Table 8, Appendix B, 2.05 = 12.5916. Reject 0H if2

> 12.5916.


12.5916), 0H is rejected.

9.32 a. To determine if the percentages of the different types of programming statements differ

for the two languages, we test:

0: The proportions of the different types of programming statements are the

same for the two languages

H

a: The proportions of the different types of programming statements are

different for the two languages

H

The expected category counts are:

( ). .

i jij

n nE n

n=

( )112170(10,412) 1136.407

19,882E n = =

( )122170(9470) 1033.593

19,882E n = =

( )52726(9470) 345.801

19,882E n = =



( )( )

2

2

ij ij

ij

n E n

E n

=

=2 2 2(125 1136.407) (2045 1033.593) (465 345.801)

1136.407 1033.593 345.801

+ + +

= 4755.1933

ALGOL PASCAL Totals

IF 125 (1136.407) 2,045 (1033.593) 2,170

FOR 968 (690.223) 350 (627.777) 1,318

IO 135 (1037.953) 1,847 (944.047) 1,982

IF ASSIGNMENT 8,293 (7167.218) 4,763 (6518.782) 13,686

Other 261 (380.199) 465 (345.801) 726

Totals 10,412 9,470 19,882


16/25


The rejection region requires = .05 in the upper tail of the 2 distribution with df = (r

1)(c 1) = (5 1)(2 1) = 4. From Table 8, Appendix B, 2.05 = 9.48773. The


Since the observed value of the test statistic falls in the rejection region ( 2 =

4755.1993 > 9.48773), 0H is rejected. There is sufficient evidence to indicate the

percentages of the different types of programming statements differ for the two

languages at = .05.

b. The form of the confidence interval for ( )A Pp p is:

( ) A A P PA P 2A P

(1 ) (1 )p p pp p z

n n

+

AA

A

8923 .857

10,412

Xp

n

= = = PPP

4763 .503

9470

Xp

n

= = =

For confidence coefficient .95, = 1 .95 = .05 and /2 = . 05/2 = .025. From Table 5,

Appendix B, .025 1.96z = . The confidence interval is:

.857(1 .857) .503(1 .503)(.857 .503) 1.96 .354 .0121

10412 9470

+

(.3419, .3661)

9.34 a. The form of the contingency tables will all be:

b. The hypergeometric formula for these tables is:

449 49

10

498

10

y y

, wherey= 0, 1, 2, , 10

Predicted EVG

No Yes TotalFALSE 439 +y 10 y 449

DefectTRUE 49 y y 49

Total 488 10 498


17/25

206 Chapter 9

Due to the large sample size, these factorials produce difficult probabilities to calculate.

The resulting probabilities are shown below:

c. The Fishers exact testp-value can be found by adding the probabilities at least as

contradictory as the one observed. P-value =P(y= 2 or 3 or or 10) = 0.2572.

d. We see that these two probabilities are equal.

9.36 a. The form of the confidence interval is:

2

(1 ) i i

i

p pp z

n

1 2 3 .60, .23, .17p p p= = =


Appendix B, .025z = 1.96. The 95% confidence intervals are:

For 1p :.60(.40).60 1.96 .60 .029 (.571, .629)

1132

For 2p :.23(.77)

.23 1.96 .23 .025 (.205, .255)1132

For 3p :.17(.83)

.17 1.96 .17 .022 (.148, .192)1132

b. We want to test:

0 1 2 3: .8, .1, and .1H p p p= = =

a: At least two proportions are different than specifiedH

The expected counts in each category are:

1 1( )E n np= = 1132(.8) = 905.6

2 2( )E n np= = 1132(.1) = 113.2

3 3( )E n np= = 1132(.1) = 113.2

y P(y)

0 0.3514

1 0.3914

2 0.19173 0.0544

4 0.0099

5 0.0012

6 0.0001

7 0

8 0

9 0

10 0


18/25



Appropriate Inappropriate Avoidable

Observed 679 261 192

Expected 905.6 113.2 113.2


( )2 2 2 2

2 (679 905.6) (261 113.2) (192 113.2) 304.5

905.6 113.2 113.2

i i

i

n np

np

= = + + =

The rejection region requires = .10 in the upper tail of the 2 distribution with df = k

1 = 3 1 = 2. From Table 8, Appendix B, 2.10 = 4.60517. The rejection region is2

> 4.60517.


4.60517) 0H is rejected. There is sufficient evidence to indicate at least two

proportions are different than specified at = .10.

9.38 The Statistix printout for the analysis appears below:

Chi-Square Test for Heterogeneity or Independencefor count = Year abuse

abuseYear 1 2 3 4

+-----------+-----------+-----------+-----------+1 Observed | 7 | 5 | 9 | 8 | 29

Expected | 9.61 | 8.22 | 5.74 | 5.43 |Cell Chi-Sq | 0.71 | 1.26 | 1.85 | 1.22 |

+-----------+-----------+-----------+-----------+2 Observed | 22 | 18 | 6 | 6 | 52


+-----------+-----------+-----------+-----------+3 Observed | 12 | 15 | 6 | 12 | 45


+-----------+-----------+-----------+-----------+4 Observed | 21 | 15 | 16 | 9 | 61


+-----------+-----------+-----------+-----------+62 53 37 35 187

Overall Chi-Square 15.86P-Value 0.0699

Degrees of Freedom 9

Cases Included 16 Missing Cases 0

To determine if the proportion of different types of abuse are changing over time, we test:

0: Types of abuse and year are independentH

a: Types of abuse and year are dependentH


19/25

208 Chapter 9

The expected category counts are shown in the printout.

The test statistic is 2 =

2 ( )

( )

ij ij

ij

n E n

E n

= 15.86 from printout.

The rejection region requires = .05 in the upper tail of the 2 distribution with df = (r1)(c

1) = (4 1)(4 1) = 9. From Table 8, Appendix B, 2.05 =16.9190. The rejection region is

2 > 16.9190.

Since the observed value of the test statistic does not fall in the rejection region

( 2 0 15.859 16.9190),H= >/ is not rejected. There is insufficient evidence to indicate theproportions of different types of abuse are changing over time at = .05.

9.40 a. To determine if pesticide depends on orchard type, we test:

0: Pesticide and orchard type are independentH

a: Pesticide and orchard type are dependentH

The test statistic is 2 = 31000.416 (from printout). Thep-value for the test isp= .000.

At = .01, >p-value, and we reject 0H . There is sufficient evidence to indicate that

pesticide used and orchard type are dependent.

PHstat was used to conduct the desired analysis and the following printout was created:

Observed Frequencies

Column variable

Row variable Almonds Peaches Nectarines Total

Chlor. 41077 4419 11594 57090

Diazinon 102935 9651 5928 118514

Methid. 21240 5198 1790 28228

Parathion 136064 53384 24417 213865

Total 301316 72652 43729 417697

Expected Frequencies

Column variable

Row variable Almonds Peaches Nectarines Total

Chlor. 41183.27505 9929.931697 5976.79325 57090Diazinon 85492.98756 20613.69636 12407.31608 118514

Methid. 20362.96178 4909.82855 2955.209666 28228

Parathion 154276.7756 37198.54339 22389.681 213865

Total 301316 72652 43729 417697


20/25


Data

Level of Significance 0.01

Number of Rows 4

Number of Columns 3

Degrees of Freedom 6

Results

Critical Value 16.8118718

Chi-Square Test Statistic 31000.41584

p-Value 0

b. We will calculate 95% confidence intervals for the rate of parathion application for the

three orchard types.

Almonds:136,064

301,316

p= = .45

.025

.45(.55) .45 1.96 .45 .002

301,316

pqp z

n =

Nectars:24,417

43,729

p= = .56

.025

.56(.44) .56 1.96 .56 .005

43,729

pqp z

n =

Peaches:53,384

72,652

p= = .73

.025

.73(.27) .73 1.96 .73 .003

72,652

pqp z

n =

9.42 a. Test 0 1 2: .5H p p= =

a 1 2:H p p


( )2

2 22

1 1

ij ij

iji j

O ee= ==


21/25


22/25


c. Fishers exact test computes thep-value atp= 0.0173. When testing at = .01,H0cannot be rejected. There is insufficient evidence to detect a difference in proportions

which agrees with our conclusion above in part a.

9.44 The Statistix printout for the analysis is shown below:

Chi-Square Test for Heterogeneity or Independencefor count = Technology Group

GroupTechnology 1 2 3 4

+-----------+-----------+-----------+-----------+1 Observed | 21 | 42 | 11 | 25 | 99


+-----------+-----------+-----------+-----------+2 Observed | 18 | 2 | 16 | 13 | 49


+-----------+-----------+-----------+-----------+3 Observed | 11 | 6 | 23 | 12 | 52


+-----------+-----------+-----------+-----------+50 50 50 50 200


Cases Included 12 Missing Cases 0

a. To determine if public opinion regarding the choice of future technology options for

generating electricity differ among the four groups, we test:

0: Choice and group are independentH

a: Choice and group are dependentH

The test statistic is2

= 44.548.


distribution with df = (r

1)(c1) = (3 1)(4 1) = 6. From Table 8, Appendix B,2.10 = 10.6446. The



10.6446), 0H is rejected. There is sufficient evidence to indicate that public opinion

does differ among the four groups at = .10.

b. Let 1 = proportion supporting the coal option and 2p = proportion supporting the

nuclear option.


23/25

212 Chapter 9

To determine if the proportion supporting the coal option exceeds the proportion

supporting the nuclear option, we test:

0 1 2: 0H p p =

a 1 2: 0H p p >

1

99 .495

200p = = 2

49 .245

200p = =

99 49 .37

200 200p

+= =

+

The rejection region re requires = .10 in the upper tail of thezdistribution. From

Table 5, Appendix B, .10z = 1.282. The rejection region isz> 1.282.


1 2 0

2 2

( ) (.495 .245) 0

(1 ) (1 ) 2 .37(.63) .37(.63) 2(.37)

200

p p Dz

p p p p p

n

= =

+ + + += 4.11

Since the observed value of the test statistic falls in the rejection region (z= 4.11 >

1.282), 0H is rejected. There is sufficient evidence to indicate the proportion

supporting coal exceeds the proportion supporting nuclear at = .10.

c. The form of the confidence interval is:

/ 2

(1 )

pp z

n

16 .32

50p= =

For confidence coefficient .90,

= 1 .90 = .10 and

/2 = .10/2 = .05. From Table 5,Appendix B, .05z = 1.645. The 90% confidence interval is:

.32(1 .32).32 1.645 .32 .109 (.211, .429)

50

9.46 The data were tested using Fishers exact test and the results are shown below:

Two by Two Tables

+----------+----------+| | || 10 | 6 | 16

| | |+----------+----------+| | || 12 | 2 | 14| | |+----------+----------+

22 8 30

Fisher Exact Tests: Lower Tail 0.1541 Upper Tail 0.0715 Two Tailed 0.2255


24/25


To determine if the fidelity and selectivity are dependent, we test:

0:H Fidelity and Selectivity are independent

a:H Fidelity and Selectivity are dependent

Thep-value for the test is 0.2255.

When testing at = .05, 0H cannot be rejected. There is insufficient evidence to indicate

that fidelity and selectivity are dependent when testing at = .05.

9.48 Some preliminary calculations are:

1 2 3 4 5 6 7 8 714(.125) 89.25ie e e e e e e e np= = = = = = = = = =

a. To determine if the probabilities of worker accidents are higher for some time periods,

we test:

0 1 2 3 4 5 6 7 8: .125H p p p p p p p p= = = = = = = =

a : At least two of the cell probabilities differ from each otherH


( )2

2

i i

ii

O e

e

=

=2 2 2 2(93 89.25) (71 89.25) (79 89.25) (110 89.25)

15.90589.25 89.25 89.25 89.25

+ + + + =


distribution with df = k

1 = 8 1 = 7. From Table 8, Appendix B,2.10 =12.0170. The rejection region is

2 > 12.0170.


12.017, 0H is rejected. There is sufficient evidence to indicate the probabilities of

worker accidents are higher in some time periods at = .10.

b. 1 98 89 102 110 399 .5588714 714

p + + += = =

0 1: .5H p =

a 1: .5H p >


25/25

The test statistic is 1 10

10 10

.5588 .53.14

( ) .5(.5)

714

p pz

p q

n

= = =

The rejection region requires = .10 in the upper tail of thezdistribution. From Table

5, Appendix B, .10 1.28z = . The rejection region isz> 1.28.

Since the observed value of the test statistic falls in the rejection region (z= 3.14 >

1.28), 0H is rejected. There is sufficient evidence to indicate the probability of an

accident during the last 4 hours of a shift is greater than during the first 4 hours at

= .10.

Documents

Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall