Upload
uzair-khan
View
228
Download
0
Embed Size (px)
Citation preview
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
1/25
190 Chapter 9
CHAPTER 9
..
Categorical Data Analysis
9.2 a. The one-way table is shown below:
b. The form of the confidence interval is:
1 11 / 2
(1 )
p pp z
n
1
294 .3025
972p = =
For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table 5,Appendix B, .025z = 1.96. The confidence interval is:
.3025(1 .3025).3025 1.96 .3025 .0289 (.2736, .3314)
972
c. The form of the confidence interval is:
( ) 1 1 2 2 1 21 2 / 2 (1 ) (1 ) 2
p p p p p p
p p zn
+ +
1
357 .3673
972p = = 2
321 .3302
972p = =
The confidence interval is:
.3673(1 .3673) .3302(1 .3302) 2(.3673)(.3302)(.3673 .3302) 1.96
972
+ +
.0371 .0524 ( .0153, .0895)
We are 95% confident the difference in the proportion of cars turning left and right is
contained between .0153 and .0895.
9.4 a. Letp1= the proportion of WMU students who agree that their DSIP research
experience is valuable to their professional future.
94.50
47
1 ==p
Turned Left Turned Right Drove Straight Total
357 321 294 972
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
2/25
Categorical Data Analysis 191
The confidence interval forp1isn
qpzp 1121
For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table 5in Appendix B,z.005= 2.576. The 99% confidence interval forp1is:
.94(.06).94 2.576 .94 .087 (.853,1.027)
50
b. Letp1= the proportion of WMU students who agree that their DSIP research
experience is valuable to their professional future and letp2= the proportion of WMU
students who are neutral about the statement.
94.50
47
1 ==p and 06.50
3
2 ==p
The confidence interval forp1p2 is:
n
ppppppzpp 212211221
2)1()1()(
++
For confidence coefficient .99, = 1 .99 = .01 and /2 = .01/2 = .005. From Table 5in Appendix B,z.005= 2.576. The 99% confidence interval forp1is:
.94(.06) .06(.94) 2(.94)(.06)(.94 .06) 2.576 .88 .173 (.707,1.053)
50
+ +
9.6 The form of the interval is:
1 1 2 2 1 21 2 2
(1 ) (1 ) 2
p p p p p pp p z
n
+ +
1
58 .6444
90p = = 2
15 .1667
90p = =
For confidence coefficient .95, = .1 .95 = .05 and /2 = .05/2 = .025. From Table 5,
Appendix B, .025z = 1.96. The 95% confidence interval is:
.6444(.3556) .1667(.8333) 2(.6444)(.1667)(.6444 .1667) 1.96
90
+ +
.4777 .1577 (.3200, .6354)
We are 95% confident the difference between the proportion of subjects who selected
brighter side up and the proportion who select darker side up falls in the interval .3200 to
.6354.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
3/25
192 Chapter 9
9.8 a. The form of the confidence interval for Cp is:
( )C CC / 2
1
p pp z
n
CC
22 .22100
npn
= = =
For confidence coefficient .90, = 1 .90 = .10 and /2 = .10/2 = .05. From Table 5,
Appendix B, .05 1.645z = . The confidence interval is:
.22(1 .22).22 1.645 .22 .068 (.152, .288)
100
b. The form of the confidence interval for ( )E Bp is:
( ) ( ) ( )E E B B E B
E B / 2
1 1 2
p p p p p pp p z
n
+ +
EE
19 .19
100
np
n= = =
BB
27 .27
100
np
n= = =
Using the information from part a, the confidence interval is:
.19(1 .19) .27(1 .27) 2(.19)(.27)(.19 .27) 1.645
100
+ +
.08 .111 ( .191, .031)
c. AA17
.17100
np
n= = =
DD
15 .15
100
np
n= = =
Using the information from part b, the confidence interval is:
.17(1 .17) .15(1 .15) 2(.17)(.15)(.17 .15) 1.645
100
+ +
.02 .093 ( .073, .095)
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
4/25
Categorical Data Analysis 193
9.10 a. To determine if the opinions of Internet users are evenly divided among the four
categories, we test:
0 1 2 3 4
a
: .25
: At least two of the proportions differ
H p p p p
H
= = = =
b. The expected numbers in each category are:
E(ni)= npi= 328(.25) = 82
The test statistic is:
[ ]2 2 2 2 2
2 ( ) (59 82) (108 82) (82 82) (79 82) 14.805
( ) 82 82 82 82
i i
i
n E n
E n
= = + + + =
The rejection region requires = .05 in the upper tail of the 2 distribution with df = k
1 = 4 1 = 3. From Table 8 in Appendix B, 2.05 = 7.81473. The rejection region is2
> 7.81473.
Since the observed value of the test statistic does fall in the rejection region ( 2 =
14.805 > 7.81473),H0is rejected. There is sufficient evidence to indicate that the
opinions of Internet users are not evenly divided among the four categories.
c. A Type I error would occur if we conclude that differences exist when, in fact, they do
not.
A Type II error would occur if we conclude that no differences exist when, in fact, they
do.
d. The expected cell counts must all be at least five and the multinomial assumptions must
be met.
9.12 To determine if there are significant differences in the percentage of incidents in the four
cause categories, we test:
0 1 2 3 4
a
: .25
: At least two of the proportions differ
H p p p p
H
= = = =
The expected numbers in each category are:
E(ni)= npi= 83(.25) = 20.75
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
5/25
194 Chapter 9
The test statistic is:
[ ]2 2 2 2 2
2 ( ) (27 20.75) (24 20.75) (22 20.75) (10 20.75) 8.04
( ) 20.75 20.75 20.75 20.75
i i
i
n E n
E n
= = + + + =
The rejection region requires = .05 in the upper tail of the 2 distribution with df = k 1 =
4 1 = 3. From Table 8 in Appendix B, 2.05 = 7.81473. The rejection region is2
>
7.81473.
Since the observed value of the test statistic does fall in the rejection region 2 = 8.04 >
7.81473),H0 is rejected. There is sufficient evidence to indicate that there are significant
differences in the percentage of incidents in the four cause categories.
9.14 a. To determine if the traffic is equally divided among the three directions, we test:
0 1 2 3: 1/3H p p p= = =
a: At least two proportions are unequalH
The expected number in each category is:
E(ni)= npi=1
9723
= 324 (i= 1, 2, 3)
The observed and expected category counts are:
Straight Turn Right Turn Left
Observed 294 321 357
Expected 324 324 324
The test statistic is:
( )2 2 2 2
2 (294 324) (321 324) (357 324) 6.167
324 324 324
i i
i
n np
np
= = + + =
The rejection region requires = .05 in the upper tail of the2
distribution with df = k
1 = 3 1 = 2. From Table 8, Appendix B, 2.05 = 5.99147. The rejection region is
2 > 5.99147.
Since the observed value of the test statistic falls in the rejection region ( 2 = 6.167 >
5.99147), 0H is rejected. There is sufficient evidence to indicate the traffic is not
equally divided at = .05.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
6/25
Categorical Data Analysis 195
b. To determine if more than one-third of all automobiles entering the intersection turn
left, we test:
0: 1/3H p=
a: 1/3H p>
The rejection region for this large-sample, one-tailed test requires = .05 in the upper
tail of thezdistribution. From Table 5, Appendix B, .05z = 1.645. The rejection region
isz> 1.645.
The test statistics is 0
0 0
357 1 972 3 2.25
1 2
3 3972
p pz
p q
n
= = =
i
Since the observed value of the test statistic falls in the rejection region (z = 2.25 >
1.645), 0H is rejected. There is sufficient evidence to indicate the proportion of all
automobiles entering this intersection that turn left exceeds 1/3 using = .05.
9.16 To determine if three proportions differ, we test:
0 1 2 3: 1/3H p p p= = =
a: At least two of the proportions differH
The expected cell counts are:
E(ni) = npi=
1
90 3
= 30 (i= 1, 2, 3)
The observed and expected category counts are:
Brighter Side Up Darker Side Up Aligned
Observed 58 15 17
Expected 30 30 30
The test statistic is:
( )2 2 2 2
2 (58 30) (15 30) (17 30) 39.267
30 30 30
i i
i
n np
np
= = + + =
The rejection region requires = .05 in the upper tail of the 2 distribution with df = k1 =
3 1 = 2. From Table 8, Appendix B,2.05 = 5.99147. The rejection region is
2 > 5.99147.
Since the observed value of the test statistic falls in the rejection region ( 2 = 39.267 >
5.99147), 0H is rejected. There is sufficient evidence to indicate at least two of the
proportions differ at = .05.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
7/25
196 Chapter 9
9.18 For k= 2:
( ) ( ) ( )2 2 22
2 1 1 2 2
1 21
i i
ii
n np n np n np
np np np=
= = +
For a binomial experiment, 1 2 1 2, , , and (1 )n y n n y p p p p= = = =
[ ]22
2 ( ) (1 )( )
(1 )
n y n py np
np n p
= +
=2 2 2 2 2 22 ( ) 2 ( )(1 ) (1 )
(1 )
y ynp n p n y n n y p n p
np n p
+ + +
=2 2 2 2 2 2 2 2 2 2
22 2 2 2 2 2 2
(1 )
y ynp n p n ny y n n p ny npy n n p n p
np n p
+ + + + + ++
=2 2 2 2 2 2 2 2 2 2 2 2 32(1 ) 2 (1 ) (1 ) 2 2 2 2 2 2
(1 )
y p ynp p n p p n p nyp y p n p n p nyp nyp n p n p n p
np p
+ + + + + + +
=2 2 2 2 2 2 3 2 2 2 2 2 2 2 32 2 2 2 2
(1 )
y y p ynp ynp n p n p y p n p nyp n p n p
np p
+ + + + +
=2 2 2 2 2
22 ( ) ( )
(1 ) (1 )
y nyp n p y np y npz
np p np p npq
+ = = =
9.20 a. Yes, the sampling appears to satisfy the assumptions of a multinomial experiment. Theexperiment contains 120 trials and 2(4) = 8 categories. Since the 120 rats were
randomly selected, the trials are considered independent and the probabilities are
considered constant.
b. ( ). .
i jij
n nE n
n=
( )1180(30) 20
120E n = = ( )21
40(30) 10120
E n = =
( )12 80(30) 20120E n = = ( )22 40(30) 10120E n = =
( )1380(30) 20
120E n = = ( )23
40(30) 10120
E n = =
( )1480(30) 20
120E n = = ( )24
40(30) 10120
E n = =
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
8/25
Categorical Data Analysis 197
c.( )
( )
2
2
ij ij
ij
n E n
E n
=
=2 2 2 2 2
(27 20) (20 20) (19 20) (14 20) (3 10)20 20 20 20 10 + + + +
+2 2 2(10 10) (11 10) (16 10)
12.910 10 10
+ + =
d. To determine if diet and presence/absence of cancer are independent, we test:
0: Diet and presence/absence of cancer are independentH
a: Diet and presence/absence of cancer are dependentH
The test statistic is 2 = 12.9.
The rejection region requires = .05 in the upper tail of the 2 distribution with df =
(r1)(c1) = (2 1)(4 1) = 3. From Table 8, Appendix B, 2.05 = 5.99147. The
rejection region is 2 > 5.99147.
Since the observed value of the test statistic falls in the rejection region ( 2 = 12.9 >
5.99147), 0H is rejected. There is sufficient evidence to indicate that diet and
presence/absence of cancer are not independent at = .05.
e. Let 1 = proportion of rats on high fat/no fiber diet with cancer and let 2 = proportion
of rats on high fat/fiber diet with cancer.
1
27
30p = = .9 2
20
30p = = .667
The confidence interval for the difference between two proportions is:
( ) 1 1 2 21 2 21 2
p q p qp p z
n n
+
For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table 5,
Appendix B, .025z = 1.96. The 95% confidence interval is:
.9(.1) .667(.333)(.90 .667) 1.645 .233 .2 (.033, .433)
30 30 +
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
9/25
198 Chapter 9
To obtain the confidence interval for the percentage, multiply the endpoints by 100%.
The interval is (3.3%, 43.3%).
We are 95% confident that the difference in the percentage of rats with cancer betweenthose on high fat/no fiber diets and those on high fat/fiber diets is between 3.3% and43.3%.
Since the rats were divided into groups according to diets, we assume the groups areindependent.
9.22 Using MINITAB, the results of the analyses are:
Tabulated statistics: Stops, Kills
Using frequencies in Fr
Rows: Stops Columns: Kills
1 2 3 4 5 All
1 32 33 19 5 2 9128.31 34.88 18.71 6.57 2.53 91.00
2 24 36 18 8 3 8927.69 34.12 18.29 6.43 2.47 89.00
All 56 69 37 13 5 18056.00 69.00 37.00 13.00 5.00 180.00
Cell Contents: CountExpected count
Pearson Chi-Square = 2.171, DF = 4, P-Value = 0.704
Likelihood Ratio Chi-Square = 2.182, DF = 4, P-Value = 0.702
* NOTE * 2 cells with expected counts less than 5
First, we check to see if the assumption about the expected cells is met. From the table, there
are two expected cell counts that are less than 5. Thus, the results of the test are suspect.
To determine if the number of kills is related to whether the trial was stopped or not, we test:
H0: Number of kills and whether the trial was stopped or not are independentHa: Number of kills and whether the trial was stopped or not are dependent
The test statistic is
2
= 2.171 (from the printout).
Thep-value of the test is .704. Since thisp-value is so large,H0is not rejected. There is
insufficient evidence to indicate that the number of kills is related to whether the trial was
stopped or not at .10.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
10/25
Categorical Data Analysis 199
9.24 a. The contingency table is shown below:
b. To determine if flight response of the geese depends on altitude of the helicopter, wetest:
H0: Flight response and Altitude are independentHa: Flight response and Altitude are dependent
Statistix was used to create the following printout:
Chi-Square Test for Heterogeneity or Independence
for Count = Altitude Response
ResponseAltitude Low High
+-----------+-----------+1 Observed | 85 | 105 | 190
Expected | 73.30 | 116.70 |Cell Chi-Sq | 1.87 | 1.17 |
+-----------+-----------+2 Observed | 77 | 121 | 198
Expected | 76.38 | 121.62 |Cell Chi-Sq | 0.00 | 0.00 |
+-----------+-----------+3 Observed | 17 | 59 | 76
Expected | 29.32 | 46.68 |
Cell Chi-Sq | 5.18 | 3.25 |+-----------+-----------+179 285 464
Overall Chi-Square 11.48P-Value 0.0032Degrees of Freedom 2
Since = .01 >p-value = .0032,H0can be rejected. There is sufficient evidence toindicate that flight response of the geese depends on the altitude of the helicopter.
c. The contingency table is shown below:
High Low Total
Less than 300 105 85 190
300-600 meters 121 77 198
600 or more 59 17 76
Total 285 179 464
High Low Total
Less than 1,000 243 37 280
1,000-2,000 meters 37 68 105
2,000-3,000 meters 4 44 48
3,00 or more 1 30 31
Total 285 179 464
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
11/25
200 Chapter 9
d. To determine if flight response of the geese depends on lateral distance of the
helicopter, we test:
H0: Flight response and Lateral distance are independentHa: Flight response and Lateral distance are dependent
Statistix was used to create the following printout:
Chi-Square Test for Heterogeneity or Independencefor Count = Lat_Cat Response
ResponseLat_Cat Low High
+-----------+-----------+1 Observed | 37 | 243 | 280
Expected | 108.02 | 171.98 |Cell Chi-Sq | 46.69 | 29.33 |
+-----------+-----------+2 Observed | 68 | 37 | 105
Expected | 40.51 | 64.49 |
Cell Chi-Sq | 18.66 | 11.72 |+-----------+-----------+3 Observed | 44 | 4 | 48
Expected | 18.52 | 29.48 |Cell Chi-Sq | 35.07 | 22.03 |
+-----------+-----------+4 Observed | 30 | 1 | 31
Expected | 11.96 | 19.04 |Cell Chi-Sq | 27.22 | 17.09 |
+-----------+-----------+179 285 464
Overall Chi-Square 207.80P-Value 0.0000Degrees of Freedom 3
Since = .01 >p-value = .0000,H0can be rejected. There is sufficient evidence toindicate that flight response of the geese depends on the lateral distance of the
helicopter.
9.26 a. To find the proportion of censored measurements for each of the six tractor lines, wetake the number of censored measurements for each tractor line and divide it by thetotal number of measurements for each tractor lane.
1
175 0.028
6047p = =
2236 0.050
4692p = =
3
319 0.045
7140p = =
4
231 0.038
6120p = =
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
12/25
Categorical Data Analysis 201
5
480 0.046
10353p = =
6
187 0.039
4794p = =
b. Statistix was used to create the following printout:
Chi-Square Test for Heterogeneity or Independencefor Count = Lat_Cat Response
ResponseTractor Line Uncensored Censored
+-----------+-----------+1 Observed | 175 | 6047 | 6222
Expected | 257.61 | 5964.39 |Cell Chi-Sq | 26.49 | 1.14 |
+-----------+-----------+2 Observed | 236 | 4456 | 4692
Expected | 194.26 | 4497.74 |
Cell Chi-Sq | 8.97 | 0.39 |+-----------+-----------+
3 Observed | 319 | 6821 | 7140Expected | 295.62 | 6844.38 |
Cell Chi-Sq | 1.85 | 0.08 |+-----------+-----------+
4 Observed | 231 | 5889 | 6120Expected | 253.39 | 5866.61 |
Cell Chi-Sq | 1.98 | 0.09 |+-----------+-----------+
5 Observed | 480 | 9873 | 10353Expected | 428.64 | 9924.36 |
Cell Chi-Sq | 6.15 | 0.27 |+-----------+-----------+
6 Observed | 187 | 4607 | 4794
Expected | 198.49 | 4595.51 |Cell Chi-Sq | 0.66 | 0.03 |
+-----------+-----------+1628 37693 39321
Overall Chi-Square 48.09P-Value 0.0000Degrees of Freedom 5
To determine the proportion of censored measurements differs for the six tractor lines,we test:
H0: Measurement type and tractor line are independent
Ha: Measurement type and tractor line are dependent
Since = 01 >p-value = .0000,H0can be rejected. There is sufficient evidence toindicate that the proportion of censored measurements differs for the six tractor lines.
c. While statistically significant, we have no way of knowing when a tractor line willproduce a large number of censored measurements and when it will produce a smallnumber of censored measurements. From a practical perspective, not much useful
information has been learned.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
13/25
202 Chapter 9
9.28 a. The contingency table is:
Committee
Acceptable Rejected Totals
Acceptable 101 23 124InspectorRejected 10 19 29
Totals 111 42 153
b. Yes. To plot the percentages, first convert frequencies to percentages by dividing thenumbers in each column by the column total and multiplying by 100. Also, divide therow totals by the overall total and multiply by 100.
Acceptable Rejected Totals
Acceptable
111
101100 = 90.99%
42
23100 = 54.76%
123
124100 = 81.05%
InspectorRejected
111
10100 = 9.01%
42
19100 = 45.23%
153
29100 = 18.95%
From the plot, it appears there is a relationship.
c. Some preliminary calculations are:
11
E = 153
)111(12411
=ncr
= 89.961 12
E = 153
)42(12421
=ncr
= 34.039
21E =
153
)111(2912 =n
cr= 21.039 22
E =153
)42(2922 =n
cr= 7.961
0
0.2
0.4
0.6
0.8
1
Acceptable Rejected Total
Committee
Proportiona
ccept/rejecte
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
14/25
Categorical Data Analysis 203
To determine if the inspector's classifications and the committee's classifications arerelated, we test:
H0: The inspector's and committee's classification are independentHa: The inspector's and committee's classifications are dependent
The test statistic is 2=2[ ]
ij ji
ij
n E
E
=961.7
)961.719(
039.21
)039.2110(
039.34
)039.3423(
961.89
)961.89101( 2222 +
+
+
= 26.034
The rejection region requires = .05 in the upper tail of the 2distribution withdf = (r1)(c1) = (2 1)(2 1) = 1. From Table 8, Appendix B, 2.05 = 3.84146.The rejection region is 2> 3.84146.
Since the observed value of the test statistic falls in the rejection region (2= 26.034 >3.84146),H0is rejected. There is sufficient evidence to indicate the inspector's andcommittee's classifications are dependent at = .05. This indicates that the inspectorand committee tend not to make the same decisions.
9.30 We wish to test:
0 1 2 3 4 5 6 7: 1/7H p p p p p p p= = = = = = =
a
1 2 3 4 5 6 7
: At least two of these proportions are different from
1/7
H
p p p p p p p= = = = = = =
Our statistic is( )
272
1
i i
ii
O e
e=
=
The observed counts are found by using the table information:
iO = (number of specimens)(percentage with manganese nodules)
The expected counts are found by i i ie n p=
These results are summarized as follows:
2 2 22 (23 55.6) (25 20.0) (11 14.1)
32.5955.6 20.0 14.1
= + + + =
Age Observed Expected
Miocene-recent 389(.059) = 23 389(1/7) = 55.6
Oligocene 140(.179) = 25 140(1/7) = 20.0
Eocene 214(.164) = 35 214(1/7) = 30.6
Paleocene 84(.214) = 18 84(1/7) = 12.0
Lake Cretaceous 247(.211) = 52 247(1/7) = 35.3
Early and Middle Cretaceous 1120(.142) = 159 1120(1/7) = 160.0
Jurassic 99(.110) = 11 99(1/7) = 14.1
323
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
15/25
204 Chapter 9
The rejection region requires = .05 in the upper tail of the 2 distribution with k1 = 7 1
= 6 df. From Table 8, Appendix B, 2.05 = 12.5916. Reject 0H if2
> 12.5916.
Since the observed value of the test statistic falls in the rejection region ( 2 = 32.59 >
12.5916), 0H is rejected.
9.32 a. To determine if the percentages of the different types of programming statements differ
for the two languages, we test:
0: The proportions of the different types of programming statements are the
same for the two languages
H
a: The proportions of the different types of programming statements are
different for the two languages
H
The expected category counts are:
( ). .
i jij
n nE n
n=
( )112170(10,412) 1136.407
19,882E n = =
( )122170(9470) 1033.593
19,882E n = =
( )52726(9470) 345.801
19,882E n = =
The observed and expected category counts are:
The test statistic is:
( )( )
2
2
ij ij
ij
n E n
E n
=
=2 2 2(125 1136.407) (2045 1033.593) (465 345.801)
1136.407 1033.593 345.801
+ + +
= 4755.1933
ALGOL PASCAL Totals
IF 125 (1136.407) 2,045 (1033.593) 2,170
FOR 968 (690.223) 350 (627.777) 1,318
IO 135 (1037.953) 1,847 (944.047) 1,982
IF ASSIGNMENT 8,293 (7167.218) 4,763 (6518.782) 13,686
Other 261 (380.199) 465 (345.801) 726
Totals 10,412 9,470 19,882
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
16/25
Categorical Data Analysis 205
The rejection region requires = .05 in the upper tail of the 2 distribution with df = (r
1)(c 1) = (5 1)(2 1) = 4. From Table 8, Appendix B, 2.05 = 9.48773. The
rejection region is 2 > 9.48773.
Since the observed value of the test statistic falls in the rejection region ( 2 =
4755.1993 > 9.48773), 0H is rejected. There is sufficient evidence to indicate the
percentages of the different types of programming statements differ for the two
languages at = .05.
b. The form of the confidence interval for ( )A Pp p is:
( ) A A P PA P 2A P
(1 ) (1 )p p pp p z
n n
+
AA
A
8923 .857
10,412
Xp
n
= = = PPP
4763 .503
9470
Xp
n
= = =
For confidence coefficient .95, = 1 .95 = .05 and /2 = . 05/2 = .025. From Table 5,
Appendix B, .025 1.96z = . The confidence interval is:
.857(1 .857) .503(1 .503)(.857 .503) 1.96 .354 .0121
10412 9470
+
(.3419, .3661)
9.34 a. The form of the contingency tables will all be:
b. The hypergeometric formula for these tables is:
449 49
10
498
10
y y
, wherey= 0, 1, 2, , 10
Predicted EVG
No Yes TotalFALSE 439 +y 10 y 449
DefectTRUE 49 y y 49
Total 488 10 498
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
17/25
206 Chapter 9
Due to the large sample size, these factorials produce difficult probabilities to calculate.
The resulting probabilities are shown below:
c. The Fishers exact testp-value can be found by adding the probabilities at least as
contradictory as the one observed. P-value =P(y= 2 or 3 or or 10) = 0.2572.
d. We see that these two probabilities are equal.
9.36 a. The form of the confidence interval is:
2
(1 ) i i
i
p pp z
n
1 2 3 .60, .23, .17p p p= = =
For confidence coefficient .95, = 1 .95 = .05 and /2 = .05/2 = .025. From Table 5,
Appendix B, .025z = 1.96. The 95% confidence intervals are:
For 1p :.60(.40).60 1.96 .60 .029 (.571, .629)
1132
For 2p :.23(.77)
.23 1.96 .23 .025 (.205, .255)1132
For 3p :.17(.83)
.17 1.96 .17 .022 (.148, .192)1132
b. We want to test:
0 1 2 3: .8, .1, and .1H p p p= = =
a: At least two proportions are different than specifiedH
The expected counts in each category are:
1 1( )E n np= = 1132(.8) = 905.6
2 2( )E n np= = 1132(.1) = 113.2
3 3( )E n np= = 1132(.1) = 113.2
y P(y)
0 0.3514
1 0.3914
2 0.19173 0.0544
4 0.0099
5 0.0012
6 0.0001
7 0
8 0
9 0
10 0
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
18/25
Categorical Data Analysis 207
The observed and expected category counts are:
Appropriate Inappropriate Avoidable
Observed 679 261 192
Expected 905.6 113.2 113.2
The test statistic is:
( )2 2 2 2
2 (679 905.6) (261 113.2) (192 113.2) 304.5
905.6 113.2 113.2
i i
i
n np
np
= = + + =
The rejection region requires = .10 in the upper tail of the 2 distribution with df = k
1 = 3 1 = 2. From Table 8, Appendix B, 2.10 = 4.60517. The rejection region is2
> 4.60517.
Since the observed value of the test statistic falls in the rejection region ( 2 = 304.5 >
4.60517) 0H is rejected. There is sufficient evidence to indicate at least two
proportions are different than specified at = .10.
9.38 The Statistix printout for the analysis appears below:
Chi-Square Test for Heterogeneity or Independencefor count = Year abuse
abuseYear 1 2 3 4
+-----------+-----------+-----------+-----------+1 Observed | 7 | 5 | 9 | 8 | 29
Expected | 9.61 | 8.22 | 5.74 | 5.43 |Cell Chi-Sq | 0.71 | 1.26 | 1.85 | 1.22 |
+-----------+-----------+-----------+-----------+2 Observed | 22 | 18 | 6 | 6 | 52
Expected | 17.24 | 14.74 | 10.29 | 9.73 |Cell Chi-Sq | 1.31 | 0.72 | 1.79 | 1.43 |
+-----------+-----------+-----------+-----------+3 Observed | 12 | 15 | 6 | 12 | 45
Expected | 14.92 | 12.75 | 8.90 | 8.42 |Cell Chi-Sq | 0.57 | 0.40 | 0.95 | 1.52 |
+-----------+-----------+-----------+-----------+4 Observed | 21 | 15 | 16 | 9 | 61
Expected | 20.22 | 17.29 | 12.07 | 11.42 |Cell Chi-Sq | 0.03 | 0.30 | 1.28 | 0.51 |
+-----------+-----------+-----------+-----------+62 53 37 35 187
Overall Chi-Square 15.86P-Value 0.0699
Degrees of Freedom 9
Cases Included 16 Missing Cases 0
To determine if the proportion of different types of abuse are changing over time, we test:
0: Types of abuse and year are independentH
a: Types of abuse and year are dependentH
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
19/25
208 Chapter 9
The expected category counts are shown in the printout.
The test statistic is 2 =
2 ( )
( )
ij ij
ij
n E n
E n
= 15.86 from printout.
The rejection region requires = .05 in the upper tail of the 2 distribution with df = (r1)(c
1) = (4 1)(4 1) = 9. From Table 8, Appendix B, 2.05 =16.9190. The rejection region is
2 > 16.9190.
Since the observed value of the test statistic does not fall in the rejection region
( 2 0 15.859 16.9190),H= >/ is not rejected. There is insufficient evidence to indicate theproportions of different types of abuse are changing over time at = .05.
9.40 a. To determine if pesticide depends on orchard type, we test:
0: Pesticide and orchard type are independentH
a: Pesticide and orchard type are dependentH
The test statistic is 2 = 31000.416 (from printout). Thep-value for the test isp= .000.
At = .01, >p-value, and we reject 0H . There is sufficient evidence to indicate that
pesticide used and orchard type are dependent.
PHstat was used to conduct the desired analysis and the following printout was created:
Observed Frequencies
Column variable
Row variable Almonds Peaches Nectarines Total
Chlor. 41077 4419 11594 57090
Diazinon 102935 9651 5928 118514
Methid. 21240 5198 1790 28228
Parathion 136064 53384 24417 213865
Total 301316 72652 43729 417697
Expected Frequencies
Column variable
Row variable Almonds Peaches Nectarines Total
Chlor. 41183.27505 9929.931697 5976.79325 57090Diazinon 85492.98756 20613.69636 12407.31608 118514
Methid. 20362.96178 4909.82855 2955.209666 28228
Parathion 154276.7756 37198.54339 22389.681 213865
Total 301316 72652 43729 417697
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
20/25
Categorical Data Analysis 209
Data
Level of Significance 0.01
Number of Rows 4
Number of Columns 3
Degrees of Freedom 6
Results
Critical Value 16.8118718
Chi-Square Test Statistic 31000.41584
p-Value 0
b. We will calculate 95% confidence intervals for the rate of parathion application for the
three orchard types.
Almonds:136,064
301,316
p= = .45
.025
.45(.55) .45 1.96 .45 .002
301,316
pqp z
n =
Nectars:24,417
43,729
p= = .56
.025
.56(.44) .56 1.96 .56 .005
43,729
pqp z
n =
Peaches:53,384
72,652
p= = .73
.025
.73(.27) .73 1.96 .73 .003
72,652
pqp z
n =
9.42 a. Test 0 1 2: .5H p p= =
a 1 2:H p p
The test statistic is:
( )2
2 22
1 1
ij ij
iji j
O ee= ==
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
21/25
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
22/25
Categorical Data Analysis 211
c. Fishers exact test computes thep-value atp= 0.0173. When testing at = .01,H0cannot be rejected. There is insufficient evidence to detect a difference in proportions
which agrees with our conclusion above in part a.
9.44 The Statistix printout for the analysis is shown below:
Chi-Square Test for Heterogeneity or Independencefor count = Technology Group
GroupTechnology 1 2 3 4
+-----------+-----------+-----------+-----------+1 Observed | 21 | 42 | 11 | 25 | 99
Expected | 24.75 | 24.75 | 24.75 | 24.75 |Cell Chi-Sq | 0.57 | 12.02 | 7.64 | 0.00 |
+-----------+-----------+-----------+-----------+2 Observed | 18 | 2 | 16 | 13 | 49
Expected | 12.25 | 12.25 | 12.25 | 12.25 |Cell Chi-Sq | 2.70 | 8.58 | 1.15 | 0.05 |
+-----------+-----------+-----------+-----------+3 Observed | 11 | 6 | 23 | 12 | 52
Expected | 13.00 | 13.00 | 13.00 | 13.00 |Cell Chi-Sq | 0.31 | 3.77 | 7.69 | 0.08 |
+-----------+-----------+-----------+-----------+50 50 50 50 200
Overall Chi-Square 44.548P-Value 0.0000Degrees of Freedom 6
Cases Included 12 Missing Cases 0
a. To determine if public opinion regarding the choice of future technology options for
generating electricity differ among the four groups, we test:
0: Choice and group are independentH
a: Choice and group are dependentH
The test statistic is2
= 44.548.
The rejection region requires = .10 in the upper tail of the2
distribution with df = (r
1)(c1) = (3 1)(4 1) = 6. From Table 8, Appendix B,2.10 = 10.6446. The
rejection region is 2 > 10.6446.
Since the observed value of the test statistic falls in the rejection region ( 2 = 44.548 >
10.6446), 0H is rejected. There is sufficient evidence to indicate that public opinion
does differ among the four groups at = .10.
b. Let 1 = proportion supporting the coal option and 2p = proportion supporting the
nuclear option.
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
23/25
212 Chapter 9
To determine if the proportion supporting the coal option exceeds the proportion
supporting the nuclear option, we test:
0 1 2: 0H p p =
a 1 2: 0H p p >
1
99 .495
200p = = 2
49 .245
200p = =
99 49 .37
200 200p
+= =
+
The rejection region re requires = .10 in the upper tail of thezdistribution. From
Table 5, Appendix B, .10z = 1.282. The rejection region isz> 1.282.
The test statistic is:
1 2 0
2 2
( ) (.495 .245) 0
(1 ) (1 ) 2 .37(.63) .37(.63) 2(.37)
200
p p Dz
p p p p p
n
= =
+ + + += 4.11
Since the observed value of the test statistic falls in the rejection region (z= 4.11 >
1.282), 0H is rejected. There is sufficient evidence to indicate the proportion
supporting coal exceeds the proportion supporting nuclear at = .10.
c. The form of the confidence interval is:
/ 2
(1 )
pp z
n
16 .32
50p= =
For confidence coefficient .90,
= 1 .90 = .10 and
/2 = .10/2 = .05. From Table 5,Appendix B, .05z = 1.645. The 90% confidence interval is:
.32(1 .32).32 1.645 .32 .109 (.211, .429)
50
9.46 The data were tested using Fishers exact test and the results are shown below:
Two by Two Tables
+----------+----------+| | || 10 | 6 | 16
| | |+----------+----------+| | || 12 | 2 | 14| | |+----------+----------+
22 8 30
Fisher Exact Tests: Lower Tail 0.1541 Upper Tail 0.0715 Two Tailed 0.2255
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
24/25
Categorical Data Analysis 213
To determine if the fidelity and selectivity are dependent, we test:
0:H Fidelity and Selectivity are independent
a:H Fidelity and Selectivity are dependent
Thep-value for the test is 0.2255.
When testing at = .05, 0H cannot be rejected. There is insufficient evidence to indicate
that fidelity and selectivity are dependent when testing at = .05.
9.48 Some preliminary calculations are:
1 2 3 4 5 6 7 8 714(.125) 89.25ie e e e e e e e np= = = = = = = = = =
a. To determine if the probabilities of worker accidents are higher for some time periods,
we test:
0 1 2 3 4 5 6 7 8: .125H p p p p p p p p= = = = = = = =
a : At least two of the cell probabilities differ from each otherH
The test statistic is:
( )2
2
i i
ii
O e
e
=
=2 2 2 2(93 89.25) (71 89.25) (79 89.25) (110 89.25)
15.90589.25 89.25 89.25 89.25
+ + + + =
The rejection region requires = .10 in the upper tail of the2
distribution with df = k
1 = 8 1 = 7. From Table 8, Appendix B,2.10 =12.0170. The rejection region is
2 > 12.0170.
Since the observed value of the test statistic falls in the rejection region ( 2 = 15.905 >
12.017, 0H is rejected. There is sufficient evidence to indicate the probabilities of
worker accidents are higher in some time periods at = .10.
b. 1 98 89 102 110 399 .5588714 714
p + + += = =
0 1: .5H p =
a 1: .5H p >
8/13/2019 Eng ISM Chapter 9 Statistics For Engineering and Sciences by Mandenhall
25/25
The test statistic is 1 10
10 10
.5588 .53.14
( ) .5(.5)
714
p pz
p q
n
= = =
The rejection region requires = .10 in the upper tail of thezdistribution. From Table
5, Appendix B, .10 1.28z = . The rejection region isz> 1.28.
Since the observed value of the test statistic falls in the rejection region (z= 3.14 >
1.28), 0H is rejected. There is sufficient evidence to indicate the probability of an
accident during the last 4 hours of a shift is greater than during the first 4 hours at
= .10.