View
215
Download
0
Tags:
Embed Size (px)
Citation preview
1
Pertemuan 09Pengujian Hipotesis Proporsi
dan Data Katagorik
Matakuliah : A0392 – Statistik Ekonomi
Tahun : 2006
2
Outline Materi :
• Uji hipotesis proporsi
• Uji hipotesis beda proporsi
• Uji kebebasan data katagorik
3
Summary of Test Statistics to be Used Summary of Test Statistics to be Used in ain a
Hypothesis Test about a Population Hypothesis Test about a Population MeanMean
n n >> 30 ? 30 ?
known ?known ?
Popul. Popul. approx.approx.normal normal
?? known ?known ?
Use Use ss to toestimate estimate
Use Use ss to toestimate estimate
Increase Increase nnto to >> 30 30/
xz
n
/
xz
n
/
xzs n
/
xzs n
/
xz
n
/
xz
n
/
xts n
/
xts n
YesYes
YesYes
YesYes
YesYes
NoNo
NoNo
NoNo
NoNo
4
A Summary of Forms for Null and Alternative Hypotheses about a
Population Proportion
• The equality part of the hypotheses always appears in the null hypothesis.
• In general, a hypothesis test about the value of a population proportion p must take one of the following three forms (where p0 is the hypothesized value of the population proportion).
H0: p > p0 H0: p < p0 H0: p = p0
Ha: p < p0 Ha: p > p0 Ha: p p0
5
Tests about a Population Proportion:Large-Sample Case (np > 5 and n(1 - p) >
5)• Test Statistic
where:
• Rejection Rule
One-Tailed Two-Tailed
H0: pp Reject H0 if z > z
H0: pp Reject H0 if z < -z
H0: pp Reject H0 if |z| > z
zp p
p
0
z
p p
p
0
pp p
n
0 01( ) pp p
n
0 01( )
6
Example: NSC
• Two-Tailed Test about a Population Proportion: Large n
For a Christmas and New Year’s week, the National Safety Council estimated that 500 people would be killed and 25,000 injured on the nation’s roads. The NSC claimed that 50% of the accidents would be caused by drunk driving.
A sample of 120 accidents showed that 67 were caused by drunk driving. Use these data to test the NSC’s claim with = 0.05.
7
Example: NSC
• Two-Tailed Test about a Population Proportion: Large n– Hypothesis
H0: p = .5
Ha: p .5
– Test Statistic
0 (67/ 120) .51.278
.045644p
p pz
0 (67/ 120) .51.278
.045644p
p pz
0 0(1 ) .5(1 .5).045644
120p
p pn
0 0(1 ) .5(1 .5).045644
120p
p pn
8
Example: NSC
• Two-Tailed Test about a Population Proportion: Large n– Rejection Rule
Reject H0 if z < -1.96 or z > 1.96
– Conclusion
Do not reject H0.
For z = 1.278, the p-value is .201. If we reject
H0, we exceed the maximum allowed risk of committing a Type I error (p-value > .050).
9
Hypothesis Testing and Decision Making
• In many decision-making situations the decision maker may want, and in some cases may be forced, to take action with both the conclusion do not reject H0 and the conclusion reject H0.
• In such situations, it is recommended that the hypothesis-testing procedure be extended to include consideration of making a Type II error.
10
Calculating the Probability of a Type II Error in Hypothesis Tests about
a Population Mean
1. Formulate the null and alternative hypotheses.
2. Use the level of significance to establish a rejection rule based on the test statistic.
3. Using the rejection rule, solve for the value of the sample mean that identifies the rejection region.
4. Use the results from step 3 to state the values of the sample mean that lead to the acceptance of H0; this defines the acceptance region.
5. Using the sampling distribution of for any value of from the alternative hypothesis, and the acceptance region from step 4, compute the probability that the sample mean will be in the acceptance region.
xx
11
Example: Metro EMS (revisited)
• Calculating the Probability of a Type II Error
1. Hypotheses are: H0: and Ha:
2. Rejection rule is: Reject H0 if z > 1.645
3. Value of the sample mean that identifies the rejection region:
4. We will accept H0 when x < 12.8323
121.645
3.2/ 40x
z
12
1.6453.2/ 40
xz
3.212 1.645 12.8323
40x
3.212 1.645 12.8323
40x
12
Example: Metro EMS (revisited)
• Calculating the Probability of a Type II Error
5. Probabilities that the sample mean will be in the acceptance region:
Values of 1-
14.0 -2.31 .0104 .989613.6 -1.52 .0643 .935713.2 -0.73 .2327 .767312.83 0.00 .5000 .500012.8 0.06 .5239 .476112.4 0.85 .8023 .197712.0001 1.645 .9500 .0500
12.83233.2/ 40
z
12.83233.2/ 40
z
13
Example: Metro EMS (revisited)
• Calculating the Probability of a Type II Error
Observations about the preceding table:– When the true population mean is close to
the null hypothesis value of 12, there is a high probability that we will make a Type II error.
– When the true population mean is far above the null hypothesis value of 12, there is a low probability that we will make a Type II error.
14
Power of the Test
• The probability of correctly rejecting H0 when it is false is called the power of the test.
• For any particular value of , the power is 1 – .
• We can show graphically the power associated with each value of ; such a graph is called a power curve.
15
Determining the Sample Sizefor a Hypothesis Test About a Population
Mean
where
z = z value providing an area of in the tail
z = z value providing an area of in the tail= population standard deviation
0 = value of the population mean in H0
a = value of the population mean used for the Type II error
Note: In a two-tailed hypothesis test, use z /2 not z
nz z
a
( )
( )
2 2
02
nz z
a
( )
( )
2 2
02
16
Relationship among , , and n
• Once two of the three values are known, the other can be computed.
• For a given level of significance , increasing the sample size n will reduce .
• For a given sample size n, decreasing will increase , whereas increasing will decrease .
17
Inferences About the Difference Between the Proportions of Two
Populations
• Sampling Distribution of
• Interval Estimation of p1 - p2
• Hypothesis Tests about p1 - p2
p p1 2p p1 2
18
• Expected Value
• Standard Deviation
• Distribution Form
If the sample sizes are large (n1p1, n1(1 - p1), n2p2,
and n2(1 - p2) are all greater than or equal to 5), thesampling distribution of can be approximatedby a normal probability distribution.
Sampling Distribution of p p1 2p p1 2
E p p p p( )1 2 1 2 E p p p p( )1 2 1 2
p pp pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( ) p p
p pn
p pn1 2
1 1
1
2 2
2
1 1 ( ) ( )
p p1 2p p1 2
19
Interval Estimation of p1 - p2
• Interval Estimate
• Point Estimator of
p p z p p1 2 2 1 2 /p p z p p1 2 2 1 2 /
p p1 2 p p1 2
sp pn
p pnp p1 2
1 1
1
2 2
2
1 1 ( ) ( )
sp pn
p pnp p1 2
1 1
1
2 2
2
1 1 ( ) ( )
20
Example: MRA
MRA (Market Research Associates) is conducting research to evaluate the effectiveness of a client’s new advertising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households “aware” of the client’s product. The new campaign has been initiated with TV and newspaper advertisements running for three weeks. A survey conducted immediately after the new campaign showed 120 of 250 households “aware” of the client’s product.
Does the data support the position that the advertising campaign has provided an increased awareness of the client’s product?
21
Example: MRA
• Point Estimator of the Difference Between the Proportions of Two Populations
p1 = proportion of the population of households “aware” of the product after the new campaign
p2 = proportion of the population of households “aware” of the product before the new
campaign = sample proportion of households “aware” of the
product after the new campaign = sample proportion of households “aware” of the
product before the new campaign
p p p p1 2 1 2120250
60150
48 40 08 . . .p p p p1 2 1 2120250
60150
48 40 08 . . .
p1p1
p2p2
22
Example: MRA
• Interval Estimate of p1 - p2: Large-Sample Case
For = .05, z.025 = 1.96:
.08 + 1.96(.0510)
.08 + .10
or -.02 to +.18– Conclusion
At a 95% confidence level, the interval estimate of the difference between the proportion of households aware of the client’s product before and after the new advertising campaign is -.02 to +.18.
. . .. (. ) . (. )
48 40 1 9648 52
25040 60
150 . . .
. (. ) . (. )48 40 1 96
48 52250
40 60150
23
Hypothesis Tests about p1 - p2
• Hypotheses
H0: p1 - p2 < 0
Ha: p1 - p2 > 0
• Test statistic
• Point Estimator of where p1 = p2
where:
zp p p p
p p
( ) ( )1 2 1 2
1 2
zp p p p
p p
( ) ( )1 2 1 2
1 2
s p p n np p1 21 1 11 2 ( )( )s p p n np p1 21 1 11 2 ( )( )
pn p n pn n
1 1 2 2
1 2
pn p n pn n
1 1 2 2
1 2
p p1 2 p p1 2
24
Example: MRA
• Hypothesis Tests about p1 - p2
Can we conclude, using a .05 level of significance, that the proportion of households aware of the client’s product increased after the new advertising campaign?
p1 = proportion of the population of households
“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign
– Hypotheses H0: p1 - p2 < 0
Ha: p1 - p2 > 0
25
Example: MRA
• Hypothesis Tests about p1 - p2
– Rejection Rule Reject H0 if z > 1.645
– Test Statistic
– Conclusion Do not reject H0.
p
250 48 150 40250 150
180400
45(. ) (. )
.p
250 48 150 40250 150
180400
45(. ) (. )
.
sp p1 245 55 1
2501150 0514 . (. )( ) .sp p1 2
45 55 1250
1150 0514 . (. )( ) .
z
(. . )
..
..
48 40 00514
080514
1 56z
(. . )
..
..
48 40 00514
080514
1 56
26
Test of Independence: Contingency Test of Independence: Contingency TablesTables
1. Set up the null and alternative hypotheses.1. Set up the null and alternative hypotheses.
2. Select a random sample and record the 2. Select a random sample and record the observedobserved
frequency, frequency, ffij ij , for each cell of the contingency , for each cell of the contingency table.table.
3. Compute the expected frequency, 3. Compute the expected frequency, eeij ij , for , for each cell. each cell.
ei j
ij (Row Total )(Column Total )
Sample Sizee
i jij
(Row Total )(Column Total ) Sample Size
27
Test of Independence: Contingency Tables
4. Compute the test statistic.
5. Reject H0 if (where is the significance level and with n rows and m columns there are
(n - 1)(m - 1) degrees of freedom).
22
( )f e
eij ij
ijji2
2
( )f e
eij ij
ijji
2 2 2 2
28
Example: Finger Lakes Homes (B)
• Contingency Table (Independence) Test Each home sold can be classified according to
price and to style. Finger Lakes Homes’ manager would like to determine if the price of the home and the style of the home are independent variables.
The number of homes sold for each model and price for the past two years is shown below. For convenience, the price of the home is listed as either $65,000 or less or more than $65,000.
Price Colonial Ranch Split-Level A-Frame < $65,000 18 6 19 12 > $65,000 12 14 16 3
29
Contingency Table (Independence) TestContingency Table (Independence) Test
• HypothesesHypotheses
HH00: Price of the home : Price of the home isis independent of independent of the style the style of the home that is purchased of the home that is purchased
HHaa: Price of the home : Price of the home is notis not independent of theindependent of the
style of the home that is purchasedstyle of the home that is purchased
• Expected FrequenciesExpected Frequencies
PricePrice Colonial Ranch Split-Level A-Frame Colonial Ranch Split-Level A-Frame Total Total
<< $99K 18 $99K 18 6 19 12 6 19 12 55 55
> $99K 12 14 16 > $99K 12 14 16 3 45 3 45
Total 30 20 35 Total 30 20 35 15 10015 100
Example: Finger Lakes Homes (B)Example: Finger Lakes Homes (B)
30
• Contingency Table (Independence) Test– Test Statistic
= .1364 + 2.2727 + . . . + 2.0833 = 9.1486– Rejection Rule
With = .05 and (2 - 1)(4 - 1) = 3 d.f.,
Reject H0 if 2 > 7.81
– Conclusion
We reject H0, the assumption that the price of the home is independent of the style of the home
that is purchased.
22 2 218 16 5
16 56 11
113 6 75
6 75 ( . )
.( )
. .( . )
. . 2
2 2 218 16 516 5
6 1111
3 6 756 75
( . ).
( ). .
( . ).
.
. .052 7 81. .052 7 81
Example: Finger Lakes Homes (B)