Upload
leslie-baldwin
View
224
Download
2
Tags:
Embed Size (px)
Citation preview
2
Outline Materi :
• Uji hipotesis rata-rata
• Uji hipotesis beda rata-rata peubah bebas
• Uji hipotesis beda rata-rata peubah berpasangan
3
Inference from Small Samples
I
Some graphic screen captures from Seeing Statistics ®Some images © 2001-(current year) www.arttoday.com
4
Introduction
• When the sample size is small, the estimation and testing procedures of large samples are not appropriate.
• There are equivalent small sample test and estimation procedures for , the mean of a normal populationthe difference between two population
means2, the variance of a normal populationThe ratio of two population variances.
5
The Sampling Distribution of the Sample Mean
• When we take a sample from a normal population, the sample mean has a normal distribution for any sample size n, and
• has a standard normal distribution. • But if is unknown, and we must use s to estimate it,
the resulting statistic is not normalis not normal.
n
xz
/
n
xz
/
normal!not is / ns
x normal!not is / ns
x
x
6
Student’s t Distribution
• Fortunately, this statistic does have a sampling distribution that is well known to statisticians, called the Student’s t distribution, Student’s t distribution, with nn-1 -1 degrees of freedom.degrees of freedom.
ns
xt
/
ns
xt
/
•We can use this distribution to create estimation testing procedures for the population mean .
7
Properties of Student’s t
• Shape depends on the sample size n or the degrees of degrees of freedom, freedom, nn-1.-1.
• As n increase , variability of t decrease because the estimate s is based on more and more information.
• As n increases the shapes of the t and z distributions become almost identical.
•Mound-shapedMound-shaped and symmetric about 0.
•More variable than More variable than zz, with “heavier tails”
AppletApplet
8
Using the t-Table
• Table 4 gives the values of t that cut off certain critical values in the tail of the t distribution.
• Index dfdf and the appropriate tail area a a to find ttaa,,the value of t with area a to its right.
For a random sample of size n = 10, find a value of t that cuts off .025 in the right tail.
Row = df = n –1 = 9
t.025 = 2.262
Column subscript = a = .025
9
Small Sample Inference for a Population Mean
• The basic procedures are the same as those used for large samples. For a test of hypothesis:
.1on with distributi- ta
on basedregion rejection aor values- using/
statistic test theusing
tailedor two one :H versus:HTest
0
a00
ndf
pns
xt
.1on with distributi- ta
on basedregion rejection aor values- using/
statistic test theusing
tailedor two one :H versus:HTest
0
a00
ndf
pns
xt
10
• For a 100(1)% confidence interval for the population mean
.1on with distributi- ta of tailin the
/2 area off cuts that of value theis where 2/
2/
ndf
ttn
stx
.1on with distributi- ta of tailin the
/2 area off cuts that of value theis where 2/
2/
ndf
ttn
stx
Small Sample Inference for a Population Mean
11
Example
A sprinkler system is designed so that the average time for the sprinklers to activate after being turned on is no more than 15 seconds. A test of 5 systems gave the following times:
17, 31, 12, 17, 13, 25 Is the system working as specified? Test using = .05.
specified) as ng(not worki 15:H
specified) as (working 15:H
a
0
specified) as ng(not worki 15:H
specified) as (working 15:H
a
0
12
Example
Data:Data: 17, 31, 12, 17, 13, 25First, calculate the sample mean and standard deviation, using your calculator or the formulas in Chapter 2.
387.75
6115
2477
1
)(
167.196
115
222
nnx
xs
n
xx i
387.75
6115
2477
1
)(
167.196
115
222
nnx
xs
n
xx i
13
ExampleData:Data: 17, 31, 12, 17, 13, 25Calculate the test statistic and find the rejection region for =.05.
5161 38.16/387.7
15167.19
/
:freedom of Degrees :statisticTest
0
ndfns
xt
5161 38.16/387.7
15167.19
/
:freedom of Degrees :statisticTest
0
ndfns
xt
Rejection Region: Reject H0 if t > 2.015. If the test statistic falls in the rejection region, its p-value will be less than = .05.
Rejection Region: Reject H0 if t > 2.015. If the test statistic falls in the rejection region, its p-value will be less than = .05.
14
Conclusion
Data:Data: 17, 31, 12, 17, 13, 25Compare the observed test statistic to the rejection region, and draw conclusions.
.015.2 if HReject
:RegionRejection
38.1 :statisticTest
0
t
t
.015.2 if HReject
:RegionRejection
38.1 :statisticTest
0
t
t
Conclusion: For our example, t = 1.38 does not fall in the rejection region and H0 is not rejected. There is insufficient evidence to indicate that the average activation time is greater than 15.
15:H
15:H
a
0
15:H
15:H
a
0
15
Approximating the p-value
• You can only approximate the p-value for the test using Table 4.
Since the observed value of t = 1.38 is smaller than t.10 = 1.476,
p-value > .10.
16
The exact p-value
• You can get the exact p-value using some calculators or a computer.
One-Sample T: TimesTest of mu = 15 vs mu > 15
Variable N Mean StDev SE MeanTimes 6 19.17 7.39 3.02
Variable 95.0% Lower Bound T PTimes 13.09 1.38 0.113
One-Sample T: TimesTest of mu = 15 vs mu > 15
Variable N Mean StDev SE MeanTimes 6 19.17 7.39 3.02
Variable 95.0% Lower Bound T PTimes 13.09 1.38 0.113
p-value = .113 which is greater than .10 as we approximated using Table 4.
AppletApplet
17
Testing the Difference between Two Means
normal. be must spopulation two the small, are sizes sample the Since
and variancesand and means with2 and 1 spopulation from
drawn are and size of samples random tindependen 9,Chapter in As22
21 .21
21
μμ
nn
normal. be must spopulation two the small, are sizes sample the Since
and variancesand and means with2 and 1 spopulation from
drawn are and size of samples random tindependen 9,Chapter in As22
21 .21
21
μμ
nn
•To test: •H0:D0 versus Ha:one of three
where D0 is some hypothesized difference, usually 0.
18
Testing the Difference between Two Means
•The test statistic used in Chapter 9
•does not have either a z or a t distribution, and cannot be used for small-sample inference. •We need to make one more assumption, that the population variances, although unknown, are equal.
2
22
1
21
21z
ns
ns
xx
2
22
1
21
21z
ns
ns
xx
19
Testing the Difference between Two Means
•Instead of estimating each population variance separately, we estimate the common variance with
2
)1()1(
21
222
2112
nn
snsns
2
)1()1(
21
222
2112
nn
snsns
21
2
021
11
nns
Dxxt
21
2
021
11
nns
Dxxt
has a t distribution with n1+n2-2=(n1-1)+(n2-1) degrees of freedom.
•And the resulting test statistic,
20
Estimating the Difference between Two Means
•You can also create a 100(1-)% confidence interval for 1-2.
2
)1()1( with
21
222
2112
nn
snsns
2
)1()1( with
21
222
2112
nn
snsns
21
22/21
11)(
nnstxx
21
22/21
11)(
nnstxx
Remember the three assumptions:
1. Original populations normal
2. Samples random and independent
3. Equal population variances.
Remember the three assumptions:
1. Original populations normal
2. Samples random and independent
3. Equal population variances.
21
Example• Two training procedures are compared by
measuring the time that it takes trainees to assemble a device. A different group of trainees are taught using each method. Is there a difference in the two methods? Use = .01.
Time to Assemble
Method 1 Method 2
Sample size 10 12
Sample mean 35 31
Sample Std Dev
4.9 4.5
0:H 210
21
2
21
11
0
:statisticTest
nns
xxt
0:H 21a
22
Example
• Solve this problem by approximating the p-value using Table 4.
Time to Assemble
Method 1 Method 2
Sample size 10 12
Sample mean 35 31
Sample Std Dev 4.9 4.5
99.1
121
101
942.21
3135
:statisticTest
t
AppletApplet
942.2120
)5.4(11)9.4(9
2
)1()1(
:Calculate
22
21
222
2112
nn
snsns
23
Example
value)-(2
1)99.1(
)99.1()99.1( :value-
ptP
tPtPp
value)-(2
1)99.1(
)99.1()99.1( :value-
ptP
tPtPp
.025 < ½( p-value) < .05
.05 < p-value < .10
Since the p-value is greater than = .01, H0 is not rejected. There is insufficient evidence to indicate a difference in the population means.
.025 < ½( p-value) < .05
.05 < p-value < .10
Since the p-value is greater than = .01, H0 is not rejected. There is insufficient evidence to indicate a difference in the population means.
df = n1 + n2 – 2 = 10 + 12 – 2 = 20df = n1 + n2 – 2 = 10 + 12 – 2 = 20
24
Testing the Difference between Two Means
•How can you tell if the equal variance assumption is reasonable?
statistic. test ealternativan use
,3smaller
larger ratio, theIf
.reasonable is assumption varianceequal the
,3smaller
larger ratio, theIf
:Thumb of Rule
2
2
2
2
s
s
s
s
statistic. test ealternativan use
,3smaller
larger ratio, theIf
.reasonable is assumption varianceequal the
,3smaller
larger ratio, theIf
:Thumb of Rule
2
2
2
2
s
s
s
s
25
Testing the Difference between Two Means
•If the population variances cannot be assumed equal, the test statistic
•has an approximate t distribution with degrees of freedom given above. This is most easily done by computer.
2
22
1
21
21
ns
ns
xxt
2
22
1
21
21
ns
ns
xxt
1)/(
1)/(
2
22
22
1
21
21
2
2
22
1
21
nns
nns
ns
ns
df
1)/(
1)/(
2
22
22
1
21
21
2
2
22
1
21
nns
nns
ns
ns
df
26
The Paired-Difference Test
•Sometimes the assumption of independent samples is intentionally violated, resulting in a matched-pairsmatched-pairs or paired-difference testpaired-difference test.•By designing the experiment in this way, we can eliminate unwanted variability in the experiment by analyzing only the differences,
ddii = = xx11ii – – xx22ii
•to see if there is a difference in the two population means,
27
Example
• One Type A and one Type B tire are randomly assigned to each of the rear wheels of five cars. Compare the average tire wear for types A and B using a test of hypothesis.
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
0:H
0:H
21a
210
0:H
0:H
21a
210
• But the samples are not independent. The pairs of responses are linked because measurements are taken on the same car.
28
The Paired-Difference Test
.1on with distributi- ta
on basedregion rejection aor value- theUse
. s,difference theofdeviation standard andmean
theare and pairs, ofnumber where
/
0
statistic test theusing
0 :H test we0:H test To d0210
ndf
p
d
sdn
ns
dt
i
d
d
.1on with distributi- ta
on basedregion rejection aor value- theUse
. s,difference theofdeviation standard andmean
theare and pairs, ofnumber where
/
0
statistic test theusing
0 :H test we0:H test To d0210
ndf
p
d
sdn
ns
dt
i
d
d
29
Example
Car 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5
0:H
0:H
21a
210
0:H
0:H
21a
210
.0837
.48 Calculate
1
22
nn
dd
s
n
dd
ii
d
i
.0837
.48 Calculate
1
22
nn
dd
s
n
dd
ii
d
i 8.125/0837.
048.
/
0
:statisticTest
ns
dt
d
8.125/0837.
048.
/
0
:statisticTest
ns
dt
d
30
Example
CarCar 1 2 3 4 5
Type A 10.6 9.8 12.3 9.7 8.8
Type B 10.2 9.4 11.8 9.1 8.3
Difference .4 .4 .5 .6 .5
Rejection region: Reject H0 if t > 2.776 or t < -2.776.
Conclusion: Since t = 12.8, H0 is rejected. There is a difference in the average tire wear for the two types of tires.
31
Some Notes
•You can construct a 100(1-)% confidence interval for a paired experiment using
•Once you have designed the experiment by pairing, you MUST analyze it as a paired experiment. If the experiment is not designed as a paired experiment in advance, do not use this procedure.
n
std d
2/n
std d
2/
32
Key Concepts
I. Experimental Designs for Small SamplesI. Experimental Designs for Small Samples
1. Single random sample: The sampled population must be normal.
2. Two independent random samples: Both sampled populations must be normal.
a. Populations have a common variance 2.
b. Populations have different variances
3. Paired-difference or matched-pairs design: The samples are not independent.