Upload
brook-harrell
View
219
Download
0
Embed Size (px)
Citation preview
The t-test
Inferences about Population Means when population SD is unknown
Confidence intervals in z (Review) Want to estimate height of students at USF.
Sampled N=100 students. Found mean =68 in and SD = 6 in.
Best guess for population mean is 68 inches plus or minus some.
95%CI = 95%CI=68±(1.96)[6/sqrt(100)] 68 ±1.96(.6) = 68 ±1.18 Interval is 66.82 to 69.18. Such an interval will
contain the mean 95% of the time.
XzX 05. NX
X
Problem with z
Formulas so far use population SD, and they have been correct, but SD is usually unknown, so we have to estimate
Estimate will be off a bit; would be nice to account for this
The statistic called ‘t’ adjusts for error in estimate of SD. Estimate of SD is better as sample size increases, so t changes with N. The values of t are basically the same as z, but t spreads out more and more as the sample size gets small.
The t DistributionWe use t when the population variance is unknown (the usual case) and sample size is small (N<100, the usual case). If you use a stat package for testing hypotheses about means, you will use t.
The t distribution is a short, fat relative of the normal. The shape of t depends on its df. As N becomes infinitely large, t becomes normal.
Example values from t and z
Area beyond value
z t (df=100) t (df=25)
[t changes with df (N)]
.50 0 0 0
.25 .67 .68 .68
.025 1.96 1.98 2.06
.005 2.57 2.62 2.79
Degrees of Freedom
For the t distribution, degrees of freedom are always a simple function of the sample size, e.g., (N-1).
One way of explaining df is that if we know the total or mean, and all but one score, the last (N-1) score is not free to vary. It is fixed by the other scores. 4+3+2+X = 10. X=1.
t table
Confidence Intervals in t
XstX 05.N
N
XX
N
ss X
X1
)( 2
Want to estimate height of students at USF. Sampled N=100 students. Found mean =68 in and SD = 6 in.Best guess for population mean is 68 inches plus or minus some.
95%CI =
95%CI=68±(1.98)[6/sqrt(100)]
68 ±1.98(.6) = 68 ±1.19
Interval is 66.81 to 69.19. Such an interval will contain the mean 95% of the time.
98.1)99,2,05.(05. dftailstt
Note this is virtually the same as in z, where interval was 66.82 to 69.18. Matters more when N is small.
CI in t, Example 2
Suppose we want to estimate mean curiosity score for psychology students. Sample N = 25 people, Mean = 52, SD = 10.
225
10ˆ;10ˆ;52ˆ
N
sss X
XXX
064.2)24,2,05(.)05(. dftailtt
)2(064.252%95 05. XstXCI
128.56872.47%95 toCI
Note: this is same as CI in z, except we use t instead of z. The value of t comes from a table. Tabled value depends on df.
One-sample t-testWe can use a confidence interval to “test” or decide whether a population mean has a given value. For example, suppose we want to test whether the mean height of women at USF is equal to 68 inches.
Suppose we randomly sample 50 women students at USF. We find that their mean height is 63.05 inches. The SD of height in the sample is 5.75 inches. Then we find the standard error of the mean by dividing SD by sqrt(N) = 5.75/sqrt(50) = .81. The critical value of t with (50-1) df is 2.01(find this in a t-table). Our confidence interval is, therefore, 63.05 plus/minus 1.63. See the graph.
One-sample t Example 1
8070605040
Height in Inches
10
8
6
4
2
0
Fre
qu
en
cy
N=50
M = 63.05
SD=5.75
8070605040
Height in Inches
Pop Mean = 68
S X .8 1
8070605040
Height in Inches
t=2.01
ci X 163.
8070605040
Height in Inches
One sample t testConfidence interval veiw
8070605040
Height in Inches
Histogram of Sample Height
Take a sample, set a confidence interval around the sample mean. Does the interval contain the hypothesized value?
Conventional Steps (Cookbook) 1. Choose alpha (.05) 2. State null and alternative hypotheses (H0:
pop mean is 68) (Ha is not 68) 3. Calculate observed stat (t = ?) 4. Find critical value (tcrit =value in table) 5. State decision rule (if obs > tcrit, reject
null) 6. State conclusion (pop mean is not 68)
7062
15
12
9
6
3
0
Freq
uenc
y
t distribution view
62 Height in Inches
One sample t test
68
S X .8 1
X 63 05.
tX
S X
4 9 5
8 16 1 1
.
..
X 4 95. t distribution
The sample mean is roughly six standard deviations (St. Errors) from the hypothesized population mean. If the population mean is really 68 inches, it is very, very unlikely that we would find a sample with a mean as small as 63.05 inches.
One-sample t, Example 2
Over the years, smokers at M’s treatment center report smoking an average of 30 cigs per day. New treatment Smoke-B-Gon pills given to N=25 new clients. Did it help?
52.2,25 XsX
50.25
52.2
N
ss X
X
X
obs s
Xt
105.
3025
X
obs s
Xt
064.2)24,2,05.( dftailscrit tt
|tobs| > tcrit. Reject null. Result is significant.
Application
We prefer to use the t test instead of the z test when the _____ is small. 1 mind 2 sample size 2 standard error 4 type II error
Definition
The t test adjusts for error in estimating the population ____ during hypothesis testing. 1 mean 2 median 3 range 4 standard deviation
Application
We compute a one-sample t test and find an obtained value of t of 2.5. The critical (tabled) value of t given the null hypothesis turns out to be 2.01. What do we decide? 1 the result is significant 2 the result is not significant 3 we made a type I error 4 we made a type II error