Upload
hadley-wickham
View
285
Download
1
Embed Size (px)
Citation preview
Hadley Wickham
Stat310Estimation + Testing
Saturday, 11 April 2009
1. What makes a good estimator?
2. Recap & general strategy
3. Non-symmetric distributions
4. Testing
Saturday, 11 April 2009
Low bias, low variance Low bias, high variance
High bias, low variance High bias, high variance
Saturday, 11 April 2009
Can combine both together to get mean squared error
MSE(!̂) = E[(!̂ ! !)2]
MSE(!̂) = Var(!̂) + Bias(!̂, !)2
Saturday, 11 April 2009
(Z + 1)/5 ~ SomeDistribution(θ, β)
What, mathematically, is a 95% confidence interval around Z?
Write down the steps you’d take to generate such an interval if you knew θ and β
Recap
Saturday, 11 April 2009
Problem
Y = g(X) Y ~ F(θ) (g has an inverse)
Find a 1 - α confidence interval for X.
i.e. Find a and b so that P(a < X < b) = 1 - α
Saturday, 11 April 2009
Solution
1.Find a 1 - α confidence interval for Y. P(c < Y < d) = 1 - α
a. If F is symmetric, then the bounds will be c = F-1(α/2) and d = F-1(1 - α/2)
b. If F isn’t symmetric then it’s harder
2.a = g-1(c), b = g-1(d)
Saturday, 11 April 2009
Example
340 333 334 332 333 336 350 348 331 344 (mean: 338, sd: 7.01)
Find a 95% confidence interval for μ
X̄n ! µ
s/"
n# tn!1
Saturday, 11 April 2009
Saturday, 11 April 2009
More complicated case
Find 95% confidence interval for standard deviation in previous case (sd = 7.01, n = 10)
X =(n! 1)S2
!2X ! !2(n" 1)
Saturday, 11 April 2009
Standard deviation
Find confidence interval for X ~ χ2(9). Generally want the shortest confidence interval, but hard to find when not symmetric.
Any of the following are correct. The best has the smallest interval.
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10
0 5 10 15 20 25 30
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10 (0.05, 1)(3.33,Inf)
Length: Inf
0 5 10 15 20 25 30
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10 (0.03, 0.99)(2.85,21.67)Length: 18.8
0 5 10 15 20 25 30
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10 (0.025, 0.975)(2.7,19.0)
Length: 16.3
0 5 10 15 20 25 30
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10 (0.01, 0.96)(2.09,17.61)Length: 15.5
0 5 10 15 20 25 30
Saturday, 11 April 2009
0.00
0.02
0.04
0.06
0.08
0.10 (0, 0.95)(0.0,16.9)
Length: 16.9
0 5 10 15 20 25 30
Saturday, 11 April 2009
Your turn
Find 95% confidence interval for the standard deviation (sd = 7.01, n = 10)
P(2.09 < X < 17.61) = 0.95
X =(n! 1)S2
!2
Saturday, 11 April 2009
Testing
Saturday, 11 April 2009
Testing
Very closely related to estimation (particularly confidence intervals)
But point is to answer a yes/no question:
Is the mean of the distribution equal to 0?
Do X and Y have the same mean?
Saturday, 11 April 2009
Your turn
The following values have been drawn from a normal distribution with standard deviation 1.
2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13)
Is it possible they came from a normal distribution with mean 1.5?
Saturday, 11 April 2009
Example
Create 95% confidence interval. Is it inside?
Create 90% confidence interval. Is it inside?
…
Or we can look up the value directly, using the cdf
Saturday, 11 April 2009
Testing jargon
No: Null hypothesis. Nothing is happening. (Thing we want to disprove)
Yes: Alternative hypothesis. Something interesting is happening.
Major complication:
Saturday, 11 April 2009
Absence of evidence is not
evidence of absence
Saturday, 11 April 2009
Implication
Means we never “accept” the null hypothesis, just “fail to reject” it.
Null distribution is usually simple case for which we know the distribution
Saturday, 11 April 2009
Your turn
Null hypothesis: μ = 1.5
Alternative hypothesis: μ > 1.5 OR μ < 1.5
Under the null hypothesis what is the distribution of the mean?
How does what we saw compare to the null distribution? Is it likely or not?
Saturday, 11 April 2009
P-valueP value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed.
Strength of evidence for rejecting the null hypothesis.
But we need a cut off to make a yes-no decision. How do we choose that cut off?
Saturday, 11 April 2009
Errors
What are the possible errors we can make?
False positive. Choose alternative when null is correct. (aka Type 1)
False negative. Choose null when alternative is true. (aka Type 2)
Saturday, 11 April 2009
Terminology
Probability of a false positive called αProbability of false negative called 1 - β
How are the two related?
Usually care more about false positives
Saturday, 11 April 2009
Testing overview
Write down null and alternative hypotheses.
Compute test statistic.
Convert to p-value.
Compare p-value to alpha cut off.
Saturday, 11 April 2009
Next time
Some specific tests.
i.e. for common situations what is the distribution under the null-hypothesis
Saturday, 11 April 2009