24 Est Testing

Hadley Wickham

Stat310Estimation + Testing

Saturday, 11 April 2009

1. What makes a good estimator?

2. Recap & general strategy

3. Non-symmetric distributions

4. Testing


Low bias, low variance Low bias, high variance

High bias, low variance High bias, high variance


Can combine both together to get mean squared error

MSE(!̂) = E[(!̂ ! !)2]

MSE(!̂) = Var(!̂) + Bias(!̂, !)2


(Z + 1)/5 ~ SomeDistribution(θ, β)

What, mathematically, is a 95% confidence interval around Z?

Write down the steps you’d take to generate such an interval if you knew θ and β

Recap


Problem

Y = g(X) Y ~ F(θ) (g has an inverse)

Find a 1 - α confidence interval for X.

i.e. Find a and b so that P(a < X < b) = 1 - α


Solution

1.Find a 1 - α confidence interval for Y. P(c < Y < d) = 1 - α

a. If F is symmetric, then the bounds will be c = F-1(α/2) and d = F-1(1 - α/2)

b. If F isn’t symmetric then it’s harder

2.a = g-1(c), b = g-1(d)


Example

340 333 334 332 333 336 350 348 331 344 (mean: 338, sd: 7.01)

Find a 95% confidence interval for μ

X̄n ! µ

s/"

n# tn!1



More complicated case

Find 95% confidence interval for standard deviation in previous case (sd = 7.01, n = 10)

X =(n! 1)S2

!2X ! !2(n" 1)


Standard deviation

Find confidence interval for X ~ χ2(9). Generally want the shortest confidence interval, but hard to find when not symmetric.

Any of the following are correct. The best has the smallest interval.


0.00

0.02

0.04

0.06

0.08

0.10

0 5 10 15 20 25 30


0.00

0.02

0.04

0.06

0.08

0.10 (0.05, 1)(3.33,Inf)

Length: Inf

0 5 10 15 20 25 30


0.00

0.02

0.04

0.06

0.08

0.10 (0.03, 0.99)(2.85,21.67)Length: 18.8

0 5 10 15 20 25 30


0.00

0.02

0.04

0.06

0.08

0.10 (0.025, 0.975)(2.7,19.0)

Length: 16.3

0 5 10 15 20 25 30


0.00

0.02

0.04

0.06

0.08

0.10 (0.01, 0.96)(2.09,17.61)Length: 15.5

0 5 10 15 20 25 30


0.00

0.02

0.04

0.06

0.08

0.10 (0, 0.95)(0.0,16.9)

Length: 16.9

0 5 10 15 20 25 30


Your turn

Find 95% confidence interval for the standard deviation (sd = 7.01, n = 10)

P(2.09 < X < 17.61) = 0.95

X =(n! 1)S2

!2


Testing


Testing

Very closely related to estimation (particularly confidence intervals)

But point is to answer a yes/no question:

Is the mean of the distribution equal to 0?

Do X and Y have the same mean?


Your turn

The following values have been drawn from a normal distribution with standard deviation 1.

2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13)

Is it possible they came from a normal distribution with mean 1.5?


Example

Create 95% confidence interval. Is it inside?

Create 90% confidence interval. Is it inside?

…

Or we can look up the value directly, using the cdf


Testing jargon

No: Null hypothesis. Nothing is happening. (Thing we want to disprove)

Yes: Alternative hypothesis. Something interesting is happening.

Major complication:


Absence of evidence is not

evidence of absence


Implication

Means we never “accept” the null hypothesis, just “fail to reject” it.

Null distribution is usually simple case for which we know the distribution


Your turn

Null hypothesis: μ = 1.5

Alternative hypothesis: μ > 1.5 OR μ < 1.5

Under the null hypothesis what is the distribution of the mean?

How does what we saw compare to the null distribution? Is it likely or not?


P-valueP value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed.

Strength of evidence for rejecting the null hypothesis.

But we need a cut off to make a yes-no decision. How do we choose that cut off?


Errors

What are the possible errors we can make?

False positive. Choose alternative when null is correct. (aka Type 1)

False negative. Choose null when alternative is true. (aka Type 2)


Terminology

Probability of a false positive called αProbability of false negative called 1 - β

How are the two related?

Usually care more about false positives


Testing overview

Write down null and alternative hypotheses.

Compute test statistic.

Convert to p-value.

Compare p-value to alpha cut off.


Next time

Some specific tests.

i.e. for common situations what is the distribution under the null-hypothesis


Technology

24 Est Testing