Download ppt - Hypothesis Testing

Transcript
Page 1: Hypothesis Testing

Hypothesis TestingHypothesis Testing

Lecture 3

Page 2: Hypothesis Testing

Examples of various hypotheses

• Average salary in Copenhagen is larger than in Bælum

• Sodium content in Furresøen is equal to the content in Madamsø

• Proportion of Turks in Århus is the same as in Aalborg

• Average height of men in Sweden is the same as in Denmark

• The average temperature is increasing over time

Page 3: Hypothesis Testing

Formulation of hypothesis

Assume we are interested in a parameter Θ (e.g. the mean of the data). Let Θ0 be a number.

There are three different kinds of hypotheses:

H0: Θ = Θ0 H0: Θ ≥ Θ0 H0: Θ ≤ ΘHA: Θ ≠ Θ0 HA: Θ < Θ0 HA: Θ > Θ0

H0 is called the null hypothesis.HA is called the alternative hypothesis.

Page 4: Hypothesis Testing

Examples of various hypotheses

• Average salary in Copenhagen is larger than in Bælum

H0: μC ≥ μB. HA: μC < μB.

• Sodium content in Furresøen is equal to the content in Madamsø

H0: μF = μM. HA: μF ≠ μM.

• Proportion of Turks in Århus is the same as in Aalborg

H0: PÅ = PA. HA: PÅ ≠ PA.

• Average height of men in Sweden is the same as in Denmark

H0: μS = μD. HA: μS ≠ μD.

• The average temperature is increasing over time

H0: μtime 1 ≥ μtime 2. HA: μtime 1 < μtime 2 if time 1 ≥ time 2.

Page 5: Hypothesis Testing

COMPARE

SMALL DIFFERENCE

BIG DIFFERENCEE NOT EQUAL MEANS

EQUAL MEANS

NORMAL DISTRIBUTION(average height in Sweden and Denmark)

Page 6: Hypothesis Testing

BINOMIAL DISTRIBUTION(Proportion of Turks in Århus and Aalborg)

BIG OR NOT?

Page 7: Hypothesis Testing

The Test Procedure

Formulate a HYPOTHESIS!

Page 8: Hypothesis Testing

Numerically bigger than

Does the data support the hypothesis or not?

Page 9: Hypothesis Testing

Types of errors•Type I error: Rejecting falsely.•Type II error: Accepting falsely.

Decision H0 is true H0 is false

Reject H0 Type I error No error

Accept H0 No error Type II error

Ideally we would like a test where it is difficult to make errors.

Page 10: Hypothesis Testing

Unfortunately

If you make a test where

• it is difficult to make a Type I error

• it is easy to make a Type II error

• and the other way around

Page 11: Hypothesis Testing

Level of significance

So we want to construct a way to decide to

• ACCEPT or

• REJECT

the hypothesis based on data in a way such that

Page 12: Hypothesis Testing

This sounds really technical!!!

Hmm

I don’t like this at all!

Page 13: Hypothesis Testing

Critical Region

Assume

• We want to test if the sodium contest here is approx 3.8 units

• We have data y1, …, yn

• We have calculated average and SE.Support that content is 3.8

Support that content is 3.8

Support that content is < 3.8

Support that content is < 3.8

Support that content is > 3.8

Support that content is > 3.8

Page 14: Hypothesis Testing

What do we know?If the content is 3.8 then the average is normally distributed with mean 3.8

With probability of 95% is the average less than 2*SE from 3.8

If the true content is 3.8 then the average

is in the red area with prob 5%

Page 15: Hypothesis Testing

Test:• The hypothesis is that the true

content is 3.8• Estimate mean and SE.• The critical region is

• If the average is in the critical area then reject the hypothesis else accept

Significance level

Prob(Type I error) = 5 %

Page 16: Hypothesis Testing

Alternative approach

Can we give a number telling us to what extend the observations support the hypothesis?

Yes, of course!

Why do you think I asked?

Hmmm

Supports hypothesis

Here we should definitely reject

Page 17: Hypothesis Testing

If the true content is 3.8 then

and

Assume that we observe an average of 3.8 and SE = 0.1

Then what?

Page 18: Hypothesis Testing

What is the probability of observing this???

What is the probability of observing this???

95% of data sets will have an average in this area (mean +/- 2 SE)

95% of data sets will have an average in this area (mean +/- 2 SE)

Assume we obtain an average of 3.8 and standard error SE = 0.1 and the true concentration is 3.8

Page 19: Hypothesis Testing

P-value

Page 20: Hypothesis Testing

Summing Up

A Statistical test can be

1.On a 5% significance level

2.By calculating the p-value

Page 21: Hypothesis Testing

Hypothesis about the Mean

1. Is the concentration 3.8?

2. Is the proprotion of Turks in Århus 7.5%

Normal Distribution

Binomial Distribution

Page 22: Hypothesis Testing

Sodium

1. Are data normal?

2. Estimate average and standard error

3. Calculate

4. Is t bigger than 2 (numerically)? OR5. Calculate p-value

Page 23: Hypothesis Testing

Turks

1. Are data binomial?

2. Calculate proportion p and standard error

3. Calculate

4. Is t bigger than 2 (numerically)?

Page 24: Hypothesis Testing

Last slide before the end• Are 3.8 in the 95% CI ?

• Accept the hypothesis (mean = 3.8) on a 5% significance level

That’s the same!!

Page 25: Hypothesis Testing

The End