34
Say not, „ Say not, „ I I have found the truth," but have found the truth," but rather, „ rather, „ I I have found a truth.„ have found a truth.„ Kahlin Gibran “ Kahlin Gibran “ The The Pr Pr ophet ophet Hypothesis Hypothesis testing testing

Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Embed Size (px)

Citation preview

Page 1: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Say not, „Say not, „II have found the truth," have found the truth," but rather, „but rather, „II have found a truth.„ have found a truth.„

Kahlin Gibran “Kahlin Gibran “The The PrProphetophet””

Hypothesis Hypothesis testingtesting

Page 2: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

What is hypothesis?

A statement about a population developed for the purpose of testing

•Population is so large that it is not feasible to study all the objects•Alternative to measuring the entire population is to take a sample from the population•Then we can test a statement to determine whether the sample does or does not support the statement concerning the population

Page 3: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Examples:

•Eighty percent of those who play the state lotteries regularly never win more than 100€ in any one play•The mean starting salary for graduates of four-year bussiness schools is 3200€ per month•Thirty-five percent of retirees in the upper Midwest sell their home and move to a warm climate within 1 year of their retirement

Page 4: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

What is hypothesis testing?

A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement

•Start with a statement, or assumption about population parameter, e.g. mean (hypothesis)•We can also verify assumptions about shape of statistical distribution

Page 5: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Example:

Hypothesis: Mean monthly commission of sales associates in retail electronics stores is in fact 2000€•Select a sample from the population to test the assumption μ=2000• Sample mean 1000€ would certainly cause rejection of the hypothesis•Mean 1995€?

Difference 5€ : •Sampling error ?•Or statistically significant difference?

Page 6: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Five-step procedure for testing a hypothesis

Page 7: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Step 1:State the Null Hypothesis (H0)

Null hypothesis: A statement about the value of a population parameter

• hypothesis being tested•designated H0 and read „H sub zero“

•H stands for hypothesis•Subscript zero implies „no difference“

•Often begin by stating: „There is no significant difference between....“•Will always contain the equal signe.g. H0 : μ=2000

Page 8: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Step 1: Alternate Hypothesis (H1)

Alternate hypothesis: A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false

•It is written H1 and is read „H sub one“•Often called the research hypothesis•Never contain equal sign•e.g. H1: μ≠2000

•We turn to the alternate hypothesis only if the data suggests the null hypothesis is untrue

Page 9: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Step 2: Level of significance

The probability of rejecting the null hypothesis when it is true

•Designated α (alpha)•Sometimes called level of risk

•Decision is made to use: • the 0,05 level (5% level)- traditionally selected for consumer research projects•the 0,01 level – for quality assurance•the 0,1 level – for political polling

Or any other between 0 and 1

Page 10: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Type I error: Rejecting the null hypothesis, H0 when its true

•Probability of commiting a type I error is α•1 - probability of accepting H0 when its true (accepting correct hypothesis)

Type II error: Accepting the null hypothesis when it is false

•Probability of commiting type II errors is •1 - power of the test

Possibility of two types of errors:

Page 11: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

f(H1)

= P(H1/H0)= P(H0/H1)

1 - 1 -

f(H0)

-probability of accepting H1

when H0 is true

-probability of accepting H0 when H1is true

Type I and type II errors

Page 12: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

•Type I error α and type II error are closely connected•Reducing one type of error enlarge other type of errorCompromise is necessary For this reason is usually selected α=0,05

Type I and type II errors

Researcher

Null hypothesis

Accepts H0 H0 Rejects

H0 is true Correct decision Type I error

H0 is false Type II error Correct decision

Page 13: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Step 3: Select the test statistic

A value determined from the sample information, used to determine whether to reject the null hypothesisFor example: in hypothesis testing for the mean, when σ is known or the sample size is large the test statistics is computed by:Formula depends on used test

n

σ - μx

u 0

Page 14: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Step 4: Formulate the decision rule

Decision rule – Statement of the specific conditions under which the null hypothesis is rejected and the conditions under which it is not rejectedCritical value – the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected

=>Computing test statistic, comparing it to the critical value and making a decision to reject or not to reject the null hypothesis.

Page 15: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Two tailed test

No direction is specified in the alternate hypothesisH0 : = 0 H1 : 0

If |ucal| u1-/2 => do not reject H0

If |ucal| > u1-/2 => reject H0

Page 16: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

One-tailed testAlternate hypothesis states directione.g:Null hypothesis includes equal signOne way to determine the location of the rejection region is to look at the direction in which the inequality sign in the alternate hypothesis is pointing (< either >). In this case < (to the left)

Page 17: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

•The critical values for a one-tailed test are different from a two-tailed test at the same significance level. •In two tailed test we split the significance level in half and put half in lower tail and half in the upper tail.•In a one-tailed test we put all the rejection region in one tail

Notice

Page 18: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Differences between one and two tailed test

Page 19: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

p–value in hypothesis testing

The probability of observing a sample value as extreme as, or mote extreme than, the value observed, given that the null hypothesis is true.•If p-value<significance level => H0 is rejected•If p-value>significance level => H0 is not rejected•Gives us also additional insight into the strength of the decision•Very small p-value e.g. 0,0001 indicates that there is little likelihood the H0 is true•On the other hand p-value 0,2033 means that H0 is not rejected and there is little likelihood that is false

Page 20: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Testing for a population mean

Let X to have normal distributed population N(, 2)H0 : = 0 H1 : 0

est = and N(, 2/n)

a) Variance of the population is known, then test statistic:

x

n

σ - μx

u 0

if |u| u1-/2 => do not reject H0

if |u| > u1-/2 => reject H0with …N(0,1)

Page 21: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

b)Variance of the population is unknown, est2 = s1

2 , large sample (n>30)

n

s - μx

u 1

0N(0,1) can be used

If |u| u1-/2 => do not reject H0

if |u| > u1-/2 => reject H0

Page 22: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

c) Variance of the population is unknown, est2=s1

2 , small sample (n≤30)

n

s - μx

t 1

0Test statistics:

Critical value t (n-1)

Page 23: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Two sample test of hypothesis about mean, independent samples

Let variable X1 is normally distributed....N(1, 12)

Let variable X2 is normally distributed ….N(2, 22)

Assume estimated means 1 and 2 are equal=> H0 :1 = 2 H1 :1 2

Two tailed testest 1 = … N(1, 1

2/n1)est 2 = … N(2, 2

2/n2)

1x

2x

Page 24: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

a) Variances of the population are known 12 ,

22 then

)nn

;(N)....xx(2

22

1

21

2121

Test statistic:

21

221

212

21

21

221

212

2121

n.nσnσn

x - x

n.nσnσn

)μμ(x - x u

Page 25: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

b) Variances of the populations 12 , 2

2 are unknown and both samples are large n1>30, n2>30

We can used same test statistic like before in a)

Variances of the populations will be replaced by their point estimates:

est 12 = s11

2 est 2

2 = s122

Page 26: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

c) Variances of the populations are unknown, at least one sample is small (n1 30, or n2 30) =>If we can assume equality of variances 1

2 = 22 = 2,

then we can use t-test statistic with student distribution.

21

21

21

2122

211!

21

nn

n.n.

2nn

s)1n(s)1n(

x - x t

Compared with critical value t pre (n1 +n2 - 2) degrees of freedom

Page 27: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

d) Variances of the populations are unknown, at least one sample is small (n1 30, or n2 30)we can not assume equality of variances (1

2 22 )

( Verified by F test)

=>We can use Behrens-Fischer test for unequal variances

Page 28: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Two-sample tests of hypothesis: Dependent samples

Samples are dependent, or related Two types of dependent samples: 1.Those characterized by a measurement, an intervention of some type, and then another measurement,2.Matching or pairing of observations – paired samples

Page 29: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

We make several measurements on the same statistical units, we get:x11 , x12, …x1j , …, x1n

x21, x22, …x2j , …, x2n x i j

Index of measurement orderj = 1,2,…,n

Index to distinguish set of measurements in timei = 1,2

We can calculate difference for each pair:dj = x1j - x2j ,

Est d = d2

n

1jj

2d )dd(

1n

1 est

Page 30: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Ho : 1 = 2 or Ho : d = 0Against alternate hypothesis H1 : d 0

Test statistic have student distribution with (n-1) degrees of freedom

)1n(n

)dd(

d

n

dt

n

1j

2j

d

What will be possible results?

Page 31: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

Hypothesis testing of variance

A) Test of equality of variance with constant

H0 :2 = 20 , est 2 = s1

2

H1 :2 20

20

212 s).1n(

Test statistic

2 distribution with (n-1) degrees of freedom

2 1- /2 2

/2Do not reject H0

Rejectionregion

Rejectionregion

Page 32: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

B) Test for equality of variances in two samples

H0 :12 = 2

2 est 12 = s11

2 , est 22 = s12

2

H1 :12> 2

2 , one tailed test

Test statistics212

211

s

sF Fischer distribution

With degrees of freedom:1= n1-1, 2= n2-1

Note: Higher variance will be numerator => F>1

F < F ( 1, 1) do not reject H0, variances of two populations can be considered equal F F ( 1, 1) H0 is rejected, variance of the first population (numerator) is significantly greater

Page 33: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

References:

•Statistics for Business and Economics, 6e © 2007 Pearson Education, IncChapter 10 and 11 => Recommended reading (do as I did ;-)

•Slovak lectures by prof. Ing. Zlata Sojková, CSc

•Another recommended study materials:http://moodle.uniag.sk/fem/course/view.php?id=211

=> Moodle course of statistics

Page 34: Say not, „I have found the truth," but rather, „I have found a truth.„ Kahlin Gibran “The Prophet” Hypothesis testing

That`s all folksDon`t worry, be happy ;-)