Estimating 0 Estimating the proportion of true null hypotheses with the method of moments By Jose M...

Preview:

Citation preview

Estimating 0

Estimating the proportion of true null hypotheses with the method of moments

By Jose M Muino. Email: jmui@igr.poznan.pl

The objective

Objective To obtain some information (0 and moments)

to help in the construction of the critical region in a multiple hypotheses problem

The situation: Low sample size The distribution under the null hypothesis is

unknown But the expectation of the null distribution is

known

Definitions

-2 2 4 6

0.1

0.2

0.3

0.4

t

)(1 Tf Tf0 )()1()( 1000 TfTfTg

Let Ti, i = 1, . . . ,m, be the test statistics for testing null hypotheses H0,i based on observable random variables.

Assume that H0,i is true with probability 0 and false with probability (1- 0)

Assume Ti follows a density function f0(T) under H0,i and f1(T) if H0,i is false.

Assume that the first m0 =m*0 H0,i are true, and the next m1 =m*(1-0 ) H0,i are false

)(on distributi theofmoment central thedenotes

)(on distributi theofmoment raw thedenotes

)(,

)(,

xjc

xjth

xj

thxj

The idea

)()1()( 1000 TfTfTg

)(,10,10)(,1 10)1( TfTfTg

)(,20,202 10)1( TfTf ccd

Define:

20

20)(,22 )(1,10,1

)1(TfTfTgd

Then:

)(,2)(,1)(,12

)(,12

)(,22

)(,120

002 TgTfTgTf

TgTg

d

d

)(,1)(,1

)(,2)(,1)(,12)(,1

0

0

1

TgTf

TgTfTgTf

d

The estimators

)(,2)(,1)(,12

)(,12

)(,22

)(,120

002 TgTfTgTf

TgTg

d

d

)(,1)(,1

)(,2)(,1)(,12)(,1

0

0

1

TgTf

TgTfTgTf

d

)(,20,202 10)1( TfTf ccd

m

iicm

d1

,22 ˆ1ˆ

)(,1 0 TfAssumed known

)(,2

)(,1

Tg

Tg

2)(,2

)(,1

iTg

iTg

Tm

Tm

Any moment

Because:

Then:

Estimators

Sample levelTest value level

m

ijij c

md

1,ˆ

m

i

jiTgj T

m 1)(,

Example: The mean value as test statistic

The properties of the estimators can be studied by taking Taylor series.

The properties will be illustrated with the example of the mean value as test statistic

Testing m hypotheses regarding m observed samples xi,j, i=1,…m, j=1,…n, using as test statistic the mean of the observations

ii xT 0:

0:

1

0

i

i

H

H

Properties

Assuming independence

m

ijij c

md

1,ˆ

1ˆ if

Properties

Assuming independence

0ˆ if jd

Properties

Assuming independence

Numerical Simulations

m0=450,m1=50, H0->N(0,1), H1->N(1,1)(2000 simulations)

0 10 20 30 40sample size

-0.025

0

0.025

0.05

0.075

0.1

0.125

0.15

saib

a

0B

0S

0

0

0

0 10 20 30 40sample size

0

0.01

0.02

0.03

0.04

0.05

dradnatsnoitaived

b

0B

0S

0

0

0

)/5,2(

)/5,0(

40,...,2

500

1000

1

0

1

0

nNTf

nNTf

n

m

m

5000 simulations

Numerical Simulations

0 10 20 30 40sample size

0

0.05

0.1

0.15

0.2

0.25

saib

a

B

S

3

2

1

0 10 20 30 40sample size

0

0.02

0.04

0.06

0.08

dradnatsnoitaived

b

B

S

3

2

1

)5,2(

)5,0(

40,...,2

500

1000

1

0

1

0

NTf

NTf

n

m

m

5000 simulations

From moments to quantiles

A family of distributions (eg: Pearson family) can be used to calculate the quantiles

)/1,4( )/1,0( 5000 15000 1010 nNTfnNTfmm

error type I

  n=3 n=3 n=4 n=4 n=5 n=5

  MM Classical MM Classical MM Classical

0,5 0,487 0,499 0,493 0,5 0,496 0,499

0,1 0,096 0,099 0,097 0,099 0,098 0,099

0,05 0,056 0,049 0,052 0,049 0,051 0,049

0,01 0,022 0,009 0,014 0,01 0,012 0,009

0,001 0,01 0,001 0,003 0,001 0,002 0,0009

100 simulations

From moments to quantiles

)/1,4( )/1,0( 10000 10000 1010 nNTfnNTfmm error type I

  n=3 n=3 n=4 n=4 n=5 n=5

  MM Classical MM Classical MM Classical

0,5 0,476 0,499 0,489 0,5 0,494 0,5

0,1 0,096 0,099 0,096 0,099 0,097 0,1

0,05 0,063 0,049 0,054 0,049 0,052 0,05

0,01 0,038 0,009 0,019 0,01 0,015 0,01

0,001 0,029 0,0009 0,007 0,001 0,003 0,001

error type I

  n=3 n=3 n=4 n=4 n=5 n=5

  MM Classical MM Classical MM Classical

0,5 0,489 0,5 0,495 0,5 0,496 0,5

0,1 0,097 0,1 0,098 0,099 0,098 0,1

0,05 0,054 0,05 0,052 0,049 0,051 0,05

0,01 0,018 0,01 0,013 0,01 0,012 0,01

0,001 0,006 0,0009 0,002 0,001 0,002 0,001

)/1,4( )/1,0( 2000 18000 1010 nNTfnNTfmm

Advantages

Combine information from sample and test level.

No assumptions about the shape of the distribution (finite moments required)

Analytical solution Properties can be obtained Estimator can be improved

Disadvantages

Different estimator for different test statistic

Estimators of the central moments of the test statistics are required

The estimation can be outside of the parameter space

Thanks for your attention!!

Questions?, Suggestions?

Or write me at:jmui@igr.poznan.pl

Work funded by Marie Curie RTN: “Transistor” (“Trans-cis elements regulating key switches in plant development”)

Recommended