27
Area Test for Observations Indexed by Time L. B. Green Middle Tennessee State University E. M. Boczko Vanderbilt University

Area Test for Observations Indexed by Time

Embed Size (px)

DESCRIPTION

Area Test for Observations Indexed by Time. L. B. Green Middle Tennessee State University E. M. Boczko Vanderbilt University. Outline. Problem The Null Hypothesis The Statistic Determining Significance Comparison to Other Tests Extending the Test. The Problem. - PowerPoint PPT Presentation

Citation preview

Page 1: Area Test for Observations Indexed by Time

Area Test for Observations Indexed by TimeL. B. Green Middle Tennessee State University

E. M. BoczkoVanderbilt University

Page 2: Area Test for Observations Indexed by Time

Outline

Problem The Null Hypothesis The Statistic Determining Significance Comparison to Other Tests Extending the Test

Page 3: Area Test for Observations Indexed by Time

The Problem

Four observations of mouse RNA each at 2, 3, 7, and 21 days after birth.

Test to see if there is a change in metabolic regulation of fatty acid metabolism and, if so, when the change happens.

Page 4: Area Test for Observations Indexed by Time

The ProblemIndependent observations at each time, represented by:

niti ,,1,0 , jiX ,ikj ,,1,0

A value of zero represents “no change.” Positive values represent an increase, negative values represent a decrease.

Page 5: Area Test for Observations Indexed by Time

The Problem

1t 4t3t2t

Page 6: Area Test for Observations Indexed by Time

The Null Hypothesis

There is no change at any time point.

jiXH ,0 : are identically distributed, with mean (or median) of zero.

Page 7: Area Test for Observations Indexed by Time

The Null Hypothesis

If the null hypothesis is true, then the order of the observations is completely due to chance.

Page 8: Area Test for Observations Indexed by Time

The Statistic

Create a piecewise linear function whose value at each time point is the mean (or median) of the observations at that time point.

Calculate the square of the L2 norm of this function.

Page 9: Area Test for Observations Indexed by Time

The Statistic

1t 4t3t2t

nt

tdtffLl

0

222 )(:

Page 10: Area Test for Observations Indexed by Time

The Statistic

3)( 12

110

2 iiiii

n

ii

ttmmmml

ik

jji

ii Xk

m0

,

1

Note: It is possible for the mi’s to be medians rather than means.

Page 11: Area Test for Observations Indexed by Time

The Statistic

kkk nnn

nnn

nnn

A

111000000

000111

000

000000111

222

111

360000

630000

0036

00

00636

0

000636

000063

11

12

3534

342423

231312

1212

nnnn

nnnn

tttt

tttt

tttt

tttttt

tttttt

tttt

L

LAxAxxl TT)(

Page 12: Area Test for Observations Indexed by Time

Determining Significance

Bootstrap:

Sample from a distribution (constructed from the data) that does satisfy H0.

Calculate new values of and compare to original value.

If H0 is true, the original value will not be different from the new values.

l

Page 13: Area Test for Observations Indexed by Time

Determining Significance

Calculate , the mean of all the data. Calculate Repeat B times

Choose a new set of from , with replacement.

Calculate the new value of the test statistic, Calculate Reject if

XXXY jiji ,,

*, jiX }{ , jiY

*lBllp }{# *

p

Page 14: Area Test for Observations Indexed by Time

Determining SignificanceWhy sample from original data?

The empirical distribution is the closest distribution we have to the true distribution.

Page 15: Area Test for Observations Indexed by Time

Determining Significance

Why re-center the data?

We must ensure that the distribution we are sampling from satisfies H0.

XXY jiji ,,

Page 16: Area Test for Observations Indexed by Time

Determining Significance

Reject if

If the sample size is large, this p-value is uniformly distributed. So

B

llp

}{# *

)( pP

Page 17: Area Test for Observations Indexed by Time

Determining SignificanceIf sample size is small:

t=(0,3,6,10)

Four observations per time point.

Page 18: Area Test for Observations Indexed by Time

Other Tests

Multiple t-tests

At each time point, perform a t-test to see if the mean is different from zero.

Combine these results using Bonferroni Correction factor.

Page 19: Area Test for Observations Indexed by Time

Other Tests

Multiple t-tests

Do not deal with time explicitly.

Have very small samples at each time point.

Assumes normality in data.

Page 20: Area Test for Observations Indexed by Time

Other TestsANOVA

Test for difference in means using one-way ANOVA.

Doesn’t explicitly deal with time.Null hypothesis is that means are the same, not that they are equal to zero.Assumes normality.

Page 21: Area Test for Observations Indexed by Time

Other Tests

Area test is more powerful than multiple t-tests or ANOVA when applied to simulated data sets.

Simulated using data from distributions with means that increase linearly over time. In this case, power depends on slope of the line.

Page 22: Area Test for Observations Indexed by Time

Extending the Test

Use median instead of mean at each time point.

Allows test to be used in cases where the existence of the mean is in doubt.

Page 23: Area Test for Observations Indexed by Time

Extending the Test

Two data sets.

Test to see whether both sets of data come from the same distribution, and there is no change in distribution over time.

)(: 22 gfLl

Page 24: Area Test for Observations Indexed by Time

Extending the Test

1t 4t3t2t

Page 25: Area Test for Observations Indexed by Time

Extending the TestTwo data sets. Distribution may change over time.

For example: Comparison to a control data set.

Resample within time points rather than across whole set.

)(: 22 gfLl

Page 26: Area Test for Observations Indexed by Time

Extending the Test

1t 4t3t2t

Page 27: Area Test for Observations Indexed by Time

Thank You