21
Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted p-values in multiresponse experiments Øyvind Langsrud, MATFORSK, Norwegian Food Research Institute. M ultiple responses are com m on in industrial and scientific experim entation.O ften these m ultiple response variables are related in som e w ay.D igitisations ofcontinuous curves and several related m easures ofthe sam e physical phenom ena are exam ples ofsuch data.O rdinary variance analysis (orgeneral linear m odelling)ofeach response variable results in several p-values (raw p-values).W e m ay w antto adjust these p-values in the sense thatthe experim entw ise errorrate is controlled.Bonferroni adjustm entis the m ostw ell know n m ethod,butthis m ethod is extrem ely conservative. Itis,how ever,possible to com pute exactand non-conservative adjusted p-values by using M onte C arlo testing.The unknow n covariance m atrix is handled by conditioning on sufficientstatistics and this m ethodology is called rotation tests.C om pared to perm utation tests,w e replace perm utations by proper random rotations.Perm utation tests avoid the m ultinorm al assum ption,butthey are lim ited to relatively sim ple m odels.O n the otherhand,a rotation testcan be applied to adjustp-values from any general linear m odel.Instead ofcontrolling the experim entw ise (orfam ilyw ise)errorrate,w e can m ake a rotation testing m ethod thatcontrols the false discovery rate.This type ofp-value adjustm enthas becom e very popularin m icroarray data analysis.M ore inform ation and free softw are can be found at http://w w w .m atforsk.no/ola/rotation.htm .

Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Embed Size (px)

Citation preview

Page 1: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 1 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Rotation Tests - Computing exact adjusted p-values in multiresponse experiments

Øyvind Langsrud, MATFORSK, Norwegian Food Research Institute.

Multiple responses are common in industrial and scientific experimentation. Often these multiple responsevariables are related in some way. Digitisations of continuous curves and several related measures of thesame physical phenomena are examples of such data. Ordinary variance analysis (or general linearmodelling) of each response variable results in several p-values (raw p-values). We may want to adjustthese p-values in the sense that the experimentwise error rate is controlled. Bonferroni adjustment is themost well known method, but this method is extremely conservative.

It is, however, possible to compute exact and non-conservative adjusted p-values by using Monte Carlotesting. The unknown covariance matrix is handled by conditioning on sufficient statistics and thismethodology is called rotation tests. Compared to permutation tests, we replace permutations by properrandom rotations. Permutation tests avoid the multinormal assumption, but they are limited to relativelysimple models. On the other hand, a rotation test can be applied to adjust p-values from any general linearmodel. Instead of controlling the experimentwise (or familywise) error rate, we can make a rotation testingmethod that controls the false discovery rate. This type of p-value adjustment has become very popular inmicroarray data analysis. More information and free software can be found athttp://www.matforsk.no/ola/rotation.htm.

Page 2: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 2 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Campylobacter experiment

Three biological replicates (block variable)

312 FT-IR wavelengths as multiple responses polysaccharide region [1200-900 cm-1]

DaysTemperature Atmosphere

2 4 7Aerobic b)

Microaerobic 5C

Anaerobic

Aerobic b) b)

Microaerobic 25C

Anaerobic

0 50 100 150 200 250 300 350-0.04

-0.02

0

0.02

0.04

0.06

0.08

Page 3: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 3 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Analysis of Variance for 1200.6, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F PBioRep 2 0.0001545 0.0001407 0.0000703 7.00 0.003Atmos 2 0.0000639 0.0000698 0.0000349 3.47 0.044Day 2 0.0001147 0.0000730 0.0000365 3.63 0.038Temp 1 0.0002607 0.0001881 0.0001881 18.71 0.000Atmos*Temp 2 0.0000002 0.0000007 0.0000004 0.03 0.966Day*Temp 2 0.0000082 0.0000082 0.0000041 0.41 0.669Error 31 0.0003117 0.0003117 0.0000101Total 42 0.0009139

Least Squares Means for 1200.6

Day Mean SE Mean2 0.005694 0.0007744 0.003469 0.0009157 0.002255 0.001069

Analysis with MINITAB - first wavelength

Because of missing values,Atmos*Day could not behandeled by MINITAB

Page 4: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 4 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

ANALYSIS with 50-50 MANOVA --- 50-50 MANOVA ver. 1.71 --- 43 objects -- 1 responses: Source DF exVarSS nPC nBu exVarPC exVarBU p-Value BioRep 2 0.155716 1 0 1.000 1.000 0.003046 Atmos 2 0.075998 1 0 1.000 1.000 0.043882 Day 2 0.075725 1 0 1.000 1.000 0.044326 Temp 1 0.298146 1 0 1.000 1.000 0.000014 Atmos*Day 3(4) 0.037330 1 0 1.000 1.000 0.347352 Atmos*Temp 2 0.004477 1 0 1.000 1.000 0.814765 Day*Temp 2 0.000229 1 0 1.000 1.000 0.989520 Error 28 0.303750 - STANDARDIZATION ON ------------

--- 50-50 MANOVA ver. 1.71 --- 43 objects -- 312 responses: Source DF exVarSS nPC nBu exVarPC exVarBU p-Value BioRep 2 0.134434 2 11 0.842 1.000 0.000000 Atmos 2 0.062564 2 11 0.806 1.000 0.000496 Day 2 0.110177 2 11 0.812 1.000 0.000000 Temp 1 0.090016 2 11 0.809 1.000 0.000000 Atmos*Day 3(4) 0.018259 2 11 0.811 1.000 0.929734 Atmos*Temp 2 0.028598 2 11 0.810 1.000 0.102187 Day*Temp 2 0.027059 2 11 0.800 1.000 0.348228 Error 28 0.428493 - STANDARDIZATION ON ------------

Page 5: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 5 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusted means as curves

50 100 150 200 250 300

-0.02

0

0.02

0.04

0.06

Day

2 4 7

Page 6: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 6 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Effect of Day - 312 single response p-values

Ordinary significance tests are not longer suitable A lot of significant results cased by random variation (since several tests/responses)

The p-values need to be adjusted So that they are interpretable

Page 7: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 7 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusted p-values

So that experimentwise (or familywise) error rate is controlled

Bonferroni correction (classical method) pAdj = #responses • pRaw Conservative upper bound (in practice often too conservative ) Dependence among responses not investigated

Modern methods Makes active use of dependence among responses Permutation tests Rotation tests

Page 8: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 8 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Assume a regression model(simplified model without constant term)

Separate F-tests for each response Random variables: F1, F2 … , Fq Observed values: f1, f2 … , fq

Maximal F-value (= minimum p-value) obtained for response number k Raw p-value:

Adjusted p-value:

qnqrrnqn EBXY0B:H 0

)( kkk fFPp

))(max(1

kiqi

k fFPp

Page 9: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 9 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusting the minimum p-value by permutations

For m=1,2 …. M permute data (Y P(m)Y ) compute maximal F-statistic from these data

Compute p-value as

),()(max1

max YXGff iqi

),((m)max YPX (m)Gf

1)( max

(m)max

Mff#

Page 10: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 10 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

How is dependence handled?

Estimate of covariance matrix under H0:

Estimate based on permuted data:

The permutation test is a conditional test Conditioned on the covariance matrix estimate Conditioned on sufficient statistics for the unknown parameters

Fisher's exact test for 22 contingency tables is the most famous conditional test

YY Tn

1

YYYPPY TTTn

(m)(m)n

11

Page 11: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 11 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Conditional test under multivariate normality?

Need distribution of Y conditioned on YT Y

Answer Y is distributed as RYobs

where Yobs is the observed matrix and where R is an uniformly distributed orthogonal matrix (random rotation matrix)

Relation to well-known tests t-test, F-tests, Hotelling T2, Wilks’ are special cases of rotation testing

But these test statistics do not depend on YT Y Conditioning not needed Simulations not necessary

Page 12: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 12 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusting the minimum p-value by rotations

For m=1,2 …. M simulate rotated data (Y R(m)Y )

where R(m) is a simulated random rotation matrix compute maximal F-statistic from these data

Compute p-value as

In practice: a much more efficient algorithm is applied

),()(max1

max YXGff iqi

),((m)max YRX (m)Gf

1)( max

(m)max

Mff#

Page 13: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 13 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusting the other p-values (permutations or rotations)

Remove response with minimum p-value Adjust minimum p-value in new data set and so on Enforce monotonicity

All calculations can be done simultaneously

Page 14: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 14 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Permutation test or rotation test

Exact permutation testing The only assumption: independent observations Useless for few observations Useless for complex ANOVA and regression models

Exact rotation testing Assumes multivariate normality Does not need as many observations as permutation testing Can be use for complex ANOVA and regression models

F-test rotation test

Page 15: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 15 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusted p-values (FWE)

non-adjusted p-values (RAW) False significance at 1% level is expected in 1% of all the investigated

responses If you have 5000 responses …..

“Classically ” adjusted p-values (FWE) False significance at 1% level is expected in not more that 1% of all

experiments where the method is applied. The experimentwise (or familywise) error rate is controlled

Page 16: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 16 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

False Discovery Rate (FDR)

Adjusted p-values according to False Discovery Rate False significance at 1% level is expected in 1% of all cases (responses) reported

as significant at 1% level. If you have 5000 responses and 200 are reported as significant at 1% level, one

will expect two of these as false.

“q-values” is proposed instead of “adjusted p-values”

Page 17: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 17 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Calculation of FDR adjusted p-values

Several methods exist

Most of them do not handle the dependence among the responses but OK if the “weak dependence requirement” is met

New variant based on rotations (or alternatively permutations) handles any kind of dependence conservative compared to other methods

since the method does not involve an estimate of the amount of responses with true null hypotheses

Page 18: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 18 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusted p-values (first 30 wavelengths)

rankNr varNr varName pRaw pAdjFDR pAdjFWE 151 1 1200.6 0.044326 0.089791 0.308669 153 2 1199.64 0.044794 0.090035 0.310380 152 3 1198.67 0.044399 0.089791 0.308669 149 4 1197.71 0.043170 0.088950 0.303313 148 5 1196.74 0.041301 0.085699 0.294036 145 6 1195.78 0.038873 0.082265 0.282597 143 7 1194.82 0.036089 0.077248 0.268276 140 8 1193.85 0.033120 0.072581 0.252768 137 9 1192.89 0.030078 0.067314 0.236018 135 10 1191.92 0.027126 0.061613 0.218551 132 11 1190.96 0.024367 0.056594 0.202177 130 12 1189.99 0.021827 0.051500 0.186164 125 13 1189.03 0.019559 0.047599 0.173055 124 14 1188.07 0.017581 0.043434 0.159513 120 15 1187.1 0.015869 0.040234 0.147983 117 16 1186.14 0.014414 0.037658 0.137538 115 17 1185.17 0.013182 0.034951 0.128280 113 18 1184.21 0.012153 0.032861 0.120358 110 19 1183.24 0.011315 0.031373 0.113914 108 20 1182.28 0.010663 0.030080 0.108626 104 21 1181.32 0.010197 0.029590 0.105095 102 22 1180.35 0.009898 0.029204 0.102575 101 23 1179.39 0.009792 0.029204 0.101727 103 24 1178.42 0.009918 0.029204 0.102743 106 25 1177.46 0.010350 0.029702 0.106292 109 26 1176.49 0.011244 0.031373 0.113354 114 27 1175.53 0.012874 0.034500 0.125852 119 28 1174.57 0.015757 0.040234 0.147242 129 29 1173.6 0.021030 0.049988 0.181178 138 30 1172.64 0.031110 0.069138 0.241854

Page 19: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 19 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Adjusted p-values (30 most significant wavelengths)rankNr varNr varName pRaw pAdjFDR pAdjFWE 1 180 1027.99 0.000060 0.001281 0.001340 2 181 1027.03 0.000062 0.001281 0.001364 3 179 1028.95 0.000063 0.001281 0.001385 4 182 1026.06 0.000067 0.001281 0.001471 5 178 1029.92 0.000070 0.001281 0.001509 6 49 1154.32 0.000070 0.001281 0.001516 7 48 1155.28 0.000070 0.001281 0.001526 8 50 1153.35 0.000075 0.001281 0.001622 9 47 1156.24 0.000077 0.001281 0.001656 10 183 1025.1 0.000079 0.001281 0.001690 11 177 1030.88 0.000081 0.001281 0.001726 12 51 1152.39 0.000087 0.001323 0.001826 13 46 1157.21 0.000092 0.001351 0.001912 14 176 1031.85 0.000098 0.001352 0.002009 15 184 1024.13 0.000098 0.001352 0.002038 16 52 1151.42 0.000108 0.001433 0.002218 17 175 1032.81 0.000121 0.001527 0.002454 18 45 1158.17 0.000123 0.001527 0.002490 19 185 1023.17 0.000130 0.001562 0.002607 20 53 1150.46 0.000146 0.001701 0.002887 21 174 1033.78 0.000153 0.001728 0.003001 22 186 1022.2 0.000180 0.001968 0.003465 23 44 1159.14 0.000186 0.001983 0.003574 24 173 1034.74 0.000195 0.002025 0.003705 25 54 1149.49 0.000213 0.002143 0.003998 26 172 1035.7 0.000247 0.002392 0.004533 27 187 1021.24 0.000259 0.002451 0.004749 28 171 1036.67 0.000310 0.002848 0.005586 29 43 1160.1 0.000325 0.002923 0.005855 30 55 1148.53 0.000334 0.002940 0.006000

Page 20: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 20 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

p-values

0 50 100 150 200 250 300 3500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1RAWFWEFDRBonfer

Page 21: Ark nr.: 1 | Forfatter: Øyvind Langsrud - a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk Rotation Tests - Computing exact adjusted

Ark nr.: 21 | Forfatter: Øyvind Langsrud- a member of the Food Science Alliance | NLH - Matforsk - Akvaforsk

Rotation Tests - Conclusion

Simulation principle for computing exact Monte Carlo p-value for any test statistic.

Based on multivariate normal distribution. Generalisation of classical tests. Related to permutation testing. Useful for computing adjusted p-values (F-tests)

FWE, FDR General linear models (ANOVA and regression) Implemented in the 50-50 MANOVA program (www.matforsk.no/ola)