Upload
gwendoline-newton
View
215
Download
0
Embed Size (px)
Citation preview
Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes
ZongBo ShangSIParCS Program, IMAGe, NCAR
August 4, 2009
North American Regional Climate Change Assessment Program (NARCCAP)Predicted Changes in Future Winter Temperature ( °C)
200 220 240 260 280 300 320
20
30
40
50
60
70
CRCM+CGCM3 Changes in Winter Temperature
Longitude
La
titu
de
-2
0
2
4
6
8
Note: This figure shows the difference between the mean of future (2040 – 2069 ) winter temperature vs. current (1970 – 1999) winter temperature.
Can We Trust What We See?
Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.
10 20 30 40 50
10
20
30
40
50
x
y
-2
-1
0
1
2
10 20 30 40 50
10
20
30
40
50
x
y-2
-1
0
1
2
What’s the Problem with Pointwise Two-sample t Tests?
10 20 30 40 50
10
20
30
40
50
Two sample t statistic
x
y
-4
-2
0
2
4
10 20 30 40 50
10
20
30
40
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
210 : H
False Discovery Rate (FDR) Control
• FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors) among the rejected null hypotheses.
• Less conservative than Bonferroni procedures, with greater power than Familywise Error Rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.
Applications of FDR in Genes Expression and Microarray
Applications of FDR in Functional Magnetic Resonance Imaging
Definition of False Discovery Rate
Declared non-significant (fail to reject)
Declared significant (reject)
Total
True null hypotheses
U V m₀
Non-true null hypotheses
T S m-m₀
m-R R m
Let Q = V / (V + S) define the proportion of errors committed by falsely rejecting null hypotheses. Notice Q is an unobservable random variable. Define the FDR to be the expectation of Q:
]/[)]/([][ RVESVVEQEQe
False Discovery Rates for Spatial Signals
• Testing on clusters rather than individual locations
• Procedure 1: Weighted Benjamini & Hochberg (BH) procedure
• Procedure 2: Weighted two-stage procedure• Procedure 3: Hierarchical testing procedure
– Testing stage: control FDR on clusters– Trimming stage: control FDR on selected points
Reference: Benjamini, Y. and Heller, R. 2007. False discovery rates for spatial signals. Journal of the American Statistical Association. 102:1272-1281.
Simulation Studies
• 1. Random Fields
• 2. Random Field Block
• 3. Random Field Gradient
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting I
x
y
-1.0
-0.5
0.0
0.5
1.0
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting II
x
y
-10
-5
0
5
10
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting II
x
y
-10
-5
0
5
10
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting II
x
y
-1.0
-0.5
0.0
0.5
1.0
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting I
xy
-10
-5
0
5
10
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting I
x
y
-10
-5
0
5
10
Simulation Study I: Two Random Fields
Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.
10 20 30 40 50
10
20
30
40
50
x
y
-2
-1
0
1
2
10 20 30 40 50
10
20
30
40
50
x
y-2
-1
0
1
2
Pre-defined Clusters
10 20 30 40
10
20
30
40
Simulation Study 1: Pointwise vs. False Discover Rate Control
10 20 30 40 50
10
20
30
40
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
01
02
03
04
05
0
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study I
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0.00
0.02
0.04
0.06
0.08
0.10
Simulation Study II: Pre-defined Block Trend
xy
trend
Trend
10 20 30 40 50
10
20
30
40
50
Trend
x
y
-10
-5
0
5
10
4 -10
10 -4
2
-2
Simulation Study II: Average of 10 Replicates
10 20 30 40 50
10
20
30
40
50
Mean of 10 Replicates from Setting I
x
y
-10
-5
0
5
10
10 20 30 40 50
10
20
30
40
50
Mean of 10 Replicates from Setting II
x
y-10
-5
0
5
10
Random Field (Matèrn, σ = 0.4) Random Field (Matèrn, σ = 0.4) + Block Trends
4 -10
10 -4
2
-2
Simulation Study II: Pointwise vs. False Discover Rate Control
10 20 30 40 50
10
20
30
40
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
01
02
03
04
05
0
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study II
10 20 30 40 50
10
20
30
40
50
Trend
x
y
-10
-5
0
5
10
0 10 20 30 40 500
1020
3040
500 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0.00
0.02
0.04
0.06
0.08
0.10
Study III: Pre-defined Gradient Trend
10 20 30 40 50
10
20
30
40
50
Trend
x
y
-10
-5
0
5
10
xy
trend
Trend
Study III: Average of 10 Replicates
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting I
x
y
-10
-5
0
5
10
10 20 30 40 50
10
20
30
40
50
10 Replicates Average for Setting II
x
y-10
-5
0
5
10
Random Field (Matèrn, σ = 2) Random Field (Matèrn, σ = 2) + Gradient Trends
Simulation Study III: Pointwise vs. False Discover Rate Control
10 20 30 40 50
10
20
30
40
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
01
02
03
04
05
0
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study III
10 20 30 40 50
10
20
30
40
50
Trend
x
y
-10
-5
0
5
10
0 10 20 30 40 50
010
2030
4050
x
y0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 500
1020
3040
50
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0.00
0.02
0.04
0.06
0.08
0.10
Applying FDR Control for Detecting Future Climate Changes
• Download climate datasets from NARCCAP program• Calculate seasonal average• Construct clusters from EPA Eco-regions• Conduct two-sample t test on
temperature/precipitation• Pointwise p-values and corresponding z scores• Build semi-variogram model to estimate spatial
autocorrelation• Calculate z score and p-value by cluster• Reject clusters based on FDR control
http://www.epa.gov/wed/pages/ecoregions/na_eco.htm
GIS: Vector Dataset, Lambert Equal-Area Projection
61 regions rejected at q=0.25 level 56 regions rejected at q=0.1 level 54 regions rejected at q=0.05 level 51 regions rejected at q=0.01 level
H0: Future Winter Temperature Increase by 3 ˚C
0 50 100 150
0.0
0.1
0.2
0.3
0.4
0.5
p-v
alu
e
Reject at q=0.25Reject at q=0.1Reject at q=0.05Reject at q=0.01
220 240 260 280 3002
03
04
05
06
07
0
Rejection at q-value
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
CRCM+CGCM3 Changes in Winter Temperature
Longitude
La
titu
de
-5
0
5
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 1 ˚C
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 2 ˚C H0: Winter Temperature ↑ 3 ˚C
H0: Winter Temperature ↑ 4 ˚C
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 6 ˚CH0: Winter Temperature ↑ 5 ˚C
FDR Tests on Winter Temperature
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Prec ↓ 20 Kg/ m² H0: ↓ 10 Kg/ m² H0: ↑ 10 Kg/ m² H0: ↑ 20 Kg/ m²
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 50 Kg/ m²H0: Winter Prec ↑ 30 Kg/ m²
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 75 Kg/ m²
220 240 260 280 300
20
30
40
50
60
70
Longitude
La
titu
de
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 100 Kg/ m²
FDR Tests on Winter Precipitation
Acknowledgement
• Dr. Steve Sain, IMAGe, NCAR• Drs. Douglas Nychka, Tim Hoar, IMAGe, NCAR• Dr. Armin Schwartzman, Harvard University• University of Wyoming• SIParCS, IMAGe, NCAR• NARCCAP