Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes ZongBo Shang SIParCS Program, IMAGe, NCAR August 4, 2009

Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes

ZongBo ShangSIParCS Program, IMAGe, NCAR

August 4, 2009

North American Regional Climate Change Assessment Program (NARCCAP)Predicted Changes in Future Winter Temperature ( °C)

200 220 240 260 280 300 320

20

30

40

50

60

70

CRCM+CGCM3 Changes in Winter Temperature

Longitude

La

titu

de

-2

0

2

4

6

8

Note: This figure shows the difference between the mean of future (2040 – 2069 ) winter temperature vs. current (1970 – 1999) winter temperature.

Can We Trust What We See?

Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.

10 20 30 40 50

10

20

30

40

50

x

y

-2

-1

0

1

2

10 20 30 40 50

10

20

30

40

50

x

y-2

-1

0

1

2

What’s the Problem with Pointwise Two-sample t Tests?

10 20 30 40 50

10

20

30

40

50

Two sample t statistic

x

y

-4

-2

0

2

4

10 20 30 40 50

10

20

30

40

50

Pointwise p-value

x

y

0.00

0.02

0.04

0.06

0.08

0.10

210 : H

False Discovery Rate (FDR) Control

• FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors) among the rejected null hypotheses.

• Less conservative than Bonferroni procedures, with greater power than Familywise Error Rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.

Applications of FDR in Genes Expression and Microarray

Applications of FDR in Functional Magnetic Resonance Imaging

Definition of False Discovery Rate

Declared non-significant (fail to reject)

Declared significant (reject)

Total

True null hypotheses

U V m₀

Non-true null hypotheses

T S m-m₀

m-R R m

Let Q = V / (V + S) define the proportion of errors committed by falsely rejecting null hypotheses. Notice Q is an unobservable random variable. Define the FDR to be the expectation of Q:

]/[)]/([][ RVESVVEQEQe

False Discovery Rates for Spatial Signals

• Testing on clusters rather than individual locations

• Procedure 1: Weighted Benjamini & Hochberg (BH) procedure

• Procedure 2: Weighted two-stage procedure• Procedure 3: Hierarchical testing procedure

– Testing stage: control FDR on clusters– Trimming stage: control FDR on selected points

Reference: Benjamini, Y. and Heller, R. 2007. False discovery rates for spatial signals. Journal of the American Statistical Association. 102:1272-1281.

Simulation Studies

• 1. Random Fields

• 2. Random Field Block

• 3. Random Field Gradient

10 20 30 40 50

10

20

30

40

50

10 Replicates Average for Setting I

x

y

-1.0

-0.5

0.0

0.5

1.0

10 20 30 40 50

10

20

30

40

50

10 Replicates Average for Setting II

x

y

-10

-5

0

5

10

10 20 30 40 50

10

20

30

40

50


x

y

-10

-5

0

5

10

10 20 30 40 50

10

20

30

40

50


x

y

-1.0

-0.5

0.0

0.5

1.0

10 20 30 40 50

10

20

30

40

50


xy

-10

-5

0

5

10

10 20 30 40 50

10

20

30

40

50


x

y

-10

-5

0

5

10

Simulation Study I: Two Random Fields

Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.

10 20 30 40 50

10

20

30

40

50

x

y

-2

-1

0

1

2

10 20 30 40 50

10

20

30

40

50

x

y-2

-1

0

1

2

Pre-defined Clusters

10 20 30 40

10

20

30

40

Simulation Study 1: Pointwise vs. False Discover Rate Control

10 20 30 40 50

10

20

30

40

50

Pointwise p-value

x

y

0.00

0.02

0.04

0.06

0.08

0.10

0 10 20 30 40 50

01

02

03

04

05

0

Rejection at q-value

x

y

0.00

0.02

0.04

0.06

0.08

0.10

9 Repeats on Simulation Study I

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0.00

0.02

0.04

0.06

0.08

0.10

Simulation Study II: Pre-defined Block Trend

xy

trend

Trend

10 20 30 40 50

10

20

30

40

50

Trend

x

y

-10

-5

0

5

10

4 -10

10 -4

2

-2

Simulation Study II: Average of 10 Replicates

10 20 30 40 50

10

20

30

40

50

Mean of 10 Replicates from Setting I

x

y

-10

-5

0

5

10

10 20 30 40 50

10

20

30

40

50

Mean of 10 Replicates from Setting II

x

y-10

-5

0

5

10

Random Field (Matèrn, σ = 0.4) Random Field (Matèrn, σ = 0.4) + Block Trends

4 -10

10 -4

2

-2

Simulation Study II: Pointwise vs. False Discover Rate Control

10 20 30 40 50

10

20

30

40

50

Pointwise p-value

x

y

0.00

0.02

0.04

0.06

0.08

0.10

0 10 20 30 40 50

01

02

03

04

05

0


x

y

0.00

0.02

0.04

0.06

0.08

0.10

9 Repeats on Simulation Study II

10 20 30 40 50

10

20

30

40

50

Trend

x

y

-10

-5

0

5

10

0 10 20 30 40 500

1020

3040

500 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0 10 20 30 40 50

010

2030

4050

0.00

0.02

0.04

0.06

0.08

0.10

Study III: Pre-defined Gradient Trend

10 20 30 40 50

10

20

30

40

50

Trend

x

y

-10

-5

0

5

10

xy

trend

Trend

Study III: Average of 10 Replicates

10 20 30 40 50

10

20

30

40

50


x

y

-10

-5

0

5

10

10 20 30 40 50

10

20

30

40

50


x

y-10

-5

0

5

10

Random Field (Matèrn, σ = 2) Random Field (Matèrn, σ = 2) + Gradient Trends

Simulation Study III: Pointwise vs. False Discover Rate Control

10 20 30 40 50

10

20

30

40

50

Pointwise p-value

x

y

0.00

0.02

0.04

0.06

0.08

0.10

0 10 20 30 40 50

01

02

03

04

05

0


x

y

0.00

0.02

0.04

0.06

0.08

0.10

9 Repeats on Simulation Study III

10 20 30 40 50

10

20

30

40

50

Trend

x

y

-10

-5

0

5

10

0 10 20 30 40 50

010

2030

4050

x

y0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 500

1020

3040

50

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0 10 20 30 40 50

010

2030

4050

x

y

0.00

0.02

0.04

0.06

0.08

0.10

Applying FDR Control for Detecting Future Climate Changes

• Download climate datasets from NARCCAP program• Calculate seasonal average• Construct clusters from EPA Eco-regions• Conduct two-sample t test on

temperature/precipitation• Pointwise p-values and corresponding z scores• Build semi-variogram model to estimate spatial

autocorrelation• Calculate z score and p-value by cluster• Reject clusters based on FDR control

http://www.epa.gov/wed/pages/ecoregions/na_eco.htm

GIS: Vector Dataset, Lambert Equal-Area Projection

61 regions rejected at q=0.25 level 56 regions rejected at q=0.1 level 54 regions rejected at q=0.05 level 51 regions rejected at q=0.01 level

H0: Future Winter Temperature Increase by 3 ˚C

0 50 100 150

0.0

0.1

0.2

0.3

0.4

0.5

p-v

alu

e

Reject at q=0.25Reject at q=0.1Reject at q=0.05Reject at q=0.01

220 240 260 280 3002

03

04

05

06

07

0


Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

CRCM+CGCM3 Changes in Winter Temperature

Longitude

La

titu

de

-5

0

5

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: Winter Temperature ↑ 1 ˚C

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: Winter Temperature ↑ 2 ˚C H0: Winter Temperature ↑ 3 ˚C

H0: Winter Temperature ↑ 4 ˚C

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: Winter Temperature ↑ 6 ˚CH0: Winter Temperature ↑ 5 ˚C

FDR Tests on Winter Temperature

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: Winter Prec ↓ 20 Kg/ m² H0: ↓ 10 Kg/ m² H0: ↑ 10 Kg/ m² H0: ↑ 20 Kg/ m²

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: ↑ 50 Kg/ m²H0: Winter Prec ↑ 30 Kg/ m²

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: ↑ 75 Kg/ m²

220 240 260 280 300

20

30

40

50

60

70

Longitude

La

titu

de

0.00

0.05

0.10

0.15

0.20

0.25

H0: ↑ 100 Kg/ m²

FDR Tests on Winter Precipitation

Acknowledgement

• Dr. Steve Sain, IMAGe, NCAR• Drs. Douglas Nychka, Tim Hoar, IMAGe, NCAR• Dr. Armin Schwartzman, Harvard University• University of Wyoming• SIParCS, IMAGe, NCAR• NARCCAP

Documents

Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes ZongBo Shang SIParCS Program, IMAGe, NCAR August 4, 2009