55
Current Monthly Homogenization Approaches Benchmarking their Strengths and Weaknesses Victor Venema

Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Current Monthly Homogenization Approaches –

Benchmarking their Strengths and Weaknesses

Victor Venema

Page 2: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Content

Global Historical Climate Network (NOAA-

GHCNv3)

– Trend: 0.8°C per century since 1880

– Raw data: 0.6°C

Need independent lines of research

1. Statistical homogenization

2. Physical understanding (parallel measurements)

3. Modelling (UHI, radiation screens)

Page 3: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Homogenisation: WHY?

Example of PAU-UZEIN temperature

1912 PAU-LESCAR (EN) 2005 PAU-UZEIN (AERO)

Slide: Olivier Mestre

Page 4: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 5: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

HOME validation study

Compare full homogenisation algorithms

Benchmark dataset

– Monthly temperature and precipitation networks

– Most realistic to date

Configuration

– Typical for Europe

– Number of stations: 5, 9, 15

Page 6: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Scatterplots monthly CRMSE

0 0.5 1 1.5

0

0.5

1

1.5

ACMANT

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

PRODIGE monthly

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

USHCN main

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

MASH main

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.5

0

0.5

1

1.5

C3SNHT

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

0 0.5 1 1.50

0.5

1

1.5

2

2.5

PMFred abs

CRMSE inhomogeneous data [°C]

CR

MS

E h

om

og

en

ise

d d

ata

[°C

]

Page 7: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Errors in trends

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

Trend difference [°C/100a]

-50 -40 -30 -20 -10 0 10 20 30 40 50

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

Trend difference [mm/100a]

Page 8: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Lessons

Modern methods a factor 2 more accurate

– Multiple breakpoint methods

– Methods that are designed to work with

inhomogeneous reference series

Training is important

Automatic methods as good as manual methods

– No metadata in validation dataset

SNHT is not recommended

Absolute homogenization is method of last resort

Page 9: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Decomposition method on Benchmark

Domonkos, P., V. Venema, O. Mestre. Efficiencies of homogenisation methods: our present knowledge and its limitation.

Proceedings of the Seventh seminar for homogenization and quality control in climatological databases, Budapest, Hungary, 24

– 28 October 2011, WMO report, Climate data and monitoring, WCDMP-No. 78, pp. 11-24, 2013.

RMSE station

trends

CRMSE

Annual data

Page 10: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Caveats HOME: ISTI

1. Missing homogenization methods

– Two- or multi-phase regression method

2. Size breaks (random walk or noise) – Ralf Lindau and Victor Venema. The joint influence of break and noise variance on the break detection

capability in time series homogenization.

3. Signal to noise ratio varies regionally

4. Regional trends (absolute homogenization)

5. Length of the series – Ralf Lindau and Victor Venema. On the multiple breakpoint problem and the number of significant breaks

in homogenisation of climate records. Idojaras, 117, no. 1, pp. 1-34, 2013.

6. Non-climatic trend bias

International Surface Temperature Initiative – Kate Willett et al. A framework for benchmarking of homogenisation algorithm performance on the global

scale. Geosci. Instrum. Method. Data Syst., 3, pp. 187-200, 2014.

Page 11: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Radiation error

Page 12: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Radiation error

Climates largest radiation errors:

* Strong insolation

* Low wind

* Dry ground

* High specific humidity

Page 13: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Parallel measurements

Transition to Stevenson screens

North-West Europe: < 0.2°C (Various, Parker)

Basel, Switzerland: 0°C (Wild screen)

Kremsmünster, Austria: 0.2°C (North-wall)

Adelaide, South Australia: 0.2°C (Glaisher stand)

Spain: 0.35°C (French screen)

Sri Lanka: 0.37°C

(Tropical screen)

India: 0.42° (Tropical screen)

Page 14: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Sources of global temperature trend bias

Transition to Stevenson screens

Transition to Automatic Weather Stations

Urbanization

Siting

Irrigation

Relocations to airports

Page 15: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Research on parallel data

Large database with parallel measurements

needed to study daily inhomogeneities

o Study statistical and physical properties of (daily)

inhomogeneities

o Dependence on local weather and regional climate

o Most studies are currently about mid-latitudes

o Validate detected inhomogeneities

o Independent evidence for

trend bias

Page 16: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Parallel Data Initiative

Produce an open database

Initially data is restricted to contributors

– Incentive to contribute

– Until first joint paper(s) by contributors are written

First action: Inventory of parallel datasets

– https://ourproject.org/moin/projects/parallel

– Dozens of datasets available

More information

– http://tinyurl.com/paralleldata

[email protected]

Page 17: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Conclusions & outlook

Statistical homogenization improves temperature trend estimates – Only best method improve precipitation trends

Modern homogenization methods more accurate

1. Statistical homogenization – Global validation study missing

– Better mathematical understanding methods

2. Better physical understanding of causes – http://tinyurl.com/paralleldata

3. More modelling to improve understanding

Page 18: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Q&A slides

Page 19: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Shorter length, less certainty

n = 21 years n = 101 years

Exceeding probability

1/128

1/64

1/32

1/16

1/8

1/4

Ralf Lindau and Victor Venema. On the multiple breakpoint problem and the number of significant breaks in

homogenisation of climate records. Idojaras, Quart. journal Hungarian Meteorol. Service, 117, no. 1, pp. 1-34, 2013.

Page 20: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Which SNR is sufficient?

RMS skill for:

0 Random segmentation

+ Standard search

for different SNRs.

So far we considered SNR = ½

Random segmentation and

standard search have comparable

skills.

Only for SNR > 1, the standard

search is significantly better.

Random

Standard

Page 21: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Surrogate temperature section

Generated homogeneous temperature networks

– Stochastic modelling

– Based on statistical properties of homogenized data

Configuration

– Typical for Europe

– 15 networks

– Length: 100 years

– Number of stations: 5, 9, 15

Added non-climatic changes

– Most realistic to date

Page 22: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Beginning

Missing data

Page 23: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

WWII

Missing data

Page 24: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Outliers

Page 25: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Breaks

Page 26: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Simulataneous

Breaks

Page 27: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Local trends

Page 28: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Physical causes of inhomogeneities

Shelter type, exposure

– Radiation & wetting protection

– Natural or forced ventilation

– Snow cover

– Plastic screen: insolation on hot

days

Relocation of station

– City-> airport, suburbs, lower heights

– Deurbanisation of network

Instrument

– Response, integration time

– Zero drift, shrinking glass initial

years

– Calibration errors

– Temperature out of range

– Quicksilver thermometers: T < -39°C

Change surrounding

– Urbanization, growing vegetation,

irrigation

Definitions

– Computation daily means

Measurement procedures

– Reading times

Maintenance procedures

– AWS: Icing, damage detection

– Painting & cleaning schedule

Digitisation & database

– Minus sign forgotten

– Station names mixed up

– Pre-homogenised data

Page 29: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Correction methodology - inflation

Corrections have deterministic (explained variance) and stochastic (unexplained) component

Downscaling: problems deterministic corrections

– Variance inflation (Von Storch, 1999)

– Quantile Matching (Maraun, 2013)

– Unintended change trend in mean

Should correct unexplained variance with noise

Homogenization – Trend in difference TS is small

– Gradual inhomogeneities (urbanization)

Maraun, D. Bias correction, quantile mapping, and downscaling: revisiting the inflation issue. J. Clim., 26, pp. 2137-2143, doi: 10.1175/JCLI-D-12-00821.1, 2013.

Von Storch, H. On the Use of ‘‘Inflation’’ in Statistical Downscaling. J. Clim., 12, pp. 3505-3506, 1999.

Page 30: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Correction – change in noise source

Change in cross-correlation

– Relocation, change in noise source

Simple example

– |N1| = |N2|

– No inhomogeneity in distribution

– Jump in difference time series

R+ +W1N1 R+ +W1N2

R Regional climate signalN Instrument specific errorW Station specific weather

R+ +W2N1

Station 2

Station 1

Page 31: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Correction – change in noise source

Page 32: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Large database with parallel measurements

needed to study daily inhomogeneities

o Generate benchmark data with realistic inhomogeneities

o For example, second cycle of ISTI

o Validate detected inhomogeneities

Research on parallel data

Page 33: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Exposure

Insolation

– Sun, hot ground, scattered

radiation

Humidity and clouds

– Infrared radiative cooling

Wind

– Heat exchange

Design

– Size sensor

– Shielding

– Mechanical ventilation

Page 34: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Australia: Albany airport and town

Trewin (2012)

Page 35: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Parallel measurements – Kremsmünster

Böhm et al. (2010)

Page 36: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Kremsmünster – percentiles difference

Böhm et al. (2010)

Page 37: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Spain: Montsouri screen, Stevenson observations,

Stevenson automatic

Page 38: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Montsouri vs. Stevenson: difference as function of

Diurnal Temperature Range and Tmax

Murcia: South East Spain, Mediterranean.

La Corunia, Corunna: North West Spain, Atlantic.

Juli

April

Page 39: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Montsouri vs. Stevenson: difference as function of

Diurnal Temperature Range and Tmax

Murcia: South East Spain, Mediterranean.

La Corunia, Corunna: North West Spain, Atlantic.

Juli

April

Page 40: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Motivation: daily data

“[Inhomogeneous data] affects, in particular, the

understanding of extremes, because

changes in extremes are often more sensitive to

inhomogeneous climate monitoring practices

than changes in the mean.”

Trenberth, K.E., et al., 2007: Observations: Surface and Atmospheric Climate Change. In: Climate

Change 2007: The Physical Science Basis. Cambridge University Press, Cambridge, United Kingdom

and New York, NY, USA.

Page 41: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Extremes, mean and variability

Page 42: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Importance changes in variability and mean

The relative sensitivity of an

extreme to changes in the

mean (dashed line) and in

the standard deviation

(solid line) for a certain

temperature threshold (x-

axis). The relative sensitivity

of the mean (standard

deviation) is the change in

probability of an extreme

event to a change in the

mean (or standard deviation)

divided by its probability.

From Katz and Brown

(1992).

Page 43: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

A priori formula

The different reaction of breaks

and noise on randomly inserted

breaks makes it possible to

estimate break variance and

break number a priori.

If we insert many breaks, almost

the entire break variance is

explained plus a known fraction of

noise.

At k = nk half of the break variance

is reached (22.8% in total).

No bias component.

0.228

3.1

Page 44: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 45: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 46: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 47: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 48: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 49: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise homogenization

http://variable-variability.blogspot.de/2012/08/statistical-homogenisation-for-dummies.html

Page 50: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

A blind test of

monthly homogenisation algorithms

Victor Venema, O. Mestre, E. Aguilar, I. Auer, J. A. Guijarro, P. Domonkos, G. Vertacnik,

T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos,

C. N. Williams, M. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova,

L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M Brunetti, C. Gruber, M. Prohom Duran,

T. Likso, P. Esteban, T. Brandsma

MeteorologicalInstitute

Bonn

Page 51: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Participant returned the data

25 blind contributions

Some algorithms multiple contributions

– Test versions

– Test influence operator (manual methods)

Algorithms/software

– USHCN

– PRODIGE

– MASH

– Craddock

– AnClim

– RhTestV2

– SNHT

– Climatol

– ACMANT

Page 52: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Monthly CRMSE complete contributions

0 0.2 0.4 0.6 0.8 1 1.2 1.4

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [°C]

Temperature

0 5 10 15 20 25 30

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [mm]

Precipitation

Page 53: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Decadal CRMSE complete contributions

0 0.2 0.4 0.6 0.8 1 1.2 1.4

ACMANT

SNHT DWD

C3SNHT

PMFred abs

PMTred rel

AnClim main

USHCN cx8

USHCN 52x

USHCN main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [°C]

Temperature

0 5 10 15 20 25 30

Climatol

C3SNHT

PMFred abs

PMTred rel

AnClim main

PRODIGE trendy

PRODIGE monthly

PRODIGE main

MASH main

Inhom. data

CRMSE [mm]

Precipitation

Page 54: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Contribution

No

stations POD POFD

Pierce

Skill Score

Heidke

Skill Score

Heidke

Special

MASH main 111 0.63 0.09 0.53 0.31 -0.20

PRODIGE main 111 0.35 0.02 0.33 0.35 0.41

PRODIGE monthly 111 0.39 0.02 0.37 0.40 0.44

PRODIGE trendy 111 0.35 0.02 0.32 0.35 0.41

USHCN main 111 0.34 0.00 0.33 0.46 0.61

USHCN 52x 111 0.40 0.01 0.39 0.51 0.62

USHCN cx8 111 0.35 0.01 0.35 0.47 0.61

AnClim main 111 0.18 0.03 0.15 0.16 0.20

iCraddock Vertacnik 55 0.60 0.03 0.57 0.54 0.49

PMTred rel 111 0.41 0.04 0.37 0.34 0.27

PMFred abs 111 0.21 0.01 0.20 0.27 0.46

C3SNHT 111 0.23 0.05 0.18 0.16 0.04

SNHT DWD 111 0.12 0.01 0.11 0.15 0.40

Climatol 111 0.38 0.01 0.37 0.45 0.55

ACMANT 111 0.50 0.03 0.47 0.44 0.41

Contingency scores

Page 55: Current Monthly Homogenization Approaches …Compare full homogenisation algorithms Benchmark dataset – Monthly temperature and precipitation networks – Most realistic to date

Pairwise vs composite reference

Composite reference

– Compute a weighted average of neighbours

– Reduces the influence of IH in single stations

– Careful selection of stations needed

No large breaks for detection

No breaks for corrections

Pairwise

– Need to attribute breaks found in the pairs to a station

– Solution to this problem is still ad-hoc or manual

– Potential for optimal mathematical solution

– Joint detection: all stations simultaneously

– Solving combinatorial problem for large breaks