Performance Metrics for Climate Models: WDAC … · Performance Metrics for Climate Models: WDAC advancements towards routine modeling ... Meridional wind stress ... Field examples:

PCMDI 1

Peter Gleckler*

Program for Climate Model Diagnosis and Intercomparison (PCMDI)

LLNL, USA

Performance Metrics for Climate Models: WDAC advancements towards routine modeling benchmarks

Climate from Space Week, Geneva, Feb 2013

* Representing WDAC and the WGNE/WGCM Climate Model Metrics Panel

PCMDI 2

Monitoring evolution of model performance: Example from Numerical Weather Prediction

Courtesy M.Miller, ECMWF

EU!

The climate modeling community does not yet have routine performance metrics

RM

S er

ror (

hPa)!

Weather Prediction Model Metrics

Year forecast was made!

DAY 5!

DAY 3!

PCMDI 3

What is usually meant by climate model “metrics”?

•  “Metrics”, as used here, are succinct and objective measures of the quality of a model simulation – usually a scalar quantity

•  Quantify errors, usually not designed to diagnose reasons for model errors

•  Skill in simulating things we have observed (“performance metrics”)

•  Model reliability for application (e.g., “projection reliability metrics”)

•  How accurate are model projections of climate change? •  Extremely valuable… and… extremely difficult

PCMDI 4

Questions motivating routine benchmarks for climate models

•  Of direct concern to the WDAC (WGNE and WGCM):

§  Are models improving?

§  Do some models consistently agree with observations better than others?

§  What do models simulate robustly, and what not?

§  Related research drivers:

§  How does skill in simulating observed climate relate to projection credibility?

§  Can we justify weighting model projections based on metrics of skill?

PCMDI 5

What opportunities are there to construct climate model performance metrics?

•  Model’s externally “forced” responses on a range of time-scales:

  Diurnal cycle

  Annual cycle

  Volcanic eruptions, changes in solar irradiance, …

•  Model’s “unforced” behavior (weather, MJO, ENSO, NAO, PDO …)

•  Evaluate model representation of individual processes and co-variability relationships

•  Test model ability to solve the “initial value” problem

•  Examine how well models perform with added complexity

PCMDI 6

Taylor diagram for CMIP3 annual cycle global climatology (1980-1999)

• Variable dependent skill

• Multi-model mean “superiority”

Standard Deviation OBS

Sta

ndar

d D

evia

tion

PCMDI 7

Evaluating how well climate models simulate the annual cycle: A “Performance Portrait” of relative errors

Mea

n

Med

ian

Latent heat flux at surface Sensible heat flux at surface

Surface temperature Reflected SW radiation (clear sky)

Reflected SW radiation Outgoing LW radiation (clear sky)

Outgoing LW radiation Total cloud cover

Precipitation Total column water vapor

Sea-level pressure Meridional wind stress

Zonal wind stress Meridional wind at surface

Zonal wind at surface Specific humidity at 400 mb Specific humidity at 850 mb Meridional wind at 200 mb

Zonal wind at 200 mb Temperature at 200 mb

Geopotential height at 500 mb Meridional wind at 850 mb

Zonal wind at 850 mb Temperature at 850 mb

“Worst”

“Best”

Clim

ate

varia

ble

Gleckler, P, K. Taylor and C. Doutriaux, J.Geophys.Res. (2008)

Model used in IPCC Fourth Assessment

Median

Relative RMSE in Climatological Annual Cycle (including spatial pattern)

PCMDI 8

Gauged by simple metrics, the structure of relative model errors is complex

Santer et al., PNAS, 2009

PCMDI 9

mpi_echam5

miroc3_2_hires

median-c06a

gfdl_cm2_1

cnrm_cm3

miroc3_2_medres

mri_cgcm2_3_2a

ncar_pcm1

giss_model_e_r

giss_model_e_h

ipsl_cm4

ukmo_hadgem1

iap_fgoals1_0_g

ncar_ccsm3_0

csiro_mk3_0

cccma_cgcm3_1

inmcm3_0

mean-c06a

gfdl_cm2_0

ukmo_hadcm3

cccma_cgcm3_1_t63

bcc_cm1

bccr_bcm2_0

giss_aom

0

5

10

15

20

mpi_echam5

miroc3_2_hires

median-c06a

gfdl_cm2_1

cnrm_cm3

miroc3_2_medres

mri_cgcm2_3_2a

ncar_pcm1

giss_model_e_r

giss_model_e_h

ipsl_cm4

ukmo_hadgem1

iap_fgoals1_0_g

ncar_ccsm3_0

csiro_mk3_0

cccma_cgcm3_1

inmcm3_0

mean-c06a

gfdl_cm2_0

ukmo_hadcm3

cccma_cgcm3_1_t63

bcc_cm1

bccr_bcm2_0

giss_aom

0

5

10

15

20

What difference does the choice of metric make?

RA

NK

Annual Mean Precipitation CMIP3 models, OBS = GPCP

RMSE MAE AVG

•  Choice of metrics can impact rank

•  Outliers (good/bad) robust to choice (in this example)

Better to be aware of how results are impacted by choice of metric than to rely on a single score

PCMDI 10

WGCM/WGNE Metrics panel terms of reference

•  Identify a limited set of basic climate model performance metrics

•  based on comparison with carefully selected observations •  well established in literature, and preferably in widespread use •  easy to calculate, reproduce and interpret •  covering a diverse suite of climate characteristics

•  large- to global-scale mean climate and some variability •  atmosphere, oceans, land surface, and sea-ice

•  Coordinate with other WCRP/CLIVAR working groups

•  Identify metrics for more focused evaluation (e.g., modes of variability, process level)

•  Striving towards a community based activity by coalescing expertise

•  Justify and promote these basic metrics in an attempt to

•  establish routine community benchmarks •  facilitate further research of increasingly targeted metrics

•  Ensure that these metrics are applied in CMIP5 and widely available

PCMDI 11

First steps… focus on annual cycle (which is in widespread use)

Standard annual cycle:

§  15-20 large- to global- scale statistical or “broad-brush” metrics §  Domains: Global, tropical, NH/SH extra-tropics §  20 year climatologies: Annual mean, 4 seasons §  Routine metrics: bias, centered RMS, MAE, correlation, standard deviation §  Field examples: OLR, T850, q, SST, SSH, sea-ice extent §  Observations: multiple for most cases

Extended set of metrics, coordinating with other working groups (in progress):

§  ENSO (CLIVAR Pacific Panel) §  Monsoons (CLIVAR AAMP) §  MJO (YOTC Task force) •  Carbon cycle in emission-driven ESMs (ILAMB)

• 

• 

• 

PCMDI 12

Some scratch slides….

PCMDI 13

The essential role of observations for climate model performance metrics

•  obs4MIPs and other efforts strive to advance the connection between data experts and model analysts

•  Transparency is crucial:

  Knowing the data came from the appropriate source (ideally the data experts)

  Accurate information concerning the data product version

  Documentation on the data product that is relevant for model analysts

•  Quantifying observational uncertainty remains a key challenge:

  For some fields, model errors remain >> than observational uncertainty, but not so in many cases

  Although inadequate, the common path is to characterize obs uncertainty by using multiple products

  Increasingly, model analysts expect useful quantification of uncertainties

  New observation ensembles, exploring the impact of processing choices, are of tremendous interest

PCMDI 14

Possible advancements for a community-based effort to establish routine benchmarks for climate models

•  The WGNE/WGCM metrics panel is working to develop an analysis package to be

shared with all leading modeling groups. This will include simple analysis routines, observational data, and a database of metrics results from all available climate models. This will enable modeling groups to compare the results from other models within their model development process.

•  The model data conventions applied to CMIP5 continue to transform how model evaluation is done in the research community. In essence, all scientists are using the same data, which is structured similarly for each model with tightly defined metadata conventions. This opens up the possibility for next-generation steps towards a shared environment for model evaluation tools. Careful incorporation of observational data into this framework will be critical, and obs4MIPs and other projects are paving the course.

Documents

Performance Metrics for Climate Models: WDAC … · Performance Metrics for Climate Models: WDAC advancements towards routine modeling ... Meridional wind stress ... Field examples: