Upload
hadien
View
218
Download
3
Embed Size (px)
Citation preview
PCMDI 1
Peter Gleckler*
Program for Climate Model Diagnosis and Intercomparison (PCMDI)
LLNL, USA
Performance Metrics for Climate Models: WDAC advancements towards routine modeling benchmarks
Climate from Space Week, Geneva, Feb 2013
* Representing WDAC and the WGNE/WGCM Climate Model Metrics Panel
PCMDI 2
Monitoring evolution of model performance: Example from Numerical Weather Prediction
Courtesy M.Miller, ECMWF
EU!
The climate modeling community does not yet have routine performance metrics
RM
S er
ror (
hPa)!
Weather Prediction Model Metrics
Year forecast was made!
DAY 5!
DAY 3!
PCMDI 3
What is usually meant by climate model “metrics”?
• “Metrics”, as used here, are succinct and objective measures of the quality of a model simulation – usually a scalar quantity
• Quantify errors, usually not designed to diagnose reasons for model errors
• Skill in simulating things we have observed (“performance metrics”)
• Model reliability for application (e.g., “projection reliability metrics”)
• How accurate are model projections of climate change? • Extremely valuable… and… extremely difficult
PCMDI 4
Questions motivating routine benchmarks for climate models
• Of direct concern to the WDAC (WGNE and WGCM):
§ Are models improving?
§ Do some models consistently agree with observations better than others?
§ What do models simulate robustly, and what not?
§ Related research drivers:
§ How does skill in simulating observed climate relate to projection credibility?
§ Can we justify weighting model projections based on metrics of skill?
PCMDI 5
What opportunities are there to construct climate model performance metrics?
• Model’s externally “forced” responses on a range of time-scales:
Diurnal cycle
Annual cycle
Volcanic eruptions, changes in solar irradiance, …
• Model’s “unforced” behavior (weather, MJO, ENSO, NAO, PDO …)
• Evaluate model representation of individual processes and co-variability relationships
• Test model ability to solve the “initial value” problem
• Examine how well models perform with added complexity
PCMDI 6
Taylor diagram for CMIP3 annual cycle global climatology (1980-1999)
• Variable dependent skill
• Multi-model mean “superiority”
Standard Deviation OBS
Sta
ndar
d D
evia
tion
PCMDI 7
Evaluating how well climate models simulate the annual cycle: A “Performance Portrait” of relative errors
Mea
n
Med
ian
Latent heat flux at surface Sensible heat flux at surface
Surface temperature Reflected SW radiation (clear sky)
Reflected SW radiation Outgoing LW radiation (clear sky)
Outgoing LW radiation Total cloud cover
Precipitation Total column water vapor
Sea-level pressure Meridional wind stress
Zonal wind stress Meridional wind at surface
Zonal wind at surface Specific humidity at 400 mb Specific humidity at 850 mb Meridional wind at 200 mb
Zonal wind at 200 mb Temperature at 200 mb
Geopotential height at 500 mb Meridional wind at 850 mb
Zonal wind at 850 mb Temperature at 850 mb
“Worst”
“Best”
Clim
ate
varia
ble
Gleckler, P, K. Taylor and C. Doutriaux, J.Geophys.Res. (2008)
Model used in IPCC Fourth Assessment
Median
Relative RMSE in Climatological Annual Cycle (including spatial pattern)
PCMDI 8
Gauged by simple metrics, the structure of relative model errors is complex
Santer et al., PNAS, 2009
PCMDI 9
mpi_echam5
miroc3_2_hires
median-c06a
gfdl_cm2_1
cnrm_cm3
miroc3_2_medres
mri_cgcm2_3_2a
ncar_pcm1
giss_model_e_r
giss_model_e_h
ipsl_cm4
ukmo_hadgem1
iap_fgoals1_0_g
ncar_ccsm3_0
csiro_mk3_0
cccma_cgcm3_1
inmcm3_0
mean-c06a
gfdl_cm2_0
ukmo_hadcm3
cccma_cgcm3_1_t63
bcc_cm1
bccr_bcm2_0
giss_aom
0
5
10
15
20
mpi_echam5
miroc3_2_hires
median-c06a
gfdl_cm2_1
cnrm_cm3
miroc3_2_medres
mri_cgcm2_3_2a
ncar_pcm1
giss_model_e_r
giss_model_e_h
ipsl_cm4
ukmo_hadgem1
iap_fgoals1_0_g
ncar_ccsm3_0
csiro_mk3_0
cccma_cgcm3_1
inmcm3_0
mean-c06a
gfdl_cm2_0
ukmo_hadcm3
cccma_cgcm3_1_t63
bcc_cm1
bccr_bcm2_0
giss_aom
0
5
10
15
20
What difference does the choice of metric make?
RA
NK
Annual Mean Precipitation CMIP3 models, OBS = GPCP
RMSE MAE AVG
• Choice of metrics can impact rank
• Outliers (good/bad) robust to choice (in this example)
Better to be aware of how results are impacted by choice of metric than to rely on a single score
PCMDI 10
WGCM/WGNE Metrics panel terms of reference
• Identify a limited set of basic climate model performance metrics
• based on comparison with carefully selected observations • well established in literature, and preferably in widespread use • easy to calculate, reproduce and interpret • covering a diverse suite of climate characteristics
• large- to global-scale mean climate and some variability • atmosphere, oceans, land surface, and sea-ice
• Coordinate with other WCRP/CLIVAR working groups
• Identify metrics for more focused evaluation (e.g., modes of variability, process level)
• Striving towards a community based activity by coalescing expertise
• Justify and promote these basic metrics in an attempt to
• establish routine community benchmarks • facilitate further research of increasingly targeted metrics
• Ensure that these metrics are applied in CMIP5 and widely available
PCMDI 11
First steps… focus on annual cycle (which is in widespread use)
Standard annual cycle:
§ 15-20 large- to global- scale statistical or “broad-brush” metrics § Domains: Global, tropical, NH/SH extra-tropics § 20 year climatologies: Annual mean, 4 seasons § Routine metrics: bias, centered RMS, MAE, correlation, standard deviation § Field examples: OLR, T850, q, SST, SSH, sea-ice extent § Observations: multiple for most cases
Extended set of metrics, coordinating with other working groups (in progress):
§ ENSO (CLIVAR Pacific Panel) § Monsoons (CLIVAR AAMP) § MJO (YOTC Task force) • Carbon cycle in emission-driven ESMs (ILAMB)
•
•
•
PCMDI 12
Some scratch slides….
PCMDI 13
The essential role of observations for climate model performance metrics
• obs4MIPs and other efforts strive to advance the connection between data experts and model analysts
• Transparency is crucial:
Knowing the data came from the appropriate source (ideally the data experts)
Accurate information concerning the data product version
Documentation on the data product that is relevant for model analysts
• Quantifying observational uncertainty remains a key challenge:
For some fields, model errors remain >> than observational uncertainty, but not so in many cases
Although inadequate, the common path is to characterize obs uncertainty by using multiple products
Increasingly, model analysts expect useful quantification of uncertainties
New observation ensembles, exploring the impact of processing choices, are of tremendous interest
PCMDI 14
Possible advancements for a community-based effort to establish routine benchmarks for climate models
• The WGNE/WGCM metrics panel is working to develop an analysis package to be
shared with all leading modeling groups. This will include simple analysis routines, observational data, and a database of metrics results from all available climate models. This will enable modeling groups to compare the results from other models within their model development process.
• The model data conventions applied to CMIP5 continue to transform how model evaluation is done in the research community. In essence, all scientists are using the same data, which is structured similarly for each model with tightly defined metadata conventions. This opens up the possibility for next-generation steps towards a shared environment for model evaluation tools. Careful incorporation of observational data into this framework will be critical, and obs4MIPs and other projects are paving the course.