Upload
oro
View
39
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Quantifying uncertainty in the UK carbon flux. Tony O’Hagan CTCD, Sheffield. Outline. Introduction Gaussian process emulation The England and Wales carbon flux 2000. Computer models. - PowerPoint PPT Presentation
Citation preview
17 May 2007 RSS Kent Local Group 1
Quantifying uncertainty in the UK carbon flux
Tony O’HaganCTCD, Sheffield
17 May 2007 RSS Kent Local Group 2
Outline
Introduction
Gaussian process emulation
The England and Wales carbon flux 2000
17 May 2007 RSS Kent Local Group 3
Computer models
In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real-world processes
For understanding, prediction, control
There is a growing realisation of the importance of uncertainty in model predictions
Can we trust them?Without any quantification of output uncertainty, it’s easy to dismiss them
17 May 2007 RSS Kent Local Group 4
Examples
Climate prediction
Molecular dynamics
Nuclear waste disposal
Oil fields
Engineering design
Hydrology
17 May 2007 RSS Kent Local Group 5
Uncertainty analysis
Consider just one source of uncertaintyWe have a computer model that produces output y = f (x) when given input x
But for a particular application we do not know x precisely
So X is a random variable, and so therefore is Y = f (X )
We are interested in the uncertainty distribution of Y
How can we compute it?
17 May 2007 RSS Kent Local Group 6
Monte Carlo
The usual approach is Monte CarloSample values of x from its distribution
Run the model for all these values to produce sample values yi = f (xi)
These are a sample from the uncertainty distribution of Y
Neat but impractical if it takes minutes or hours to run the model
We can then only make a small number of runs
17 May 2007 RSS Kent Local Group 7
GP solution
Treat f (.) as an unknown function with Gaussian process (GP) prior distribution
Use available runs as observations without error, to derive posterior distribution (also GP)
Make inference about the uncertainty distributionE.g. The mean of Y is the integral of f (x) with respect to the distribution of X
Its posterior distribution is normal conditional on GP parameters
17 May 2007 RSS Kent Local Group 8
Gaussian process emulation
Principles of emulation
The GP and how it works
17 May 2007 RSS Kent Local Group 9
Emulation
A computer model encodes a function, that takes inputs and produces outputs
An emulator is a statistical approximation of that function
Estimates what outputs would be obtained from given inputs
With statistical measure of estimation error
Given enough training data, estimation error variance can be made small
17 May 2007 RSS Kent Local Group 10
So what?
A good emulator estimates the model output accurately
with small uncertainty
and runs “instantly”
So we can do uncertainty analysis etc fast and efficiently
Conceptually, weuse model runs to learn about the function
then derive any desired properties of the model
17 May 2007 RSS Kent Local Group 11
Gaussian process
Simple regression models can be thought of as emulators
But error estimates are invalid
We use Gaussian process emulationNonparametric, so can fit any function
Error measures can be validated
Analytically tractable, so can often do uncertainty analysis etc analytically
Highly efficient when many inputs
Reproduces training data correctly
17 May 2007 RSS Kent Local Group 12
2 code runs
Consider one input and one output
Emulator estimate interpolates data
Emulator uncertainty grows between data points
17 May 2007 RSS Kent Local Group 13
3 code runs
Adding another point changes estimate and reduces uncertainty
17 May 2007 RSS Kent Local Group 14
5 code runs
And so on
17 May 2007 RSS Kent Local Group 15
BACCO
This has led to a wide ranging body of tools for inference about all kinds of uncertainties in computer models
All based on building the GP emulator of the model from a set of training runs
This area is now known as BACCOBayesian Analysis of Computer Code Output
17 May 2007 RSS Kent Local Group 16
BACCO includes
Uncertainty analysis
Sensitivity analysis
Calibration
Data assimilation
Model validation
Optimisation
Etc…
All within a single coherent framework
17 May 2007 RSS Kent Local Group 17
MUCM
Managing Uncertainty in Complex ModelsLarge 4-year research grant
Started in June 2006
7 postdoctoral research assistants
4 PhD studentships
Based in Sheffield, Durham, Aston, Southampton, LSE
Objective: to develop BACCO methods into a robust technology that is widely applicable across the spectrum of modelling applications
17 May 2007 RSS Kent Local Group 18
Example: UK carbon flux in 2000
Vegetation model predicts carbon exchange from each of 707 pixels over England & Wales
Principal output is Net Biosphere Production
Accounting for uncertainty in inputsSoil propertiesProperties of different types of vegetation
Aggregated to England & Wales totalAllowing for correlationsEstimate 7.55 Mt CStd deviation 0.56 Mt CAnalysis by Marc Kennedy and John Paul Gosling
17 May 2007 RSS Kent Local Group 19
SDGVMd outputs for 2000
17 May 2007 RSS Kent Local Group 20
Outline of analysis
1. Build emulators for each PFT at a sample of sites
2. Identify most important inputs
3. Define distributions to describe uncertainty in important inputs
Analysis of soils data
Elicitation of uncertainty in PFT parameters
Need to consider correlations
17 May 2007 RSS Kent Local Group 21
4. Carry out uncertainty analysis in each sampled site
5. Interpolate across all sitesMean corrections and standard deviations
6. Aggregate across sites and PFTsAllowing for correlations
17 May 2007 RSS Kent Local Group 22
Sensitivity analysis for one pixel/PFT
17 May 2007 RSS Kent Local Group 23
Elicitation
Beliefs of expert (developer of SDGVMd) regarding plausible values of PFT parameters
Important to allow for uncertainty about mix of species in a pixel and role of parameter in the model
In the case of leaf life span for evergreens, this was more complex
17 May 2007 RSS Kent Local Group 24
EvNl leaf life span
17 May 2007 RSS Kent Local Group 25
Correlations
PFT parameter in one pixel may differ from in another
Because of variation in species mix
Common uncertainty about average over all species induces correlation
Elicit beliefs about average over whole UKEvNl joint distributions are mixtures of 25 components, with correlation both between and within years
17 May 2007 RSS Kent Local Group 26
Mean NBP corrections
17 May 2007 RSS Kent Local Group 27
NBP standard deviations
17 May 2007 RSS Kent Local Group 28
Land cover (from LCM2000)
17 May 2007 RSS Kent Local Group 29
Aggregate across 4 PFTs
17 May 2007 RSS Kent Local Group 30
Sensitivity analysis
Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the parameters of PFTs
As opposed to soil parameters
Contribution of PFT uncertainty largest in grasslands/moorlands
17 May 2007 RSS Kent Local Group 31
England & Wales aggregate
PFTPlug-in estimate
(Mt C)Mean(Mt C)
Variance (Mt C2)
Grass 5.28 4.64 0.2689
Crop 0.85 0.45 0.0338
Deciduous 2.13 1.68 0.0128
Evergreen 0.80 0.78 0.0005
Covariances 0.0010
Total 9.06 7.55 0.3170
17 May 2007 RSS Kent Local Group 32
Conclusions
Bayesian methods offer a powerful basis for computation of uncertainties in model predictionsAnalysis of E&W aggregate NBP in 2000
Good case study for uncertainty and sensitivity analyses
But needs to take account of more sources of uncertainty
Involved several technical extensionsHas important implications for our understanding of C fluxesPolicy implications