Combining Observations and Models: A Bayesian View
Mark Berliner, OSU Stat Dept
• Bayesian Hierarchical Models
• Selected Approaches
• Geophysical
Examples
• Discussion
Main Themes1) Goal: Develop probability
distributions for unknowns of interest by combining information sources: Observations, theory, computer model output, past experience, etc.
2) Approaches: Bayesian Hierarchical Models Incorporate various information
sources by modeling 1. priors2. data model or likelihoods
Bayesian Hierarchical Models• Skeleton:
1. Data Model: [ Y | X , ]
2. Process Model Prior: [ X | ]
3. Prior on parameters: [ ]
• Bayes’ Theorem: posterior distribution: [ X , | Y]
• Compare to
“Statistics”: [ Y | ] [ ]
“Physics”: [ X | (Y) ]
ApproachesA. Stochastic models incorporating science• Physical-statistical modeling (Berliner 2003
JGR) From ``F=ma'' to [ X | ] • Qualitative use of theory (eg., Pacific SST
model; Berliner et al. 2000 J. Climate)
B. Incorporating large-scale computer models
1) From model output to priors [ ]
2) Model output as samples from process model prior [ X | ] almost !
3) Model output as ``observations'' (Y)
C. Combinations
Glacial Dynamics (Berliner et al. 2008 J. Glaciol)
Steady Flow of Glaciers and Ice Sheets • Flow: gravity moderated by drag (base &
sides) & ….stuff….• Simple models: flow from geometry
Data: Program for Arctic Climate Regional Assessments
& Radarsat Antarctic Mapping Project
• surface topography (laser altimetry) • basal topography (radar altimetry) • velocity data (interferometry)
Modeling: surface – s, thickness – H, velocity -
u Physical Model
• Basal Stress: = - gH ds/dx (+ “stuff”)
• Velocities: u = ub + 0 H n
where ub = k p + ( gH )-q Our Model
• Basal Stress: = - gH ds/dx + where is a ``corrector process;” H, s unknown
• Velocities: u = ub + H n + e
where ub = k p + ( gH )-q or a constant;
is unknown, e is a noise process
Wavelet Smoothing of Base
Results: Velocity
Results: Stress and Corrector
Paleoclimate (Brynjarsdóttir & Berliner 2009)
Climate proxies: Tree rings, ice cores, corals, pollen, underground rock provide indirect information on climate
• Inverse problem: proxy f(climate)
Boreholes: Earth stores info on surface temp’s
• Model: Heat equation
Borehole data f(surface temp’s)
• Infer boundary condition (initial cond. is nuisance)
Modeling
• Data Model:
Y | Tr, ~ N( Tr + T0 1 + q R(k), 2 I)
true temp
Adjustments for rock types, etc.
• Process Model: heat equation applied to Tr
with b.cond. surface temp history Th
Tr | Th , ~ N( BTh , 2 I)
Th | ~ N( 0 , 2 I)
Y
h
r
In progress:• Combining boreholes (parameters and
b.cond as samples from a distribution)
• Combining with other sources and proxies
Bayesian Hierarchical Models to Augment the
Mediterranean Forecast System (MFS)Ralph Milliff CoRAChris Wikle Univ. MissouriMark Berliner Ohio State Univ..Nadia Pinardi INGV (I'Istituto Nazionale di
Geofisica e Vulcanologia) Univ. Bologna (MFS Director)Alessandro Bonazzi, Srdjan Dobricic INGV, Univ. Bologna
Bayesian Modeling in Support of Massive Forecast Models
1. MFS is an Ocean Model
2. A Boundary Condition/Forcing: Surface Winds
3. Approach: produce surface vector winds (SVW), for ensemble data assimilation
• Exploit abundant, “good” satellite wind data (QuikSCAT)
• Samples from our winds-posterior ensemble for MFS
(Before us: coarse wind field (ECMWF))
“Rayleigh Friction Model” for winds (Linear Planetary Boundary Layer Equations)
Theory
(neglect second order time derivative)discretize:
Our model
BHM Ensemble Winds
10 m/s
10 members selected from the Posterior Distribution (blue)
ApproachesA. Stochastic models incorporating science• Physical-statistical modeling (Berliner 2003
JGR) From ``F=ma'' to [ X | ] • Qualitative use of theory (eg., Pacific SST
model; Berliner et al. 2000 J. Climate)
B. Incorporating large-scale computer models
1) From model output to priors [ ]
2) Model output as samples from process model prior [ X | ] almost !
3) Model output as ``observations'' (Y)
C. Combinations
Part B) Information from Models
1) Develop prior from model output• Think of model output runs O1, … , On as samples
from some distribution• Do data analysis on O’s to estimate distribution• Use result (perhaps with modifications) as a prior
for X• Example: O’s are spatial fields: estimate spatial
covariance function of X based on O’s. • Example: Berliner et al (2003) J. Climate
2) Model output as realizations of prior “trends”
• Process Model Prior
X = O +
where is “model error”, “bias”, “offset”
• [ Y | X , ] is measurement error model:
Y = X + e
• Substitution yields [ Y | O , , ]
Y = O + + e
• Modeling is crucial (I have seen set to 0)
3) Model output as “observations”
• Data Model: [ Y, O | X , ] ( = [ Y | X, ] [ O | X , ])
• [ O | X , ] to include “bias, offset, ..”
• Previous approach: start by constructing
[ X | O , ]
This approach: construct [ O | X , ] • Model for “bias” a challenge in both cases• This is not uncommon, though not always
made clear
A Bayesian Approach to Multi-model Analysis and Climate Projection
(Berliner and Kim 2008, J Climate)
Climate Projection:
– Future climate depends on future, but unknown, inputs.
– IPCC: construct plausible future inputs, “SRES Scenarios” (CO2 etc.)
– Assume a scenario and get corresponding projection
Hemispheric Monthly Surface Temperatures
• Observations (Y) for 1882-2001.
Data Model: Gaussian with mean = true temp.
& unknown variance (with a change-point)• Two models (O): PCM (n=4), CCSM (n=1) for
2002-2197, and 3 SRES scenarios (B1,A1B,A2).
Data Model: assumes O’s are Gaussian with mean = t + model biast (different for the two models) and unknown, time-varying variances (different for the two models)
• All are assumed conditionally independent
Notes (Freeze time)
• Data model for kth ensemble member from Model j:
Ojk = + bj + ejk
– is common to both Models
– bj is Model j bias
– E( ejk ) = 0 and variances of e’s depend on j
• Computer model model:
= X + ewhere E(e) = 0• Priors for biases, variances, and X• Extensions to different model classes (more ’s)
and richer models are feasible.
IPCC (global) Us (NH)
Figure 10.4
Discussion: Which approach is best?
• Depends on form and quality of observations and models and practicality
• Develop prior for X from scientific model (part A) offers strong incorporation of theory, but practical limits on richness of [ X | ] may arise
• Model output as “observations”– Combining models: Just like different
measuring devices;– Nice for analysis & mixed (obs’ & comp.) design
– Need a prior [ X | ]
• Model output as realizations of prior “trends”– Most common among Bayesian statisticians– Combining models: like combining experts
Discussion: Models versus Reality
Need for modeling differences between X’s and O’s.
Model “assessment” (“validation”, “verification”) helps, but is difficult in complicated settings:
– Global climate models. Virtually no observations at the scales of the models.
– Tuning. Modify model based on observations.– Observations are imperfect, and are often
output of other physical models.– Massive data. Comparing space-time fields
Discussion, Cont’d
• Part C) Combining approaches– Example: Wikle et al 2001, JASA. Combined
observations and large-scale model output as data with a prior based on some physics
• Usually, many physical models. No best one, so it’s nice to be flexible in incorporating their information
Thank You!