Modelling non-stationarity in space and time for air quality
data Peter Guttorp University of Washington
[email protected] NRCSE
Slide 2
Outline Lecture 1: Geostatistical tools Gaussian predictions
Kriging and its neighbours The need for refinement Lecture 2:
Nonstationary covariance estimation The deformation approach Other
nonstationary models Extensions to space-time Lecture 3: Putting it
all together Estimating trends Prediction of air quality surfaces
Model assessment
Slide 3
Research goals in air quality modeling Create exposure fields
for health effects modeling Assess deterministic air quality models
Interpret environmental standards Enhance understanding of complex
systems
Slide 4
The geostatistical setup Gaussian process (s)=EZ(s) Var Z(s)
< Z is strictly stationary if Z is weakly stationary if Z is
isotropic if weakly stationary and
Slide 5
The problem Given observations at n sites Z(s 1 ),...,Z(s n )
estimate Z(s 0 ) (the process at an unobserved site) or (a weighted
average of the process)
Slide 6
A Gaussian formula If then
Slide 7
Simple kriging Let X = (Z(s 1 ),...,Z(S n )) T, Y = Z(s 0 ), so
X = 1 n, Y = , XX =[C(s i -s j )], YY =C(0), and YX =[C(s i -s 0
)]. Thus This is the best linear unbiased predictor for known and C
(simple kriging). Variants: ordinary kriging (unknown ) universal
kriging ( =A for some covariate A) Still optimal for known C.
Prediction error is given by
Slide 8
The (semi)variogram Intrinsic stationarity Weaker assumption
(C(0) need not exist) Kriging can be expressed in terms of
variogram
Slide 9
Method of moments: square of all pairwise differences, smoothed
over lag bins Problem: Not necessarily a valid variogram Estimation
of covariance functions
Slide 10
Least squares Minimize Alternatives: fourth root transformation
weighting by 1/ 2 generalized least squares
Slide 11
Fitted variogram
Slide 12
Kriging surface
Slide 13
Kriging standard error
Slide 14
A better combination
Slide 15
Maximum likelihood Z~N n ( , ) = [ (s i -s j ; )] = V( )
Maximize and maximizes the profile likelihood
Slide 16
A peculiar ml fit
Slide 17
Some more fits
Slide 18
All together now...
Slide 19
Effect of estimating covariance structure Standard
geostatistical practice is to take the covariance as known. When it
is estimated, optimality criteria are no longer valid, and plug-in
estimates of variability are biased downwards. (Zimmerman and
Cressie, 1992) A Bayesian prediction analysis takes proper account
of all sources of uncertainty (Le and Zidek, 1992)
Slide 20
Violation of isotropy
Slide 21
General setup Z(x,t) = (x,t) + (x) 1/2 E(x,t) + (x,t) trend +
smooth + error We shall assume that is known or constant t =
1,...,T indexes temporal replications E is L 2 -continuous, mean 0,
variance 1, independent of the error C(x,y) = Cor(E(x,t),E(y,t))
D(x,y) = Var(E(x,t)-E(y,t)) (dispersion)
Slide 22
Geometric anisotropy Recall that if we have an isotropic
covariance (circular isocorrelation curves). If for a linear
transformation A, we have geometric anisotropy (elliptical
isocorrelation curves). General nonstationary correlation
structures are typically locally geometrically anisotropic.
Slide 23
The deformation idea In the geometric anisotropic case, write
where f(x) = Ax. This suggests using a general nonlinear
transformation. Usually d=2 or 3. G-plane D-space We do not want f
to fold.
Slide 24
Implementation Consider observations at sites x 1,...,x n. Let
be the empirical covariance between sites x i and x j. Minimize
where J(f) is a penalty for non-smooth transformations, such as the
bending energy
Slide 25
SARMAP An ozone monitoring exercise in California, summer of
1990, collected data on some 130 sites.
Slide 26
Transformation This is for hr. 16 in the afternoon
Slide 27
Thin-plate splines Linear part
Slide 28
A Bayesian implementation Likelihood: Prior: Linear part: fix
two points in the G-D mapping put a (proper) prior on the remaining
two parameters Posterior computed using Metropolis-Hastings
Slide 29
California ozone
Slide 30
Posterior samples
Slide 31
Other applications Point process deformation (Jensen &
Nielsen, Bernoulli, 2000) Deformation of brain images (Worseley et
al., 1999)
Slide 32
Isotropic covariances on the sphere Isotropic covariances on a
sphere are of the form where p and q are directions, pq the angle
between them, and P i the Legendre polynomials. Example: a i
=(2i+1) i
Slide 33
A class of global transformations Iteration between simple
parametric deformation of latitude (with parameters changing with
longitude) and similar deformations of longitude (changing smoothly
with latitude). (Das, 2000)
Slide 34
Three iterations
Slide 35
Global temperature Global Historical Climatology Network 7280
stations with at least 10 years of data. Subset with 839 stations
with data 1950-1991 selected.
Slide 36
Isotropic correlations
Slide 37
Deformation
Slide 38
Assessing uncertainty
Slide 39
Gaussian moving averages Higdon (1998), Swall (2000): Let be a
Brownian motion without drift, and. This is a Gaussian process with
correlogram Account for nonstationarity by letting the kernel b
vary with location:
Slide 40
Kernel averaging Fuentes (2000): Introduce orthogonal local
stationary processes Z k (s), k=1,...,K, defined on disjoint
subregions S k and construct where w k (s) is a weight function
related to dist(s,S k ). Then A continuous version has
SARMAP revisited Spatial correlation structure depends on hour
of the day (non-separable):
Slide 43
Brunos seasonal nonseparability Nonseparability generated by
seasonally changing spatial term Z 1 large-scale feature Z 2
separable field of local features (Bruno, 2004)
Slide 44
A non-separable class of stationary space-time covariance
functions Cressie & Huang (1999): Fourier domain Gneiting
(2001): f is completely monotone if (-1) n f (n) 0 for all n.
Bernsteins theorem : for some non- decreasing F. Combine a
completely monotone function and a function with completely
monotone derivative into a space-time covariance
Slide 45
A particular case =1/2, =1/2 =1/2, =1 =1, =1/2 =1, =1
Slide 46
Uses for surface estimation Compliance exposure assessment
measurement Trend Model assessment comparing (deterministic) model
to data approximating model output Health effects modeling
Slide 47
Health effects Personal exposure (ambient and non- ambient)
Ambient exposure outdoor time infiltration Outdoor concentration
model for individual i at time t
Slide 48
2 years, 26 10-day sessions A total of 167 subjects: 56 COPD
subjects 40 CHD subjects 38 healthy subjects (over 65 years old,
non-smokers) 33 asthmatic kids A total of 108 residences: 55
private homes 23 private apartments 30 group homes Seattle health
effects study
Slide 49
pDR PUF HPEM Ogawa sampler
Slide 50
HI Ogawa sampler T/RH logger Nephelometer Quiet Pump Box CO 2
monitor CAT
Slide 51
Slide 52
PM 2.5 measurements
Slide 53
Where do the subjects spend their time? Asthmatic kids: 66% at
home 21% indoors away from home 4% in transit 6% outdoors Healthy
(CHD, COPD) adults: 83% (86,88) at home 8% (7,6) indoors away from
home 4% (4,3) in transit 3% (2,2) outdoors
Slide 54
Panel results Asthmatic children not on anti- inflammatory
medication: decrease in lung function related to indoor and to
outdoor PM 2.5, not to personal exposure Adults with CV or COPD:
increase in blood pressure and heart rate related to indoor and
personal PM 2.5
Slide 55
Slide 56
Slide 57
Trend model where V ik are covariates, such as population
density, proximity to roads, local topography, etc. where the f j
are smoothed versions of temporal singular vectors (EOFs) of the
TxN data matrix. We will set 1 (s i ) = 0 (s i ) for now.
Slide 58
SVD computation
Slide 59
EOF 1
Slide 60
EOF 2
Slide 61
EOF 3
Slide 62
Slide 63
Slide 64
Slide 65
Kriging of 0
Slide 66
Kriging of 2
Slide 67
Quality of trend fits
Slide 68
Observed vs. predicted
Slide 69
Observed vs. predicted, cont.
Slide 70
Conclusions Good prediction of day-to-day variability seasonal
shape of mean Not so good prediction of long-term mean Need to try
to estimate
Slide 71
Other difficulties Missing data Multivariate data Heterogenous
(in space and time) geostatistical tools Different sampling
intervals (particularly a PM problem)