72
Modelling non- stationarity in space and time for air quality data Peter Guttorp University of Washington [email protected] NRCSE

Modelling non-stationarity in space and time for air quality data Peter Guttorp University of Washington [email protected] NRCSE

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

  • Slide 1
  • Modelling non-stationarity in space and time for air quality data Peter Guttorp University of Washington [email protected] NRCSE
  • Slide 2
  • Outline Lecture 1: Geostatistical tools Gaussian predictions Kriging and its neighbours The need for refinement Lecture 2: Nonstationary covariance estimation The deformation approach Other nonstationary models Extensions to space-time Lecture 3: Putting it all together Estimating trends Prediction of air quality surfaces Model assessment
  • Slide 3
  • Research goals in air quality modeling Create exposure fields for health effects modeling Assess deterministic air quality models Interpret environmental standards Enhance understanding of complex systems
  • Slide 4
  • The geostatistical setup Gaussian process (s)=EZ(s) Var Z(s) < Z is strictly stationary if Z is weakly stationary if Z is isotropic if weakly stationary and
  • Slide 5
  • The problem Given observations at n sites Z(s 1 ),...,Z(s n ) estimate Z(s 0 ) (the process at an unobserved site) or (a weighted average of the process)
  • Slide 6
  • A Gaussian formula If then
  • Slide 7
  • Simple kriging Let X = (Z(s 1 ),...,Z(S n )) T, Y = Z(s 0 ), so X = 1 n, Y = , XX =[C(s i -s j )], YY =C(0), and YX =[C(s i -s 0 )]. Thus This is the best linear unbiased predictor for known and C (simple kriging). Variants: ordinary kriging (unknown ) universal kriging ( =A for some covariate A) Still optimal for known C. Prediction error is given by
  • Slide 8
  • The (semi)variogram Intrinsic stationarity Weaker assumption (C(0) need not exist) Kriging can be expressed in terms of variogram
  • Slide 9
  • Method of moments: square of all pairwise differences, smoothed over lag bins Problem: Not necessarily a valid variogram Estimation of covariance functions
  • Slide 10
  • Least squares Minimize Alternatives: fourth root transformation weighting by 1/ 2 generalized least squares
  • Slide 11
  • Fitted variogram
  • Slide 12
  • Kriging surface
  • Slide 13
  • Kriging standard error
  • Slide 14
  • A better combination
  • Slide 15
  • Maximum likelihood Z~N n ( , ) = [ (s i -s j ; )] = V( ) Maximize and maximizes the profile likelihood
  • Slide 16
  • A peculiar ml fit
  • Slide 17
  • Some more fits
  • Slide 18
  • All together now...
  • Slide 19
  • Effect of estimating covariance structure Standard geostatistical practice is to take the covariance as known. When it is estimated, optimality criteria are no longer valid, and plug-in estimates of variability are biased downwards. (Zimmerman and Cressie, 1992) A Bayesian prediction analysis takes proper account of all sources of uncertainty (Le and Zidek, 1992)
  • Slide 20
  • Violation of isotropy
  • Slide 21
  • General setup Z(x,t) = (x,t) + (x) 1/2 E(x,t) + (x,t) trend + smooth + error We shall assume that is known or constant t = 1,...,T indexes temporal replications E is L 2 -continuous, mean 0, variance 1, independent of the error C(x,y) = Cor(E(x,t),E(y,t)) D(x,y) = Var(E(x,t)-E(y,t)) (dispersion)
  • Slide 22
  • Geometric anisotropy Recall that if we have an isotropic covariance (circular isocorrelation curves). If for a linear transformation A, we have geometric anisotropy (elliptical isocorrelation curves). General nonstationary correlation structures are typically locally geometrically anisotropic.
  • Slide 23
  • The deformation idea In the geometric anisotropic case, write where f(x) = Ax. This suggests using a general nonlinear transformation. Usually d=2 or 3. G-plane D-space We do not want f to fold.
  • Slide 24
  • Implementation Consider observations at sites x 1,...,x n. Let be the empirical covariance between sites x i and x j. Minimize where J(f) is a penalty for non-smooth transformations, such as the bending energy
  • Slide 25
  • SARMAP An ozone monitoring exercise in California, summer of 1990, collected data on some 130 sites.
  • Slide 26
  • Transformation This is for hr. 16 in the afternoon
  • Slide 27
  • Thin-plate splines Linear part
  • Slide 28
  • A Bayesian implementation Likelihood: Prior: Linear part: fix two points in the G-D mapping put a (proper) prior on the remaining two parameters Posterior computed using Metropolis-Hastings
  • Slide 29
  • California ozone
  • Slide 30
  • Posterior samples
  • Slide 31
  • Other applications Point process deformation (Jensen & Nielsen, Bernoulli, 2000) Deformation of brain images (Worseley et al., 1999)
  • Slide 32
  • Isotropic covariances on the sphere Isotropic covariances on a sphere are of the form where p and q are directions, pq the angle between them, and P i the Legendre polynomials. Example: a i =(2i+1) i
  • Slide 33
  • A class of global transformations Iteration between simple parametric deformation of latitude (with parameters changing with longitude) and similar deformations of longitude (changing smoothly with latitude). (Das, 2000)
  • Slide 34
  • Three iterations
  • Slide 35
  • Global temperature Global Historical Climatology Network 7280 stations with at least 10 years of data. Subset with 839 stations with data 1950-1991 selected.
  • Slide 36
  • Isotropic correlations
  • Slide 37
  • Deformation
  • Slide 38
  • Assessing uncertainty
  • Slide 39
  • Gaussian moving averages Higdon (1998), Swall (2000): Let be a Brownian motion without drift, and. This is a Gaussian process with correlogram Account for nonstationarity by letting the kernel b vary with location:
  • Slide 40
  • Kernel averaging Fuentes (2000): Introduce orthogonal local stationary processes Z k (s), k=1,...,K, defined on disjoint subregions S k and construct where w k (s) is a weight function related to dist(s,S k ). Then A continuous version has
  • Slide 41
  • Simplifying assumptions in space-time models Temporal stationarity seasonality decadal oscillations Spatial stationarity orographic effects meteorological forcing Separability C(t,s)=C 1 (t)C 2 (s)
  • Slide 42
  • SARMAP revisited Spatial correlation structure depends on hour of the day (non-separable):
  • Slide 43
  • Brunos seasonal nonseparability Nonseparability generated by seasonally changing spatial term Z 1 large-scale feature Z 2 separable field of local features (Bruno, 2004)
  • Slide 44
  • A non-separable class of stationary space-time covariance functions Cressie & Huang (1999): Fourier domain Gneiting (2001): f is completely monotone if (-1) n f (n) 0 for all n. Bernsteins theorem : for some non- decreasing F. Combine a completely monotone function and a function with completely monotone derivative into a space-time covariance
  • Slide 45
  • A particular case =1/2, =1/2 =1/2, =1 =1, =1/2 =1, =1
  • Slide 46
  • Uses for surface estimation Compliance exposure assessment measurement Trend Model assessment comparing (deterministic) model to data approximating model output Health effects modeling
  • Slide 47
  • Health effects Personal exposure (ambient and non- ambient) Ambient exposure outdoor time infiltration Outdoor concentration model for individual i at time t
  • Slide 48
  • 2 years, 26 10-day sessions A total of 167 subjects: 56 COPD subjects 40 CHD subjects 38 healthy subjects (over 65 years old, non-smokers) 33 asthmatic kids A total of 108 residences: 55 private homes 23 private apartments 30 group homes Seattle health effects study
  • Slide 49
  • pDR PUF HPEM Ogawa sampler
  • Slide 50
  • HI Ogawa sampler T/RH logger Nephelometer Quiet Pump Box CO 2 monitor CAT
  • Slide 51
  • Slide 52
  • PM 2.5 measurements
  • Slide 53
  • Where do the subjects spend their time? Asthmatic kids: 66% at home 21% indoors away from home 4% in transit 6% outdoors Healthy (CHD, COPD) adults: 83% (86,88) at home 8% (7,6) indoors away from home 4% (4,3) in transit 3% (2,2) outdoors
  • Slide 54
  • Panel results Asthmatic children not on anti- inflammatory medication: decrease in lung function related to indoor and to outdoor PM 2.5, not to personal exposure Adults with CV or COPD: increase in blood pressure and heart rate related to indoor and personal PM 2.5
  • Slide 55
  • Slide 56
  • Slide 57
  • Trend model where V ik are covariates, such as population density, proximity to roads, local topography, etc. where the f j are smoothed versions of temporal singular vectors (EOFs) of the TxN data matrix. We will set 1 (s i ) = 0 (s i ) for now.
  • Slide 58
  • SVD computation
  • Slide 59
  • EOF 1
  • Slide 60
  • EOF 2
  • Slide 61
  • EOF 3
  • Slide 62
  • Slide 63
  • Slide 64
  • Slide 65
  • Kriging of 0
  • Slide 66
  • Kriging of 2
  • Slide 67
  • Quality of trend fits
  • Slide 68
  • Observed vs. predicted
  • Slide 69
  • Observed vs. predicted, cont.
  • Slide 70
  • Conclusions Good prediction of day-to-day variability seasonal shape of mean Not so good prediction of long-term mean Need to try to estimate
  • Slide 71
  • Other difficulties Missing data Multivariate data Heterogenous (in space and time) geostatistical tools Different sampling intervals (particularly a PM problem)
  • Slide 72
  • Southern California PM 2.5 data