16
A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental Scales Richard B. Alexander and Gregory E. Schwarz NWQMC 2016 Conference, Tampa, FL, May 2-6, 2016 National Water Quality Assessment Project U.S. Geological Survey Reston, VA

A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit

Hydrological Model across Regional and Continental Scales

Richard B. Alexander and Gregory E. Schwarz

NWQMC 2016 Conference, Tampa, FL, May 2-6, 2016

National Water Quality Assessment Project U.S. Geological Survey

Reston, VA

Page 2: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Growing Interest in Bayesian Methods

Employs conditional probability theory expressed by Bayes' Theorem:

P[ model| data ] ∝ P[ model ] * P[ data | model ]

Updated knowledge of model (posterior distribution)

New data and the model that best describes the data

Prior model

Posterior (prior & monitoring data)

New Monitoring

Data

Prior (model or knowledge)

Pro

bab

ility

De

nsi

ty

Water-Quality Condition

Optimized watershed models maximize

this probability User decisions about

data sharing over space and time

Page 3: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Hierarchical Bayesian Methods

Provide flexibility in describing complexities in data over space and time

Model coefficients (and error variance) treated as random variables, with values that vary in time/space

Allow partial pooling of site data across watersheds

Allow modeling of error variance

Hierarchical structure nests higher-order models of parameters (e.g., region) within lower-order structural models (e.g., monitoring stations)

Partially pooling (“data borrowing”) can improve prediction accuracy (bias, precision)

Handles small samples well

Page 4: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Hierarchical Bayesian Methods

Improve quantification of model uncertainties (state-space methods)

Explore spatial and temporal patterns in the uncertainties

Identify causes of uncertainties

Allows separate estimation of measurement errors in observations vs. real process-related errors associated with the model structure

Produces more accurate estimates of uncertainties and updated predictions of “true” WQ conditions

Provide probabilistic model outcomes

Inclusive of uncertainties

Avoid prediction biases related to transformed response variables

Page 5: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Challenges in Bayesian Hydrological and Watershed Modeling

Previous Bayesian applications:

evaluated small catchments with little diversity (e.g., soils, climate, land use)

few systematic comparisons of a wide range of hierarchical structures

Reported computational limitations of the available software for hydrological models (e.g., WinBugs, RJAGS)

Declarative programming style (e.g., WinBugs) inconsistent with imperative procedural style (order matters) of many hydrologic models >> complicates use of the software

Page 6: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

A New-Generation Bayesian Method

SPARROW (USGS)

USGS SPARROW water-quality model R scripts, with R libraries under development Novel automated construction and execution of

standard user-selected models: Excel parameter table settings >> Stan script

Evaluated multi-spatial/temporal scale performance Results: Improved quantification of uncertainties

over space/time, with wide-ranging improvements in prediction accuracy (small to ~50%)

Stan Bayesian (Columbia Univ.,

Gelman et al.)

Stan Hamiltonian Monte Carlo (HMC) methods – uses a

more strategic sampling of posterior parameter distributions than Markov Chain Monte Carlo methods

More efficient (~10x faster) and robust R interface (Rstan) uses imperative procedural

language with C++ translator

Page 7: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

USGS SPARROW Water-Quality Model SPAtially Referenced Regression on Watershed Attributes (Smith et al., 1997)

Hybrid mechanistic and statistical features: Spatially explicit (land/water) Mass-balance constraints Non-conservative transport Parameter estimation using least squares optimization

Home page: http://water.usgs.gov/nawqa/sparrow

Capabilities:

Predict annual water-quality load, yield, and concentration for unmonitored stream reaches

Assess effects of hydrological and biogeochemical processes on transport and fate in watersheds

Apportion stream loads to major pollution sources and upstream watersheds

Page 8: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Mean Annual Streamflow (1997-2007) 1:500,000 RF1 streams (18 HUC-2 regions)

(1,778 calibration sites; 890 validation sites)

Validation site

Calibration site

01

02

03

04

09

12

13

14

15

18

17

10

16

08

11 05

06

07

SPARROW Streamflow Models

Susquehanna River Basin Mean Seasonal Streamflow, 2001-08

1:100,000 NHD streams (19 HUC-8 watersheds) (85 sites)

Validation site

Calibration site

Pennsylvania New York

Maryland New Jersey

Page 9: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

SPARROW Non-Hierarchical Bayesian Streamflow Models (“baseline” models)

R-Square Yield = 0.839 RMSE = 0.34 6 parameters

Susquehanna River Basin - NHD (Mean Seasonal)

Predicted Volume (m3)

Ob

serv

ed V

olu

me

(m

3)

National – RF1 (Mean Annual)

Predicted Volume (m3)

Ob

serv

ed V

olu

me

(m3

)

R-Square Yield = 0.905 RMSE = 0.47 14 parameters

Page 10: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

SPARROW Non-Hierarchical Bayesian Streamflow Models (“baseline” models)

R-Square Yield = 0.839 RMSE = 0.34 6 parameters

Susquehanna River Basin - NHD (Mean Seasonal)

Predicted Volume (m3)

Ob

serv

ed V

olu

me

(m

3)

National – RF1 (Mean Annual)

Predicted Volume (m3)

Ob

serv

ed V

olu

me

(m3

)

R-Square Yield = 0.905 RMSE = 0.47 14 parameters

Hierarchical Bayesian models explore space-time patterns in the model uncertainties, with opportunities to reduce bias and increase precision

Page 11: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Bayesian SPARROW R Components

Hierarchical rate coefficients for user-specified regions (discrete)

Validation site

Calibration site

01

02

03

04

09

12

13

14

15

18

17

10 16

08

11 05 06

07

National Mean

In-Stream Loss Rate Coefficient

Loss

Rat

e (p

er d

ay

of

wat

er t

rave

l tim

e)

HUC-2 Region

Page 12: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Bayesian SPARROW R Components

Regional Differences in Prediction Error

Rat

io o

f St

d.

Dev

iati

on

to

Mea

n

HUC-2 Region

Validation site

Calibration site

01

02

03

04

09

12

13

14

15

18

17

10 16

08

11 05 06

07

Hierarchical rate coefficients for user-specified regions (discrete)

Hierarchical error variance for user-defined regions (discrete, continuous)

Page 13: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Bayesian SPARROW R Components

Validation site

Calibration site

Pennsylvania New York

Maryland New Jersey

River Distance (km) Euclidian Distance (km)

Spatial correlation evident in the process uncertainties among calibration station pairs

Lin

ear

Co

rrel

atio

n

Hierarchical rate coefficients for user-specified regions (discrete)

Hierarchical error variance for user-defined regions (discrete, continuous)

State-space option to quantify measurement and process uncertainties over space/time

Page 14: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Bayesian SPARROW R Components

Validation site

Calibration site

Pennsylvania New York

Maryland New Jersey

Highest Accuracy for Summer

Updated predictions that include process uncertainties show improved accuracy

HUC-6 Watershed

Mo

del

RM

SE R

atio

(s

tate

-sp

ace

to

no

n-s

tate

-sp

ace)

Process uncertainties include prominent effects of baseflow/groundwater

Hierarchical rate coefficients for user-specified regions (discrete)

Hierarchical error variance for user-defined regions (discrete, continuous)

State-space option to quantify measurement and process uncertainties over space/time

Page 15: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Bayesian SPARROW R Components

Hierarchical rate coefficients for user-specified regions (discrete)

Hierarchical error variance for user-defined regions (discrete, continuous)

State-space option to quantify measurement and process uncertainties over space/time

Additional model refinements under development:

Seasonal spatial covariance terms for adjacent catchments

Spatial smoothing of process errors for updated predictions (Kalman filtering terms)

Page 16: A Novel Use of R to Couple Hierarchical Bayesian …...A Novel Use of R to Couple Hierarchical Bayesian Methods with a Spatially Explicit Hydrological Model across Regional and Continental

Use of hierarchical Bayesian methods with SPARROW: • Improved understanding of space-time variability in model coefficients

and prediction uncertainties •Wide-ranging improvements in prediction accuracy (small to ~50%) •Enhanced understanding of spatial and temporal variability in

process components

• Improved understanding of large-scale RStan performance across variable time-space scales: Results suggest that seasonal models of >~70,000 km2 and >10 yrs. will require NHD river network thinning

Planned library releases of SPARROW R (GitHub, CRAN): •Classic SPARROW - steady state non-Bayesian version (end 2016) SAS to R translation plus additional enhancements •Bayesian version of classic SPARROW (2017)

Conclusions: A Novel Use of R with SPARROW across Regional and Continental Scales