Statistical Perspective on the Design and Analysis of Natural Resource Monitoring Programs Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division

Statistical Perspective on the Design and Analysisof Natural Resource Monitoring Programs

Anthony (Tony) R. Olsen

USEPA NHEERL

Western Ecology Division

Corvallis, Oregon

(541) 754-4790

[email protected]

Web Page: http://www.epa.gov/NHEERL/ARM

National Water Quality Monitoring Council:Monitoring Framework

• Applies to all natural resource monitoring

• Monitoring pieces must be designed and implemented to fit together

• View as information system

• National monitoring requires consistent framework

• Reference: Water Resources IMPACT, September 2003 issue

Impact Article Contributors• Framework Overview

Charles A. Peters Robert C. Ward

• The Three C’s Abby Markowitz Linda T. Green James Laine

• Monitoring Objectives Charles S. Spooner Gail E. Mallard

• Monitoring Design Tony Olsen Dale M. Robertson

• Data Collection Franceska Wilde Herbert J. Brass Jerry Diamond

• Data Management Karen S. Klima Kenneth J. Lanfear Ellen McCarron

• Assess and Interpret Dennis R. Helsel Lindsay M. Griffith

• Report Results Mary Ambrose Abby Markowitz Charles Job

Monitoring Program Weaknesses

1. Monitoring results are not directly tied to management decision making (monitoring objectives)

2. Results are not timely nor communicated to key audiences (convey results)

3. Objectives for monitoring are not clearly, precisely stated and understood (monitoring objectives)

4. Monitoring program not viewed/implemented as an information system (data management, overall)

5. Monitoring measurement protocols, survey design, and statistical analysis become scientifically out-of-date (field/lab methods, monitoring design, data analysis/assessment)

Communicate, Coordinate, Collaborate• Communication: process of

conveying information; can be one way or an exchange of thoughts, messages, or ideas

• Coordination: process in which two or more participants link, harmonize or synchronize interaction and activities

• Collaboration: process in which two or more participants work collectively to deal with issues that they cannot solve individually; partnerships, alliances, teams

• Kish (1965): “The survey objectives should determine the sample design; but the determination is actually a two-way process…”

• Initially objectives are stated in common sense statements – challenge is to transform them into quantitative questions that can be conveyed precisely to intended audience.

• Statistical perspective is key Know whether a monitoring design can answer the question Know when the question is not precise enough – multiple

interpretations

Developmonitoringobjectives

ConveyResults andfindings

Designmonitoringprogram

Identify Monitoring Objectives• Objectives determine the monitoring design (yet monitoring

design constrains objectives that can be met) Usual to have multiple objectives Precise statements are required Objectives must be prioritized Objectives compete for samples

• Statistical perspective helps identify Target population Subpopulations that require estimates Elements of target population Potential sample frames Variables to be measured Impact of precision required

Example: From Question to Objective

• What is the quality of waters in the United States?

• What is the quality of streams with flowing water during summer in the U.S.?

• What is the biological quality of streams with flowing water during summer in the U.S.?

• How many km of streams with flowing water during the summer are impaired, non-impaired, or marginally-impaired within the U.S.? How is impairment determined? What is meant by summer? Are constructed channels, canals, effluent-dominated streams

included?

Designmonitoringprogram Collect

field andlab data


• Key components of monitoring design What will be monitored? (target population) What will be measured? (variables or indicators) When and how frequently will the measurements be taken?

(temporal design) Where will the measurements be taken? (site selection)

• Statistical perspective Sample frame and target population Survey design

What is a Target Population?

• Target population denotes the ecological resource for which information is wanted

• Requires a clear, precise definition Must be understandable to users Field crews must be able to determine if a particular site is in the

target population

• More difficult to define than most expect.

• Includes definition of what the elements are that make up the target population

Target Population, Sample Frame, Sampled Population

We Live in an Imperfect World…

Ideally, cyan, yellow, gray squares would overlap completely

Basic Spatial Survey Designs

• Simple Random Sample

• Systematic Sample Regular grid Regular spacing on linear resource

• Spatially Balanced Sample Combination of simple random and systematic characteristics Guarantees all possible samples are distributed across the resource

(target population) Generalized Random Tessellation Stratified (GRTS) design

Generalized Random Tessellation Stratified (GRTS) Survey Designs

• Probability sample producing design-based estimators and variance estimators

• Give another option to simple random sample and systematic sample designs Simple random samples tend to “clump” Systematic samples difficult to implement for aquatic resources

and do not have design-based variance estimator

• Emphasize spatial-balance Every replication of the sample exhibits a spatial density pattern

that closely mimics the spatial density pattern of the resource

Spatial Balance: 256 points

Why aren’t Basic Designs Sufficient?

• Monitoring objectives may include requirements that basic designs can’t address efficiently Estimates for particular subpopulations requires greater

sampling effort Administrative restrictions and operational costs

• Natural resource in study region makes basic designs inefficient Resource may be known to be restricted to particular subregions

• Complex designs may be more cost-effective

National Wadeable Stream Assessment 2004

Survey Design & Response Design

• Survey design is process of selecting sites at which a response will be determined Which sites will be visited (spatial component) Which monitoring season will sites be visited (temporal

component, panel design)

• Response design is process of obtaining a response at a site: When site is to be visited within a monitoring season

• A single index period visit during a monitoring season

• Multiple visits during monitoring season: e.g. monthly, quarterly

Field plot design Process of going from basic field measurements to indicators

Designmonitoringprogram

Collectfield andlab data

Compileand manage

data

• Components Field methods (response design) Laboratory methods Measurement quality objectives Quality assurance & quality control Logistical plan and gaining site access

• Statistical perspective Experimental designs to determine cost-effective and scientifically-

defensible response designs Statistical quality control Methods for minimizing non-response

0 10 20 30 40 50 60 70 800

20

40

60

80

100

Stream Length (Channel Width Units)

Species Richness

(% of Maximum)

Response Design - Fish

• 1-pass sampling

• Spread effort throughout reach

• Get “common” species in approx. relative abundance

A

B

CK

JI

HG

F

E

DFLOW

Distance between transects=4 times mean wetted width at X-site

X-site

Total reach length=40 times mean wetted width at X-site (minimum=150 m)

RC

L

C

RL

CR

L

SAMPLING POINTS• L=Left C=Center R=Right• First point (transect B)

determined at random• Subsequent points assigned in

order L, C, R

Response Design: Benthos and Periphyton

US Forest ServiceForest Inventory

and Analysis (FIA) Plot Design

Minimizing Non-Response: Prairie Potholes

• Landowner contact procedure Obtain owner list from USDA ASCS local office Cover letter explained study, random selection, measurements,

walking access only, timing/duration visit, offer to honor special owner conditions

Consent form Map of identifying wetland to be visited Telephone contact 2-4 weeks after letter – list of FAQs and

answers provided to personnel Second letter 5-6 weeks after initial letter

• Access rates: private land 42%• 25% of access approvals required multiple contacts• From Lesser et al (2001)

Collectfield andlab data

Compileand manage

data

Assessand interpret

data

• Components: compile and manage data Data entry Database development Metadata Data preservation Data discovery and retrieval

• Statistical perspective Statistical QA checking of data Access to auxiliary data used in statistical analyses Influence retrieval and database design Importance of preserving design information

Examples

• STORET modified to include survey design information Which sites are part of the survey design Stratification, weights, cluster variables

• USGS NWIS and NWISWeb NWIS focus on input/site specific (typically time focus) NWISWeb focus on retrieval (typically spatial focus)

• National Resource Inventory’s analysis database Statistical imputation for missing data Statistical creation of pseudo points

• Incorporate known information• Link across years for consistency

Determination of single weight for each point in database Results in a single, consistent database for 1982, 1987, 1992, …

that is easy to use for statistical analyses

Compileand manage

data

Assessand interpret

data


• Derived indicator construction

• Statistical Design-Based estimation

• Statistical Model-assisted and model-based estimation Inference to unsampled locations Spatial pattern inference (or where is the map!)

• Semi-empirical modeling Incorporating physical processes Empirical statistical modeling using auxiliary data

Design-Based Population Estimation

• Scientific inference from sample to population

• Minimizes assumptions used in the inference process

• Relies on principles of statistical survey design and analysis

• Natural resource programs who use Forest Inventory & Analysis National Resource Inventory National Wetland Status and Trends Program National Agricultural Statistics Service programs Environmental Monitoring and Assessment Program (EMAP)

Estimating Site Occupancy Rates

• MacKenzie, D. I., J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle, and C. A. Langtimm. 2002. Estimating site occupancy rates when detection probabilities are less than one. Ecology 83:2248-2255.

• Likelihood based model for estimation Assumes simple random sample of sites Similar to closed-population, mark-recapture model Estimate probability of occupancy and probability of detection

• Estimation with complex survey designs Maximum likelihood as before Likelihood must incorporate survey design

• Stratification

• Unequal probability of selection

• Cluster sample

Northeast Lake Fish Species PAO

0 0.2 0.4 0.6 0.8

Rock Bass

Yellow Bullhead

Brown Bullhead

American Eel

White Sucker

Northern Pike

Chain Pickerel

Brook Trout

Proportion Occupied

Adj PAO

Unadj PAO

Statistical Model-assisted and Model-based Estimation

• Improve estimation based on complete coverage information

• Adjustment for non-response at the site level

• Small area estimation

• Spatially-explicit model of probability of impairment Identification of “hot spots” likely to be impaired

• Will see increased use of these techniques

Semi-parametric Small Area Model:Northeast Lakes ANC prediction for HUCs

• J. Breidt, J. Opsomer, G. Ranalli, G. Claeskens, G. Kauermann

• Colorado State University STARMAP research program sponsored by USEPA STAR grants program

Semi-empirical Modeling: USGS NAWQA

• SPARROW relates in-stream water-quality measurements to spatially referenced characteristics of watersheds, including contaminant sources and factors influencing terrestrial and stream transport.

• The model empirically estimates the origin and fate of contaminants in streams, and quantifies uncertainties in these estimates based on model coefficient error and unexplained variability in the observed data.

Estimated nitrogen export (kg/km2/yr) for watersheds of the conterminous United States.

http://water.usgs.gov/nawqa/sparrow/wrr97/fig1.html

Assessand interpret

data



• Questions to ask when planning reporting What is objective for communicating the results? Who is the target audience? What is message want to convey? What formats will be used to convey the message?

• Statistical perspective Clarity on scope of inference: target population/sampled population Reporting of precision for results Construction of statistical tables Construction of presentation quality statistical graphics

10%21%

34%28%

Valleys

10%14%

31%

41%

North-Central Appalachians

15%

22%

35%11%

Ridge and Blue Ridge

IBI Results Geographic Distribution

35%

3%

28%

26%

Western Appalachians

(InsufficientData)

Estuarine Stressor Comparison

Degraded 30 ± 6%

Undegraded70 ± 6%

Degraded18 ± 8%

Undegraded82 ± 8%

Louisianian Province Virginian Province

Metals 42%

Toxicity 4%Contaminants 28%

Low D.O.

Habitat 14%

Unknown10% Unknown

39%

Contaminants 10%Both2%

Low DissolvedOxygen 49%

Benthic invertebrate condition

Condition

Stressors Associated with Degraded Condition

MAIA: Relative Risk Assessment

Pr(Poor BMI, given Poor SED)RR

Pr(Poor BMI, given OK SED)

“The risk of Poor BMI is 1.6 timesgreater in streams with Poor SEDthan in streams with OK SED.”

Lake Ontario Diporeia Spatial Pattern

Summary

• Statistical perspective is pervasive throughout the monitoring framework

• Substantial advances in incorporating statistical perspective in monitoring have been made during the last half of the 20th century

• Many statistical methodology advances are on the horizon that will improve monitoring cost-effectiveness

• Incorporating a statistical perspective throughout the development and implementation of a monitoring program is no longer optional – it is essential

When will natural resource monitoring programs be able to support an Environmental Statistics Briefing Room?

Documents

Statistical Perspective on the Design and Analysis of Natural Resource Monitoring Programs Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division