Upload
allyson-copeland
View
222
Download
3
Tags:
Embed Size (px)
Citation preview
Best Model
Dylan Loudon
Linear Regression Results
Erin Alvey
Who will you trust?
• Field technicians?
• Software programmers?
• Statisticians?
• Instructors?
• GIS technicians?
• Other researchers?
• Yourself?
Regression (Correlation) Modeling• Creates a model in N-Dimensional
“Hyper-Space”
• Defined by:– Covariates– Response variables– Mathematics used to create the model– Statistics used to optimize parameters– Options for model evaluation– Predictor variables
Multiple Linear Regression
Linear Regression: 2 Predictors
Mathworks.com
Non-Linear Regression
Regression Methods• Continuous Regression:
– Linear Regression– Generalized Linear Models (GLM)– Generalized Additive Models (GAMs)
• Categorical Regression (trees):– Regression Trees– Classification and regression trees (CART)
• Machine Learning:– Maximum Entropy (Maxent)– NPMR, HEMI, BRTs, etc.
Brown Shrimp Size
• Add graph from work
Terminology
• Plant uses:– Measured value and response variable– Explanatory variable
• I prefer:– Response variable– I’ll use “measured value” to identify measured
values in field data– Covariate: Explanatory variable used to build
the model– Predictor: Explanatory variable used to predict
Douglas Fir Habitat Model
Hab
itat
Qua
lity
Precipitation (mm)0 10000
1
PredictorModel
Prediction
PredictorModel
Prediction
Field Data
Covariate
Model Selection and Parameter Estimation
PredictorModel
Prediction
Field or Sample Data
Covariate
Model Selection and Parameter Estimation
Model Validation
Douglas-Fir sample dataLat Lon F3 MeanTempPrecip
40.893634 -121.802272 41 69 107040.987702 -122.117088 45 96 140640.987702 -122.117088 40 96 140640.987702 -122.117088 43 96 140640.987702 -122.117088 42 96 140640.987702 -122.117088 46 96 1406
Create the Model
Model“Parameters”
Precip
To Points
Extract
Text File
To Raster
X Y MeanTempPrecip Predict-123.677 41.61906 71 1548 193.6-123.344 41.61906 55 1212 150.4-123.011 41.61906 79 887 187.5667-122.677 41.61906 68 584 155.4667-122.344 41.61906 102 513 221.1
Prediction
Attributes
Data
• Response Variable– From the field data (sample data)
• Covariates– From the field or remotely sensed
• Predictors– Typically remotely sensed – Sample as covariates for training– Can be different for predicting to new
scenarios
Response Variable
• What is the:– Spatial uncertainty?– Temporal uncertainty?– Measurement uncertainty?
• Will it answer your question?
Covariate Variables
• What is the:– Spatial uncertainty?– Temporal uncertainty?– Measurement uncertainty?
• How well does the collection time of the covariates match the field data?
• Do they co-vary with the phenomena?
• Do the covariates “correlate”?
Types of uncertainty
• Accuracy (bias)
• Precision (repeatability)
• Reliability (consistency of a set of measurements)
• Resolution (fineness of detail)
• Logical consistency– Adherence to structural rules, attributes,
and relationships
• Completeness
Types of Errors• Gross errors
– Transcription– Sinks in DEMs
• Random– Estimated using probability theory
• Systematic errors– “Drift” in instruments– Dropped lines in Landsat
Gross Errors
• Lat/Lon:– Reversed– 0, names, dates, etc.
• Dates:– Extended in databases
• Measurements:– Inconsistent units– Inconsistent protocols– What can you expect from a field team?
Occurrences of Polar Bears
From The Global Biodiversity Information Facility (www.gbif.org, 2011)
Systematic Errors
Landsat Scan line Error
Response Variable Qualification Tools• Maps (various resolutions)
• Examine the data values:– How many digits?– Repeating patterns, gross errors?
• “Documentation”
• Measurements:– Occurrences?– Binary: Histogram– Categorical: Histogram– Continuous: Histogram
What’s the Impact on Models?
Significant Digits
• How many digits to represent 1 meter?– Geographic: Lat/Lon?– UTM: Eastings/Northings?
Significant Digits
• Geographic:– 1 digit = 1 degree– 1 degree ~ 110 km– 0.00001 ~ 1.1 meters
• UTM:– 1 digit = 1 meter
Covariate Qualification
• Maps
• Documentation
• Examine the data:– How many digits?
• Integer or floating point?
– Repeating patterns?
• Histograms
CONUS Annual Percip.
Covariate Uncertinaty
0.00
0.20
0.40
0.60
0.80
1.00
1.20-231
-219
-207
-195
-183
-172
-160
-148
-136
-124
-112
-100 -88
-77
-65
-53
-41
-29
-17 -5 7 19 30 42 54 66 78 90 102
Num
ber o
f Pix
els
Scal
ed to
1
Degrees C Times 10
Min Temp of Coldest Month
Min Temp of Coldest Month
0.00
0.20
0.40
0.60
0.80
1.00
1.20-230
-215
-201
-186
-172
-157
-143
-128
-114
-100 -85
-71
-56
-42
-27
-13 2 16 31 45 60 74 88 103
Num
ber o
f Occ
urre
nces
Sca
led
to 1
Degrees C Times 10
Min Temp: Envrionment
Histograms
hist(Temp,breaks=400)
Covariate Correlation
• Correlation Plots
• Pearson product-moment correlation coefficient
• Spearman’s rho – non parametric correlation coefficient
Correlation plots
California Correlations
California Predictors
Response vs. Covariates
• For Occurrences:– Histogram covariates at occurrences vs.
overall covariates
• For Binary Data:– Histogram covariates for each value
• For Categorical Data :– Histogram covariates for each value– Or scatter plots
• For Continuous Data– Scatter plots
Covariate Occurrence Histograms
Precipitation with Douglas-Fir Occurrences
Douglas Fir Model In HEMI 2
Green: Histogram of all of CaliforniaRed: Histogram of Douglas-Fir Occurrences
Doug-Fir Height vs. Precip.
Douglas Fir Height
Terrestrial Predictors
• Elevation:– Slope– Aspect– Absolute Aspect
• Distance to:– Roads– Streams (streamline)
• Climate– Precip– Temp
• Soil Type• RS:
– Landsat– MODIS– NDVI, etc.
Marine Predictors
• Temp• DO2• Salinity• Depth• Rugosity
(roughness)• Current (at depths)• Wind
More Complicated
• Associated species• Trophic levels• Temporal• Cyclical
Predictor Layers
• Means, mins, maxes
• Range of values
• Heterogeneity
• Spatial layers:– Distance to…– Topography: elevation, slope, aspect
Field Data and Predictors
• As close to field measurements as possible
• Clean and aggregate data as needed– Documenting as you go
• Estimate overall uncertainty
• Answer the question:– What spatial, temporal, and measurement
scales are appropriate to model at given the data?
Temporal Issues
• Divide data into months, seasons, years, decades.– Consistent between predictors and
response
• Extract predictors as close to sample location and dates as possible
• Use the “best” predictor layers
Additional Slides
Dimensions of uncertainty
• Space
• Time
• Attribute
• Scale
• Relationships
Basic Tools
• Histograms: What is the distribution of occurrences of values (range and shape)
• Scattergrams: What is the relationship between response and predictor variables and between predictor variables
• QQPlots: Are the residuals normally distributed?
Types of Data
• “God does not play dice”– Einstein
• “the end of certainty”– Prigogine, 1977 Nobel Prize
• What remains is:– Quantifiable probability with uncertainty
Uncertainty Factors
• Inherent uncertainty in the world
• Limitation of human congnition
• Limitation of measurement
• Uncertainty in processing and analysis