Upload
samson-cross
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Calibration, Sensitivity Analysis and Uncertainty Analysis for Computationally Expensive
Models
Prof. Christine Shoemaker
Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman
School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering
Cornell University
South Florida Water District Morning MeetingSept. 24, 3003
Models Help Extract Information
Point Data
from monitoring or experiments at limited number of points in space and time
Model
that describes temporal and spatial connections
Forecasts (with statistical representation)
Comparison of Alternative Management Options
Understanding Processes
from Point Data to Processes Continuous in Space and Time
Models Help Extract Information from
Data___________________
Point Data
from monitoring or experiments at limited number of points in space and time
Model
that describes temporal and spatial connections
Forecasts (with statistical representation)
Comparison of Alternative Management Options
Understanding Processes
Model Outputs
for Multiple Outputs
Steps in Modeling
• Calibration—selecting parameter values within acceptable limits to fit the data as well as possible
• Validation—applying the model and calibrated parameters to independent data set
• Sensitivity Analysis—assess the impact of changes in uncertain parameter values on model output
• Uncertainty Analysis-assessing the range of model outcomes likely given uncertainty in parameters, model error, and exogenous factors like weather.
Computationally Expensive Models
• It is difficult to calibrate for many parameters with existing methods with a limited number of simulations.
• Most existing uncertainty methods require thousands of simulations.
• We can only do a limited number of model simulations if models that hours to run.
• Our methods are designed to reduce the number of simulations required to do good calibration and sensitivity analysis.
Methods and Applications
• We will discuss a general methodology for calibration, sensitivity analysis and uncertainty analysis that can be applied to many types of computationally expensive models.
• We will present numerical examples for two “real life” examples: a watershed and a groundwater remediation.
1.Effective Use of Models and Observations Through Calibration, Sensitivity Analysis and Uncertainty
Analysis
A description of the technical approach and “real life applications. Including:
1. Sensitivity Analysis for large number of parameters with application to a large watershed.
2. Optimization methods for calibration with application to ground water remediation based on field data.
3. Uncertainty Analysis based on groundwater model
Cannonsville Watershed• Cannonsville Reservoir Basin – agricultural basin• Supply of New York City drinking water • To avoid $8 billion water filtration plant, need
model analysis to help manage phosphorous
1200 km2
Watershed subject to economic constraints if P violations of TMDL.
Monitoring Stations
#S
#S
#S
#S#S
$T
$T
$T$T
$T
%U
%U
Trout Creek
W. Br. Delaware R. @ Walton
Little Delaware R.
W. Br. Delaware @ Delhi
Beerston
Town Brook
Town Brook
Subwatersheds BoundariesRivers and Streams
#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations
5 0 5 10 Kilometers
N
There are over 20,000 data for this watershed
Questions
• Using all this data, can we develop a model that is a useful forecasting tool to assess the impact of weather and phosphorous management actions on future loading the reservoir?
• What phosphorous management strategies should be undertaken if any?
I. Methodology for Sensitivity Analysis of a Model with Many Parameters: Application to Cannonsville Basin
• Joint work with Jennifer Benaman (Cornell Ph.D. in Civil and Environmental Engineering, 2003)
• Funded by EPA Star Fellowship
Sensitivity Analysis with Many Parameters
• Sensitivity Analysis measures the change in model output associated with the change (perturbation) in model input (e.g. in parameter values).
• Purposes include:– To help select which parameters should be adjusted
in a calibration and which can be left at default values.
– This makes multivariate sensitivity and uncertainty analysis more feasible for computationally expensive models
Sensitivity Analysis with Many Parameters- Additional Purposes– To prioritize additional data collection, and
– To estimate potential errors in model forecasts that could be due to parameter
value errors.• Sensitivity Analysis and calibration are
difficult with a large number of parameters.
Questions
• Can we develop a sensitivity analysis method that is:– robust (doesn’t depend strongly on our
assumptions)?– computationally efficient for a large
number of parameters (hundreds)?– allows us to consider many different
model outputs simultaneously?– .
• 160 parameters– 35 basinwide– 10 vary by land use (10 x 5 land uses)– 7 vary by soil (7 x 10 soil types)– 2 additional for corn and hay– 1 additional for pasture
• Ranges obtained from literature, databases, and SWAT User’s Manual
Choose Parameters Establish ParameterRanges
Choose OutputVariables of Concern
Application to Cannonsville Watershed
Monitoring Stations
#S
#S
#S
#S#S
$T
$T
$T$T
$T
%U
%U
Trout Creek
W. Br. Delaware R. @ Walton
Little Delaware R.
W. Br. Delaware @ Delhi
Beerston
Town Brook
Town Brook
Subwatersheds BoundariesRivers and Streams
#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations
5 0 5 10 Kilometers
N
Output Variables of Concern• Basinwide (average annual from 1994-1998)
– Surface water runoff– Snowmelt– Groundwater flow– Evapotranspiration– Sediment yield
• Location in-stream (monthly average over entire simulation)– Flow @ Beerston– Flow @ Trout Creek– Flow @ Town Brook– Flow @ Little Delaware River– Sediment load @ Beerston– Sediment load @ Town Brook
Choose Parameters Establish ParameterRanges
Choose OutputVariables of Concern
Final Results
Weighting Method A Weighting Method B Weighting Method C Weighting Method D
All Equal Weights Focus on Beerston Focus on CalibrationFocus on Basinwide
ManagementAPMBASIN 100 100 100 100BIOMIXBASIN 100 100 100 100CN2CSIL 100 100 100 100CN2FRSD 100 100 100 100CN2PAST 100 100 100 100RSDCOPAST 100 100 100 100SLSUBBSNBASIN 100 100 100 100SMFMNBASIN 100 100 100 100T_BASEPAST 100 100 100 100T_OPTPAST 100 100 100 100USLEKNY129 100 100 100 100ESCONY129 100 75 75 100SMTMPBASIN 100 75 75 100LAT_SEDBASIN 100 50 100 100CN2HAY 75 75 75 75ESCONY132 75 75 75 50GWQMNBASIN 75 75 75 75TIMPBASIN 75 50 75 75BIO_MINPAST 75 50 50 75ROCKNY132 75 25 50 50REVAPMNBASIN 50 50 50 75ROCKNY129 50 25 50 25USLEPCSIL 25 25 50 25HVSTICSIL 25 25 25 50USLECPAST 25 25 25 25SMFMXBASIN 25 0 0 50GSIPAST 0 0 25 0ROCKNY026 0 0 25 0
Percentage of times in the 'Top 20'
These are in top 20 for ALL cases
These are in top 20 most of the time
Computational Issues• We have a robust method for determining
importance and sensitivity of parameters.• An advantage is that the number of model
simulations is independent of the number of output variables, sensitivity indices, or weighting factors considered in the combined sensitivity analysis. (Almost no extra computation is required to do many output variables, indices or weightings.)
• The number of simulations is simply the number required to do a single (non robust) univariate sensitivity analysis multiplied by the number of perturbation methods (=2 in this example).
Next Steps• Once the most important parameters have
been identified we can extend the analysis to more detailed analyses including:– Multivariate sensitivity analysis (changes in more
than one parameter at a time)– Uncertainty Analysis (e.g. GLUE)
• Both of these analyses above are highly computationally demanding and can hence only be done with a small number of parameters.
• The (univariate) sensitivity analysis done here can identify the small number of parameters on which these analyses should be focused.
Questions
• Can we develop a sensitivity analysis method that is:
– robust (doesn’t depend strongly on our assumptions)?
– computationally efficient for a large number of parameters (hundreds)?
– allows us to consider many different model outputs simultaneously?
– Yes, the results for Cannonsville indicate this is possible with this methodology.
– Models with longer simulation times require more total simulation times or fewer parameters.
II: Use of Response Surface Methods in Non-Convex Optimization,
Calibration and Uncertainty Analysis
• Joint work with– Pradeep Mugunthan (PhD Candidate in Civil and
Environmental Engineering)– Rommel Regis (Postdoctoral Fellow with PhD in
Operations Research)
– Funded by three National Science Foundaton (NSF) Projects
Computational Effort for Trial and Error (Manual) Calibration
• Assume that you have P parameters and you want to consider N levels of each.
• Then the total number of combinations of possible sets of parameter is NP.
• So with 10 parameters, considering only 2 values each (very crude evaluation), there are 21024 possible combinations, too many to evaluate all of them for computationally expensive function.
• With 8 parameters considering a more reasonable 10 values each gives 100 million possible combinations of parameters!
• With so many possibilities it is hard to find with trial and error good solutions with few (e.g. 100) function evaluations.
Automatic Calibration
• We would like to find the set of parameter values (decision variables) such that– the calibration error (objective function) is
minimized – subject to constraints on the allowable range
of the parameter values.
This is an Optimization Problem.
It can be a global optimization problem.
NSF Project 1: Function Approximation Algorithms for Environment Analysis with
Application to Bioremediation of Chlorinated Ethenes
• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,
• The project is funded by the NSF Environmental Engineering Program.
• The following slides will discuss the application of these concepts to uncertainty analysis.
“Real World Problem”:Engineered Dechlorination by Injection of Hydrogen
Donor and Extraction
We have developed a user friendly transport model of engineered anaerobic degradation of chlorinated ethenes that models chemical and biological species and utilizes MT3D and RT3D.
This model is the application for the function approximation research.
Optimization
• Because our model is computationally expensive, we need to find a better way than trial and error to get a good calibration set of parameters.
• Optimization can be used to efficiently search for a “best” solution.
• We have developed optimization methods that are designed for computationally expensive functions.
Optimization
• Our goal is to find the
minimum of f(x)
where x є D• We want to do very few evaluations of f(x)
because it is “costly to evaluate.
This can be a measure of error between model prediction and observations
X can be parameter values
Global versus Local Minima
F(x)
X (parameter value)
Local minimum
Global minimum
Many optimization methods only find one local minimum.
We want a method that finds the global minimum.
Experimental Design with Symmetric Latin Hypercube (SLHD)
• To fit the first function approximation we need to have evaluated the function at several points.
• We use a symmetric Latin Hypercube (SLHD) to pick these initial points.
• The number of points we evaluate in the SLHD is (d+1)(d+2)/2, where d is the number of parameters (decision variables).
x (parameter value-one dimensional example)
Objective
Function
f(x)
measure of error
One Dimensional Example of Experimental Design to Obtain Initial Function Approximation
Costly Function Evaluation (e.g. over .5 hour CPU time for one evaluation).
x (parameters)
f(x)
Function Approximation with Initial Points from Experimental Design
In real applications x is multidimensional since there are many parameters (e.g. 10).
x (parameter value)
f(x)
Update in Function Approximation with New Evaluation
Update done in each iteration for function approximation for each algorithm expert.
Function Approximation is a guess of the function value of f(x) for all x.
new
Use of Derivatives
• We use the gradient-based methods only on the function approximations R(x) (for which accurate derivatives are inexpensive to compute).
• We do not try to compute gradients/derivatives for the underlying costly function f(x).
Our RBF Algorithm
• Our paper on RBF optimization algorithm has will appear soon in Jn. of Global Optimization .
• The following graphs show a related RBF method called “Our RBF” as well as an earlier RBF optimization suggested by Gutmann (2000) in Jn. of Global Optimization called “Gutmann RBF”.
Comparison of RBF Methods on a 14-dimensional Schoen Test Function (Average of 10 trials)
120 140 160 180 200 220 240 260 280 30015
20
25
30
35
40
45
number of function evaluations
mean o
f th
e b
est
valu
e in 3
0 r
uns
Comparison of RBF Methods on a 14-dimensional Schoen Test Function
ExpRBF-LGutmannRBFGreedyRBF
Objective Function
Our RBF
Number of Function Evaluations
Comparison of RBF Methods on a 12-dimensional Groundwater Aerobic Bioremediation Problem ( a PDE system)
(Average of 10 trials)
80 100 120 140 160 180 200400
500
600
700
800
900
1000
1100
number of function evaluations
mean o
f th
e b
est
valu
e in 1
0 r
uns
Comparison of RBF Methods on a 12-dimensional Groundwater Bioremediation Problem
ExpRBF-LGutmannRBFGreedyRBF
Objective Function
Number of Function Evaluations
Our RBF
The following results are from:
NSF Project 1: Function Approximation Algorithms for Environment Analysis with
Application to Bioremediation of Chlorinated Ethenes
• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,
• The project is funded by the NSF Environmental Engineering Program.
Now a real costly function: DECHLOR: Transport Model of
Anaerobic Bioremediation of Chlorinated Ethene
• This model was originally developed by Willis and Shoemaker based on kinetics equations by Fennell and Gossett.
• This model will be our “costly” function in the optimization.
• Model based on data from a field site in California.
Complex model: 18 species at each of thousands of nodes of finite difference model
Butyrate
Propionate
H2
Acetate
LactateLac2Ace
PCE DCETCE VC Ethene
Dechlorinator
Lac2Prop
But2AceProp2Ace
Methane
But2Ace
Hyd2Meth
Chlorinated.Ethenes
Donors
Example of Objective Function for Optimization of Chlorinated Ethene Model
2 T
1 t
I
1 i
s tij
o tij
J
1 j ) Y (Y SSE
where, SSE is the sum of squared errors between observed and simulated chlorinated ethenes
is the observed molar concentration of species j at time t, location i
is the simulated molar concentration of species j at time t, location i
t = 1 to T represent time points at which measured data is available
j = 1 to J represents PCE, TCE, DCE, VC and ethene in that order
i = 1 to I is a set monitoring locations
otijY
stijY
Observation Model
Algorithms Used for Comparison of Optimization Performance on Calibration
• Stochastic Greedy Algorithm– Neighborhood defined to make search global– Neighbors generated from triangular distribution around current
solution. Moves only to a better solution.• Evolutionary Algorithms
– Derandomized evolution strategy DES with lambda = 10 and b1 = 1/n and b2 = 1/n0.5 (Ostermeier et al. 1992)
– Binary or Real Genetic algorithm GA, population size 10, one point cross-over, mutation probability 0.1, crossover probability 1
• RBF Function Approximation Algorithms– RBF Gutmann- radial basis function approach, with cycle length
five, SLH space filling designRBF-Cornell radial basis function approach.
• FMINCON– derivative based optimizer in Matlab with numerical derivatives
• 10 trials of 100 function evaluations were performed for heuristic and function approximation algorithms for comparison
Comparison of algorithms for NS as objective function on a hypothetical problem
-1
4
9
14
19
30 50 70 90
Number of function evaluations
-(Ave
rage
NS)
FMINCON
RBF-CORNELL
RBF-GUT
FMINCON+RBF
DES
RealGA
BinaryGA
Average is based on 10 trials. The best possible value for –NS is –1. 28 Experimental design evaluations done.
ours
Lower curve is better
Boxplot comparing best objective value (CNS) produced by the algorithms in each trial over 10 trials
ours
average
outlier
Conclusions• Optimizing costly functions is typically done only
once.• The purpose for our examination of multiple
trials is to examine how well one is likely to do if you do solve the problem only once.
• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.
• Our RBF has both the smallest Mean and the smallest Variance.
• The second best method is Gutmann RBF, so RBF methods seem very good in general.
Conclusions• Optimizing costly functions is typically done only
once.• The purpose for our examination of multiple
trials is to examine how well one is likely to do if you do solve the problem only once.
• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.
• Our RBF has both the smallest Mean and the smallest Variance.
• The second best method is Gutmann RBF, so RBF methods seem very good in general.
Alameda Field Data
• The next step was to work with a real field site. • We obtained data from a DOD field site studied
by a group (including Alleman, Morse, Gossett, and Fennell).
• Running the simulation model takes about three hours for one run of the chlorinated ethene model at this site because of the nonlinearities in the kinetics equations.
Site Layout
Range of objective values for SSE objective function at Alameda field site - Mean, min and max are shown for each
algorithm
150000
250000
350000
450000
550000
650000
DES FA-Gutmann FA-RS FMINCON
SS
E (m M
)2
ours
gradient
Conclusions on RBF Optimization of Calibration
• Radial Basis Function Approximation Methods can be used effectively to find optimal solutions of costly functions.
• “Our RBF” performed substantially better than the previous RBF method by Gutmann on the difficult chlorinated ethene remediation problem, especially because our RBF is robust (small variance).
• Both Genetic algorithms and derivative-based search did very poorly.
• The two RBF methods did much better on the Alameda field data problem than other methods.
However,300 hours is a long time to wait!
Solution: Parallel Algorithms• We would like to be able to speed up
calculations for costly functions by using parallel computers.
• To get a good speed up on a parallel computer, you need an algorithm that parallelizes efficiently.
• We are developing such an algorithm through a second NSF grant (from Computer and Information Science Directorate).
III: Uncertainty Analysis
• Modelers have discovered that there is often more than one set of parameters that gives and “adequate” fit to the data.
• One approach to assessing uncertainty associated with a model output is to look at the weighted mean and the variability of the output associated all the sets of parameters that give an equally good fit.
x (parameters)
f(x)
More than one parameter value might give acceptable goodness of fit
If we impose a “filter” and allow only the acceptable points, then only the black points are incorporated in the analysis.
acceptable
Uncertainty Analysis: GLUE Approach
• GLUE is a methodology (by Bevins and co-workers) used largely in watersheds (where computation times are not long).
Uncertainty Analysis via GLUE: Dots are Model Simulations of Parameter Combinations Chosen at
Random (Two Parameter Example)
parameter 1
parameter 2
parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75
Glue Methodology (used mostly in watershed modeling)
• Step 1: Select combinations of parameter values at random and simulate model for each combination.
• Step 2:compare goodness of fit (e.g. R 2) for each model simulation compared with data
• Step 3: Simulate model at acceptable points and weight output to determine variability characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years)
Problems with GLUE Methodology
• We applied GLUE to the Cannonsville Watershed SWAT model predictions for sediment (a very hard quantity to model).
• We did 20,000 Monte Carlo runs (which took about three weeks of computer time).
• Of the 20,000 runs only two runs were within the allowable R2. (only two )
• This does not adequately characterize uncertainty, and it is not computationally feasible to make more runs.
• For computationally expensive models like our groundwater problem or your Everglades problem, it is not feasible to run the model 20,000 times!
• Hence GLUE has the problem that it finds very few samples within an acceptable level (filter) if the filter is fairly stringent.
Groundwater Example Used for Numerical Comparison with GLUE
• 2-D confined aquifer contaminated with chlorinated ethenes.
• Same PDE equations as earlier field case• 400m long, 100m wide• Modeled using a coarse 10mx10m finite difference grid
– Simulation time for 6 month calibration period was approximately ¾ minute in a Pentium4® 3GHz computer
– Typical simulation time for long-term forecast scenarios is of the order of several hours to days
Calibration Problem
• Calibration of 3 parameters were considered – 2 biological parameters and one biokinetic parameter
• Synthetic observations were generated for a period of 6 months using a known set of parameters
• Optimal calibration was attempted using a response surface (RS) optimization method (Regis and Shoemaker, 2004)
• GLUE based calibration/uncertainty assessment was also performed for comparison
Output Definition
• Output: The total moles of toxic compounds (chlorinated ethenes) remaining in aquifer at final time period. (This cannot be measured but can be estimated through model.)
• Uncertainty in the Output was analyzed using GLUE and RS based methods
Goodness-of-fit Measure
• Nash-Sutcliffe Efficiency Measure (Nash and Sutcliffe, 1970)
• Optimization algorithm was setup to minimize CNS = 1-NS, so that a CNS of zero is best
S
i
tj
avi
obstji
tj
obstji
simtji
CC
CC
SNS
12
,,,
2
,,,,,
11
1 NS
Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Our Method 1 with 200 function evaluations
Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
GLUE 1 with 500 function evaluations
Uncertainty Estimates for Total Moles of Chlorinated Ethenes
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Is the mean, range is 99% of data
This is the true answer
Uncertainty Estimates for Total Moles of Chlorinated Ethenes
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Number of points after applying filter
RS200 uses 200 function evaluations. G200 found 0 solutions (none) for this filter. GS500 found only 5 solutions.
This is the true answer
Uncertainty Estimates for Total Moles of Chlorinated Ethenes
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Number of points after applying filter
Is the mean, range is 99% of data
This is the true answer
The mean estimate is almost perfect for our RS method and is far off for GLUE method with 250% as many points evaluated !
Uncertainty Estimates for Total Moles of Chlorinated Ethenes
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Number of points after applying filter
Even with 2000 function evaluations, GLUE has a much worse mean than our RS method with only 1/10 as many function evaluations.
Our Method 2(RSG)
• Step 1: Same as in Method 1• Step Construct a function approximation surface of the
output• Step 3: Make a large number of samples from function
approximation. Do further function evaluations if function approximation is negative and refit function approximation.
• Step 4: Filter out points that are not acceptable and compute statistics
• Determine statistical characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years) at all acceptable points.
Uncertainty Estimates for Total Moles of Chlorinated Ethenes
Bounds obtained using a filter of 0.01 for CNS
G2000G1000G500RS200 RSG20k TRUE141.00
142.00
143.00
144.00
145.00
146.00
147.00
To
tal
mo
les
of
chlo
rin
ated
eth
enes
5
6 12126
35
Number of points after applying filter
Our Method 2 with 200 function evaluations and 20,000 samples from the response surface
Difference Between Method 1 and Method 2
The uncertainty analysis in Method 1 is based only on actual function evaluations.
The uncertainty analysis in Method 2 is based on a very large number of samples from the function approximation.
Comments on Results
• A strict filter produces very few points with GLUE – even after 2000 function evaluations, only 12 points
remain after filtering
• Our RS method produces the tightest bounds and also provides more points for uncertainty assessment with only 200 function evaluations– Limited with respect to sample independence
• The RSG provides an improvement over GLUE – Independent samples for uncertainty assessment– A larger sample size for a tight filter
Effect of Relaxing Filter – CNS of 0.1
Empirical 98% Bounds obtained using a filter of 0.1 for CNS
G2000RS200 G200 G500 G1000 RSG20k TRUE135.00
140.00
145.00
150.00
155.00
160.00
165.00
To
tal m
ole
s o
f ch
lori
nat
ed
eth
enes
90
12 44 84 167
1542
Comparison of percentage of points after filtering
0
20
40
60
80
100
120
0.01 0.1 0.3 1 inf
CNS Filter
Per
cen
tag
e o
f p
oin
ts
afte
r fi
lter
ing
RS200
G200
G500
G1000
G2000
RSG20k
Comparison of percentage of points after filtering
0
10
20
30
40
50
0.01 0.1 0.3
CNS Filter
Perc
en
tag
e o
f p
oin
ts a
fter
filt
eri
ng
Filter
RS200
G200
G500
G1000
G2000
RSG20k
Percentage of Points for Different Filters
Advantages of Method 2
• The samples are independent
• Reuse information from calibration
• Computationally cheap – – use only the same number of costly function
evaluations as in the regular RS optimization method (200 in these examples)
– Can obtain goodness-of-fit and output values for many thousands of points
Summary
• Models can help us use data take a small scale and at discrete time points to understand and manage environmental processes over large spatial areas and time frames.
• Development of computationally efficient methods for automatic calibration, sensitivity and uncertainty analysis are very important.
New Project 2: Parallel Optimization Algorithms
• Funded by the Computer Science (CISE) Directorate at NSF
• The method is general and can be used for a wide range of problems including other engineering systems in addition to environmental systems.
• This research is underway.
2. How are calibration sensitivity analysis and uncertainty
analysis used in environmental analyses?
3. What are the alternatives to sensitivity analysis and uncertainty
analysis?
How Do we address the uncertainties that are not directly related to parameter uncertainty
such as data uncertainty?
My NSF Projects
• NSF-Environmental Engineering: applications of methods to watershed and groundwater
• NSF-Advanced Computing: development of parallel algorithms for function approximation optimization
• NSF-Statistics: development of an integration of Bayesian statistical methods with function approximation optimization for computationally expensive functions.
• All this previously funded research can be useful in applications to the Everglades.