Upload
megan-jimenez
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Work-package 6Statistical integration
Allard de Wit & Raymond van der Wijngaart
Content of this workshop
Work package description About crop yield forecasting Yield forecasting models Statistical forecasting assumptions The CGMS Statistical Toolbox
Work package description
Statistical integration for an E-Agriculture service
To integrate the crop yield indicators generated by modelling platforms and those derived from remote sensing information. The indicators generated from the previous work-packages will be ingested into a statistic analysis platform which has to integrate different statistical approaches, to be regionally flexible and specific and to be reproducible for a posterior evaluation.
Work package description
Objectives Implement the CST for crop yield forecasting for
the target regions Load database with yield predictors derived
from crop modelling and/or remote sensing based approaches
Organize a workshop for piloting of the CST and improve understanding and experience with the tool by the end users.
Raise awareness and gain experience with the CST which should lead to improved yield forecasts.
Work package descriptionNo Work Package title Typ
eLeader Person-
monthsstart end
6 Statistic integration
Alterra 11 24 36
6.1 Statistic platform design
RTD Alterra 6.5 24 36
6.2 Statistic platform pilot
RTD INRA 4.5 30 36
No Deliverable name Nature Dissemination
Delivery date
6.1.1 CST setup for target regions
Report Public 36
6.2.1 Piloting and workshop report
Report Public 36
About crop yield forecasting
Estimates of crop yield or production for the current season before the harvest
For administrative regions Often calibrated against regional statistical
data Continuous “assimilation” of data as the
growing season progresses Improve accuracy during the growing season Better then a baseline forecast (i.e. average
or trend)
About crop yield forecasting
“the art of identifying the factors that determine the spatial and inter-annual variability of crop yields” (René Gommes, FAO 2003).
About those factors“ … AND IF IT GETS ENOUGH RAIN, AND SUN, AND IF IT ISN'T KILLED BY HAIL, AND IF IT ISN’T DAMAGED BY FROST, AND IF WE CAN GET IT OFF BEFORE IT’S COVERED BY SNOW, AND IF WE GET IT TO THE ELEVATORS, AND IF THE TRAINS ARE RUNNING, AND IF THE GRAIN HANDLERS AREN’T IN STRIKE, AND IF … “
Type of forecasting systems
Judgement, based on stakeholders that reach consensus on the expected yield given all available information
Statistical, based on functional relationships between a crop yield indicator and the crop yield statistics (e.g. time trend models and/or CGMS simulation result)
Combinations of the above (European MARS system)
Parameterize the model in time or in space?
Build a time-series model for one region and multiple years
Build a spatial model for multiple regions and one year
Combine the two above (even more difficult!)
Reasons for preferring a time-series modelSeveral effects: Socio-economic factors differ between
regions (example: Germany 7.24 ton/ha, Poland 3.44
ton/ha). Crop yields often show an upward (or
downward) trend over the years Magnitude of simulated year-to-year
variability in crop yield differs from variability in regional statistics.
Statistical forecasting models
Parametric models: Regression analyses: (multiple -) regression
between crop yield statistics and crop indicators Non-parametric models:
Scenario analyses: Find similar years and use these to forecast
Neural networks: train a neural network to recognize yield-indicator relationships
Statistical forecasting assumptions
Uses time-series of historic statistics and crop yield indicators
Parameterizes a forecasting model explaining the relation using a best fit criterion (mean squared error)
Crop yield = f( time-trend + indicators(i1, i2, …) ) Uses (multiple) linear regression The model parameters are derived at several time
steps during the growing season (i.e. each dekad) Forecast model is then applied in prognostic mode
to forecast the current season’s yield
Statistical forecasting assumptions
0
1970
year
simulated yield
observed yield
trend
2010 2011
simulated indicator
forecasted yield
2000
Long term technological trend CGMS predicts deviation from trend caused by weather
Statistical forecasting assumptions
Advantages: Simple, understandable Hypothesis testing (statistical significance) Provides models with predictive power
Example of regression analyses for winter-wheat in Hebei province
Winter-wheat yield -- Hebei province
2.5
3
3.5
4
4.5
5
5.5
6
1992 1994 1996 1998 2000 2002 2004
Year
Yie
ld [
ton
/ha
]
R 2̂ = 0.521
Time-trend analyses
Strange value, discarded!
Compare residuals with CGMS resultsHebei winter-wheat - Deviations from the trend
-0.5
-0.4-0.3
-0.2-0.1
0
0.10.2
0.30.4
0.5
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004
Year
Yie
ld a
no
ma
ly [
ton
/ha
]
Hebei winter-wheat - CGMS pot. yield storage organs
0
1000
2000
3000
4000
5000
6000
1994 1995 1996 1998 2000 2001 2002 2003
Year
Bio
mas
s [k
g/h
a]
Fit linear model through yield anomaliesResiduals vs CGMS simulation results
-0.5
-0.25
0
0.25
0.5
3000 3500 4000 4500 5000 5500
CGMS simulated yield [kg/ha]
Yie
ld a
no
mal
y [t
on
/ha]
R2 = 0.601
Build and test complete forecasting model
Our complete forecasting model is specified by:
1. The time trend model (R2 = 0.521) +2. The yield anomaly model as a function of
the CGMS simulation results (R2 = 0.601)
The CGMS statistical toolbox (CST)
Manual analyses is error prone CST does several analyses: time trend analyses,
(multiple) regression analyses and scenario analyses
Each model is tested whether it improves prediction beyond the trend only
Hypothesis testing for determining significance of results
Lists the models that were analyzed
Lists the statistical indicators of each
model
Lists the t-values of each component in each model
(significance testing)
Coloring of according to
statistical significance and sign of the model
coefficient