Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Statistical simulation model for air
temperatures: non stationarity, non
linearity and boundedness
Thi Thu huong HOANG (EDF R&D), Sylvie PAREY(EDF R&D), Didier DACUNHA-CASTELLE (Université Paris Sud)
1st of June , 2012
1- Simulation model for air temperatures- 1st June 2012
Outline
Introduction
Pre-processing
Model and estimation procedure
Application to air temperature
Conclusion and perspectives
2- Simulation model for air temperatures- 1st June 2012
INTRODUCTION
3- Simulation model for air temperatures- 1st June 2012
ContextEDF is interested in the impact of climate change on energyThis will help outline the offer/demand balance of the 21st century and envisage a decision-making procedures related to adaptation strategies.We concentrate on temperature, a key parameter influencing energy:
Reduce or increase the demandhot or cold waves can affect overhead linesExtreme temperatures can have adverse effects on renewable energy sources that are sensitive to climate variables and on thermal production.
ObjectivesBuild a simulation model
for (maximum or minimum) daily air temperature for a fixed locationfor a month or a seasonwith good qualities for the bulk and the tails of distribution
able to easily produce a great number of realistic trajectories
4- Simulation model for air temperatures- 1st June 2012
Very large models, complexity, heavy computationNon global chaos, chaotic sub-models
Coupled PDE : fluid mechanics, atmosphere, ocean radiation scheme (solar variations) water cycle, greenhouse gases (CO2, …), aerosols
physical parameterizationsRemaining uncertainties : clouds, carbon cycle, ice melting
Uncertainties How to estimate them? No stochasticityPseudo simulation, sensitivity to small variations on initial conditions ⇒ limited number of trajectoriesStill difficulties to reproduce variability and extremes
X o o x
MODEL
Initial conditions
Trajectories
Some features of numerical climate models
5- Simulation model for air temperatures- 1st June 2012
Important number of stations. Complicated spatial correlations
Non stationary, non linear
Two periodicities (in mean and in variance), maybe non constant for long periods
Boundedness : only a very accurate application of extremes theory allows to prove that every model has to take into account this feature.
Continuous-time process versus discrete measurements: Temperature is a continuous-time process but observed at the discrete time steps How to apply the properties of continuous-time model to discrete observations?
Difficulties in the stochastic modelling of temperature
6- Simulation model for air temperatures- 1st June 2012
PRE-PROCESSING FROM NON STATIONARITY TOWARDS STATIONARITY
7- Simulation model for air temperatures- 1st June 2012
Pre-processingThe aim is to remove the trends in mean and in variance and the (additive et multiplicative) seasonalities to obtain reduced series as stationary as possible
The processing treatment uses both the nonparametric and parametric approaches
: mean, scale function, : seasonalities in mean and in scale
Estimation procedure: estimate by loess, by a trigonometric function from the series , then
by loess and by a trigonometric function from the series
For , the modified partitioned cross validation(1) is used , for , the
Akaike criteria are used
The reduced series:
(1) Modified partitioned CV: new algorithm for correlated data (thesis of Hoang, 2010)
)()()()()()( tZtStstStmtX V++=
)(),( tstm )(),( tStS V
( ) )ˆˆ/(ˆˆ Vtttttt SsSmXZ −−=
8- Simulation model for air temperatures- 1st June 2012
[ ] 2)(ˆ)(ˆ)( tStmtX −−
)(tm )(tS )(tX
)(),( tstm
)(ts
)(2 tSV
)(),( 2 tStS V
MODEL AND ESTIMATION PROCEDURE FOR THE REDUCED SERIES
9- Simulation model for air temperatures- 1st June 2012
Characteristics of the reduced series short memory seasonality remains in the correlation and in the volatility cyclo-stationary bounded in the tails non linearityvolatility depends on the state Studies:
tests of trend(2) on the basic statistics of (mean, variance, skewness, kurtosis)test of cyclo-stationarity(3) on the extremes of (see Parey et al., 2012)study the trend and the seasonality in the series
10- Simulation model for air temperatures- 1st June 2012
kttZZ −
tZtZ
(2), (3): A new test proposed in Hoang (2010), the principe of test is described in the next slide
Principle of our test of trend (or of stationarity)
Situation: Xk = X(tk) , tk ∈ (0,1) ∼ G(x, θ) known or unknown distribution, θ (t) ∈ C compact in R
Hypothesis of test: θ is constant
Let be the constant estimator of θ, the non-parametric (spline or loess) estimator of θ
Idea: compare these two estimators by use of the distance :
The asymptotic of the test is proven (Hoang, 2010)
In practice: build a test on ∆ (build the distribution of ∆ under H0 hypothesis by using the simulations if the law is known or bootstrap otherwise)
11- Simulation model for air temperatures- 1st June 2012
nc nθˆ
nn cˆ −=∆ θ
Asymptotic of the test
Consider that the law of ε is known. Let θ0 be the true value of θ
We have: when (always)
when (only if θ0 is constant !)
We proved that (Hoang, 2010) (t is supposed to belong to (0,1): t = tk = k/n)
The theorem can be proven for an unknown law of ε(based on least squares estimation)
12- Simulation model for air temperatures- 31st May 2012
Theorem (m.l.e theory and Le Cam ‘s point of view)
enoughhighnforacsurelyalmosthave
weFccathatsoantstaconnotisWhen
nn 02ˆˆ),,min(, 00
>>−
∈−≤∀
θ
θθ
0ˆ θθ →n
∞→n
0ˆ θ→nc ∞→n
Theory of the diffusion process with inaccessible boundaries
Temperature has Markov properties and is bounded → Diffusion process with inaccessible boundaries r1 and r2 (r1 < ∞, r2 < ∞):
where b is the drift, a is the diffusion coefficient and W is a Brownian motion.
The invariant marginal density of the continuous-time process can be estimated from the discrete observations
Moreover, we proved (Hoang, 2010) that if ξ < 0, the domain of max attraction of the continuous-time process is the same as that of an i.i.d discrete sample with the same marginal density
Conclusion: we can apply the properties of extreme value theory of the bounded diffusion process to a discrete-time sample. We don’t need to use the exact formula of the discrete Markov chain
13- Simulation model for air temperatures- 1st June 2012
tttt dWZaZbdZ )()( +=
The SFHAR(seasonal functional heteroscedastic autoregressive) model
)1,0(
),()1(3652sin
3652cos)( 1
1,2,1,0
1
N
ZtatZtjtjtZ
t
tt
p
j
jk
jkk
∝
+−
++= −
=∑
ε
επθπθθ
∀>
+−−= −
= =−
∧
∑ ∑
ttatrCtrC
ZtjtjrttrZta kt
k
p
j
jk
jkt
0)²(ˆ),ˆ(),,ˆ(
3652cos
3652cos)ˆ)(ˆ(),(²
21
1
5
0 1,2,1121
2 παπα
Estimate with constraints:),( 12
−tZta Zero outside the boundaries positive constraints C on the first derivative from the continuous-time diffusion process (see thesis of Hoang, 2010):
Form of a:
14- Simulation model for air temperatures- 1st June 2012
( ) ( )2
22
2
1
11
2
/11),(2)('
/11),(2)('
ξξ −=
−= trbraettrbra
First order Euler scheme of a discrete diffusion:Rq: This is a discrete approximation of the bounded diffusion. With a gaussian noise, the model is not bounded but ‘almost’ bounded
Extension: SFHAR model
)1,0(,)()( 11 NZaZbZ ttttt ∝+= −− εε
Estimation procedure and optimisation
Estimation of the autoregressive part (AR(1))
Choose the number of cosine and sine terms by a Akaike criterion
Estimation of the volatility through maximum likelihood with constraintsFind the initial values using least squares estimation: problem of least squares with equality and inequality constraints → transform to the quadratic programming problem and use the algorithm of Goldfarb and Idnani (1982,1983) Maximum likelihood estimation with constraints: use the results of least square estimation as the initial values and use the Nelder and Mead algorithm (1965)
15- Simulation model for air temperatures- 1st June 2012
APPLICATION TO AIR TEMPERATURE
16- Simulation model for air temperatures- 1st June 2012
Observation data Maximum or Minimum daily temperature for one month or one season at one location
17- Simulation model for air temperatures- 1st June 2012
Europe: ECA&D 1950-2009Homogenous (« useful »)Tx, Tn<3 years of missing data ⇒106 Tx series and 120 Tn series
United States: NCDC, Global Historical Climatology Network
Series with <4 years of missing dataBeginning before 1966 and ending after 200886 Tx series, 85 Tn series
Validation criteria
Residuals: Whiteness of the residuals and of the squared residuals Tests of normality of the residuals
Comparison to the observations: Basic statistics: mean, variance, skewness, kurtosis Marginal law: density function, test of homogeneity Quantiles Temperature of a fixed date for X Extreme parameters for Z Proportion of outliers for X: we expect that there are x% observations lower than the estimated x-percentile of the simulations
18- Simulation model for air temperatures- 1st June 2012
Results: Europe (Déols in France) Model for a month and for a season a estimated non parametrically by splines
19- Simulation model for air temperatures- 04 May 2012
a estimated parametrically
Tmax: a tends to increase with Z(t-1) Tmin: a tends to decrease with Z(t-1)
Tmax
-10 -5 0
0.0
0.5
1.0
1.5
2.0
8 Jan
Z(t-1)
a
nonparapara_without constrpara+constr
Tminlower bound is unrealistic because ξ≈0
Results: Europe (Déols in France)
20- Simulation model for air temperatures- 1st June 2012
The residuals: normality is not rejected by Komogorov-Smirnov test but is one time over two (especially for winter and summer) rejected by Shaprio-Wilk test or Anderson-Darling test. They take more account of the tails.
Tmax, August
Tmin, January
Results: Europe (Déols in France)
21- Simulation model for air temperatures- 1st June 2012
X ZObservation
sSimulations Observation
sSimulations
Mean 25.247 25.263(24.947,25.632) 0.003 0.007 ( -0.066 , 0.093 )Variance 18.084 17.826 (15.88 ,
19.682)0.957 0.935 ( 0.843 , 1.019 )
Skewness 0.297 0.434 ( 0.31 , 0.551 ) 0.365 0.439 ( 0.308 , 0.576 )Kurtosis -0.466 0.412 ( 0.133 , 0.712 ) -0.404 0.446 ( 0.165 , 0.807 )
X ZObservation
sSimulations Observation
sSimulations
Mean 24.235 24.232(24.046,24.401) -0.001 -0.002 ( -0.046 , 0.04 )Variance 20.097 19.245(18.188,20.255) 1.01 0.981 ( 0.931 , 1.04 )Skewness 0.32 0.435 ( 0.339 , 0.512 ) 0.39 0.575 ( 0.488 , 0.656 )Kurtosis -0.195 0.534 ( 0.331 , 0.779 ) -0.317 0.624 ( 0.407 , 0.875 )
X ZObservation
sSimulations Observation
sSimulations
Mean 0.944 0.987 ( 0.554 , 1.37 ) -0.004 0.002 ( -0.09 , 0.081 )Variance 23.24 22.014( 19.79 ,
23.951)0.976 0.949 ( 0.846 , 1.038 )
Skewness -0.571 -0.308 (-0.523 , -0.137)
-0.522 -0.324( -0.541 , -0.157 )
Kurtosis 0.836 0.275 ( -0.189 , 1.259 ) -0.269 0.283 ( -0.181 , 1.119 )X Z
Observations
Simulations Observations
Simulations
Mean 1.289 1.242 ( 0.986 , 1.513 ) 0.007 -0.006 ( -0.06 , 0.052 )Variance 21.398 25.191(23.574,26.714) 0.99 1.181 ( 1.106 , 1.249 )Skewness -0.402 -0.319 ( -0.434,
-0.202)-0.311 -0.271 ( -0.378 , -0.16 )
Kurtosis 0.457 0.553 ( 0.279 , 0.812 ) 0.089 0.43 ( 0.204 , 0.682 )
Tmax, July
Tmax, summer
Tmin, January
Tmin, winter
Simulations represent correctly mean and variance, but not skewness and kurtosis
The results seem better for a month than for a season
Results: Europe (Déols in France)
22- Simulation model for air temperatures- 1st June 2012
Tmax, July Tmax, summer
Tmax, October Tmax, autumn
The results are better for individual months than for seasonsThe results are better for the inter-seasons (spring, fall) whose distributions are more symmetric
Results: Europe (Déols in France)
23- Simulation model for air temperatures- 1st June 2012
The observed temperatures are usually in the confidence interval of the simulations
except for the very special cases
Results: Europe (Déols in France)
24- Simulation model for air temperatures- 04 May 2012
Tmax, July• 0.2% observations higher than estimated Q99% from simulations• 1% observations higher than estimated Q98% from simulations
Tmax, summer• 0.45% observations higher than estimated Q99% from simulations• 1.4% observations higher than estimated Q98% from simulations
Tmin, January•1.5% observations lower than estimated Q1% from simulations• 3.0% observations lower than estimated Q2% from simulations
Tmin, winter• 0.6% observations lower than estimated Q1% from simulations• 1.4% observations lower than estimated Q2% from simulations
Problem when ξ ≈ 0, simulations more bounded
Simulations less bounded
Simulations less bounded
Results: United-States (Minneapolis) Model for a month and for a season a estimated non parametrically by splines: the same behavior as in Europe
25- Simulation model for air temperatures- 04 May 2012
a estimated parametrically
Tmax
Tmax: a tends to increase with Z(t-1) Tmin: a tends to decrease with Z(t-1)
TmaxTmin
Results: United-States (Minneapolis)
26- Simulation model for air temperatures- 1st June 2012
The residuals: normality is more often rejected by the normality tests: Komogorov-Smirnov test Shaprio-Wilk test and Anderson-Darling test than for Deols
Tmax, July
Tmin, January
Results: United-States (Minneapolis)
27- Simulation model for air temperatures- 1st June 2012
Tmax, July
Tmax, summer
Tmin, January
Tmin, winter
The simulations represent rather well all basic statistics
These results are better compared to those of Deols
The simulations represent better the observations when the observations are less asymmetric and their (normalized) kurtosis is close to that of a gaussian law (0)
X ZObservation
sSimulations Observation
sSimulations
Mean 28.598 28.45 (28.191 , 28.655)
0.042 0.003 ( -0.065 , 0.057 )
Variance 14.93 14.515 (13.57 , 15.681)
0.993 0.99 ( 0.928 , 1.065 )
Skewness -0.104 0.025 ( -0.101 , 0.103 ) -0.129 0.023 ( -0.087 , 0.108 )Kurtosis 0.08 0.102 ( -0.088 , 0.317 ) 0.056 0.094 ( -0.062 , 0.286 )X Z
Observations
Simulations Observations
Simulations
Mean 27.192 27.221(27.053,27.414) -0.006 0.001 ( -0.039 , 0.049 )Variance 19.526 19.498(18.686,20.443) 1.002 0.992 ( 0.961 , 1.033 )Skewness -0.303 -0.136 ( -0.216,
-0.058)-0.131 0.068 ( -0.007 , 0.124 )
Kurtosis 0.179 0.209 ( 0.046 , 0.361 ) -0.1 0.041 ( -0.079 , 0.156 )X Z
Observations
Simulations Observations
Simulations
Mean -15.022 -14.86(-15.33, -14.407) 0.042 -0.001 ( -0.06 , 0.057 )Variance 62.904 60.028(54.635,65.691) 0.993 0.976 ( 0.884 , 1.065 )Skewness -0.129 -0.184 ( -0.284 , -0.09 ) -0.129 -0.17 ( -0.268 , -0.076 )Kurtosis -0.76 -0.064 ( -0.215 , 0.151) 0.056 -0.105 ( -0.255 , 0.08 )
X ZObservation
sSimulations Observation
sSimulations
Mean -12.856 -12.90 (-13.28 ,12.559) 0.008 0 ( -0.052 , 0.044 )Variance 61.257 59.037(55.691,62.453) 1.002 0.983 ( 0.937 , 1.031 )Skewness -0.239 -0.287 ( -0.353, -0.229) -0.186 -0.257 ( -0.321, -0.207 )Kurtosis -0.659 -0.079 ( -0.179 , 0.024) -0.731 -0.076 ( -0.162 , 0.011 )
Results: US (Minneapolis)
28- Simulation model for air temperatures- 1st June 2012
The results are better for individual months than for seasonsThe simulations perform better in the bulk than for Deols because the skewness and kurtosis of the observations (X) in Minneapolis are closer to those of a Gaussian law
Tmax, July Tmax, summer
Tmin, January Tmin, winter
Results: US (Minneapolis)
29- Simulation model for air temperatures- 04 May 2012
Tmax, July• 0.8% observations higher than estimated Q99% from simulations• 2.1% observations higher than estimated Q98% from simulations
Tmax, summer• 0.45% observations higher than estimated Q99% from simulations• 1.3% observations higher than estimated Q98% from simulations
Tmin, January• 0.35% observations lower than estimated Q1% from simulations• 1.3% observations lower than estimated Q2% from simulations
Tmin, winter• 0.4% observations lower than estimated Q1% from simulations• 1.2% observations lower than estimated Q2% from simulations
Simulations less bounded
Simulations rather good for extremes
Simulations less bounded
Simulations less bounded
CONCLUSION & PERSPECTIVES
30- Simulation model for air temperatures- 1st June 2012
ConclusionVolatility is not constant but linear in the centreTemperature is generally boundedOur model performs better in the bulk when the skewness and kurtosis of the observations are close to those of a Gaussian law (because of the use of a gaussian law for the residuals)Our model gives in general better results for individual months than for seasonsTaking constraints at the boundaries, improves the simulation of the extremes (see Hoang et al., 2011)The model (with constraints at the boundaries) is not adapted when the data is not bounded (ξ ≥ 0)
Perspectives
31- Simulation model for air temperatures- 1st June 2012
Use transformations to make the series more symmetric Bootstrap residuals instead of simulating the gaussian law N(0,1) to solve the problem of skewed data
References
Hoang T.T.H., Modélisation de séries chronologiques non stationnaires, non linéaires. Application à la définition des tendances sur la moyenne, la variabilité et les extrêmes de la température de l'air en Europe, 2010
Parey S., Hoang T.T.H and Dacunha-Castelle D., The role of variance in the evolution of observed temperature extremes in Europe and in the United States, submitted to Climatic Change, 2012, under revision
D. Goldfarb and A. Idnani. Dual and Primal-Dual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, Springer-Verlag, Berlin, 226–239, 1982
D. Goldfarb and A. Idnani. A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1–33, 1983
Nelder, J. A. and Mead, R. A simplex algorithm for function minimization. Computer Journal 7, 308–313, 1965
Hoang T.T.H., Dacunha-Castelle D. and Benmenzer G., Estimation of a diffusion model with trends taking into account the extremes. Application to temperature in France, Evironmetrics, 22 (3), 464-479, 2011
32- Simulation model for air temperatures- 1st June 2012
Principle of our test of trend (or of stationarity)
The considered model: X(t)= θ(t)+ ε(t), the law of ε is known or unknown
Hypothesis of test: θ is constant / θ is not constant
Let: the constant estimator of θ (by m.l.e if the law of ε is known, by least squares if not) the nonparametric estimator of θ (by splines if the law of ε is known, by loess if not)
Idea: compaire these two estimators by the L² distance :
The asymptotic of test is proven (Hoang, 2010)
In practice: build a test on ∆ (build the empirical distribution of ∆ under H0 hypothesis by using the simulations if the law is known or the boostrap samples if not )
33- Simulation model for air temperatures- 31st May 2012
ncnθˆ
nn cˆ −=∆ θ