Upload
masrur-ahmed
View
6
Download
1
Embed Size (px)
Citation preview
Rjeas Research Journal in Engineering and Applied Sciences 2(5) 370-375 Rjeas Emerging Academy Resources (2013) (ISSN: 2276-8467)
www.emergingresource.org
370
MODELING AND FORECASTING MAXIMUM TEMPERATURE OF WARRI
CITY- NIGERIA
1Daniel Eni , 1Adeyeye Fola J. and 2 Duke, S. Orok Okor 1Department of Mathematics and Computer Science,
Federal University of Petroleum Resources, Effurun- Nigeria. 2Department of Computer Science,
Cross River University of Technology. Corresponding Author: Daniel Eni __________________________________________________________________________________________
ABSTRACT The influence of temperature on environmental factors and human endeavors cannot be over emphasis. The influence spans through agricultural activities like the rate of soil respiration and degradation as well as carbon cycle and seasons among many others to global climatic change. These underline the importance of the temperature and the need to develop modeling and forecasting tools as strategies for long-term planning. Here-in lays the motivation for studying and modeling patterns of temperature in Warri a town in Nigeria using seasonal ARIMA models. We obtained historical data of average monthly maximum temperature for the period 1994-2008 for the studies and those of 2009 for forecast validation of the chosen model, from the National Metrological Center, Oshodi- Nigeria. Model identification was by visual inspection of both the sample ACF and sample PACF to postulate many possible models and then use the model selection criterion of Residual Sum of Square RSS , Akaikes Information Criterion AIC complemented with the Schwartzs Bayesian Criterion SBC, to choose the best model. The chosen model is the SARIMA (1, 1, 1) (0, 1, 2) process which met the criterion of model parsimony with low AIC value of-797.81253 and SBC value of -785.34056.Model adequacy checks shows that the model is appropriate. The model was used to forecast temperature for 2009 and the forecast compared very well with the observed empirical data for 2009. Researchers will find this result useful in building temperature component into a general climatic forecasting model. Also environmental manager who require long term temperature forecast will find the identified model very useful. . Emerging Academy Resources KEYWORDS: ACF, PACF, ARIMA, Maximum Temperature, AIC, SBC. __________________________________________________________________________________________INTRODUCTION Climate change is one of the biggest environmental threats to food, availability of water, forest biodiversity and livelihood (Chung et al, 2011). Temperature variability happens to be one of the most influential components of climate change. Many studies have been conducted by scientist and researchers on the influence of temperature on natural environment and human endeavors. Bond-lamberty and Thompson, 2010 found the role of temperature in controlling the rate of soil respiration to be positive while Bindraban and Coauthers, 2012 found temperature impact on soil degradation to be high. On the other hand, Grace, 2004 found a positive correlation between ambient temperature and carbon cycle while Steltzer and Post, 2009 found same to be the case with growing seasons. These studies underline the importance of the temperature and the need to develop modeling and forecasting tools as strategies for long-term planning. In fact, Romilly, 2005 noted that modeling variation of Earths surface temperature and making dependable forecast underline the foundation of sound environmental
policies. Here-in lays the motivation for studying and modeling patterns of temperature in Warri, a town in Nigeria. The Warri city is located in latitude 50 31 N and longitude 50 45E with two distinct seasons; the rainy season (May-October) and dry season (November-April). It has mean annual temperature of 32.80 C and an annual rain amount of 2673.8mm. Warri is a major oil city located in the Niger Delta region of Nigeria. The main focus of this work is to determining appropriate seasonal ARIMA model that can adequately predict temperature for Warri city. The seasonal multiplicative ARIMA (Autoregressive, Integrated Moving Average) model is of the form t
st
s aBBCzBB (1) Where t
Ddt yz log
yt is the observed temperature data at time t, B 1 is the regular difference and
ss B 1 is the seasonal difference. D is the
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375 Modeling And Forecasting Maximum Temperature Of Warri City- Nigeria
371
order of the seasonal difference while d is the order of regular difference. C is a constant and ta is a white noise process. B is the regular autoregressive polynomial of
order p while sB is the seasonal autoregressive polynomial of order P. Similarly, B is the regular moving average polynomial of order q while sB is the seasonal moving average polynomial of
order Q. Sometimes, the model (1) is denoted SARIMA (p, d, q)(P, D, Q). The ARIMA model (1) is said to be invertible if all the roots of the moving average polynomial sBB lie outside the unit circle. Note that the model is already stationary. Many models can be formed from (1).These models are made of either past observed values together with a white noise or white noise only or a mixture of both. The major contribution of Box and Jenkins were to provide a general strategy in which three stages of model building were given prominence. These stages are those of model identification, estimation and diagnostic checks.[ see for example Hipel et al.(1977) and McLeod (1995). Several researchers and scientist have used these models for several technical and scientific studies. Burinskiene and Rudzkieme (2005) used ARIMA models to model and forecast tourism development in Lithuania while Kohansai and Rezazachen (2013) modeled and predicted water stability level in Zarrin Dast town among several others. MATERIAL AND METHOD We obtained historical data of average monthly maximum temperature for the period 1994-2008 for the studies and those of 2009 for forecast validation of the chosen model from the National Metrological Center, Oshodi-Nigeria. To detect possible presence of seasonality, trend, time varying variance and other nonlinear phenomena, we inspect the time plot of the observed data side by side with the plots of sample autocorrelation functions (ACF) and sample partial autocorrelation functions (PACF). This will help us determine possible order of differencing and and the necessity of logarithmic transform to stabilize variance. Non stationary behavior is indicated by the refusal of both the ACF values, k and the
PACF, kk to die out quickly. Also possible seasonal
differencing is indicated by large ACF values, k at lags s, 2sns. Our technique is to apply both simple and seasonal differencing until data is stationary. Stationary behavior is indicated by either a cut or exponential decay of ACF values k as well as
PACF values kk .
Model identification is by comparing the theoretical patterns of the ACF and PACF of the various ARIMA models with that of the sample ACF and PACF computed using empirical data (Janacek and Swift, 1993). A suitable model is inferred by matching these patterns. Generally ( Brooks, 2002), ARIMA (0, d, q) is indicated by spikes up to lag q and a cut to zero thereafter of the ACF values k complemented by an exponential decay or damped sine wave of the PACF values kk . Inversely, ARIMA (p, d, 0 ) is identified by exponential decay or damped sine wave of the of the ACF values k complemented spikes up to lag p and a cut thereafter to zero of the PACF values kk . When the process is an ARIMA (0, d, q)* (0, D, Q) then spikes will be noticed up to lag q+Qs. While ARIMA (p, d, 0)*(P, D, 0) is indicated by spikes at lag p+ Ps and a cut to zero thereafter of the PACF. However, the mixed SARIMA model is difficult to identify by visual methods of ACF and PACF plots only. In this work, we use the model identification discussed above to give a rough guess of possible values p, q, P, and Q from which several models shall be postulated and then use the model selection criterion of Residual Sum of Square RSS (Box and Jenkins, 1976), Akaikes Information Criterion AIC (Akaike, 1974) to choose the best model.The AIC computation is based on the mathematical formula
mLAIC 2log2 , where m= p+q+P+Q is the number of parameters in the model and L is the likelihood function. The best model is the one with the lowest AIC value. It is however noted that the likelihood is likely increased by addition of more parameters into the model. This will further reduce the value of the AIC leading to the choice of a model with many parameters. Wei (1990) emphasize on the need for the chosen model to meet criterion of model adequacy and parsimony. For this reason we complement the RSS and AIC with the Schwartzs Bayesian Criterion SBC, (Schwartz, 1978). The SBC computation is based on the mathematical formula nmLSBC loglog2 , where m= p+q+P+Q is the number of parameters in the model and L is the likelihood function. The SBC introduced a penalty function to check excess parameters in the model having identified a suitable SARIMA model, the next stage is the parameters estimation of the identified model and this is done through an exact maximum likelihood estimate due to Melard (1984). While forecast and prediction is by least squares forecast using a least square algorithm due to Brokwell and Davis (1991). When the estimated parameters are not significant, we do correlation analysis to remove redundant parameters.
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375 Modeling And Forecasting Maximum Temperature Of Warri City- Nigeria
372
The test for model adequacy stage requires residual analysis and this is done by inspecting the ACF of the residual obtained by fitting the identified model. If the model is adequate then residuals should be a white noise process. Under the assumption that the residual is a white noise process, the standard error of the autocorrelation functions should be
approximately n
1 (Anderson, 1942). Hence under
the white noise assumption, 95% of the autocorrelation functions should fall within the
rangen96.1
.If more than 5% fall outside this range
then the residual process is not white noise .We complement the visual inspection of the residual ACF with the portmanteau test of Ljung and Box, 1978. This test provides a Q statistics defined by
,)()2( 21
1k
m
krknnnQ
(2)
Where kr is the autocorrelation value of the residual at lag k, n=N-d-D. Q is approximately distributed as
QPqpm 2 . The technique here is to choose a level of significance and compare the computed Q with the tabulated 2 with m-p-q-P-Q degree of freedom. If the model is inappropriate, the Q value will be inflated when compared with tabulated 2 RESULT AND DISCUSSION To decide on the presence of trend and time varying variances, we inspect the time plot of warri maximum temperature data in Fig 1side by side with the ACF and PACF of the data as shown in Fig 2 and Fig 3 respectively.
Fig1. Time Plot of Maximumum Temperature
MONTH, period 12
4710147101471014710147101
Max
Tem
p
36
34
32
30
28
26
Fig 2. ACF Plot of Max Temp
Lag Number
1615
1413
1211
109
87
65
43
21
AC
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Fig 3. PACF Plot of Max Temp
Lag Number
1615
1413
1211
109
87
65
43
21
Parti
al A
CF
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Examination of Fig 1 clearly shows presence of time varying variance and seasonal variation while the refusal of the ACF and PACF values to decay in Figs 2 and 3 respectively is an indication of a regular trend. However, we are unable to decide at this stage the presence or otherwise of seasonal trend. We perform a logarithm and first regular difference so as to stabilize the variance and remove the trend. A time plot of max temperature after logarithm and first difference transform is shown in Fig 4 below
Fig 4. Time Plot of Maximum Temp
(Logarithm Transform and First Difference)
MONTH, period 12
471014710147101471014710
Max
Tem
p
.2
.1
0.0
-.1
-.2
On inspecting Fig 4, we note the strong presence of seasonal factors and suspect the presence of seasonal trend. This is confirmed by very high spikes at and around seasonal lags of the ACF as shown in Fig 5.
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375 Modeling And Forecasting Maximum Temperature Of Warri City- Nigeria
373
Fig 5. ACF Plot
( Logarithm and First Difference Transform)
Lag Number
5855
5249
4643
4037
3431
2825
2219
1613
107
41
ACF
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
We complete the data preparation process by additionally performing a first order seasonal difference and the time plot is shown in Fig 6.
Fig 6. Time Plot of Max Temp
(Logarithm, First Difference and Seasonal Difference Transform )
MONTH, period 12
4710147101471014710147
Max
Tem
p
.2
.1
0.0
-.1
-.2
Visual examination of Fig 6 shows that the process is now stationary. For celerity of discussion, we from now on refer to the maximum temperature after logarithm, first regular difference and first seasonal difference transformations as the Stationary Process of The Maximum Temperature. Hence we expect a seasonal ARIMA process of the form
12,1,,1, QPqpSARIMA The order of the model parameters p, q, P and Q are identified by visual inspection of ACF and PACF of the stationary process of the maximum temperature shown in Figs 7 and 8 to propose many possible models and the use of model selection criterion of AIC and BIC to pick the most appropriate model. We expect the ACF in Fig 7 to cut at q+Qs. However we notice a cut after lag 25 suggesting a moving average parameter of order one i.e. q=1 and a seasonal moving average parameter of orde two i.e. Q=2. Similarly from the PACF in Fig 8, we notice a
cut at lag 25 suggesting an AR parameter of order one i.e.p=1 and a Seasonal autoregressive parameter of order two i.e. P=2. Since our strategy is not to have mixed seasonal factors, we postulate two models from which, based on the model selection criterion of RSES, AIC and SBC, the best is selected.
Fig 7. ACF Plot of Stationary Process of Max Temp
Lag Number
5855
5249
4643
4037
3431
2825
2219
1613
107
41
AC
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Fig 8. PACF of The Stationary Process of Max Temp
Lag Number
5855
5249
4643
4037
3431
2825
2219
1613
107
41
Par
tial A
CF
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
The two models are SARIMA (1, 1, 1) (0, 1, 2) and SARIMA (1, 1, 1) (2, 1, 0). We extend the search to models around the two already mentioned. The result is shown in table 1. Table 1: Postulated Models and Performance Evaluation Model RSES AIC SBC SARIMA (1, 1, 1 )(2, 1, 0) .07842732 -797.81253 -785.34056 SARIMA (1, 1, 1 )(0, 1, 2) .06904940 -818.31944 -805.84746 SARIMA (1, 1, 0 )(1, 1, 2) .08470936 -783.31059 -770.83862 SARIMA (1, 1, 1 )(1, 1, 2) .06774596 -817.60735 -802.01738 SARIMA (1, 1, 0 )(0, 1, 2) .08598198 -783.37908 -774.0251 SARIMA (0, 1, 1 )(1, 1, 2) .07209436 -809.62574 -797.15376 SARIMA (0, 1, 1 )(0, 1, 2) .07849310 -799.52263 -790.16865 SARIMA (0, 1, 0 )(1, 1, 2) .10477684 -750.3175 -740.96352 SARIMA (0, 1, 0 )(0, 1, 2) .10536211 -751.00055 -744.76457 From table 1, we note that in terms of AIC and SBC, the SARIMA (1, 1, 1) (0, 1, 2) model performed best. However it is in competition with SARIMA (1, 1, 1) (1, 1, 2) that has the lowest RSES. This
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375 Modeling And Forecasting Maximum Temperature Of Warri City- Nigeria
374
notwithstanding, we choose SARIMA (1, 1, 1) (0, 1, 2) as the best in terms of model parsimony and performance based on AIC and BIC. We estimated the parameter values of the chosen model as shown below. Table 2: Parameters B in the Model B SEB T-RATIO APPROX. PROB. AR1 .23653389 .07697176 3.072996 .00248460 MA1 .97208876 .03819124 25.453188 .00000000 SMA1 .66546734 .13438867 4.951811 .00000182 SMA2 .22351861 .09816789 2.276901 .02409261 We note that all the parameters are significant The chosen model is mathematically of the form
tt
ttttttt
tt
tt
yBBxwhere
aaaaaxxaBBBBBx
aBBByBB
log11
2172.02235.06655.09720.02365.02172.06469.022351.06655.09720.010.2365B 1
2235.06655.019720.01log110.2365B 1
12
26251311
25132412
121212
To verify the suitability of the model, we plot the autocorrelation values of the residual against lag as shown in Fig 9.
Fig 9. ACF Plot of Residuals
Lag Number
5855
5249
4643
4037
3431
2825
2219
1613
107
41
ACF
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
We note that on inspection of Fig 9, there is no spike at any lag indicating that the residual process is random. We complement with the portmanteau of Ljung and box. Computation of the Q value of the portmanteau test, using the first 25 autocorrelation values of the residual gives 18.468. When compared with tabulated chi square value of 32.7, with 21 degree of freedom and at 5% level of significance, we conclude that the model is a good fit.
Forecast and Model Validation Below is the 2009 forecast using SARIMA (1, 1, 1) (0, 1, 2) and empirically observed data for the year Table 3: Forecast for 2009 Month Jan Feb March April May June July Aug Sept Oct Nov Dec Forecast 33.06 34.28 33.98 33.08 32.64 30.82 28.99 29.88 29.87 31.20 32.19 33.39 Observed 33.4 34.4 34.2 33.4 32.4 31.3 29.1 29.4 30.1 30.8 33.2 32.8 Difference -0.34 -0.12 -0.22 -0.32 0.24 -0.48 -0.11 0.48 -0.23 0.4 -1.01 0.59 A t-distribution test of equality of mean shows that the difference between the two means is not significant at 1% level of significance. We therefore conclude that the chosen model can adequately be used to forecast maximum temperature. CONCLUSION We have shown that time series ARIMA models can be used to model and forecast Maximum temperature. The identified SARIMA (1, 1, 1) (0, 1, 2) has proved to be adequate in forecasting maximum temperature for at least one year. Researchers will find this result useful in building temperature component into a general climatic forecasting model. Also environmental manager who require long term temperature forecast will find the identified model very useful. However, due to low data point of fifteen years, we have not been able to identify the changing pattern of fluctuations of maximum temperature over a century as this will require at least one hundred years of data point. REFERENCES Anderson, R. L., (1942) Distribution of Serial Correlation Coefficient, Annals of Mathematical Statistics 13(1), 1-13
Akaike, H., (1974) A New Look at Statistical Model Identification. IEEE Transaction on Automatic Control 19(6) 716-723 Bindraban, P. S and Coauthors, (2012) Assessing The Impact of Soil Degradation on Food Production. Current Opinion on Environmental Sustainability. 4, 478-488 Box, G, E. P. and Jenkins. (1976) Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco, USA Brockwell, P. J and Davis, R. A.(1991) Time Series: Theory and Method. Spinger Brooks, C. (2002) Introductory Econometrics for Finance. Cambridge University Press, UK Chung, E. S., Park, K. and Lee, K. S (2011) The Relative Impact of Climate Change and Urbanization on The Hydrological Response of a Korean Urban Watershed. Hydrological Processes. 25, 544-560 Grace, J (2004) Understanding and Managing Global Carbon Cycle. Journal of Ecology. 92, 189-202
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375 Modeling And Forecasting Maximum Temperature Of Warri City- Nigeria
375
Hipel, KJ. W., McLeod, A. 1 and Lennox, W. (1977). Advances in Box-Jenkins Modeling: Model Construction. Water resources Research 13, 567-575 Ljung, G. M and Box, G. E. P (1978) On the Measure of Lack of Fit in Time Series Model. Biometrika, 65, 297-303. Mayhew P.J., Jarkins,E.B., and Banton, T.B.(2008). A Long-term Association Between Global Temperature and Biodiversity, Origin and Estimation on the Fossil record. Proceedings of the Royal Society B. 275, 47-53 Mcleod, A. I., (1995) Diagnostic Checking of Periodic Autoregression Models With Application. The Journal of Time Series Analysis 15, 221-233 Melard, G (1984) A Fast Algorithm for The Exact Likelihood of Autoregressive- moving Average Models. Applied Statistician 33(1): 104-119 Romilly,P.(2005).Time series Modeling of Global Mean Temperature for Managerial Decision Making. Journal of Environment magament, 76, 61-70. Schwartz, G. E (1978). Estimating the Dimension of a Model. Annals of Statistics. 6(2): 461-464 Stelzer,H., and Post,E.(2009). Seasons and Life Cycles. Science, 324,886-887 APPENDIX: Maximum Temperature (0C) for Warri (1994 2009). 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 JAN 33 33.4 32.7 32.8 32.5 32.2 33 32.8 33.1 33.1 32.8 32.3 FEB 34.2 34.4 33.1 34.1 34.2 32.6 33.8 34.1 34.6 34.7 34.2 34.8 MARCH 34.2 34.2 33.3 32.8 33.3 33.5 34.7 33.9 33.2 33.8 34.5 33.5 APRIL 33.2 33.4 32.9 32 32.2 33 33.2 32.7 32.5 33.1 32.7 33.2
MAY 33 32.4 32.5 31.8 31.5 32.5 32.5 32.3 32.3 32.5 31.6 32.7 JUNE 31.1 31.3 30.9 30.1 31.1 30.8 30.6 30.6 30.5 30 30.9 30.6 JULY 29.2 29.1 29.1 28.8 29.1 28.1 28.4 29.7 29.2 28.9 28.7 28.7 AUG 28.9 29.4 29 29.4 29.6 29.5 28.1 27.9 28.7 28.6 28.5 29.4 SEPT 29.8 30.1 28.8 30.4 30.1 28.7 29.7 29.7 28.9 30 30.4 30.9 OCT 31.6 30.8 30.5 31.2 32.4 29.7 29.9 30.1 30.3 32.1 30 31.2 NOV 31.1 33.2 33.2 33.3 32.7 32.5 32.6 32.8 32.6 32.9 31.4 33.6 DEC 33.7 32.8 32.7 32.6 32.3 33.5 33.3 33.4 33.4 32.4 33.2 33.3
2006 2007 2008 2009 JAN 32.5 33.3 33 33.4 FEB 34.3 34.1 34.2 34.4 MARCH 33.4 33.7 34.2 34.2 APRIL 33 33 33.2 33.4 MAY 32.4 33 33 32.4 JUNE 29.5 31.1 31.1 31.3 JULY 27.8 28.9 29.2 29.1 AUG 27.9 29.1 28.9 29.4 SEPT 29.5 30.5 29.8 30.1 OCT 31.2 32.5 31.6 30.8 NOV 32.8 33.4 31.1 33.2 DEC 33.2 33.1 33.7 32.8