25
STAT 443 FORECASTING PROJECT Bharat Khanna Nabil Jalil Umer Iqbal Taimur Mazhar

STAT 443 Project

Embed Size (px)

Citation preview

Page 1: STAT 443 Project

STAT 443

FORECASTING

PROJECT

Bharat Khanna

Nabil Jalil

Umer Iqbal

Taimur Mazhar

Page 2: STAT 443 Project

Time Series Analysis of the Public Sector from 1981-

2009

Introduction

The topic of interest was analyzing the trend in the number of people employed by the

government. The logical approach was to use explanatory variables to fit the data

obtained to a linear regression model; these were chosen in the order of GDP, Total

Wages paid and Exchange Rate. Employment by the government has historically been

relatively stable showing an increasing linear trend. Harsh winters would also imply a

seasonal structure in employment in a country such as Canada.

Keeping this is mind, the group tried to analyze and predict a model capable of

forecasting future employment patterns.

Summary

Preliminary testing showed the linear regression model to be a poor fit. This lead to

analysis on the acf and pacf on employment data and eventually a

SARIMA(0,1,1)x(0,1,1) was chosen after testing many models. This model showed

significant predictive power when used on the testing set and was hence adequate in

predicting future movements in employment.

Page 3: STAT 443 Project

Analysis

The first step was inspecting the data itself and trying to associate trends. The time series

plot looks as follows:

Fig. 1 Fig. 2

From the plot, figure 1, we see that there is an upper trend with time. We suspect there to

be seasonality with the identical periodic waves every year. The variance also appears to

be increasing slightly with time.

The training set was decided to be the last three months of data at this point. The time

series of the training set is shown in figure 2.

Page 4: STAT 443 Project

The group instinctively started with a Linear Regression model having the GDP, Wages,

Time and Exchange Rate as explanatory variables. The results of the first run proved

Exchange Rates were not a significant factor in the linear regression model. Time was

added to the model because of the time trend displayed in the time series plot of

employment.The R results are as follows:

Call:

lm(formula = Employment ~ Time + Wages + GDP)

Residuals:

Min 1Q Median 3Q Max

-323429 -40515 5281 51680 149931

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.644e+06 5.039e+04 32.631 < 2e-16 ***

Time -3.680e+03 2.111e+02 -17.437 < 2e-16 ***

Wages 1.835e-04 5.508e-06 33.319 < 2e-16 ***

GDP 4.648e-06 1.028e-06 4.521 8.58e-06 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 72590 on 332 degrees of freedom

Multiple R-squared: 0.8916, Adjusted R-squared: 0.8906

F-statistic: 909.9 on 3 and 332 DF, p-value: < 2.2e-16

Page 5: STAT 443 Project

A large R squared value looked promising, as larger R squared values suggest that model

has a high predictive power. The variables Time, Wages and GDP all seemed significant.

However, when the residuals were analyzed, there was significant correlation between

residuals which lead to the group rejecting the linear regression model. The following are

the time series plots of the residuals and the acf and pacf plots of the residuals:

Fig. 3

From the above plot, we see that the variance does not appear to be constant hence

indicating heterogeneity. So the residuals fluctuate about the mean with amplitude which

is not constant. Also the plot does not appear to be very symmetrical about the mean.

Page 6: STAT 443 Project

Fig. 4 Fig. 5

The very slow decay in the acf plot and by inspection on the time series of the residuals,

on the previous page, it was evident the residuals were correlated. We carried out a

Durbin-Watson test to verify this result and got the following:

> dwtest(regmodel)

Durbin-Watson test

data: regmodel

DW = 1.2372, p-value = 3.416e-13

alternative hypothesis: true autocorrelation is greater than 0

The above result shows that the residuals are first order serially correlated.

Page 7: STAT 443 Project

The acf plot shows extremely slowly decaying covariances and once again shows that the

linear regression model in that the residuals were in fact correlated. A runs test and a

bartels test were also run, and the very low p-values of below 5 percent, for both,

Fig.6Fig.7

Page 8: STAT 443 Project

confirmed that we had significant evidence to reject the null hypothesis that the residuals

were uncorrelated.

It was clear the data needed differencing to get rid of the upward trend. The data was

differenced and the time series plot and acf and pacf plots were as follows:

Fig.8

From the above time series plot, on the left, we applied a variance stabilization on the

data by using a Log function. We can see that the variance did not really stabilize, or it

appears more like there are two different variances. We also tried stabalising the data

with the square root function, but that did not help much either, in fact it was quite

similar to the Log stabaliser. We also see that differencing the data got rid of the time

trend, as the data is no longer increasing upwards with time.

Page 9: STAT 443 Project

Fig. 9 Fig. 10

The acf plot had smoothed out significantly. However the peaks at 12 were still very

noticeable. The pacf also showed similar trends at the 12 and 24 mark. The time series

plot of the differenced data also showed seasonal variation. It was hence decided to lead

up on our initial intuition claiming the data has a seasonal trend of 12 months. The data

was hence differenced seasonally. The resulting acf and pacf plots and time series of the

differenced data are as follows:

Most of the seasonal variation seems to

have disappeared and the only anomaly

in the data seems to occur at around the

175th to the 250th period mark. If we

observe the initial data plot in Fig. 1 we

can see employment is tumultuous

during that period. There seems to be a

sudden dip in employment followed by

Fig.11

Page 10: STAT 443 Project

a large spike increase. This is more of this is mind the group decided the anomaly from

the 175th to the 250th period mark can be largely ignored as an exception.

By classically differencing the data, to get rid of any further trend, that was followed by a

lag 12 seasonal differencing, the group knew they were dealing with a seasonal ARIMA

(SARIMA) model. The next step was to read the acf and pacf plots of the classically and

seasonally differenced data to help with estimating the p, q, P and Q parameters for the

SARIMA model. The acf plot is as follows:

The acf plot shows at most one

significant spike followed by a group

of insignificant spikes with the

exception of the ones at 12 and 18. The

group decided these were erroneous

rather than based on another trend

since the acfs seemed to be leveling off

after 1 until those spikes. Hence p was

chosen to be either 0 or 1. Tests were

also conducted on p=2 to make sure

our assumptions were correct and

results on this are to follow.

Fig. 12

Page 11: STAT 443 Project

The pacf plot looked as follows:

Fig. 13

There we no significant spikes after the first spike. So a value of q of either 0 or 1 would

be tested. To double check, q=2 was also included in random tests performed.

Once p and q were determined, the group ran 36 different models to find which gave the

lowest AIC value. The results are given in Table 1 on the next page.

Page 12: STAT 443 Project

Table 1The selected model was chosen to be the SARIMA (0,1,1)x(0,1,1) model because it gave the lowest AIC.

Call:arima0(x = log(Employment), order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))

Coefficients: ma1 sma1 -0.3020 -0.4997s.e. 0.0561 0.0470

Model # (p,d,q) (P,D,Q) sigma sq AIC1 (0,1,1) (0,1,1) 8.18E-05 -2113.542 (1,1,1) (0,1,1) 8.16E-05 -2112.453 (2,1,1) (0,1,1) 8.15E-05 -2110.734 (0,1,2) (0,1,1) 8.16E-05 -2112.615 (1,1,2) (0,1,1) 8.15E-05 -2110.786 (2,1,2) (0,1,1) 8.32E-05 -2110.787 (0,1,1) (1,1,1) 8.17E-05 -2112.38 (1,1,1) (1,1,1) 8.14E-05 -2111.259 (2,1,1) (1,1,1) 8.13E-05 -2109.6210 (0,1,2) (1,1,1) 8.13E-05 -2111.4411 (1,1,2) (1,1,1) 8.13E-05 -2109.6212 (2,1,2) (1,1,1) 8.13E-05 -2107.6413 (0,1,1) (2,1,1) 8.16E-05 -2110.3614 (1,1,1) (2,1,1) 8.2E-05 -2111.2615 (2,1,1) (2,1,1) 8.13E-05 -2107.6416 (0,1,2) (2,1,1) 8.14E-05 -2109.417 (1,1,2) (2,1,1) 8.13E-05 -2107.6618 (2,1,2) (2,1,1) 8.13E-05 -2105.6619 (0,1,1) (0,1,2) 8.17E-05 -2112.2620 (1,1,1) (0,1,2) 8.14E-05 -2111.2121 (2,1,1) (0,1,2) 8.13E-05 -2109.5822 (0,1,2) (0,1,2) 8.14E-05 -2111.423 (1,1,2) (0,1,2) 8.13E-05 -2109.5924 (2,1,2) (0,1,2) 8.13E-05 -2107.625 (0,1,1) (1,1,2) 8.17E-05 -2110.2926 (1,1,1) (1,1,2) 0.08141 -2109.2427 (2,1,1) (1,1,2) 8.13E-05 -2107.5828 (0,1,2) (1,1,2) 8.14E-05 -2108.4329 (1,1,2) (1,1,2) 8.13E-05 -2107.5430 (2,1,2) (1,1,2) 8.13E-05 -2105.6231 (0,1,1) (2,1,2) 8.17E-05 -2108.332 ERROR33 (2,1,1) (2,1,2) 8.13E-05 -2105.6134 (0,1,2) (2,1,2) 8.14E-05 -2107.4335 (1,1,2) (2,1,2) 8.13E-05 -2105.6236 (2,1,2) (2,1,2) 8.13E-05 -2103.63

Page 13: STAT 443 Project

sigma^2 estimated as 8.184e-05: log likelihood = 1059.77, aic = -2113.54

Residual Analysis on the chosen model

Once the model was chosen, the group had to validate the results. The first step was to plot the acf and pacf plot of the residuals to check for serial correlation:

Fig. 14

Page 14: STAT 443 Project

The acf and pacf plots show largely no correlation between the residuals. Most of the data

appears to be significant as it lies within the 95 percent confidence intervals. On the

PACf plot, there is one spike at lag18 which is likely to be the result of an outlier.

The following tests were done for serial correlation:

> runs.test(res)

Runs Test - Two sided

data: res

Standardized Runs Statistic = -1.7274, p-value = 0.0841

> bartels.test(res)

Bartels Test - Two sided

data: res

Standardized Bartels Statistic = -1.7513, RVN Ratio = 1.806, p-value =

0.0799

Both the Runs and Bartels test has p values greater then 5 percent, therefore there is no

significant evidence to reject the null hypothesis that the residual sare uncorrelated.

Next the group decided to test for normality in the residuals. A QQ plot and the Shipro-

Wilk test were run:

Page 15: STAT 443 Project

> shapiro.test(res)

Shapiro-Wilk normality test

data: res

W = 0.8146, p-value < 2.2e-16

The residuals did not turn out to be normally distributed and failed the SW test due to the

significantly low p-value, hence rejecting the hypothesis that the residuals are normally

distributed.The group decided to run the SW test again removing outliers that could be

influencing data (2 were removed from the right tail and 1 from the left tail). The results

are as follows:

Shapiro-Wilk normality test

data: res[-outliers] W = 0.8914, p-value = 2.504e-14

While the p-value increased, it was still significantly lower than the rejection range. Upon

further analysis, the group found that most of the outliers lie within the 175th to 250th

period band mentioned earlier. On removing all the outliers from that region (about 50

ows data points) the SW test returned a p-value of 0.07 which meant the residuals after

removing 25 points on either tail were normally distributed. This however, contributes a

large majority of the data.

The period, though, as has been stated, looks more of an exception than a rule and it was

decided to further analyze the residuals to see if the model was a good fit.

The following is a histogram of the residuals of the chosen model:

Fig.15

Page 16: STAT 443 Project

The histogram, on the right, shows some form of a bell shaped nature, although the

higher and lower residuals do not have much of a frequency.

TSDiag returned the following results:

Fig. 17

As is evident from both the tsdiag tests and Fig. 16, the residuals look like white noise.

Even though the normality assumption is not satisfied, the ACF plot of the residuals says

the residuals aren’t correlated and the Ljung-Box test gives high p-values for all lags. It

Fig.16

Page 17: STAT 443 Project

was decided to proceed to prediction intervals to see how well the testing set fits the

models.

The 95 percent prediction interval was calculated as:

The observed values for Jan, Feb and Mar of 2009 are:

Jan-09 Feb-09 Mar-09

Actual 3,506,579

3,577,568

3,599,991

Table 2

> p.95<-exp(p.95)> p.95

[,1] [,2] [,3]

[1,] 3466818 3528836 3591964

[2,] 3531518 3608711 3687592

[3,] 3539571 3628865 3720410

Jan-09 Feb-09 Mar-09

Lower Bound 3,466,818 3,531,518 3,539,571

Fit 3,528,836

3,608,711 3,628,865

Upper Bound 3,591,964 3,687,592 3,720,410

Actual 3,506,579

3,577,568

3,599,991

Table 4

The 99 percent prediction interval was calculated as:

> p.99<-exp(p.99)> p.99 [,1] [,2] [,3][1,] 3447428 3528836 3612167[2,] 3507445 3608711 3712901[3,] 3511786 3628865 3749847

Jan-09 Feb-09 Mar-09

Lower Bound 3,447,428

3,507,445

3,511,786

Mean 3,528,836

3,608,711

3,628,865

Upper Bound 3,612,167

3,712,901

3,749,847

Actual 3,506,579

3,577,568

3,599,991

Page 18: STAT 443 Project

Table 5

On the previous page, we notice that the fitted values lie within both the 95% and 99%

interval for all the three months. The predicted values are quite close to the actual

employment level values for all three months. This is evidence of our chosen model to

have good predictive abilities. The 99% prediction intervals are wider than the 95%

prediction intervals, which would make sense since we are trying to predict with a higher

level of confidence.

Conclusion

To first summarize, we tried our Multiple Regression model with the different

explanatory variates, however after finding out that the residuals were correlated we then

tried out a SARIMA(0,1,1)x(0,1,1) model by inspecting the PACF’s and ACF’s of the

data and through further trial and error of different models. We also tried a Holt-Winters

model with multiplicative seasonality, although this is not shown in the report. However,

we rejected the model because it’s residuals were also serially correlated.

The SARIMA model was chosen primarily due to the seasonality observed from the

various ACF, Time Series and PACF plots.

We conclude that our SARIMA model was a good at forecasting our average levels of

employment. This is because the model fits the data the best and has the highest

predictive value, verified by the fact that all the prediction were within the predictive

intervals and close to the actual values. Our chosen SARIMA model had the lowest AIC

value. The residuals are uncorrelated but normal. Although normality is preferred, it’s not

a condition for a good predictive model. Our residuals may not be normal because of

several outliers due to financial turmoil in the recent past like the recent credit crunch that

has led to a recession. Also quite a few jobs were lost due to the harsh weather conditions

in Canada.

The model can be improved by adding more data by taking employment levels from a

longer time period. We can also add repressors to our model to improve the predictive

power.

Page 19: STAT 443 Project