Spatial Panel Data Forecasting over Different Horizons, Cross-Sectional and Temporal Dimensions

Spatial Panel Data Forecasting over Different Horizons, Cross-Sectional

and Temporal Dimensions

Matías Mayor (University of Oviedo)Roberto Patuelli (University of Bologna & RCEA)

Introduction

• Rise of theoretical and empirical spatial econometrics literature, but temporal aspect has received less attention

• Spatial dimension of labour markets a relevant topic, spatial autocorrelation emerging in particular from labour mobility over different regions

• Importance of spatial aspect for forecasting pointed out by Giacomini and Granger (2004; ‘ignoring spatial autocorrelation, even when it is weak, leads to highly inaccurate forecasts’) and Hernández-Murillo and Owyang (2006)

• Different methods proposed:– Static panel models (Baltagi and Li 2004; Longhi and Nijkamp 2007;

Fingleton 2009; Baltagi et al. 2012; Fingleton and Palombi 2013)– Dynamic panel models (Kholodilin et al. 2009; Baltagi et al. 2013)

(improves forecast performance in particular when the forecasting horizon is longer)

– VAR models (Schanne et al. 2010) (similar conclusion to Kholodilin et al.)

Introduction (2)

• We compare two methods to obtain unemployment forecasts in (small) administrative units, and observe their performance between different countries– a spatial vector autoregressive (SVAR) model (Beenstock and

Felsenstein 2007; Kuethe and Pede 2011)– a dynamic heterogeneous-coefficients panel data model based on an

eigenvector-decomposition spatial filtering (SF) procedure (Griffith 2000, 2003).

• We exploit the strong heterogeneity in the size of NUTS regions to investigate the variation in the forecasting performance

• The two methods belong to two separate traditions: VAR models represent the mainstream (time-series) forecasting tradition, while the SF-enhanced dynamic panel model attempts to merge the panel data modelling tradition with the spatial statistics one, within a semi-parametric framework.

Spatial VAR Models

• A VAR model (Sims 1980) can be written as a set of symmetric equations in which each (dependent) variable is described by a set of its own lags and the lags of other variables in the system

• VAR models assume the absence of spatial spillovers• A few proposals to introduce spatial relationships in a VAR framework. The

number of parameters to collect then increases quadratically with spatial units some of the existing proposals use spatial contiguity information to limit the number of parameters

– Pan and LeSage (1995) propose to use spatial contiguity information as an alternative prior in a Bayesian VAR model

– Di Giacinto (2003) defines parameter constraints in a structural VAR model based on neighbouring structure,

– Schanne et al. (2010), based on the Global VAR (GVAR) model of Pesaran et al. (2004), use geographical information to include spatial connections between regions. Advantage of the GVAR model: inclusion of a temporal dimension within the spatial dependence process

• Some authors consider only contemporaneous spatial processes (Longhi and Nijkamp 2007; Kholodilin et al. 2008), whereas others specify only a temporally lagged type of spatial dependence (Hernández-Murillo and Owyang 2006).

Spatial VAR Models (2)• We follow the approach by Beenstock and Felsenstein (2007),

where traditional VAR methods and modern spatial panel data techniques are ‘mixed’

• Beenstock and Felsenstein allow for both contemporaneous and serially lagged spatially correlated variables– Highly nonlinear model, because of the contemporaneous spatial

autoregressive process– They restrict the coefficient of the endogenous contemporaneous spatial

lag to zero, linearizing the model– Novelty is the inclusion of the spatial cross-regressive lags– A further advantage is the possibility of testing for the

significance of regional spillovers by means of Granger causality test.

– Since Wy and the residuals are not independent, it’s estimated by SUR

– If only one lag is allowed for:

1 1 1 1 11 1 11 1

n n

i ,t i i , i ,t i , ,i , j j ,t i , , ,i , j j ,t i ,tj j

y c y w y w y

Spatial Filtering

• Our alternative approach decomposes the autoregressive processes according to exogenous spatial patterns representative of accessibility/contiguity relations between the regions

• Benefits– We obtain an explicit model of the spatial patterns

without being over-restrictive by imposing (probably erroneous) regime-specific constraints

– We are able to estimate the model more parsimoniously, while covering the most relevant spatial structures

Spatial Filtering (2)

• Griffith’s (2003) spatial filtering (SF) approach is based on the computational formula of Moran’s I (MI, Moran 1948) statistic

• This eigenvector decomposition technique extracts n orthogonal numerical components from a n × n normalized spatial weights matrix

• C can be used to obtain, given X, the numerator of MI and its extreme eigenvalues are approximately the extreme values of MI (Griffith 2000). Because of this mathematical relation, the eigenvectors of C represent all mutually exclusive (orthogonal and independent) spatial patterns implied by W. They are extracted in decreasing order of spatial autocorrelation (MI). (e.g. E1 has the largest MI achievable, given W, and all subsequent eigenvectors maximize MI while being orthogonal to previously extracted eigenvectors). The set of eigenvecs explaining spatial patterns in the variable of interest can be found by regressing it stepwise on the eigenvecs

2

( )( ).

( ) ( )

ij i ji j

ij ii j i

N w x x x xI

w x x

T T( / ) ( / ),n nn n C I 11 W I 11

Spatial Filtering (3)

• Griffith (2008) showed that SF can also help explaining spatial heterogeneity in regression coefficients. An equivalent to GWR can be computed by interacting the Xs with the eigenvectors

• Patuelli et al. (2012) used it in a dynamic panel to construct a spatial filter representation of the serial autoregressive coefficients, allowing for improved inference in unit root testing

1 11 1

k k'

i ,t i ,t m i ,m i ,t m' i ,m' i ,tm m'

y c y E y E ,

The Data

• We test the forecasting performance of SVAR and SF on three data sets, for Spain, Switzerland and France. We use official regional unemployment rates at the NUTS-3 level

• All three data sets have satisfactory but different temporal (T) and spatial dimensions (n), but the geographical size of the spatial (administrative) units is widely different. Average area of Spanish provinces is about 10,499 km2, while for the Swiss cantons it is 1582 km2. French regions are 7030 km2 on average.

• Data for Spain: quarterly unemployment rates by province, for the period 1976–2008, 47 provinces

• Data for Switzerland: monthly unemployment rates, for the period 1975–2008, 26 cantons

• Data for France: quarterly unemployment rates, for the period 1982–2011, 96 departments

The Data (2)

Switzerland Spain France

n = 26

t = 384

n = 47

t = 132

n = 96

t = 120

Forecasting Strategy

• 1)– We evaluate the short-run predictive power of the two

methods. To do so, we use a rolling window– For each model and data set, estimates are obtained

using a fixed-size window of observations– The forecasting window rolls over two years,

providing one-step-ahead forecasts over 8 quarters for Spain and 24 months for Switzerland. Given cross-sectional dimensions, the overall number of forecasted values is (8 * 47 =) 376 for Spain and (24 * 26 =) 624 for Switzerland

Forecasting Strategy

• 2)– We then evaluate the predictive power of the same

methods over longer forecasting horizons, again using a (one-year) rolling window and a fixed-size window of observations

– We provide forecasts until two years ahead, i.e. over 8 quarters for Spain and France, and 24 months for Switzerland. Given cross-sectional dimensions, the overall number of forecasted values is (4* 8 * 47 =) 1504 for Spain, (12 * 24 * 26 =) 7488 for Switzerland, and (4 * 8 * 96) = 3072 for France

Evaluation of Forecasts

• Forecasting performance is summarized by means of statistical indicators– mean square error (MSE)– mean absolute error (MAE)– mean absolute percentage error (MAPE) (to account for scale heterogeneity)– Moran’s I (MI)

• We use a nonparametric test to assess if two models are equally accurate: the sign test (ST, Lehmann 1998)

– Does not rely on the usual assumptions of most tests (e.g. Diebold-Mariano or Wilcoxon tests), as it does not require normal distribution or symmetry between the two vectors

– Based on the comparison of forecasting errors. If the methods tested present a similar forecasting performance, the number of SF (Model 2) forecasts with a greater error than the one of SVAR (Model 1) may be expected to be 50%

– Does not provide insights on the error distribution, but only on comparative forecasting, pairwise. In practice, it tests the hypothesis of equality in the medians

– where C is the number of times that Model 2 shows a higher error than Model 1, and p is the number of forecasts. S follows a normal distribution N(0, 1)

,2 2

ppS C

1) Results for Switzerland

• SVAR shows better forecasting performance than SF, although differences are considerably reduced when MAPE is considered. In any case, numerical distance is rather small

• In all cases, SF presents a high level of variability in comparison to SVAR (see graphs)

• Finally, the sign test is performed along three dimensions– all forecasting errors are pooled (for all cross-sectional units and all

forecasting periods)– the average forecasting errors by canton are analysed– the average forecasting errors per period are compared

• The results show a statistically better performance of the SVAR model only when forecasting errors by region are analysed

1) Results for Switzerland (2)• From spatial methods, we might expect forecasting errors with no spatial autocorrelation…• For the majority of forecasting periods, there is no significant spatial autocorrelation, for both

SVAR and SF models, but SVAR seems to produce less spatially autocorrelated forecasting errors

• Overall, our findings, are not surprising, since T >> n clearly advantages a time-series-related method like the SVAR.

2) Results for Switzerland

2) Results for Switzerland (2)


• Sign test (MeAPE):– Test is not significant

for the first two forecasting horizons, but then becomes significant in favour of SVAR until the two years horizon

F.o. Winner F.o. Winner

1 - 13 SVAR

2 - 14 SVAR

3 SVAR 15 SVAR

4 SVAR 16 SVAR

5 SVAR 17 SVAR

6 SVAR 18 SVAR

7 SVAR 19 SVAR

8 SVAR 20 SVAR

9 SVAR 21 SVAR

10 SVAR 22 SVAR

11 SVAR 23 SVAR

12 SVAR 24 SVAR


1) Results for Spain

• Findings for Spain differ from the ones for Switzerland: the SF model has gained in competitiveness from the different data structure– In particular, the SVAR model appears to be more competitive with

regard to MSE and MAE (when the error is not standardized), while the SF model minimizes percentage error (MAPE), winning six of out eight comparisons

• It is now the SVAR model that presents a higher heterogeneity in forecasting errors (see graphs) (maybe due to increase in cross-sectional dimension?)– Also noteworthy: generalized increase in forecasting errors over time

and in particular at the last two quarters, coinciding with the 2008 financial crisis, which had a strong labour market impact on the Spanish labour market

• In all cases, sign tests are not significant, suggesting an overall equivalence between the SVAR and the SF models

2) Results for Spain (2)• As for the case of Switzerland, both methods produce spatially uncorrelated

forecasting errors in most cases, but the SVAR model appears to account better for the true spatial correlation in the dependent variable. In any case, the levels of spatial autocorrelation of forecasting errors, when significant, are very low

2) Results for Spain

2) Results for Spain (2)


• Sign test (MeAPE):– Test is not significant

for the first two forecasting horizons, then becomes significant in favour of SVAR

– BUT SF becomes competitive again at the two-year forecasting horizon


1 - 5 SVAR

2 - 6 SVAR

3 SVAR 7 SVAR

4 SVAR 8 -


2) Results for France

2) Results for France (2)


• Sign test (MeAPE):– SVAR appears to be

superior for short-term forecasts

– BUT SF becomes competitive – and now wins! –when approaching the two-year forecasting horizon


1 SVAR 5 SVAR

2 SVAR 6 -

3 SVAR 7 SF

4 SVAR 8 SF


Rejoinder

• Differences in data structure (between the Swiss, Spanish and French data sets) appears to be a discriminating factor in terms of forecasting accuracy.

• Short-run forecasting• SVAR seems preferable on the SF model when T >> n and the spatial units

have smaller size (i.e., the Swiss data).• When moderate n and T are used, we do not find stable significant

differences between the two competing methods. SVAR appears to minimize errors on the scale of the unemployment rates (MSE and MAE), while the SF model is preferable when percentage error is considered (MAPE)

– This finding is justified by methodological aspects, as the SF model computes a geographical approximation of both the autoregressive coefficients and of the fixed/random effects. As such, it may be less efficient in estimating outliers (e.g., change in high unemployment areas), while it may be expected to provide smoother findings on the spatial patterning of coefficients

• Finally, the SVAR model shows a smaller number of spatially autocorrelated errors for both the Swiss and the Spanish data sets, although most estimations produced uncorrelated errors for both methods

Rejoinder (2)

• Expanding forecasting horizon• Consistently with previous results, median and average errors appear to be

tied to the data structure (n and T)– Both methods may deserve their own niche in regional forecasting

• Sign tests on median equivalence tend to prefer SVAR, but for longer forecasting horizons SF becomes competitive (Spain/France)

• Forecasting errors of SF show stronger residual spatial autocorrelation, while SVAR forecasting error often end up having negative spatial autocorrelation

• More questions to be answered:– Are n and T influencing our results, or are the geographical characteristics of the

regions or macro attributes?– What happens for small-n, small-T?– Not possible to test a data structure opposite to the one of Switzerland (e.g.,

German NUTS-3, for which n >> T), as SVAR cannot be estimated in such case– If the SF model improves its performance for longer horizons, could the spatial

autocorrelation of its forecasting errors follow the same pattern?– How can we improve forecasting performance by considering neighbouring

regions across national borders? (NARSC 2013)

Thank you!

Roberto PatuelliDepartment of Economics

[email protected]

www.unibo.it

Thanks for listening!

http://www.unibo.it/

Documents

Spatial Panel Data Forecasting over Different Horizons, Cross-Sectional and Temporal Dimensions