Upload
vanhuong
View
247
Download
4
Embed Size (px)
Citation preview
Panel Models, Spatial Econometrics,
and Spatial Panel Models
+ Some of Quantitative Geography
Oriented Stuff
Overview:
1. Spatial Econometrics in Quantitative Geography
2. What is a Panel Model
3. Spatial Weights Matrices
4. Spatio-Temporal Models (DGPs)
5. Some R resources for Spatial Panel Models
6. Final Thoughts
What is Panel Data?
• Repeat observations on the same set of units over time
– Education and Income on Individuals from age 18 to 50 (longitudinal Study)
– Investment in Education and Average Income across US States from 1980 to 2000
• Pros
– More data! (N x T observations)
– Might better approximate an experimental structure (iewhat are the impacts of a policy change that occurs in a certain year?)
• Typically considered a ‘social’ science issue because it concerns data on discrete units (people, counties, species, markets) in discrete time (one observation per year,month, week et cetera)
Quantitative Geography
• Spatial Statistics– Spatial Econometrics
– Point Pattern Processes
– Spatial Mixed (Hierarchical) Models
• Geostatistics– Kriging, Interpolation
– Continuous Space-time analysis
• Spatial Optimization– Resource Extraction, Reserve Design
– Network Optimization (shortest path, TSP)
Underlying Spatial Structure (Support)
• Discrete
– Events (disease, crime)
– Objects (regions, cities)
• Continuous (Geostatistics)
– Environment (temperature, elevation)
– Social (house prices, dense urban areas)
• Depending on the scale of analysis (city block, region,
country) a spatial structure could be either discrete or
continuous
• The nature of the structure determines the tools we use to
analyze it
Got Structure? Choose your Weapons…
• Events:
– Spatial Poisson Regression, Cluster Analysis
• Regions:
– Spatial Regression (Econometrics)
• Continuous Field:
– Geostatistics (kriging)
• The biggest limiting factor in Regions is the lack of precise distance measurements
• So we often resort to conitiguity based measures of influence-> the W matrix
Economics 245a
What is a W matrix?
An N x N matrix of weights that specifies the
degree of correlation among spatial unit Ni
and it’s neighbors Nj..
Typical W matrices (Getis 2004)
1. Spatially Contiguous Neighbors
2. Inverse distance raised to a power
3. Length of shared border divided by perimeter
4. N nearest neighbors
5. All weighted centroids within distance d
6. Lot’s more…
How do we use W Matrices?
• Spatial Lag:
• Spatial Error:
• MANY Extensions:
– SARAR (Spatial Lag+Spatial Error),SARMA (Spatial Autogressive
Moving Average), Spatial Durbin (lagged regressors), et cetera
What happens when we Ignore Spatial Correlation?
• Spatial Lag:– Biased Estimates (omitted variable bias->rho*Wy is
in error term)
– Misinterpret Marginal Effects (emanating and spill-
over effects)
• Spatial Error:– Estimates unbiased, standard errors possibly too
small…
More on Correlation (why we use W)• Correlation generally manifests in the error term
(residuals)
• Serial correlation (through time) and spatial correlation (across space) can bias our estimates of beta and our standard errors
• If the correlation is the result of an omitted variable that is correlated with one of our regressors (X) then it will bias our estimate of beta
• If the correlation is independent of our regressors, but correlated with our outcome variable, then our standard errors will be downward biased, leading to a false rejection of H0
What is Panel Data?
• Repeat observations on the same set of units over time
– Education and Income on Individuals from age 18 to 50 (longitudinal Study)
– Investment in Education and Average Income across US States from 1980 to 2000
• Pros
– More data! (N x T observations)
– Might better approximate an experimental structure (ie what are the impacts of a policy change that occurs in a certain year?)
• Cons
– Attrition
– Correlation up the wazooo: Observation it correlated with it-1and possibly with jt and even jt-1
Theoretical Models in a Spatial Panel
Setting (from Anselin 2008)
• Pure Space Recursive
– Too many parameters to identify
yi in time t is dependent on a weighted average of
neighboring yj’s in time t-1
Theoretical Models in a Spatial Panel
Setting (from Anselin 2008)
• Time-Space Recursive
yi in time t is dependent on a weighted average of
neighboring yj’s in time t-1 AND, the value of yi in
time t-1
Theoretical Models in a Spatial Panel
Setting (from Anselin 2008)
• Time-Space Simultaneous
yi in time t is dependent on a weighted average of
neighboring yj’s in time t AND, the value of yi in time
t-1
Economics 245a
What if W Changes Over Time?
W Matrices in a Panel Setting
• Many spatial models are forced to rely on analyst-
specified measures of influence (the W matrix)
• If W is misspecified it can lead to biased estimates,
misinterpretation of model results (ie for prediction,
simulation)
• In a panel setting, W could change through time
– (ex: Trade, Agriculture, Migration)
• Current spatial panel routines do not facilitate
different W specifications through time (or much else)
Economics 245a
rho=.5 W=C
Economics 245a
rho=Var W=C
Economics 245a
rho=.5 W=Var
Economics 245a
rho=Var W=Var
Summary of Results
• True value of rho always within two se’s of estimates
• The BalRe and FE methods falsely accepted null about 2%
in scenarios 3 and 4 (no SE’s for KKPRe)
• rho estimates are sensitive to data
• Time-varying W exerts more influence than time varying rho
• Fixed effects estimation tends to be more conservative, and
random effects tend to be closer to the true value
• A ‘growing W’ will tend to cause underestimation of rho
Final Thoughts: W matrices in Panel Models
• If a satisfactory method of distance (geographic or
otherwise can be found) direct representation of
spatial correlation using variogram is probably a
better approach
• Still an active research area, especially with
variograms applied to non-geographic distances
• Panel structure gives more flexibility in defining a W
matrix (also active area of research)
What to do if you have a spatial panel?1. Think about what the ideal theoretical form is
2. Fit a model with dummy variables for spatial units, and
temporal units
3. Test for Serial Correlation and/or Spatial Correlation
– If serial correlation, but no spatial correlation is found,
use HAC standard errors
– If spatial correlation, but no serial correlation, use cluster
robust standard errors or, fit a spatial error model
4. If both, think about W, and refit the model with SHAC
standard errors
5. Try fitting t- Cross sectional models, with different W’s, look
at Hierarchical Models…USE SIMULATIONS TO TEST THE
EFFECTIVENESS OF YOUR METHOD!!!!!
Panel Models and Spatial Econometrics in R
• spdep: basic spatial econometrics
• sphet: SHAC standard errors
• plm: basic panel models
• splm: spaital panel models (still in alpha)
• spacetime: continuous spatial-temporal models
• lme4: linear mixed effects models
Data Manipulation in R
• reshape2: melt and cast are all you need!
• plyr: take data apart and put it back together again
• apply, lapply, tapply: the apply family is your friend
as is their cousin: aggregate
• grep, gsub, strsplit: quick, dirty, but powerful string
parsing
• ggplot2: The only graphics package you will ever
need…maps and so much more!
R Resources
• Rseek: http://www.rseek.org/
• R Journal
• R Bloggers
• Journal of Statistical Software
• Springer UseR! Series
Questions?
"Values in close spatial proximity may be similar
not because of spatial autocorrelation but
because the values are independent realizations
from distributions with similar means"
(Schabenberger and Gotway 2005, p. 22).