On the Dynamics of Interstate Migration:
Migration Costs and Self-Selection
Christian Bayer∗
Falko Juessen†‡
Universität Bonn and IGIERTechnische Universität Dortmund and IZA
First version: February 15, 2006This version: July 29, 2011
Abstract
This paper develops a dynamic structural model of migration decisions that is aggre-
gated to describe the behavior of interregional migration. Our structural approach
allows us to deal with dynamic self-selection problems that arise from the endogene-
ity of location choice and the persistence of migration incentives. The self-selection
problem is solved by keeping track of the distribution of migration incentives over
time. This econometric treatment has important consequences for the estimation
of structural parameters such as migration costs. For US interstate migration, we
obtain a cost estimate of roughly two-thirds of an average annual household income.
We also show that the treatment of income persistence has important consequences
for comparative statics of the model as well as microeconomic age patterns of mi-
gration.
KEYWORDS: Dynamic self-selection, dynamic discrete choice, aggregate migration,
indirect inference
JEL-codes: C24, C25, E24, J61
∗Universität Bonn, Department of Economics, Adenauerallee 24-42, 53113 Bonn, Germany, Tel.:+49-228-73 4073; email: [email protected].†Technische Universität Dortmund, Department of Economics, 44221 Dortmund, Germany; phone:
+49-231-755-3291; fax: +49-231-755-3069; email: [email protected]‡We would like to thank three anonymous referees for their valuable and helpful comments. All errors
are ours. We would further like to thank Francesc Ortega, Andreas Schabert and conference participantsat the NASM 2006, the SED Meeting 2006, the EEA Meeting 2006, the VfS Meeting 2006, the SMYE2007, the SCE Meeting 2007, the ERSA Meeting 2007, LAMES 2008, and at seminars held at IZA,Universität Bonn, the EUI, and Università Bocconi for their helpful comments and suggestions. Part ofthis paper was written while C. Bayer was visiting fellow at Yale University and Jean Monnet fellow at theEuropean University Institute. He is grateful for the support of these institutions. Financial support bythe Rudolf Chaudoire Foundation is gratefully acknowledged. The research has been supported by DFGunder Sonderforschungsbereich 475 and 823. We would like to thank Christian Wogatzke for excellentresearch assistance. A previous version of the paper has been circulated under the title "A generalizedoptions approach to aggregate migration with an application to US federal states".
1
1 Introduction
Migration choices are important economic decisions. Migration allows individual agents
to evade adverse shocks to their income and it is an important way of macroeconomic
adjustment (Blanchard and Katz, 1992, and Decressin and Fatas, 1995). Many fac-
tors influence the decision to migrate and a vast empirical literature has analyzed how
migration decisions are driven by economic incentives, in particular by income differ-
entials.1 Since migration is a dynamic discrete choice problem, advances in modelling
these problems2 have opened up new frontiers for empirical research on migration too.
This triggered a recent interest in structural models of migration.3 Common to these
papers is an i.i.d. assumption for the agents’incomes after controlling for observables.
In this paper, we highlight that a deviation from this i.i.d. assumption has stark
consequences for the estimation of structural parameters, the comparative statics ofmigration with respect to migration costs, and the age patterns of migrants. This is
because of dynamic self-selection. If (residual) incomes are autocorrelated (as shown
by e.g. Storesletten, Telmer and Yaron (2004) or Low, Meghir and Pistaferri (2010)),
repeated decision making implies that neither migrants nor the population taking mi-
gration decisions are a random sample with respect to income. The income of an agent
is typically highest in the place she currently lives in, because she will have—in her past—
selected herself into a region where she is best off.4
In non-repeated discrete-choice modelling ("now-or-never" type of decisions), various
solutions to self-selection problems have been discussed, see Heckmann and Robb (1985)
for an overview. In the context of migration, the role of such static self-selection for the
estimation of migration gains was discussed by Nakosteen and Zimmer (1980).5 Their
proposed solution builds on a selection model of the type popularized by Heckman (1974,
1976, 1978) and Lee (1978, 1979). However, it rests on the assumption of non-repeated
discrete choice and on residual income heterogeneity being i.i.d.
We first elaborate on the difference between dynamic and static self-selection in a
stylized two period setup that has the advantage of analytical tractability. Thereafter,
we develop a fully dynamic model of repeated migration choices. This model allows
1See Greenwood (1975, 1985, and 1997) and Cushing and Poot (2004) for survey articles.2See Keane and Wolpin (2009), Norets (2009), Aguirregabiria and Mira (2010).3See e.g. Armenter and Ortega (2010), Coen-Pirani (2010), Gemici (2011), or Kennan and Walker
(2011).4Norets (2008) shows that wrongly assuming i.i.d. unobservables can create significant estimation
biases in dynamic discrete choice models and therefore (Norets, 2009) develops a Bayesian estimationtechnique for this class of models with serially correlated unobservables.
5Examples of further studies adressing static self-selection in migration are: Borjas (1987), Borjas,Bronars, and Trejo (1992), Tunali (2000), and Hunt and Mueller (2004).
2
us to take a classical, simulation-based estimation approach of the structural parame-
ters while taking serial correlation in potential incomes and self-selection into account.
Our approach relies on explicitly modelling the dynamics of the distribution of poten-
tial incomes. Our modelling strategy follows Caballero and Engel’s (1999) paper on
investment, which highlights the interaction of lumpy investment and the evolution of
investment incentives. In the spirit of their model, we develop a microeconomic struc-
tural model of migration which can be used to describe the simultaneous evolution of
unobservable migration incentives and migration rates at an aggregate level. This allows
us to identify the model parameters from the business cycle frequency fluctuations in
migration rates. We use annual US state level migration flows from 1989-2008 from
the IRS. An advantage of our approach is that we can easily combine information from
different levels of aggregation. Specifically, we also exploit information on dispersions of
household incomes by state and year from the Current Population Survey (CPS).
In estimating our model, we obtain four important findings. First, we estimate
migration costs to be US$ 34,248 for a typical move between US states. This number
is substantially smaller than the ones reported in previous contributions, such as Davis,
Greenwood and Li (2001), but in line with Kennan and Walker’s (2011) estimate - at
least when they take expected payoff-shocks into account. Second, we show that it can
generate a substantial bias in estimated migration costs if one ignores the endogeneity
and the dynamics of the distribution of unobserved potential incomes. Third, we show
that the comparative statics of the model with respect to exogenous changes in migration
costs, for example due to more or less liquid housing markets, changes substantially with
assumptions regarding if and how to model persistence of potential income differences
across states. Fourth, we also document migration dynamics at the micro level that
differs from a model which does not keep track of the incentive distribution. One of
the best documented facts from microdata is that younger households are more likely
to migrate than older ones. The prominent explanation for this is the so-called human
capital channel where migration is an investment in human capital that pays off longer
for younger agents (Sjaastad, 1962). A problem with this explanation is that it cannot
capture the sharp decline in migration rates between ages 20 and 30.
We shut down this human capital channel and apply a perpetual-youth model instead
where the decision problem of the agent is stationary and independent of the agent’s
age. Nonetheless, age influences migration in our model because it is an argument of
the distribution of migration incentives. As in Jovanovic’s (1979) job search model,
the match between agent and region becomes more effi cient as agents get older, since
agents have selected themselves into their preferred region. This mechanism, while in
3
principle discussed in parallel work by Coen-Pirani (2010) and Kennan and Walker
(2011), provides in our setup a new quantitative explanation for the empirical age-
migration pattern. We show that autocorrelated incomes are key to the close quantitative
match of observed and model-implied age patterns if one does not want to rely on age-
dependent migration costs as in Kennan and Walker (2011). To make this point we
show that one obtains very different and counterfactual results if approximating the
persistence in incomes by a mixture of an i.i.d. and a fixed effect component.
Kennan and Walker (2011) have a framework where migration is an experience good
and choice is between 50 regions whereas we assume that the household knows alternative
opportunities at each point in time, modelled in a bi-regional setup. We use a bi-regional
setup because simulating the dynamic evolution of migration incentives is numerically
intense even if solving the microeconomic decision problem itself is quick. In Kennan
and Walker (2011), income dynamics is given by a combination of fixed location-specific
shocks and an i.i.d. component, whereas we model it as an autoregressive process. To
match age patterns of migration, Kennan and Walker consider age-specific migration
preferences. At the same time, they account for further household characteristics, ob-
taining identification from cross-individual variations in migration patterns, while our
identification relies on business cycle frequency movements in migration and hence im-
plicitly controls for factors that do not change over time.
Gemici (2011) also exploits differences in migration patterns across households and
provides a dynamic model of family migration decisions. Her model puts to the center
of attention the issues of intra-household bargaining and private externalities that job
offers (and moving choices) of one spouse cause for the other. Gemici (2011) models
persistence in income as a constant job-specific effect and households decide to change
jobs (and consequently place of residence) if they obtain a favorable job offer.
The remainder is organized as follows. Section 2 illustrates in a stylized two-period
model why dynamic self-selection implies that the evolution of migration incentives and
migration choices need to be estimated simultaneously. Section 3 extends the model
to a setup where an agent maximizes life-time well-being by repeated location choice
in a perpetual-youth model. Section 4 shows how to aggregate this model. Section 5
confronts the model with data and presents the estimates of the structural parameters.
Section 6 investigates the role of different assumptions on the persistence of incomes
for the comparative statics of the model and the age patterns of migration our model
implies. Section 7 concludes and an Appendix provides detailed proofs, details about
the numerical model, and some further robustness checks.
4
2 Why a dynamic model of migration and migration incentives?
Most micro studies and lately also more macro studies on migration link the individual
migration decision to a probabilistic model in which agent i migrates at time t if the
long-term gain in utility terms obtained by migration is large enough and exceeds some
threshold value c, see for example Davies, Greenwood, and Li (2001), Hunt and Mueller
(2004), or Kennan and Walker (2011).
2.1 Endogenous initial state
To illustrate the problems induced by dynamic self-selection in such setup, we first
consider a two-period, t = 0, 1, bi-regional example with regions A and B in this section.
In Section 3 we develop an infinite horizon, dynamic discrete choice model of migration
that can solve the problems highlighted here.
Let yiAt indicate whether agent i resides at time t in region A (yiAt = 1 if i in A and
yiAt = 0 if i in B). The decision problem in t = 1 can then be written as
yiA1 =
{1
0
if y∗iA1 > 0
if y∗iA1 ≤ 0(1)
where y∗iA1 is the latent utility agent i enjoys from living in region A relative to living
in region B (including eventual migration costs). Equation (2) below gives a parametric
form to this utility difference:
y∗iA1 =
{uiA1 − (uiB1 − c)(uiA1 − c)− uiB1
if agent i lives in A at time 0
if agent i lives in B at time 0
= γ (wiA1 − wiB1)− c (1− 2yiA0) + νi1. (2)
We assume that the flow utility uij1 from living in region j depends only on incomes
wij1. The parameter γ measures the marginal utility of income. The utility costs of
migration are described by c. The stochastic component νi1 reflects differences across
agents, omitted migration incentives, and/or some variability of migration costs.
Typically, we are interested in the structural parameters γ and c and hence would
estimate (1) respectively the parametric form (2) to infer these parameters with a dis-
crete choice estimator suitable for the distribution of shocks νi1, e.g. running a probit
estimation of migration choice on income differences as potential migration gains. Such
a direct approach is in general not feasible as potential migration gains are unobservable
to the econometrician, i.e. we observe wij1 only if the agent chooses to live in j.
A standard approach to solve this problem is to proxy the unobservable potential
5
income by the income a similar agent realizes in the other region using a Mincer-type
wage regression
wij1 = ζj1zi1 + w∗ij1,
where ζj1 measures the sensitivity of wages to observables zi1 and w∗ij1 is residual wage
heterogeneity.
Nakosteen and Zimmer (1980) highlighted that the self-selection of agents has to
be taken into account when estimating the average unobserved potential income gains,
ζj1zi1. We assume this problem to be solved, since we are here not interested in the effect
of classical self-selection. Therefore, we assume that the econometrician actually knows
ζj1 and thus also the average gain from migration.6 Nonetheless, if wage residuals w∗ij1are autocorrelated, the structural estimation of the decision problem defined in (1) and
(2) will be biased if the place of residence is a result of past decision making (dynamic
self-selection).
Replacing wij1 by the estimates ζj1zi1 in (1) , we obtain for the latent variable y∗i1
y∗iA1 = γ (ζA1 − ζB1) zi1 − c (1− 2yiA0) + γ (w∗iA1 − w∗iB1) + νi1︸ ︷︷ ︸:=ηi1
. (3)
The proxy-model (3) , which now is feasible to estimate (again with, say, a probit es-
timator), contains a composed error term that combines the original error νi1 from
the discrete choice problem (1) and a measurement error γ (w∗iA1 − w∗iB1) that captures
the residual income heterogeneity across agents after controlling for observables zi1. We
assume this term is orthogonal to (ζA1 − ζB1) zi1. Making use of the proxy income dif-
ference (ζA1 − ζB1) zi1 it is now feasible to estimate (1) and (3) with a discrete choice
estimator corresponding to the distribution of ηi1 (say probit for example), regressing
migration choices on imputed income differentials, (ζA1 − ζB1) zi1. However, for unbi-
ased estimates of c it is necessary that ηi1 and hence γ (w∗iA1 − w∗iB1) is also orthogonal
to the previous place of residence yiA0.
When studying regional migration this assumption is typically not satisfied. To see
how this leads to a bias in the estimate of migration costs c, consider the following two
scenarios, where in both scenarios the average migration rate is small, individual wages
are unobservable, and location B offers on average higher wages than location A.
6 In the terminology of the econometric literature on selection, this assumption means that the problemof estimating treatment effects can be readily solved. This selection problem lead Nakosteen and Zimmer(1980) to advocate a joint estimation of the latent income variable and the migration choice based ona model of the type popularized by Heckman (1974, 1976, 1978) and Lee (1978, 1979). See Heckmannand Robb (1985) for various consistent estimators of ζj1.
6
• In scenario 1, agents are initially randomly distributed across the two locations.
In this scenario, for migration rates to be low, it must be that migration costs are
large and hinder agents from moving from region A to B.
• In scenario 2, agents are initially self-selected in the two locations, such that they
are where they earn most. Suppose further extreme wage persistence: individual
wages remain constant over time. Now, we observe zero migration even in the
absence of migration costs, because households already are in the region where
they earn most. Any household that migrates would actually incur an income loss
and an aggregate income difference is not informative about the latent gain (or
loss) from moving for any given household.
If one mistakes scenario 2 for scenario 1, migration costs will be overestimated. A
setup in which time periods are relatively short, e.g. years, and where agents have repeat-
edly faced the decision to migrate is more like scenario 2 as there is high autocorrelation
in incomes. This holds true even after controlling for individual characteristics and fixed
individual heterogeneity, see for example Storesletten, Telmer and Yaron (2004) or Low,
Meghir and Pistaferri (2010).
To make the above argument formal, assume w∗ijt follows an AR(1) process
w∗ij1 = ρw∗ij0 + εij1
with i.i.d. innovations εij1. The initial conditions w∗ij0 are i.i.d., drawn from a normal
distribution N(0, σ2
0
). Replacing w∗ij1 in (3) , we obtain
y∗iA1 = γ (ζA1 − ζB1) zi1 − c (1− 2yiA0) + γ (ρ (w∗iA0 − w∗iB0) + εiA1 − εiB1) + νi1︸ ︷︷ ︸=ηi1
.
As long as ρ 6= 0, corr (yiA0, ηi1) 6= 0 if the location in the previous period yiA0 is a
function of the previous periods’ residual income difference (w∗iA0 − w∗iB0) . In general
this will be the case if the location yiA0 has been a result of migration choice and thus
is not random.
Scenario 2 refers to the case where each household initially is in the region where it
earns most income
yiA0 =
{1
0
ifwiA0 > wiB0
ifwiA0 ≤ wiB0
.
We can calculate the covariance of the composed error term ηi1 with yiA0 as (see
7
Appendix A)
cov (yiA0, ηi1) = 2γρσ0φ
((ζA1 − ζB1) zi1
2σ0
)> 0, (4)
where φ is the density of the standard normal distribution. Note that the covariance
is positive, implying an upwards bias in the estimate of migration costs c.7 In addition,
the bias is neither constant across individuals nor across time. It is largest when the
deterministic differences (ζA1−ζB1)zi12σ0
are small, i.e. when regions are much alike.
The bias vanishes if ρ = 0, which corresponds to the model considered by Nakosteen
and Zimmer (1980). It also vanishes if the location in period t = 0, yiA0, is not related
to the income difference in t = 0. That initial location is unrelated to initial income
differences is likely for example if one looks at location at the time of birth compared to
the location at another fixed age. Research on internal migration, however, has typically
not looked at this type of data. Migration data that comes at a yearly frequency typically
reflects the behavior of households who already faced migration decisions - even if most
of them decided not to move.
2.2 Dynamics of the distribution of potential incomes
The two-period model introduced above reflects this repeated decision process only up
to a limit, though it highlights the general problem arising from dynamic self-selection.
To fully address the dynamic character of the migration decision, we extend the model
to the infinite horizon in the next section. There, one important element will be the
dynamics of the distribution of unobserved migration incentives sketched in Figure 1.
Suppose the composed error term ηit is initially normally distributed as in Figure
1 (a). The figure displays the distribution of potential incomes, γ (ζAt − ζBt) zit + ηit.
Low values imply that income in region B is favorable, high values imply better income
prospects in region A. In the absence of migration costs, all agents with γ (ζAt − ζBt) zit+ηit < 0 decide to live in region B and they decide to live in region A otherwise.
As a result of this self-selection, the distribution of income differences changes for
the next period. No agent who lives in region B prefers to live in region A, see Figure
1(b). Effectively, the right-hand part of the distribution in Figure 1(a) has been cut as
all agents with higher income in region A have chosen A as the region to live in.
Adding a normally distributed idiosyncratic income shock to the persistent income
difference leads to the distribution of income differences as displayed in Figure 1(c). The
colored-in region indicates the set of agents that will migrate from B to A after the
7The covariance is suffi cient to argue that a bias will be present. However, if the migration probabilityis given by a non-linear model such as logit or probit, it is not easily possible to derive an explicit biasformula as in a linear regression model.
8
Figure 1: Distribution of potential incomes in region A relative to B
(a) overall population (b) conditional on living in region B
3 2 1 0 1 2 30
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Live in B Live in A
yiAt
: log income difference between state A and state B
popu
latio
n de
nsity
3 2 1 0 1 2 30
0.2
0.4
0.6
0.8
1
1.2
1.4
Live in B Live in A
yiAt
: log income difference between state A and state B
popu
latio
n de
nsity
(c) conditional on living in region B after (d) conditional on living in region B afterfirst move. After idiosyncratic first move. After idiosyncratic andshocks. aggregate shocks.
3 2 1 0 1 2 30
0.2
0.4
0.6
0.8
1
Live in B Live in A
yiAt
: log income difference between state A and state B
popu
latio
n de
nsity
3 2 1 0 1 2 30
0.2
0.4
0.6
0.8
1
Live in B Live in A
yiAt
: log income difference between state A and state B
popu
latio
n de
nsity
Shaded area: mass of agents who are better off living in region A instead of region B.
9
idiosyncratic shocks occurred.
Besides idiosyncratic shocks, aggregate shocks to the average income difference
γ (ζAt − ζBt) zit also influence the migration decisions of agents. Figure 1(d) shows thedistribution of migration incentives as in Figure 1(c), but after an adverse shock to
region B. Aggregate shocks shift the income differences for all agents and thus shift the
distribution of income differences before migration without directly altering its shape.
By comparing Figures 1(c) and 1(d), one can see that the shape of the distribution after
migration (the region not colored in) differs between both figures. As a consequence, one
needs to keep track of the evolution of the incentive distribution to determine aggregate
migration. Therefore, we develop a model based on dynamic optimal migration decisions
in the presence of persistent shocks to income. This model can then be aggregated and
used to simulate the evolution of migration and its incentives over time.
3 A simple stochastic model of migration decisions
We consider an economy with two regions, A and B. This economy is inhabited by a
continuum of agents of measure 1. Agents maximize future well-being over an infinite
horizon by location choice. In each period a constant fraction δ of randomly selected
agents dies and is replaced by newborn agents so that the overall population remains
constant ("perpetual youth model"). We model the economy in discrete time and at
each point in time an agent decides in which region to live and work. First, we consider
the decision problem of an individual agent i living in region j = A,B. Thereafter, we
discuss aggregation and the dynamics of the distribution of migration incentives.
Living in region j at time t gives the agent utility wijt that we interpret as utility
from income, which is stochastic in our model. We assume incomes to be composed
of a persistent (autocorrelated) component wijt and a transitory (i.i.d.) component
ϕijt.8 Both components vary over time and across individuals. We assume that only
the persistent component wijt is observed before migration. Consequently, in describing
migration behavior, we can focus exclusively on the effect of persistent variations in
potential incomes. Changes in the transitory component realize after migration and
hence do not affect migration choice. Therefore, we drop ϕijt for notational convenience
when describing migration as a function of incomes. However, when confronting our
model to data, including data on aggregate income, we need to take transitory income
fluctuations into account.
Moving from one region to the other comes at a cost. When an agent moves, she is
8See the evidence on transitory income fluctuations provided by Storesletten, Telmer, and Yaron(2004) for instance.
10
subject to a disutility c that enters additively in her utility function. We assume that
migration costs, c, are constant across agents and over time. The instantaneous utility
function uit(j, k) is given by
uit (j, k) = wijt − Ij 6=kc (5)
for an agent that has lived in region k in the preceding period and now lives in region j.
Here, I denotes an indicator function, which equals 1 if the agent has moved from region
k to j and 0 if the agent had lived in region j in the preceding period. Our assumption
of utility being linear in income can be understood as assuming complete markets and
perfect consumption insurance, where the allocation problem then simplifies to locating
the agent where she is most productive (taking migration costs into account).9
The agent discounts future utility by factor β < 1 and takes further into account the
probability of dying δ such that she effectively discounts future utility by β = β (1− δ) <1 and maximizes the so discounted sum of expected future utility by her location choice.
The agent knows the distribution of the persistent component of income wijt and forms
rational expectations. With wijt being stochastic, the potential migrant waits for good
income opportunities.10
The distribution of migration incentives, wijt, is assumed to be log-normal. In par-
ticular, we assume that log income in the two regions (free of transitory shocks), wijt,
follows an AR(1) process with normally distributed innovations ξijt and autoregressive
coeffi cient ρ :
ln (wijt) =: wijt = µj (1− ρ) + ρwijt−1 + ξijt, j = A,B. (6)
This process holds for the whole continuum of agents and each agent draws her own
series of innovations ξijt for both regions. The expected value of log income in region j
is µj . The innovations ξijt are composed of aggregate as well as idiosyncratic components.
They have mean zero, are serially uncorrelated, but may be correlated across regions A,B
(see Section 4.2.2). Note that transitory shocks to income, ϕijt, which are irrelevant for
migration choices will be added to aggregate income when matching our model to data.
The income distribution and migration costs, together with the utility function and
9Any deviation from this complete markets assumption makes wealth of the agent an important statevariable of the agent’s decision problem and we want to abstract from this complication.10Our model is based on the real-options approach to migration suggested by Burda (1993) and Burda
et al. (1998). Since the latter two papers only look at migration as a once and for all decision, theypreclude return migration and do not have to study the evolution of migration incentives, to which pastmigration decisions feed back.
11
the discount factor define the decision problem for the potential migrant. The optimiza-
tion problem is described by the following Bellman equation:
V (k,wiAt, wiBt) = maxj=A,B
{exp (wijt)− I{k 6=j}c+ βEtV (j, wiAt+1, wiBt+1)
}. (7)
Here, Et denotes the expectations operator with respect to information at time t and kdenotes the current region of the agent.11 The optimization problem is stationary and
in particular it is independent of age as agents die with a constant probability δ.
The optimal policy is relatively simple. The agent migrates from region k to region j
if and only if the costs of migration are lower than the sum of direct benefits of migration
expwijt − expwikt and the expected value gain
∆V (wiAt, wiBt) := βEt [V (B,wiAt+1, wiBt+1)− V (A,wiAt+1, wiBt+1)] . (8)
This means that the agent migrates from A to B if and only if
c ≤ exp (wiBt)− exp (wiBt) + ∆V (wiAt, wiBt) =: c (wiA, wiB) . (9)
This gives a critical level of costs ciA =: c (wiA, wiB) at which agent i living in region
A and facing potential incomes wiA, wiB is indifferent between moving and not moving
to region B. Note that due to individual differences in incomes the critical cost levels
for moving, ciA, differ across individuals, while migration costs c are common. This
introduces heterogeneity in migration decisions. A person moves from A to B if and
only if c ≤ ciA. Conversely, a person living in region B moves to region A if and only
if c ≤ ciB = −ciA. Note that ciA can be positive as well as negative. If ciA is positive,region B is more attractive. If it is negative, region A is more attractive and a person
living in region A would only have an incentive to move to region B if migration costs
were negative.
4 Aggregate migration and the dynamics of income distributions
4.1 Aggregate migration
Given this trigger rationale for migration, the hazard rate
Λj (wA, wB) :=
1if j = A and c ≤ c (wA, wB) or
if j = B and c ≤ −c (wA, wB)
0 otherwise
11Existence and uniqueness of the value function is proved in Appendix B.
12
determines whether a person living in region j moves to the other region if she faces the
potential incomes (wA, wB).
Now, consider the distribution Ft of (potential) incomes (wA, wB) and household
locations. Suppose this joint income and location distribution is the distribution after
the income shocks ξijt have been realized, but before migration decisions have been
taken. Let fjt denote the conditional density of this distribution, conditional on the
household living in region j at time t. Then, the actual fraction Λjt of households living
in j that migrate to the other region evaluates as
Λjt :=
∫Λj (wA, wB) · fjt (wA, wB) dwAdwB. (10)
This aggregate migration hazard can be thought of as a weighted mean of all microeco-
nomic migration hazards Λj (wA, wB), weighted by the density of income pairs (wA, wB)
from distribution Ft.
4.2 Dynamics of income distributions
The distribution Ft (and hence fjt) evolves over time and is a result of direct shocks to
income just as it is a result of past migration. In addition it is altered by the death and
birth of agents. We need to characterize the law of motion for Ft to close our model and
to obtain the sequence of aggregate migration rates.
4.2.1 The effect of migration on income distributions
In order to follow the evolution of Ft we need to characterize both the evolution of
the fraction Pjt of households living in each region and the conditional distribution of
incomes fjt (conditional on a household actually living in a specific region j).
The proportion of households living in region j at time t+ 1 is a result of migration
decisions at time t. The law of motion for Pjt is thus given by
Pjt+1 =(1− Λjt
)Pjt + Λ−jtP−jt. (11)
The first part of the sum reflects the fraction of households that remain in region j,
where(1− Λjt
)is the probability to stay in region j. The second part is the fraction
of households that migrate from region −j to region j. The probability δ of a householddying does not influence the proportion of households living in each region as a dying
household is by assumption replaced by a newborn one in the same region.
Since the microeconomic migration hazard depends on (wA, wB) , different potential
incomes result in different propensities to migrate. As a consequence, migration changes
13
not only the fraction Pjt of households living in region j at time t, but also the conditional
distribution of income, fjt. For example, households living in region A, earning a low
current income, wA, but facing a substantially higher potential income in B, wB, will
probably migrate. As a result, the number of those households will drop to zero in region
A after migration decisions have been taken, while the number of households facing a
smaller income differential might not change, recall Figure 1.
The distribution of migration incentives is thus a function of past migration decisions,
and we can express the new density of households with income (wA, wB) in region j after
migration, fjt, by
fjt (wA, wB) = [1− Λjt (wA, wB)]fjt(wA,wB)Pjt
Pjt+1+ Λ−jt (wA, wB)
f−jt(wA,wB)P−jtPjt+1
. (12)
The probability [1− Λjt (wA, wB)] is again the probability to stay in region j. The
term fjt (wA, wB)Pjt weights this probability and is the unconditional income density
for region j before migration has taken place. To obtain the conditional density after
migration, the unconditional income density, fjt (wA, wB)Pjt, is divided by Pjt+1, which
is the fraction (or probability) of households living in region j after migration (i.e. in
time t+ 1). Analogously, the second part of the sum is constructed.
4.2.2 The effect of income shocks on the income distribution
Besides migration, also shocks to income change the distribution of income pairs, Ft. The
shocks to income can be differentiated along two dimensions: One dimension is aggregate
vs. idiosyncratic, the other one is region-specific vs. economy-wide. For a single agent
we can decompose the total potential income wijt in region j (see equation 6) into
an aggregate regional-component zjt and an individual-specific regional-component w∗ijtbeing driven by shocks θjt and εijt, respectively:
wijt = zjt + w∗ijt (13)
zjt = µj (1− ρ) + ρzjt−1 + θjt
w∗ijt = ρw∗ijt−1 + εijt, j = A,B.
In case agent i is newborn in period t, we assume that she begins life without any past
idiosyncratic income advantage or disadvantage in region j, i.e. we set w∗ijt−1 = 0. Note
that t refers to natural time and not to the age of the agent. Further note that εijt and
θjt simply add additional structure to the income shock
ξijt = εijt + θjt
14
in equation (6) . We assume for convenience that the autocorrelation of aggregate and
idiosyncratic shocks is the same.
The regional-aggregate shock θjt for region j hits all agents equally and changes
their potential income for region j. Note that this shock does not depend on the actual
region the agent lives in. For example, a positive shock θAt > 0 increases the potential
income in region A for agents that currently live in this region as well as for agents
that are currently living in region B. They realize this potential income by deciding to
actually live in region A. The importance of economy-wide business cycles relative to the
size of region-specific aggregate fluctuations is reflected by the correlation ψθ between
aggregate shocks θAt and θBt. The higher is ψθ the more important are economy wide
shocks relative to region specific ones.
However, aggregate shocks are typically only a minor source of income variation
for an agent. Agents differ in various personal characteristics that result in different
income profiles over time. Individuals differ in their skills and while the demand may
grow for the skill of one person, demand may deteriorate for another person’s skills.
This heterogeneity is captured by the idiosyncratic shocks (εiAt, εiBt) . If εiAt is positive,
income prospects of the individual agent i increase in region A. The correlation ψε
between εiAt and εiBt reflects economy-wide demand shifts for a person’s individual skills.
Since we assume aggregate and idiosyncratic shocks to be independent, the variance of
the total shock to income, ξijt, is the sum of the variances of idiosyncratic and aggregate
shocks: σ2ξ = σ2
ε + σ2θ.
Persistence in incomes is captured by the autoregressive parameter ρ in equation
(13) . In our baseline setup, we abstain from the inclusion of permanently fixed individ-
ual differences (fixed effects) because this makes the model numerically more tractable.
However, we compare to a setup in which agents can be of 5 different types with fixed
preferences for or against region A but have i.i.d. income shocks otherwise.12
Aggregate and idiosyncratic shocks to income, birth and death of households, as
well as income persistence jointly determine the transition from fjt to fjt+1, details
are provided in Appendix C. The latter density now determines migration decisions in
time t + 1, starting the cycle over again. As a result it is both past income shocks and
12While the solution of the restricted dynamic programming problem of the agent can be obtainedquickly, the simulation of the distribution of migration incentives is numerically involved. The compu-tation time for the estimation amounts to roughly 12h on a 8-core Xeon (Clovertown) 3GHz machine.The alternative specification with fixed effects can be allowed for by modelling K types of agents thathave a fixed income advantage, κk ∈ R, from staying in region A instead of region B. The model thenis solved for each different type of agent as it is solved for the single type. An important aspect is thata κk-type agent upon dying in one region may be replaced by a different type in that region leading toan initial misallocation.
15
past migration decisions that drive the incentives to migrate. Making this explicit and
keeping track of the distributional dynamics of migration incentives is the key element
of our model, as it distinguishes our approach from other empirical models of migration.
4.3 Aggregate income
To link our model to aggregate data, we finally need to describe the evolution of aggregate
regional realized incomes. For region j, log aggregate income wjt is given by
wjt = ln
(∫exp (wj) fjt (wA, wB) dwAdwB
)+ ϕjt, (14)
where the first term is persistent, realized income. The second term, the transitory
income component ϕjt, measures fluctuations in income at a high frequency that are
irrelevant to the migration decision. More generally, it captures the idea that in reality
income measures migration incentives imperfectly. One reason is that any empirical
income concept is noisy as such. The inclusion of ϕjt reflects this agnostic view.
5 Estimation
5.1 Estimation technique and estimated parameters
We rely on an indirect inference procedure in order to find the parameters of our model
that allow us to match closest the observed patterns of migration that are in the data.
In particular, we apply a method of simulated moments (MSM) as has been proposed by
Gourieroux, Monfort, and Renault (1993) to obtain estimates of structural parameters
when the likelihood function of the structural model becomes intractable. This estimator
relies on numerical simulation of the model. Details are provided in Appendix E.
The idea behind a method of simulated moments is to: first, choose a set of moments
that captures the characteristics of the data, second, simulate the structural economic
model, and third, find parameters such that the simulated moments replicate the ob-
served moments closely.
A simulation of our model yields migration and income data for two regions. Of
course, the actual migrant faces a more complex decision problem than in our model.
Including D.C. as a destination region, an agent has to decide between 50 possible
alternative states where she can move to. To make this comparable to our model, the
50 alternatives in the data have to be aggregated to a single complementary region
for each of the 51 states.13 The average income of the alternative region is proxied
13Generating artificial bi-regional data means that we technically assume the best income opportunityover all alternative regions to follow a log-normal distribution as assumed in our model. An approxima-
16
by the population-weighted average income over all alternative 50 states. This data is
combined with migration data from the Internal Revenue Service (IRS). This database
contains annual state-to-state migration flow data for the US for the period 1989-2008.
We simulate our model for 51 pairs of regions and 70 years, but we drop the first 50 years
for each region to minimize the influence of our initial choice of the income distribution
F0. We choose F0 to equal the ergodic distribution in the absence of aggregate shocks, see
Appendix D for details. To reduce simulation uncertainty, we replicate each simulation
5 times and use the averages over these simulations.
We estimate all parameters of the model except for the discount factor β, the prob-
ability of dying δ, and average incomes. As we work with annual data, we choose the
discount factor to be β = 0.95. We fix the probability of dying to δ = 2.5% to reflect an
average working-life expectancy of 40 years and set the mean log household income to
µA = µB = 10.5 (roughly US$ 45,000).
All other parameters of our model are estimated. Our primary estimation target
are migration costs, c. Besides migration costs, we need to estimate the correlation of
persistent shocks to income across regions, ψε and ψθ, and the importance of common
shocks across individuals, i.e. the variance of aggregate shocks σ2θ.We assume ψε = ψθ =
ψ while we fix the correlation of transitory shocks ψϕ to the one of realized incomes in
the data (see Section 5.3.1. for a discussion of this choice). Finally, we need to estimate
the parameters of the idiosyncratic income process(ρ, σ2
ε
). We assume autocorrelation
is the same for aggregate and individual shocks. This latter assumption is for made for
convenience. Our complete set of estimated parameters is Θ =(c, ψ, ρ, σ2
θ, σ2ϕ, σ
2ε
).
5.2 Data
To estimate the model we exploit data on state-to-state migration rates and household
level and aggregate data on labor incomes.
5.2.1 IRS migration data
We use state-to-state migration data for the period 1989-2008 provided by the US Inter-
nal Revenue Service (IRS). The IRS calculates state-level (and county-level) migration
data for the entire United States based on year-to-year address changes reported on
individual income tax returns filed prior to late September of each calendar year. This
means the migration data is obtained by matching the Social Security number of the
primary taxpayer from one year to the next. The IRS data identifies households with
an address change since the previous year and then totals migration to and from each
tion of this sort cannot be avoided by assuming an extreme value distribution for incomes. This wouldonly work if migration incentives were serially uncorrelated.
17
state in the US to every other state. Given these bilateral migration flows, aggregate
gross immigration and outmigration for the 50 US states and the District of Columbia
can be computed, i.e. the number of households who moved to a state and the number
of households leaving a state, respectively. Migration rates are calculated by express-
ing gross immigration as proportions of the number of total population of households
(migrants and non-migrants) reported in the IRS data set.
The IRS migration data represents between 95 and 98 percent of total annual filings.
According to Gross (2005), the IRS migration data may be the largest data set that tracks
movement of both households and people from state to state. A particular advantage
of the data set is the relatively large time period covered (1989-2008) and the almost
universal coverage of households. This is important for our study as our identification
strategy exploits the time-series business-cycle volatility of state-level migration rates.
A shortcoming of the IRS data is that it does not represent the entire US popula-
tion. Households who are not required to file income tax returns are not covered. As a
result the IRS data under-represents the poor and the elderly (also excluded is a small
percentage of tax returns filed after late September of the filing year). However, com-
pared to other sources of migration data, such as the Current Population Survey (CPS)
for example, a decisive advantage of the IRS data is the size of the population that is
sampled. The CPS on average covers roughly 1000 households per state and year, with
much smaller numbers for smaller states. This introduces significant sampling variation
in migration rates that dominates the business cycle fluctuations at the state level that
we want to measure and exploit for identification, see Section 5.3.
5.2.2 Income data
State level income data is taken from the Regional Economic Accounts provided by the
BEA. We use as income data the average wage per job (Table CA34), which is the income
concept most closely related to our model. The data is deflated using the CPI.
To relate the dispersion of household incomes our model predicts to actual data we
use data from the March Supplement Files of the CPS of the US Census.14 We match
the dispersion of incomes across households in our model to the cross-sectional dispersion
of gross earnings in the CPS.
In the CPS, respondents are interviewed to obtain information about the employment
status and earnings of each member of the household 16 years of age and older. The
sample of the CPS is representative of the civilian non-institutional population. Gross
annual earnings are defined as income from wages and salaries including pay for overtime.
14We obtained the data through Unicon Research http://www.unicon.com/.
18
Table 1: Descriptive statistics
raw data state-wise linearlydetrended & demeaned
mean std min max mean std min maxINM 0.0373 0.0161 0.0130 0.1093 0.0373 0.0025 0.0261 0.0527
OUTM 0.0365 0.0143 0.0186 0.1146 0.0365 0.0023 0.0280 0.0758
Y 9.7994 0.1691 9.4341 10.4556 9.7994 0.0201 9.7456 9.8754
YC 9.8743 0.0677 9.7703 9.9823 9.8743 0.0168 9.8423 9.9046
YSTD 0.4656 0.0195 0.4008 0.5374 0.4656 0.0195 0.4008 0.5374
INM: In-migration rate from IRS data, OUTM: Out-migration rate from IRS data, Y: Averagewage per job (in logs) from BEA data, YC: Average wage per-job in the complementary region(in logs) from REIS data, YSTD: Cross-sectional standard deviation of log residual earningsfrom CPS.
Nominal earnings are deflated with the CPI and expressed in 2006 dollars. We use the
same period of time as for the IRS migration data (1989-2008).
Our selected sample comprises civilians aged 23 to 55. We drop individuals who work
less than 5 hours a week or less than 4 weeks a year and obtain earnings residuals by a
regression of log labor earnings on a set of age, year, state, and education dummies. To
control for outliers, we run the regression twice, dropping (for each age) the top-bottom
0.5 percentiles based on the residuals from the first-step regression. When relating the
dispersion of log income residuals from the CPS data to our model, we take into account
the demographic structure in our model and calculate, for each state and year, a weighted
standard deviation of log earnings residuals with the model-implied population weights
that depend on δ.
5.2.3 Descriptive statistics
As we focus on the business-cycle behavior of the data, we remove a state-specific linear
time trend and state fixed effects from both, migration and income data. Arguably
using an HP-filter with usual weights would remove too much fluctuations from slowly
evolving migration rates. Shimer (2005) makes a similar argument for filtering labor
market flows. Results do not qualitatively change if we use state-wise HP(100)-filtering
19
instead.15 Table 1 presents some descriptive statistics for the data used in the estimation.
After filtering out trends and taking out fixed state differences, in- and out-migration
rates are weakly negatively correlated (correlation coeffi cient: -0.31) and show mild
persistence (autocorrelation: 0.62). Overall migration activity (sums of in- and out-
migration) is roughly acyclical (correlation coeffi cient with income: 0.08). Further mo-
ments of the data are displayed in Table 3 where we compare these to the matched
moments from simulations of our model.
5.3 Identification
Our identification strategy is to match time-series volatilities in migration, i.e. the
business-cycle behavior. The idea behind this identification approach from business-
cycle-frequency fluctuations is that such approach controls for fixed state differences
like location, size, permanent or compensating income differentials by construction as
these differences do not change over the cycle. Similarly, this identification is arguably
not affected by non-economic migration incentives —again as they remain constant at
business cycle frequency.
5.3.1 Moments
Our identification strategy implies as an obvious first target to match the volatility of
migration rates σ (mjt) (over time and averaged across states). Given the volatility of
aggregate income shocks, this volatility measures how sensitive migration is to aggregate
conditions.
A more direct measure of this sensitivity is a regression of migration rates on the
incomes of the destination and the source region. To make such regression scale-invariant
with respect to incomes, we use log-deviations from average incomes as the income
variables, i.e. we estimate
mjt = α0 + α1 (wjt − wj.) + α2 (w−jt − w−j .) + ujt.
Higher migration costs make migration less sensitive to aggregate income shocks. Also
the intercept α0 reveals information about migration incentives. Higher migration costs
will typically lead to lower migration rates on average for example. Since this moment
does not strictly follow our identification strategy to identify from business cycle fluctu-
ations, we run one (exactly identified) estimation, where we exclude average migration
rates from the set of moment conditions.
To estimate the parameters of the income process(ψ, ρ, σ2
θ, σ2ϕ, σ
2ε
)we need further
15Results are available in Appendix F.
20
informative moments on income. Aggregate shocks θAt, θBt are common across indi-
viduals. Hence, both θAt and θBt are contained in realized aggregate incomes wAt, wBt(we observe from the REIS data). Note that as migration induces selection, aggregate
realized incomes will differ from the average potential income that all agents in the econ-
omy would obtain when living in a given state (where the realized incomes are those of
agents who actually choose to live in that given state). For this reason, also the corre-
lation of realized incomes σ (wAt, wBt) and their variance σ2 (wjt) are not identical to
the correlation, ψ, of potential incomes and their variance. Nonetheless, we can expect
the observable σ (wAt, wBt) and σ2 (wjt) to contain information on the correlation ψθbetween θAt and θBt and on their variance σ2
θ.
To estimate the parameters of the idiosyncratic income process, σ2ε and ρ, we exploit
information on the cross-sectional variance of realized incomes σ2 (wiAt) we observe from
the CPS data on household earnings. Again, migration affects the mapping from poten-
tial to realized incomes.
In summary, the mapping of parameters of the income process (ψε, ψθ, ρ, σ2θ, σ
2ϕ,
and σ2ε) to the discussed income moments depends on all model parameters including
migration costs. Nonetheless, these parameters can be identified if their variations lead
to changes in moments of observables, see Section 5.3.2.
However, for the idiosyncratic shocks the identification problem is more severe. While
the cross-sectional variance of incomes inherits the size of shocks σ2ε and their persistence
ρ, there is by construction no income data that allows to infer the regional correlation
of idiosyncratic income shocks ψε. A given agent is either in one or the other region so
that the shock εijt is observable only in one or the other region (and only for stayers).
And as the shock εijt refers to "residual" income after eliminating predictable income
components (such as regional averages) there is no agent in the other region that could
be matched in order to infer εiAt and εiBt simultaneously. There is no way to resolve this
problem, so that we need to assume that aggregate and individual correlation coeffi cients
are equal, i.e. ψε = ψθ = ψ. As argued, the correlation of aggregate shocks to potential
income, ψθ, translates to some extent into the correlation of realized incomes.
5.3.2 How variations in parameters affect moments
As discussed above, all model parameters affect more than one moment at the same
time. In fact, most parameters affect all moments simultaneously. Table 2 summarizes
these effects in a stylized way. Importantly, the parameters have quite different impacts
on the various moments, such that their combinations identify parameters. Technically
speaking, the Jacobian of moments with respect to parameters has full rank. This is
21
a necessary condition for identification of the model parameters. In the following, we
discuss how changes in model parameters affect the moments we aim to match.
The volatility of realized aggregate incomes increases in all "aggregate" pa-rameters
(c, ψ, σ2
θ, σ2ϕ
). Of course the reasons are different for the various parameters.
The variances of persistent and transitory aggregate shocks, σ2θ and σ
2ϕ, have a direct
and hence large effect. The impact of the persistent shock is somewhat muted by offset-
ting migration decisions. An increase in migration costs or in the regional correlation of
shocks limits the extent to which households can use migration to evade adverse income
shocks and hence indirectly increases income volatility.
The correlation of incomes is unaffected by the variance of transitory and per-sistent aggregate shocks. Only the covariance of shocks has a direct and positive impact
on the comovement of realized incomes, while higher migration costs decrease this co-
movement as they limit the extent of income synchronization through migration.
The volatility of migration rates is affected by all parameters except the varianceof transitory shocks. An increase in the aggregate income volatility σ2
θ directly increases
fluctuations in migration rates, because the distribution of potential incomes experiences
larger shifts. By contrast, higher migration costs or more correlated incomes decrease
the volatility of migration rates because migration rates respond less to income shocks
or because income shocks are less differential, respectively. Also the micro-parameters(ρ, σ2
ε
)have a large impact on the volatility of migration rates. If idiosyncratic incomes
become more volatile or more persistent, this increases the option value of migration
and hence (like a migration-cost increase) decreases migration volatility.
The sensitivity to income differences reacts to changes in model parameters asdoes the volatility of migration rates: Migration costs c, idiosyncratic income dispersion,
σε, and income persistence, ρ, all decrease the sensitivity of migration rates to aggregate
income differentials. They shift out the migration trigger Λi. Differently to their null-
effect on migration volatility, a larger variance of transitory shocks, σ2ϕ, decreases the
measured sensitivities of migration to aggregate income differentials as it decreases the
signal to noise ratio. Vice versa for a larger variance of persistent aggregate shocks, σ2θ.
A larger variance σ2θ increases the signal to noise ratio.
Average migration rates are affected by c, ψ, σ2ε, and ρ. The effect on average
migration rates is different between ρ and σε. In line with its effect on the volatility
of migration rates, an increase in ρ decreases average migration. Higher persistence in-
creases the extent of self-selection because a given region is preferable to an agent for
a longer period of time. The volatility of idiosyncratic incomes, σ2ε, has the opposite
effect. Although it increases option values of migration and hence shifts out Λi, it more
22
Table 2: Simulated moments estimation: Stylized Jacobian
Moments
Migration Rates Income SensitivityParameters σ(mjt) α0 σ (wjt) σAB (wjt) σ (wijt) α1 α2
MigrationCosts, c - - - - + - + - - - -Covariance ofShocks, ψ - - + + + - - - -AggregateShock, σ2
θ ++ 0 ++ 0* 0 ++ ++TransitoryShock, σ2
ϕ 0 0 ++ 0* 0 - - - -Autocorre-lation, ρ - - - - 0 + ++ - - - -IdiosyncraticShock, σ2
ε - - ++ 0 + ++ - - - -
σ(mjt) : time-series standard deviation of migration rates, α0 : average migration rate,σ (wjt) : time-series standard deviation of state-level average incomes, σAB (wjt) : correlationof state-level average incomes across states, σ (wijt) : cross-sectional standard deviation ofhousehold incomes, α1,2 : sensitivity of migration rates to home and destination log-incomes.The table contains the signs of the entries of the Jacobian of the moment condition withrespect to parameters. "+ +" stands for a strongly positive reaction, "+" for a positivereaction, "0" for roughly no reaction, "-" for a negative reaction, and "- -" for a stronglynegative reaction of the respective moment to a change in model parameters.*If the data moment is not perfectly matched the composition of persistent and transitoryshocks might matter, because income without transitory shocks in the simulation has acovariance different from the one of transitory shocks.
23
strongly increases the frequency at which this migration trigger is hit. Consequently,
average migration rates increase in σ2ε. The variance of aggregate income shocks, σ
2θ, has
almost no impact on aggregate migration rates. Aggregate shocks shift which region is
currently preferable to the average agent but do not contribute notably to the frequency
at which agents migrate. Income risk of agents is predominantly idiosyncratic. If poten-
tial incomes correlate more strongly (higher ψ) less is to be gained from relocation and
migration rates are lower on average; analogously for higher migration costs.
The dispersion of household incomes strongly depends on both the dispersionof idiosyncratic income shocks and their persistence, σε and ρ. Both parameters directly
increase the dispersion of realized incomes. To a far lesser extent, also higher migration
costs and more correlated income shocks increase this dispersion. In both cases, it be-comes more diffi cult for households to evade negative income shocks through migration.
5.3.3 Practical implementation
For the estimation, we match our set of estimated sample moments,
%S = {σ (mjt) , σ (wAt, wBt) , σ (wjt) , σ (wijt) , α0, α1, α2} to their corresponding esti-mates from simulated data from our model., i.e. we simulate our model for a given
vector of model parameters Θ and calculate the distance between the moments obtained
from this simulation % (Θ) and the sample moments %S . We use the covariance matrix of
%S obtained by 10,000 bootstrap replications as a weighting matrix so that our distance
and goodness-of-fit measure is
L = (%S − % (Θ))′ cov (%S)−1 (%S − % (Θ)) .
Under the null hypothesis of our model being the data generating process, cov(%S)−1
is the optimal weighting matrix. The actual estimation is carried out by minimizing the
distance measure L numerically by using a Nelder-Mead simplex algorithm.
5.4 Estimation results
Table 3 displays the point estimates of the matched moments calculated from the IRS,
REIS and CPS data and the corresponding moments obtained from the simulation of
our model under the estimated parameters. Parameter estimates are reported in Table
4. The column "(I) Baseline" refers to the estimation results from a specification setting
δ = 0.025, matching all discussed moments, and estimating the entire set of parameters.
Columns (II) to (IV) report robustness checks where we estimate an exactly identified
model excluding the average migration rate from the set of matched moments, set the
average working life to 50 years, or estimate without transitory shocks, respectively.
24
Table 3: Simulated moments estimation: moments estimates
Data Simulation(I) (II) (III) (IV) (V) uncon-
Moment Baseline exclude α0 δ = 0.02 σϕ = 0 ditionalStd. of migration
rates, σ (mijt) 0.002 0.002 0.002 0.002 0.004 0.002Corr. of agg. incomes,
σ (wAt, wBt) 0.620 0.589 0.591 0.587 0.416 0.540Std. of agg.
incomes, σ (wjt) 0.019 0.019 0.019 0.019 0.013 0.021Average migration
rate, α0 0.037 0.037 0.038 0.037 0.038 0.037Sensitivity to desti-
nation income, α1 0.045 0.049 0.048 0.050 0.229 0.070Sensitivity to source
income, α2 -0.053 -0.049 -0.049 -0.050 -0.212 -0.049Cross-sectional std.
of incomes, σ (wijt) 0.466 0.466 0.466 0.466 0.470 0.466
*: not matched. The column ‘Data’refers to the moments estimated from the combinedREIS/IRS/CPS data set, with data on 50 US states and D.C. over the period 1989-2008. Thecolumns ‘Simulation’refer to the moments estimated from the simulation of the model usingthe parameters given in Table 4. Both actual and simulated data are within-transformed andstate-wise linearly de-trended. The simulations generate a panel of 51 region-pairs and an70-year history of migration and income data. The first 50 years of simulated data are droppedin order to minimize the influence of initial values. Each simulation is repeated 5 times anddata moments are compared to the average over the 5 replications of the simulation.
The final column, (V), reports estimates where migration-induced income dynamics is
ignored. We discuss the results for this specification in the next section.
Overall our model is able to replicate the observed moments closely. In fact, the
χ2 (1)-distributed overidentification test reported at the bottom of the table does not
reject our model at the 5% level, see Table 4. The estimated migration costs are US$
34,248. This is substantially smaller than the estimates reported in previous contribu-
tions such as Davies, Greenwood, and Li (2001), but in line with Kennan and Walker
(2011) when they take into account expected pay-off shocks conditional on migration.
Parameter estimates from the robustness checks do not differ qualitatively from our
25
Table 4: Simulated moments estimation: structural parameter estimates
(I) (II) (III) (IV) (V) uncon-Baseline exclude α0 δ = 0.02 σϕ = 0 ditional
Autocorrelation 0.952 0.951 0.948 0.936 0.627of income, ρ (0.020) (0.168) (0.018) (0.005) (0.569)
Std. of idiosyncratic 0.172 0.173 0.175 0.195 0.366shocks, σε (0.022) (0.256) (0.019) (0.005) (0.206)
Std. of transitory 0.019 0.019 0.019 0 0.019shocks, σϕ (0.0001) (0.001) (0.001) — (0.001)
Std. of aggregate 0.751 0.747 0.765 1.261 1.196shocks, σθ (in %) (0.082) (0.131) (0.069) (0.030) (0.197)
Correlation of shocks 0.316 0.334 0.309 0.216 0.379across regions, ψ (0.343) (0.354) (0.325) (0.046) (0.102)
Migration cost, c, 10.441 10.421 10.425 10.707 11.349in logs (0.199) (0.292) (0.224) (0.038) (1.097)
Migration cost, c in $ 34,248 33,541 33,682 44,667 84,843Moment distance, χ2 (1) 3.053 2.913 3.514 1253 84.78p-value 0.081 — 0.061 0 0
Standard errors in parenthesis. Estimation is carried out using the simulated momentsestimator by Gourieroux, Monfort, and Renault (1993), which chooses structural modelparameters by matching the moments from a simulated panel of regions with data moments asdisplayed in Table 3. For details on the simulation, see notes to Table 3.
baseline specification.16
The estimated standard deviation of aggregate income shocks is 0.75% while the
standard deviation in idiosyncratic shocks is 17.2%. Hence aggregate shocks make up
0.18% of the total variance in income. There is a significant transitory income compo-
nent (measurement error) in the aggregate income fluctuations, which has an estimated
standard deviation of 1.9%. This means that transitory fluctuations in aggregate income
add a variance term that has about 40% of the long-run variance of the sum of potential
incomes and measurement error. However, migration smooths realized incomes so that
transitory shocks make up more of the aggregate variance in realized incomes.
The estimated standard deviation and persistence of idiosyncratic incomes is in line
with the numbers for example reported in Storesletten et al. (2004). The estimated
correlation of latent shocks to potential income across regions is 31.6%. This is roughly
16Further robustness checks are provided in Appendix F including alternative filtering of trends andalternative definitions of aggregate incomes.
26
half the observed correlation of realized incomes (62%, see Table 3). The key difference
between the two is that realized income comprises self-selection of the agents into the
region in which they are better off. Differential shocks to regional incomes are partly
offset by migration, while common income shocks do not trigger moves that offset the
shock. After an adverse income shock to a region, the low income agents of that region
move to the region that has become relatively richer. This dampens the income decrease
in the region hit by the shock and decreases average income in the other region. Hence,
migration ties together the average realized incomes in both regions more closely than
potential incomes are.
5.5 Ignoring dynamic self-selection
In the final column (V)of Table 4 we report the estimates for an approximate version
of our model. There, we purposely ignore the dynamic self-selection that shapes dis-
tributions of potential incomes and we replace the conditional density in (10) by its
unconditional counterpart. If self-selection played no role, this replacement was inno-
cent. The former place of residence was not informative for unobservable migration
incentives and conditional and unconditional distributions coincided. Hence, we should
obtain similar estimation results as in our baseline specification if self-selection was of
no concern. One may be tempted to think so as annual migration rates are small.
The column ‘unconditional’in Table 4 reports the estimation results from this ex-
ercise ignoring dynamic self-selection, i.e. using the unconditional income distributions
instead of the ones conditional on the place of residence. Neglecting self-selection seems
all but harmless. The point estimates of all model parameters change substantially.
Most importantly– and in line with our argument in Section 2– the estimated migra-
tion costs are with US$ 84,843 substantially larger than in the baseline estimation and
all other robustness checks. Hence, treating migration as a dynamic decision problem
at the micro level without taking care of dynamic self-selection in the aggregation may
lead to a severe bias.
6 (How) to model persistence matters
6.1 Two alternatives
Next, we show that modelling persistence in incomes and the way it is modelled have
stark consequences that are rooted in dynamic self-selection but go beyond a potential
bias in parameter estimates. For this purpose, we re-estimate the model for two spec-
ifications. In the first one, we fix the autocorrelation ρ at zero. In the second one,
we additionally introduce fixed household effects in incomes as an alternative way to
27
Table 5: Simulated moments estimation: Estimation results from models without dy-namic self-selection
fixedbaseline ρ = 0 effects
Autocorrelation 0.952 0 0of income, ρ (0.020) — —
Std. of fixed 0 0 0.347idiosyncratic effects, σκ — — (0.289)
Std. of idiosyncratic 0.172 0.442 0.420shocks, σε (0.022) (0.001) (0.063)
Std. of transitory 0.019 0.019 0.017shocks, σϕ (0.001) (0.001) (0.001)
Std. of aggregate 0.751 1.225 1.562shocks, σθ (in %) (0.082) (0.053) (0.095)
Correlation of shocks 0.316 0.457 0.451across regions, ψ (0.343) (0.124) (0.098)
Migration cost, c, 10.441 10.476 10.031in logs (0.199) (0.103) (0.358)
Migration cost, c in $ 34,248 35,471 22,715Moment distance, χ2 (1) 3.053 117.828 116.952p-value 0.081 0 0
See notes to Table 4. The fixed effects model assumes seven types of agents with equalpopulation size and different permanent attachment to either region A or B. As the migrationtriggers under ρ = 0 are further out in the ergodic unconditional income distribution, both theρ = 0 and fixed effects model are solved with a finer grid for the income process.
model persistence. In this second alternative, an agent permanently faces higher income
potential in one or the other region, but is randomly assigned to one of the regions at
birth. In this specification, we again need to take the self-selection of agents into regions
into account when estimating the model. The key difference to our baseline specification
is that under the fixed effects specification, persistent heterogeneity is revealed at labor
market entry while heterogeneity of agents is slowly building up over time in our baseline
specification. Estimation results for both experiments are reported in Table 5.
In interpreting the estimation results some care needs to be taken. Overall the
point estimates remain relatively similar (except for the variance of idiosyncratic income
shocks). Even though we obtain similar parameter estimates, the model under ρ = 0
exhibits very different elasticities with respect to migration costs as we will discuss later.
28
That migration cost estimates nonetheless remain similar is due to the fact that the lower
is ρ the smaller is the dispersion of the present values of income streams a household
obtains remaining in one region forever. In other words, the lower is ρ, the less extreme
are potential gains from migration.
Having this in mind, we set up an alternative specification that captures persistence
in incomes. The final column in Table 5 reports results from a specification, where we
set the autocorrelation ρ in (13) to zero, but introduce an additional fixed effect that
increases (or decreases) household i′s potential log-income in region A by κi permanently.
This implies that (13) is modified for region A to
wiAt = zAt + w∗iAt + κi.
To solve the model, we then specify that an agent can be one of 7 types, κi ∈ {κ1, . . . , κ7} ,each making up 1/7 of the entire population. We assume that upon birth, the type of the
household is randomly assigned. Note that this implies that households will dynamically
self-select based on the realization of κi but they may exhibit a strong mismatch with
their region when born. This specification is closest to the setup Kennan and Walker
(2011) consider. Still there are differences: in our setup agents know their potential
incomes before the migration decision (migration is no experience good) and migration
costs are fixed, besides our estimation strategy being different.
Again the estimated migration costs do not differ significantly from our baseline
result. They are in fact very close to what Kennan and Walker (2011) obtain when they
take into account the expected i.i.d. payoff shock of a migrant (their migration costs are
stochastic). Our estimation results indicate a substantial amount of perfectly persistent
heterogeneity σκ = 0.347, making up about 40% of the total income variance.
Although they have small effects regarding parameter estimates, model assumptions
on whether and how to model autocorrelations in income have strong consequences in
terms of model behavior. To illustrate this, we first look at exogenous changes in mi-
gration costs (compared to the estimated ones) and calculate the migration response.
Second, we analyze the age-patterns in migration predicted by our baseline model and
the alternative model specification with fixed effects. We show that only the baseline
specification is able to account for the empirically observed age patterns (without intro-
ducing age-dependent migration costs).
29
6.2 Counterfactual experiments
We consider three counterfactual experiments, where we vary migration costs in the
baseline, the no-autocorrelation, and the fixed-effects version of our model. In Table
6 we report migration rates and average incomes for these experiments together with
the numbers under the estimated migration costs. All parameters other than migration
costs we leave as estimated.
In the first experiment we set migration costs to zero. This allows us to determine a
steady state in which only the distribution of migration incentives and not costs of moves
determine the migration rate. In a situation in which unobservable migration incentives
are serially uncorrelated, i.e. drawn completely anew every period, migration rates are
50% on average in the absence of migration costs. In such a situation of zero costs
and i.i.d. incentives, every period half of the population in one region is better off by
moving to the other one. In fact this is what we find for the no-autocorrelation model. By
contrast, both our baseline model and the fixed effects model display migration rates well
below 50% in the absence of migration costs. In both setups, distributions of migration
incentives result from past migration decisions. Agents self-select into the region where
they are better off. Only those agents that have been on the margin —on the verge of
moving in the preceding period —are likely to migrate in the current period. Yet, the
difference between fixed effects and baseline AR-1 specification for income persistence is
tremendous. Migration rates increase to only 12.6% in the latter but increase to 37.7%
in the former specification, even though in both setups the migration rate is 3.7% under
the estimated costs. Under the AR-1 specification, dynamic self-selection plays a much
bigger role in determining migration rates than in the fixed effects setup.
A problem with the zero-cost counterfactual could be that the cost-decrease is dif-
ferent across models and this might be responsible for the different changes in migration
rates. Therefore, we consider a second experiment, where a migration subsidy of $10,000
is awarded. Yet, this does not change the picture, as one can see from Table 6. Migra-tion rates respond much stronger in the fixed effects and no autocorrelation specification
than they do in our baseline setup.
Finally, we consider an increase in migration costs by $30,000, which roughly doubles
costs in the baseline model. One may think of such experiment as a simulation of the
effect of a less liquid housing market such as after the financial crisis. Again, the different
empirical models make very different predictions. Our baseline model predicts a mild
decrease in migration activity by one percentage point, while the fixed effects model
predicts a decline by more than 2.5 percentage points. Also the impact on average
30
Table 6: Simulation results: variations in migration costs
Average Average AverageMigration costs migration rate log income income
Baseline (AR-1) zero costs 12.6% 10.84 50,869ρ = 0.952 10,000$ subsidy 4.3% 10.83 50,362
as estimated 3.7% 10.82 50,16130,000$ increase 2.7% 10.81 49,662
No autocorrelation, zero costs 50.0% 10.77 47,382ρ = 0 10,000$ subsidy 8.2% 10.67 43,217
as estimated 3.7% 10.64 41,89830,000$ increase 0.4% 10.61 40,376
Fixed effects, no further zero costs 37.7% 10.91 54,721autocorrelation, ρ = 0 10,000$ subsidy 10.4% 10.91 54,666
as estimated 3.7% 10.89 53,74530,000$ increase 1.2% 10.88 52,892
incomes is much larger in the latter model. Average incomes decline by 1.5% ($ 850) in
the fixed effects setup compared to 1% ($ 500) in our baseline model. In summary, if
one thinks about effects of changing migration costs, it very much matters for the results
how one models income persistence.
6.3 Age patterns of migration
There is a second point for which we want to highlight the differences implied by dif-
ferent strategies to model income persistence. So far, we focused on the estimation of
migration costs and aggregate measures in describing the role played by dynamic self-
selection. However, current contributions to the empirical study of migration go beyond
the representative agent assumption of our homogeneous migration-cost model (with
heterogeneous incentives). A well documented pattern in migration data is that younger
agents are significantly more likely to move than older agents.
The standard explanation of this pattern relies on the investment character of mi-
gration choices, the so-called human capital theory of migration. This theory rests on
the fact that younger agents face a longer period in which their migration choices can
pay off. As a consequence younger agents are more likely to migrate just as younger
agents are more likely to invest in human capital. While this human capital theory of
31
migration is able to explain well the difference in migration between job starters and
agents close to retirement, it has diffi culties in explaining the sharp decline in migration
rates between ages 20 and 35 (Kennan and Walker, 2011). Accordingly some authors
have suggested age-dependence of migration costs.17
Our model provides one possible explanation for the age-migration relation that
does not rely on age-dependent migration preferences. In our model, agents start their
working life randomly assigned to one of the two regions. Agents then repeatedly choose
whether to stay or to move to the alternative region observing their current potential
income differences, which result from their history of income shocks. This sequence of
income shocks and migration choices has two consequences. First, over the course of
their lives, agents accumulate income risk since income is highly autocorrelated. Thus
potential incomes become more dispersed between regions as agents get older. Second,
an agent stays in a given region if she earns more income in her current region than
in the alternative one. This means that ex-ante (i.e. before income shocks realize) the
match between agent and region becomes more effi cient as agents get older. They have
selected themselves into their preferred region. This increasing match effi ciency implies
that migration rates generally decline in age. This effect is similar to the role of age in
Jovanovic’s (1979) job search model and has in principle been discussed in work parallel
to ours (Kennan and Walker, 2011, Coen-Pirani, 2010). Here we show that the way
persistence in incomes is modelled is decisive for the extent this selection channel can
quantitatively account for the age patterns in migration as observed in the data.
To investigate this, we simulate 50,000 households over 50 periods of time for each
state, repeat the simulation 5 times, and store both household’s age and their migration
choices for the last 20 years of the simulation. This allows us to calculate migration rates
by age as displayed in Figure 2. We assume that a household enters the labor market at
the age of 21.18 One can see that the migration rate declines as the household becomes
older, but the drop in migration rates is smoothed over different ages. With our baseline
estimates, agents have a probability to migrate of 10% at the time they enter the labor
market, while agents who are 20 years older only migrate with a probability of 3%. This
age pattern is induced by agents choosing their optimal location over time.
We confront this implied age pattern of migration to observed migration rates by
age in the Current Population Survey (CPS). The available IRS data does not provide a
split-up of migration by age. However, here we can use the CPS data on migration rates
17Family formation and education choice (see for example Gemici, 2011) are examples for other po-tential explanations of the sharp decrease in migration rates between age 20 and age 35.18This corresponds to the midpoint estimate of first and last transition from school to work as reported
in Jacob and Weiss (2010) for the US.
32
Figure 2: Simulation results: migration rates by age
20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
Age
Mig
ratio
n R
ate
Model (AR1)Model (Fixed effects)CPS
Model: Relative frequencies of migration conditional on age. The frequencies are obtained bysimulating the behavior (migration, income, death and birth) of a cross-section of 50,000households for 50 years. Reported frequencies are obtained by averaging over 5 repeatedsimulations using only the final 20 years of data in each simulation. "AR-1" refers to ourbaseline estimate, "Fixed effects" to the model which captures income persistence by fixedeffects instead.CPS: Average interstate migration rates of households by age from the Current PopulationSurvey (CPS), civilian population of age 21-50. Average over the years 1989-2004; there is nodata for 1995.
as we are not interested in fluctuations over time and states but just in the average age
pattern. CPS migration rates are overall slightly lower than the migration rates in the
IRS data.
As our model predicts, migration rates fall quickly in the first years after labor market
entry. However, our model underpredicts the decline in migration rates at later ages.
This is likely due to the eternal youth structure of our model that shuts down the human
capital channel of migration described before. Understanding the dynamic self-selection
channel as complementary to the human capital channel, we thus expect a richer model
encompassing both channels to fit the observed age patterns in migration rates closely
33
Figure 3: Simulation results: income dispersion by age
20 25 30 35 40 45 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Age
Inco
me
vari
ance
AR1 realizedAR1 potentialFixedeffects realizedFixedeffects potential
"Realized": Variance of realized incomes in the economy. The variances are obtained bysimulating the behavior (migration, income, death and birth) of a cross-section of 50,000households for 50 years. Reported frequencies are obtained by averaging over 5 repeatedsimulations using only the final 20 years of data in each simulation. "Potential": Variance ofpotential incomes as given by the income process. "AR-1" refers to our baseline estimate,"Fixed effects" to the model which captures income persistence by fixed effects instead.
without making migration costs age-dependent.19
The figure also reveals that migration rates do not decline monotonically in age in
our model. At the time of entry into the labor market there is not much heterogeneity
across agents, but this heterogeneity quickly grows as agents accumulate shocks to their
potential incomes. This means that initially only few agents observe income differences
between both regions large enough to make them move. Therefore, under our estimated
migration costs, migration rates are highest one year after labor market entry. The same
holds true for the CPS data. After the first period there are many agents who would
earn more in the other region but not suffi ciently more to justify a move. In the second
19Of course, the close match of simulated and actual age patterns depends on our choice of age 21being the age of labor market entry. If labor market entry is around that age the match is roughly asgood as displayed in Figure 2. If labor market entry was substantially later the match would be worse.
34
period, many of these agents observe an income shock that is large enough to induce a
move, because income shocks are large relative to the observed heterogeneity at early
ages. In the following periods the effect of dynamic self-selection dominates the effect of
increasing heterogeneity, leading to the inverse hump-shaped pattern of migration rates.
The figure looks very different for the model with fixed effects, see again Figure 2.
Here we see a sharp spike of migration activity at labor market entry and a quick decline
thereafter. Since agents know where they are permanently better off at the time of
labor market entry, they seek to realize the income gain by immediate migration. In our
baseline model, by contrast, regional attachments slowly unfold in the course of time
and selection happens over a much longer time period.
For the same reason, we obtain fairly different results on the effect of migration on
the dispersion of incomes, see Figure 3. In the fixed effects model there is a substantial
difference between the realized income dispersion and the dispersion of potential incomes
already at labor market entry and it is almost independent of age. This is because agents
know the fixed effects part of potential income already at the time of labor-market entry
and hence if they are in a region that is permanently no good match at this time, they
move away quickly. By contrast, the effect of migration on income dispersion builds
up in age as the income dispersion fans out. Only for the fairly old workers there is
a large difference between realized and potential income dispersion. For the overall
population, however, the reduction in income dispersion by migration is relatively small
as the fraction of older workers is small in our model.20
7 Conclusion
We have provided a model of aggregate migration with microeconomic foundation. The
paper is a contribution to the recently evolving literature on structural models of mi-
gration. We explicitly deal with the problem that potential gains from migration are
unobservable and display a dynamic character. This dynamic character of migration
incentives has two aspects: First, the individual gains from migration evolve stochasti-
cally over time, but will typically be highly persistent. Second, at an aggregate level, the
distribution of migration incentives is a result of past migration decisions themselves.
Starting from the microeconomic decision problem allows us to keep track of the dy-
namics of the incentive distribution. This dynamics is driven by (dynamic) self-selection.
Neglecting this self-selection results in biased estimates of structural parameters, such
as migration costs. The estimated migration costs amount to about US$ 34,248, which
20Note that we take the population distribution in age into account when matching the wage dispersiondata, see Section 5.2.2.
35
corresponds to two-thirds of an average annual income.
Our analysis calls for a careful treatment of the self-selection problem when economic
incentives are not fully observable but persistent. This is particularly relevant for the
analysis of migration. Rather than being drawn every period anew, migration incentives
have a long memory. One example of this long memory of migration incentives is the
persistence that income displays. We integrated the persistence of unobserved migration
incentives in a structural dynamic microeconomic model of the migration decision. This
consequently allowed us to simulate the joint behavior of the observed migration rates,
of the unobserved migration incentives, and of their observable proxies, i.e. incomes.
Addressing the partial unobservability of migration incentives may not only be impor-
tant to macro-studies of migration. Also at a micro level, potential incomes are typically
unobservable and have to be proxied. However, such approximation regularly neglects
self-selection. If households live in their preferred place of residence as a result of their
location choice, and if all observable things are equal, then it must be the unobserved
component of their preferences that is in favor of the place in which they actually live.
Besides unobservable parts of income, this unobservable component of preferences can
also comprise different valuations of amenities and social networks. Even these factors
can be expected to exhibit persistence.
Future research calls for a more complex microeconomic model that integrates more
information into the macroeconomic analysis, for example labor market conditions and
amenities. Additionally, it would be desirable to extend our bi-regional approach to the
case of multiple regions, as in Davies, Greenwood and Li (2001) and Kennan and Walker
(2011). Further aspects, such as the interaction of migration and local labor markets,
could be analyzed in a general equilibrium framework as in Coen-Pirani (2010), but
our results call for an explicit treatment of the dynamic structure and persistence of
migration incentives.
Taking a more general perspective, our paper highlights the role of dynamic self-
selection in a model with imperfectly observed incentives. One can expect the econo-
metric issues that we raise to carry over to other examples of dynamic discrete choice
problems. Examples would be labor-market participation (see Keane and Wolpin, 2009)
or product switching (see e.g. Sweeting, 2007). Also in these frameworks our suggested
solution may well be applicable - explicit aggregation and taking the incentive dynamics
seriously.
36
References
[1] Adda, J. and R. Cooper (2003): "Dynamic Economics: Quantitative Methods and
Applications", MIT Press, Cambridge.
[2] Aguirregabiria, V. and P. Mira (2010): "Dynamic Discrete Choice Structural Mod-
els: A Survey", Journal of Econometrics, 156, 38-67.
[3] Armenter, R. and F. Ortega (2010): "Credible Redistribution Policies and Migration
across U.S. States", Review of Economic Dynamics 13, 403-423.
[4] Blanchard, O. and L. Katz (1992): "Regional evolutions", Brookings Papers on
Economic Activity, 1, 1-75.
[5] Breitung, J. and W. Meyer (1994): "Testing for unit roots in panel data: are wages
on different bargaining levels cointegrated?", Applied Economics, 26, 353-361.
[6] Burda, M. (1993): "The Determinants of East-West German Migration: Some First
Results", European Economic Review, 37, 452-462.
[7] Burda, M., M. Müller, W. Härdle, and A. Werwatz (1998): "Semiparametric Analy-
sis of German East-West Migration Intentions: Facts and Theory", Journal of Ap-
plied Econometrics, 13, 525-541.
[8] Borjas, G. J. (1987): "Self-selection and the earnings of immigrants", American
Economic Review, 77, 531-553.
[9] Borjas, G. J., S. Bronars, and S. Trejo (1992): "Self-selection and internal migration
in the United States", Journal of Urban Economics, 32, 159-185.
[10] Caballero, R. and E. M. R. A. Engel (1999): "Explaining Investment Dynamics in
U.S. Manufacturing: A Generalized (S,s) Approach", Econometrica, 67, 783-826.
[11] Chow, C. S. and J. N. Tsitsiklis (1991): "An optimal multigrid algorithm for con-
tinuous state discrete time stochastic control", IEEE Transactions on Automatic
Control, 36, 898—914.
[12] Cushing, B. and J. Poot (2004): "Crossing boundaries and borders: Regional science
advances in migration modelling", Papers in Regional Science, 83, 317-338.
[13] Coen-Pirani, D. (2010): "Understanding Gross Workers Flows Across U.S. States",
Journal of Monetary Economics, 57, 769-784.
[14] Davies, P. S., M. J. Greenwood, and H. Li (2001): "A Conditional Logit Approach
to U.S. State-to-State Migration", Journal of Regional Science, 41, 337-360.
[15] Decressin, J. and A. Fatas (1995): "Regional Labor Market Dynamics in Europe",
European Economic Review, 39, 1627-1655.
[16] Gemici, A. (2011): "Family Migration and Labour Market Outcomes", mimeo,
NYU.
37
[17] Gourieroux, C., A. Monfort, and E. Renault (1993): "Indirect inference", Journal
of Applied Econometrics, 8, 85-118.
[18] Greenwood, M. J. (1975): "Research on internal migration in the United States: A
survey", Journal of Economic Literature, 13, 397-433.
[19] Greenwood, M. J. (1985): "Human migration: Theory, models and empirical stud-
ies", Journal of Regional Science, 25, 521-544.
[20] Greenwood, M. J. (1997): "Internal migration in developed countries", in: Rosen-
zweig, M. R. and O. Stark (eds.), Handbook of population and family economics,
Volume 1B, Elsevier, Amsterdam.
[21] Gross, E. (2005): "Internal Revenue Service Area-To-Area Migration Data:
Strengths, Limitations, and Current Trends", American Statistical Association 2005
Conference Paper.
[22] Hassler, J., J. V. R. Mora, K. Storesletten, and F. Zilibotti (2005). "A Positive
Theory Of Geographic Mobility And Social Insurance", International Economic
Review, 46, 263-303.
[23] Heckman J. J. (1974): "Shadow Prices, Market Wages, and Labor Supply", Econo-
metrica, 42, 679-94.
[24] Heckman J. J. (1976): "The common structure of statistical models of truncation,
sample selection, and limited dependent variables and a simple estimator for such
models", Annals of Economic and Social Measurement, 5, 475-492.
[25] Heckman J. J. (1978): "Dummy Endogenous Variables in a Simultaneous Equations
System", Econometrica, 46, 931—59.
[26] Heckman J. J. and R. Robb (1985): "Alternative Methods for Evaluating the Impact
of Interventions", Journal of Econometrics, 30, 239-267.
[27] Hunt, G. L. and R. E. Mueller (2004): "North American Migration: Returns to
Skill, Border Effects and Mobility Costs", Review of Economics and Statistics, 86,
988-1007.
[28] Jacob, M. and F. Weiss (2010): "From Higher Education to Work", Higher Educa-
tion, 60, 529-542.
[29] Jovanovic, B. (1979): "Job Matching and the Theory of Turnover", Journal of
Political Economy, 87, 972-990.
[30] Keane M. P. and K. I. Wolpin (2009): "Empirical Applications of Discrete Choice
Dynamic Programming Models", Review of Economic Dynamics, 12, 1-22.
[31] Kennan, J. and J. R. Walker (2011): "The Effect of Expected Income on Individual
Migration Decisions", Econometrica, 79, 211-251.
[32] Lee L.F. (1978): "Unionism and Wage Rates: A Simultaneous Equation Model with
38
Qualitative and Limited Dependent Variables," International Economic Review 19,
415—33.
[33] Lee L.F. (1979): "Identification and Estimation in Binary Choice Models with
Limited (Censored) Dependent Variables," Econometrica, 47, 977—96.
[34] Levin, A., C. F. Lin, and C. S. J. Chu (2002): "Unit Root Tests in Panel Data:
Asymptotic and Finite Sample Properties", Journal of Econometrics, 108, 1-24.
[35] Low H., C. Meghir, and L. Pistaferri (2010): "Wage Risk and Employment Risk
over the Life Cycle", American Economic Review, 100, 1432—67.
[36] Nakosteen, R. A. and M. Zimmer (1980): "Migration and Income; The Question of
Self-Selection," Southern Economic Journal, 46, 840-851.
[37] Norets, A. (2008), "Implementation of Bayesian Inference in Dynamic Discrete
Choice Models", mimeo, Princeton University.
[38] Norets, A. (2009), "Inference in Dynamic Discrete Choice Models with Serially
Correlated Unobserved State Variables", Econometrica, 77, 1665-1682.
[39] Shimer, R. (2005), "The Cyclical Behavior of Equilibrium Unemployment and Va-
cancies", American Economic Review, 95, 25-49.
[40] Sjaastad, L. (1962): "The costs and returns of human migration", Journal of Polit-
ical Economy, 70, 80-93.
[41] Storesletten, K., C. I. Telmer, and A. Yaron (2004): "Cyclical Dynamics in Idio-
syncratic Labor-Market Risk", Journal of Political Economy, 112, 695-717.
[42] Sweeting, A. (2007): "Dynamic Product Repositioning in Differentiated Product
Industries: The Case of Format Switching in the Commercial Radio Industry",
NBER-WP 13522.
[43] Tauchen, G. (1986): "Finite state Markov-chain approximation to univariate and
vector autoregressions", Economics Letters, 20, 177-181.
[44] Tunali, I. (2000): "Rationality of Migration", International Economic Review, 41,
893-920.
39
Appendix
A Derivation of equation (4)
For the covariance between the error term ηi1 and location in period 0, yiA0, (as described
in Section 2, eq. (4)), we obtain:
cov (yiA0, ηi1) =E (yiA0ηi1)− E (ηi1)E (yiA0)
=E (ηi1|yiA0 = 1) Pr (yiA0 = 1)− E (ηi1) Pr (yiA0 = 1) .
Since E (ηi1) is by assumption zero, this simplifies to
cov (yiA0, ηi1) = E (ηi1|yiA0 = 1) Pr (yiA0 = 1) .
Using the definition of
ηi1 := γ (w∗iA1 − w∗iB1) + νi1
and
w∗ij1 = ρw∗ij0 + εij1
we can rewrite cov (yiA0, ηi1) as
cov (yiA0, ηi1) =E (γ (ρ (w∗iA0 − w∗iB0) + εiA1 − εiB1) + νi1|wiA0 > wiB0) Pr (wiA0 > wiB0)
= γρE ((w∗iA0 − w∗iB0) |wiA0 > wiB0) Pr (wiA0 > wiB0) ,
where the second equality follows from εijt and vit being orthogonal to all information
available at time t−1 and mean zero. Making use of the definition of wijt := ζjtzit+w∗ijtand the normality of w∗ijt we obtain
cov (yiA0, ηi1) = γρE ((w∗iA0 − w∗iB0) | (w∗iA0 − w∗iB0) > − (ζA1 − ζB1) zi1) Pr (wiA0 > wiB0)
= 2γρσ0
φ(−(ζA1−ζB1)zi1
2σ0
)(
1− Φ(−(ζA1−ζB1)zi1
2σ0
)) Pr ((w∗iA0 − w∗iB0) > − (ζA1 − ζB1) zi1)
= 2γρσ0φ
((ζA1 − ζB1) zi1
2σ0
),
where φ and Φ are the probability density function and cumulative distribution function
of a standard normal distribution respectively.
40
B Existence and uniqueness of the value function
We begin with proving existence and uniqueness of the value function. Notation is as in
the main text throughout this appendix, unless stated otherwise.
For the ease of exposition, we assume that the income process is only approximately
log-normal. In particular, we assume that income has a finite support.
Definition 1 Let W =[W,W
]be the support of w.
Definition 2 Define a mapping T according to the migration problem of a household,
that is
T (u) (k,wiAt, wiBt) = maxj=A,B
{exp (wijt)− I{k 6=j}c+ βEtu (j, wiAt+1, wiBt+1)
}. (15)
The mapping T is defined on the set of all real-valued, bounded functions B that arecontinuous with respect to wA,B and have domain D = {A,B} ×W2.
Lemma 3 The mapping T preserves boundedness.Proof. To show that T preserves boundedness one has to show that for any bounded
function u also Tu is bounded. Consider u to be bounded from above by u and bounded
from below by u. Then, Tu is bounded, because
Tu = maxj=A,B
{exp (wijt)− I{k 6=j}c+ βEtu (j, wiAt+1, wiBt+1)
}≤ exp
(W)
+ βu <∞,(16)
and
Tu= maxj=A,B
{exp (wijt)− I{k 6=j}c+ βEtu (j, wiAt+1, wiBt+1)
}(17)
≥ maxj=A,B
{exp (wijt)− I{k 6=j}c+ βu
}≥ exp (W ) + βu > −∞. (18)
Lemma 4 The mapping T preserves continuity.Proof. Since Tu is the maximum of two continuous functions, it is itself continuous.
Lemma 5 The mapping T satisfies Blackwell’s conditions.Proof. First, we need to show that for any u1 (·) < u2 (·) the mapping T preserves
the inequality. Since both the expectations operator and the max operator preserve the
41
inequality, T does also. Second, we need to show that T (u+ a) ≤ Tu + γa for any
constant a and some γ < 1. Straightforward algebra shows that
T (u+ a) = Tu+ βa. (19)
Since β < 1 by assumption, T satisfies Blackwell’s conditions.
Proposition 6 The mapping T has a unique fixed point on B, and hence the Bellman-equation has a unique solution.
Proof. Follows straightforwardly from the last three Lemmas.
C Effect of income shocks on the distribution of income
Idiosyncratic shocks, aggregate shocks, death and birth of agents, and the persistence
of the income process determine the transition of the distribution of income incentives
after migration to the distribution of migration incentives before migration in the next
period. For surviving households the income distribution at the beginning of period t+1
results from adding idiosyncratic and aggregate shocks to the distribution of income
after migration in period t, Ft, of which fjt (wA, wB) is the conditional density, see (12).
This means that for a surviving household an income of wijt+1 in period t+ 1 can result
from any possible combination of wijt and ξijt+1 = θjt+1 + εijt+1 for which
wijt+1 = µj (1− ρ) + ρwijt + θjt+1 + εijt+1 (20)
holds. Solving this equation for wijt, we obtain
w∗j (wijt+1, θjt+1, εijt+1) := wijt =wijt+1 − (θjt+1 + εjt+1)
ρ− µj
(1− ρ)
ρ. (21)
This w∗j (wijt+1, θjt+1, εijt+1) is the time-t potential income in region j that is consistent
with a future potential income of wijt+1 and realizations of shocks θjt+1 + εijt+1 at the
beginning of period t + 1. Now suppose that both kinds of shocks, θ and ε, have been
realized. Then, w∗A,B is a one-to-one mapping of future incomes (wiAt+1, wiBt+1) to
current income (wiAt, wiBt) .
The conditional density of observing the future income pair (wiAt+1, wiBt+1) can
thus be obtained from a retrospective. The income pair (w∗A, w∗B) of past incomes for
a surviving household corresponds uniquely to a future income pair (wiAt+1, wiBt+1) .
Consequently, we can express the density of the income distribution at time t + 1 (for
42
surviving households) using the income distribution after migration Ft, and its condi-
tional density fjt. The density of the income distribution Ft+1 conditional on surviving
and the region and the vector of shocks is given by
fjt+1 (wA, wB|θAt+1, θBt+1, εiAt+1, εiBt+1)
= fjt (w∗A (wA, θAt+1, εiAt+1) , w∗B (wB, θBt+1, εiBt+1)) . (22)
Weighting this density with the density of the idiosyncratic shocks h (εiAt+1, εiBt+1)
yields the density of observing the future income pair (w∗A, w∗B) together with the idio-
syncratic shocks (εiAt+1, εiBt+1) :
fjt (w∗A (wA, θAt+1, εiAt+1) , w∗B (wB, θBt+1, εiBt+1)) · h (εiAt+1, εiBt+1) .
Integrating over all possible idiosyncratic shocks (εiAt+1, εiBt+1) yields the density
fjt+1 of the income distribution before migration and conditional on surviving in period
t+ 1 for a certain combination of aggregate shocks (θAt+1, θBt+1):
fjt+1 (wA, wB|θAt+1, θBt+1) =∫fjt (w∗A (wA, θAt+1, εA) , w∗B (wB, θBt+1, εB)) · h (εA, εB) dεAdεB, j = A,B. (23)
Finally, the actual conditional distribution of potential incomes Fjt+1 and its density
fjt+1 is determined by a convex combination of fjt+1 (for surviving households) and the
distribution of income shocks (for newborn households)
fjt+1 (wA, wB|θAt+1, θBt+1, zAt+1, zBt+1)
= (1− δ) fjt+1 (wA, wB|θAt+1, θBt+1) + δh (εA − zAt+1, εB − zBt+1) .
For given aggregate states and shocks, this new distribution determines migration from
region j to region −j according to equation (10) for period t+ 1.
The evolution of income distributions can thus be summarized as follows. Between
two consecutive periods, the conditional distribution of potential incomes first evolves
as a result of migration decisions, moving the density from fjt to fjt. Thereafter, the
distribution is altered by aggregate and idiosyncratic shocks to income, moving the
density from fjt to fjt+1. Finally, a fraction of households dies and for this fraction the
distribution fjt+1 is replaced by the distribution of income shocks. This leads to the
new distribution fjt+1, which determines migration decisions in period t + 1, starting
43
the cycle over again.
D Invariant distribution
We prove that migration decisions and idiosyncratic shocks to income imply that poten-
tial income follows an ergodic Markov-process if there are no aggregate shocks. There-
fore, there is an invariant distribution the sequence of income distributions converges
to. For simplicity, we present the proof for an arbitrary discrete approximation of the
model with a continuous state-space for income.
Lemma 7 Assume an arbitrary discretization of the state space with n points for thepotential income in each of the regions. Then, we can capture the transition from ft to
ft+1, which are the unconditional densities of the distribution of households over both
regions and potential incomes, in a matrix Γ =
(Π (I −DA) ΠDB
ΠDA Π (I −DB)
)∈ R2n2×2n2.21
In this matrix, Π denotes the transition matrix that approximates the AR(1)-process for
income (including birth and death) by a Markov-chain, see Adda and Cooper (2003, pp.
56) for details. Matrix Dj , j = A,B is the n2 × n2 diagonal matrix with the migration
hazard rates for each of the n2 income pairs of the income grid.
Proof. First, we take a discrete state-space of n possible incomes for each region,
wA1...wAn and wB1...wBn. Second, we denote the vector of probabilities that describes
the distribution of potential incomes and household locations in the following form
f =(f (A,wA1, wB1) ... f (A,wAn, wB1) ... f (A,wAn, wBn) f (B,wA1, wB1) ... f (B,wAn, wBn)
)′.
(24)
Analogously, we define the distribution after migration but before idiosyncratic shocks,
f . Taking our law of motion from (23) , we obtain as a discretized analog
ft+1 =
(Π 0
0 Π
)ft. (25)
Now, define dh ∈ {0, 1} as the fraction of households that migrate and are in the h-thincome and location triple given our vectorization of the income grid. This means that
dh = Λj (wAk, wBl) , h = 1...2n2, where (j, wAk, wBl) is the h-th element in the vectorized
grid. Moreover, define D = diag (d) as the diagonal matrix with migration rates on the
diagonal and DA and DB as the diagonal matrices with only the first n2 and the last n2
21Since we work with a discretization, strictly speaking f is not a density, but a vector of probabilitiesfor drawing a location-income possibility vector from a given element of the grid.
44
elements of d, respectively. Then, we can describe the transition from ft to ft by
ft =
(I −DA DB
DA I −DB
)ft (26)
Combining the last two equations, we obtain
ft+1 =
(Π (I −DA) ΠDB
ΠDA Π (I −DB)
)ft. (27)
Lemma 8 For any distribution of idiosyncratic shocks with support equal to W2, matrix
Π has only strictly positive entries.
Proof. If the idiosyncratic shocks have support equal to W2, then every pair of potential
incomes can be reached from every other pair of incomes as a result of the shock, because
we assume the shocks to income to be approximately log-normal. Thus, all entries of Π
are strictly positive.
Lemma 9 Γ2 has only positive entries.
Proof. Due to cA = −cB, we can assume an ordering of states such that we can write
DA =
(Ina 0
0 0
)and DB =
(0 0
0 Inb
), without loss of generality, where Iz is a z×z unit
matrix. Accordingly, we define partitions of Π such that
Π =
(A1 A2
A3 A4
)=
(B1 B2
B3 B4
)
=
(C1 C2
C3 C4
)=
(D1 D2
D3 D4
),
where A1 ∈ R(n2−na)×(n2−na), B1 ∈ Rnb×nb , C1 ∈ Rna×na , D1 ∈ R(n2−nb)×(n2−nb).
This yields for Γ2 after some tedious algebra
Γ2 =
B2C3 A2A4 B2C4 A2B4
B4C3 A4A4 B4C4 A4B4
D1C1 C1A2 D1D1 C1B2
D3C1 C3A2 D3D1 C3B2
.
Each entry of this matrix is positive, since Π and hence its partitions are positive.
45
Proposition 10 Under the assumptions of the above Lemmas, migration and idiosyn-cratic shocks define an ergodic process with a stationary distribution F0 = limn→∞Bnei.
Proof. The above Lemma directly implies the ergodicity of the Markov chain.
E Numerical aspects
The first step in solving the model numerically is to obtain a solution to (7) . We do so
by value-function iteration.22 For this value-function iteration, we first approximate the
bivariate process of potential incomes for an individual agent in regions A and B
wijt = µj (1− ρ) + ρwijt−1 + ξijt (28)
by a Markov chain. Because wA and wB are correlated through the correlation structure
in ξ, it is easier to work with the orthogonal components(w+A , w
+B
)of (wA, wB) in the
value function iteration.
We evaluate the value function on an equi-spaced grid for the orthogonal components
with a width of ±3.5σ+A,B around their means, where σ
+A,B denote the long-run standard
deviations of the orthogonal components. The grid is chosen to capture almost all move-
ments of the income distribution F later on.23 Given this grid, we use Tauchen’s (1986)
algorithm to obtain the transition probabilities for the Markov-chain approximation of
the income process in (28) .
We apply a multigrid algorithm (see Chow and Tsitsiklis, 1991) to speed up the
calculation of the value function. This algorithm works iteratively. It first solves the
dynamic programming problem for a coarse grid and then doubles the number of grid
points in each iteration until the grid is fine enough. In between iterations the solution
for the coarser grid is used to generate the initial guess for the value-function iteration
on the new grid by spline interpolation. The initial grid has 16×16 points (income A ×income B) and the final grid has 128×128 points.
The solution of (7) yields the optimal migration policy and thus the microeconomic
migration hazard rates Λj . With these hazard rates, we can obtain a series of aggregate
22See for example Adda and Cooper (2003) for an overview of dynamic programming techniques.23The choice of ±3.5σ+A,B is motivated as follows. We obtain in the estimation that about 99%
of the income shocks is due to the idiosyncratic component. Therefore, we can expect 99.9% of themass of the income distribution to fall within ±3.29 ·
√0.99σ+A,B
∼= ±3.27σ+A,B around the mean of thedistribution for any given year. Additionally, the mean income for each year moves within the band±3.29 ·
√0.01σ+A,B
∼= ±0.33σ+A,B in again 99.9% of all years. Since the sum of both components is±3.6σ+A,B , a grid variation of ±3.5σ+A,B should not truncate the income distribution.
46
migration rates for a simulated economy as described in detail in Section 4.2 for any
realization of aggregate shocks (θjt)j=A,Bt=1...T and an initial distribution F0.
This means that we need an initial distribution of income F0 to solve the sequen-
tial problem. Following Caballero and Engel’s (1999) suggestion, we use the ergodic
distribution of income F that would be obtained in the absence of aggregate income
shocks.24
To simulate a series of migration rates that correspond to the aggregate migration
hazards(ΛAt,Bt
)t=1...T
, we draw a series of aggregate shocks (to the orthogonal basis)(θ+At, θ
+Bt
)t=1...T
from a normal distribution with variance φ ·(σ+A,B
)2, φ ∈ [0, 1] . The
weight φ measures the relative importance of aggregate shocks, relative to idiosyncratic
shocks, i.e. φ =σ2θ
σ2θ+σ2ε. Correspondingly, the orthogonal components of the idiosyncratic
shocks have variance (1− φ) ·(σ+A,B
)2.
As we did to approximate the microeconomic income process for value function iter-
ation, we also discretize the distribution of migration incentives over the chosen grid of
income to simulate its evolution. Accordingly, we replace the conditional density in (10)
by discrete probabilities. This means that for grid points (xAk, xBl) k, l = 1...64, (k, l
being the index of grid points) with a distance of 2h in between points, we calculate the
probabilities initially (for t = 0 and before the first aggregate shock) as
pk,l,0 =
∫ xA,k+h
xA,k−h
∫ xB,l+h
xB,l−hf0 (x1, x2) dx1dx2.
An aggregate shock θj in t = 0 now implies that the off-grid pair (xA,k + θA, xB,l + θB)
occurs with probability pk,l,0 after the aggregate shock. To re-obtain on-grid probabili-
ties, we use spline interpolation methods to find the on-grid probability after aggregate
but before idiosyncratic shocks, pk,l,1, restricting p to take values between 0 and 1. That
is, for each t we define a function τ with τ t (xA,k + θA,t+1, xB,l + θB,t+1) := pk,l,t and
obtain pk,l,t+1 as
pk,l,t+1 = τ t (xA,k, xB,l)
where τ t is the interpolation of τ t. Idiosyncratic shocks are accounted for by multiplying
after-aggregate-shock, on-grid income probabilities with the transition probability matrix
obtained from Tauchen’s algorithm and thus obtain pk,l,t+1, the probability to fall in the
k, l-cluster after idiosyncratic shocks. The effect of migration on the distribution of
migration incentives is captured by using a discretized version of (12) .
24This distribution is calculated by assuming that idiosyncratic shocks ω have the full variance of ξ.
47
We calculate aggregate migration rates this way, i.e. by directly simulating the
evolution of the incentive distribution instead of using a Monte Carlo method based on
drawing a sample of agents, for the reason that the latter is not adequate in our case.
We focus on aggregate behavior, but aggregate shocks turn out to be relatively small
(being responsible for less than 1% of the total variation in income, see the discussion
in Section 5.4). Hence sampling variation would exceed the true aggregate variation of
income most likely if we applied a Monte Carlo approximation.
F Further robustness checks
Tables 7 and 8 provide the simulated moments and parameter estimates for further ro-
bustness checks. These are: (I) HP-filtering of individual state series instead of linear
detrending, (II) using a common linear trend instead of a state-specific one, (III) re-
placing average wage per job by personal income, or (IV) by disposable income, or (V)
using the average wage per job of the 5 nearest neighbor states instead of all 50 alter-
natives, and finally (VI) replacing the average cross-sectional standard deviation by the
autocorrelation of aggregate incomes in the set of moments to be matched. Overall the
estimated parameters do not change by much, with the exception of the inclusion of au-
tocorrelation in aggregate incomes where transitory shocks almost vanish and migration
costs triple.
48
Table7:Robustnesschecks:momentestimates
(I)
(II)
(III)
(IV)
(V)Income
(VI)Including
HP-filtered
Common
PersonalDisposable
ofnearest
autocorrelation
Moment
Baseline
trend
lineartrend
income
income
neighbors
ofincomes
Std.ofmigration
0.002
0.002
0.004
0.002
0.002
0.002
0.003
rates,σ
(mijt)
(0.002)
(0.002)
(0.004)
(0.002)
(0.002)
(0.002)
(0.002)
Corr.ofagg.incomes,
0.589
0.69
0.454
0.527
0.483
0.553
0.475
σ(w
At,wBt)
(0.620)
(0.724)
(0.432)
(0.555)
(0.504)
(0.574)
(0.620)
Std.ofagg.
0.019
0.014
0.026
0.024
0.023
0.019
0.019
incomes,σ
(wjt
)(0.019)
(0.013)
(0.027)
(0.025)
(0.023)
(0.019)
(0.019)
Averagemigration
0.037
0.037
0.037
0.037
0.037
0.037
0.038
rate,α
0(0.037)
(0.037)
(0.037)
(0.037)
(0.037)
(0.037)
(0.037)
Sensitivitytodesti-
0.049
0.063
0.039
0.029
0.028
0.043
0.074
nationincome,α
1(0.045)
(0.041)
(0.052)
(0.044)
(0.036)
(0.040)
(0.045)
Sensitivitytosource
-0.049
-0.063
-0.041
-0.03
-0.029
-0.043
-0.111
income,α
2-(0.053)
-(0.042)
-(0.058)
-(0.045)
-(0.052)
-(0.043)
-(0.053)
Cross-sectionalstd.
0.466
0.466
0.465
0.465
0.465
0.466
—ofincomes,σ
(wijt)
(0.466)
(0.466)
(0.466)
(0.466)
(0.466)
(0.466)
—Autocorrelation
——
——
——
0.681
ofaggregateincomes
——
——
——
(0.744)
Actualdatamomentsinbrackets.
49
Table8:Robustnesschecks:structuralparameterestimates
(I)
(II)
(III)
(IV)
(V)Income
(VI)Including
HP-filtered
Common
PersonalDisposable
ofnearest
autocorrelation
Baseline
trend
lineartrend
income
income
neighbors
ofincomes
Autocorrelation
0.952
0.951
0.952
0.952
0.951
0.953
0.979
ofincome,ρ
(0.020)
(0.031)
(0.012)
(0.032)
(0.027)
(0.020)
(0.003)
Std.ofidiosyncratic
0.172
0.173
0.172
0.175
0.174
0.171
0.737
shocks,σε
(0.022)
(0.035)
(0.016)
(0.033)
(0.028)
(0.021)
(0.025)
Std.oftransitory
0.019
0.015
0.023
0.025
0.023
0.019
0.001
shocks,σϕ
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
(0.001)
(0.016)
Std.ofaggregate
0.751
0.619
0.987
0.768
0.751
0.742
3.805
shocks,σθ(in%)
(0.082)
(0.105)
(0.102)
(0.123)
(0.104)
(0.081)
(0.128)
Correlationofshocks
0.316
0.329
0.305
0.199
0.267
0.346
0.766
acrossregions,ψ
(0.343)
(0.364)
(0.195)
(0.501)
(0.462)
(0.377)
(0.062)
Migrationcost,c,
10.441
10.447
10.462
10.541
10.505
10.416
11.461
inlogs
(0.199)
(0.277)
(0.238)
(0.362)
(0.240)
(0.208)
(0.093)
Migrationcost,cin$
34248
34449
34961
37836
36489
33394
94939
Momentdistance,χ
2(1
)3.053
26.02
6.331
12.333
9.448
0.806
127.83
p-value
0.081
00.012
00.002
0.369
0
SeenotestoTable4.
50