25
Journal of Econometrics 59 (1993) 99-123. North-Holland Identification and estimation of dynamic models with a time series of repeated cross-sections* Robert Moffitt* Brown University, Providence, RI 02912, USA Repeated cross-sectional data contain information on independent cross-sections of individual units at two or more points in time. Estimation of dynamic models with such data is made difficult by the general lack of information on lagged dependent and independent variables and the consequent unobservability of the intertemporal covariances needed to identify and estimate dynamic models. It is demonstrated here that the parameters of such models, both linear and nonlinear, both with and without fixed individual effects, are identified and can be consistently estimated with the imposition of certain restrictions. The paper includes an examination of the identification and estimation with repeated cross-sectional data of dynamic discrete dependent variable models, which can be parameterized in terms of transition rates between the different cross-sections. 1. Introduction Repeated cross-sectional (RCS) data contain information from independently drawn sets of cross-sections of a population at two or more points in time. While such data obviously provide more information than data from a single cross- section, RCS data are generally regarded as inferior to true panel data - that is, data on the same cross-sectional units over time - for the estimation of dynamic models. However, it has been pointed out previously that at least one class of models - linear with fixed effects - is identified and can be consistently estimated with RCS data [Browning et al. (1983, Deaton (19831.’ The present paper extends this point in several ways: (1) the identification conditions for the Correpondence to: Robert Moffitt, Department of Economics, Brown University, Providence, RI 029 12, USA. *The author would like to thank Christopher Flinn, James Heckman, Franc0 Peracchi, Marno Verbeek, and the participants of seminars at several universities for comments. The comments of three anonymous ;efere& were also helpful. This paper was presented at the August 1990 World Congress of the Econometric Society in Barcelona and the CIDE Conference on ‘The Econometrics of Panels and Pseudo Panels’ in V&ice in October 1990. ‘It has also been shown by Heckman and Robb (1985) that certain classes of models for the impact of interventions can be consistently estimated with RCS data. 0304476/93/f%.OO 0 1993-Elsevier Science Publishers B.V. All rights reserved

Identification and estimation of dynamic models with a time series of

  • Upload
    leduong

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Identification and estimation of dynamic models with a time series of

Journal of Econometrics 59 (1993) 99-123. North-Holland

Identification and estimation of dynamic models with a time series of repeated cross-sections*

Robert Moffitt* Brown University, Providence, RI 02912, USA

Repeated cross-sectional data contain information on independent cross-sections of individual units at two or more points in time. Estimation of dynamic models with such data is made difficult by the general lack of information on lagged dependent and independent variables and the consequent unobservability of the intertemporal covariances needed to identify and estimate dynamic models. It is demonstrated here that the parameters of such models, both linear and nonlinear, both with and without fixed individual effects, are identified and can be consistently estimated with the imposition of certain restrictions. The paper includes an examination of the identification and estimation with repeated cross-sectional data of dynamic discrete dependent variable models, which can be parameterized in terms of transition rates between the different cross-sections.

1. Introduction

Repeated cross-sectional (RCS) data contain information from independently drawn sets of cross-sections of a population at two or more points in time. While such data obviously provide more information than data from a single cross- section, RCS data are generally regarded as inferior to true panel data - that is, data on the same cross-sectional units over time - for the estimation of dynamic models. However, it has been pointed out previously that at least one class of models - linear with fixed effects - is identified and can be consistently estimated with RCS data [Browning et al. (1983, Deaton (19831.’ The present paper extends this point in several ways: (1) the identification conditions for the

Correpondence to: Robert Moffitt, Department of Economics, Brown University, Providence, RI 029 12, USA.

*The author would like to thank Christopher Flinn, James Heckman, Franc0 Peracchi, Marno Verbeek, and the participants of seminars at several universities for comments. The comments of three anonymous ;efere& were also helpful. This paper was presented at the August 1990 World Congress of the Econometric Society in Barcelona and the CIDE Conference on ‘The Econometrics of Panels and Pseudo Panels’ in V&ice in October 1990.

‘It has also been shown by Heckman and Robb (1985) that certain classes of models for the impact of interventions can be consistently estimated with RCS data.

0304476/93/f%.OO 0 1993-Elsevier Science Publishers B.V. All rights reserved

Page 2: Identification and estimation of dynamic models with a time series of

100 R. MojQt, Model ident$carion and estimation with RCS data

linear fixed effects model are stated explicitly; (2) estimation methods for the linear fixed effects model are demonstrated which make use of the individual micro data and which economize on parameters; (3) autoregressive linear models are considered; and (4) the methods are extended to models with discrete dependent variables, both with and without fixed effects. It is shown that the dynamic models in all cases are identified and that consistent estimating techniques are available from RCS data with the imposition of certain restrictions.

Efficiency questions are not considered in the paper nor is any direct consid- eration of the relative efficiency of true panel data and RCS data considered. That is an important topic but one left for future work. The analysis raises a number of other important questions as well which are left for future investiga- tion; these are discussed at the end of the paper.

The availability of methods for the estimation of dynamic models with RCS data is important for applied work in countries without panel data, as is the situation in many Western European countries and many developing countries. However, it is also important even in countries, such as the U.S., where panel data are often available, for the available panels are often inferior to the available cross-sections in some respects. For example, the U.S. Current Population Survey (CPS) has larger samples, more representative samples over time because they are unaffected by attrition, and more consistently-defined questions over time than the available U.S. panels. While a direct comparison of this pe will not be made in the paper, it is worth noting that the relative desirability of the particular panels and RCS data sets that may be available in a country may not be clearcut.

The analysis of RCS data is also of interest because such data provide a connecting link between micro and aggregate data. As will become clear below, the methods for estimating dynamic models with RCS data are explicitly or implicitly grouping methods akin to those used to generate aggregate data series. A comparison of estimates from RCS data and from true panel data would shed light on, for example whether the commonly observed differences in parameter estimates from aggregate data and panel data in some areas of research (e.g., of life cycle models) are the result of the panel nature of the latter or simply their individual micro nature.

Section 2 of the paper outlines the classes of models to be considered. Linear models are discussed in section 3 and discrete-dependent-variable models are discussed in section 4. Section 5 reports two empirical illustrations, and a sum- mary and set of conclusions follows in section 6.

2. Classes of models considered

The classes of models considered in the paper are specializations of the general model:

YZ = aYi.t-1 + 4tB +_h + &it* i=l,..., IV, t=l,..., l-, (1)

Page 3: Identification and estimation of dynamic models with a time series of

R. MoJitt. Model iden@cation and estimation with RCS data 101

E(&) = g2, E(EitEjJ = 0, Vi,j, t # 5, (2)

where yt is a latent endogenous continuous variable for cross-sectional unit i at time t, yit is its (possibly dichotomous) realization, Xit is a (K x 1) vector of strictly exogenous (w.r.t. &it) variables for i at t, /? is its associated coefficient vector,fi is a fixed effect orthogonal to &it but not to Xit and yi, 1- 1, and &it is an error term with scalar covariance matrix. The dynamics in the model arise both from the presence of the fixed effect and from the lagged endogenous variable. Fixed effects rather than random effects are considered because the latter, at least if orthogonal to Xi,, are less likely to raise issues of consistency (e.g., if GI = 0, then consistent estimates in the presence of orthogonal random effects can be obtained by pooling the cross-sections).2 Only first-order lags in yi, are con- sidered for simplicity; generalization to higher-order lags is straightforward. A nonscalar covariance matrix of &it is also not considered because serial correlation parameters in that matrix will generally not be identified with RCS data, and identification of the other parameters in the model will generally not hinge on the absence of such correlation in any case. Finally, it is assumed for simplicity that the cross-sections at each t are of the same size (N), that the population is closed with respect to in- and out-migration, and that there are no births or deaths.

Letting y, denote the observed value of the endogenous variable for unit i at time t, the critical characteristic of RCS data is that yi, is observed but yi, 1_ 1 is not. Thus no estimate of cov (yil, yi, f _ 1) is available in the data. Covariances are needed to estimate (1) not only because of the presence of yi, f_ 1 as a regressor but also because of the presence of the fixed effect. The question of the paper is the nature of the restrictions that must be imposed to estimate such models in the absence of information on the covariances.

Four special cases will be separately considered:

I. Linear Models (yi, = yt)

(A) CI = 0 (fixed effects model),

(B) 5 = 0 (autoregressive model).

II. Binary Choice Limited Dependent Variable Models

(yi, = 1 if yi”; 2 0; yi, = 0 otherwise)

‘The ‘fixed effect’ in this model can be considered to be either a fixed unknown constant or a stochastic term possibly correlated with Xi, (i.e., a ‘correlated random effect’). The distinction will have no operational content for the analysis below.

Page 4: Identification and estimation of dynamic models with a time series of

102

(A) CI = 0

(II) fi = 0

R. Mofitt, Model identification and estimation with RCS data

(binary choice fixed effects model),

(Markov model).

Extensions to cases which are combinations of these are straightforward and not discussed.3

3. Linear models

3.1. Fixed effects models

To keep matters simple we shall initially consider a model with a single regressor. In addition, we shall henceforth make explicit the nonpanel nature of the data by indexing individuals in each cross-section by i(t), for in RCS data the individuals are potentially different in each cross-section. We there- fore have

J’i(t)t = PO + P1xict,t +.hcl, + Ei(z)t, i(t) = 1, . . . , N, t = 1, . . . , T. (3)

Estimation of (3) by least squares on the pooled RCS data set will yield inconsistent estimates for /?I if xicrjt andAft) are correlated, since the latter is unobserved and hence must be omitted from the regression. Considering identi- fication indirectly by search for a consistent estimator, we seek an instrumental variables (IV) estimator that yields a consistent estimate of fll, and hopefully an estimator for which identification conditions are well-known. For this we require an instrument for Xi(r)t that will, among other things, be asymptotically uncorrelated with fictJ.

The most important class of instruments is that consisting of functions of t. Let us specify a linear projection of Xi(r)t onto K such functions as follows:

Xi(r)t = 5 dkgk(t) + Wi(t)t, (4) k=l

where gk are known functions and where Wi(r)t is a residual orthogonal to those functions by construction. Using a function of t as an instrument is sensible because x. ,(f)f presumably varies with t; if it did not, /?i could not be identified

31n the binary choice model the lagged dependent variable could be in latent form rather than in the form of the observable. The analysis would be simpler in that case because an analytic reduced form could be derived and because instrumental variables could be used, as it cannot be in the case actually considered here (see below).

Page 5: Identification and estimation of dynamic models with a time series of

R. MoJiirt, Model ident$cation and estimation with RCS data 103

even with true panel data.4 Also, functions oft should be uncorrelated with true fixed effects because the latter are time-invariant and hence must be uncor- related with functions of t.

Formally, let 2, be the least squares prediction from (4). Then consistency of the IV estimator of p1 requires that at least one element of the set (6,) be nonzero and that plim [(l/NT)Ci(t),l~t~(r)l = 0.’ The first of these conditions should hold if ~2~ varies with t, as just noted. Whether the second condition will hold is more questionable, for the nature of the RCS data implies that the fixed effects in the sample will differ in each cross-section (t) because there are potentially different individuals in each.

The fixed effectfict) can be decomposed into its sample mean at each t ($) and its deviation from that mean (Vi(t)):

f;:(t) =X + Vi(t).

The former can be further decomposed into the true mean fixed effect in the total population (f*), which is time-invariant by necessity - here the assumption of a closed population is important - and a deviation from that mean arising from sampling error in the data (vi):

Thus we have

h(t) =P + Vi + Vi(t). (7)

Consistency of the IV estimator requires that i2, be asymptotically uncorrelated withf*, vi, and vi(t). Sincep is a constant, this condition must be met. However, it is met for the latter two variables as well as N --too, though not as T +co. As N +oo, the sampling error vi goes to 0; further, the distribution of vi(t) becomes independent of t (and hence of 2,) because the sample of individuals in each successive cross-section becomes identical when the total population is sampled, even if their identities are not known.

Remark I. The two conditions for consistency are stronger than what is required for true panel data. With panel data, /I1 can be identified even if 8k = 0 Vk because Oi(t)t still provides variation over t conditional on i(t). Furthermore, sincefi(,) =fi Vt, unbiased estimates of p1 can be obtained even at finite N by conditioning on A.

“However, variation in xi,,), over t could arise from q,), only; see below.

‘By assumption, +), is asymptotically uncorrelated with Q,,),.

Page 6: Identification and estimation of dynamic models with a time series of

104 R. Mofitt, Model identiJication and estimation with RCS data

Remark 2. The restrictions necessary for identification and estimation with RCS data are needed to circumvent what is a type of missing data problem, namely, that the past and future values of yi, and xit for the same individual i are missing. But the lack of data on Xi, at different t for the same i is more serious than the lack of data on similar yi,. With true panel data the conventional within estimator for /I1 can be written as an instrumental variables estimator with (xir - xi,) as instrument (xi, = mean over t for each i):

11 Yi*(xit - xi.)

fQwithin) = $i Xit(Xir - xi,). t

(8)

If the available RCS data contained information on histories of the Xit of the individuals in the sample, even if not the yi, histories, the right-hand side of (8) could be computed even for such data. The resulting estimator would be consistent under the same conditions as is the within estimator because they are identical.

The model thus far requires time series aggregation - the instrument g2, varies only with t, not with i(t).6 More efficient estimates can be obtained if the micro data contain information on exogenous time-invariant characteristics which are determinants of x. In many models of individual behavior, year of birth qualifies as such as a variable. If there are ‘cohort’ effects in x, then an instrument for xi(t)t can be based on a linear projection onto functions of both t and ci(t), the year of birth of individual i(t):

xi(t)t = 5 81kglk(t) + f 82mgZm(~i(t)~ t) + i 63jg3j(ci(t)) + mi(r)t, (9)

k=l m=l j=l

where glk? 92mT and g3j are known functions. Unfortunately, an instrument based on (9) is likely to be correlated withf;:,,, because the fixed effect is likely to be correlated with cohort as well (indeed, this may be one source of the covariation between xi(,), and &,). Denoting x(t) as the mean fixed effect for those sampled at t who were born in year c; Vi(t) as the deviation ofA from that mean;f: as the true mean for cohort c in the total population; and v&i as the deviation of &) from f:, we have

6The left-hand side need not be grouped w.r.t. (i.e., aggregated over) t for consistent estimation. See footnote 8 below.

Page 7: Identification and estimation of dynamic models with a time series of

R. Mojitt, Model identification and estimation with RCS data 105

in analogy with (7). Sincef: can be represented with cohort dummies explicitly in the main equation (3), consistency again requires that Ai( be asymptotically uncorrelated with v&) and vi(t). Consistency follows in this case as N +co, holding T and the number of cohorts fixed. The reasoning is identical to that given previously with the modification that the number of observations in each cohort-t cell goes to infinity.’

Remark 3. Browning et al. (1985) computed group means of yi(r)t and Xi(r)t within cohort-age cells and regressed the y means on the x means and cohort dummies. This procedure is a special case of the one outlined here, with a full set of age, cohort, and cohort-age interaction dummies appearing in (9). It is well-known that grouping methods are instrumental variables methods, with cohort and age serving as the grouping criteria in this case. Only age (t) is excluded from the main equation and hence it serves as the identifying grouping variable, just as in the simpler model.’

It may be noted that a full set of cohort dummies may not be required in the main equation. The equation

J

Yi(r)t = PO + PlXi(t)t + C yjhj(Ci(t)) + Ef(t)t, (11) j=l

where hj are known functions, has as a special case equation (3) with (10) substituted in and with cohort dummies used for thef:. In most applications, it is likely that y will vary smoothly with cohort effects and, hence, those effects will be representable with fewer parameters than would be required with a full set of cohort dummies. Efficiency gains could thus result. An example of the degree to which such parsimony can be achieved in one application is given in the empirical section below.g

Remark 4. Deaton (1985) noted an errors-in-variables problem with the Browning et al. procedure that arises because the means of Xilt)t for each cohort-age cell are error-filled measures of true cohort-age means. In the model here that problem appears in the terms vi(,) and vi(f)* Those terms represent errors

‘See Verbeek and Nijman (1992) and Angrist (1991) for a discussion of this condition as well.

*Grouping the dependent variable, yi,,,,, is unnecessary for identification and point estimation. It is straightforward to show that a least squares regression of the cell means of y on the cell means of x yields estimates identical to those of a regression of the individual yitot on the cell means of x. However, standard errors may be more accurate if y is grouped. The efficiency and optimality of IV methods is not discussed here, only their identification.

‘Parsimony in (9) is not particularly desirable asymptotically. There is no large-sample efficiency loss from including all possible functions of t and ci,,r in the instrument. However, there may be small-sample losses.

Page 8: Identification and estimation of dynamic models with a time series of

106 R. Mo$ftt, Model idenlification and estimation with RCS data

in the cohort dummies forf: which arise because those dummies imperfectly proxy the cross-section-specific meansJ(,,. Those errors may be correlated with A Xi(r)t in small samples, but this correlation disappears as N-co, as noted previously.

Moving to the general case, new issues are raised. The general case can be written

Yi(f)f = X(f)fP +h(f) + &i(t)f 9 (14

where Xi(r) is now a (K x 1) vector of regressors potentially correlated with&,,. Assume that there exists an (L x 1) vector Zi(t) of time-invariant variables [for individual i(t)] including not only cohort but also, for example, sex, race, years of education (if schooling has been completed), and residential location (if mobility can be ignored). Also let FI’i(t)t be an (M x 1) vector of time-varying variables uncorrelated withACt,, which may consist only of functions oft. Then the linear projections upon which the IV method is based can be written as

Xi(f)t = 61 Wi(r)t + d*Zi(f) + mi(t)t7 (13)

5‘(t) = zi#JY + Vi(t), (14)

where 6i is a (K x M) matrix of coefficients, S2 is a (K x L) matrix of coefficients, Wi(t)t is a (K x 1) vector of errors, y is a (L x 1) vector of coefficients, and Vi(t) represents the remaining individual effect conditional on Zi(t). Eq. (14) is a generalization of (lo), with Z&,,y capturing the effects of all time-invariant variables, not just cohort, and with vi(t) in (14) representing both errors in (10) combined. Letting y, X, Z, and v be the stacked NT vectors for all i and t and letting U = [X Z] and 6 = [X Z], where X is the matrix of least squares predictions from (13), the IV estimator for /I and y is (oU)-’ t?“y and consis- tency requires that plim [(l/NT) ov] = 0 and that 0 be of full column rank, (K + L). As in the simpler case, consistency is achievable only as N +co holding T fixed.”

Remark 5. This generalization permits ‘grouping’ conditional on any set of time-invariant variables, not just cohort. The sample sizes in most data sets would not permit literal grouping by cells of Zi(t). The regression-based proce- dure proposed here is one method for dealing with this problem (literal grouping by sufficiently large cells is still an alternative).”

“‘Also required is plim[(l/NT) C?‘E] = 0; this is assumed throughout. If this assumption fails, e.g., because of time-specific shocks, different instruments may be needed. See below.

“Time-specific variables such as the aggregate unemployment rate may also be entered and used to ‘group’ in this regression sense.

Page 9: Identification and estimation of dynamic models with a time series of

R. MO&, Model identification and estimation with RCS data 107

Remark 6. There may be difficulties in meeting the rank condition if the dimension of Xi(t)t is high and if W’i(r)r only contains functions oft, for identifica- tion in this case requires an additional independent function of t for each additional variable in Xi(r)t, The problem is likely to be especially severe if functions oft are included themselves in Xi(t)r, for, while they can serve as their own instruments, they will make it more difficult to identify the coefficients of other variables in Xi(t)t because additional functions of t will be required.

Remark 7. If Xi(r)r is redefined to include time-invariant variables for which ‘structural’ coefficients in the /I vector are defined, their coefficients cannot be identified if those variables are allowed to be correlated withfict). This result is similar to the nonidentification of the coefficients on time-invariant variables in the panel data model with fixed effects.

Remark 8. If Xi(t)* is not strictly exogenous w.r.t. .sicrjt, then additional restric- tions on the choice of instruments may result. For example, if aggregate or individual shocks affect both Xi(t)t and si(r)t, instruments which use only vari- ables in an information set at some prior time may appear in (13). More generally, whatever orthogonality conditions are suggested by theory may be used to generate the instruments. See Blundell et al. (1989) for a discussion of this issue in the context of the Browning et al. grouping estimator.

3.2. Autoregressiue models

The autoregressive model is

Yi(t)t = olYi(t),t- 1 + Xi(t),/l + Zi(,,y + Eictjt, i = 1, . . . ,N,

t = 2, . . . , T, (15)

where all variables are as defined previously, but where Xi(t)t is defined to include only time-varying variables distinct from the vector of time-invariant variables Zi(t). Estimation of the autoregressive model with RCS data can be achieved with methods similar to the IV methods used for the fixed effects model if an instrument for yi(t),r_ 1 can be constructed, though here it must be 2SLS since the true value Of yi(r), t _ 1 is not observed. Consider a linear projection using the observations at t - 1 on yicr_ lJ,l- 1 :

Yi(r-lhr-1 = W(t-1),1-l 1 + Z::(t-1) 6 6 2 + Oi(t-l),t-1, (16)

where Wi(r-l),t-1 is a vector of time-varying variables. The vector of time- invariant variables is assumed to be identical in (15) and (16) although it is conceivable that they may differ.

Page 10: Identification and estimation of dynamic models with a time series of

108 R. Mojitt. Model identtjkation and estimation with RCS data

Inserting a predicted variable ii(t), f_ 1 obtained from least-squares estimation of(16) (using Wi(t),t-l and Zict,) into (15) in place of yi(t), f_ 1 and applying least squares will yield consistent estimates of /.I and y provided that $i(l),l_ 1 is asymptotically uncorrelated with si(t)t. Identification of the coefficients requires that the stacked matrix [y XZ] be of full column rank.

Remark 1. As in the simple case considered in the fixed effects model, the Zi(t) could consist of cohort functions and the H’i(t)t vector could consist of functions of t alone. The model would then be clearly identified by the variation in t in the instrument. Inserting the mean Ji(t-i),t_i of the sample at t - 1 in the same cohort as that at t is a simple example of such an instrument. But (15) and (16) generalize that method to one permitting regres- sion-based ‘grouping’ on any other set of time-invariant variables Zi(t) and any other time-varying variables that may be available. Indeed, even a single cross-section may be used to estimate the model if there are no cohort effects if t is defined as age and if yi(t), f _ 1 is constructed from the units aged t - 1 in the same cross-section.

Remark 2. Construction of an instrument ji(l),t_i using variables

in Wi(t)t other than t itself requires knowing the history of those variables for the cross-sectional units at t. This is a strong data requirement unlikely to be satisfied by most RCS data sets. Variables that may be thought to be in Wi(t)t but which are not observed must themselves be projected onto functions oft and whatever time-varying variables are observed in much the same manner as xi(t)t was projected in the fixed effects model.” The rank condition for identification may be difficult to meet in this circumstance if Xi(t)r itself contains time-varying functions of t or functions of t itself.

Remark 3. A more formal approach to the problem may be taken by imposing the structure (15) on all previous t and by constructing an instrument for yi(r), f_ 1 from the reduced form of the equation. For illustration, assume that the process has a finite start date and that the initial value of yi(c)f is determined by the function

Yi(t)l = X(t)lS + G(t)7 + &i(t)17 (17)

which is simply (15) evaluated at yi(t)o = 0. A more general formulation could be given in which the variables, coefficients, and error variance in (17) are less

121t should be noted that some time-varying variables like the number and ages of children may be backcast with considerable accuracy, and that aggregate variables such as the unemployment rate will presumably also be measurable in the past.

Page 11: Identification and estimation of dynamic models with a time series of

R. Mojitt. Model identification and estimation with RCS data 109

strictly tied to those in (15). The reduced form of (15) can be shown to be

f-l

Yi(t),r-1 = 1 u’-‘X(r)rP + Z:(,)y 1 _ (y-1

( > 1-U + Vi(t), f- 19

r=1

where

f-l

vi(r), r - 1 = C @'-'Ei(t)rT r=l

(18)

(19)

which can be estimated on the t - 1 sample and used to construct an instrument for the t sample. Note that in this case the absence of any time-varying variables Xi(t)r in the model still permits identification because the autoregressive struc- ture implies that the cumulative effects of Zi(r) on yi(t),,_l vary with t, thus permitting the construction of an instrument not linearly dependent on Zi(t) in (15). More flexible specifications of the autoregressive structure of the model would generate less rigid forms of the r-dependence of the effect of Zi(t) in (18).

Remark 4. It may be of interest to note that least squares estimates of (15) on true panel data using the observed values of yi(t), f_ 1 will generate inconsistent estimates of the parameters if there is individual-specific serial correlation in ei(t)t (e.g., from ‘unobserved heterogeneity’). But this problem does not appear with RCS data because IV methods are used by necessity. These methods are consistent in the presence of such heterogeneity.13

4. Discrete dependent variables

4.1. Fixed eflects models

Only binary choice models will be considered; extensions to other discrete models and to limited dependent variable models is straightforward. Write the fixed effects model as

Y3t)t = X:(t)ZS +./i(t) + Ei(r)r, (20)

Yi(t)r = l if yz,), 2 0,

= 0 otherwise. (21)

13Note that one cannot used the lagged y. 1(1,, ,- 1 as an instrument -that is not observable. Using it in true panel data would be erroneous if there is first-order serial correlation. Instead, lagged yic,- 1j, ,- t must be used, but that variable is identical for all individuals i(t) who are of the same age or cohort.

Page 12: Identification and estimation of dynamic models with a time series of

110 R. Mo@I, Model iden@cation and estimation with RCS data

Assuming si(r)t * N(0, l), the issue is whether the IV or 2SLS methods discussed in section 3.1 can be applied here. 2SLS methods for limited dependent variable models without fixed effects were first used by Nelson and Olsen (1978), who simply estimated a Tobit model with predicted regressors. Amemiya (1979) demonstrated the consistency of the estimator and derived its asymptotic covariance matrix. A number of other consistent 2SLS and IV methods for probit and Tobit have since been developed; Newey (1987) considers their relative efficiencies. Thus consistency of 2SLS in such models has been estab- lished. Identification requires meeting the same conditions as in the linear model.

Relatively little is altered when the fixed effects model in (20)-(21) is considered, although an additional normality assumption is required. In the case where there are no cohort effects, insertion of (7) into (20) generates a composite error term vi + Vi(r) + &i(t)t. As N -00, vi -+ 0, as before, and, further, the distribution of the within-period individual fixed effects vi(f) converges to a distribution uncorrelated with the instruments for Xi(t)r. With the additional (not necessarily innocuous) assumption of normality of Vi(t) + &i(t)t, the consistency results in the articles cited in the previous paragraph are immediately applicable. Unlike the case with true panel data, where large T is required for consistent estimation of logit and probit fixed effects models [Chamberlain (1980), Heckman (1981a)], here the individual fixed effects are not estimated and hence no inconsistency in their estimates at finite T is transmitted to /3.

If cohort fixed effects are estimated, consistency of the 2SLS-estimator- plus-cohort-dummies requires that N +cc while the number of cohorts is held fixed, or, alternatively, that N/NC +co, where N, is the number of cohorts. However, this requirement is present even in the linear model, where the cohort dummies are already error-filled proxies for true mean cohort fixed effects in finite-N samples. Thus we have consistency only with N -+ co in both the linear and probit models. The same applies for the general case shown in (20) using (13) to construct instruments.

4.2. Markov models

A first-order autoregressive model between binary outcome variables is equivalent to a first-order Markov model. i4 That model can be characterized by two transition rates, one each for the probabilities of inflow and outflow from

141f the lagged endogenous variable is yt,- 1 rather than yi, ,_ 1, simpler methods more directly analogous to those in the linear autoregressive model can be used.

Page 13: Identification and estimation of dynamic models with a time series of

R. Mofitt, Model identification and estimation with RCS data 111

each of the two states. Define the model as follows:”

Pit = Prob(y, = l), (22)

Pit = Prob(yit = 1 1 yi, t- 1 = O), (23)

jl, = Prob(y, = 0 1 yi.r- 1 = 1). (24)

Then we have the accounting identity

Pit = Pitt1 - Pi,r-1) + C1 - &It1 - Pi,t-1) = Pit + VirPi,t-17 (25)

where vi, = 1 - ili, - bit. Eq. (25), a standard flow equation in the literature on Markov processes, relates the two marginal probabilities at t and t - 1 to the two transition rates.

That the parameters of (25) are identified with panel data in cases in which they are not with RCS data is intuitive and forms the basis for a fundamental nonidentification result for RCS data. For example, reinterpreting the cross- sectional index i as indexing groups of cross-sectional units within which frequency estimates of p and A can be estimated with a given panel sample, nonidentification in the RCS case can be seen from (25) by merely noting that the two parameters CLit and lit for group i at period t cannot be identified solely from knowledge of Pit and pi,r_ 1. Thus a model with pit and Ait completely unrestricted with respect to i and t cannot be estimated with RCS data.16

Identification is possible with restrictions imposed over i and/or t, however. Consider two examples.

Example 1: Time-homogeneous and unit-homogeneous hazards. If pit = p and li, = A. for all i and t, then, letting v = 1 - II - II, we have

Pit = P + ulPi,r-13 (26)

with reduced form

(27)

ISThe notation for i(t) is not used in this section for simplicity.

16More generally, in a transition model with L possible states rather than 2, the L(L - 1) unique elements of a L x L transition matrix cannot be estimated from the marginal probabilities at c - 1 and t alone.

Page 14: Identification and estimation of dynamic models with a time series of

112 R. MO&U, Model idenrification and estimation with RCS data

assuming pi0 = 0.’ 7 Thus the profile of pit = pr over t is a two-parameter function; hence both p and ;1 are identified provided estimates on pt at two different t are available. This result obviously generalizes to cases in which the hazards are homogeneous w.r.t. t but only within groups defined over i.

Models with this type of homogeneity imposed have been studied extensively in the statistical literature, where the predominant approach considers least squares estimation of (26) using aggregate frequency estimates of the pit = pt, thus identifying the hazards from an estimated slope and intercept [Miller (1952), Madansky (1959), Lee et al. (1970), Lawless and McLeish (1984), Kalbfleish and Lawless (1985)]. These studies typically ignore the boundedness of the variable pt as well as the presence of sampling error in the frequency estimates and consequent errors-in-variables difficulties in least squares estima- tion.18

Example 2: Linear probability model for hazards. Let pit = Xi,e, and Ait = x,0,, where Xi, is a vector of observables for unit i at time t (perhaps including independent functions of t). Then we have

Pit = -Gel + (Xte*)Pi.t- 1.

Estimation of (28) can be considered using methods analogous to the 2SLS methods proposed for the autoregressive linear model. Construction of a pre- dicted value of fii,r_l, possibly by explicit calculation of the reduced form for (28), and substitution of the dichotomous yi, and yi, f _ 1 for pit and pi, t _ 1 leads to the same identification condition as that for the linear model, namely, that the predicted value @i, t_ 1 have sufficient variation independent of Xi, to meet the rank condition, which will often mean variation over t.19 The rank condition will involve interactions of hi, 1_ 1 and xit as well since the parameters e2 are identified from such interactions.20

This example demonstrates that the proportionality restrictions inherent in the index functions Xi,eI and Xi,& are themselves sufficient for identification of transition rates from RCS data. However, while such restrictions are not

“The value of pi0 is not the first observed outcome of the process -that is pi, -but is instead the value of the state prior to the beginning of the process. As most outcome variables are defined, pi0 will be zero - an individual is unemployed at the beginning of an unemployment spell, not working prior to entering the labor force, unmarried at the beginning of the lifetime, and so on.

“However some authors consider ML estimation to address the first problem, though still taking the sample esiimate of pr_ , as exogenous and error-free.

“Note that this implies that the parameters of transition rates cannot be identified for steady state processes.

“‘As in the linear case, it can alsb be shown that the predicted value fii, ,_ 1 will vary with t even if all the Xi, are time-invariant.

Page 15: Identification and estimation of dynamic models with a time series of

R. Mojjitt, Model idekjication and estimation with RCS data 113

necessary with true panel data, in practice most studies using panel data nevertheless impose such index function restrictions in the specification of their transition rates.

Imposing the index function restriction in a proper model of binary choice requires leaving the linear probability model. The specification and estimation of a probit model illustrates such a proper probability model identifiable with RCS data. Let

Yi’; = Gel + (X:,e*)Yi, f- 1 + &if, (29)

yi, = 1 if yi, 2 0,

= 0 otherwise, (30)

where sit ‘V N(0, 1). The hazard rates are

PiI = F CxkeI13 (31)

J-i, = 1 - F[Xi,(Bl + 8*)], (32)

where F is the unit normal c.d.f. Consistent instrumental variable and two-stage estimation methods are not available for this model because of the nonlinear errors-in-variables problem that would be created by instrumenting yi, r- 1. However, the reduced form can be estimated directly. Letting Pit = Prob(y, = l), the reduced form for Pit can be shown to be

1-l

Pit = Pit + 1 k ( > .&, Bis ’ r=1

(33)

where qis = 1 - li, - pis* Consistent estimates of 8i and BZ can be obtained by maximization of

L = 1 C CYifl”kT(Pit) + C1 - Yir)l”gtl - Pit113 i f

(34)

using (31) and (32) in (33).*’ Computing pit by means of (33) is equivalent to integrating out all possible

histories for each individual i at time t to derive an expression for the marginal probabilities that are observed. This integration over histories is identical to the formal solution to the initial conditions problem in panel data [Heckman

211ncluding fixed effects in the model cannot be addressed with predicted values of X as discussed previously because of similar nonlinear errors-invariables problems. In this case the function in (13) could be estimated jointly with (29)-(30) using FIML.

Page 16: Identification and estimation of dynamic models with a time series of

114 R. h4ojitt. Model identification and estimation with RCS data

(1981b, pp. 181-185)]. The problem in that case arises when several periods of a process are observed for an agent but the beginning of the process is not; hence the unobserved history must be integrated out. The situation here is identical except that there is only a single observed time period.

It may be noted that the assumption of a scalar covariance matrix for &it is less innocuous in this model than in the linear autoregressive model. Whereas the presence of serial correlation in the linear model, such as that generated by unobserved heterogeneity, only affects the variance of the reduced form error term in (18), in the probit model it affects the form of the reduced form expression (33), where the elements of the covariance matrix of ait would enter nonlinearly with the regressor variables in Xi, because the complete distribution of sir would have to be integrated out. No simple identification restrictions can be derived guaranteeing separation of the elements of the covariance matrix and the parameters of the transition function in this case.22

5. Empirical illustrations

The major issue raised by the discussion is whether the identification restric- tions necessary for the estimation of the various dynamic models with RCS data are indeed met in common areas of application. That exercise requires the use of panel data and therefore will be conducted in future work. Instead, this section provides two illustrations of the estimation of dynamic models with RCS data, one a linear fixed effects model and one a Markov model. The first is treated only briefly since it is less novel than the latter.

5.1. Linear$xed efects

To illustrate the IV method for the linear fixed effects model, the life cycle model of labour supply of Browning et al. (1985) is estimated with US. RCS data (Browning et al. used U.K. data). In the model, y, is hours of work and xit is the log of the real discounted hourly wage rate. The fixed effects are given a specific interpretation, as including the marginal utility of wealth, and the coefficient on xit is interpreted as an intertemporal substitution effect of this wage. The data set is the U.S. Current Population Survey (CPS) and the sample includes white males 20-59 from 21 annual waves, 1968 to 1988.23

**A less severe problem also arises with incomplete panel data when the covariance matrix estimated for the included periods must be extrapolated to unobserved pre-sample periods.

2”0nly the March files are used. Hours of work are measured as the annual amount over the year prior to the survey and the wage rate is measured as the ratio of earnings in the prior year to hours worked. The wage is discounted with annualized three-month T-bill rates using a 1978 base. To keep the estimation problem manageable and to make the sample size roughly the same as that used by Browning et al., the data are randomly subsampled down to a total of 15,500 over the years.

Page 17: Identification and estimation of dynamic models with a time series of

R. Mojitt. Model identification and estimation with RCS data 115

Table 1 shows IV estimates of eq. (3) with h projected onto cohort and individual characteristics for education, number of children, family size, and regional location, and with xit projected onto these same variables plus age.24 Year dummies are also included in both hours and wage equations for consis- tency with Browning et al. Columns (1) and (2) show wage and hours estimates, respectively, with a relatively unrestricted age and cohort specification, and columns (3) and (4) show estimates after a specification search on the form of the age and cohort effects. F-tests on the age, cohort, year effects, and on the seven individual characteristics are also shown.

Two conclusions can be drawn from the results. First, a considerable amount of parsimony is achieved in the specification of age and cohort effects. The unrestricted specification in the first two columns contains 27 parameters for age and cohort effects in the wage equation and 12 parameters for cohort effects in the hours equation. Although this is much more restricted than a full specifica- tion of all single-year cohort and age effects and interactions (the sample size is insufficient for such a specification in any case), most of the age and cohort parameters are insignificant. The more restrictive model shown in the table, which has quadratic age and cohort effects in the wage equation and cubic cohort effects in the hours equation, cannot be rejected at the 10 percent level but only contains seven age-cohort parameters in the wage equation and three cohort parameters in the hours-worked equation.

Second, the relative sizes of the F-statistics imply that the presence of the individual characteristics is considerably more important than either age, co- hort, or year effects. Thus the IV procedure proposed in section 3 above, which makes full use of the individual micro data and of the cross-sectional variation in individual characteristics within age-cohort cells, significantly improves the fit of the mode1.2s

5.2. Markov model

To illustrate the Markov model the CPS data are used to estimate a model of female labor supply. The dependent variable yir is defined to equal 1 if the individual is employed and 0 if not. The model in (29)-(30) is estimated with ML by maximizing the likelihood function in (34). The sample includes white married women 20-59 in the 21 years 1968-1988.

‘?Some of the individual characteristics are time-varying, inconsistent with an interpretation as fixed effects. They may alternatively be considered as time-varying variables in Xi, in (12) for which no instruments are needed.

15The wage coefbcient in column (4) implies an intertemporal wage elasticity of 0.68, considerably larger than that obtained by Browning et al. or, for example, MaCurdy (1981). It appears to be a result neither of the parsimony of the age and cohort effects or the inclusion of the individual characteristics; the difference seems to arise from the differences in the data sets used. However, the year effects are still significant.

Page 18: Identification and estimation of dynamic models with a time series of

116 R. Mojitt, Model identtjication and estimation with RCS data

Table 1

Labor supply and wage estimates for white males; n = 15,500.”

Unrestricted age and cohort effects”

Log discounted Annual hourly wage hours worked

(1) (2)

Restricted age and cohort effects’

Log discounted Annual hourly wage hours worked

(3) (4)

Log discounted hourly wage”

Education

964.6’ 1037.3’ - (139.1) (142.7)

Individual characteristics

0.056’ - 28.0’ 0.056’ - 32.3e (0.002) (8.1) (0.002) (8.3)

No. children < 6

No. children 6-17

Family size

Northeast

North central

West

0.011 6.7 0.010 4.9 (0.012) (11.2) (0.012) (11.2)

0.002 12.8 - 0.001 12.9 (0.010) (9.3) (0.010) (9.3)

0.006 - 6.0 0.008 - 6.4 (0.008) (7.9) (0.008) (7.9)

0.116’ - 159.0’ 0.116’ - 167.9’ (0.016) (22.2) (0.016) (22.5)

0.062’ - 38.0’ 0.063’ - 43.4e (0.015) (16.8) (0.015) (17.0)

0.074’ - 119.2’ 0.075’ - 125.6’ (0.015) (18.1) (0.015) (18.3)

Age effects

Cohort effects

Year effects

Individual characteristics

R-squared

6.4’

(8)

4.2’

(12)

1.9’

(20)

151.1’

(7)

0.123

F-statistics

2.7’

(12)

5.5’

(20)

14.3’

(7)

0.032

10.0’

(5)

7.3’

(5)

1.8’

(20) (2:;

151.8’ 15.1’

(7) (7)

0.122 0.03 1

BStandard errors in parentheses; for F-statistics, number of parameters in parentheses. bWage equation: Piecewise-linear age splines at five-year intervals from age 20 to 59; piecewise-

linear cohort splines at five-year intervals from 1908 to 1967; interactions of linear cohort variable with all age splines; year dummies. Labor supply equation: Same but without age-related variables.

’ Image equation: Age, age squared, cohort, cohort squared, age x cohort, age x cohort squared, cohort x age squared, year dummies. Labor supply equation: cohort, cohort squared, cohort cubed, year dummies.

dPredicted from wage equation. ‘Significant at 10% level.

Page 19: Identification and estimation of dynamic models with a time series of

R. Mojitt, Model identification and estimation with RCS data 117

Table 2 shows the results of three specifications of the hazards of varying degrees of simplicity. Column (1) presents estimates of a model with constant hazards and implies transition rates of p = 0.43 and A = 0.35 [see (31)-(32) for definitions]. These annual transition rates are implausibly high and exceed those typically found in panel data, for example, by Heckman and Willis (1977) who found p = 0.139 and I = 0.144. 26 However, a model with constant hazards is a gross violation of the data, for it implies a monotonically rising lifecycle profile of employment rates [see (27)], contrary to the actual shapes of such profiles (see below).

The second column shows an expanded version of this model with two time-invariant variables, education and cohort, and one time-varying variable, the U.S. unemployment rate for prime-age males, which can be backcast over the past periods of the life cycle for each observation in the data.27 The parameters are again well-determined, with entry rates positively affected by education and cohort and with exit rates negatively affected by education and positively affected by cohort. Higher unemployment rates lower entry rates, but lower exit rates as well, a result to be discussed more momentarily.

Column (3) shows the most well-specified model, including quadratic age terms in both hazards (age can be perfectly backcast), variables for children (backcast from current numbers and ages of children in the household), and three initial conditions variables. The initial conditions variables permit the first transition in the process, pL1 (i.e., entry starting from pi0 = 0 at age 19 to some positive value at age 20) to differ from the rest of the profile of entry rates by an amount varying by education and cohort. Most of the parameters of the model are fairly precisely determined. Moreover, the results are consistent with find- ings from panel data: both entry and exit are concave in age and are affected by children of different ages in the expected directions. The unemployment rate decreases the exit rate, consistent with added-worker behavior.

Fig. 1 shows the actual and fitted employment probabilities in the data as well as the estimated transition rates. The model fits the actual profile reasonably well except at the very beginning of the life cycle, where an initial rise and subsequent fall in actual employment rates is not fully captured; the introduc- tion of more age terms would improve this fit. More interesting are the esti- mated transition rates shown in the figure, which ‘explain’ the employment profile by the pattern of entry and exit rates shown. Both entry and exit rates rise initially though the former eventually dominates and employment probabilities rise, but entry rates are forced down and exit rates forced up (by rising numbers

26These estimates are averages over the four transition rates between their five years, 1967-1971, shown in their fig. 2, pp. 46-47. The estimates are based on a sample of 1583 white women in the Panel Study of Income Dynamics.

“Given the ages of the women in the data, this requires the use of unemployment rates back to 1928.

Page 20: Identification and estimation of dynamic models with a time series of

118 R. Mofitt, Model identification and estimation with RCS data

Table 2

Markov estimates for employment status for white females: n = 19.892.

Intercept

Education

Cohort

Unemployment rate

Age

Age squared/l00

No. children <: 6

No. children 6-17

D,cb

D2,, x Education

DzO x Cohort

(1) (2) (3)

O,(P)

- 0.308’ - 2.649’ - 3.649’ (0.070) (0.137) (0.566)

0.041’ 0.043’ (0.011) (0.008)

0.043’ 0.013’ (0.002) (0.005)

- 0.185’ - 0.012 (0.026) (0.019)

0.085’ (0.026)

- 0.120’ (0.035)

- 0.774’ (0.052)

0.099’ (0.022)

- 0.273 (0.504)

0.026’ (0.007)

0.035’ (0.021)

Intercept

Education

0.476’ (0.149)

Cohort

Unemployment rate

Age

Age squared/100

No. children cc 6

b(S) 3.986’

(0.347)

- 0.007 (0.027)

- 0.103’ (0.007)

0.647’ (0.073)

5.570’ (1.121)

- 0.025 (0.016)

- 0.023’ (0.013)

0.134’ (0.045)

- 0.113’ (0.047)

0.142’ (0.062)

0.366’ (0.072)

No. children 6-17 0.091’ (0.050)

Log likelihood - 13734.3

“Asymtotic standard errors in parentheses. bEquals 1 if age = 20, 0 if not. ‘Significant at 10% level.

- 13263.9 - 12606.5

Page 21: Identification and estimation of dynamic models with a time series of

R. Mojitt. Model identification and estimation with RCS data 119

.36 -

32 Actual ( p by age) -

< <’

.I4 -

.I0 - o6 _.-‘.._._.-.

02 - -‘-Y._,_

25 30 35 40 45 90 55 AW

Fig. 1. Actual and fitted employment probabilities, and fitted transition rates, by age.

of young children), eventually reaching levels that begin pulling down the employment rate. As the children age, entry rates are pulled back up slowly and exit rates are pushed down, but the direct effects of age in the hazards slow this process by exerting an upward effect on exit and a downward effect on entry. As a result, the exit effect continues to dominate and employment rates continue to decline, These life cycle profiles of entry and exit, and the reasons for their patterns, are quite consistent with those found in the literature using panel data even though transitions are not directly observed in the RCS data used here.

It is of some interest in the application at hand to note the difference between the profiles shown in fig. 1, which confound true within-cohort profiles with across-cohort profiles, and those for a single woman or group of women from the same cohort and with the same characteristics. Fig. 2 shows the simulated profile for a woman aged 45 and with the mean values of the other variables in the data. The employment profile has the same general shape as that in fig. 1 but shows much more curvature, implying that the quasi-cross-sectional profiles in fig. 1 were too flat if taken as true cohort profiles. This is again a finding commonly reported in other such comparisons in the literature. The shapes of all three curves in fig. 2 are explained by the same forces discussed for the profiles in fig. 1, although the trends are more dramatic in fig. 2.

Finally, table 3 provides a comparison of the fitted model with a conventional cross-sectional probit model in which employment status is made a function of only contemporaneous characteristics of the observation. To date, such probit

Page 22: Identification and estimation of dynamic models with a time series of

120

.72

.66

.64

.60

.56

.52

.40

.36

.06

R. MO@, Model iden@cation and estimation with RCS data

I I

25 30

I

35

I

40

I

45 Age

Fig. 2. Simulated life cycle profile of employment and transition rates for women with mean values of the characteristics.

models have been the predominant type of model estimated with RCS data. The probit coefficients are shown in the first column and the effects of a unit change in each of the regressors on the employment probability is shown in the next two columns for both the cross-sectional and the Markov model (standard errors are obtained with the delta method). For the Markov model, the effects shown represent those resulting from a unit change in each regressor variable in both the XitOl and XitOz vectors, and thus represent the combined effect of each variable on the employment rate working through the lagged distribution of entry and exit. As the table shows, the probit and Markov estimates are close for the estimates of education and children but less so for the other variables. The age effect in the Markov model is lower than that in the probit model, but this

Page 23: Identification and estimation of dynamic models with a time series of

R. Mofitf, Model identiJca!ion and estimation with RCS data 121

Table 3 Comparisons with cross-sectional probit.’

Cross-sectional probit coefficients

(1)

Markov model

(2)

Cross-sectional probit

(3)

Intercept

Education

Cohort

Unemployment fate

Age

Age squared/100

No. children < 6

No. children 6-17

Log likelihood

- 3.297d (0.216)

0.072d (0.004

0.026d (0.003)

- 0.007 (0.008)

0.079* (0.008)

- 0.087* (0.010)

- 0.5 16d (0.015)

- O.lOld (0.009)

- 12635.7

-

0.03Od (0.002)

0.003 (0.002)

O&W (0.010)

- 0.001 (0.002)

_

- 0.172d (0.009)

- 0.053d (0.005)

-

0.029d (0.001)

O.OIOd (0.001)

- 0.003 (0.003)

0.005d (0.001)

- 0.197* (0.005)

- 0.040d (0.004)

-

‘Standard errors in parentheses. bp = probability of employment. In the Markov model, ap/aX is the derivative of eq. (33) w.r.t.

elements of X common top and r~. In the cross-sectional probit model, ap/aX =f(X/?)X, where XjI is the probit index. All effects are evaluated at the means of the data.

‘In columns (2) and (3), a unit change in age squared is not separately shown; its effect is captured in the prior row.

dSignificant at 10% level.

difference arises because a one-year increase in the age of an observation in the Markov model implies a different history of the children variables. In particular, if the same number and ages of children are observed for the older woman, the children must therefore have been born later in her life cycle and, in this case, to have pushed down her employment rate history and therefore her contempo- raneous employment rate. ‘* The unemployment rate shows no effect in the

“The age effects for the Markov model could be estimated, alternatively, by holding the children history fixed, although some assumption about new births at the higher age would have to be assumed. However, this would still not be comparable to the cross-sectional probit coefficients because the contemporaneous numbers and ages of children would not be held constant. Thus it is not possible to obtain completely comparable age effects in the two models. The same obtains for the unemployment rate (see below).

Page 24: Identification and estimation of dynamic models with a time series of

122 R. MoJitt, Model identification and estimation with RCS data

probit model but a positive one in the Markov model, consistent with an added-worker effect for married women. The difference is in part a result of the differing amounts of information used in the two estimation procedures, for the Markov model uses the entire life cycle history of unemployment rates for each observation while the probit model uses only the contemporaneous unemploy- ment rate. However, there is also a lack of strict comparability between the two estimated effects, for the Markov coefficient is obtained by shifting the entire history of unemployment rates up by one unit.29

The differences in the two models are not surprising since the Markov model contains a formal lag structure. Also not surprisingly, comparison of the likeli- hood functions for the two models indicates that the Markov model provides a better fit even after adjustment for the number of parameters using the Akaike criterion. Nevertheless, it does confirm that a dynamic model fit to RCS data is superior in many respects to the cross-section models usually estimated.

6. Summary and conclusions

The results of this analysis suggest that many dynamic models are identified with repeated cross-sectional data under what appear to be relatively mild restrictions. This raises a number of questions and issues for future work. The most important is whether the identification conditions do indeed hold, an issue that can be investigated with panel data. Such an investigation is not entirely straightforward, however, if the panel data suffer from measurement error with well-known deleterious consequences for fixed-effects estimators. Conditional on identification, the relative efficiencies of panel and RCS estimators of dy- namic models could be considered as a second issue. Other issues worth examination are (1) the importance of serial correlation and unobserved hetero- geneity in the Markov model and (2) means of incorporating the effects of migration, mortality, and birth rates into RCS estimates.

References

Amemiya, T., 1979, The estimation of a simultaneous-equation Tobit model, International Eco- nomic Review 20, 169-181.

Angrist, J., 1991, Grouped-data estimation and testing in simple labor supply models, Journal of Econometrics 47, 243-266.

Blundell, R., M. Browning, and C. Meghir, 1989, A microeconometric model of intertemporal substitution and consumer demand, Discussion paper no. 89-l 1 (University College London, London).

Browning, M., A. Deaton, and M. Irish, 1985, A profitable approach to labor supply and commodity demands over the life cycle, Econometrica 53, 503-544.

29The Markov model permits the estimation of the effect of any alteration in the time pattern of historical unemployment rates.

Page 25: Identification and estimation of dynamic models with a time series of

R. Mofjitt, Model ident@cation and estimation with RCS data 123

Chamberlain, G., 1980, Analysis of covariance with qualitative data, Review of Economic Studies 47, 225-238.

Deaton, A., 1985, Panel data from time series of cross sections, Journal of Econometrics 30,109-126. Heckman, J., 1981a, Statistical models for discrete panel data, in: C. Manski and D. McFadden, eds.,

Structural analysis of discrete data with econometric applications (MIT, Cambridge, MA) 114-178.

Heckman, J., 1981b, The incidental parameters problem and the problem of initial conditions in estimating a discrete time-discrete data stochastic process, in: C. Manski and D. McFadden, eds., Structural analysis of discrete data with econometric application (MIT, Cambridge, MA) 179-195.

Heckman, J. and R. Robb, 1985, Alternative methods for evaluating the impact of interventions, in: J. Heckman and B. Singer, eds., Longitudinal analysis of labor market data (Cambridge University, Cambridge) 156-245.

Heckman, J. and R. Willis, 1977, A beta-logistic model for the analysis of sequential labor force participation by married women, Journal of Political Economy 85, 27-58.

Kalbfleish, J.D. and J.F. Lawless, 1985, The analysis of panel data under a Markov assumption, Journal of the American Statistical Association 80, 863-871.

Lawless, J.F. and D.L. McLeish, 1984, The information in aggregate data from Markov chains, Biometrika 71, 419-430.

Lee, T.C., G.C. Judge, and A. Zellner, 1970, Estimating the parameters of the Markov probability model from aggregate time series data (North-Holland, Amsterdam).

MaCurdy, T., 1981, An empirical model of labor supply in a life-cycle setting, Journal of Political Economy 89, 1059-1085.

Madansky, A., 1959, Least squares estimation in finite Markov processes, Psychometrika 24, 137-144.

Miller, G., 1952, Finite Markov processes in psychology, Psychometrika 17, 149-167. Nelson, F.D. and L. Olson, 1978, Specification and estimation of a simultaneous-equation model

with limited dependent variables, International Economic Review 19, 695-709. Newey, W., 1987, Efficient estimation of limited dependent variable models with endogenous

explanatory variables, Journal of Econometrics 36, 231-250. Verbeek, M. and T. Nijman, 1992, Can cohort data be treated as genuine panel data?, Empirical

Economics 17,9-23.