30
JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 21: 1265– 1293 (2006) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/jae.910 INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION IN SHORT PANELS MARCO FRANCESCONI a AND CHETI NICOLETTI b * a Department of Economics, University Essex, Colchester, UK b Institute for Social and Economic Research, University of Essex, Colchester, UK SUMMARY Using data from the first 11 waves of the BHPS, this paper measures the extent of the selection bias induced by standard coresidence conditions—bias that is expected to be severe in short panels—on measures of intergenerational mobility in occupational prestige. We try to limit the impact of other selection biases, such as those induced by labour market restrictions that are typically imposed in intergenerational mobility studies, by using different measures of socio-economic status that account for missing labour market information. We stress four main results. First, there is evidence of an underestimation of the true intergenerational elasticity, the extent of which ranges between 12% and 39%. Second, the proposed methods used to correct for the selection bias seem to be unable to attenuate it, except for the propensity score weighting procedure, which performs well in most circumstances. This result is confirmed both under the assumption of missing-at- random data as well as under the assumption of not-missing-at-random data. Third, the two previous sets of results (direction and extent of the bias, and differential abilities to correct for it) are also robust when we account for measurement error. Fourth, restricting the sample to a period shorter than the 11 waves under analysis leads to a severe sample selection bias. In the cases when the analysis is limited to eight waves, this bias ranges from about 40% to 65%. Copyright 2006 John Wiley & Sons, Ltd. Received 27 September 2004; Revised 20 September 2005 1. INTRODUCTION Opinion among economists about the extent of intergenerational mobility has markedly changed in the last twenty years. Twenty years ago, the general view—based primarily on data from the United States—was that there was little persistence in economic status across generations (Becker and Tomes, 1986). A more recent wave of studies in the United States and several other industrialized countries has questioned that view, and found evidence suggestive of far less mobile societies than was earlier believed. 1 Two important problems, which marred early studies, have been underlined. First, the samples used in most analyses were non-representative. In particular, they tended to be of small size and refer to highly homogeneous groups of the population of children and parents (e.g., individuals were from a specific region or city or they were twins). Second, long-run permanent economic status was poorly measured. Most of the early studies used single-year or other short-run measures of economic status, generally proxied by earnings or income. To limit such problems, the more recent literature uses longitudinal household data from national probability samples. These data tend to avoid sample homogeneity (but may still lead to relatively Ł Correspondence to: Cheti Nicoletti, Institute for Social and Economic Research, Wivenhoe Park, Colchester CO4 3SQ, UK. E-mail: [email protected] Contract/grant sponsor: Economic and Social Research Council; Contract/grant number: RES-000-23-0185. 1 See Solon (1999) for a comprehensive review of this more recent literature. Copyright 2006 John Wiley & Sons, Ltd.

INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

JOURNAL OF APPLIED ECONOMETRICSJ. Appl. Econ. 21: 1265–1293 (2006)Published online in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/jae.910

INTERGENERATIONAL MOBILITY AND SAMPLE SELECTIONIN SHORT PANELS

MARCO FRANCESCONIa AND CHETI NICOLETTIb*a Department of Economics, University Essex, Colchester, UK

b Institute for Social and Economic Research, University of Essex, Colchester, UK

SUMMARYUsing data from the first 11 waves of the BHPS, this paper measures the extent of the selection bias inducedby standard coresidence conditions—bias that is expected to be severe in short panels—on measures ofintergenerational mobility in occupational prestige. We try to limit the impact of other selection biases, suchas those induced by labour market restrictions that are typically imposed in intergenerational mobility studies,by using different measures of socio-economic status that account for missing labour market information. Westress four main results. First, there is evidence of an underestimation of the true intergenerational elasticity,the extent of which ranges between 12% and 39%. Second, the proposed methods used to correct for theselection bias seem to be unable to attenuate it, except for the propensity score weighting procedure, whichperforms well in most circumstances. This result is confirmed both under the assumption of missing-at-random data as well as under the assumption of not-missing-at-random data. Third, the two previous sets ofresults (direction and extent of the bias, and differential abilities to correct for it) are also robust when weaccount for measurement error. Fourth, restricting the sample to a period shorter than the 11 waves underanalysis leads to a severe sample selection bias. In the cases when the analysis is limited to eight waves,this bias ranges from about 40% to 65%. Copyright 2006 John Wiley & Sons, Ltd.

Received 27 September 2004; Revised 20 September 2005

1. INTRODUCTION

Opinion among economists about the extent of intergenerational mobility has markedly changed inthe last twenty years. Twenty years ago, the general view—based primarily on data from the UnitedStates—was that there was little persistence in economic status across generations (Becker andTomes, 1986). A more recent wave of studies in the United States and several other industrializedcountries has questioned that view, and found evidence suggestive of far less mobile societies thanwas earlier believed.1 Two important problems, which marred early studies, have been underlined.First, the samples used in most analyses were non-representative. In particular, they tended to be ofsmall size and refer to highly homogeneous groups of the population of children and parents (e.g.,individuals were from a specific region or city or they were twins). Second, long-run permanenteconomic status was poorly measured. Most of the early studies used single-year or other short-runmeasures of economic status, generally proxied by earnings or income.

To limit such problems, the more recent literature uses longitudinal household data from nationalprobability samples. These data tend to avoid sample homogeneity (but may still lead to relatively

Ł Correspondence to: Cheti Nicoletti, Institute for Social and Economic Research, Wivenhoe Park, Colchester CO43SQ, UK. E-mail: [email protected]/grant sponsor: Economic and Social Research Council; Contract/grant number: RES-000-23-0185.1 See Solon (1999) for a comprehensive review of this more recent literature.

Copyright 2006 John Wiley & Sons, Ltd.

Page 2: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1266 M. FRANCESCONI AND C. NICOLETTI

small estimating samples). Moreover, if they have a long enough time series component, theyallow researchers to compute better measures of long-run status, which are typically obtainedafter averaging data over several years. For example, using five-year averages, Solon (1992) andZimmerman (1992) estimate intergenerational correlations that are about twice as large as thosefound in the earlier research surveyed by Becker and Tomes (1986). Using even more years ofdata, the estimates reported in Mazumder (2005) reveal intergenerational correlations that areabout 50% greater than those reported in Solon’s and Zimmerman’s studies. Recent researchhas also considerably improved our understanding of the mechanisms that link one generation tothe next. Several studies have modelled and estimated the effect on intergenerational mobility offamily background (Shea, 2000), neighbourhood influences (Page and Solon, 2003) and assortativemating (Chadwick and Solon, 2002).

However, researchers have virtually ignored one important issue that may plague all intergener-ational correlation estimates, namely the issue of sample selection.2 The extent of intergenerationalmobility is usually measured by estimating a relationship between a measure of son’s or daugh-ter’s economic status (e.g., earnings or income) and the same measure of economic status forhis or her parent(s). Common practice has been to exclude all records of data where parents orchildren report no earnings or income (because, for example, they were unemployed at the timeof the survey). Two exceptions are the studies by Couch and Lillard (1998) and Minicozzi (2003).Both conclude that there exists an important role for assumptions on labour market selectionin identifying intergenerational income mobility, but their evidence is mixed. Couch and Lillardassign one dollar of income to individuals who have a valid report of no earnings, and find thatmore selected samples lead to higher correlations between sons’ and fathers’ incomes (i.e., lessmobility). Minicozzi uses a different method and estimates different Manski-type bounds aroundchildren’s income. Contrary to Couch and Lillard, she finds that dropping both unemployed andpart-time employed sons leads to a higher degree of mobility than if part-time employed sons hadbeen included.

In intergenerational mobility analyses, however, there may be also selection problems that areinduced by restrictions not driven by unemployment, part-time employment or lack of labourmarket information. In the panel data usually used in such analyses, parents and children mustbe found living in the same household for some time during the panel years and followed overtime. Information on children’s or parents’ status may be censored, for example, by non-ignorableattrition or by the fact that the time series component of longitudinal datasets is not long enough.The aim of this paper is precisely to assess how severe the problem of short panels can be forthe estimation of intergenerational mobility between sons’ and fathers’ socio-economic status. Inshort panels, in fact, the choice of the child generation is typically constrained by the trade-offbetween younger and older children. The former group is likely to be a random sample of childrenwho co-reside with their parents but their observed socio-economic status is almost certainly anoisy measure of long-run status (the condition underlying the choice of this group is what wecall adulthood condition). The latter group will have better measures of status but its inclusionmay bias the sample towards children who co-reside with their parents at late ages (this imposeswhat we call co-residence condition).

To address the problem induced by these two conditions (especially the latter one), we introducea new methodology which we apply to a short panel using the first 11 waves of the British

2 There are different types of selection that may occur here. In Section 2 we shall clarify which forms of selection we areinterested in and address in this study.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 3: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1267

Household Panel Survey (BHPS), covering the period 1991–2001.3 All BHPS respondentsaged 16 or more are asked to report the occupation of their parents when they were aged14. This information is based on retrospective questioning (and therefore may suffer fromrecall error), but it relies neither on the adulthood condition nor on the co-residence condition.Using a continuous index of occupational prestige, which relates strongly to labour income(see Ermisch et al., forthcoming), we then estimate intergenerational elasticities in occupationalprestige that are assumed to have no selection bias. We take these estimates to be our bias-free benchmarks throughout the study. By linking children to parents over the 11 years of thepanel and imposing a standard co-residence condition, we obtain a new selected subsample offather–son pairs. For this subsample, we again estimate intergenerational elasticities. Comparingthese elasticities to our benchmarks provides us with a direct measure of the extent of theco-residence selection bias we are interested in. We then evaluate two approaches that correctfor this potential bias. The first belongs to the general class of Heckman-type sample selectioncorrections. The second is within the class of models based on propensity score estimation, whichwe estimate using procedures based on inverse propensity score weighting and on dummies fordifferent levels of the propensity score. Finally we conduct some sensitivity analyses to assesswhether the elasticities estimated on the selected sample are robust to varying the length of thepanel.

The rest of the paper is organized as follows. Section 2 describes the issue of selectionwe are concerned with in this study and casts our contribution within the relevant literature.Section 3 presents the data source, the estimating samples, our measures of socio-economicstatus, and the other variables used in estimation. Section 4 discusses our methodology togauge the extent of selection bias in our sample and to correct for it (and Appendix Aprovides a detailed statistical description of the assumptions needed to estimate our correctionmodels). Section 5 reports our results and shows a number of sensitivity checks. Section6 concludes.

2. THE ISSUE

In analysing the extent of intergenerational mobility, researchers have used several differentmeasures of long-run socio-economic status (e.g., educational attainment, labour income, hourlywages and occupational indices). Each measure has advantages and disadvantages.4 Because ofrestrictions imposed by our data, we will use an index of occupational prestige computed accordingto the technique proposed by Goldthorpe and Hope (1974), which is widely used especially inthe sociological literature.5 There are also several popular methods of examining intergenerational

3 In this paper we specifically focus on (and model) this selection problem only, and abstract from the selection issuesrelated to labour market conditions, attrition or non-response. However, we will try to account for the issues relatedto non-response and/or out-of-labour-market conditions in ways similar to those used by Couch and Lillard (1998). Anempirical analysis of intergenerational mobility where all these different types of selection are jointly modelled and fullyaccounted for bears investigation in future work.4 For a review, see the discussions in Atkinson et al. (1983), Bowles and Gintis (2002) and Erikson and Goldthorpe(2002).5 Goldthorpe and Hope (1974) suggest that the scale which results from their occupational prestige grading exercise shouldnot be viewed as a grading of social status stricto sensu, i.e., as tapping some underlying structure of social relations of‘deference, acceptance and derogation’ (p. 10). It should instead be viewed as ‘a judgement which is indicative of whatmight be called the “general goodness” or . . . the “general desirability” of occupations’ (pp. 11–12). In intergenerational

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 4: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1268 M. FRANCESCONI AND C. NICOLETTI

correlations in socio-economic status (e.g., linear regressions, quantile analysis, and transitionmatrices). In this paper, we perform our analysis using a log-linear regression model whichassumes that the log of a son’s permanent socio-economic status in family i, denoted by ys

i ,can be expressed as a linear function of his father’s log permanent socio-economic status, yf

i ,according to

ysi D ˛ C ˇyf

i C ui �1�

where ˇ is our parameter of interest that denotes the intergenerational elasticity of son’s status withrespect to father’s status; ˛ is the intercept term that represents the change in status common tothe son’s generation; and ui is a random disturbance. Assuming (for simplicity and momentarily)that the underlying variances in father’s and son’s status are equal (so that ˇ coincides with thecorrelation between father’s and son’s status), a value of ˇ D 1 indicates a situation of completeimmobility, whereby (apart from the influence of ui) the sons’ position in their status distributionis fully determined by their fathers’ position. A value of ˇ D 0 instead indicates a situationof complete mobility or regression to the mean, whereby the child’s position is completelyindependent of his father’s; and with intermediate values of ˇ between 0 and 1, the statusdistribution still regresses to the mean, but at a rate that decreases the higher is ˇ.6

Equation (1) can be derived from a model with utility-maximizing parents as in Becker andTomes (1979, 1986), which we do not present for convenience (see also Solon, 1999, for a succinctexposition). Despite its simplicity, this model is rich enough to illustrate several aspects of theintergenerational transmission of economic status, and this may help explain its popularity in theeconomics literature. One important implication of the model is that it shows that intergenerationaltransmission—as captured by ˇ—may occur through a multitude of processes, e.g., the importancethat parents place on children’s future earnings, the return to human capital investment, thestrength of the intergenerational transmission of endowments (which are embedded in u), andthe relative magnitudes of the variances in the endowments’ sub-components (such as market luckand endowment luck). The difficulty of empirically distinguishing these various processes is oneof the major messages of Goldberger’s (1989) work, which proposes an alternative ‘mechanical’interpretation of equation (1) that does not rest on utility maximization and does not give a causalmeaning to ˇ. Irrespective of the interpretation, however, both economic and mechanical modelsof intergenerational transmission operate under the notion of permanent or lifetime incomes,ys

i and yfi . Therefore, equation (1) imposes the strong restriction of a constant ˇ coefficient.7

Another important aspect to bear in mind is that, although there are many reasons to expect apositive intergenerational correlation in socio-economic status, the correlation need not be large.

mobility studies, this index may be characterized by misclassification error when a new occupation is observed or anold occupation disappears. Using the same measure of status, Ermisch et al. (forthcoming) document a stark case ofmean displacement, with the father’s distribution being skewed to the right and the son’s being bimodal. By diminishingthe signal proportion of the sample variation in measured scores, this is likely to produce intergenerational correlationsthat are downward biased. But this is possibly true in both our benchmark sample and our selected sample, so that ourinference on bias magnitude and correction may not be directly affected. In addition, Ermisch et al. (forthcoming) presenta measurement model that shows the conditions under which the estimates based on the Hope–Goldthorpe index areconsistent estimates of parameters similar to those in the economic literature.6 It is possible, of course, to have a negative ˇ (i.e., reversal). In these circumstances, fathers with status above (or below)the mean will have sons with below (or above) average status levels.7 Repeated observations on ys

i and yfi would allow us to relax this restriction to some extent. But, as Section 3 clarifies, the

data in our key baseline sample contain only one observation on yfi , making it impossible to estimate random-coefficient

analogues of (1).

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 5: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1269

As previously mentioned, ˇ is linked to numerous (deep) parameters, and theory does not say howlarge or small we should expect them to be. Getting a clearer idea of the size of intergenerationalmobility in a society is an empirical matter, and this crucially hinges on how well permanentincomes are measured, which is precisely what we focus on in this study.

If ysi and yf

i are observed for all sons and fathers in the sample, then ordinary least squares(OLS) estimation will produce a consistent estimate of ˇ. If, however, either ys

i or yfi are observed

only for some father–son pairs because of selection, the OLS estimate of ˇ obtained from thesecensored data is inconsistent. There are a number of reasons why information on yf –ys pairsmay be censored. First, observations where a parent or a child has no required status informationare usually deliberately dropped from the analysis. For example, by excluding fathers and sonsthat were not in full-time jobs (i.e., working on average 30 hours per week at least 30 weeksper year), Zimmerman (1992) implicitly assumes exogenous selection into full-time employment.Solon (1992), Dearden et al. (1997), Couch and Dunn (1997) and Chadwick and Solon (2002),among others, exclude cases in which income observations are non-positive, and thus implicitlyassume exogenous selection into employment. These assumptions are not consistent with standardeconomic results according to which selection into the labour force or into full-time employmentis likely to be correlated with potential earnings (Heckman, 1979; Vella, 1998).8

Second, information on yf –ys pairs may be censored because information on child’s or father’sstatus is missing (item non-response) or because one of the two individuals is not in the sample(unit non-response). By dropping father–child observations in which at least one has missingstatus information or has attrited out of the survey under analysis, most studies implicitly assumethat non-response is random or that attrition is ignorable.9 Again, these assumptions run counterto the findings that non-response is likely to be systematically correlated with observable andunobservable characteristics, or that estimates of intergenerational relationships are likely to beaffected by differential attrition (Fitzgerald et al., 1998).

Third, many longitudinal studies typically restrict their analysis to children from specific birthcohorts, say, born between years b1 and b2, b1 < b2 (e.g., Solon, 1992; Couch and Dunn, 1997;Bjorklund and Jantti, 1997; Chadwick and Solon, 2002; Minicozzi, 2003). The restriction on b2

(which we label adulthood condition) is motivated by the need to assure that children’s socio-economic status is observed as far out as possible on their life cycle (when they are ‘adult’) sothat their observed status can be taken as a reliable measure of long-run permanent status.10 Thisneed has driven researchers to choose a relatively low b2. The restriction on b1 (which we labelcoresidence condition) is instead motivated by the need to avoid overrepresenting children wholeft home at later ages. If there are unobserved factors affecting children’s later socio-economicstatus (through ui in (1)) that also influence children’s chances of living (or ‘co-residing’) withtheir parents, then the OLS estimate of ˇ will suffer from sample selection bias. So the need tolimit this bias has led researchers to choose a relatively high b1. This double choice produces theselection problem inherent in all short panels. Our paper primarily focuses on the co-residencecondition.

8 This is the type of selection examined by Couch and Lillard (1998) and Minicozzi (2003).9 Selection based on the availability of other relevant variables for children and parents (e.g., age, hours of work, education)will induce similar problems. See Couch and Lillard (1998) for a review of studies that used sample selection criteria onsuch variables.10 Clearly the selection generated by the adulthood condition and the selection induced by the child’s labour marketconditions described above are closely related. Our empirical analysis will in part address this issue.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 6: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1270 M. FRANCESCONI AND C. NICOLETTI

In longitudinal birth cohort datasets that follow individuals born in one specific week over time(such as the National Child Development Study used by Dearden et al., 1997), the problem ofchoosing b1 and b2 does not exist since b1 and b2 are equal and coincide with the year of thestart of the survey, b0. The drawbacks of such data sources, however, are that intergenerationalanalyses can be done only after a relatively long period of time since the surveys started (whichmay lead to serious attrition bias), and generalizations to other (younger or older) cohorts arenot straightforward. In nationally representative family-based longitudinal data (such as the PanelStudy of Income Dynamics used by Solon, 1992, or the German Socioeconomic Panel used byCouch and Dunn, 1997), the problem exists and may be serious, especially when b1 < b2 � b0.For example, in Solon (1992) b1 D 1951, so that children are aged at most 17 in 1968 (i.e., b0, thefirst year of the survey analysed by Solon) to address the coresidence condition, and b2 D 1959,so that children are aged at least 25 in 1984 (the last year of observation) to address the adulthoodcondition. Apart from attrition issues, this problem disappears if the panel data have a long enoughtime series component, so that b2 ½ b1 ½ b0. But for analyses based on short panels, such as thatused in this paper, the problem is potentially severe. Our two main objectives therefore are (a) toquantify the extent to which standard estimates of ˇ obtained from short panels are affected bythis coresidence bias, and (b) to evaluate alternative econometric techniques to alleviate its effects.In Appendix A we shall return to the different types of selection that may bias our estimates, andformally describe the assumptions imposed by the various methods used to correct for selectionbias. Before explaining these methods, in the next section we describe the data used in our empiricalapplication.

3. DATA

The data we use are from the first 11 waves of the British Household Panel Survey (BHPS)collected over the period 1991–2001. Since Autumn 1991 the BHPS has annually intervieweda representative sample of about 5500 households covering more than 10 000 individuals. Alladults and children in the first wave are designated as original sample members. Ongoingrepresentativeness of the non-immigrant population has been maintained by using a ‘followingrule’ typical of household panel surveys: at the second and subsequent waves, all originalsample members are followed (even if they moved house or if their households split up).Personal interviews are collected, at approximately one-year intervals, for all adult membersof all households containing either an original sample member, or an individual born to anoriginal sample member.11 The sample therefore remains broadly representative of the populationof Britain as it changes over time.12 From the BHPS, we select three different samples andemploy various measures of the occupational status variable for sons and fathers, with the doubleaim of gauging the extent of selection bias of interest here and of attenuating the measurementerror problem inherent in all intergenerational studies. We now turn to describe samples andvariables.

11 Individuals are defined as ‘adult’ (and are therefore interviewed) from their 16th birthday onwards.12 Of the individuals interviewed in 1991, 88% were re-interviewed in wave 2 (1992). The wave-on-wave response ratesfrom the third wave onwards have been consistently above 95%. See Taylor (2002) for a full description of the dataset.Detailed information on the BHPS can also be obtained at <http://www.iser.essex.ac.uk/bhps/doc>. The householdsfrom the European Community Household Panel subsample (followed since the seventh wave in 1997), those from theScotland and Wales booster subsamples (added to the BHPS in the ninth wave) and those from the Northern Irelandbooster subsample (which started in wave 11) are excluded from our analysis.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 7: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1271

3.1. Samples

Our main analysis is restricted to 2691 men (or sons) born between 1966 and 1985, whohave at least one valid interview over the panel period under study. This represents our FullSample. The BHPS asks all adult respondents aged 16 or more to provide information about theirparents’ occupations when they (the respondents) were aged 14, and releases data on an indexof occupational prestige introduced by Goldthorpe and Hope (1974). This index ranges from 5to 95, with greater values indicating higher occupational prestige, and it is highly correlated withearnings.13 Between 1991 and 2001, the BHPS data indicate a correlation between gross monthlyearnings and the Hope–Goldthorpe (HG) index of 0.70 for men and 0.75 for women. Becausethe position of individuals in the occupational hierarchy is relatively stable over the life cycle,the HG scale is also likely to be an adequate measure of people’s permanent socio-economicstatus (Nickell, 1982). Ermisch et al. (forthcoming) show that using occupational prestige scoresof fathers and offspring as a proxy of permanent income produces similar results to those usingaverage earnings data for Germany, and such results are consistent with those found for Britain. Ofcourse, different correlations between occupational score measures and average earnings may arisein countries other than Germany and Britain. Extending our work to other countries is interestingand should bear further analysis in the future.

All the individuals in the Full Sample who could be successfully matched to their fatherare part of our second sample, which we refer to as Restricted Sample. There are 1114 ofsuch father–son pairs. Differently from the Full Sample, this imposes stringent adulthood andcoresidence conditions.14 Individuals born in 1966 were aged 25 in the first year of the panel(1991) and 36 in the last (2001): they could have lived with their parents at any age betweenthose two years. With a median home-leaving age of about 23–24 (Ermisch and Di Salvo, 1997;Ermisch and Francesconi, 2000), co-residence at such ages means that the Restricted Sampleoverrepresents sons who left home at late ages. At the other extreme, individuals born in 1985were aged 16 in 2001 (the last year of analysis): although they are likely to be a random sampleof young people living with their parents, their HG index is arguably a noisy measure of long-runstatus. The comparison of the intergenerational elasticities obtained from the Full Sample and theRestricted Sample will provide us with a measure of the extent of the selection in short panels,under the maintained assumption that Full Sample estimates do not suffer from selection bias.

As discussed in the Introduction, one of the issues emphasized by recent empirical studies isthe lack of reliable measures of fathers’ long-run permanent status. In this respect our dataset isno exception (see the next subsection). To gauge the impact of the related bias, we thus analysea third sample (of fathers). In the second (1992) and eleventh (2001) waves of the BHPS, adultrespondents were asked to provide information on all their children regardless of where theylived. This, which we call Supplemental Sample, is composed of 1434 men whose sons were bornbetween 1966 and 1985 (the same year-of-birth selection used in the Full Sample). This shouldprovide us with a random sample of fathers with sons born in those years.15 As measurement

13 Phelps Brown (1977) reports a strong log-linear relationship between median gross weekly earnings and the HG score,with a rise of 1 unit in the index being associated with an increase of 1.031% in earnings. Nickell (1982) finds a correlationbetween the HG score and the average hourly earnings of 0.85.14 Although the date-of-birth restrictions imposed on the Full Sample are the same as those imposed here for comparabilitypurposes, they could be relaxed and allow us to concentrate only on prime-age men in employment.15 As opposed to the Full Sample, however, the Supplemental Sample does not contain information on sons’ HG scores,except for those cases in which—as in the Restricted Sample—father and son were observed to live in the same householdat least once over the sample period.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 8: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1272 M. FRANCESCONI AND C. NICOLETTI

error corrections would typically require information on the variance of fathers’ permanent status(Zimmerman, 1992), the Supplemental Sample allows us to estimate any moment of distributionof fathers’ HG scores, both unconditional and conditional on son’s age.

3.2. Socio-economic Status Variables

We use two alternative measures of sons’ socio-economic status. Assuming exogenous selectioninto the labour market, the first measure (labelled HGs

1) is given by the average HG score overall waves after excluding the cases with missing status information either because the son doesnot work or because his information is genuinely missing.16 The second measure (HGs

2) is givenby the average HG score over all waves after replacing the cases in which the son is not inpaid employment (except for those who are in full-time education) with the minimum HG scoreobserved in the whole sample.17 Table I shows that the mean values of HGs

1 and HGs2 are

43 and 41 respectively for sons in the Full Sample, and 40 and 39 for sons in the RestrictedSample. Regardless of the sample, therefore, HGs

2 is smaller than HGs1 (as we replaced the cases

with missing prestige information with the minimum HG observed), but their differences are notstatistically significant. Imputed values, in fact, amount only to 9% in the case of the Full Sampleand 5% in the case of the Restricted Sample.

As mentioned above, one of the major difficulties in estimating intergenerational elasticitiesabides in the fact that father’s status, yf

i , is measured with error. The key problem is the lackof direct measures of permanent status. In the case of the Full Sample, in particular, the BHPSprovides us with only one single-year measure of fathers’ occupational prestige (when sons wereaged 14).18 Although the HG index is an arguably good proxy for long run status, a single-yearmeasure may still be tainted by transitory fluctuations in fathers’ careers. In addition, the BHPSelicits this information by asking respondents to report their parents’ occupation when they wereaged 14. The retrospective questioning of children to obtain data on parents may of course generaterecall errors.19 Both types of errors (due to measurement and recall) may be such that the varianceof observed status is greater than the variance of permanent status, leading the OLS estimate of ˇin (1) to be biased downward.20 Estimation of (1) is further complicated by the fact that fathers’and sons’ occupations may refer to different points in their life cycle. Although this age variationcould also bias the estimates of intergenerational mobility, no general conclusion on the directionof this bias can be reached (Jenkins, 1987).21

16 Several studies have argued that averaging status data over time reduces the impact of the transitory component of thestatus variable (thus reducing the potential of errors-in-variables bias) and yields a more accurate measure of permanentstatus. See, among others, Solon (1992), Zimmerman (1992), Dearden et al. (1997), and Mazumder (2005).17 This minimum value does not refer to the minimum score in the individual’s work history (because this informationis not available), but to the overall minimum score observed in the Full Sample. For a similar treatment of non-randomselection into the labour force, see Couch and Lillard (1998).18 In the case of the Restricted Sample, instead, multiple measures of fathers’ HG scores are available in principle. Forthat sample, however, there are still measurement problems in that fathers and sons are observed at different points in theirlife cycle (see below). Clearly, the Restricted Sample is expected to suffer from the selection bias discussed in Section 2.19 Again, our Restricted Sample does not have to face this problem as fathers and sons report independently their ownoccupational information.20 Because our Supplemental Sample is not affected by such errors, it should give us an idea of how serious this problemmight be for the Full Sample.21 Using American and Canadian data, Grawe (2006) documents that income variance increases as father’s income ismeasured later in the life cycle, and this leads to lower estimates of the intergenerational elasticity. On the other hand, assons become older, estimates of income persistence are higher. Evidence for other countries, however, is not yet available.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 9: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1273

Table I. Summary statistics by sample

Variables Full sample Restricted sample Supplemental sample

Mean (SD) N Mean (SD) N Mean (SD) N

Hope–Goldthorpe scores (sons)HGs

1 42.99 (13.33) 2117 40.00 (11.63) 969HGs

2 40.92 (14.47) 2311 38.88 (12.30) 1022

Hope–Goldthorpe score (fathers)

HGf1 49.63 (15.70) 1152 47.57 (15.14) 397

HGf2 50.03 (15.64) 1572 49.41 (15.40) 817

HGf3 48.93 (12.02) 2691 48.87 (13.21) 1114

HGfSS 47.80 (15.86) 1427

HGfSS 47.68 (15.01) 1430

Other characteristicsFather’s age 48.21 (7.54) 1062 46.93 (8.17) 1434House price index (log) 11.08 (0.26) 2691 11.05 (0.32) 1114

Son’s characteristicsAge 23.15 (4.51) 2691 21.30 1114 19.76 (6.64) 1434Year of birth 1974 (5.32) 2691 1976 (5.20) 1114 1976 (5.80) 1434Ethnic origin:

White (base) 0.94 2691 0.94 1114Black 0.02 2691 0.01 1114Indian 0.02 2691 0.02 1114Pakistani/Bangladeshi 0.01 2691 0.02 1114Other 0.01 2691 0.01 1114

Religious attendance:Protestant 0.12 2691 0.11 1114Catholic 0.06 2691 0.05 1114Other denomination 0.02 2691 0.03 1114

Region of sons’ residence:Greater London (base) 0.105 2691 0.996 1114Rest of South East 0.186 2691 0.193 1114South West 0.086 2691 0.084 1114Anglia and Midlands 0.237 2691 0.234 1114North West 0.101 2691 0.111 1114Rest of North 0.163 2691 0.170 1114Wales 0.044 2691 0.048 1114Scotland 0.077 2691 0.060 1114

Note: N indicates the number of sons in the Full and Restricted Samples, and the number of fathers in the SupplementalSample.

To address at least part of such problems we use three different measures of fathers’ socio-economic status.22 The first uses the single-year measure reported by sons, which identifiesfathers’ occupational prestige when their sons were aged 14, and excludes all cases with missinginformation (because either the son refused to answer, or he did not know his father’s occupation,or his father did not work when he was 14). This is the measure available for the Full Sample,HGf

1 . For all sons in the Full Sample who do not report the occupational information on their

22 More sophisticated imputation procedures than the one we use here would require information on fathers (e.g., age,education, and work experience), which is typically not available for father–son pairs in the Full Sample.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 10: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1274 M. FRANCESCONI AND C. NICOLETTI

fathers but could be successfully matched to them, we replace the missing values on paternaloccupational prestige with the HG scores directly reported by the fathers when their sons wereaged 14 (or as close as possible to that age). Of course, this replacement can be done only forthose father–son pairs that are in the Restricted Sample. We denote this measure HGf

2 . Our thirdmeasure, HGf

3 , replaces the missing values of HGf2 with the average (mean) score observed in the

entire Full Sample.23 There are clearly other types of imputation for father’s missing values. Forexample, for fathers who are observed to live with their sons in the Restricted Sample, regression-based imputation methods could be a compelling alternative. We do not pursue this, however,because of space limitation.

In the case of the Supplemental Sample we compute two measures of fathers’ HG scores.The first, HGf

4 , is the index obtained from the father in the second or eleventh wave whenhe is asked to report information about all his children. The second, which we denote HGf

5 ,is the average of the HG scores across all waves in which the father is observed. If thisinformation is missing, we replace it with the occupational prestige reported for the mostrecent job. Both averaging across waves and replacing missing data are meant to reduce theproblems induced by missing or inapplicable cases and measurement error (see subsection5.2.3).

Table I shows that the distributions of HGf1 , HGf

2 and HGf3 are relatively similar in the Full

and Restricted Samples, with means ranging between 47.6 and 50 and standard deviations rangingbetween 12 and 15.7. The mean prestige scores in the Supplemental Sample are close to thescores in the Full Sample, which suggests that measurement and recall errors affecting the statusmeasures in the Full Sample may not be substantial. In addition, we never reject the hypothesisthat the mean HG scores and their standard deviations in the Full and Restricted Samples differfor sons or for fathers, even at significance levels as high as 10%.

3.3. Other Variables

Table I lists the summary statistics of the other variables used in the analysis. As in several otherstudies, the model for son’s status in equation (1) is extended to include son’s age and its square(Solon, 1992; Zimmerman, 1992; Dearden et al., 1997; Couch and Dunn, 1997; Corak and Heisz,1999). Although the age range of 16 to 34 years is the same in both samples, the mean age ofsons in the Full Sample is 23.2, about 2 years greater than in the Restricted Sample. We do nothave information on fathers’ age in the Full Sample, but in the Restricted Sample their mean ageis 48.2, while it is 46.9 in the Supplemental Sample, whose higher standard deviation reflects itsgreater heterogeneity.

23 In our empirical analysis, we used two other measures. One replaced the missing values of HGf2 with the minimum

observed score in the Full Sample. The assumption behind this measure is that the missing values are for fathers with lowpermanent incomes. This is justified by the fact that the majority of such cases are missing due to inapplicability (that is,fathers who are likely to be unemployed or out of the labour market). But differently from the case of sons (see above),in this case we can distinguish fathers who were unemployed from fathers who do not have occupational information forother reasons (e.g., death and out of the labour market) only in 6% of the cases. For this reason, the results obtained withthis measure are likely to be problematic, and are therefore not shown here. However, they are reported in Francesconiand Nicoletti (2004). Another measure replaces the missing values of HGf

2 with the median score in the Full Sample.This measure assumes that the missing values refer to fathers who are randomly spread over the entire distribution ofoccupational prestige, and the median minimizes the mean absolute errors and is robust to outliers. The results obtainedfrom this measure are virtually identical to those obtained with HGf

3 (mean), and are thus not reported. They can beobtained from the authors on request.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 11: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1275

The next section will illustrate our methods to account for sample selection biases. A number ofother variables are used to model the probability of observing fathers and sons living in the samehousehold at least once during the sample period. These are reported in Table I. Approximately94% of the sons in both samples are white. The average year of birth for sons in the Full Samplewas 1974, compared with 1976 for those in the Restricted Sample. Religious attendance is anotherfactor that is believed to have a deep effect on the likelihood of young people leaving the parentalhome (Cherlin, 1992). Although religious views on family formation are varied, strong religiousbeliefs are one cultural source of ideas that encourages the maintenance of traditional values(Wilcox, 2002). For each of the three religious denominations considered here (Catholic, Protestant,and other religions), ‘attendance’ is defined as attending religious services at least once a month.24

In both samples, less than 20% of the young men are religiously active, more than 40% have noreligious affiliation and the rest have an affiliation but do not attend services regularly.

Many studies of household formation have underlined the importance of the price of housing(e.g., Haurin et al., 1994; Ermisch, 1999). House prices indeed can affect the likelihood ofobserving fathers and sons living together, and so they may determine the selection into theRestricted Sample.25 In Table I, the (log) average prices of housing are relatively similar in theFull and Restricted Samples, with their corresponding levels being about 65 000 and 63 000,respectively. These averages mask large differences across regions and over time. Eight regionaldichotomous variables are included in the selection model for the standard regions of GreatBritain.26 These are the same regions used to assign annual house prices, and so identification ofthe effects of the price of housing on the probability of not living apart from parents is based onthe differences in patterns of house price changes among regions. Table I shows that the regionaldistributions of sons are similar across samples.

24 The “Protestant” group includes: Church of England (Anglican), Church of Scotland, Free Presbyterian, Episcopalian,Methodist, Baptist, Congregationalist, and other Christian denominations. The ‘other religions’ include: Muslim, Hindu,Jewish, Sikh, and other non-Christian denominations. The omitted category includes those with no religious affiliation aswell as those who have a religious affiliation but attend religious services only infrequently. Distinguishing between suchtwo groups does not change our results.25 The price of housing is an ambiguous concept when there are different housing tenures, non-neutral tax treatmentof them, and probable imperfections in financial markets (Ermisch and Di Salvo, 1997; Ermisch, 1999). In Britain in2002, nearly 90% of households are either owner-occupiers (68%) or ‘social tenants’ (22%). The latter primarily includeshouseholds who rent their dwelling from local authorities. Social housing is not allocated by price, but by administrativeprocedures, which give priority to families with children and the elderly. While only a small proportion of all householdsrent from private landlords, it is a relatively important sector for young people leaving their parental home, being thedestination of 45% of all departures and 33% of departures among those who are not full-time students. Owner-occupationis the destination for 56% of non-student departures. Information on rents in the private market is not available in theBHPS, and there are barriers to entry into social rental housing for young people. Our measure of the price of housingis the same as that used by Ermisch (1999), and is given by the average ‘mix-adjusted’ house price relative to the retailprice index in any given year for the region in which a person resided in that year. It adjusts for changes in the mix of thesize and type of house (e.g., detached, semi-detached, flat, etc.), but does not adjust for quality change. This measure islikely to capture a large proportion of the variation in a measure of the annual ‘user cost of housing’ for owner-occupiers,because mortgage and income tax rates are set nationally and relative house prices show much larger variation over timethan these. It also could be viewed as an indicator of housing market conditions, in both rental and owner-occupiedmarkets. For individuals in the Full Sample who could not be matched with their fathers, the price of housing refers tothe price observed in the first wave they were in the panel. For those who co-reside with their fathers (and therefore, allsons in the Restricted Sample), this variable is measured at the last wave they were observed living together.26 The regions are Greater London (which is our base category), South East, South West, East Anglia and East and WestMidlands, North West (including Yorkshire and Humberside), Rest of the North, Wales, and Scotland.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 12: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1276 M. FRANCESCONI AND C. NICOLETTI

4. METHODS

The first objective of our analysis is to determine the impact of sample selection assumptions onintergenerational mobility estimates. We begin by estimating an extended (still very parsimonious)specification of equation (1) on the Full and Restricted Samples, namely:

ysi D ˛ C ˇyf

i C A0i� C εi �2�

where Ai is a vector containing the son’s age and age squared,27 and εi is the new error term.Throughout the paper, we work under the assumption that the estimates of ˇ obtained from the FullSample are free of selection bias, that is, E�εjyf, A� D 0 (see also Appendix A). Comparing thenthe estimates of ˇ obtained from the Full Sample to those obtained from the Restricted Sampleprovides us with a measure of the extent of the bias. We will perform this comparison by using eachof the two HG score measures for sons alternatively with each of the three measures for fathers.

After having established the magnitude of the bias (if any), we consider different estimationmethods to correct for it. For this purpose, we apply five different methods.28 The first methodis based on a maximum likelihood (ML) estimation of the main equation (2) jointly with thefollowing selection equation:

rŁi D Z0

i C vi �3�

with

ri D 1 if rŁi > 0

ri D 0 otherwise �4�

where rŁi is a latent variable with associated indicator function ri that takes value 1 if the father’s

status, yfi , is observed, and 0 if yf

i is missing; Zi is a vector of explanatory variables (possiblyincluding Ai) that determine the probability of observing a son living with his father during thesurvey years and are assumed for simplicity to be observed for all individuals; vi is a zero-meanerror term with E�εijvi� 6D 0; and is a conformable vector of parameters to be estimated. Underthe assumption that εi and vi are independently and identically distributed N�0, �, with beinga full-rank variance–covariance matrix, and �εi, vi� independent of Zi and yf

i , it is straightforwardto estimate all the parameters in (2) and (3) using a standard ML procedure.

The second method follows the parametric two-step (TS) estimation procedure suggested byHeckman (1979), which, like the previous method, imposes a bivariate normal distribution on theerror terms, εi and vi. The idea here is that in the conditional expectation of (2), after normalizing �v

to 1, the term E�εijZi, ri D 1� equals �εv

(��Z0

i��Z0

i�

), where ��� and �� denote the probability

density and cumulative distribution functions of the standard normal distribution, and the term inparentheses is known as the inverse Mill’s ratio. Therefore E�εijZi, ri D 1� is different from zero aslong as the error terms in (2) and (3) are correlated, that is, �εv 6D 0. The TS method is first to obtainan estimate of the inverse Mill’s ratio using a probit model, and then obtain a consistent estimate

27 The introduction of Ai in (2) is motivated by the fact that fathers and sons are not observed at the same point in theirlife cycle. A more flexible specification would require us to include father’s age as well, which would account for differentage profiles in socio-economic status between generations (Solon, 1992). However, we cannot do this because father’sage is not available in our data.28 For a comprehensive review, see Vella (1998).

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 13: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1277

of ˇ (and the other parameters in (2)) by OLS.29 Without imposing any exclusion restriction,both ML and TS methods will identify our model (2)–(4) via functional form (i.e., through thenonlinearity implied by the error structure). However, in the analysis below we introduce exclusionrestrictions, with our selection equation including the same variables in (2) plus dummy variablesfor region of residence, religious attendance and race, and the regional house price index.30

A criticism of the ML and TS methods is their reliance on distributional assumptions on εi andvi (Lee, 1982, 1984; Newey, 1999). In particular, bivariate normality implies that the relationshipbetween the error terms is linear. Our third method tries to test for departures from normality byincluding terms that capture systematic deviations from linearity in the disturbances. This methodagain involves a two-step procedure, in which the first step is identical to that of TS. The secondstep estimates the conditional expectation of (2), but—rather than including the first-step estimateof the inverse Mill’s ratio as an additional regressor—it includes a polynomial expression that isproportional to the inverse Mill’s ratio and takes the form

E�εijZi, ri� D[

��Z0i�

�Z0i�

] J∑jD0

υj�Z0i�j

where J is the order of the polynomial and υj are parameters to be estimated. If J D 0, thisexpression equals the inverse Mill’s ratio of the TS method. But coefficients of the termsfor j D 1, . . . , J that are statistically different from zero lead to reject the hypothesis that thedisturbances are bivariate normal. In our application J D 4. We refer to this method as TSN, tounderline its two-step nature and the fact that it tests for non-normality.

Our last two methods rely less on distributional assumptions by imposing an index restric-tion according to which E�εijZi, ri D 1� D g�Z0

i� D h�Pr�ri D 1jZi��, where both g and h areunknown functions, and Pr�ri D 1jZi� D �Z0

i� is the propensity score. The fourth method fol-lows a two-step propensity score stratification (PSS) procedure by slightly modifying the estimatorproposed by Cosslett (1991). In the first step we estimate the selection model parametrically viaprobit (rather than non-parametrically as suggested by Cosslett). In the second step, we estimateour main equation (2) while approximating the selection correction, g��, by J indicator variables,dj D fI�Pr�ri D 1jZi� 2 j�g, where j D 1, . . . , J is one of the J intervals that partition the [0,1]support of the propensity score.31 In our analysis, we use three alternative specifications of the PSSmethod. In one, we consider four dummy variables indicating whether the predicted propensityscore is in the bottom quartile, or between the 25th percentile and the median, or between themedian and the 75th percentile, or in the top quartile. In another, we consider 10 dummy variablesthat partition the predicted propensity score distribution in deciles. In the last, we partition the[0,1] support of the propensity score in equally spaced intervals controlling for the balancing scoreproperty.32

29 Following Heckman (1979), we compute consistent estimates of the variance–covariance matrix of ˇ and � in the secondstep by accounting for the heteroskedasticity of the error terms induced by the truncation of ε and the pre-estimation ofthe inverse Mill’s ratio.30 Similar restrictions are used in Vella (1998). Clearly, for our exclusion restrictions to be valid, we need region, raceand the house price index: (a) to be predictive of ri; (b) to assign the realization of ri randomly across families given theother observables; and (c) to be exogenous to εi.31 Although similar to ours, the procedure by Cosslett (1991) suggests a partition of the support of Z0 instead of thesupport [0, 1] of the propensity score.32 This property requires that the distribution of the variables Z does not differ between sons who are observed living withtheir fathers and sons who are not, given that the propensity score lies in each of the intervals which partition the [0–1]

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 14: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1278 M. FRANCESCONI AND C. NICOLETTI

Finally, our fifth method applies the propensity score weighting (PSW) procedure recently usedin Robins and Rotnitzky (1995) and Wooldridge (2002). This is again a two-step method withthe first step requiring the usual estimation of the selection model via probit. In the second step,we estimate the main equation (2) using weighted least squares with weights given by the inversepropensity scores obtained from the first step.

A key issue in selection problems has to do with the way in which the selection process itselfis believed to be generated. If the probability that ri D 1 is independent on fys

i , yfi , Zig, then the

data are missing completely at random, the missing data problem can be ignored, and we canestimate equation (2) using the subsample of father–child pairs with observations on both ys

i andyf

i (and other regressors). If instead ri is not independent of ysi , but, conditional on ys

i and Zi, itis uncorrelated to yf

i , the data are said to be missing at random (MAR), and consistent estimatesof ˇ can be obtained from (2) using the subsample of individuals with complete observations andconsidering the missing covariate problem as if it were a missing dependent variable problem(and, thus, applying the five correction methods outlined above).

If ri is independent neither of ysi nor of yf

i , the data are not missing at random (NMAR), andconsistent estimates of ˇ can be obtained only by inverting (2) into

yfi D a C bys

i C A0ic C �i �5�

which allows us to consider the problem of the missing yfi as a standard missing dependent variable

problem and to apply again our five correction methods straightforwardly.33 In the reverse-OLSregressions (5), a, b, and c are parameters, and �i is an error term. Since our parameter of interestis ˇ in equation (2), this can be recovered from (5) through

D b

(�2

Qys, Qys

�2Qyf, Qyf

)�6�

where �2Qk,Qk is the variance of Qk, Qk D Qys, Qyf, and Qk is the residual of the regression of k on A.

Equation (5) is a direct application of the results shown in Frisch and Waugh (1933). From (5) werecover ˇ under the assumption that b is consistenly estimated, and the conditions under whichestimation of b is consistent are presented in Appendix A for each of the correction methods usedin our analysis.

Now, ysi and Ai are observed for all father–son pairs, so the estimation of � Qys, Qys is unproblematic.

However, yfi is observed only for father–child pairs for whom both the adulthood and the

coresidence conditions hold, the possibility of missing yfi ’s must be accounted for in estimating

� Qyf, Qyf . For this purpose, the information contained in the Supplemental Sample is crucial.Specifically, for all fathers in that sample, we regress the log of their HG index on a constant, theirson’s age and age squared, and use the residuals of this regression to compute � Qyf, Qyf . Therefore,for each of the five methods described above (ML, TS, TSN, PSS, and PSW), we will present two

support. If the balancing score property is not satisfied for a specific interval, we split it into disjoint smaller subintervals,until the property is satisfied. See Rosenbaum and Rubin (1983) for further details.33 It is worthwhile noticing that we estimated one specification of the selection model (3)–(4) in relation to (2) and anotherspecification in relation to (5). In the first, the Zi vector excludes both ys

i (obviously because it is the dependent variable)and yf

i (because we impose MAR). In the second specification, we exclude yfi (since it is the dependent variable in (5))

but we include ysi .

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 15: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1279

Table II. Assumptions imposed by different estimators

Label Assumption Estimator Equation

A1 �r jD ysjyf, A� OLS (2)A2 �r jD yfjys, A� OLS (5)A3 �r jD yfjys, A, Z� ML, TS, PSS, TSN (2)A4 �ys jD Zjyf, A� ML, TS, PSS, TSN (2)A5 �r jD ysjyf, A, Z� PSW (2)A6 �yf jD Zjys, A� ML, TS, PSS, TSN (5)A7 �r jD yfjys, A, Z� PSW (5)A8 �r jD yfjA, Z� PSW (2)

Note: OLS, ordinary least squares method; ML, maximum likelihoodestimator; TS, parametric two-step estimation procedure; TSN, two-stepprocedure with non-normality test; PSS, propensity score stratificationprocedure; PSW, propensity score weighting procedure.

sets of estimates of ˇ, one set that assumes data missing at random and another set that assumesdata not missing at random.

Table II summarizes the assumptions imposed by each of the methods described in this sectionto obtain a consistent estimate of the intergenerational correlation parameter.34 For example,assumption A1, which is needed for the OLS estimator of ˇ in (2) to be consistent, requiresys

i to be independent of ri conditional on yfi and Ai. This means that the residuals in (2) must not

depend on the selection process. However, such a dependency could be driven either by unobservedvariables—in which case the ML, TS, TSN, and PSS methods and their corresponding assumptionswill apply—or only by observed variables that are excluded from the main equation (2). In thislatter case, assumption A5 becomes relevant.35 Finally, assumption A3 identifies our notion ofMAR, which naturally will be relaxed when we estimate our models under the hypothesis ofNMAR data. Appendix A provides a formal clarification of all these assumptions.

5. RESULTS

5.1. The Extent of Selection Bias

Table III reports the estimated intergenerational elasticities using six different combinations ofthe son’s and father’s status variables for the Full and Restricted Samples computed under theassumption that ys

i is independent of ri conditional on all the explanatory variables in (2). Theestimates from the Full Sample range between 0.17 and 0.22, while those from the RestrictedSample range between 0.10 and 0.20.36 The last column of Table III reveals that the gap betweenthe two sets of estimates is statistically significant at conventional levels.37

34 The notation ‘ jD ’ indicates statistical independence.35 A similar reasoning is valid when the reverse-regression model (5) is used.36 In spite of being smaller than the elasticities reported in many recent studies that use earnings or income as measuresof status (Solon, 1999), our estimates are close to those shown in Atkinson et al. (1983) for Britain, when they use netfamily incomes rather than earnings as their variables of interest. They are also close to those reported in Blanden et al.(2004), where the log of children’s earnings are regressed on the log of parental income, and to those reported in Ermischand Francesconi (2004), who also use HG scores.37 Using a probit model in which ri is determined by ys

i , yfi and Ai, we can also check if assumption A1 holds by testing

whether the coefficient on ysi is significantly different from zero. Differently from the Chow test shown in Table III, this

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 16: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1280 M. FRANCESCONI AND C. NICOLETTI

Table III. Estimated ˇ for different measures of status by sample

Measures of status Full Sample N Restricted Sample N Chow test

1. HGs1 and HGf

1 0.224 (0.025) 1035 0.198 (0.039) 377 3.192 [0.013]2. HGs

2 and HGf1 0.215 (0.029) 1092 0.185 (0.042) 388 2.644 [0.032]

3. HGs1 and HGf

2 0.173 (0.022) 1386 0.110 (0.030) 728 4.856 [0.000]4. HGs

2 and HGf2 0.170 (0.025) 1467 0.105 (0.033) 763 5.552 [0.000]

5. HGs1 and HGf

3 0.172 (0.022) 2117 0.108 (0.030) 969 5.620 [0.000]6. HGs

2 and HGf3 0.168 (0.013) 2311 0.103 (0.033) 1022 8.387 [0.000]

Note: N denotes the number of cases (sons and fathers) used in estimation. Standard errors are reported inparentheses. In the last two columns we report the values of the Chow tests with their corresponding p-valuesin brackets. The null hypothesis of the Chow test is that the intergenerational elasticity (net of age and agesquared) estimated with the Full Sample is equal to the corresponding intergenerational elasticity estimatedwith the Restricted Sample.

Taking the values from the Full Sample as benchmark estimates free of selection bias, wealways find that the Restricted Sample leads to an underestimation of the true intergenerationalelasticity, which is in line with the results obtained in a different context by Minicozzi (2003). Themagnitude of the bias varies from moderate (between 12% and 14% in the first two rows) to large(between 36 and 39 in the other). The extent of the bias therefore depend on how socio-economicstatus is measured.38

Biases of the order of almost 40% are arguably large. But even those found in the first tworows of Table III are sizeable, in the sense that they are likely to be consequential to mobility.This is because of our measure of socio-economic status. In general, values of ˇ of 0.18 or0.19 suggest patterns of intergenerational mobility that are relatively similar to those impliedby values of about 0.21 or 0.22. The sons’ (conditional) probability of staying in or movingto different points of the HG scores distribution varies by 1 or 2 percentage points as longas their father’s prestige does not lie at the extremes of his distribution. However, if father’sprestige is at the extremes (e.g., bottom and top deciles), the differences in probability arelarger (of the order of 3–4 percentage points) and such differences may underpin substantiallydifferent occupations and earnings. For example, almost 90% of fathers in the top decile aremanagers (across all industrial sectors) and professionals (e.g., engineers, architects, universityprofessors, medical doctors, solicitors and chartered accountants). This is true only for 58% offathers in the ninth decile of the HG score distribution (which lies between the 80th and the90th percentiles). Those occupational differences are reflected in substantial pay differentials. Forexample, the 2001 average monthly earnings is 2100 for fathers in the ninth decile, while fathers

test does not require linearity in the intergenerational mobility equation, but imposes a parametric probability model forthe selection process. The results are shown in Appendix Table A.I (panel (a)), which reports the statistics only whenHGs

1 is used. Identical results were obtained when HGs2 was used, and are therefore not shown for simplicity. In two

out of three cases assumption A1 is rejected at the 10% level (rows one and two). This indicates that selection issues arerelevant. Importantly, in only one case (row three), this test contradicts the results from the Chow test at the 10% level. Inthis case, however, the normality assumption imposed by the probit specification is rejected (see Appendix A for furtherdiscussion on the normality assumption tests).38 Following the suggestion of a referee to account for differential propensity to report valid occupational information,we also performed our analysis including the number of years used in averaging for both fathers and sons as separatecovariates in the Restricted Sample. In general, the added variables are not statistically significant, except when we useHGf

3 . In any case, the results on the estimated ˇ and the extent of the bias do not change.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 17: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1281

in the top decile earn 2500 per month, approximately 15% more. In sum, a downward bias ofup to 25% in intergenerational elasticities based on occupational prestige measures is likely toprovide us with a different picture of social mobility even if the ‘true’ value of the elasticity islow.

5.2. Correcting for Sample Selection Bias

The results for the (probit) selection model are reported in Appendix Table A.II. We showthree specifications. Specification [1] is relevant for equation (2), specifications [2] and [3]pertain to model (5) but use two different measures of ys

i , HGs1 and HGs

2 respectively.These results are obtained when we use HGf

1 . Therefore, matched father–son pairs for whichinformation on father’s occupational prestige is missing have been dropped, thus reducingthe size of the estimating sample.39 This allows us to keep two of the sources of poten-tial sample selection (coresidence and missing father’s information) separate, and concentrateonly on the coresidence process. When the missing father’s data are imputed (i.e., whenwe use either HGf

2 or HGf3 ), sample sizes are larger. The results for these other mod-

els, which are not shown for the sake of brevity, are similar to those reported in AppendixTable A.II.

Our estimates confirm a number of previous results (e.g., Haurin et al., 1994; Ermisch and DiSalvo, 1997). Older sons are less likely to be matched with their fathers, and so are youngblack (Carribean or African) men (albeit this is not statistically significant). Men of Indianethnic origin are instead more likely to be observed living with their fathers than otherwiseidentical white men. This is also the case for young men who regularly attend religious services(but only if they are non-Christians). Individuals who live in Greater London are more likelyto co-reside with their fathers in any of the panel years as compared to individuals in otherparts of the country. Higher house prices significantly reduce the probability of observingfather–son pairs, suggesting a lower probability of co-residence in the panel in times orareas with relatively higher house prices. This is in contrast to the findings reported in otherstudies (e.g., Ermisch, 1999), although such studies were typically interested in the householdformation process rather than the co-residence of fathers and sons (indeed a number of sonswho could not be matched with their fathers in our sample are found to live with theirmothers).

With methods allowing for selection on unobservables, i.e., correlation between the error termsin the selection equation and the main equation (ML, TS, PSS, and TSN), the choice of appropriateexclusion restrictions is crucial for identification. Our identifying variables—ethnicity, religion,region of residence, and local house prices—have been selected on the basis of data availabilityand previous research on household formation (e.g., Haurin et al., 1994). How well such variablesidentify the selection process of interest here is an empirical matter, and we shall defer thediscussion of this point to the next two subsections. Of course, there could be other ‘exogenous’(perhaps more convincing) variables to identify father–son co-residence, such as time changes indivorce laws or in the tax/benefit system which have been used in other related intergenerationalstudies (e.g., Gruber, 2004). The modelling of the selection process along these lines, however, isnot straightforward given the short sampling period of our data and the one-shot measurement ofr, and in any case it arguably does not represent the main contribution of our study.

39 See the values of N for the Full Sample in the first two rows of Table III.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 18: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1282 M. FRANCESCONI AND C. NICOLETTI

Table IV. Correcting for sample selection bias under MAR

Sample type Correction methods

Full Restricted ML TS TSN PSS PSW

1. HGs1 and HGf

1 0.224 0.198 0.198 0.198 0.200 0.199 0.208(0.025) (0.039) (0.039) (0.039) (0.039) (0.039) (0.060)

2. HGs1 and HGf

2 0.173 0.110 0.110 0.110 0.109 0.110 0.145(0.022) (0.030) (0.030) (0.030) (0.030) (0.030) (0.038)

3. HGs1 and HGf

3 0.172 0.108 0.108 0.108 0.108 0.109 0.129(0.022) (0.030) (0.030) (0.030) (0.030) (0.030) (0.041)

Note: Standard errors are reported in parentheses.

5.2.1 MAR DataWe now turn to see how well the methods described in Section 4 reduce the sample selectionbias that affects ˇ obtained from the Restricted Sample. Table IV shows these results under theassumption of MAR data.40 For expositional convenience, the first two columns of Table IV reportagain the elasticities shown in Table III for the three possible combinations of HGs

1 and HGf.The results for the combinations of HGs

2 and HGf are similar, and therefore not reported.41 Allcorrection methods seem to be unable to attenuate the bias, except for the PSW method.42 Forexample, in row one (where the bias is 12%), the PSW-corrected value of ˇ is 0.21, or 7% smallerthan the corresponding true value in the Full Sample. In row two, where the bias is larger (36%),the PSW method reduces the bias by 56%.

This method is valid when assumptions A5 and A8 are satisfied (see Table II). The tests reportedin Appendix Table A.I panels (c) and (h) reveal that both assumptions cannot be rejected at the1% level. Unsurprisingly, the fact that the correction methods other than PSW perform poorlyis confirmed by the strong rejection of A4 regardless of the measures of socio-economic status(panel (d)). These results may indicate that the selection of interest here is primarily driven byobservables. But, because of our earlier discussion on exclusion restrictions, they may also suggestproblems with our identifying variables.

5.2.2 NMAR DataThe procedure we adopt to obtain consistent estimates of ˇ when the data are NMAR is: (i) toestimate the reverse regression (5); and (ii) to multiply the b coefficient by the ratio in variancesas shown in (6). The first two columns of Table V report the estimates of b for the Full andRestricted Samples, respectively, again only for the HGs

1 –HGf combinations. These estimates(in either sample) are always greater than those found for ˇ in Table III, except for the case of

40 We conducted a number of tests on the significance of the inverse Mill’s ratio in the TS method, the polynomial termsin the TSN procedure, and the correction dummy variables in the PSS estimation. In all cases, we cannot reject thehypothesis that the coefficients of such variables are zero. However, as the Chow tests reported in Table III reveal, thesetests seem to be not powerful enough to detect significant differences in intergenerational elasticities.41 This similarity emerges also in all subsequent analyses. The results can be obtained from the authors.42 Here and in the subsequent correction exercises, the results on the PSS method refer only to the case in which thecorrection term is approximated by four dummy variables that partition the predicted propensity score distribution inquartiles. The results obtained when 10 dummy variables (corresponding to deciles) are included as well as the resultsthat control for the balancing score are not shown for convenience, but are similar to those reported. They can be obtainedfrom the authors.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 19: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1283

Table V. Estimated b and �2Qyf, Qyf for measures of status by sample

b �2Qyf, Qyf

FullSample

RestrictedSample

Chowtest

FullSample

SupplementalSample

Equalitytest

1. HGs1 and HGf

1 0.331 0.324 0.871 0.117 0.132 1.104(0.036) (0.064) [0.481] (0.005) (0.005) [0.085]

2. HGs1 and HGf

2 0.251 0.165 2.223 0.116 0.132 1.121(0.032) (0.045) [0.064] (0.004) (0.005) [0.030]

3. HGs1 and HGf

3 0.157 0.121 1.168 0.068 0.132 1.914(0.033) (0.034) [0.323] (0.003) (0.005) [0.000]

Note: Standard errors are reported in parentheses. Standard errors for �2Qyf, Qyf are computed using boot-strapping

methods with 1000 replications. The p-values of the Chow test and Equality test are in brackets. The null hypothesisof the Chow test is that the coefficient b (net of age and age squared) estimated with the Full Sample is equal to thecorresponding coefficient estimated with the Restricted Sample. The null hypothesis of the equality test is that thevariance of Qyf computed with Full Sample is equal to the corresponding variance computed with the SupplementalSample.

HGf3 . Given the link between b and ˇ, such a result implies that the variability in status among

fathers is greater than that among sons, with the opposite holding in the case of HGf3 possibly

as a result of imputation (see also the standard deviations of HGs and HGf in Table I). Thisrelationship is plausible if the variance in status increases over the life cycle (Grawe, unpublished,2003). Comparing the estimates of the two samples in Table V, we find that the Restricted Sampleproduces a greater b in all three cases. However, as revealed in the Chow tests reported in the thirdcolumn of the table, the differences are never statistically significant. Thus, if we were interestedin b rather than ˇ, OLS estimation of (5) on the Restricted Sample would not lead to biasedestimates. Interestingly, the test for the validity of assumption A2 leads to the same result (seeAppendix Table A.I panel (b)).

The next two columns of Table V present the estimates of the variance in father’s status, �2Qyf, Qyf ,

computed on the Full and Supplemental Samples, respectively.43 We cannot reject the hypothesisthat �2

Qyf, Qyf under the Full Sample is equal to �2Qyf, Qyf under the Supplemental Sample for the first

measure of father’s status, HGf1 . In the other cases, however, the two variances are significantly

different at standard levels.The intergenerational elasticities implied by the estimates of b and � Qyf, Qyf according to expression

(6) are in Table VI. The first column shows the values of ˇ found when b and both �2Qyf, Qyf and �2

Qys, Qys

are from the Full Sample. This produces the same estimates of ˇ reported in Table III, which—inwhat follows—are again taken as our new benchmark. The values in the second and third columnsdiffer by the estimates of b (which are computed on the Full and Restricted Samples, respectively),but use the same ratio of variances, namely the ratio between the variance in son’s status computedon the Full Sample and the variance in father’s status computed on the Supplemental Sample. The

43 Notice that the variances for the Full Sample are identical each time we use the same measure of father’s status, whilethose for the Supplemental Sample never change because they are based on the same measure of status. We do not reportthe estimates of the variance for the Restricted Sample, because they will yield the same estimates of ˇ as those shownin Table IV. We also do not report the estimates of the variance in son’s status, �2

Qys, Qys , since ysi and the other relevant

variables in (5) are observed for all father–son pairs.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 20: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1284 M. FRANCESCONI AND C. NICOLETTI

Table VI. Correcting for sample selection bias under NMAR

Sample type Correction methodsc

Full/Fulla Full/Suppl.b Restricted/Suppl.c ML TS TSN PSS PSW

1. HGs1 and HGf

1 0.224 0.206 0.206 0.208 0.208 0.211 0.216 0.225(0.025) (0.022) (0.040) (0.042) (0.042) (0.041) (0.043) (0.063)

2. HGs1 and HGf

2 0.173 0.157 0.103 0.100 0.101 0.104 0.102 0.147(0.022) (0.020) (0.028) (0.031) (0.030) (0.030) (0.031) (0.036)

3. HGs1 and HGf

3 0.172 0.098 0.075 0.075 0.075 0.074 0.075 0.104(0.022) (0.012) (0.021) (0.022) (0.022) (0.022) (0.022) (0.031)

Note: Standard errors are reported in parentheses. They are computed under the assumption that

(�2

Qys, Qys

�2Qyf, Qyf

)is constant.

a Full Sample is used to estimate both b and �2Qyf, Qyf .

b Full Sample is used to estimate b, and Supplemental Sample is used to estimate �2Qyf, Qyf .

c Restricted Sample is used to estimate b, and Supplemental Sample is used to estimate �2Qyf, Qyf .

Restricted Sample (column 3) underestimates the true value of ˇ, with the bias ranging frommoderate (about 8% in the first row) to large (40% and 56% in rows 2 and 3, respectively).

Table VI demonstrates that the PSW method corrects adequately well for the bias in all cases,reducing it by 17 percentage points in row three and eliminating it altogether in row one.Interestingly, this good performance is confirmed by the fact that assumption A7 is never rejected(see Appendix Table A.I). The other methods perform less satisfactorily, perhaps because theirunderlying assumption A6 is always rejected. These findings strongly confirm those obtained aboveunder the hypothesis of MAR data, and again—with the usual caveats—stress the importance ofselection on observables.

5.2.3 Measurement ErrorAs mentioned in the Introduction, a number of studies have emphasized the role played by mea-surement error in socio-economic status in estimating intergenerational mobility. The unavailabilityof measures of permanent status generally leads to downward-biased elasticities (that is, greatermobility). Assuming a classical measurement error model,44 one way to attenuate the bias hasbeen to average over repeated observations on father’s status (Solon, 1992; Zimmerman, 1992).It is straightforward to show that

ˇavg D ˇ

�2Qyf, Qyf

�2Qyf

avg, Qyfavg

�7�

where ˇ is the elasticity obtained from model (2), �2Qyf, Qyf is the variance of the residual of the

regression of yfit on Ait for all years t in which fathers are observed, �2

Qyfavg, Qyf

avgis the variance of

the residual of the regression of time averages of yf on A, and ˇavg is the elasticity obtainedfrom (2) when yf

i is replaced by its time average, yfi,avg. The problem with our data is that in

44 This assumes that the measurement errors for sons and fathers are mutually uncorrelated and also uncorrelated with thepermanent component of status (see Fuller, 1987).

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 21: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1285

Table VII. Correcting for sample selection bias under MAR in presence of measurement error

Sample type Correction methods

Full Restricted ML TS TSN PSS PSW

1. HGs1 and HGf

1 0.230 0.203 0.203 0.203 0.205 0.204 0.213(0.026) (0.040) (0.040) (0.040) (0.040) (0.040) (0.061)

2. HGs1 and HGf

2 0.176 0.112 0.112 0.112 0.111 0.112 0.147(0.022) (0.030) (0.030) (0.030) (0.030) (0.030) (0.039)

3. HGs1 and HGf

3 0.102 0.064 0.064 0.064 0.064 0.065 0.077(0.013) (0.018) (0.018) (0.018) (0.018) (0.018) (0.024)

Note: Standard errors are reported in parentheses. They are computed under the assumption that

�2Qyf, Qyf

�2Qyfavg, Qyf

avg

is constant.

either the Full or Restricted Samples we do not have repeated observations on father’s status. Tocompute the term at the denominator in parentheses in expression (7), therefore, we have to resortto the ‘external’ information contained in the Supplemental Sample. In particular, our measure ofinterest is the average of the father’s HG scores across all waves in which he is observed, HGf

5(see Subsection 3.2).

The results that adjust ˇ to account for measurement error according to (7) are reported inTable VII. The first two rows show only marginally greater elasticities than the estimates fromthe Full Sample in Table III. This underlines that the HGf

1 and HGf2 measures are likely to be

contaminated only by small measurement error. When HGf3 is used, the elasticities are instead

about 40% smaller (row three). In this case, the variance of HGf3 is likely to be underestimated

because of imputation. In any case, the occupational prestige measures used here seem to be goodproxies of permanent economic status. Importantly, all the findings that emerged earlier are stillvalid. We highlight two such findings. First, using the Restricted Sample leads to underestimatingthe true intergenerational elasticity (with biases ranging between 12% and 37%). Second, allcorrection methods perform poorly, in the sense that they are unable to attenuate the selectionbias, apart from the PSW procedure. In fact, depending on the father’s measure, the bias reductionachieved by the PSW-corrected estimates ranges between 35% and 56%.

5.3. The Effect of Changing the Length of the Panel

The BHPS at present is a short panel to study intergenerational mobility issues as compared to otheravailable longitudinal data sources (e.g., the Panel Study of Income Dynamics and the NationalLongitudinal Survey in the United States, or the National Child Development Study in Britain).There are, however, even shorter panels of data for Britain as well as other countries, some ofwhich have been discontinued and thus will never reach a long enough time series component(e.g., the European Household Panel Study, which has been collected only eight times). In whatfollows we present a sensitivity check of our analysis when the length of the sample is restrictedto a period shorter than 11 waves. In Table VIII we report the ˇ estimates computed using threenew samples, namely, the subsamples of sons who co-reside with their fathers in at least onewave during the first eight, six, and four waves of the BHPS. For each subsample we compute the

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 22: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1286 M. FRANCESCONI AND C. NICOLETTI

Table VIII. Intergenerational elasticities using panels of different length

Sample type (11 waves)a Restricted Samples with:

Full Restricted 8 waves 6 waves 4 waves

1. HGs1 and HGf

1 0.224 0.198 0.129 0.098 0.031(0.002) (0.029) (0.513)

342 318 2972. HGs

1 and HGf2 0.173 0.110 0.061 0.030 0.012

(0.089) (0.427) (0.795)566 472 397

3. HGs1 and HGf

3 0.172 0.108 0.060 0.028 0.011(0.086) (0.472) (0.810)

714 566 442

Note: Standard errors are reported in parentheses. Figures in italics denote the number ofobservations.a For simplicity standard errors and number of observations are not reported. They can be foundin Table III.

measures of occupational prestige using only the information available in the fictitiously shorterpanel period (that is, the first eight, six, and four waves, respectively).

Two comments are in order. First, limiting the analysis to a smaller number of waves leads tosmaller sample sizes, and this seems to decrease the estimation precision quite substantially. Whilein the Restricted Sample based on 11 waves all ˇ estimates are significantly different from zeroat 1% significance level, in the subsamples based on eight, six, and four waves there are only twocases out of nine in which we cannot reject the assumption of a zero intergenerational elasticity.Second, almost invariably the intergenerational elasticity declines as the length of observation(and analysis) shrinks. In most cases this decline is considerable. This has substantial effects onthe underestimation of ˇ. In fact, while the downward bias ranged from 12% to 39% when weused all 11 waves, it ranges from 42% to 65% when eight waves are used, and from 86% to94% when four waves are used. With very short panels, therefore, the statistical inference onthe intergenerational elasticity estimates is problematic and the extent of selection bias greatlyexacerbated.

As a way to isolate an explanation of why limiting the analysis to shorter panels introducessuch problems, we repeated this exercise on three other samples, that is, the subsamples of sonswho co-reside with their fathers in at least one wave during the last four, six, and eight waves ofthe BHPS. The results from this analysis (not presented for convenience) are qualitatively similarto those shown in Table VIII. However, if we obtain the occupational prestige information on thelast four, six, or eight waves but consider all sons who co-reside with their fathers even in earlierwaves, the estimates of ˇ are not reduced; in fact, they are close to those obtained from the FullSample shown in Table III. Taken together, these results suggest that the lower estimated valuesof ˇ in Table VIII are mainly driven by the fact that in short panels there are fewer sons in theupper tail of the HG distribution, and thus the age effects are not captured well. This is consistentwith the evidence presented in Grawe (unpublished, 2003). However, artificially truncating thepanel from below seems to produce highly selected samples of older sons living with their fathers(thus exacerbating the co-residence condition problem), and this in turn tends to offset the smallerbias driven by the adulthood condition.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 23: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1287

6. CONCLUSION

Using data from the first 11 waves of the BHPS, this paper measures the extent of the selectionbias induced by adulthood and coresidence conditions—bias that is expected to be severe inshort panels—on measures of intergenerational mobility in occupational prestige. We try to limitthe impact of other selection biases, such as those induced by labour market restrictions thatare typically imposed in intergenerational mobility studies, by using different measures of socio-economic status that account for missing labour market information.

We stress four main results. First, there is evidence of an underestimation of the trueintergenerational elasticity. The extent of the downward bias is moderate in some cases (ofthe order of 12–14%) and large in others (of the order of almost 40%). The consequences interms of intergenerational mobility of such biases are noticeable especially at the extremes of theoccupational prestige distribution.

Second, the proposed methods used to correct for the selection bias seem to be unable to attenuateit, except for the propensity score weighting procedure, which performs well in most circumstances.In some cases, the PSW-corrected estimates of the intergenerational elasticity reduce the bias bymore than one-half (from 36% to 16%). This result is confirmed both under the assumption ofmissing-at-random data as well as under the assumption of not-missing-at-random data. Third, thetwo previous sets of results (direction and extent of the bias, and differential abilities to correctfor it) emerge also when we account for measurement error. Fourth, restricting the sample to aperiod shorter than the 11 waves under analysis exacerbates the extent of sample selection bias.Even restricting the analysis to eight waves would produce a downward bias ranging from 42%to 65%. Part of this underestimation seems to be due to life cycle effects, whereby short panelsare less likely to have sons in the upper tail of the earnings distribution.

In considering these results, some extensions should be considered for future research. It ispossible that the selection process analysed here is stronger in some parts of the occupationalprestige distribution than in others. If the strength of the selection process is greater where thestatus distribution is generally more rigid, namely at its extremes (Corak and Heisz, 1999; Ermischand Francesconi, 2004), then the magnitude of the selection bias may be greater than what wegauged here. In addition, it would be profitable to consider other correction methods to attenuatethe bias (e.g., Manski bounds on quantile regressions). This would provide us with additional toolsto address the selection issue and would also allow us to see how generalizable our conclusionsare on this matter. Another extension is to account for other selection processes (for instance,the selection into employment) more formally. This would help to reach a fuller characterizationof the extent of the selection bias and identify its different sources more precisely than what wehave done in this paper. It would also give us greater credibility to analyse the same selectionproblems for women. Finally, while the size of the Restricted Sample compares favourably withother studies, a richer data source or a longer panel of data would allow us to check the robustnessof our findings and increase the range of hypotheses that could be tested.

APPENDIX

As discussed in Section 2, estimation of the intergenerational transmission based on the relationshipbetween our first measure of sons’ occupational prestige (HGs

1) and our first measure of fathers’occupational prestige (HGf

1 ) using the Restricted Sample is potentially affected by three differentsources of sample selection:

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 24: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1288 M. FRANCESCONI AND C. NICOLETTI

(a) father–son co-residence;(b) missing data on father’s occupational prestige, HGf

1 , if the son does not know or refuses toreport his father’s occupation, or if the father was not working when his son was 14 yearsold; and

(c) missing data on son’s occupational prestige, HGs1, if the son refuses to report his own

occupation, is unemployed or out of the labour market.

We correct for the potential bias induced by father–son co-residence (bias (a)) by applyingdifferent types of estimation on the Restricted Sample, which impose different assumptionson the selection process. We account for missing data on HGf

1 (bias (b)) by using severalmethods to replace the missing information. In particular, HGf

2 replaces the missing values forfathers who live with their sons in at least one wave during the panel with the occupationalprestige score directly reported by the fathers when their sons were 14 (or as close as possibleto age 14). HGf

2 may still have missing values because fathers were not working duringthe time they lived with their sons. Thus, HGf

3 replaces the missing values in HGf2 with

the average prestige scores computed over the entire father’s distribution. Finally, we tacklebias (c) by replacing the missing values on HGs

1 for sons who are neither working nor ineducation with the minimum (sons’) score observed in the data. The new measure is labelledHGs

2.Although we acknowledge biases (b) and (c) and, to some extent, we do take them into

account in estimation as explained above (and in the text), the main focus of our analysis isto assess the potential bias generated by the co-residence selection (bias (a)). For this purpose,we compare a number of different estimators which account for this type of selection withthe Restricted Sample. By using the Full Sample we can then empirically gauge the size ofthe potential bias, and test the assumptions imposed by the various estimators. The aim ofthis Appendix is to discuss the assumptions underlying such estimators and their correspondingtests.

In what follows we keep the notation introduced in Section 4, with the measures of socio-economic status for sons and fathers denoted by ys and yf, respectively. We assume thatE�εjyf, A� D 0 and E��jys, A� D 0 so that, given a random sample, equations (2) and (5) canbe properly estimated by OLS. Unfortunately, we can observe yf only if sons are observedto live with their father in at least one wave of the panel, which corresponds to saying thatthe co-residence indicator dummy r takes value 1. Since r may depend on observed andunobserved variables, the subsample of coresident sons—that is, our Restricted Sample—isnot a random sample, and therefore the OLS estimation on the Restricted Sample can beinconsistent.

OLS estimation of (2) with the Restricted Sample is consistent if E�εrjyf, A� D 0. Then, asufficient condition for consistency is �ys jD rjyf, A�, which is assumption A1 (see Table III). Infact, under this condition, we can write

E�εrjyf, A� D E�ysi � ˛ � ˇyf

i � A0i�jyf, A, r D 1�Pr�r D 1jyf, A� D 0

Similarly, equation (5) is consistently estimated by OLS on the Restricted Sample if assumptionA2 in Table III holds.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 25: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1289

Two other sufficient conditions for consistency of the OLS estimator of (2) are �ε jD Zjyf, A�and �ε jD rjyf, A, Z�. Thus,

E�εrjyf, A� D EZ[E�εjyf, A, Z, r D 1�Pr�r D 1jyf, A, Z�]

D EZ[E�εjyf, A�Pr�r D 1jyf, A, Z�] D 0

where EZ is the mathematical expectation with respect to Z.By analogy, alternative sufficient conditions for consistency of the OLS estimates of (5) are

�� jD rjys, A, Z�, which is equivalent to assumption A7 in Table II and �� jD Zjys, A�, whichcorresponds to assumption A6 in the same table.

Condition A7 alone is sufficient for consistency of the PSW estimator for equation (5). In otherwords, the instrumental variables exclusion restriction A6 is not necessary for consistency of thePSW when A7 holds. This is because consistency of PSW is satisfied if

E

[�

r

p�Z, ys�jys, A

]D 0,

where p�.� is the propensity score function. Since, in our application, the vector A is includedin Z, this is dropped from the propensity score, and Pr�r D 1jys, A, Z� D p�ys, Z�. This momentcondition holds if A7 is satisfied, and thus

E

[�

r

p�Z, ys�jys, A

]D EZ

[E��jys, A, Z, r D 1�

Pr�r D 1jys, Z�

p�ys, Z�

]D EZE��jys, A, Z� D 0

It should be emphasized that, when both A6 and A7 hold, the above moment condi-tion is satisfied even if the propensity score is inconsistently estimated and Pr�r D 1jys, Z�6D p�ys, Z�.

The PSW estimator for equation (2) is consistent if both �r jD ysjA, Z� and �yf jD rjys, A, Z� (i.e.,assumption A3) hold, or, alternatively, if both �r jD ysjZ, A, yf� (which is equivalent to assumptionA5) and �yf jD rjA, Z� (i.e., assumption A8) hold.45

Formally, this means that:

E

r

p�Z�jyf, A

)D EZ

[E

p�Z�jyf, A, Z, r D 1

)Pr�r D 1jyf, Z�

]D EZE�εjyf, A, Z, r D 1� D 0

As in the case of equation (5), the PSW estimator for (2) does not require the instrumentalvariables exclusion restriction (i.e., assumption A4).

The use of a proper instrumental variables exclusion assumption is instead necessary for theempirical identification of the remaining four correction methods used in our analysis, namely,ML, TS, PSS, TSN. For equation (2) this assumption is given by A4, whereas for equation (5) itis given by A6.

When estimating (2), ML, TS, PSS, TSN allow for dependence of the selection process onys and Z. However, they impose the independence condition �r jD yfjys, Z�, which is assumptionA3 imposed by PSW too.

45 Notice that A7 is equivalent to A3 (and thus in Table A.I the test results for the two assumptions are identical).

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 26: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1290 M. FRANCESCONI AND C. NICOLETTI

Furthermore, the ML, TS, PSS, TSN estimators assume, when applied to equation (2), theseparability condition �ε jD A, yf, Zjm�Z�, r D 1�, and, when applied to equation (5), the condition�� jD A, ys, Z, jh�Z, ys�, r D 1�. These conditions allow one to account for sample selection in(2) and (5) by adding an additional (separable) explanatory term, m�.� or h�.�. Specifically,m�Z) and h�Z, ys� take the form of the inverse Mill’s ratio for ML and TS, a polynomialexpression proportional to the inverse Mill’s ratio for TSN, and the propensity score forPSS.

Finally, ML and TS impose also a distributional assumption, requiring that the errors inthe intergenerational equation and in the selection model are jointly distributed as a bivariatenormal.

The availability of the Full Sample, which is arguably unaffected by the co-residence selection(bias (a)), allows us to compute straightforward tests for all these assumptions, which aresummarized in Table II. Without the Full Sample, such assumptions would be falsifiable onlyby imposing other untestable assumptions. Assumptions A1, A2, A3, A5, A7, and A8 are allconditional independence assumptions between the co-residence indicator dummy r and somevariables, say G, conditional on a set of other variables, say V, namely, �r jD GjV�. We verifysuch independence assumptions by estimating probit models of r as a function of G and V,and testing whether the coefficients of G are significantly different from zero. The instrumentalvariables exclusion assumptions A4 and A6 are instead tested by estimating equations (2) and (5),respectively, with the instrumental variables added as explanatory variables and checking whethertheir coefficients are jointly significantly different from zero. The numerical results of all suchtests are reported in Table A.I.

Finally, we verify the normality assumption imposed by the probit selection model by modifyingthe score test proposed in Machin and Stewart (1990) for ordered probit models, which in turnmodifies the score test for a grouped dependent variable introduced by Chesher and Irish (1987).When the selection model includes Z, we find that normality is never rejected at standard levelsof significance. Conversely, excluding Z and including only ys, yf, and A will generally lead toa rejection of the normality assumption (see Francesconi and Nicoletti, 2004, for the numericalvalues of such tests).

Table A.I. Tests on the assumptions imposed by different estimators and normality

(a)Assumption A1

(b)Assumption A2

(c)Assumption A3

(d)Assumption A4

HG1s and HG1

f �2.06 [0.04] �1.92 [0.06] �0.97 [0.33] 2.17 [0.01]HG1

s and HG2f �1.92 [0.06] �0.95 [0.34] 0.04 [0.97] 1.87 [0.02]

HG1s and HG3

f �1.51 [0.13] �1.21 [0.23] �0.48 [0.63] 3.07 [0.00]

(e) (f) (g) (h)Assumption A5 Assumption A6 Assumption A7 Assumption A8

HG1s and HG1

f �2.79 [0.01] 1.69 [0.04] �0.97 [0.33] �2.48 [0.01]HG1

s and HG2f �2.19 [0.03] 2.19 [0.00] 0.04 [0.97] �2.26 [0.02]

HG1s and HG3

f �1.50 [0.13] 2.23 [0.00] �0.48 [0.63] �2.05 [0.04]

Note: p-values of the tests are in brackets. For the definition of each assumption and the estimator theyapply to, see Table III.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 27: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1291

Table A.II. Selection equation

Variable Model 1 Model 2 Model 3

Coeff. (SE) Coeff. (SE) Coeff. (SE)

Year of birth1966–70 0.925 (0.238) 1.042 (0.242) 1.006 (0.235)1971–75 1.694 (0.238) 1.757 (0.242) 1.773 (0.235)1981–85 1.629 (0.330) 1.529 (0.334) 1.750 (0.321)

House price (log) �2.174 (0.290) �2.216 (0.295) �2.265 (0.291)

Region of residenceRest of South East �0.400 (0.170) �0.471 (0.173) �0.411 (0.170)South West �0.693 (0.215) �0.833 (0.221) �0.784 (0.216)Anglia and Midlands �1.161 (0.224) �1.298 (0.230) �1.282 (0.226)North West �1.100 (0.247) �1.245 (0.253) �1.199 (0.249)Rest of North �0.958 (0.237) �1.107 (0.243) �1.081 (0.239)Wales �1.270 (0.297) �1.435 (0.303) �1.367 (0.298)Scotland �1.285 (0.251) �1.406 (0.257) �1.377 (0.251)

Ethnic originBlack �0.476 (0.412) �0.568 (0.414) �0.555 (0.399)Indian 1.155 (0.394) 1.139 (0.387) 1.198 (0.390)Pakistani/Bangladeshi 0.301 (0.432) 0.283 (0.440) 0.497 (0.408)Other �0.426 (0.460) �0.378 (0.475) �0.353 (0.420)

Religious attendanceProtestant �0.194 (0.135) �0.217 (0.136) �0.206 (0.132)Catholique �0.272 (0.187) �0.248 (0.189) �0.214 (0.186)Other denomination 1.131 (0.356) 1.191 (0.364) 1.233 (0.353)

log HG1s �0.789 (0.155)

log HG2s �0.412 (0.128)

Constant 23.355 (3.343) 26.829 (3.479) 25.878 (3.413)

N 1035 1035 1092�2 [p-value] 226.5 [0.000] 252.7 [0.000] 259.0 [0.000]Pseudo R2 0.167 0.186 0.182

Note: Estimates are obtained from probit models. N is the number of sons.

ACKNOWLEDGEMENTS

We thank Jo Blanden, Nick Buck, Simona Comi, Miles Corak, Christian Dustmann, John Ermisch,Stephen Jenkins, Dean Lillard, Robert Moffitt, Paul Schultz, Wilbert Van der Klaauw, the Co-editor (Herman van Dijk), and two anonymous referees for their valuable comments on earlierdrafts of the paper. The financial support of the Economic and Social Research Council under the‘Intergenerational mobility in Britain: Does sample selection matter?’ Award No. RES-000-23-0185is gratefully acknowledged.

REFERENCES

Atkinson AF, Maynard AK, Trinder CG. 1983. Parents and Children: Incomes in Two Generations. Heine-mann: London.

Becker GS, Tomes N. 1979. An equilibrium theory of the distribution of income and intergenerationalmobility. Journal of Political Economy 87: 1153–1189.

Becker GS, Tomes N. 1986. Human capital and the rise and fall of families. Journal of Labor Economics4: S1–S39.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 28: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

1292 M. FRANCESCONI AND C. NICOLETTI

Bjorklund A, Jantti M. 1997. Intergenerational income mobility in Sweden compared to the United States.American Economic Review 87: 1009–1118.

Blanden J, Goodman A, Gregg P, Machin S. 2004. Changes in intergenerational mobility in Britain. InGenerational Income Mobility in North America and Europe, Corak M (ed.). Cambridge University Press:Cambridge, UK; 122–146.

Bowles S, Gintis H. 2002. The inheritance of inequality. Journal of Economic Perspectives 16: 3–30.Chadwick L, Solon G. 2002. Intergenerational income mobility among daughters. American Economic Review

92: 335–344.Cherlin AJ. 1992. Marriage, Divorce and Remarriage. Harvard University Press: Cambridge, MA.Chesher A, Irish M. 1987. Residual analysis in the grouped and censored normal linear model. Journal of

Econometrics 34: 33–61.Corak M, Heisz A. 1999. The intergenerational earnings and income mobility of Canadian men: evidence

from longitudinal income tax data. Journal of Human Resources 34: 504–533.Cosslett S. 1991. Semiparametric estimation of a regression model with sample selectivity. In Nonparametric

and Semiparametric Methods in Econometrics and Statistics, Barnett WA, Powell J, Tauchem G. (eds).Cambridge University Press: Cambridge, UK; 175–197.

Couch KA, Dunn TA. 1997. Intergenerational correlations in labor market status: a comparison of the UnitedStates and Germany. Journal of Human Resources 32: 210–232.

Couch KA, Lillard DR. 1998. Sample selection rules and the intergenerational correlation of earnings. LabourEconomics 5: 313–329.

Dearden L, Machin S, Reed H. 1997. Intergenerational mobility in Britain. Economic Journal 107: 47–66.Erikson R, Goldthorpe JH. 2002. Intergenerational inequality: a sociological perspective. Journal of Economic

Perspectives 16: 31–44.Ermisch J. 1999. Prices, parents, and young people’s household formation. Journal of Urban Economics 45:

47–71.Ermisch J, Di Salvo P. 1997. The economic determinants of young people’s household formation. Economica

64: 627–644.Ermisch J, Francesconi M. 2000. Patterns of household and family formation. In Seven Years in the Lives of

British Families, Berthoud R, Gershuny J (eds). Policy Press: Bristol; 21–44.Ermisch J, Francesconi M. 2004. Intergenerational mobility in Britain: new evidence from the BHPS. In

Generational Income Mobility in North America and Europe, Corak M (ed.). Cambridge University Press:Cambridge; UK; 147–189.

Ermisch J, Francesconi M, Siedler T. Intergenerational economic mobility and assortative mating. EconomicJournal , (forthcoming).

Fitzgerald J, Gottschalk P, Moffitt R. 1998. An analysis of the impact of sample attrition on the secondgeneration of respondents in the Michigan Panel Study of Income Dynamics. Journal of Human Resources33: 300–344.

Francesconi M, Nicoletti C. 2004. Intergenerational mobility and sample selection in short panels. Institute forSocial and Economic Research, University of Essex, Colchester, Working Paper No. 2004-17, September.

Frisch R, Waugh FV. 1933. Partial time regressions as compared with individual trends. Econometrica1: 387–401.

Fuller WA. 1987. Measurement Error Models. Wiley: New York.Goldberger AS. 1989. Economic and mechanical models of intergenerational transmission. American Eco-

nomic Review 79: 504–513.Goldthorpe JH, Hope K. 1974. The Social Grading of Occupations. Clarendon Press: Oxford.Grawe ND. 2006. The extent of lifecycle bias in estimates of intergenerational earnings persistence. Labour

Economics , (forthcoming).Gruber J. 2004. Is making divorce easier bad for children? The long-run implications of unilateral divorce.

Journal of Labor Economics 22: 799–833.Haurin DR, Hendershott PH, Kim D. 1994. Housing decisions of American youth. Journal of Urban

Economics 35: 28–44.Heckman J. 1979. Sample selection as a specification error. Econometrica 47: 153–161.Jenkins SP. 1987. Snapshots versus movies: ‘lifecycle bias’ and the estimation of intergenerational earnings

inheritance. European Economic Review 31: 1149–1158.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 29: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION

INTERGENERATIONAL MOBILITY IN SHORT PANELS 1293

Lee L–F. 1982. Some approaches to the correction of selectivity bias. Review of Economic Studies 49:355–372.

Lee L–F. 1984. Tests for the bivariate normal distribution in econometric models with selectivity. Econo-metrica 52: 843–863.

Machin SJ, Stewart MB. 1990. Union and financial performance of British private sector establishment.Journal of Applied Econometrics 5: 327–350.

Mazumder B. 2005. Fortunate sons: new estimates of intergenerational mobility in the United States usingsocial security earnings data. Review of Economics and Statistics 87: 235–255.

Minicozzi AL. 2003. Estimation of sons’ intergenerational earnings mobility in the presence of censoring.Journal of Applied Econometrics 18: 291–314.

Newey WK. 1999. Two step series estimator of Heckman selection model. Mimeo, Department of Economics,Massachussets Institute of Technology.

Nickell S. 1982. The determinants of the occupational success in Britain. Review of Economic Studies 49:45–53.

Page ME, Solon G. 2003. Correlations between sisters and neighbouring girls in their subsequent income asadults. Journal of Applied Econometrics 18: 545–562.

Phelps Brown H. 1977. The Inequality of Pay. Oxford University Press: New York.Robins J, Rotnitzky A. 1995. Semiparametric efficiency in multivariate regression models with missing data.

Journal of the American Statistical Association 90: 122–129.Rosenbaum PR, Rubin DB. 1983. The central role of the propensity score in observational studies for causal

effects. Biometrika 70: 41–55.Shea J. 2000. Does parents’ money matter? Journal of Public Economics 77: 155–184.Solon G. 1992. Intergenerational income mobility in the United States. American Economic Review 82:

393–408.Solon G. 1999. Intergenerational mobility in the labor market. In Handbook of Labor Economics, Vol. 3, Ch.

29, Ashenfelter O, Card D (eds). Elsevier: Amsterdam; 1761–1800.Taylor MF. (ed.) with Brice J, Buck N, Prentice-Lane E. 2002. British Household Panel Survey User Manual

Volume A: Introduction, Technical Report and Appendices . University of Essex: Colchester, UK.Vella F. 1998. Estimating models with sample selection bias: a survey. Journal of Human Resources

3: 127–169.Wilcox WB. 2002. Then comes marriage? Religion, race, and marriage in urban America. Working paper

2002-13-FF. Center for Research on Child Wellbeing, Princeton University: Princeton, NJ.Wooldridge JM. 2002. Inverse probability weighted M-estimators for sample selection, attrition, and strati-

fication. Portuguese Economic Journal 1: 117–139.Zimmerman DJ. 1992. Regression toward mediocrity in economic stature. American Economic Review 82:

409–429.

Copyright 2006 John Wiley & Sons, Ltd. J. Appl. Econ. 21: 1265–1293 (2006)DOI: 10.1002/jae

Page 30: INTERGENERATIONAL MOBILITY AND SAMPLE SELECTION …faculty.smu.edu/millimet/classes/eco7377/papers/francesconi nicoletti 2006.pdfINTERGENERATIONAL MOBILITY AND SAMPLE SELECTION