Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
THE
QUARTERLY JOURNALOF ECONOMICS
Vol. 129 November 2014 Issue 4
WHERE IS THE LAND OF OPPORTUNITY? THEGEOGRAPHY OF INTERGENERATIONAL MOBILITY IN
THE UNITED STATES*
Raj ChettyNathaniel Hendren
Patrick KlineEmmanuel Saez
We use administrative records on the incomes of more than 40 millionchildren and their parents to describe three features of intergenerational mo-bility in the United States. First, we characterize the joint distribution ofparent and child income at the national level. The conditional expectation ofchild income given parent income is linear in percentile ranks. On average, a 10percentile increase in parent income is associated with a 3.4 percentile increase
*The opinions expressed in this article are those of the authors alone and donot necessarily reflect the views of the Internal Revenue Service or the U.S.Treasury Department. This work is a component of a larger project examiningthe effects of tax expenditures on the budget deficit and economic activity. Allresults based on tax data in this article are constructed using statistics originallyreported in the SOI Working Paper ‘‘The Economic Impacts of Tax Expenditures:Evidence from Spatial Variation across the U.S.,’’ approved under IRS contractTIRNO-12-P-00374 and presented at the National Tax Association meeting onNovember 22, 2013. We thank David Autor, Gary Becker, David Card, DavidDorn, John Friedman, James Heckman, Nathaniel Hilger, Richard Hornbeck,Lawrence Katz, Sara Lalumia, Adam Looney, Pablo Mitnik, Jonathan Parker,Laszlo Sandor, Gary Solon, Danny Yagan, numerous seminar participants, andfour anonymous referees for helpful comments. Sarah Abraham, Alex Bell, ShelbyLin, Alex Olssen, Evan Storms, Michael Stepner, and Wentao Xiong providedoutstanding research assistance. This research was funded by the NationalScience Foundation, the Lab for Economic Applications and Policy at Harvard,the Center for Equitable Growth at UC-Berkeley, and Laura and John ArnoldFoundation. Publicly available portions of the data and code, including interge-nerational mobility statistics by commuting zone and county, are available athttp://www.equality-of-opportunity.org.
! The Author(s) 2014. Published by Oxford University Press, on behalf of Presidentand Fellows of Harvard College. All rights reserved. For Permissions, please email:[email protected] Quarterly Journal of Economics (2014), 1553–1623. doi:10.1093/qje/qju022.Advance Access publication on September 14, 2014.
1553
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://www.equality-of-opportunity.orghttp://qje.oxfordjournals.org/
in a child’s income. Second, intergenerational mobility varies substantiallyacross areas within the United States. For example, the probability that achild reaches the top quintile of the national income distribution startingfrom a family in the bottom quintile is 4.4% in Charlotte but 12.9% in SanJose. Third, we explore the factors correlated with upward mobility. High mo-bility areas have (i) less residential segregation, (ii) less income inequality, (iii)better primary schools, (iv) greater social capital, and (v) greater family stabil-ity. Although our descriptive analysis does not identify the causal mechanismsthat determine upward mobility, the publicly available statistics on interge-nerational mobility developed here can facilitate research on such mechanisms.JEL Codes: H0, J0, R0.
I. Introduction
The United States is often hailed as the ‘‘land of opportu-nity,’’ a society in which a person’s chances of success dependlittle on his or her family background. Is this reputation war-ranted? We show that this question does not have a clearanswer because there is substantial variation in intergenera-tional mobility across areas within the United States. TheUnited States is better described as a collection of societies,some of which are ‘‘lands of opportunity’’ with high rates of mo-bility across generations, and others in which few children escapepoverty.
We characterize intergenerational mobility using informa-tion from deidentified federal income tax records, which providedata on the incomes of more than 40 million children and theirparents between 1996 and 2012. We organize our analysis intothree parts.
In the first part, we present new statistics on intergenera-tional mobility in the United States as a whole. In our baselineanalysis, we focus on U.S. citizens in the 1980–1982 birth co-horts—the oldest children in our data for whom we can reliablyidentify parents based on information on dependent claiming. Wemeasure these children’s income as mean total family incomein 2011 and 2012, when they are approximately 30 years old.We measure their parents’ income as mean family income be-tween 1996 and 2000, when the children are between the agesof 15 and 20.1
1. We show that our baseline measures do not suffer from significant life cycleor attenuation bias (Solon 1992; Zimmerman 1992; Mazumder 2005) by establish-ing that estimates of mobility stabilize by the time children reach age 30 and are notvery sensitive to the number of years used to measure parent income.
QUARTERLY JOURNAL OF ECONOMICS1554
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
Following the prior literature (e.g., Solon 1999), we beginby estimating the intergenerational elasticity of income (IGE)by regressing log child income on log parent income.Unfortunately, we find that this canonical log-log specificationyields very unstable estimates of mobility because the relation-ship between log child income and log parent income is nonlinearand the estimates are sensitive to the treatment of children withzero or very small incomes. When restricting the sample betweenthe 10th and 90th percentiles of the parent income distributionand excluding children with zero income, we obtain an IGE esti-mate of 0.45. However, alternative specifications yield IGEs rang-ing from 0.26 to 0.70, spanning most of the estimates in the priorliterature.2
To obtain a more stable summary of intergenerational mobil-ity, we use a rank-rank specification similar to that used by Dahland DeLeire (2008). We rank children based on their incomesrelative to other children in the same birth cohort. We rank par-ents of these children based on their incomes relative to otherparents with children in these birth cohorts. We characterize mo-bility based on the slope of this rank-rank relationship, whichidentifies the correlation between children’s and parents’ posi-tions in the income distribution.3
We find that the relationship between mean child ranks andparent ranks is almost perfectly linear and highly robust to al-ternative specifications. A 10 percentile point increase in parentrank is associated with a 3.41 percentile increase in a child’sincome rank on average. Children’s college attendance and teen-age birth rates are also linearly related to parent income ranks. A10 percentile point increase in parent income is associated with a6.7 percentage point (pp) increase in college attendance rates anda 3 pp reduction in teenage birth rates for women.
In the second part of the article, we characterize variation inintergenerational mobility across commuting zones (CZs). Com-muting zones are geographical aggregations of counties that aresimilar to metro areas but cover the entire United States,
2. In an important recent study, Mitnik et al. (2014) propose a new dollar-weighted measure of the IGE and show that it yields more stable estimates. Wediscuss the differences between the new measure of mobility proposed by Mitnik etal. and the canonical definition of the IGE in Section IV.A.
3. The rank-rank slope and IGE both measure the degree to which differencesin children’s incomes are determined by their parents’ incomes. We discuss theconceptual differences between the two measures in Section II.
WHERE IS THE LAND OF OPPORTUNITY? 1555
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
including rural areas (Tolbert and Sizer 1996). We assign chil-dren to CZs based on where they lived at age 16—that is, wherethey grew up—irrespective of whether they left that CZ after-ward. When analyzing CZs, we continue to rank both childrenand parents based on their positions in the national income dis-tribution, which allows us to measure children’s absolute out-comes as we discuss later.
The relationship between mean child ranks and parent ranksis almost perfectly linear within CZs, allowing us to summarizethe conditional expectation of a child’s rank given his parents’rank with just two parameters: a slope and intercept. The slopemeasures relative mobility: the difference in outcomes betweenchildren from top versus bottom income families within a CZ. Theintercept measures the expected rank for children from familiesat the bottom of the income distribution. Combining the interceptand slope for a CZ, we can calculate the expected rank of childrenfrom families at any given percentile p of the national parentincome distribution. We call this measure absolute mobility atpercentile p. Measuring absolute mobility is valuable because in-creases in relative mobility have ambiguous normative implica-tions, as they may be driven by worse outcomes for the rich ratherthan better outcomes for the poor.
We find substantial variation in both relative and absolutemobility across CZs. Relative mobility is lowest for children whogrew up in the Southeast and highest in the Mountain West andthe rural Midwest. Some CZs in the United States have relativemobility comparable to the highest mobility countries in theworld, such as Canada and Denmark, while others have lowerlevels of mobility than any developed country for which dataare available.
We find similar geographical variation in absolute mobility.We focus much of our analysis on absolute mobility at p¼ 25,which we call ‘‘absolute upward mobility.’’ This statistic measuresthe mean income rank of children with parents in the bottom halfof the income distribution given linearity of the rank-rank rela-tionship. Absolute upward mobility ranges from 35.8 in Charlotteto 46.2 in Salt Lake City among the 50 largest CZs. A 1 standarddeviation increase in CZ-level upward mobility is associated witha 0.2 standard deviation improvement in a child’s expected rankgiven parents at p¼ 25, 60% as large as the effect of a 1 standarddeviation increase in his own parents’ income. Other measures ofupward mobility exhibit similar spatial variation. For instance,
QUARTERLY JOURNAL OF ECONOMICS1556
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
the probability that a child reaches the top fifth of the incomedistribution conditional on having parents in the bottom fifth is4.4% in Charlotte, compared with 10.8% in Salt Lake City and12.9% in San Jose. The CZ-level mobility statistics are robust toadjusting for differences in the local cost of living, shocks to localgrowth, and using alternative measures of income.
Absolute upward mobility is highly correlated with relativemobility: areas with high levels of relative mobility (low rank-rank slopes) tend to have better outcomes for children from low-income families. On average, children from families belowpercentile p¼ 85 have better outcomes when relative mobility isgreater; those above p¼85 have worse outcomes. Location mat-ters more for children growing up in low-income families: the ex-pected rank of children from low-income families varies moreacross CZs than the expected rank of children from high incomefamilies.
The spatial patterns of the gradients of college attendanceand teenage birthrates with respect to parent income across CZsare very similar to the variation in intergenerational income mo-bility. This suggests that the spatial differences in mobility aredriven by factors that affect children while they are growing uprather than after they enter labor market.
In the final part of the article, we explore such factors bycorrelating the spatial variation in mobility with observable char-acteristics. To begin, we show that upward income mobility issignificantly lower in areas with larger African American popu-lations. However, white individuals in areas with large AfricanAmerican populations also have lower rates of upward mobility,implying that racial shares matter at the community level.
We then identify five factors that are strongly correlated withthe variation in upward mobility across areas. The first is segre-gation: areas that are more residentially segregated by race andincome have lower levels of mobility. Second, areas with moreinequality as measured by Gini coefficients have less mobility,consistent with the ‘‘Great Gatsby curve’’ documented acrosscountries (Krueger 2012; Corak 2013). Top 1% income sharesare not highly correlated with intergenerational mobility bothacross CZs within the United States and across countries, sug-gesting that the factors that erode the middle class may hamperintergenerational mobility more than the factors that lead toincome growth in the upper tail. Third, proxies for the qualityof the K-12 school system are positively correlated with mobility.
WHERE IS THE LAND OF OPPORTUNITY? 1557
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
Fourth, social capital indexes (Putnam 1995)—which are proxiesfor the strength of social networks and community involvement inan area—are also positively correlated with mobility. Finally,mobility is significantly lower in areas with weaker family struc-tures, as measured, for example, by the fraction of single parents.As with race, parents’ marital status does not matter purelythrough its effects at the individual level. Children of marriedparents also have higher rates of upward mobility in communitieswith fewer single parents. Interestingly, we find no correlationbetween racial shares and upward mobility once we control forthe fraction of single parents in an area.
We find modest correlations between upward mobility andlocal tax policies and no systematic correlation between mobilityand local labor market conditions, rates of migration, or access tohigher education. In a multivariable regression, the five key fac-tors described above generally remain statistically significantpredictors of both relative and absolute upward mobility, evenin specifications with state fixed effects. However, we emphasizethat these factors should not be interpreted as causal determi-nants of mobility because all of these variables are endogenouslydetermined and our analysis does not control for numerous otherunobserved differences across areas.
Our results build on an extensive literature on intergenera-tional mobility, reviewed by Solon (1999) and Black and Devereux(2011). Our estimates of the level of mobility in the United Statesas a whole are broadly consistent with prior results, with theexception of Mazumder’s (2005) and Clark’s (2014) IGE esti-mates, which imply much lower levels of intergenerational mo-bility. We discuss why our findings may differ from their resultsin Online Appendices D and E. Our focus on within-country com-parisons offers two advantages over the cross-country compari-sons that have been the focus of prior comparative work (e.g.,Bjorklund and Jäntti 1997; Jäntti et al. 2006; Corak 2013).First, differences in measurement and methods make it difficultto reach definitive conclusions from cross-country comparisons(Solon 2002). The variables we analyze are measured using thesame data sources across all CZs. Second, and more important,we characterize both relative and absolute mobility across CZs.The cross-country literature has focused exclusively on differ-ences in relative mobility; much less is known about how theprospects of children from low-income families vary across coun-tries when measured on a common absolute scale (Ray 2010).
QUARTERLY JOURNAL OF ECONOMICS1558
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
Our analysis also relates to the literature on neighborhoodeffects, reviewed by Jencks and Mayer (1990) and Sampson et al.(2002). Unlike recent experimental work on neighborhood effects(e.g., Katz, Kling, and Liebman 2001; Oreopoulos 2003), our de-scriptive analysis does not shed light on whether the differencesin outcomes across areas are due to the causal effect of neighbor-hoods or differences in the characteristics of people living in thoseneighborhoods. However, in a followup paper, Chetty andHendren (2014) show that a substantial portion of the spatialvariation documented here is driven by causal effects of placeby studying families that move across areas with children of dif-ferent ages.
The article is organized as follows. We begin in Section II bydefining the measures of intergenerational mobility that westudy and discussing their conceptual properties. Section IIIdescribes the data. Section IV reports estimates of intergenera-tional mobility at the national level. In Section V, we presentestimates of absolute and relative mobility by commuting zone.Section VI reports correlations of our mobility measures withobservable characteristics of commuting zones. Section VII con-cludes. Statistics on intergenerational mobility and related covar-iates are publicly available by commuting zone, metropolitanstatistical area, and county on the project website (www.equali-ty-of-opportunity.org).
II. Measures of Intergenerational Mobility
At the most general level, studies of intergenerational mobil-ity seek to measure the degree to which a child’s social and eco-nomic opportunities depend on his parents’ income or socialstatus. Because opportunities are difficult to measure, virtuallyall empirical studies of mobility measure the extent to which achild’s income (or occupation) depends on his parents’ income (oroccupation).4 Following this approach, we aim to characterize the
4. This simplification is not innocuous, as a child’s realized income may differfrom his opportunities. For instance, children of wealthy parents may choose not towork or may choose lower-paying jobs, which would reduce the persistence ofincome across generations relative to the persistence of underlying opportunities.
WHERE IS THE LAND OF OPPORTUNITY? 1559
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://www.equality-of-opportunity.orghttp://www.equality-of-opportunity.orghttp://qje.oxfordjournals.org/
joint distribution of a child’s lifetime pretax family income (Yi),and his parents’ lifetime pretax family income (Xi).
5
In large samples, one can characterize the joint distributionof (Yi, Xi) nonparametrically, and we provide such a characteri-zation in the form of a 100� 100 centile transition matrix below.However, to provide a parsimonious summary of the degree ofmobility and compare rates of mobility across areas, it is usefulto characterize the joint distribution using a small set of statis-tics. We divide measures of mobility into two classes that capturedifferent normative concepts: relative mobility and absolute mo-bility. In this section, we define a set of statistics that we use tomeasure these two concepts empirically and compare their con-ceptual properties.
II.A. Relative Mobility
One way to study intergenerational mobility is to ask, ‘‘Whatare the outcomes of children from low-income families relative tothose of children from high-income families?’’ This question,which focuses on the relative outcomes of children from differentparental backgrounds, has been the subject of most prior researchon intergenerational mobility (Solon 1999; Black et al. 2011).
The canonical measure of relative mobility is the elasticity of
child income with respect to parent income dE½log YijXi¼x�dlog x
� �, com-
monly called the intergenerational income elasticity (IGE). Themost common method of estimating the IGE is to regress log childincome (logYi) on log parent income (logXi), which yields a coeffi-cient of
IGE ¼ �XYSDðlogYiÞSDðlogXiÞ
; ð1Þ
where �XY¼Corr(logXi, logYi) is the correlation between log childincome and parent income and SD() denotes the standard devia-tion. The IGE is a relative mobility measure because it measuresthe difference in (log) outcomes between children of high versuslow income parents.
5. If taxes and transfers do not generate rank reversals (as is typically the casein practice), using post-tax income instead of pretax income would have no effect onour preferred rank-based measures of mobility. See Mitnik et al. (2014) for a com-parison of pretax and post-tax measures of the IGE of income.
QUARTERLY JOURNAL OF ECONOMICS1560
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
An alternative measure of relative mobility is the correlationbetween child and parent ranks (Dahl and DeLeire 2008). Let Ridenote child i’s percentile rank in the income distribution of chil-dren and Pi denote parent i’s percentile rank in the income dis-tribution of parents. Regressing the child’s rank Ri on his parents’rank Pi yields a regression coefficient �PR¼Corr(Pi, Ri), which wecall the rank-rank slope.6 The rank-rank slope �PR measures theassociation between a child’s position in the income distributionand his parents’ position in the distribution.
To understand the connection between the IGE and the rank-rank slope, note that the correlation of log incomes �XY and thecorrelation of ranks �PR are closely related scale-invariant mea-sures of the degree to which child income depends on parentincome.7 Hence, equation (1) implies that the IGE combines thedependence features captured by the rank-rank slope with theratio of standard deviations of income across generations.8
The IGE differs from the rank-rank slope to the extent that in-equality changes across generations. Intuitively, a given increasein parents’ incomes has a greater effect on the level of children’sincomes when inequality is greater among children than amongparents.
We estimate both the IGE and the rank-rank slope to distin-guish differences in mobility from differences in inequality and toprovide a comparison to the prior literature. However, we focusprimarily on rank-rank slopes because they prove to be muchmore robust across specifications and are thus more suitable forcomparisons across areas from a statistical perspective.
II.B. Absolute Mobility
A different way to measure intergenerational mobility is toask, ‘‘What are the outcomes of children from families of a givenincome level in absolute terms?’’ For example, one may be
6. The regression coefficient equals the correlation coefficient because bothchild and parent ranks follow a uniform distribution by construction.
7. For example, if parent and child income follow a bivariate log normal dis-
tribution, �PR ¼6ArcSinð�XY2 Þ
p &3�XYp ¼ 0:95�XY when �XY is small (Trivedi and Zimmer
2007).8. More generally, the joint distribution of parent and child incomes can be
decomposed into two components: the joint distribution of parent and child percen-tile ranks (the copula) and the marginal distributions of parent and child income.The rank-rank slope depends purely on the copula, whereas the IGE combines bothcomponents.
WHERE IS THE LAND OF OPPORTUNITY? 1561
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
interested in measuring the mean outcomes of children whosegrow up in low-income families. Absolute mobility may be ofgreater normative interest than relative mobility. Increases inrelative mobility (i.e., a lower IGE or rank-rank slope) could beundesirable if they are caused by worse outcomes for the rich. Incontrast, increases in absolute mobility at a given income level,holding fixed absolute mobility at other income levels, unambig-uously increase welfare if one respects the Pareto principle (and ifwelfare depends purely on income).
We consider three statistical measures of absolute mobility.Our primary measure, which we call absolute upward mobility, isthe mean rank (in the national child income distribution) of chil-dren whose parents are at the 25th percentile of the nationalparent income distribution.9 At the national level, this statisticis mechanically related to the rank-rank slope and does not pro-vide any additional information about mobility.10 However, whenwe study small areas within the United States, a child’s rank inthe national income distribution is effectively an absolute out-come because incomes in a given area have little effect on thenational distribution.
The second measure we analyze is the probability of risingfrom the bottom quintile to the top quintile of the income distri-bution (Corak and Heisz 1999; Hertz 2006), which can be inter-preted as a measure of the fraction of children who achieve the‘‘American Dream.’’ Again, when the quintiles are defined in thenational income distribution, these transition probabilities canbe interpreted as measures of absolute outcomes in small areas.Our third measure is the probability that a child has familyincome above the poverty line conditional on having parents atthe 25th percentile. Because the poverty line is defined in abso-lute dollar terms in the United States, this statistic measures the
9. This measure is the analog of the rank-rank slope in terms of absolute mo-bility. The corresponding analogof the IGE is the mean log income of children whoseparents are at the 25th percentile. We do not study this statistic because it is verysensitive to the treatment of zeros and small incomes.
10. We show below that the rank-rank relationship is approximately linear.Because child and parent ranks each have a mean of 0.5 by construction in thenational distribution, the mean rank of children with parents at percentile p issimply 0.5þ�PR(p – 0.5). Conceptually, the slope is the only free parameter in thelinear national rank-rank relationship. Intuitively, if one child moves up in theincome distribution in terms of ranks, another must come down.
QUARTERLY JOURNAL OF ECONOMICS1562
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
fraction of children who achieve a given absolute livingstandard.11
It is useful to analyze multiple measures of mobility becausethe appropriate measure of intergenerational mobility dependson one’s normative objective (Fields and Ok 1999). Fortunately,we find that the patterns of spatial variation in absolute and rel-ative mobility are very similar using alternative measures. Inaddition, we provide nonparametric transition matrices and mar-ginal distributions that allow readers to construct measures ofmobility beyond those we consider here.
III. Data
We use data from federal income tax records spanning 1996–2012. The data include both income tax returns (1040 forms) andthird-party information returns (such as W-2 forms), which giveus information on the earnings of those who do not file tax re-turns. We provide a detailed description of how we construct ouranalysis sample starting from the raw population data in OnlineAppendix A. Here, we briefly summarize the key variable andsample definitions. Note that in what follows, the year alwaysrefers to the tax year (i.e., the calendar year in which theincome is earned).
III.A. Sample Definitions
Our base data set of children consists of all individuals who(i) have a valid Social Security number or individual taxpayeridentification number, (ii) were born between 1980 and 1991,and (iii) are U.S. citizens as of 2013. We impose the citizenshiprequirement to exclude individuals who are likely to have immi-grated to the United States as adults, for whom we cannot mea-sure parent income. We cannot directly restrict the sample toindividuals born in the United States because the database onlyrecords current citizenship status.
We identify the parents of a child as the first tax filers (be-tween 1996 and 2012) who claim the child as a child dependentand were between the ages of 15 and 40 when the child was born.
11. Another intuitive measure of upward mobility is the fraction of childrenwhose income exceeds that of their parents. This statistic turns out to be problem-atic for our application because we measure parent and child income at differentages and because it is very sensitive to differences in local income distributions.
WHERE IS THE LAND OF OPPORTUNITY? 1563
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
If the child is first claimed by a single filer, the child is defined ashaving a single parent. For simplicity, we assign each child aparent (or parents) permanently using this algorithm, regardlessof any subsequent changes in parents’ marital status or depen-dent claiming.12
If parents never file a tax return, we cannot link them to theirchild. Although some low-income individuals do not file tax re-turns in a given year, almost all parents file a tax return atsome point between 1996 and 2012 to obtain a tax refund ontheir withheld taxes and the Earned Income Tax Credit (Cilke1998). We are therefore able to identify parents for approximately95 percent of the children in the 1980–1991 birth cohorts. Thefraction of children linked to parents drops sharply prior to the1980 birth cohort because our data begin in 1996 and many chil-dren begin to the leave the household starting at age 17 (OnlineAppendix Table I). This is why we limit our analysis to childrenborn during or after 1980.
Our primary analysis sample, which we refer to as the coresample, includes all children in the base data set who (i) are bornin the 1980–1982 birth cohorts, (ii) for whom we are able to iden-tify parents, and (iii) whose mean parent income between 1996and 2000 is strictly positive (which excludes 1.2% of children).13
For some robustness checks, we use the extended sample, whichimposes the same restrictions as the core sample, but includes allbirth cohorts from 1980 to 1991. There are approximately 10 mil-lion children in the core sample and 44 million children in theextended sample.
1. Statistics of Income Sample. Because we can only reliablylink children to parents starting with the 1980 birth cohort in thepopulation tax data, we can only measure earnings of children upto age 32 (in 2012) in the full sample. To evaluate whether
12. Twelve percent of children in our core sample are claimed as dependents bydifferent individuals in subsequent years. To ensure that this potential measure-ment error in linking children to parents does not affect our findings, we show thatwe obtain similar estimates of mobility for the subset of children who are neverclaimed by other individuals (row 9 of Online Appendix Table VII).
13. We limit the sample to parents with positive income because parents whofile a tax return (as required to link them to a child) yet have zero income are un-likely to be representative of individuals with zero income and those with negativeincome typically have large capital losses, which are a proxy for having significantwealth.
QUARTERLY JOURNAL OF ECONOMICS1564
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
estimates of intergenerational mobility would change signifi-cantly if earnings were measured at later ages, we supplementour analysis using annual cross-sections of tax returns main-tained by the Statistics of Income (SOI) division of the InternalRevenue Service (IRS) prior to 1996. The SOI cross-sections pro-vide identifiers for dependents claimed on tax forms starting in1987, allowing us to link parents to children back to the 1971birth cohort using an algorithm analogous to that describedabove (see Online Appendix A for further details). The SOIcross-sections are stratified random samples of tax returns witha sampling probability that rises with income; using samplingweights, we can calculate statistics representative of the nationaldistribution. After linking parents to children in the SOI sample,we use population tax data to obtain data on income for childrenand parents, using the same definitions as in the core sample.There are approximately 63,000 children in the 1971–1979birth cohorts in the SOI sample (Online Appendix Table II).
III.B. Variable Definitions and Summary Statistics
In this section, we define the key variables we use to measureintergenerational mobility. We measure all monetary variables in2012 dollars, adjusting for inflation using the consumer priceindex (CPI-U).
1. Parent Income. Following Lee and Solon (2009), ourprimary measure of parent income is total pretax income atthe household level, which we label parent family income.More precisely, in years where a parent files a tax return, we de-fine family income as adjusted gross income (as reported on the1040 tax return) plus tax-exempt interest income and the non-taxable portion of Social Security and Disability (SSDI) benefits.In years where a parent does not file a tax return, we definefamily income as the sum of wage earnings (reported on formW-2), unemployment benefits (reported on form 1099-G), andgross social security and disability benefits (reported on formSA-1099) for both parents.14 In years where parents have no
14. The database does not record W-2s and other information returns prior to1999, so nonfiler’s income is coded as 0 prior to 1999. Assigning nonfiling parents 0income has little effect on our estimates because only 2.9% of parents in our coresample do not file in each year prior to 1999 and most nonfilers have very low W-2income. For instance, in 2000, median W-2 income among nonfilers was $29.
WHERE IS THE LAND OF OPPORTUNITY? 1565
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
tax return and no information returns, family income is coded aszero.15
Our baseline income measure includes labor earnings andcapital income as well as unemployment insurance, SocialSecurity, and disability benefits. It excludes nontaxable cashtransfers such as Temporary Assistance to Needy Families andSupplemental Security Income, in-kind benefits such as foodstamps, all refundable tax credits such as the EITC, nontaxablepension contributions (such as to 401(k)s), and any earned incomenot reported to the IRS. Income is always measured prior to thededuction of individual income taxes and employee-level payrolltaxes.
In our baseline analysis, we average parents’ family incomeover the five years from 1996 to 2000 to obtain a proxy for parentlifetime income that is less affected by transitory fluctuations(Solon 1992). We use the earliest years in our sample to best re-flect the economic resources of parents while the children in oursample are growing up.16 We evaluate the robustness of our find-ings using data from other years and using a measure of individ-ual parent income instead of family income. We define individualincome as the sum of individual W-2 wage earnings, unemploy-ment insurance benefits, SSDI payments, and half of householdself-employment income (see Online Appendix A for details).
Furthermore, we show below that defining parent income based on data from 1999to 2003 (when W-2 data are available) yields virtually identical estimates (Table I,row 5). Note that we never observe self-employment income for nonfilers and there-fore code it as 0; given the strong incentives for individuals with children to filecreated by the EITC, most nonfilers likely have very low levels of self-employmentincome as well.
15. Importantly, these observations are true zeros rather than missing data.Because the database covers all tax records, we know that these individuals have 0taxable income.
16. Formally, we define mean family income as the mother’s family income plusthe father’s family income in each year from 1996 to 2000 divided by 10 (by 5 if weonly identify a single parent). For parents who do not change marital status, this issimply mean family income over the five-year period. For parents who are marriedinitially and then divorce, this measure tracks the mean family incomes of the twoparents over time. For parents who are single initially and then get married, thismeasure tracks individual income prior to marriage and total family income (in-cluding the new spouse’s income) after marriage. These household measures ofincome increase with marriage and naturally do not account for cohabitation; toensure that these features do not generate bias, we assess the robustness of ourresults to using individual measures of income.
QUARTERLY JOURNAL OF ECONOMICS1566
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
2. Child Income. We define child family income in the sameway as parent family income. In our baseline analysis, we aver-age child family income over the last two years in our data (2011and 2012), when children are in their early thirties. We reportresults using alternative years to assess the sensitivity of ourfindings. For children, we define household income based on cur-rent marital status rather than marital status at a fixed point intime. Because family income varies with marital status, we alsoreport results using individual income measures for children,constructed in the same way as for parents.
3. College Attendance. We define college attendance as an in-dicator for having one or more 1098-T forms filed on one’s behalfwhen the individual is aged 18–21. Title IV institutions—all col-leges and universities as well as vocational schools and otherpostsecondary institutions eligible for federal student aid—arerequired to file 1098-T forms that report tuition payments orscholarships received for every student. Because the forms arefiled directly by colleges, independent of whether an individualfiles a tax return, we have complete records on college attendancefor all children. The 1098-T data are available from 1999 to 2012.Comparisons to other data sources indicate that 1098-T formscapture college enrollment quite accurately overall (Chetty,Friedman, and Rockoff 2014, Appendix B).17
4. College Quality. Using data from 1098-T forms, Chetty,Friedman, and Rockoff (2014) construct an earnings-basedindex of ‘‘college quality’’ using the mean individual wage earn-ings at age 31 of children born in 1979–1980 based on the collegethey attended at age 20. Children who do not attend college areincluded in a separate ‘‘no college’’ category in this index. Weassign each child in our sample a value of this college qualityindex based on the college in which they were enrolled at age
17. Colleges are not required to file 1098-T forms for students whose qualifiedtuition and related expenses are waived or paid entirely with scholarships orgrants. However, the forms are frequently available even for such cases, presum-ably because of automated reporting to the IRS by universities. Approximately 6%of 1098-T forms are missing from 2000 to 2003 because the database contains no1098-T forms for some small colleges in these years. To verify that this does notaffect our results, we confirm that our estimates of college attendance by parentincome gradients are very similar for later birth cohorts (not reported).
WHERE IS THE LAND OF OPPORTUNITY? 1567
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
20. We then convert this dollar index to percentile ranks withineach birth cohort. The children in the no-college group, who con-stitute roughly 54 percent of our core sample, all have the samevalue of the college quality index. Breaking ties at the mean, weassign all of these children a college quality rank of approxi-mately 542 ¼ 27.
18
5. Teenage Birth. We define a woman as having a teenagebirth if she ever claims a dependent who was born while shewas between the ages of 13 and 19. This measure is an imperfectproxy for having a teenage birth because it only covers childrenwho are claimed as dependents by their mothers. Nevertheless,the aggregate level and spatial pattern of teenage births in ourdata are closely aligned with estimates based on the AmericanCommunity Survey.19
6. Summary Statistics. Online Appendix Table III reportssummary statistics for the core sample. Median parent familyincome is $60,129 (in 2012 dollars). Among the 30.6% of childrenmatched to single parents, 72.0% are matched to a female parent.Children in our core sample have a median family income of$34,975 when they are approximately 30 years old; 6.1% of chil-dren have zero income in both 2011 and 2012; 58.9% are enrolledin a college at some point between the ages of 18 and 21; and15.8% of women have a teenage birth.
In Online Appendix B and Appendix Table IV, we showthat the total cohort size, labor force participation rate, distri-bution of child income, and other demographic characteristics ofour core sample line up closely with corresponding estimates inthe Current Population Survey and American CommunitySurvey. This confirms that our sample covers roughly the samenationally representative population as previous survey-basedresearch.
18. The exact value varies across cohorts. For example, in the 1980 birth cohort,55.1% of children do not attend college. We assign these children a rank of55:1
2 þ0:02 ¼ 27:7% because 0.2% of children in the 1980 birth cohort attend collegeswhose mean earnings are below the mean earnings of those not in college.
19. Of women in our core sample, 15.8% have teenage births; the correspondingnumber is 14.6% in the 2003 ACS. The unweighted correlation between state-levelteenage birth rates in the tax data and the ACS is 0.80.
QUARTERLY JOURNAL OF ECONOMICS1568
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
IV. National Statistics
We begin our empirical analysis by characterizing the rela-tionship between parent and child income at the national level.We first present a set of baseline estimates of relative mobilityand then evaluate the robustness of our estimates to alternativesample and income definitions.20
IV.A. Baseline Estimates
In our baseline analysis, we use the core sample (1980–1982birth cohorts) and measure parent income as mean family incomefrom 1996 to 2000 and child income as mean family income in2011–2012, when children are approximately 30 years old.Figure I Panel A presents a binned scatter plot of the meanfamily income of children versus the mean family income oftheir parents. To construct this figure, we divide the horizontalaxis into 100 equal-sized (percentile) bins and plot mean childincome versus mean parent income in each bin.21 This binnedscatter plot provides a nonparametric representation of the con-ditional expectation of child income given parent income,E[YijXi¼ x]. The regression coefficients and standard errors re-ported in this and all subsequent binned scatter plots are esti-mated on the underlying microdata using OLS regressions.
The conditional expectation of children’s income given par-ents’ income is strongly concave. Below the 90th percentile ofparent income, a $1 increase in parent family income is associ-ated with a 33.5 cent increase in average child family income. Incontrast, between the 90th and 99th percentile, a $1 increase inparent income is associated with only a 7.6 cent increase in childincome.
20. We do not present estimates of absolute mobility at the national level be-cause absolute mobility in terms of percentile ranks is mechanically related to rel-ative mobility at the national level (see Section II). Although one can computemeasures of absolute mobility at the national level based on mean incomes (e.g.,the mean income of children whose parents are at the 25th percentile), there is nonatural benchmark for such a statistic as it has not been computed in other coun-tries or time periods.
21. For scaling purposes, we exclude the top bin (parents in the top 1%) in thisfigure only; mean parent income in this bin is $1,408,760 and mean child income is$113,846.
WHERE IS THE LAND OF OPPORTUNITY? 1569
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
AL
evel
of
Ch
ild F
amily
Inco
me
vs. P
aren
t F
amily
Inco
me
Lo
g C
hild
Fam
ily In
com
e vs
. Lo
g P
aren
t F
amily
Inco
me
B
FIG
UR
EI
Ass
ocia
tion
bet
wee
nC
hil
dre
n’s
an
dP
are
nts
’In
com
es
Th
ese
figu
res
pre
sen
tn
onp
ara
met
ric
bin
ned
scatt
erp
lots
ofth
ere
lati
onsh
ipbet
wee
nch
ild
inco
me
an
dp
are
nt
inco
me.
Bot
hp
an
els
are
base
don
the
core
sam
ple
(1980–1982
bir
thco
hor
ts)
an
dbase
lin
efa
mil
yin
com
ed
efin
itio
ns
for
pare
nts
an
dch
ild
ren
.C
hil
din
com
eis
the
mea
nof
2011–2012
fam
ily
inco
me
(wh
enth
ech
ild
isap
pro
xim
ate
ly30
yea
rsol
d),
wh
erea
sp
are
nt
inco
me
ism
ean
fam
ily
inco
me
from
1996
to2000.
Inco
mes
are
in2012
dol
lars
.T
oco
nst
ruct
Pan
elA
,w
ebin
pare
nt
fam
ily
inco
me
into
100
equ
al-
size
d(c
enti
le)
bin
san
dp
lot
the
mea
nle
vel
ofch
ild
inco
me
ver
sus
mea
nle
vel
ofp
are
nt
inco
me
wit
hin
each
bin
.F
orsc
ali
ng
pu
rpos
es,
we
do
not
show
the
poi
nt
for
the
top
1%
inP
an
elA
.In
the
top
1%
bin
,m
ean
pare
nt
inco
me
is$1.4
mil
lion
an
dm
ean
chil
din
com
eis
$114,0
00.
InP
an
elB
,w
eagain
bin
pare
nt
fam
ily
inco
me
into
100
bin
san
dp
lot
mea
nlo
gin
com
efo
rch
ild
ren
(lef
ty-
axis
)an
dth
efr
act
ion
ofch
ild
ren
wit
hze
rofa
mil
yin
com
e(r
igh
ty-
axis
)ver
sus
mea
np
are
nts
’lo
gin
com
e.C
hil
dre
nw
ith
zero
fam
ily
inco
me
are
excl
ud
edfr
omth
elo
gin
com
ese
ries
.In
bot
hp
an
els,
the
10th
an
d90th
per
cen
tile
ofp
are
nts
’in
com
eare
dep
icte
din
dash
edver
tica
lli
nes
.T
he
coef
fici
ent
esti
mate
san
dst
an
dard
erro
rs(i
np
are
nth
eses
)re
por
ted
onth
efi
gu
res
are
obta
ined
from
OL
Sre
gre
ssio
ns
onth
em
icro
data
.In
Pan
elA
,w
ere
por
tse
para
tesl
opes
for
pare
nts
bel
owth
e90th
per
cen
tile
an
dp
are
nts
bet
wee
nth
e90th
an
d99th
per
cen
tile
.In
Pan
elB
,w
ere
por
tsl
opes
ofth
elo
g-l
ogre
gre
ssio
n(i
.e.,
the
inte
rgen
erati
onal
elast
icit
yof
inco
me
orIG
E)
inth
efu
llsa
mp
lean
dfo
rp
are
nts
bet
wee
nth
e10th
an
d90th
per
cen
tile
s.
QUARTERLY JOURNAL OF ECONOMICS1570
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
TA
BL
EI
INT
ER
GE
NE
RA
TIO
NA
LM
OB
ILIT
YE
ST
IMA
TE
SA
TT
HE
NA
TIO
NA
LL
EV
EL
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Sam
ple
Ch
ild
’sou
tcom
eP
are
nt’
sin
com
ed
ef.
Cor
esa
mp
leM
ale
chil
dre
nF
emale
chil
dre
nM
arr
ied
pare
nts
Sin
gle
pare
nts
1980–1985
coh
orts
Fix
edage
at
chil
dbir
th
1.
Log
fam
ily
inco
me
Log
fam
ily
inco
me
0.3
44
0.3
49
0.3
42
0.3
03
0.2
64
0.3
16
0.3
61
(excl
ud
ing
zero
s)(0
.0004)
(0.0
006)
(0.0
005)
(0.0
005)
(0.0
008)
(0.0
003)
(0.0
008)
2.
Log
fam
ily
inco
me
Log
fam
ily
inco
me
0.6
18
0.6
97
0.5
40
0.5
09
0.5
28
0.5
80
0.6
42
(rec
odin
gze
ros
to$1)
(0.0
009)
(0.0
013)
(0.0
011)
(0.0
011)
(0.0
020)
(0.0
006)
(0.0
018)
3.
Log
fam
ily
inco
me
Log
fam
ily
inco
me
0.4
13
0.4
35
0.3
92
0.3
58
0.3
22
0.3
80
0.4
34
(rec
odin
gze
ros
to$1,0
00)
(0.0
004)
(0.0
007)
(0.0
006)
(0.0
006)
(0.0
009)
(0.0
003)
(0.0
009)
4.
Fam
ily
inco
me
ran
kF
am
ily
inco
me
ran
k0.3
41
0.3
36
0.3
46
0.2
89
0.3
11
0.3
23
0.3
59
(0.0
003)
(0.0
004)
(0.0
004)
(0.0
004)
(0.0
007)
(0.0
002)
(0.0
006)
5.
Fam
ily
inco
me
ran
kF
am
ily
inco
me
ran
k0.3
39
0.3
33
0.3
44
0.2
87
0.2
94
0.3
23
0.3
57
(1999–2003)
(0.0
003)
(0.0
004)
(0.0
004)
(0.0
004)
(0.0
007)
(0.0
002)
(0.0
006)
6.
Fam
ily
inco
me
ran
kT
opp
ar.
inco
me
ran
k0.3
12
0.3
07
0.3
17
0.2
56
0.2
53
0.2
96
0.3
27
(0.0
003)
(0.0
004)
(0.0
004)
(0.0
004)
(0.0
006)
(0.0
002)
(0.0
006)
7.
Ind
ivid
ual
inco
me
ran
kF
am
ily
inco
me
ran
k0.2
87
0.3
17
0.2
57
0.2
65
0.2
79
0.2
86
0.2
92
(0.0
003)
(0.0
004)
(0.0
004)
(0.0
004)
(0.0
007)
(0.0
002)
(0.0
006)
8.
Ind
ivid
ual
earn
ings
ran
kF
am
ily
inco
me
ran
k0.2
82
0.3
13
0.2
49
0.2
59
0.2
72
0.2
83
0.2
87
(0.0
003)
(0.0
004)
(0.0
004)
(0.0
004)
(0.0
007)
(0.0
002)
(0.0
006)
9.
Col
lege
att
end
an
ceF
am
ily
inco
me
ran
k0.6
75
0.7
08
0.6
44
0.6
41
0.6
63
0.6
78
0.6
61
(0.0
005)
(0.0
007)
(0.0
007)
(0.0
006)
(0.0
013)
(0.0
003)
(0.0
010)
WHERE IS THE LAND OF OPPORTUNITY? 1571
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
TA
BL
EI
(CO
NT
INU
ED)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Sam
ple
Ch
ild
’sou
tcom
eP
are
nt’
sin
com
ed
ef.
Cor
esa
mp
leM
ale
chil
dre
nF
emale
chil
dre
nM
arr
ied
pare
nts
Sin
gle
pare
nts
1980–1985
coh
orts
Fix
edage
at
chil
dbir
th
10.
Col
lege
qu
ali
tyra
nk
Fam
ily
inco
me
ran
k0.1
91
0.1
88
0.1
95
0.1
74
0.1
72
0.1
98
0.1
89
(P75–P
25
gra
die
nt)
(0.0
010)
(0.0
014)
(0.0
015)
(0.0
014)
(0.0
020)
(0.0
007)
(0.0
022)
11.
Tee
nage
bir
thF
am
ily
inco
me
ran
k�
0.2
98
�0.2
31
�0.3
22
�0.2
85
�0.2
90
(fem
ale
son
ly)
(0.0
006)
(0.0
007)
(0.0
016)
(0.0
004)
(0.0
011)
Nu
mber
ofob
serv
ati
ons
9,8
67,7
36
4,9
35,8
04
4,9
31,0
66
6,8
54,5
88
3,0
13,1
48
20,5
20,5
88
2,2
50,3
80
Not
es.
Each
cell
inth
ista
ble
rep
orts
the
coef
fici
ent
from
au
niv
ari
ate
OL
Sre
gre
ssio
nof
an
outc
ome
for
chil
dre
non
am
easu
reof
thei
rp
are
nts
’in
com
esw
ith
stan
dard
erro
rsin
pare
nth
eses
.A
llro
ws
rep
ort
esti
mate
sof
slop
eco
effi
cien
tsfr
omli
nea
rre
gre
ssio
ns
ofth
ech
ild
outc
ome
onth
ep
are
nt
inco
me
mea
sure
exce
pt
row
10,
inw
hic
hw
ere
gre
ssco
lleg
equ
ali
tyra
nk
ona
qu
ad
rati
cin
pare
nt
inco
me
ran
k(a
sin
Fig
ure
IVP
an
elA
).In
this
row
,w
ere
por
tth
ed
iffe
ren
cebet
wee
nth
efi
tted
valu
esfo
rch
ild
ren
wit
hp
are
nts
at
the
75th
per
cen
tile
an
dp
are
nts
at
the
25th
per
cen
tile
usi
ng
the
qu
ad
rati
csp
ecifi
cati
on.
Col
um
n(1
)u
ses
the
core
sam
ple
ofch
ild
ren
,w
hic
hin
clu
des
all
curr
ent
U.S
.ci
tize
ns
wit
ha
vali
dS
SN
orIT
INw
ho
are
(i)
bor
nin
bir
thco
hor
ts1980–1982,
(ii)
for
wh
omw
eare
able
toid
enti
fyp
are
nts
base
don
dep
end
ent
claim
ing,
an
d(i
ii)
wh
ose
mea
np
are
nt
inco
me
over
the
yea
rs1996–2000
isst
rict
lyp
osit
ive.
Col
um
ns
(2)
an
d(3
)li
mit
the
sam
ple
use
din
colu
mn
(1)
tom
ale
sor
fem
ale
s.C
olu
mn
s(4
)an
d(5
)li
mit
the
sam
ple
toch
ild
ren
wh
ose
pare
nts
wer
em
arr
ied
oru
nm
arr
ied
inth
eyea
rth
ech
ild
was
lin
ked
toth
ep
are
nt.
Col
um
n(6
)u
ses
all
chil
dre
nin
the
1980–1985
bir
thco
hor
ts.
Col
um
n(7
)re
stri
cts
the
core
sam
ple
toch
ild
ren
wh
ose
pare
nts
bot
hfa
llw
ith
ina
5-y
ear
win
dow
ofm
edia
np
are
nt
age
at
tim
eof
chil
dbir
th(a
ge
26–30
for
fath
ers;
24–28
for
mot
her
s);
we
imp
ose
only
one
ofth
ese
rest
rict
ion
sfo
rsi
ngle
pare
nts
.C
hil
dfa
mil
yin
com
eis
the
mea
nof
2011–2012
fam
ily
inco
me,
wh
ile
pare
nt
fam
ily
inco
me
isth
em
ean
from
1996
to2000.
Pare
nt
top
earn
erin
com
eis
the
mea
nin
com
eof
the
hig
her
-earn
ing
spou
sebet
wee
n1999–2003
(wh
enW
-2d
ata
are
avail
able
).C
hil
d’s
ind
ivid
ual
inco
me
isth
esu
mof
W-2
wage
earn
ings,
UI
ben
efits
,an
dS
SD
Iben
efits
,an
dh
alf
ofan
yre
main
ing
inco
me
rep
orte
don
the
1040
form
.In
div
idu
al
earn
ings
incl
ud
esW
-2w
age
earn
ings,
UI
ben
efits
,S
SD
Iin
com
e,an
dse
lf-e
mp
loym
ent
inco
me.
Col
lege
att
end
an
ceis
defi
ned
as
ever
att
end
ing
coll
ege
from
age
18
to21,
wh
ere
att
end
ing
coll
ege
isd
efin
edas
pre
sen
ceof
a1098-T
form
.C
olle
ge
qu
ali
tyra
nk
isd
efin
edas
the
per
cen
tile
ran
kof
the
coll
ege
that
the
chil
datt
end
sat
age
20
base
don
the
mea
nea
rnin
gs
at
age
31
ofch
ild
ren
wh
oatt
end
edth
esa
me
coll
ege
(ch
ild
ren
wh
od
on
otatt
end
coll
ege
are
incl
ud
edin
ase
para
te‘‘n
oco
lleg
e’’gro
up
);se
eS
ecti
onII
I.B
for
furt
her
det
ail
s.T
een
age
bir
this
defi
ned
as
havin
ga
chil
dw
hil
ebet
wee
nage
13
an
d19.
Inco
lum
ns
(1)–
(5)
an
d(7
),in
com
ep
erce
nti
lera
nk
sare
con
stru
cted
by
ran
kin
gall
chil
dre
nre
lati
ve
toot
her
sin
thei
rbir
thco
hor
tbase
don
the
rele
van
tin
com
ed
efin
itio
nan
dra
nk
ing
all
pare
nts
rela
tive
toot
her
pare
nts
inth
eco
resa
mp
le.
Ran
ks
are
alw
ays
defi
ned
onth
efu
llsa
mp
leof
all
chil
dre
n;
that
is,
they
are
not
red
efin
edw
ith
inth
esu
bsa
mp
les
inco
lum
ns
(2)–
(5)
or(7
).In
colu
mn
(6),
pare
nts
are
ran
ked
rela
tive
toot
her
pare
nts
wit
hch
ild
ren
inth
e1980–1985
bir
thco
hor
ts.
Th
en
um
ber
ofob
serv
ati
ons
corr
esp
ond
sto
the
spec
ifica
tion
inro
w4.
Th
en
um
ber
ofob
serv
ati
ons
isap
pro
xim
ate
ly7%
low
erin
row
1bec
au
sew
eex
clu
de
chil
dre
nw
ith
zero
inco
me.
Th
en
um
ber
ofob
serv
ati
ons
isap
pro
xim
ate
ly50%
low
erin
row
11
bec
au
sew
ere
stri
ctto
the
sam
ple
offe
male
chil
dre
n.
Th
ere
are
866
chil
dre
nin
the
core
sam
ple
wit
hu
nk
now
nse
x,
wh
ich
isw
hy
the
nu
mber
ofob
serv
ati
ons
inth
eco
resa
mp
leis
not
equ
al
toth
esu
mof
the
obse
rvati
ons
inth
em
ale
an
dfe
male
sam
ple
s.
QUARTERLY JOURNAL OF ECONOMICS1572
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
1. Log-Log Intergenerational Elasticity Estimates. Partly mo-tivated by the nonlinearity of the relationship in Figure I Panel A,the canonical approach to characterizing the joint distribution ofchild and parent income is to regress the log of child income on thelog of parent income (as discussed in Section II), excluding chil-dren with zero income. This regression yields an estimated IGE of0.344, as shown in the first column of row 1 of Table I.
Unfortunately, this estimate turns out to be quite sensitive tochanges in the regression specifications for two reasons, illus-trated in Figure I Panel B. First, the relationship between logchild income and log parent income is highly nonlinear, consis-tent with the findings of Corak and Heisz (1999) in Canadian taxdata. This is illustrated in the series in circles in Figure I Panel B,which plots mean log child income versus mean log family incomeby percentile bin, constructed using the same method as Figure IPanel A. Because of this nonlinearity, the IGE is sensitive to thepoint of measurement in the income distribution. For example,restricting the sample to observations between the 10th and90th percentile of parent income (denoted by the verticaldashed lines in the graph) yields a considerably higher IGEestimate of 0.452.
Second, the log-log specification discards observations withzero income. The series in triangles in Figure I Panel B plots thefraction of children with zero income by parental income bin. Thisfraction varies from 17% among the poorest families to 3% amongthe richest families. Dropping children with zero income there-fore overstates the degree of intergenerational mobility. The waythese zeros are treated can change the IGE dramatically. Forinstance, including the zeros by assigning those with zeroincome an income of $1 (so that the log of their income is zero)raises the estimated IGE to 0.618, as shown in row 2 of Table I. Ifinstead we treat those with 0 income as having an income of$1,000, the estimated IGE becomes 0.413. These exercises showthat small differences in the way children’s income is measured atthe bottom of the distribution can produce substantial variationin IGE estimates.
Columns (2)–(7) in Table I replicate the baseline specificationin column (1) for alternative subsamples analyzed in the priorliterature. Columns (2)–(5) split the sample by the child’sgender and the parents’ marital status in the year they firstclaim the child. Column (6) replicates column (1) for the extendedsample of 1980–1985 birth cohorts. Column (7) restricts the
WHERE IS THE LAND OF OPPORTUNITY? 1573
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
sample to children whose mothers are between the ages of 24 and28 and fathers are between 26 and 30 (a 5-year window aroundthe median age of birth). This column eliminates variation inparent income correlated with differences in parent age at childbirth and restricts the sample to parents who are younger than 50years when we measure their incomes (for children born in 1980).Across these subsamples, the IGE estimates range from 0.264 (forchildren of single parents, excluding children with zero income) to0.697 (for male children, recoding zeroes to $1).
The IGE is unstable because the income distribution isnot well approximated by a bivariate log-normal distribution, aresult that was not apparent in smaller samples used in priorwork. This makes it difficult to obtain reliable comparisonsof mobility across samples or geographical areas using theIGE. For example, income measures in survey data aretypically top-coded and sometimes include transfers and othersources of income that increase incomes at the bottom of thedistribution, which may lead to larger IGE estimates than those ob-tained in administrative data sets such as the one used here.
In a recent paper, Mitnik et al. (2014) propose a new measureof the IGE, the elasticity of expected child income with respect to
parent income dlog E½YijXi¼x�dlog x
� �, which they show is more robust to
the treatment of small incomes. In large samples, one can estimatethis parameter by regressing the log of mean child income in eachpercentile bin (plotted in Figure I Panel A) on the log of meanparent income in each bin. In Online Appendix C, we show thatMitnik et al.’s statistic can be interpreted as a dollar-weightedaverage of elasticities (placing greater weight on high-income chil-dren), whereas the traditional IGE weights all individuals withpositive income equally. These two parameters need not coincidein general and the ‘‘correct’’ parameter depends on the policy ques-tion one seeks to answer. However, it turns out that in our data,the Mitnik et al. dollar-weighted IGE estimate is 0.335, very sim-ilar to our baseline IGE estimate of 0.344 when excluding childrenwith zero income (Online Appendix Figure I Panel A).22
22. Mitnik et al. (2014) find larger estimates of the dollar-weighted IGE in theirsample of tax returns. A useful direction for further work would be to understandwhy the two samples yield different IGE estimates.
QUARTERLY JOURNAL OF ECONOMICS1574
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
In another recent study, Clark (2014) argues that traditionalestimates of the IGE understate the persistence of status acrossgenerations because they are attenuated by fluctuations in real-ized individual incomes across generations. To resolve this prob-lem, Clark estimates the IGE based on surname-level means ofincome in each generation and obtains a central IGE estimate of0.8, much larger than that in prior studies. In our data, estimatesof mobility based on surname means are similar to our baselineestimates based on individual income data (Online AppendixTable V). One reason that Clark (2014) may obtain larger esti-mates of intergenerational persistence is that his focus on distinc-tive surnames partly identifies the degree of convergence inincome between racial or ethnic groups (Borjas 1992) ratherthan across individuals (see Online Appendix D for furtherdetails).23
2. Rank-Rank Estimates. Next we present estimates of therank-rank slope, the second measure of relative mobility dis-cussed in Section II. We measure the percentile rank of parentsPi based on their positions in the distribution of parent incomes inthe core sample. Similarly, we define children’s percentile ranksRi based on their positions in the distribution of child incomeswithin their birth cohorts. Importantly, this definition allows usto include zeros in child income.24 Unless otherwise noted, wehold the definition of these ranks fixed based on positions in theaggregate distribution, even when analyzing subgroups.
Figure II Panel A presents a binned scatter plot of the meanpercentile rank of children E[RijPi¼p] versus their parents’ per-centile rank p. The conditional expectation of a child’s rank givenhis parents’ rank is almost perfectly linear. Using an OLS regres-sion, we estimate that a 1 percentage point (pp) increase in parentrank is associated with a 0.341 pp increase in the child’s mean
23. For example, Clark (2014, p. 60, Figure 3.10) compares the outcomes ofindividuals with the surname Katz (a predominantly Jewish name) versusWashington (a predominantly black name). This comparison generates an impliedIGE close to 1, which partly reflects the fact that the black-white income gap haschanged very little over the past few decades. Estimates of the IGE based on indi-vidual-level data (or pooling all surnames) are much lower because there is muchmore social mobility within racial groups.
24. In the case of ties, we define the rank as the mean rank for the individuals inthat group. For example, if 10% of a birth cohort has zero income, all children withzero income would receive a percentile rank of 5.
WHERE IS THE LAND OF OPPORTUNITY? 1575
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
203040506070
Mean Child Income Rank
010
2030
4050
6070
8090
100
Ran
k-R
ank
Slo
pe =
0.3
41(0
.000
3)
Par
ent I
ncom
e R
ank
203040506070
010
2030
4050
6070
8090
100
Mean Child Income Rank
Can
ada
Den
mar
k
Ran
k-R
ank
Slo
pe (
Can
ada)
= 0
.174
(0.0
05)
Ran
k-R
ank
Slo
pe (
Den
mar
k) =
0.1
80(0
.006
)
Uni
ted
Sta
tes
Par
ent I
ncom
e R
ank
AM
ean
Ch
ild In
com
e R
ank
vs. P
aren
t In
com
e R
ank
in t
he
U.S
.C
ross
-Co
un
try
Co
mp
aris
on
sB
FIG
UR
EII
Ass
ocia
tion
bet
wee
nC
hil
dre
n’s
an
dP
are
nts
’P
erce
nti
leR
an
ks
Th
ese
figu
res
pre
sen
tn
onp
ara
met
ric
bin
ned
scatt
erp
lots
ofth
ere
lati
onsh
ipbet
wee
nch
ild
ren
’san
dp
are
nt’
sp
erce
nti
lein
com
era
nk
s.B
oth
figu
res
are
base
don
the
core
sam
ple
(1980–1982
bir
thco
hor
ts)
an
dbase
lin
efa
mil
yin
com
ed
efin
itio
ns
for
pare
nts
an
dch
ild
ren
.C
hil
din
com
eis
the
mea
nof
2011–2012
fam
ily
inco
me
(wh
enth
ech
ild
isap
pro
xim
ate
ly30
yea
rsol
d),
an
dp
are
nt
inco
me
ism
ean
fam
ily
inco
me
from
1996
to2000.
Ch
ild
ren
are
ran
ked
rela
tive
toot
her
chil
dre
nin
thei
rbir
thco
hor
t,an
dp
are
nts
are
ran
ked
rela
tive
toall
oth
erp
are
nts
inth
eco
resa
mp
le.
Pan
elA
plo
tsth
em
ean
chil
dp
erce
nti
lera
nk
wit
hin
each
pare
nt
per
cen
tile
ran
kbin
.T
he
seri
esin
tria
ngle
sin
Pan
elB
plo
tsth
ean
alo
gou
sse
ries
for
Den
mark
,co
mp
ute
dby
Bos
eru
p,
Kop
czu
k,
an
dK
rein
er(2
013)
usi
ng
asi
mil
ar
sam
ple
an
din
com
ed
efin
itio
ns.
Th
ese
ries
insq
uare
sp
lots
esti
mate
sof
the
ran
k-r
an
kse
ries
usi
ng
the
dec
ile-
dec
ile
tran
siti
onm
atr
ixfr
omC
orak
an
dH
eisz
(1999).
Th
ese
ries
inci
rcle
sin
Pan
elB
rep
rod
uce
sth
era
nk
-ran
kre
lati
onsh
ipin
the
Un
ited
Sta
tes
from
Pan
elA
as
are
fere
nce
.T
he
slop
esan
dbes
t-fi
tli
nes
are
esti
mate
du
sin
gan
OL
Sre
gre
ssio
non
the
mic
rod
ata
for
the
Un
ited
Sta
tes
an
don
the
bin
ned
seri
es(a
sw
ed
on
oth
ave
acc
ess
toth
em
icro
data
)fo
rD
enm
ark
an
dC
an
ad
a.
Sta
nd
ard
erro
rsare
rep
orte
din
pare
nth
eses
.
QUARTERLY JOURNAL OF ECONOMICS1576
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/
rank, as reported in row 4 of Table I. The rank-rank slope esti-mates are generally quite similar across subsamples, as shown incolumns (2)–(7) of Table I.
Figure II Panel B compares the rank-rank relationship in theUnited States with analogous estimates for Denmark constructedusing data from Boserup, Kopczuk, and Kreiner (2013) and esti-mates for Canada constructed from the decile transition matrixreported by Corak and Heisz (1999).25 The relationship betweenchild and parent ranks is nearly linear in Denmark and Canadaas well, suggesting that the rank-rank specification provides agood summary of mobility across diverse environments. Therank-rank slope is 0.180 in Denmark and 0.174 in Canada,nearly half that in the United States.
Importantly, the smaller rank-rank slopes in Denmark andCanada do not necessarily mean that children from low-incomefamilies in these countries do better than those in the UnitedStates in absolute terms. It could be that children of high-income parents in Denmark and Canada have worse outcomesthan children of high-income parents in the United States. One
TABLE II
NATIONAL QUINTILE TRANSITION MATRIX
Parent quintileChild quintile 1 2 3 4 5
1 33.7% 24.2% 17.8% 13.4% 10.9%2 28.0% 24.2% 19.8% 16.0% 11.9%3 18.4% 21.7% 22.1% 20.9% 17.0%4 12.3% 17.6% 22.0% 24.4% 23.6%5 7.5% 12.3% 18.3% 25.4% 36.5%
Notes. Each cell reports the percentage of children with family income in the quintile given by the rowconditional on having parents with family income in the quintile given by the column for the 9,867,736children in the core sample (1980–1982 birth cohorts). See notes to Table I for income and sample definitions.See Online Appendix Table VI for an analogous transition matrix constructed using the 1980–1985 cohorts.
25. Both the Danish and Canadian studies use administrative earnings infor-mation for large samples as we do here. The Danish sample, which was constructedto match the analysis sample in this article as closely as possible, consists of chil-dren in the 1980–1981 birth cohorts and measures child income based on meanincome between 2009 and 2011. Child income in the Danish sample is measured atthe individual level, and parents’ income is the mean of the two biological parents’income from 1997 to 1999, irrespective of their marital status. The Canadiansample is less comparable to our sample, as it consists of male children in the1963–1966 birth cohorts and studies the link between their mean earnings from1993 to 1995 and their fathers’ mean earnings from 1978 to 1982.
WHERE IS THE LAND OF OPPORTUNITY? 1577
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/
cannot distinguish between these possibilities because the ranksare defined within each country. One advantage of the within–United States CZ-level analysis implemented below is that it nat-urally allows us to study both relative and absolute outcomes byanalyzing children’s performance on a fixed national scale.
3. Transition Matrixes. Table II presents a quintile transitionmatrix: the probability that a child is in quintile m of the childincome distribution conditional on his parent being in quintile nof the parent income distribution. One statistic of particular in-terest in this matrix is the probability of moving from the bottomquintile to the top quintile, a simple measure of success that wereturn to later. This probability is 7.5% in the United States,compared with 11.7% in Denmark (Boserup, Kopczuk, andKreiner 2013) and 13.4% in Canada (Corak and Heisz 1999). Inthis sense, the chances of achieving the American dream are con-siderably higher for children in Denmark and Canada than thosein the United States.
In Online Data Table I, we report a 100�100 percentile-leveltransition matrix for the United States. Using this matrix and themarginal distributions for child and parent income in OnlineData Table II, one can construct any mobility statistic of interestfor the U.S. population.26
IV.B. Robustness of Baseline Estimates
We now evaluate the robustness of our estimates of interge-nerational mobility to alternative specifications. We begin byevaluating two potential sources of bias emphasized in priorwork: life cycle bias and attenuation bias.
1. Life Cycle Bias. Prior research has shown that measuringchildren’s income at early ages can understate intergenerationalpersistence in lifetime income because children with high lifetimeincomes have steeper earnings profiles when they are young(Solon 1999; Grawe 2006; Haider and Solon 2006). To evaluatewhether our baseline estimates suffer from such life cycle bias,Figure III Panel A plots estimates of the rank-rank slope by theage at which the child’s income is measured. We construct the
26. All of the online data tables are available at http://www.equality-of-opportunity.org/index.php/data.
QUARTERLY JOURNAL OF ECONOMICS1578
at University of O
ttawa on January 9, 2015
http://qje.oxfordjournals.org/D
ownloaded from
http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://qje.oxfordjournals.org/lookup/suppl/doi:10.1093/qje/qju022/-/DC1http://www.equality-of-opportunity.org/index.php/datahttp://www.equality-of-opportunity.org/index.php/datahttp://qje.oxfordjournals.org/
AL
ifec
ycle
Bia
s: R
ank-
Ran
k S
lop
es b
y A
ge
of
Ch
ildA
tten
uat
ion
Bia
s: R
ank-
Ran
k S
lop
es b
y N
um
ber
of
Yea
rs U
sed
to
Mea
sure
Par
ent
Inco
me
B
FIG
UR
EII
I
Rob
ust
nes
sof
Inte
rgen
erati
onal
Mob
ilit
yE
stim
ate
s
Th
isfi
gu
reev
alu
ate
sth
ero
bu
stn
ess
ofth
era
nk
-ran
ksl
ope
esti
mate
din
Fig
ure
IIP
an
elA
toch
an
ges
inth
eage
at
wh
ich
chil
din
com
eis
mea
sure
d(P
an
elA
)an
dth
en
um
ber
ofyea
rsu
sed
tom
easu
rep
are
nts
’in
com
e(P
an
elB
).In
bot
hp
an
els,
chil
din
com
eis
defi
ned
as
mea
nfa
mil
yin
com
ein
2011
to2012.
InP
an
elA
,p
are
nt
inco
me
isd
efin
edas
mea
nfa
mil
yin
com
efr
om1996
to2000.
Each
poi
nt
inP
an
elA
show
sth
esl
ope
coef
fici
ent
from
ase
para
teO
LS
regre
ssio
nof
chil
din
com
era
nk
onp
are
nt
inco
me
ran
k,
vary
ing
the
chil
d’s
bir
thco
hor
tan
dh
ence
the
chil
d’s
age
in2011–2012
wh
enth
ech
ild
’sin
com
eis
mea
sure
d.
Th
eci
rcle
su
seth
eex
ten
ded
sam
ple
inth
ep
opu
lati
ond
ata
,an
dth
etr