44
Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie, Oliver Himmler Journal of Human Resources, Volume 45, Number 2, Spring 2010, pp. 364-406 (Article) Published by University of Wisconsin Press DOI: For additional information about this article Access provided by Southern Methodist University (1 Dec 2017 04:18 GMT) https://doi.org/10.1353/jhr.2010.0018 https://muse.jhu.edu/article/466823/summary

Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Health and Wages: Panel Data Estimates Considering Selection and Endogeneity

Robert Jäckie, Oliver Himmler

Journal of Human Resources, Volume 45, Number 2, Spring 2010, pp. 364-406(Article)

Published by University of Wisconsin PressDOI:

For additional information about this article

Access provided by Southern Methodist University (1 Dec 2017 04:18 GMT)

https://doi.org/10.1353/jhr.2010.0018

https://muse.jhu.edu/article/466823/summary

Page 2: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

T H E J O U R N A L O F H U M A N R E S O U R C E S • 45 • 2

Health and WagesPanel Data Estimates Considering Selectionand Endogeneity

Robert JackleOliver Himmler

A B S T R A C T

This paper complements previous studies on the effects of health on wagesby addressing the problems of unobserved heterogeneity, sample selection,and endogeneity in one comprehensive framework. Using data from theGerman Socio-Economic Panel (GSOEP), we find the health variable tosuffer from measurement error and a number of tests provide evidencethat selection corrections are necessary. Good health leads to higherwages for men, while there appears to be no significant effect for women.Contingent on the method of estimation, healthy males earn between 1.3percent and 7.8 percent more than those in poor health.

I. Introduction

Does superior health enable individuals to command higher wages?This question has spurred research in both labor and health economics, and conse-quently led to the identification of two major channels of interaction. First, healthas part of human capital may positively affect labor market productivity and hencewages. Second, as Grossman (2001) points out, if marginal benefits of investmentin health increase with the salary, health should rise with wages. Thus, reversecausality may lead to biased estimates of the health effect. A number of furtherchallenges need to be considered: while inaccuracies in assessing health status mayintroduce bias due to measurement error whenever self-reported health satisfactionis used in the estimations, another problem that remains unappreciated in most earlier

Robert Jackle is a senior researcher at TNS Infratest Social Research, Munich. Oliver Himmler is aresearch associate at Goettingen University. The authors thank Christian Holzner, Georg Wamser,Joachim Winter, and participants at various seminars and conferences for helpful discussions. Threeanonymous referees are thanked for insightful suggestions. Special thanks go to Joachim Wolff whokindly shared his GSOEP preparation files. All remaining deficiencies are the authors’ responsibility.Data in this article are from the German Socio-Economic Panel (GSOEP), available from the GermanInstitute for Economic Research (DIW).[Submitted February 2007; accepted December 2008]ISSN 022-166X E-ISSN 1548-8004 � 2010 by the Board of Regents of the University of Wisconsin System

Page 3: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 365

studies is nonrandom sample selection. Since labor market participation is endoge-nous and health status is one of the influences driving selection, failing to applyselection correction methods may result in inconsistent estimation. Finally, an issueparticularly relevant in the health context is unobserved heterogeneity. Wheneverunobserved factors such as genetic endowment are correlated with health, the useof panel data techniques to account for omitted variable bias is called for.

The impact of health on wages has been studied using a variety of econometricapproaches, accounting for the above problems to different extents: Gambin (2005)investigates the relationship between health and wages for 14 European countriesemploying fixed (FE) and random effects (RE) estimation. She proposes that formen, self-reported health has a greater effect than for females, while in the case ofchronic diseases the opposite holds true. An econometric model that accounts forthe simultaneous effects of health and wages in a structural multiequation systemhas been suggested by Lee (1982). His approach is based on a generalized versionof Heckman’s (1978) treatment model. Using a cross-sectional sample of male U.S.citizens, he finds that health and wages are strongly interrelated, that is, wagespositively affect health and vice versa. In a similar vein, Cai (2007) estimates amultiequation system using cross-sectional Australian data and finds health to havea positive effect on wages once endogeneity is accounted for. He also finds thatthere is no endogenous selection present in his data. Haveman, Wolfe, Kreider, andStone (1994) estimate a multiple equation system for working time, wages, andhealth, employing generalized method of moments techniques on panel data. Theyfind that in the male U.S. population poor health affects wages negatively. The effectof self-assessed general and psychological health on wages is at the core of Con-toyannis and Rice’s (2001) study using the British Household Panel Survey. Theyapply FE and RE instrumental variable estimators and conclude that reduced psy-chological health decreases male wages, while positive self-assessed health increaseshourly wages for women. While each of these papers tackles at least one of thementioned econometric issues, to our knowledge there is no study that accounts forunobserved heterogeneity, nonrandom sample selection, and endogeneity in oneframework.

In order to fill this gap, we utilize a recently developed estimation method pro-posed by Semykina and Wooldridge (2006), which extends Wooldridge’s (1995)method of testing and correcting for sample selection in fixed effects models. Thelatter estimator has been contrasted with alternative methods proposed by Kyriazidou(1997) and Rochina-Barrachina (1999) in an application to female wage equationsby Dustmann and Rochina-Barrachina (2007). While Kyriazidou’s (1997) estimatorimplies homoskedastic idiosyncratic errors over time, Rochina-Barrachina (1999)does not rely on this assumption. The drawback of her method, however, is that itassumes joint normality of the error terms in the probit and the main equation.Wooldridge’s (1995) method relies on standard probit estimates for each year inorder to calculate annual inverse Mills ratios (IMRs) and explicitly models the con-ditional mean of the error terms in the main equation. Its advantage over the othermodels that have been suggested is that it does not rely on any known distributionof the errors in the equation of interest, and allows them to be time heteroskedasticand serially correlated in an unspecified way. One approach to expanding these three

Page 4: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

366 The Journal of Human Resources

estimators to account for nonstrict exogeneity and measurement error is presentedin Dustmann and Rochina-Barrachina (2007). Similarly, Semykina and Wooldridge(2006) enhance Wooldridge’s (1995) estimator and demonstrate how to test andcontrol for sample selection in a fixed effects model with endogeneity. The reasonwe choose to adopt the Semykina and Wooldridge (2006) approach in this paper isthat, other than the alternative methods, it allows for time heteroskedasticity andautocorrelation in the error terms of both equations.

The estimator is applied to male and female samples taken from the GermanSocio-Economic Panel (GSOEP). We find the health variable to be reported witherror and a number of tests provide evidence that corrections for nonrandom selec-tion into the work force are indicated in both the female and male sample. We showthat the impact of health on wages is statistically different from zero for men only.For them, a highly significant effect of health on wages, associated with a healthpremium of up to 7.8 percent, is found and cannot be eradicated by applying selec-tion correction. Considering nonrandom selection into the work force is, however,associated with lower wages on each health level for both genders.

The remainder of this paper is structured as follows: the starting point is a dis-cussion of specification issues and resulting problems, followed by a detailed over-view of the estimation methods in Section III. The ensuing Section IV provides datadescriptions and discusses various specifications of the health variable. In SectionV, we report estimation and test results. Section VI concludes.

II. Model Specification and Resulting Problems

To fix ideas, a simple model of how health affects wages is pre-sented. A firm produces Yt at time t � (1,2,...,T), using effective labor Lt as thesingle input in producing Yt. The firm’s production function is given by Yt � F(Lt),and the amount of effective labor can be written as

n

L � p (E ,a ,h )•� ,(1) t � i i i,t i,t i,ti�1

where is labor supply of employee i, and is an unknown function that� p (•)i,t i

determines the effectiveness of an individual’s working hours . This function�i,t

takes as arguments the years of education , age , and state of health . InE a hi i,t i,t

what follows, we refer to the first two variables as the human capital part of p (•)i

and to the latter part as health effect.Workers are paid according to their marginal productivity, and so the log wage

of each employee can be written as

dF �Ltlog w �log [ • ]�log F �log p (E ,a ,h ),(2) i,t L i i i,t i,ttdL ��t i,t

such that wages are determined by firm-level supply and demand factors aslog FLt

well as by the employee-level human capital and health effects.

Page 5: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 367

In what follows, we describe the operationalization of the latter two effects andderive the baseline econometric model.

A. The Human Capital Part

The human capital part of pi(•) is approximated using a specification similar toMincer (1958 and 1974). He suggested that log wages are linear in the years ofschooling, and linear and quadratic in the years of labor market experience. RomeuGordo (2006), however, finds evidence for the existence of a positive relationshipbetween unemployment and health satisfaction using GSOEP data. On this account,we include unemployment rather than working experience. Adding an age variablethen implicitly controls for work experience as well. Furthermore, human capitaltheory suggests using firm tenure as a proxy for the firm-specific investment inhuman capital. Because firm tenure (and its square) is more closely related to laborproductivity than the general working experience is, it should cause an extra increasein wages.

B. The Health Effect

As stated earlier, health is an essential part of human capital and will thus affectlabor market productivity which in turn determines wages. We use self-assessedhealth satisfaction as our key explanatory variable, the definition and functional formof which is discussed in detail in Section VA.

C. Dependent Variable and Baseline Specification

While health as a part of human capital directly affects productivity, it also can beconsidered an endogenous capital stock, which according to Grossman (2001) de-termines the amount of time an individual can spend participating in the labor mar-ket. One reason that the number of hours worked diverges somewhat across indi-viduals may therefore lie in differences in health status and so we will use hourlywages rather than monthly earnings as the dependent variable.

The above model can then be parameterised as follows:

log w �b ��f ��a ���E �ue �(3) i,t B,t i,t i,t i i,t

�ft ��ch ���g(h )�du ��error,i,t i,t i,t i,t

where are hourly wages, is a vector that approximates firm level supply andw bi,t B,t

demand forces by using the average number of job-seekers, notified va-(log F )Lt

cancies, and (un)employment figures at the federal state (Bundesstaat) level B.1 Thevector of dummy variables captures four different categories of firm size, isf ai,t i,t

the vector of a third-order polynomial of age and denotes years of schoolinga Ei,t i

or training. Second-order polynomials of unemployment experience and firm tenureare captured in and , respectively. are the number of children in threeue ft chi,t i,t i,t

age categories, and g(hi,t) is a yet to be determined function of the health variable.

1. Data provided by the German Federal Employment Agency, Nuremberg.

Page 6: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

368 The Journal of Human Resources

Finally, are indicator variables for firm sector, occupational status,2 East Ger-dui,t

many, parttime work, nationality, children, and time periods.3

In the estimation of the parameter vector in(��, ��, ��, �, ��, ��, ��, �, ��)�Equation 3, a number of problems arise. To start with, Grossman (2001) suggeststhat the rate of return to (gross) investment in health equals the additional availabilityof healthy time, evaluated at the hourly wage rate. This means that health shouldrise with wages as the marginal benefits of health investment increase with the wagerate, implying that is simultaneously determined along with . As we employh wi,t i,t

self-reported health satisfaction, measurement error also can be an important sourceof bias. In the absence of an “objective” measure, such as a physician’s evaluationof overall health, � will likely be biased toward zero. Another problem arises if arandom sample drawn from the overall population is not available. In this study, weaim to identify the effect of health on the labor market productivity for all individ-uals, thus a bias may result from the fact that individuals endogenously decide toparticipate in the labor market. If some of the factors determining participation alsoaffect health and wages, selection correction methods are in order. Omitted variablebias is also a cause of concern. Disregarding, for example, the genetic endowmentof a person could lead to biased estimates as it may at the same time impact healthstatus and hourly wages.4 The following section explains how we deal with theseissues econometrically.

III. Econometric Approach

As indicated above, the goal of this work is to make statements aboutthe impact of health on wages for the entire population. Thus, with panel data,employing a simple within estimator is a reasonable approach only when we can besure that the decision to participate in the labor market is either randomly deter-mined, or fully covered by the observable variables, or the fixed effect. In the contextof this paper it is entirely conceivable that unobserved time varying health deter-minants such as the lifestyle an individual engages in (think of alcohol, nicotine,sports), or motivation affect selection and will not be covered by the fixed effect.This kind of selection will then influence wages through the error term and lead toinconsistent estimation. To overcome the selection problem, the following modelis estimated:

∗ ∗w �� �x � �y � �c �u ,(4) i,t 0 i,t 1 i,t 2 i i,t

∗ ∗w �w , y � y if S �1 and unobserved otherwise,(5) i,t i,t i,t i,t i,t

2. Interaction terms between the occupational status and the health variables as well as between age andhealth were found to be statistically insignificant and consequently dropped from the final model.3. Variable descriptions are shown in Appendix Tables A1 and A2.4. Past shocks (such as heart attacks, accidents, etc.) may affect current state of health (Contoyannis, Jones,and Rice 2004, and Halliday 2008). As far as differences in the ability to cope with such (past) healthshocks aren’t covered by unobserved effects, endogeneity may be introduced. Considering the full dynamicsof health on top of all sources of endogeneity mentioned above is, however, beyond the scope of thispaper.

Page 7: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 369

∗ ∗S �� �k �z ��e ; S �1[S �0](6) i,t 0 i i,t i,t i,t i,t

where all variables superscripted with an asterisk are latent variables. In Equation4, are hourly wages and the 1�K vector xi,t comprises those explanatory vari-∗wi,t

ables in Equation 3 that we observe irrespective of participation, including health.Variables that can only be observed for those who work make up and are im-∗yi,t

posed as exclusion restrictions on the participation equation. Unobserved individualcharacteristics are contained in ci, ui,t is an unobserved error term and Si,t in Equation5 denotes labor market participation. Equation 6 describes a person’s decision toparticipate in the labor market, where is the latent propensity to work, 1[.] is an∗Si,t

indicator function which equals one if its argument is true, and the 1�G vector zi,t

is a superset of xi,t. Though not strictly necessary, it is advantageous to have G�K,which is why we add exclusion restrictions to zi,t that drive selection but can at thesame time be omitted from Equation 4. The individual effect ki is composed ofunobserved characteristics and exhibits no variation over time. Furthermore, ei,t,which is normally distributed with standard deviation , is uncorrelated with ki ande�t

zi, where zi � (zi,1,...,zi,T) and t � (1,2,...,T).Following Mundlak (1978), Chamberlain (1984), and Wooldridge (1995), write

ki as a linear projection onto the time averages of , denoted , a constant as wellz zi,t i

as an error ai. Then, Equation 6 can be rewritten as:

∗S �� �z �z ��v ,(7) i,t 0 i i,t i,t

where the composite error term �i,t � ai � ei,t is independent of zi and allowed tobe heterogeneously distributed over time and there are no restrictions imposed onthe correlation between �i,t and for .v s�ti,s

Two assumptions concerning the wage equation (Wooldridge 1995 and 2002)ensure that no restrictions are imposed on how relates to �i,s, s � t.5 First, ui,t isui,t

a linear function of and mean independent of conditional on . Second,v z vi,t i i,t

similar to the selection equation, the unobserved effect is modelled as a projectionof onto and an error term .6 This method specifically models the∗c (x ,y ,v ) bi i i i,t i

unobserved effect such that correlation between and is possible. Under∗c (x ,y ,v )i i i i,t

these assumptions, Equation 4 can be rewritten as:

∗ ∗ ∗w �� �x �x � �y �y � � �r ,(8) i,t 0 i 1 i,t 1 i 2 i,t 2 t i,t i,t

where ri,t � bi � li,t and li,t is the remaining part of ui after including the inverseMills ratios (IMRs). The IMRs i,t are obtained by estimating Equation 7 with stan-dard probit methods for each t. Because si,s (s � t) does not influence i,t, the errorterm ri,t is allowed to be correlated with i,s. Equation 8 (with i,t replaced by )i,t

can therefore be consistently estimated by pooled OLS. We follow Wooldridge(1995) and construct standard errors robust to serial correlation and heteroskedas-

5. Dustmann and Rochina-Barrachina (2007) call this condition “contemporaneous exogeneity” of theselection process.6. It should be noted that this assumption is rather restrictive, as it allows only for time-invariant unob-served effects to be correlated with the explanatory variables in Equation 4. For time-variant latent variablesWooldridge’s (1995) estimator may thus be inconsistent.

Page 8: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

370 The Journal of Human Resources

ticity, which also are adjusted for the additional variation introduced by the esti-mation of T probit models in the first step.

While estimation of Equation 8 assumes (strict) exogeneity of the explanatoryvariables, Semykina and Wooldridge (2006) provide an estimation method based onWooldridge (1995) that allows for endogeneity in the presence of unobservedheterogeneity and sample selection: analogous to the above derivations, the startingpoint is the model in Equations 4, 5, and 6. Presume, however, that the healthvariable (as part of xi,t in Equation 4) is correlated with ui,t. As it stands, health ispart of zi,t but at the same time ui,t must not be correlated with zi,t. Hence, the healthvariable is removed from zi,t and replaced by a proxy for health which exhibits nocorrelation with ui,t and can thus serve as an additional exclusion restriction in theparticipation equation. The resulting 1�G vector is denoted qi,t and its time averages

and itself also replace and zi,t in Equation 7.q q zi i,t i

An estimator that allows �i,t in Equation 7 to be correlated with ui,t and ci inEquation 4 when the health variable is endogenous can be obtained by maintainingthe assumptions underlying Equation 8 and replacing with . Thus, analogous tox qi i

Equation 8 we can write:

∗ ∗ ∗w �� �q �x � �y �y � � �r .(9) i,t 0 i 1 i,t 1 i 2 i,t 2 t i,t i,t

Again, the first step is to estimate T standard probit models, and calculate the IMRs. Because ri,t is allowed to be correlated with for s � t (that is, i,t is not i,t i,s

strictly exogenous in Equation 9), a consistent way of estimating Equation 9 ispooled 2SLS, where serve as (their own) instruments. Stan-∗ ∗ ˆ1, q , q , y , y , i i,t i i,t i,t

dard errors robust to serial correlation and heteroskedasticity are calculated as sug-gested by Semykina and Wooldridge (2006). They are adjusted for the additionalvariation introduced by the estimation of T probit models in the first step and theyalso account for the use of the pooled 2SLS estimator.

IV. Data and Descriptives

The data used in this analysis are taken from twelve consecutiveannual waves of the German Socio-Economic Panel Study (GSOEP), provided bythe German Institute for Economic Research (DIW). The GSOEP, which is repre-sentative of the German population, started in 1984 with about 12,200 observationsfrom the western German states. In June 1990, another 4,400 individuals living inthe territory of the former German Democratic Republic were added in order toexpand the GSOEP to the eastern part of Germany.

A. Sample Construction

For the empirical analysis, we use observations from all subsamples between 1995and 2006, with the exception of samples G (“Oversampling of High Income House-holds”) and H (“Refreshment 2006”).7 We extract data on the variables described

7. 1995 is chosen as starting point because the key variable “number of doctor visits” is not available in1994. Subsamples A through D constitute the “base data,” subsamples E and F are refreshment samples,which start in 1998 and 2000, respectively. The 2006 refreshment sample H is excluded, because bydefinition every person in this sample is observable for only one year.

Page 9: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 371

in Appendix Table A1. The sample is constrained to persons older than 17 andyounger than 66 years. Also excluded are those who are self-employed, self-employed in the agricultural sector, work in the family business, are on maternityleave, drafted for mandatory military or civilian service, as well as individuals whoserve an apprenticeship, trainees, interns, volunteers, aspirants, pensioners, and thosestill in education. Marginally or irregularly employed persons also are removed fromthe estimation sample. Motivated by two arguments, we choose to exclude (severely)handicapped people from the analysis, too. First, firms may discriminate againsthandicapped individuals, irrespective of their productivity. Hence, their wages maybe artificially low, or they might even drop out of the labor market due to discrim-ination, which is not meant to be captured in the selection equation. Secondly, inGermany severely handicapped people often work at special “sheltered workshops,”where they are not paid according to their marginal productivity.

Hourly wages are derived by dividing gross individual earnings in the monthbefore the interview by 4.3 (the average number of weeks per month) and thendividing the resulting weekly wage by the usual working time per week.8 Any extrasalaries like Christmas or holiday bonuses, thirteenth monthly pay, or child benefitsare not taken into account. Suspiciously high or low wage rates were manuallychecked and dropped if necessary. Wages (as well as all other financial variables)are deflated to their year 2001 real values using the eastern and western CPIs and,if necessary, converted into Euro equivalents.9

Participation in the labor market is constituted by having worked for pay in themonth before the interview. In the participation equations, both working and non-working adults are used for estimation. Since the econometric approach includeslinear probability models, which exploit within transformations, individuals whoappear for only one year are removed from the estimation sample.

Appendix Table A2 shows how the stepwise exclusion of different groups leadsto an estimation sample of 9,277 females and 8,847 males, resulting in 57,203 and57,419 observations, respectively. For the estimation of the wage equations, personswho participate in the labor market for only one year are dropped from the sample.Due to this restriction and because individuals with missing wages who declareparticipation are defined as participating in the selection equations, the number ofobservations in the wage equations differs from the working population in the probitsample.

In the time period considered, about 69 percent of the female and around 86percent of the male sample population participate in the labor market and male realhourly wages are on average about 0.22 log points higher than those of women.Appendix Tables A7 and A8 compare variables in the participation equations forworking and nonworking individuals, Appendix Tables A9 and A10 provide detailedsummary statistics for variables used in the wage equations.

8. Usual hours are chosen due to their invariance to short term health problems. Including the effects ofshort-term health issues on hours may bias hourly wages upward as paid sick days are common practicein Germany. Contractual hours are used instead of usual working time whenever the former exceed thelatter.9. For this purpose, Consumer Price Indices included in the $pequiv files of the GSOEP are used.

Page 10: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

372 The Journal of Human Resources

B. Health Variable

The GSOEP health measure asks individuals to state how satisfied they currentlyare with their health on a category scale ranging from zero to ten. As the functionalform is a priori unclear, three specifications of the health variable are employed inorder to gain insight into the relationship between health and participation/wages:(i) without any further transformations, implying a log-linear relationship; (ii) usinga log-log model, as suggested by Equation 2;10 and (iii) splitting health satisfactioninto four dummy variables, thus producing a flexible nonlinear specification.11 Ap-pendix Table A4 shows results of these preliminary regressions for both the wageand participation equations.12

The coefficients of the health variable(s) turn out to be significantly different fromzero in all specifications, for both women and men, and in the wage and participationequations. An important observation is that health satisfaction affects wages andlabor market participation nonlinearly. This becomes evident in both the log-log andthe dummy specification. In the latter, throughout the categories excellent, good, andmedium health, an increasing effect of health satisfaction at a diminishing rate isrevealed. For example, in the case of female labor supply, reducing health fromexcellent to good (0.001) has a much smaller effect than reducing it further tomedium health (0.021). Equally stated, diminishing health from excellent to good(0.006) in the male sample affects wages less strongly than reducing it from goodto medium (0.03).

Based on these results and given that almost 90 percent of the observations areallocated over the categories excellent, good, and medium health (see AppendixTable A3), the health measure should exhibit some kind of nonlinear specification,where wages and the probability to work increase with health at diminishing rates.For pragmatic reasons, instead of choosing the more flexible dummy variable spec-ification, we decide to rely on the log-log structure. First, its functional form mostclosely approximates the model suggested in Equation 2. Second, only one instru-ment is needed when implementing the log-log form, which is especially importantfor the IV-approaches of the participation equations in V.A. Finally, it still allowsfor increasing returns to health at a decreasing rate—the relevant functional formfor 90 percent of all observations.

The observed mean of this log-health variable for working females between 1995and 2006 is 2.579, while the value for nonworking women is smaller at 2.482 logpoints. For males, the working to nonworking health ratio is 2.594 to 2.40. Thehypothesis of the equality of means between the working and nonworking groupcan be rejected on the basis of two standard t-tests, t � 25.22 (p-value � 0) forfemales and t � 39.73 (p-value � 0) for males.

10. Health satisfaction is transformed as follows: , which is a parallel trans-2g (h )�log (h � (h �1))�i,t i,t i,t

lation of the log function, where .g(h �0)�0i,t

11. According to the frequency distribution in the appendix (Appendix Table A4), we define poor (Cate-gory 0–4), medium (Category 5–6), good (Category 7–8), and excellent health (Category 9–10), where thefirst one serves as base category.12. To make parameters directly interpretable, we employ linear probability models to estimate the par-ticipation equations in Columns 4, 5, and 6. In all specifications further explanatory variables (see Tables3, 4 and Appendix Tables A6, A7) are included but not reported.

Page 11: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 373

V. Empirical Results

A. Participation Equations

Health is expected to influence the decision to participate in the labor market as wellas wages. Thus, in order to gain insight on the extensive margin, Appendix TablesA5 and A6 present estimation results for the Mundlak-type specification needed forthe Wooldridge (1995) and Semykina and Wooldridge (2006) estimators as well asfive additional specifications. The exclusion restrictions we propose are: nonlaborincome, a binary variable for having a partner, partner’s net wage and second degreepolynomials of the partner’s age, labor market experience, and education as well asan indicator variable for whether the partner variables were missing though thepresence of a partner is reported.

As a means of coping with the possible endogeneity of health in the participationequation, we employ computationally undemanding (FE-)IV linear probability spec-ifications in Columns 3 and 4. Here, the number of doctor visits in the last threemonths serves as an instrument for the health variable.13 The intuition is that “doctorvisits” approximate past investment and depreciation in health and account for pastshocks affecting current health satisfaction. At the same time, “doctor visits” shouldnot have an effect on wages other than through health status.14 Columns 1 and 2display pooled OLS and within results to allow a check of the IV specificationsagainst naıve estimators.

The estimated coefficients of the health variable turn out to be significantly dif-ferent from zero for both women and men and in all four linear specifications.Comparing the parameters in Columns 3, 4 with Columns 1, 2 shows that, as isexpected in the presence of measurement error, the coefficients of health satisfactionusing IV methods are larger than those in the pooled OLS or within model.15 Onthe other hand, the inclusion of unobserved effects reduces the estimated parametersin Columns 2 and 4 in comparison to Columns 1 and 3, that is, correlation betweenthe health variable and latent individual heterogeneity is associated with an upwardbias.

Column 5 provides a pooled probit model, which assumes that the explanatoryvariables are independent of any unobserved effect.16 Column 6 applies the Mundlakspecification—as laid out in Section III—to the pooled sample. Based on the above

13. For an example of how an endogenously reported health measure may affect wages see Stern (1989).In his paper he uses symptoms or diseases as instruments for endogenously reported disability and laborforce participation.14. One issue our instrument probably doesn’t resolve is that people may justify nonparticipation in thelabor market by reporting low health, such that there is actually an omitted variable, say, “motivation.” Ifthese individuals visit physicians in order to justify their nonparticipation in the same fashion, the instru-ment may be invalid. However, as long as physicians do not issue sick notes to people who are healthy,there is really no reason to arrange such appointments. Additionally, as long as “motivation” is timeinvariant, the individual effects in Column 6 should take care of the problem.15. Heteroskedasticity robust, regression based Hausman tests in the spirit of Wooldridge (2002) confirmsystematic differences between the health coefficients in Columns 3, 4 and 1, 2.16. In Columns 5 and 6, a robust variance covariance matrix accounts for the fact that observations arecorrelated within individuals over time. Under more restrictive assumptions, “traditional” random effectsprobit estimation is possible; results for these models are available on request.

Page 12: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

374 The Journal of Human Resources

mentioned hints that “health satisfaction” may be endogenous in the selection equa-tions, in the pooled probit (Column 5) and Mundlak-type (Column 6) specificationsthe possibly endogenous “health satisfaction” variable is replaced by “number ofdoctor visits” which we assume to be exogenous and which reflects health satisfac-tion. Thus, the “doctor visits” variable effectively serves as an additional exclusionrestriction, increasing their total number to eleven. This procedure follows Semykinaand Wooldridge (2006) and Dustmann and Rochina-Barrachina’s (2007) method. Itis strictly necessary for the Semykina and Wooldridge (2006) estimator and alsoapplied to the pooled probit estimator in order to enable comparison. In line withColumns 1 through 4, a higher number of doctor visits (that is, lower health) isassociated with a significantly lower probability of participation in both probit spec-ifications.

As coefficients in linear and nonlinear models cannot readily be compared, Table1 provides participation probabilities of “average” individuals, which differ only withrespect to their state of health (actually, they differ only with respect to the meanvalues of health/doctor visits within each of the four health categories poor, medium,good, excellent (see Section VA). For a healthy woman, the pooled probit probabilityof participation (Column 5) is 13 percentage points higher than for a female of poorhealth. Controlling for correlated individual effects (Column 6) reduces the proba-bility difference to a mere 1.5 percentage points. The linear specifications in Col-umns 1 and 2 reveal the same pattern: When applying pooled OLS the probabilityto work is about 11 percentage points higher for healthy than for unhealthy women;the gap shrinks to two percentage points when implementing the within transfor-mation. Columns 3 and 4 display the instrumental variables estimates. Again, thefixed effects approach reduces the probability gap; however, the magnitude of thegaps is larger than without controlling for endogeneity.

The male probit estimates show the probability difference between healthy andunhealthy individuals to vary between one percentage point when the pooled probitestimator is considered and 0.5 percentage points when controlling for the interactionbetween individual effects and the health variable. In the linear specifications, thecorresponding values (Columns 1 and 2) are around 13 and four percentage points,respectively. Finally, allowing for the endogeneity of health satisfaction expands theprobability gap to 22 and 16 percentage points, respectively.

Results for most of the other variables are as expected (see Appendix Tables A5and A6). For both women and men, the participation probability increases with age(at a decreasing rate) and education.17 Living in the eastern part of Germany isassociated with lower participation for men, while the effect is positive for women(the female population in the eastern region also has a higher participation proba-bility than their western counterparts, probably rooted in the socialist past). Beingof non-German origin and the amount of nonlabor income has a negative influenceon the probability of labor market participation. An increasing labor market attach-ment of the partner tends to reduce the probability to work for women and has atendency to increase the participation probability in the male population, yet someof the partner and children variables exhibit the same sign for women and men,

17. For women, in some of the linear specifications the probability to work decreases with age.

Page 13: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 375

Table 1Participation Probabilities (in Percent), by Different Health Groups, Women andMen, 1995–2006

Men

OLS Within 2SLS FE-2SLS Probit Mundlak Probit(1) (2) (3) (4) (5) (6)

Poor health 76.9 83.3 71.1 74.9 93.9 94.6Medium health 84.2 85.5 83.0 83.8 94.7 94.9Good health 87.7 86.5 88.8 88.1 95.0 95.0Excellent health 90.0 87.2 92.6 90.9 95.1 95.1

Women

OLS Within 2SLS FE-2SLS Probit Mundlak Probit(1) (2) (3) (4) (5) (6)

Poor health 61.6 67.3 54.1 55.8 71.4 72.7Medium health 67.4 68.5 66.0 66.3 73.2 73.6Good health 70.3 69.1 71.8 71.5 74.0 74.0Excellent health 72.1 69.4 75.6 74.8 74.4 74.2

Source: GSOEP 1995–2006, own calculations. Participation probabilities are based on different binarychoice models (see Appendix Tables A5 and A6). 57,203 observations from 9,277 female persons and57,419 observations from 8,847 male individuals. Except for health satisfaction in Columns 1–4 and doctorvisits in Columns 5 and 6, probabilities are accounted at the mean values of all covariates. The state ofhealth is defined as: poor (Categories 0–4), medium (Categories 5–6), good (Categories 7–8), and excellent(Categories 9–10) health.

which means that the effects are somewhat ambiguous overall. For both sexes, thenumber of children in different age categories mostly reduces the individuals’ labormarket attachment and the partner’s net wage is associated with a decreasing work-ing probability in most specifications for both females and males.

B. Wage Equations

Since the core interest of this study is the estimation of the wage equation (Equation3), results for six different estimation methods are given in Tables 3 and 4. Columns1 through 3 in each table display results for OLS, FE, and Wooldridge’s (1995)estimator, all of which assume health to be exogenously determined. In both Tables3 and 4, endogeneity of health is allowed for in the pooled 2SLS (Column 4) andFE-2SLS (Column 5) specifications as well as in Semykina and Wooldridge’s (2006)estimator (Column 6).

Page 14: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

376 The Journal of Human Resources

A. The Instruments

For Specifications 4 through 6, the set of instruments consists of all 11 variablesthat serve as exclusion restrictions in the participation equations (including “doctorvisits,” see Section VA). To check the rank conditions on the 2SLS estimators, F-tests on the joint-significance of the instruments in the first step regressions areconducted. For both women and men, and for all econometric models the null hy-potheses are rejected at any sensible level. Overidentification tests strongly rejectthe null hypotheses of no correlation between the instruments and the error of thewage equation for both sexes in the pooled IV and FE-2SLS estimations (Columns4 and 6). When testing for overidentifying restrictions in Semykina’s and Woold-ridge’s (2006) framework, however, no correlation between the instruments and theerror in the wage equation is detected. This is in line with Semykina (2007), whoshows that if instruments enter the selection equation, “[. . .] they will be inevitablycorrelated with [. . .],” the error term of the selected sample. Consequently, if aselection bias exists— which is the case here (see Table 2)—overidentification testswill detect endogeneity of the exclusion restrictions. Thus, rejecting the null hy-pothesis in the pooled IV and FE-2SLS approach is just another way of stating thatselection bias is present.

B. The Selection Effects

A preliminary check for the presence of selection bias can be carried out by Waldtests on the joint significance of the inverse Mills ratios (Table 2). In Columns 1and 2 we follow Wooldridge (1995) and conduct “variable addition” tests, as firstproposed by Verbeek and Nijman (1992). It is assumed that no further endogeneityproblems occur and under the null the standard within estimator is valid. In Columns3 and 4 tests in the spirit of Semykina and Wooldridge (2006) are carried out, wherethe null hypothesis suggests to use the FE-2SLS estimator. For women and menalike, the null hypothesis is strongly rejected and this evidence of selection bias inboth the FE and the FE-2SLS framework indicates that use of the methods intro-duced in Section III is in order.18

C. Health and Wages

While good health significantly increases participation for both men and women, theimpact of health on the wage rate differs quite a bit across genders. For males (Table3), the parameter of the health variable using pooled OLS (0.041) is higher than thecoefficient in the fixed effects model (0.013). Both effects are significantly differentfrom 0 at the 1 percent level. Controlling for selection lowers the significance levelto 5 percent and reduces the coefficient even further (0.011), but differences betweenthe FE and the Wooldridge (1995) estimator are practically small. This suggests thatusing the FE estimator already accounts for most of the bias introduced by the

18. For both women and men, the inverse Mills ratios are negatively correlated with wages in most years(coefficients not reported). Since the IMRs are inversely related to the estimated probabilities of beingemployed, the negative coefficients indicate that a higher participation probability is associated with anabove average salary.

Page 15: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 377

Table 2IMR Tests, Women and Men, 1995–2006

Withina FE-2SLSb

Male Female Male Female

Wald-test, 2� �12 139.80 44.52 131.68 44.11P-values 0.000 0.000 0.000 0.000N 47,746 37,670 47,746 37,670

Source: GSOEP 1995–2006, own calculations. Within and FE-2SLS estimation. Robust p-values are re-ported under the test statistics. a. Wald tests on the joint significance of the IMRs are provided. It isassumed that there are no further endogeneity problems. Under the null hypothesis the within estimatorsare valid. b. Wald tests on the joint significance of the IMRs are provided. Under the null hypothesis theFE-2SLS estimators are

correlation between the health variable and unobserved individual heterogeneity.Turning to the 2SLS models, a comparison of the parameters shows that the coef-ficients of health satisfaction in Columns 1, 2, and 3 are smaller than their 2SLScounterparts in Columns 4, 5, and 6 which is to be expected if self-assessed healthis error-ridden. Within the instrumental variable framework, the (significantly esti-mated) parameters again exhibit substantial differences. Using pooled 2SLS is as-sociated with a coefficient of 0.046, whereas implementing FE-2SLS yields the high-est parameter of 0.062. Though less precisely estimated, controlling for selectionscales the health coefficient down to 0.041. For the Mundlak-type estimators inColumns 3 and 6, a Wald test of the joint significance of the unobserved individualeffects is carried out and in both cases indicates correlated individual effects. Selec-tion tests, where now the assumptions under the null hypothesis are more restrictivethan those underlying the tests in Table 2 again reject the null of no selection effectsin Columns 3 and 6. Finally, endogeneity tests show systematic differences betweenthe health coefficients in Columns 2 and 5.

The same six econometric models using the female sample are presented in Table4. The results, however, are less intuitive than in the male sample. As with men,selection corrections are indicated by Wald tests on the joint significance of theIMRs for the models in Columns 3 and 6. In these specifications Wald tests confirmthe presence of correlated individual effects just as in the male sample, whereasendogeneity tests suggest that the health variable is exogenous in Columns 1, 2, and3. Throughout all specifications, only pooled OLS points to a significant effect ofhealth for females. Therefore, summarizing the above, it seems that for women,health has only a negligible effect on wages (intensive margin), though there existsa significant effect on labor market participation (extensive level).

In an attempt to give an idea of the economic significance of the above resultsand in order to facilitate comparison of the various estimators, Table 5 providespredicted wages for four “average” individuals, who differ only in their state of

Page 16: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

378 The Journal of Human Resources

Tab

le3

Wag

eE

quat

ions

,M

en,

1995

–200

6 OL

SaW

ithin

aW

oold

r95c

2SL

SbFE

-2SL

SbSe

mW

ool0

6d

(1)

(2)

(3)

(4)

(5)

(6)

Log

heal

thsa

tisfa

ctio

n0.

041

0.01

30.

011

0.04

60.

062

0.04

1(0

.004

)***

(0.0

04)*

**(0

.005

)**

(0.0

14)*

**(0

.02)

***

(0.0

24)*

Age

0.08

30.

083

(0.0

07)*

**(0

.007

)***

Age

squa

red

�0.

002

�0.

002

�0.

002

�0.

002

�0.

002

�0.

002

(0.0

002)

***

(0.0

002)

***

(0.0

003)

***

(0.0

002)

***

(0.0

002)

***

(0.0

003)

***

Age

trip

le1.

00e-

051.

00e-

051.

00e-

051.

00e-

051.

00e-

051.

00e-

05(1

.29e

-06)

***

(1.7

2e-0

6)**

*(2

.21e

-06)

***

(1.2

9e-0

6)**

*(1

.73e

-06)

***

(2.2

2e-0

6)**

*U

nem

ploy

men

tex

peri

ence

�0.

047

�0.

105

�0.

096

�0.

047

�0.

106

�0.

097

(0.0

03)*

**(0

.012

)***

(0.0

17)*

**(0

.003

)***

(0.0

12)*

**(0

.017

)***

Une

mpl

oym

ent

expe

rien

cesq

uare

d0.

003

0.00

40.

004

0.00

30.

004

0.00

4(0

.000

4)**

*(0

.002

)*(0

.003

)(0

.000

4)**

*(0

.002

)*(0

.003

)Fi

rmte

nure

0.01

40.

004

0.00

50.

014

0.00

50.

005

(0.0

005)

***

(0.0

008)

***

(0.0

01)*

**(0

.000

5)**

*(0

.000

8)**

*(0

.001

)***

Firm

tenu

resq

uare

d�

0.00

02�

0.00

01�

0.00

01�

0.00

02�

0.00

01�

0.00

01(1

.00e

-05)

***

(0.0

0002

)***

(0.0

0003

)***

(1.0

0e-0

5)**

*(0

.000

02)*

**(0

.000

03)*

**E

duca

tion

0.03

40.

034

(0.0

008)

***

(0.0

008)

***

Dum

my

educ

atio

n�

0.02

3�

0.00

6�

0.00

3�

0.02

3�

0.00

6�

0.00

3(0

.004

)***

(0.0

04)

(0.0

05)

(0.0

04)*

**(0

.004

)(0

.005

)Pa

rttim

e�

0.11

5�

0.03

8�

0.02

5�

0.11

5�

0.03

6�

0.02

5(0

.016

)***

(0.0

19)*

*(0

.022

)(0

.016

)***

(0.0

19)*

(0.0

22)

Page 17: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 379

Fore

igne

r�

0.00

2�

0.00

2(0

.005

)(0

.005

)

Stat

e-le

vel

vari

able

sL

ogun

empl

oym

ent

(fed

eral

stat

e)�

0.11

5�

0.01

7�

0.00

5�

0.11

5�

0.01

5�

0.00

5(0

.006

)***

(0.0

17)

(0.0

21)

(0.0

06)*

**(0

.017

)(0

.021

)L

ogva

canc

ies

(fed

eral

stat

e)�

0.04

0�

0.00

3�

0.00

4�

0.04

0�

0.00

2�

0.00

4(0

.008

)***

(0.0

07)

(0.0

09)

(0.0

08)*

**(0

.007

)(0

.009

)L

ogem

ploy

ed(f

eder

alst

ate)

0.17

10.

025

0.01

70.

171

0.02

30.

018

(0.0

1)**

*(0

.018

)(0

.023

)(0

.01)

***

(0.0

18)

(0.0

23)

Eas

tG

erm

any

�0.

221

�0.

041

�0.

037

�0.

221

�0.

041

�0.

037

(0.0

06)*

**(0

.01)

***

(0.0

12)*

**(0

.006

)***

(0.0

1)**

*(0

.012

)***

Num

ber

ofch

ildre

nU

pto

2ye

ars

ofag

e0.

036

0.01

30.

010.

036

0.01

30.

01(0

.005

)***

(0.0

05)*

*(0

.006

)*(0

.005

)***

(0.0

05)*

**(0

.006

)*3–

5ye

ars

ofag

e0.

036

0.01

80.

015

0.03

60.

019

0.01

5(0

.004

)***

(0.0

05)*

**(0

.006

)***

(0.0

04)*

**(0

.005

)***

(0.0

06)*

**6–

16ye

ars

ofag

e0.

013

0.00

20.

0006

0.01

30.

002

0.00

09(0

.003

)***

(0.0

03)

(0.0

04)

(0.0

03)*

**(0

.003

)(0

.004

)D

umm

yno

child

ren

�0.

018

�0.

008

�0.

008

�0.

017

�0.

007

�0.

008

(0.0

05)*

**(0

.006

)(0

.007

)(0

.005

)***

(0.0

06)

(0.0

07)

(con

tinu

ed)

Page 18: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

380 The Journal of Human Resources

Tab

le3

(con

tinu

ed)

OL

SaW

ithin

aW

oold

r95c

2SL

SbFE

-2SL

SbSe

mW

ool0

6d

(1)

(2)

(3)

(4)

(5)

(6)

Firm

size

(bas

eca

tego

ry:

�20

empl

oyee

s)20

–199

0.08

70.

046

0.03

60.

087

0.04

50.

036

(0.0

05)*

**(0

.006

)***

(0.0

08)*

**(0

.005

)***

(0.0

06)*

**(0

.008

)***

200–

1,99

90.

153

0.05

90.

047

0.15

30.

058

0.04

7(0

.005

)***

(0.0

08)*

**(0

.009

)***

(0.0

05)*

**(0

.008

)***

(0.0

09)*

**�

2,00

00.

192

0.06

70.

055

0.19

20.

066

0.05

4(0

.005

)***

(0.0

08)*

**(0

.01)

***

(0.0

05)*

**(0

.008

)***

(0.0

1)**

*Fi

rmsi

zem

issi

ng0.

080.

022

0.03

70.

080.

021

0.03

8(0

.017

)***

(0.0

17)

(0.0

19)*

(0.0

17)*

**(0

.017

)(0

.019

)**

Con

stan

t�

0.14

6�

0.15

9(0

.106

)(0

.113

)N

47,7

4647

,746

47,7

4647

,746

47,7

4647

,746

DF

47,6

9540

,020

47,6

5147

,695

40,0

2047

,641

Wal

dte

sts

onth

ejo

int

sign

ifica

nce

of12

IMR

s83

.04*

**64

.21*

**11

time

dum

mie

s34

9.62

***

266.

15**

*13

8.27

***

349.

71**

*26

5.53

***

137.

37**

*6

occu

patio

ndu

mm

ies

2709

.41*

**17

.80*

**72

8.22

***

2685

.21*

**17

.97*

**68

7.12

***

9se

ctor

dum

mie

s13

69.8

4***

65.8

1***

398.

07**

*13

72.6

0***

66.0

1***

405.

00**

*U

nobs

erve

def

fect

se1,

080.

87**

*1,

146.

49**

*

Sour

ce:

GSO

EP

1995

–200

6,ow

nca

lcul

atio

ns.S

tand

ard

erro

rsin

pare

nthe

sis:

*si

gnifi

canc

eat

ten,

**at

five,

and

***

at1

perc

ent.

Yea

r,se

ctor

,and

occu

patio

ndu

mm

ies

incl

uded

,bu

tno

tre

port

ed.

a.St

anda

rder

rors

are

robu

stto

seri

alco

rrel

atio

nan

dhe

tero

sked

astic

ity;

b.ro

bust

stan

dard

erro

rsas

ina,

but

the

2SL

Ses

timat

oris

used

and

acco

unte

dfo

r;c.

robu

stst

anda

rder

rors

asin

a,bu

tth

eva

riat

ion

intr

oduc

edby

the

prob

itfir

st-s

tage

estim

atio

nis

acco

unte

dfo

r;d.

robu

stst

anda

rder

rors

asin

c,bu

tthe

2SL

Ses

timat

oris

used

and

acco

unte

dfo

r;e.

test

stat

istic

sfo

rth

ejo

int

sign

ifica

nce

of35

vari

able

s(v

ecto

r)

or45

vari

able

s(v

ecto

r)

are

repo

rted

.2

�x

qi

i

Page 19: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 381

Tab

le4

Wag

eE

quat

ions

,W

omen

,19

95–2

006

OL

SaW

ithin

aW

oold

r95c

2SL

SbFE

-2SL

SbSe

mW

ool0

6d

(1)

(2)

(3)

(4)

(5)

(6)

Log

heal

thsa

tisfa

ctio

n0.

023

0.00

70.

005

0.00

80.

030.

007

(0.0

05)*

**(0

.005

)(0

.005

)(0

.014

)(0

.022

)(0

.024

)A

ge0.

080.

08(0

.008

)***

(0.0

08)*

**A

gesq

uare

d�

0.00

2�

0.00

2�

0.00

2�

0.00

2�

0.00

2�

0.00

2(0

.000

2)**

*(0

.000

3)**

*(0

.000

3)**

*(0

.000

2)**

*(0

.000

3)**

*(0

.000

4)**

*A

getr

iple

9.57

e-06

1.00

e-05

1.00

e-05

9.59

e-06

1.00

e-05

1.00

e-05

(1.6

0e-0

6)**

*(2

.25e

-06)

***

(2.6

9e-0

6)**

*(1

.60e

-06)

***

(2.2

5e-0

6)**

*(2

.85e

-06)

***

Une

mpl

oym

ent

expe

rien

ce�

0.02

9�

0.09

1�

0.08

8�

0.03

�0.

091

�0.

088

(0.0

02)*

**(0

.017

)***

(0.0

21)*

**(0

.002

)***

(0.0

17)*

**(0

.021

)***

Une

mpl

oym

ent

expe

rien

cesq

uare

d0.

001

0.00

20.

002

0.00

10.

002

0.00

2(0

.000

2)**

*(0

.003

)(0

.004

)(0

.000

2)**

*(0

.003

)(0

.004

)Fi

rmte

nure

0.01

60.

003

0.00

30.

016

0.00

30.

003

(0.0

007)

***

(0.0

01)*

**(0

.001

)**

(0.0

007)

***

(0.0

01)*

**(0

.001

)**

Firm

tenu

resq

uare

d�

0.00

02�

0.00

007

�0.

0000

9�

0.00

02�

0.00

007

�0.

0000

9(0

.000

02)*

**(0

.000

03)*

*(0

.000

04)*

*(0

.000

02)*

**(0

.000

03)*

*(0

.000

04)*

*E

duca

tion

0.04

20.

042

(0.0

009)

***

(0.0

009)

***

Dum

my

educ

atio

n�

0.03

0�

0.01

3�

0.01

0�

0.03

0�

0.01

2�

0.01

0(0

.005

)***

(0.0

06)*

*(0

.008

)(0

.005

)***

(0.0

06)*

*(0

.008

)

(con

tinu

ed)

Page 20: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

382 The Journal of Human Resources

Tab

le4

(con

tinu

ed)

OL

SaW

ithin

aW

oold

r95c

2SL

SbFE

-2SL

SbSe

mW

ool0

6d

(1)

(2)

(3)

(4)

(5)

(6)

Part

time

�0.

047

0.01

80.

025

�0.

047

0.01

80.

025

(0.0

04)*

**(0

.007

)**

(0.0

08)*

**(0

.004

)***

(0.0

07)*

*(0

.008

)***

Fore

igne

r0.

014

0.01

4(0

.006

)**

(0.0

06)*

*

Stat

ele

vel

vari

able

sL

ogun

empl

oym

ent

(fed

eral

stat

e)�

0.10

10.

0008

0.01

8�

0.10

10.

002

0.01

7(0

.007

)***

(0.0

18)

(0.0

23)

(0.0

07)*

**(0

.018

)(0

.024

)L

ogva

canc

ies

(fed

eral

stat

e)�

0.04

1�

0.00

30.

002

�0.

040

�0.

003

0.00

2(0

.008

)***

(0.0

09)

(0.0

1)(0

.008

)***

(0.0

09)

(0.0

12)

Log

empl

oyed

(fed

eral

stat

e)0.

133

0.03

10.

005

0.13

20.

031

0.00

6(0

.012

)***

(0.0

22)

(0.0

29)

(0.0

12)*

**(0

.022

)(0

.03)

Eas

tG

erm

any

�0.

190

�0.

039

�0.

037

�0.

191

�0.

039

�0.

037

(0.0

07)*

**(0

.013

)***

(0.0

14)*

*(0

.007

)***

(0.0

13)*

**(0

.014

)**

Num

ber

ofch

ildre

nU

pto

2ye

ars

ofag

e0.

047

�0.

013

0.03

10.

047

�0.

013

0.03

1(0

.017

)***

(0.0

17)

(0.0

21)

(0.0

17)*

**(0

.017

)(0

.022

)3–

5ye

ars

ofag

e0.

025

�0.

009

0.01

50.

024

�0.

008

0.01

4(0

.009

)***

(0.0

1)(0

.013

)(0

.009

)***

(0.0

1)(0

.014

)6–

16ye

ars

ofag

e�

0.00

6�

0.01

2�

0.00

4�

0.00

6�

0.01

2�

0.00

4(0

.005

)(0

.007

)*(0

.009

)(0

.005

)(0

.007

)*(0

.009

)D

umm

yno

child

ren

�0.

011

0.00

80.

007

�0.

011

0.00

80.

007

(0.0

08)

(0.0

1)(0

.012

)(0

.008

)(0

.01)

(0.0

13)

Page 21: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 383

Firm

size

(bas

eca

tego

ry:

�20

empl

oyee

s)20

–199

0.08

80.

031

0.02

60.

088

0.03

10.

026

(0.0

05)*

**(0

.007

)***

(0.0

09)*

**(0

.005

)***

(0.0

07)*

**(0

.009

)***

200–

1,99

90.

139

0.05

20.

044

0.13

80.

052

0.04

4(0

.005

)***

(0.0

08)*

**(0

.01)

***

(0.0

05)*

**(0

.008

)***

(0.0

1)**

*�

2,00

00.

177

0.05

50.

040.

177

0.05

40.

04(0

.006

)***

(0.0

09)*

**(0

.011

)***

(0.0

06)*

**(0

.009

)***

(0.0

11)*

**Fi

rmsi

zem

issi

ng0.

102

0.04

0.07

10.

102

0.03

90.

072

(0.0

2)**

*(0

.017

)**

(0.0

21)*

**(0

.02)

***

(0.0

17)*

*(0

.021

)***

Con

stan

t0.

080.

122

(0.1

2)(0

.125

)N

37,6

7037

,670

37,6

7037

,670

37,6

7037

,670

DF

37,6

1931

,063

37,5

7537

,619

31,0

6337

,565

Wal

dte

sts

onth

ejo

int

sign

ifica

nce

of12

IMR

s53

.52*

**44

.50*

**11

time

dum

mie

s14

2.69

***

203.

37**

*11

0.07

***

143.

24**

*20

3.97

***

96.6

4***

6oc

cupa

tion

dum

mie

s2,

293.

65**

*23

.41*

**70

5.71

***

2,29

2.66

***

23.3

7***

664.

39**

*9

sect

ordu

mm

ies

537.

381*

**25

.40*

**16

2.58

***

538.

35**

*25

.45*

**16

2.13

***

Uno

bser

ved

effe

ctse

985.

57**

*1,

024.

72**

*

Sour

ce:

GSO

EP

1995

–200

6,ow

nca

lcul

atio

ns.S

tand

ard

erro

rsin

pare

nthe

sis:

*si

gnifi

canc

eat

ten,

**at

five,

and

***

at1

perc

ent.

Yea

r,se

ctor

,and

occu

patio

ndu

mm

ies

are

incl

uded

,bu

tno

tre

port

ed.

a.St

anda

rder

rors

are

robu

stto

seri

alco

rrel

atio

nan

dhe

tero

sked

astic

ity;

b.ro

bust

stan

dard

erro

rsas

ina,

but

the

2SL

Ses

timat

oris

used

and

acco

unte

dfo

r;c.

robu

stst

anda

rder

rors

asin

a,bu

tth

eva

riat

ion

intr

oduc

edby

the

prob

itfir

st-s

tage

estim

atio

nis

acco

unte

dfo

r;d.

robu

stst

anda

rder

rors

asin

c,bu

tth

e2S

LS

estim

ator

isus

edan

dac

coun

ted

for;

e.te

stst

atis

tics

for

the

join

tsi

gnifi

canc

eof

35va

riab

les

(vec

tor

),or

45va

riab

les

(vec

tor

)ar

ere

port

ed.

2�

xq

ii

Page 22: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

384 The Journal of Human Resources

Table 5Wage Predictions (per Hour), by Different Health Groups, Women and Men,1995–2006

Men

OLS Within Wooldr95 2SLS FE-2SLS SemWool06(1) (2) (3) (4) (5) (6)

Poor health 12.137 12.115 11.162 12.086 11.612 10.967Medium health 12.475 12.218 11.246 12.463 12.104 11.274Good health 12.641 12.268 11.286 12.650 12.350 11.425Excellent health 12.752 12.301 11.313 12.774 12.514 11.526

Women

OLS Within Wooldr95 2SLS FE-2SLS SemWool06(1) (2) (3) (4) (5) (6)

Poor health 9.870 8.882 5.779 9.993 8.712 5.782Medium health 10.020 8.925 5.799 10.045 8.891 5.811Good health 10.095 8.947 5.809 10.071 8.980 5.826Excellent health 10.144 8.960 5.815 10.088 9.038 5.835

Source: GSOEP 1995–2006, own calculations. Predicted hourly wages (in Euros, deflated to unity at yearend 2001) based on different wage equations (see Tables 3 and 4). 47,746 observations from 7,679 malepersons and 37,670 observations from 6,560 female individuals. Except for health satisfaction and theIMRs, wages are accounted at the mean values of all explanatory variables. The state of health is definedas: poor (Category 0–4), medium (Category 5–6), good (Category 7–8), and excellent (Category 9–10)health.

health.19 For a male in excellent health, pooled OLS predicts real wages to be about5 percent (0.62 Euro) higher than for a male person suffering from poor health.20

Accounting for individual heterogeneity in Column 2 reduces the wage gap to about0.19 Euro or 1.5 percent. Whenever nonrandom selection into the work force(Wooldridge 1995) is additionally considered, hourly wages decline on each healthlevel and the wage differential in Column 3 shrinks to 1.3 percentage points (0.15Euro) when compared to the within estimator. Predictions are slightly different whenimplementing instrumental variable techniques. The wage gap between individuals

19. Since the individuals’ health status also affects the probability of participating, we calculate “mean”IMRs (over time and person), which differ with respect to the corresponding health groups. Hence, thesimulations in Table 5 are based on mean values of all explanatory variables in the wage equations exceptfor health satisfaction and doctor visits, respectively, as well as the inverse Mills ratios.20. In order to obtain hourly wages we exponentiate predicted log wages. This procedure differs somewhatfrom the one proposed by Kennedy (1983). However, we conduct sensitivity tests based on the pooledOLS estimates and find the differences between Kennedy’s method and our predictions to be negligible.

Page 23: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 385

who are highly satisfied with their health status and those suffering from poor healthis about 0.69 Euro (5.7 percent) in Column 4 and 0.90 Euro (7.8 percent) in Column5. Again, wages are reduced in each health group whenever sample selection isaccounted for. Here, the wage premium for being in excellent health is estimated tobe around 0.56 Euro (5.1 percent).

The differences in real hourly wages between healthy and unhealthy females rangefrom 0.6 percent to 3.7 percent, conditional on the method of estimation. Since allestimators with the exception of pooled OLS lack sufficient precision, simulationresults in the female sample must be interpreted with caution. Keeping this in mind,the decline in wages induced by accounting for nonrandom selection into work(Columns 3 and 6) is fairly strong in the female sample. This finding is rooted inthe fact that integrating the large share of nonparticipating females into the workforce would drastically reduce average wages.

D. Other Results

Aside from our main interest in the effect of health on wages, concave wage profilesare found with respect to firm tenure in all specifications and for women and men.Given the high unemployment rates in Germany, it is of practical relevance to seethat in all models past unemployment periods go with significantly lower wages (atan increasing rate), whereas education positively affects participation and comes witha rate of return per additional year of schooling of 4.2 percent for women andapproximately 3.4 percent for men.

Results for most of the other variables are as expected. For both women and menwages increase at a decreasing rate with age. Working in the eastern part of Germanyor being in parttime employment reduces salaries. In the pooled specifications inColumns 1 and 4, a larger average number of job seekers in the federal state nega-tively influences wages, whereas an increase in the number of employed raises thewage rate. Women and men working in large firms (�2,000 employees) earn sig-nificantly more than in medium-sized firms, who in turn earn more than males andfemales employed in small firms. These effects persist when controlling for individ-ual heterogeneity and selection, albeit smaller in magnitude. The number of children(especially 0 to 5 year olds) is associated with higher wages in the male sample,whereas results in the female sample are insignificant in most specifications. Finally,as for the structural factors affecting wages, we find industry and occupational wagedifferentials in all models and irrespective of gender, Wald tests confirm the jointsignificance of six occupational and nine sector dummies at any sensible level.

VI. Conclusion

In this article, we employ recently developed estimation methods bySemykina and Wooldridge (2006) in order to control for selection, individual het-erogeneity, and endogeneity in one comprehensive framework and apply them tothe question of whether health has a causal effect on wages. A number of testsprovide evidence that corrections for nonrandom selection into the work force arenecessary in both the female and male sample, and the health variable is found to

Page 24: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

386 The Journal of Human Resources

suffer from measurement error. Our results show that good health raises wages formen, while for women there appears to be no significant effect. The fact that pre-dicted participation probabilities in the probit models are more contingent on healthfor the average woman than they are for the average man is in line with this finding,providing tentative evidence that for females health may mainly affect participation,while for males the effect is essentially to be found on the intensive margin.

A question for further research that immediately comes to mind is whether in-vestment in improved health status is worthwhile at the micro or macro level. Whilemonetary gains from being in good health are calculated in Table 5, these resultsare not very informative from a welfare point of view, which must take into accountindividual utility gains and adequate cost measures. Another task for future researchcould be the estimation of an “all-encompassing” model which takes into accountall sources of endogeneity mentioned in this article and additionally tries to addressdynamic effects in the state of health.

Appendix

Table A1Description of Variables

Variable Description

Probit dummy variable indicating participation in the labormarket (probit�1) or no participation (probit�0)

Log hourly wage log earnings per hour (deflated to 2001 Euros)Health satisfaction variable indicating current health satisfaction of an indi-

vidual; categories range from 0–10 transformation:

2f(h )�log h � (h �1)�i,t i,t i,t� �Age age in yearsEducation amount of education or training in yearsDummy education whenever years of education or training decrease over

time, the lower values are changed to the formermaximum and the dummy education variable set toone

Unemployment experience duration of unemployment in a person’s career; inyears, with months in decimal form

Firm tenure duration of time with firm; in years, with months indecimal form

Log nonlabor income logged household income minus net wage income (in2001 Euros)

Number of doctor visits number of doctor visits in the last three monthsParttime dummy variable indicating parttime workForeigner dummy variable indicating non-German nationality

(continued)

Page 25: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 387

Table A1 (continued)

Variable Description

Firm size four dummy variables indicating different firm sizes;categories: up to 20 employees; 20–199 employees;200–1999 employees; larger than 2000 employees

Occupation seven occupation dummies, constructed using the Erik-son, Goldthorpe Class Category IS88 (base: high ser-vice)

Sector ten aggregated sector dummies, based on the NACEclassification (base: agriculture, forestry, fishing)

Time 11 time dummies (1996–2006) (base: 1995)

State level variablesLog unemploymenta (log) yearly averages of job seekers in the individual

state of residenceLog vacancies (log) yearly average of notified vacancies (per state)Log employed (log) yearly average of employed persons (per state)Dummy East Germany dummy variables indicating where a person lives (probit

equation) or works (wage equation); Region � 0 ifWestern Germany

Parent variablesNumber of childrenb Number of children in three categories; (1) up to two

years old; (2) between 3–5 years old; (3) between 6–16 years old

Dummy no children dummy variable indicating the presence of children un-der the age of 17 (dummy no children � 1 if nochildren present)

Partner or spouse variablesc

Single dummy variable indicating whether a person has a part-ner/is married (single � 1 if person has no partner)

Flag missing dummy variable indicating missing data on partner/spouse variables (dummy flag missing � 1 if partnerpresent but data missing)

Net wage net wage of partner or spouseAge age in years of partner or spouseExperience labor market experience of partner/spouseEducation amount of education or training in years of partner/

spouse

a. Unemployment, vacancy, and employment figures are provided by the Federal Employment Agency,Nuremberg; b. All children variables equal zero, if dummy no children � 1; c. All partner/spouse variablesequal zero, if single � 1 or flag missing � 1.

Page 26: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

388 The Journal of Human Resources

Tab

leA

2St

epw

ise

Adj

ustm

ent

ofSa

mpl

es(i

nP

erce

nt)

and

Ave

rage

Yea

rsof

Indi

vidu

als

inSa

mpl

e

Men

Wom

en

Obs

erva

tions

Perc

ent

Indi

vidu

als

Ave

rage

Yea

rsIn

Sam

ple

Obs

erva

tions

Perc

ent

Indi

vidu

als

Ave

rage

Yea

rsIn

Sam

ple

(1)

Com

plet

esa

mpl

e10

4,54

016

,037

6.52

113,

468

16,9

976.

68(2

)B

etw

een

18an

d65

year

s88

,194

15.6

414

,181

6.22

92,4

4518

.53

14,4

676.

39(3

)N

otpe

nsio

ners

81,1

088.

0313

,378

6.06

84,6

548.

4313

,677

6.19

(4)

Not

ined

ucat

ion

77,1

634.

8612

,969

5.95

80,0

815.

4013

,141

6.09

(5)

Not

self

-em

ploy

ed69

,974

9.32

12,2

495.

7176

,378

4.62

12,8

555.

94(6

)N

oton

mat

erni

tyle

ave

69,9

220.

0712

,246

5.71

72,4

615.

1312

,726

5.69

(7)

Not

mili

tary

/civ

ilian

serv

ice

69,6

510.

3912

,219

5.70

72,4

550.

008

12,7

235.

69(8

)N

oap

pren

tices

hip

orsi

mila

r66

,203

4.95

11,7

655.

6369

,428

4.18

12,3

165.

64(9

)N

otm

argi

nally

orir

regu

larl

ypa

rttim

eem

ploy

ed64

,983

1.84

11,6

015.

6065

,028

6.34

12,0

085.

42

(10)

Not

in“s

helte

red

wor

ksho

ps”

64,8

660.

1811

,589

5.60

64,9

010.

2011

,997

5.41

(11)

With

valid

info

rmat

ion

onal

lpr

obit

vari

able

s57

,419

11.4

88,

847

6.49

57,2

0311

.86

9,27

76.

17

(12)

Lab

orm

arke

tpa

rtic

ipan

ts49

,397

8,23

96.

0039

,336

7,30

55.

38(1

3)W

ithva

lidin

form

atio

non

all

wag

eeq

uatio

nva

riab

lesa

47,7

463.

347,

679

6.22

37,6

704.

246,

560

5.74

Dat

a:G

SOE

P,sa

mpl

esA

–F,

1995

–200

6.T

hesa

mpl

eis

pool

edon

the

indi

vidu

al-y

ear

leve

l.a.

For

estim

atin

gea

rnin

gseq

uatio

ns,

indi

vidu

als

who

wor

kfo

ron

lyon

eye

arar

edr

oppe

dfr

omth

esa

mpl

e.O

bser

vatio

nsw

ithm

issi

ngda

taon

wag

esar

ein

clud

edin

the

part

icip

atio

neq

uatio

nif

they

repo

rtto

have

wor

ked

for

pay

inth

em

onth

befo

reth

ein

terv

iew

.

Page 27: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 389

Table A3Frequency Distribution, Health Satisfaction

Overall Sample Male Sample Female Sample

Category Absolute Percent Absolute Percent Absolute Percent

00 814 0.71 371 0.65 443 0.7701 769 0.67 371 0.65 398 0.7002 2,351 2.05 1,155 2.01 1,196 2.0903 4,828 4.21 2,372 4.13 2,456 4.2904 6,004 5.24 2,898 5.05 3,106 5.4305 14,718 12.84 6,840 11.91 7,878 13.7706 12,053 10.52 6,093 10.61 5,960 10.4207 21,008 18.33 10,776 18.77 10,232 17.8908 29,600 25.82 15,004 26.13 14,596 25.5209 13,767 12.01 7,038 12.26 6,729 11.7610 8,710 7.6 4,501 7.84 4,209 7.36

All 114,622 100.00 57,419 100.00 57,203 100.00

Data: GSOEP, samples A–F, 1995–2006. Observations are on individual-year level.

Page 28: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

390 The Journal of Human Resources

Table A4Different Functional Forms of the Health Variable

Male Sample

Wage Equation Participation Equation

(1) (2) (3) (4) (5) (6)

Linear healthsatisfaction

0.008 0.019(0.0008)*** (0.0008)***

Log healthsatisfaction

0.041 0.109(0.004)*** (0.004)***

Mediumhealth

0.003 0.08(0.005) (0.005)***

Good health 0.033 0.115(0.005)*** (0.005)***

Excellenthealth

0.039 0.113(0.006)*** (0.005)***

N 47,746 47,746 47,746 57,419 57,419 57,419

Female Sample

Wage Equation Participation Equation

(1) (2) (3) (4) (5) (6)

Linear healthsatisfaction

0.005 0.015(0.0009)*** (0.0009)***

Log healthsatisfaction

0.023 0.087(0.005)*** (0.004)***

Mediumhealth

�0.012 0.073(0.006)* (0.006)***

Good health 0.013 0.094(0.006)** (0.006)***

Excellenthealth

0.017 0.095(0.007)*** (0.006)***

N 37,670 37,670 37,670 57,203 57,203 57203

Data: GSOEP, samples A–F, 1995–2006. All specifications (including the linear probability models inColumns 4, 5, and 6) are estimated using pooled OLS. Robust standard errors are in parenthesis: * sig-nificance at ten, ** at five, and *** at 1 percent. The state of health in Columns 3 and 6 is defined as:poor (Category 0–4), medium (Category 5–6), good (Category 7–8), and excellent (Category 9–10). Furtherexplanatory variables are included (see Tables 3, 4 and 10, 11), but not reported.

Page 29: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 391

Tab

leA

5P

arti

cipa

tion

Equ

atio

n,M

en,

1995

–200

6

OL

SaW

ithin

a2S

LSb

FE-2

SLSab

Prob

itcM

undl

akPr

obitcd

(1)

(2)

(3)

(4)

(5)

(6)

Age

0.02

90.

030

0.16

6(0

.006

)***

(0.0

06)*

**(0

.045

)***

Age

squa

red

�0.

0005

0.00

05�

0.00

050.

0005

�0.

004

�0.

001

(0.0

002)

***

(0.0

002)

***

(0.0

002)

***

(0.0

002)

**(0

.001

)***

(0.0

01)

Age

trip

le1.

36e-

06�

7.46

e-06

1.23

e-06

�7.

77e-

060.

0000

2�

3.36

e-06

(1.2

0e-0

6)(1

.72e

-06)

***

(1.2

0e-0

6)(1

.73e

-06)

***

(9.3

0e-0

6)**

(1.0

0e-0

5)E

duca

tion

0.01

50.

014

0.10

9(0

.000

6)**

*(0

.000

6)**

*(0

.009

)***

Dum

my

Edu

catio

n�

0.00

5�

0.01

2�

0.00

4�

0.01

2�

0.06

6�

0.04

9(0

.004

)(0

.004

)***

(0.0

04)

(0.0

04)*

**(0

.034

)*(0

.034

)Fo

reig

ner

�0.

042

�0.

043

�0.

224

(0.0

05)*

**(0

.005

)***

(0.0

52)*

**L

oghe

alth

satis

fact

ion

0.10

90.

032

0.17

80.

133

(0.0

04)*

**(0

.005

)***

(0.0

12)*

**(0

.020

)***

Doc

tor

visi

ts�

0.02

8�

0.01

3(0

.002

)***

(0.0

02)*

**L

ogno

nlab

orin

com

e�

0.03

6�

0.03

3�

0.03

6�

0.03

3�

0.36

2�

0.36

7(0

.000

4)**

**(0

.000

6)**

*(0

.000

4)**

*(0

.000

6)**

*(0

.025

)***

(0.0

25)*

**

(con

tinu

ed)

Page 30: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

392 The Journal of Human Resources

Tab

leA

5(c

onti

nued

)

OL

SaW

ithin

a2S

LSb

FE-2

SLSab

Prob

itcM

undl

akPr

obitcd

(1)

(2)

(3)

(4)

(5)

(6)

Stat

ele

vel

vari

able

sL

ogun

empl

oym

ent

(fed

eral

stat

e)�

0.06

9�

0.09

5�

0.06

8�

0.09

2�

0.43

8�

0.44

7(0

.005

)***

(0.0

16)*

**(0

.005

)***

(0.0

16)*

**(0

.067

)***

(0.1

17)*

**L

ogva

canc

ies

(fed

eral

stat

e)0.

008

�0.

021

0.00

6�

0.02

10.

078

�0.

064

(0.0

07)

(0.0

08)*

**(0

.007

)(0

.008

)***

(0.0

52)

(0.0

51)

Log

empl

oyed

(fed

eral

stat

e)0.

060

0.10

10.

060

0.09

60.

353

0.35

3(0

.009

)***

(0.0

19)*

**(0

.009

)***

(0.0

19)*

**(0

.088

)***

(0.1

44)*

*E

ast-

Ger

man

y�

0.02

8�

0.01

3�

0.02

7�

0.01

6�

0.17

8�

0.15

1(0

.006

)***

(0.0

22)

(0.0

06)*

**(0

.022

)(0

.065

)***

(0.1

61)

Num

ber

ofch

ildr

en

2ye

ars

ofag

e0.

003

0.01

20.

001

0.01

10.

062

0.06

1(0

.005

)(0

.006

)**

(0.0

05)

(0.0

06)*

(0.0

45)

(0.0

44)

3–5

year

sof

age

�0.

010

0.00

5�

0.01

00.

005

�0.

053

0.00

5(0

.004

)**

(0.0

05)

(0.0

04)*

*(0

.005

)(0

.038

)(0

.039

)6–

16ye

ars

ofag

e�

0.01

6�

0.00

9�

0.01

7�

0.00

8�

0.07

5�

0.04

9(0

.003

)***

(0.0

04)*

*(0

.003

)***

(0.0

04)*

*(0

.026

)***

(0.0

30)*

Dum

my

noch

ildre

n�

0.06

6�

0.01

1�

0.06

4�

0.01

0�

0.21

60.

010

(0.0

05)*

**(0

.006

)*(0

.005

)***

(0.0

06)

(0.0

55)*

**(0

.048

)

Page 31: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 393P

artn

eror

spou

seva

riab

les

Sing

le0.

708

0.04

00.

689

0.02

64.

420

1.19

6(0

.052

)***

(0.0

61)

(0.0

52)*

**(0

.062

)(0

.489

)***

(1.0

38)

Net

wag

epa

rtne

r/sp

ouse

�0.

0000

2�

0.00

005

�0.

0000

2�

0.00

005

�0.

0002

�0.

0004

(3.1

7e-0

6)**

*(4

.63e

-06)

***

(3.1

8e-0

6)**

*(4

.69e

-06)

***

(0.0

0003

)***

(0.0

0005

)***

Age

part

ner/

spou

se0.

019

0.01

30.

019

0.01

30.

135

0.09

3(0

.002

)***

(0.0

03)*

**(0

.002

)***

(0.0

03)*

**(0

.016

)***

(0.0

24)*

**A

gesq

uare

dpa

rtne

r/sp

ouse

�0.

0002

�0.

0001

�0.

0002

�0.

0001

�0.

001

�0.

0009

(0.0

0002

)***

(0.0

0004

)***

(0.0

0002

)***

(0.0

0004

)***

(0.0

002)

***

(0.0

003)

***

Exp

erie

nce

part

ner/

spou

se�

0.00

07�

0.00

4�

0.00

09�

0.00

4�

0.00

2�

0.01

2(0

.000

6)(0

.002

)**

(0.0

006)

(0.0

02)*

*(0

.007

)(0

.014

)E

xper

ienc

esq

uare

dpa

rtne

r/sp

ouse

�0.

0000

30.

0000

9�

0.00

002

0.00

009

�0.

0002

0.00

02(0

.000

02)

(0.0

0005

)*(0

.000

02)

(0.0

0005

)*(0

.000

2)(0

.000

4)E

duca

tion

part

ner/

spou

se0.

060

�0.

021

0.05

8�

0.02

30.

279

�0.

044

(0.0

06)*

**(0

.011

)*(0

.006

)***

(0.0

11)*

*(0

.060

)***

(0.1

52)

Edu

catio

nsq

uare

dpa

rtne

r/sp

ouse

�0.

002

0.00

05�

0.00

20.

0006

�0.

010

�0.

0003

(0.0

002)

***

(0.0

004)

(0.0

002)

***

(0.0

004)

(0.0

02)*

**(0

.006

)D

umm

yfla

gm

issi

ng0.

847

0.13

60.

827

0.12

25.

196

1.63

3(0

.054

)***

(0.0

60)*

*(0

.053

)***

(0.0

61)*

*(0

.504

)***

(1.0

39)

Con

stan

t�

0.63

3�

0.81

9�

4.30

5(0

.111

)***

(0.1

14)*

**(0

.912

)***

Tim

edu

mm

ies

�2

�11

40.7

***

25.7

4***

41.7

04**

*24

.954

***

30.2

69**

*17

.273

*U

nobs

erve

def

fect

s�

2�

3645

1.32

***

LL

�16

,958

.95

�16

,718

.69

Sour

ce:

GSO

EP

1995

–200

6,ow

nca

lcul

atio

ns.

Dif

fere

ntbi

nary

choi

cesp

ecifi

catio

ns.

57,4

19ob

serv

atio

nsfr

om8,

847

indi

vidu

als.

Stan

dard

erro

rsin

pare

nthe

sis:

*si

gnifi

canc

eat

ten,

**at

five,

and

***

at1

perc

ent.

Yea

rdu

mm

ies

are

incl

uded

inea

chpr

oced

ure

but

not

repo

rted

.a)

Rob

ust

stan

dard

erro

rsar

epr

ovid

ed;

b)t-

test

son

the

sign

ifica

nce

ofth

ein

stru

men

tin

the

1st

step

regr

essi

ons

confi

rmth

atth

era

nkco

nditi

onfo

rid

entifi

catio

nof

the

IVes

timat

ors

isfu

lfille

d;he

tero

sked

astic

ityro

bust

,re

gres

sion

base

dH

ausm

ante

sts

prov

ide

evid

ence

for

the

endo

gene

ityof

the

heal

thva

riab

leon

the

1pe

rcen

tsi

gnifi

canc

ele

vel.

c)St

anda

rder

rors

are

robu

stto

seri

alco

rrel

atio

nin

the

indi

vidu

alsc

ores

acro

sst;

d)un

obse

rved

effe

cts

are

spec

ified

asa

linea

rpr

ojec

tion

onth

e(w

ithin

)m

eans

ofth

ere

gres

sors

.

Page 32: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

394 The Journal of Human Resources

Tab

leA

6P

arti

cipa

tion

Equ

atio

n,W

omen

,19

95–2

006

OL

SaW

ithin

a2S

LSb

FE-2

SLSab

Prob

itcM

undl

akPr

obitcd

(1)

(2)

(3)

(4)

(5)

(6)

Age

�0.

010

�0.

009

0.09

7(0

.006

)(0

.006

)(0

.041

)**

Age

squa

red

0.00

070.

0001

0.00

070.

0002

�0.

0009

�0.

002

(0.0

002)

***

(0.0

002)

(0.0

002)

***

(0.0

002)

(0.0

01)

(0.0

01)

Age

trip

le�

1.00

e-05

�4.

98e-

06�

1.00

e-05

�5.

33e-

06�

8.06

e-06

�6.

01e-

07(1

.26e

-06)

***

(1.6

7e-0

6)**

*(1

.26e

-06)

***

(1.6

9e-0

6)**

*(8

.00e

-06)

(7.8

7e-0

6)E

duca

tion

0.02

60.

024

0.10

3(0

.000

8)**

*(0

.000

8)**

*(0

.007

)***

Dum

my

educ

atio

n0.

002

0.00

60.

005

0.00

6�

0.01

50.

020

(0.0

05)

(0.0

05)

(0.0

05)

(0.0

05)

(0.0

32)

(0.0

25)

Fore

igne

r�

0.09

4�

0.09

3�

0.28

9(0

.006

)***

(0.0

06)*

**(0

.044

)***

Log

heal

thsa

tisfa

ctio

n0.

087

0.01

70.

176

0.15

6(0

.004

)***

(0.0

04)*

**(0

.012

)***

(0.0

19)*

**D

octo

rvi

sits

�0.

021

�0.

011

(0.0

02)*

**(0

.001

)***

Log

nonl

abor

inco

me

�0.

032

�0.

024

�0.

032

�0.

023

�0.

145

�0.

112

(0.0

006)

***

(0.0

006)

***

(0.0

006)

***

(0.0

006)

***

(0.0

06)*

**(0

.004

)***

Stat

ele

vel

vari

able

sL

ogun

empl

oym

ent

(fed

eral

stat

e)�

0.08

8�

0.08

5�

0.08

8�

0.08

4�

0.32

2�

0.32

3(0

.007

)***

(0.0

18)*

**(0

.007

)***

(0.0

18)*

**(0

.057

)***

(0.0

92)*

**L

ogva

canc

ies

(fed

eral

stat

e)�

0.04

8�

0.03

4�

0.05

1�

0.03

4�

0.17

3�

0.13

7(0

.008

)***

(0.0

08)*

**(0

.008

)***

(0.0

08)*

**(0

.051

)***

(0.0

38)*

**

Page 33: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 395L

ogem

ploy

ed(f

eder

alst

ate)

0.12

00.

153

0.12

30.

157

0.43

90.

604

(0.0

12)*

**(0

.023

)***

(0.0

12)*

**(0

.023

)***

(0.0

80)*

**(0

.117

)***

Eas

t-G

erm

any

0.04

80.

023

0.05

10.

025

0.12

50.

064

(0.0

08)*

**(0

.030

)(0

.008

)***

(0.0

30)

(0.0

58)*

*(0

.165

)N

umbe

rof

child

ren

2

year

sof

age

�0.

317

�0.

203

�0.

319

�0.

204

�0.

981

�0.

641

(0.0

10)*

**(0

.010

)***

(0.0

10)*

**(0

.010

)***

(0.0

44)*

**(0

.040

)***

3–5

year

sof

age

�0.

221

�0.

116

�0.

221

�0.

114

�0.

677

�0.

359

(0.0

07)*

**(0

.007

)***

(0.0

07)*

**(0

.007

)***

(0.0

32)*

**(0

.029

)***

6–16

year

sof

age

�0.

105

�0.

030

�0.

106

�0.

030

�0.

315

�0.

100

(0.0

04)*

**(0

.005

)***

(0.0

04)*

**(0

.005

)***

(0.0

23)*

**(0

.021

)***

Dum

my

noch

ildre

n�

0.05

0�

0.00

1�

0.05

0�

0.00

1�

0.05

10.

031

(0.0

07)*

**(0

.008

)(0

.007

)***

(0.0

08)

(0.0

44)

(0.0

34)

Part

ner

orsp

ouse

vari

able

sSi

ngle

0.26

1�

0.03

80.

270

�0.

038

0.46

6�

0.10

1(0

.075

)***

(0.1

36)

(0.0

77)*

**(0

.137

)(0

.437

)(0

.402

)N

etw

age

part

ner/

spou

se�

0.00

005

�0.

0000

3�

0.00

005

�0.

0000

3�

0.00

02�

0.00

01(2

.79e

-06)

***

(2.8

5e-0

6)**

*(2

.85e

-06)

***

(2.9

2e-0

6)**

*(0

.000

02)*

**(1

.00e

-05)

***

Age

part

ner/

spou

se0.

006

0.00

50.

007

0.00

70.

009

0.01

4(0

.002

)**

(0.0

04)

(0.0

02)*

**(0

.004

)(0

.015

)(0

.017

)A

gesq

uare

dpa

rtne

r/sp

ouse

�0.

0001

8.58

e-06

�0.

0001

�9.

99e-

06�

0.00

020.

0000

9(0

.000

02)*

**(0

.000

04)

(0.0

0003

)***

(0.0

0004

)(0

.000

2)(0

.000

2)E

xper

ienc

epa

rtne

r/sp

ouse

0.00

5�

0.00

30.

005

�0.

003

0.01

4�

0.01

2(0

.001

)***

(0.0

02)

(0.0

01)*

**(0

.002

)(0

.009

)(0

.011

)E

xper

ienc

esq

uare

dpa

rtne

r/sp

ouse

�0.

0001

�0.

0000

3�

0.00

01�

0.00

003

�0.

0003

�0.

0001

(0.0

0002

)***

(0.0

0004

)(0

.000

02)*

**(0

.000

04)

(0.0

002)

*(0

.000

2)

(con

tinu

ed)

Page 34: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

396 The Journal of Human Resources

Tab

leA

6(c

onti

nued

)

OL

SaW

ithin

a2S

LSb

FE-2

SLSab

Prob

itcM

undl

akPr

obitcd

(1)

(2)

(3)

(4)

(5)

(6)

Edu

catio

npa

rtne

r/sp

ouse

0.02

1�

0.02

40.

019

�0.

029

0.03

7�

0.07

1(0

.008

)***

(0.0

17)

(0.0

08)*

*(0

.017

)*(0

.054

)(0

.061

)E

duca

tion

squa

red

part

ner/

spou

se�

0.00

060.

0006

�0.

0006

0.00

09�

0.00

10.

002

(0.0

003)

**(0

.000

6)(0

.000

3)**

(0.0

006)

(0.0

02)

(0.0

02)

Dum

my

flag

mis

sing

0.29

1�

0.01

50.

298

�0.

016

0.51

9�

0.06

3(0

.076

)***

(0.1

37)

(0.0

77)*

**(0

.138

)(0

.441

)(0

.404

)C

onst

ant

0.01

0�

0.25

3�

2.05

1(0

.128

)(0

.133

)*(0

.794

)***

Tim

edu

mm

ies

�2

�11

169.

007*

**25

.96*

**59

.867

***

59.5

95**

*99

.211

***

43.7

65**

*U

nobs

erve

def

fect

s�

2�

3669

1.46

***

LL

�28

,309

.25

�27

,964

.73

Sour

ce:G

SOE

P19

95–2

006,

own

calc

ulat

ions

.Dif

fere

ntbi

nary

choi

cesp

ecifi

catio

ns.5

7,20

3ob

serv

atio

nsfr

om9,

277

pers

ons.

Stan

dard

erro

rsin

pare

nthe

sis:

*si

gnifi

canc

eat

ten,

**at

five,

and

***

at1

perc

ent.

Yea

rdu

mm

ies

are

incl

uded

inea

chpr

oced

ure

butn

otre

port

ed.a

)R

obus

tsta

ndar

der

rors

are

prov

ided

;b)

t-te

sts

onth

esi

gnifi

canc

eof

the

inst

rum

ent

inth

efir

stst

epre

gres

sion

sco

nfirm

that

the

rank

cond

ition

for

iden

tifica

tion

ofth

eIV

estim

ator

sis

fulfi

lled;

hete

rosk

edas

ticity

robu

st,r

egre

ssio

nba

sed

Hau

sman

test

spr

ovid

eev

iden

cefo

rth

een

doge

neity

ofth

ehe

alth

vari

able

onth

e1

perc

ent

sign

ifica

nce

leve

l.c)

Stan

dard

erro

rsar

ero

bust

tose

rial

corr

elat

ion

inth

ein

divi

dual

scor

esac

ross

t;d)

unob

serv

edef

fect

sar

esp

ecifi

edas

alin

ear

proj

ectio

non

the

(with

in)

mea

nsof

the

regr

esso

rs.

Page 35: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 397

Table A7Summary, Participation Equation, Men, 1995–2006

Entire Sample Probit � 0 Probit � 1

Probit 0.860 0 1(0.347) (0) (0)

Age 41.481 43.022 41.231(10.939) (13.452) (10.453)

Age squared 1,840.336 2,031.813 1,809.240(927.205) (1,132.517) (885.515)

Age triple 86,389.600 102,508.200 83,771.970(62,600.990) (76,750.590) (59,579.680)

Education 12.192 11.192 12.354(2.606) (2.158) (2.636)

Dummy education 0.145 0.162 0.143(0.353) (0.369) (0.350)

Foreigner 0.128 0.189 0.118(0.334) (0.392) (0.323)

Doctor visits 1.892 2.732 1.755(3.694) (5.115) (3.388)

Log health satisfaction 2.567 2.400 2.594(0.411) (0.581) (0.370)

Log nonlabor income 5.713 7.712 5.388(2.881) (1.140) (2.946)

State level variablesLog unemployment (federal

state)12.797 12.755 12.804(0.550) (0.547) (0.550)

Log vacancies (federal state) 10.459 10.258 10.491(0.822) (0.865) (0.810)

Log employed (federal state) 14.689 14.512 14.718(0.740) (0.784) (0.729)

Dummy 0.257 0.386 0.237East-Germany (0.437) (0.487) (0.425)

Number of childrenUp to two years old 0.081 0.052 0.086

(0.287) (0.233) (0.295)Between 3–5 0.117 0.073 0.124

(0.350) (0.287) (0.359)Between 6–16 0.475 0.338 0.497

(0.813) (0.738) (0.822)Dummy no children 0.601 0.737 0.579

(0.490) (0.440) (0.494)

(continued)

Page 36: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

398 The Journal of Human Resources

Table A7 (continued)

Entire Sample Probit � 0 Probit � 1

Partner or spouse variablesa

Single 0.225 0.334 0.207(0.417) (0.472) (0.405)

Net wage partner/spouse 587.805 499.016 600.032(638.363) (653.367) (635.307)

Age partner/spouse 40.912 44.535 40.413(10.068) (11.591) (9.735)

Age squared partner/spouse 1775.179 2117.684 1728.013(850.881) (1007.411) (815.836)

Experience partner/spouse 10.565 13.017 10.227(9.156) (11.176) (8.788)

Experience squared partner/spouse

195.440 294.326 181.823(298.963) (393.187) (280.838)

Education partner/spouse 11.902 11.181 12.001(2.418) (2.330) (2.413)

Education squared partner/spouse

147.499 130.442 149.848(63.827) (57.959) (64.240)

Dummy flag missing 0.025 0.015 0.026(0.156) (0.123) (0.160)

N 57,419 8,022 49,397

Source: GSOEP 1995–2006, own calculations. All summary statistics are on individual-year level. Standarderrors are in parenthesis. a) The reported sample statistics for these variables are conditional on nonmissingdata (Dummy flag missing � 0) and having a partner/being married (Single � 0).

Page 37: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 399

Table A8Summary, Participation Equation, Women, 1995–2006

Entire Sample Probit � 0 Probit � 1

Probit 0.688 0 1(0.463) (0) (0)

Age 41.714 43.930 40.708(11.026) (11.973) (10.413)

Age squared 1,861.637 2,073.182 1,765.550(933.362) (1,049.170) (858.688)

Age triple 87,851.120 103,525.300 80,731.660(63,049.890) (73,259.050) (56,400.720)

Education 11.926 11.099 12.302(2.467) (2.209) (2.486)

Dummy education 0.131 0.136 0.129(0.337) (0.342) (0.335)

Foreigner 0.124 0.195 0.091(0.329) (0.396) (0.288)

Doctor visits 2.575 3.013 2.375(4.013) (4.844) (3.554)

Log health satisfaction 2.549 2.482 2.579(0.425) (0.499) (0.384)

Log nonlabor income 5.870 6.997 5.358(2.842) (1.923) (3.038)

State level variablesLog unemployment (federal

state)12.801 12.841 12.783(0.558) (0.569) (0.553)

Log vacancies (federal state) 10.459 10.541 10.422(0.823) (0.792) (0.835)

Log employed (federal state) 14.692 14.766 14.658(0.742) (0.724) (0.748)

Dummy 0.252 0.202 0.275East Germany (0.434) (0.402) (0.446)

Number of childrenUp to two years old 0.039 0.082 0.020

(0.200) (0.286) (0.140)between 3–5 0.100 0.175 0.066

(0.326) (0.423) (0.264)between 6–16 0.502 0.648 0.436

(0.818) (0.952) (0.740)Dummy no children 0.607 0.517 0.648

(0.488) (0.500) (0.478)

(continued)

Page 38: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

400 The Journal of Human Resources

Table A8 (continued)

Entire Sample Probit � 0 Probit � 1

Partner or spouse variablesa

Single 0.212 0.145 0.242(0.409) (0.352) (0.428)

Net wage partner/spouse 1,450.245 1,409.635 1,471.591(1,117.829) (1,234.772) (1,050.547)

Age partner/spouse 45.789 47.774 44.745(11.164) (12.125) (10.474)

Age squared partner/spouse 2,221.217 2,429.394 2,111.787(1,046.820) (1,169.002) (958.543)

Experience partner/spouse 22.569 24.399 21.607(11.186) (11.769) (10.743)

Experience squared partner/spouse

634.506 733.833 582.293(525.858) (582.168) (485.620)

Education partner/spouse 12.171 11.776 12.379(2.622) (2.560) (2.630)

Education squared partner/spouse

155.015 145.226 160.161(71.257) (68.108) (72.329)

Dummy flag missing 0.041 0.032 0.046(0.199) (0.175) (0.209)

N 57,203 17,867 39,336

Source: GSOEP 1995–2006, own calculations. All summary statistics are on individual-year level. Standarderrors are in parenthesis. a. The reported sample statistics for these variables are conditional on nonmissingdata (Dummy flag missing � 0) and having a partner/being married (Single � 0).

Page 39: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 401

Table A9Summary, Wage Equation, Men 1995–2006

Mean Standarddeviation

10thpercentile

90thpercentile

Log hourly wage 2.571 0.427 2.051 3.093Log health satisfaction 2.596 0.366 2.095 2.893Age 41.174 10.350 28.0 56.0Age squared 1802.446 875.795 784.0 3136.0Age triple 83201.150 58815.110 21952.0 175616.0Unemployment experience 0.403 1.099 0 1.200Unemployment experience

squared1.371 9.031 0 1.440

Firm tenure 11.392 10.118 1.100 27.200Firm tenure squared 232.165 351.910 1.210 739.840Education 12.371 2.634 10.5 18.0Dummy education 0.143 0.350 0 1Parttime 0.022 0.147 0 0Foreigner 0.117 0.321 0 1

State level variablesLog unemployment (federal

state)12.805 0.550 12.160 13.660

Log vacancies (federal state) 10.494 0.809 9.192 11.428Log employed (federal state) 14.720 0.728 13.586 15.563Dummy East Germany 0.221 0.415 0 1

Number of childrenUp to two years old 0.087 0.296 0 0Between 3–5 0.125 0.361 0 1Between 6–16 0.501 0.825 0 2Dummy no children 0.575 0.494 0 1

Firm size (�20 employees)a

20–199 0.301 0.459 0 1200–1999 0.235 0.424 0 1 2,000 0.256 0.437 0 1Firm size missing 0.023 0.149 0 0

(continued)

Page 40: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

402 The Journal of Human Resources

Table A9 (continued)

Mean Standarddeviation

10thpercentile

90thpercentile

Occupation dummies (high service)Low service 0.185 0.388 0 1Routine nonmanual 0.040 0.196 0 0Skilled manual 0.305 0.460 0 1Semi-unskilled manual 0.212 0.408 0 1Farm labor 0.012 0.109 0 0Missing occupation 0.090 0.287 0 0

Sector dummies (agriculture, forestry, fishing)Unknown sector 0.029 0.169 0 0Energy, water, mining 0.016 0.124 0 0Manufacturing 0.363 0.481 0 1Construction 0.108 0.311 0 1Trade 0.084 0.278 0 0Transport, communication 0.042 0.200 0 0Financial services, insurance 0.024 0.153 0 0Other services 0.090 0.287 0 0State 0.230 0.421 0 1

Exclusion restrictions/instrumentsDoctor visits (last 3 months) 1.745 3.328 0 4.0Log nonlabor income 5.382 2.939 0 8.114Single 0.205 0.404 0 1Flag missing 0.026 0.161 0 0Net wage partner/spouseb 602.324 633.724 0 1,450.677Age partner/spouse 40.308 9.656 28.0 54.0Age squared partner/spouse 1,717.997 806.792 784.0 2,916.0Experience partner/spouse 10.179 8.735 0.800 23.700Experience squared partner/

spouse179.920 277.926 0.640 561.690

Education partner/spouse 12.017 2.415 9.0 16.0Education squared partner/

spouse150.249 64.354 81.0 256.0

Source: GSOEP 1995–2006, own calculations. All summary statistics are on individual-year level 47,746observations). Individuals with participation in only 1 year and individuals with missing wages are droppedfrom the sample. a. For dummy variables, the base categories are given in parenthesis; b. the reportedsample statistics for these variables are conditional on nonmissing data (Dummy flag missing � 0) andhaving a partner/being married (Single � 0).

Page 41: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 403

Table A10Summary, Wage Equation, Women, 1995–2006

Mean Standarddeviation

10thpercentile

90thpercentile

Log hourly wage 2.350 0.432 1.807 2.845Log health satisfaction 2.580 0.381 2.095 2.893Age 40.661 10.321 26.0 55.0Age squared 1,759.862 849.240 676.0 3,025.0Age triple 80,244.280 55,620.660 17,576.0 166,375.0Unemployment 0.488 1.209 0 1.500ExperienceUnemployment 1.699 12.051 0 2.250Experience squaredFirm tenure 9.340 8.688 0.900 22.900Firm tenure squared 162.715 271.736 0.810 524.410Education 12.325 2.481 10.0 16.0Dummy education 0.129 0.335 0 1Parttime 0.382 0.486 0 1Foreigner 0.090 0.286 0 0

State level variablesLog unemployment

(federal state)12.785 0.552 12.150 13.630

Log vacancies (federalstate)

10.422 0.835 9.118 11.428

Log employed (federalstate)

14.658 0.748 13.560 15.562

Dummy East Germany 0.268 0.443 0 1

Number of childrenUp to two years old 0.019 0.139 0 0Between 3–5 0.065 0.262 0 0Between 6–16 0.434 0.736 0 2Dummy no children 0.648 0.478 0 1

Firm size (�20 employees)a

20–199 0.295 0.456 0 1200–1999 0.219 0.414 0 1 2,000 0.193 0.395 0 1Firm size missing 0.026 0.160 0 0

(continued)

Page 42: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

404 The Journal of Human Resources

Table A10 (continued)

Mean Standarddeviation

10thpercentile

90thpercentile

Occupation dummies (high service)Low service 0.256 0.437 0 1Routine nonmanual 0.197 0.398 0 1Skilled manual 0.068 0.251 0 0Semi-unskilled manual 0.173 0.378 0 1Farm labor 0.009 0.092 0 0Missing occupation 0.229 0.421 0 1

Sector dummies (agriculture, forestry, fishing)Unknown sector 0.032 0.176 0 0Energy, water, mining 0.004 0.060 0 0Manufacturing 0.167 0.373 0 1Construction 0.016 0.125 0 0Trade 0.154 0.361 0 1Transport, communication 0.023 0.150 0 0Financial services,

insurance0.031 0.172 0 0

Other services 0.204 0.403 0 1State 0.364 0.481 0 1

Exclusion restrictions/instrumentsdoctor visits (last 3

months)2.365 3.518 0 5.0

Log nonlabor income 5.346 3.039 0 8.175Single 0.242 0.428 0 1Flag missing 0.046 0.208 0 0Net wage partner/spouseb 1,476.293 1,048.302 0 2,644.976Age partner/spouse 44.698 10.410 31.0 59.0Age squared partner/

spouse2,106.237 951.393 961 3481

Experience partner/spouse 21.570 10.701 7.0 36.0Experience squared

partner/spouse579.760 482.760 49.0 1,296.0

Education partner/spouse 12.395 2.632 10.50 18.0Education squared

partner/spouse160.553 72.444 110.250 324.0

Source: GSOEP 1995–2006, own calculations. All summary statistics are on individual-year level (37,670observations). Individuals with participation in only one year and individuals with missing wages aredropped from the sample. a. For dummy variables, the base categories are given in parenthesis; b. thereported sample statistics for these variables are conditional on nonmissing data (Dummy flag missing �0) and having a partner/being married (Single � 0).

Page 43: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

Jackle and Himmler 405

References

Cai, Lixin. 2007. “Effects of Health on Wages of Australian Men.” Working Paper Series2007/02. Melbourne: University of Melbourne, Institute of Applied Economic and SocialResearch.

Chamberlain, Gary. 1984. “Panel Data.” In Handbook of Econometrics Volume 2, ed. ZviGriliches and Michael D. Intriligator, 1247–1318. Amsterdam.

Contoyannis, Paul, Andrew M. Jones, and Nigel Rice. 2004. “The Dynamics of Health inthe British Household Panel Survey.” Journal of Applied Econometrics 19(4):473–503.

Contoyannis, Paul, and Nigel Rice. 2001. “The Impact of Health on Wages: Evidence fromthe British Household Panel Survey.” Empirical Economics 26(4):599–622.

Dustmann, Christian, and Marıa Engracia Rochina-Barrachina. 2007. “Selection Correctionin Panel Data Models: An Application to the Estimation of Females’ Wage Equations.”Econometrics Journal 10(2):263–93.

Gambin, Lynn. 2005. “The Impact of Health on Wages in Europe—Does Gender Matter?”HEDG Working Paper 05/03.

Grossman, Michael. 2001. “The Human Capital Model.” In Handbook of Health Economics,Volume 1A, ed. Anthony J. Culyer and Joseph P. Newhouse, 347–409. Amsterdam:Elsevier Science B.V.

Halliday, Timothy J. 2008. “Heterogeneity, State Dependence, and Health.” EconometricsJournal 11(3):499–516

Haveman, Robert, Barbara Wolfe, Brent Kreider, and Mark Stone. 1994. “Market Work,Wages, and Men’s Health.” Journal of Health Economics 13(2):163–82.

Heckman, James J. 1978. “Dummy Endogenous Variables in a Simultaneous EquationSystem.” Econometrica 46(4):931–60.

Kennedy, Peter. 1983. “Logarithmic Dependent Variables and Prediction Bias.” OxfordBulletin of Economics and Statistics 45(4):389–92.

Kyriazidou, Ekaterini. 1997. “Estimation of a Panel Data Sample Selection Model.”Econometrica 65(6):1335–64.

Lee, Lung-Fei. 1982. “Health and Wage: A Simultaneous Equation Model with MultipleDiscrete Indicators.” International Economic Review 23(1):199–221.

Mincer, Jacob. 1958. “Investment in Human Capital and Personal Income Distribution.”Journal of Political Economy 66(4):281–302.

———. 1974. Schooling, Experience, and Earnings. New York: National Bureau ofEconomic Research.

Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross-Section Data.”Econometrica 46(1):69–85.

Rochina-Barrachina, Marıa Engracia. 1999. “A New Estimator for Panel Data SampleSelection Models.” Annales d’Economie et de Statistique 55/56:153–81.

Romeu Gordo, Laura. 2006. “Effects of Short- and Long-Term Unemployment on HealthSatisfaction: Evidence from German Data.” Applied Economics 38(20):2335–50.

Semykina, Anastasia. 2007. “Specification Tests in Panel Data Models with Selection.”Michigan State University. Unpublished.

Semykina, Anastasia, and Jeffrey M. Wooldridge 2006. “Estimating Panel Data Models inthe Presence of Endogeneity and Selection: Theory and Application.” Michigan StateUniversity. Unpublished.

Stern, Steven. 1989. “Measuring the Effect of Disability on Labor Force Participation.”Journal of Human Resources 24(3):361–95.

Page 44: Health and Wages: Panel Data Estimates Considering Selection … · 2017-12-01 · Health and Wages: Panel Data Estimates Considering Selection and Endogeneity Robert Jäckie,

406 The Journal of Human Resources

Verbeek, Marno, and Theo Nijman. 1992. “Testing for Selectivity Bias in Panel DataModels.” International Economic Review 33(3):681–703.

Wooldridge, Jeffrey M. 1995. “Selection Correction for Panel Data Models underConditional Mean Independence Assumption.” Journal of Econometrics 68(1):115–32.

———. 2002. Econometric Analysis of Cross-Section and Panel Data. Cambridge andLondon: MIT Press.