Using matched substitutes to adjust for nonignorable
nonresponse: an empirical investigation using labour market
data
Richard Dorsett
Policy Studies Institute
23 June 2004
Abstract
This paper assesses the potential for reducing attrition bias by replacing survey dropouts with individuals from a refreshment sample, identified using propensity score matching. By linking administrative records with survey data, it is possible to observe outcomes for dropouts and therefore to test models of attrition. Doing so reveals the dropout process to be nonignorable such that the commonly-used method of reweighting non-dropouts is ineffective in overcoming attrition bias. By constructing artificial refreshment samples, the potential for replacing dropouts with individuals from a targeted refreshment sample is demonstrated. Where the targeting of the refreshment sample is imperfect, the success of the approach is more qualified. However, controlling for historic outcomes, the attrition bias can still be reduced. Keywords: attrition, propensity score matching, refreshment sample, unemployment. Address for correspondence: Richard Dorsett, Policy Studies Institute, 100 Park Village East, London NW1 3SR, UK. Tel: 020 7468 2316; Fax: 020 7388 0914; Email: [email protected]
1
Using matched substitutes to adjust for nonignorable nonresponse: an empirical
investigation using labour market data
1. Introduction
Increasingly, research in economics, sociology and other disciplines makes use of panel
data. The appeal of such data (in which the same units are observed at multiple points in
time) is that they permit the consideration of more complex relationships than is possible
using cross-section data. There are now well-established panel data sets in the UK
(British Household Panel Survey and Labour Force Survey, for example) and new panels
are being developed (the ESRC-funded Millennium Birth Cohort Study, for example).
The availability of such rich data has prompted developments in statistical and
econometric theory, and methods of examining and modelling micro-relationships that
allow for dynamics have grown in sophistication.
However, the drawback to panel data is that they are often subject to more severe missing
data problems than are cross-section data. Specifically, individuals who participate in the
first wave of a panel may drop out in later waves. This may mean that those who
continue to respond become unrepresentative of the original sample. Hence, problems of
inference can arise, particularly if the tendency to drop out is systematically related to an
outcome of interest. In view of this, methods to overcome the problem of attrition bias
have a potentially important empirical role.
2
Dolton (2002) introduced a new method of addressing the problem of attrition by
replacing survey dropouts with substitutes drawn from a refreshment sample and selected
via propensity score matching. However, in his empirical application, dropouts’
outcomes are never observed so it is not possible to know whether the substitution is
successful in overcoming attrition bias. In this paper, administrative data are linked to
longitudinal survey data collected for the purpose of evaluating a UK active labour
market programme (the New Deal for Young People -NDYP), thereby allowing the
unemployment status for all wave 1 respondents to be observed on an ongoing basis,
regardless of whether they subsequently dropped out. By comparing the level of
unemployment for non-dropouts to that for wave 1 respondents as a whole, the extent to
which attrition biases estimates of unemployment (the substantive focus of this paper)
can be quantified. Analogous comparisons provide a means of assessing the extent to
which replacing dropouts with substitutes from a refreshment sample can overcome
attrition bias.
The empirical part of the paper begins by formally testing the model generating attrition.
This in itself is an unusual step since most analyses are based on an untestable
assumption as to which attrition model is appropriate. However, when survey data can be
linked to administrative data in the way described above, such a test can provide a useful
guide as to how best to proceed to overcome attrition bias. Second, the success of
approaches in the spirit of Dolton (2002) is considered. A complication with the data at
hand is that no refreshment samples are available. To overcome this, artificial
refreshment samples are constructed. Summarising the results, when a suitable
3
refreshment sample is available, the approach works well. When the refreshment sample
is not ideal, the success of the approach is more qualified. However, the desired result
can be achieved by controlling for administratively-held historic outcomes in an intuitive
way.
The structure of the paper is as follows. In the next section, the two dominant models of
attrition are discussed along with a recently advanced test that is possible when a
refreshment sample is available or when, as in this case, outcome data are available in
administrative records for the full sample. In Section 3, the use of refreshment sampling
with matching (RSM) is described. The structure of the data is presented in Section 4
and the underlying model of attrition investigated. Section 5 describes the approach that
was taken to construct the artificial refreshment samples. The effectiveness of RSM in
overcoming attrition bias is assessed in Section 6. Section 7 discusses the implications of
the results.
2. Background – models of attrition
There has long been an awareness of the problems caused by sample attrition and
numerous methods of addressing the problem exist. The choice of appropriate method is
dependent on the process generating the missing values. Where observations are missing
completely at random (MCAR), no bias results from basing subsequent analysis on only
non-attriting respondents. However, the MCAR model will rarely apply in practice. In
4
fact, it can be tested empirically since it is a special case of the missing at random (MAR)
model. The MAR model implies that the missing data process is ignorable (Rubin, 1976;
Little and Rubin, 1987) and is sometimes referred to as selection on observables
(Fitzgerald et al., 1998). If MAR holds, multiple imputation methods (see King, 2001 for
a recent example) can be used to allow unbiased inferences (Little and Rubin, 1989).
Alternatively, weights can be used to restore the sample to representativeness. This is the
more common approach among econometricians.
While MAR may be a valid model in some instances, it will often be the case that the
outcome variable of interest is not independent of the attrition process. In such a
scenario, the process generating attrition is nonignorable. In a labour market context, this
can be exemplified by those changing job being more likely to drop out of the survey
(due to lack of time, moving house or some other reason). Hausman and Wise (1979)
(HW) developed a two-step procedure for the nonignorable case which is in the spirit of
Heckman’s selection model (Heckman, 1979). This involves estimating the probability
of dropping-out and using the results to construct an adjustment term that can be included
as a regressor in the outcome equation to correct for the selected nature of the resulting
sample. This requires an assumption about the form of the joint distribution of the errors
in the selection and outcome equations. Credible implementations also require a suitable
instrument (a variable that influences attrition but not outcomes). This can be a major
obstacle in practice.
5
In those empirical analyses that acknowledge the problem of attrition, the approach used
to tackle it is typically based on an untestable assumption about the process generating
the missing values. Such assumptions may be explicit or implicit but they exist
nonetheless. In some applications, a refreshment sample may be available. This is a
random sample from the original population that can be used to replace individuals who
have dropped out of the panel (Ridder, 1992). For the case where a refreshment sample
exists, Hirano, Imbens, Ridder and Rubin (2001) (HIRR) develop a model of attrition that
nests both the MAR and HW models and allows their assumptions to be tested.
Adopting the HIRR notation, the panel data cover two periods and are characterised by
attrition. Let YBitB be a vector of all time-varying variables for individual i at time t and let
XBi B be a vector of all fixed variables for individual i. In the first time period, a random
sample of size NBPB from a fixed population is drawn. This is the panel. For each
individual in the panel, XBi B and YBi1 B are observed. A subset of size NBBPB of this sample does
not drop out of the second stage survey. This is the balanced panel (BP) and for these
individuals YBi2 B is observed. The remaining NBIPB = NBPB – NBBPB comprise the incomplete panel
(IP). Those in the IP drop out of the survey in the second stage and YBi2B is therefore not
observed. In addition to the panel data set, in the second period a new random sample
from the original population is drawn. This is the refreshment sample (size NBRB).
The issue of initial nonresponse is not considered in this paper. The focus is on those
individuals who are interviewed in the first time period. Some will not respond when
approached a second time. Let WBi B be an indicator of willingness to respond at the second
6
wave; WBi B = 1 represents those who respond at the second wave and WBi B = 0 represents
those who do not. HIRR show that knowing the joint distribution of (YB1B, YB2 B, X) or
possibly the conditional distribution of (YB1 B, YB2 B) given X allows the MAR and HW models
of attrition to be tested. Using a refreshment sample, these distributions can be identified.
With this notation, the MAR model implies
W C YB2 B | YB1 B, X
where C denotes independence. In this scenario, first period variables can be used to
explain drop-out and the attrition problem in the sample can be overcome by weighting
the balanced panel observations by the inverse of the probability of attriting or by
multiple imputation.
Under HW,
W C YB1 B | YB2 B, X
which implies that the probability of attrition depends on contemporaneous variables but
not on first period variables. This is known as selection on unobservables since attrition
depends partly on variables that are not observed when the individual drops out.
7
HIRR show that the restrictions implied by the MAR and HW models allow the resulting
marginal distribution of the second period outcome to be tested against that of the
refreshment sample. Denoting the conditional probability Pr(YBi2 B = 1 | YBi1 B = y, WBi B = w) by
qByw B and the probability Pr(Y Bi1 B = y, W Bi B = w) by r Byw B , the marginal distribution of YBi2 B can be
calculated as Pr(Y Bi2 B = 1) = q B00B . r B00B + qB01B . r B01B + qB10B . r B10B + qB11B . r B11B. The terms qB00 B and
q B10B cannot be observed but can be retrieved for the MAR and HW models due to the
linear restrictions these models imply. Specifically, MAR implies
qB00B = qB01 B
and
qB10B = qB11.B
HW implies
qB00B =
{r B10B . r B01B . (1 - qB01B) – r B11 B . r B00B . (1 - qB11 B)} / {rB00 B . rB11 B . qB11B . (1 – qB01 B) / qB01 B – r B11B . r B00B . (1 – qB11 B)}
and
qB10B = {qB00 B . rB00B . qB11 B . rB11 B} / {qB01B . r B01B . r B10B} .
8
The approach taken in this paper when examining the form of attrition is similar to that of
HIRR. However, instead of estimating the marginal distribution of YBi2 Bfrom a
refreshment sample, the linked administrative data allow it to be observed for all wave 1
respondents.
3. Refreshment sampling with matching
As noted, knowledge of the model generating survey dropout is useful since it provides
guidance on how best to address the attrition problem. Under MAR, the consensus
among statisticians is that multiple imputation methods are preferable. A more common
approach among econometricians is to simply reweight the balanced panel. One
drawback of this approach is that no account is taken of the additional uncertainty in
subsequent inference introduced by the fact that the weights are themselves estimates.
Under HW, assuming a suitable instrument can be found, the approach described in
Section 2 must be altered for each specific application. This can be cumbersome in
practice. Furthermore, estimating more complicated models under HW can be
prohibitively difficult.
In view of these complications, other approaches are of great potential interest. One
solution is simply to replace dropouts with survey substitutes from a refreshment sample.
Rubin and Zanutto (2001) provide a summary of the use of such refreshment samples. A
general problem with this approach is that since such substitutes are, by their very nature,
9
respondents, they may differ from those dropouts they replace. This can be addressed to
some extent by targeting the refreshment sample. That is, the refreshment sample can be
drawn disproportionately from those who have low response rates or are likely to drop
out. Furthermore, steps can be taken to help ensure that the substitutes are similar to the
dropouts with regard to a number of characteristics.
Dolton (2002) used single nearest-neighbour propensity score matching to identify
suitable replacements from a targeted refreshment sample. The theoretical foundations of
propensity score matching were first presented in Rosenbaum and Rubin (1983). It has
become popular mainly as an evaluation technique (see, for example, Heckman et al.,
1999). In that context, its use is to estimate the effects of some treatment. In order to
estimate the treatment effect on a particular outcome, an estimate of the outcome that
would have resulted in the absence of treatment is required. This counterfactual outcome
is unobservable. Furthermore, it cannot credibly be inferred directly from the outcomes
of the untreated since they are likely to differ substantially in their characteristics.
Propensity score matching offers a means of estimating the counterfactual.
In order to provide a context for RSM, it is useful to consider the details of the matching
approach. Formally, the impact of treatment on outcome Y is
(1) ∆ = YP
1P – YP
0P
10
where YP
1P is the potential outcome conditional on treatment and YP
0P is the potential
outcome conditional on non-treatment. The mean effect of treatment on the treated is:
(2) θ = E(∆ | S = 1) = E(YP
1P – YP
0P | S = 1) = E(YP
1P | S = 1) - E(YP
0P | S = 1)
where S=1 denotes treatment and S=0 denotes non-treatment. The problem arises from
the term E(YP
0P | S = 1). This is the mean of the counterfactual which, since it is
unobservable, must be identified and estimated.
Matching estimators rely on the Conditional Independence Assumption (CIA):
(3) YP
0P C S | X = x, ∀ x∈ χ .
where X is a set of conditioning variables and χ denotes the part of the attribute space for
which the treatment effect is defined. Intuitively, the assumption states that, conditional
on observable characteristics, potential outcomes are independent of receiving treatment.
This allows non-treated outcomes to be used to infer counterfactual outcomes for the
treated. However, this is only valid if there are non-treated individuals for all values of X
for the treated (the support condition):
(4) Pr ( S = 1 | X = x ) < 1.
11
The parameter θ cannot be identified for values of X which do not satisfy this support
condition. Matching operates by constructing counterfactual outcomes using the
outcomes of the matched individuals from the non-treated group. The mean impact of
treatment can then be estimated as the mean difference in the outcomes of the matched
pairs.
Rosenbaum and Rubin showed that if assumption (3) holds then
(5) YP
0P C S | P(X) = p(x), ∀x∈ χ
where P(X)=Pr(S=1|X) is the propensity score, the probability of receiving treatment.
The important advantage of Rosenbaum and Rubin’s innovation is that the
dimensionality of the match can be reduced to one. Rather than matching on a vector of
characteristics, it is possible to match on just the propensity score. Having done so, the
mean effect is again estimated as the mean difference in the outcomes of the matched
pairs.
The key innovation of the RSM approach is to use matching as a means of replacing
survey dropouts. In this application, ‘treatment’ is dropping out of the survey. For each
dropout, a substitute from the refreshment sample is identified through matching as that
individual who is most similar with regard to a number of key characteristics. By so
doing, RSM may offer a means of restoring the representativeness of the sample and
thereby overcoming the attrition problem.
12
Operationally, the process of identifying the substitutes is as follows:
1. construct a refreshment sample (see Section 5)
2. pool the survey dropouts and the individuals from the refreshment sample into a
single dataset
3. estimate the probability of being a dropout rather than in the refreshment sample
(the propensity score) using a probit model.
4. initialise the matching weight of all members of the refreshment sample to 0.
5. identify the closest match for each dropout:
a) choose one dropout, i
b) find the individual, j, in the refreshment sample with the closest value of
the propensity score to that of i
c) increment the matching weight of j by 1
d) return to a) until all dropouts have been matched.
This procedure results in the identification of refreshment sample individuals who can be
used to replace dropouts, thereby allowing subsequent analysis to be based on a complete
dataset. It is clear from the algorithm that a single refreshment sample member may
substitute for multiple dropouts. This must be accounted for when carrying out further
analysis. The extent to which this process overcomes attrition bias can be assessed by
comparing the mean outcome of the dropouts with the (weighted) mean outcome of the
replacements. This is analogous to the estimation of θ in equation (2). The size and
13
significance of ^θ in this context provides an indication as to how successfully the survey
substitutes capture the outcomes of the dropouts they replace.
4. Data – characterising attrition in the NDYP surveys
NDYP is an active labour market programme introduced in Britain in 1998. It is
mandatory for all those aged 18-24 who have been claiming unemployment benefits for a
period of six months or more. It was evaluated using a survey design that involved two
stages of interviews with the same individuals. The sample was drawn from those
entering NDYP between 29 August and 27 November 1998. Survey interviews took
place at about 6 and 15 months after sampling and the results of these surveys are
reported in Bryson et al. (2000) and Bonjour et al. (2001). As noted in Section 1, survey
data were linked to administrative data allowing unemployment outcomes for the full
sample to be observed regardless of whether individuals responded to the survey.
The structure of the NDYP data is set out in Table 1. The entries for the balanced and
incomplete panels indicate what would be observable were the unemployment outcome
only observed for those who responded to the survey. Since the survey data are linked to
administrative records, the entries for the panel show the distribution of stage 2 outcomes
for the balanced and incomplete panel combined. This allows the true probability of
being unemployed at stage 2 to be observed.
14
<TABLE 1>
Knowing the true probability of stage 2 unemployment, the restrictions imposed by the
attrition models can be tested. The parameters that are directly estimable from Table 1
are given in Table 2.
<TABLE 2>
The estimates of qB00 B and qB10B implied by MAR and HW are given in Table 3. Using these
estimates, the probability of stage 2 unemployment can be estimated. Under the MAR
model, this is estimated at 0.496 while the HW estimate is 0.428. The true probability is
0.449. Bootstrapping the difference between the true and the model-based estimates
results in absolute t-statistics of 8.814 and 1.155 for MAR and HW respectively. From
this, it appears that the HW model better characterises the data and in fact the MAR
model is emphatically rejected.
<TABLE 3>
Table 4 shows how unemployment differs over time between the drop-outs and those
who remain in the sample. The first column shows the probability of being unemployed
around the time of the second stage interviews. The difference between the stayers and
the dropouts is statistically significant. The subsequent columns show how this
difference evolves. In fact, very soon after the point of sampling, significant differences
15
are evident. The size of these differences grows over time, reducing only at the most
recent observation period. The statistical significance of these differences is consistent
throughout. Those dropping out of the survey are less likely to be unemployed. This
accords with the situation of those entering employment being less likely to respond to
surveys due to lack of available time, moving to take up work, or some other reason. It
should be remembered that the client group for NDYP is young and therefore more likely
to be geographically mobile.
<TABLE 4>
This divergence between stayers and dropouts in the probability of unemployment is
shown in more detail in Figure 1 (where the vertical lines indicate approximate fieldwork
dates).
<FIGURE 1>
5. Creating refreshment samples
Since no refreshment samples were available with the NDYP data, such samples were
artificially constructed. The approach taken to achieve this is illustrated in Figure 2.
<FIGURE 2>
16
The starting point is the original sampling frame drawn from administrative records and
including all NDYP entrants, regardless of whether they responded to the first or second
interview. Of these, a proportion responded to the first wave of interviews. Since the
focus in this paper is on attrition rather than nonresponse, the main interest is in those
who responded at wave 1 (denoted R=1 in Figure 2). Two uniformly distributed random
numbers, ε B1B and εB2 B, were generated for each individual in the sample frame: ε B1B was used to
divide the wave 1 respondents into two equally-sized sub-samples. The basic idea is to
use the RSM technique to address the attrition in one of these sub-samples (referred to as
the ‘test’ sample in the remainder of this paper) using the other sample (the ‘non-test’
sample) to construct a refreshment sample.
Given the process by which these sub-samples were identified, they will exhibit similar
(within sampling variation) patterns of attrition to that among the wave 1 respondents as a
whole. Since there were approximately 6,000 wave 1 respondents, of whom nearly 2,600
dropped out, the test sample and the non-test sample both comprise roughly 3,000
individuals, of whom roughly 1,300 drop out. Hence, the potential of the RSM approach
can be explored by examining how well the artificially-constructed refreshment samples
can provide substitutes for the dropouts from the test sample in such a way that the
attrition bias is attenuated.
In fact, a number of different refreshment samples were constructed. The targeted
refreshment sample (TRS) is constructed as all dropouts from the non-test sample. This
17
will be roughly equal in size to the number of dropouts in the test sample. As noted
above, the aim of a TRS is that it over-represents those likely to drop out. In the artificial
case considered in this paper, the targeting is absolute in the sense that the refreshment
sample includes only those actually who drop out. Hence, it allows the RSM approach to
be investigated in the ideal case where the targeting of the refreshment sample is perfect.
The second refreshment sample considered is the non-targeted refreshment sample
(NTRS). This is constructed as a random sample of the non-test sample of size equal to
the TRS. In practice, this was achieved by selecting those individuals from the non-test
sample with a value of εB2 B smaller than the overall proportion of wave 1 respondents who
drop out. The interest of the NTRS is that it provides a means of comparing the
effectiveness of RSM when a targeted refreshment sample is available with its
effectiveness when the refreshment sample is not targeted or is not perfectly targeted.
Since targeting a refreshment sample may, in practice, be difficult, it is of interest to
investigate whether RSM works when used with a refreshment sample that is not
targeted.
The third refreshment sample is the ‘practical’ refreshment sample (PRS). The purpose
of this refreshment sample is to approach a more realistic situation whereby the
refreshment sample is not perfectly targeted but is imperfectly targeted. In practice, the
targeting of the refreshment sample will be less precise than the TRS considered here and
will contain those individuals who are merely suspected of being more likely to drop out
of the sample. This requires some judgement on the part of the survey designer. The
18
purpose of the PRS is to investigate this more realistic scenario. In this case, the
sampling frame was randomly divided using εB1 B and the probability of responding at wave
1 was modelled (for those with εB1 B ≥ 0.5) jointly with the probability of responding at
wave 2 for the wave 1 respondents (that is, for those with εB1 B ≥ 0.5 and R=1). A sub-
sample (of equal size to the TRS) of those in the non-test sample for whom this
probability is the greatest was chosen as the PRS.
The remainder of the paper assesses the performance of RSM using these three
refreshment samples. In addition, using the balanced panel as a refreshment sample is
also considered. This is of interest since, in practice, BP exists without the need to carry
out an additional survey. Consequently, should RSM be successful when using BP as a
refreshment sample, this indicates a simple way to address the attrition problem.
However, given the rejection of MAR as the model of attrition, it is unlikely that
replacing dropouts with substitutes from BP will be successful. To see this, note that the
resulting dataset would simple amount to a reweighting of BP which is only valid if
MAR holds. More promising perhaps, is to combine BP with one of the other
refreshment samples, as in Dolton (2002). The effectiveness of this is considered below.
6. Assessing RSM
The algorithm set out in Section 3 was followed in order to generate an estimate of θ that
shows, for a given refreshment sample, the extent to which RSM succeeded in replacing
19
dropouts in such a way that attrition bias was overcome. However, a valid concern is that
the results may be specific to the particular test sample and refreshment samples
constructed. That is, since the definition of these samples contains a random element, it
may be that this random component is driving the results. To address this, the entire
process of constructing refreshment samples and estimating θ was bootstrapped (10,000
replications).
The key results in this section are based on the bootstrapped estimates. However, to
provide an insight into the mechanism of RSM, the next sub-section describes the process
of implementing RSM using just one version of each of the refreshment samples (i.e. not
bootstrapped). This is intended to allow an insight into the more formal (i.e.
bootstrapped) results that follow.
6.1Implementing RSM – descriptives and diagnostics
The TRS and the NTRS are straightforward to construct. The PRS is more involved
since it involves simultaneous estimation of wave 1 and wave 2 response. The aim is to
construct a refreshment sample that over-represents those predicted to drop out. It is
useful to consider how such a sample might be constructed in practice. The PRS is
drawn from the same population as the main sample. However, it only becomes relevant,
and therefore only has to be surveyed, after the second wave of main sample interviews.
20
At this stage, only those variables that exist in the administrative records are available.
These will often be fairly limited in scope.
The results of estimating a model of wave 1 and wave 2 response using administrative
data are given in Table 5. As an overall comment, the process of initial nonresponse is
characterised differently from that of attrition. Most notably, unemployment at roughly
the time of the wave 1 fieldwork is an important correlate of initial response, while
unemployment at the time when the second interviews were carried out is strongly
associated with wave 2 response. This accords with the earlier comment on attrition
being nonignorable in this dataset.
<TABLE 5>
Using these results, the probability of responding at wave 1 but not at wave 2 was
generated. The PRS was drawn from those wave 1 respondents who had the highest
probability of dropping out in the second wave. It was constructed to be equal in size to
the TRS.
Figure 3 depicts the change over time in unemployment for the three refreshment samples
and for that half of the sampling frame from which they were drawn. Clear differences
are evident. For the sampling frame, unemployment fell from a level of 78 per cent in
December 1998 to 38 per cent in June 2000. The level of unemployment in the NTRS is
similar to that of the sampling frame at the start of the period. The differences increase
21
around the time of the wave 1 fieldwork and then decline thereafter. The results of Table
5 have shown initial response to be positively associated with unemployment. Since all
those in the NTRS respond at wave 1, the expectation is that their level of unemployment
around the time of the wave 1 interview will be higher than among the sample frame
whose definition does not constrain them to respond at wave 1. A similar logic explains
the trend seen among the TRS. Wave 2 response is associated with remaining
unemployed. Since none of those in the TRS responds at wave 2, they are more likely to
have left unemployment.
The PRS displays a very different trend, although still consistent with expectations.
Since this refreshment sample comprises those individuals from the non-test sample who
are most likely to respond at wave 1 and not respond at wave 2, it will over-represent
those with the characteristics associated with such response behaviour. Since
unemployment status at the time of interview is positively associated with response, it
follows that the PRS will disproportionately represent those who were unemployed at the
time of the first survey but who had exited unemployment by the time of the second
survey. This gives rise to the trend seen in Figure 3.
<FIGURE 3>
Table 6 present some descriptive statistics for the samples. The first column of figures
describes the test sample as a whole. Ideally, the end result of the RSM process will be to
replace dropouts such that this profile is mirrored in the resulting refreshed sample. The
second and third columns describe respectively the characteristics of those who drop out
22
and those who remain in the test sample. The most notable differences between the two
groups are that those in the IB are more likely than those in the BP to be from a minority
ethnic group and less likely to own their accommodation. In the fourth column, the
profile for the NTRS is given. This should be identical (within sampling variation) to the
test sample and, indeed, no major differences from column 1 are evident. Similarly, the
fifth column which describes the characteristics of those in the TRS should resemble the
second column since it comprises all dropouts from the non-test sample. The final
column describes those in the PRS. They tend to be younger than those in the other
samples, are less often from a minority ethnic group and more often unpartnered. They
are less likely to have been employed and more likely to have been unemployed at the
time of wave 1 interviews.
<TABLE 6>
The results of estimating the propensity score are presented in Table 7. The first column
shows the results using a pooled dataset made up of the IP and the BP to estimate the
probability of being in the IP. The variables included in the specification were chosen as
those most likely to be related to dropping out. The next three columns present
analogous results when the model is estimated on the IP and, respectively, the NTRS, the
TRS and the PRS. The final three columns correspond to the preceding three but include
the BP as an additional potential source of refreshment substitutes. A significant positive
coefficient for a given characteristic indicates that that characteristic is more commonly
found among the IP than it is among those in the refreshment sample. As expected, there
23
are no significant coefficients when the model is estimated using the IP and TRS. The
most significant coefficients are evident when the model is estimated using the IP and the
PRS.
<TABLE 7>
These results were used to generate the propensity scores required for matching. Single
nearest neighbour matching was used since, intuitively, this relates more directly than
other matching methods to the desired aim of identifying an individual substitute for each
dropout. Table 8 summarises the matching weights for the identified substitute samples.
The results show that the balanced panel combined with the TRS performs best, with a
mean weight of 1.31. This combination also performs best when considering the decile
of comparators with the largest weights since they account for only 20 per cent of
dropouts. This can be interpreted as a measure of the concentration of matches in a small
number of individuals and indicates whether there may be an over-reliance on individual
comparators.
<TABLE 8>
In Table 9, a different type of diagnostic is presented that illustrates the extent to which
the matching has balanced the characteristics of the dropouts with those of the
refreshment sample. Consider the first pair of columns for which the BP is used as the
refreshment sample. In the first column, the (pre-matching) differences between
24
dropouts and those in the BP are given. For each variable listed, the cell entry represents
the mean difference between the dropouts and the BP expressed as a percentage of its
standard error. Thus, it provides, on a comparable metric, the degree of similarity
between the means of the two groups. Clearly, some characteristics are more similar than
others. For example, there is little difference in the proportion who have any educational
qualifications but there is a marked difference in the proportion belonging to a minority
ethnic group. Averaging across all variables yields the entry in the final row which
captures the overall level of balance between the two groups across all variables. The
second column provides analogous results to those in the first column, except that now
the dropouts are compared with the substitutes from the BP who were identified through
matching.
<TABLE 9>
Comparing the two columns provides an indication of the extent to which matching
improves the balance between the two groups; in the case of the BP, balance improves
from 5.01 to 2.29. This level of post-matching balance is comparable to that for both the
NTRS and the TRS (2.57 and 2.44, respectively). However, the balance is not as good
when using the PRS (5.85). For NTRS, TRS and PRS, balance is always improved by
pooling with the BP.
25
6.2 Can RSM work?
In the remainder of this section, the success of RSM in overcoming attrition bias is
considered. As noted in Section 3, this is done by comparing the mean outcome of the
dropouts with the (weighted) mean outcome of their replacements. The size and
significance of ^θ indicates how successfully the survey substitutes capture the outcomes
of the dropouts they replace. A significant difference suggests that the outcomes of the
dropouts have not been captured by the replacements.
Table 10 presents these results for a number of refreshment samples. The first row
relates to unemployment at around the time of the second wave of interviews.
Subsequent rows relate to unemployment at December 1998 and at quarterly intervals
thereafter, up until June 2000. This allows the evolution of differences to be observed.
For each unemployment outcome considered, ^θ is given, together with the associated t-
statistic and the mean squared error (multiplied by 100 for ease of interpretation). A
negative value of ^θ shows that unemployment is lower among the dropouts than it is
among those selected to replace them.
<TABLE 10>
The first column shows the results when using the BP as a refreshment sample. As
mentioned before, a reweighting of the BP is only appropriate if attrition is MAR so it is
not surprising that this approach is not successful in overcoming attrition bias. From
26
June 1999 onwards, the replacements are more likely than the dropouts to be
unemployed. Using the NTRS (column 2) is more successful but still ^θ is significant
from December 1999 onwards. The level of bias is noticeably smaller than when the BP
is used, however. By June 2000, mean unemployment was 7 percentage points higher
among replacements drawn from the NTRS than for dropouts. The analogous figure
when drawing replacements from the BP was 12 percentage points.
The results show that taking replacements from the TRS can overcome attrition bias
(column 3). At all points, ^θ is zero and the mean squared error is also smaller than with
any of the other refreshment samples. The more realistic case of the PRS is shown in
column 4. These results reflect the pattern seen in Figure 3. In the early quarters, ^θ is
negative but in September and December 1999 ^θ becomes positive, indicating that
unemployment is higher among the dropouts than among their replacements. For the last
two quarters, ^θ is small and insignificant, indicating that attrition has been overcome
from around the time of wave 2 interviews onwards.
The remaining three columns in Table 10 show the results when combining the BP with
each of the other refreshment samples. Drawing replacements from BP and NTRS
combined (column 5) is less successful than drawing replacements from NTRS alone.
Similarly, drawing replacements from BP and TRS combined (column 6) is less
successful than drawing replacements from NTRS alone. The results when combining
BP with PRS are more mixed. Column 7 shows that ^θ is smaller and less often
27
significant in the early quarters compared to when replacements are taken from PRS
alone. However, attrition bias is evident for the last two quarters when taking
replacements from BP and PRS combined.
The implication of these results is that RSM can overcome attrition bias but that it needs
a targeted refreshment sample. Using the BP as a source of possible replacements
reduces the effectiveness of RSM in this analysis. The PRS can overcome attrition bias
from the point of dropout onwards but will misrepresent unemployment at earlier periods.
In view of the relevance of the PRS, it is appropriate to investigate whether it is possible
to improve on its performance. The important role of wave 1 and wave 2 unemployment
in the construction of the PRS has already been noted. The RSM approach was repeated
adding wave 1 and wave 2 unemployment status to the list of variables included in the
estimation of the propensity score. The reason for doing this is that by including these
variables, the matching process seeks to balance them across the dropouts and the
refreshment samples. In other words, the substitutes should be similar to the dropouts in
terms of the outcome variables. The results are shown in Table 11.
<TABLE 11>
The first two columns shows the results when using the BP and the NTRS respectively as
refreshment samples. Relative to the earlier results for these refreshment samples, there
is some improvement in performance. However, attrition bias is still evident from
28
December 1999 onwards in the case of BP and from March 2000 onwards in the case of
NTRS. The results for the PRS (column 3) show ^θ to be insignificant throughout,
suggesting that attrition bias has been successfully overcome. This is due in part to the
higher standard errors implied in column 3, suggesting some imprecision will be
introduced to any subsequent analysis. The mean squared error is comparable to that for
the NTRS. The final two columns show the results when combining BP with NTRS and
PRS respectively. As before, this serves to worsen the performance of RSM.
7. Summary and discussion
In this paper, administrative data were linked to survey data in order to demonstrate the
extent to which attrition in a longitudinal survey can bias analysis of unemployment.
Those dropping out of the sample were shown to be more likely to have exited
unemployment than those who remained in the sample such that estimates of
unemployment based on the latter group would tend to be biased upwards. The most
common approach to dealing with such attrition bias is to re-weight the continuing
respondents such that the sample remains representative with regard to observable
characteristics. In practice, the MAR assumption that validates this approach is rarely
questioned. In this paper, a formal test of the attrition model strongly rejected the MAR
assumption, indicating that an alternative means of addressing the attrition bias was
required. Where it is possible to link survey and administrative records in the manner
29
described in this paper, it seems that performing such a test could routinely be used to
provide a valuable insight into the appropriate means of dealing with attrition.
The substantive results have shown that when a targeted refreshment sample is available
it is possible to use matching to replace dropouts in such a way that attrition bias is
rendered insignificant. The targeting is important: using a non-targeted refreshment
sample or replacing dropouts with individuals in the balanced panel resulted in an over-
representation of the unemployed. Furthermore, while it may be intuitively appealing to
boost the number of potential replacements by combining the targeted refreshment
sample with the balanced panel, the results suggest that doing so may be counter-
productive.
Although the approach appears to work when a targeted refreshment sample is available,
the targeting is likely to be imperfect in practice. An insight into the more realistic
scenario of a refreshment sample comprising those merely suspected of being more likely
to drop out was provided by the ‘practical’ refreshment sample. The results showed that
it was possible to use this refreshment sample to overcome attrition bias, albeit possibly
at the price of higher variance for any subsequent analysis.
While caution must be exercised when generalising these results, they suggest that the
RSM approach may be effective in practical applications. The targeting of the
refreshment sample is of central importance.
30
In the remainder of this paper, some consideration is given to the situation where it is not
possible to link administrative records to survey data in the manner described above. In
this case, estimating a model in order to predict dropout may not be possible so an
alternative means of identifying the refreshment sample is needed. One approach would
be to base the targeting of the refreshment sample on an understanding gained from
experience of characteristics associated with nonresponse or dropout. This is the
approach taken in Dolton (2002).
An alternative approach is to intensify efforts to achieve interviews with those who
dropped out and to regard the newly achieved interviews as the refreshment sample. In
effect, what this would amount to is boosting the proportion of observations accounted
for by the balanced panel and drawing the refreshment sample from those who dropped
out but were subsequently recovered. The justification for this is that the refreshment
sample thus defined would tend to have characteristics more similar to the dropouts as a
whole. Some support for this is suggested by Lynn et al. (2001) who show that hard-to-
get respondents in the UK Family Resources Survey (i.e. those needing six or more visits
to achieve an interview or those who are persuaded to co-operate after initially refusing)
are more likely to be in work than other respondents.
There are a number of possible means of increasing the chances of eventually achieving
interviews with dropouts. For example, if the data are collected by means of postal
questionnaire or telephone interview then there is the option of instead using face-to-face
interviews for hard-to-get cases since these tend to achieve a higher response rate.
31
Response rates can also be improved by tracing address changes more assiduously,
increasing the number of call-backs or varying the times at which interviewers call. In
some cases, financial incentives to participate may be appropriate.
Acknowledgements
The author gratefully acknowledges financial support from the ESRC (ref: R000223901).
Helpful comments on earlier versions of this paper were given by Peter Dolton, Jeff
Smith and Michael White as well as participants at seminars at the Policy Studies
Institute and the University of Westminster; the annual general meeting of the Statistical
Society of Canada in Halifax, Nova Scotia; the second workshop on Research
Methodology in Amsterdam; and the Royal Statistical Society one day conference on
attrition and non-response in social surveys in London. I am also grateful to the
Department for Work and Pensions for allowing me access to the data. The usual
disclaimer applies.
32
References
Bonjour, D., Dorsett, R., Knight, G., Lissenburgh, S., Mukherjee, A., Payne, J., Range,
M., Urwin, P., White, M. (2001) New Deal for Young People: national survey of
participants: stage 2. Employment Service Report 67, Sheffield.
Bryson, A., Knight, G. and White, M. (2000) New Deal for Young People: national
survey of participants: stage 1. Employment Service Report, 44, Sheffield.
Dolton, P. (2002) Reducing attrition bias using targeted refreshment sampling and
matched imputation. University of Newcastle mimeo.
Fitzgerald, J., Gottschalk, P. and Moffitt, R. (1998) An analysis of sample attrition in
panel data: the Michigan Panel Study of Income Dynamics. Journal of Human
Resources, 33, 251-299.
Hausman and Wise (1979) Attrition bias in experimental and panel data: the Gary
Income Maintenance Experiment. Econometrica, 47, 455-474.
Heckman, J. (1979) Sample selection bias as a specification error Econometrica 47: 153-
61.
Heckman, J., Lalonde, R. and Smith, J. (1999) The economics and econometrics of active
labour market programs. In Handbook of labor economics vol III (eds. Ashenfelter,
O. and Card, D.), pp. 1865-2097. Amsterdam: North Holland.
Hirano, K., Imbens, G., Ridder, G and Rubin, D. (2001) Combining panel data sets with
attrition and refreshment samples. Econometrica, 69, 1645-1659.
King, G. (2001) Analysing incomplete political science data: an alternative algorithm for
multiple imputation. American Political Science Review, 95, 49-69.
Little, R. and Rubin, D. (1987) Statistical analysis with missing data. New York: Wiley.
33
Little, R. and Rubin, D. (1989) The analysis of of social science data with missing values.
Sociological Methods and Research, 18, 292-326.
Lynn, P., Clarke, P, Martin, J. and Sturgis, P. (2001) The effects of extended interviewer
efforts on nonresponse bias. In Survey nonresponse (eds. Groves, R., Dillman, D.,
Eltinge, J. and Little, R.), pp. 135-147. New York: Wiley.
Ridder, G. (1992) An empirical evaluation of some models for non-random attrition in
panel data. Structural change and economic dynamics, 3, 337-355
Rosenbaum, P. and Rubin, D (1983) The central role of the propensity score in
observational studies for causal effects. Biometrika, 70, 41-50.
Rubin, D. (1976) Inference and missing data Biometrika, 63, 581-592.
Rubin, D. and Zanutto, E. (2001) Using matched substitutes to adjust for nonignorable
nonresponse through multiple imputations. . In Survey nonresponse (eds. Groves,
R., Dillman, D., Eltinge, J. and Little, R.), pp. 389-402. New York: Wiley.
Tables and figures Table 1: Summary statistics: YBitB indicates unemployment at time t, WBi B indicates willingness to respond to stage 2 survey. (Sub) sample YBi1B YBi2B WBiB NBalanced panel 0 0 1 792 0 1 1 279 1 0 1 859 1 1 1 1405 3335Incomplete panel 0 0 969 1 0 1606 2575Panel 0 3256 1 2654 5910 Table 2: Directly estimable parameters W=0 W=1 qBy1 B
YB1B=0 0.164 0.181 0.261 (0.005) (0.005) (0.013)YB1B=1 0.272 0.383 0.621 (0.006) (0.006) (0.010)Standard errors in parentheses Table 3: Model-based parameters MAR HW TRUEqB00B 0.261 0.145 (0.013) (0.024)qB10B 0.621 0.440 (0.010) (0.050)Pr(yB2B=1) 0.496 0.428 0.449 (0.008) (0.019) (0.006)Standard errors in parentheses Table 4: Probability of being unemployed by year. Claimant unemployed (%): N Stage 2 Week commencing: Interview 28/12/98 29/3/99 28/6/99 27/9/99 27/12/99 27/3/00 26/6/00Non-drop-outs 50.5 81.5 68.4 60.3 52.6 49.4 49.4 46.1 3358Drop-outs 37.7 78.9 64.8 54.1 44.5 38.1 34.9 33.9 2590%-point difference 12.8 2.6 3.6 6.3 8.1 11.3 14.4 12.2 P-value (%) 0.0 1.2 0.4 0.0 0.0 0.0 0.0 0.0
Table 5: Modelling probability of responding to wave 1 and to wave 2 interviews using administrative data Response at
wave 2 Response at wave 1
Age: 18-19 0.177 0.183 (2.06)* (3.17)** Age: 20-21 0.104 0.145 (1.24) (2.48)* Age: 22-23 0.208 0.017 (2.48)* (0.29) Female 0.061 0.091 (1.13) (2.29)* Partner 0.175 0.301 (1.75) (4.20)** Number of JSA claims since Jan 1995 -0.030 -0.031 (2.68)** (4.03)** Region: northern -0.005 -0.184 (0.04) (2.09)* Region: north west 0.002 -0.217 (0.02) (2.71)** Region: Yorkshire & Humberside -0.046 -0.022 (0.44) (0.26) Region: Wales -0.321 -0.132 (2.29)* (1.23) Region: west midlands -0.048 -0.146 (0.40) (1.58) Region: east midlands and eastern -0.079 -0.136 (0.71) (1.56) Region: south west -0.313 0.033 (1.72) (0.22) Region: London and south east -0.449 -0.507 (3.65)** (6.89)** Area type: rural, high unemployment 0.618 0.214 (5.02)** (2.54)* Area type: rural/urban, tight labour market 0.185 0.213 (1.91) (3.12)** Area type: rural/urban, high unemployment 0.334 0.193 (3.33)** (2.66)** Area type: rural/urban, tight labour market 0.091 -0.049 (1.08) (0.81) Area type: urban, high unemployment 0.184 0.072 (2.48)* (1.33) Preferred occupation: clerical and secretarial 0.096 (2.22)* rural area 0.301 (2.18)* Unemployed week beginning 15-Feb-1999 0.370 (9.72)** Unemployed week beginning 25-Oct-1999 0.214 (4.19)** Constant 0.165 -0.035 (0.84) (0.37) Observations 5517 5517 Absolute value of z-statistics in parentheses * significant at 5% level; ** significant at 1% level
Table 6: Characteristics of the sample and refreshment samples Test sample IB BP NTRS TRS PRSaged 18-19 0.24 0.22 0.25 0.25 0.22 0.35aged 20-21 0.33 0.33 0.34 0.33 0.33 0.37aged 22-23 0.25 0.26 0.25 0.24 0.26 0.13aged 24-25 0.17 0.18 0.17 0.18 0.19 0.18female 0.29 0.28 0.29 0.30 0.29 0.31ethnic minority 0.47 0.51 0.44 0.46 0.49 0.32has partner 0.13 0.12 0.14 0.15 0.15 0.07dependent children in household 0.09 0.08 0.10 0.10 0.10 0.10no educational qualifications 0.35 0.35 0.35 0.34 0.37 0.37no vocational qualifications 0.56 0.58 0.54 0.53 0.57 0.54drivers licence 0.23 0.24 0.22 0.24 0.24 0.22drivers licence and car 0.14 0.14 0.14 0.15 0.15 0.14long-term health problem 0.24 0.24 0.25 0.24 0.24 0.21housing: social housing 0.51 0.51 0.51 0.52 0.52 0.53housing: mortgage or owned 0.30 0.26 0.32 0.29 0.25 0.28employed at first interview 0.25 0.25 0.25 0.25 0.26 0.20unemployed at first interview 0.47 0.49 0.46 0.47 0.50 0.54number of jsa claims since jan 1995 4.10 4.15 4.07 4.12 4.13 4.10total days unemployed before ND 642.99 649.66 637.99 639.40 655.28 624.48duration of current claim at ND entry 291.09 288.34 293.15 288.66 283.84 288.08Local unemployment rate at ND entry 5.74 5.54 5.89 5.79 5.63 5.37N 2974 1274 1700 1300 1316 1300
Table 7: Results of estimating the propensity score using the incomplete panel and varying the refreshment sample BP NTRS TRS PRS BP &
NTRS BP & TRS
BP & PRS
aged 18-19 -0.109 -0.119 -0.007 -0.652 -0.113 -0.059 -0.348 (1.19) (1.22) (0.07) (6.63)** (1.42) (0.75) (4.41)** aged 20-21 -0.056 -0.030 -0.020 -0.382 -0.049 -0.034 -0.195 (0.71) (0.36) (0.25) (4.55)** (0.72) (0.51) (2.89)** aged 22-23 -0.037 0.033 -0.001 0.228 -0.011 -0.018 0.042 (0.50) (0.42) (0.02) (2.61)** (0.17) (0.28) (0.64) female -0.030 -0.067 -0.033 -0.112 -0.047 -0.029 -0.066 (0.55) (1.17) (0.58) (1.91) (1.00) (0.63) (1.41) ethnic minority 0.155 0.114 0.050 0.543 0.135 0.109 0.295 (3.20)** (2.21)* (0.98) (10.03)** (3.19)** (2.59)** (6.96)** has partner -0.165 -0.185 -0.164 0.533 -0.172 -0.161 0.082 (1.82) (1.95) (1.73) (4.83)** (2.18)* (2.03)* (1.01) dependent children -0.031 -0.008 0.014 -0.451 -0.018 -0.009 -0.204 (0.29) (0.07) (0.13) (4.01)** (0.20) (0.09) (2.26)* no educational qualifications -0.018 0.002 -0.030 0.003 -0.010 -0.023 -0.012 (0.34) (0.03) (0.54) (0.06) (0.21) (0.49) (0.26) no vocational qualifications 0.116 0.150 0.050 0.187 0.125 0.083 0.142 (2.34)* (2.83)** (0.94) (3.43)** (2.89)** (1.93) (3.27)** drivers licence 0.190 -0.021 0.065 -0.068 0.089 0.130 0.084 (2.16)* (0.24) (0.72) (0.72) (1.20) (1.73) (1.11) drivers licence and car -0.147 -0.006 -0.101 -0.039 -0.075 -0.114 -0.103 (1.43) (0.05) (0.96) (0.35) (0.85) (1.30) (1.15) long-term health problem -0.021 -0.016 -0.020 0.120 -0.020 -0.021 0.035 (0.37) (0.27) (0.33) (1.92) (0.40) (0.43) (0.72) housing: social housing -0.201 -0.124 0.014 -0.057 -0.162 -0.094 -0.134 (3.17)** (1.88) (0.22) (0.84) (2.98)** (1.76) (2.46)* housing: mortgage or owned -0.365 -0.183 0.045 -0.119 -0.279 -0.178 -0.252 (5.17)** (2.46)* (0.61) (1.56) (4.57)** (2.94)** (4.11)** employed at first interview 0.066 0.059 -0.068 0.182 0.060 0.011 0.089 (1.00) (0.83) (0.95) (2.40)* (1.03) (0.19) (1.52) unemp at first interview 0.078 0.037 -0.075 -0.050 0.056 0.010 0.014 (1.37) (0.60) (1.21) (0.80) (1.12) (0.20) (0.29) num. JSA claims since 1/95 0.002 -0.006 0.005 0.011 -0.002 0.003 0.004 (0.14) (0.47) (0.43) (0.86) (0.17) (0.33) (0.39) total days unemp before ND -0.000 -0.000 -0.000 -0.000 -0.000 -0.000 -0.000 (0.21) (0.10) (0.46) (4.17)** (0.22) (0.31) (2.27)* length of claim at ND entry -0.000 -0.000 0.000 -0.000 -0.000 -0.000 -0.000 (0.98) (0.73) (0.54) (0.12) (1.01) (0.39) (0.91) local unemp rate at ND entry -0.059 -0.045 -0.020 0.069 -0.051 -0.042 -0.013 (4.57)** (3.27)** (1.41) (4.38)** (4.51)** (3.64)** (1.09) Constant 0.283 0.324 0.112 -0.199 -0.128 -0.239 -0.288 (1.99)* (2.10)* (0.74) (1.23) (1.03) (1.94) (2.29)* Observations 2954 2552 2566 2562 4244 4258 4254 Absolute value of z-statistics in parentheses. * significant at 5% level; ** significant at 1% level
Table 8: Distribution of weights for the matched samples Type of refreshment sample: BP NTRS TRS PRS BP &
NTRS BP & TRS
BP & PRS
weight: 1 488 334 382 353 639 721 6722 194 209 202 149 193 192 1823 62 80 62 63 56 36 494 24 25 34 29 11 11 105 8 22 16 24 5 1 36 2 5 5 7 47 5 3 4 5 8 1 1 2 3 9 1 1
10 1 11 2 12 1 13 1 14 15 16 17 18 19 1
N 785 679 707 640 904 961 920mean weight 1.61 1.86 1.79 1.97 1.40 1.31 1.37% top decile 25 24 25 0.30 22 20 22
Table 9: Measures of concordance between dropouts and refreshment samples BP NTRS TRS PRS BP & NTRS BP & TRS BP & PRS
Pre/post matching: pre post pre post pre post pre post pre post pre post pre postaged 18-19 5.42 1.50 6.02 4.29 0.80 3.43 28.38 0.53 5.68 0.56 2.74 4.15 15.62 4.73aged 20-21 1.65 7.57 0.41 0.51 0.68 3.37 9.59 5.98 0.76 1.18 1.23 0.34 5.12 5.18aged 22-23 3.06 1.45 5.11 2.92 1.21 2.53 33.52 7.68 3.95 0.18 2.25 2.36 15.37 1.70Female 2.42 2.81 4.19 5.24 1.39 0.88 7.20 12.67 3.19 2.10 1.97 0.70 4.50 1.92ethnic minority 13.53 1.43 10.06 0.16 4.63 5.39 39.49 4.26 12.02 0.64 9.64 0.79 24.49 0.64has partner 5.68 1.65 7.65 7.22 8.85 0.69 16.20 2.15 6.54 2.81 7.08 0.23 3.09 2.97dependent children in household 4.07 3.31 4.74 2.47 4.85 1.37 4.74 6.87 4.37 0.28 4.42 6.33 4.37 1.10no educational qualifications 1.00 2.82 2.78 4.82 2.86 1.15 2.69 4.62 1.77 0.66 0.69 3.47 0.60 1.16no vocational qualifications 7.29 3.52 10.55 1.12 3.13 0.48 8.70 6.07 8.70 1.12 5.48 2.24 7.90 2.88drivers licence 6.12 2.27 0.85 0.74 0.02 2.96 4.79 4.32 3.06 1.68 3.43 2.43 5.54 2.64drivers licence and car 0.53 0.45 2.45 0.89 2.99 0.89 1.71 9.59 1.36 1.80 1.61 3.59 0.44 6.58long-term health problem 2.25 0.56 0.88 3.72 1.26 4.27 5.93 0.95 1.66 2.41 1.82 1.85 1.24 4.68housing: social housing 0.90 1.43 2.11 0.63 1.45 0.95 4.42 9.99 1.43 0.32 1.14 1.11 2.43 5.87housing: mortgage or owned 13.56 0.35 5.90 1.42 3.14 2.00 3.50 12.32 10.27 1.41 6.43 1.24 9.25 3.88employed at first interview 0.09 0.55 0.09 3.29 2.01 1.27 11.52 2.09 0.09 3.48 0.93 2.37 4.83 3.34unemployed at first interview 5.22 3.02 3.33 2.38 2.06 5.23 10.99 1.59 4.40 3.65 2.04 6.50 1.80 4.12number of jsa claims since jan 1995 3.66 0.50 1.61 1.96 0.89 5.21 2.18 2.84 2.77 1.87 2.43 0.83 3.02 6.39total days unemployed before ND 2.53 4.29 2.23 1.02 1.22 2.99 5.54 8.78 2.40 4.53 0.90 3.53 3.82 1.29duration of current claim at ND entry 1.77 3.75 0.12 2.63 1.74 2.47 0.10 8.02 1.07 1.27 0.28 3.11 0.97 1.62TTWA unemployment rate at ND 19.45 2.62 13.72 3.87 5.16 1.22 9.72 5.60 16.97 0.24 13.33 2.80 7.33 0.42Mean 5.01 2.29 4.24 2.57 2.52 2.44 10.55 5.85 4.62 1.61 3.49 2.50 6.09 3.15
Table 10: Differences in outcomes between dropouts and substitutes from refreshment samples BP NTRS TRS PRS BP &
NTRS BP & TRS
BP & PRS
unemployed at W2 -0.13* -0.07* 0.00 0.08* -0.10* -0.07* -0.04 (6.07) (2.94) (0.01) (2.31) (5.12) (3.46) (1.72) [1.67] [0.58] [0.06] [0.82] [1.09] [0.56] [0.24]unemployed Dec 98 -0.02 -0.02 0.00 -0.09* -0.02 -0.02 -0.05* (1.48) (0.80) (0.00) (3.31) (1.32) (0.94) (2.59) [0.09] [0.07] [0.04] [0.83] [0.07] [0.05] [0.26]unemployed Mar 99 -0.03 -0.02 0.00 -0.07* -0.03 -0.02 -0.05* (1.74) (0.86) (0.01) (2.24) (1.51) (1.02) (2.22) [0.15] [0.09] [0.05] [0.67] [0.11] [0.07] [0.30]unemployed Jun 99 -0.06* -0.03 0.00 0.03 -0.05* -0.03 -0.03 (3.03) (1.44) (0.01) (0.79) (2.59) (1.69) (1.10) [0.43] [0.17] [0.06] [0.19] [0.28] [0.16] [0.12]unemployed Sep 99 -0.08* -0.04 0.00 0.16* -0.06* -0.04* 0.01 (3.73) (1.81) (0.01) (3.64) (3.15) (2.09) (0.50) [0.66] [0.26] [0.06] [2.61] [0.44] [0.24] [0.10]unemployed Dec 99 -0.11* -0.06* 0.00 0.11* -0.09* -0.06* -0.02 (5.30) (2.53) (0.01) (2.93) (4.42) (2.98) (0.82) [1.26] [0.45] [0.06] [1.40] [0.83] [0.43] [0.11]unemployed Mar 00 -0.14* -0.08* 0.00 0.03 -0.11* -0.08* -0.07* (6.82) (3.29) (0.00) (0.94) (5.79) (3.86) (3.31) [2.09] [0.70] [0.06] [0.18] [1.35] [0.68] [0.59]unemployed Jun 00 -0.12* -0.07* 0.00 0.02 -0.10* -0.07* -0.06* (5.74) (2.82) (0.01) (0.61) (4.96) (3.29) (2.95) [1.47] [0.51] [0.06] [0.12] [0.96] [0.50] [0.45]* - significant at 5% level; t-statistics in parentheses; 100*mean square error in brackets; bootstrapped (10,000 reps)
Table 11: Differences in outcomes between dropouts and substitutes from refreshment samples, including wave 1 and wave 2 unemployment status in propensity score model BP NTRS PRS BP & NTRS BP & PRSunemployed at W2 -0.07* -0.04 -0.03 -0.06* -0.06* (3.81) (1.91) (0.71) (3.29) (3.31) [0.57] [0.21] [0.29] [0.37] [0.39]unemployed Dec 98 -0.01 -0.01 -0.01 -0.01 0.00 (0.48) (0.29) (0.17) (0.44) (0.28) [0.03] [0.04] [0.15] [0.03] [0.02]unemployed Mar 99 0.00 0.00 0.00 0.00 0.00 (0.17) (0.07) (0.09) (0.14) (0.08) [0.03] [0.04] [0.20] [0.03] [0.03]unemployed Jun 99 -0.02 -0.01 -0.01 -0.02 -0.02 (1.11) (0.46) (0.26) (0.90) (0.88) [0.08] [0.06] [0.20] [0.06] [0.06]unemployed Sep 99 -0.01 0.00 0.01 -0.01 -0.01 (0.58) (0.24) (0.24) (0.47) (0.40) [0.04] [0.04] [0.21] [0.03] [0.03]unemployed Dec 99 -0.05* -0.03 -0.02 -0.04* -0.04* (2.67) (1.33) (0.47) (2.30) (2.39) [0.27] [0.12] [0.23] [0.18] [0.21]unemployed Mar 00 -0.10* -0.06* -0.04 -0.08* -0.08* (5.20) (2.51) (1.07) (4.43) (4.47) [1.12] [0.38] [0.37] [0.72] [0.74]unemployed Jun 00 -0.09* -0.05* -0.04 -0.07* -0.07* (4.25) (2.15) (1.01) (3.74) (3.76) [0.77] [0.29] [0.34] [0.52] [0.53]* - significant at 5% level; t-statistics in parentheses; 100*mean square error in brackets; bootstrapped (10,000 reps)
prop
ortio
n un
empl
oyed
probability of unemployment over time
stayer dropout
01jan1999 01apr1999 01jul1999 01oct1999 01jan2000 01apr2000
.2
.4
.6
.8
Figure 1: Difference between stayers and dropouts in the probability of being unemployed
Figure 2: Constructing the refreshment samples
0.30
0.40
0.50
0.60
0.70
0.80
0.90
Dec-98
Feb-99
Apr-99
Jun-9
9
Aug-99
Oct-99
Dec-99
Feb-00
Apr-00
Jun-0
0
prop
ortio
n un
empl
oyed
sample frameNTRSTRSPRS
Figure 3: Differences between samples in the changing probability of unemployment
Sampling frame (including initial nonrespondents)
N=11,159
Non-test samplen=N*pr(R=1)/2
Test samplen=N*pr(R=1)/2
TRSnTRS =nIP
NTRSnNTRS =nTRS
PRSnPRS =nTRS
BPnBP = n*pr(w=1)
IPnIP = n*pr(w=0)
ε1 < 0.5, R=1 ε1 ≥ 0.5, R=1
ε2 < Pr(w=0)
nTRS cases with largest pr(w=0)w=0w=1 w=0
Sampling frame (including initial nonrespondents)
N=11,159
Non-test samplen=N*pr(R=1)/2
Test samplen=N*pr(R=1)/2
TRSnTRS =nIP
NTRSnNTRS =nTRS
PRSnPRS =nTRS
BPnBP = n*pr(w=1)
IPnIP = n*pr(w=0)
ε1 < 0.5, R=1 ε1 ≥ 0.5, R=1
ε2 < Pr(w=0)
nTRS cases with largest pr(w=0)w=0w=1 w=0
Sampling frame (including initial nonrespondents)
N=11,159
Non-test samplen=N*pr(R=1)/2
Test samplen=N*pr(R=1)/2
TRSnTRS =nIP
NTRSnNTRS =nTRS
PRSnPRS =nTRS
BPnBP = n*pr(w=1)
IPnIP = n*pr(w=0)
ε1 < 0.5, R=1 ε1 ≥ 0.5, R=1
ε2 < Pr(w=0)
nTRS cases with largest pr(w=0)w=0w=1 w=0