13
349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, CAPI, DROP-OFF, and CMS 351 Marcel Das, Corrie Vis, and Bas Weerman 7.3 Translation Process 352 Janet Harkness 7.4 Sample Design 352 Anders Klevmarken 7.5 The SHARE-SRC Train-The-Trainer Programme 353 Kirsten Alcser and Grant Benson 7.6 Fieldwork and Sample Management 354 Oliver Lipps and Guiseppe De Luca 7.7 Survey Response 355 Guiseppe De Luca and Franco Peracchi 7.8 Item Response 356 Adriaan Kalwij and Arthur van Soest 7.9 Computing a Comparable Health Index 357 Hendrik Jürges 7.10 Income Imputation 357 Omar Paccagnella and Guglielmo Weber 7.11 Wealth Imputation 358 Dimitrios Christelis, Tullio Jappelli, and Mario Padula 7.12 Methodological Issues in the Elicitation of Subjective Probabilities 359 Luigi Guiso, Andrea Tiseno, and Joachim Winter

7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

349

*7 Methodology Ed. Hendrik Jürges

7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350Axel Börsch-Supan

7.2 Instruments: LMU, CAPI, DROP-OFF, and CMS 351Marcel Das, Corrie Vis, and Bas Weerman

7.3 Translation Process 352Janet Harkness

7.4 Sample Design 352Anders Klevmarken

7.5 The SHARE-SRC Train-The-Trainer Programme 353Kirsten Alcser and Grant Benson

7.6 Fieldwork and Sample Management 354Oliver Lipps and Guiseppe De Luca

7.7 Survey Response 355Guiseppe De Luca and Franco Peracchi

7.8 Item Response 356Adriaan Kalwij and Arthur van Soest

7.9 Computing a Comparable Health Index 357Hendrik Jürges

7.10 Income Imputation 357Omar Paccagnella and Guglielmo Weber

7.11 Wealth Imputation 358Dimitrios Christelis, Tullio Jappelli, and Mario Padula

7.12 Methodological Issues in the Elicitation of Subjective Probabilities 359Luigi Guiso, Andrea Tiseno, and Joachim Winter

Page 2: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

350

Methodology

This chapter provides short accounts of various methodological aspects of the Sur-vey of Health, Ageing and Retirement in Europe and its co-ordination by the Mannheim Research Institute for the Economics of Aging (MEA) at the University of Mannheim, Germany. It includes an overview of both the substantive and technical development of the common survey instrument and its translation into the different languages. Further, we briefly describe the sample design and weighting strategy in the participating countries, our train-the-trainer program that aimed at implementing common practices in each country, and the field work and survey management. The chapter further presents basic information about unit and item non-response rates. Finally, we include in this chapter short method-ological notes on the comparability of subjective health data, the imputation of missing information on income and wealth, and the elicitation of subjective probabilities. Further details will be available in a separate technical reports volume, which will be published later in 2005.

7.1 History of the Development Process: Pilots, Pre-Tests, and Main Study Axel Börsch-Supan

The SHARE development process iterated in four stages between questionnaire de-velopment and data collection. In the first stage, starting in January 2002, the working groups produced an English-language draft questionnaire, departing from the HRS and ELSA instruments plus survey instruments in Germany, Italy and Sweden which addressed relevant questions. This draft questionnaire was piloted in the UK in September 2002 with help of the National Centre for Social Research (NatCen, London) which has also con-ducted the first wave of ELSA.

Based on the lessons from this pilot, the English-language questionnaire was thoroughly revised and, with the help of the language management utility (LMU), translated into all SHARE languages. These language elements were fed in a common CAPI programme. The second stage culminated in a first all-country pilot which applied this instrument si-multaneously in all SHARE countries, using quota samples of some 75 individuals in each country ( June 2003).

In the third stage, after further refinements of the instrument, the full questionnaire was pre-tested in January/February 2004 using genuine probability samples (some 100 primary respondents per country plus their spouses). This all-country pre-test also tested the coun-try-specific logistics and the procedures to achieve probability samples.

During the fourth stage, an extensive statistical analysis of the pilot and pre-test results was conducted under the AMANDA project also financed by the European Commission. The improvements based on these analyses led to the final design of the instrument. The first prototype wave of about 1500 households per country began late April 2004 and was finished in most countries in October 2004. Supplementary data collection is still going on.

The articles in this book are based on an early and incomplete release of the SHARE data, created in November 2004 (“Release 0”). It includes 18,169 individuals in 12,512 households with completed interviews. The French data were only partial, and the No-vember release did not contain Belgian data. While we have done a host of crosschecks, an extensive consistency and plausibility check of all data with a subsequent imputation process is work still to be done. All results in this book are therefore preliminary.

In April 2005, a more complete data set (“Release 1”) will be accessible to the entire

Page 3: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

351

Methodology

research community. It will contain about 4,000 additional individual interviews in about 2,500 households, plus added generated and imputed variables. We hope that many re-searchers will take the opportunity to work with these fascinating data.

A final release with the complete data set – about 27,000 individuals – with an extensive set of generated and imputed variables (“Release 2”) is planned for the first half of 2006.

7.2 Instruments: LMU, CAPI, DROP-OFF, and CMS Marcel Das, Corrie Vis, and Bas Weerman

Although the actual fieldwork in SHARE was carried out by a different agency for each country, the programming of the individual instruments was done centrally by Cen-tERdata, a survey research institute affiliated with Tilburg University in the Netherlands. The data were collected using a computer assisted personal interviewing (CAPI) program, supplemented by a self-completion paper and pencil questionnaire. The set-up of this CAPI program allowed each country involved to use exactly the same underlying structure of meta-data and routing. The only difference across countries was the language. This mechanism, where question texts are separated from question routing, enforces the com-parability of all country specific translations with a generic questionnaire.

The CAPI program was written in Blaise: a computer-assisted interviewing system and survey processing tool for the Windows operating system, developed by Statistics Nether-lands and also used by the US Health and Retirement Survey. The generic CAPI instrument was directly implemented in Blaise, and the generic texts (in English) were stored in an external database. After several rounds of revisions of the generic instrument, the different countries translated their versions of the instrument using the Internet and the so-called Language Management Utility (LMU), developed by CentERdata. Another program was written converting the translated question texts, interviewer instructions, answer catego-ries, fill texts and other instrument texts (like error messages) from the (LMU) database into a country specific survey instrument, based on the blueprint of the generic version. Yet another program was developed to process a paper version of the separate country specific CAPI instruments, as well as the generic English version.

There were only few exceptions to the generic blueprint of the questionnaire. Country-specific parts were introduced when institutions were fundamentally different, e.g. in the health care section. Second, country specifics could be introduced by skipping irrelevant answer categories and by adding new country specific answer categories in the LMU. These exceptions never led to a different sequence of questions for a specific country.

Next to the CAPI instrument, a Case Management System (CMS) was developed to manage the co-ordination of the fieldwork. Only three countries used their own system: France, Switzerland, and The Netherlands. The CMS basically consists of a list of all house-holds in the gross sample that should be approached by the interviewer. Contact notes and registrations, appointments with respondents, and area and case information could be entered in the system, and the system enforced common procedures for re-contacting respondents and how to handle non-response.

Some additional tools that converted the CMS into a complete Sample Management System (SMS) were developed. One tool facilitated the merging of all CMS databases that came back from the field, the preparation for sending the interview data, and the actual sending (via FTP) to the central management team. Another tool generated a progress report on the basis of the CMS databases.

All data that came back from the field were processed, converted to SPSS and STATA

Page 4: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

352

Methodology

data files and put on a secured web site. The so-called keystroke files, files that register all keystroke activity during the fieldwork, formed the basis for additional files containing information about times spent on different modules and the interview in total.

7.3 Translation Process Janet Harkness

Due to the complex nature of the SHARE questionnaire, the translation process consti-tuted a considerable challenge. Often, the costs and the effort called for in survey transla-tion are underestimated. Thus, although each participating country in SHARE organised its own translation effort, the central co-ordinator initiated several activities to support the individual translation efforts:

• First, SHARE countries were provided with guidelines recommending how to go about hiring translators, testing translators, organising the translation, and reviewing and as-sessing the translation. The model advocated followed in simplified form that used in the European Social Survey (see ESS documents at http://www.europeansocialsurvey.org). The guidelines advocated organising a team to complete the translation and to review translations. The team would bring together the language and translation skills, survey questionnaire know-how and substantive expertise needed to handle the SHARE questionnaire modules. In the ESS the translation guidelines are closely linked to procedural specifications that participating countries have to meet. This was not the case in SHARE; participants were offered the guidelines as recommendations. Ultimately each country decided on its own procedures.

• Second, the project co-ordinator commissioned an expert in survey translation to ad-vise SHARE participants on any translation queries they might have.

• Third, the project co-ordinator commissioned a professional review of a sample of the first draft of SHARE translations. SHARE countries were provided with feedback from an external set of translators, each working in their language of first expertise. The translators commented in detail on selected questions and submitted a brief gen-eral appraisal of the translation draft, pointing out areas where improvements could be made. This procedure was repeated for a later draft of the questionnaire and feedback again provided to SHARE participants. The pretest-and-pilot design of the SHARE study, coupled with the translation guidelines and appraisals, provided the SHARE project with a rare opportunity to refine and correct the source questionnaire and the translated versions.

7.4 Sample Design Anders Klevmarken

In the participating SHARE countries the institutional conditions with respect to sam-pling are so different that a uniform sampling design for the entire project was infeasible. Good sampling frames for our target population of individuals 50+ and households with at least one 50+ individual did not exist or could not be used in all countries. In most countries there were registers of individuals that permitted stratification by age. In some countries these registers were administered at a regional level, Germany and The Nether-lands are two examples. In these cases we needed a two or multi-stage design in which regions were sampled first and then individuals selected within regions. In the two Nordic countries Denmark and Sweden we could draw the samples from national population registers and thus use a relatively simple and efficient design. In France and Spain it be-

Page 5: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

353

Methodology

came possible to get access to population registers through the co-operation with the national statistical office, while in other countries no co-operation was possible. In three countries, Austria, Greece and Switzerland we had to use telephone directories as sam-pling frames and pre-screening in the field of eligible sample participants. As a result the sampling designs used vary from simple random selection of households to rather com-plicated multi-stage designs. These differences are reflected in the design weights that are all equal in Denmark, that use simple random sampling of households, but very different in, for instance, Italy. There are also national differences in efficiency. The simple Nordic designs are likely to be more efficient than some of the complex multi-stage designs used in central and southern Europe.

In the three countries that used telephone directories and in Denmark the final sampling unit was a household, while in all other countries the final unit of selection was an indi-vidual. Since all 50+ individuals of a household and any of their partners were included in the sample independently of how it was selected, the inclusion probability of a household is by design the same as that of any of the included household members. In the countries that used an individual as the finite unit of selection, the inclusion probabilities are pro-portional to the number of household members 50+, data that only became available in the interviews. In these countries it was thus only possible to compute design weights for responding households.

Unit non-response was compensated by adjusting the design weights. This was done in a calibration approach. In most countries the calibration was done to national population totals decomposed by age and gender, in two countries more information could be used and in two countries just national totals by gender were used.

7.5 The SHARE-SRC Train-The-Trainer Programme Kirsten Alcser and Grant Benson

A train-the-trainer (TTT) programme was developed by the Survey Research Center (SRC) of the University of Michigan at Ann Arbor for the SHARE project, providing cen-tralised training of local survey agency trainers in order to facilitate standard training of in-terviewers and standardisation of the data collection processes in the respective countries. Training tools were developed by SRC in close co-operation with MEA and CentERdata, including an Interviewer Project Manual describing all SHARE field protocols; a Facilita-tor Guide with power point slides and training scripts; a CD-based training on gaining respondent co-operation; and training videos to illustrate (a) the correct interpretation and recording of call attempts, and (b) physical measurements. All materials were translated from the English deliverables into the language of the country before being distributed to the local interviewers.

A TTT was conducted prior to each pilot/pretest and production data collection. After the initial TTT training, subsequent training sessions were abbreviated, covering primarily changes or additions to SHARE, as these evolved. A final product included a prototype agenda for the two-day training of SHARE interviewers in the host countries (see Table 1).

Page 6: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

354

Methodology

7.6 Fieldwork and Sample Management Oliver Lipps and Guiseppe De Luca

Each individual survey agency managed their own field following their established pro-tocols, subject to a set of requirements from the SHARE co-ordinating team, enforced by the design of the common electronic case management system (CMS). Most important among these requirements were measures to minimise the number of households who are unwilling or unable to participate in the survey. For example, advance letters explaining the importance of the study were sent to each household in the gross sample before the interviewer contacted them in person. At this stage, some countries also offered monetary or other incentives for participation. If a first attempt to gain the household‘s co-operation had been unsuccessful, the address was given to a new interviewer with special experience in gaining co-operation. If respondents were unable to participate due to health reasons, we asked for consent to have the interview done by a proxy respondents, e.g. an adult child.

During the field period, the SHARE co-ordinator set up a procedure to monitor the fieldwork in each participating country in real-time—in parallel to the survey agencies. Every two weeks, at pre-specified dates, the survey agencies sent their updated CAPI and CMS data to CentERdata, where the data was processed and then made available to the project co-ordinator. This data was then used by the co-ordination team to follow the progress made in each country. At each time during the entire field period it was thus pos-sible to monitor (with a maximum lag of two weeks):

• how many households had been contacted

• how many interviews had been conducted

• which interviewers were actively working on SHARE and which were currently inac-tive

• what were the main reasons for non-contact

• what were the main reasons for non-interviews

Table 1

SHARE Two-Day Training Plan

Topic Length(Minutes)

Day 1: Introductions, welcome, logistics 15 SHARE project and questionnaire overview 45 Laptop overview and instrument installation check 30 Overview of Case Management System 75 Overview of the Blaise interview program 45 SHARE questionnaire walk-through (scripted mock scenario recommended):

First half session

150

Day 2: Question and Answers from Day 1 15 SHARE questionnaire walk-through (scripted mock scenario recommended): Second half session

120

Proxy interviews 45 Importance of response rates 30 Contacting household 60 Practice using the Case Management system 60 Gaining respondent cooperation 60

Total time in training (excluding breaks): 12.5 hours

Page 7: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

355

Methodology

Given this information, the co-ordinator was able to identify possible problems in the field and their reasons very early in the process. Strategies how to cope with such problems could be discussed with the country teams and survey agencies and implemented without unnecessary delay.

7.7 Survey Response Guiseppe De Luca and Franco Peracchi

Survey participation may be viewed as the result of a sequential process involving eligi-bility, contact of the eligible units and response by the contacted units. For SHARE, the analysis of survey participation depends crucially on whether or not the sampling frame contains preliminary information on the eligibility status of the sample units. Countries that use telephone directories as sampling frames (namely Austria, Greece and Switzer-land) have a higher probability of selecting ineligible sample units. However, once the effects of the different frames on eligibility rates are taken into account, one can compare response rates across all countries involved in the project.

Overall, the SHARE data release on which all results presented in the present volume are based (“Release 0”) contains 18,169 individuals in 12,512 households. The unweighted country-average of household response rates is 55.4% (57.4% among the countries under EU-contract), see Table 2. France and the Netherlands have the highest response rates (69.4% and 61.6%, respectively), Switzerland the lowest (37.6%). Focusing attention on the reasons for household non-response, refusal to participate to the survey is the main reason (28.9%), although in some countries a non negligible fraction of non-response is also due to non-contact (12.4% in Spain) and other non-interview reasons (17.1% and 14.2% in Sweden and Germany respectively). An analysis of individual response rates and within-household response rates suggests that most of non-response in SHARE occurs at the household level, and that the response behaviour of individuals within a household is strongly and positively related. The unweighted country-average of within-household individual response rates is 86.3%. Preliminary response analysis by subgroup of the target population reveals only small differences in the patterns of survey participation by gender and age group.

Table 2

Country Household Response Rate Individual Response Rate (within household)

Sweden 42.1% 83.8%Denmark 61.1% 93.0% Netherlands 61.6% 87.9% Germany 60.2% 86.5% France 69.4% 91.7% Switzerland 37.6% 86.9% Austria 57.3% 87.4% Italy 54.1% 79.7% Spain 50.2% 73.8% Greece 60.2% 91.8% Total 55.4% 86.3% Total (EU-funded) 57.4% 86.0%

Household Response Rate

Page 8: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

356

Methodology

7.8 Item Response Adriaan Kalwij and Arthur van Soest

As common in household surveys, respondents sometimes answered questions with “I don’t know” (DK) or “I’d rather not say” (RF, refusal). This behaviour is called item non-response. For an overwhelmingly large majority of the variables in SHARE, item non-response is a minor problem since the percentage of DKs or RFs is quite small. For example, there is hardly any item non-response in physical or mental health variables, in well-being, labour force status and job satisfaction, or in basic demographics and informa-tion on children. Somewhat larger item non-response rates are found for qualitative ques-tions on pension entitlements, expectations, asset ownership or the nature of the assets.

The type of questions that suffers substantially from item non-response are questions on amounts of income, expenditure, or values of assets. In this respect, however, SHARE does not differ much from comparable surveys like ELSA or HRS. For example, owners of shares of stock or stock mutual funds are asked the total value of their (household’s) shares of stock and stock mutual funds. In SHARE, 30.2% of the owners answer DK of RF, compared to 35.0% in HRS wave 2002.

Respondents answering DK or RF are asked a number of subsequent questions on whether the amount is larger than, smaller than, or about equal to a given amount. This so-called unfolding bracket design was already used in HRS 1992 and proved to be an ef-fective way to collect categorical information on the initial non-respondents. For example, with bracket questions on the amounts €25,000, €50,000 and €100,000, for those who go through all the bracket questions, we know whether the amount is less than €25,000, about €25,000, between €25,000 and €50,000, about €50,000, etc. Like in HRS, a large fraction of initial non-respondents appear to be willing to answer the bracket questions. For examples, for shares of stocks and stock mutual funds, 45.4% of initial non-respon-dents in SHARE complete the brackets, compared to 41.2% in HRS. For 16.5% of all owners in SHARE, there is no information on the amount at all, compared to 18.6% in HRS. Thus SHARE compares favourably to HRS in this respect, something that is gener-ally also found for other amount questions.

For studies that use income or income components, wealth or wealth components, etc., as one of the right-hand variables, missing information on one of these variables is a problem. Deleting observations with missing information is often an unattractive option for two reasons. The first reason is that a smaller sample size results in an efficiency loss. The second reason is that deleting missing data may yield biased inferences when item non-response is related to the variable of interest. For instance, the reason for item non-response may be related to the same factors that drive income or health of the respondent and deleting missing data would then lead to a selective sample.

Therefore, instead of deleting missing data, the missing values are replaced by imputed values, i.e., observed values of other respondents that are similar to the respondent consid-ered in certain relevant aspects. Many imputation methods exist. For the data release used by all papers in this volume, we followed the procedure of Hoynes et al. (1998). Imputa-tions were first done recursively for a small set of core variables (income from employ-ment, self employment or public pensions, value of owner-occupied housing, amount held in shares of stock and stock mutual funds, amount held in checking and saving accounts, and food consumption. This is done to guarantee that imputations respect the correlation structure of these variables. For example, respondents with missing food consumption but with high (observed or imputed) earnings, were assigned an observed (probably relatively

Page 9: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

357

Methodology

high) food consumption amount of another respondent with similarly high earnings but with observed food consumption. The imputed values are flagged (i.e., an indicator vari-able is constructed indicating the level of imputation) and flags and imputed variables will be included in the public release of the data. More refined imputation methods will be applied to later data releases.

7.9 Computing a Comparable Health Index Hendrik Jürges

Subjective data, such as self-assessed health can be subject to cross-country bias for several reasons. However, there is a fairly straightforward possibility to compute a single measure of health, that is comparable across countries. The main requisite are objective data on the respondents health: self reported diagnosed chronic conditions, mental illness-es, symptoms (especially pain), or functional limitations. If available, one also uses medical records, and measurements and tests like blood samples, grip strength, balance, gait speed, etc. The absence of any conditions, symptoms, or limitations, implies perfect health, i.e. an index value of 1. The presence of a condition reduces the health index by some given amount or %age, the so-called disability weight. The disability weight of each condition or symptom is assumed to be the same for each respondent.

Disability weights are often derived by expert judgements or surveys specialised to elicit health preferences, using time trade-off or standard gambles. In SHARE, we are able to compute disability weights from within our sample (Cutler and Richardson 1997) by esti-mating ordered probability (e.g. probit) models of self-reported health (which ranges e.g. from „excellent“ to „poor“) on a large number of variables representing chronic condi-tions, symptoms, ADL problems, depression, physical functioning, height, weight, and cognitive functioning. We can also include our measures of grip strength and walking speed, and basic demographic variables like age and sex. The health index is then comput-ed as the linear prediction from this regression (the latent variable), normalised to 0 for the worst observed health state (often referred to as „near death“) and 1 for the best observed health state (referred to as „perfect health“). This procedure implies disability weights for each condition or impairment that are equal to the respective (also normalised) regression parameters. Since the variable on which we base this measurement is self-reported health itself (and thus potentially subject to cross-cultural bias), we account for country specific reporting styles by modelling the latent variable thresholds as a function of country of resi-dence (i.e. we basically have fixed country effects at each threshold). Thus thresholds are allowed to vary across countries, while disability weights are constrained to be the same in each country.

7.10 Income Imputation Omar Paccagnella and Guglielmo Weber

The Definition of Income: Total income is the sum of some incomes at the individual level and some at the household level. The basic definition used in the SHARE project reflects money income before taxes on a yearly base (2003) and includes only regular pay-ments. Lump-sum payments and financial support provided by parents, relatives or other people are not included.

The available data at the individual level include: income from employment; income from self-employment or work for a family business; income from (public or private) pen-sions or invalidity or unemployment benefits; income from alimony or other private regu-

Page 10: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

358

Methodology

lar payments; income from long-term care insurance (only for Austria and Germany).The available data at the household level include: income from household members

not interviewed; income from other payments, such as housing allowances, child benefits, poverty relief, etc.; income actually received from secondary homes, holiday homes or real estate, land or forestry; capital income (interest from bank accounts, transaction accounts or saving accounts; interest from government or corporate bonds; dividend from stocks or shares; interest or dividend from mutual funds or managed investment accounts). For homeowners, the data at the household level also include imputed rent, based on the self-assessed home value minus the net residual value of the debt (payments for mortgages or loans). The interest rate used for imputed rents is fixed at 4% for all countries.

The SHARE definition of income does not include home business and „other types of debts“: in the latter case we are not able to separate the amount of the debts on cars and other vehicles from the total amount of debts.

Imputations: Whenever a respondent did not know or refused to give the exact amount in a certain question, unfolding brackets (UB) questions were asked to recover that value (see above). Different cut-offs were used across countries.

As far as UB observations are concerned, we implemented a simple hot-deck procedure to impute values for those cases in which the exact amount are missing. At this stage, only the amount variable is imputed. Also, we imputed one variable at a time and did only one round of imputations for each variable. No stratification was made, except by country (due to the differences in the cut-offs).

In the event of a „refusal“ or „don‘t know“ answer to all UB questions, we stratify by country and age classes, except for financial assets, where income is computed on the basis of the stock values (whether exact records exist or just imputed).

In the event of “invalid” („refusal“, „don‘t know“, or missing) values on frequency vari-ables (for instance the period covered by a payment and the number of months in which the respondent has received the payment in 2003), a linear regression technique was ap-plied to impute such frequencies. In particular, we used the linear regression only for the frequencies of received pension. The regression conditions upon the following indepen-dent variables: age, sex and dummy indicators for whether the associated amount variable belongs to the intervals defined by the 1st, 2nd, and 3rd quartile.

We produce the estimated coefficients for each frequency variable within each country. In a few cases the hot-deck procedure may fail because there are no donors that can be used for that specific interval.

7.11 Wealth Imputation Dimitrios Christelis, Tullio Jappelli, and Mario Padula

The Definition of Wealth: SHARE contains the following information on the ownership and value of the following assets.

• Real assets, i.e. the ownership and value of the primary residence, of other real estate, of the share owned of own businesses and of owned cars.

• Gross financial assets, i.e. the ownership and value of bank accounts, government and corporate bonds, stocks, mutual funds, individual retirement accounts, contractual savings for housing and life insurance policies.

• Mortgages and financial liabilities.

Page 11: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

359

Methodology

The values of these variables are summed over all household members in order to gener-ate the corresponding household-level variables. As with income, whenever a respondent did not know or refused to give the exact amount in a certain question, unfolding brackets (UB) questions were asked to recover that value, where different entry points were used across countries.

Imputations: Imputation is performed using the hotdeck imputation package in STATA, which is based on the approximate Bayesian bootstrap described in Rubin and Schenker (1986). This procedure requires the classification (by some variables, e.g. unfolding bracket values, age, etc.) of the non-missing observations in cells, from which bootstrap samples are drawn and values from these samples are used to impute the missing observations in each.

We impute asset values in two steps. (1) If an individual gives a response of „don‘t know“ or refuses to answer the ownership question, then ownership is imputed. The imputation is done using country and age as classificatory variables for the hotdeck procedure. (2) The amount is imputed when ownership is imputed, when the individual gives a response of don’t know/refusal and either does not start the unfolding brackets procedure, does not complete it, or completes it without giving a specific amount as an approximate answer, or when the original answer is deemed illegitimate for other reasons.

In the end we divided the variables into three groups according to the criteria by which the cell classification for imputation was made (all imputations were made separately for each country).

• Housing, bank accounts and cars: These variables contained numerous positive non-missing values, reflecting the wide ownership of the corresponding assets. In the case in which we did not know the bracket value we used age as an additional variable. When we knew the bracket value, we used it together with age.

• Mortgage: We needed to link the value of the mortgage to the value of the house, in order to avoid as much as possible the case where the imputed value of the mortgage was greater than the value of the house. Thus, when we did not know the bracket value of the mortgage, we used the bracket value of the house as a classificatory vari-able; when we knew the bracket value of the mortgage we used it for the imputation and we excluded the bracket value of the house because its inclusion would have made the cells too thin.

• Other real estate, bonds, stocks, mutual funds, individual retirement accounts, con-tractual savings for housing, life insurance, own business and owned share thereof and financial liabilities: These variables exhibited relatively few positive non-missing values. We used age to define the imputation cells when we did not know the bracket value, while we used the bracket value for their definition when we knew it.

7.12 Methodological Issues in the Elicitation of Subjective Probabilities Luigi Guiso, Andrea Tiseno, and Joachim Winter

Non-response rates for the subjective expectations questions are generally low. For the “sunny day” question, the non-response rate is 3.2% and for the subjective survival ques-tion it is 7.9%. There is only minor variation of non-response rates across countries—the smallest non-response rates (below 5%) are observed in Austria, Switzerland, and Ger-many; the largest non-response rate to the subjective survival question of about 15% in Spain.

Page 12: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

360

Methodology

An issue that has received some attention in the literature on probabilistic expectations is rounding to certain “focal values” (in particular, to 0%, 50%, and 100%, and to a lesser degree to other multiples of 10%). Even more striking than rounding is the excessive use of 50% responses. Some authors argue that a 50% response reflects “epistemic” uncertainty about the event in question (e.g., Bruine de Bruin, et al. 2002). In this case, 50% responses would be similar to a “don’t know” response, and they would have to be dealt with dif-ferently than other multiples of 10% generated by rounding. While a deeper analysis of this issue is beyond the scope of this paper, it is nevertheless interesting to see whether the phenomenon of rounding and excessive 50% responses is present in the SHARE data as well, and even more importantly, whether there are any striking differences in response behaviour to probabilistic expectations questions across participating countries.

For instance, an analysis of the responses to the “sunny day” question confirms find-ings of other surveys such as HRS: Most of the responses are at focal values, in particular multiples of 10%, with a peak at 50% that cannot easily be explained by rounding. Overall, however, only about one fifth of all responses are at 50%, which is less than what has been found in other surveys. In the SHARE data, the prevalence of 50% responses is similar in all questions—between 20% and 30% of all responses. Second, there is some variation across countries. The question with the largest degree of cross-country variation in the use of 50% responses is the “sunny day” question, and it seems likely that the observed differences are due to actual differences in weather conditions and not in response be-haviour—the Mediterranean countries simply have better weather, so the entire response distribution should be shifted to the right, reducing the number of 50% responses. For the other questions, the variation is rather small.

Future research will have to test whether these differences correctly reflect differences in the underlying expectations across countries or whether there are country-specific re-sponse styles for probabilistic expectations questions. Another methodological issue re-lated to probabilistic expectations questions is whether there is a general tendency by respondents to be optimistic (i.e., to report high probabilities for positive and low prob-abilities for negative events) in hypothetical choice questions. A first impression of whether this effect exists can be obtained by correlating responses to a question that likely reflects an individual’s overall optimism (in the case of SHARE, we use the “sunny day” question for this purpose) with the responses to substantive probabilistic questions.

Table 3 shows the correlation of responses to the substantive expectations questions with responses to the “sunny day” question. While all correlations are statistically signifi-

Table 3 Correlation of Responses to the Substantive Expectations Questionswith Responses to the “Sunny Day” Question

Question Correlation p-value N

ex007 Decrease in pensions -0.0755 0.000 6240 ex008 Increase in retirement age -0.0563 0.000 6268 ex009 Survival to target age 0.0979 0.000 15108 ex010 Better standard of living 0.1154 0.000 15618 ex011 Worse standard of living 0.0262 0.001 15531

Notes: Reported correlation are Spearman’s rank correlation coefficients. The p-value is for the null hypothesis that the row variable is independent of the response to the “sunny day” question.

Page 13: 7 Methodology...349 * 7 Methodology Ed. Hendrik Jürges 7.1 History of the Development Process: Pilots, Pre-tests, and Main Study 350 Axel Börsch-Supan 7.2 Instruments: LMU, …

361

Methodology

cant at any conventional confidence level (due to the large sample size), the absolute size of the correlation coefficients is small, which can be taken as evidence against a general tendency to be optimistic or pessimistic.

ReferencesBruine de Bruin, W., P. S. Fischbeck, N. A. Stiber, and B. Fischhoff. 2002.: What number is Fifty-Fifty? Redis-

tributing excessive 50% responses in elicited probabilities. Risk Analysis 22:713-23.

Cutler, D.M. and E. Richardson. 1997: Measuring the Health of the U.S. population. Brookings Papers on

Economic Activity: Microeconomics 1997, 217-71.

Hoynes, Hilary, M. Hurd and H. Chand. 1998: Household Wealth of the Elderly under Alternative Imputa-

tion Procedures. In Inquiries of Economics of Aging, ed. David Wise, 229-57. Chicago: The University of

Chicago Press.

Rubin, D.B and N. Schenker. 1986: Multiple imputation for interval estimation from simple random samples

with ignorable non-response. Journal of the American Statistical Association 81:366-74.