40
Sample data for health analysts Welcome! ESDS Government Samples of Anonymised Records

Samples of Anonymised Records - UK Data Service · Samples of Anonymised Records Longi-tudinal Studie s Individ-ual licence d EUL Small Area Micro-data Special licence household SAR

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Sample data for health analystsWelcome!

ESDS Government

Samples of Anonymised Records

Today

• Introduce some key data resources for health research

• Provide examples of how the data can be used

• Demonstrate different routes for accessing data

Introduction to Government Microdata

• Focussing on some key individual-level sample datasets gathered by government for policy making purposes– Health Surveys – Census Microdata

• Introduce the data and its features• How to access data, resources and support• Future developments

• Hands-on session before lunch

Why should you want to know about the data?

Because the data are..........• Very cost effective: data free of

charge to not for profit researchers• Saves time: no need to conduct

survey • Access to high quality, well

documented data • Can provide nationally representative

data - allows generalisation to population

• Allows historical and geographical comparisons to be made

• ESRC funded data support services

“Microdata”?Individual cases

are maintained – eachrecord contains a widerange of characteristics

What are they like?

• Sample data

• Snapshot data (“cross-sectional”)

• Long-running series

• Large

• Flexible

Typically...

Government Surveys• Conducted by:

– Office for National Statistics (ONS)

– NatCen – or similar...

• Collected face to face – CAPI (and CASI)

• Household Surveys– Typically exclude communal

establishments– May include one, all or

some residents• Limited range of topics

Health SurveysDetailed Health Information• Health Survey for England from 91• Welsh Health Survey continuous from ‘03• Scottish Health Survey 95, 98, 03/04

Some useful indicators: general health, gp/hosp visits, smoking/drinking contraception

• General Household Survey from 71 (less recently)

Contains some health data• Omnibus• Time Use Survey• Labour Force Survey

Census microdata• Wide-ranging but

brief• Self completion• Decennial • Samples of

Anonymised Records available for 91 and 01

• Sample size up to 5%

• Contains only basic health indicators

http://www.freefoto.com/images/11/16/11_16_6_prev.jpg

What data are available?

Which datasets might be most suitable for:

• Comparing the health of unpaid carers across local authority areas

• Comparing obesity levels across minority groups

• Comparing the relationship between education and contraception type over 3 decades

• Looking at the impact of job type on accidents at work

Why use...?SARs HSE

Detailed health data

Basic indicators onlyLTLI, General Health, Caring

Health focussed inc. nurse visit

Sample size Up to 5% (n=3 million) Variese.g. c. 15k full interviews in 2007

Lowest Geography

Down to LAD Region/ SHA(published modelled figures for lower levels see Scholes et al 2008)

Currency 2001 data available 2008 data available

Other Links to other census products – Easy to use

Provides key indicators

From the census...• Can be used alongside other

census products• Promise of confidentiality –

microdata is not public data• Gaps due to non-response

were filled using ‘imputation’ (borrowing data from similar respondents)

• Simple random sample (approximately)

• Very large samples 0.6 to 3 million cases

www.freefoto.com

Census topics 2001

• Employment• Education• Household structure• Housing • Transport• Ethnicity & Religion• Health & Care

On Not on • Type of illness

or disability • Income

Health in the census

• Long term limiting illness in 1991 and 2001

• New to 2001:– Caring – General health

Range of census filesWhat do you want?

Area tables

Samples of Anonymised Records Longi-tudinal Studies

Individ-ual licenced EUL

Small Area Micro-data

Special licence household SAR

Individ-ual CAM

House-hold CAM

Do you want smaller than local authority geography?

- -

Local authority Individual flexibility

Individual detail

Less V.G Good V.G Excel-lent

Excel-lent

Excel-lent

Household members linked to each other

Ability to follow individual over time

SARs Use

• Carers by local authority area (Buckner and Yeandle)

• Care homes in 1991/2001 (Banks et al 2006)

• High Levels of Ill health in Scotland exist once key socio-economic factors are controlled (Popham 2005)

Dahlberg et. al. 2007

SARs Access

www.ccsr.ac.uk/sars• Registration for End User Licence files

– Academic (online registration using federated access)

– Non-academic, form on site. No charge for not-for profit. Name up to 10 users

• Approved Research files– Start at SARs site should get routed to right

place– Approved Researcher legal status

Health Survey for England• Executed by the Joint Health Surveys Unit

of NatCen and UCL• C. 2.5k accesses in 2009 – but nearly all

academic• Design varies to reflect topic:

– 2004 ethnic minority groups– 2005 older people– 2006 cardiovascular disease– 2007 knowledge and attitudes– 2008 physical activity and fitness

• Data collected directly from 13+ yos• Regional/SHA geography• Published reports most recently from NHS

Information Centre

HSE 08: Physical Activity & Fitness

More interesting design, requires weighting

For main interview and nurse visit if consent obtained

• Main sample (max 2 households) 64% response rate.– Main sample 15k adults and 3.5k children (0-15)– Child boost 73% response rate n=4.5k– Nurse visit for 10k adults and 2.4k children

• Sub-sample of c. 1/3 of sampling points– Up to 1 adult (+ 1 child) per hhd– asked to wear accelerometer – perform step test during nurse visit

Scottish Health Survey

• 95, 98, 03/04 (08-11 in progress)• Closely modelled on HSE• Sample large enough to distinguish 7

regions• 2003 focus on CVD: saliva, urine and ECG

measures taken• Health Board Geography

Welsh Health Survey

• Quite different!• 1995 (not available)• 1998 self completion (n=30k)

– WHA geography• 2003 onwards –continuous survey

– household interview and self completion– in 2007, n=14k adults and 3k children– general health, specific illness, NHS use,

subjective lifestyle– Regional geography

General Health Survey (GHS)aka General Lifestyle Survey

(GLF)• multi-purpose continuous survey since 1971• face to face interviews• all adults in 13,000 addresses annually (72%

response)• regular health-related lifestyle topics

– smoking – prevalence, consumption, dependency – drinking - amount & frequency

• occasional lifestyle topics– e.g. sport & leisure

• extensive socio-economic data – household composition, education, employment,

housing tenure• Change to panel design in 2005

Not the only data!

• Other surveys in the ESDS Government contain health data as we’ve seen. e.g.– Omnibus (aka Opininions)– Labour Force Survey– British Social Attitudes...

What you can do with the data –practical uses and examples

• Looking at change over time

• Look at sub-populations

• Using the flexibility of the data to look at alternative definitions

Marmot, M (2003)

SMOKING AND SOCIAL CLASS - MEN

05

1015202530354045

1994 1995 1996 1997 1998 1999 2000 2001

year

%

all sc I&II sc IV&VSource:HSE

Marmot, M (2003)

Using successive cross-sectional data over time

Pros…• Reasonable amount of

comparability• Can you pool

years/quarters to look at periods?

• Data is representative at each time point

• Good at looking at impacts on groups (not individuals)

Cons…• Limits to continuity in

the data (e.g. ethnicity, SOC)

• Cannot establish individual change

Survey data is subject to sampling error!Example: Pregnancy and Employment

• Using 1998-99 General Household Survey data alone there are only 168 pregnant women aged 16-49

• 95% Confidence interval for % pregnant women economically inactive 34.2 – 49.1%

• Combined 3 years’ data to obtain sample of 465 pregnant women

• Confidence interval using 3 years’ data: 34.9 –43.9%

Small populationsare often too small

What are ‘hours worked’?

• Is it just paid work? Or unpaid as well?• Hours usually worked, or actually worked last week?• In main job, or in any job? • What about students?• Overtime – paid?• Overtime – unpaid?• Lunch hours?• Do non-workers work zero hours or should they be

excluded?

Using the flexibility of the data to look

at alternative definitions

Limitations• Cannot attempt to identify respondents –

so not possible to do direct matching under End User Licence– matching on known area is OK– probabilistic approaches OK

• PCT geography not available– region lowest level on most data– some LAD geography

• Not public data – Must register

Accessing Data through the ESDS

• All users can access study descriptions, online documentation, including questionnaires, free of charge without registering with ESDS

• In order to access the datasets you need to register with ESDS– Register online using your UK Federation username

and password – Simple online form, takes about 10 minutes– You need to register a usage of the data as part of

this process– Non-commercial users: free of charge – Commercial users: £500 Charge for per study and will

need to apply for a UK Federation username and password

– You need to agree to the End User Licence when – you register

Longitudinal

• Census longitudinal studies:– LS, – SLS, – NILS – See www.census.ac.uk

• ESDS longitudinal studies e.g.: – cohort studies NCDS, BCS70, MCS– British Household Panel Studies

• See www.esds.ac.uk/longitudinal– English Longitudinal Study of Ageing

and more...

• Drug use, Smoking and Drinking among Young People in England

• Psychiatric Morbidity Surveys• National Survey of NHS Patients• Adult Dental Health• Infant Feeding Survey• National Diet & Nutrition Survey• National Survey of Sexual Attitudes & Lifestyles

(NATSAL)• Active People Survey• Taking Part Survey• ...

What data is available for my topic?

• Census.ac.uk has a search portal for data• SARs team about to publish a health topic guide

• ESDS Government www.esds.ac.uk/government– health theme guide – survey pages – survey finder– omnibus module pages– variables search– publications and registered usage pages– nesstar

• ESDS Data Catalogue• ESDS theme page on health:

http://www.esds.ac.uk/themes/health/

• Survey Question Bank (Survey Resources Network)• http://surveynet.ac.uk/sqb/• http://surveynet.ac.uk/srn/introduction.asp

Finding out about the data

Unless...• you can track your variable back to the

question(s) asked on the questionnaire, • know who the questions were asked of• and what was done with the raw data to

turn it into the final data• and what you must do to be able to use it

• You don’t understand the data

!

Documentation

• Questionnaires• Codebooks • Information about derived variables• Technical information on sampling,

weighting etc.

Future developments• Census 2011

– Range of new questions on identity and language– Less on socio-economic characteristics– Northern Ireland Scotland – information on specific disabilities

• ONS Integrated Household Survey– GHS, Labour Force Survey, Expenditure & Food Survey, English

Housing Survey, Omnibus Surveys, Life Opportunities Survey– Approx 500k individuals in c. 250k households– Core questionnaire

• Economic Activity• Education• Health & Disability• Identity• Income

– PCT level estimates?– Outputs expected 2010– http://www.esds.ac.uk/government/cps/

• IC Survey Consultation - a new Health and Social Care Survey– Fieldwork Jan 2011, reporting late 2012– Join IC’s Health Surveys Programme Network

[email protected]– https://www.wave.ic.nhs.uk/Services/ICHSPN/default.aspx–

Help and Support

• SARs– www.ccsr.ac.uk/sars– [email protected]– Annual User Meeting

• ESDS Government– www.esds.ac.uk/government– [email protected]– GHS/GLF and Health Surveys Annual

User Meetings