25
ALSPAC Record Linkage to External Databases Andy Boyd ALSPAC, Social Medicine University of Bristol

ALSPAC Record Linkage to External Databases

  • Upload
    allene

  • View
    52

  • Download
    0

Embed Size (px)

DESCRIPTION

ALSPAC Record Linkage to External Databases. Andy Boyd ALSPAC, Social Medicine University of Bristol. The data sources and processes involved. The processes involved in linkage projects Overview of ALSPAC’s existing data linkage projects National Pupil DB & Geographic linkage as examples - PowerPoint PPT Presentation

Citation preview

Page 1: ALSPAC Record Linkage to External Databases

ALSPAC Record Linkage to External Databases

Andy Boyd

ALSPAC, Social Medicine

University of Bristol

Page 2: ALSPAC Record Linkage to External Databases

The data sources and processes involved

• The processes involved in linkage projects

• Overview of ALSPAC’s existing data linkage projects

• National Pupil DB & Geographic linkage as examples

• Data Availability & Linkage Problems

Page 3: ALSPAC Record Linkage to External Databases

Processes involved in linkage projects

• Find the contact

• Ethics – informed consent and/or Section 60 support

• Data Security

• HM Revenue & Customs

• Creating a linkage data set

• Data QC checks

• Identifiers

• Formats and data ‘normalisation’

Page 4: ALSPAC Record Linkage to External Databases

Processes involved in linkage projects cont…

• Who links the data?

• one of the two parties or an independent 3rd party

• Processing the data• Anonymity vs sufficient data for research

• Ages in Months & Years• First Half of Postcode• Recode unusual outcomes into wider categories

Page 5: ALSPAC Record Linkage to External Databases

Major External DatabasesHealth related datasets

• Office National Statistics (ONS) Tracing

– Cancer Registry & GRO

• NSTS (NHS Strategic Tracing Service)

• Electronic antenatal & birth records

• PCT data (Exeter DB, My Quest)*

Non health Datasets

• National Pupil Database (DCSF, DIUS*, UCAS*)

• ALSPAC Schools Collection

• G.I.S Datasets (Geographic Information Systems)

• DWP*

• Home Office* * Linkage currently being investigated

Page 6: ALSPAC Record Linkage to External Databases

National Pupil Database• Maintained by Dept. Children Schools &

Families• Covers all state maintained schools in England• Annual / now 3 time points, census• Data at school and pupil level• Key data include:

– Exam results– Attendance– Pupil demographics (including address, ethnicity,

Free School Meals, Special Educational Needs)– School Characteristics (pupil numbers, staff pupil

ratios)

Page 7: ALSPAC Record Linkage to External Databases

NPD – How we did it

• 3rd party conducted match – The Fischer Trust – independent charity

• Provided data on the eligible cohort• ALSPAC & DCSF provided the following

linkage variables:– Surname, Forename, Familiar name– Date of Birth, Gender– Postcode, Previous Postcode & Postcode

accuracy flag– Current School (from ALSPAC data

collection)

Page 8: ALSPAC Record Linkage to External Databases

NPD - Details• ALSPAC Cohort covers three academic years• We hold data on all YPs across these three

years – approx. 600,000 cases a year• Figures based on eligible cohort

17671 linked (86%)• Majority of unlinked cases thought to be in

private education (will be in NPD from KS4)

Page 9: ALSPAC Record Linkage to External Databases

NPD - Advantages

• Covers all English state schools• Good match rate for eligible cohort• Regular updates• Access to ‘confidential’ variables• PLUG workshops provide good opportunities

to discuss data and solutions to problems

Page 10: ALSPAC Record Linkage to External Databases

NPD - Problems• Central ID QC issues (a few duplicates)• Only applies to English state maintained until

KS4, then re-link – extra costs and bias until then

• Data collection method/standards varies from school to school

• Documentation (lack of)• Size of raw data, time consuming to process• Fixed time point census, doesn’t record all

school movements (especially annual census)

Page 11: ALSPAC Record Linkage to External Databases

G.I.S Data• Spatial data held at many geographic levels• Geographies range in scale from 0.1 meters

to regional/national data• Tied together via postcode or grid reference

as central ID• Key data include:

– NSPD ( was All Fields Postcode Directory) - geo linking database

– Deprivation & Socio Economic indices (IMD, Townsend, Acorn)

– Census data

Page 12: ALSPAC Record Linkage to External Databases

G.I.S – How we link cases to data

• Master file of Postcodes• Postcodes linked to grid

reference• Grid references of various

scales• PCs/GridRef mapped to:

– Electoral geographies– Census geographies

• Ethics:– We don’t generally identify

residence at PC or equivalent level

Ordinance Survey – The National Grid

Page 13: ALSPAC Record Linkage to External Databases

G.I.S - Details• 50,000 ALSPAC address points, associated

with a date range which can then be linked to ALSPAC data collection

• Linkage examples:– Indices of multiple deprivation– Travel from home to

school patterns– Cancer rates and residential

distance from power lines

The geographic relation between household income and polluting factories – FoE 1999

Page 14: ALSPAC Record Linkage to External Databases

G.I.S advantages

• Many data sets in public domain (or available through ‘athens’)

• Many geographies are broad enough to not identify cohort members

• National picture (some exclude Scotland)

Page 15: ALSPAC Record Linkage to External Databases

G.I.S Problems

• Shifting geographies across time points• Royal Mail change postcodes• Postcode not precise enough in some cases• Postcode boundaries are not contiguous with

other geographic boundaries

Page 16: ALSPAC Record Linkage to External Databases

Accuracy issues with analysis at postcode level

Address level Postcode level

Page 17: ALSPAC Record Linkage to External Databases

Accuracy issues with analysis at postcode level

Address level Postcode level

Page 18: ALSPAC Record Linkage to External Databases

Accuracy issues with analysis at postcode level

Address level Postcode level

Page 19: ALSPAC Record Linkage to External Databases

Data Availability & Linkage Problems

• Cohort Data

• GIS Data

• GIS Ethics

Page 20: ALSPAC Record Linkage to External Databases

Linkage problems with the cohort data

• Missing data– Especially problematic for the cases who

didn’t enrol in the original recruitment– Partners– 69 cases with no known birth outcome– Gaps in the address data

• However…– ONS matched 99.7% mothers, so we have

their old & new NHS numbers and cleaned data (original recruitment cases only)

Page 21: ALSPAC Record Linkage to External Databases

Linkage problems we encounter• Many of the early records are paper based or

in varied formats.• Quality Control – ONS data returned to us

with 37 incorrect ALSPAC Ids• Unknown methods – No documentation from

ONS or Fischer regarding the quality of the match

• Lack of uniqueness in the ID (either duplicates or multiple IDs per case)

Page 22: ALSPAC Record Linkage to External Databases

GIS Data Availability

• Collected as administrative resource• Not yet cleaned, documented and

presented to usual ALSPAC standards• Initiatives under way to validate and fill

gaps in record• Schools GIS data in the main not

processed• Aim to build into standard ALSPAC

resource

Page 23: ALSPAC Record Linkage to External Databases

GIS Ethics• Postcode level or greater accuracy treated as

a personal identifier• Research proposals to use these data need

ALSPAC Law & Ethics Approval• Broader geographical data can be released in

normal manner• A two-stage process is used to collect and

process precise data

Page 24: ALSPAC Record Linkage to External Databases

GIS Ethics

Step 1 – Postcodes (or full address) provided to researcher with unique collection ID with no other data attached

Step 2 – Researcher attaches their data and returns file to ALSPAC

Step 3 – ID converted to the appropriate collaborator ID, postcode data removed

Step 4 – Requested ALSPAC data added to the file and data sent to the researcher

Page 25: ALSPAC Record Linkage to External Databases

Andy [email protected]