19
The Welsh Health Survey in the SAIL databank Mark Atkinson, Jonathan Kennedy and Sinead Brophy College of Medicine, Swansea University Health Surveys User Conference, 10 July, 2015

The Welsh Health Survey in the SAIL databank - UK Data · PDF fileTo help protect your privacy, PowerPoint has blocked automatic download of this picture. The Welsh Health Survey in

Embed Size (px)

Citation preview

To help protect your privacy, PowerPoint has blocked automatic download of this picture.

The Welsh Health Survey in the SAIL databank

Mark Atkinson, Jonathan Kennedy and Sinead Brophy

College of Medicine, Swansea University

Health Surveys User Conference, 10 July, 2015

Contents

• SAIL databank• Reasons for linking WHS data• Linkage to WHS data• Looking at a few example variables

o BMIo Smokingo Exerciseo Health Conditions

SAIL - Split file approach to anonymisation

Datasets in SAIL

• We have a large variety of routine data; • GP (70 % coverage) • Hospital inpatient• Hospital outpatient• A&E• Education (National Pupil Database)• National Survey for Wales

Reasons for linking WHS data

1. To compare with a dataset with similar variables.a) Missing data in GP databaseb) Look at the WHS data which is more complete

2. To supplement the WHS data with additional information.

Comparison between datasets

WHS GP dataTime Single point Multiple pointsCoverage Nearly

CompleteIncomplete

Questions Identical Variable

WHS data Linkage

WHSNot Present Present

GPNot Present 10 (3%) 175 (51%) 185 (54%)Present 11 (2%) 145 (43%) 156 (46%)

21 (6%) 320 (94%) 341

BMI – data in both datasets

BMI

Green line – gradient = 1

Red line, actual gradient of the data = 0.886WHS value 5 % smaller than corresponding GP value

BMI- Standardised histogram of WHS data

BMI

Sta

ndar

dise

d fr

eque

ncy

BMI

Weights

Green line – gradient = 1

Red line, actual gradient of the data = 0.879WHS value 4.4 % smaller than corresponding GP value

Alcohol and Diet

• Alcohol– 79 % of those with GP data have GP data on alcohol

consumption under the 136.. Read code header

• Diet– 43 % of those with GP data have GP data on diet under any of

the three Read code headers• 1F% Dietary history• 13A% Diet – patient initiated• 13B% Diet - medical

Exercise

Comparison of the variables

Read Codes• 1383. = Enjoys light exercise

Exercise

• 62 % of those with GP data have GP data on exercise under the 138 Read code header

WHSNo exercise Light Medium Heavy

GP

No exercise 6 <5 <5 <5Light 11 29 24 9Medium 5 11 32 19Heavy <5 <5 <5 8

Two types of missingness in GP data

1. Some have no relevant codes2. Some have ambiguous codes which carry no information

1st type In WHS, with GP data but no 138 codes

WHSNo Exercise 16 (14 %)Light 20 (18 %)Medium 46 (41 %)Heavy 31 (27 %)

2nd typeWhere GP exercisestatus is unknown

WHSNo Exercise 9 (21 %)Light 14 (33 %)Medium 15 (35 %)Heavy 5 (12 %)

Smoking

• 96 % of those with GP data have GP data on smoking under the 137 Read code header

• Note that people reporting in WHS as never smokers may have evidence from the GP data of being ex smokers

WHSmissing never ex smoker

GP

No information <5 9 <5 5never <5 65 5 <5ex <5 55 67 <5smoker <5 12 19 41

Health conditions

• Health conditions are notoriously under-reported in questionnaires

• The use of linked GP and hospital episode diagnoses will supplement WHS data

Conclusions

• WHS data can aid the understanding of GP data because of greater completeness and standardisation

• GP data can complement WHS data because of the temporal dimension it can bring

• GP and hospital episode diagnoses can complement and enrich WHS data