Big Data for security and ResilienceChallenges and OpportunitiesJohn Parkinson
Ethics and Politics in Big DataHealth Data
CPRD
Clinical Practice Research Datalink
MedicinesDevicesCTRegulator
BiologicsStandards& Controls
More dimensionsto data
PHE
NHSEHSCIC
CPRDResearch
Surveillance
BI
Unlocking the potential of NHS Patient Data in Research
Observational & InterventionalOTHER
The Power of the 54 million
in the NHS…with25 years of coded data
Big data is high volume, high velocity, and/or high variety information
assets that require new forms of processing to enable enhanced
decision making, insight discovery and process optimization
Data Mining( hypothesis generation)
Hypothesis Strengthening
Hypothesis Testing
BIGTruthfulResults
BIGTruthful, Powerful
Results
EthicsLinkageSecurity
SCIENCERisk based approach
Big Health Data Cycle
TransformData
Instance it
Listen, HearDeliver
N of 1From the BIG
NHS DatasetsSocialCareData
LocalData
DiseaseReg.
5
NHS Audits40 Granularity
100%
I o Care8500Practices8LRNs %
Very detailedCodedrecord
2 o Care
HES+Rx
100%ICD10OPCS4Drugs
Lab Path Microb.
DeathBirthDemography
100%
National central Data - ONS
Environ.Air Poll.
Club CardData
?
Linkage on NHS#And OrPC, sex, DOB
ApprovedResearchers
ClinicalTrials
BiosamplePROs
Consented
100%PHECancerBINOCAR
Genetic data
Digital path. datafut
ure
Triple linkage- MI -17,964nBMJ paper May 21st 2013
3188
CPRD1338074.5%
MINAP 9438, 52.5%
HES 11998, 67.9%
3532
5561
1806
12901099
1488
GPRD- Clinical Record
Liveflag Audit seq Oper. ID sys date sys timeClintype ClinID Patient Id Consulation ID Topic Category Staff ID Event date End dateinpractice Private Medcode Data1 data2 2 4732314 000p 20050404 142154 000s 0?Oe 01aE 3qDi 1 SED002 000m 20050404 Y N 246..00 70 1481 4732337 000p 20050404 143202 000s 0?Oe 01aE 3qDi 1 SED002 000m 20050404 Y N 246..00 70 1481 4732613 000@ 20050404 151722 000s 0?Of 013O 3qEn 1 SED002 000A 20050404 Y N 246..00 105 2001 4732660 000@ 20050404 154024 000s 0?Og 00Zz 3qEv 1 SED002 000A 20050404 Y N 246..00 90 1741 4733221 000@ 20050404 173149 000s 0?Oh 03EG 3qGF 1 SED002 000A 20050404 Y N 246..00 71 1461 4733312 000@ 20050404 181118 000s 0?Oi 00eL 3qGP 1 SED002 000A 20050404 Y N 246..00 95 1461 4734735 000A 20050405 85938 000s 0?Oj 02gL 3qIS 1 SED002 000@ 20050405 Y N 246..00 80 1401 4734759 000@ 20050405 90422 000s 0?Ok 01a@ 3qIP 1 SED002 000A 20050405 Y N 246..00 78 1221 4734762 000m 20050405 90434 000s 0?Ol 00ov 3qIU 1 SED002 000j 20050405 Y N 246..00 85 1551 4734820 000A 20050405 90905 000s 0?Om 02NI 3qIZ 1 SED002 000@ 20050405 Y N 246..00 90 1601 4734829 000N 20050405 90948 000s 0?On 02wM 3qIV 1 SED002 000B 20040730 Y N 246..00 61 1231 4734842 000m 20050405 91143 000s 0?Oo 02XG 3qIb 1 SED002 000j 20050405 Y N 246..00 100 1551 4734844 000m 20050405 91158 000s 0?Op 02XG 3qIb 1 SED002 000j 20050404 Y N 246..00 102 1531 4734849 000m 20050405 91222 000s 0?Oq 02XG 3qIb 1 SED002 000j 20050404 Y N 246..00 108 1562 4734930 000Q 20050405 92235 000s 0?Or 01JH 3qIj 1 SED002 000B 20050405 Y N 246..11 78 1401 4734932 000m 20050405 92313 000s 0?Os 00Uu 3qIi 1 SED002 000j 20050405 Y N 246..00 75 1181 4734955 000N 20050405 92711 000s 0?Ot 01wo 3qIn 1 SED002 000@ 20041013 Y N 246..00 76 1401 4734960 000Q 20050405 92745 000s 0?Or 01JH 3qIj 1 SED002 000B 20050405 Y N 246..11 78 1401 4735043 000h 20050405 94227 000s 0?Ou 01bm 3qIu 1 SED002 000e 20050405 Y N 246..00 70 1201 4735087 000m 20050405 94500 000s 0?Ov 039n 3qJ0 1 SED002 000j 20050405 Y N 246..00 80 1301 4735131 000@ 20050405 95000 000s 0?Ow 001R 3qJ3 1 SED002 000A 20050405 Y N 246..00 81 1241 4735142 000h 20050405 95232 000s 0?Ox 01cH 3qJ7 1 SED002 000e 20050405 Y N 246..00 60 1101 4735224 000N 20050405 101052 000s 0?Oy 00Ny 3qJ9 3 SED002 000B 20050315 Y N 246..00 80 1301 4735297 000m 20050405 101935 000s 0?Oz 01wo 3qJL 1 SED002 000j 20050405 Y N 246..00 85 1601 4735316 000h 20050405 102047 000s 0?P0 01lN 3qJK 1 SED002 000e 20050405 Y N 246..00 70 1101 4735346 000N 20050405 102343 000s 0?P1 005p 3qJM 3 SED002 000A 20050307 Y N 246..00 66 1091 4735352 000N 20050405 102429 000s 0?P2 005p 3qJM 3 SED002 000A 20050307 Y N 246..00 64 1051 4735353 000Q 20050405 102434 000s 0?P3 01SU 3qJR 1 SED002 000A 20050405 Y N 246..11 80 1301 4735357 000A 20050405 102446 000s 0?P4 00Ul 3qJT 1 SED002 000@ 20050405 Y N 246..00 80 1402 4735400 000@ 20050405 102954 000s 0?P5 02kD 3qJU 1 SED002 000A 20050405 Y N 246..00 103 147
100 million “active years” of data (all England one Billion) 100 billion is total number ever born/lived
Medical codes 107,000Product codes 59,000 3 Terbytes of data *6 at any one time 180 TBs in total for UK would be 2 Petabytes
Clinical Trials
Phase 3 to Real World
N
Practice Filters- Post Code, LRN, CLRN, AHSN Options Modify
CPRD Trialaviz showing where the patients are by “large population” regions
CPRD Trialaviz viewer provides for a random series of patients who meet set criteria there record in two views- across a number of years and then magnified to a daily basis. In this case a further magnification of 1o therapy events shows what was prescribed. Any event can be hover over to understand the journey of care.
Population
The Power of the 54 million
with real timefast query capabilityauto run capability
algorithmic capability
Mean Blood Pressure
Mean LDL
Fever/ microbiology
Mumps
Thanks