Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
2015 PhUSE SDE, Copenhagen10. June 2015
Jean-Marc FerranConsultant & Owner
• Thanks to – Sarah Nolan (University of Liverpool)– Khaled El Emam (Privacy Analytics) for their input
DeID Standards Risk Data
UtilityRWD
Studies
MD5
PharmaEmployees CROs Researchers
(Portal)Researchers (Data is sent)
Public(Web)
Legal Framework
Technical Framework& Controls
Data De-
Identification
Data De-Identification
Processes
Quasi/Direct Identifiers
AssessmentRules
Residual Risk
PatientID
DoB Age Gender Race Country PartnerAge
1 12APR1963 51 Male White Canada 48
2 28MAY1974 40 Male Asian France 41
3 06MAY1961 53 Male White United States 36
4 28MAY1954 60 Female Black Spain 65
5 14JUL1969 45 Male Black Brazil 41
6 13AUG1964 50 Female White Argentina 45
7 18MAR1961 53 Male White United States 48
8 22JAN1961 53 Male White United States 37
9 27SEP1924 90 Male White Canada 73
10 07FEB1956 58 Male White Canada 62
?
PatientID
Age Category
Age Gender Race Country PartnerAge
1 <89 51 Male White Canada
2 <89 40 Male Asian France
3 <89 53 Male White United States
4 <89 60 Female Black Spain
5 <89 45 Male Black Brazil
6 <89 50 Female White Argentina
7 <89 53 Male White United States
8 <89 53 Male White United States
9 ≥89 . Male White Canada
10 <89 58 Male White Canada
?
??
PatientID
Age Category 2
Age Gender Race Continent PartnerAge
1 50-59 Male White North America
2 40-49 Male Asian Europe
3 50-59 Male White North America
4 60-69 Female Black Europe
5 40-49 Male Black South America
6 50-59 Female White South America
7 50-59 Male White North America
8 50-59 Male White North America
9 ≥89 Male White North America
10 50-59 Male White North America
?
??
?
?
PatientID
DoB Age Gender Race Country PartnerAge
1
2
3
4
5
6
7
8
9
10
?
?
?
??
??
?
??
PatientID
DoB Age Gender Race Country PartnerAge
1 12APR1963 51 Male White Canada 48
2 28MAY1974 40 Male Asian France 41
3 06MAY1961 53 Male White United States 36
4 28MAY1954 60 Female Black Spain 65
5 14JUL1969 45 Male Black Brazil 41
6 13AUG1964 50 Female White Argentina 45
7 18MAR1961 53 Male White United States 48
8 22JAN1961 53 Male White United States 37
9 27SEP1924 90 Male White Canada 73
10 07FEB1956 58 Male White Canada 62
?
Size 1: 100.0%
Patients having same characteristics for important quasi identifiers
PatientID
Age Category
Age Gender Race Country PartnerAge
1 <89 51 Male White Canada
2 <89 40 Male Asian France
3 <89 53 Male White United States
4 <89 60 Female Black Spain
5 <89 45 Male Black Brazil
6 <89 50 Female White Argentina
7 <89 53 Male White United States
8 <89 53 Male White United States
9 ≥89 . Male White Canada
10 <89 58 Male White Canada
?
??
Size 3: 33.3%
Patients having same characteristics for important quasi identifiers
PatientID
Age Category 2
Age Gender Race Continent PartnerAge
1 50-59 Male White North America
2 40-49 Male Asian Europe
3 50-59 Male White North America
4 60-69 Female Black Europe
5 40-49 Male Black South America
6 50-59 Female White South America
7 50-59 Male White North America
8 50-59 Male White North America
9 ≥89 Male White North America
10 50-59 Male White North America
?
??
?
?
Size 5: 20.0%
Patients having same characteristics for important quasi identifiers
AveragePatients
1Size(EquivalenceClass[Patient])!
"#
$
%&
Maxi
1Size(EquivalenceClass[i])!
"#
$
%&
Hrynaszkiewicz et al., BMJ 2010: Less than 3 quasi identifiers
Disease Population in Geographical LocationProb=1/XXXXX
All Similar Clinical TrialsProb=1/XXX
All Similar Sponsor Clinical TrialsProb=1/XX
Clinical TrialProb=1/X
ProactiveOutside a Request
Use Company/Industry Guidelines
Compare to SAP
Good common sense…
ReactiveBased on a Request
Use Company/Industry Guidelines
Focus on what is needed
Negotiate with researcher
Minimum Data Utility
Quasi/Direct Identifiers
Data Rules
Risk
5th Clinical Trial Data Transparency Forum Heidelberg, Germany, 23rd April
Clinical Study Data Request.com and the SAS Data Access System:
An Academic Researcher’s Experience
Sarah J. Nolan ([email protected])Department of Biostatistics
University of Liverpool
Academic Context• October 2011: 3 year research project for
Cochrane Epilepsy Group funded by National Institute of Health Research (NIHR):
“Clinical and cost effectiveness of interventions for epilepsy in the NHS”
o Cochrane Individual Participant Data (IPD) Network Meta Analysis (10 drugs)o Carbamazepine (CBZ), Phenytoin (PHT), Valproate
(VPA), Phenobarbitone (PB), Oxcarbazepine (OXC), Lamotrigine (LTG), Gabapentin (GBP), Topiramate(TPM), Levetiracatam (LEV), Zonisamide (ZNS)
Network Meta Analysis: Data Requestingo 29 existing studies (n=5881)–12 Academic studies (n=1383)–13 Pharmaceutical studies (n=3320)–4 Government studies (n=1178)
oData provided from 18 studies (n=4697, 80%)–2 Academic studies (n=286, 21%)–13 Pharmaceutical studies (n=3320, 100%)–3 Government studies (n=1091, 93%)
Network Meta-Analysis
SAS Multi Sponsor Data Access Environment
Standard operating procedure for the project
1) Perform a detailed check of the data for content• Go back to data providers if anything is missing
2) Consistency check against the publication• Check inconsistencies with data providers
3) Prepare analysis variables for the network meta-analysis outcomes
Completed in approximately: 5 working days
• Real World Data is any data from external “Real World” sources – Insurance claims databases – Electronic medical/health records – Social media feeds – Web trawling of online
documentation – Biosensor device data – Mobile App data – Genomic/Proteomic/Xxxxxx-omic
data – Publicly available environmental
data – Marketing survey data
Source: PhUSE German SDE 2014 –Rob Walls, Roche
• Many De-Identification Standards Available
• Increasing awareness around Risk Assessment
• Data Utility is key and must be assessed properly including reproduction of results
• Increase of Pubic Data and Real World Data Studies will require data de-identification to evolve
Jean-Marc FerranConsultant & Owner, Qualiance ApS
dk.linkedin.com/in/jeanmarcferran/
@QualianceTwitta
• [1] Clinical Trial Transparency Regulatory Landscape - Ben Rotz – Eli Lilly and Company – Clinical Trial Data Transparency Forum – 11 February 2014
• [2] EMA Guidelines 0070 (Draft) – June 2013– http://www.ema.europa.eu/docs/en_GB/document_library/Other/2013/06/WC500144730.p
df• [3] Hrynaszkiewicz I, Norton M L, et al. Preparing raw clinical data for publication:
guidance for journal editors, authors, and peer reviewers. British Medical Journal 2010; 340:304–307– http://www.bmj.com/content/340/bmj.c181
• [4] The Twelve Characteristics of an Anonymization Methodology, Khaled El Emam– http://www.privacyanalytics.ca/wp-content/uploads/2013/07/TwelveCharacteristics.pdf
• [5] A De-identification Strategy Used for Sharing One Data Provider’s Oncology Trials Data through the Project Data Sphere Repository, Malin, 2013 – https://www.google.com/url?q=https://www.projectdatasphere.org/projectdatasphere/html/
resources/PDF/DEIDENTIFICATION&sa=U&ei=D69iU47DBYaN4ASf4YGgBw&ved=0CBsQFjAA&usg=AFQjCNGaWTa9-cXwUpP9q6UfE9FBjPV4vw
• [6] Preparing individual patient data from clinical trials for sharing: the GlaxoSmithKline approach – Pharmaceutical Statistics 2014