Upload
varsha-khodiyar
View
256
Download
0
Embed Size (px)
Citation preview
Privacy and Publication: challenges and opportunities for clinical data
Varsha Khodiyar, PhDData Curation Editor, Scientific Data
Nature Publishing Group
[email protected]@varsha_khodiyar
@scientificdata
Big Data Opportunities Using the NDA, 17th October 2015
Reporting bias impacts human health
Oseltamvir: “only...effective...for the prevention and treatment of symptoms of influenza”
Cochrane Database Syst Rev. 2012 DOI: 10.1002/14651858.CD008965.pub3
Reboxetine: “overall an ineffective and potentially harmful antidepressant”BMJ 2010;341:c4737
Statins: “beneficial effect…on atrial fibrillation...is not supported by a comprehensive review of published and unpublished evidence”
BMJ 2011;342:d1250
3
Withholding data impacts human health
Increasing support for data transparency
• Funder/institution policy and mandates1
• Regulatory agencies (EMA)• Legislation (FDAAA)• Non-governmental/academic (IOM, YODA)• Industry (CSDR)• Journals and ICMJE2
4
1. Hahnel, Mark (2015): Global funders who require data archiving as a condition of grants. figshare. http://dx.doi.org/10.6084/m9.figshare.1281141
2. http://www.icmje.org/news-and-editorials/principles_data_sharing_jan2014.html
Publishers/journals and data access
• More reliable evidence – and papers• Journal mission/goals• Help community derive maximum benefit from research• Content innovation (facilitate more use and reuse)• Reliability (peer review)• Discoverability and visibility (bibliographic databases)• Linking and licensing content (open access)• Permanence (content and links)• Credit/incentives (article types and citations)• Encouraging and implementing good practice and policies
5
Journal data policies
• Willingness to share stated (Annals Internal Medicine)• Data sharing implied by submission (BioMed Central*)• Data sharing implied as a condition of publication (Nature*)• Mandated data sharing with statement in paper (PLOS, BMJ -
for clinical trials)• Mandated data sharing with statement and link to data (non-
medical journals e.g. ecology, animal genomics)• Mandated open data as a condition of submission (Scientific
Data, GigaScience, F1000Research)
*Minimum requirement – some disciplines/journals may mandate
6
STRONGER
1. Vines, T. H. et al. Mandated data archiving greatly improves access to research data. FASEB J. fj.12–218164– (2013). doi:10.1096/fj.12-218164
Data sharing via supplementary files
7
Sandercock et al: The International Stroke Trial database. Trials 2011, 12:101 doi:10.1186/1745-6215-12-101
Data sharing via repository links
8
Data sharing via repository links
9
Data sharing via repository links
10
Role of data journals/articles
• Data peer review• Outlet for ‘unpublishable’ data• Data discoverability• Data reusability• Permanence of datasets• Robust links with repositories• Credit/reward data generators• “Intelligently open data”
11
Scientific Data
Scientific Data peer review
Peer review focuses on:• Completeness (can others reproduce?)• Consistency (were community standards
followed?)• Integrity (are data in the best repository?)• Experimental rigour and technical quality
(were the methods sound?)Does not focus on: • Perceived impact/importance• Size/complexity of data
An example Data Descriptor
14
Human readable
representation of study
i.e. article (HTML & PDF)
Human readable representation
of studyi.e. article
(HTML & PDF)
Machine readable
representation of study
i.e. metadata
Scientific Data structured metadataIn-house curation team:• assists users to submit the
structured content via simple templates and an internal authoring tool• performs value-added
semantic annotation of the experimental metadata
analysis method script
Data file or record in a database
Data on (reasonable) request - issues
16
• Meta-analysis fails to launch when <40% IPD available – unanswered requests and refusal to shareSystematic Reviews 2014, 3:97 doi:10.1186/2046-4053-3-97
• Poor availability of psychological research data (only 64/249 datasets available)American Psychologist 2006, 61(7) doi:10.1037/0003-066X.61.7.726
• Data received from 1/10 authors publishing in PLOS Medicine and PLOS Clinical TrialsPLoS ONE 2009, 4(9): doi:10.1371/journal.pone.0007078
• 38% of 394 requested datasets received from APA journal authorsCollabra 2015, 1(1): doi:10.1525/collabra.13
Clinical researchers support sharing
17
Rathi V, Dzara K, Gross CP, Hrynaszkiewicz I, Joffe S, Krumholz HM, Strait KM, Ross JS: Sharing of clinical trial data among trialists: a cross sectional survey. BMJ 2012;345:e7570
• Sharing de-identified data via repositories should be required (236 respondents, 74%)
• Investigators should share de-identified data on request (229 respondents, 72%)
What are researchers’ concerns?
18
Reproduced from: Rathi V, Dzara K, Gross CP, Hrynaszkiewicz I, Joffe S, Krumholz HM, Strait KM, Ross JS: Sharing of clinical trial data among trialists: a cross sectional survey. BMJ 2012;345:e7570
Better ways to share on request
19
Yale Open Data Access (YODA) & Clinical Study Data Request (CSDR) projects:
• Data Use Agreements (DUAs)• Controlled access environment• Scientific validity of reanalysis checked• Independent governance• Data anonymisation checks
http://yoda.yale.edu/https://www.clinicalstudydatarequest.com/
Better way to publish data on request
20
• Sensitive data repositories (e.g. UKDA)Permanence, curation, persistent identifiers,
versioning• Data-on-request services (e.g. YODA)
Independent governance, scientific review and transparency of access requests, DUAs
• Journals/publishersPeer review, visibility, credit/citations, robust
links+=
A robust data-on-request workflow?
21
Hrynaszkiewicz, I., Khodiyar, V., Hufton, A. & Sansone, S. A. Publishing descriptions of non-public clinical datasets: guidance for researchers, repositories, editors and funding organisations. BioRxiv http://dx.doi.org/10.1101/021667 (2015).
Open access Data Descriptor
22
http://www.nature.com/articles/sdata201531
Open access Data Descriptor
23
http://www.nature.com/articles/sdata201531
Linked to restricted access data
http://dx.doi.org/10.7910/DVN/25833All approved repositories:http://www.nature.com/sdata/data-policies/repositories
Key recommendations
25
• Clinical researchers: Prepare to share on request, with short embargoes
• Repositories: Develop mechanisms to host clinical data non-publicly and manage access requests; collaborate with journals
• Editors and publishers: Check policy compliance for every submission and facilitate peer reviewer access to data; collaborate with repositories
• Sponsors and funders: Partner with trusted repositories and ensure that data access requests are proportionately reviewed without introducing unnecessary barriers
Repositories for non-public data should
26
• Provide stable identifiers for metadata records• Allows access to data with the minimum of
restrictions, codified in DUAs• Ideally be independent of the study sponsors• Have a transparent and persistent system for
requesting access to data and reviewing requests to access data
• Allow access to data in a timely manner• Ensure long-term preservation of data in their non-
public form
Visit nature.com/sdata
Email [email protected]
Tweet @ScientificData
Honorary Academic EditorSusanna-Assunta Sansone
Managing EditorAndrew L. Hufton
Data Curation EditorVarsha K. Khodiyar
Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators
Supported by