18
Data Consultant, Honorary Academic Editor Associate Director, Principal Investigator Helping you publish, discover and reuse research data Susanna-Assunta Sansone, PhD

NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Data Consultant, Honorary Academic Editor

Associate Director, Principal Investigator

Helping you publish, discover and reuse research data

Susanna-Assunta Sansone, PhD

Page 2: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

https://projects.ac/blog/five-top-reasons-to-protect-your-data-and-practise-safe-science/

Credit to:

Page 3: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Human Genome 2001 62 Pages, 150 Authors,

49 Figure, 27 tables

Journal publishing: the changing landscape!

Encode Project 2012 30 papers, 3 Journals

Page 4: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

http://www.flickr.com/photos/andrevanbortel/3745527869/sizes/m/in/photostream/

Several data-related activities at NPG

•  Figure source data

•  Extended data

•  Data citation

•  Code review

•  Reproducibility checklist

•  Linked Data release - CC0

….and…..

Page 5: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Helping you publish, discover and reuse research data

Visit nature.com/scientificdata Email [email protected] Tweet @ScientificData

Supported by:!

Honorary Academic Editor Susanna-Assunta Sansone, PhD Managing Editor Andrew L Hufton, PhD Editorial Curator Victoria Newman Advisory Panel and Editorial Board including senior researchers, funders, librarians and curators

Page 6: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

•  Introduction!o concepts and principles!o working with repositories!

•  Data Descriptor !o structured experimental metadata!

•  Content!o examples !

Page 7: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

"!!

Launched on May 27th, 2014

Credit for sharing your data

Focused on reuse and reproducibility

Peer reviewed, curated

Promoting Community Data Repositories

Open Access A new online-only publication for descriptions of scientifically valuable datasets in the life, environmental and biomedical sciences, but not limited to these!

By the Rat Genome Sequencing & Mapping Consortium"

Page 8: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

"""Experimental metadata or "structured component"

(in-house curated, machine-readable formats)"

Data Descriptor: narrative and structure!

Article or "narrative component"

(PDF and HTML) !

Page 9: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Data Descriptor: narrative!

Sections:!•  Title"•  Abstract"•  Background & Summary"•  Methods"•  Technical Validation"•  Data Records"•  Usage Notes "•  Figures & Tables "•  References"•  Data Citations"!

Focus on data reuse"Detailed descriptions of the methods and technical analyses supporting the quality of the measurements.!Does not contain tests of new scientific hypotheses!

In traditional publications this information is not provided in a sufficiently detailed manner

However this information is essential for understanding, reusing, and reproducing datasets

Page 10: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Data Descriptor: narrative!

Sections:!•  Title"•  Abstract"•  Background & Summary"•  Methods"•  Technical Validation"•  Data Records"•  Usage Notes "•  Figures & Tables "•  References"•  Data Citations"!

Focus on data reuse"Detailed descriptions of the methods and technical analyses supporting the quality of the measurements.!Does not contain tests of new scientific hypotheses!

Joint Declaration of Data Citation Principles by the Data Citation Synthesis Group, incl.: -  CODATA -  Research Data Alliance (RDA) -  Force11

Page 11: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

In-house curation team:"•  assists users to submit the structured

content via simple templates and an internal authoring tool!

•  performs value-added semantic annotation of the experimental metadata!

For advanced users/service providers willing to export ISA-Tab for direct submission, we will release a technical specification:!

analysis !method! script!

Data file or !record in a database!

Data Descriptor: structure (CC0)!

Page 12: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Export to various formats (ISA_tab, RDF, etc)

Linking between research papers, Data Descriptors, and data records

Making data discoverable !

Page 13: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

24

3

10 4

1

4

3

4

DNA and protein sequenceFunctional genomicsGenetic association and genome variationMetagenomicsMolecular interactionsOrganism- or disease-specificProteomicsTaxonomy and species diversityTraces and sequencing reads

“Omics” is emphasized among basic life-sciences repositories

•  We currently recognize over 50 public data repositories, and provide advice on the best place for authors to archive their data!

•  We have integrated systems with both:!!!

Helping authors find the right place for the data!

Page 14: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

""""""""Scientific hypotheses:"Synthesis"Analysis"Conclusions"

Methods and technical analyses supporting the quality of the measurements:"What did I do to generate the data?"How was the data processed?"Where is the data?"Who did what when"

Relation with traditional articles - content and time!

BEFORE: get your data to the community as soon as possible (see NPG pre-publication policy) AT THE SAME TIME: publish your Data Descriptor(s) alongside research article(s) AFTER: expand on your research articles, adding further information for reuse of the data

Page 15: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Evaluation is not be based on the perceived impact or novelty of the findings!•  Experimental Rigour and Technical Data Quality!

o  Were the data produced in a rigorous and methodologically sound manner?!o  Was the technical quality of the data supported convincingly with technical validation

experiments and statistical analyses of data quality or error, as needed?!o  Are the depth, coverage, size, and/or completeness of these data sufficient for the types of

applications or research questions outlined by the authors?!

•  Completeness of the Description!o  Are the methods and any data-processing steps described in sufficient detail to allow others to

reproduce these steps?!o  Did the authors provide all the information needed for others to reuse this dataset or integrate it

with other data?!o  Is this Data Descriptor, in combination with any repository metadata, consistent with relevant

minimum information or reporting standards?!

•  Integrity of the Data Files and Repository Record!o  Have you confirmed that the data files deposited by the authors are complete and match the

descriptions in the Data Descriptor?!o  Have these data files been deposited in the most appropriate available data repository?!

Peer review process focused on quality and reuse!

Page 16: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

•  Neuroscience, ecology, epidemiology, environmental science, functional genomics, and our first metabolomics!

•  New data sets and previously published data sets!•  Data sets in figshare, OpenfMRI, GEO, GenomeRNAi,

ArrayExpress and MetaboLights !•  Code deposited in figshare and GitHub!•  Little data (KB) to Big(ish) data (300+GB)!•  Individual datasets, compendium and citizen science!•  Academic and industry authors!

16

Current content is diverse – bimonthly releases !

Page 17: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Kirwan: MS metabolomics!

richer ISA-Tab

Page 18: NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

•  Do you run a data resource we should recognize?!o  See on our website the list of criteria databases should meet!!

•  Are you interested in facilitating submission to us? !o  See our ISA-Tab specification on the website!

-  you can implement and export in this format from your authoring/curation tool, or from your database!

!

•  Do you want to submit Data Descriptor(s)?!o  Check suitability by sending a pre-submission enquire, we accept:!

-  Submissions in the life, environmental and biomedical sciences; but not limited to!-  Experimental and computational datasets!-  Individual datasets, curated aggregations, and collections!-  Unpublished data and follow-up, with additional information for wider reuse!

Interested in collaborating or submitting?!