17
Project Project 3TU.Datacentrum 3TU.Datacentrum Im@c, September 24 Im@c, September 24 th th 2009 2009 Jeroen Rombouts, MSc Jeroen Rombouts, MSc Project manager 3TU.Datacentrum Project manager 3TU.Datacentrum

Imac 090924

Embed Size (px)

Citation preview

Page 1: Imac 090924

ProjectProject3TU.Datacentrum3TU.Datacentrum

Im@c, September 24Im@c, September 24thth 2009 2009Jeroen Rombouts, MScJeroen Rombouts, MSc

Project manager 3TU.DatacentrumProject manager 3TU.Datacentrum

Page 2: Imac 090924

Presentation outlinePresentation outline

Why care about research data?

What do data producers have to say?

Page 3: Imac 090924

Why care? 1/3Why care? 1/3

Research

Manuscript Publication

Data Metadata

Repository Library

Page 4: Imac 090924

Why care? 2/3Why care? 2/3

• Physical decay of storage media;

• Loss of descriptive (meta)data;

• Loss of ‘rendering’ capabilities (contemporary applications for viewing and analysing data).

Risks of current research data management

Reasons for long-term preservation and access

• Data value (cost intensive, valorisation, continuous datasets);

• Research quality (verification, knowledge transfer, sharing).

Page 5: Imac 090924

Why care? 3/3Why care? 3/3

• Plan of National Science Foundation regarding preservation of digital scientific output (2006);

• OAIS reference model (2002 by CCSDS) becomes ISO standard (2009);

• KNAW starts Dutch data repository for humanities and social sciences: DANS (Data Archiving and Networked Services) (2005);

• No initiatives for engineering and science in the Netherlands.

Project setting

Page 6: Imac 090924

The 3TU.Datacentrum 1/8The 3TU.Datacentrum 1/8

• Builds on two previous projects;– E-Archiving – digital depot– Darelux – Data Archiving River Environment Luxemburg

• Time frame of 3 years 2008 - 2010;– Financed mainly by 3TU.Federation– Datasets from TUD, TU/e and UT, later other science data

• Goal: long-term access to research data.

Project description

Page 7: Imac 090924

The 3TU.Datacentrum 2/8The 3TU.Datacentrum 2/8

Tasks

CollaborationWith DANS, SURF, Koninklijke Bibliotheek and others:• “DRIVER-II” (EU-7FP), Demonstrator voor Enhanced Publications;• “Waardevolle Data & Diensten” (SURFshare), identify added value of data repository for data producers.• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.

• Implement and run ‘data-archive’ (facilitate data producers);- Collect, preserve, publish and provide access to data- (ß): drietu2.3tu.nl/repository/collection:all/view/html

• Data management consultancy;- Select and develop formats, metadata, tools, etc.

Page 8: Imac 090924

The 3TU.Datacentrum 3/8The 3TU.Datacentrum 3/8

• Data of ‘enhanced publications’ (underlying data and visualisations linked to publications).Increase publication value (stronger basis, more citations, …);

• Data generated by ‘hard to repeat’ processes.E.g. high cost, (environmental) observations, complex or continuous experiments, …;

• Data collected with public funding.Conditions by funding organisations or publishers like Nature Publishing Group, NWO, governmental organisations, universities, …;

• Preferably open access data with potential for reuse (verification, new research, …).Increase visibility, efficiency and quality of research efforts.

Which data to preserve? And why?

Page 9: Imac 090924

• Technical infrastructure (server, platform, websites, formats & models)

• Dataset Darelux (2.0)http://drietu2.3tu.nl/repository/resource:study-CITG/view/html

• Dataset Flame (BagIt)http://drietu2.3tu.nl/datasets/flame/

• Dataset Wind speed/Solar radiationhttp://drietu2.3tu.nl/datasets/windzon/

• Datasets ‘on the way’: NNV Survey ‘job market physicists’, Enhanced Publication ‘combustion’, Waterlab, Biotechnology, Remote sensing, ‘Tire noise’

The 3TU.Datacentrum 4/8The 3TU.Datacentrum 4/8

Page 10: Imac 090924

• Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library.“to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence”;

• Founding member COAR: Confederation of Open Access Repositories (October);

• Provide input for “Nota Wetenschappelijke informatievoorziening” (OC&W), “Toekomst voor ons digitaal geheugen” (NCDD);

• Partner in “Nationale Coalitie Digitale Duurzaamheid” (www.ncdd.nl);

• Coordinating “Forum onderzoeksdata”.

Related ‘results’ 5/8Related ‘results’ 5/8

Page 11: Imac 090924

The 3TU.Datacentrum 6/8The 3TU.Datacentrum 6/8

Page 12: Imac 090924
Page 13: Imac 090924

The 3TU.Datacentrum 8/8The 3TU.Datacentrum 8/8

The benefits for data producers and data consumers

• Increased visibility of research output. (metadata in repository networks, assigning doi’s, facilitate increases citation rate for ‘enhanced publications’, ...);

• Improved quality of dataset (quality assurance for multi- user setup, checks on ingest, …);

• Provide (long-term) preservation of and accessibility to, valuable research data;

• Distribution of research data for reuse, including administration and usage statistics;

• Provides advice on data management, rights, formats, metadata, etc.

Page 14: Imac 090924

Nobody needs my data

Data transfer not needed, every PhD does own project

Our datasets are confidential

Interesting but not for me

Only for long term continuous

data

Datasets are stored by publisherNo time!

Our research is once only

What do data producers say? 1/2What do data producers say? 1/2

Page 15: Imac 090924

Surprising our university had no faciltity for data

preservation

Transfer of data between PhD’s can be

improved

Would like to publish data

Good opportunity to share datasets

we bought

Very usefull, essential metadata

often missing Much to

improve in reuse of data

When can I store my datasets?

What do data producers say? 2/2What do data producers say? 2/2

Page 16: Imac 090924

Questions? Suggestions?Questions? Suggestions?

Nature News Special on Data Sharing (september 2009)www.nature.com/news/specials/datasharing/index.html

Toekomst voor ons digitaal geheugenhttp://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf

Page 17: Imac 090924

ResourcesResources

• The 3TU.Datacentrum project www.datacentrum.3tu.nl• "Unavailability of online supplementary scientific information from

articles published in major journals" doi:10.1096/fj.05-4784lsf• "Going, Going, Gone: Lost Internet References“

doi:10.1126/science.1088234• “Sharing Detailed Research Data Is Associated with Increased

Citation Rate” doi:10.1371/journal.pone.0000308• “To share or not to share” www.rin.ac.uk/data-publication• “NSF’s Cyberinfrastructure Vision for 21st century Discovery”

www.nsf.gov/od/oci/ci_v5.pdf• “SURF Direct” Digitale rechten – onderzoeksdata (Dutch)

www.surf.nl/surfdirect• Nature News Special on Data Sharing (september 2009)

www.nature.com/news/specials/datasharing/index.html• Toekomst voor ons digitaal geheugen

http://www.ncdd.nl/documents/NCDDToekomst2009_000.pdf