13
The Data Management Ecosystem 4 April 2013 University of California Curation Center California Digital Library

The Data Management Ecosystem

Embed Size (px)

Citation preview

Page 1: The Data Management Ecosystem

The Data Management Ecosystem

4 A p r i l 2 0 1 3

U n i v e r s i t y o f C a l i f o r n i a C u r a ti o n C e n t e rC a l i f o r n i a D i g i t a l L i b r a r y

Page 2: The Data Management Ecosystem

The research data problem

• Journal article

– Uniquely and persistently identified

– Concept of “publish”

– Multiple copies

– Easily findable

– Services: impact metrics, citation tracking, etc.

• Research data

– Nope

– Not really

– Typically one

– Difficult

– Nope

Research data is seen as a second-class citizen in the

scholarly record.

Page 3: The Data Management Ecosystem

An ecosystem of inter-dependent partners

Besides data repository and publisher partners...• researchers• educators• citizen science groups• funders• tenure and promotion committees

Libraries as neutral connection partners

Page 4: The Data Management Ecosystem

Where can libraries make a difference?

Research

CollectSave

PublishShare

CreateKnowledge

Research & Scholarship Lifecycle

Page 5: The Data Management Ecosystem

Collect > Publish > Share > Save > Research

Capture today’s web; build tomorrow’s archives

Create, edit, share, and save data management plans

Open source curation add-in for Microsoft Excel

Page 6: The Data Management Ecosystem

Collect > Publish > Share > Save > Research

Create and manage persistent identifiers: ARKs, DOIs, etc.

An infrastructure to publish and get credit for sharing research data

Page 7: The Data Management Ecosystem

Collect > Publish > Share > Save > Research

Curation repository: store, manage, preserve, and share research data

Open deposit, open access repository for spreadsheet data

Data Observation Network for Earth

Page 8: The Data Management Ecosystem

Collect > Publish > Share > Save > Research

What’s missing to complete the “incentive” circuit?

• Impact measures, citation tracking

“Connecting the data to the research it informs”

Altmetrics tools to measure non-traditional products and uses , etc.,

Page 9: The Data Management Ecosystem

Stable storage: Merritt repository• Curation repository open to the UC

community and beyond

• Discipline / content agnostic

• Micro-services architecture

• Easy-to-use UI or API

• Hosted or locally deployedPrimary Functions

1. Deposit

2. Manage (metadata, versions, etc)

3. Access (expose)

4. Share (with other researchers)

5. Preserve

Page 10: The Data Management Ecosystem

EZID: Long term identifiers made easy

• Precise identification of a dataset (DOI or ARK)

• Credit to data producers and data publishers

• A link from the traditional literature to the data (DataCite)

• Exposure and research metrics for datasets(Web of Knowledge, Google)

Primary Functions1. Create persistent identifiers2. Manage identifiers (and associated

metadata) over time3. Resolve identifiers

Take control of the management and distribution of your research, share and get credit for it, and build your reputation through its collection and documentation

Page 11: The Data Management Ecosystem

Discovery: DataCite consortium• Technische Informationsbibliothek (TIB),

Germany

• Australian National Data Service (ANDS)

• The British Library

• California Digital Library, USA

• Canada Institute for Scientific and

Technical Information (CISTI)

• L’Institut de l’Information Scientifique et

Technique (INIST), France

• Library or the ETH Zürich

• Library of TU Delft, The Netherlands

• Office of Scientific and Technical

Information, US Department of Energy

• Purdue University, USA

• Technical Information Center of Denmark

Page 12: The Data Management Ecosystem

Member Nodes

• diverse institutions

• serve local community

• provide resources for managing their data

New distributed framework

Coordinating Nodes• retain complete metadata

catalog • subset of all data• perform basic indexing• provide network-wide

services• ensure data availability

(preservation) • provide replication

services

Flexible, scalable, sustainable network

Page 13: The Data Management Ecosystem

The rest of the story

www.cdlib.org/uc3

[email protected]

[email protected] for service questions