27
Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director Outreach Director, UKOLN, University of Bath, UK Funded by: This work is licensed under a Creative Commons Licens Attribution-ShareAlike 2. 3 rd European Conference on Research Infrastructures

Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

Embed Size (px)

Citation preview

Page 1: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

Digital | Curation | Centre

Adding value to open access research data: reflections on the process of data curation

Dr Liz Lyon,

DCC Associate Director Outreach Director, UKOLN, University of Bath, UK

Funded by:

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

3rd European Conference on Research Infrastructures

Page 2: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

2

Digital | Curation | Centre

For later use? In use now (and the future)?

What is digital curation?

Data preservation Data curation

Static Dynamic

“maintaining and adding value to a trusted body of digital information for current and future use”

Page 3: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

3

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

Page 4: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

4

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

Page 5: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

5

Digital | Curation | Centre

Curation issues 1: Data capture & integration

into research workflows

• R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, deposit of results (chemistry)

• SMART TEA electronic Laboratory notebook + annotations

Page 6: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

6

Digital | Curation | Centre

– Access Grid – Collaborative telematic art– Modify spaces for performers – Interplay: Hallucinations

Page 7: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

7

Digital | Curation | Centre

Human discourse : supporting “persistent conversations”?

• MEMETIC Project

• JISC-funded

• Virtual Research Environments Programme

• Compendium software + Access Grid

Page 8: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

8

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

Page 9: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

9

Digital | Curation | Centre

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

The scholarly knowledge cycle.

Liz Lyon, Ariadne, July 2003.

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

© Liz Lyon (UKOLN, University of Bath), 2005

Page 10: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

10

Digital | Curation | Centre

Federated repository architectures & repository services

fusion layer ‘repository federator’

repository repository repository repository repository

portal portal portal portal portal

heterogeneous - metadataformats, content formats,identifiers, packagingstandards

homogeneous - metadataformats, content formats,identifiers, packagingstandards

From Andy Powell: http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/presentations/jiie-jcs-2005/

• Global

• Inter-disciplinary

• Cross-sectoral

• Multiple format types

• Data, eprints, images…….

• e-Framework: JISC & DEST

• Defining common services + domain-specific services

Page 11: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

11

Digital | Curation | Centre

eBank UK Project• Two key themes:

– Open access to datasets– Linking research data to publications and to learning

• UKOLN, University of Southampton, University of Manchester• e-Science application ‘Combechem’ : Grid-enabled combinatorial

chemistry + National Crystallography Service• Resource Discovery Network / PSIgate physical sciences portal

http://www.ukoln.ac.uk/projects/ebank-uk/

Page 12: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

12

Digital | Curation | Centre

A data repository entry

Page 13: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

13

Digital | Curation | Centre

Access to the underlying data: complex objects

ecrystals.chem.soton.ac.uk

Page 14: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

14

Digital | Curation | Centre

Curation issues 2: describing data

• Validation, publication & discovery of data models & schema

• Managing complex objects • Metadata packaging standards

– METS– MPEG 21 DIDL

• Semantic descriptions– Formal controlled vocabularies– High-level and domain ontologies– Inter-disciplinary discovery

• Informal approaches Web 2.0 “folksonomies”

Page 15: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

15

Digital | Curation | Centre

JISC PALS

Dictate project

Research data?

Blogs & informal communications?

Page 16: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

16

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

Page 17: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

17

Digital | Curation | Centre

Curation issues 3: Persistent identifiers for data citation

• Identify use cases: depositor, author, service provider, reader, publisher, ?

• Schemes: DOI, Handle, ARK, PURL• Global identification: express as http URIs• Added value services: CrossRef, resolution service,

integration (Globus), look-up service• Domain identifiers: e.g. International Chemical Identifier

(INChI) codes• Google molecules using InChIs demo: Peter Murray-Rust,

Uni Cambridge

Page 18: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

18

Digital | Curation | Centre

One approach to data citation using DOIs

• Publication & citation of scientific primary data project National Library for Science & Technology (TIB), University of Hanover, Germany STD-DOI Project http://www.std-doi.de

• DOI registry for datasets• Data publication agents: World Data Center Climate,

GeoForschungsZentrum Potsdam • Data requirements: quality control, long-term curation,

use DOI resolver• Exemplar data citation:

– Kamm, H; Machon, L; Donner, S (2004): Gas chromatography (KTB Field Lab), GFZ Potsdam. doi:10.1594/GFZ/ICDP/KTB/ktb-geoch-gaschr-p

Page 19: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

19

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

Page 20: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

20

Digital | Curation | Centre

Adding value: eBank linking data to publications

Page 21: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

21

Digital | Curation | Centre

Linking research to learning - embedding eBank aggregator service in a science portal for student learners

Page 22: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

22

Digital | Curation | Centre

UK Digital Curation Centre

• Delivering services

• Development activities

• Research agenda

• Outreach Programme

• http://www.dcc.ac.uk/

Page 23: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

23

Digital | Curation | Centre

Adding value through annotation

DCC Research Agenda at the University of Edinburgh

• Databases: Annotation scoping report

• AstroDAS distributed annotation servers

• New annotation model + prototype: top-ranked demonstration at recent DB conference

Page 24: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

24

Digital | Curation | Centre

(Very simple) e-Research Cycle and Data Curation

Formulate hypothesis / ideas, test, experiment, observe: data creation,

collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

Page 25: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

25

Digital | Curation | Centre

Page 26: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

26

Digital | Curation | Centre

Curation issues 5: workforce development, capacity building & achieving cultural change

• DCC Outreach & Services:– [email protected]

(legal - technical guidance) – Curation Manual– Workshops, Information Days– 2nd International Conference

November 2006

• NSF Report : “Data scientist”• Develop hybrid skills• Embed in u/g, p/g curriculum• Facilitate collaboration:

researchers, data centres, digital libraries & archives communities

Page 27: Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director

Digital | Curation | Centre

Thank you.

[email protected]

Join the DCC Associates Network at www.dcc.ac.uk