36
AHM, Nottingham, September 2004 1 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon Coles, School of Chemistry, University of Southampton

AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

Embed Size (px)

Citation preview

Page 1: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 1

eBank UK : linking research data, scholarly communication and learning.

Dr Liz Lyon, UKOLN, University of Bath

Dr Simon Coles, School of Chemistry, University of Southampton

Page 2: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 2

Overview• In context: scholarly communications

– Open Access – Data, information, workflows and provenance

• The data publication bottleneck

– e-Science and crystallography– Comb-e-chem Project

• eBank UK

– Information architecture and data flow– Interoperability issues

• Challenges for the future

Page 3: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

Scholarly communications

Page 4: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 4

Current chemistry publishing protocols

Ideas and interpretations

Results & derived data

Hooks into the literature

Raw data!

Page 5: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 5

Page 6: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 6

Page 7: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 7

“It is envisaged that the sharing of primary data would prevent unnecessary repetition of experiments and enable scientists to build directly on each others’ work, creating greater efficiencies and productivity in the research process.”

The government line

Page 8: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 8

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Searching , harvesting, embedding

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

Linking

The scholarly knowledge cycle.

Liz Lyon, eBankUK article. Ariadne, July 2003.

Page 9: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 9

Learning & Teaching workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Resource discovery, linking, embedding

Peer-reviewed publications: journals, conference proceedings

Validation

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Page 10: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 10

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

Linking

Page 11: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 11

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services:

eBank UK

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

Linking

Page 12: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

The Data Publication Bottleneck

Page 13: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 13

Data Overload!

How do we disseminate?

EPSRC National Crystallography

Service

The data deluge

Page 14: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 14

CombeChem: An EPSRC pilot project

X-Raye-Lab

Analysis

Properties

Propertiese-Lab

SimulationVideo

Diff

ract

omet

er

Grid Middleware

StructuresDatabase

Page 15: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 15

Grid

E-Scientists

Entire E-Science CycleEncompassing experimentation, analysis, publication, research, learning

5

Institutional Archive

LocalWebPublisher

Holdings

Digital Library

E-Scientists Graduate Students

Undergraduate Students

Virtual Learning Environment

E-Experimentation

E-Scientists

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints & Metadata

Certified Experimental

Results & Analyses

Data, Metadata & Ontologies

Page 16: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

The eBank UK Project

Page 17: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 17

eBank UK project

• JISC-funded for 1 year from September 2003• UKOLN at the University of Bath (lead), University of

Southampton, University of Manchester• “Building the links between research data, scholarly

communication and learning”• Exemplar: e-Science testbed ‘Combechem’

– Grid-enabled combinatorial chemistry– Crystallography, laser and surface chemistry examples– Development of an e-Lab using pervasive computing technology– National Crystallography Service

• Resource Discovery Network / PSIgate physical sciences portal• http://www.ukoln.ac.uk/projects/ebank-uk/

Page 18: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 18

The project team

• UKOLN• Michael Day• Monica Duke• Rachel Heery• Liz Lyon• +• Andy Powell

• Southampton• Les Carr• Simon Coles• Jeremy Frey• Chris Gutteridge• Mike Hursthouse

• Manchester• John Blunden-Ellis

Page 19: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 19

First steps: establishing common ground…

• Understand the data creation process • Terminology and definitions

– Data– Metadata– Datafile– Dataset– Data holding

• Different views– Digital library researchers, computer scientists, chemists– Generic vs specific– Modeller vs practitioner

• Aim for a common ontology• Modelling the domain• Creating a metadata schema

Page 20: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 20

Progress update

• Version 2.0 eBank metadata schema• Enhanced ePrints.org software• Pilot institutional e-data repository for

harvesting (raw, derived, results data)• Exports records as ebank_dc and oai_dc• Validation of schema• Pilot eBank UK aggregator service• Developing search interface Version 1.0 • Testing with PSIgate physical sciences portal

– embedding eBank UK

Page 21: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 21

Crystallography workflow

• Initialisation: mount new sample on diffractometer & set up data collection

• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic Information File

format)• Report: generate Crystal Structure Report

RAW DATA DERIVED DATA RESULTS DATA

Page 22: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 22

Deposition into the archive

Page 23: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 23

An Archive entry

ecrystals.chem.soton.ac.uk

For a demo come to the JISC booth!

Today @ 13:00 & during tea

Page 24: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 24

All the way back to the underlying data…

Page 25: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 25

Some metadata issues

• Using simple and qualified Dublin Core • Additional chemical information in schema for

harvesting e.g. empirical formula• Schema contains International Chemical Identifier

(InChI)• Links to all datasets associated with an experiment• Links to individual datasets within an experiment• Links to eprints (and other published literature)

derived from the data• Using vocabularies specific to crystallography• Engaging the broader scientific community to ensure

different schemas are compliant and standards can emerge

Page 26: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 26

ebank_dc record (XML)

Crystal structure (data holding)

Crystal structure report (HTML)

Dataset

Dataset

Institutional repository

Deposit

Dataset

dc:identifier

dcterms:references

Linking

dc:type=“CrystalStructure” and/or “Collection”

Model input Andy Powell, UKOLN.

Eprint oai_dc record (XML)

dcterms:isReferencedBy

dc:type=“Eprint” and/or ”Text”

Data flow in eBank

Eprint “jump-off” page (HTML)

dc:identifierEprint manifestation (e.g. PDF)

Linking

Page 27: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 28

ebank_dc record (XML)

Crystal structure (data holding)

Crystal structure report (HTML)

Dataset

Dataset

Institutional repository

eBank UK aggregator service

ePrint UK aggregator service

Subject service

DepositHarvesting OAI-PMH

ebank_dc

Harvesting OAI-PMH oai_dc

Harvesting OAI-PMH oai_dc

Searching, linking and embedding

Searching, linking and embedding

Searching, linking and embedding

Dataset

dc:identifier

dcterms:references

Linking

dc:type=“CrystalStructure” and/or “Collection”

Model input Andy Powell, UKOLN.

PSIgate portal

Eprint oai_dc record (XML)

dcterms:isReferencedBy

dc:type=“Eprint” and/or ”Text”

Data flow in eBank

Eprint “jump-off” page (HTML)

dc:identifierEprint manifestation (e.g. PDF)

Linking

Page 28: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 29

Harvesting: OAIster

Page 29: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 30

Linking and aggregating: Search & discover

For a demo come to the JISC booth!Today @ 13:00 & during tea or the buffet

Page 30: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 31

Linking and aggregating: Hit browsing

Page 31: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 32

And finally…eBank embedded in a science portal

Page 32: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 33

Currently we are……

• Assessing outcomes of a Consultation Workshop held in August e.g.– Cost-benefit issues for researchers?– RAE / assessment impact?– Disciplinary differences?

• Presenting a demonstrator• Completing supporting studies on (1)

Provenance and (2) Data models and schema• Promoting Open Access and Open eData Archives to

international crystallographic organisations, publishers, learned societies

• Phase 2 proposal funding sought for further 12 months

Page 33: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

Challenges for the future

Page 34: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 35

Phase 2 plan…….(1)

• Continue to progress towards generic metadata schemas

• Validation against other schema– CLRC Scientific Metadata Model

• Modify Eprints.org software to allow for more generic scientific data and schemas

• Metadata enhancement: subject keyword additions based on knowledge of keywords in related publications

• Investigate identifiers e.g. International Chemical Identifier (InChI code)

• Explore context sensitive linking: find me– Datasets by this person; Journal articles by this person; Datasets

related to this subject; Journal articles on this subject; Learning objects by this person; Learning objects on this subject

Page 35: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

AHM, Nottingham, September 2004 36

Phase 2…….(2)

• Full embedding into the crystallographic research and publishing communities

• Chemistry workflow embedding– SMART TEA e synthesis Lab– Other analytical techniques in chemistry

• e-Learning embedding and pedagogic evaluation– Undergraduate chemical informatics courses– Introduction to visiting schools

• Expand into other physical, mathematical, geological and engineering sciences

• Feasibility study in related domains – bio and medical sciences

• Feasibility study in unrelated domains – arts and humanities

Page 36: AHM, Nottingham, September 20041 eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon

                                                             

Thank you.

Questions?…..