JISC Joint Programmes Meeting 20051 eBank UK : linking research data, learning and scholarly...

Preview:

Citation preview

                                                             

JISC Joint Programmes Meeting 2005 1

eBank UK : linking research data, learning and scholarly communications.

Dr Liz Lyon, UKOLN, University of Bath

Dr Simon Coles, School of Chemistry, University of Southampton

                                                             

The wider context

                                                             

JISC Joint Programmes Meeting 2005 3

Why create the e-Framework?The JISC strategic context

Sarah Porter, 2005

JISC-fundedcontent providers

institutionalcontent providers

externalcontent providers

brokers aggregators catalogues indexes

institutionalportals

subjectportals

learning managementsystems

media-specificportals

end-userdesktop/browser pr

esen

tatio

n

fusion

prov

isio

n

OpenURLlink servers

shared infrastructure

authentication/authorisation (Athens)

institutional profilingservices

terminology services

service registries

identifier services

metadata schema registries

© Andy Powell (UKOLN, University of Bath), 2005

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

JISC Information Environment architecture

                                                             

JISC Joint Programmes Meeting 2005 5

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

The scholarly knowledge cycle.

Liz Lyon, Ariadne, July 2003.

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0

© Liz Lyon (UKOLN, University of Bath), 2005

                                                             

JISC Joint Programmes Meeting 2005 6

Data Overload!

How do we disseminate?

EPSRC National Crystallography

Service

eScience - the data deluge

                                                             

JISC Joint Programmes Meeting 2005 7

                                                             

JISC Joint Programmes Meeting 2005 8

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services:

eBank UK

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

                                                             

The eBank UK Project

                                                             

JISC Joint Programmes Meeting 2005 10

eBank UK: background

• JISC-funded September 2003, Phase 2 February 2005• UKOLN at the University of Bath (lead), University of

Southampton, University of Manchester• Exemplar: e-Science testbed ‘Combechem’

– Grid-enabled combinatorial chemistry– Crystallography, laser and surface chemistry examples– Development of an e-Lab using pervasive computing technology– National Crystallography Service

• Resource Discovery Network / PSIgate physical sciences portal

• http://www.ukoln.ac.uk/projects/ebank-uk/

                                                             

JISC Joint Programmes Meeting 2005 11

The project team

• UKOLN• Michael Day• Monica Duke• Rachel Heery• Traugott Koch • Liz Lyon• +• Andy Powell

• Southampton• Les Carr• Simon Coles• Jeremy Frey• Chris Gutteridge• Mike Hursthouse• Andrew Milstead

• Manchester• John Blunden-Ellis

                                                             

JISC Joint Programmes Meeting 2005 12

Data Flow in eBank UK

Submit

Store/link

Data files

Metadata

Present

HTML

Institutional repository

OA

I-P

MH

Harvest (XML)

Index and Search

Present

HTML

eBank aggregator

Create

Deposition Interface

Local archive search

interface

Service Provider interfaces e.g. Subject PortalDeposit

                                                             

JISC Joint Programmes Meeting 2005 13

ebank_dc record (XML)

Crystal structure (data holding)

Crystal structure report (HTML)

Dataset

Dataset

Institutional repository

eBank UK aggregator service

ePrint UK aggregator service

Subject service

DepositHarvesting OAI-PMH

ebank_dc

Harvesting OAI-PMH oai_dc

Harvesting OAI-PMH oai_dc

Searching, linking and embedding

Searching, linking and embedding

Searching, linking and embedding

Dataset

dc:identifier

dcterms:references

Linking

dc:type=“CrystalStructure” and/or “Collection”

Model input Andy Powell, UKOLN.

PSIgate portal

Eprint oai_dc record (XML)

dcterms:isReferencedBy

dc:type=“Eprint” and/or ”Text”

eBank data model

Eprint “jump-off” page (HTML)

dc:identifierEprint manifestation (e.g. PDF)

Linking

                                                             

JISC Joint Programmes Meeting 2005 14

CombeChem: An EPSRC pilot project

X-Raye-Lab

Analysis

Properties

Propertiese-Lab

SimulationVideo

Diff

ract

omet

er

Grid Middleware

StructuresDatabase

                                                             

JISC Joint Programmes Meeting 2005 15

Crystallography data: The publication problem

Cl

Cl

Cl

Cl

Cl

Cl

ClCl Cl

Cl

Cl

ClCl

O

O

O

O

N

N

N

N

N+

O

O

O

N+

O

O

O

25,000,000

2,000,000

300,000

                                                             

JISC Joint Programmes Meeting 2005 16

Crystallography workflowRAW DATA DERIVED DATA RESULTS DATA

• Initialisation: mount new sample set up data collection• Collection: collect data• Processing: process and correct images• Solution: solve structures• Refinement: refine structure• CIF: produce CIF (Crystallographic Information File)• Validation: chemical & crystallographic checks• Report: generate Crystal Structure Report

                                                             

JISC Joint Programmes Meeting 2005 17

A data repository entry

                                                             

JISC Joint Programmes Meeting 2005 18

Access to the underlying data

ecrystals.chem.soton.ac.uk

                                                             

JISC Joint Programmes Meeting 2005 19

Harvesting: OAIster

                                                             

JISC Joint Programmes Meeting 2005 20

Aggregating: search & discover

                                                             

JISC Joint Programmes Meeting 2005 21

Linking data to publications

                                                             

JISC Joint Programmes Meeting 2005 22

eBank embedded in a science portal

                                                             

JISC Joint Programmes Meeting 2005 23

Current Developments: Deposition and validation tools

Validation

File format manipulation

                                                             

JISC Joint Programmes Meeting 2005 24

Current Developments: Integration into crystallographic publishing practices

Publishers seal of approval

                                                             

JISC Joint Programmes Meeting 2005 25

Current Developments: Ontologies for aggregating, linking & discovery

• Transform the ‘list’ into an ‘ontology’

• Embed ontology into the deposition process

• Publish keywords in OAI

• Aggregators use keywords for linking with the broader literature

• Researchers use keyword ontology in search and discovery services

                                                             

JISC Joint Programmes Meeting 2005 26

eBank : linking to learning

• Embedding in e-Learning processes• Evaluating the pedagogical benefits

– MChem course

– Chemical informatics course

                                                             

Issues and challenges

                                                             

JISC Joint Programmes Meeting 2005 28

1. Issues: research data as content

• Sharing it!• Data diversity

– Homo- or heterogeneous– Raw and derived / processed – Sensitivity– Fast or slow growth in volume

• Repository evolution: – Likelihood to scale up (from bytes to petabytes)– Quality assurance (from the start)– Community-based standards development

(“folksonomies”)– Build robust services

                                                             

JISC Joint Programmes Meeting 2005 29

2. Issues: generic data models, metadata schema & terminology

• Validation against other schema– CCLRC Scientific Data Model Vs 2

• Complex digital objects and packaging options – METS– MPEG 21 DIDL

• Terminologies– Domain: crystallography– Inter-disciplinary e.g. biomaterials– Metadata enhancement: subject keyword additions to datasets

based on knowledge of keywords in related publications – Meaningful resource discovery?

                                                             

JISC Joint Programmes Meeting 2005 30

3. Issues: linking and identifiers

• Links to individual datasets within an experiment• Links to all datasets associated with an experiment or a data

collection• Links to derived eprints and published literature • Context sensitive linking: find me

– Datasets by this author / creator– Datasets related to this subject– Learning objects by this author / creator– Learning objects related to this subject

• Identifiers and persistence– “generic” – domain: International Chemical Identifier (InChI code)

• Resource discovery : Google Scholar?• Provenance: authenticity, authority, integrity?

                                                             

JISC Joint Programmes Meeting 2005 31

4. Issues: embedding and workflow• Into the crystallographic publishing community International Union of

Crystallography • Into the chemistry research workflow

– SMART TEA Digital Lab Book e-synthesis Lab– Other analytical techniques and instrumentation– RAE procedures?

• Into the curriculum and e-Learning workflows– MChem course – Undergraduate Chemical Informatics courses

                                                             

JISC Joint Programmes Meeting 2005 32

Next in Phase 2…….

• Full embedding into the crystallographic research and publishing communities

• Chemistry workflow embedding– R4L Repository for the Laboratory– Related sub-domains of chemistry SPECTRa

• e-Learning embedding and pedagogic evaluation– Assess role in u/g chemical informatics courses– Introducing school children to e-research

• Enabling interdisciplinary research– Physical, mathematical, earth, environmental and

engineering sciences

                                                             

Thank you.

Questions?…..

Recommended