15
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer AHDS 25 October 2006

Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer

Embed Size (px)

Citation preview

Funded by:

© AHDS

Preservation in Institutional Repositories

Preliminary conclusions of the SHERPA DP project

Gareth Knight

Digital Preservation OfficerAHDS

25 October 2006

Funded by:

© AHDS

     

SHERPA DP Project• Acronym: Securing a Hybrid Environment for Research

Preservation and Access: Digital Preservation

• Development Partners: AHDS at King’s College London (Lead), Nottingham, Glasgow, Edinburgh, White Rose Consortium, London Leap Consortium

• Duration: 2 years, March 2005 – February 2007

• Funding: JISC and CURL

• Programme: JISC Digital Preservation and Records Management Programme

Funded by:

© AHDS

     

Sherpa DP ProjectPurpose:

To create a collaborative, shared preservation environment for the SHERPA project framed around the OAIS Reference Model.

Aims:1. To develop a prototype preservation environment for SHERPA

Partners based on the OAIS reference model including a set of protocols and software tools.

2. To establish a workflow & procedures to suit the needs of institutional repositories and the preservation service.

3. Provide guidance on the ingest process, to encourage the deposit of formats that will minimise long-term operational costs.

4. To develop an exemplar for an outsourced preservation service.

5. Create a User Guide that recommends standards, best practice, protocols and processes that may be used in the management, preservation and presentation of e-print repositories

Funded by:

© AHDS

     

Why distribute preservation functions?

• In many IRs, there is a scarcity of staff with necessary preservation skills and expertise

• institutional repositories lack the time to implement preservation

• potential cost savings in terms of staff time and equipment?

• seeking to remove repetition of services

• Preservation is not inherent in most repository software. DSpace and EPrints software primarily about submission, basic storage and access (for the moment)

Funded by:

© AHDS

     

Repository Landscape

ERPAePRINTSNottingham

ePrints

EdinburghResearc hArc hive

W hite Rosec onsortium

SOAS Eprintrepository

NottinghameTheses

Modern LanguagesPublic ations Arc hive

UCLePrints

ImperialEprints

Roy alHolloway

LSE Researc hOnline

KingsePrints

J eLit J ournalof eliterac y

GlasgowePrintsServic ee P rin ts rep os ito ry

so ftw a reD S p ace rep os ito ry

so ftw a re

P u b l ishe d &p ee r-rev iew ed

p ap e rs

e lec tro n ic th ese s& d isse rta tio n s

S in g le in s ti tu tionre po s ito ries

m u lti in s ti tu tionre po s ito ries

Birkbec kePrints

Funded by:

© AHDS

     

OAIS Functional Model

4-1

.2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Funded by:

© AHDS

     

Distributed OAIS ModelSIP = E-print & discovery MD

AIP = E-print, discovery& preservation MD

DIP = E-print and discovery MD

A disaggregated OAIS

Pre se rv atio n Se rv ice (Se rv ice Pro v id e r)

In stitu tio n al Re p o sito ry (Co n te n t Pro v id e r)

R e se a rch e r

D a ta Man a ge me n t

D a ta Man a ge me n t

Acce ss

Acce ss

Arch iva l Sto ra ge

Arch iva l Sto ra ge

Ad min is tra tio n

SIP

DIP

DIP

AIP

In ge st

In ge st

SIP

Pre se rva tio n P la n n in g

D e po sito r

AIP

Ad min is tra tio n

Funded by:

© AHDS

     

Generic WorkflowSub mit d a ta& me tada ta

Validationsuc c esful

RequestResubmission

N o

Ismetadata

c omplete?

Enhanc eMetadata

Copy SIP torepository

store

E-print inappropriate

depositformat

Migrate todissemination

formatN o

Copy D IP torepository store &

disseminationserver

Makeavailable inc atalogue

Researc her(Consumer)

ac c esses data

Metadatatransfer

C reatePres ervation

m etadata

GenerateAIP

Riskassessment

revealproblems?

Im plem entm igrationS trategy

Yes

N o

Sc heduleObsolesc enc e

Monitoring

Depositformat

Obsolete

C reate newdis s em ination

form at

Transfer A IP toPreservation

store

Service Provider (Preservation Service)

Ye s

Ye s

Content Provider (Institutional Repository)

Rec ord detailsof migration

ac tion

98

2

3

4

56

Ye s

7

1 0

1 1

N o

1 2

N o

Ye s

Validationsuc c esful

RequestResubmission

N o

1 3

1 4

1 5

Datatransfer

1 6

Funded by:

© AHDS

     

Practical WorkflowThe AIP must be prepared prior to ingest into Fedora:• Accept SIP (Harvest metadata, process harvested metadata,

extract digital objects• Generate AIP (normalise datastreams and create preservation

metadata for SIP & AIP)• Data Management (integrity check, format obsolescence, format

migration, AIP additions)

Change Content Provider practices to support appropriate services:

• Ingest policy - encourage preservation formats • Dissemination policy – encourage distribution of original

deposited formats• Licence agreement

Funded by:

© AHDS

     

Minimum Requirements for Preservation

Technical1. Expose basic metadata to identify new submissions.2. Provide some method of identifying data objects associated with

a metadata record3. Provide some method of authenticating data objects associated

with a metadata record

Policy4. Policies to identify preferred file formats for deposit and inform

the Producer (depositor) and preservation service provider of these requirements;

5. Create and implement a deposit licence that:• Establish permission for the Content Provider to allocate

responsibility for preservation to a third-party.• Establishes permission to transform the submitted resource

(e-print) for the purpose of preservation and accessibility.

Funded by:

© AHDS

     

Best Practice Requirementsfor preservation

Technical6. Expose a full record of all metadata stored by the IR, including

desc, admin, preservation.7. Provide a detailed description of the metadata schema

implemented, including a list of elements and vocabulary.8. Co-operate with the partner institution to identify methods that

may be used to return metadata and data to the institutional repository

Policy9. Co-operate with the Preservation Service Provider to review

and potentially revise ingest policies to ensure SIPs are deposited in formats appropriate for preservation.

Funded by:

© AHDS

     

Service Provider ResponsibilitiesStorage:• Provide a permanent storage facility and disaster recovery

capabilities• Manage storage hierarchyPreservation Planning:• Evaluate contents of archive and undertake risk assessment • Develop recommendations for preservation standards and

policies• Life cycle management. Monitor changes in technology

environment, users’ service requests, and knowledge basePreservation Action:• Develop and implement migration plans• Create and manage multiple copies of content, including off-site

storage• Record appropriate information on any changes

Funded by:

© AHDS

     

System Architecture• Fedora Server (initially version 2.1.1).

• FedoraGSearch generic search plug-in (currently a Beta version. It will be bundled with Fedora in the future)

• MySQL database server (initially version 5.0).

• Elated web interface to Fedora (used for SHERPA DP web interface)

• JHOVE (initially version 1.1)

• DROID

• Format registry (e.g. GDFR)

Funded by:

© AHDS

     

Preliminary Conclusions• There is no out-of-the-box solution to preservation.

• The location of preservation activities is unimportant. However, appropriate repository services must exist

• Repository interoperability is possible using appropriate standards exist.

• Preservation begins on ingest!

• Further investigation on OAIS-compliant models to represent distributed services is necessary

Funded by:

© AHDS

     

Further InformationURL:

http://www.sherpadp.org.uk/

Contact

[email protected]

[email protected]