27
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa, Stefan Andersson Electronic Publishing Centre Uppsala University Library, Sweden

1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

  • View
    218

  • Download
    4

Embed Size (px)

Citation preview

Page 1: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

1

Archiving Workflow between a Local Repository and the National Library ArchiveExperiences from the DiVA Project

Eva Müller, Peter Hansson, Uwe Klosa, Stefan Andersson

Electronic Publishing CentreUppsala University Library, Sweden

Page 2: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

2

focus on

• Using URN:NBN as persistence identifier for referencing and as identifier in an archiving workflow

• Workflow between a local repository and the national library archive

Page 3: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

3

Outline

• DiVA project and its objectives• DiVA publishing system• DiVA Archive and archiving

workflow• URN:NBN and it’s role in the

workflow• Conclusions and next steps

Page 4: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

4

DiVA ProjectDigitala Vetenskapliga Arkivet (Digital Scientific Archive)

• Since September 2000 • Objectives:

– Technical solutions & well functioning workflow supporting full-text publication of doctoral theses, working papers…

– Explore ways to ensure the future use and understanding of the digital objects in the archive

Page 5: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

5

DiVA Publishing System

• Focus on workflow– reuse and enhance the data directly from

the source document,originally created by authors, for metadata and a digital master for an electronic & printed version

– store & checksum the files– assign a persistent identifier– send a copy to the National Library Archive

Page 6: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

6

Author

DiVA Manager

LocalRepository

Word Processor

Word ProcessingFormat (Template)

DiVADocument

Format

Local Long-termStorage

Long-termstoragepackages

DiVADocument

Format

Page 7: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

7

Implementation

• Java – XML technologies• Currently an Oracle database used for

indexing and searching• Architecture: component-based design

– Modularity and reusability of the components– Possibility to seamlessly replaced modules

with improved implementations of the component

Page 8: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

8

DiVA system

• Used by 5 universities in Sweden– Stockholm, Umeå, Uppsala och

Örebro university + Södertörns högskola

• Soon 1 university in Denmark– Århus University (Staatsbibliotek i

Århus)

Page 9: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

9

Issues

• How can we ensure the future access and understanding of documents we produce locally?

• What factors increase potential for success?

• Can these factors be integrated into an automated and low-cost workflow?

Page 10: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

10

How can we ensure accessibility in the future?

• A stable point of reference (persistent identifier)

• Use human-readable, non-proprietary storage format

• Storage in several locations

Page 11: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

11

How can we minimize risks for data loss?

• Multiple copies in different locations

• Mechanism to keep track of copies

Page 12: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

12

Analysis of interest

– Authors • Dissemination of their intellectual output

– Universities• Track research output• Reduce publishing cost• Increase impact

– National libraries• Legal deposit

Page 13: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

13

Strategies

• Decision to use XML as a primary storage format

• Decision to use URN:NBN • Decision to cooperate with the

Royal Library • Decision to fit all needs into an

automated workflow

Page 14: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

14

DiVA Document Format

• Internal format • Version 1.0 (described in XML Schema)

– 99 elements– Component based– Extensible– Administrative elements are combined with

descriptive elements– DocBook DTD is used for the content part

of the document

Page 15: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

15

Web Services

Author

Metadata DisseminationServices

Dublin Core

MARC 21

TEI Header

Endnote

Reference Manager

URN:NBN Register Format

Export Formats

DiVA Document Format

LocalRepository

Word Processor

Word ProcessingFormat (Template)

Content DisseminationServices

PDF

DocBook

TEI

HTML

Export Formats

Page 16: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

16

Implementation of the Archiving Workflow• Assignment of the URN:NBN to the resources• Implementation of URN:NBN Resolution

Service• URN:NBN as a unique identifier within the

archive• URN:NBN as a naming convention for files,

directories and archival packages• URN:NBN as a part of disseminated metadata

Page 17: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

17

Assignment of the URN:NBN

Sub domain – managed locallyStructure URN:NBN:se:X:divaURN:NBN:se:uu:diva+locally

managed serial numberURN:NBN is used as identifier for

each item – an item is a single publication without consideration of format

Page 18: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

18

Implementation of URN:NBN Resolution Service

• Only basic functionally, more development planned

• Implemented as a java-servlet and contains an harvester which can harvest URN:URL-bindings from many different repositories

Page 19: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

19

Resolution ServiceConfiguration File

Other

URN:NBNRegister Format

request

response

request

response

URN:NBN:se to URLmappings

requeste.g. http://urn.kb.se/resolve?urn=

responseuser redirected to an URL

DiVA

User

Royal Library

RepositoriesURN:NBN

Register Format

URN:NBNresolution

service

Page 20: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

20

URN:NBN as a naming convention for files, directories and archival packages

Page 21: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

21

Workflow to the National Library Archive

• Archiving packages– Administrative and descriptive

metadata stored in XML– Today – content in pdf and where

possible even in XML– Each manifestation is bundled to

AP• General data • Format specific data

Page 22: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

22

metadatacontent

stylesheetsschemas

checksums

checksum

name: URN:NBN:se:[specific part]

Archiving Package

Page 23: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

23

Central

Local

URN:NBNResolution

Service

Long-termStorage

LibraryCatalogue

Long-termStorage

urn:nbn:se:….urn:nbn:se:.. ->http://www…urn:nbn:se:.. ->http://www...urn:nbn:se:.. ->http://www...

List of URN:NBNto URL mappings

Metadata & Content

Repository

Long-term storage packages

Long-term storage packages MARC 21XML

Metadata

Page 24: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

24

Conclusions • Low-cost system that supports a fully

automated workflow from the point of submission works well

• Using harvesting model for updates to the mapping registry makes the management of URN:NBN simple

• Automatic creation of MARC21 records makes cataloguing faster and less expensive

• Push model to deliver archival packages makes this process more reliable and easier to manage

Page 25: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

25

• Use of XML for all metadata associated with each archival package increases the likelihood of future understanding of the digital objects in the archive and offers the potential to easily extract document metadata, if necessary

• Modularity of technical solution offers advantage of component reusability and is a solid basis for further local and cooperative development

• Long-term access to institutional research publications can be assured with cooperation from national libraries

Page 26: 1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,

26

Next steps

New project funded by Royal Library’s Department for National Co-ordination and Development (BIBSAM)– Examine and evaluate current solutions– Develop and implement a generalized

archiving workflow between a local repository and a national archive focusing on the variety of publishing platforms and systems