21
Steps in a Digital Preservation Workflow Bill LeFurgy, [email protected] Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections and Technical Services

Steps in a Digital Preservation Workflow Bill LeFurgy, [email protected] Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Embed Size (px)

Citation preview

Page 1: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Steps in a Digital Preservation Workflow

Bill LeFurgy, [email protected]

Library of Congress

March 7, 2012

Hosted by ALCTS, the Association for Library Collections and Technical Services

Page 2: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

What is Covered Here

• A high-level introduction to workflows in a digital preservation context

• Outline of how to conceptualize a workflow, including life cycle considerations

• Variables that influence the design and execution of workflows

• Consideration of some existing models, architectures and tools

Page 3: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Workflow Advice in a Nutshell• Start where you are• Ask questions

– What do you need to do?• Which content (limited is ok)• How to manage/preserve/make available (limited is ok)

– What capabilities do you have?• Staff• Infrastructure/services

– What is a basic workflow that you can undertake?

• Develop a model– Test – Revise, improve– Repeat

Page 4: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

What is a Workflow?

• Sequence of connected steps to accomplish an activity from start to finish

• Declared as the work of a person or group of persons

• Often repeatable over time

• Abstract representation of actual work

• Can be simple or complex

Page 5: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Workflows in a Digital Preservation Context

• Sequence of steps involved to place digital content under preservation control (however defined)

• Highly variable according to institutional policy, capacity, content type—one size does not fit all

• Variability includes scale, maturity, complexity, process, tools, automation…

• Continual development from community experience

• Distinct from digitization! (But can be linked)

Page 6: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Workflows in an Institutional Context

• Workflows are developed as part of an overall institutional approach, which is informed by current community concepts (i.e., OAIS)

• Workflows are one element of an interlinked institutional approach

http://www.dpworkshop.org/dpm-eng/program/index.html

Page 7: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Planning and Starting a Workflow

Ideally, an institution will have policies that drive workflows

Goportis Project: http://www.digitalpreservationsummit.de/presentations/altenhoener.pdf

Page 8: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Digital Life Cycle In developing a workflow, consider a digital life cycle

model—the basic stages content moves through from creation to providing ongoing management/access over time

JISC http://www.dlib.org/dlib/july04/beagrie/07beagrie.html

Page 9: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Digital Life Cycle Models and Digital Workflows

• Concepts are closely related• Life cycle models are high-level abstractions of stages

that digital content move through during stewardship• Models often represented as diagrams to give the big

picture of what digital stewardship involves• Diagrams can be useful in identifying generic workflow

sequences• Diagrams vary in detail and complexity

Page 10: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

DigitalNZ: http://makeit.digitalnz.org/

Page 11: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Digital Curation Centre: http://www.dcc.ac.uk/resources/curation-lifecycle-model

Page 12: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

CASPAR: http://www.casparpreserves.eu/other-caspar-products/caspar_workflow.jpg

Page 13: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

At the most basic level….

Workflow and Preservation Tasks

Workflows focus on concrete actions needed to process individual batches or streams of content (images, video, etc.)

Page 14: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Penn State Libraries: http://www.ijdc.net/index.php/ijdc/article/download/191/256

Narrative use cases can be used to model workflow processes

Page 15: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Workflows can tie steps to specific tools

Page 16: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Archivematica: http://archivematica.org/wiki/images/d/dc/Archivematica-architecture-7May2010-2.png

Page 17: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Carolina Digital Repository: https://cdr.lib.unc.edu/external?page=about.technology

Workflows can refer to distributed services

Page 18: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Public Record Office Victoria: http://www.dlib.org/dlib/november07/waugh/11waugh.html

Workflows can drill down into details for one process, such as ingest

Page 19: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Portico: http://www.portico.org/digital-preservation/services/preservation-approach/preservation-step-by-step#step3

Workflows can be described without recourse to flow chart diagrams

Page 20: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

Incremental Development is the Key

• Everybody is looking to optimize and do better!• Important thing is to establish and document

basic policies, processes • Useful to start with a pilot workflow and modify,

extend as needed• Workflows usually change over time based on

experience, improved tools, other factors• Learn by doing

Page 21: Steps in a Digital Preservation Workflow Bill LeFurgy, wlef@loc.gov Library of Congress March 7, 2012 Hosted by ALCTS, the Association for Library Collections

For More Information• “A Framework for Distributed Preservation Workflows,”

http://ijdc.net/index.php/ijdc/article/view/157/220• Archivematica, http://archivematica.org/• Carolina Digital Repository, https://cdr.lib.unc.edu• CASPAR, http://www.casparpreserves.eu/• Digital Curation Centre, http://www.dcc.ac.uk/resources/curation-lifecycle-model• Goportis Project, http://www.goportis.de/en/our-services/digital-preservation.html• Portico, http://www.portico.org/digital-preservation/services/preservation-approach/

preservation-step-by-step• “Responding to the Call to Curate: Digital Curation in Practice at Penn State University,”

http://www.ijdc.net/index.php/ijdc/article/download/191/256 • “Review of Data Management Lifecycle Models,” http://opus.bath.ac.uk/28587/• “Select for Success Key Principles in Assessing Repository Models,”

http://www.dlib.org/dlib/july07/rieger/07rieger.html• “Taverna and myExperiment: Tools for creating and sharing workflows,” http://wiki.opf-

labs.org/download/attachments/8356515/SCAPE-IntroductionToTaverna-myExperiment-HackathonYork2011.pptx (PPTX)

• “The Design and Implementation of an Ingest Function to a Digital Archive,” http://www.dlib.org/dlib/november07/waugh/11waugh.html

• Wellcome Library Digital Curation Workflow (PPT), http://library.wellcome.ac.uk/assets/wtx055599.ppt

• Yale Digital Preservation Service Level 1 Matrix (PDF), http://odai.yale.edu/node/262/attachment