39
Planning for Digital Preservation

Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Embed Size (px)

Citation preview

Page 1: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Planning for Digital Preservation

Page 2: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Planning for Preservation

Digital preservation issues come up much faster than traditional preservation issuesDigital resources need on-going attentionBuild a preservation strategy into your project from the startKeep dealing with the short-term issues and you won’t ever need to face the long-term problem

Page 3: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Issues

The content of digital resources is only accessible with the aid of intermediary technologiesDigital resources are complexReliance on specific combination of formats, software and hardware to operate correctlyI.T. develops rapidly, and resources can become obsolete very quickly

Page 4: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Three Key Areas

Content – the bits and bytes

Technologies: software systems; hardware: websites, access and delivery systems

Organisational

Page 5: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Planning for the Future

Short-term: Initial technology still current and actively supported - 0 - 5 yearsMedium-term: Initial technology still in use and supported, but no longer used for new work - 5-10 yearsLong-term: Initial technology no longer used or supported - 10+ years

Page 6: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

….. In the Short Term

Making digital assets available

Website administration Website updatesSoftware and operating system patchesPeriodic backupsPeriodic checks on master copies

Page 7: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

…….. In the Medium Term

Keeping your existing digital outputs ‘up and running’

Upgrading operating systems and softwareUpgrading hardwareReplacing hardware componentsRefreshing master copiesPeriodic backupsPeriodic checks on master copies

Page 8: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

…….. In the Longer Term

Overcoming technological obsolescence to preserve a usable digital resource

Introducing completely new softwareReplacing entire hardware systemsEnhancing functionalityPeriodic backupsPeriodic checks on master copies

Page 9: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

During the Data Creation Phase

Importance of backupsPreferably more than one copy, on and off siteAppropriate frequencyMore than one file formatCheck your backupsBut backup is not preservation!

Page 10: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

What to Preserve?

Significant Characteristics

Very difficult to preserve everything (data, functionality and interaction) about a digital resourceDocumented or commonly understood significant characteristics help simplify preservation action

Page 11: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Analogue……

Book - Significant:Words, paragraphs, chapters, author, publication date, …

Not Significant:Binding, print run, font, colour of paper, …

Newspapers - Significant: Words, paragraphs, headlines, size of type, date, page number of article, …

Not Significant:Size of page, spacing, text justification, colour of paper, …

Page 12: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Digital………

There is a shared understanding of what is important in a paper-based resourceLess agreement about what is important in a digital resourceComplicated to decide as software and formats support many options that are not knowingly used but have default settings

Page 13: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Questions to ask….

What are the significant characteristics of your digital outputs?

What are the digital objects that make up your resource?What is the purpose of your digital resource?

Think about the problem in terms of content and purposeVery difficult (if not impossible) to ensure your resource stays exactly the same in the futureWhat can change without adverse effects?What changes must be limited, and by how much?How can you check changes are acceptable?

Page 14: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Assessing the scale of the Preservation Task

Estimating volume and type:Textual DocumentsStill ImagesMoving ImagesAudio filesNumeric datasetDatabaseMarkup Documents (XML etc.)CADGISVirtual realityWebsiteSoftware executable

Page 15: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Risk Assessment for file formats used

Review data types and file formats

Assess the risks associated with those file formats

Establish policy for dealing with them

Page 16: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Preservation Metadata

Metadata needed to manage preservation of digital collections: technical; administrative Not necessarily a “complete set” of preservation metadata elementsPossible sources:

OCLC/RLG Working Group; the Consultative Committee for Space Data Systems; CEDARS project; The UK National Archives (formerly the Public Record Office); Arts and Humanities Data Service; NEDLIB project; California Digital Library; Harvard University Library

Page 17: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

File Structure

Create an overview of the file structureCreate a list of all filesCreate a logical file strategy from the outset

Choose consistent filenamesAvoid using re-using same filename even in separate folders.Store files in a logical order with systems and contents files kept apart.Summary of contents may be included with each file.Keep a record of encryption keys – important for preservation.

Page 18: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Preservation Strategies: Content

Migration: convert the data to work with new applicationsEmulation: convert the data, application (and operating system) to work on new hardwareTechnology preservation: Keep everything running Virtual computing: create a standard ‘virtual’ runtime environmentMigration on demand: convert original format directly into up-to-date format

Page 19: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Theory ----- Practice

In practice, migration is the simplest and most common approachLimitations of migration are:

Can be difficult to ensure accurate migrationDoes not capture functionality, only (possibly partial) dataMay need to be repeated frequentlyMight lead to ‘mutation’ over time

Page 20: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Migrating to new standards – but which one?

"The good thing about standards is that there are so many to choose from“ (A. Tanenbaum)

Quicktime 1.0 1992MPEG-1 1992Real Media 1995MPEG-2 1996RealVideo 1997MPEG-4 1999Quicktime 5.0 1999Active Streaming Format 1999

DIVX 5.0 2002 The number of A/V “de-facto” standard formats has exploded in the past five years, and this does not cover the dozens of audio and video codec combinations!

Page 21: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Measuring Longevity of Standard

Who developed it?Microsoft, Motion Picture Expert Group, etc.

 Has it received mainstream support?Can your hardware save data in that format?

 What organisations are using it?Is it used in industry

 Is it widely accepted by the professional and amateur community?

Technology watch – check web sites, developer forums and newsgroups.

 Has it been submitted as an ISO standard?

Page 22: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Measuring Longevity of Standard

Are there any legal actions to change the standard?

Is there a licensing fee?

What tools are available to create and manipulate the format

Open source vs. proprietaryPRONOM – National Archive database of 250 software products, 550 file formats and 100 manufacturers

Can I execute these tools on my computer?Java, Windows-only, Mac-only

Page 23: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Choosing a Suitable Migration Path

What are the main features?Small file size, streaming support

 Will it support your specialist needs?Subtitles, DRM, Internet delivery, etc.

 Does it provide sufficient qualityLossless vs. lossy compression.

 Will it impose any restrictions on use?Can it actually be played by your target audience?

 Is the standard stable or does it change frequently?

How will this affect your desire to use the format?

Page 24: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Migration problems

Have you encountered any problems when accessing these files in other applications?

Quirks (text not displaying, desynchronised audio/video, upside-down video playback).Version incompatibilities

Migrating to other formatsAre there any other problems when exporting to other formats? E.g. lossless-to-lossless conversion, in-editableDocument quirks & incompatibilities for later.

Page 25: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Updating Hardware

Hardware has changed dramatically in the last 3 years

Memory – DDR vs. SD-RAMCPU – pin compatibilityGraphics cards – AGP 2x, 4x, 8xOperating system – will Windows NT4/98 run on newer hardware?

Do you upgrade existing hardware or replace it with new equipment?

Page 26: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Updating Software

Software changes on a frequent basisFour service packs available for Windows 2000.Microsoft issues 3 patches per week on average.Legal action force changes to plugin handling.In addition, there is an estimated 20 un-patched vulnerabilities in Internet Explorer alone (PivX Solutions).

Do you upgrade to a later operating system or continue to use an operating system & software with known security flaws?

Page 27: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Preserving Your Website: technical issues

Standards And FormatsHas the Web site been designed using open standards, which should help future-proofing?Have proprietary formats been used (for which backwards compatibility may not be considered)

Architecture & ImplementationHas the technical architecture of the Web site been documented?Can you continue to use technical systems after funding has finished?

Page 28: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Preserving Your Website: content issues

Accuracy:Is the content of the Web site accurate today Who and how will changes be madeCould the content of the Web site be misleading in the future?

Usability:Maintaining links – short medium and long term

Legal:Is the Web site legal (accessibility; copyright; defamation; IPR; …)?Will the Web site be legal tomorrow, if new legislation is enacted? How will you know – who will make necessary changes?

Page 29: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Maintaining a Website

Run a link check across the Web site. Fix broken internal links and as many external links as is reasonable. Document the link report.

Run HTML (and CSS) validation checks across the Web site. Fix as many invalid pages as is reasonable. Document the findings.

Run an accessibility check across the Web site. Fix as many inaccessible pages as is reasonable. Document the findings.

Page 30: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Maintaining a Website

Address technical areas:

Remove any backend scripts which are no longer needed

Remember that scripts, etc. are liable to go wrong.

Ensure that applications are configured to break gracefully and provide meaningful errors – tell users who to contact if they find an error

Page 31: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Procedures framework

From start to finish:Creation and Management Manuals within Procedures Framework

Key File Format Conversion Guides

Digital Object Preservation Handbook: a ‘how-to’ guide

Page 32: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Options for Ensuring Preservation

Once a project is completed……………

Live, (supported) systemArchivedOrganisational Repository‘Shelved’Abandoned

Page 33: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Not Recommended……..

AbandonedMay be appropriate, probably isn’t, think about archiving the resource instead

‘Shelved’Don’t - shelving a digital resource without active, on-going attention is highly likely to result in its lossMedia degradationSoftware and hardware obsolescenceLoss of knowledge about the resource

Page 34: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Recommended……. But Think About

Live SystemImportance of functionality/interfaceOrganisational buy-in: who is running the system, and what is their commitment to it?What will happen if the system is shut down?Is the digital resource completed or on-going?Who Pays?

Page 35: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Recommended…… But Think About

Deposit in an ArchiveIs the digital resource going to a trusted archive?Are only some aspects of the resource being archived?Will it be available for others to use?Will the resource be updated in the future?Costs?

Page 36: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Recommended…….. But think about

Establish a RepositoryBusiness model and financial plan

Management and administrative processes Policies and proceduresSystems and toolsSoftware and hardwareResource curation Metadata and documentationPreservation management

Page 37: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Establishing Requirements

A pragmatic approach – workable and achievablePreservation requirementsEstablish common practices, procedures and use of standardsInvestigate and establish hardware, systems, and tools requirementsInvestigate and evaluate productsBusiness planning and costings

Page 38: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Developing the Architecture

The architecture must support:The entire activity cycle including ingest, data management, storage, long term preservation, discovery, access and deliveryAll necessary security aspectsComplex resourcesDiscovery and delivery options

Page 39: Planning for Digital Preservation. Planning for Preservation Digital preservation issues come up much faster than traditional preservation issues Digital

Summary

Build in preservation right from the startDocument decisions/policies/proceduresBalance longevity with innovationBe ruthless about what you must keep and what can be discardedThink content and functionalityPlanningIt’s a continuous process – not a one-off