38
Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Kentucky Department for Libraries and Archives November 2014 Best Practices Exchange Montgomery, Alabama

Working with a Preservation Software Vendor - The Kentucky ... · PDF fileWorking with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Kentucky Department for

Embed Size (px)

Citation preview

Working with a Preservation

Software Vendor -

The Kentucky Experience

Glen McAninch

Kentucky Department for Libraries and Archives

November 2014

Best Practices Exchange

Montgomery, Alabama

Who We Are

Kentucky Department for Libraries and

Archives

State Library State Archives

State Records Management

Background for Using Preservica

1996 Began downloading publications

2003 Began archiving governors’ websites

2006 e-Archives available to public

2009 Launched DSpace application

2010 GeoMAPP grant project – Demo of

Safety Deposit Box (SDB)

2010 – Began evaluation of Archive-It –

Became full member in 2012

2012 (Nov.) Purchased Preservica Service

2012 Began a pilot of Preservica Service

2011 Participated in a test audit for ISO

Standard for Trustworthy Digital Repositories

KDLA Collection File Types

• Web sites

• Publications

• Minutes

• Geospatial datasets (maps)

• Databases

• Digital images

• Video

• Audio recordings

Volume of Electronic Collections

• Approximately 7 TB in e-Archives -Includes multiple copies

– 1.38 TB on file server • 1/2 TB of high resolution Digital Photos (Preservica)

• 1/3 TB includes textual material, digital photographs, audio/video and website snapshots

– 1.6 TB of GIS data (vector data and imagery)

– 300 GB of databases (Governors’ constituent mail and vital statistics)

– 100 GB of Scanned Images (historic documents) (DSpace)

KDLA Electronic Collections

• DSpace is currently the primary access tool

– <15,000 records in DSpace

– ~160 GB in files in Asset Store of DSpace

– Some publications cross-linked to library catalog

But several things NOT in DSpace

– Snapshots of Elected Officials Websites (Archive-IT)

– GIS Data • Some shapefiles and map images in DSpace

• Geodatabase snapshots on file server and in Preservica

– Photos (Governors & some agencies) - Preservica

– Backlog waiting to be added

Support for DSpace

• Open source software

• Large consortium to plan updates

• Changed from inhouse support to State

technology unit support

• Moved from open source database to Oracle

• Greater expertise in managing & customizing –

good wiki to diagnose problems

• Greater cost in the state system

Business Challenges

• Automate largely distributed process of ingest and preservation to lessen cost of labor and be more efficient

• Import effectively from our access system

• Establish connections from our network to Preservica and back again

• Gain intellectual and physical control over growing number of electronic records

• Manage the holdings through time

Tech Support Small % of Budget

Preservica

& DSpace

0

20

40

60

80

100

120

140

160

Labor Storage &servers

Outsourceservices

Technologysupport

Conversion Training

Low

High

KDLA Customization of Preservica

• 6 week test of Preservica in spring of 2012

• Purchased Preservica in November 2012

• KDLA Customization Needs

– Workflow began building own workflow using

existing tools January 2013

– DSpace import early in 2014

– Improvement of uploads August 2014?

– Public interface Version 1 August 2014

Version 2 October 2014

– Management of local copy (in progress)

Preservica as a Business Solution

• Tools (full preservation services):

– Use of templates to automate descriptive and accession metadata insertion

– Automated technical data extraction from files

– Format normalization and migration

• Administrative control of the archive

• Cloud storage

• Little or no on-site maintenance

• Search features enable batch operations

Dashboard with Customizable Graphs

Ingest and Normalization Processes

• Format validation

• Fixity checking with checksum

• Virus checking

• Metadata extraction

• Metadata generation through

templates

• Creation of thumbnails

• Normalization of format

Preservica

SIP Creator

• Locate batch

of files to

accession

•Select

structures

• Select

collection from

pick list

Preservica SIP

Creator (lower)

•Draw

metadata from

folder titles or

preassign to a

field

•Apply Integrity

Checks

• Attach

metadata

(accession) to

the package as

template

Workflows are Customizable

Ingest for Preservation

Preservation

Processes

•Virus Scan

•Characterization and

Format validation

•Normalization &

Creation of DIP’s

(Original and

Derivative

managed together)

•Generates checksums

•Distributes metadata

and indexes it

•Generates thumbnails

Agency/Series/“Deliverable unit” Organization

Accession Metadata From Template

Dublin Core Descriptive Metadata

Administration Reports

Explorer Searching Capabilities

Integrity Checking

• Each storage unit customizable by:

– Volume size (portion of repository)

– Days between checks

– Repair damaged files manually (automated not practical)

Benefits

• Automate the ingest and

preservation processes

• Accession information addresses elements of provenance for users

• Migration of formats in the future

• Continual integrity checking

• Export to both a local copy and display copy

• Collaboration to continue to improve the product like additional access mechanisms

Development of a Preservica

Universal Access Interface

• Preservica conference calls with user

group to determine needs

• Sample constructs were presented

• Users suggested supplementing search

with browse

• Need to customize metadata displayed

• Customizing of opening page

• Search result filtering

• Thumbnails and viewers

Customization of Universal Access Site

• Use of the WordPress based interface

– KDLA name and logo

– Selection of collections to feature with images

– Selection of background and font colors

• Customization within UA transform code

– Selection of fields to display at various levels

including Dublin Core descriptive elements and

Preservica generated preservation metadata

KDLA: Opening Page & Detail Page

• Browsing

collections

• Keyword

searching

• “Featured”

or “other

items of

interest”

allows

further

browsing

• Images and

descriptions

of featured

collections

selected by

archivist

KDLA: Browse Collections (Agencies) • Browse

collections

by agency

and view

the series

under each

agency

• The user

can select

“view

details” for

metadata

• Browse a

collection

further

KDLA: Gallery Display of Images

• Thumbnail

display of

folders

with a

single

image

allows the

user to

quickly

review the

images in

a gallery

display

KDLA: Folder with Map Images

• This record of map images imported from our DSpace repository shows Dublin Core metadata at the bottom

• Allows users to view many images in an additional display.

KDLA: Video Record for Download or Play

• This screen

allows uses to

both download

or play video

• When “play

video” is

selected it is

automatically

converted to a

version that

plays on users

computers

• KDLA has not

currently

loaded lengthy

videos

KDLA: Broad Search “Agriculture”

• Searching

with broad

keyword

terms like

“Agriculture”

• Filter by

using the

“Refine your

search”

feature to

retrieve at

the file,

record, or

collection

levels.

• Search only

in specific

collection

KDLA: Keyword Search for “bees AND statistics”

• Full text

searches of

keywords

in

documents

as well as

metadata

fields

• Boolean

operators

can also be

applied to

refine a

search

Summary of Preservica Universal

Access Capabilities

• Online public access to preserved records

• Customizable interface

• Commitment to adding features

• Follows suggestions by customers

• Open source tool for managing opening

page

Preservica User Group Activities

• State Archives Members at SAA meeting • Michigan, Kentucky, New York, Wisconsin, Vermont

• Alabama (pilot)

• Regular meetings • US group began at SAA 2014

• Sarah Grimm (Wisc.) chair

• Blog/Press releases

• Collaboration on development

– DSpace (Kentucky)

– Universal Access (Michigan and Kentucky)

Preservica Development

– Email: • Microsoft Email PST Ingest (October 2014)

• Email Rendering (October 2014)

• Email Search (October 2014)

– Records management – exchange with Enterprise Content Management Systems based on retention schedules

• Lotus Notes Export/Ingest (previous)

• SharePoint/Outlook Export/Ingest (October 2014)

• Filenet, HP Trim, and OpenText Export/Ingest (2015)

– Access Systems - export/import from • DSpace (Spring 2014)

• CONTENTdm (October 2014)

– Universal Access • Launched (July 2014)

• Upgrade (October 2014)

Contact Information

• Glen McAninch – Kentucky Dept. for Libraries and Archives

– Manager of Technology Analysis & Support Branch

[email protected]

• Beth Shields – Kentucky Dept. for Libraries and Archives

– Technology Analysis & Support Branch

[email protected]