Click here to load reader
View
305
Download
0
Embed Size (px)
Citation preview
Harvard Library’sDigital Preservation Repository,the Digital Repository Service (DRS)NISO-NFAIS Joint Virtual Conference
Andrea Goethals, HL Digital Preservation Services 12/7/2016
AGENDA
• DRS Overview
• Highlights of Current Work
• Challenges
• Future Work
• Q&A
DRS
Overview
Current work
Challenges
Future work
Q & A
WHAT IS THE DRS?
• Harvard-maintained service for digital content for:• long-term preservation
• keep the content safe• keep the information usable long-term on modern platforms
• delivery to users
• A service - not storage or a tool• Includes preservation & IT staff actively monitoring the
content and systems• Includes documented policies, practices & preservation
plans• Uses technology & systems but these change over time
DRS
Overview
Current work
Challenges
Future work
Q & A
KEY POLICIES
• What can be deposited into the DRS
• Who can deposit to the DRS
• Obligations of collection managers
• Responsibilities of DRS staff
• Retention policies
• Discovery & access policies
• Delivery services
• Preservation services
DRSPolicyGuide:http://hul.harvard.edu/ois/systems/drs/policyGuide/DRS_Policy_Guide-Printable.pdf
DRS
Overview
Current work
Challenges
Future work
Q & A
KEY STRATEGIES
• Format guidance (preferred & accepted for deposit)
• Deposit tools with automatic technical characterization
• Content validated against documented content models
• Constant bit integrity checking
• Regular storage refreshes
• Format migrations
• Expert networks
DRS
Overview
Current work
Challenges
Future work
Q & A
WHAT’S IN THE DRS?
• 63 million files,204 TB per copy
• Many formats• Images, audio, text,
digitized books, web sites, documents, biomedical image stacks, email, “opaque objects” and soon video• Primarily digitized images
and text
Text~ 1/3
Image~ 2/3
Rest of formats (audio,
documents, etc.)
DRS
Overview
Current work
Challenges
Future work
Q & A
TECHNOLOGY
• Modular architecture• Front ends (deposit, delivery, management)• Middleware (APIs)• Back end (preservation storage, database, index)
• Combination of:• Third-party tools
• Open-source and/or free software (Linux, Apache tools, Java, SOLR, etc.)
• Commercial off-the-shelf software (Oracle, LuraTech Image Server)
• Custom services (for authorization, authentication, persistent naming, viewing/updating metadata, deleting content, etc.)
DRS
Overview
Current work
Challenges
Future work
Q & A
COLLABORATIVELY MANAGED
DRS Business Owner DRS Technology OwnerDigital Preservation Services Library Technology Services
Key Responsibilities:• Preservation & usage policies,
strategies• Manage & communicate about service• Represent users’ & content’s
preservation needs• Define high-level enhancement
roadmaps• Preservation plans• Preservation outreach, consulting,
guidelines
Key Responsibilities:• Technology & security policies,
strategies• Manage hardware, software,
development• Bug fixes & enhancements• System monitoring & scaling• Refine roadmaps based on resources• System testing & documentation• User support & training on systems
DRS
Overview
Current work
Challenges
Future work
Q & A
HIGHLIGHTS OF CURRENT WORK
• Metadata migration
• Rollout of video preservation services
DRS
Overview
Current work
Challenges
Future work
Q & A
METADATA MIGRATION PROJECT
• Last piece of the DRS2 Project (move to the next-generation DRS)• Previously completed:• Transition all infrastructure to standards-based object
model and metadata schemas• New modern management tools and API-based services
layer• Support for more formats
• Metadata migration• Re-describe all the content at the object and file-level• Result: more accurate and detailed metadata to support
curatorial management and preservation planning
DRS
Overview
Current work
Challenges
Future work
Q & A
VIDEO SERVICES PROJECT
• Part of a larger project to add DRS support for formats most-requested by curators:• video• word processing• CAD (2D and 3D)• disk images• RAW camera images• (image sequences for scanned film)
• New fast-tracking process working with consultants to help with the analysis
DRS
Overview
Current work
Challenges
Future work
Q & A
VIDEO SERVICES PROJECT
• DRS support• Video content model• DRS tool enhancements
• Deposit (FITS, etc.)• Delivery (Streaming Delivery Service based on JWPlayer)
• Media Preservation Services enhancements• Purchased video playback technology, professional
digitization technology, additional SAN storage• Custom routing of data for deposit from MPS SAN to the
DRS loader• Wrote custom DRS deposit tools• Staff training
DRS
Overview
Current work
Challenges
Future work
Q & A
CHALLENGES
• Long-running back-end projects, for example:• Format migrations• Architecture migrations• Metadata schema migrations• Data model migrations
Invisible to most + resource-heavy!
• Support for the long tail of formats• Necessary but arguably less-impactful
DRS
Overview
Current work
Challenges
Future work
Q & A
LONG TAIL
19%
15%
10%
8% 8%
7%
6% 6% 6%
5% 5%
1% 1% 1% 1% 1% 0%
CuratorRequeststoAddFormatSupporttotheHLDRS(2004-2016)
DRS
Overview
Current work
Challenges
Future work
Q & A
FUTURE WORK
• Format migrations (RealAudio, SMIL, Kodak PhotoCD)
• Easier deposits• User-friendly for humans• More automated streams from systems inside and outside
of Harvard
• Medium-term preservation
• Full support through delivery for disk images, CAD, email
Q & A