28
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist

Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist

Embed Size (px)

Citation preview

Challenges of Digital Media Preservation

Karen Cariani, DirectorMedia Library and ArchivesDave MacCarn, Chief Technologist

Who we are: WGBH Media Library and Archives 2

Preservation needs are more complicated

— New and changing content formats— Network connections— Software— Storage media— Hardware

Access expectations challenging— Faster access— Anywhere, anytime

Transition challenges (Analog to Digital) 3

Content formats 4

Storage and retrieval

How do we:

Capture the audio and video generated by myriad cameras Store the project information to allow potential re-edit Store files with rich, meaningful metadata Store born-digital materials Display and retrieve born-digital materials

5

Access: Organizational Issues

Metadata

Descriptive metadata—Need description for video to be useful, findable—How to capture that—How to make sure it is linked to video files

6

Folder Structure

Create folders by card— Assign unique number— Continue numbers— Add description— Place ENTIRE card

contents into this folder!!

7

Original footage

© 2011 WGBH

8

Proposed tapeless workflow

Create a mapping document between filemaker and DAM

Used to generate an xml stylesheet Video is ingested simultaneously with the metadata

from filemaker using the xml stylesheet Technical metadata is ingested simultaneously with the

video and production data using the xml generated by the source digital files

9

Challenges - again

Access issues— File size— Formats – to playback— Useable - — Search/findable

Metadata Organize files

Preservation issues— Copies— Formats – for migration— Being able to play again later— Speed of access (big file size) – to use/process— Migration ease

10

File management— Where are the files?

Needed for access to files— Large preservation files— Smaller access, proxy files

Network speed— Larger files, need faster

network to meet speed expectations

Software /Network 11

Issues with current file mgmt systems/software

Preservation not a priority Interface issues

— Access vs. Preservation IT relationship

— Tech support— Vendor reliance issues— Need library based system for Archivist needs rather than traditional IT

company needs Expense

— License cost— Development— Customizations

12

Access

Can find Can view Can select Can get out of

system Can reuse in

editing system

13

Preservation Needs

Multiple Copies Validity Bit quality checks Long lasting storage Regular migration Persistence

14

Challenges of preservation and access

For preservation— Want to capture as close to original as possible— Originals may be many different formats— Will need to make sure you can export and use different formats in

future— File format issues— Fixity check big files

For access— Want one consistent format for playback/access— Needs to be easy to migrate, use

15

What makes video different?

Preservation files are large— Uncompressed— Slow to move around

Need proxy files for viewing— Smaller size for quick transport

over network Complicated formats

— Not just one file type— Codecs, wrappers, frame speed,

etc

16

Technology Mix: 17

Hydra project

Combine preservation system with access system Better interface Flexible design Easy to evolve

18

Insert graphic

Blacklight Hydra heads Hydra mgmt layer Fedora repository HSM storage system

19

Fundamental Assumption #1

No single system can provide the full range of repository-based solutions for a given institution’s needs,

…yet sustainable solutions require a common repository infrastructure.

20

Fundamental Assumption #2

No single institution can resource the development of a full range of solutions on its own,

—…yet each needs the flexibility to tailor solutions to local demands and workflows.

21

Hydra Philosophy -- Community

• An open architecture, with many contributors to a common core

• Collaboratively built “solution bundles” that can be adapted and modified to suit local needs

• A community of developers and adopters extending and enhancing the core

• “If you want to go fast, go alone. If you want to go far, go together.”

22

CRUD in Repositories

Major Hydra Components

Hardware/Storage media: HSM

Access— Online

XX bytes Spinning disk— Offline— Nearline

Preservation (offline)— Robotic tape library system — LT04 data tapes— 2 copies— One stored off site

Migration needs 3-5 years— Both tape migration to newer formats— Technology migration

New Storage Types and Costs

Need hierarchical storage (HSM)—Video files are large—Spinning disks are expensive—Tape can help save cost—Tape copies/migration can be automated

26

New Storage Types and Costs

But HSM has licensing issues—Some systems cost by gigabyte managed—Need Open source alternative

27

Q & A

Karen: [email protected]

Dave: [email protected]

28