61
Workshop: Preservation and Access for Audio and Video Richard Wright BBC R&D – PrestoCentre Goportis Digital Preservation Summit Hamburg – 19Oct 2011

Workshop on Preservation and Access for Audio and Video

Embed Size (px)

Citation preview

Page 1: Workshop on Preservation and Access for Audio and Video

Workshop: Preservation and Access for Audio and Video

Richard Wright

BBC R&D – PrestoCentre

Goportis Digital Preservation Summit

Hamburg – 19Oct 2011

Page 2: Workshop on Preservation and Access for Audio and Video

We are on a journey

Page 3: Workshop on Preservation and Access for Audio and Video

Which began here:

Page 4: Workshop on Preservation and Access for Audio and Video

And here:

Page 5: Workshop on Preservation and Access for Audio and Video

And here:

phonautograph, developed in 1857 by Parisian inventor Édouard-Léon Scott de Martinville

Page 6: Workshop on Preservation and Access for Audio and Video

And here:

Page 7: Workshop on Preservation and Access for Audio and Video

And here:

"The cinema is an invention without a future"

- Louis Lumière

Page 8: Workshop on Preservation and Access for Audio and Video

And here:

Page 9: Workshop on Preservation and Access for Audio and Video

All in the 19th Century

In the 20th

C, came broadcasting

Page 10: Workshop on Preservation and Access for Audio and Video

And the format wars:

List formats and dates

Page 11: Workshop on Preservation and Access for Audio and Video

Leaving us with this:

Page 12: Workshop on Preservation and Access for Audio and Video

The BBC Archive, 1995

• About 1 million hours BBC radio and television

• 1.5 million items of film and videotape

• 750,000 radio recordings

• 3 million photographs

• 1.2 million commercial recordings (“grams”)

• 4 million items of sheet music

• 22 million newspaper cuttings

• 550,000 document files; 20,000 rolls of microfilm

• 500,000 phonetic pronunciations

Page 13: Workshop on Preservation and Access for Audio and Video

TV Archive Holdings, 1999

Film 30%

D3 6%

Digibeta 1%

Betacam11%

VHS 14%Umatic 4.5%

1” C Format 12%

2” Quad 1%

Ekta, Reversal 12%

Page 14: Workshop on Preservation and Access for Audio and Video

TV Formats and Holdings

• 2” Quad, 60s-80s: Dr Who, Dad’s Army, Steptoe & Son, Forsythe Saga, Fawlty Towers, Secret Army

• 1” C, 70s-90s: Yes Minister, Eastenders, Angels, Wogan, All Our Working Lives

• Film: 50 years, 40s-90s: Man Alive, 1984, Ascent of Man, British Empire, Omnibus, Pride and Prejudice 95

• Ektachrome: 67-82 news: Vietnam War, Yom Kippur war, all major domestic stories

• U-Matic: 82-90’s: Lockerbie, General Elections in 1983 & 1987, the Gulf War

Page 15: Workshop on Preservation and Access for Audio and Video

Radio Archive, 1999

• Radio holdings: 750,000 recordings, 300,000 hours

• Fewer technical problems with audio recordings

1/4” tapes in regions 37%

1/4” tapes in London 32%

CD Sound effects 0.02%

LP Sound effects 0.4%

DAT NCA sequence programmes , bulletins 1%1/4” News bulletins 3%

Cassette NCA sequence programmes 4%

CD Programme extract compilation 0.1%

1/4” complete programmes 15%

1/4” film unit tapes 1%

DAT, London & Region 2%

LP & 78RPM Programme Extract, 5% of sound holdings

Page 16: Workshop on Preservation and Access for Audio and Video

Radio Formats and Holdings

• Radio One sessions: 40k recordings, all BBC copyright; mainly ¼” tape

– Rolling Stones, Beatles, Who, Jimi Hendrix, Led Zeppelin, The Fall ... and so many more

• News off-air on DAT 1990-2001, then CD to 06

• Files from tapeless production onto CD (6 yrs)

• Shellac and vinyl pressings from 20s-60s

– Special problems with lacquer, acetate, aluminium master discs: 16”, fragile, deteriorate

Page 17: Workshop on Preservation and Access for Audio and Video

The Problem: Analogue Media

DecayingObsoleteFragile

Presto Survey, 20015 million hours of holdings(10 European broadcasters)

Page 18: Workshop on Preservation and Access for Audio and Video

Decaying Obsolete Fragile

• Obsolescence: at least 2/3 of the material• Deterioration: approximately 1/3 of the

material • Fragile media: roughly 1/4 of the material

Overall: 70% ofholdings haveproblems (2001)

The Solution:digitisation

Page 19: Workshop on Preservation and Access for Audio and Video

Obsolescence

• Videotape–2”; 1”; U-Matic: no playback equipment

• Film–Disappearing in post production

• Audio formats–Grams : no playback equipment (in BBC !!)

–¼” no longer accepted in BBC radio production and playout systems

Page 20: Workshop on Preservation and Access for Audio and Video

Deterioration

• Videotape – decay of adhesive– 2”; 1”; U-Matic (30% read failures at BBC)

• Audio – decay of adhesive: ¼” tape– Lacquer discs

• Magnetic sound tracks: vinegar syndrome

• Other Acetate

• Decay of film splices

• General decay of polymer materials– Even the sleeves on vinyl LPs

Page 21: Workshop on Preservation and Access for Audio and Video

Fragile Media

• Vinyl (45s, LPs)

– and shellac (78s)

• Film

– 10 plays per print (videotape: 50)

• Video or audiotape

– physical damage

– magnetic fields

Page 22: Workshop on Preservation and Access for Audio and Video

Size of the Problem– in Europe

• Presto: found 5 million hours 2001

– Mainly broadcast archives

• Prestospace: found 10 million hours 2004

– Broadcast and large national collections

• TAPE: found additional 20 million hours

– In collections not covered previously

• UNESCO estimate: 200 million hours worldwide (100 million in Europe)

Page 23: Workshop on Preservation and Access for Audio and Video

Where is the material?

• Broadcast archives 30% (roughly)

• National collections 15%

• Other major collections 15%

• Small and specialist collections 40%

NB: all these figures refer to archived material ONLY (TAPE survey)

Page 24: Workshop on Preservation and Access for Audio and Video

What to do about it? Presto preservation factory

• Efficient workflow for digitisation–Staff specialisation–Cartography and Triage

Adam Smith: “the division of labour in pin manufacturing – and the great increase in the quantity of work that results” (UK £20 note)

wiki.prestospace.org preservation guide

Page 25: Workshop on Preservation and Access for Audio and Video

Problems with the solution: digitisation not fully accepted

“You’re not preserving anything; you’re only making more proxies and adding to the problem”

• Not accepted as a solution for film

• Not easy to implement for video (in full quality); problems: encodings, compression, formats, file size, bandwidth ...

• But – very much accepted for audio: BWF

Page 26: Workshop on Preservation and Access for Audio and Video

Problems with the solution: needs Digital Preservation

The approach in 2006:

•Media

•Multiple copies

•Maintenance

•Migration

Page 27: Workshop on Preservation and Access for Audio and Video

Media

• Datatape : cheaper that hard drives– Needs an expensive tape drive

– And has reliability issues

• Optical is cheapest of all– But isn’t really mass storage (DVD=4.7 GB)

• New DVD format(s) promise 20 to 100 GB

– And has reliability issues

• Hard drives prices have dropped sharply– Easiest to automate management

– And has reliability issues

Page 28: Workshop on Preservation and Access for Audio and Video

Multiple copies

• Two copies

– Two technologies

• In two places

• But fastest recovery is by mirroring

– Which means identical technologies

• Big arguments about RAID vs simpler options vs more complex options

Page 29: Workshop on Preservation and Access for Audio and Video

Maintenance

• Life cycle management

• Should be every archive’s

built-in process

• Begins with blank media

– Then the writing

– Then the initial checking

– Then the periodic checking: aerobics, scrubbing

• Ends with migration to the next format

Page 30: Workshop on Preservation and Access for Audio and Video

Migration

• A fact of life

• Every five years

• Can involve a lot of manual handling (of datatapes or optical media)

• Or can be nearly transparant (disc upgrades) –but: every three years!

• Best practice: uncompressed file formats

Page 31: Workshop on Preservation and Access for Audio and Video

Digital Preservation 2009formal management model

Is the format a

problem?

START HERE

Archive for a

few years

What cost/quality/risk

option can you affordCompress

lossy

YES

NO

UncompressCompress

lossless

END HERE

(1)

(2)

(3) (4)

(5a)(5b)

(5c)

Page 32: Workshop on Preservation and Access for Audio and Video

... with emulation

Is the format at

risk?

START HERE

Archive for a

few years

What cost/

quality/risk can

you afford?Compress

lossy

YES

NO

UncompressCompress

lossless

END HEREMultivalent

Page 33: Workshop on Preservation and Access for Audio and Video

Stepping back: the real problem with storage (1)

Medium Bits/cm² Life

Stone 10 10 000

Paper 104 1000

Film 107 100

Disc 1010 10

Each change 1000 times cheaper,

but lasts 1/10th as long

Page 34: Workshop on Preservation and Access for Audio and Video

The problem (2)

• Current storage media are unreliable

– Discs fail

– Data tapes fail

– Optical media fail (and are easily damaged)

– Companies fail

– exceptions? Glass discs; Holographic media; Going back to film; Digital film;

Page 35: Workshop on Preservation and Access for Audio and Video

The problem (3)

• Storage isn’t just about media– Encoding and obsolescence

– File formats and obsolescence

– File management systems and obsolescence

– Physical interfaces and obsolescence

– Operating systems and obsolescence

– System complexity and associated risks

– Human errors

–Cost: continuous maintenance

Page 36: Workshop on Preservation and Access for Audio and Video

What is the cost of continuous maintenance?

• You need a model of storage operating and replacement costs, into the future

• What storage? So you need a storage strategy:

– Allocation of storage: primary, backup, cloud ...

– Operation of storage: cycles for copying, checking

– Some idea of relating costs to risks !!!

• NOT available from storage vendors

Page 37: Workshop on Preservation and Access for Audio and Video

Simple Preservation Model

Page 38: Workshop on Preservation and Access for Audio and Video

And now:one PrestoPRIME tool

• A model for storage systems, to calculate

– Cost

– Risk

– Loss

– And compare what-if scenarios

• http://prestoprime.it-innovation.soton.ac.uk/

Page 39: Workshop on Preservation and Access for Audio and Video
Page 40: Workshop on Preservation and Access for Audio and Video
Page 41: Workshop on Preservation and Access for Audio and Video

Storage Systems

HDD in serversMigration required every 4 years. Running CostsAccess: €0.1 per GB

Storage: €1 per GB per yearCorruption RatesAccess: avg. 1 in 500 files

Latent: avg. 1 in 750 files per year

HDD on shelvesMigration required every 4 years. Running CostsAccess: €1 per GB

Storage: €0.25 per GB per yearCorruption RatesAccess: avg. 1 in 100 files

Latent: avg. 1 in 500 files per year

Page 42: Workshop on Preservation and Access for Audio and Video

More Storage Systems

Data tapes in a robotMigration required every 6 years. Running CostsAccess: €0.2 per GB

Storage: €0.4 per GB per yearCorruption RatesAccess: avg. 1 in 1x104 files

Latent: avg. 1 in 1x105 files per year

Data tapes on shelvesMigration required every 6 years. Running CostsAccess: €1 per GB

Storage: €0.1 per GB per yearCorruption RatesAccess: avg. 1 in 1x104 files

Latent: avg. 1 in 1x105 files per year

Page 43: Workshop on Preservation and Access for Audio and Video
Page 44: Workshop on Preservation and Access for Audio and Video

Storage Configuration

Found 3 storage configurations. Add...

Disk with Tape

System 1: HDD in servers

Files accessed avg of 0.25 times per year, staying constant

Scrubbing every 1 year(s)

System 2: Data tapes in a robot

Files accessed avg of 0 times per year, staying constant

Scrubbing every 3 year(s)

Page 45: Workshop on Preservation and Access for Audio and Video
Page 46: Workshop on Preservation and Access for Audio and Video

File Collections

• Found 1 file collection. Add...

• read-only

• Default File Collection

• Length of cost/loss projection is 25 year(s). Files

• 100 thousand initially, staying constant.

• Average File Size

• 25 GB.

Page 47: Workshop on Preservation and Access for Audio and Video
Page 48: Workshop on Preservation and Access for Audio and Video

Plans

Found 3 plans. Add...

Disk and Tape edit Delete Evaluate

File Collection: Default File Collection

25 year lifetime. 100 files, avg. 25 GB in size.

Storage Configuration: Disk with Tape

Uses HDD in servers and Data tapes in a robot systems.

Page 49: Workshop on Preservation and Access for Audio and Video
Page 50: Workshop on Preservation and Access for Audio and Video
Page 51: Workshop on Preservation and Access for Audio and Video

http://prestoprime.it-innovation.soton.ac.uk/

Now: Three Areas of Digital Preservation, and PrestoPRIME tools

1) Digitisation – going digital

2) Digital Workflows – working digitally

3) Digital Preservation (proper)

Page 52: Workshop on Preservation and Access for Audio and Video

1. Digitisation – Key Ideas

• Cartography and Triage

– Make a map of your holdings

– Decide on priorities

• Make a preservation plan

– Digitisation: in-house or a service provider?

• Better – Faster – Cheaper

– Division of labour: Adam Smith, industrial process

– Lower prices by contractors for archive work

Joanneum: Quality Analysis Tool

Page 53: Workshop on Preservation and Access for Audio and Video

2- Working with digital content (lots of files)

It’s all about management– DAM/MAM and Trusted Repositories – what do

they do, what don’t they do -- White Papers– Storage –ITI online free tools– Metadata – Joanneum mapping and validation ;

“tag gardening” Univ Amsterdam; fingerprinting INA

– Digital library technology RAI, BBC MXF support– Access – Joanneum Time-based navigation,

annotation– Rights – RAI ontology, Eurix implementation

Page 54: Workshop on Preservation and Access for Audio and Video

3- Preserving the digital content

• Preservation Platform: P4=PrestoPRIME Preservation Platfom, Eurix; Rosetta, Ex Libris

• Standards: OAIS; formal control; formal preservation actions eg migration; P4

• Emulation – Multivalent, Univ of Liverpool• Formats, carriers, storage: Planning and

strategy: PrestoPRIME white papers• Managing and maintaining storage into the

future –SLA’s for outsourced service; white papers, software for real-time SLA monitoring; modelling and simulation tools

Page 55: Workshop on Preservation and Access for Audio and Video

Access: Audiovisual Content and Digital Libraries

• Digitisation makes audiovisual content available to web access, including digital libraries

• Broadcast archive projects: Birth of TV, VIdeoActive, EUScreen (link to Europeana)

• BUT – what are digital libraries doing to provide access to audiovisual content?

Page 56: Workshop on Preservation and Access for Audio and Video

Four requirementsfor sensible access

• Granularity

• Navigation

• Reference and Citation

• Annotation

Page 57: Workshop on Preservation and Access for Audio and Video

Granularity - division into meaningful units

• Keyframes

• Other methods to represent video

• and audio:

Page 58: Workshop on Preservation and Access for Audio and Video

Navigation

• "Click and play" on visual representation of the meaningful units

Page 59: Workshop on Preservation and Access for Audio and Video

Reference and Citation

• the core requirement for scholarly discourse

– along with a major change in attitude!

• Needs a permanent place for “things to be”

– Hence the need for stable audiovisual collections

“Hamlet, for example, is comparable to Saxo

Grammaticus' Gesta Danorum.[citation needed]

King Lear is based on King Leir in Historia

Regum Britanniae by Geoffrey of Monmouth,

retold in 1587 by Raphael Holinshed.[citation

needed] “

wikipedia

Page 60: Workshop on Preservation and Access for Audio and Video

Annotation

• the core requirement for social web = interactivity

• individual interacts with content

• individuals interact with other individuals

Page 61: Workshop on Preservation and Access for Audio and Video

Thank You

Preservation Guide

preservationguide.co.uk

PrestoCentre prestocentre.eu

Richard Wright [email protected]