15
UK LOCKSS Alliance Today’s scholarly content, secured for tomorrow Adam Rusbridge UK LOCKSS Alliance Coordinator EDINA, University of Edinburgh 8 th March 2012 Trust and eJournals Workshop, London

UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

UK LOCKSS AllianceToday’s scholarly content, secured for tomorrow

Adam Rusbridge

UK LOCKSS Alliance Coordinator

EDINA, University of Edinburgh

8th March 2012

Trust and eJournals Workshop, London

Page 2: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Summary

• LOCKSS: Digital equivalent of the physical shelf

• Sufficient rights to access content as needed

• Financial control and governance over systems

• Automate preservation functions where possible

• LOCKSS provides generic preservation capacity

• Customise the distributed architecture according to community needs

• Modeling the total cost of long-term storage

www.flickr.com/photos/guitarlogy/5387073471/

Page 3: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Community Action for Assured Access

A co-operative organization to ensure continuing sustainable access to scholarly work over the long term.

UK libraries are collaborating to build national ‘network level’ infrastructure and to coordinate the preservation of electronic material of local and UK interest. (since 2008)

Support Service at EDINA provides underlying coordination, support and development

JISC Collections organises membership subscriptions and gives guidance and support

JISC prompted the initial project led by the Digital Curation Centre (2006-08)

17 member institutionsDe Montfort University

King’s College London

London School of Economics

Natural History Museum

Open University

Royal Holloway, University of London

University of Birmingham

University of Edinburgh

University of Glasgow

University of Hertfordshire

University of Huddersfield

University of Newcastle Upon Tyne

University of Oxford

University of Salford

University of St. Andrews

University of Warwick

University of York

Steering Committee directs activity (next meeting May 2012)

Phil Adams (De Montfort University)

Lisa Cardy (London School of Economics)

Geoff Gilbert (University of Birmingham) 

Tony Kidd (University of Glasgow)

Liz Stevenson (University of Edinburgh)

Lorraine Estelle (JISC Collections)

Peter Burnhill (EDINA)

Adam Rusbridge (EDINA)

Page 4: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Technical Infrastructure

- Preserves content as published- Preserve the record: web archiving- Fetches content from a server

- Preserves integrity- Audit protocol to prevent damage- Tamper resistant

- Avoids single point of failure- Distributed network to avoid points

of failure- Model on success of print collections

(and operation of the library)

Page 5: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Technical Infrastructure

- Preserves content as published- Preserve the record: web archiving- Fetches content from a server

- Preserves integrity- Audit protocol to prevent damage- Tamper resistant

- Avoids single point of failure- Distributed network to avoid points

of failure- Model on success of print collections

(and operation of the library)

Page 6: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Technical Infrastructure

- Preserves content as published- Preserve the record: web archiving- Fetches content from a server

- Preserves integrity- Audit protocol to prevent damage- Tamper resistant

- Avoids single point of failure- Distributed network to avoid points

of failure- Model on success of print collections

(and operation of the library)

Page 7: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

MetaArchive

• A distributed digital preservation solution depends on a collaborating set of institutions agreeing to preserve each other’s content.

• Requires central coordination; shared enthusiasm, resources and benefit

• Successful models initiated where community / shared need already in place.

• MetaArchive is a cooperative not a vendor (conceived 2004)• Goal is not to make profits, but to improve each member's situation.• Distribute across geography: diversify funding, politics, economy• Replicate content, lower barriers of entry

• Educopia Institute - non-profit administrative organisation• Coordination role; arrange legal agreements and commitments to

preserve member content• Sustained by affordable cooperative fee memberships set by

members • Supplemented by grants and contracts

Page 8: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Costs

• Equipment• Each institution required to contribute a server to the network• As of June 2011: $4,600 for a 16TB machine

• Staffing• 2% of a systems administrator’s time• Administrator/point of contact• Software engineer who preps content for ingest

• (latter two roles needed for outsourced solution).

• Storage• $1.00/GB/year for content stored in the network.• ‘Conspectus’ to organise where content is stored

Page 9: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Tiered Membership

• Sustaining Members: $5,500/year• Leadership, development, governance

• Preservation Members: $3,000/year• Benefit from shared preservation model

• Collaborative Members: (varies, but e.g. $4,000/year for 20)• Consortia that share a server, and so look like one organisation.• Allows existing consortia to preserve co-hosted content for a

fraction of what it would cost to do so as individual members.

Page 10: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

PLNs in the UK: Member Survey

• Share resources & responsibility, build community, keep costs low

• Preservation policy

• Content and Collections

• Organisational architecture

• Costs and Resources

Page 11: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Initial Conclusions

• Survey response rate: 50% of members

• Institutions seeking affordable solutions to digital preservation. • e-preservation strategies have yet to be developed.

• Extent of digital assets requiring preservation unknown• Systematic audits have not yet been carried out.

• Prefer architecture where content is stored at more than one location• However a fully distributed approach was not favoured.

• Mixed enthusiasm for a PLN• Need to demonstrate PLN is low-cost and sustainable

• Need clear and demonstrable financial benefits

• Need a shared interest in preserving a particular body or type of content.

• Difficult to gain acceptance and commitment without these benefits

• Moving forward: establish a UK PLN, or join the MetaArchive as a Collaborative Member?

Page 12: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Pricing of Long-term Cloud Storage

• David Rosenthal has been looking at cost models for long-term storage: http://blog.dshr.org

• Does it make economic sense to store data in the cloud, in the long-term.

• Kryder's Law, 30yr history of exponential increase in disk capacity at roughly constant cost.

• The cost of storing bits for the long term depends on current price and how fast it is dropping.

• How long can we expect Kryder's Law to continue?

• Indications that Kryder's Law is slowing down• 4TB disks now available, but slower than expected.• Driver for 3.5" disks has been desktop PCs. Volume market

is now 2.5" disks: same curve but higher price/ byte.• By 2020, ought to have 14TB 2.5" drive @ $40

• Consumers may prefer a 2TB 1" drive for $15 and less power draw

Year Cost per GB

2002 $3.98

2004 $1.25

2006 $0.64

2008 $0.29

2010 $0.08?

2012 $0.06?

Page 13: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Cloud Storage Price History

http://blog.dshr.org/2012/02/cloud-storage-pricing-history.html

• Price of cloud storage is dropping around an order of magnitude more slowly than raw disk prices.

• There is a recurrent cost for storage in the cloud. As collections grow, will the cost of cloud storage grow more than if performed locally?

• Research to model total costs over time – local hardware, maintenance, location, power, bandwidth, staffing.

Provider Price (Year of Launch)

Current Price % decrease

Amazon S3 $0.15/GB/mo (2006)

$0.125/GB/mo 3%/yr

RackSpace $0.15/GB/mo (2008)

$0.15/GB/mo 0%/yr

Windows Azure $0.15/GB/mo (2009)

$0.14/GB/mo 3%/yr

Page 14: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Principles of LOCKSS: Building Trusted Archives

• LOCKSS software can be used to provide general, shared preservation capacity

• Responsibility spread across the community• Shepherded by strong universities with strong collection

policies

• Further assessment of UK Private LOCKSS Networks• Model selected depends on scale of content & community

enthusiasm

• Further assessment to understand the total cost of storage

www.flickr.com/photos/guitarlogy/5387073471/

Page 15: UK LOCKSS Alliance: Today’s scholarly content, secured for tomorrow

Find out more…

http://www.lockssalliance.ac.uk

[email protected]

@EDINA_eJournals