22
Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at the 9 th International Symposium on ETDs, Quebec City Presented By: Gail McMillan, Director Digital Library and Archives Virginia Tech

Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Embed Size (px)

Citation preview

Page 1: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Report on Preservation of ETDs:The LOCKSS Prototype

The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science

Reported at the 9th International Symposium on ETDs, Quebec City

Presented By:Gail McMillan, Director

Digital Library and ArchivesVirginia Tech

Page 2: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Agenda

Goals What is LOCKSS? Participating Universities International ETD Preservation Analysis and Results Conclusion

Page 3: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Digital Preservation

Goal: Information should be Readable Usable in the future

Preservation – NOT just backup Existing preservation techniques

Floppy, CD and hard disk drives Central and distributed database

servers

Page 4: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Technical Infrastructure Goals

Build on successful LOCKSS open-source model

Create dark archive for locally produced digital content

Use off-the-shelf hardware Use open-source software Easy replication Demonstrate LOCKSS scalability

Page 5: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

LOCKSS Lots of Copies Keep Stuff Safe

Peer-to-peer digital preservation system Open source software Turns an inexpensive desktop computer

into a digital preservation appliance Easy, inexpensive way to

Collect Store Preserve Provide access to the contents--or, not.

Page 6: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Functions of LOCKSS (1)

CollectVia a web crawler

Appropriate crawl rules are specified

Preserve and AuditEvery institution preserves

Its own contents Contents of partner universities Contents are polled to determine

authenticity and reinstate bad files

Page 7: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Functions of LOCKSS (2)

Provide access By running web proxies Open or restricted access

Dark Archives for partners’ ETDs Levels of access controlled at originating

institutions Administration

Via a web user interface Controlling access to cached contents

and other functions

Page 8: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

LOCKSS Preservation

Contents of each university (nodes M1 through M5) preserved at every other university Multiple, dispersed

copies Not a backup--

nothing is overwritten

All versions retained

M1

M3

M2

M5

M4

Page 9: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

ASERL-LOCKSS-ETD Initiative

Florida State University Georgia Institute of Technology University of Kentucky University of Tennessee Vanderbilt University Virginia Polytechnic Institute and State University

http://www.aserl.org/

Page 10: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Preservation using LOCKSS

Prerequisites Minimum hardware configurationLOCKSS software installed on all

participating partners’ systemsPermissions for the LOCKSS system

to collect, preserve, periodically validate, repair ETDs

Page 11: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Example Hardware Configuration Enterprise (3TB)

Dell PowerEdge Server 1850 LOCKSS - $3500

Dell PowerEdge Server 1850 Firewall - $2500

Dell/EMC AX100 SAN (3TB) - $10,000

RedHat Enterprise AS – 2@$50 = $100

UPS - $700 Server Rack - $1200

Grand Total - $16,800.00 w/ Rack - $18,000.00

Desktop (200Gb)

Intel Based Desktop LOCKSS (200Gb) - $500

Intel Based Desktop Firewall - $350

CentOS Linux - $0 UPS - $50

Grand Total - $900.00

Page 12: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Participating Universities

International universities Pontifícia Universidade Católica do Rio

de Janeiro, Brazil Humboldt-Universität, Germany University of Cape Town, South Africa

US universities Florida State University Georgia Tech Virginia Tech

Page 13: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

International ETDs Preservation (1)

For international universities KS wrote plug-ins to collect contents

(ETDs) from the 3 universities For US universities

Verified and reused OAI plug-ins for the 3 universities

Page 14: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

International ETD Preservation (2)

Example ETD collectionUniversity of Cape Town ETD collectionManifest (i.e., permissions) page:

http://pubs.cs.uct.ac.za/lockss/manifest.html

Screen shots of UCT plug-in and the crawl results of contents follow

Page 15: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

University of Cape Town Plug-in (1)

Page 16: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

UCT plug-in:

Crawl Results with

• Level (depth) =4

• Fetch delay = 6 seconds

Page 17: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Harvested International ETD Collections

Page 18: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Harvested American ETD Collection [source: http://lockss-etd.lib.vt.edu:8081/DaemonStatus ]

Page 19: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Tutorial on how to write plug-ins

KS developed mini-tutorial http://scholar.lib.vt.edu/lockss/introduction.htm

10 screens This tutorial can be

Generalized for ETD plug-ins Extended to write OAI plug-ins

Page 20: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Conclusion and Future Work

International ETDs can be harvested and preserved using LOCKSS and OAI-PMH

It requires cooperation and collaboration from participating universities

Future Work An online portal open for the public to view

certain details Brazil expressed interest in formalizing ETD

preservation for the NDLTD using LOCKSS

Page 21: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Acknowledgements

Special thanks to LOCKSS (Stanford University) Thomas Robertson Seth Morabito

Thanks to all participating universities Florida State Georgia Tech Humboldt-Universität, Germany Pontifícia Universidade Católica do Rio de

Janeiro, Brazil University of Cape Town, South Africa Virginia Tech

Page 22: Report on Preservation of ETDs: The LOCKSS Prototype The work of Kamini Santhanagopalan Virginia Tech Graduate Student in Computer Science Reported at

Send Questions/Comments to

[email protected]