View
218
Download
4
Category
Preview:
Citation preview
University of KentuckyETD 2001
Long-term Retentionof ETDs
Beth Kraemer and Tom Teper
University of Kentucky
University of KentuckyETD 2001
Why is the Archiving of Electronic Documents a Hot Topic?
• Increase in “Born Digital” Documents
• Electronic is different from paper
• Future is unknown
• Recovery is more expensive than planning
University of KentuckyETD 2001
Why Long-Term Retention is an issue for ETDs
• Prevalence of PDF as “archival” format for ETDs
• Lack of archival planning by ETD programs
University of KentuckyETD 2001
Format of Original
• PaperOne format
Eye-legible, “permanent” back-up
No dynamic parts
• ElectronicInnumerable format types
No innate back-up
University of KentuckyETD 2001
Factors Affecting the Ability to Archive Electronic Formats
• Well-documented• Well-tested• Non-proprietary• Platform independent (hardware and
software)• If forced to choose proprietary, look at
export formats
University of KentuckyETD 2001
What is “preservation”?
• Preservation – An umbrella term that concerns itself with providing access to materials for as long as needed by whomever needs them
Cannot exist without the assistance of other branches of libraries.
• Asset Management – the business of providing access and protecting the institution’s investment.
University of KentuckyETD 2001
Semantic Arguments I
• “Killing the Duck to Keep the Quack”
- Simon PockleyShort-term access
Long-term access
Preservation
University of KentuckyETD 2001
Semantic Arguments II
• Digital Preservation – a misnomer
• Preservation through use of digital imagery – a possibility
• Preserving digital information – a necessity
University of KentuckyETD 2001
Requirements of “Digital Preservation”
• Make use possible
• Protect the original item
• Protect the surrogateWhat about ETDs, when the surrogate is the original?
-Paul Conway, “Preservation in the Digital World”
University of KentuckyETD 2001
Preserving Electronic Records
• Emulation – the process of using new technology to “emulate” the functionality of obsolete hardware/software on new technology
• Migration – according to CLIR and the RLG, a set of organized tasks designed to achieve the periodic transfer of digital materials from one hardware/software configuration to another, or from one generation of computer technology to another.
University of KentuckyETD 2001
Requirements for Preserving Born-Digital Information
• Ability to be copied perfectly
• Ability to be accessed without geographic restraint
• Machine readability
• Ability to retain unique functionality.
-Jeff Rothenberg, “ Avoiding Technological Quicksand”
University of KentuckyETD 2001
Three Preservation Models
• Technology-basedLOCKSS
CEDARS
• Analog
• Put it on the shelf and hope for the best
University of KentuckyETD 2001
A Cautionary Tale…
• Who drives development?The Government
IndustryRecords Managers
Bell & Howell (UMI)
DjVu, or having faith in non-standard
University of KentuckyETD 2001
“Now you can easily convert your paper documents to high-quality digital images with unbelievably small file sizes that can be viewed by anyone. “
University of KentuckyETD 2001
Problems with PDF
• Proprietary
• De facto standards are not standards
• Can’t export to functional format
• Image v. text formatsMultiple PDF formats
Problems with non-Western characters
University of KentuckyETD 2001
Bias in the ETD World
• Access NOW v. future
• One Document = One Format
• All document aspects deserve the same preservation attention
University of KentuckyETD 2001
Considerations (Preliminary Issues)
• Ignoring “media” issues for now
• Importance of developing “standards”
• We’re not necessarily doing it right
University of KentuckyETD 2001
Considerations
• FormatRequire archive-friendly submission, orConvert to archive-friendly
• ProcedureBalance
Ease of production (Students)Ease of migration (Library)
University of KentuckyETD 2001
Format Considerations
• Different Formats for Different PurposesSubmission
Retention
Delivery
• Document as a Sum of its PartsDifferent formats for different archival needs
University of KentuckyETD 2001
Procedure Considerations
• Identify “migratability” of various formatsBest
OK
Unacceptable
• Talk with ETD authors so they can choose
University of KentuckyETD 2001
Example
Creation
(Submission):
• Word• HTML
Archival:
• (Native)• HTML• ASCII
Delivery:
• HTML• PDF• TIFF
University of KentuckyETD 2001
Example
Creation
(Submission):
• Word• HTML
Archival:
• (Native)• HTML• ASCII
Delivery:
• HTML• PDF• TIFF
University of KentuckyETD 2001
“Considerations” Summary
• Archiving, publishing, maintaining ETDs will require work.
• Electronic documents will have to be converted to be accessible and functional in the future.
• Discuss and evaluate format options• Stop viewing born-digital documents through
paper vision
University of KentuckyETD 2001
“Considerations” Summary
• Stop thinking of a document as one static, physical item
• Allow ETD authors to make decisions
• Be prepared for change
• Work toward standards
• Cooperation?
University of KentuckyETD 2001
Short-Term Benefits v. Long-Term Costs I
• Preservation’s motto: “Think Twice, Cut Once.”
• ETDs: “Do it once, do it right”- Paul Conway – “Handbook for Digital Projects
University of KentuckyETD 2001
Short-Term Benefits v. Long-Term Costs II
• Proactive Preservation v. Accident and Rescue
ISO Archiving Standards Working Group
National Archives of Australia-Seamus Ross, “Changing Trains at Wigan: Digital
Preservation and the Future of Scholarship”
• More costly for born-digital items….
University of KentuckyETD 2001
Short-Term Benefits v. Long-Term Costs III
• Memory
• The “Long Nineteenth and Short Twentieth Centuries”
-Eric Hobsbawm ”The Age of Revolutions,” “The Age of Industry,” “The Age of Empire,” and “The Age of Extremes”
University of KentuckyETD 2001
The Future…
• PDF will become (remain) archival standard
Lacks features of archival standard
Technology tends to change!
• PDF will give way to other formatsBackfiles will have to be migrated or will be lost
University of KentuckyETD 2001
…and Choices
• Scholarly Obligations
• Obligations of the ETD Community
University of KentuckyETD 2001
Selected Bibliography
• Avoiding Technological Quicksand. Jeff Rothenberg. http://www.clir.org/pubs/abstract/pub77.html. 1999.
• “File Formats to Support Long-Term Access to Electronic Records.” Managing Electronic Records 2000. Richard Fisher and Charles Dollar. 2000.
• Permanent Digital Records and the PDF Format. Stephen J. Gilheany, http://www.ArchiveBuilders.com 22025v177.
• Preservation in the Digital World. Paul Conway, http://www.clir.org/pubs/abstract/pub62.html. 1996.
University of KentuckyETD 2001
Selected Bibliography
• Can Libraries of Digital Materials Last Forever?, Richard H. Ekman, Change, March/April 2000
• Best Practices for Digital Archiving: An Information Life Cycle Approach, Gail M. Hodge, D-Lib Magazine, January 2000.
• LC21: A Digital Strategy for the Library of Congress (2001), http://www.nap.edu/books/0309071445/html/
University of KentuckyETD 2001
Selected Bibliography
• Archiving the Digital Public Record: An Internet Snail’s Pace, Henry M. Gladney, iMP Magazine, Oct. 23, 2000.
• Using Emulation to Preserve Digital Documents, Jeff Rothenberg, NELiNET, June 5, 2000.
• “Preserving Digital Information: Report of the Task Force on Archiving of Digital Information.” Council on Library and Information Resources and Research Libraries Group, ftp://ftp.rlg.org/pub/archtf/final-report.pdf, 1996.
University of KentuckyETD 2001
Selected Bibliography
• The Crisis in Electronic Government Record Keeping: A Strategy for Long-term Storage, Philip Coombs, Library Computing, 18(3), 2000.
• Phase II Report of the Working Group on Permanence of NLM Electronic Publications, 10/13/00, http://www.nlm.nih.gov/pubs/reports/permanence.pdf.
University of KentuckyETD 2001
Selected Bibliography
• Ross, Seamus. “Changing Trains at Wigan: Digital Preservation and the Future of Scholarship.” London: National Preservation Office, 2000.
• Russell, Kelly. “Digital Preservation and the Cedars Project Experience.” Presented at Preservation 2000: An International Conference on the Preservation and Long Term Accessibility of Digital Materials. York, England: 2000. http://www.rlg.org/events/pres-2000/russell.html.
University of KentuckyETD 2001
Selected Bibliography
• Reich, Vicky and David S.H. Rosenthal. “LOCKSS (Lots of Copies Keep Stuff Safe).” Presented at Preservation 2000: An International Conference on the Preservation and Long Term Accessibility of Digital Materials. York, England: 2000. http://www.rlg.org/events/pres-2000/reich.html.
• Barnum, George D. “The Federal Depository Library Program Electronic Collection: Preserving a Tradition of Access to United States Government Information.” Presented at Preservation 2000: An International Conference on the Preservation and Long Term Accessibility of Digital Materials. York, England: 2000. http://www.rlg.org/events/pres-2000/barnum.html.
Recommended