11
This article was downloaded by: [McMaster University] On: 05 November 2014, At: 06:27 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Collection Management Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wcol20 Digital Preservation and Access M. Stuart Lynn a a Information Resources and Communications , University of California Office of the President , USA Published online: 23 Sep 2008. To cite this article: M. Stuart Lynn (1998) Digital Preservation and Access, Collection Management, 22:3-4, 55-63, DOI: 10.1300/J105v22n03_06 To link to this article: http://dx.doi.org/10.1300/J105v22n03_06 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is

Digital Preservation and Access

Embed Size (px)

Citation preview

Page 1: Digital Preservation and Access

This article was downloaded by: [McMaster University]On: 05 November 2014, At: 06:27Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Collection ManagementPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/wcol20

Digital Preservation and AccessM. Stuart Lynn aa Information Resources and Communications ,University of California Office of the President , USAPublished online: 23 Sep 2008.

To cite this article: M. Stuart Lynn (1998) Digital Preservation and Access, CollectionManagement, 22:3-4, 55-63, DOI: 10.1300/J105v22n03_06

To link to this article: http://dx.doi.org/10.1300/J105v22n03_06

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone is

Page 2: Digital Preservation and Access

expressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 3: Digital Preservation and Access

Digital Preservation and Access: Liberals and Conservatives

M. Stuart Lynn

The sentry in Gilbert and Sullivan’s operetta Iolanthe noted that he some- times thought it comical that every boy or girl born into the world alivc is either a little liberal or else a little conservative.

In the world of digital technologies, every little picce of information is electronically reduced to a collection of liberals and conservatives, other- wise known as 0s and Is. This almost miraculous world of digital technol- ogies continues its relentless march, what George Gilder termed “the dcscent into thc microcosm.” Costs continue to halve every two or three years, or performance or capacity doubles over the same time frame. And there is no end in sight. In reccnt years, we have seen the compact disk-a digital devicebury thc analog long-playing record; newspapers that trans- mit photographs in scanned digital form; and more recently the dramatic shift in the U S . away from current analog standards of television technol- ogy and towards a digital standard for high definition television (HDTV). And the ways in which we communicate and collaborate are being trans- formed by these same digital tcchnologies.

THE INTERNET PHENOMENON

About a year or so ago, a now famous cartoon of a dog appeared in the New Yorker magazinc. In that cartoon, the dog was standing before a

~~

M. Stuart Lynn is Associate Vice President for Information Resources and Communications, University of California Office of the President. At the time of this symposium, lie was Vice President for Information Technologies at Cornell University.

[Haworrh co-indcxing envy tiote]: “Digiinl Prcservnrion nnd Access: Liberals and Conservelives.” Lynn, M. Stuart. Co-published siinultaiieously in Collecrioti Martogenieit1 (Thc Haworrli Press, Inc.) Val. 22, No. 314. 1998, pp. 55-63; and: Going Digirol: Slraregies,/br Access. Pwserwlion. and Coiwer- sioii oJCoIlec~ionr 10 N Digiml Forninr (cd: Donald L Dewill) The H a ~ t ? l i Press, Inc.. 1998. pp. 55-63.

55

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 4: Digital Preservation and Access

56 Goirig Digiial: Strategies for Access, Preservaiiori, arid Coriveirioit

networked personal computer. The caption was, “On the Internet nobody knows you’re a dog.” I surmise that future linguists will regard that cartoon as the defining point when the word Infernel fully entered the vernacular. Now we are bombarded with Internet literature, books, and articles in both electronic and paper forms, the latter being out of date by the time they reach the bookstores. Tools such as Gopher and Mosaic are revolutionizing the ways in which we access information across the world’s networks. TraEc across the global Internet continues to grow at astounding rates, compounding at over 14 percent per month, and the number of people connected directly or indirectly to the Internet grows at similar rates.

Daily we are bombarded with newspaper articles about the National Information Infrastructure, about mega-mergers among telephone and cable TV companies and network and content providers. Much of this is hoopla that has less to do with dissemination and communication of educational and scholarly information and more to do with how to deliver 500 channels of mud wrestling to the home to feed the world’s insatiable appetite for passive entertainment and numbing out. These pastimes mean big money for these large corporations. However, while they talk about what can and should and should not be done, and who should do it and when, and with whom, the Internet continues its relentless march forward. The Internet is where the real action is, fueled by the rapid advances in digital technologies, computers, and communications.

THE SHIFT TO DIGITAL

Wc are all aware that digital technologies hold great promise for the world’s libraries as well as for the publishing industry, revolutionizing how we capture, store, disseminate, and access information and how we cope with the exponential growth of recorded knowledge. This conference focuses on how these technologies can be applied to the preservation of scholarly materials, such as books, serials, photographs, manuscripts, vid- eos, and slides; and to enabling worldwide access to these digitally stored materials across global networks.

Digital technologies decline rapidly in cost and improve in pcrfor- mance. What is not practicable today will be tomorrow. By contrast, the costs of analog technologies such as paper, video, sound-and even micro- film-decline slowly, if at all. In fact, they tend to increase. If cost decreases occur, it is generally because of the economies of scale associated with mass manufacturing, so that, for example, the economics

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 5: Digital Preservation and Access

Bralegies far Access arid Preseivatioii 57

of traditional book and journal publishing are closely tied to the need for large markets.

However, it is not only dcclining costs that are providing the impetus for the shift to digital. There are other motivators, such as the reliability of rcproduction and of transmission at grcat distances. Photocopies, for example, lose in quality at each successive stage of reproduction, as do microfilms with each generational copy. For thesc and other reasons, it is only a matter of time before the cost/performance curves cross and digital technologies come to dominate in any givcn area. Those wretched liberals and conservatives will be everywhere.

MAJOR OBSTACLES

Digital technologics offcr the potential for access at a distance to the intellcctual resources of our libraries. Thc rapid-indeed exponential- growth in the rcach of the world’s open networks, in particular the Inter- net, opens the door to the realization of Erasmus’s 16th-century dream of a library “with no limits othcr than the world itself.”

What are somc of the obstacles that inhibit the realization of this dream, that inhibit turning the promise of digital technologies into reality? What do these obstacles mean for preservation and access? I would like to briefly review thrcc major hurdles to be overcome:

The hurdlc of converting between the analog world and the digital world and vice versa, for cxample, scanning paper books on the one hand and providing access to those scanned images on the other. The hurdle of ensuring that these scanned digital images can be stored in a form that will be acccssible 500 years from now to meet preservation rcquirements. The hurdle of implementing the storage and distribution systems needed to provide access at a distance across the world’s networks.

Analog to Digital and Kce Versa

Thc reality is that, notwithstanding all the potential advantagcs of digi- tal tcchnologies, at thc level of actual use analog technologies are better suited to the necds of human beings. Unlike computers, we human liberals and conservativcs do not think in Is and 0s but prefer the warm and fuzzy world of paper, sound, and video that we can touch, see, read, and hcar. One of the challengcs is indeed in how we convert back and forth between

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 6: Digital Preservation and Access

58 Going Digital: Strutegies for Access. Preservuliori. arid Conversioii

the use of digital technologies for storage and transmission and analog technologies for human presentation and interaction.

We can, for example, scan brittle books to convert thein to digital images in a number of ways. One way is to scan the books directly using high-resolution production scanning equipment. At Cornell University, in a joint pilot project with thc Xerox Corporation and the Commission on Preservation and Access (the CLASS Project), we have scanned over 750,000 pages-or about 2,000 books or book equivalents-in a production setting at moderately high resolution reducing them to digital form. From these digital images wc have produced high-quality paper facsimiles, printed on acid-free paper, that should last several hundred years; indeed, our faculty often prefer the crisp new facsimiles to the brittle and often crumbling originals. These facsimiles can be bound and reshelved for traditional forms of access or printed on demand in response to research- ers’ needs.

We have also prototyped network access to the scanned images from computer workstations locatcd across our campus and across the Internet worldwide. Directly from their desktops, researchers can rapidly access and browse Cornell’s embryonic digital library.

Bilmup Images. I must emphasize that we are talking about scanned pages that are essentially digital photographs, or so-called bitmap images. We do not convert these images to machine-readable text through optical character recognition (OCR) technology although we have the option of doing so later. There are several reasons for this:

OCR technology is not sufficiently accurate without hand editing, which is relativcly expensive, particularly when applied to many of the older fonts. Our goal has been to develop a process that is cost competitive with microfilming. As OCR technology improves, it can always be applied later to the scanned bitmap imagcs. For preservation purposes, we wish to capture the original format of the book. Books contain substantial amounts of image material such as half- tones, engravings, and graphs, not to mention material such as math- ematical equations, that do not lend themselves to OCR technology. Bitmap image formats provide a lowest common denominator that enables us to capture most classes of materials and freely inter- change the digital images among different computer environments.

I suggest that for retrospective digital conversion of books for preserva- tion purposes, image scanning will dominate. Relatively few documents will be converted to structured text formats such as SGML (Standardized

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 7: Digital Preservation and Access

Strategies for Access orid Pieservatioii 59

Generalized Markup Language). The costs would be too expensive. This is in contrast to what will dominate for ncw tcxts, particularly those that originate in electronic form.

The major component of image capture and storage production costs is the labor cost of handling fragile pages. These production costs are there- fore quite comparable to the costs of microfilming. Our studies also show these costs to be less than thosc of photocopying. Incidentally since the copyright has expired on many if not most embrittled books, we have been able to build much of the infrastructure for the digital library unencum- bered by having to address complex issues of control of intellectual prop- erties.

Microfilm arid Bitmap Inzages. We can also produce microfilmfor archi- val purposes-from the scanned digital images. With today’s technologies, the quality of the digitally produced microfilm on the whole might not be as good as that obtained from directly microfilmed books using conven- tional methods; however, preliminary evidence suggests that it is good cnough to mcct rcquired standards.

Another way to produce digital images is to microfilm the books first using conventional photographic techniques and later scan the microfilm itself into a digital image whenever the film-scanning technology is ade- quate. Yale University is undertaking a project testing this approach. The advantage is that to preserve the intellectual content of the book we can exploit the inherently higher resolution and superior archival quality of film, while providing for improved access across digital networks now or at any time in the future. Today, we may scan microfilin in production settings at relatively low resolution at perhaps only a small increment to the original cost of producing the original tnicrofilm. At somc point in the future, for the same small incremental cost, we will be able to scan the inicrofilm at vcry high resolution that will allow us to capture thc fine details of the original document.

The key point is that, either way, we can have our cakc and eat it too. Interchanging both ways between the digital world and the analog world of microfilm appears both practical and achievable, the tradeoffs being ones of pertnanencc, resolution, and speed. We can exploit the preserva- tion advantages of microfilm today and the access advantages of digital technologies tomorrow. We can transmit the scanned digital images to distant computer workstations for viewing, and we can print out high- quality paper facsimiles whenever and wherever they are needed.

The world’s investment in microfilming for preservation continues to be a wise one, in spite of the emerging advantages of digital tcchnologies.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 8: Digital Preservation and Access

60 Goirig Digital: Strategies for Access, Prvseivntiori. mid ConveI.sio~1

Microfilming today does not preclude exploiting the advantages of digital access in the future.

Deacidification to slow down the deterioration processes also remains a promising area for investigation, even if only as a holding strategy pend- ing the viable application of digital or other technologies.

Other Applications of Digital Technolorn in Ptwewatioii. The CLASS Project at present only spans monochrome matcrials less than 8 112 x I I inches in size that contain text, line art, and halftone images. In another collaborative project involving Cornell, the University of Southern California, and the Kodak Corporation, we have studied the application of PhotoCD capture and storage technologies to the digital preservation of continuous-tone monochrome and color imagery and to larger formats.

All of these projects and other related activities reflect the extraordinary versatility of digital technologies. Indeed, Cornell’s own investigations parallel other activities, underway or planned, across the world. The extraordinarily impressive task undertaken by Seville’s Archivos de 10s Indios in scanning about 10 percent of their manuscript holdings, or around 10 million images, has also underscored the potential of these technologies. I revisited this project about a month ago, and it was gratify- ing to watch scholars in their reading room accessing manuscripts at workstations-reducing wear and tcar on the originals as well as signifi- cantly speeding up access timesand digitally enhancing the manuscripts on the fly to make text legible that had long been obscured by bleed- through or other problems.

The Cost of Conversion. Conversion still remains the expensive part of the process, approximately $100 per book including the costs of selection, regardless of technology used. Let us put this figure in perspective. If there are indeed I1 million unique titles across the research libraries of the U.S. that will become embrittled, the cost of converting these books amounts to about $500,000 per year for 20 years for cach of the 120 members of the Association of Research Libraries ( A m ) representing about 5,000 books per year per institution. This does not count the costs of long-term storage and delivery systems. Nevertheless, the conversion cost of $500,000 per year can be compared with an average figure of about $4 million per year spent for new acquisitions. I recognize that acquisition budgets are under enormous pressure, and I am certainly not suggesting substitution; how- ever, the decision to reformat a book is almost equal to the decision to buy a new book if the alternative is that the book will become completely unusable. I simply note that the cost of digitally preserving our old library is about 12 pcrccnt of the annual cost of acquiring new materials.

I have mainly focused in these remarks on conversion of books, manu-

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 9: Digital Preservation and Access

Strategies for Access atid Preservariori 61

scripts, and photographs. But the same essential issues apply to many other formats, such as videotapes, where, I suggest, we are almost com- pelled to consider digital alternatives because of the rapidity with which these tapes are changing bcforc our eyes.

Ensuring Lorrgeviv

The second hurdle to be overcome underscores the wisdom of choosing microfilm as today’s basic preservation standard. Unlike microfilm technology, digital technologies change rapidly. The formats in which data are stored are also subject to evolution. Standards are fluid and are oftcn replaced even before thcy havc matured from working to acceptcd standards. As a result we cannot guarantee that images scanned and stored in digital form today will be accessible even five years from now, Ict alone 500 years, any more than today we can easily read the punched cards of ycstcrday.

Keeping Puce with Technologv. Contrary to what is often assumcd, however, the primary issue is not the longevity of stored images on some given medium. There is no reason to assuinc that we will want to keep thesc digital books on the medium of original capture. Today, for example, we could store about 20 scanned books on a compact disk; 100 years from now we will be able to store the cntire 5 million volumes of today’s Cornell Library in the same physical space. People ask me how long will such-and-such a mcdium last. My answer is, I hope not longer than 10 years.

Space savings alone, however, will only encouruge periodic transfer- cnce to new media. It is the need to keep up with changing formats, software, and other technologies that absolutely compels such periodic “refreshing” as an absolute necessity. Fortunately, Cornell’s studies sug- gest that the continuing costs of technology rcfreshing are more than offset by cost savings associated with space compaction, an attractive feature in the context of the burgeoning costs associated with traditional library growth. It is possible to hypothesize that wc will refresh not just because we have to, but bccause we cannot afford not to. Our challenge is to formalize the task.

Perpetual Technology “Refreshing. ” In business terms, technology refreshing simply represents a form of continuous inventory management, a task that will of necessity occupy increasing attention of librarians as they struggle to cope with deteriorating analog media, from acidic paper to videotapes. Indeed, the need to institutionalize such technology refresh- ing-that is, ensure that the means exist to guarantee continuous attcntion to the need well beyond our lifetimes-is conceptually no different from entrusting microfilm repositories to maintain in perpetuity correct temper-

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 10: Digital Preservation and Access

62 Goirig Digital: Strategies for Access, Preservaliori, arid Coiiversiori

ature and humidity settings. It may, however, require us to consider the finances of libraries in wholly diffcrent ways if libraries are not to implode under the weight of exponential growth.

Indeed, many if not most libraries have already crosscd the digital Rubicon of technology refreshing. Digital online catalogs replace analog card catalogs. We entrust our computer centers to maintain such digital catalogs for eternity, refreshing the databases as technologies change. Imagine the consequences to our research libraries if those digital catalogs became inaccessible because of some future disruption in the refreshing process. A library without an index virtually ccases to exist.

Even as digital technologies evolve to provide improved worldwide access, the stable and standardized properties of microfilm continue to demand our attention for preservation reformatting. This will likely con- tinue until we fully come to grips with the issues of how to institutionalize the process of refreshing our stored digital imagcs. However, these issues are not insurmountable, and we may rapidly approach the point where it makes sense to consider digital technologies for preservation-at least in a hybrid partnership with microfilm.

Providing Worldwide Access

Thirdly, there is the issue of access. Research libraries throughout the world justifiably pride themselves on open access, a fundamental under- pinning of a democratic society. Open access, however, must not be con- fused with free access. It is a well-maintained fiction that our libraries are free to all scholars; this is misleading not because it costs to acquire and maintain collections, but because easy or free access is largely limited to those scholars in physical proximity to those collections. It is expensive and time-consuming for scholars to come from a distance. The irony is that in maintaining open acccss to our “analog” libraries, we may be inhibiting equitable access by scholars and others from institutions across the nation and around the world, who would benefit from open access at a distance to digitized collections.

Equal Access ro All. The promise of digital collections is indeed that they can equitably be accessed by scholars from wherever they may be, that is, access from any place and at any time. And not just by scholars, but also by citizens everywhere, from young elementary school students to life-long learners-provided, of course, they are not network-challenged! The further promise is that as we come to implcment the necessary infra- structure, thc exponentially declining costs of digital technologies will ultimately place such access within the financial range of all scholars everywhere. It has been suggcsted that scholarship follows access, not

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014

Page 11: Digital Preservation and Access

Strategies for Access mid Preseivaa[iori 63

collection, and that the greatest collcctions in the world have diminished scholarly value if access is inhibited.

A Tilne OfProlotypes arid Experimentation. The creation of electronic virtual libraries spanning the globe heralds the fulfillment of Erasmus’s dream, but there is much to do before we can close the gap between promise and reality. Erasmus may yet have to wait awhile.

We are in the early stages. We still have much to learn about the best ways to store, share, index, and access digital infomation. Navigation among the growing cornucopia of distributed network resources presents extraordinarily difficult challenges. We still have to agree on standard ways of exchanging digital information. We are only just coming to grips with the complex issues surrounding control of intellectual property and ensuring a fair return to copyright holders.

Rapid progress, howevcr, is being made, both in the private and higher educational sectors. Organizations such as RLG, the Commission on Pres- ervation and Access, OCLC, the Coalition for Networked information, and professional library associations are leading the way in addressing the complex issues involved. Prototype projccts are undcrway, and rapid prog- ress is now occurring. The projects I have described here and the many others at other institutions more than hint at what is to come, as does the explosive growth worldwide of network-accessible information resourccs on the Internet. There is a surge in experimentation in creation of elec- tronicjournals, even though at this early stage it is not clear how these are to meet the exacting standards of the academy or to become financially self-sustaining. The recent National Science Foundation digital library initiative has stimulated new thinking and creative proposals.

THE BEST OF BOTH WORLDS

Attitudes are changing rapidly. Libcrals and conservatives alike-librari- ans, as well as publishers, scholars, and technologistsno longer view the electronic library in the manner of St. Augustine, who prayed, “Lord, give me chastity-but not yet,” but as a conccpt whose time is upon us, not necessarily to replace the paper library, but to augment it in ways that combinc the benefits of both.

Digital technologies, providing access at a distance across the world’s electronic highways to our intellectual resources, will realize Erasinus’s dream over the coming years and decades. As we move forward in this regard, however, we must be mindful of the primary and urgent need to preserve those intellectual resources and to use the mix of tcchnologies- analog or digitahnost appropriate to the task.

Dow

nloa

ded

by [

McM

aste

r U

nive

rsity

] at

06:

27 0

5 N

ovem

ber

2014