13
- 1 - Digital Preservation: Strategies for Indian Libraries 7 th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries Ravinder Kumar Chadha Keywords: Digital Preservation, IPR 1. Introducrtion Easy Internet access presumes that everyone can capture, access, and use the world’s accumulated digital information. People, businesses, institutions, and governments invest time and effort to create and capture digital information for instantaneous access by anyone. Researchers are striving to make this information available to communities worldwide. Unfortunately, nobody can guarantee the continued preservation and accessibility of digital information generated in this era of rapid technological advances. Traditionally, preserving things meant keeping them unchanged; however, the digital environment has fundamentally changed our concept of preservation requirements. If we hold on to digital information without modifications, accessing the information will become increasingly more difficult, if not impossible. Digital preservation presents its own unique challenges, arising from the basic nature of digital data—it is machine- readable, not eye-readable. Unlike the fairly straightforward process of decoding other machine- dependent media, such as microfilm or LPs, maintaining digital data in a form that is intelligible to humans involves the use of a complex set of tightly interwoven technologies. Even if we could find a physical medium to contain unaltered digital data permanently, formats for recording the information would change and the hardware and software needed to recover the information would become obsolete. Digital preservation is plagued by the short media life, obsolete hardware and software, and slow read times of old media. Rapid technological advances do not solve the problem; instead, we need to migrate digital materials from one technology generation to another every few years. For digital records, the preservation issues extend beyond media life considerations. Devices for reading these media rapidly become obsolete; the various formats for digital documents and images introduce additional complications. Using research to develop policies, procedures, standards, and protocols based on solid frameworks provides accurate concepts and essential attributes of preservation in the digital information life cycle. 2. Advantages of Digital Access Digital technology offers distinctive advantages to institutions with impressive collections of scholarly resources. Information content can be delivered directly to the reader without human intervention. Readers can retrieve information content in digital form remotely, although such delivery may tax the capabilities of even the most sophisticated projection equipment and networks. Digital image quality is extraordinary and is improving constantly. It is now possible to represent almost any type of traditional research material with such visual quality that reference to the original materials is unnecessary for most, if not all, 7 th International CALIBER-2009, Pondicherry University, Puducherry, February 25-27, 2009 © INFLIBNET Centre, Ahmedabad

7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

  • Upload
    lykhanh

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 1 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

Digital Preservation: Strategies for Indian Libraries

Ravinder Kumar Chadha

Keywords: Digital Preservation, IPR

1. Introducrtion

Easy Internet access presumes that everyone cancapture, access, and use the world’s accumulateddigital information. People, businesses, institutions,and governments invest time and effort to createand capture digital information for instantaneousaccess by anyone. Researchers are striving to makethis information available to communitiesworldwide. Unfortunately, nobody can guaranteethe continued preservation and accessibility ofdigital information generated in this era of rapidtechnological advances.

Traditionally, preserving things meant keepingthem unchanged; however, the digital environmenthas fundamentally changed our concept ofpreservation requirements. If we hold on to digitalinformation without modifications, accessing theinformation will become increasingly moredifficult, if not impossible. Digital preservationpresents its own unique challenges, arising fromthe basic nature of digital data—it is machine-readable, not eye-readable. Unlike the fairlystraightforward process of decoding other machine-dependent media, such as microfilm or LPs,maintaining digital data in a form that is intelligibleto humans involves the use of a complex set oftightly interwoven technologies.

Even if we could find a physical medium to containunaltered digital data permanently, formats for

recording the information would change and thehardware and software needed to recover theinformation would become obsolete. Digitalpreservation is plagued by the short media life,obsolete hardware and software, and slow readtimes of old media. Rapid technological advancesdo not solve the problem; instead, we need tomigrate digital materials from one technologygeneration to another every few years. For digitalrecords, the preservation issues extend beyondmedia life considerations. Devices for reading thesemedia rapidly become obsolete; the various formatsfor digital documents and images introduceadditional complications. Using research to developpolicies, procedures, standards, and protocols basedon solid frameworks provides accurate concepts andessential attributes of preservation in the digitalinformation life cycle.

2. Advantages of Digital Access

Digital technology offers distinctive advantages toinstitutions with impressive collections of scholarlyresources. Information content can be delivereddirectly to the reader without human intervention.Readers can retrieve information content in digitalform remotely, although such delivery may tax thecapabilities of even the most sophisticatedprojection equipment and networks. Digital imagequality is extraordinary and is improvingconstantly. It is now possible to represent almostany type of traditional research material with suchvisual quality that reference to the originalmaterials is unnecessary for most, if not all,

7th International CALIBER-2009,Pondicherry University, Puducherry, February 25-27, 2009

© INFLIBNET Centre, Ahmedabad

Page 2: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 2 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

purposes. The power of full-text searching andsophisticated, cross-collection indexing affordsreaders the opportunity to make new uses oftraditional research resources. Newly developedsystem interfaces (the look and feel of the computerscreen) combined with new ways to delivermanageable portions of large image data filespromise to revolutionize the ways in which researchmaterials are used for teaching and learning. It isno wonder that there is a nearly overwhelming rushto jump on the digital bandwagon.

3. Digital Preservation Strategies

UNESCO’s Guidelines for the Preservation ofDigital Heritage (2003) group these strategies underthe following four categories:

3.1 Short-term Strategies

Bit-stream Copying

Refreshing

Replication

Technology Preservation or Computer MuseumBackwards Compatibility and Version Migration3.2 Medium- to Long-term Strategies

MigrationViewers and Migration at the Point of AccessEmulationCanonicalizationEmulation

3.3 Investment Strategies Restricting Range of Formats and Standards Reliance on Standards Data Abstraction and Structuring Encapsulation Software Re-engineering Universal Virtual Computer

3.4 Alternative strategies Analogue Backups Digital Archaeology or Data Recovery

4. Combinations

These strategies have demonstrated to work incertain circumstances over limited periods of time.None of them have proven themselves againstunknown threats over centuries of change. Most ofthese strategies are, however, being used in themanagement of data, and it is likely thatcombinations of these strategies will continue to beresearched and proposed for large-scale, long-termpreservation. It is, therefore, reasonable forpreservation programmes to look for multiplestrategies, especially if they are responsible for arange of materials over extended periods.

4.1 Short-term Strategies

Short-term digital preservation strategies are likelyto work for a short period of time only. Thesestrategies include bit-stream copying, refreshing,replication, technology preservation or computermuseum, backwards compatibility and versionmigration.

4.2 Bit-stream Copying

Bit-stream copying, commonly known as “backingup data” refers to the process of making an exactduplicate of a digital object. It deals only with thequestion of data loss due to hardware and mediafailure, whether resulting from normal malfunctionand decay, malicious destruction or naturaldisaster.It should be considered the minimummaintenance strategy for even the most lightlyvalued, ephemeral data.

4.3 Refreshing

Refreshing essentially means copying digitalinformation from one long-term storage mediumto another of the same type, with no changewhatsoever in the bit-stream (e.g. from an older CD-

Page 3: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 3 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

RW to a new CD-RW). “Modified refreshing” isthe copying to another medium of a similar typewith no change in the bit-pattern that is of concernto the application and operating system using thedata, e.g. from a QIC tape to a 4mm tape; or from a100 MB Zip disk to a 750 MB Zip disk. Refreshingis a necessary component of any successful digitalpreservation project. It potentially addresses bothdecay and obsolescence issues related to the storagemedia.

4.4 Replication

Replication is used to represent multiple digitalpreservation strategies. Bit-stream copying is a formof replication. LOCKSS (Lots of Copies Keeps StuffSafe) is a consortia form of replication, while peer-to-peer data trading is an open, free-market formof replication. LOCKSS uses low-cost tools to crawlthe Web to cache “redundant, distr ibuted,decentralized” e-journal presentation files for whicha library has a subscription or license. LOCKSSsupports the traditional model whereby individuallibraries build and maintain local collections ofjournals, and work is underway to develop a userinterface for local collection management of e-journals cached using the LOCKSS system. ALOCKSS Alliance of participating libraries has beenformed and the system is currently in beta test mode.The intention of replication is to enhance thelongevity of digital documents while maintainingtheir authenticity and integrity through copying andthe use of multiple storage locations.

4.5 Technology Preservation

Technological preservation is based on keeping andmaintaining the technical environment that is usedfor creation of contents including operating systems,original application software, media drives, etc. It

is sometimes called the “computer museum”solution. In other words, technological preservationbecomes applicable to digital materials that are lefton obsolete storage media and hardware andsoftware required to access them are discarded.Technology preservation is more of a disasterrecovery strategy for use on digital objects that havenot been subjected to a proper digital preservationstrategy.

4.6 Backwards Compatibility and VersionMigration

This strategy relies on the ability of current versionsof software to interpret and present digital materialcreated with previous versions of the same softwareand to save them in current format. In the case ofbackwards compatibility, the presentation may belimited to temporary viewing, whereas versionmigration permanently converts documents into aformat that can be presented by the current versionof the software.

5. Medium to Long-term Preservation Strategies

Strategies proposed for medium and long-termpreservation are likely to work for a long period oftime. Such strategies should be used for digitalmaterials that are likely to be of value for a longperiod of time. Medium and long-term preservationstrategies include:

5.1 Migration

Migration is a broader and richer concept of digitalpreservation than “refreshing”. Migration is a setof organized tasks designed to achieve the periodictransfer of digital materials from one hardware /software configuration to another, or from onegeneration of computer technology to a subsequentgeneration. The purpose of migration is to preserve

Page 4: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 4 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

the integrity of digital objects and to retain theability for clients to retrieve, display, and otherwiseuse them in the face of constantly changingtechnology. Migration includes refreshing as ameans of digital preservation but differs from it inthe sense that it is not always possible to make anexact digital copy or replica of a database or otherinformation object as hardware and software changeand still maintain the compatibility of the objectwith the new generation of technology.

5. 2 Viewers and Migration at the Point of Access

Migration or providing viewing facility at the pointof access has been proposed as an alternative torecurring and incremental migration. The processinvolves use of appropriate viewers, software toolsor transformation methods that provide accessibilityat the time of access, using the original data stream.For example: The VERS strategy convertsdocuments to a PDF format on the basis that third-party viewers for PDF may be constructed from theformat specification.

Limitations of this approach includes i) viewers maynot be available for all formats such as executablefiles; ii) Viewers may be able to represent some,but not all, elements of digital materials; iii) thegap between the original format and the prevailingtechnologies at the time of access may be too greatfor the tools or methods to cope with; and iv)Viewers, tools or methods, and correspondingmetadata must also be maintained or adjusted astechnologies change.

5.3 Cannibalisation

Cannibalisation is a technique designed to allowdetermination of whether the essentialcharacteristics of a document have remained intactthrough a conversion from one format to another.

Cannibalisation relies on the creation of arepresentation of a type of digital object that conveysall its key aspects in a highly deterministic manner.Once created, this form could be used toalgorithmically verify that a converted file has notlost any of its essence. Cannibalisation has beenpostulated as an aid to integrity testing of filemigration, but it has not been implemented.

5.4 Emulation

Emulation uses a special type of software, calledan emulator, to translate instructions from originalsoftware to execute on new platforms. The oldsoftware is said to run “in emulation” on newerplatforms. This method attempts to simplify digitalpreservation by eliminating the need to keep oldhardware working. Emulation combines softwareand hardware to reproduce in all essentialcharacteristics the performance of another computerof a different design, allowing programs or mediadesigned for a particular environment to operate ina different, usually newer environment. Emulationrequires the creation of emulator programs thattranslate code and instructions from one computingenvironment so it can be properly executed inanother.

6. Investment Strategies

Investment preservation strategies involveinvestment of efforts at the time of archiving digitalmaterials. Such strategies include: RestrictingFormats and Standards, Reliance on Standards,Data Abstraction and Structuring, Encapsulation,Software Re-engineering and Universal VirtualComputer.

6.1 Restricting Formats and Standards

Preservation programmes may decide to only storedata in a limited range of formats and standards.This can be achieved either by only accepting

Page 5: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 5 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

material in specified formats or by convertingmaterial from other formats before storage. Alldigital objects within an archival repository of aparticular type (e.g., colour images, structured text)can be converted into a single chosen file formatthat is thought to embody the best overallcompromise amongst characteristics such asfunctionality, longevity, and preservability. For,example most of the textual and graphicalinformation can be converted into PDF format. TheUK Archaeology Data Service (ADS), for example,specifies a preferred (but not exclusive) range offormats for deposit and provides guidelines fordepositors on creating or preparing materials forsubmission.

The strategy does not necessarily solve the accessproblem unless the obsolescence of formats andstandards used are handled effective through someother strategy. This strategy imposes seriousrestrictions on the range of materials that apreservation programme can accept. Moreover, theprocess of conversion from original format maycause some loss of essential elements.

6.2 Reliance on Standards

This preservation strategy involves the use of open,widely available and supported standards and fileformats that are likely to stable for a longer periodof time discarding proprietary or less-supportedstandards. Such standards or formats may either beformally agreed or may be de facto standard formatsthat have been widely adopted by industry. Forexample, majority of digitisation programmeschoose TIFF (Tagged Image File Format) as anopen, stable and widely supported standard forcreation of preservation master images. Similarly,most publisher use PDF as de facto standard forelectronic distribution of their research articles, due

to the availability of PDF readers for all platforms.Reliance on standards may lessen the immediatethreat to a digital document from obsolescence, butit is not a permanent preservation solution.

6.3 Data Abstraction and Structuring

Data abstraction, sometimes also callednormalization, involves analyzing and tagging dataso that the functions, relationships and structure ofspecific elements can be described. Using dataabstraction, the representation of content can beliberated from specific software applications, thedigital contents can, however, be read usingdifferent applications as technology changes. Dataabstraction makes a document application-independence and simplifies the transport of databetween platforms and over generations oftechnology. The technique, however, has itslimitation, it requires extensive development of toolsand methods for analysis and processing in orderto correctly represent and tag each type of data.Moreover, technology eventually used forpresentation may still limit what functions can berepresented.

6.4 Encapsulation

Encapsulation may be seen as a technique ofgrouping together digital objects and metadatanecessary to describe and provide access to thatobject. The grouping process lessens the likelihoodthat any critical component necessary to decode andrender a digital object will be lost. Encapsulationis considered a key element of emulation.

Encapsulation may also bundle metadata thatdescribe or provide link to the software applicationsand platform used for original contents consideringthe fact that it is impractical and unnecessary to

Page 6: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 6 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

encapsulate the software. Open ArchivalInformation System (OAIS) Reference Model, forexample, describes incorporating data objects andtheir associated metadata into Archival InformationPackages (AIPs).

6.5 Software Re-engineering

Digital materials are mostly tied to the applicationsoftware used for creating them. The applicationsoftware, in turn, are dependent on a specific systemor platform in order to function. Applicationsoftware get most affected by changes in technology.Moreover, they are also usually unsuited

for preservation strategies, including regularmigration. Software reengineering may offer anumber of strategies for transforming software astechnologies change, similar to transformation ofdata formats. Some possibilities include:

Adjustment and re-compiling of source code for anew platform;

Reverse-engineering of compiled code intohigher level code and porting that to the newplatform;

Re-coding of the software from scratch, or re-coding in another programming language; and

Translation of compiled binary instructions forone platform directly into binary instructionsfor another platform.

Reengineering application would require sourcecode, which may not be available except for opensource programmes and software that are developedin-house. Even when source code is available,porting to other platforms is not a trivial job, itrequires considerable time and effort per object.Moreover, compilers or interpreters are required forthe new platform for the code language.

6.6 Universal Virtual Computer

Universal Virtual Computer is a form of emulation.It requires development of a computer programindependent of any existing hardware or softwarethat could simulate the basic architecture of everycomputer since the beginning, including memory,a sequence of registers, and rules for how to moveinformation among them. Users could create andsave digital files using the application software oftheir choice, but all files would also be backed upin a way that could be read by the universalcomputer. To read the file in the future would requireonly a single emulation layer—between theuniversal virtual computer and the computer of thattime.

This approach requires substantial investments bothat the time of archiving while developing encodingmethods or UVC-native interpretive programmesfor each data type as well as at the time of restorationin developing a UVC emulator and restoreprogrammes. Moreover, if original data objects areabstracted or transformed for encoding purposes,such transformation may discard essentialcharacteristics.

The proof-of-concept prototype for the UVCapproach (Lorie, 2002) has been used to produce alogical schema, decoder programme andrepresentation mechanism for PDF documents, suchthat the document content can be represented usinga UVC interpreter and restore programme.

7. Alternative Strategies

Alternative strategies to digital preservation includetaking analogue backup of document (print ormicrofilm) or recovering data from obsolete digitalmedia.

Page 7: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 7 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

7.1 Analogue Backups

Analogue backups combine the conversion ofdigital objects into analogue form with the use ofdurable analogue media, e.g., taking high-qualityprintouts or the creation of silver halide microfilmfrom digital images. An analogue copy of a digitalobject can, in some respects, preserve its contentand protect it from obsolescence, withoutsacrificing any digital qualities, includingsharability and lossless transferability. Text andmonochromatic still images are the most amenableto this kind of transfer. Given the cost andlimitations of analogue backups, and their relevanceto only certain classes of documents, the techniqueonly makes sense for documents whose contentsmerit the highest level of redundancy and protectionfrom loss.

Limitation of this strategy includes i) advantagesoffered by digital technology such as convenienceof use, storage efficiency, search and navigationpossibility is lost; ii) the strategy does notcompletely remove the threat of technologicalobsolescence; and iii) long-term stability ofanalogue material may depend on expensive storageenvironments that prove to be less reliable thanwell-managed computer systems based on highlevels of redundancy.

7.2 Digital Archaeology

Digital archaeology includes methods andprocedures to rescue content from damaged mediaor from obsolete or damaged hardware and softwareenvironments. Digital archaeology is explicitly anemergency recovery strategy and usually involvesspecialized techniques to recover bit-streams frommedia that has been rendered unreadable, eitherdue to physical damage or hardware failure such

as head crashes or magnetic tape crinkling. Digitalarchaeology is generally carried out by commercialdata recovery companies by maintaining a varietyof storage hardware (including obsolete types) plusspecial facilities such as clean rooms for dismantlinghard disk drives. Given enough resources, readablebit-streams can often be recovered even from heavilydamaged media (especially magnetic media), but ifthe content is old enough, it may not be possible tomake it renderable and /or understandable.

7.3 Combination Strategies

As mentioned before, no single strategy isappropriate for all data types, situations, orinstitutions. A number of strategies may, therefore,be necessary to cover the range of objects andcharacteristics to be preserved. Preservationprogrammes should also consider the potentialbenefits of redundancy in pursuing more than onestrategy. It may be noted that even with goodplanning, a single strategy may fail leaving theprogramme with no means of access. Several digitalpreservation projects use more than one approach,for example:

Standards such as TIFF for image collectionsare often chosen in preparation for eventualmigration to other standard formats over thelong-term;

The VERS strategy couples the use of standards(PDF, XML) to the future use of viewers andthe likely migration of XML encoded metadatain the future;

Persistent archives use data abstraction withthe view to eventual migration – migration ofthe data, the mark up system and the supportingsoftware, and upgrading of hardware;

The Universal Virtual Computer (UVC)approach combines data abstraction with rulesfor migration of data objects at the point of

Page 8: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 8 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

access, and an emulation approach for softwareobjects. The “durable encoding” approach adsthe use of fundamental standards for encodingdata, including encoding that could beunderstood by the UVC.

8. IPR Issues

IPR issues are not simple in the digital preservationworld, where migration copies, archival copies,derivative versions, and other states of an objectexist, over a period of time. Meeting legalrequirements for preserving digital objects requirescareful, comprehensive, ongoing approaches thatavoid risk to the organization

Each institution must determine if the material itis seeking to preserve is in the public domain, or ifsomeone other than the institution owns thecopyright. It may want to try to locate the copyrightowner and license (perhaps at a cost) those rightsneeded to preserve a work. Alternatively, it mayconclude that preservation activities are authorizedby “fair use”. As per the Copyright laws the author/Copyright holder has exclusive right to copy an itemof work. In addition he has right to:

prepare derivative works based upon thecopyrighted work

distribute copies of the copyrighted work to thepublic

perform some copyrighted works publicly

display some copyrighted works publicly in the case of sound recordings, to perform the

copyrighted work publicly by means of a digitalaudio transmission in the United States

control access to a work protected by the use ofa technological measure

Digital preservation strategies may impinge on theserights. Migration, for example, may be a violation

of the copyright owner’s right to prepare a derivativework. Making a digital work broadly available mayimpinge on the copyright owner’s distribution,performance, and display rights. Preserving apassword-protected or encrypted file may requireviolating the copyright owner’s exclusive right tocontrol access. Given that almost everything iscopyrighted and the copyright owner has extensiveexclusive rights, how can digital preservation occurwithout risking copyright infringement?

8.1 US Copyright Law section 108 (Limitationson exclusive rights: Reproduction by librariesand archives): The Section 108 of US CopyrightLaw as modified by Digital Millennium CopyrightAct of 1998 has a provision that allows librariesand archives to copy, digitise and make accessiblepublished documents in their collections. Theamended, section 108 also permits up to 3 digitalcopies of unpublished and damaged works forpreservation of copyrighted material provided thatdigital copies are not made available to the publicoutside the library premises or put on the Internet.This further permits a library to copy a work into anew format if the original format becomesobsolete—that is, the machine or device used torender the work perceptible is no longermanufactured or is no longer reasonably availablein the commercial marketplace.

In order to be able to take advantage of theexemptions, certain ground rules must be met. Thelibrary or archives must be open to the public; thecopying cannot be for “direct or indirect commercialadvantage;” and any copies made must carry anotice of copyright.

Assuming that those conditions are met, librariesand archives can engage in limited copying forpreservation purposes without fear of infringement.However, certain other requirements apply:

Page 9: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 9 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

You must own a copy of the original.

The copying must be solely for preservationor security.

The original must be “damaged, deteriorating,lost, or stolen,” or the existing format in whichthe work is stored is obsolete.

A reasonable investigation reveals that anunused copy cannot be obtained at a fair price.

8.2 US Copyright Law section 107 (Limitationson exclusive rights: Fair use): Another exemptionlibraries and archives can use for their digitalpreservation programs is Section 107, Fair Use.Fair use is a judicially interpreted doctrine decidedon a case-by-case basis. You have no assurance thatany specific use is fair until a judge tells you it isfair. And while fair use is supposed to favourreproduction done for the purpose of teaching,scholarship, or research, not all copying done forsuch purposes is automatically fair. In determiningwhether a use is fair, a court must consider no lessthan four factors. These are:

the purpose of the use (including whether theuse transforms the original into

something new or merely replicates theoriginal)

the nature of the original material (whetherit is primarily creative or factual)

the amount of the original duplicated

the effect on potential market or value of theoriginal

Given the social benefit of preservation, it seemslikely that the courts would tolerate a preservationprogram that sought only to preserve digitalinformation but did not seek to distribute it toothers.

Any digital preservation program is likely to existin a grey area of legality. It is important for thosecharged with digital preservation responsibilitiesto understand that, while many actions may beacceptable, the area in which they can work withlegal certainty (primarily the exceptions affordedby Section 107 and 108) is extremely limited. Itis imperative, therefore, that digital preservationprograms remain in close contact with theirinstitution’s legal advisors to ensure that they donot place their institution at an unacceptable levelof risk.

9. Strategies for Indian Librarianship

Indian IT industry is progressing at a very fast pace.India is considered as superpower in the IT field.We have been generating huge amount of digitalinformation since last 25 years. We, however, havenot fully realised the importance for preservationof digital data being generated by governmentagencies, research organisations, academicinstitutions, cultural and commercial organisation,etc. There is an immediate need to address the issueof digital preservation at national level andformulate a National Digital Preservation Policy.This Conference can recommend to Ministry ofCulture and Ministry of Information Technologyfor the same.

In India digital preservation will need to be adistributed responsibility. This is partly because ofenormous amount of digital material being producedby a large number of organisations and partlybecause of the problems related to digitalpreservation mentioned. However, decisionsregarding preservation of digital information needto be taken at an early stage so that those creatingdigital data are logically the ones best able toundertake that initial activity. It is also a factor thatsolutions are not going to be of the nature of “onesize fits all”. Different approaches have to be

Page 10: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 10 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

adopted for different types of digital resources and,while duplication of effort is to be avoided, a certainamount of judicious overlap can be beneficial,particularly in these early stages of developingdigital repositories.

The role of organisations creating digital materialsis both crucial and difficult to integrate into acoherent infrastructure for preserving digitalmaterials. This is mainly due to the reason thatthey may be reluctant to hand over their materialselsewhere. Libraries and archives have establishedtheir credentials for preserving print materials overa very long timeframe. In these very early stages ofdeveloping digital repositories, it may be difficultfor creators to assign the same level of trust to thelibrarians for preserving digital materials. So thelibrary Professionals have to take lead and undertakethe responsibility of preserving the important dataof their organisations.

Some creators of digital materials may be bestplaced to undertake preservation responsibilitybecause of their in-depth knowledge of the subjectmatter, but they may well lack the necessaryarchiving skills. The optimum solution in thesecases might be an alliance between an organisationskilled in managing digital data, and the creators,so that those with the greatest knowledge of thematerial maintain control over decisions on whatcontent needs to be preserved and at what intervals.

The need for the development of reliable tools andservices has been recognised throughout the worldand related developments are taking placeeverywhere. This is yet another example of theglobal nature of digital preservation and thetendency for the same issues to emerge in differentparts of the globe at much the same time. Whilefocussing primarily on developing the Indian digital

preservation agenda, we have to recognise thatdigital preservation is very much a global issue andit is critically important to establish good lines ofcommunication with all those engaged in digitalpreservation efforts.

The Indian libraries should also work with respectto the cultural record is being creating in digitalforms. The digital environment is still relativelyuncultivated at this stage, but the need is urgent,the time is opportune and the conditions are fertilefor a strong, far-sighted set of cultivating actionsto help ensure that the digital record ultimatelymatures and flourishes. By analysing the emergingdigital environment, and by setting up digitalarchives so as to identify the most demandingpreservation. The following issues need to beaddressed urgently.

i) Primary responsibility of preserving digitaldata rests with the creators, providers andowners of digital information.

ii) Long-term preservation of digital informationon a scale adequate for the demands of futureresearch and scholarship will require a deepinfrastructure capable of supporting adistributed system of digital archives.

iii)A critical component of the digital archivinginfrastructure is the existence of a sufficientnumber of trusted organizations capable ofstoring, migrating and providing access todigital collections.

iv) A process of certification for digital archivesis needed to create an overall climate of trustabout the prospects of preserving digitalinformation.

v) Certified digital archives must have the rightand duty to exercise an aggressive rescue

Page 11: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 11 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

function as a fail-safe mechanism forpreserving valuable digital information that isin jeopardy of destruction, neglect orabandonment by its current custodian.

In view of the above, it is proposed that in India wemust set up a National Centre for DigitalPreservation. The NCDP will not be a repositoryfor digital data but will work towards a moreeffective and efficient infrastructure for digitalpreservation within the country. It will set tone forinitiating digital preservation activities in acoordinated manner.

The NCDP may undertake activities like developingpilot projects, propose support structures, and thedevelopment of best practice. It may undertake thefollowing activities:

9.1 To coordinate with existing and potentialdigital archives around the country and providecoordinating services for better preservation ofdigital data.

Action is urgently needed to ensure that documents,software products and other digital informationobjects are preserved before they slip irrevocablyaway. The proposed NCDP may undertake a projectdesigned with this particular focus as a cooperativeventure so as to develop strategies for preservingprecious data in distributed digital archives.Because the objects in this focal area are at suchrisk of loss, the project would also provide a usefulmeans of exploring the operations of certificationand fail-safe mechanisms for digital archives.

9.2 To initiate national debate on setting up ofadvance digital archives, particularly withrespect to removing legal and economic barriersto preservation.

A national debate may be sponsored by the NCDPto generate an enormous amount of creative

thinking about the commitment to the developmentof digital archives. It might be focused on fosteringcreative alliance, especially with publishers, andpractical, joint efforts designed to lower the legaland economic barriers to the effective operation ofdigital archives.

9.3 To recommend archival application oftechnologies and services, such as hardware andsoftware emulation algorithms, transactionsystems for property rights and authenticationmechanisms, which promise to facilitate thepreservation of digital data.

Only through early and active use will digitalarchives be able to influence the development ofkey new technologies and services and help toensure that they support information longevity.Moreover, there is growing need for evidence thatdigital archives can practically and effectivelyincorporate in their daily operations automatedsystems for emulating obsolete hardware andsoftware, transacting intellectual property and usingcryptographic and other mechanisms for creatingtrusted distribution channels for digital information.

9.4 To develop national informationinfrastructure to ensure that longevity of digitalinformation is an explicit goal.

NCDP may work for developing distributednetwork of linked digital information archives inwhich digital information will flourish over thelong-term. Communication and informationnetwork policy decisions regarding pricing, securityand network extension will greatly affect theviability of these archives and their efforts topreserve digital information. These policy decisionsneed to be informed with an understanding of theimportance and complexity of digital preservation.

9.5 To prepare of a white paper on the legal andinstitutional foundations needed for thedevelopment National Depository of digital data.

Page 12: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 12 -

Digital Preservation: Strategies for Indian Libraries 7th International CALIBER 2009

To work out a proposal for amending the Deliveryof Books Act 1956 so as to have a legally mandatedsystem of deposit for published works, in whichpublishers are required to place with a certifieddigital archives a copy of a work in a standardarchival format in addition to the printed copies.

9.6 To create subject digital repositories.

ICSSR, NISCAIR, DESIDOC, NML, INSA andother such apex bodies in specialised subject areasshould be encouraged to create subject digitalrepositories in their subject area.

9.7 To examine test and implement emergingstandards and tools regarding formats,hardware/software, security, access rightsmanagement, etc.

NCDP may work out standards for digital archivesand administer the process of digital archivalcertification. The appropriate individuals andorganizations need now to begin systematically toidentify and describe the standards, criteria andmechanisms for archival certification and therebylaunch the process that would lead ultimately to aformal certification program.

9.8 To act as national agency for coordinatingdigital preservation initiatives in the country andalso coordinate with other countries.

There is considerable evidence of worldwide interestin the means of preserving digital information. Anumber of agencies are working at national levelviz. the European Union, the Consortium ofUniversity Research Libraries in Great Britain anda national Working Party in Australia on themanagement of material in electronic format. Theyhave generated working papers on the topic ofdigital preservation and invited internationalcollaboration. India need a nodal agency to identifyand facilitate international collaborative efforts inthe field of digital preservation.

9.9 To identify current best practices and tobenchmark such practices for being used in thecountry.

a. The design of systems that facilitatearchiving at the creation stage.

There is a need to provide long-term access togovernment and scholarly data being producedin the country. There will be need to studyhow are publishers redesigning the creationprocess to support their electronic publishingprograms? What software are they using andhow have they influenced software producersto modify their development of their productsand suggest solutions for various organisationsin the country.

b. Storage of massive quantities of culturallyvaluable digital information.

There is an immediate need to develop largedigital archives for social and culturallyimportant information. Examples include thearchives of census data, remote sensing satelliteimagery, weather data land records, scholarlyoutput, research data, etc.

c. Requirements and standards for describingand managing digital information.

Descriptive information about the content ofdigital objects, their origins and provenanceand their management over time is critical forboth long-term preservation and future use ofdigital information. Standards and bestpractices for describing and managing digitalinformation are needed to track changes inownership or control over digital objectsthroughout their life cycle, to administerintellectual property rights, and to documentany changes in the format and structure ofdigital objects that may ensue from migration.

Page 13: 7 International CALIBER 2009 Digital Preservation ...inflibnet.ac.in/caliber2009/CaliberPDF/1.pdf · 7th International CALIBER 2009 Digital Preservation: Strategies for Indian Libraries

- 13 -

Digital Preservation: Strategies for Indian Libraries7th International CALIBER 2009

A responsible digital archive must provide toits users what it knows about the provenanceand context of its objects so that users can makeinformed decisions about the reliability andquality of the evidence before them. Ministryof IT, Ministry of Culture, Professional bodies,Library and Information Science professionals,need to collaborate in an evaluation andexpansion of descriptive standards andpractices so that they satisfy the specialrequirements of digital preservation and access.

d. Migration paths for digital preservation ofculturally valuable digital information

Data migration is a common practice asorganizations preserve their essential datarecords through successive changes inhardware and software. Cultural archives thathave been collecting digital objects have alsohad to begin migrating them as the hardwareand software on which they were created hasbecome obsolete. What is the range ofexperience of different organizations witharchiving different types of content? What canbe learned and generalized from theseexperiences? How do strategies compare amongdifferent organizations for archiving similarmaterials. Are there economies of scale thatcould be achieved by combining efforts acrossdigital archives? What are the costs of thedifferent strategies employed? What strategieshave failed? In what ways have practicesimproved over time? On the basis of analysisof the above, the NCDP may formulaterecommendation for national digitalprogramme and initiate dialogue atinternational level with similar agencies

10. Conclusion

Decisions about preserving information shouldconsider the costs. We can use current technologyto determine the costs of retaining information;

however, both expenditures and technology willevolve. Whereas we can project the costs for basicelements of technology—such as digital media perunit volume of information and unit processing bycomputers—there are no proven techniques forestimating he costs of long-term digital informationpreservation.

We can now make information easily available tocommunities worldwide via the Internet. We,however face the challenge of preserving digitalinformation with its paradox of short media life,obsolete hardware and software, slow read times ofold media, and defunct Web sites. Despite the wealthof accumulated, technology- generated information,we currently lack proven methods for preservingthis information or for using optimal technologytools to access it and determine its authenticity.Failure to address these digital preservationproblems is analogous to squandering potentialprofessional, personal, and economic gains,contributing to cultural and intellectual poverty, andresulting in exorbitant costs for recovery. We arecompelled to meet the research challenge to resolvethe conflict between the creation context and theuse context to facilitate digital informationpreservation.

There are numerous challenges before us, but alsoenormous opportunities to contr ibute to thedevelopment of a national infrastructure thatpositively supports the long-term preservation ofdigital information. Such an infrastructure is adesirable outcome that will benefit us only if weconceive and structure it to benefit posterity.

About Author

Dr. Ravinder Kumar Chadha, Joint Secretary,Parliament of India, Lok Sabha Secretariat, NewDelhi 110001.

E-mail: [email protected]