7
BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation ------------------------------------------------------- The Vatican Library is institutionally a library of preservation which for more than 500 years has fulfilled its fundamental role, preserving, protecting and restoring the patrimony of books kept there. This daunting task was made even more difficult because the Vatican Library has never forgotten that free consultation of its immense patrimony for scholars from around the world was a mandate from which the Library could not be dispensed. Quite in keeping with this role and knowing wanting to also preserve for the future generations his manuscripts, the Library started a project of "long-term digital preservation" that on one hand gives all possible guarantees of longevity of the realized technological product and on the other would allow the disclosure of the manuscripts possessed by the Vatican Library to a wide representation of users worldwide. The project, begun in 2007, was preceded by a long period of preliminary studies, which laid down the guidelines to follow during the planning stages. The analysis of the initial period highlighted the uniqueness of the endeavor and the impossibility of benefitting from experiences in similar projects of "LONG TERM DIGITAL CONSERVATION". Effectively, until now, there don't appear to have been experiences in the world in archiving of this importance which have at least 10 PETA BYTE data stored. 1

BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

Embed Size (px)

Citation preview

Page 1: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

BAV and the FITS (Flexible Image Transport System) format

40 Years of Experience in Long-Term Digital conservation

-------------------------------------------------------

The Vatican Library is institutionally a library of preservation which for more than 500 years has fulfilled its fundamental role, preserving, protecting and restoring the patrimony of books kept there. This daunting task was made even more difficult because the Vatican Library has never forgotten that free consultation of its immense patrimony for scholars from around the world was a mandate from which the Library could not be dispensed. Quite in keeping with this role and knowing wanting to also preserve for the future generations his manuscripts, the Library started a project of "long-term digital preservation" that on one hand gives all possible guarantees of longevity of the realized technological product and on the other would allow the disclosure of the manuscripts possessed by the Vatican Library to a wide representation of users worldwide. The project, begun in 2007, was preceded by a long period of preliminary studies, which laid down the guidelines to follow during the planning stages. The analysis of the initial period highlighted the uniqueness of the endeavor and the impossibility of benefitting from experiences in similar projects of "LONG TERM DIGITAL CONSERVATION". Effectively, until now, there don't appear to have been experiences in the world in archiving of this importance which have at least 10 PETA BYTE data stored.

1

Page 2: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

After the preliminary phase, the plan of the Vatican Library had clearly defined the basic parameters that would guarantee the survival time of such an undertaking. So with the expertise of our photographic laboratory we made an accurate selection of the acquisition device during the test bed.

With the same attention image management software applications were identified, and the ways of calibrating the devices were applied with extreme care; procedures able to ensure the stability of the exact color profile during the long phases of the acquisition project.

A detailed study has been devoted to defining the graphic parameters of acquisition, trying to bring out the exact values of graphical evaluation of an image in relation to the sensitivity of the sensor used, its number of lines, the accurate shape of its pixels.

2

Page 3: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

The preliminary stages of study and analysis obtained from the data received during the TEST BED, determined the selection of the Metis planetary devices that adopt CCD trilinear sensors, designed and manufactured by Kodak. These KLI sensors were designed about 20 years ago (and updated over time) for military and professional use only. They are certainly more expensive but they are still unsurpassed for their quality and their response to productive stress. Recently, Kodak sold its division that produces CCD sensors to "TRUESENSE Imaging" which continues the development and updating using the latest technologies. The sensors of the METIS planetary scanners are purchased directly from KODAK headquarters in the United States, without going through European subsidiaries. This guarantees careful selection according to production standards and careful disposal of materials at the limits of productivity.

Basically, however, the two fundamental parameters, that as a solid foundation, support this project are definitely the choice of digital preservation format and strict observance in controlling technological obsolesence in reader compatibility of the conservation devices.

3

Page 4: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

Obviously the control of technological obsolescence is the strict responsibility of the DATA CENTER management

and therefore is only the result of consolidated planning and strict enforcement of the policies that govern these issues.

While, on the other hand, the choice of the FITS format also as the format for the preservation of assets belonging to cultural heritage was the result of a long and fruitful collaboration begun with researchers from the Roman world of astrophysics, and then spread to the entire international scientific community by which we have received an important and significant incentive to continue in that direction. The main feature of the FITS format is indeed that of being an open source format, designed by NASA in the 70's. Analyzing the processes of acquisition in the astronomical and astrophysical worlds, we noticed great similarity with the processes of acquisition of cultural assets and total sharing on conservation issues. As the crowning achievement of this intensive collaboration, definitions of keywords for cataloguing the world's cultural heritage are being worked on. These keywords will enter into FITS standard and merge with existing ones relevant to the worlds of astrophysics and space physics, creating the first format in XML with powerful descriptive characteristics for different storage environments.

4

Page 5: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

The FITS format digital preservation between manuscripts and stars

The market for storage in digital environments conventionally uses the TIFF standard, the file generated in this format has a size limited to 4.2 gigabytes, and even if new solutions are applied to solve this problem the "GRAND LARGE TIFF" hasn't had success and is not free. Furthermore, TIFF does not allow 3 dimensional visualization. It is a proprietary format belonging to the ADOBE company, created in 1992, and last updated in 1998. The FITS format is presented to the world of conservation with very advanced technology and at least 45 years of experience in digital preservation. Its design originated in NASA laboratories for the conservation of images from lunar missions, dates back to the seventies even though it was officially presented in 1981. The updating is done at least every six months and is the fruit of the entire global scientific community, involving American, Japanese and European working groups. The product is "open source" which makes it practically one of a kind and, considering the masses of enormous amounts of data that are normally handled during long term digital preservation projects, it is essential that the data is at least in an open source format and not subject to any constraint of a commercial nature. So the format that preserves the galaxies, planets and stars is now also used for the conservation of the cultural heritage of the Vatican Library, and having grasped the concept that the source code of FITS is in the public domain and that it can be worked on with implementations for improvement shared by the scientific and astronomical worlds that already use it, one immediately realizes how the versatility of this format can follow step by step the evolution of issues relevant to the improvement of all phases of the digitization process. In practice, the FITS is a dynamic format that supports the curator satisfying every aspect

5

Page 6: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

of improvement in the evolution of digital preservation. The file generated by this format is not limited in size and is already ready for the third or the fourth dimension. I hope that this conference can compare the different conservation policies expressed by the most popular formats. It would also be interesting to understand why there isn't an opening to the analysis of new issues of long-term preservation both in the world of cultural heritage and in the world of public administration. These two hemispheres would need constant dialogue to protect the good whose conservation is to be decided. Frequent technological elaborations not only make the conservation processes more efficient and less costly, but especially in the area of the delicate selective operation that happens by using the expressed parameters in the specifications of competition, we would have more frequent updating of the technology expressed by the latter and consequently the most desirable result would surely be obtained for the digital preservation process under consideration.

One will always come out winning from the comparison with other organizations that address such complex issues because what triumphs is certainly the spirit that animates projects aimed at long-term digital preservation, from which the most benefit for the protection of cultural property is derived. It is not necessary to forget during this process how the disclosure of the property undergoes a flywheel effect with the use of traditional media channels. So conservation is meant as a means of protection because it protects the exposure of the good from bacteriological agents and sudden changes in temperature during the consultation process, but also digital preservation is meant as a tool for disseminating virtual pages that will be the object of indispensable study in the paleographic world.

Benefiting from the experiences that occurred during the TEST BED and from the comparisons made with several technology companies during these years, the Vatican Library has increasingly consolidated guidelines that have become the carrier of the project of "LONG TERM DIGITAL CONSERVATION".

Therefore, the FITS format becomes the only format that will be kept in archival storage and the storage format will use a script that was already created in the beta test phase that will allow the exportation of XML descriptive data coming from the acquisition device in TIFF format a normally used by the device toward the FITS format used in conservation.

The system will transform the FITS to TIFF on demand by restoring the original XML information from the device.

This new philosophy of conservation can be extended to any type of document because the FITS platform will be able to meet the needs deriving administrative records, documents, and fiscal matters. The structures of these documents lend themselves greatly to the preservation processes in FITS format, the above-described files stored as images are no longer bound by the procedures of copyright arising from the software that created them and even after decades it will be enough to simply re-open the image file to submit it to the ICR OCR software that will be used from 10 to 20 years and the file will fully reusable without having to use or know the generating software.

6

Page 7: BAV and the FITS (Flexible Image Transport System) format · BAV and the FITS (Flexible Image Transport System) format 40 Years of Experience in Long-Term Digital conservation-----

The diagram below is an evolutionary example of a DATA CENTER which distributes and maintains unified data on a FITS platform. The system distributes the files on demand using the open source process of re-conversion from FITS to TIFF or to other formats.

I thank you for your attention

7