Summary of Review of International Initiatives

Embed Size (px)

Citation preview

  • 7/31/2019 Summary of Review of International Initiatives

    1/2

    McDonald, John. 2006.A Review of Relevant International Initiatives.

    http://www.collectionscanada.gc.ca/digital-initiatives/012018-3300-e.html

    Summarized by Ernest Hoffman

    This report provides an overview of current digital information initiatives and organizations fromaround the world, together with analysis of how they could inform the development of CDIS. It

    focuses on national and regional initiatives that have articulated a strategy, assessed the

    feasibility of a strategy, or constitute a de facto strategy.

    The report is focused on 'born digital' information, based on the assumption that the greatest

    challenges being faced by national governments concern the management of information created

    electronically and for which its ongoing use and preservation in digital form are dependent upon

    technologies that will change over time, and because approaches to managing born-digital

    information are in flux.

    Most of the initiatives address born-digital data such as government records, scientific research,

    etc, and so are unrelated to our project. As we are concerned with the preservation of complete

    webpages, including news content together with links, comments, social media elements, ads,

    etc, the areas of the report which are relevant to our project are web-harvesting policies, systems

    and practices.

    The most developed web harvesting systems covered in the report are:

    The Swedish Royal Librarys Kulturarw3 project, which has been harvesting Swedish

    websites since 1996. Their approach has often been cited as an example of a 'whole domain' or

    'comprehensive collection' and is based on the approach taken by the Internet Archive in theU.S. This is the most comprehensive and longest-running program of its kind in the world, and

    would provide a good starting point for a Canadian online news harvesting program.

    The e-Depot of the national library of the Netherlands, an automated system for the

    ingestion, description, management and long-term storage of electronic publications. Like the

    Swedish program, it is characterized as actually more of a storage system than an active

    preservation system, and was created in collaboration with IBM. Unlike the Swedish program,

    however, web-harvesting is only one aspect of the e-Depot, which functions as a very developed

    automation model for all kinds of records, and could provide a good template for large-scale

    digital acquisition in Canada.

    Australias PANDORA initiative (Preserving and Accessing Networked Documentary

    Resources of Australia) which is a national network of distributed archives similar to the TDR

    network envisioned by the CDIS, with different institutions are responsible for different types of

    information. Pandora collects both websites and discrete publications, so presumably it is

    archiving Australian news websites.

    http://www.collectionscanada.gc.ca/digital-initiatives/012018-3300-e.htmlhttp://www.collectionscanada.gc.ca/digital-initiatives/012018-3300-e.htmlhttp://www.collectionscanada.gc.ca/digital-initiatives/012018-3300-e.html
  • 7/31/2019 Summary of Review of International Initiatives

    2/2

    The New Zealand National Library has also established a "trusted digital repository" based on

    the Library's own "digital information strategy" which has among its objectives, To ensure the

    long-term storage and preservation of New Zealand's online heritage, which would also include

    online news content.

    On the policy and rights side, the report highlights the Danish Royal Librarys Digital PoliciesFramework. The review of existing policies and the development of a Strategic Plan led to the

    establishment of the "Hybrid Library" initiative. Danish legislation on archiving web pages now

    allows the Royal Library to harvest published documents without problems of copyright. Both

    their review of existing policies and the legislation which allows them to harvest would be worth

    looking into.

    Outside of rights issues, the main challenge to automated mass-archiving of websites is accessing

    them later, so the ability to emulate the original hardware and software environment is essential.

    The report highlights Camileon (Creative Archiving at Michigan & Leeds: Emulating the Old on

    the New), a joint initiative between the University of Michigan and Leeds University. It ran from1999 to 2003 and explored various emulation techniques for the preservation of digital

    information, and would have something to say about website retrieval and emulation.

    Because this report provides only brief descriptions of the initiatives it covers, its main value is

    that it provides links to the most important plans and projects underway around the world. It

    would be worthwhile to search the domains of the various national libraries, government sites

    and other organizations listed here for keywords related to web-harvesting, etc, as well as those

    related to online news in particular. This is especially the case for the British and U.S. digital

    preservation initiatives, which are too large and complex to have been given detailed treatment in

    this report, but which no doubt have web-harvesting and news-specific elements within them.

    The report also contains a list of journals where digital preservation research is published, and

    these could also be searched with our chosen keywords.