An Emerging Market

Embed Size (px)

Citation preview

  • 8/9/2019 An Emerging Market

    1/10

    An Emerging Market:Establishing Demand forDigital Preservation Tools and Services

  • 8/9/2019 An Emerging Market

    2/10

    Executive SummaryDigital data are all pervasive. Across industry, governmentand the public sector, digital files are vital to modern business.

    They facilitate communications, allow efficient exchange ofinformation and reduce running costs. Digital data appear tobe just a mouse-click away, whenever we need them, but forhow long?

    Like analogue records, digital data are susceptible to decay.Bits and bytes may disappear over time and hardware andsoftware can become obsolete, leaving existing dataunreadable. Storage deterioration and lack of long-term

    management undermine the readability and accessibility ofdigital content. Data loss may have serious repercussions foran organisation. It could mean the loss of valuable researchdata, removing an organisations competitive advantage.Or it could result in the disappearance of crucial audit trailswith undesirable financial and legal consequences for anyregulated organisation. It is therefore vital to carry out activedigital preservation. That is to ensure ongoing, meaningfulaccess to digital information foras long as it is required1.

    This is not just a protective measure, it is also a businessopportunity: a new market for digital preservation solutionsand products.

    This Planets white paper examines digital preservationissues (strategic, technical and economic) from theperspective of vendors and suppliers. It draws on qualitativeanalysis of 18 interviews with leading IT companies basedin the US, Europe, the Middle East and Australasia. Their

    thoughts and opinions are summarised in this paper. Theyshed light on the new emerging digital preservation market.

    A Planets White Paper byPauline Sinclair, Tessella and AmirBernstein, Swiss Federal Archives

    Published July 2010

  • 8/9/2019 An Emerging Market

    3/103

    The digital preservation market is still in its infancy; there is plenty of potentialfor growth as it affects all business sectors.

    Currently, engagement is led by the memory institutions2 at national and

    international level. Engagement is also high in government and researchorganisations and emerging in the private sector.

    Legal obligation is the key driver for organisations to engage in digitalpreservation, although additional motivations vary by sector.

    Although digital preservation is business critical, many organisations do nothave a policy to cover it. Where policies exist, their comprehensiveness isvariable.

    There is a lack of information on the costs of digital preservation and itsbenefits (both tangible and intangible) which makes it hard to put togethera convincing business case.

    Budgets for digital preservation are generally short-term and tend to beproject-based.

    There is a perceived immediate need to preserve documents, images, audio,websites, video, spreadsheets and emails.

    Migration is strongly preferred to emulation to preserve digital material.However, this may change as emulation has a distinct role to play and thereis some interest in emulation tools amongst the briefings participants.

    Participants thought that the most important factors for a digital preservationsolution are that it should: maintain authenticity, reliability and integrity,adhere to metadata standards, and check records have not been damaged.

    Scalability of digital preservation solutions to high volumes of content and highingest rates are regarded as important but scalability of access is not yetimportant.

    While attendees thought standards are important in digital preservation,particularly OAIS and ISAD(G), they also thought that there are currently toomany standards.

    There is still a need for guidance (particularly training), exchange of bestpractice, and awareness-raising through conferences and workshops.

    There is confusion as to what is meant by digital preservation. The differencebetween passive (storing multiple backups of data) and active (using migrationor emulation to provide access to obsolete formats) preservation needs to beclarified. It should be emphasised that only active preservation can ensure thatdigital material can be accessed in the future.

    Key Findings

    1 Mind the Gap. Assessing digital preservation needs in the UK, 2006, Waller, M.,

    and Sharpe, R., Digital Preservation Coalition.2 Memory institutions is a collective term for museums, archives and libraries.

  • 8/9/2019 An Emerging Market

    4/104

    Methodology

    In 2009, Planets conducted a series of interviews with18 IT companies, who between them cover all aspectsof the digital preservation market, in order to better

    understand that market. In-depth structured interviewswere held with each of the participants. Two-thirds ofthe interviews were held in Brussels and the remainderwere conducted over the phone; all were held over aperiod of 3 weeks. Each interview covered exactly thesame topics with the facilitators following a script toensure there was no bias. Initially, the participants wereasked quantitative questions, where their choice ofanswers was constrained, and then these answers wereexplored further in open discussions. The questionswere grouped into three sections covering: the digitalpreservation products and services offered by theparticipants, their views on the digital preservation

    needs of the market, and opportunities for Planets andthe participants to work together.

    Of necessity, only a relatively small number of suppliersand vendors could be interviewed. However, theparticipants were picked to form a representativesample of the IT market in terms of size, marketcoverage and involvement in digital preservation. So,while the quantitative results reported in this whitepaper are not statistically significant, they are indicativeof the trends and so offer a useful insight into thedigital preservation market.

    The Participants

    The 18 participants in the Planets Supplier and VendorBriefings comprised senior representatives (including 6CEOs and 8 IT managers) from IT companies acrossthe digital preservation market. The participatingcompanies, who between them represent millions ofcustomers, are mainly based in Europe but also inNorth America, the Middle East and Australasia. Theyrange from small companies with up to 75 employees,an annual turnover of 0.5 to 4.5 million equivalentand activities on a national or a regional level (two orthree countries) to multinational giants with thousandsof employees and an annual turnover the equivalent ofbillions of euros.

    The companies derive between 10% and 80% of theirrevenue from digital preservation services and havebeen involved in digital preservation from 2 years up to25 years. They all supply clients in the archives, libraries,public administration and commercial sectors and covera broad spectrum of content segments, ranging from

    the sciences and medicine, via economics and the lawto the arts and humanities.

    Digital preservation suppliers are starting to follow thetrend of providing services on-line and storage in thecloud, with 9 of the 14 participants who answered thequestion currently doing so. Of these, two-thirds offerdata storage or preservation-planning services, only athird offer data analysis and one or two offer data

    normalisation, data migration or emulation services.The majority of the participants receive data on-lineand subject it to a range of quality checks beforeaccepting it.

    The Market

    What is Digital Preservation?

    Digital preservation means different things to differentpeople. The main source of confusion is between the

    passive and active approaches to preservation. The firstusually refers to the long-term storage of data. Thisapproach aims to preserve the bit-streams throughmultiple backups. It ensures data are retained but, assystems are regularly replaced and formats constantlychange, this approach cannot guarantee that the datawill be readable or understandable in the future. Activepreservation starts with the passive approach but useseither regular data migration or the provision ofemulation tools to ensure that data can be read,accessed and understood for as long as necessary.For many this is seen as too complicated and so it

    is the passive approach that wins: Most institutionsregard preservation as storage.

    In the US, digital preservation tends to mean passivepreservation, as exemplified by LOCKSS (Lots of CopiesKeeps Stuff Safe), while in Europe the active approachis more common and this is the approach that Planetshas taken. Participants also identified differences inattitudes by sector. In the private sector, the emphasisis on storage and immediate access. In the public sector,and particularly memory institutions, it is on activepreservation and long-term access. There is also someconfusion between preservation and digitisation. Whiledigitisation makes material visible and disseminates itwidely, it does not in itself guarantee long-termavailability.

    There is still a lot of confusion a lot of work in education tobe done.Source: 2009 Planets supplier / vendor

    briefing participant

  • 8/9/2019 An Emerging Market

    5/105

    A Market in its Infancy

    There is emerging engagement in digital preservationin other market sectors, specifically:

    The financial sector, with banks and insurancecompanies obliged to retain audit trails and which,since the 1980s, have been moving towards digitalrecord management systems.

    Healthcare organisations, which are requiredto retain patients records.

    The pharmaceutical industry, which is interested inpreserving accumulated scientific knowledge.

    The oil industry, which is looking for possibilities to

    reanalyse old seismographic data in search of newoil and gas reserves.

    The photographic industry, where the growthin digital images has been and continues tobe huge.

    Engagement is also perceived to be led by largeorganisations with capital and in-house or third-party IT

    resources. This is unsurprising as the digital preservationmarket is still in its infancy and so economies of scalehave not yet been brought to bear; it is only those withplenty of resources and strong motives who havestarted to tackle it. Engagement is thought, by some,to be higher in the US than Europe.

    Legislation introduced over the past decade has requiredorganisations to retain digital information for significantperiods of time (ranging from 6 months to 10 years ormore). Such legislation includes the Data ProtectionDirective 95/46/EC, as enacted in the countries of the

    EU, and MiFID in Europe, Sarbanes-Oxley and HIPAA inthe US and the Basel II Accord worldwide. It applies notjust to organisations core business data but day-to-dayadministration data such as contracts, pension plans,health and safety records etc. Financial institutions areparticularly concerned with compliance and the needto keep audit trails of transactions to minimise liability.

    While legal obligation is the key driver for mostorganisations, data security, business continuity andre-analysis of data are also considered to be importantmotivations for engaging in digital preservation.

    Different drivers are perceived to be at work in differentsectors; a secondary driver for many in the privatesector is the commercial need to reanalyse data. Outsidememory institutions, engagement is motivated byexternal imperatives rather than a belief in the inherentvalue in retaining information.

    Figure 1: Customer Segments Targeted by the Planets Supplier and Vendor Briefings Participants

    10% do digital preservation,70% are thinking about it butdont know what to do and 20%are not aware of it.Source: 2009 Planets supplier / vendorbriefing participant

    The market is perceived to be at a relatively early stagein its development. Engagement in digital preservation isbeing led by the memory institutions and is also thought

    to be high among government and researchorganisations. This is reflected in the customer sectorsin which participating companies have operations.

    14

    12

    10

    8

    6

    4

    2

    0

    Reg

    ional

    Governm

    ent

    Interna

    tional

    Org.s

    Arch

    ives

    Nat

    iona

    lArch

    ives

    StateArch

    ives

    Internat

    iona

    lOrg.s

    Libra

    ries

    Nat

    ional

    Libra

    ries

    Priva

    teLibraries

    &Arch

    ives

    Cen

    tral

    Governm

    ent

    Loca

    lpu

    blicbodies

    Un

    iversi

    tyArch

    ives

    Sta

    teLibra

    ries

    Un

    ivers

    ity

    Libra

    ries

    Priva

    teLibra

    ries

    Loca

    lGovernm

    ent

    Financia

    lInst

    itu

    tutions

    Museum

    Arch

    ives

    Governmen

    tAgen

    cies

    Industria

    lCompan

    ies

    Insurance

    Compan

    ies

    LegalF

    irms

    Other

    Customer Segments Targeted by the Vendors and Suppliers

    No.

    Respondents

  • 8/9/2019 An Emerging Market

    6/106

    The Planets Market Survey3

    demonstrated thecorrelation between articulating a digital preservationpolicy and engagement in digital preservation activities.However, the briefings participants saw things as lessclear-cut. Although some of their customers have suchpolicies, many have only inadequate policies or none atall. For some, this is because they are dependent on asingle vendors solutions and so dont see the need towrite their own policy, while others are just getting onwith it. The public sector and memory institutions leadthe way with the development of policies and areperceived to have policies that are comprehensive andthat relate to long-term preservation. Where policiesexist in the private sector, they may be incomplete orconcerned with storage and access only. As withengagement, it is often larger companies with their ownIT resources that have developed a policy, while smallcompanies lack one. Organisations in the US are lesslikely to have a policy than organisations in Europe,despite greater pressure to retain data.

    Articulating a digital preservation policy is only the firststep, though, and as one representative noted: Writingpolicies is cheap, implementing them is expensive.Unfortunately, digital preservation is often seen as a

    luxury, particularly in the current economic climate.The case has not been made to prioritise digitalpreservation and consequently it is subject to economicconditions. There is a lack of data on the costs of digitalpreservation, both initial set-up costs and long-termrunning costs, and the benefits (such as data reuse,avoidance of fines from regulators, compliance withlegislation), both tangible and intangible, have not beenclearly articulated. All of which makes it hard to puttogether a solid business case.

    It is not surprising, therefore, that where digital

    preservation budgets exist, they are short-term: typically1 to 3 years ahead and rarely more than 10 years.Although memory institutions and Government arethought to take a longer-term perspective. This is aproblem when legislation demands that organisations

    retain data for longer periods than the budgets cover,as it hinders planning. In many cases funding is strictlyproject-based. This may be no bad thing, as it can leadonto more long-term funding when a case can be madeto sustain the digital preservation system created in aninitial project.

    In some cases, budgets may be spent preferentially ondigitisation. Several participants commented that end-users are more interested in ensuring wider access to,and greater visibility of, their existing material throughdigitisation than in engaging in digital preservation.Most want to put their material online and are not

    interested in preservation. Through such actions end-users believe they have preserved their analogue databut, in the long term, the digital surrogates created bydigitisation need to be preserved, just like born-digitalmaterial. One vendor commented that the focus ondigitisation is the result of funding being available tosupport it: Funds [exist] for digitisation but not forpreservation. This may be the wrong order but, on theother hand, it is pragmatic. They digitise and then lookfor strategies and solutions to retain the content.

    Digital preservation is still considereda one time project usually capitalbudget and not revenue budget is

    used.Source: 2009 Planets supplier / vendorbriefing participant

    Figure 2: Main Motives for Organisations to Engage in Long-term Digital Preservation

    3 The Digital Divide: Assessing Organisations Preparations

    for Digital Preservation, A Planets White Paper, 2010,

    Sinclair, P.

    Legalobligation

    Data security Businesscontinuity

    Futurere-analysis ofdata by end

    users

    Businessvalue

    derived fromlong-term

    access

    Other

    18

    16

    14

    12

    10

    86

    4

    2

    0

    Main Motives for Organisations to Engage with Long-term Digital Preservation

    No.

    Respondents

  • 8/9/2019 An Emerging Market

    7/107

    The Nature of the Solution

    The briefings participants thought a wide range offactors were important for end-users when evaluatinga digital preservation solution. Factors consideredimportant or very important by more than three-

    quarters of the participating suppliers and vendors were:maintains authenticity, reliability and integrity, adheresto metadata standards, checks records have not beendamaged, retrieves content by description, is able tostore many different types of content, and characterisesrecords by extracting metadata.

    Whether or not networked services are importantdepends on the end-user. Small companies, whosecore business is not archiving, are not looking forsoftware or a solution to install in-house but an on-lineservice to handle the small quantities of data they need

    to preserve. For them, networked services are veryimportant. However, for organisations with largequantities of data to archive, of the order of terabytesor petabytes, networked services are useless as thedata volumes would overload the network.Additionally, not all organisations trust someone elseto look after their data, particularly if it is confidentialor commercially sensitive.

    Scalability of digital preservation solutions to highvolumes of content and high ingest rates are important,while scalability to high access rates is not thought tobe so important. This may reflect the embryonic natureof the market, with end-users currently being concernedwith gathering and storing the digital informationrather than disseminating it widely. However, for manyorganisations the data they need to store is confidentialand has a limited audience, so high access rates willnever be important.

    The attendees agreed that standards are important,particularly OAIS4 & ISAD(G)5 but several commentedthat currently there are too many standards. Thisplethora of standards has several sources. There are thearchiving metadata standards, such as Dublin Core,

    PREMIS and METS, the electronic records managementstandards, such as ISO15489, MoReq2 and DoD5015.2, and other relevant standards such as those fordigital archiving (e.g. OAI-PMH, OAI-ORE, TRAC), ITsecurity (e.g. ISO17799, ISO27001, ISO72002), anddomain-specific standards (e.g. BIP0008 for UK courtsof law).

    As for the strategy used for active preservation of digitalfiles, migration was overwhelmingly preferred toemulation (over 85% of participants said their end-usersonly use migration). This is due to the perception thatmigration is more sustainable, less resource hungry, andeasier to control since it is easier to see what has beendone and compare the results. By contrast, emulation is

    seen as being too complicated for many types of data,difficult to manage, and there is only limited knowledgeof how to deal with it.

    Despite the reservations about emulation, eightattendees expressed interest in Planets emulation toolsto ensure access to old records for decades to come.Attendees recognised migration and emulation havedistinct roles and both strategies have a place in activedigital preservation. Potentially this could break thevicious cycle surrounding emulation, whereby the lackof familiarity with emulation and its perception of being

    hard to implement, leads to a lack of demand foremulation tools which, in turn, means a lack of productsand effort to educate end-users. This is important asboth techniques address different digital preservationneeds. Emulation is particularly important for providingaccess to digital objects which have dynamic behaviouror where users need to interact with the object, such asmultimedia materials, geographic information systems(GIS) and educational software.

    Whatever digital preservation solution is implemented,it will need to preserve a broad range of digital material.

    The participants perceived an immediate need topreserve documents, images, audio, websites, video,spreadsheets and emails, with different types ofbusiness prioritising different object types. Emails arean interesting case as they are treated legally ascorrespondence and thus covered by recent legislation.So organisations are required to retain them for specificperiods of time. However, their preservation iscomplicated by the need to handle the wide range ofattachments that can be associated with them. As onevendor said: Email preservation is very interesting butseldom dealt with. Websites are also seen as beingimportant but difficult to preserve. There are different

    approaches to preserving them, with one vendorrecommending preserving the database behind thewebsite and another taking snapshots. Given the broadrange of material needing preservation, it is perhaps notsurprising that participants saw potentially high interestin a service to recover obsolete files, a service that iscurrently lacking.

    Standards are on the whole highlyimportant. Unfortunately there aretoo many standards, which threaten

    to make standards themselvesredundant.Source: 2009 Planets supplier / vendor

    briefing participant

    4 2003 Reference Model for the Open Archival Information

    System5 General International Standard for Archival Description

  • 8/9/2019 An Emerging Market

    8/10

    Documen

    ts

    Images

    Au

    dio

    Websi

    tes

    Video

    Spreadshee

    ts

    Emai

    ls

    Databases

    Scien

    tific

    Data

    eBoo

    ksan

    deJournals

    Software/

    Source

    Co

    de

    ISO/disc

    images

    GIS

    Types of Digital Material Suppliers & Vendors Products/Users Need to Preserve

    No.

    Resp

    ondents

    8

    Opportunities

    Participants recognised that there is an emerging marketin digital preservation with the potential for growth asthe need to preserve digital content affects all businesssectors. Most said that they had an active interest in

    developing the digital preservation market or tappinginto existing demand through provision of tools andservices. They recognise that there is scope to stimulatedemand through development of archiving standards,provision of strategic advice on implementation, anddevelopment of tools for planning and executing digitalpreservation schemes.

    Participants in the briefings were clear that there is stillconsiderable work to be done to raise awareness aboutthe importance of preserving digital information for thelong-term. This needs to start by clarifying exactly what

    is meant by digital preservation: both the differencebetween active and passive preservation and thatdigitisation is not digital preservation. In particular, theconsequences of not carrying out digital preservationneed to be emphasised.

    Outside memory institutions awareness falls andmanagement, in particular, needs to be made moreaware of its importance. As one consultant put it:[Digital preservation] is business critical, but itsnot being thought about strategically. There is a needfor a business case scare stories to be articulatedand to motivate organisations to deal with digital

    preservation. For those aware of the importance ofdigital preservation there is a need for guidance, theexchange of best practice, and help with creatinga business case. Participants see a demand for

    conferences, workshops and user groups, due to theirrole in sharing information. Practical measures include:providing access to preservation planning tools andallowing users to test planned preservation actions in acontrolled environment on sample data.

    Attendees drew attention to areas where they would

    like to see change. In particular, they would like to seecost models for digital preservation which are currentlymissing and are necessary for building a solid businesscase. Such models need to put a price on both the costs(e.g. loss of audit trails) and the benefits (e.g. re-use ofdata), and also demonstrate the price of inaction.

    In addition, participants think the current proliferationof standards needs to be curbed and some consolidationtake place. Of course this will require agreement onwhich standards are important and need to be kept,which are specialised and should no longer be universal,

    and which are no longer needed and should beeliminated.

    The briefings highlight a number of opportunities thatare ripe for exploitation. The most immediate need isfor digital preservation consultancy, providing guidanceon what needs to be archived, how to do so and howto build a business case for digital preservation. Moredigital preservation tools for characterisation andmigration are required to cover the vast range offormats that digital material comes in. There ispotentially a niche market providing on-line digitalpreservation services to smaller companies for whomarchiving is not a core activity. Finally, there is a needto educate the digital preservation community aboutthe role emulation has to play which needs to bebacked up by the provision of practical emulation tools.

    Figure 3: The Types of Digital Material that Vendors and Suppliers Products/Users Need to Preserve

    18

    16

    14

    12

    10

    8

    6

    4

    2

    0

    Now Medium term Long-term

  • 8/9/2019 An Emerging Market

    9/109

    Summary

    The digital preservation market is still in its infancy.However, the relevance of digital preservation isextending far beyond memory institutions into allbusiness sectors. For many, though, digital preservation

    is seen as an onerous obligation not an opportunity torealise the long-term value of existing resources.Memory institutions are culturally predisposed toretaining information and see inherent long-term valuein it. By contrast, other sectors are having this imposedon them from outside, by legislation or businesspressures, and regard it as another drain on the bottomline. This is manifested in the lack of digital preservationpolicies, or their incomplete nature, and the short-termproject-based nature of budgets, which vanish in timesof economic uncertainty. It is also seen in the confusionabout what digital preservation entails.

    Solutions already need to deal with a wide variety ofcontent and this will only increase in the future. It istherefore no surprise that scalability to high rates ofingest and volumes of content are important now.Although preservation is not yet deemed to benecessary for as long as 50 years, the ability to trusta preservation solution to keep an object intact andaccessible is regarded as important. The need forpreservation standards is recognised, as is the needfor their rationalisation. While there is a strongpreference for migration, there is emergingrecognition that emulation has a place for particulartypes of content.

    Future engagement requires work to be done to raiseawareness about the importance of preserving digitalinformation and to articulate a business case for it. Atthe most basic level there needs to be a clear definitionof what digital preservation is. Following on from that,the costs and benefits of digital preservation need to beset out and the cost of no action clearly demonstrated.Further guidance on how to implement digitalpreservation is required and there is a demand forinformation and training. There is an ongoing need for

    the development of tools and services, in particular tocater for the needs of smaller organisations with fewerresources. Finally, workable approaches to emulationneed to be developed and so predispose end-users touse it where appropriate.

  • 8/9/2019 An Emerging Market

    10/1010

    Acknowledgements

    Planets would like to thank the individuals andorganisations who attended the supplier/vendorbriefings and also the British Library and the SwissFederal Archives for undertaking the work on behalf

    of the project.

    About Planets

    Planets (Preservation and Long-term Access throughNetworked Services) was a four-year project co-fundedby the European Commission under the InformationSociety Technologies (IST) priority of the SixthFramework Programme for research, technologicaldevelopment and demonstration (IST-033789).Between 2006 and 2010, Planets has worked todevelop a framework and a suite of practical toolsand services that will enable institutions in Europe tomanage and access their digital collections for thelong-term. Co-ordinated by the British Library, theproject has brought together the expertise of 16national libraries, archives, research institutions andtechnology companies in Europe.

    Planets technology provides access, through a singleopen-source application, to a range of Planets andthird-party digital preservation tools and serviceswhich support and automate a range of processes.

    These include creating a preservation policy, planningto preserve specific content or collections, identifyingthe significant properties of collections and individualobjects to be preserved, assisting with the identificationand selection of tools and services, validating actionthat has been taken and determining the extent towhich it has been successful.

    In addition, Planets has developed a controlledexperimental environment where tools, services andworkflows can be tested using pre-defined samplecontent to assess their suitability for use in digital

    preservation. The outcomes of such experiments willbe used to update information available to Planets usersabout the tools appropriateness to preserve particularcontent.

    Planets results will be maintained and developed bythe Open Planets Foundation (OPF), an independent,not-for-profit company. OPFs members share acommitment to ensuring long-term access to digitalcontent. OPF will provide hosted access to Planetsservices, technical support and training. It will alsocoordinate development of Planets services, tools andtechnology by supporting and engaging with thePlanets Open Source community. Members benefitby sharing experience and know-how in a communityof experts.

    Further Information

    For more information about Planets visit:www.planets-project.eu

    To join our user community and receive regular updates

    about Planets, register at:www.planets-project.eu/community

    To download Planets reports and newsletters visit:www.planets-project.eu/publications.

    To find our more about the Open Planets Foundation,including membership, visit:www.openplanetsfoundation.org.

    You can email your questions to us at:[email protected]