26
PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Embed Size (px)

Citation preview

Page 1: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

PARSE.Insight Framework and Lesson Learned

David Giaretta (STFC)

Page 2: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Overview

• Aims

• Survey

• Roadmap

Page 3: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Alliance for Permanent Access

• The Alliance aims to develop a shared vision and framework for a sustainable organisational infrastructure for permanent access to scientific information

The British Library European Organization for Nuclear Research [CERN]CSC — IT Center for ScienceDelegation of the Finnish Academies of Science and

Letters Deutsche Nationalbibliothek Digital Preservation Coalition European Science Foundation [ESF] European Space Agency [ESA] Helmholtz-Gemeinschaft Deutscher Forschungszentren International Association of Scientific, Technical & Medical

Publishers Joint Information Systems Committee [JISC] Koninklijke Bibliotheek Max-Planck-Gesellschaft NESTOR Kompenteznetzwerk Nationale Coalitie Digitale Duurzaamheid [NCDD] Portico Science & Technology Facilities Council [STFC]

http://www.alliancepermanentaccess.org/

Page 4: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Alliance for Permanent Access

• The Alliance aims to develop a shared vision and framework for a sustainable organisational infrastructure for permanent access to scientific information

The British Library European Organization for Nuclear Research [CERN]CSC — IT Center for ScienceDelegation of the Finnish Academies of Science and

Letters Deutsche Nationalbibliothek Digital Preservation Coalition European Science Foundation [ESF] European Space Agency [ESA] Helmholtz-Gemeinschaft Deutscher Forschungszentren International Association of Scientific, Technical &

Medical Publishers Joint Information Systems Committee [JISC] Koninklijke Bibliotheek Max-Planck-Gesellschaft NESTOR Kompenteznetzwerk Nationale Coalitie Digitale Duurzaamheid [NCDD] Portico Science & Technology Facilities Council [STFC]

http://www.alliancepermanentaccess.org/PARSE.Insight

Page 5: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

PARSE.Insight

• PARSE.Insight aims to provide:– Insight and understanding into the capabilities and practices

within the various research communities– An inventory of current and planned research and development

relating to e-infrastructures and permanent access– A roadmap for a support e-infrastructure for maintaining long-term

accessibility and usability of scientific and other digital information in Europe

– Identification of gaps in the existing and planned infrastructure– Progress towards a standard for evaluating the sustainability and

trustworthiness of digital repositories

Page 6: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Motivation

• Concern with data and documents

• Need for supporting e-infrastructure– What should this look like?– How can it be developed?– What timescale?

• The role of the Alliance for Permanent Access

Page 7: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Infrastructures for preservation

• Social / Legal / Financial / Organisational

• Agreements / Trust / Standards

• Costs/ Benefits/ Rewards

• Technical components

Page 8: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Lessons from other Infrastructures

• Need to “grow”, “encourage”, “foster” rather than “build”• include organisational, financial, legal & marketing• Provide services rather than specific technologies• Tackle “choke points”• Various phases of development

Page 9: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Approach

• Approach based on evidence from community insight …

• … while taking full account of current work on digital preservation

• Coverage of disciplines: wide and deep

• Coverage of resources: data and documents

Page 10: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Approach (2)

• Top-down:– Desk research– Targeted surveys to stakeholders in science– Interviews– Workshops and conferences

• Bottom-up: case studies in 3 communities:– Case 1: High Energy Physics (HEP)– Case 2: Earth Observation (EO)– Case 3: Social Sciences & Humanities (SSH)

Page 11: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Encouraging Organisational and Social change

• Policies: mandates for depositing research data and funding agencies requirements:

• Robust and reliable deposit places, where researchers can be sure their data will not get lost, be corrupted or misused with correct right access mechanisms.

• Elements that increase comfort levels so that new users will know how to use and interpret the available data.

• Communication and awareness around these issues. • Have publication of data as valued and as

referencable as is a publication of a paper in a journal.

Page 12: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

12

Benefits

• No organisation can do everything that is required for digital preservation forever

• Need to share the cost/effort

• Need to identify commonalities– None will be a perfect fit for all purposes

Page 13: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Insight: stakeholders

Research

• Research institutes (non-profit)

• Universities

• Academic libraries

Data management (preservation)

• Data centres (profit / non-profit)

• Libraries

• Archives

Funding/policy

• National Funding organisations

• European funding

• Corporate funding

Publishing

• General (cross-community) publishers

• Specific (community) publishers

Page 14: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

General Surveys to stakeholders

Research

1397 responses

Data management (preservation)

273 responses

Funding/policy

< responses

Publishing

186 responses

Plus a similar number from in-depth case studies

Page 15: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

About researchers

Communities aggregated to:• Agriculture & Nutrition• Behavioural Sciences• Humanities• Life Sciences• Medicine• Social Sciences• Physical Sciences• Socio-Cultural Sciences• Technology

Based on KNAW classification (Royal Netherlands Academy of Arts and Sciences)

Page 16: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Reasons for preservation (R)

It is unique

It potentially has economic value

It may stimulate inter-disciplinary collaborations.

It allows for re-analysis of existing data.

It may serve validation purposes in the future.

It will stimulate the advancement of science.

If research is publicly funded, the results should become public property and therefore properly

preserved.

Page 17: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

What? Data spectrum (R)

Page 18: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Sharing of data (R)

How open is your data?

Page 19: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Sharing of data (R)

Which constrains do you see in making data open?

Page 20: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Threats to preservation

1. The ones we trust to look after the digital holdings may let us down.

2. The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future.

3. Loss of ability to identify the location of data.4. Access and use restrictions (e.g. Digital Rights

Management) may not be respected in the future.5. Evidence may be lost because the origin and authenticity

of the data may be uncertain.6. Lack of sustainable hardware, software or support of

computer environment may make the information inaccessible.

7. Users may be unable to understand or use the data e.g. the semantics, format or algorithms involved.

Page 21: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Threats to preservation (R)

The ones we trust to look after the digital holdings may let us down

The current custodian of the data may cease to exist

Loss of ability to identify the location of data

Access and use restrictions may not be respected in the future

Evidence may be lost

Lack of sustainable hardware/software

Users may be unable to understand or use the data

Page 22: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Threats to preservation (R)

Users may be unable to understand or use the data e.g. the semantics, format or algorithms involved.

Page 23: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Threat Requirement for solution

Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved

Ability to create and maintain adequate Representation Information

Non-maintainability of essential hardware, software or support environment may make the information inaccessible

Ability to share information about the availability of hardware and software and their replacements/substitutes

The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity

Ability to bring together evidence from diverse sources about the Authenticity of a digital object

Access and use restrictions may make it difficult to reuse data, or alternatively may not be respected in future

Ability to deal with Digital Rights correctly in a changing and evolving environment

Loss of ability to identify the location of data An ID resolver which is really persistent

The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future

Brokering of organisations to hold data and the ability to package together the information needed to transfer information between organisations ready for long term preservation

The ones we trust to look after the digital holdings may let us down

Certification process so that one can have confidence about whom to trust to preserve data holdings over the long term

Page 24: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

FUTURE

• Users may be unable to understand or use the data e.g. the semantics, format, processes or algorithms involved

• Non-maintainability of essential hardware, software or support environment may make the information inaccessible

• The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity

• Access and use restrictions may not be respected in the future

• Loss of ability to identify the location of data

• The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future

• The ones we trust to look after the digital holdings may let us down

Page 25: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

Links• CASPAR:

– www.casparpreserves.eu – http://www.casparpreserves.eu/Members/cclrc/Deliverables/caspar-

validation-evaluation-report/at_download/file - Validation report– http://www.youtube.com/watch?v=Yun9hkPPF9M - cartoon

• PARSE.Insight: www.parse-insight.eu

• Alliance for Permanent Access www.alliancepermanentaccess.eu

• Digital Curation Centre: www.dcc.ac.uk

• Audit and certification: wiki.digitalrepositoryauditandcertification.org

• OAIS: http://public.ccsds.org/publications/archive/650x0b1.pdf

http://public.ccsds.org/sites/cwe/rids/Lists/CCSDS%206500P11/Overview.aspx

http://developers.casparpreserves.eu

http://sourceforge.net/projects/digitalpreserve

Page 26: PARSE.Insight Framework and Lesson Learned David Giaretta (STFC)

END