35
PEPRS: Recording The Extent PEPRS: Recording The Extent Preserved Preserved Peter Burnhill EDINA, University of Edinburgh with sincere thanks to Regina Reynolds ALA Holdings Forum, New Orleans, 25 th June 2011 … part of shared task to ensure ease and continuity of access Universal and repurposed holdings information - emerging initiatives and projects” 4:00/5:30pm MCC Room 355

PEPRS: Recording The Extent Preserved

  • Upload
    fairly

  • View
    25

  • Download
    1

Embed Size (px)

DESCRIPTION

… part of shared task to ensure ease and continuity of access. PEPRS: Recording The Extent Preserved. Peter Burnhill EDINA, University of Edinburgh with sincere thanks to Regina Reynolds ALA Holdings Forum, New Orleans, 25 th June 2011. - PowerPoint PPT Presentation

Citation preview

Page 1: PEPRS: Recording The Extent Preserved

PEPRS: Recording The Extent PEPRS: Recording The Extent PreservedPreservedPEPRS: Recording The Extent PEPRS: Recording The Extent PreservedPreserved

Peter Burnhill

EDINA, University of Edinburgh

with sincere thanks to Regina Reynolds

ALA Holdings Forum, New Orleans, 25th June 2011

… part of shared task to ensure ease and continuity of access

“Universal and repurposed holdings information - emerging initiatives and projects”

4:00/5:30pm MCC Room 355

Page 2: PEPRS: Recording The Extent Preserved

2

This presentation is in 3 parts

1. Why the interest in the ‘holdings statement’– ‘experiential knowledge’ from union catalogues

– Moving from human-readable to computational

2. An introduction to PEPRS and peprs.org– What is available now or ‘real soon now’

– What is unresolved but important and needs doing

3. Focus on record of extent preserved– Extent issued; extent held on shelf or digitally secured

(but first a little bit of ‘institutional’ background to start)

Page 3: PEPRS: Recording The Extent Preserved

3

Brief introductions

1. EDINA– UK national academic data centre – http://edina.ac.uk – Designated and funded by JISC –

http://www.jisc.ac.uk/ * The agency for innovative use of digital technology for

UK research and education

– Based at University of Edinburgh* Research-led University, with Library founded in 1580

2. ISSN International Centre– Directs and coordinates the ISSN Network of 88

national ISSN Centres

– Based in Paris, France

Page 4: PEPRS: Recording The Extent Preserved

What is PEPRS?

• JISC-funded project

• led by EDINA & ISSN IC

• to provide an online registry on what e-journals are being preserved

– who is doing this and how

– and the extent of content preserved

• a registry of keepers of (e-)journal content

Page 5: PEPRS: Recording The Extent Preserved

Experience and implications (1) Union catalogues

1. SALSER (union catalogue of serials in Scotland, est. 1994)

– http://edina.ac.uk/salser/ – all life is there

– no de-duplication at the title level, nor at the holdings level

– Holdings statements once described as “highly variable and mostly poor”

5

Page 6: PEPRS: Recording The Extent Preserved

2. SUNCAT, the UK union catalogue of serials• 80 largest research & university libraries

– inc British Library, Cambridge, Oxford, Edinburgh, Glasgow– 3.5m ‘library records’: over 4.7m ‘item holding records’

+ 2.8m ‘titles’ in CONSER, ISSN & DOAJ databases

• FRBR-like matching to provide search at title-level– http://www.suncat.ac.uk/

• No comparison of information at holdings level– change in local holdings statement is biggest cause of

updating

• Helping UK Research Reserve discover ‘candidate titles’ for print archiving– UKRR plans to keep minimum of 3 copies

* OPAC holdings statements not reliable enough for disposal decisions

Page 7: PEPRS: Recording The Extent Preserved

Importance of knowing what was & was not issued

• Always been a problem for librarians who need to claim back for what does not arrive

• Now a problem for the ‘preservers’!

Exploring data flowon ‘issues’ into SUNCAT: to help librarians know what had not been issued(!)– ONIX for Serials and serials holdings format

Page 8: PEPRS: Recording The Extent Preserved

Experience and implications (2): access to articles

The article has always been the ‘information

object of desire’. Now with an established

digital world (but not a ‘digital only’ world),

the focus is on ‘entitlement’ & ‘access’

- not ‘holdings’

Page 9: PEPRS: Recording The Extent Preserved

9

Assisting access to articles online remotely

1. A&I and machine-to-machine access– linking via OpenURL to articles online

2. Institutions arrange licence & remote access to publishers’ content via ERM (not the OPAC)

3. Recent focus on role of ERM, and union catalogues, to record of ‘entitlement’ in event of cancelation

4. Renewed attention on ‘digital shelf for back copy’– for assurance of continuity of access

Page 10: PEPRS: Recording The Extent Preserved

authentication(Shibboleth)

Reader(article)

Publisherarticle serial

issue

Licence=authorisation serial issue article

Scholarly Communication(Retaining focus on formal (£) economy for licensed online access to article–length work published in journals – but conscious of the ‘open’)

Library(serial)

‘locate/access’

‘discover’

‘request’

OPACOPAC

A&IA&I

ISSN & other metadata

DOI & other metadata

OpenURLResolver

unioncat

Serials managers

‘Holdings’ metadata

P.Burnhill, EDINA/JISC, 2005 (updated 2011)

Page 11: PEPRS: Recording The Extent Preserved

13

Is this a case of ‘middle child syndrome’?

an emotional scarring condition with neglect, forgotten dates, and sometimes in bad cases forgetting they even exist.

Middle children are known for ending up with things that are too big for the baby and too small for the oldest.

Page 12: PEPRS: Recording The Extent Preserved

Holdings statements as the “middle child”

1. In OPACs and union catalogues, holding statements are difficult to understand, often regarded as wrong, and some think them unreformable.

2. The eldest (the journal title information) always takes precedence, but can help a lot if well defined

3. The youngest (the wild article child) is ‘just there’

Page 13: PEPRS: Recording The Extent Preserved

PEPRS: Piloting an E-journal Preservation Registry Service

Idea of a registry raised in literature, ca. 2003/4, and then again in 2006:

“either .. clarity of public statement by each agency

or through a registry by which it would be plain what content was being archived, and therefore what was not.”

(US) CLIR Report, 2006

Page 14: PEPRS: Recording The Extent Preserved

PEPRS--Development

• Scoping study in 2007 by Rightscom and Loughborough University led on to a JISC-funded Project:– Partners: EDINA & ISSN International Centre,

* Phase 1: August 2008 – July 2010

‘investigate, prototype and build’* Phase 2: August 2010 – July 2012

‘preparing for service & governance’

– Initially UK in scope, we now judge PEPRS as necessarily international

* Literature is international – so is ISSN* Every nation needs one* Growing international support

Page 15: PEPRS: Recording The Extent Preserved

1717

On the road … and hosted at http://edina.ac.uk/presentations.html

1. JISC Journals Working Group (London, August 2008)

2. ISSN National Directors Meeting (Tunis, September 2008)

3. NASIG, 24th Annual Conference (Ashville NC, USA, 4 June 2009)

4. Library of Chinese Academy of Science (Beijing, 15 September 2009)

5. ISSN National Directors Meeting (Beijing, 17 September 2009)

6. PARSE.Insight Workshop (Darmstadt, Germany, 21 September 2009)

7. Knowledge Exchange Workshop (Edinburgh, October 2009)

8. E-journals are Forever Workshop, JISC/DPC (London, April 2010)

9. IFLA 2010 (Gothenburg September 2010)

10. RLUK Conference (Edinburgh, 11 November 2010)

11. Columbia Univ. (NYC, 23 November 2010); UKSG (Spring 2011)

12. … ISSN Governing Body (Paris, April 2011)

13. … ARL (Montreal, May 2011) and welcomed invite to ALA, New Orleans

P.Burnhill, F.Pelle, P.Godefroy, F.Guy, M.Macgregor, A.Rusbridge & C.ReesPiloting an e-journals preservation registry service. Serials 22(1) March 2009. [UK Serials Group]

P.Burnhill Tracking e-journal preservation: archiving registry service anyone? Against the Grain. 21(1) February 2009. pp. 32,34,36

Page 16: PEPRS: Recording The Extent Preserved

ISSN Register

E-J Preservation Registry Service

E-Journal Preservation

Registry

SERVICES: user requirements

(a)

(b)

Data dependency

Piloting an E-journals PreservationRegistry Service

METADATAon extant e-journals

METADATA on preservation action

Abstract Data Model: Figure 1 in reference paper in Serials, March 2009

Digital Preservation Agencies e.g. CLOCKSS, Portico; BL, KB;

UK LOCKSS Alliance etc.

Page 17: PEPRS: Recording The Extent Preserved

19

Information about the archiving organisations

• Wanting to work with those who have ‘archival intent’, i.e., the keepers of content for the long term

• Five pilot participants:– British Library– CLOCKSS Archive– e-Depot [Koninklijke Bibliotheek (KB), Dutch Royal Library]

– Global LOCKSS Network– Portico

*preparing to include more in some kind of self-registration

Page 18: PEPRS: Recording The Extent Preserved

Participants self-state* the following:

• Overview & background: A short summary of each archiving initiative.

• Ingest & preservation workflow: Steps taken to ingest content & preserve it over time.

• Library access to content: In general terms, the conditions under which a library can access the content archived for each initiative.

• Auditing of content, policies and procedures (both internal and external activities): Steps taken to ensure the ongoing authenticity and accessibility of content and to monitor the development of the approach over time.

• Latest data: With direct link to the archiving agency's holdings information, or to the archiving agency's home page if the holdings information is not available.

[*PEPRS is not an audit*]

Page 19: PEPRS: Recording The Extent Preserved

Public Βeta now live! after field-testing with archiving Organisations [British Library, CLOCKSS, LOCKSS, KB & Portico] + associates

http://peprs.org

Page 20: PEPRS: Recording The Extent Preserved

23

Simple search shows that this journal is being preserved

A Quick Look

get same result searching on (either) ISSN

Page 21: PEPRS: Recording The Extent Preserved

24

* CLOCKSS also archives Springer content; not shown here

Passing glance at the variation in ‘holdings information’ reflecting what the archiving organisations hold as metadata

Page 22: PEPRS: Recording The Extent Preserved

25

What happens when print ISSN is entered?

Note key role of ISSN-L; even if the ‘print ISSN’ is entered, the preservation status of the e-journal is found

Page 23: PEPRS: Recording The Extent Preserved

26

Allows a library to upload a list of ISSNs to check preservation status.Being field-tested in the UK and by 2CUL (Columbia & Cornell)

* COMING SOON *

Page 24: PEPRS: Recording The Extent Preserved

27

* COMING SOON *

We are exploring the standards to use for m2m use of the registry service, so PEPRS could be used within union catalogues and other serial services.

Page 25: PEPRS: Recording The Extent Preserved

28

Variation in how ‘holdings’ are expressed to PEPRS Variation in how ‘holdings’ are expressed to PEPRS by the agencies by the agencies The volume is often the work unit in archiving, plus whatever metadataThe volume is often the work unit in archiving, plus whatever metadata there is at hand associated with that unit of effort there is at hand associated with that unit of effort * Dates are in the metadata, not in the workflow ** Dates are in the metadata, not in the workflow *

Page 26: PEPRS: Recording The Extent Preserved

29

More variation in a list from OUP

Mix of Arabic volume Mix of Arabic volume numbers numbers & Roman numerals; & Roman numerals; dates are derived dates are derived from metadatafrom metadata

Page 27: PEPRS: Recording The Extent Preserved

30

note (simple) variation in Publisher information, across the archiving agencies, and ISSN Register

Page 28: PEPRS: Recording The Extent Preserved

31

Matters unresolved (1): things in initial project scope

PEPRS-specific

• What users ‘really want to know’ via release of Public Beta

– about archiving agencies and their preservation policy & practices

– feedback on functionality; opportunity for social media

• How to be an international registry of global keepers

– Governance: UK (JISC/SCONUL/RLUK); EU (Knowledge Exchange; LIBER); USA (ARL); International (IFLA, ICOLC, ISSN-IC; EU) ??

Relevant for ‘Holdings Forum’

• Assigning ISSNs to preserved e-serials that are reported

1. ‘E-journals’ that come to notice

* ISSN-IC is devising workflow to assign ISSNs as required

2. ‘D-journals’, digitised content from print journals

* some have print ISSN, some not; problematic but essential to make progress

• Issues/volumes, not just titles

– extent preserved; common/conversion [action in Phase 2]

Page 29: PEPRS: Recording The Extent Preserved

32

Matters unresolved (2): challenging the scope of PEPRS

1. ‘Continuity of access’, not just preservation– archiving agencies may want to detail current access offer

* how should PEPRS try to adapt?

2. What about repositories of digitized journals?– HATHI Trust has over 210,000 titles

* of which only about 1/3 have an ISSN in the record

3. What about print archiving?– CLR’s PAPR initiative, for print journals

* significant proportion will not have had an ISSN assigned

Common challenges relevant for Holdings ForumCommon challenges relevant for Holdings Forum

• All have serials where ISSN not yet assigned by ‘big sister’

– If it is worth preserving it should have a serials identifier!

* Good News: ISSN Network has issued over 80,000 already

• All tackling ‘middle sister’ problem of Issues/Volumes

Page 30: PEPRS: Recording The Extent Preserved

33

‘holdings information’ in OPACs has ‘middle child’ conflict

‘holdings’ in OPAC conflates: – information for humans

(patrons/readers) about access to content

with – possession of that content

Maybe OK for print journals

but we need a different approach for journal content in digital format,

where access and stewardship have differing requirements

Page 31: PEPRS: Recording The Extent Preserved

34

What’s the way forward?

Let’s accept that the OPAC holding statement is just a ‘human-readable string’

We need radical reform, with means to ingest and store structured metadata on issues (with their tables of contents) that allows:

a) transformation to allow helpful display for humans

b) computation by software/agents to support lots more

Page 32: PEPRS: Recording The Extent Preserved

35

• Information for machines on what is held by a keeper:• We are working on an ‘arithmetic’ representation

• the norm/expectation being some matrix expression, with ‘additions’ and ‘subtractions’ about that norm

• Ingesting data flows from Publishers & Digitizers …

… that can be parsed ‘volume by ‘volume’

… but expect the operational definition of ‘volumes’ to differ, as the workflows for Publishers and Digitizers are not the same, and so their respective ‘units of work’ differ:

① The issue as is published② The bound volume as was digitized

Universal and repurposed holdings information …

Page 33: PEPRS: Recording The Extent Preserved

36

Concluding thoughts …

Our common task is to ensure ease and continuity of access

Because the role of libraries, individually and collectively, as trusted keepers of scholarly information has been challenged by the new economics of the digital …

Each Keeper needs to be sure about what it holds– on a (digital) shelf held with ‘archival intent’

… doing so in ways that all others can know who is keeping what?– publish that metadata so the machine can understand!

That’s true for e-journal content, and probably true for both digitized journal content and also of print …

Page 34: PEPRS: Recording The Extent Preserved

hence interest in registries:

peprs.org =>

thekeepers.org

peprs.org =>

thekeepers.org

Page 35: PEPRS: Recording The Extent Preserved

38

THANK YOU

Acknowledgements due to all members of the PEPRS Project Team,

and in particular to Morag Macgregor for the software engineering

And thanks again to Regina Reynolds for adding Expression to this Work [Manifestation/Item?]

Contact details:

[email protected] and [email protected]