30
Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Embed Size (px)

Citation preview

Page 1: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Moving Forward with theOpenDOAR Directory

Peter MillingtonSHERPA Technical Development Officer

University of Nottingham, England

Page 2: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Outline

• Brief introduction to OpenDOAR– What it is. Project time line

• OAI-PMH harvesting exercise– Modus operandi– Results for re-use policies– Technical issues & performance

• Conclusions & Recommendations• Prototype ‘policy generator’ tool• Questions & Feedback

Page 3: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

What is OpenDOAR?• Directory of Open Access Repositories• Coverage

– Institutional & Subject-based repositories; Funders’ OA archives

– Not covering: OA journals – see DOAJ – http://www.doaj.org/

• Authoritative evaluated data– More than auto-harvested OAI data

– Proactive - more than data supplied by repository administrators

– Periodic review for currency and functionality

• Target users– Search service providers, OA stakeholders, end-users

– Active dialogue with providers, administrators, funders, etc

• http://www.opendoar.org/

Page 4: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 5: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

OpenDOAR Project Time Line

• Started early 2005– University of Nottingham & University of Lund– Funded by: OSI, JISC, CURL & SPARCEurope

• First public version– January 2006– Data built on work by Tim Brody, Southampton, & others– 380 repositories (04-May-2006)

• Developing Version 2– Additional fields & views– Due summer 2006

Page 6: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Harvesting Modus Operandi• Aims

– Familiarisation with OAI-PMH

– Investigation of repositories’ policies

• OAI-PMH protocol– 315 Repositories in OpenDOAR with an OAI Base URL

– verb=Identify – policies from eprints.xsd schema

– Timings recorded & technical glitches noted

• Microsoft Excel Macros– Prompted for operator interventions

– Such events would hamper auto-harvesting

• PHP– Firewall problems – needed to use HTTP proxy server

– PHP functions would not handle HTTPS

Page 7: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

eprints.xsd Policy Criteria• content

– Text and/or a URL linking to text describing the content of the repository

– It would be appropriate to indicate the language(s) of the metadata/data in the repository

• metadataPolicy– Text and/or a URL linking to text describing policies relating to the

use of metadata harvested through the OAI interface• dataPolicy

– Text and/or a URL linking to text describing policies relating to the data held in the repository

– This may also describe policies regarding downloading data (full-content)

• submissionPolicy– Text and/or a URL linking to text describing policies relating to the

submission of content to the repository (or other accession mechanisms)

Page 8: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Metadata Policy Results

Page 9: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Metadata Policy Results• No policy info for two thirds of repositories

– Technical problems with 9%– No data provided for 40%– ‘Undefined’ for 17% - EPrints default settings

• Policies given– Nearly all permit re-use for non-commercial purposes– A third seem to allow commercial re-use

• Many policies copied from other repositories– e.g. CogPrints

• Issues for service providers– Lack of easily accessible policy statements– Prohibited re-sale of metadata – Why prohibited?

Page 10: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

[Full] Data Policy Results

Page 11: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Full Data Policy Results

• Also no policy info for two thirds of repositories– Technical problems with 9%– No data provided for 42%– ‘Undefined’ for 17%

• Policies given– Re-sale of full items nearly universally prohibited– Unclear policy in ~7% of cases– 7% prohibit harvesting by robots

• Prohibited harvesting by robots– Total prohibition prevents full text indexing and analysis– Transient harvesting should be permitted – e.g. CalTech

Page 12: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Content Policies• Repository Type

– Institutional or departmental repository– Multi-institution subject-based repository

• Subject Specialities– Up to three, or ‘many’

• Type of Material– e.g. Research papers, Theses, etc

• Publication Status– Pre-prints (not peer-reviewed)– Final peer-reviewed drafts (post-prints)– Published versions

• Individual tagging with peer-review and publication status • Principle Languages

– Up to three

Page 13: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Submission Policies• Eligible Depositors

– Role and/or Organisation unit– Or their delegated agents

• Deposition Rules– Who can deposit what – usually own work only– Mandatory deposition of metadata

• Moderation (vetting)– What, if anything, is vetted by the administrator– e.g. eligibility, relevance, valid layout. Exclusion of spam

• Content Quality Control (Peer review)– Responsibility for the validity and authenticity of the content– Not checked, or checking by internal subject specialists.

• Copyright Policy– Responsibility for copyright clearance– Dealing with proven copyright violations

Page 14: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Interim Conclusions• The eprints.xsd is not working

– Not used at all – or left ‘undefined’– Muddled entries – e.g. items under wrong heading

• Why?– Lack of awareness of its existence– Unsupported by repository software package– Insufficient guidance – possible language issues– Some policies not covered – e.g. preservation

• But…– Copying indicates a desire for model policies– Plenty of good examples on which to base models– Would be very useful to service providers, advocates, etc.

Page 15: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Recommendations• For Repository Administrators

– Ensure the eprints.xsd schema is in your OAI configuration– Put real policy info in the schema – not just ‘undefined’– Fix any technical issues– Avoid using HTTPS

• For OpenDOAR– Encourage repository administrators to improve matters– Provide model policies– Provide a ‘policy generator’ tool for administrators

• Future Work– Update eprints.xsd or replace with something new– Re-analyse annually to monitor progress

Page 16: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

OpenDOAR Policy Generator• Aims

– Capturing policies using standard formulae– Tool to help administrators formulate their policies

• Analysis of policies– Identification of recurring phrases and concepts– Natural language cluster analysis

• Selection of statements & options– Appropriate to the policy type– And meaningful

• OpenDOAR policy recommendations– Minimum options – achieving OA goals but restricted– Optimum options – refinements for more use or better

quality

Page 17: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 18: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 19: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 20: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 21: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 22: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 23: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England
Page 24: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Proposed Minimum Metadata Policy

• Anyone may access the metadata free of charge.

• The metadata may be re-used in any medium

– without prior permission for not-for-profit purposes

– provided the OAI Identifier and/or a link to the original metadata record are given.

• The metadata must not be re-used in any medium

– for commercial purposes without formal permission.

Page 25: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Proposed Minimum Full Data Policy

• Anyone may access full items free of charge.

• Single copies of full items can be: – Reproduced & displayed or performed in any format or

medium

– for personal research or study, educational, or not-for-profit purposes

– without prior permission or charge.

• Full items must not be harvested by robots– except transiently for full-text indexing or citation analysis

• Full items must not be sold commercially– in any format or medium

– without formal permission of the copyright holders.

Page 26: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Proposed Minimum Submission Policy

• Items may only be deposited by accredited members of the organisation, or their delegated agents.

• Authors/Depositors may archive only their own work. • The administrator only vets items for the exclusion of

spam • The validity and authenticity of the content of

submissions is the sole responsibility of the depositor. • Any copyright violations are entirely the responsibility

of the authors/depositors. • If the repository receives proof of copyright violation,

the relevant item will be removed immediately.

Page 27: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Optimum Policy Ideas

• Metadata Policy– Allow re-sale of metadata– Increased visibility outweighs ‘exploitation’

• Full Data Policy– Allow multiple copying – for educational purposes– Allow full harvesting – LOCKSS-like preservation

• Submission Policy– Mandatory deposition of metadata– Mandatory deposition of thesis full texts

Page 28: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

What Next?

• Consultation– SHERPA partners– Other interested parties

• Policy generator– End-user testing – volunteers needed– Ideas for output – e.g. text for EPrints configuration

• Refining recommended policies– Ideas for minimum and optimum options– Feedback on our proposals

• Aiming for release summer 2006

Page 29: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

Any Questions or Feedback?http://www.opendoar.org/

ContactPeter Millington

[email protected]

Page 30: Moving Forward with the OpenDOAR Directory Peter Millington SHERPA Technical Development Officer University of Nottingham, England

OpenDOAR Organisation

• The OpenDOAR Team– University of Nottingham, England

• Bill Hubbard, Gareth Johnson, Peter Millington

– University of Lund, Sweden• Lars Bjørnshauge, Kristoffer Lundqvist, Salam Baker Shanawa

• Our Funders– Open Society Institute (OSI)– Joint Information Systems Committee (JISC)– Consortium of Research Libraries (CURL)– SPARCEurope