22
Institutional Repositories & Discipline Institutional Repositories & Discipline Based Repositories Based Repositories Pauline Simpson Pauline Simpson National Oceanography Centre, National Oceanography Centre, Southampton Southampton GRADE Kick Off Meeting GRADE Kick Off Meeting 28 Sep 2005 28 Sep 2005

Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Embed Size (px)

Citation preview

Page 1: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Institutional Repositories & Discipline Based Institutional Repositories & Discipline Based RepositoriesRepositories

Pauline SimpsonPauline Simpson

National Oceanography Centre, SouthamptonNational Oceanography Centre, Southampton

GRADE Kick Off MeetingGRADE Kick Off Meeting

28 Sep 200528 Sep 2005

Page 2: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

OutlineOutline

Geospatial data =Geospatial data =

Institutional and Subject RepositoriesInstitutional and Subject Repositories

Repository choicesRepository choices

Data CentresData Centres

Possible solutionsPossible solutions

Page 3: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Geospatial Data Geospatial Data

Scope within GRADEScope within GRADE

Numerical data, raw and analyzedNumerical data, raw and analyzed

Information ProductsInformation Products

PublicationsPublications CD Roms, DVDCD Roms, DVD Learning ObjectsLearning Objects

Page 4: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Repositories are spreading because …Repositories are spreading because …

Supplementary to traditional publicationSupplementary to traditional publication Do not affect current research publication Do not affect current research publication

processesprocesses Give easy access Give easy access Give rapid accessGive rapid access Give long-term access Give long-term access Increase readership and use of materialIncrease readership and use of material They offer advantages to institutionsThey offer advantages to institutions They offer advantages to research fundersThey offer advantages to research funders They offer new ways for information to be linked They offer new ways for information to be linked

and usedand used

Page 5: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Subject/Discipline Based RepositoriesSubject/Discipline Based Repositories

Relies on peer interaction – no mandateRelies on peer interaction – no mandate Individual agreements have to be struckIndividual agreements have to be struck No definitive boundariesNo definitive boundaries Quality control issuesQuality control issues Sustainability issuesSustainability issues Transitory – collection at riskTransitory – collection at risk Responsibility for preservationResponsibility for preservation

Issues over the return on the money and effort investedIssues over the return on the money and effort invested

? A trusted repository? Supported by ….? A trusted repository? Supported by ….

Subject repositories often managed by an individual for a group

Page 6: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Subject repositories are archives which collect and Subject repositories are archives which collect and manage material relating to one or more related subject manage material relating to one or more related subject

areas. A number currently exist mainly within science areas. A number currently exist mainly within science subjects.subjects.

Significant subject repositories include many using e-Prints or Significant subject repositories include many using e-Prints or DSpace software:DSpace software:

ArXiv - ArXiv - http://http://xxx.arxiv.cornell.eduxxx.arxiv.cornell.edu// (physics, mathematics, non-linear science and (physics, mathematics, non-linear science and computer science) computer science)

Cogprints - Cogprints - http://http://cogprints.ecs.soton.ac.ukcogprints.ecs.soton.ac.uk// (Cognitive sciences including (Cognitive sciences including psychology, neuroscience, linguistics and other related areas) psychology, neuroscience, linguistics and other related areas)

CiteSeer - CiteSeer - http://http://citeseer.nj.nec.com/csciteseer.nj.nec.com/cs (computer science) (computer science) HTP Prints - HTP Prints - http://http://htpprints.yorku.cahtpprints.yorku.ca// (History and theory of psychology) (History and theory of psychology) PubMedCentral - PubMedCentral - http://http://www.pubmedcentral.nih.govwww.pubmedcentral.nih.gov// (US National Library of (US National Library of

Medicine's digital archive of life sciences journal literature. Medicine's digital archive of life sciences journal literature. PhilSci Archive - PhilSci Archive - http://philsci-archive.pitt.eduhttp://philsci-archive.pitt.edu// (philosophy of science) (philosophy of science) E-LIS - E-LIS - http://http://eprints.rclis.orgeprints.rclis.org// (library and information science) (library and information science) RePEcRePEc (Research Papers in Economics) (Research Papers in Economics)

  

Page 7: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Institutional RepositoriesInstitutional Repositories

Freely accessible web-based databases providing access to the full Freely accessible web-based databases providing access to the full text of scholarly material produced by members of an institution.text of scholarly material produced by members of an institution.

Digital collections that capture and preserve the intellectual Digital collections that capture and preserve the intellectual output of the communities. output of the communities.

What are the essential elements? What are the essential elements? Institutionally defined: Institutionally defined: Content - generated by the communityContent - generated by the community Scholarly content:Scholarly content:, published articles, books, book sections, , published articles, books, book sections,

preprintspreprints and working papers, conference papers, enduring teaching and working papers, conference papers, enduring teaching materials, student theses, data-sets, etc.materials, student theses, data-sets, etc. Cumulative & perpetual: Cumulative & perpetual: preserve ongoing access to materialpreserve ongoing access to material Interoperable & open access: Interoperable & open access: free, online, global, utilising free, online, global, utilising

standards :standards : OAI , Dublin Core etc OAI , Dublin Core etc

Page 8: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Institutional RepositoriesInstitutional Repositories

Institutions are logical implementers of repositories Institutions are logical implementers of repositories

because they can take responsibility for:because they can take responsibility for:

  

––              Centralising a distributed activityCentralising a distributed activity

––              Framework and InfrastructureFramework and Infrastructure

––              Permanence that can sustain changesPermanence that can sustain changes

––              Stewardship of Digital assetsStewardship of Digital assets

––              PreservationPreservation policy for long term access policy for long term access

––              Provide central digital showcase for the research, Provide central digital showcase for the research,

teaching and scholarship of the institutionteaching and scholarship of the institution

““a trusted repository” supported by the Information Communitya trusted repository” supported by the Information Community

Page 9: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Institutional Repository Software for geo Institutional Repository Software for geo datadata

OSI Directory of Institutional Repository SoftwareOSI Directory of Institutional Repository Software V.3 V.3 http://http://www.soros.org/openaccess/softwarewww.soros.org/openaccess/software//

E-Prints (GNU)E-Prints (GNU)  [http://software.eprints.org/].   [http://software.eprints.org/].  Open-source OAI-compliantOpen-source OAI-compliant software developed at University of Southampton to enable anyone to set upsoftware developed at University of Southampton to enable anyone to set up their own Open Archives-compliant institutional archive.  Originally programmed for their own Open Archives-compliant institutional archive.  Originally programmed for

subject repositories but now re-engineered for IR. Dsubject repositories but now re-engineered for IR. Does not identify treatment of oes not identify treatment of datasets, though can cover bibliographic descriptiondatasets, though can cover bibliographic description

DSpace: Durable Digital DepositoryDSpace: Durable Digital Depository [http://dspace.org/].  [http://dspace.org/].  Open-source Open-source software developed at MIT for their own repository; released as open source software in software developed at MIT for their own repository; released as open source software in Nov. 2002. Nov. 2002. 

Overtly identifies datasets. Offers Overtly identifies datasets. Offers opportunity to explore the issues surrounding the opportunity to explore the issues surrounding the incorporation of different metadata standards within one system…. Different disciplines incorporation of different metadata standards within one system…. Different disciplines

have adopted different sets of metadata standards to accommodate their particular have adopted different sets of metadata standards to accommodate their particular data needs.data needs.

Two examples are the CSDGM standard for geospatial data and the DICOM standard for Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, digital imaging in medicine. … develop more general standards, such as Dublin Core, whichwhich

proposes a basic set of common elements that can be used across many different proposes a basic set of common elements that can be used across many different disciplines and document types.disciplines and document types.

(DC and MARC are norms)(DC and MARC are norms)

Page 10: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

https://dspace.ucalgary.ca/handle/1880/33 need to register to search

Page 11: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

http://careo.ucalgary.ca/cgi-bin/WebObjects/CAREO.woa - information products

Page 12: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Repository ChoicesRepository Choices

Subject - Subject - arXiv, Cogprints, RePEC,arXiv, Cogprints, RePEC, Institutional – Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UKSouthampton, Glasgow, Nottingham (SHERPA), MBA UK National - National - DARE (all universities in the Netherlands), Scotland, British Library DARE (all universities in the Netherlands), Scotland, British Library

(proposal)(proposal) National / Subject National / Subject -- ODINPubAfricaODINPubAfrica InternationalInternational - - Internet Archive ‘Universal’, OAIsterInternet Archive ‘Universal’, OAIster Regional - Regional - White Rose UKWhite Rose UK Consortia - Consortia - SHERPA-LEAP (London E-prints Access Project)SHERPA-LEAP (London E-prints Access Project) Funding Agency – Funding Agency – NIH (PubMed), Wellcome Trust (UK PubMed), NERCNIH (PubMed), Wellcome Trust (UK PubMed), NERC Project - Project - Public Knowledge Project EPrint ArchivePublic Knowledge Project EPrint Archive Conference - Conference - 11th Joint Symposium on Neural Computation, May 15 11th Joint Symposium on Neural Computation, May 15

2004 2004 Personal – Personal – peer to peer, web pages etcpeer to peer, web pages etc Media TypeMedia Type - - VCILT VCILT Learning Objects Repository, NTDL (Theses)Learning Objects Repository, NTDL (Theses) PublisherPublisher – – journal archivesjournal archives Data Repositories/ArchivesData Repositories/Archives - - NODC, BODC, DOD, JODC, BADC etcNODC, BODC, DOD, JODC, BADC etc

Science, particularly Environmental Science is well servedScience, particularly Environmental Science is well served Logical host for numeric datasetsLogical host for numeric datasets

Page 13: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Data Centres/ Archives / RepositoriesData Centres/ Archives / Repositories

Within organisational infrastructures but not defined by itWithin organisational infrastructures but not defined by it National responsibilitiesNational responsibilities Subject and Technical Specialists, quality control of contentSubject and Technical Specialists, quality control of content Secure storage and migration policiesSecure storage and migration policies Well developed Metadata schema & StandardsWell developed Metadata schema & Standards

DIF – Directory Interchange Format, FGDC etcDIF – Directory Interchange Format, FGDC etc ISO 19115ISO 19115

the minimum set of metadata required to serve the full range of the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data); use, data access, data transfer, and use of digital data);

optional metadata elements - to allow for a more extensive standard optional metadata elements - to allow for a more extensive standard description of geographic data, if required; description of geographic data, if required;

a method for extending metadata to fit specialized needs.a method for extending metadata to fit specialized needs. Though ISO 19115:2003 is applicable to digital data, its principles can Though ISO 19115:2003 is applicable to digital data, its principles can

be extended to many other forms of geographic data such as maps, be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data.charts, and textual documents as well as non-geographic data.

““a trusted repository” supported by the Data Management a trusted repository” supported by the Data Management CommunityCommunity

Page 14: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

ARCHIMEDE : A Canadian software solution for institutional repositories ARCHIMEDE : A Canadian software solution for institutional repositories [[http://http://archimede.bibl.ulaval.ca/di/Welcome.doarchimede.bibl.ulaval.ca/di/Welcome.do]. OAI compliant software developed by Laval University ]. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process. anytime during his search and retrieval process.

Berkeley Electronic PressBerkeley Electronic Press [ [http://http://www.bepress.com/repositories.htmlwww.bepress.com/repositories.html].  Commercial OAI-].  Commercial OAI-compliant software used by the University of California’s compliant software used by the University of California’s eScholarshipeScholarship Repository Repository..

CERN Document Server Software (CDSware)CERN Document Server Software (CDSware) [ [http://http://cdsware.cern.chcdsware.cern.ch//]. OAI compliant ]. OAI compliant software developed by, maintained by, and used at, the CERN Document Server. software developed by, maintained by, and used at, the CERN Document Server.

Project Tapir Project Tapir [http://sourceforge.net/projects/tapir-eul]: Tapir provides additional functionality [http://sourceforge.net/projects/tapir-eul]: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. Dissertations supervision, submission and dissemination. See Queen's University Project.See Queen's University Project.    

   Fedora™ Project: An Open-Source Digital Repository Management SystemFedora™ Project: An Open-Source Digital Repository Management System

[http://www.fedora.info/]. Jointly developed by the University of Virginia and Cornell University, [http://www.fedora.info/]. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation. management, digital asset management, scholarly publishing, and digital preservation.

GreenstoneGreenstone [http://www.greenstone.org/cgi-bin/library?a=p&p=home].  Suite of open-source [http://www.greenstone.org/cgi-bin/library?a=p&p=home].  Suite of open-source multilingual software for building and distributing digital library collections.  Produced by the multilingual software for building and distributing digital library collections.  Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO.  Presently in limited use at (since 2000) in cooperation with UNESCO and the Human Info NGO.  Presently in limited use at New Zealand Digital Library Project and some other sites.New Zealand Digital Library Project and some other sites.

OCLC Research SoftwareOCLC Research Software [http://www.oclc.org/research/software/default.htm]. A list of open [http://www.oclc.org/research/software/default.htm]. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards. and harvest data according to OAI-PMH standards.

FIGARO, i-TOR, etcFIGARO, i-TOR, etc

Page 15: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Dilemma for ResearcherDilemma for Researcher

Mandates from major funding agencies now require grantees to Mandates from major funding agencies now require grantees to deposit research output in a ‘designated repository’ or ‘any’deposit research output in a ‘designated repository’ or ‘any’

Wellcome Trust (UK PubMed) - Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per £400 million producing 3500 papers per yearyear

RCUKRCUK

Where should the full text of their research be depositedWhere should the full text of their research be deposited

Researcher wants to enter metadata and deposit only once and Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place?perhaps deposit all related material in one place?

Situation at presentSituation at present Harvesting, but harvester is not the choice of the depositorHarvesting, but harvester is not the choice of the depositor Duplicate keying metadata into repositories of choiceDuplicate keying metadata into repositories of choice Cannot target multiple repositories with one exerciseCannot target multiple repositories with one exercise

Does it matter where it is deposited since Google Scholar, Yahoo, Does it matter where it is deposited since Google Scholar, Yahoo, Scopus , will pick it up wherever it is?Scopus , will pick it up wherever it is?

Page 16: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Repositories taking over the world?Repositories taking over the world?

Turf WarTurf War Not between Institutional and Subject Repositories – Not between Institutional and Subject Repositories –

complementary and should coexistcomplementary and should coexist Possibly between Text based and Numeric based Possibly between Text based and Numeric based

repositoriesrepositories Repositories of whatever flavour v. Data CentresRepositories of whatever flavour v. Data Centres

Are both spilling over into each others territory?Are both spilling over into each others territory?

The Cavalry : JISC Digital Repositories ProgrammeThe Cavalry : JISC Digital Repositories Programme Strand: Linking Text and DataStrand: Linking Text and Data

Page 17: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Learning & Teaching workflows

Research & e-Science workflows

Aggregator services

Repositories : institutional, e-prints, subject, data, learning objects

Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Resource discovery, linking, embedding

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Resource discovery, linking, embedding

Deposit / self-archiving

Learning object creation, re-use

Searching , harvesting, embedding

Quality assurance bodies

Validation

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

From: Lyon : CNI - JISC - SURF Conference, May 2005

Page 18: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

CLADDIER Project **CLADDIER Project **((CCitation, itation, LLocation ocation AAnd nd DDeposition in eposition in DDiscipline and iscipline and

IInstitutional nstitutional RRepositoriesepositories))

The CLADDIER system will be a step on the road The CLADDIER system will be a step on the road to a situation where (in this case, environmental) to a situation where (in this case, environmental) scientists will to be able to move seamlessly from scientists will to be able to move seamlessly from information discovery (location), through information discovery (location), through acquisition to deposition of new material, with all acquisition to deposition of new material, with all the digital objects correctly identified and cited. the digital objects correctly identified and cited. The lessons learned will be of applicability for the The lessons learned will be of applicability for the relationships between other discipline based relationships between other discipline based repositories and institutional repositoriesrepositories and institutional repositories..

**JISC Digital Repositories Programme 2005 -

Page 19: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Persistent identifiers• semantically transparent• VersioningDataset Citations• Publishing practiceAutomated Linking both ways• citation png

Page 20: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Where to DepositWhere to Deposit

One outcome of CLADDIER ProjectOne outcome of CLADDIER Project

‘‘pull’ = Harvesting pull’ = Harvesting ‘‘push’ = CLADDIER outcomepush’ = CLADDIER outcome

Enable researcher to deposit in one Enable researcher to deposit in one repository and choose to upload (push) the repository and choose to upload (push) the metadata to another repository of choice. metadata to another repository of choice.

Logical to ‘push’ from IR to Subject?Logical to ‘push’ from IR to Subject? Redundancy of records?Redundancy of records?

Page 21: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Thank YouThank You

Pauline SimpsonPauline Simpson ( ( [email protected]@noc.soton.ac.uk ) )

Page 22: Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

Data CentresData Centres

Discovery metadataDiscovery metadata - What data sets hold the sort of - What data sets hold the sort of data I am interested in? This enable organisations to data I am interested in? This enable organisations to know and publicise what data holdings they have. know and publicise what data holdings they have.

Exploration metadataExploration metadata - Do the identified data sets - Do the identified data sets contain sufficient information to enable a sensible contain sufficient information to enable a sensible analysis to be made for my purposes? This is analysis to be made for my purposes? This is documentation to be provided with the data to documentation to be provided with the data to ensure that others use the data correctly and wisely. ensure that others use the data correctly and wisely.

Exploitation metadataExploitation metadata - What is the process of - What is the process of obtaining and using the data that are required? This obtaining and using the data that are required? This helps end users and provider organisations to helps end users and provider organisations to effectively store, reuse, maintain and archive their effectively store, reuse, maintain and archive their data holdings.data holdings.