Creating a Digital Repository: The Caltech Experience Kimberly Douglas and Ed Sponsler DASER Summit...

Preview:

Citation preview

Creating a Digital Repository: The Caltech Experience

Kimberly Douglas and Ed Sponsler

DASER Summit

November 22, 2003

Caltech Library System

library@caltech.edu

Caltech

285 Tenure-track faculty

1000 Postdoctoral researchers

900 Undergraduates

1100 Graduate students

Caltech Library System

• 48 FTEs (13 Librarians, 6 IT Staff)

• 4 Libraries

• 500,000 Volumes

• 3556 Paid Print- Journal Subscriptions

• 2116 Paid E-Journal Subscriptions

• $7,100,000 Annual Budget (FY 03)

Caltech CODA

• Collection of Open Digital Archives: CODA

Mus. A passage of more or less independent character introduced after the completion of the essential parts of a movement, so as to form a more definite and satisfactory conclusion. – OEDURL: http://coda.caltech.edu

• Nearly 2,000 documents available now!– Theses, Technical Reports, Conference papers,

Books, Non-Research collections.

Minimal Start-up

• Dedicate an IT applications developer– Minimum 40% time commitment

• Librarian– Minimum 15% time commitment

• Find an initial development platform– Old workstation with a 10GB disk will do

• Use Open Source software– Install Linux with standard applications– Install E-Print software

Tools Developed

Author permission form Repository policy document Conversion process from print to digital

.ps, .pdf production OCR

Unique digital identifier caltechCSTR:2000.001

Persistent resolver mechanism resolver.caltech.edu/[Identifier]

The E-Prints Choice

• Why in 2000?– No cost to acquire– Open source– Very functional to purpose of open access

• Why still in 2003?– Centered around metadata– Generic application of LAMP– Low barriers

Caltech’s Implementation

• Design– Multiple instances

• Different policies and authority• Different presentation objectives/needs

– Unique identifier in URL for persistent resolution– Standardized to streamline implementation

• Staffing– Subject librarians oversee metadata

• Content– Convert from print to create critical mass– Individual recruitment and nurturing– Pdf primarily– Archive original source format where possible

Author Permission Agreement

I hereby grant to [Caltech] the irrevocable, non-exclusive royalty free right to reproduce, distribute, display, and perform this work in any format including electronic formats throughout the world for educational, research and scientific non-profit uses during the full term of copyright including renewals and extensions via the Digital Collections mechanisms maintained by the Caltech Library System. I also hereby grant to Caltech the non-exclusive right to sub-license these rights to others should the Institute forego the ability to maintain distribution. I warrant that I have the copyright to make this grant to Caltech unencumbered and complete.

Once this paper is so published, it may not be withdrawn. With the approval of the repository administration revisions to available documents within this service will be accepted.

The following Notice Concerning Terms and Conditions of Use will be included with the electronic distribution copies of the work: You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format.

http://resolver.library.caltech.edu/caltechCSTR:2001.000a

Recruitment Strategies

Identify the Faculty champion Appeal to group/dept. needs or objectives Reduce effort or worry of research or

departmental support staff

Caltech – OAI Data Providerhttp://coda.caltech.edu

2000 - Computer Science Technical Reports (CaltechCSTR)409 Reports, 1980’s to present; new reports online since 1995.

2001 - Cavitation 2001 Proceedings (CAV2001)110 papers; born online in June 2001

2001 - Earthquake Engineering Research Lab. Reports (CaltechEERL)

191 Reports, 1970’s to 2001; converted to online in Summer 2001

Caltech – OAI Data Providerhttp://coda.caltech.edu

2002 - Parallel and Distributed Systems Group E-Tech. Reports (CaltechPARADISE)52 papers, 1992 to dateharvest faculty site; have alert on his page, adding pdf to .ps

2002 - Electronic Theses and Dissertations (CaltechETD)600+ theses; 1960’s to date;older theses converted and current theses submitted by student

2002 - Books by Caltech Authors (CaltechBOOK)2 books converted by authors

Caltech – OAI Data Providerhttp://coda.caltech.edu

2002 – Caltech Oral Histories (CaltechOH)36 records input by the Caltech Archives staff

2003 – Engineering and Science (journal) (CaltechES)Caltech’s Alumni magazine converted to pdf by Library staff

2003 – Caltech Control and Dynamical Systems Technical Reports (CaltechCDSTR); author requested that they be harvested

200? – VLSI Conferences (CaltechVLSI)

% Time by Repository

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Cav2001148 hrs in 3.5

mos. (110 papers)

EERL 191 hrs in 8

mos.(287 repts)

Paradise 35 hrs in 2 mos.

(45 repts.)

Librarian

Systems

Support & Doc. Del.

Future

• Federating tools

• Archival format – structured documents

• Archiving research data and the companion analytical tools

To find out more…

-Sponsler, Ed (2001) PURR - The Persistent URL Resource Resolver. URL: http://resolver.caltech.edu/CaltechLIB:2001.003

-Sponsler, Ed and Van de Velde, Eric F. (2001) Eprints.org Software: a Review. URL: http://resolver.caltech.edu/CaltechLIB:2001.004

Ramachandran, Hema and Sponsler, Ed and O'Donnell, Jim (2002) Digital Collections:Making it Happen. URL: http://resolver.caltech.edu/CaltechLIB:2002.010

Recommended