Upload
abraham-austin
View
214
Download
1
Embed Size (px)
Citation preview
Digital Object Identifiers for EOSDIS data
ESIP Winter Meeting Jan 6, 2011John Moses, ESDIS
Publishable Persistent Data Identifiers• Want unique and lasting data identifiers for publication– More frequent and consistent citation of EOSDIS
datasets– To find the NASA data used in research
• Regardless of where it is moved to or who becomes responsible for it
• The LTA corollary: To find the documentation for archived data
– To enable metrics collection on cited datasets• Digital Object Identifiers (DOIs) have emerged as
the most accepted data identifier in the publishing community.
What is a Digital Object Identifier?• The DOI® system and the Handle System: Internet infrastructure
– DOI is an application of the Handle System to intellectual property– Owned by International DOI Foundation (IDF)– www.doi.org . The system was
developed from the publishing industry (circa 2000)– Internet resolution service for unique and persistent identifiers of digital objects
• Consists of two part alphanumeric string doi:[prefix]/[suffix] – E.g., 10.1234/123; Prefix 10 is the DOI registry identifier; 1234 the Registrant (e.g. NASA)– Suffix alphanumeric string identifies the data item as decided by the registrant or agent– Example for Earthquake Event Dataset Authored by Automated System:
• Geofon operator (2009): GEOFON event gfz2009kciu (NW Balkan Region) GeoForschungsZentrum Potsdam(GFZ). [ doi:10.1594/GFZ.GEOFON.gfz2009kciu ]
• Citation and Location information is maintain at the DOI registry by an IDF Registrant Agent (RA) through a subscription provider– Citation and location information can be updated as frequently as desired by the
Registrant subscription holder– Desire is for one DOI per data item; but registry does not preclude multiple
registrations/publishers of ‘similar’ data– Can migrate existing DOIs to new or different Registrant Agent and/or owner at any time
Implementing DOIs for EOSDIS– Developing the ops concept and scope of ESDIS-
DAAC roles/responsibilities• Guidelines for DOI suffix profile, citation/location information.• California Digital Library (Joan Starr), Dept Of Energy (Sharon Jordan),
NASA Scientific & Technical Information (Gerald Steeman)• Want DOIs to be attractive to users, solicit feedback from DAAC UWGs
• Request, assign, monitor DOIs, citation & location info• Add DOIs to DAAC product citation web pages• Add DOIs to GCMD and ECHO through metadata updates • Imbed DOIs into product metadata at next reprocessing• Add DOI metadata to NTRS for searchable documentation• Setup metrics collection from journal citation reports
Implementation in Interoperable Architectures
Metadata flows in NASA Earth Science Data Systems5
Provenancecollection
DOI ProvenanceServices
tools
tools
DOI
NASA Technical Reports Server
DOI tools
Operations Concept• DAACs propose the datasets and order to be assigned DOIs
– Provide citation and location information according to ESDIS guidelines– Post online DOI information as assigned– Work with instrument teams and product generation teams to get DOIs
embedded into product granule-level metadata– Work with NTRS to add DOI to searchable metadata
• ESDIS approves the process for assigning DOI structure names– Approve new DOIs and best practices; dupport nominal rate of requests
• Establish suffix naming convention; avoid organization references• Assign new DOIs at dataset level, when a major change in product version• Support ancillary (e.g., non-NASA) ES datasets on non-interference basis
– Keep DOI suffix structure master – ensures consistency and uniqueness• i.e., maintain two levels – <measurement project or mission-
instrument>/<product id (version)> – Monitor maintenance of citation and location information– Provide guidance on implementation of DOIs in metadata
Member Institute using DataCite (RA):
California Digital Library and EZID • EZID is a service providing researchers a way to manage identifiers
persistently for datasets, files, and resources of all types. The service is available via a machine to machine programming interface (an API) and as a web user interface.
• Core functions:– Create a persistent identifier: DOI– Add object location (URL landing page, separate from citation)– Add citation metadata (DataCite repository, mandatory shown below)
• Creator (person or organization)• Title (long name of dataset)• Publisher (holder of the data – organization making it available)• Publication Year (year when data was, or will be first available)
– Update object location– Update object metadata
IF THERE IS MORE TIME
DOI Persistence
Registration Agent: DataCite• DataCite, established a scientific data
application with IDF.• Service is run by open membership
organization of gov and edu libraries. Focused on improving the scholarly infrastructure around datasets.
• Most appropriate RA because of their focus on working with data centers to assign persistent identifiers to datasets leveraging the Digital Object Identifier (DOI) infrastructure.
• United States Member Institutes– California Digital Library (Founding Member)
• Recommended subscription provider because of bulk pricing and EZID Web/API services
– Office of Scientific and Technical Information, US Department of Energy ( new Member Dec 2010)
– Purdue University Libraries (Member)– Interuniversity Consortium for Political and
Social Research - ICPSR (Associate Member)– Microsoft Research (Associate Member)
TIB: German National Library of Science and Technology
DAAC Recommendations for DOIs• ORNL already using CrossRef RA but would like to transition to DataCite
and California Digital Library’s EZID (Bob Cook)– Views EZID as having better services more closely aligned to our community– DataCite and EZID have started discussions with Thomson Reuters to get data
citations (i.e., DOIs) in their index; expose EZID catalog.– Expects easy transition – DOI remains the same; get machine readable citation
from CrossRef; script using EZID api to populate DataCite metadata.• NSIDC recommends using DOIs and EZID (Ruth Duerr & Mark Parsons)
– bulk pricing– a web service as well as API for bulk processing– ARKs as well as DOIs – to support granule citations
• GSFC in contact with CDL about DOIs and EZID (Chris Lynnes)– Involved in structure for AIRS observations in CMIP5– Interested in a project wide approach and service
• PODAAC has expressed interest in assigning DOIs (Andy Bingham)