38
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg

Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg

Embed Size (px)

Citation preview

Rapid Visual OAI Tool

S. Kothamasa, K. Maly, M. Zubair(Old Dominion University)

X. Liu(Los Alamos National Laboratory)

RCDL 2003, St. Petersburg

Outline

Open Archive Initiative (OAI).

Rapid Visual OAI Tool (RVOT).

Future Work.

Statistics & Feedback.

RCDL 2003, St. Petersburg

Motivation for OAI

One of the biggest obstacles to transparent resource discovery is the fact that many digital libraries use different, proprietary technologies that do not allow for interoperability.

Approaches for Interoperability•Federation

• Tight Integration (DLs adhering to a certain specification)•Harvesting

• Loose Integration (DLs agree to expose their collection in a standard way, and still maintaining their proprietary implementation)

•Gathering• Distributed search – no cooperation required

Open Archive Initiative (OAI) supports the Harvesting approach (only metadata harvesting).

RCDL 2003, St. Petersburg

Open Archive Initiative (OAI) Framework(http://www.openarchives.org/)

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is an international effort to facilitate bridges across islands of digital libraries.

The OAI protocol defines a data provider and service provider model and permits metadata harvesting of a data provider by a service provider.

• Data Provider supports the OAI protocol as a means of exposing metadata about the content in their systems

• Service Providers issue OAI protocol requests to the systems of data providers and use the returned metadata as a basis for building value-added services.

RCDL 2003, St. Petersburg

Core Concepts of OAI-PMH 2.0

• Metadata format – Dublin Core (DC)- 15 Elements, all repeatable and all optional.

- Title - Language- Creator - Coverage- Subject - Relation- Description - Rights- Publisher- Contributor- Date- Type- Format- Identifier- Source

RCDL 2003, St. Petersburg

• Metadata format – Parallel Metadata

- The OAI metadata harvesting protocol supports the notion of parallel metadata sets, allowing collections to expose metadata in formats that are specific to their applications and domains. The OAI technical framework places no limitations on the nature of such parallel sets, other than that the metadata records be structured as XML data that have a corresponding XML schema for validation.

RCDL 2003, St. Petersburg

• Metadata Harvesting

- Move away from distributed searching.- cannot scale well to large number of participants.

- Extract metadata from various sources.

- Build services on local copies of metadata.- data remains at remote repositories

RCDL 2003, St. Petersburg

• OAI Request and OAI Response.

- OAI Request for Metadata is embedded in HTTP.

- OAI Response to OAI Request is encoded in XML.

- XML Schema specification for OAI Response is provided in OAI-PMH document.

RCDL 2003, St. Petersburg

Repos i tory

Harves ter

Service Provider Data Provider

Supporting protocol requests:• Identify• ListMetadataFormats• ListSets

Harvesting protocol requests:• ListRecords• ListIdentifiers• GetRecord

RCDL 2003, St. Petersburg

• Repository name •Base-URL

• Admin e-mail• OAI protocol version

• Description Container

Repos i tory

Harves ter

Service Provider Data Provider

Identify

RCDL 2003, St. Petersburg

REPEAT• Format prefix

• Format XML schema/REPEAT

Repos i tory

Harves ter

Service Provider Data Provider

ListMetadataFormats

RCDL 2003, St. Petersburg

REPEAT• Set Specification

• Set Name/REPEAT

Repos i tory

Harves ter

Service Provider Data Provider

ListSets

RCDL 2003, St. Petersburg

REPEAT• Identifier

• Datestamp• Metadata

•About Container/REPEAT

* from=a * until=b

* set=klmListRecords * metadataPrefix=oai_dc R

epos i tory

Harves ter

Service Provider Data Provider

RCDL 2003, St. Petersburg

REPEAT• Identifier

• Datestamp/REPEAT

Repos i tory

* from=a * until=b

*metadataprefix=oai_dcListIdentifiers * set=klm

Harves ter

Service Provider Data Provider

RCDL 2003, St. Petersburg

• Identifier• Datestamp

• Metadata• About

Repos i tory

Harves ter

Service Provider Data Provider

* identifier=oai:mlib:123a GetRecord * metadataPrefix=oai_dc

RCDL 2003, St. Petersburg

OAI Request and OAI Response

RCDL 2003, St. Petersburg

Outline

Open Archive Initiative (OAI).

Rapid Visual OAI Tool (RVOT).

Future Work.

Statistics & Reviews.

RCDL 2003, St. Petersburg

What does it mean by making existing DL to OAI compliant DL

RCDL 2003, St. Petersburg

OAI Framework and RVOT

RCDL 2003, St. Petersburg

Rapid Visual OAI Tool(http://rvot.sourceforge.net)

• Rapid Visual OAI Tool (RVOT) can be used to graphically construct a OAI-PMH (OAI 2.0 compliant) repository from a collection of files. 

• The records in the original collection can be in any one of the acceptable format. The format currently supported are  RFC1807, Marc subset & COSATI formats. 

• RVOT helps to define the mapping visually from a native format to oai_dc format, and once this is done, metadata is converted from native format to DC.

RCDL 2003, St. Petersburg

• The tool is self-contained; it comes with a lightweight http server and OAI-PMH request handler.

• The design of RVOT is such that it can be easily extended to support other metadata formats.

RCDL 2003, St. Petersburg

RVOT Architecture

RCDL 2003, St. Petersburg

RVOT - Components

• Metadata Manager- Native to DC Mapping Definition Tool- Native to DC Metadata Converter- DC Metadata Publishing Tool

• OAI Webserver- Lightweight HTTP Server- OAI-PMH Data provider

• Graphical User Interface- Metadata Mapping Definition Interface- DC Metadata Publishing Interface- Administration (repository specific)- Other Interfaces (Logs, Help..)

RCDL 2003, St. Petersburg

Extending RVOT

• Metadata Parser

• Native Metadata Parser extending ‘Parser’ Interface

RCDL 2003, St. Petersburg

Flow Diagram

RCDL 2003, St. Petersburg

RVOT Main Interface

RCDL 2003, St. Petersburg

Interface to Specify Native Files Directory

RCDL 2003, St. Petersburg

RVOT Metadata Mapping Interface

RCDL 2003, St. Petersburg

Interface to Publish Dublin Core Metadata

RCDL 2003, St. Petersburg

RVOT – Highlights

• Visually construct OAI-PMH (OAI 2.0 compliant) repository from a collection of metadata files.

• Interface for Mapping Native to DC Elements.– Mapping need only once.– Create, Modify, View and Delete options available.

• Http Server to handle OAI requests.– OAI 2.0 requests supported.– Server Configuration options.– Start and Stop Server.

• DC Metadata publishing tool.– Can create DC Metadata files.– Add, Modify, View and Delete options available.

RCDL 2003, St. Petersburg

RVOT – Installation

• Download RVOT from http://rvot.sourceforge.net/• Unzip/Untar the downloaded package. • Add native (custom) metadata parser (optional).• Compile the package (compile.bat or compile.sh).• Start RVOT (startup.bat or startup.sh).• Provide Repository specific information.• Specify native metadata directory location.• Specify Native to DC Mapping.• Start OAI webserver.• Issue OAI Requests.

RCDL 2003, St. Petersburg

Outline

Open Archive Initiative (OAI).

Rapid Visual OAI Tool (RVOT).

Future Work.

Statistics & Feedback.

RCDL 2003, St. Petersburg

RVOT – Future Work

• At present RVOT supports native format files located on the local hard disk. We plan to enhance this by supporting URLs, databases for the native metadata format files.

• Provide Import/Export of DC metadata from/to Static Repository Format.

• Visually generate parsers by specifying the required parameters.

RCDL 2003, St. Petersburg

Outline

Open Archive Initiative (OAI).

Rapid Visual OAI Tool (RVOT).

Future Work.

Statistics & Feedback.

RCDL 2003, St. Petersburg

Outline

Open Archive Initiative (OAI).

Rapid Visual OAI Tool (RVOT).

Future Work.

Statistics & Feedback.

RCDL 2003, St. Petersburg

Usage Statistics

                                                               

User Feedback“..The interface looks and feels elegant and easy to work with, everything

seems to work straightforward. For me as a user it actually nearly looks too much streamlined, as I can do "only" the "simple" conversions. The complicated stuff (xml-description, http-request-handling etc.) is of course hidden from the users view..”

- Jürgen Beling , University of Trier, Germany

“..You can use this software to turn a collection of files into an OAI 2.0 compatible repository. No database needed, and it works like a charm. I dl-ed the files and had a personal OAI compatible repository up in less than 10 minutes. Let me repeat, less than 10 minutes. Me! I had nothing of value to put in there, but it can be done. And what's more, you can easily map one

metadata set onto another, say Dublin Core..” - Henk Ellermann, Erasmus Electronic Publishing Initiative, Holland

“..I like your Java repository program with the light weight http server. We will use it to build a repository for the datasets we'd like to expose. Also, I will be teaching a graduate course in W3C & related standards next fall. One module will be OAI-PMH and I plan to have students install, configure and test your program as a lab exercise. Nothing like hands on to show how

protocols work..”- Larry Mongin, Indiana University,

USA