47
Alternative Architecture for Information in Digital Libraries Onno W. Purbo [email protected]

Alternative Architecture for Information in Digital Libraries

  • Upload
    zazu

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Alternative Architecture for Information in Digital Libraries. Onno W. Purbo [email protected]. Reference. http://www.dlib.org/dlib/february97/cnri/02arms1.html - PowerPoint PPT Presentation

Citation preview

Page 1: Alternative Architecture for Information in Digital Libraries

Alternative Architecture for Information in Digital Libraries

Onno W. [email protected]

Page 2: Alternative Architecture for Information in Digital Libraries

Reference http://www.dlib.org/dlib/february97

/cnri/02arms1.html William Y. Arms, Christophe

Blanchi, Edward A. Overly, “An Architecture for Information in Digital Libraries,” Corporation for National Research Initiatives Reston, Virginia, February 1997.

Page 3: Alternative Architecture for Information in Digital Libraries

The Structure of Information Digital data digital library. Digital objects

Metadata Unique identifier (handle).

Group of digital objects set of digital objects.

Different type of material categories.

Page 4: Alternative Architecture for Information in Digital Libraries

Components of Comp System

Page 5: Alternative Architecture for Information in Digital Libraries

Work Flow Example Search

Z.39.50 – list of digital objects identified by handle.

Select Retrieval

Resipository Access Protocol (RAP) Display

Page 6: Alternative Architecture for Information in Digital Libraries

Information Architecture

Page 7: Alternative Architecture for Information in Digital Libraries

Structure of Info in Digi Lib Relationship (chapter, index) Format (SGML, HTML) Version Right & Permission Computer System & Network

(dialup vs. broadband).

Page 8: Alternative Architecture for Information in Digital Libraries

Basic Principles User & app. Program must be

flexible. Collections must be

straightforward to manage. The information archirectire must

reflect economic, social & legal framework.

Page 9: Alternative Architecture for Information in Digital Libraries

Data type, structural metadata Data type – technical properties of

data, format & processing. Structural metadata – type,

version, relationship of digital material.

Meta-object – reference to a set of digital object.

Page 10: Alternative Architecture for Information in Digital Libraries

Guidelines for all categories All data is given an explicit data

type All metadata is encoded explicitly Handles are given to individual

items of intellectual property Meta-objects are used to aggregate

digital objects Handles are used to identify items

listed in meta-objects

Page 11: Alternative Architecture for Information in Digital Libraries

An Example of the Use of Meta-objects Scanned photographs Digital objects for a scanned

photograph Digital objects for individual versions Meta-object Handles for scanned photographs Depositing a scanned photograph

Page 12: Alternative Architecture for Information in Digital Libraries

Digital objects for a scanned photograph Low resolution “thumbnail” High resolution “reference” image

Page 13: Alternative Architecture for Information in Digital Libraries

Digital objects for individual versions Key metadata.

used to manage the object in a networked environment. It includes the handle, and the rights and permissions associated with the digital object.

Structural metadata. includes fields for description, owner, handle of

meta-object, data size, data type (e.g., "jpg"), version number, description, date deposited, use (e.g., "thumbnail"), and the date of last revision.

Image data. This is the image data.

Page 14: Alternative Architecture for Information in Digital Libraries

Meta-object Key metadata.

includes the handle, and the rights and permissions associated with the digital object.

Structural metadata. includes a description, the owner, the number of

versions, the date deposited, the use ("meta-object"), and the date of last revision.

Data about each version. For each of the three scanned versions (e.g., the

thumbnail), there is a package of information including the handle of the version, and the relationship among the versions.

Page 15: Alternative Architecture for Information in Digital Libraries

Handles for scanned photographs control identifier - 3a16116r.jpg replace the control identifiers by handles, which

provide a unique, persistent, location independent name for each item - loc.ndlp.amrlp/3a16116

Terminology to describe handles: "loc.ndlp.amrlp" is the naming authority "3a16116" is a locally unique string

For convenience in processing, use sequence numbers loc.ndlp.amrlp/3a16116.1 loc.ndlp.amrlp/3a16116.2

Page 16: Alternative Architecture for Information in Digital Libraries

Meta object identifies 2 image

Page 17: Alternative Architecture for Information in Digital Libraries

Depositing a scanned photograph Human machine

Page 18: Alternative Architecture for Information in Digital Libraries

Depositing a scanned photograph - human Selection of the material that will

be made into each digital object. Specification of the metadata for

those fields that require judgment.

Page 19: Alternative Architecture for Information in Digital Libraries

Depositing a scanned photograph - machine Creation of the meta-object and

the links to other digital objects. Depositing the digital objects in

the repository. Registering the handles in the

handle system.

Page 20: Alternative Architecture for Information in Digital Libraries

Access to a scanned photograph Bibliographic entries in search systems

refer to the scanned photograph by the handle of the meta- object.

If a user requests a summary of the photograph, the "thumbnail" image is provided.

If the user requests access to the photograph without specifying which version, the "access" image is provided.

Page 21: Alternative Architecture for Information in Digital Libraries

Technical Information

Page 22: Alternative Architecture for Information in Digital Libraries

Digital Object

Page 23: Alternative Architecture for Information in Digital Libraries

Digital Object Key-metadata

The key-metadata is the information stored in the digital object that is needed to manage the digital object in a networked environment -- for example to store, replicate, or transmit the object without providing access to the content. This includes terms and conditions, and the handle.

Digital material The digital material (or data) comprises a set of

sequences of bits.

Page 24: Alternative Architecture for Information in Digital Libraries

Digital Objects Internal Structure An element is a bit sequence

comprising an elementary unit of information. An element has its own ID.

A package is a collection of elements and other packages, with its own ID.

A digital object is a package with key-metadata for use in a networked environment. The ID is a handle.

Page 25: Alternative Architecture for Information in Digital Libraries

Data Element

Page 26: Alternative Architecture for Information in Digital Libraries

Data Element Data element

A data element is any bit-sequence. Element ID

The element ID is the internal identifier of the element within the digital object. Unlike a handle, which is unique and known publicly, the element ID is of local importance only.

Attributes Attributes are the information that is needed to

process the element. They include: a role, which defines the function of the element (such as "DTD" in the SGML world), and a type, which includes technical information (such as "jpeg").

Page 27: Alternative Architecture for Information in Digital Libraries

A Package

Page 28: Alternative Architecture for Information in Digital Libraries

Packages Packages are used to group or associate

elements and other packages. A package has a package ID.

If the package is a digital object, the package ID is a handle. Otherwise, it is the internal identifier of the package within the digital object. Unlike a handle, which is unique and known publicly, such a package ID is of local importance only. The content of a package consists of elements and other packages.

Page 29: Alternative Architecture for Information in Digital Libraries

Handle & Handle System

Page 30: Alternative Architecture for Information in Digital Libraries

Handle & Handle System The digital library is assembled from a

great variety of components. They include people, computers, networks, repositories, databases, search systems, Web servers, digital objects, elements of objects, bibliographic records, and many more. Keeping track of these components requires a systematic approach to identification.

http://www.handle.net

Page 31: Alternative Architecture for Information in Digital Libraries

Typical handle record

Page 32: Alternative Architecture for Information in Digital Libraries

Handle record for web

Page 33: Alternative Architecture for Information in Digital Libraries

Handle System To resolve a handle is to present a handle

to the handle system and receive as a reply information about the item identified.

The handle system is a distributed computer system, with many computers distributed across the world. CNRI manages a global handle registry and there are local handle services operated by other organizations, e.g. http://www.handle.net/

Page 34: Alternative Architecture for Information in Digital Libraries

Naming Authority Handles are created by naming

authorities, administrative units that are authorized to create and edit handles.

Page 35: Alternative Architecture for Information in Digital Libraries

The Repository

Page 36: Alternative Architecture for Information in Digital Libraries

Structure of a Repository A repository is a system for networked based

storage and access to digital objects. All interaction with the repository uses a simple

protocol, known as the Repository Access Protocol (RAP). RAP has a small number of fundamental operations, such as "deposit", which stores a digital object in the repository, and "access", which provides access to a digital object.

Thus RAP provides a clearly defined, open interface for the repository that allows others to write clients and higher level interfaces.

Page 37: Alternative Architecture for Information in Digital Libraries

Structure of Repository

Page 38: Alternative Architecture for Information in Digital Libraries

Structure of Repository Repository shell

The repository shell is the part of the repository that interfaces with the outside world. It implements the RAP protocol

Persistent store Information in the repository is held in the persistent

store. The persistent store is completely hidden from the outside.

Object management layer The object management layer provides an interface

between the services provided by the persistent store and the object oriented functions required by the repository shell.

Page 39: Alternative Architecture for Information in Digital Libraries

The Repository Access Protocol (RAP) VerifyHandle. Confirm that a handle has been

registered in the handle system. AccessRepoMeta. Access the repository metadata. Verify_DO. Confirm that a repository stores a digital

object with a specified handle. AccessMeta. Access the metadata for a specified

digital object. Access_DO. Access the digital object. Deposit_DO. Deposit a digital object in a repository. Delete_DO. Deletes a digital object from a repository. MutateMeta. Edit the metadata for a digital object. Mutate_DO. Edit a digital object.

Page 40: Alternative Architecture for Information in Digital Libraries

Handle system to access DO

Page 41: Alternative Architecture for Information in Digital Libraries

Example RAP Work Flow The handle "loc.ndlp/1234" is sent to the handle

system. It resolves to data type "handle" (HDL), value "loc/repos1". This is interpreted as information that the digital object is stored in the repository identified by the given handle.

The handle "loc/repos1" is sent to the handle system. It resolves to information of type "RAP". This is information that the repository implements RAP. The corresponding data is a reference to a CORBA Object Request Broker (ORB).

The command "Access_DO (loc.ndlp/1234)" is now sent to the repository.

Page 42: Alternative Architecture for Information in Digital Libraries

Benefit Using Handle Since the digital object is identified by a

handle, if it is moved to another repository the only change required is to alter the data in the first of the handle records in the figure. Since the repository is identified by a handle, if the repository is moved to a different computer or otherwise changed, but its handle remains the same, altering the single data item in the second handle record in the figure is the only change needed, for all the digital objects stored in the repository.

Page 43: Alternative Architecture for Information in Digital Libraries

User Interface

Page 44: Alternative Architecture for Information in Digital Libraries

User Interface System

Page 45: Alternative Architecture for Information in Digital Libraries

Client via CGI-BIN

Page 46: Alternative Architecture for Information in Digital Libraries

DO sets as hierarchies

Page 47: Alternative Architecture for Information in Digital Libraries

Hierarchies Level 0:

contains the digitized image, sound, text, or other data.

Level 1: is a parent of digital objects of Level 0. Upon

encountering a digital object of this type, the digital object browser extracts the content of the all the child Level 0 digital objects and displays them in an indexed list to the user. This type has been used to display indexes of thumbnail images.

Level 2: is a parent of digital objects of Level 1.