Upload
hester-wheeler
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
Introducing “Pergamos”
Libraries Computer Center
Department of Informatics & Telecommunications
University of Athens
A FEDORA-based Digital Library System utilizing Digital Object Prototypes
Kostas [email protected]
European FEDORA User MeetingCopenhagen, 28 September 2005
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Outline Motivation – The University of Athens (UoA) DL Digital Objects (DOs)
DO Storage (FEDORA) DO Manipulation (DL Application Logic)
Digital Object Prototypes Automatic DO Type Conformance Scope of Prototypes & Collection Management Implementation Details
A Preview of Pergamos Discussion
September 28, CopenhagenEuropean FEDORA User Meeting 2005
The UoA DL Project Over 1 million objects originating from 8
disparate collections Folklore notebooks, Ancient papyri, UoA Historical
Archive, Byzantine music manuscripts, Theatrical photos & brochures, Informatics research papers and dissertations, Medical images, Press articles
Heterogeneous material, in terms of content type, metadata, structure, user requirements
Mostly digitized material, requiring detailed cataloging
September 28, CopenhagenEuropean FEDORA User Meeting 2005
UoA DL Project Metadata Build a Web-based DL System to handle all
material Centralized DL approach due to
Existing hardware infrastructure Funding restrictions Administration simplicity
FEDORA is our DO Repository
September 28, CopenhagenEuropean FEDORA User Meeting 2005
UoA DL Project Metadata Contd. Small Team
2.5 developers, 1 librarian, 1 manager Requirements, Specifications, Development,
Digitization & Cataloging Management … … while everyday tasks keep running!
Cataloging Personnel Scholars & Experts in each collection’s domain
(not librarians) Strict Schedule
First Collection deadline: early 2006 Project deadline: end of 2006
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Motivation Simplify & speed up the cataloging process
Provide effective Web-based cataloging interfaces
Automate content ingestion Decrease development time
Avoid custom coding for each content variation Elaborate on reusable and configurable DL
modules Provide the means to treat content variations in
a unified manner
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Digital Objects A Digital Object is a human generated artifact
consisting of the digital content and related information
September 28, CopenhagenEuropean FEDORA User Meeting 2005
FEDORA FEDORA Digital Object Model
Content Models, Datastreams, Behavior Definitions, Mechanisms & Disseminators
FEDORA is a DO Repository Focus on how each DO part is encoded &
stored Handles effectively issues related to storage,
preservation & versioning, searching & indexing, interoperability
September 28, CopenhagenEuropean FEDORA User Meeting 2005
DL Application Logic Cataloging, Workflows, Collection Building &
Management, User Interfaces, etc DL Modules manipulate DOs in a higher level
of abstraction Focus on the overall behavior of the DO
(what are the DO parts and how do they behave)
DOs reflect the underlying “real world” objects – they behave according to their nature, their essence, their type
September 28, CopenhagenEuropean FEDORA User Meeting 2005
An example – Theatrical Collection Albums containing photos of National Theater
Performances What is a Photo DO?
A digital image stored in various formats (e.g high quality, www
quality, thumbnail) accompanied by the metadata required for
describing the picture What is an Album DO?
A container of Photo DOs accompanied by theatrical play metadata
September 28, CopenhagenEuropean FEDORA User Meeting 2005
A 2nd example – Historical Archive University’s Senate Session Proceedings >
Folders > Sessions > Items What is a Item DO?
A digital image (capturing 1 or 2 pages) stored in various formats (e.g high quality, www
quality, thumbnail) What is a Session DO?
A container of Item DOs + metadata What is a Folder DO?
A container of Session DOs + metadata
September 28, CopenhagenEuropean FEDORA User Meeting 2005
DO Typing Information FEDORA Content Models express DO Typing
information Content Models are metadata attributes (e.g.
“photo”, “album”) that we use as a guide Humans interpret Content Models, not the DL
System Manual resolution of DO Typing issues
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Problems Catalogers carry out manual XML editing in a
low level of abstraction with too technical, complex & over detailed semantics
Developers generate ad-hoc, custom & not reusable implementations of DO types’ variations of behavior
DL modules exhibit limited evolution and configuration capabilities
DO Typing Information
The DL System should resolve DO Typing issues automatically
(in a manner transparent to the DL Application Logic)
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Automatic DO Type Conformance The designer specifies the various DO types… … and the DL System makes DOs conform to
these type specifications automatically How?
September 28, CopenhagenEuropean FEDORA User Meeting 2005
The OO Viewpoint In the OO model an object is itself aware of its
“nature” and behaves accordingly Objects are conceived as instances of a type,
automatically conforming to the type’s definitions & specifications
OO types are separate entities (named either classes or prototypes)
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Digital Object Prototypes A DO Prototype is a DO Type Specification, a
separate entity that defines the DO’s: Constitutional parts – metadata sets, files,
structure, etc Private behaviors – DO internal operations
such as serializations, validations, assignment of default values, content conversions, etc
Public behaviors (behavior schemes) – the DO external interface, consisting of high level operations such as Detail view, Browse View, Edit View, etc
September 28, CopenhagenEuropean FEDORA User Meeting 2005
DO Prototypes & Instances The designer carries out the definition of DO
Prototypes – the DL System handles the rest DO Prototypes represent the realization of the
Content Model notion in a OO fashion: The process of generating a DO from a
Prototype is called instantiation The resulted object is an instance of the
prototype A DO instance automatically conforms to the
Prototype’s specifications Stored DOs vs DO instances
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Digital Object Dictionary The runtime environment in which DO instances
and Prototypes operate: Instantiation of DOs based on the prototype
specifications (private behaviors: load & parse XML, assign default values, etc)
Exposure of the public DO behaviors in a high level, uniform API (for use by DL Modules)
Serialization of the DO instance back to FEDORA (private behaviors: serialize data structures in XML, perform validations, etc)
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Expression of DL Application Logic A DL Module performs the following steps:
1. Acquire the DO Instancedo = dictionary.acquireObject(“type”)do = dictionary.acquireObject(“uoadl:1024”)
2. Perform operations upon itdo.getMDSet(“DC”).getField(“title”)dictionary.executeBehavior(do, “editView”)
3. Store the DO in the repositorydictionary.saveObject(do)
Cleaner, simpler, more effective
September 28, CopenhagenEuropean FEDORA User Meeting 2005
3-tier DL ArchitectureS
epar
atio
n o
f C
on
cern
s
September 28, CopenhagenEuropean FEDORA User Meeting 2005
3-tier DL Architecture
Storage
Sep
arat
ion
of
Co
nce
rns
September 28, CopenhagenEuropean FEDORA User Meeting 2005
3-tier DL Architecture
Storage
DO Typing & Instantiation
Sep
arat
ion
of
Co
nce
rns
September 28, CopenhagenEuropean FEDORA User Meeting 2005
3-tier DL Architecture
Storage
DO Typing & Instantiation
Composition of DO behaviors
Sep
arat
ion
of
Co
nce
rns
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Scope of Prototypes Should we have global DO Types? Collection-pertinent types: A DO Prototype is
defined in the context of a Collection Support fine grained definition of collection
specific kinds of material Hierarchical naming scheme for types
Theatrical Collection Photo: dl.theatre.photo Medical Collection Photo: dl.medical.photo Stored in the “contentModel” metadata attribute
Avoid type collisions
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Collection Management DL = Hierarchy of DO instances
Collections are also DOs The DL itself is a DO, representing the “super-
collection” (the collection of all the collections) Easily add new collections & sub-collections All content is modeled in a unified manner &
can be characterized Allow the DL designer to work out the details of
each collection independently, yet in a uniform manner
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Implementation details DO Prototypes are
Specified in XML form Stored in the “TEMPLATE” datastream of the
appropriate Collection DO Loaded, parsed & interpreted by the DO
Dictionary in its bootstrap procedure Transparent to FEDORA
DO Instances are supplied with the “CONTAINER” datastream, containing the pids of the DOs they “contain”
September 28, CopenhagenEuropean FEDORA User Meeting 2005
DO Prototypes in detail MD Sets
Specification of each individual field (label, description, multi-value, mandatory, UI characteristics)
Serialization information (how to store it in FEDORA) Field mappings (under development)
Files: Automatic conversions (tiff -> jpeg + thumb) Batch Import: automatically create Dos from zip bundles Structure: allowed children types Browsers: browse field Indices: e.g. subject catalog Behavior schemes: atomic DO elements
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Pergamos Historical Archive (production) Folklore Notebooks (testing) Theatrical Collection, Medical Images &
Byzantine music manuscripts (finalization of requirements & specifications)
Undergoing development … the remaining collections are coming next
Historical Archive will be published on early 2006…
… with a multi-lingual UI, hopefully!
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Public DO BehaviorsFEDORA Behaviors Behavior Schemes
Are defined in each DO separately
Are defined once and in one place (in the Prototype)
Operate on the datastreams Operate on the atomic elements of a DO
Invoked directly on the DO Invoked as in OO Dynamic Method Dispatch
Require the a priori existence of datastreams
Instantiation (empty DO)
Generic Targeted on UI issues
Exposed as Web services Web services will be of use after the DL has been built
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Future Work Fully implement the OO paradigm
OO Inheritance for DO Prototypes (e.g the Notebook type derives from the Book type)
OO Polymorphism for DO instances (e.g the DO “uoadl:1234” is both a Notebook & a Book)
Supply general purpose linking capabilities that exceed structural relations (FEDORA Metadata for Object-to-Object Relationships?)
Deliver on schedule…
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Conclusions If in doubt, use FEDORA
Flexible & Extensible (they mean it) 1 year of Pergamos development, 2 months of
testing & 3 months of production use (Historical Archive) with no serious problems
Though, Sandy & Carl, I’d be grateful for some minutes of your time!!!
DO Prototypes: a realization of Content Models in OO terms, implemented on top of FDOM to handle DO Typing issues automatically
Detailed report on Pergamos to appear…
September 28, CopenhagenEuropean FEDORA User Meeting 2005
Thank You Questions? Comments? For details:
"On the Effective Manipulation of Digital Objects: A Prototype-based Instantiation Approach"Kostas Saidis, George Pyrounakis, Mara Nikolaidou, Proc. 9th European Conference on Research and Advanced Technology for Digital Libraries, ECDL 2005, Vienna, Austria, September 2005
email: [email protected]