15
The Fedora Digital Repository Project and the National Science Digital Library (NSDL) July 26, 2005 Dean B. Krafft Cornell University

The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

  • Upload
    rufin

  • View
    31

  • Download
    1

Embed Size (px)

DESCRIPTION

The Fedora Digital Repository Project and the National Science Digital Library (NSDL). July 26, 2005. Dean B. Krafft Cornell University. Fedora: Repository Middleware. A F lexible, E xtensible D igital O bject R epository A rchitecture - PowerPoint PPT Presentation

Citation preview

Page 1: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

The Fedora Digital Repository Projectand the

National Science Digital Library (NSDL)

July 26, 2005

Dean B. KrafftCornell University

Page 2: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora: Repository Middleware

• A Flexible, Extensible Digital Object Repository Architecture

• An architecture and toolkit (like IIS or SQL Server), not a vertical application

• Audience: system builders – 12 major university or national (Denmark) digital libraries

• DSpace in contrast: a vertical application with a fixed workflow targeted at users

• So far incorporated in two commercial products: VTLS’s Vital digital library, and Company X’s product – finalist for a large government contract

Page 3: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora: Project Details

• Collaboration of Cornell and UVa• Development team of 10 developers+leads• Currently implemented in Java; licensed under

Mozilla Public License• Funded by Mellon: starting 2nd 3yr $1.4m grant• Cornell leads: Sandy Payette & Carl Lagoze• 20,000+ downloads, active user community• Use cases: Digital Asset Management, Scholarly

Publishing, Information Network Overlay, Institutional Repository, Digital Archive and Records Management, Digital Library

Page 4: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Digital object identifier

Reserved Datastreams Key object metadata

DisseminatorsWeb-service methods for distributing views of recombined content

Datastreams Set of content or metadata items (local or external URL redirects)

Fedora Digital Object Model Component View

Persistent ID (PID)

Dublin Core (DC)

Datastream

Datastream

Audit Trail (AUDIT)

Relations (RELS-EXT)

Disseminator

Default Disseminator

Page 5: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora Repository Service

• Set of SOAP/REST services: Manage, Access, Search, Query

• Fundamental store is XML, with RDBMS cache (Oracle, MySQL), and RDF triple store for relationship queries

• Modular architecture: Manage, Access, Storage, Dissemination, Authentication, Authorization, RDF Resource Index

Page 6: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora 2.0 Capabilities

• Object-to-object Relationships– Ontology of common relationships (RDF schema)– Relationships stored in special datastream (RELS-EXT)

• Resource Index (RI)– RDF-based index of repository (Kowari triple-store)– Graph-based index includes:

• Object properties and Dublin Core• Object Relationships and Object Disseminations

– Powerful querying of graph of inter-related objects– REST-based query interface (using RDQL or ITQL)– Results in different formats (triples, tuples, sparql)

• Fedora 2.1 (August 2005) adds– Plug-in Authentication modules– Fine-grained Authorization using XACML XML-based policies

Page 7: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora Service Framework(v2.1 & Planned 2005/6-2006/7)

Fe dora Re po sito rySe rv ice

Serv ices

Apps

P re se rva tionInte grityS e rvice

Ex te rna lW orkflow

JHOV E

GDFR

Ba sicW orkflowS e rvice

Dialog Box Name

O KTex t:

Tex t

Tex t

Tex t

Tex t

Tex t

Canc el

H elp

Sample Text Here Sample Text Here Sample TextHere Sample Text Here Sample Text Here SampleText Here Sample Text Here Sample Text HereSample Text Here Sample Text Here

S am ple Tex t Here S am ple Tex t Here S am ple Tex t Here Sam ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here Sam ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here Sam ple Tex t Here

Fedora-Web-IRAdministrator

OAIP rovide rS e rvice

Dire ctoryInge st

S e rvice

W e b-ba se dsubm ission a ndba sic w orkflow

Fe de rationPID Re s olution

Se rvicePre s e rvation

M onitor ingSe rvice

Eve ntNotification

Se rvice

Fe doraS e a rchS e rvice

Dyna m icDisse m ina tor

S e rvice

PolicyBuilder

Other

Ser v ice

Page 8: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

National Science Digital Library

• K-gray Science, Technology, Engineering, and Mathematics (STEM) education

• NSF-created brand and home for digital resources of known high quality

• Community of users, contributors and institutions (as providers and consumers)

• Creates context for resources (e.g. lesson plans, standards alignment, ratings, annotations, reviews, brands)

• Guides selection & use; not just discovery

Page 9: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)
Page 10: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Program Details

• Major NSF Division of Undergraduate Education program, over $20m/yr funding

• Over 120 NSF grants in program• Core Integration collaboration of UCAR,

Columbia University and Cornell University• Cornell provides core technical

infrastructure: Fedora-based repository, Lucene-based search, nsdl.org portal

• Columbia: Shibboleth authentication; SDSC: Storage Resource Broker archive

Page 11: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

What Fedora Provides NSDL

• Objects: Aggregators (collections), Metadata Providers, Agents, Resources (with local or remote content), Metadata

• Relationships: Structural (part of), Equivalence, Membership, arbitrary graph queries

• Network overlay architecture: A lens for viewing science content on the net, whether content is local, remote, or archived – it all has a repository-based URL

• Web services: disseminations are arbitrary recombinations of content

• Authentication/Authorization: Collections and services manage their own repository content

Page 12: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

NSDL RecommenderService

ExampleCollection

NSDL BigBang

NSDL Agent1000

MDP 3000

Aggr2002 M

4002

NSDL Collections

1002

Aggr2005

M4005

NSDLRecom-mended

1005

NSDL RSAgent 1004

MDP 3004

ExampleAgent 10010

MDP 10011

Aggr10012

Aggr2004

M10005

Example.org

10006

pBy

pBy

repBy

repBypBy

mOf

m4

m4

m4

agg4

mdp4

agg4mdp4

agg4

agg4

1st mOf

repBy

Types of Objects

Agents

Aggregators

Metadata Providers

Resources

Metadata

Types of Relationships

metadataProviderFor (mdp4)aggregatorFor (agg4)providedBy (pBy)metadataFor (m4)memberOf (mOf)· 1st. A recommended resource· 2nd. Makes it a “blessed” NSDL Collection

2nd mOf

M10007

m4

pBy

mdp4

NSDL FEDORA-BASED REPOSITORY

Page 13: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Appendix – Additional Information

• Fedora website: http://www.fedora.info

• NSDL website: http://nsdl.org

• An Information Network Overlay Architecture for the NSDL by Lagoze, Krafft et al.: http://www.arxiv.org/abs/cs.DL/0501080

• Fedora: An Architecture for Complex Objects and their Relationships by Lagoze, Payette et al.: http://www.arxiv.org/abs/cs.DL/0501012

Page 14: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Selected Fedora Adopters

• Current Users:– National Science Digital Library (NSDL): Core Integration– University of Virginia – digital library– VTLS – library systems vendor selling Fedora-based product– Tufts University – digital library and university records management– OhioLink – statewide consortium of academic libraries– Northwestern: Library and Academic Technologies – digital library– ARROW: National Library of Australia and Monash University – nationally distributed

institutional repository project– Royal Library Denmark, National Library, and DTU – integrated national digital library– Rutgers University – digital library– Indiana University – digital library– American Geophysical Union – repository of back issues of journals– Library of Congress – National Digital Newspaper Project– University of Delaware – digital library– Hamilton College – digital library– Cornell CIT – Electronic File Cabinet to manage office records– Tibetan Buddhist Resource Center – digital library– Yale University – manage university records– DISA – South Africa, History of Apartheid resistance – record repository

• Interesting new proposals– Company X finalist for large government contract – Cornell Lab of Ornithology (data + tools + documents)

Page 15: The Fedora Digital Repository Project and the National Science Digital Library (NSDL)

Fedora Development Consortium

• Advisory Board– University of Virginia– Tufts– VTLS– ARROW (Monash University and Nat’l Lib Australia)– Harris Corp.– Danish Royal Library and DTU– Northwestern University– NSDL – Core Integration

• Mission– Requirements Definition, Specifications. Joint Development– Commission of Working Groups

• Content Modeling• Outreach and Education• Workflow and Service-Oriented Processes

– Recommendation for Long-Term sustainability model• Governance and Funding• Set Fedora Free – full open source model (e.g., public SourceForge)• Code Maintenance (UVA until 2012; plan for beyond)