DSpace & DuraCloud Integrations

Preview:

DESCRIPTION

DSpace & DuraCloud Integrations talk, as presented as part of the DuraCloud Workshop at Open Repositories 2011 on June 6, 2011. More Information on work presented in these slides can be found at: * https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite

Citation preview

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

DSpace + DuraCloud IntegrationsTim Donohue

DuraSpace

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

1. DSpace AIP Backup & Restore (1.7 +)(Initial DuraCloud use case: Backup & Restore)

2. DSpace Curation Task System (1.7 +)

3. DSpace Replication Task Suite (1.8)

Basis for DSpace Integration

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• Primary Use Cases– Backup & Restore of DSpace Content

• All content or just partial (Community/Collection/Item)

– Migration/Export of DSpace Content• All content or just partial (Community/Collection/Item)

– DuraCloud Integration

Intro to Archival Info Pkgs (1.7+)

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

How to Backup DSpace (pre-1.7)

Full Database Backup

FolderBackup

Database Assetstore Folder

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

How to Restore All (pre-1.7)

Full Database Backup

FolderBackup

Database Assetstore Folder

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

How to Restore a Collection (pre-1.7)

Full Database Backup

FolderBackup

Database Assetstore Folder

Temporary Database

Temporary Folder?

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

How to Restore a Collection (pre-1.7)

Full Database Backup

FolderBackup

Database Assetstore Folder

Temporary Database

Temporary Folder?

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Backup via Archival Info Pkgs

Package for each Community,

Collection & Item

AIP backup

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Restore All via Archival Info Pkgs

AIP backup

Package for each Community,

Collection & Item

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Restore a Collection via AIPs

AIP backup

Collection AIP

Items in Collection

1 2

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

What’s in an AIP?

METS(DIM / MODS / PREMIS /

METSRights)

LicenseContent Files

or Logos

*Also a BagIt version in works

Other Filesin Bundles

(optional)

Archival Information Package (AIP)

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

What’s in an AIP?

Descriptive Metadata: DIM & MODS Tech/Preservation Metadata: PREMIS

Related Object AIPs

METS(DIM / MODS / PREMIS /

METSRights)

LicenseContent Files

or Logos Other Filesin Bundles

(optional)

Rights Metadata: METSRights

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

The “Site” AIP

Top-Level Community

AIPsMETS

(DIM / MODS / PREMIS / METSRights)

Special AIP for site-wide info/metadata:(e.g. Group Memberships, EPeople)

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Restore All In-Archive Content (Files + Metadata)Restore All People & GroupsRestore All Permissions / Access RightsRestore Community / Collection Logos, Metadata,

Rights & Item TemplatesRestore Community / Collection / Item HierarchyRestore In-Process / Incomplete ItemsRestore Collection OAI-PMH/ORE Harvest SettingsRestore all configuration files (dspace.cfg, etc.)

What can AIPs restore?

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Migrate a Collection

Items in Collection

One DSpace Install Another DSpace Install

Collection AIP

12

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Migrate Content

12

(Future work)One DSpace Install

Collection AIP

Items in Collection

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Package for each Community,

Collection & Item

DuraCloud Integration (1.7.x)

Local “Watch” Folder

1

./dspace packager -d

java -jar synctool.jar

[1]

[2]

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

DuraCloud Integration (1.7.x)

Package for each Community,

Collection & Item

Local Folder

2

java -jar retrievaltool.jar

./dspace packager -r

[1]

[2]

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

New: DSpace Replication Suite in 1.8

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• Enables a basic ‘microservices’ approach to curating DSpace objects

• Anyone can build a task & share it.• Currently tasks must be written in Java

– Working on JRuby & Jython integration (1.8?)

• “Frees” admin tasks from Command Line– Can now run from Admin UI or CLI

DSpace Curation System (1.7+)

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• A set of curation tasks geared towards ‘replicating’ (backup/restore/audit) content

• “Wraps” 1.7 DSpace AIP Backup & Restore

DSpace Replication Suite (1.8)

Backup content to AIP (filesystem or DuraCloud)Restore/Replace from AIPAudit AIP (compare to DSpace content)Basic IO Tracking of AIP Upload/DownloadsAll replication tasks can be run via Admin UI

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Package for each Community,

Collection & Item

DuraCloud Integration (1.8.x)

Local Temp Folder(Cache)

ORCommand line Curation Tools

1

“Replication Task Suite”:• Suite of Curation Tasks• One step Sync process• Via UI or CLI

1

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Package for each Community,

Collection & Item

DuraCloud Integration (1.8.x)

Local Temp Folder(Cache)

ORCommand line Curation Tools

“Replication Task Suite”:• One step Retrieve process• Via UI or CLI• Also ‘auditing’ tools

1

1

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

DSpace Replication Suite Demo

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

Cannot yet take advantage of DuraCloud streaming capabilities (AIPs are zip files)

Cannot yet take advantage of DuraCloud transformation services (AIPs are zip files)

Known Limitations

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• Working towards “unzipped” AIPs (1.8?)– METS file & Content files stored in an AIP ‘folder’

but NOT zipped up– Support for DuraCloud streaming, etc.

• DSpace UI Streaming Integration (@mire)• ‘Auto-Sync’ options

– Updates in DSpace -> DuraCloud (queued?)– Updates via DuraCloud services -> DSpace?

Next Steps

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• MIT : Richard Rodgers & Wendy Bossons – Developed Curation Task Framework– Developed initial Replication Suite tasks

• @mire : Mark Diggory– Look for @mire’s “Integrating DuraCloud Services

in DSpace” talk on Friday at 3:30pm

In Large Thanks to…

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• Replication Task Suite:– https://wiki.duraspace.org/display/DSPACE/Replic

ationTaskSuite

• AIP Backup & Restore:– https://wiki.duraspace.org/display/DSDOC/AIP+Ba

ckup+and+Restore

• Curation Task System:– https://wiki.duraspace.org/display/DSDOC/Curati

on+System

For More Information

Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org

• Package : http://www.flickr.com/photos/halfbisqued/2353845688/• Harddrive & Terminal icons: http://tango.freedesktop.org/Tango_Desktop_Project• Folder icon: http://www.openclipart.org/detail/13740• Database icon: http://www.openclipart.org/detail/68413• Zip Pkg icon: http://veryicon.com/icons/system/capital-icon-suite-mac/zip-10.html• File icons: http://veryicon.com/icons/system/rhor-v2-part-3/• Checkmark & Delete icons: http://veryicon.com/icons/system/on-stage/• Tools Icon: http://veryicon.com/icons/system/azullustre/

Photo/Icon Acknowledgments