View
1.305
Download
0
Category
Tags:
Preview:
DESCRIPTION
DSpace & DuraCloud Integrations talk, as presented as part of the DuraCloud Workshop at Open Repositories 2011 on June 6, 2011. More Information on work presented in these slides can be found at: * https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite
Citation preview
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
DSpace + DuraCloud IntegrationsTim Donohue
DuraSpace
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
1. DSpace AIP Backup & Restore (1.7 +)(Initial DuraCloud use case: Backup & Restore)
2. DSpace Curation Task System (1.7 +)
3. DSpace Replication Task Suite (1.8)
Basis for DSpace Integration
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• Primary Use Cases– Backup & Restore of DSpace Content
• All content or just partial (Community/Collection/Item)
– Migration/Export of DSpace Content• All content or just partial (Community/Collection/Item)
– DuraCloud Integration
Intro to Archival Info Pkgs (1.7+)
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
How to Backup DSpace (pre-1.7)
Full Database Backup
FolderBackup
Database Assetstore Folder
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
How to Restore All (pre-1.7)
Full Database Backup
FolderBackup
Database Assetstore Folder
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
How to Restore a Collection (pre-1.7)
Full Database Backup
FolderBackup
Database Assetstore Folder
Temporary Database
Temporary Folder?
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
How to Restore a Collection (pre-1.7)
Full Database Backup
FolderBackup
Database Assetstore Folder
Temporary Database
Temporary Folder?
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Backup via Archival Info Pkgs
Package for each Community,
Collection & Item
AIP backup
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Restore All via Archival Info Pkgs
AIP backup
Package for each Community,
Collection & Item
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Restore a Collection via AIPs
AIP backup
Collection AIP
Items in Collection
1 2
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
What’s in an AIP?
METS(DIM / MODS / PREMIS /
METSRights)
LicenseContent Files
or Logos
*Also a BagIt version in works
Other Filesin Bundles
(optional)
Archival Information Package (AIP)
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
What’s in an AIP?
Descriptive Metadata: DIM & MODS Tech/Preservation Metadata: PREMIS
Related Object AIPs
METS(DIM / MODS / PREMIS /
METSRights)
LicenseContent Files
or Logos Other Filesin Bundles
(optional)
Rights Metadata: METSRights
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
The “Site” AIP
Top-Level Community
AIPsMETS
(DIM / MODS / PREMIS / METSRights)
Special AIP for site-wide info/metadata:(e.g. Group Memberships, EPeople)
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Restore All In-Archive Content (Files + Metadata)Restore All People & GroupsRestore All Permissions / Access RightsRestore Community / Collection Logos, Metadata,
Rights & Item TemplatesRestore Community / Collection / Item HierarchyRestore In-Process / Incomplete ItemsRestore Collection OAI-PMH/ORE Harvest SettingsRestore all configuration files (dspace.cfg, etc.)
What can AIPs restore?
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Migrate a Collection
Items in Collection
One DSpace Install Another DSpace Install
Collection AIP
12
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Migrate Content
12
(Future work)One DSpace Install
Collection AIP
Items in Collection
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Package for each Community,
Collection & Item
DuraCloud Integration (1.7.x)
Local “Watch” Folder
1
./dspace packager -d
java -jar synctool.jar
[1]
[2]
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
DuraCloud Integration (1.7.x)
Package for each Community,
Collection & Item
Local Folder
2
java -jar retrievaltool.jar
./dspace packager -r
[1]
[2]
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
New: DSpace Replication Suite in 1.8
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• Enables a basic ‘microservices’ approach to curating DSpace objects
• Anyone can build a task & share it.• Currently tasks must be written in Java
– Working on JRuby & Jython integration (1.8?)
• “Frees” admin tasks from Command Line– Can now run from Admin UI or CLI
DSpace Curation System (1.7+)
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• A set of curation tasks geared towards ‘replicating’ (backup/restore/audit) content
• “Wraps” 1.7 DSpace AIP Backup & Restore
DSpace Replication Suite (1.8)
Backup content to AIP (filesystem or DuraCloud)Restore/Replace from AIPAudit AIP (compare to DSpace content)Basic IO Tracking of AIP Upload/DownloadsAll replication tasks can be run via Admin UI
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Package for each Community,
Collection & Item
DuraCloud Integration (1.8.x)
Local Temp Folder(Cache)
ORCommand line Curation Tools
1
“Replication Task Suite”:• Suite of Curation Tasks• One step Sync process• Via UI or CLI
1
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Package for each Community,
Collection & Item
DuraCloud Integration (1.8.x)
Local Temp Folder(Cache)
ORCommand line Curation Tools
“Replication Task Suite”:• One step Retrieve process• Via UI or CLI• Also ‘auditing’ tools
1
1
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
DSpace Replication Suite Demo
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
Cannot yet take advantage of DuraCloud streaming capabilities (AIPs are zip files)
Cannot yet take advantage of DuraCloud transformation services (AIPs are zip files)
Known Limitations
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• Working towards “unzipped” AIPs (1.8?)– METS file & Content files stored in an AIP ‘folder’
but NOT zipped up– Support for DuraCloud streaming, etc.
• DSpace UI Streaming Integration (@mire)• ‘Auto-Sync’ options
– Updates in DSpace -> DuraCloud (queued?)– Updates via DuraCloud services -> DSpace?
Next Steps
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• MIT : Richard Rodgers & Wendy Bossons – Developed Curation Task Framework– Developed initial Replication Suite tasks
• @mire : Mark Diggory– Look for @mire’s “Integrating DuraCloud Services
in DSpace” talk on Friday at 3:30pm
In Large Thanks to…
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• Replication Task Suite:– https://wiki.duraspace.org/display/DSPACE/Replic
ationTaskSuite
• AIP Backup & Restore:– https://wiki.duraspace.org/display/DSDOC/AIP+Ba
ckup+and+Restore
• Curation Task System:– https://wiki.duraspace.org/display/DSDOC/Curati
on+System
For More Information
Licensed under Creative Commons Attribution-Share Alike 3.0 Unported License (CC BY-SA 3.0)To request other use: info@duracloud.org
• Package : http://www.flickr.com/photos/halfbisqued/2353845688/• Harddrive & Terminal icons: http://tango.freedesktop.org/Tango_Desktop_Project• Folder icon: http://www.openclipart.org/detail/13740• Database icon: http://www.openclipart.org/detail/68413• Zip Pkg icon: http://veryicon.com/icons/system/capital-icon-suite-mac/zip-10.html• File icons: http://veryicon.com/icons/system/rhor-v2-part-3/• Checkmark & Delete icons: http://veryicon.com/icons/system/on-stage/• Tools Icon: http://veryicon.com/icons/system/azullustre/
Photo/Icon Acknowledgments
Recommended