View
212
Download
0
Category
Preview:
Citation preview
NGDA Architecture Update
Greg Janée
Greg Janée • May 16, 2005 2
Three motivations
• Archival has to be cheap & easy– little incentive– no funding
• Need to archive data semantics– key differentiator from text, audio, video
• Focus on long-term preservation– need to migrate whole systems
Greg Janée • May 16, 2005 3
system
databasestorage
handleresolver
database
Typical repository architecture
database
handleresolver
database
fragile
Greg Janée • May 16, 2005 4
NGDA architecture
storage subsystem
standard, public data model
archival system
ADL OAIbulk
loader
databases,caches,
etc.
Web
access ingest
Greg Janée • May 16, 2005 5
Post-NGDA architecture
storage subsystem
standard, public data model
Web
Greg Janée • May 16, 2005 6
Storage system requirements
• Req’s:– associate UUIDs/RIDs with bitstreams– retrieve global/local bitstream by UUID/RID– determine (parent) UUID of any bitstream– list all UUIDs
• Satisfied by:– any filesystem– tag URIs for UUIDs
• tag:library.ucsb.edu,2005:identifier
Greg Janée • May 16, 2005 7
Archival objects
directoryUUID
componentRID
UUID
Greg Janée • May 16, 2005 8
Archival objects
• Directory info per component– named relationship/position– format & semantics
• by UUID references to definitions
– fixity: checksum– provenance: isDerivative– policy: mutability– rights
• Components may be provided by archive itself
Greg Janée • May 16, 2005 9
Example
USGS
DOQQ
GeoTIFFFGDC
Object x
x.tiffx.fgdc x.gif
met
adat
ad
ata
derived
TIFFsubtypeOf
Greg Janée • May 16, 2005 10
Archives
• Archive = set of archival objects– no structure– no free-floating bitstreams
• In anticipation of federation:– associations may cross archive boundaries– archival objects may not
Greg Janée • May 16, 2005 11
Object types
• Content• Format definition• Semantic definition• Provider• Organizational structures
– collection– series– ingest session
Greg Janée • May 16, 2005 12
Archive-provider agreement
• Defines– common structure of objects to be ingested– necessary validations– associations to other objects– policies, rights, etc.
• Represents choke point– requires human evaluation
Greg Janée • May 16, 2005 13
Deferred functionality
• Incremental ingest• Object revisions• Rights• 3rd-party access• Federation
Greg Janée • May 16, 2005 14
Status
• Starting development now
• Approach: iterative refinement
Recommended