12
NCSU Libraries Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project Jim Tuttle North Carolina State University Libraries

Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

  • Upload
    espen

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Jim Tuttle North Carolina State University Libraries. Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project. Process Overview. Data transfer Threat and format analysis, validation Archive package organization Selective format migration - PowerPoint PPT Presentation

Citation preview

Page 1: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Tools Development and Demonstration:North Carolina Geospatial Data Archiving Project

Jim TuttleNorth Carolina State University Libraries

Page 2: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Process Overview

• Data transfer• Threat and format analysis, validation• Archive package organization• Selective format migration• Metadata normalization and supplementation• Source metadata translation• Statistics collection• Extra-repository AIP management

Page 3: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Data Transfer

• Python Md5sum comparison• 'Transfer set' metadata capture in 'Seed file'

Page 4: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Threat and format analysis, validation

Python wrappers for the following:

• Virus – ClamAV• Compressed files (tar, zip, gzip, bzip)• Geodatabases (extension and size)• Executable files (magic numbers)• Jhove validation

Page 5: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Archive package organization

• ESRI ArcGIS toolbar for selected formats

Page 6: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Archive package organization

• Rule-based python logic– filestem – extension relationships

( multi-file format validation)

– directory structure• Manual intervention

– metadata.doc• NOID assignment

Page 7: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Selective Format Migration

• Coversions using ArcGIS toolbar– e00 interchange to coverage to shapefile– geodatabase to raster, shapefile, etc

• Original files retained

Page 8: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Metadata Normalization & Supplementation

• Agency-specific XML templates in ArcCatalog with synchronization flags

• Provenance and curation metadata scripted

Page 9: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Source Metadata Translation

• Hub-and-spoke model a la Echo Depository– repository agnostic– modular conversion

hub– facilitate repository

software migration & inter-archive exchange

Page 10: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Statistics Collection

• Python scripted statistics generation:– number of files by format– cumulative size by format– mean file size– collection size– agency contribution

Page 11: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Extra-repository AIP management

• Workflow Management Database populated as a spoke on the metadata/ingest hub

• External tracking of NOID, Handle, ISO keywords, other metadata for interaction with other systems

Page 12: Tools Development and Demonstration: North Carolina Geospatial Data Archiving Project

NCSU Libraries

Questions?

Jim TuttleGeospatial Data Librarian &Project Coordinator

NCGDAPNCSU Librariesjim_tuttle at ncsu dot edu

http://www.lib.ncsu.edu/ncgdap/