27
The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial Data October 27, 2006

The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Embed Size (px)

Citation preview

Page 1: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

The North Carolina Geospatial Data Archiving Project

Steven P. MorrisNorth Carolina State University Libraries

Maintaining Long-Term Access to Geospatial Data October 27, 2006

Page 2: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 2

NC Geospatial Data Archiving Project

Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP)One of 8 initial NDIIPP partnershipsFocus on state and local geospatial content in North Carolina (state demonstration)Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventoriesObjective: engage existing state/federal geospatial data infrastructures in preservation

Serve as catalyst for discussion within industry

Page 3: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 3

Targeted data: Digital orthophotography

85+ NC counties with orthophotos1-5 flights per county30-200 gb per flight

Page 4: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 4

Targeted data: Vector data (w/tabular)

Economic, infrastructure, and ethnographic data

Page 5: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 5

Today’s geospatial data as tomorrow’s cultural heritage

Future uses of data are difficult to anticipate (as with Sanborn Maps).

Page 6: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 6

Risks to State/Local Geospatial Data

Producer focus on current dataData overwrite as common practice

Future support of data formats in question

No open, supported format for vector data

Shift to web services-based accessData becoming more ephemeral

Inadequate or nonexistent metadataImpedes discovery and use

Increasing use of spatial databases for data management

The whole is greater than the sum of the parts

Page 7: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 7

Challenge: Vector Data Formats

No widely-supported, open vector formats for geospatial data

Spatial Data Transfer Standard (SDTS) not widely supportedGeography Markup Language (GML) – diversity of application schemas and profiles threatens permanent access

Spatial DatabasesThe sum is more than the whole of the parts, and the sum is very difficult to preserveCan export individual data layers for curationSome thinking of using the spatial database as the primary archival platform

Page 8: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 8

Challenge: Cartographic Representation

Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc.

Page 9: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 9

Challenge: Geospatial Web Services

• How to capture records from decision- making processes?• Possible: Atlas collections from automated image capture• Web 2.0 impact: Emerging tiling and caching schemes (archive target?)

Page 10: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 10

Different Ways to Approach Preservation

Technical solutions: How do we archive acquired content over the long term?

Build a data repository: not as an end in itself but as a catalyst for discussion within the data communityDevelop a repository ingest workflow: create technical points of engagement with the NDIIPP partners

Page 11: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 11

Different Ways to Approach Preservation

Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production?

Engage data producer community and spatial data infrastructure through outreach and engagement; influence practiceSell the problem to software vendors and standards developmentFind overlap with more compelling business problems: disaster preparedness, business continuity, road building, etc.Start a discussion about roles at the local, state, and federal level

Page 12: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 12

NCGDAP Technical Approach

Receive data as is – variety of distribution methodsMigration of some at-risk formatsMetadata remediation, standardization, and synchronizationDistilling complex objects into repository ingest items (not easy)Using DSpace for demonstration purposes (keeping repository platform at arms length)In the development: use METS record as dormant item “brain” within the repository

Some unsustainable activities – for learning experience

Page 13: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 13

Building Data Bundles: The Zip Codes Example

Page 14: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 14

Where is the Dataset?

Page 15: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 15

Here’s One!

Files

• Multi-file dataset• Georeferencing• Metadata file• Symbolization file• Additional documentation• License• Disclaimer• More

Metadata

• FGDC• Acquisition metadata• Transfer metadata • Ingest metadata• Archive rights• Archive processes• Collection metadata• Series metadata

Page 16: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 16

Hub-and-Spoke Metadata Workflow

Page 17: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 17

Hub-and-Spoke Metadata Workflow

Page 18: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 18

Cultural: Changing Industry Thinking

Is the geospatial industry “temporally-impaired?”Lack of access to older dataLack for tool/model support for temporal analysisMetadata: poor support for changing dataEducation: building class projects around available data (i.e., not temporal)

Increased interest now in temporal applications?Increased demand for temporal data?Improved tool support: ArcGIS 9.2 animation tools; Geodatabase History, etc.

Page 19: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 19

Cultural: Content Exchange Networks

Solving the present-day problems of data sharing is a pre-requisite to solving the problem of long-term accessLeveraging more compelling business problems: disaster preparedness and business continuity needs can put the data in motion (siphon off to the archive)Engage existing spatial data infrastructure in archiving and preservationContent exchange network technical challenges:

Rights managementLarge-scale transfers on networkContent packaging (MPEG 21 DIDL, XFDU, METS, …)

Page 20: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 20

Cultural: Engaging Standards Efforts

Nov. 2005 EDINA and NCSU present on preservation challenges at the OGC Technical Committee MeetingKey points of intersection with standards efforts:

GML archival profile?Content packaging and content exchangeMetadata support for temporal entitiesArchival use cases in GeoDRM

Oct. 2006 meeting of Ad Hoc Historical Data Working Group at OGC TC – plans to develop a formal Data Preservation Working Group

Page 21: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 21

Sept. 2006 Frequency of Capture Survey

Survey objective:Document current practices for obtaining archival snapshots of county/municipal geospatial vector data layersSeek guidance about frequency of capture

Survey topics:General questions about data archiving practiceSpecific questions about parcels, street centerlines, jurisdictional boundaries, and zoning

Survey subjects:All 100 counties and 25 municipalities58% response rateSurvey conducted September 2006

Page 22: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 22

NC County/Municipal Agency Frequency of Capture: Parcel Data

42%

9%9%

14%

13%

13% Annually

Every 6 Months

Quarterly

Monthly

Weekly or Daily

Not Saved

Based on a percentage of the respondents that indicate they actually archive some data

Page 23: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 23

Project Status Cultivating a commercial market for older data.

Part of “permanent access” is marketing, advertising, and putting older data into the path of the user

What About Commercial Data?

Page 24: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 24

Mobile, LBS and, social networking applications drive demand for placed-based dataExample sources:

Oblique ImageryStreet-view Imagery (e.g., A9.com)Transportation Dept. Videologs

Long-term cultural heritage value in non-overhead imagery: more descriptive of place and function

New Challenges:“Platial” vs. Spatial Imagery

Page 25: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 25

Emerging online environments are increasingly used to make decisions, how are these decisions documented?How far will KML go?Temporal component in emerging tiling & caching standards?

New Challenges: Ajax Applications, Google Earth and All That

Page 26: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 26

• Web mashup interactions with existing systems spur creation of intermediate content layers: e.g., tiling and caching of WMS services

• Identification of a standard tiling scheme may create a new preservation opportunity (temporal axis on caches?)

Page 27: The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial

Note: Percentages based on the actual number of respondents to each question 27

Questions?

Contact:

Steve MorrisHead, Digital Library InitiativesNCSU Librariesph: (919) [email protected]

http://www.lib.ncsu.edu/ncgdap