Upload
calvin-briggs
View
214
Download
0
Embed Size (px)
Citation preview
Geospatial Data Preservation Challenges at the Sub-National Level:The North Carolina Experience Steve MorrisHead of Digital Library InitiativesNorth Carolina State University Libraries
Cambridge Conference July 18, 2007
2
Outline
Project backgroundTargeted geospatial contentRisks to dataValue in older dataChallenges (Technical and organizational)Solutions (?)Next steps
3
NC Geospatial Data Archiving Project
Partnership between university library (NCSU) and NC Center for Geographic Information & AnalysisPart of the Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP)Focus on state and local geospatial content in North Carolina (state demonstration)Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventoriesObjective: engage existing state/federal geospatial data infrastructures in preservation
Serve as catalyst for discussion within industry
4
NCGDAP Goals
Repository GoalCapture at-risk dataExplore technical and organizational challenges
Project End GoalData Producers: Improved temporal data management practicesArchives: More efficient means of acquiring and preserving data;
Progress towards best practices
Temporal data management vs. long-term preservation
5
• 96 of 100 North Carolina Counties have GIS systems as do many municipalities
• Over 30 state agency data producers
Collection Focus:State and Local Government Geospatial Data
• Exceptional value– Detailed, current, accurate
• Exceptional risk– Inconsistent or nonexistent archiving
practices– Complicated formats and complex
objects
Source: NC OneMap
Carrboro, NC : Population 17,797 (2005 est.)
22 downloadable GIS data layers
3 OGC WMS services (web services)
10 web mapping applications
9 downloadable PDF map layers
7
NCGDAP Data Types – Vector GIS
• Cadastral (tax parcels)
• Street centerlines
• Zoning
• Topographic contours
• School, sheriff, fire
• Voting precincts
• More …
• County, municipal, state
• Detailed, accurate, current
• Frequently updated
8
NCGDAP Data Types – Digital Orthophotography
• All 100 NC counties with orthos• 1-5 flight years per county• 30-300 gb per flight
Note: Percentages based on the actual number of respondents to each question 9
GIS Software
Software project file (.mxd, .apr, …)
Data layer file (.avl, .lyr, …)
PDF map exports
Web Services-based representations
NCGDAP Data Types – Cartographic
Note: Percentages based on the actual number of respondents to each question 10
Mobile, LBS, and, social networking applications
Long-term cultural heritage value in non-overhead imagery: more descriptive of place and function
Oblique Imagery
Road Videologs
Tax Dept. Photos
Street View Images
Other Data Types – Place-based Data
11
Digital Preservation Points of Failure
Data is not saved, or …can’t be found, or …media is obsolete, or …media is corrupt, or …format is obsolete, or …file is corrupt, or …meaning is lost
Solutions:
MigrationEmulationEncapsulation XML
12
Risks to Geospatial Data
Producer focus on current dataData overwrite as common practice
Future support of data formats in questionNo open, supported format for vector data
Shift to web services-based accessData becoming more ephemeral
Inadequate or nonexistent metadataImpedes discovery and use
Increasing use of spatial databases for data management
The whole is greater than the sum of the parts
13
Value in Older Data: Solving Business Problems
Suburban Development 1993/2002Near Mecklenburg-Cabarrus County border
Land use change analysis
Real estate trends analysis
Site location analysis
Disaster response
Resolution of legal challenges Impervious surface maps
14
Value in Older Data: Cultural Heritage
Future uses of data are difficult to anticipate (as with Sanborn Maps)
15
Challenge: Vector Data Formats
No widely-supported, open vector formats for geospatial data
Spatial Data Transfer Standard (SDTS) not widely supportedGeography Markup Language (GML) – diversity of application schemas and profiles a challenge for “permanent access”
Spatial DatabasesThe whole is more than the sum of the parts, and the whole is very difficult to preserveCan export individual data layers for curation, but relationships and context are lostSome thinking of using the spatial database as the primary archival platform
16
Challenge: Cartographic Representation
Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc.
17
Challenge: Geospatial Web Services
• How to capture records from decision- making processes?• Possible: Atlas collections from automated image capture• Web 2.0 impact: Emerging tiling and caching schemes (archive target?)
18
Challenge: Preservation Metadata
Metadata Archived?
0.0%10.0%20.0%30.0%40.0%50.0%60.0%70.0%
FGDC format Locally definedmetadata
NC OneMapmetadata starter
block
None
% o
f R
esp
on
den
ts
Results from a 2006 survey of all 100 NC counties and 25 largest NC municipalities
19
Challenge: Data Capture
Response:yes = 65.3%, no = 34.7%*
(out of 57.6% response rate)
Jurisdictions Archiving Snapshots
No: 34.7%
Yes: 65.3%
No response
Yes
No
2006 Frequency of Capture Survey targeting North Carolina counties and municipalities
20
Data Capture Survey Results: Overview
Two-thirds of responding agencies create and retain periodic snapshotsLong-term retention more common in counties with larger populationsStorage environments vary, with servers and CD-ROMs most commonOffsite storage (or both onsite and offsite) is used by nearly half of the respondentsPopularity of historic images has resulted in scanning and geo-referencing of hardcopy aerial photos among one-third of the respondents
21
Solutions: Content Exchange Infrastructure
Volume of state/federal requests for local data (“contact fatigue”) spurs rethinking of archive strategy for data acquisition Leveraging more compelling business reasons to put the data in motion (disaster preparedness, highway construction, census, …)Content exchange networks:
Minimize need to make contactAdd technical, administrative, descriptive metadataEstablish rights and provenance
22
Informing and Leveraging Other Infrastructure
OrthophotoData DistributionSystem
Efficient transfer of large quantities of imagery
Street Centerline Data Distribution System
Efficient transfer of data from 100 counties, with metadata and clarified rights
NC GIS Inventory
• Efficient data identification• Adding preservation elements
NC OneMap Data Download and Viewer
• Public access• Data visualization
23
• Partnered with EDINA (UK) and NARA to approach the Open Geospatial Consortium (OGC) in 2005-2006
• Working Group charter approved by OGC Technical Committee plenary Dec. 2006
Solutions: Engaging Standards Efforts
24
GML for archivingGeo Rights Management – adding archive use casesContent packaging Saving data state in web services InteractionsContent replication (OGC/Open Grid Forum talks)Persistent identifiersData versioning (metadata and catalog support)Cartographic representation
Points of Engagement with the OGC
Cross-fertilize between library/archives and geospatial communities
25
Project StatusCultivating a commercial market for older data.
Part of “permanent access” is marketing, advertising, and putting older data into the path of the user
Role of Commercial Data Providers
26
Software vendors are more keenly aware of temporal data management as a customer problemConsulting firms increasingly see temporal data management and archiving as a business opportunityInnovative practices emerging at local and state level to complement and inform national level activities
Signs of Hope
Viral adoption of archiving practices vs. mandated archiving practices: which will have more effect?
27
TechnicalRefining repository ingest workflow (currently using DSpace)Further investigation into use of METS (Metadata Encoding and Transmission Standard) and PREMIS (Preservation Metadata Standard)Content exchange tests with other organizations
OrganizationalOGC Data Preservation Working GroupEngaging State Archives: Local records outreach and records retention practicesWork towards formulating best practices for data capture practices for local agenciesContent exchange networks
Next Steps
28
Questions?
Steve MorrisHead, Digital Library InitiativesNCSU Librariesph: (919) [email protected]
http://www.lib.ncsu.edu/ncgdap