GIS Data Curation in Libraries
A panel will explore the future of GIS data curation in libraries. Speakers will address traditional ways libraries incorporate GIS services, how researchers use GIS data through the life cycle & finally the potential/challenge of GIS data curation.
Michael ElliotAssistant Professor of Biostatistics at SLU
Karen HogenboomNumeric and Spatial Data Librarian at UIUC
Cynthia HudsonDigital Data Outreach Librarian at WUSTL
Jennifer Moore GIS / Anthropology Librarian at WUSTL
Chris FreelandAssociate University Librarian at WUSTL
GIS in Libraries
(Karen)
Digital AssetsManagementSystems
(Chris)
Data Curation in Libraries(Cynthia)
Case Study in the
Research Lifecycle(Michael)
Curating GIS Data(Jennifer)
Discussion and
Questions
My Experience as a Public Health Faculty Member Using GIS DataMichael B. Elliott, Ph.D.
Assistant Professor
Public Health has a long history with spatial data: 19th century London John Snow
Obesity Trends* Among U.S. AdultsBRFSS, 1985
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1986
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1987
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1988
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1989
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1990
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14%
Obesity Trends* Among U.S. AdultsBRFSS, 1991
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1992
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1993
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1994
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1995
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1996
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19%
Obesity Trends* Among U.S. AdultsBRFSS, 1997
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
Obesity Trends* Among U.S. AdultsBRFSS, 1998
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
Obesity Trends* Among U.S. AdultsBRFSS, 1999
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
Obesity Trends* Among U.S. AdultsBRFSS, 2000
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% ≥20%
Obesity Trends* Among U.S. AdultsBRFSS, 2001
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
Obesity Trends* Among U.S. AdultsBRFSS, 2002
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
Obesity Trends* Among U.S. AdultsBRFSS, 2003
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
Obesity Trends* Among U.S. AdultsBRFSS, 2004
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% ≥25%
Obesity Trends* Among U.S. AdultsBRFSS, 2005
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Obesity Trends* Among U.S. AdultsBRFSS, 2006
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Obesity Trends* Among U.S. AdultsBRFSS, 2007
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Obesity Trends* Among U.S. AdultsBRFSS, 2008
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Obesity Trends* Among U.S. AdultsBRFSS, 2009
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Obesity Trends* Among U.S. AdultsBRFSS, 2010
(*BMI ≥30, or ~ 30 lbs. overweight for 5’ 4” person)
No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%
Prevalence* of Self-Reported Obesity Among U.S. AdultsBRFSS, 2011
*Prevalence reflects BRFSS methodological changes in 2011, and these estimates should not be compared to previous years.
15%–<20% 20%–<25% 25%–<30% 30%–<35% ≥35%
How I’ve used GIS data in my research Associate aspects of the neighborhood (built environment)
with behaviors and chronic disease (diabetes)
Start with diabetes mortality rate
Look at poverty status
Look at location of parks
Look at location of fast food chains
Look at location of convenience stores
Look at location of grocery stores
When You Put it all together
What takes the most time? Finding data Modifying / Limiting shape files Re-finding data
Where do I try to find my data? Census Bureau MSDIS (Missouri Spatial Data Information Service) City/County Departments of Planning CDC Various pay sources
Where to store all this data?
What a mess!
Problems Trying to locate where I stored files for different projects Trying to remember what I named the files (especially when I accepted
ArcMap’s default names) Trying to remember how I changed the files Concerns about quality of the files Lack of access to colleagues files across department, college,
university, (city? Etc.) Lack of normalization of shape file projections Lack of metadata Disrupted linkages if switching computers or changing file structure or
updating software Using Dropbox as a collaborative temporary solution does not fix
problem
Possibilities for the future…
Possibilities for the future…
GIS Services in Academic Libraries
Karen Hogenboom
Numeric and Spatial Data Librarian
University of Illinois at Urbana-Champaign
Consultations with GIS Users
Finding data
Help with choosing or using software
Data management (and curation) Metadata Database design
Etc…
Providing Access to Data
Compilations of trusted sources http://www.library.illinois.edu/sc/datagis
Geo-portals: http://geodata.tufts.edu
Subscriptions to data sourcesSimplyMapSocial ExplorerGeolyticsSmall topical data sets (countrydata.com,
UNIDO Industrial Statistics)
http://www.library.illinois.edu/sc/datagis
http://www.library.illinois.edu/sc/datagis
Providing Access to Data
Compilations of trusted sources http://www.library.illinois.edu/sc/datagis
Geo-portals: http://geodata.tufts.edu
Subscriptions to data sourcesSimplyMapSocial ExplorerGeolyticsSmall topical data sets (countrydata.com,
UNIDO Industrial Statistics)
Providing Access to Data
Compilations of trusted sources http://www.library.illinois.edu/sc/datagis
Geo-portals: http://geodata.tufts.edu
Subscriptions to data sourcesSimplyMapSocial ExplorerGeolyticsSmall topical data sets (countrydata.com,
UNIDO Industrial Statistics)
(Geo)Data LiteracyData literate students must “be able to access,
assess, manipulate, summarize, and present data.”1
Workshops (geographic concepts and software, finding data)
Sessions with classes/groups
Online guides: http://libguides.com
1 Milo Schield, “Information Literacy, Statistical Literacy, and Data Literacy,” IASSIST Quarterly (Summer/Fall 2004): 7-11.
http://www.libguides.com
http://www.libguides.com
Accessing Academic Library GIS Services
Data Curation in Libraries The model and existing tools to get you there...
Cynthia Hudson
Digital Data Outreach Librarian
Washington University in St. Louis
Adapted from: Dorothea Salo “Librarians love data”
DCC Curation Lifecycle Model
http://www.dcc.ac.uk/resources/curation-lifecycle-model
CONCEPTUALIZE
CREATE OR RECEIVE
APPRAISE & SELECT
INGEST
PRESERVATION ACTION
STORE
ACCESS, USE & REUSE
TRANSFORM
GIS Data Curation: Challenges & PotentialJennifer Moore
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Cura
tion
Life
cycle fro
m th
e D
CC
http://
ww
w.d
cc.ac.u
k/reso
urces/cura
tion-life
cycle-mod
el
Curation Lifecycle Model as a Guide for GIS Data
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Provenance
Pho
to b
y S
ilve
r S
tack
htt
p://
ww
w.f
lickr
.co
m/p
ho
tos/
silv
ers
tack
/71
63
871
65
6/
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Collection?
Licensed?
Purchased?
Public Domain?
Two issues:
Who/when/how/where was it originally collected
Where/when/how did the researcher get it?
CREATE/RECIEVE
PRESERVE
STORE
Authoritative?
Quality?
Photo from
woodlyw
onder works http://w
ww
.flickr.com/photos/w
ww
orks/2222523486/
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
What does authoritative mean for GIS data?
Original, raw data?
Confirmed by localSources?
Centuries long problem for cartographers
Now there are many collectors of GIS data; some argue this makes the question of quality harder to answer
CREATE/RECEIVE
Derivatives
Versioning
Pho
to b
y Lu
z h
ttp://ww
w.flickr.co
m/p
ho
tos/luzb
onita
/235
32
271
40
/
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Derivatives
DerivativesDerivatives
Derivatives
AuthorityAccuracyCurrency
TRANSFORM
APPRAISE/SELECT
PRESERVE
ACCESS/REUSE
DiverseStructured Layered
NeedsAttribution
Pho
tos b
y Do
ug8
88
88 h
ttp://ww
w.flickr.co
m/p
ho
tos/do
ug8
88
88/3
22
035
70
81/
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Data Complexity
Pho
to b
y A
rtfo
rm C
an
ad
o h
ttp:
//w
ww
.flic
kr.c
om
/pho
tos/
art
form
/32
660
13
003
/
File sizeRobust
FormatsObsoleteProprietary Versatile
Best practicesNaming conventionsmetadata
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Data management
CONCEPTUALIZE
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarianmetadata
Data that informs us about the data. Necessary for data management, preservation and discovery.
Data curators say it is often a challenge that researchers do not accurately document their data.
But, researchers don’t want to learn a metadata standard to make the data useful; they just want to fill in a form.
metadata
FGDC metadata. I mean, really. FGDC is RIDICULOUSLY complex, and tool support for it is therefore nonexistent. Who thought this would work, and have they been fired yet? - Dorothea Salo
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
ISO 19115? Geographic Markup Language (GML)?
Pho
to b
y D
avew
ing6
8 ht
tp:/
/ww
w.f
lickr
.com
/pho
tos/
dave
win
g68/
2834
1438
54/
Data Access and Support
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
ACCESS/REUSE
CONCEPUTALIZATION
Good Examples
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
http://cugir.mannlib.cornell.edu/ http://inside.uidaho.edu/
http://www.geomapp.net/
Steps Forward
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
Create a Geospatial Data Collection Policy (model
NGDA)
Anticipate the difficulties of geospatial data curation
Foster culture of data best practices
Develop relationship with other institutions
Establish GeoPortal with OAIS standard guidelines
Bibliography
Bethune, Alec, Butch Lazorchak, and Zsolt Nagy. 2009. “GeoMAPP: A Geospatial Multistate Archive and Preservation Partnership.” Journal of Map & Geography Libraries 6 (1): 45–56. doi:10.1080/15420350903432630.
Bose, Rajendra, and Femke Reitsma. 2006. “Advancing Geospatial Data Curation.” http://www.era.lib.ed.ac.uk/handle/1842/1074.
Downs, Robert R., and Robert S. Chen. "Organizational needs for managing and preserving geospatial data and related electronic records." Data Science Journal 4, no. 0 (2005): 255-271.
Erwin, Tracey, and Julie Sweetkind-Singer. 2009. “The National Geospatial Digital Archive: A Collaborative Project to Archive Geospatial Data.” Journal of Map & Geography Libraries 6 (1): 6–25. doi:10.1080/15420350903432440.
Gold, Anna K. "Cyberinfrastructure, data, and libraries, part 2: Libraries and the data challenge: Roles and actions for libraries." Office of the Dean (Library) (2007): 17.
Jenkins, Keith. 2013. “Expert Feedback on Geospatial Data Curation.” http://guides.library.cornell.edu/profile.php?uid=1097
Kenyon, Jeremy. 2012. “Geospatial Data Curation at the University of Idaho.”Journal of Web Librarianship 6 (4): 251–262.
Salo, Dorothea. 2013. “Expert Feedback on Geospatial Data Curation.” http://dsalo.info/
Shaon, Arif, and Andrew Woolf. 2011. “Long-term Preservation for Spatial Data Infrastructures: a Metadata Framework and Geo-portal Implementation.” D-Lib Magazine 17 (9): 1–.
Steinhart, Gail. 2006. “Libraries as Distributors of Geospatial Data: Data Management Policies as Tools for Managing Partnerships.” Edited by Gail Steinhart. Library Trends 55 (2): 264–284.
Stonltenberg, Jaime. 2013. “Expert Feedback on Geospatial Data Curation.” http://www.library.wisc.edu/directory/staff/Jaime-Stoltenberg
Sweetkind, Julie, Mary Lynette Larsgaard, and Tracey Erwin. 2006. “Digital Preservation of Geospatial Data.” Library Trends 55 (2): 304–314.
Xia, Jingfeng. 2012. “Metrics to Measure Open Geospatial Data Quality.” Issues in Science & Technology Librarianship (68): 7.
Jennifer Moore | GIS Outreach & Anthropology Librarian | Washington University Libraries | [email protected] | @anthrolibrarian
GIS & Digital Asset Management Systems (DAMS)Chris Freeland
Associate University Librarian
Twitter: @chrisfreeland
What is a Digital Asset Management System? Combination of hardware & software used to store and
access digital objects Documents Images / Photos Video Audio Datasets
SANDB
DAMS
Metadata Files
UIs / APIs:• Add/Edit/Delete• Access control
Kinds of DAMS
Enterprise
Institutional
Personal
Connecting GIS & DAMS
…little to no native support, requires custom programming
Putting it all togetherTropicos: http://www.tropicos.orgMissouri Botanical Garden’s
botanical information system 4 million+ specimen records 1.2 million plant names 98,000 collectors / authors 140,000 images
Maps via ESRI tools & other technologies… ArcIMS in 2000, only recently taken offline ArcGIS Server 9.3 & JavaScript API in 2010
Digital Asset Management via Fedora Commons
ArcGIS API for JavaScript
SQL Server
djatoka
ArcGIS Server Fedora Commons
Images
ASP.NET (C#)
File System
DB
App
UI / API
MySQLSpatial Data Image Metadata
GIS DAMS
GIS & DAMS: Conclusions Libraries have invested in DAMS for media storage &
delivery Opportunities for use with custom GIS apps, but requires
customization / tradeoffs It DOES work It IS NOT simple
Move towards community-supported research data portals will probably win
GIS in Libraries
(Karen)
Digital AssetsManagementSystems
(Chris)
Data Curation in Libraries(Cynthia)
Case Study in the
Research Lifecycle(Michael)
Curating GIS Data(Jennifer)
Discussion and
Questions
Thank you!