View
218
Download
0
Category
Preview:
DESCRIPTION
Data curation has emerged as a strategic growth area for academic libraries. Many libraries have conducted needs assessments as a precursor towards developing services; however there have been few comparisons of the findings across institutions. This panel brings together four librarians from different institutions to discuss both common and distinct findings from their respective needs assessments. The panelists will speculate on the application of these findings at their specific libraries and in academic libraries generally.
Citation preview
COMING TO AN UNDERSTANDINGA Cross-institutional Examination of Assessments of Data Curation Needs
Jake Carlson - Purdue UniversityDianne Dietrich - Cornell UniversityGail Steinhart - Cornell UniversityAlison Valk - Georgia Institute of TechnologyStephanie Wright - University of Washington
Dianne Dietrich
Planning & Data Management Plans
Planning and Data Management Plans
May 2010
October 2010
December
2010
January 2011
NSF press release indicating intent to require data management plans with grant proposals.
NSF releases specifics for data management plan requirement.
Cornell survey distributed to PIs and Co-PIs of NSF grants.
NSF requirement goes into effect.
Planning and Data Management Plans
How prepared are researchers to address data management plan requirements?
What is the potential impact of researcher plans on existing Cornell services?
Planning and Data Management Plans
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Percentage of respondents who answered "I'm not sure" for questions where that was an option
Each bar represents a question where respondents were asked to select "Yes", "No", or "I'm not sure"
Adapted from Steinhart, et al. (2012) Prepared to Plan? A Snapshot of Research Readinessto Address Data Management Planning Requirements. Journal of eScience Librarianship 1(2).
Planning and Data Management Plans
No data
Up to 1 GB
1 GB - 100 GB
100 GB - 1 TB
1 TB - 100 TB
More than 100 TB
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%
Responses to the question: "Given the NSF ex-pectation to share data ... how much data would
you intend to share?"
Adapted from Steinhart, et al. (2012) Prepared to Plan? A Snapshot of Research Readinessto Address Data Management Planning Requirements. Journal of eScience Librarianship 1(2).
Planning and Data Management Plans
Yes30%
I'm not sure61%
No: 9%I do not plan to create
metadata26%
I'm not sure if I plan to create metadata
32%
Have you produced or do you anticipateproducing metadata for this project?
Adapted from Steinhart, et al. (2012) Prepared to Plan? A Snapshot of Research Readinessto Address Data Management Planning Requirements. Journal of eScience Librarianship 1(2).
If you plan on creating metadata, does it conform to known standards in your discipline?
Planning and Data Management Plans
Own
infra
stru
ctur
e
Campu
s so
lutio
n
Comm
ercial
sol
utio
n0
10203040506070
Anticipated Backup Strategy by Size of Data
More than 100 TB1 TB - 100 TB100 GB - 1 TB1 GB - 100 GBUp to 1 GB
Backup Strategy
Nu
mb
er
of
resp
on
ses
Adapted from Steinhart, et al. (2012) Prepared to Plan? A Snapshot of Research Readinessto Address Data Management Planning Requirements. Journal of eScience Librarianship 1(2).
Stephanie Wright
Management
Management: UW
Background
Services Survey &
Interviews
Management: Organization
Survey Guidance on data
organization (file structure, file naming, etc.) ranked 13th out of 14
Tracking updates to data (versioning) ranked 8th
Image Credit: radrice “data cat finds no data” http://blog.looxii.com/wp-content/uploads/2011/06/new-data-cat.jpg
Management: Organization
Interviews Whatever makes
sense to organizer More planning,
better organization Especially true of
larger, well-funded projects
“But that really was sort of something we addressed after the fact, after we started to go, ‘Huh, I’m naming them this way, you’re naming them that way, and I have no idea what your naming conventions mean.’”
(Health)
Management: Description
Survey 1/3 didn’t know of
metadata standard 16% were able to
identify metadata standard
Metadata service ranked 10th out of 14
Image & Quote Credit: NYU Health Sciences Libraries “Data Sharing and Management Snafu in 3 Short Acts” http://www.youtube.com/watch?v=N2zK3sAtr-4
“Everything you need to know about the data is in the article.”
Management: Description
Interviews Documentation is
biggest challenge in data management Recognize role of
metatadata Time consuming, no
immediate benefit Data planning vs.
data forensics
“If I was gonna make (the data) available to other people, I would feel some responsibility in documenting it a little bit better.”
(Social Sciences)
Management: Summary
Services needed: Training on best
practices or general strategies
Tools that integrate description and organization of data into the workflow
“I kind of feel like we’re just making our way through the wilderness. And if there were somebody who could kind of hold our hands and say, ‘Look, data management is important and here are some strategies for going about it…’ That would be great.”
(Health)
Jake Carlson
Sharing
Sharing: Purdue
Background on Purdue’s work:
Primarily Interview Driven
• Data Curation Profiles• Data Management
Plans• Data Information
Literacy
Sharing
Willingness to Share Generally, faculty are open to
sharing their data with others.
There is an “underground economy” of data sharing.
Factors in deciding whether or not to share:What will this person do with my data?
How much time & effort will it take me?
Image Credit: andrew_mc_d “Share” http://www.flickr.com/photos/andrew_mc_d/452728652/
Sharing
Sharing
ControlIssues in sharing data publicly:
Timing over when to release data. Use - If anyone can get the data,
anyone can use it for whatever they want to
Misinterpretation - there’s no guarantee that someone won’t misconstrue the data
Sharing
Attribution Generally expressed as need for
others to cite the data set (though not always)
“So for in my personal opinion, data citations won’t help me too much. Paper citations count for everything. It counts for impact of the paper, it counts for tenure, it counts for the profile of my work.”
- Professor of Biochemistry
Sharing
Documentation and Description
"If you ask someone if you can see their raw data, you might as well be asking if you can look at their underwear. It's really problematic."
- Agronomy Professor
Sharing
Services for Data Sharing at Purdue
Consultation & Collaboration with Data Producers
Support "local" sharing Workflows Documentation Description
Support "external" sharing Workflows Documentation Description
Alison Valk
Preservation
Background
“Develop campus partnerships to collect, manage, share, and preserve Georgia Tech digital research data.”
“Improve and develop new resources & services to assist researchers with data stewardship”
Preservation
IRB-approved research to
determine gaps in data curation services provided to
researchers.
Data assessment surveySeries of campus wide interviewsNSF DMP content analysis
Preservation
By combining information gathered via the survey and the
interviews, we developed a clearer picture of the research
data curation needs on campus. Out of 77 who completed survey-
o 44 agreed to be interviewed
o 26 interviews completed
Preservation
Interview Team
Chris DotySusan ParhamElizabeth RolandoAlison Valk
10 Interview questions
“How important is it for you to archive / preserve your data?”
“How important is it for you or others to have access to your data over the long-term?”
Preservation
Transcribe interviews
Web application for Qualitative & Mixed Methods research Visualize major discussion points or code correlations
Code
Correlation between cost of working with data –
to how strongly participants feel data should be preserved…
Preservation
Storage prices no longer cost prohibitive
Preservation
Lack of metadata or curation = unusable data
Data is often “lost” when project participants
such as grad students leave institution
Computing professor:
“I don’t want to
micromanage my research
assistants”
Preservation
Some
researchers are using
Cloud based tools, such as DropBox etc.
for archiving –
Little concern for security
risks associated.
Preservation
Next Steps:
Select Case studies- oResearchers have volunteered to allow us
to archive their research data.Increased Outreach- New Services
oCustomized DMPtool oDepartmental Data Management Workshops oMore robust web presenceo Proof-of-concept Library hosted Research Data Repository
Preservation
Questions?
Jake Carlson @jrcarlso jakecarlson@purdue.edu
Dianne Dietrich @nemka dd388@cornell.edu
Gail Steinhart @gailst gss1@cornell.edu
Alison Valk @valkcano alison.valk@library.gatech.edu
Stephanie Wright @shefw swright@uw.edu
Recommended