Upload
george-pope
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
New research data management world
Federal funding agencies now expect researchers to include data management plans in proposals
Article 26. Sharing of Findings, Data, and Other Research Productsa. NSF expects significant findings from research and education activities it supports to be promptly submitted for publication, with authorship that accurately reflects the contributions of those involved. It expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the data, samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages grantees to share software and inventions or otherwise act to make the innovations they embody widely useful and usable.b. Adjustments and, where essential, exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate legitimate interests of investigators.
Context : The scientist’s view
• Design experiments that will collect the data needed to answer the research question(s)
• Process those data in a way that will produce the results needed to draw sensible conclusions
• Publish those conclusions so that they can be shared with the wider scientific community and ultimately the public
Investigators often feel that their responsibility ends here.
Context: The data manager’s view
• Consolidate the data collected by investigators and produce datasets in formats that encourage re-use by other investigators
• Make those data accessible to the wider scientific community and potential citizen scientists
• Publish the metadata necessary to enable the data to be interpreted by third parties
The investigators’ responsibility actually ends HERE!
The “reusable” data challengeHow do we:• Consolidate interdisciplinary data from disparate sources in
ways that provide practical (and achievable) opportunities for powerful, complex, synthetic analysis?
• Encourage commonality in scientific measurements and data standards so that we can perform a meaningful comparison between “apples” and “apples”?
• Increase data re-use to extract the maximum possible value from precious research dollars
• Share these data with collaborators in a way that supports answering the big questions?
Someone’s responsibility ends HERE!
Current data management process
Researcher completes
project
DM asks for data and metadata
DM asks for data and metadata
again
Researcher plans next
project
DM pledges eternal friendship and reminds about
metadata
Researcher publishes research
Researcher starts next
project
DM dies waiting for metadata
Researcher sends data
to DM
DM rescinds friendship pledge while pleading for
metadata
Researcher completes
project
DM asks for data and metadata
DM asks for data and metadata
again
Researcher plans next
project
Realistically, where do we want to be?
• A single, consolidated repository (“virtual” notebook) where we can store information about projects, organization, methods and protocols, and datasets (metadata)
• The means to enter data wherever we may be• A data catalog that helps find useful research
data (both published and unpublished)• Visualization tools to help assess the value of
data
A catalog to help find and evaluate data
GIOS DB
Organization data
1
Public datasets Metadata
2
Internal datasets Metadata
3
Data management workflow
Create project record
Create metadat
a record(s)
Projectinitiation
Projectcompletion
Describe field
methods
Describe data
attributes
Submit data and
metadata
Datacollection
Dataanalysis
Finalize dataset
s
Describe lab
methods
Create dataset
s
Data Management System
Research publication
Publish data and
metadata
What does this mean to investigators?
• Effort required to input project and dataset information
• “Pay As You Go” model reduces the back end effort when you really want to be planning for the future
• A single resilient place to store project information – accessible by all team members
• Latest “version” always available• Content available as input to manuscripts• The next data management plan is…
Easy!
Project Timeline
• Design phase: December – February• Development: – Organization/project module: Jan – April – Metadata module: March – July
• Assistance with testing: June - July• Training: July – August