Magle data curation in libraries

Preview:

Citation preview

Why would I let a librarian anywhere near my research data?C. Tobin Magle, Biomedical Sciences Research Support Specialist

http://www.slideshare.net/CTobinMagle/magle-data-curation-in-libraries

Updates from the library

• 24/7 badge access

• Website refresh

• Research support pages• Free Interlibrary loan for all!

• New off campus login

• Water bottle filling stations

Questions

• Why should I care about data management?

• What do libraries have to do with it?

• What is Tobin up to in this area?

Why should I care about data management?

Rinehart, AK. “Getting emotional about data” College & Research Libraries News September 2015 vol. 76 no. 8 437-440

Anonymous researcher’s

submission to “Day of Data”

at Brown University

But, more people are doing research

https://nexus.od.nih.gov/all/2012/06/27/what-weve-learned-about-graduate-students/

17% data is lost per year post publication

doi:10.1016/j.cub.2013.11.014

The majority of research data aren’t curated

doi:10.1353/lib.0.0036

<22% NIH grants require a Data Sharing Plan

Thus, we are losing a lot of dataThat could be repurposed

Research funding is tight

From: The Anatomy of Medical Research:  US and International ComparisonsJAMA. 2015;313(2):174-189. doi:10.1001/jama.2014.15939

NIH

Pharma

Med. Device Companies

Biotech

State/localPrivate funds

Other Fed.

Funders want to do more with lessHence, data sharing

http://figshare.com/blog/2015_The_year_of_open_data_mandates/143

NSF post-award requirements

“Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4

2003 NIH data sharing policy

“The NIH endorses the sharing of final research data to serve these and other important scientific goals. The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers.” http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html

NIH Genomic Data Sharing (2014)

“To promote robust sharing of human and non-human data from a wide range of genomic research and to provide appropriate protections for research involving human data, the National Institutes of Health (NIH) issued the NIH Genomic Data Sharing Policy (GDS Policy) on August 27, 2014 in the NIH Guide Grants and Contracts.”http://grants.nih.gov/grants/guide/notice-files/NOT-OD-14-124.html

Whitehouse’s 2013 OSTP

“The Obama Administration is committed to the proposition that citizens deserve easy access to the results of research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the results of federally funded research freely available to the public—generally within one year of publication.”http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research

I’m going to have to curate my data…

But it can also be good for your career

Make Lemonade:

• Reduce data loss

• More data usage

• More exposure to collaborators

• More competitive grant applications

How are libraries getting involved?

• We don’t make the rules

• We want to provide guidance

• Research data management services

• NLM Administrative Supplements

But libraries don’t have anything to do with research data…

“The biggest challenge that libraries face in building data management services is the researchers’ perception that librarians do not understand research data and have no role to play in data management.”

J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504

Libraries are changing:

Strength: organizing and finding information• Old role: Finding and cataloging books • New role: Finding and cataloging electronic resources• Informationist’s role: Finding datasets for data

repurposing and helping researchers curate their own

Libraries are providing data services

Research Data

Lifecycle

From: https://www.lib.msu.edu/rdmg/servcat/

DMPs

Software

Data cleaning

Deposit inRepository

Metadata

Search for datasets

We care what you think

Help us help you!

Data Management Challenges

Basic Science• Lack of standards and

procedures • “Reinventing the wheel”

• Disconnect among data types• Imaging and tabular data

stored separately

• Staff leave with their data

Clinical• Data quality

• Collection inconsistent among staff

• Changing data format among statistical programs

J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504

Do researchers want to share data?

Basic Science• Not really (27%)

• Concerns• Negative experiences• Privacy• “My data is too specialized

to be of use to others”• Lack of infrastructure• Curation takes time

Clinical• Mostly (58%)

• Concerns• As long as they know who’s

using it.• Interest in using other

researchers’ data

J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504

Librarians are receiving grant funding

Informationist Funding

NLM Administrative Supplements for Informationist Services

Purposes:

(1) To enhance collaborative, multi-disciplinary basic and clinical research by integrating an information specialist into the research team in order to improve the capture, storage, organization, management, integration, presentation and dissemination of biomedical research data

(2) To assess and document the value and impact of the informationist’s participation.

http://www.nlm.nih.gov/ep/AdminSupp.html

Project backgroundDr. Kechris’s R01 proposal generated miRNA expression data from LXS recombinant inbred mouse panel as a resource for the research community.

Planned to share data in PhenoGen database

NLM Informationist Awards

Aims:

1. Make data and code publicly available with appropriate metadata

2. Create tutorials to facilitate data reuse

3. Assess efficacy of Aims 1 + 2

Aim 1: Make data/code/metadata public

• Deposit raw miRNA data public repositories• NCBI (SRA/GEO/BioProject/BioSample)• PhenoGen (new functionality to support NGS data)

• Standardize and apply metadata

• Make analysis workflows (R code) available in GitHub

• Repository entry to link all materials from this project• Including tutorials from Aim 2

Aim 2: Facilitate data reuse with tutorials

Variety of formats:

• Video Tutorials: Adobe Captivate

• Written tutorials: Blog• https://hslnews.wordpress.com/category/bioinformatics-

bites/

• Guide on the Side: • http://hslibrarytraining.ucdenver.edu

Aim 3: Assess efficacy of Aims 1 and 2

• Monitor data usage• Citation• Downloads (Google Analytics)

• Surveys and assessments about tutorials• Are the tutorials helping others use the data?

Acknowledgements

HSL Faculty and Staff• Melissa Desantis• Lisa Traditi• Kristen Desanto• John Jones• Lilian Hoffecker• Ben Harnke• Ruby Nugent

Tobin.magle@ucdenver.eduPhone: 303-724-2114Twitter: @tobinmaglehttp://orcid.org/0000-0003-3185-7034

Contact Information

Kechris Group/PhenoGen• Katerina Kechris• Boris Tabakoff• Laura Saba• Spencer Mahaffey• Pam Russell

http://www.slideshare.net/CTobinMagle/magle-data-curation-in-libraries

Recommended