5
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign GSLIS Research Showcase 30 March 2012

Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Embed Size (px)

Citation preview

Page 1: Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Data Sharing Practices:Implications for Curation and Re-use

Carole L. Palmer & Tiffany Chao

Center for Informatics Research in Science & ScholarshipGraduate School of Library & Information Science

University of Illinois at Urbana-Champaign

GSLIS Research Showcase30 March 2012

Page 2: Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Team members:

Carole PalmerTiffany Chao

Nic WeberKaren Baker Andrea Thomer

• small science

• complex, heterogeneous data

• implications for data curation

• value for re-use across disciplines

- Data Practices team

Comparative analysis of researchers in the earth and life sciences

• Qualitative analysis of worksheets and interviews conducted with scientists.

• Investigation of data production and use in relation to curation needs, cultures of sharing, and re-use potential.

Page 3: Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Field

Specific Research

AreaForm to be

shared Formats

Type of data set Size

Shared when?

Agronomy

water quality, drainage, and plant growth

cleaned, reviewed sensor;

hand-collected samples .xls

approx. 100 files

~1MB each, up to 20 Mb

After publication

Geology

rock, water and microbes

averaged sensor;

hand-collected samples; photographs .xls; jpg

1 file; images

< 1 Mb

After publication

Civil Engineering

traffic movement

cleaned, normalized sensor

MySQL postgresql

1 database

approx. 1000 K/day

1 month to 1 year embargo

Curation Profiles Project: What can be shared when?

Page 4: Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Production vs. reuse / wholes and partsGeobiology Volcanology Soil ecology Sensor science

Data unit

Time series: (site specific)

• spreadsheets• microscopy images• annotated digital “field photos”

Rock profile:

• physical rock• thin section• chemical analysis• photographs• field notes

Database:

• multiple abiotic soil measures• associated metadata

Database:

• soil data• sensor data

User communities

Geobiology, Geology, Chemistry,MicrobiologyU.S. Park Service

Geology – igneous petrologyGeophysicsGeochemistry

Biochemistry Earthworm ecology

Network Science Computer Science

Sharingconventions

• by request • no repository• mostly post-pub, some unpublished

• by request• no repository

• public resource collection

• Reference data industry• Limits – customization “vertical” dev.

Page 5: Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate

Far from collective, shared data infrastructure

Curation of functional groupings:

Exposing data very different from supplying by request. Complex mis-use concerns:

• Misinterpretation – presumed problems• Misappropriation – actual premature re-use• Disregard of good faith practices

– how used, what referenced

• Scholarly record of data collected and analyzed• Unit for long-term preservation• Organization for retrieval• Raw material for future research