23
Challenges for libraries in data curation Consol Garcia Librarian, Biblioteca Campus del Baix Llobregat Universitat Politècnica de Catalunya- Barcelona Tech PKP 2011, Berlin

Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

Challenges for libraries in data curation

Consol GarciaLibrarian, Biblioteca Campus del Baix LlobregatUniversitat Politècnica de Catalunya- Barcelona Tech

PKP 2011, Berlin

Page 2: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Challenges for libraries in data curation

Something about data:why data?why open?why sharing data?

Somethingabout libraries,What we need todo, what challenges do we face, opportunities, social network

What's going on? European Research Council, JISC, SURF, Germany

Page 3: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Data, Linked data, Linked Open data✔ Because computationally intensive science is

being carried out

✔ This is a data society, due to e-science (new methodology emerging from broadband communications networks, software and infrastructure)

✔ Web of data intended to enable computers to understand the semantics (…)

Page 4: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Open Data / Open AccessRegarding data...

✔ Are we where oa was 10 years ago✔ Both share goals✔ Same stakeholders and workteams

✔ Biomed, Wellcome trust, scholarly publiching, researchers, ...

✔ Movements that benefit one from each other

In Berlin declaration there's mention to metadata, raw data and other materials

Page 5: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Sharing data

✔ It's much more than share✔ It's deposition, preservation,✔ It's access, use and reuse

Data Life cycle

✔ It's much more than life span✔ It's a cycle that properly managed will enable

access, evaluations and re-use over time

Page 6: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Data Life cycle

Page 7: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Why share data?

Page 8: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Why share data?

✔ To verify data✔ To retain data integrity✔ PARSE study 98% if research is publicly

funded, the results should become public property and be properly preserved

Page 9: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Why not to share it?

✔ Researchers want to use their results as intellectual capital

✔ Researchers can sell their data✔ It takes time, effort and money✔ No data standards within a discipline✔ Idiosyncratic research practices

Page 10: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

What's important? Technical perspective

Page 11: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

What's important? Cultural barriers

✔ Scientist must be aware on data management

✔ Changing the culture of science from publications to data

✔ Ensure proper citation (technology will help)

✔ Social tools have a great potencial to speed up scientific discovery

Page 12: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

What needs to happen?

✔ Build an infrastructure✔ is kind of everybody's problem, and

therefore it's nobody's problem, Boyle)

✔ Design good online tools✔ understand how science works)

✔ Create cultural change✔ Top down strategy (open access movement)✔ Bottom-up (how to measure contributions)

Arxiv and SPIRES

Page 13: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

When will it happen?

✔ When researchers find it useful✔ When researchers get credit to do it✔ When funders require it✔ When publishers require/find it useful✔ When the recommendations and polices are checked

Page 14: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

When should it be done?

✔ Before publication✔ Human Genome Project✔ Arxiv.org

✔ After publication✔ NIH✔ PANGEA

Page 15: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Examples✔ Protein Data Bank (prepublication)✔ NCBI: GenBank✔ Sloan Digital Sky Project✔ PLOS and PMC (editors)✔ Arxiv.org (repositories)✔ DataONE (DataObservation Network for Earth) under

NSF DataNet programme✔ PANGEA

✔ a lot of work but still technical, legal and cultural barriers

Page 16: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Where to begin: roadmap✔ Self-assestment / Data audit

✔ Digital Curation Center checklist✔ Should provide guidance on different:

✔ Data producers (quality of data)✔ Data users (fair use)✔ Funding agencies (mandate data)✔ Repositories (storage, preservation of data,

DRAMBORA)

✔ On the requirements in the international projects

Page 17: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

What's going on?✔ Carlos Morais Pires e-infrastructures and scientific data

✔ Some EU projects aiming for enviromental related data ENV 2012.6

✔ JISC MRD Programme✔ University of Edimburg has a policy plan for RDM

✔ PANGEA funded by German research council in 2010

✔ Seal of Approval (DANS)✔ Spain is just beginning

Page 18: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Librarian's role✔ Increasing volume & types of data✔ Subject librarian could be the curator✔ Should have skills and knowledge on:

✔ Scientific research✔ Operating systems✔ Database management systems✔ Scripting languages

✔ Some functions:✔ Streamline submission to databases✔ Automate curation✔ Standarize data✔ Facilitate contributions to annotation✔ Editing and teaching?

Page 19: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Librarian's role✔ Self-teaching, on-job experience✔ Training at LIS:

✔ University of Illinois at Urbana-Champaign✔ Digital Curation Center DC101

✔ Journal and conferences✔ Could help with

✔ Metadata✔ Copyright✔ Advocacy✔ Archiving and long-term preservation✔ Citing data

Page 20: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Conclusions

✔ Librarians could/will be involved at any stage in the research process and collect the pieces

✔ Deep partnership between library and researchers is necessary

✔ Focus on small-scale solutions✔ Be aware of metadata schemas and vocabularies within a discipline

✔ Librarians and researchers are still learning how to manage research data

Page 21: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Conclusions✔ Different approaches attending to:

✔ funding agencies,✔ subject disciplines

✔ Physics✔ Meteorology✔ Astronomy✔ Life science

✔ world region✔ age of researchers

✔ 50% of the respondents from the Tenopir et al. survey reported that neither the organization or the project provide funds to manage data

Page 22: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Conclusions

✔ No data standards within a discipline

✔ Complexity among data objects✔ Some communities are willing to share but there's no data center where to send the data

✔ Some times the problem relies on the quantity and quality of data

✔ Mandates flexibility

Page 23: Challenges for libraries in data curation · Self-assestment / Data audit Digital Curation Center checklist Should provide guidance on different: Data producers (quality of data)

PKP Conference, Berlin 2011

Thank you!