34
DATA CURATION: CHALLENGES AND OPPORTUNITIES FOR RESEARCH LIBRARIES Brian E. C. Schottlaender The Audrey Geisel University Librarian 26 September 2012 OSU “Library Futures” Seminar 1

Data Curation: Challenges and Opportunities for Research Libraries

  • Upload
    alexia

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Curation: Challenges and Opportunities for Research Libraries. Brian E. C. Schottlaender The Audrey Geisel University Librarian. Should I Talk About … . … declining: budgets? numbers of staff? transactions? … closing branch libraries? … “rationalizing” collections? - PowerPoint PPT Presentation

Citation preview

Page 1: Data Curation:  Challenges and Opportunities  for Research Libraries

DATA CURATION: CHALLENGES AND OPPORTUNITIES

FOR RESEARCH LIBRARIES

Brian E. C. SchottlaenderThe Audrey Geisel University Librarian

26 September 2012 OSU “Library Futures” Seminar 1

Page 2: Data Curation:  Challenges and Opportunities  for Research Libraries

SHOULD I TALK ABOUT …

• … declining:– budgets?– numbers of staff?– transactions?

• … closing branch libraries?• … “rationalizing” collections?• … repurposing space? • … bottom-up strategic planning?• … moving to a service program–based

organizational structure?

26 September 2012 OSU “Library Futures” Seminar 2

Page 3: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 3

NO, I THINK I’LL TALK ABOUT …

DATA CURATION

26 September 2012

Page 4: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 4

OVERVIEW

• The Scholarly Record• Stewardship• Data Curation• Why do data need to be curated?• Why should libraries curate data?• What should research libraries do?

26 September 2012

Page 5: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 5

THE SCHOLARLY RECORD?

The scholarly record is …

“… that which has already been written in all disciplines ... that stable body of graphic information, upon which each discipline bases its discussions, and against which each discipline measures its progress.”

Ross Atkinson. “Text Mutability and Collection

Administration.” Library Acquisitions: Practice & Theory, Vol. 14 (1990)

26 September 2012

Page 6: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 6

WHAT DOES THE SCHOLARLY RECORD INCLUDE?

• E-only journals • Reviews• Preprints and working papers• Encyclopedias, dictionaries,

and annotated content

Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation

Conducted by Ithaka for the Association of Research Libraries (November 2008)

26 September 2012

Page 7: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 7

ScholarlyPublishing

(e.g., journal articles)

“THE SCHOLARLY RECORD”

Stable

26 September 2012

Libraries

Trusted Third Parties(e.g., JSTOR, Portico)

Page 8: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 8

WHAT DOES THE SCHOLARLY RECORD INCLUDE?

• E-only journals • Reviews• Preprints and working papers• Encyclopedias, dictionaries,

and annotated content• Data resources

Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation

Conducted by Ithaka for the Association of Research Libraries (November 2008)

26 September 2012

Page 9: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 9

ScholarlyPublishing

(e.g., journal articles)

“THE SCHOLARLY RECORD”

Stable

26 September 2012

Libraries

Trusted Third Parties(e.g., JSTOR, Portico)

Page 10: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 10

WHAT DOES THE SCHOLARLY RECORD INCLUDE?

• E-only journals • Reviews• Preprints and working papers• Encyclopedias, dictionaries, and annotated content• Data resources• Blogs• Discussion forums• Professional and academic hubs

Nancy L. Maron and K. Kirby Smith. Current Models of Digital Scholarly Communication: Results of an Investigation

Conducted by Ithaka for the Association of Research Libraries (November 2008)

26 September 2012

Page 11: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 11

Scholarly Raw Material

(e.g., archives, data)

ScholarlyPublishing

(e.g., journal articles)

“THE SCHOLARLY RECORD”

ArchivesData Centers

[Some in Libraries; Some Not]

Libraries

Stable

Infrastructures largely

self-contained

26 September 2012

Trusted Third Parties(e.g., JSTOR, Portico)

Less Stable

ScholarlyInquiry/Discourse (e.g., blogs, wikis, open notebooks

?????

Very unstableEmergent

INPUTS OPERATORS OUTPUTS

Page 12: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 12

STEWARDSHIP 1

“Stewardship is a core value that includes notions of mission, responsibility, integrity, trust, accountability, service, preservation and sustainability for future use.” 

Sharon E. Farb. “Libraries, Licensing, and the

Challenge of Stewardship.” First Monday, Vol. 11, No. 7 (3 July 2006)

“As a society and as educational institutions, we have a collective responsibility to preserve and make available, along a continuum of a life cycle, our digital heritage.”

Jeffrey L. Horrell.“Converting and Preserving the Scholarly Record: An Overview.”

LRTS, Vol. 52, No 1 (January 2008)

26 September 2012

Page 13: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 13

Stewardship 2

• “There is a need for a close linking between digital data archives, scholarly publications, and associated communication. The potential for an expanded role for research libraries in the area of digital data stewardship affords opportunities to address these important linkages.”

• “Stakeholder groups have different expertise, outlooks, assumptions, and motivations … Collaboration models to share expertise and resources will be critical.”

26 September 2012

To Stand the Test of Time—Long-Term Stewardship of Digital Data Sets in Science and Engineering:

A Report to the National Science Foundation from the ARL Workshop on New Collaborative Relationship (2006)

Page 14: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 14

Stewardship 3• “Historically, universities have played a leadership role in the

advancement of knowledge and shouldered substantial responsibility for the long-term preservation of knowledge through their university libraries. An expanded role for some research and academic libraries and universities, along with other partners, in digital data stewardship is a topic for critical debate and affirmation.”

• “The scale of the challenge regarding the stewardship of digital data requires that responsibilities be distributed across multiple entities and partnerships that engage institutions, disciplines, and interdisciplinary domains.”

To Stand the Test of Time … (2006)

26 September 2012

Page 15: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 15

DATA CURATION: WHAT IS IT?

“The activity of managing and promoting the use of data from its point of creation, to ensure it is fit for contemporary purpose, and available for discovery and reuse. For dynamic datasets this may mean continuous enrichment or updating to keep it fit for purpose. Higher levels of curation will also involve maintaining links with annotation and other published materials.”

Philip Lord, Alison Macdonald, Liz Lyon, and David Giaretta.

“From Data Deluge to Data Curation.” eScience All Hands Meeting 2004 (2004)

26 September 2012

Page 16: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 16

DATA CURATION: WHAT’S IT INCLUDE?

• Design• Creation or Collection• Processing• Analysis• Appraisal• Selection• Description• Discovery • Dissemination• Repurposing• Storage• Preservation• Etc.

26 September 2012

Page 17: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 17

CURATION MODEL

Panos Constantopoulos,et al.“DCC&U: An Extended Digital

Curation Lifecycle Model.”The International Journal

of Digital Curation, Issue 1, Vol. 4 (2009)

26 September 2012

Page 18: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 18

ACTORS …

“As we move from small to large scale data sharing, where data are managed and maintained for broad access, we also are seeing an increase in the number and type of intermediaries. Intermediaries, in the form of organizations and the people who work for them, prepare data for reuse by eliciting, organizing, storing, packaging and/or preserving data, and by performing various roles in dissemination and facilitation …”

Ixchel M. Faniel and Ann Zimmerman.“Beyond the Data Deluge: A Research

Agenda for Large-Scale Data Sharing and Reuse.” The International Journal of

Digital Curation, Issue 1, Vol. 6 (2011)

26 September 2012

Page 19: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 19

… AND STAKEHOLDERS

• Disciplinary experts• Functional experts

– Developers– Curators– Preservationists

• Users• Archives• Data Centers• Libraries• Institutions• Professional Societies• Publishers• Governments

26 September 2012

Page 20: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 20

THE CURATION ECOSYSTEM 1

26 September 2012

Systems Providers

Data Providers

Service Providers

FundersPolicy

Makers

Data Consumers

Page 21: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 21

THE CURATION ECOSYSTEM 2

“… the activities of curation are highly interconnected within a system of systems, including institutional, national, scientific, cultural, and social practices as well as economic and technological systems. Data curation is a nascent set of technologies and practices emerging in the context of this complex and rapidly evolving socio[economic]-technical ecosystem.”

Anna Gold. “Data Curation and Libraries:

Short-Term Developments, Long-Term Prospects.” http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1027&context=lib_dean

26 September 2012

Page 22: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 22

WHY DO DATA NEED TO BE CURATED?

• “The more effectively that data can be manipulated, mined, managed, analyzed and served to communities, the better the conduct of science can be supported.”

• “The more we can eliminate boundaries in this exponentially growing sea of data, the better data can be shared enabling multidisciplinary and collaborative research …”

• “The more effectively students and faculty gain the data intensive knowledge and skills, the larger the impact will be on science and society.”

NSF-OCI Task Force on Data and Visualization.

Report Draft Final (March 7, 2011)

26 September 2012

Page 23: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 23

WHY DO DATA NEED TO BE CURATED?

– BECAUSE DATA REUSE REQUIRES IT.

• WHY DO DATA NEED TO BE REUSED?– BECAUSE TRANS-DOMAIN RESEARCH REQUIRES IT.

• WHY IS TRANS-DOMAIN RESEARCH IMPORTANT?– BECAUSE SOLVING GRAND CHALLENGES REQUIRES IT.

• WHY IS SOLVING GRAND CHALLENGES IMPORTANT?– BECAUSE THEY AFFECT ALL OF US.

26 September 2012

Page 24: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 24

WHY DO DATA NEED TO BE CURATED? 3

BECAUSE THE GOVERNMENT

SAYS SO.26 September 2012

Page 25: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 25

WHY SHOULD RESEARCH LIBRARIES CURATE DATA?

• Because we can:“Research libraries, archives, and other stewardship institutions have the capacity to aggregate and hold data, manage metadata, deal with rights management and access, and help users.”

• Because we must:“… uncurated data are as good as lost, even if the bits are storedforever, because they cannot be interpreted correctly.”

• Because, left to their own devices, scientists won’t:“… many if not most scientists focus on the shortest path to a particular scientific result rather than the best long-term solution for data reuse or data-service …”

NSF-OCI Task Force on Data and Visualization.

Report Draft Final (March 7, 2011)

26 September 2012

Page 26: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 26

What Should Research Libraries Do?

1. Stop waiting and start proactive engagement locally.2. Stake a claim in the production cycle.3. Start retraining and repurposing staff.4. Be a doer, not a broker, wherever possible.5. Consider digital curation collaborations.6. Actualize collaborative engagement.

Tyler Walters and Katherine Skinner.New Roles for New Times:

Digital Curation for Preservation. Association of Research Libraries (2011)

26 September 2012

Page 27: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 27

WHAT HAVE I DONE?

• Reached out to the San Diego Supercomputer Center (on whose Executive Committee I sit) to co-create the campus’ Research Cyberinfrastructure Initiative (RCI), funded by the Chancellor.

• Leveraged the NDSA-funded Chronopolis Federated Preservation Environment to create a Research Data Curation Services Program.

• Hired a Director, and reallocated portions of two domain specialists and a metadata analyst to her.

• Created Sample Data Management Plans for various NSF Directorates.

• Launched five curation pilots in the Humanities and the Sciences.• Joined DPN and am preparing to field-test Chronopolis as a DPN

data triad.

26 September 2012

Page 28: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 28

AND SO …

26 September 2012

Page 29: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 29

AND SO …

26 September 2012

Page 30: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 30

AND SO …

26 September 2012

Page 31: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 31

AN EXAMPLE

26 September 2012

Page 32: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 32

AND SO …

26 September 2012

Page 33: Data Curation:  Challenges and Opportunities  for Research Libraries

OSU “Library Futures” Seminar 33

CONCLUSION

• Digital scholarly output cannot be de-coupled from the raw material and inquiry operations that generate that output, at least not as easily as analog scholarly output can be.

• It can’t be, it needn’t be, and it shouldn’t be.

• Its stewardship calls for a more expansive view of what constitutes the scholarly record, a view that encompasses more and different inputs, outputs, and stakeholders; and a more distributed and interoperant organizational and technical infrastructure.

26 September 2012

Page 34: Data Curation:  Challenges and Opportunities  for Research Libraries

QUESTIONS?

26 September 2012 34OSU “Library Futures” Seminar