47
Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication Statewide IT Conference, Indiana University Sept. 27, 2010 HathiTrust: A Big Idea with Bold Plans

Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Embed Size (px)

Citation preview

Page 1: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Brenda Johnson, Dean of University Libraries

Gary Charbonneau, Systems Librarian

Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Statewide IT Conference, Indiana University

Sept. 27, 2010

HathiTrust: A Big Idea with Bold Plans

Page 2: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust - Outline

A Big Idea• Mission and Goals; Partners; Governance

Content and Use• Relationship to Google Books and Internet Archive• Size, characteristics of content• A few words about technology

Bold Plans

September 27, 2010Statewide IT Conference, Indiana University

Page 3: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Importance of A Name

September 27, 2010Statewide IT Conference, Indiana University

• Hathi (pronounced hah-tee)

Hindi word for elephant, an animal highly regarded for its memory, wisdom, and strength

• Trust

A core value of research libraries and one of their greatest assets. In combination, the words convey the key benefits researchers can expect from a first-of-its-kind shared digital repository

• There’s an elephant in the library.

Page 4: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

What is HathiTrust?

• Started in 2008 as a partnership among research libraries, HathiTrust is an open web resource that aggregates, preserves and provides access to the collections of member libraries.

• Initial purpose was to provide trusted shared repository for books and journals digitized by and available through Google Books and Internet Archive

September 27, 2010Statewide IT Conference, Indiana University

Page 5: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Google Books/Internet Archive

• In 2004, Google began digitizing the books and journals from many major research libraries in U.S. – including, starting in 2008, IU’s

• Some libraries, including the University of California, had similar digitization projects with the Internet Archive

• Books and journals digitized from these projects were deposited in HathiTrust

September 27, 2010Statewide IT Conference, Indiana University

Page 6: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Current HathiTrust Partners: 29 and Counting

Columbia University

Dartmouth University

University of California system (11 libraries)

CIC (Committee on Institutional Cooperation) (12 libraries)University of Chicago University of Minnesota

University of Illinois Northwestern University

Indiana University Ohio State University

University of Iowa Pennsylvania State University

University of Michigan Purdue University

Michigan State University University of Wisconsin, Madison

New York Public Library

Princeton University

University of Virginia

Yale University

September 27, 2010Statewide IT Conference, Indiana University

Page 7: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

If Google and Internet Archive have these books, why do we need

HathiTrust?

HathiTrust’s mission is much broader than simply to replicate Google Books:

Contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge.

September 27, 2010Statewide IT Conference, Indiana University

Page 8: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Why do we need HathiTrust? (1)

Preservation…For The Long Term• Better entrusted to research libraries than to a private

corporation, even a benevolent one

• Not just preserving bits

• Full preservation program, including active curation, metadata, migration, management plans, etc.

• Seeking TRAC Certification (Trustworthy Repository Audit and Certification)

September 27, 2010Statewide IT Conference, Indiana University

Page 9: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Why do we need HathiTrust? (2)

Expanded access and discoverability

• Full-text access to pre-1923 books and journals, plus those which have had rights cleared

• Beyond full-text keyword search: enhanced discoverability options

September 27, 2010Statewide IT Conference, Indiana University

Page 10: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Why do we need HathiTrust? (3)

Focus on scholarly values and needs

• Develop content, access and functionality that meets needs of researchers

• Share expertise and cost of preserving and providing access to scholarly record among institutions who share this fundamental mission

September 27, 2010Statewide IT Conference, Indiana University

Page 11: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust: Getting Started

• Initial development responsibility: University of Michigan, with mirror site at IUPUI, administered by UITS Enterprise Infrastructure

• Much future development will be distributed among partner institutions under direction of HathiTrust Executive Committee

September 27, 2010Statewide IT Conference, Indiana University

Page 12: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

A Unique Partnership• HathiTrust is library work at scale; an early example of an

“above-campus” service

• A new experiment in collaboration

Not a separate entity; not a 501(c)(3) like Sakai, Kuali, DuraSpace or many open source software projects

Instead, a jointly-funded, jointly governed, jointly developed partnership.

• Together, we are HathiTrust.

September 27, 2010Statewide IT Conference, Indiana University

Page 13: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Sustainability:HathiTrust Governance 2008-2012

• Executive Committee

Budget, finances, decision making

• Strategic Advisory Board

Guidance on policy and planning

• HathiTrust staff

• Working groups and committees

September 27, 2010Statewide IT Conference, Indiana University

Page 14: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Current Working Groups

• Discovery Interface • Collections• Quality• Communication• Usability• Storage• Development Environment• Research Center

September 27, 2010Statewide IT Conference, Indiana University

Page 15: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Financial contributions of partners

HathiTrust Functional Framework

Page 16: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Next steps in governance

• 5-year agreements, reviewed in the third year of every term

• First Constitutional Convention will be in 2012

• Partners will determine governance structures and partnership models, effective 2013

September 27, 2010Statewide IT Conference, Indiana University

Page 17: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Focus On Users

• Preservation…with access• Benefits to IU researchers and their colleagues

around the world:– Ensure long-term preservation and access– Increase discoverability – Create scholarly tools– Expand content beyond Google and Internet

Archive

September 27, 2010Statewide IT Conference, Indiana University

Page 18: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust – constantly changing

• Rapid growth and development; fluid environment

• Next few slides describe HathiTrust currently

• Will follow with discussion about future plans

September 27, 2010Statewide IT Conference, Indiana University

Page 19: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust - Content

• The vast majority of what is currently in HathiTrust consists of files received from Google from volumes digitized by Google for Google Book Search

• Almost all of the remainder consists of files received from Internet Archive. Much of the content from University of California comes by way of Internet Archive

September 27, 2010Statewide IT Conference, Indiana University

Page 20: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust Content (2)

• Since not all of Google’s “library partners” are members of HathiTrust, and none of Google’s publisher partners are, HathiTrust is still (mostly) a subset of what is in Google Book Search. However….

September 27, 2010Statewide IT Conference, Indiana University

Page 21: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust Content (3)

• Because of HathiTrust’s copyright clearance project, there are some things available in full text in HathiTrust that are only available in “snippet view” in Google.

• Because of Internet Archive, there are probably some things in HathiTrust that are not available in Google at all.

September 27, 2010Statewide IT Conference, Indiana University

Page 22: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust - focus on collections

• HathiTrust is about collections, not simply Google digitization

• For example:• access for persons with print disabilities• opening access for public domain volumes• collection building tool• high-quality bibliographic data necessary

for scholarly work

September 27, 2010Statewide IT Conference, Indiana University

Page 23: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Content Growth

September 27, 2010Statewide IT Conference, Indiana University

Page 24: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Content Distribution

September 27, 2010Statewide IT Conference, Indiana University

Page 25: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Language Distribution (1)

September 27, 2010Statewide IT Conference, Indiana University

Page 26: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Language Distribution (2)

September 27, 2010Statewide IT Conference, Indiana University

Page 27: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Dates

September 27, 2010Statewide IT Conference, Indiana University

Page 28: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Originating Institution

September 27, 2010Statewide IT Conference, Indiana University

Page 29: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Content Over Time

September 27, 2010Statewide IT Conference, Indiana University

Page 30: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

September 27, 2010Statewide IT Conference, Indiana University

Page 31: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

September 27, 2010Statewide IT Conference, Indiana University

Page 32: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

September 27, 2010Statewide IT Conference, Indiana University

Page 33: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

September 27, 2010Statewide IT Conference, Indiana University

Page 34: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust DataGrid

• Using Isilon Clustered Storage System• Similar principles to a datagrid using WAFS

(OneFS)– Wide Area File System (2.3 PB per file system)– Automated data replication among nodes– Currently Two Nodes

• Ann Arbor - University of Michigan• Indianapolis – Indiana University NOC

• Connected via I-Light and Michigan Lambda Rail

September 27, 2010Statewide IT Conference, Indiana University

Page 35: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

September 27, 2010Statewide IT Conference, Indiana University

HathiTrust Grid

Indianapolis Ann Arbor

Isilon OneFS Currently Supports

up to 2.3 PB between Two Nodes

Page 36: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

More on HathiTrust Technology

http://www.hathitrust.org/technology

September 27, 2010Statewide IT Conference, Indiana University

Page 37: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

A Use Case

• IUB scholar needed quick access to a definitive 52-volume set of Voltaire’s work published in late 1800s; deadline approaching

• Had been transferred to the Auxiliary Library Facility

• Available in HathiTrust and Google Books• Google Books not usable for this scholarly

purpose• Able to do work much more efficiently and

quickly in HathiTrust

September 27, 2010Statewide IT Conference, Indiana University

Page 38: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust’s Bold Plans

• We believe the HathiTrust of tomorrow will look very different from the HathiTrust of today

• Google and Internet Archive digitized volumes just the beginning

• The sky’s the limit (or, more accurately, the combined will and resources of the partnership are the limit)

September 27, 2010Statewide IT Conference, Indiana University

Page 39: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Vision for the future: More Content

• Current and backlist scholarly monographs• Born-digital materials• Some locally-digitized collections• Some non-book/non-journal resources

…anything that is appropriate for a research library collection AND IS A SHARED PRIORITY FOR PARTNERS

September 27, 2010Statewide IT Conference, Indiana University

Page 40: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Vision for the future: More Content (2)

• More full-text:

Google Book Settlement - if approved:– could receive all Google-digitized files

to preserve– could make much more full-text

available• Rights-clearing project - open access to public

domain materials

September 27, 2010Statewide IT Conference, Indiana University

Page 41: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Vision for the Future: More Functionality

• Research tools– Computational research – Advanced collection builders– Advanced discovery

• Expanded quality processes • Rigorous preservation guarantees• Defining paths for fair uses• Tools for shared print collection management

September 27, 2010Statewide IT Conference, Indiana University

Page 42: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Vision for the Future: Enhanced Discoverability

• Not just keyword searching of full-text• Highly-functional bibliographic access

- HathiTrust catalog - Integration into other discovery tools:

- IUCAT, WorldCat, Discovery Services

September 27, 2010Statewide IT Conference, Indiana University

Page 43: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust and local digital library initiatives

• HathiTrust is a solution for large-scale, shared high-priority needs of partners; currently optimized for digitized monographs and journals

• Partners will identify priorities for content and functionality development

• HathiTrust will not supplant all institutionally-based digital library initiatives

• Local digital library collections and services will still be needed

September 27, 2010Statewide IT Conference, Indiana University

Page 44: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

How Can HathiTrust Make a Difference?

• Future not yet known precisely, but…• For the first time in history, HathiTrust has:

- defined a large-scale partnership to achieve a large-scale goal

- built the first version of a very large, high-quality shared repository

• Building blocks to ensuring that research collections, print and digital:• are preserved, curated, highly discoverable and

accessible• retain their research value in a digital platform

September 27, 2010Statewide IT Conference, Indiana University

Page 45: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

Some lessons learned so far

• HathiTrust can serve as shared repository for mass digitized library collections

• HathiTrust can provide organizational structure for other collaborations– Shared print collection management– Bibliographic integration

• The research library community is able to collaborate deeply to attain shared goals

September 27, 2010Statewide IT Conference, Indiana University

Page 46: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

HathiTrust Mission - redux

Contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge.

September 27, 2010Statewide IT Conference, Indiana University

Page 47: Brenda Johnson, Dean of University Libraries Gary Charbonneau, Systems Librarian Julie Bobay, Associate Dean for Collection Development and Scholarly Communication

CreditsOur thanks to colleagues who generously granted us

permission to use their slides for this presentation:

John Wilkin, HathiTrust Executive Director

Jeremy York, HathiTrust Project Librarian

Heather Christenson, Mass Digitization Project Manager, California Digital Library

Also, many of the ideas for this presentation based on:Courant, Paul N. and John Wilkin. “Building ‘Above Campus’ Library

Services.” Educause Review, July/August 2010, 74-75.

September 27, 2010Statewide IT Conference, Indiana University