Upload
christine-oconnor
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Progress towardsthe Digital Future
Dr. Gloriana St. ClairDean of University Libraries
Digital Libraries ColloquiumFebruary 22, 2006
Purpose Introduce the NIH Public Access Policy as
a major first step toward solving dysfunction in scholarly communications
Compare the NSF-funded Million Book Project with Google Print and the Open Content Alliance
Foster an interchange about these developments relevant to education in Qatar
Scholarly Communications and NIH Essence of the problem Promise of open access NIH Public Access Policy proposal Current situation
Essence of the Problem
Provost Mark Kamlet at Carnegie Mellon’s Open Access Forum (2004)
Essence of the Problem
Serial & Monograph Costs, 1986-2003*
North American research libraries
*Source: www.arl.org/stats/
Essence of the Problem
Author Publisher
Goal Wide distribution
of work
Increase
revenue
Reward Reputation, P&T, grants
Maximized
profit
Strategy Publish Control
access & price
Mismatched Motives
What is “Open Access”? Immediate free availability on the public Internet Research literature that scholars produce without
expectation of payment (e.g., journal articles) Recognizes that the value of research increases
with use Exploits economics of Internet An access model, not a business model
What Open Access Can Achieve Expand information usage and application. Remove barriers that make content scarce. Weaken the position of publishers that use their
monopoly position to support excessively high prices.
Focus economic return on value addition (rather than content control).
Eliminate systemic inefficiencies by unbundling functions.
Introduce price competition. Benefits outweigh dislocations.
NIH Public Access Policy Proposal
Elias Zerhouni’s memo urging authors of NIH-funded research to submit final manuscripts to PubMed Central
Why Pennsylvania Mattered NIH Public Access Proposal began as a rider to NIH
appropriations bill (ATA) Arlen Specter is the chair of the Appropriations
Committee Rick Johnson called me because I have a certain
reputation, having left Journal of Academic Librarianship when Elsevier purchased it
Carnegie Mellon provost Mark Kamlet is a medical economist who understands the dysfunction and favors change
Why Pennsylvania Mattered Carnegie Mellon lobbyist, Maureen McFalls, has a
daughter in library school and was a great advocate Carnegie Mellon Libraries asked for letters from
PALCI, PALINET, Pennsylvania Library Association, etc.
ATA and SPARC arranged for Carnegie Mellon, Pitt, Penn, and Penn State provosts to send a joint letter
Provost Kamlet made telephone calls and visits on short notice
Why Pennsylvania Mattered
Provosts’ joint letter supporting expanded access to NIH-funded research
ATA
NIH Policy Fulfills its Promise Accessible electronic information is more
desirable than print Author goals of wide distribution and
reputation are met Publishers are not put out of business but the
public is better served Health domain has huge popular interest
NIH Policy Problem Slow uptake
Nature, New England Journal of Medicine
in the News, January 2006
Founder Sergey Brin said he believed Google is "doing the right thing" with their work in China:
"We ultimately made a difficult decision, but we felt that by participating there, and making our services more available, even if not to the 100 percent that we ideally would like, that it will be better for Chinese Web users, because ultimately they would get more information, though not quite all of it.“1
“This is the day the world changes.”
John Wilkin, University of Michigan3
“… commercialize the great research libraries with a handshake, suddenly and epochally.”
Rory Litwin, in Library Juice2
Advent of Google Print (late 2004)
A Closer Look at the Players … Compare the NSF-funded Million Book
Project with Google Print and the Open Content Alliance (OCA)
Project the impact of partnership: Million Book Project and OCA
Digital v. Paper
Rory Litwin, “On Google’s Monetization of Libraries”4
1. Privacy [cookies]2. Introduction of commercial bias3. Questions about democratization
and equity of access4. Disintermediation issues5. Decontextualization of knowledge6. Closing of the information commons
In our rapidly changing world, lifelong learning and access to books have become essential to employment, health, peace, and prosperity. Greater public access to information is consistent with the goals of education and democracy.
Million Book Project
Dr. Raj Reddy
MBP Vision
To create online access to all published works …
Searchable, browsable and navigable by humans and machines …
Free-to-read Instantly available In any language At any literacy level Anywhere in the world
Google Print Vision
To organize the world’s information and make it useful and web-accessible.
“How many users will find, and then buy, books they never could have discovered any other way? How many out-of-print and backlist titles will find new and renewed sales life? How many future authors will make a living through their words solely because the Internet has made it so much easier for a scattered audience to find them?”5
OCA Vision6
To collect all published information , and make it accessible to everyone, no matter where they are in the world
Access to information is a key ingredient to education and an open society
We have the necessary technologies, and the will for an open society … Will we make it happen?
MBP Partners The National Science Foundation has awarded the
Million Book Project four grants for equipment and planning
Partners include government and academic institutions in India, China, and Egypt; academic libraries in the U.S.; OCLC; and the United Nations Food and Agriculture Organization
Newest partner is the Open Content Alliance
Google Print Leader & Partners Google, Inc. U. Michigan Stanford University Harvard University U. Oxford New York Public Library
Brewster Kahle, director and co-founder of the Internet Archive
I can do this
OCA Leaders
OCA Partners Adobe Systems Incorporated Biodiversity Heritage Library,
a cooperative project of: American Museum of Natural
History Harvard U. Botany Libraries Harvard U. Library of the
Museum of Comparative Zoology
Missouri Botanical Garden Natural History Museum,
London The New York Botanical
Garden Royal Botanic Gardens, Kew Smithsonian Institution
Libraries
Columbia University Emory University European Archive HP Labs Johns Hopkins
University Libraries McMaster University Memorial University
of Newfoundland Million Book Project Missouri Botanical
Garden MSN National Archives
(United Kingdom) O'Reilly Media
Prelinger Archives Research Libraries
Group (RLG) Rice University Smithsonian
Institution Libraries University of British
Columbia University of
California University of Ottawa University of
Pittsburgh University of Toronto University of Virginia York University
MBP Collections Books for College Libraries (best books) University presses and scholarly societies
(with copyright permission) U.N.’s Food and Agriculture
Organization content National Agriculture Library Academic libraries with
agriculture collections
Janet McCue, Cornell University
will coordinate the agriculture collections
Google Print Collections Stanford – 40,000-volume pilot
Harvard – 40,000-volume pilot from a 15-million volume collection
U. Michigan – virtually the entire collection; add seven million to search engine; Michigan to “receive and own a high quality digital copy” and to provide access7
New York Public Library – a subset of a 20-million volume collection. Selection criteria = in public domain (1923), interesting, not too fragile
OCA CollectionsWill seed the archive with partners’ collections (below). Will scan U.S. collections in situ at 10¢ per page.
European Archive Internet Archive National Archives (UK) O'Reilly Media Prelinger Archives University of California University of Toronto
MBP Research Initiatives
Machine translation Massive distributed
database Storage formats Use of digital libraries Distribution and
sustainability
Security Search engines Image processing Optical Character
Recognition (OCR) Language processing Copyright laws
Research: Arabic OCR
R&D in Arabic OCR “Million Book Project at Bibliotheca Alexandrina,” by Youssef Eldakar et al.
Journal of the Zhejiang University SCIENCE: Special Proceedings Issue of the1st Int’l Conference on Universal Digital Library (ICUDL November 2005): p. 1331.
Research: Indian and Chinese OCR and Language Translation
OCA Research Initiatives
… In discussions with major publishers and the organizations that represent them in order to explore legal, sustainable business models through which more copyrighted content can be made widely available. … OCA looks forward to continued dialogue with publishers in order to explore and build solutions that benefit the entire community of Internet users.8
Exploring and/or creating inexpensive digitization techniques 9
Worries
Copyright, Copyright, Copyright Printing [Good News] Working with Publishers
Copyright, Copyright, Copyright …
Copyright is the biggest reality that we all face.
MBP Copyright Strategy
Focusing on available out-of-copyright materials (government documents, pre-1923 collections, etc., as well as indigenous cultural treasures …)
Incised palm leaves to be digitizedSaraswathi Mahal Library, India
Google Print Copyright Strategy10
For books in copyright, a Google Print search displays “snippet[s] of text”
A ‘snippet’ is defined as three lines Search returns three snippets per book and indicates
how many times search terms appear Search also returns bibliographic data about the book,
and information on where to buy the book or find it at a local library
Google Print Strategy Adjustments
“Google said yesterday that it would temporarily halt its program to make searchable, digital copies of the vast contents of three university libraries to give publishers and other copyright holders the chance to opt out of having their protected works copied.”11
OCA Copyright Strategy
The OCA is committed to respecting the copyrights of content owners. … OCA contributors must secure the permission of all concerned copyright holders prior to submitting materials to the OCA for digitization or inclusion in the archive.12
MBP Working with Publishers Focusing with increasing success on gaining
permission from university presses and scholarly societies to digitize books and provide access to searchable full-text.
The MBP approach is to request permission for a range of years, for example, everything published prior to 1990. A publisher can specify the cut-off year or, alternatively, specify the list of titles for which they grant non-exclusive permission to digitize in the MBP.
Google Print Working with Publishers
“We’ve already had great success working with publishers directly to add their works to our index through our Publisher Program, and when we add books with publisher permission, we can offer more information and a much richer user experience.”13
OCA Working with Publishers14
OCA has been in discussions with major publishers and the organizations that represent them in order to explore legal, sustainable business models through which more copyrighted content can be made widely available. O'Reilly Media is one commercial publisher that has already agreed to make certain content available to the OCA.
OCA looks forward to continued dialogue with publishers in order to explore and build solutions that benefit the entire community of Internet users.
2005 MBP Partners’ Meeting
Partners met in Hangzhou, China
India 200,000 volumes China 400,000 volumes Egypt 20,000 volumes
620,000 volumes
Status of Collection Digitization
Big New Ideas: Reading Online
“Will people read on screens in the future?”15
People often print long online documents (i.e., users still find hard copy easier to use)
Hardware improvement? Navigation improvement? Building on previous user behavior to help guide new users
from page to page online?
Big New Ideas: Synthetic Documents
Bypass current copyright restrictions with machine-made synthetic documents that transmit intellectual content.16
Big New Ideas: Big Finish17
Finish the Million Book Project by the next Partners Meeting (fall 2006)
Congregate all the metadata in one place
Conclusions Future of libraries is digital Technical progress is substantial Intellectual property laws continue to be a
major barrier Social reactions deserve more research by
librarians
Q & A
Digital future and education in Qatar … What lies ahead?
Thank You
Gloriana St. Clair
And thanks to my generous collaborators: Mark Kamlet, Provost, Carnegie Mellon Rick Johnson, Executive Director, SPARC
If you would like an electronic copy of this talk, contact Cindy Carroll, [email protected]
Endnotes
1. Brin, Sergey. Quoted by David Kirkpatrick, “Google Founder Defends China Portal.” Fortune (January 25, 2006). Available: http://money.cnn.com/2006/01/25/news/international/davos_fortune/.
2. Litwin, Rory. “On Google’s Monetization of Libraries.” Library Juice 7, 26 (December 17, 2004). Available: http://www.libr.org/Juice/issues/vol7/LJ_7.26.html#3.
3. Wilkin, John. Quoted in “Google to Scan Books from Major Libraries.” MSNBC Tech News & Reviews (December 14, 2004). Available: http://www.msnbc.msn.com/id/6709342.
4. Litwin.
Endnotes
5. Schmidt, Eric. “Books of Revelation.” The Wall Street Journal(October 18, 2005). Available: http://googleblog.blogspot.com/2005/10/point-of-google-print.html. Mr. Schmidt is Google CEO.
6. Kahle, Brewster. “Towards Universal Access to all Knowledge: Internet Archive.” Journal of the Zhejiang University SCIENCE: Special Proceedings Issue of the 1st International Conference on Universal Digital Library (ICUDL November 2005).
7. University of Michigan (Nancy Connell). “Google/U-M Project Opens the Way to Universal Access to Information.“ University of Michigan News Service (December 14, 2004). Available: http://www.umich.edu/news/?Releases/2004/Dec04/library/index.
Endnotes
8. Open Content Alliance. “Open Content Alliance FAQ.” Available: http://www.opencontentalliance.org/faq.html.
9. Kahle.
10. University of Michigan. “Google/U-M Project Questions and Answers.” The University Record Online (January 7, 2005). Available: http://www.umich.edu/~urecord/0405/Dec13_04/lib_qa.shtml.
11. Wyatt, Edward. “Google Library Database is Delayed.” New York Times (August 13, 2005). Available: http://query.nytimes.com/gst/abstract.html?res=F50614FD3E5A0C708DDDA10894DD404482.
Endnotes
12. “Open Content Alliance FAQ.”
13. Smith, Adam. “Discovering Hard-to-find Books.” Available: http://googleblog.blogspot.com/2005/10/discovering-hard-to-find-books.html. Mr. Smith is Senior Business Product Manager for Google Print.
14. “Open Content Alliance FAQ.”
15. Lesk, Michael. “The Qualitative Advantages of Information: Bigger is Better.” Journal of the Zhejiang University SCIENCE: Special Proceedings Issue of the 1st International Conference on Universal Digital Library (ICUDL November 2005): p. 1176.
Endnotes
16. Shamos, Michael I. “Machines as Readers: A Solution to the Copyright Problem,” Journal of the Zhejiang University SCIENCE: Special Proceedings Issue of the 1st International Conference on Universal Digital Library (ICUDL November 2005): p. 1179-1187.
17. Proposed by N. Balakrishnan, Supercomputer Education & Research Center, Indian Institute of Science (ICUDL, November 2005).