The Corruption of a Text: Looking at Geroge Herbert's The Temple in Digitized Repositories

Preview:

DESCRIPTION

When mass digitization of books started in the mid-2000s, several libraries and researchers saw this as a boon to general research. After almost a decade of working with these texts it is important to examine the textual integrity of the files that are provided through services like Google Play and The HathiTrust. This paper will explore the quality of the plain text files of George Herbert’s "The Temple: Sacred Poems and Private Ejaculations" available through these online repositories when compared to the original manuscript. There are currently seven editions of the book listed on Google Books as free eBooks and about 15 versions available at HathiTrust.org. "The Temple" was chosen as it includes concrete poetry and has a varied publishing history. The text, across its varied editions, will give a clear indication of how well these digital repositories provide access points to the text beyond the basic scan of the physical page. The paper concludes with a discussion of the limitations of the texts that were found to have been included in the mass digitization programs.

Citation preview

The Corruption of

a TextLooking at George

Herbert’s The Temple in Digitized

Repositories

INQUIRY

What is the quality and availability of a text

through online bookstores and

repositories after more than two decades of

digitization project?

BACKGROUND

Digitization Projects

Brief History- Markup1971- Project Gutenberg

1976- Oxford Text Project

1982- Perseus Digital library

1987- Text Encoding Initiative

1988- Women Writers Project

1992 University of Virginia Etext Center

Brief History- Mass DigitizationOpen Content Alliance

University Digital Library

Library of Congress American Memory Project

Google Books

HathiTrust

The Internet Archive

Google Library Project

Started in 2002.

In 2004 partnered with Harvard, University of Michigan, The New York Public Library, Oxford University, and Stanford.

In 2006 partnered with University of California, University Complutense of Madrid, University of Wisconsin- Madison, and University of Virginia.

http://books.google.com/

HathiTrust

Founded in 2008.

Has over 90 member organization.

Collective repository that includes content digitized in Google Book Project and Internet Archive.

12 Million items, 4.7 Million in the public domain.

http://www.hathitrust.org/

The Internet Archive

Founded in 1996 in San Francisco.

Archives text, audio, moving images, archived web pages and software, including arcade games.

Has 7 million texts currently archived.

Provides specialized services for adaptive reading and information access for persons with disabilities. https://archive.org/

Production of Digitized ContentLibrary identify content for digitization.

Google digitizes books.

Library will review samples against benchmarking guidelines and confer with Google.

Digital copy will be set of images and OCR files plus metadata.

Finished copies reside with library and Google for use as outlined in agreement.

Based on the 2005 Agreement between University of Michigan & Google

Finger spam from http://fonerbooks.blogspot.com/2009/09/are-google-books-on-demand-books-ready.html, last accessed 10/31/14

George Herbert

George Herbert, 1593-1633, Photograph: Tarker/Corbis Tarker/ Tarker/Corbis

The Temple- HistoryGeorge Herbert (1593-1633) was an Anglican

priest and poet.

Original ms. lost around time Herbert died. By summer 1633 a manuscript emerged from the religious community Little Gidding. It was donated to the Bodleian library in 1735.

First edition was either based on the Little Gidding edition or was developed alongside scribes.

Went through several printings where poems were combined and editorial corruption occurred.

The Temple- StructureThe poems within the book are arranged as

a map. They reflect both a physical church and the journey of the penitent.

“The stanzaic form, structure, and metrics of the individual poems within The Temple have a major significance for meaning within each poem and within the sequence… Each step and every building block is a poem of the sequence. Thus which poems are included and their arrangement is of prime importance.” -John Shawcross

The Temple- Type of Poems“His poems exhibit a very wide range of

ingenuity and improvisation. There are hidden acrostics, emblematic forms, anagrams, puns, and considerable learning.”

Poems like “The Agonie,” “The Collar,” and “The Storm” rely on irregular indents or hanging lines and a variety of line length to express the disorder or agitation that is behind the reason for the poem.

-Anthony Hecht, The Essential Herbert

DISCOVERY

Repositories45 versions from a search on HathiTrust for

title= “The Temple” by author= “Herbert, George, 1593-1633.”

16 versions in The Internet Archive for title= “The Temple” by author= “Herbert, George, 1593-1633.”

369 results for “The Temple Herbert, George, 1593-1633 site:books.google.com.”

Online BookstoreAmazon: 5 editions. Cost between $.99 and

$9.99.

iBooks: 4 titles total for George Herbert. Cost between $8.99 and $10.99. All 4 titles are also available on Amazon, Barnes and Noble and Google Play.

Barnes and Noble: 20 versions of The Temple available as Nook Books.

Sample OCR text

http://books.google.com/books?id=ilwJAAAAQAAJ&pg=PA36&focus=viewport&dq=editions:-J3ocwe0sHMC&output=text#c_top

http://books.google.com/books?id=ilwJAAAAQAAJ&pg=PA36&focus=viewport&dq=editions:-J3ocwe0sHMC&output=text#c_top

http://books.google.com/books?id=y9cNAAAAQAAJ&pg=PA6&focus=viewport&dq=editions:-J3ocwe0sHMC&output=text#c_top

http://books.google.com/books?id=kdcNAAAAQAAJ&pg=PA38&focus=viewport&dq=editions:-J3ocwe0sHMC&output=text#c_top

The Altar

Nook Ebook

Nook Ebook

Titus Edition Ebook

The Agonie

Nook Ebook

Nook Ebook

Titus Edition Ebook

Easter Wings

Nook Ebook

Nook Ebook

Titus Edition Ebook

ConclusionsCurrently only one e-book version

that comes close to intent of author: Titus Book’s Kindle edition of The Temple: Sacred Poems and Private Ejaculations.

Online repositories contain multiple copies of different editions with no clear designations. Repositories host duplicate files from other repositories.

Thank You!Twitter: Rodzvilla

LinkedIn:

http://www.linkedin.com/pub/john-

rodzvilla/2/589/a88