Reading Rare Books Online

For many researchers today (whether academic or simply curious), one of the greatest benefits of recent technological progress is the ability to conduct archival research at home, in your pajamas, or at two in the morning. (Or, all three at the same time.) For readers with access, electronic databases including Early English Books Online (EEBO) offer thousands of early and rare printed materials that can be downloaded to a home computer, printed out, consulted in a PDF reader, or marked electronically. I recently read Robert Tofte’s poetry collection Laura (London: 1597) on my iPad, for instance.

The EEBO database consists of thousands of early titles originally published between 1475 and 1700 (the periods covered in the short-title catalogs of Pollard & Redgrave and Wing), which were formatted onto microfilm in the 1930s by the University of Michigan and have since been digitized. After a centuries-long journey through manuscript, print, microfilm, and digital media, the text images are sometimes poor in quality and therefore hard to read. Below is an example of the kind of “show-through” you can find in an EEBO document (this is taken from the 1644 edition of John Milton’s The Doctrine and Discipline of Divorce:

Milton, EEBO Text

Despite these occasional exigencies, EEBO is ultimately an invaluable resource, and it continues to grow. Beginning in 1999, a collaborative effort between ProQuest LLC, the University of Michigan, and Oxford University known as the TCP (Text Creation Partnership) began to key the full texts of first-editions in order to make them searchable by keyword. Now in its second phase, the TCP seeks to bring its total to 70,000 titles and includes the collaboration of over 150 libraries. I’ve had the pleasure to hear Martin Mueller speak recently on EEBO, and I share his enthusiasm for a project that certainly has its “noise,” but that probably promises more good than ill. In fact, it opens up a new generation of scholars to the textual and editorial practice that has been mostly taken for granted in the academy for decades. It does matter what editions we read.

And yet. We must temper our enthusiasm, for although EEBO is an invaluable resource, it does not and will not replace archival research. At least, not yet. There are physical aspects of rare books that cannot be fully conveyed through these digitized microfilm copies, such as watermarks, physical dimensions, and bindings, each of which offer important clues about the production, consumption, and circulation of a given book. Additionally, EEBO images (often from copies in the British Library and the Huntington Library) represent a very small sample of the surviving copies of a given publication. Far from being identical, copies of early books often have very subtle differences in terms of press variants and error corrections. Fortunately, scholars and librarians are becoming increasingly aware of the value of retaining “duplicate” copies of early books in the effort to digitize them. Claire Stewart recently pointed me toward this HathiTrust duplicates report, which acknowledges the value of “duplicates” for scholars in certain fields (see p. 6). It’s my belief that the effort to digitize our cultural heritage will lead us back toward the material, the physical, and the artifact, and I’m thinking more about this after reading Bethany Nowviskie’s MLA 2013 paper, published just yesterday.

EEBO is not alone in its home-delivery of rare books to readers and researchers. Other projects including GoogleBooks, HathiTrust, and the Internet Archive contain millions of printed books from earlier eras, and in some cases allow readers to download the whole artifact. I want to use the rest of my time here to show some of the potential and limitations of the Internet Archive, however, mainly in order to call attention to some of its unusual features. Here is what you find when you search for John Milton’s The Doctrine and Discipline of Divorce: a copy of the 1645 pirated edition held by the Boston Public Library. I came across this in November while researching Milton’s pamphlets:

Milton, Internet Archive
The Internet Archive allows you to do with this book some of the same things you can do in EEBO. For instance, you can page through the artifact in its entirety; you can download it to your computer; you can peruse the ASCII text (although EEBO’s TCP project currently only has available first-edition keyed texts, so this one would not be there). However, this online archive allows you to do some different things as well that come slightly closer to the archival visit. For instance, the images of the artifact appear in color, as opposed to black-and-white (although you have the choice to download the PDF here in color or in black-and-white). The resolution of the images is not excellent. There is, however, a two-page layout and a page-turning animation effect that you can opt for, which I have found found for modern texts in iBooks, but less commonly among early modern digital archives. You can also “play” the book as a slideshow and watch the pages turn rhythmically, one after another. It’s a bit mesmerizing. I admit I’m not sure how useful it is to be able to “play” and “pause” a book like this, though. Below is an image of the “page-turn,” although you have to see it in action to really get the full effect.

Milton, Internet Archive (page turn)
The final aspect of this interface I’ll consider here is perhaps among the most promising, but the least successful. If you press the sound button in the top-right corner, you can hear a simulated, female voice read the text. This could be a useful feature, but the OCR delivery of the text is confused by the typography of this early modern book, and systematically garbles the “long s” into an “f” sound. There are other problems with it as well. Olin Bjork and John Rumrich have recently collaborated on a Paradise Lost audiotext, and their work suggests that the visual and the aural can indeed work together productively in a hypertextual archive site. The Internet Archive’s current “iffues” suggest that we still have many years and hard work ahead of us, but we should not sacrifice the effort on account of the “noise” we will inevitably encounter.