disembodying the past to preserve it

What follows is a keynote I gave at the Digital Preservation 2013 conference on July 23, 2013. If you’re curious, there’s a video up of the talk and the Q & A as well and a pdf of the slides I showed (some of which vary from what I’ve shown here).

“Disembodying the past to preserve it”

I am, as you’ve heard, not someone who focuses on issues of digital preservation. I’m a book historian and performance scholar who works at a cultural heritage organization that is focused on the preservation and exploration of centuries-old objects. I think about the digital and preservation from the perspective of someone who studies the past and seeks new ways to make it accessible to scholars and the public.

So since I spend a lot of time thinking about the history of books and since so many people see the rise of the digital heralding the end of print, I thought I would start off by looking back at the earliest surviving instances of moveable type in the West. We all know, I think, that the first book printed by Johannes Gutenberg was the bible in 1455. But that wasn’t the earliest instance of print. Gutenberg’s first printed text were indulgences—short formulaic texts sold by the Church and its deputies to fund various enterprises by promising purchasers they wouldn’t need to spend as much time in purgatory for their sins.

1454 indulgence (Rylands Library)

1454 indulgence (Rylands Library)

What we’re looking at is one of the earliest surviving copies of these get-out-of-jail-free cards. Now held at the University of Manchester’s Rylands Library, this indulgence was printed in 1454 and was issued to a specific buyer in 1455 on the 27th of February. (You can see why printed indulgences were so handy—the bulk of the text is the same from one to the next, and small blank spaces can be left to be filled in by hand with the particulars for each sinner.)

There are other copies of the Gutenberg indulgences that have survived. This one is a slightly later issue (it’s the 31-line indulgence, not the 30-line, for those of you who are bibliophiles). Now part of Princeton’s collections, this indulgence was issued on 29 April 1455 in Pfullendorf to Johannes Grosshans—you can just barely make out the fact that there is a manuscript insertion here, but this copy hasn’t survived in as nearly nice shape as the last one we looked at.

It’s astonishing, actually, that any of these indulgences survived. Very few of them did—even though print runs for indulgences were huge, often in the thousands, there are only 50 recorded surviving copies of the 31-line indulgence, and a mere 8 extant copies of the 30-line. When you look at an indulgence, it’s easy to see why they wouldn’t survive in large numbers. They’re just flimsy little things. Compare these two images, the first of someone holding Princeton’s indulgence I mentioned above…

… and this of a bound incunable (also in Princeton’s collections):

The first can be held in one hand (even in its framed state) while the other rests heavily in a chair (don’t try that in your reading room, please!). I think you can guess my point: the Aristotle is big and it’s durable because it’s big. You can’t easily tear or lose this book. But a single sheet of paper? That gets misplaced, it gets accidentally destroyed, it gets forgotten. A light breeze could blow it away if you weren’t paying attention. And once the holder has died? Do you need to hang onto an indulgence as a record of your grandfather’s purchase?

The disposability of indulgences is why they haven’t survived. But it’s also why the ones that survived did. Here’s what I mean: because the indulgences weren’t seen as precious documents to save, they were perfect to reuse for other purposes. And the early indulgence that survived often did so because it was used as waste paper in bindings. Without launching into a lecture on early modern binding structures, I’ll just say that bindings often incorporated paper leftover from other projects. Endpapers, spine linings, structural supports—binders needed materials to finish their books. And why would you use good blank paper—paper that could be used for other purposes—when you had scrap paper at hand? And so odds and ends of printed paper were incorporated into the bindings of books:

If we turn back to the images of the indulgences I’ve shown, you’ll see that being treated as disposable is how they survived. The 30-line indulgence now at Manchester was preserved in a binding—you can see the evidence of holes in the corners and the stain left from the leather turn-ins. And Princeton’s copy survived as pastedowns in a binding from the early 1470s. With Cambridge University Library’s copy of Wynkyn de Worde’s 1598 indulgence we see something slightly different: these are indulgences that were never sold and are still in sheet form, preserved in the binding of a bible. This is one of my favorite examples, because it doubles as evidence of something we normally wouldn’t see, the production technique displayed in the unfinished object.

Because it was disposable, it was preserved. It’s not a preservation technique I’d recommend, but it’s worked for more than a few texts.

I’ll let you deal with what this might mean for digital preservation (I know just a tiny bit enough about digital forensics to gather that bits of data cling to other bits of data and that you might be looking to recover someone’s novel only to find that other records of their life are interspersed with it). Instead, I’ll ask what lessons we might learn from this about using digital iterations of material objects.

For starters, it’s worth pointing out that I wouldn’t have been able to give this talk if these objects hadn’t been photographed and shared online. It was because I was looking for images of indulgences for a different talk that I came across these pictures and noticed that they all looked like binders waste. Discoverability shouldn’t be news, but it shouldn’t be forgotten either.

The problem that we’re facing, in my world, is that the digital objects we’re producing sometimes lead to wonky discoveries. Here’s one thing that has been bothering me recently: the size of books.

psalms unscaled

Here we have two books of psalms, one printed in Geneva in 1576 (on the left) and one in Florence in 1566 (on the right). They are, to all appearances, the same size.

psalms scaled

But this is how their comparative sizes should be displayed: the Italian psalter is 21 centimeters tall and the Geneva psalter is 13 centimeters, or about the height of a Sharpie. (Projected on a screen, of course, they appear to be significantly larger than a Sharpie, although perhaps on your device’s screen they are significantly smaller—a not unrelated oddity of working with digitizations of material objects: size isn’t stable.)

Here we see a collection of books as we would see them in the Folger’s digital image collection, displayed side-by-side:

sizeofbooks unscaled

Here are those same images shown in relation to each other—I arbitrarily chose one book as my standard, and then calculated the scale and adjusted the images from there:

sizeofbooks scaled only

This slide does a much better job of conveying the relative size of these books. But it’s a rotten way of browsing through a large collection of images if you’re at all interested in any feature other than size. In other words, if you want to treat these images as books—as objects that you hold in your hand and read—then you’re going to be dissatisfied. They’re always going to be digital surrogates (a phrase I hate), lacking the primacy of the original.

But what if we took the disembodied aspect of these images of books as an opportunity rather than a failure?

Here’s a fun fact about early printing that is all about its material process: many printed works are illustrated with woodcuts, images that are literally made from blocks of wood.

(I just want to make the aside that it’s pretty effing amazing that the Folger has the piece of wood that made that exact print—the other amazing thing is that on the other side of this piece of wood is carved another woodcut! Just like you’d reuse scrap paper in your bindings, you’d repurpose pieces of wood.) In any case, one of the results of illustrations being made from blocks of wood is that the blocks of wood could be reused to print illustrations in different works.

The Broadside Ballads Online project at the Bodleian has taken this fact and combined it with image search technology to begin to explore how images in Renaissance ballads are used and reused. Alexandra Franklin did some excellent work with this, starting with noticing a distinctive hat used in Unconstant Phillis, a late 17th-century lament by a shepherd about the woman he loves:


Using image match, Franklin searched across their ballad collection for other instances of the hat, pulling up 8 hits, including this one from The Noble Gallant.


What’s particularly fun about this reuse is that we see that although the hat is the same, the man wearing the hat is not!

Why might this be a useful discovery? Tracing the use of a woodblock across multiple printings and multiple works can help date printing; it can also help us think about iconography and shifting discourses. For me, this is also a useful discovery for the way it turns material objects into digital ones that can be dismembered and rearranged. It’s not strictly necessary to use digital tools to do this sort of image-hunting work: Ruth Luborsky and Elizabeth Ingram compiled their Guide to English Illustrated Books without the use of image matching technology. But it’s certainly much easier to do it with bits than with books.

What can digitization offer that material objects cannot? Tools to reshape objects that would break under physical pressure. The work done on The Great Parchment Book by the University College London Centre for Digital Humanities is the most recent and exciting example of what those possibilities are. The Great Parchment Book is a survey compiled in 1639 of all those estates in Derry managed by the City of London through the Irish Society and the City of London livery companies. It’s a remarkable set of records. But it’s also a collection of 165 leaves that were badly damaged in a fire in 1786. Through careful preservation, about 50% of the text was recovered, but the brittle, wrinkled parchment remained an intractable obstacle to further work. But after extensive physical preservation work on the manuscript and detailed imaging, the UCL team was able to virtually unwrinkle the pages (read about the preservation and digitization processes). About 90% of the text of the Great Parchment Book is now readable, and available for examination online as images of the leaves, enhanced images, or a transcription of the text.

In both of these cases, digitization makes available objects for study that would otherwise be restricted, either because they’re too fragile to handle or they’re too dispersed to work with. For someone invested in cultural heritage, these are remarkable accomplishments. We can’t study the past if we can’t access its records and artifacts.

But both of these projects are ones that require significant investments of time, money, and people. They’re not lightweight experimentations—you need high-resolution images, you need expertise in image manipulation, you need the physical objects at hand.

I want to end with a look at something that is the opposite of all this, something that builds off of what has already been done, publicizing and redeploying images without adding to them or, indeed, displaying them.

The Library of Aleph is a twitter account that tweets the captions of prints and photographs in the Library of Congress’s digital collections. The tweets are nothing more than the captions—no images themselves, no links to them. Just the captions, with occasional reminders that anyone can find these images by searching the Library of Congress. Here’s one tweet: “House burning during Groveland reign of terror—Negroes driven from homes throughout area.” Here’s a screenshot of the corresponding record:

House burning during Groveland Reign of Terror--Negroes driven from homes throug_2013-07-26_16-45-37

The Library of Aleph’s tweetstream the day after the verdict of George Zimmeran’s trial was announced was a relentless account of the history of African-Americans, from slavery through Jim Crow through the Civil Rights Movement. The person who created The Library of Aleph hadn’t created it for this purpose—it was really an account he put together to tweet out some of the interesting images he was finding without cluttering up his main account. But in his anger after the verdict, it became a platform for remembering and reliving our past.

I bring it up here because of this paradox: what makes the tweets so powerful is that they are disconnected from the material object they’re referencing. They’re just captions. We might gloss over images but I think we pause over these. What are we reading? Who wrote the captions? What does it mean to choose these words to describe these images?

I love the way @libraryofaleph connects the past to the present and the present to the past. Things that speak to us today can speak to what spoke to us in the past, and digital technologies can bring them together. But what I really take out of this in terms of what cultural heritage organizations can do with digital tools to preserve our past is that this is an account that came not from the Library of Congress, but from an unaffiliated user. The Library of Congress did all the hard work in collecting these works, in digitizing them, in creating their metadata, in making them discoverable, and then in making it open so that somebody else could do with it something powerful.

And it’s that that cultural organizations need to think about in the use of the digital objects we are creating. We need to make them open so that other people can do things with them that it would never occur to us to do ourselves. Preserve your data, create your metadata carefully, and then release it. Make it open so that it can be used, so that we can learn from it, and so that it can continue to be discovered by future users.

What do we want from online facsimiles of Shakespeare?

Over at The Collation last week, I wrote a blog post providing a quick explanation for what might be gained from looking at multiple copies of digital facsimiles of the First Folio and linking to the eight copies I’ve found. Mostly what I was interested in there was the availability of such things and a taste of the joys of copy-specific reading. Here I want to look at what actually matters to me a bit more: the usability of such resources. It should be perfectly clear, but I’m going to say it anyway, just to be safe. This is my personal site and I am not representing the Folger’s point of view here, only my own as a user of such resources.

Before I look at specific examples, here’s what I want as a scholar:

  • high-resolution cover-to-cover images, zoomable to, say, larger-than-life size with full clarity so that I can pick out details on pieces of type;
  • choice between viewing as pages and viewing as a book with two-page spreads that, ideally, convey the depth of the book and the shifting balance of pages as you move through it so that I know where in the volume I am;
  • navigation synced to plays (with modern acts and scene divisions), to signature marks, and to page numbers so that I can easily find my way to wherever I want;
  • cataloging information that tells me something about the copy I’m looking at; at a minimum, shelfmark and identification of which pages are not original F1 leaves, but preferably including information about provenance, binding, marginalia, uncorrected pages, and other copy-specific details;
  • cataloging information that tells me something about the digital surrogate I’m looking at; at a minimum, when it was built and who built it;
  • a CC-NC or, even better, a CC-BY license that will allow for downloading and reusing at a minimum specific images and, preferably, the entire work, so that I can share it in my teaching and scholarship and so that I can compare multiple copies;
  • and since I’m dreaming here, the ability to read offline in a friendly interface so that I can access it even when I’m (gasp!) without a good internet connection;
  • and the ability to add my own annotations, so that I can keep track of what I’m finding.

Well. That’s not asking for much. Maybe someday all my desires will be met, but it certainly hasn’t happened yet. And it won’t happen unless we start advocating for what we need in these resources. (My focus here is on the First Folio, but my points hold for any digitized book and, to a slightly lesser degree, to any digitized textual work. We all need transparent records, digital copies that are available to reuse, and high-resolution images that convey not only the words on the item but the physical manifestation of that item.)

So in the spirit of helpful critique, here’s how the 8 copies currently available for free online stack up. (N.b. I’ve linked to catalog records where I can find them in the headlines. West numbers refer to the number given the copies in Anthony James West’s census of First Folios. Full descriptions of all these copies can be found in West and Rasmussen’s Descriptive Catalogue; quick takes on which copies have leaves missing or in uncorrected states can be found at the end of my Collation post.)

tl;dr I imagine many of you won’t read all the way through this 4,000-word post, so here’s the unsurprising upshot: digital projects don’t age well. For the First Folio, that means that what used to be cutting-edge in terms of quality and interface can, 5 to 10 years later, can be woefully behind the times in offering what users want. We need to plan ahead so that we can offer users the tools they want now while also building in reuse for future users. That means, above all, super high-quality images, open access, clear documentation, and constant exploration.

Folger Shakespeare Library, copy no. 68 (West 126)

Available as page spreads in the Folger’s digital image collection, as page spreads and as a pdf through the World Digital Library; downloadable as individual page openings and as a pdf.

This tends to be my go-to copy of the First Folio when I need to check what’s in it and when I need to zoom in for good details. I like this copy in part because it’s a complete copy—all the leaves are original F1 leaves, rather than facsimiles or replacements from other copies or later editions—and it has some nice manuscript markings setting off some passages. (I always like reminders that books aren’t pristine collectibles but objects to be used.) I also like the Folger’s digitization of it. It’s cover-to-cover, so you can see the Wodehouse bookplate in the front, and the full-page spreads show the depth of the volume as well: in this image from Measure for Measure, you can easily spot from the visible page edges that we’re still well near the front of the volume.

Folger STC 22273 Fo.1 no.68, sig. F4v-F5r

Folger STC 22273 Fo.1 no.68, sig. F4v-F5r

I like, too, that the images are high-resolution enough that you can really zoom in and see details. Here is a screenshot of the most zoomed-in view of the same opening (click on the image to enlarge):

screenshot of zoomed-in view of Folger's copy 68

screenshot of zoomed-in view of Folger’s copy 68, including the navigation tool in the lower right

I also like that it’s easy to download individual images from the Folger’s Luna interface; the full-page spread that you see above is the maximum size that is allowed for downloading; click on it and you’ll see how big it is. And it’s easy to link to individual openings (see the examples that I’ve included above). If you’re a more adept Luna user (or if you’ve read Jim Kuhn’s tooltip posts in The Collation), you’ll find that you can link to zoomed-in views or to view of multiple images, too (see, for instance, this comparison of two copies). Information from the Folger’s incredibly thorough catalog records is included in Luna, so it’s easy to understand what you’re looking at without leaving the image; information is also provided identifying the image, including the signature marks (but not, however, page numbers or modern act/scene divisions). There isn’t any obvious information about when the images were taken or anything else about the dates or tools involved in the production of the digital images and interface.

But what I don’t like about using this also comes from the Luna interface. It’s not particularly easy or intuitive to quickly find your way to specific places in the volume. If you’re looking for something in King Lear and you know your way through the First Folio already, your instinct will be to scroll through the thumbnails until you find the beginning of the play. But that gets annoying. When I was looking for a page that included some of the marginalia I mentioned, the easiest way to do it was to flip through the pdf I downloaded from the World Digital Library, find the opening that looked good, and then locate that in Luna by skimming the thumbnails. The pdf is not very high resolution in and of itself, so while it’s fine for full-page spreads, if you try to zoom in to see details, it blurs out pretty quickly. Here’s a rough equivalent to the zoomed-in view from the online version, this one from the pdf (it’s about 200%):

zoomed-in detail of pdf of copy 68

screenshot of zoomed-in detail of copy 68 pdf

So I switch back and forth between the two when I’m trying to locate the details of something specific. It’s also worth mentioning, speaking of ease of use, that the pdf file has most of the page openings rotated 180°. The best option is to download it and then use a tool like PDF Toolkit to rotate all but the first and last two pages of the file.

If you’re looking for a specific play, it’s possible to construct a search to just pull up, say, Measure for Measure in copy 68. Use “Advance Search” to search “Call Number” for “STC 22273 Fo.1 no.68″ and “Image Details” for “Measure for Measure”; when you get the results, sort them by “Multiple Page Sort Order” and *phew* you’ve got your play. There used to be links on the Folger’s website for early copies of the plays, including in F1 and relevant quartos, and my understanding is those should be restored soon (I’ll update this with the relevant link when that happens).

The World Digital Library’s interface is built around the same images that the Folger provides through Luna. It seems like it should be slightly easier to navigate (instead of going through thumbnails, you can click arrows to turn to the next or previous opening), but I find the transitions between the images to be so slow as to be more frustrating than helpful.

The Folger has licensed this, as with the rest of the items in its digital image collection, with a CC-NC license.

Folger Shakespeare Library, copy no. 5 (West 63)

Available as page spreads in the Folger’s Digital Image Collection, as a pdf from Octavo editions, and as page spreads through the Rare Book Room; downloadable as individual page openings and as a pdf.

This is my go-to pdf, though not my go-to online digital copy. It’s a good copy of the First Folio—only a few original leaves missing—and was one of the first really high-resolution digitizations of the First Folio that was easily available. Octavo chose it as the basis of their edition of the First Folio, published on CD-ROM in, I think, 2001. That edition, available for purchase but also freely through the Folger’s catalog record, is rich not only for the nice digitization of the First Folio, but for the incredible contextual material accompanying it, including essays by Arthur Freeman, Stephen Orgel, and A.R. Braunmuller, as well as a copy of Peter Blayney’s amazing booklet on the First Folio. It’s a wealth of information. (The pdf file at the Folger has been corrupted; at the moment it has the contextual material, but not F1 or Blayney. I think it will be replaced with a better file soon; I’ll update this when that happens.) If I had to recommend one digital First Folio to an interested non-specialist, the Octavo pdf is what I’d choose, given the strengths of its digitization, its ease of navigation, and its fabulous contextual material.

The pdf is easy to navigate—you can use the arrows to move back and forth, of course, but each play and act/scene is bookmarked so that you can jump straight to it. And the resolution is ok, maybe slightly better than the WDL pdf of copy 68:

screenshot of zoomed-in detail of copy 5

screenshot of zoomed-in detail of copy 5 pdf

The digital copy in Luna has the same interface advantages (cover-to-cover! full-page openings showing depth!) and disadvantages (argh, navigation!) of copy 68, but you can’t zoom in to quite the same level of detail in copy 5:

screenshot of zoomed-in detail of Folger copy 5

screenshot of zoomed-in detail of Folger copy 5

The images through the Rare Book Room are the same as in Luna, but with a different interface. I like the ability to use arrows to navigate through the book, but it doesn’t let you zoom in very much.

Bodleian Library, University of Oxford, Arch. G c.7 (West 31)

Available online in page spreads, downloadable as individual page images, and eventually as a pdf (I’m positive I saw something about this latter point somewhere, but now I’m not finding my steps back to it; I hope it’s actually true and not something I totally hallucinated).

This is the most recent addition to the collection of digitized First Folios; you can learn more about the publicly funded campaign to digitize it on the project’s blog, “Sprint for Shakespeare.” It’s also a (kind of) infamous copy of the First Folio. The Bodleain acquired it in 1624 and then—gasp!—subsequently got rid of it, apparently in 1664 when the Third Folio (which it purchased) made the First “obsolete.” The Bodleian’s copy of the First reappeared at the University in 1905, when Gladwyn Turbutt (an Oxford undergraduate whose family owned the book since the early eighteenth century) brought it in for advice about the binding. It was immediately recognized as the long-gone Bodleian deposit copy and the Turbutt family delayed its sale to the highest bidder (*cough cough Henry Folger*) to give the Bodleian a chance to raise the money to purchase it, which it did. What makes this copy interesting is not only the Bod’s foolishly getting rid of it and its spectacular return, but the fact that the book remained in its original binding and showing all the wear-and-tear of its usage.

The online interface of this copy’s digitization comes in two options. The first is through the BookReader interface developed by the Internet Archive. It lets you navigate the book either through thumbnail images or in one-page or two-page spreads. There are bookmarks to let you easily jump to specific acts and scenes in specific plays, and in the two-page opening view, you can easily see a visualization of page edges to show where you are in the volume and to flip ahead to specific pages:

our favorite opening in Measure for Measure, as in the Bodleian's interface

our favorite opening in Measure for Measure, as in the Bodleian’s interface

I generally find this a good way to navigate a book—it’s easy to work out where you are and to get to where you want to be. The fake visualization of the fake edges is a little weird, though, and while I love the pop-up that shows you where your mouse is when it’s jumping ahead, I couldn’t actually get the book to jump when I clicked on the margins. And in this particular incarnation, however, I’m disturbed by the gutter issues. Since many of the images of individual pages includes a glimpse across the gutter of the opposite page (not a bad thing for an individual image), when they’re stitched together into a page opening, the weirdness across the gutter is, well, weird.

On the other hand, working with the individual page images is super easy. There’s a link that takes you to pages of thumbnails, each of which is identified both by signature mark and by play title and page number; the image below, for example, is labeled “F4v / MM p.68.”

downloadable image from the Bodleain's copy

downloadable image from the Bodleain’s copy

It’s a good resolution (click on the image above to be able to enlarge it to its full size), and I like the touch of including the ruler to indicate the leaf’s size. It’s also wonderful that the Bodleian has released this under a CC-BY license and clearly indicates what that means in terms of usage. They have a great statement on the site’s accessibility, too (it looks like that’s something required by Oxford; I wish more places would keep accessibility in mind as they are designing their sites). I wish the site was more clearly linked to the cataloging information (I got to the catalog record by navigating through the link to the Bodleian’s main website in the upper right corner of the site and the searching in the catalog for the shelfmark). And I haven’t figured out an obvious way to be able to link to specific openings in the book view, although you can link to individual page images (check out the front pastedown, for instance!).

The Bodleian’s biggest strength is combining ease of use for non-scholars and for scholars. By using the Internet Archive interface, they’ve made it friendly for general browsers to look through the book. (As Pip Willcox said in a tweet to me, given that the public funded the project, the Library felt strongly that it had to be fully open-access—and, obviously, that it had to be friendly to use.) And by separating the book into individual page images, they’ve made it usable for scholars like me. That, I think, is key to successful digital First Folios—they need to work not only for the finicky experts but for the general user. And I think the Bodleian has mostly achieved that, unlike most of the other digital copies.

 Meisei University (West 201)

Available online as individual page images with transcriptions of marginalia; downloadable as individual page images. Akihiro Yamada’s book on the marginalia and his transcriptions is available in its entirety as well.

This is a fascinating copy of the First Folio, and the most interesting copy of the ones that have been put online. It has extensive early modern marginalia dating, according to Akihiro Yamada’s study of it,  from the 1620s or 1630s in a Scottish hand. There are underlinings, dots, and marginal notes focusing primarily on summarizing the play. The annotations are understandably the focus of this interface; it’s really not designed to read the play easily, but does offer a range of ways into the marginalia. You can navigate to individual pages by choosing the play and either act/scene/lines or through-line numbers; you can also navigate by signature marks or by image number. Once you’re at a page, you can then use the arrows to browse to the next or the previous image. It’s not instantly obvious—the landing page is a black screen instructing you to “Please Search Page Image” rather than the first leaf of the volume. But once you’re on an image, you have the option of enlarging the page (the initial result is thumbnail sized), enlarging the marginal notes, and reading a transcription of the note.

Here, for example, is a screenshot of a page from Measure for Measure, enlarged to its largest side, along with a detail of the marginalia and its transcription:

screenshot of Meisei copy showing marginalia and its transcription

screenshot of Meisei copy showing marginalia and its transcription

As you can tell, the page itself doesn’t enlarge particularly well, though the details of the marginal notes are a bit better. You can also download a page image:

downloaded page image from Meisei copy

downloaded page image from Meisei copy ( (c) Meisei University)

It’s pretty small, but its resolution is okay (as elsewhere, click to enlarge to its full size). Their copyright page states that “All rights reserved; no part of this database may be reproduced or reprinted in any form, except for non-profit-making, educational or scholarly use. In such cases, please cite the copyright of Meisei University and write to the contact address.”

As I said, though, the point of this digitization isn’t to read the First Folio, it’s to study its marginalia. So while you can search by page image, you can also search the marginalia, using either the “lexicon” option (words that appear in Yamada’s index) or the “concordance” option (words from the entire marginalia corpus):

searching the Meisei marginalia corpus

searching the Meisei marginalia corpus

I don’t really have a project that would use this, but I like that they’ve enabled it. The (comparatively) low resolution and the sometimes not-intuitive interface are likely residues of the project’s age: it began in 2002 and ended in 2008. Quibbles about that aside, it’s wonderful that Meisei, which bought the book in 1980, has taken these steps to make its riches explorable.

Since I’ve already gone on long enough, and am feeling a bit burnt out, my examinations of the remaining online First Folios is going to be speedier.

University of Pennsylvania, Furness Library (West 180):

Available as individual page images and possible to save as jpegs. Penn’s interface makes it easy to navigate the First Folio by play or by page number. It’s also possible to compare F1 to other works if you follow the “select a text for comparison” option in the upper left—if you’re interested in Hamlet, there’s a lot of goodies in there, and it’s nicely displayed side-by-side. Information about the specifics of the copy or of the image aren’t obvious. (I happen to know that this interface was put together in the mid-1990s, since I had grad school friends who worked on it.) The book begins not with its cover (although you can catch glimpses of it and page edges here) but with the blank recto of Ben Jonson’s poem, “To the reader” (sig. ΠA1r). (Penn also has a copy of the Peter Blayney booklet I mentioned above, which is handy to know if it’s not available through the Octavo pdf.)

Penn's First Folio, with our Measure for Measure page in its largest state

Penn’s First Folio, with a screenshot of our Measure for Measure page in its largest state

Brandeis (West 153) and New South Wales (West 192)

I’ve grouped these together because they are both available at Internet Shakespeare Editions as individual page images than can be saved as jpegs. The Brandeis copy is also available through Perseus, though I find that interface doesn’t have much to recommend it over the ISE. Neither one of these is a great facsimile (the NSW is oddly pink and there’s a lot of bleed-through in the Brandeis—shooting with a black page behind the leaf would have helped with that, wouldn’t it?), but the interface is easy to navigate. I particularly like how if you’re looking at a specific place in one copy, you can jump straight to that location in the other copy. There’s also a nice “compare” feature, just as in the Penn above, but here with more options.

MM in Brandeis copy

MM in Brandeis copy

MM in New South Wales copy

MM in New South Wales copy

Miami University (West 174)

Available online as individual page images that can be saved as jpegs. I love that Miami has done this but it’s very hard to use and the quality of the digitization, by current standards, is not good. It’s easy to initially find your way through the book to a specific play, as you can see from the screenshot below, and it generates a stable URL to bring you back to a specific page (here’s my MM example).

the navigation interface for Miami's copy

the navigation interface for Miami’s copy

Once you’re at the page you want, you can zoom in pretty deeply (the pull-down menu says “100%”) but it’s actually larger-than-life-size, as you can tell if you click on the image below to see its full size.

detail of Miami's MM

detail of Miami’s MM

I’m not a big fan of the quality of the image (bleed-through is a lot more distracting in images than it is in real life), but it’s more than usable. What I find really difficult is that once you’ve zoomed into the image, you lose all the navigation tools—the sidebar menus that let you move to different pages completely disappears. Even if you return to a smaller size (the zoom level pull-down is still at the top of the screen), that doesn’t return the sidebars. Perhaps there’s a metaphor in here, about not being able to see the forest for the trees.

A final note, for those of you who actually read all the way through this: I am delighted that all these copies exist and that all these institutions have made them openly available. Where I offer criticisms it’s in the spirit of love and improvement. As I said above, it’s amazing how quickly these projects age: ones built just a decade ago look impossibly old-fashioned and not up to snuff. By looking at how they all stack up, I hope anyone thinking about how to digitize copies of books not only thinks about how they’re being used but how they can be remade so that they continue to be used.

where material book culture meets digital humanities

Below is the text from a talk I gave at the Geographies of Desire conference, held at the University of Maryland on April 27-28. Almost everything that I said there is something that I’ve said here before, so faithful readers won’t find much that’s new. But I promised I’d stick it up here, so here it is! If you’re simply looking for the set of links to the resources I mentioned, you can find those on Pinboard. I haven’t included all of my slides here, but you can find those here. I haven’t included all my ad-libbing either, but you would have had to have been there for that.

“Where material book culture meets digital humanities”

Discussions about early modern books and digital tools have tended to focus on one of two responses. One of the first things that people focus on is the amazing access that digital tools have given us to early modern works. Instead of schlepping from library to library across the globe—a series of journeys that many scholars could not easily afford—we can access nearly all extant early modern printed English books, and many continental ones, from our desktops. Thanks to EEBO (Early English Books Online), ECCO (Eighteenth Century Collections Online), and Gallica (the digital collection of the Bibliothèque nationale), among others, digital facsimiles are available for us to consult and download entire works from the early modern printed world.

There are limitations, of course. One is the quality of the images. EEBO consists of digital facsimiles not of early books, but of microfilms of early books. As a result, it doesn’t always capture what we might want it to. Here we see an image from EEBO of the second quarto of Hamlet.

opening from a Folger Q2 Hamlet, as in EEBO

You can see one column of text on each page, along with a whole bunch of other junk. [slide] Here’s the same page opening from the Folger’s reproduction of that book:

same opening, same copy, in a high resolution image from the Folger

There’s still ink bleeding through from the other sides of these leaves, but it’s a bit easier to sort out what’s what.

Then there’s this, another image of not-quite visible ink mixed in on the page:

opening from a 1557 Primer, as in EEBO

But this is an instance of red ink not reproducing clearly.

same opening, in a high-resolution image from the Folger

And because the red isn’t visible, you miss in the EEBO copy what’s really a great mistake on this page, the moment where the phrase “of the five corporall joyes of our Ladie” is really a correction for the mistaken “joyes of our lorde.”

never mix up your lord and your lady

My favorite EEBO moment, however, is this one: the title page of a 1612 elegy mourning the death of Prince Henry.

the title page of STC 23576 as in EEBO

This is how the image appears in EEBO; but this is how the image appears in their reproduction of the second state of this edition.

title page of STC 23577 as in EEBO

Do you see what happened? It’s a mourning book, and it was printed on pages bordered in black and sometimes entirely in black, with a xylographic title page, that is, a title page in which white lettering appears on a black background. But when the microfilm was being processed, someone clearly didn’t believe what they were seeing and they assumed it was a mistake, that it should be black on white, and so they reversed the negative, producing a facsimile of a book that doesn’t exist.

There are resources that provide higher quality digital facsimiles of early modern books and that, unlike EEBO and ECCO, are free to use. The Folger has digitized many works in their entirety, including all copies of the pre-1642 Shakespeare quartos and a couple of first folios. The British Library has digitized some of their collection, cover-to-cover, as have many other libraries, including that of the University of PennsylvaniaPrinceton, University of Oklahoma, and the Bavarian State Library. The English Broadside Ballad Archive now includes some high-resolution color facsimiles, and the Universal Short-Title Catalogue (covering all books printed in Europe in the 15th and 16th centuries) includes links to digital copies from many European libraries.

Digital tools have, without a doubt, increased our access to facsimiles of early modern books. If I can sit in my study in Rockville and study Erasmus’s 1516 translation of the New Testament by looking at a copy currently held in Basel, that’s a win.

If one dominant way of thinking about digital tools and early modern books is in terms of access, another has been in terms of text. Access is about text of course—what we’re gaining access to is the ability to read texts. But there are also digital tools that don’t simply read texts, they distant read them. EEBO-TCP can make research a bit easier if you’re interested, say, in sassafras and want to find instances of it being discussed. In the right hands, you can do much more interesting types of computational analysis that can reveal things that would be difficult to see otherwise. Recent work by Michael Witmore and Jonathan Hope, for instance, reveals that genre is marked not only in terms of plot, but also linguistically at the sentence level—histories and comedies and tragedies are genres that are grammatically inflected.  That seems like a win to me, too.

These tools that I’ve just described rely on the ways we have always read books, albeit with increases in distance or speed (you can read a book held at the Folger Shakespeare Library from your study in Gdansk; you can analyze the texts of the entire Shakespeare corpus in a matter of minutes rather than years). I want to take this moment to wonder what new possibilities we might imagine. How might we use digital tools to look at texts differently? How might we use digital tools to represent texts differently? Can we move away from reading text to studying the physical characteristics of text, characteristics that can reveal important information about the content of the text and the cultural and historical creation of the artifact?

The multi-spectral imaging done by the Lazarus team of the Archimedes Palimpsest gives a hint of how digital tools might let us see things that would otherwise go unseen. The Archimedes Palimpsest is a 13th-century Byzantine prayerbook written over a 10th-century manuscript containing writings of the Greek mathematician Archimedes, as well as multiple other works from various periods. Using multi-spectral imaging, along with other tools, the team was able to recover visual access to much of the earliest writings in the book. Google took the project’s dataset and made a “Google book” of the earliest state of the codex resulting in a digital reproduction of a book that exists, but is not visible to us just by looking at it.

One recent paper about the use of densitometers to study levels of dirt on the pages of medieval manuscripts suggests that we can learn about book usage through analyzing how and where dirt is distributed across a book. It might seem obvious that pages that are used more often will be dirtier, and that is in part what the author found, but the use of the densitometer revealed that it’s more complicated than we can always assess with the naked eye. The paper’s author, Kathryn Rudy, points out, for example, that she had assumed that two different patterns of dirt on an opening came from two different users, but the densitometer’s analysis suggested that the patterns were similar enough that they were likely to have been made by the same person—perhaps they held the book in different ways suitable for different prayers. The analysis also pointed out that even books that retain visible marks might have been cleaned by modern owners to such a degree that the dirt is no longer viable as an analytical tool, something that might help us think about the changes books undergo during modern ownership.

Studying the distribution of dirt is just the beginning of how we might begin to use technology to help us understand books in new ways. A colleague in Antwerp reports that German books held in Belgium smell different than German books held in Germany. The cause lies in how the paper was treated: paper needs to be treated with sizing agents so that it handles ink properly (instead of absorbing ink, ink sits on the surface of the paper and dries there, producing crisp and legible marks). His speculation is that books in Germany were sized in a multi-stage sequence, with the last step taking place after the book had been printed, perhaps as part of the binding process. Books that remained in Germany after they were printed went through this final process; books that were shipped outside of Germany seem to have missed that final stage, resulting in a noticeably different smell because of their different chemical properties. If this is the case, the smell of early German books can help scholars understand not only the physical acts of making paper and books, but can help us trace the circulation of early printed works. Using computers to analyze the smells of books and software to map those smells could help researchers learn how books were made and sold and used.

We could also use new technologies to explore other the other senses we use when handling books. The feel of paper (or parchment) is another element of books that has more to offer than nostalgic fetishizing: the thickness, color, and pliability of paper can tell us about the costs of production, in part, but also give insight into the experience of using the book and its intended audience. How might the characteristics of feel be represented in digital media? Could a 3D printer replicate samples of different paper qualities? Could we project back from a paper’s physical characteristics today to how it might have appeared and felt when it was made?

The three-dimensional aspects of paper extend beyond what can be felt by human touch. The process of making books in the letterpress period—and the process of writing on leaves of paper and parchment in all periods—is a process of putting pressure on the paper, leaving behind an indentation on one side of the leaf and an extrusion on the other side of the leaf. In most cases, the indentations are visible because the instrument causing them (type, woodblock, stylus) left behind ink markings. In other cases, there are indentations without ink, sometimes caused when two sheets of paper are accidentally run through the press, sometimes left behind when the bearing type used to even out the blank spaces in a page leaves behind blind impressions.

Folger STC 7043.2, leaf F1v under raking light (click to see this image compared to one under normal light)

There are also the indentations left behind during the papermaking process from the wires and frames used in the forms. Once we start thinking in these terms, we can find more topographical variations on leaves of paper: wormholes, dog-eared corners, holes left from stitches sewing gatherings and the binding together, plate marks from engravings. What might we learn from visualizing books not as texts to be read but as topographical maps?

Another option would be to use digital tools to visualize the context of books, to encounter them not in isolated codices, but in libraries. This 360° panoramic view inside the Strahov Monastery’s Library in Prague lets you see not only the entire room, but to zoom in to see the titles of the books on the shelves. This is primarily a pretty picture, but imagine if this technology was married to something that let you look at catalog records of the books that you’re seeing, or to switch from catalog records to a view of a book on a shelf.

screenshot of a zoomed out view of Strahov Library

titles on books at the far end of the library

If we could use digital tools to estrange ourselves from our books, to defamiliarize what we think we know, we might learn something new about how they were made and how they are used. People keep pointing out to me that we are in the incunabula age of digital texts. We are. And that’s what makes it so exciting.


traces of my dad

Arnold Werner, 1938-2007 (self-portrait)

When I was a kid, my father wrote a weekly column for the student newspaper at Michigan State University, where he taught. “The Doctor’s Bag” ran in the State News for six years, from 1969 through 1975. It was eventually syndicated and ran in 50 campus newspapers, with a circulation of around 600,000. What this means, in part, is that when I was little people used to ask me if my dad was “The Doctor’s Bag.” (That’s how they used to phrase it: Is your dad “The Doctor’s Bag?”) I had no idea what the column was; I just knew he wrote it. At some point, I gathered that it was a medical advice column answering students’ questions about all things health related. It wasn’t until I was an adult and Dad sent me copies of the entire run of the column that I sat down and read them.

I can hardly begin to describe how much I love those columns. I love them for what they reveal about college life in America in the early 70s. The questions students asked! They’re what you imagine—a lot of questions about sex and drinking and drugs. But there’s more to them, too, like the struggle of living in a dorm that has more people than your home town. The overwhelming impression you get, reading them all through, is how much they didn’t know, and the pent-up longing to ask someone who will take them seriously and give them real answers. I suppose if I’d read them as a kid I would have been horrified that my dad talked about this stuff, but you know, he was a psychiatrist, so it’s hardly like I didn’t expect him to talk about everything under the sun. As an adult, I’m impressed with how deftly he answers their questions.

column printed in the 1975 SUNY Albany student paper

letter from 1970 printed in the Stony Brook Statesman

I love them, too, for the window into my father’s personality. They are both funny and earnest, just like he was. They lecture sometimes and joke at other times.

letter from 1972 Doctor's Bag

And they’re amazing for the controversies they raised. Honestly, reading the columns now, it’s hard to appreciate what the scandal is. But people wrote letters in complaining about them. The head of Albany’s Student Health Services complained:

a mild letter to the editor

In June 1970, a couple of Michigan legislators attacked his columns on the House and Senate floors for being “almost indescribable filth” and were outraged that they were being published at a public university. Think of the taxpayers! In 1973 the editor of a student paper was suspended for having printed both disrespectful pictures of Santa Claus and for running my dad’s column. Apparently a mother of a student once sent a letter to my dad chiding him to “think of your own mother before you put these letters in;” little did she realize that Dad did think of his mother and often mailed his column to my grandparents. (They were only disapproving when he appeared in the National Enquirer.)

Today is the 5th anniversary of my father’s death. I miss him. I’ve written before, glancingly, about him in a post on the intangibles of books. I have some of his childhood books, complete with his name carefully inscribed on the inside cover, and I cherish those books, even when I have no desire to read them. Those books are a connection to him. And when someone you love is gone, you need to find connections.

inscription on inside cover

The last years of his life were not good ones. He had cerebral palsy, and while it didn’t really interfere with the bulk of his life—he was an avid biker, faithfully doing the DALMAC ride from Lansing to Mackinaw, even once as 4 days of 100-mile trips—it made his old age miserable. Well, I say old age, but I really mean his 60s, which is not very old. He was only 68 when he did, both much too young and after too much pain and suffering.

excerpt from Parade magazine, 1974

I am glad his death has receded enough that I can remember the joy of his life rather than the pain of its end. And I am glad that there are traces of some of that life still online. The digitization of college newspapers means that some of my dad’s columns are available for all to see, along with this Parade magazine piece about the youth of 1974, and, weirdly, a 1996 Weekly World News piece on “how to blow your stack without looking like a butthead!” I’m glad, too, that you can find some of the results that came out of a workshop on cerebral palsy and aging that we held in his honor. There’s a piece from Developmental Medicine and Child Neurology and, if that’s too long, a slide set on the subject.

There’s much of his life that isn’t out there—his photography, his hobby of rebuilding old cars, his bicycling, his woodworking. And his other psychiatric work, the stuff that got published in academic journals, is locked up in their hands (though your library might have a copy of the psychiatric glossary he edited for the APA in 1980). His columns, too, are probably still owned by the syndication company (someday I’ll retrieve his papers from the lawyers and see what his contract stipulated). The bits and pieces of the online traces of my dad add up to someone who is kind of him, but who isn’t all of him. And there was so much of him when he was alive.

It wasn’t until he died that I began to appreciate the staggering challenges of all the stuff we leave behind. There are his newspaper columns, thousands of photographs and negatives, the records of his life. Dad was a pack rat, which makes the task more challenging. And he was enough of a public figure that it’s hard to resist the feeling that someone somewhere might find this material interesting. Not for what it says about him, but for what it says about the times he lived through. Those Doctor’s Bag columns are full of nuggets. At some point, I’ll do something about that. If I was a researcher in the history of medicine, or the culture of mid-twentieth-century America, I’d find useful material in there. And there’s more, too. Maybe someone would want to know this story: My dad volunteered for the Vietnam War after he’d completed med school, but the army wouldn’t take him because of the cerebral palsy—he limped and certainly couldn’t run. And what happened a few years later? They tried to draft him, but he said no: you didn’t want me then, you can’t have me now. I have all that documentation, because that’s the kind of thing he saved. What do I do with that? Is that just family history, or does that mean something to someone else?

I don’t know what the answers to those questions are. Maybe I’ll just hang onto everything until it’s my kids’ turn to deal with it. Is that what happened to all those old books we have in libraries? The immediate family couldn’t bear to get rid of them and so they hung onto them until finally they because old enough to be wanted beyond the family? Maybe. At some point, I suppose, these things either won’t mean anything to anyone, and they can be tossed, or they will be become interesting through sheer survival through the ages. Maybe it doesn’t matter which.

I’m grateful that he wrote these columns and that I can still read them. I’m grateful that he had enough pride in them to save them and to pass them on to his daughters. I’m grateful that he loved us as much as he did, and that when it was time for him to die, that we were there by his side. He taught me how to write, how to use a camera, develop negatives, and print film. We argued about my curfew, butted heads because we were both stubborn, and watched Battleship Potemkin together. I loved him dearly. And I miss him a little bit less when I come across the traces of his life that have been scattered across the world.

2nd row, 2nd from the right


the serendipity of the unexpected, or, a copy is not an edition

My last post focused on my frustration with the assumption that digitization is primarily about access to text:

But access is not all that digitization can do for us. Why should we limit ourselves to thinking about digital facsimiles as being akin to photographs? Why should we think about these artifacts in terms only of the texts they transmit? Let’s instead think about digitization as a new tool that can do things for us that we wouldn’t be able to see without it. Let’s use digitization not only to access text but to explore the physical artifact.

I spent the remainder of that post brainstorming some suggestions about what digitization might enable other than access to text, and there were some great comments about the ramifications of textualizing the digital that I’m still mulling over. In this post I want to offer some examples of why we might want to look at books rather than digital surrogates as a way of approaching the relationship between digital and physical from another angle.

So why might we want to look at physical books rather than digital surrogates, other than a fetish of smell and a sense of the magical presence of the original? Here are a few examples that start to get at what physical books offer that digital surrogates miss.

The making of the book

One of the things that we learn from examining books is how they are made. There are all sorts of things you can see in physical books that reveal their making: watermarks and chain lines in the paper (which can help date a book, as Carter Hailey has shown ); sewing structures in bindings (which can reveal if pages have been cut out or added in later); and pasted-in cancels or errata slips (which show changes made due to correct errors or in response to censorship).

It’s true that some of these features could be incorporated into digital surrogates; as my photo above shows, it’s absolutely possible for digital images to reveal paste-ins such as this one. But it’s also true that most users of digital surrogates of early modern books rely on EEBO, in which the equivalent page appears thusly:

The text might be the same, but there is no indication in the EEBO reproduction that the text appears on a slip of paper that has been pasted onto the page (although a keen eye might notice that the lines are not quite square with the rest of the page). If all you care about is text, that’s fine, but you’re missing a key part of the book’s history. And what happens if you were looking at a copy that didn’t have the errata slip (not all copies did)? You might never know it was an option–see the later section, “a copy is not an edition” for more on that.

My larger point here is that while digitization could convey some aspects of this category of information, they generally do not.

The history of the book’s use

The best thing about old books, I think, is their longevity and the traces of the history that they carry with them. Inscriptions, marginalia, doodles, vandalism, erasures, cutting out images and leaves–none of those are captured if your focus is solely on the text, and all of them have something to tell us about how a book was used.

(This is an image of one of the Folger’s three copies of Sacro Bosco’s Sphaera Mundi; this one happens to be heavily marked up, with additional diagrams penned in and plenty of annotations, but the other two are significantly less annotated. It looks yellow, by the way, because I took the photo myself, sans flash, per Folger requirements. There’s not a whole lot of light in the Old Reading Room, and I did a sort of half-hearted job of color-correcting my picture.)

Here’s a detail from a blank leaf from the middle of a 1550 Chaucer Works that’s covered in marginalia. I count at least four hands in this picture, three sixteenth-century and one twentieth-century. There’s other marginalia in this book, too:

So add two more hands (this time a seventeenth-century and a twentieth-century one that is mistaken about the date) to the collection of readers who have left their traces in this book. And then there’s this, from the same volume:

So that’s one more inscriber (although he doesn’t appear to have been an owner of this book). There’s also the cover of the book, with two more names incorporated into the binding, eighteenth-century descendants of Frances Wolfreston. I’ve written about this book and this collector before, so I won’t go on further here. But this kind of passage through history tells us not only about one particular family and one particular book, but gives a window into different responses to and uses of Chaucer’s poetry.

I suppose it’s possible that digitizing could capture this; if I can take pictures of these inscriptions, there’s no reason a digitization project couldn’t. Oh, except time and money. This is a big book of more than 300 pages, and only one of two copies we have of this imprint, and one of five copies of this edition, and one of I-don’t-know-how-many mid-sixteenth-century copies of Chaucer’s Works. Are we talking about a project that will digitize all pages, cover to cover, of all these books? Who’s going to fund that? Is it worth funding that as opposed to, say, funding digitizing all the books that were once owned by Ben Jonson?

The serendipity of finding the unexpected

This sounds a lot like the sort of paean to open stacks and browsing that you hear from some book fetishists. But I’m talking about something a bit more complicated, I think. When you look at a digital surrogate, someone has already made the decision for you about what you want to see and how you are going to use it. There’s no reason you can’t use the surrogate differently, but every choice they’ve made impacts your ability to circumvent it. For instance, some of the biggest digitization projects out there don’t include blank leaves or pages in their digitization. ECCO is the chief culprit that comes to mind: they prioritize text, and so their surrogates start with the frontispiece or title page, then move to the dedication/preface/letter to the reader/start of the text. But you know what’s missing? The blank verso of the title page. Does that matter? I don’t know. It might. It depends on what you’re looking for. But you’ll never know that you might be looking for an answer that depends on that blank presence if you don’t know that it’s not there.

Jeffrey Todd Knight has written about this serendipity in his most recent article, “Invisible Ink: A Note on Ghost Images in Early Modern Printed Books”. His focus is on ghost images left behind when some of the Pavier Quartos were bound together in early collections with non-Shakespearean works, so that an image of the title page of Heywood’s A Woman Killed with Kindness appears faintly on the verso of the last leaf of Henry V. Knight draws out some of the implications of what this means for the Pavier Quartos and our understanding of Shakespeare book history–go read the piece yourself–but he also makes a broader argument for the need to consider the invisible, reminding us that “it has been easy to forget that text reproduction technologies, at every level, carry biases”:

The onscreen interfaces that give us Shakespeare and Heywood’s plays today are not transparent windows onto the text themselves; they define and regulate a field of visibility, as do all forms of curation going back to the early copies, which also carried biases.

What I would emphasize is that we don’t know what we’re missing until it’s possible to see it. We can’t see the blank pages or the invisible ink if we’re experiencing a book through decisions that have already eliminated the possibility of seeing them.

A copy is not an edition

This is true for all books, but especially early modern books: no two copies are the same, whether through the process of how they were printed or their subsequent use by readers. The practice of stop-press changes being made at any stage of production, and at multiple stages of production, and the habit of mixing sheets so that “uncorrected” sheets can appear in the same copy as “corrected” sheets, means that any book that had changes made during the printing process will exist in different states. Nor are those states typically indicated on a book’s title page or (depending on how important a book is seen as being and how well cataloged it has been) in its bibliographic record. You can only find these variants by looking at multiple copies of a single edition–hence the brilliance of things like the Hinman collator, the Lindstrand Comparator, the McLeod Portable Collator, and Hailey’s Comet, all devices that allow their user to compare multiple copies of books without actually reading them. (Reading, as any copy editor will tell you, will only distract you from what you’re actually seeing on teh page; collators–or, as I like to think of them, the original textual/machine hacks–let you look instead of read.)

Why do we care about these textual variants? We might care because they tell us something about how the book was made, because it might say something about economic or societal pressures (if they were changes introduced in response to something other than correcting an error), or because the presence of and differences between states might help us understand the range of textual meaning available.

One example of the stop-press changes can be seen in the second quarto of Hamlet, in which Hamlet’s lines to Osric exist in three states, two of which can be compared in the Shakespeare Quartos Archive:

We are in no danger of missing the many textual variants of any of Shakespeare’s plays. But imagine if we were talking about, say, a Webster play, or a poem in praise of Queen Elizabeth. Those have not been as extensively, even fanatically, collated as Shakespeare’s works have. Nor have the funds been lavished on reproducing many digital surrogates of a single edition, as they have with Hamlet.

Remember above when I pointed out that some copies of The General History of Virginia have an errata slip, but others do not? If you encounter a digital surrogate of a copy that has the slip, you will treat that copy as if it represents the entire run of that edition, as if all copies of that edition of General History have errata slips. What is the characteristic of a single copy becomes a characteristic of the edition. But a copy is not an edition; what is unique to a copy is not common to the edition.

This list of possibilities has gone on long enough. Nearly everything I write about on this blog dwells on what we discover from looking at one copy of one book. If you’re curious to see more examples of what you might find from looking at physical books rather than digital surrogates, try some of the following posts in addition to the ones I link to above about blanks and Frances Wolfreston: “an armorial binding mystery“, focusing on overlapping book stamps; “essayes of a prentise“, about the significance of the binding on a volume of James I’s writings; “David and Goliath, redux“, about finding the same pattern of embroidered bindings; and “bibles for historical occasions“, in which I think about, um, bibles used on historical occasions.

I would love to hear more thoughts from you on this subject, particularly of lucid examples of what we can learn from looking at physical books beyond the sort of intangibility of emotions that books can evoke.

fetishizing books and textualizing the digital

For some time I’ve been perplexed by the way both pro-digitization and pro-book people talk about digitizing books. A crude characterization of the ways in which the two sides depict the argument as having two sides might look like this:

pro-digitization: Look, I can access all these wonderful old materials without leaving my armchair!

pro-book: Those aren’t books; you can’t feel the paper and breathe in their smell!

pro-digitization: But we can create a universal library!

pro-book: You’re not creating a library, you’re destroying libraries!

pro-digitization: Nyah nyah!

pro-book: Pfft!

And there you go. The digitization folks talk about access and the book folks talk about being in the presence of the object. Neither side tends to present a more nuanced sense of how they might each have something to offer the other, or to recognize that there might be other considerations and uses at stake.

Lest you think I’m exaggerating, consider the most recent salvos in this inanity: the op-eds from Tristram Hunt and James Gleick. Tristram Hunt’s “Online is fine, but history is best hands on” was published in The Observer on July 3rd. In it, Hunt argued that the digital is not only not fine, but an impediment to studying history:

Yet when everything is down-loadable, the mystery of history can be lost.Why sit in an archive leafing through impenetrable prose when you can slurp frappucino while scrolling down Edmund Burke documents?

But it is only with MS in hand that the real meaning of the text becomes apparent: its rhythms and cadences, the relationship of image to word, the passion of the argument or cold logic of the case. Then there is the serendipity, the scholar’s eternal hope that something will catch his eye. Perhaps another document will come up in the same batch, perhaps some marginalia or even the leaf of another text inserted as a bookmark. There is nothing more thrilling than untying the frayed string, opening the envelope and leafing through a first edition in the expectation of unexpected discoveries. None of that is possible on an iPad.

This is such ridiculous tripe, it’s hard to know where to start. The basis of his argument seems to be that access diminishes value: if you can come across this stuff too easily, you won’t really have earned understanding it. In fact, James Gleick’s op-ed, “Books and Other Fetish Objects” (New York Times, July 17th), takes it down pretty nicely: “I’m not buying this. I think it’s sentimentalism, and even fetishization. It’s related to the fancy that what one loves about books is the grain of paper and the scent of glue.” (A point of clarification: that the link to http://smellofbooks.com/ is there in the NYT piece; I’m not sure I’ve seen them do that kind of bloggy commentary before, but kudos to Gleick for getting it in there.)

Gleick’s piece isn’t all snark, far from it. His main point is that

It’s a mistake to deprecate digital images just because they are suddenly everywhere, reproduced so effortlessly. We’re in the habit of associating value with scarcity, but the digital world unlinks them. You can be the sole owner of a Jackson Pollock or a Blue Mauritius but not of a piece of information — not for long, anyway. Nor is obscurity a virtue.

He makes another, excellent point about the value of such digital repositories as London Lives: “They enrich cyberspace, particularly because without them the online perspective is so foreshortened, so locked into the present day.” That’s a key point, I think, especially for those of us who study the past; it’s easy to lose sight of the past, to rewrite it to suit present needs.

As you can guess, I find much more that is compelling in Gleick’s point of view than I do in Hunt’s. But Gleick goes astray in the sentence that immediately follows the passage in the block quote: “A hidden parchment page enters the light when it molts into a digital simulacrum. It was never the parchment that mattered.”

“It was never the parchment that mattered.”

Now if all you’re interested in is the text, the words on the page, then maybe the parchment might not matter. The Magna Carta matters beyond its material presence, that is certainly true. But no text exists outside of its material manifestation. And this is where the pro-digitization folks seem blind to me. So much of that rhetoric has focused on access: let’s digitize these books/manuscripts/bits of paper and parchment so that more people can read them. And that is a great thing, it really is, especially when they are open access and available to folks who might not be able to travel far and wide to research libraries and who might not have the right credentials to get into those libraries. That sort of access is radical.

But access is not all that digitization can do for us. Why should we limit ourselves to thinking about digital facsimiles as being akin to photographs? Why should we think about these artifacts in terms only of the texts they transmit? Let’s instead think about digitization as a new tool that can do things for us that we wouldn’t be able to see without it. Let’s use digitization not only to access text but to explore the physical artifact. What would be the book equivalent of the extreme zooms, as you have in Google’s Art Project’s depiction of Vincent Van Gogh’s “Starry Night” or in the alternate lighting view of their image of Chris Ofili’s “No Woman, No Cry”? Could we have digitized books that let us virtually unsew the leaves and examine the formes in which they were printed? Could we strip black ink off of pages to let us better see the watermarks and chainlines? Could we alternate between regular light and raking light so that we can see the impression left by bearing type? Those are pretty tame suggestions off the top of my head. What could we come up with if we put some open-minded bibliographers and keen coders in a room together?

This post has gone on long enough, but I’ll scout down some examples of what we cannot see when we think about digitization as being only about text and post them next time.

(A note about the images: The top is a screenshot of the zoomed-in “Starry Night”; the bottom is a screenshot of a zoomed-in detail of “No Woman, No Cry” showing the woman’s tears of photos of Stephen Lawrence. Go to Google Art Project and see the paintings for yourself.)