disembodying the past to preserve it

What follows is a keynote I gave at the Digital Preservation 2013 conference on July 23, 2013. If you’re curious, there’s a video up of the talk and the Q & A as well and a pdf of the slides I showed (some of which vary from what I’ve shown here).

“Disembodying the past to preserve it”

I am, as you’ve heard, not someone who focuses on issues of digital preservation. I’m a book historian and performance scholar who works at a cultural heritage organization that is focused on the preservation and exploration of centuries-old objects. I think about the digital and preservation from the perspective of someone who studies the past and seeks new ways to make it accessible to scholars and the public.

So since I spend a lot of time thinking about the history of books and since so many people see the rise of the digital heralding the end of print, I thought I would start off by looking back at the earliest surviving instances of moveable type in the West. We all know, I think, that the first book printed by Johannes Gutenberg was the bible in 1455. But that wasn’t the earliest instance of print. Gutenberg’s first printed text were indulgences—short formulaic texts sold by the Church and its deputies to fund various enterprises by promising purchasers they wouldn’t need to spend as much time in purgatory for their sins.

1454 indulgence (Rylands Library)

1454 indulgence (Rylands Library)

What we’re looking at is one of the earliest surviving copies of these get-out-of-jail-free cards. Now held at the University of Manchester’s Rylands Library, this indulgence was printed in 1454 and was issued to a specific buyer in 1455 on the 27th of February. (You can see why printed indulgences were so handy—the bulk of the text is the same from one to the next, and small blank spaces can be left to be filled in by hand with the particulars for each sinner.)

There are other copies of the Gutenberg indulgences that have survived. This one is a slightly later issue (it’s the 31-line indulgence, not the 30-line, for those of you who are bibliophiles). Now part of Princeton’s collections, this indulgence was issued on 29 April 1455 in Pfullendorf to Johannes Grosshans—you can just barely make out the fact that there is a manuscript insertion here, but this copy hasn’t survived in as nearly nice shape as the last one we looked at.

It’s astonishing, actually, that any of these indulgences survived. Very few of them did—even though print runs for indulgences were huge, often in the thousands, there are only 50 recorded surviving copies of the 31-line indulgence, and a mere 8 extant copies of the 30-line. When you look at an indulgence, it’s easy to see why they wouldn’t survive in large numbers. They’re just flimsy little things. Compare these two images, the first of someone holding Princeton’s indulgence I mentioned above…

… and this of a bound incunable (also in Princeton’s collections):

The first can be held in one hand (even in its framed state) while the other rests heavily in a chair (don’t try that in your reading room, please!). I think you can guess my point: the Aristotle is big and it’s durable because it’s big. You can’t easily tear or lose this book. But a single sheet of paper? That gets misplaced, it gets accidentally destroyed, it gets forgotten. A light breeze could blow it away if you weren’t paying attention. And once the holder has died? Do you need to hang onto an indulgence as a record of your grandfather’s purchase?

The disposability of indulgences is why they haven’t survived. But it’s also why the ones that survived did. Here’s what I mean: because the indulgences weren’t seen as precious documents to save, they were perfect to reuse for other purposes. And the early indulgence that survived often did so because it was used as waste paper in bindings. Without launching into a lecture on early modern binding structures, I’ll just say that bindings often incorporated paper leftover from other projects. Endpapers, spine linings, structural supports—binders needed materials to finish their books. And why would you use good blank paper—paper that could be used for other purposes—when you had scrap paper at hand? And so odds and ends of printed paper were incorporated into the bindings of books:

If we turn back to the images of the indulgences I’ve shown, you’ll see that being treated as disposable is how they survived. The 30-line indulgence now at Manchester was preserved in a binding—you can see the evidence of holes in the corners and the stain left from the leather turn-ins. And Princeton’s copy survived as pastedowns in a binding from the early 1470s. With Cambridge University Library’s copy of Wynkyn de Worde’s 1598 indulgence we see something slightly different: these are indulgences that were never sold and are still in sheet form, preserved in the binding of a bible. This is one of my favorite examples, because it doubles as evidence of something we normally wouldn’t see, the production technique displayed in the unfinished object.

Because it was disposable, it was preserved. It’s not a preservation technique I’d recommend, but it’s worked for more than a few texts.

I’ll let you deal with what this might mean for digital preservation (I know just a tiny bit enough about digital forensics to gather that bits of data cling to other bits of data and that you might be looking to recover someone’s novel only to find that other records of their life are interspersed with it). Instead, I’ll ask what lessons we might learn from this about using digital iterations of material objects.

For starters, it’s worth pointing out that I wouldn’t have been able to give this talk if these objects hadn’t been photographed and shared online. It was because I was looking for images of indulgences for a different talk that I came across these pictures and noticed that they all looked like binders waste. Discoverability shouldn’t be news, but it shouldn’t be forgotten either.

The problem that we’re facing, in my world, is that the digital objects we’re producing sometimes lead to wonky discoveries. Here’s one thing that has been bothering me recently: the size of books.

psalms unscaled

Here we have two books of psalms, one printed in Geneva in 1576 (on the left) and one in Florence in 1566 (on the right). They are, to all appearances, the same size.

psalms scaled

But this is how their comparative sizes should be displayed: the Italian psalter is 21 centimeters tall and the Geneva psalter is 13 centimeters, or about the height of a Sharpie. (Projected on a screen, of course, they appear to be significantly larger than a Sharpie, although perhaps on your device’s screen they are significantly smaller—a not unrelated oddity of working with digitizations of material objects: size isn’t stable.)

Here we see a collection of books as we would see them in the Folger’s digital image collection, displayed side-by-side:

sizeofbooks unscaled

Here are those same images shown in relation to each other—I arbitrarily chose one book as my standard, and then calculated the scale and adjusted the images from there:

sizeofbooks scaled only

This slide does a much better job of conveying the relative size of these books. But it’s a rotten way of browsing through a large collection of images if you’re at all interested in any feature other than size. In other words, if you want to treat these images as books—as objects that you hold in your hand and read—then you’re going to be dissatisfied. They’re always going to be digital surrogates (a phrase I hate), lacking the primacy of the original.

But what if we took the disembodied aspect of these images of books as an opportunity rather than a failure?

Here’s a fun fact about early printing that is all about its material process: many printed works are illustrated with woodcuts, images that are literally made from blocks of wood.

(I just want to make the aside that it’s pretty effing amazing that the Folger has the piece of wood that made that exact print—the other amazing thing is that on the other side of this piece of wood is carved another woodcut! Just like you’d reuse scrap paper in your bindings, you’d repurpose pieces of wood.) In any case, one of the results of illustrations being made from blocks of wood is that the blocks of wood could be reused to print illustrations in different works.

The Broadside Ballads Online project at the Bodleian has taken this fact and combined it with image search technology to begin to explore how images in Renaissance ballads are used and reused. Alexandra Franklin did some excellent work with this, starting with noticing a distinctive hat used in Unconstant Phillis, a late 17th-century lament by a shepherd about the woman he loves:

unconstant_phillis_hat

Using image match, Franklin searched across their ballad collection for other instances of the hat, pulling up 8 hits, including this one from The Noble Gallant.

gallant_hat

What’s particularly fun about this reuse is that we see that although the hat is the same, the man wearing the hat is not!

Why might this be a useful discovery? Tracing the use of a woodblock across multiple printings and multiple works can help date printing; it can also help us think about iconography and shifting discourses. For me, this is also a useful discovery for the way it turns material objects into digital ones that can be dismembered and rearranged. It’s not strictly necessary to use digital tools to do this sort of image-hunting work: Ruth Luborsky and Elizabeth Ingram compiled their Guide to English Illustrated Books without the use of image matching technology. But it’s certainly much easier to do it with bits than with books.

What can digitization offer that material objects cannot? Tools to reshape objects that would break under physical pressure. The work done on The Great Parchment Book by the University College London Centre for Digital Humanities is the most recent and exciting example of what those possibilities are. The Great Parchment Book is a survey compiled in 1639 of all those estates in Derry managed by the City of London through the Irish Society and the City of London livery companies. It’s a remarkable set of records. But it’s also a collection of 165 leaves that were badly damaged in a fire in 1786. Through careful preservation, about 50% of the text was recovered, but the brittle, wrinkled parchment remained an intractable obstacle to further work. But after extensive physical preservation work on the manuscript and detailed imaging, the UCL team was able to virtually unwrinkle the pages (read about the preservation and digitization processes). About 90% of the text of the Great Parchment Book is now readable, and available for examination online as images of the leaves, enhanced images, or a transcription of the text.

In both of these cases, digitization makes available objects for study that would otherwise be restricted, either because they’re too fragile to handle or they’re too dispersed to work with. For someone invested in cultural heritage, these are remarkable accomplishments. We can’t study the past if we can’t access its records and artifacts.

But both of these projects are ones that require significant investments of time, money, and people. They’re not lightweight experimentations—you need high-resolution images, you need expertise in image manipulation, you need the physical objects at hand.

I want to end with a look at something that is the opposite of all this, something that builds off of what has already been done, publicizing and redeploying images without adding to them or, indeed, displaying them.

The Library of Aleph is a twitter account that tweets the captions of prints and photographs in the Library of Congress’s digital collections. The tweets are nothing more than the captions—no images themselves, no links to them. Just the captions, with occasional reminders that anyone can find these images by searching the Library of Congress. Here’s one tweet: “House burning during Groveland reign of terror—Negroes driven from homes throughout area.” Here’s a screenshot of the corresponding record:

House burning during Groveland Reign of Terror--Negroes driven from homes throug_2013-07-26_16-45-37

The Library of Aleph’s tweetstream the day after the verdict of George Zimmeran’s trial was announced was a relentless account of the history of African-Americans, from slavery through Jim Crow through the Civil Rights Movement. The person who created The Library of Aleph hadn’t created it for this purpose—it was really an account he put together to tweet out some of the interesting images he was finding without cluttering up his main account. But in his anger after the verdict, it became a platform for remembering and reliving our past.

I bring it up here because of this paradox: what makes the tweets so powerful is that they are disconnected from the material object they’re referencing. They’re just captions. We might gloss over images but I think we pause over these. What are we reading? Who wrote the captions? What does it mean to choose these words to describe these images?

I love the way @libraryofaleph connects the past to the present and the present to the past. Things that speak to us today can speak to what spoke to us in the past, and digital technologies can bring them together. But what I really take out of this in terms of what cultural heritage organizations can do with digital tools to preserve our past is that this is an account that came not from the Library of Congress, but from an unaffiliated user. The Library of Congress did all the hard work in collecting these works, in digitizing them, in creating their metadata, in making them discoverable, and then in making it open so that somebody else could do with it something powerful.

And it’s that that cultural organizations need to think about in the use of the digital objects we are creating. We need to make them open so that other people can do things with them that it would never occur to us to do ourselves. Preserve your data, create your metadata carefully, and then release it. Make it open so that it can be used, so that we can learn from it, and so that it can continue to be discovered by future users.