In my last post I discussed the urgent problem of preserving "born digital" collections (that is, creative drafts produced using a computer rather than paper and pencil), and the very real possibility that a large portion of our cultural history will be lost unless we solve it quickly. Today, though, the sun is shining, the weather is warm, and the days are getting longer, so I turn to the happier subject of the really remarkable things a scholar can learn using born digital data.
In 2008, I began working with the digital files in the Jonathan Larson collection at the Library of Congress. From 1989-1996 Larson wrote the words and music to the musical RENT (now opening in a new production off-Broadway). Tragically, just before the show's opening night Off-Broadway, Larson died of an aortic aneurysm caused by Marfan Syndrome. Several years later, his papers, including about 180 floppy disks, were donated to the Library of Congress. With the permission of the Larson estate and the Library staff, I migrated all of the data from these disks to a more stable storage medium at the Library.
The final draft of RENT on Larson's disks was saved at 12:38 PM on Monday, January 15 using a copy of Microsoft Word 5.1 for the Macintosh. Opening this file with a vintage copy of the software, it's possible to see the file more or less as Larson saw it in 1996 (see, for example, the screenshot below).
When I open the same file with a simple text editor like Text Wrangler, though, the text appears to be somewhat different. In the picture below, you can see the line "Before the virus takes hold" appears as "Before you enter the light."
But I'm looking at the same file! What's going on? It turns out early versions of Microsft Word had a setting called "fast save" to speed up the frequent action of writing to a file (a slow process in those days of floppy disks and computers that ran only about 2% as fast as today’s iPhones). "Fast save" worked by appending revisions to the end of a file rather than completely overwriting the existing the text. Word 5.1 knew to look for these revisions and integrate them into the main text when the file was opened. A text editor, on the other hand, just opens the text as it finds it. When these files are opened with a software tool called a hex editor, though, it's possible to look at the groups of text that represent the revisions recorded in a single "fast save." In the picture below you can see an example of a hex editor view of the file with the relevant text highlighted.
Like paper palimpsest drafts in which a revision is written over an early text (which is itself still visible), Jonathan Larson's Microsoft Word files provide scholars and artists a fascinating glimpse into his creative process. For textual scholars practicing genetic criticism and lyricists trying to improve their craft by studying the practices of a master, this kind of information can be invaluable. However, it is information that is in grave danger of being lost unless it is preserved quickly. In part because of their untimely deaths, Jonathan Larson and Howard Ashman's collections were relatively easy to preserve because only a couple of decades had passed since they were created. If today's artists and writers want to preserve their digital work in a similar way, it is important that they begin to develop a plan now. As Digital Curator, I'm happy to help!