Silences And Sounds: Making magic with old texts – how one scholar uses Transkribus (2/16/25)

A snippet of the Thalbach Chronicle (Bregenz VLA Thalbach Hs 9) and the logo/motto from Transkribus: "Unlock the past with Transkribus"

I’m not a modern-day marvel; digital humanities *seems* cool, but it’s not my training and not my natural modality. I live on a farm, with all the attendant joys of rural internet. (Failed the ping test recently? Us too!) I have read a number of DH articles with interest, and adopted some of the intellectual practices that volume assessment allows for. But at last, the moment has come: it’s time to learn a new tool in order to make my regular work go faster.

There’s this monastic chronicle (Bregenz VLA Thalbach Hs 9), you see, and it’s unedited. That there is, in fact, data from within “my” convent is a wonderful thing, and I’m thrilled to have access. There are two big problems, however. First, it has not been digitized. And second, I don’t read (well, I didn’t read) 18th-century Kurrentschrift. So, here’s what I did.

STEP ONE: GET PHOTOS. With the permission of the archive, I was able to photograph the chronicle. My system relies on the step-basis auto-numbering of photos, and is woefully “brute force” for the more sophisticated user. It is also (I confess it now!) simply a set of photos on my cell phone. No fancy lighting, no high-tech imaging for the ages; these are functional photos for use as a musicologist, not reflecting the book-history elements (which may be interesting but are not my raison d’etre). Before I start, I prepare a written description of the object, with special note of handwriting changes, format changes (from 18 to 24 lines, for example), and so on.

To organize my photos, I start by taking a picture of the description of the MS that I prepared in advance, and of its gathering structure that I prepared on-site. Then I take pictures of the outsides of the MS and of the visually interesting bits that caught my early attention. (These pictures are for slidedecks for any talk I might give – they’re the visual capture of the “coolness” of the item.)

Then, I take a picture of the gathering description for the first gathering, prepared in pencil in the archive. For more complicated gatherings, that includes a gathering structure diagram, but it’s often just a few lines of text. That becomes my photographic “label” for the section. Then, I take sequential pictures of the pages in the gathering. Occasionally I repeat a page if I want to be sure that I captured some particular detail, but mostly I move from first page to last page in the gathering. When I’m done with the gathering, I take a picture of the wooden table as a marker. Why? Because that’s going to leap out at me when I’m looking at all these photos of handwritten pages on my phone!

Now, onto the next gathering, and the next, and the next after that. Each starts with a header photo; each is sequential; each ends with a picture of the wooden table. At the very end, I go back and take pictures of details – with a label card or paper-pad notation first telling what gathering and folio it’s from, and why I thought it important (“detail of the insertion in the right-hand margin showing a different hand”).

Lastly, I go home and back everything up into a file folder. At this point, the sophisticate would probably rename all the photos, but I let the assigned image number stand for the page. YMMV. [That’s the increasingly old-fashioned phrase “Your Mileage May Vary,” if we happen to live in different acronym worlds.]

ORGANIZING IMAGES ON MY HARD DRIVE

With a couple hundred images, managing the inventory can seem daunting, but I’ve developed some habits over the years. I’m a spreadsheet person; spreadsheets make my heart sing. I love me some useful spreadsheets. So, for each of the important manuscripts in my life, I have a translation table: Photo Image number, gathering, folio or page, content, commentary, and then columns of whatever I’m interested in (concordances or chapter number or dates or places or whatever – that’s for the assessment phase).

So with the Chronicle, I now had a mass of mostly indecipherable eighteenth-century text entries carefully organized into a folder and managed via spreadsheet. Now it was time to start reading. Except, I don’t (yet) read Kurrentschrift, so I’ve got a whole mess of gobblety-gook. Enter the wonderful world of technology! I know AI has its issues – not least of which is its ecological impact – but there are tasks for which it is exceptionally well-situated, and it turns out that teaching the novice how to read new scripts is, for me, one of those true talents.

TRANSKRIBUS

I’ve heard of Transkribus (at https://www.transkribus.org/) for years; the idea of an app that can decode various historical scripts is an attractive short-cut for handwriting styles I don’t know, particularly since my focus is the content more than its presentation. I’m an extractivist: I want to know what the chronicle actually says and add those data-points to the story I’m telling. Also, I’m not keen to prepare editions – a chronicle is a side-witness to the music for me, not a central focus of my work. Many of my decisions reflect that perspective. I didn’t seek out a colleague for collaboration, for one thing, nor go to a paleographic institute. Hooray for brute force, right?

I searched out the Transkribus website and read all the (very helpful) guides that were prepared. I even watched two of the introductory videos, though I had to go to town for them to download at playable speeds. And, they have a capacity to try a few sample pages lower down on the page (scroll down to “try it out”). I chose a representative image and uploaded it to see what it did. Magic! From the loops and lines of Kurrentschrift emerged words that were, for the most part, German dialect, and familiar in style and spellings from other texts from the area. Success! I admit that I scooped up the sample reading and dumped it into a document file; I wanted to be sure that whatever I had, I saved.

The next step of learning was a several day project. One of the best things about the Transkribus tool is that it has a lot of subsets that use certain sets of documents as training tools. These models are available to apply to your document(s), and some of them work better than others. I literally made a list of ALL of the models that covered German Kurrentschrift of the 18th century and tested them with two different pages from my chronicle. For each, I did A/B testing: was this model better than that one? I kept notes on which ones did well, and went back to a couple of the models three or four times until I settled in on the one that seemed the most accurate on a first pass. I know that I could train the model for MY project, but I wasn’t interested in that this first time through, in part because I was a complete script-reading newbie, and didn’t want to mis-train the AI.

Once I had a model in hand – and had taken careful notes on its model number and name for scholarly purposes – it was time to start the transcription project. So, I created a free account (which currently gives you 100 pages of transcription free per month), and priced out the subscription model I’d use once we’re in the new fiscal year at my University.

As I planned my project, I realized that organizing the materials is an important consideration. There are “collections” in Transkribus, and “documents” within the collections. As a reminder to the reader: I’m not aiming at edition prep; I’m working toward extracting my data. So I created a hodge-podge organization that made sense to me. Instead of a collection that was the entire chronicle – something that I believe would probably be best practice – I broke out my chronicle into its gatherings, so I can navigate to-and-fro easily.

And then, I uploaded subsections of the gatherings as documents, rather than the entire gathering at a go or (at the other end of the spectrum) the individual leaves of the chronicle. This is being created for my convenience, after all, and this first go-round I wasn’t certain how things worked. I have between 4 and 16 pages in each “document.” I did learn that the windows folder bugaboo, randomization, occasionally impacted my uploads, which is one of the reasons that I kept my “documents” short. I also decided to retain document naming based on image number; for me, my spreadsheet is the controlling document. Renaming is both time-intensive and an area in which error can enter. As a result, my documents are named such compelling things as “IMG_1421-1427.” It works for me. (On the other hand, my naming for the gatherings is a bit more obvious to the outsider: “ChronikGath3” works here, and continuously typing in “ThalbachChronikGath3” just seemed like more work than needful since I’m not contemplating doing this with other chronicles, at least not in the next three years.)

Finally, after uploading the first document in the first collection, it was time to drive. I selected my pages, hit the “Recognize” button, and was taken to the interface. I added the “public model” that I had selected through testing, then took a deep breath, and hit “recognize.” The job runs in the background, and eventually the selected pages will have header colors that turn orange, to signal that the draft text is ready to review.

USING AI TEXT RECOGNITION TO CREATE A SEAT-OF-THE-PANTS EDITION

Here’s the part where things get wonderful. The AI model I chose is actually pretty decent with my text. As a new reader of Kurrentschrift, it took me a while to get a hang of it, but I used the process to teach myself the reading skills which will be necessary to me for this document and a couple of others upcoming. (I’m a 14th-15th century scholar; our handwriting is MUCH more legible, thank you very much!) For those who are in my boat, here are a few things I did that made learning to read the script go quickly.

First, I pull the transcribed text into a document file so that it’s on my local machine. (Remember, I’m that “rural internet” guru; failure to reach the world as a whole is as regular an experience as is going grocery shopping.) Alas, I haven’t been using the export function, though it’s there; instead, I cut-and-paste. It’s a rube’s approach, I know, but it’s fast, and it puts everything in a space I can edit with my own tools and habits. (I’m on LibreOffice these days; again, YMMV. But it’s free, and it doesn’t keep trying to put everything in OneDrive. Which is out in cyberspace. And often unavailable here at the farm. I’m glaring at you, Microsoft.)

To manage these texts, I insert headers for each individual page in all caps (to stand out from the transcribed text). For my purposes, the image number and the MS gathering and folio numbers suffice – along the lines of “PHOTO 1363 CHRONICLE GATH3 p. 34”. Also, like the AI transcription, I honor the line breaks of the original, so that toggling from transcription to image and back is easy. (Also, I insert my cut-and-paste as unformatted text; others might want the line numbers, but there were enough errors in line identification that I found it easier to do without.) This was a good cross-check to that randomizing ordering that windows puts on file transfer; by checking each image against its image number and page or folio number, I was able to ensure that the order of my text was in fact the order of the chronicle (except that the chronicle gatherings are actually out of order, but that’s a fault in the manuscript, not the editor nor the technology!).

Second, I got myself a couple of tables of cursive letterforms compared to Fraktur letter forms, so that the basic shapes were something I could puzzle through. I admit that my first pass awareness-level was so low that on the first four pages I read, the only word I could decode independently was “septuagesima.” However, once I learned that those really precise looking “n’s” were actually the letter “e,” I started to see the handwriting emerge from the page.

Third, it is my practice to work through systematically, allowing “bad readings” in order to get from zero to literate. I mangled my way through the first four pages, by which time the d’ as “der” and the dß as “das” was pretty clear. I go line by line, and I’ve learned to highlight the relevant information in different highlighter as I go. (For me, yellow is people, green is liturgy, blue is date or place, red is music, sweet music.) My goal is extraction, not perfection. It’s embarrassing to note that neither the AI nor I at first recognized the swoop at the end of words as an “-n.” Likewise, it took a while before I was confident enough to simply obliterate the AI’s suggestions for my own reading of a word. That said, it’s truly a case of learn-by-doing; as I hit page 20, I was starting to read each word instead of decoding it letter by letter.

Fourth, as a matter of process, I’m comfortable leaving in uncertainties. This work isn’t directly for publication, so if I wasn’t sure of a word, I would simply accept it or type in my best guess, then put in square brackets another possible reading, and frame things with question marks. For instance: “unser lieben Erbar [? frawen?] officii” – even as a newbie reader, the word “erbar” makes no sense here, but rather than worry about it at length, I put in my contextual reading and then moved on. I can search those up and revisit them after I’ve plowed through the first time.

Finally, as I indicated before, I reward myself with the “ping” of a data finding by using those highlighter buttons liberally. As I look back now over less than a month of intermittent work, I’ve got a long roster of people and events to code into my other note-taking systems. I haven’t harvested them yet, but they’ll be easy to identify as I finish up the process. Having those rewards in sight makes the days of “ugh, I can’t DO this” more bearable. And each time I return to the document, more and more of it looks like German instead of just “ink scrawls.”

MY TAKEAWAYS: THE MAGIC OF TECHNOLOGY

The reality is, the technology is remarkably impressive. Even without training it on my manuscript, it’s getting 75 to 80% of the text down properly. (It confuses Q for G, though: Quardian is not a word. Maybe next time I’ll try training the model.) That’s amazing!

It’s working from manuscript, and that’s an imperfect environment. Every so often, particularly when the scribe’s lines have a waver to them or when the page was curved in the photo, it mangles lines and mixes up word order – the manual corrective is absolutely necessary.

The benefit is that as a scholar, I’m a factor of ten times more competent with the script now than I was at the end of the first week. Having learned to read a cursive 16th-century hand without AI assistance, I can testify to the massive jump-start that having a plausible transcript makes, as long as I’m working systematically, letter by letter and word by word.

It’s just like practicing. If you work on the details and the techniques, there comes that moment where all of a sudden your perspective shifts from notes on the page to the sounds of the past. And that, my friends, is magical.

ACCESS
https://www.transkribus.org/
https://www.youtube.com/@transkribus

Silences And Sounds

Pages

Sunday, February 16, 2025

Making magic with old texts – how one scholar uses Transkribus (2/16/25)

No comments:

Post a Comment

How Long Is a Long Prayer? Innsbruck FB 1118