Showing posts with label method. Show all posts
Showing posts with label method. Show all posts

Saturday, March 22, 2025

How To Jump-Start a Scholarly Article: The Plan

Elements of article planning: beginning stages
 

I’m newly collaborating with a colleague and it’s made me hyper aware of my own system of article writing. We all have quirky shortcuts, and writing is ALWAYS a case of “you do you” (and I do mean that in the nicest way possible: you really should follow the thing that works for you as a writer!). But I also love hearing how other people approach things, so thought I might usefully share my own standard ramp-up.

First, of course, is the good idea or the new data point. Something important enough to be written up, something bright and shiny, something interesting enough to pause a colleague in the hallway and talk about it. (Brainstorming ideas is a separate process, and not the subject of today’s post.)

And then, it’s time to do something with it, and that’s where we begin.

WHERE TO TARGET:

First, I figure out where I might like to write about it. I am a bonafide nerd; I keep a spreadsheet of possible venues, categorized by area. Oooo, this is medieval. No, it’s musicological. Gee, I should write for the regional history crowd. This one is definitely monastic. Whatever it is, there’s an outlet for that. (Fun fact, I also scrutinize my own article bibliographies for venues that might be a possible future locale where my words might reside. Where do my so-interesting colleagues publish? Write that place down!)

So, I generally try to envision two or three different journals where this future “thing” might go. That will be shaped by style, by content, by my current proclivity for footnotes, and by the capacity to write mere words, or to include images/tables/figures, depending on my mood and on the nature of the bright idea.

I then sit with recent issues of each, getting a sense for what the current editor/editorial team seems to like. From that, I usually find a clear target, the place I want to publish this bright shiny thing. First task in writing is always to know your audience. Done!

HOW TO START:

Starting the writing is hard, I know. So, shortcuts help. I use that “sit with the journal” time to get a sense of the shape of articles from that journal. 

  • How many intro paragraphs?
  • Does the article use sub-headers or continuous narrative?
  • How much space devoted to the author’s method and how detailed does it get?
  • How many “big sections” are typically in the main body?
  • How many images / tables / figures are there?
  • What kind of conclusion does it use, and how “big picture” does it get?
  • What’s the total word count for each article?
  • How many sources are cited? Is this one of those tour-de-force places demonstrating complete bibliographic control, or is it more “here are a bunch of related books”? I have sometimes switched target journal based on those practices. Ahem.

I make a little table from 3 to 5 of the recent articles, and I also use that time to capture the citation conventions (and translation habits) that are typical. Yes, I know that most journals have a style guide for authors, but I am here to tell you that they are … uneven … in their level of detail.

Why do all that work on topics unrelated to my bright shiny thing? Because these become your formulaic guide to how to approach your own writing.

BUILDING THE (BROAD, VERY BROAD) OUTLINE

For me, the next step is building my own article’s broad outline – capture the content in 5 to 7 big strokes, and distribute the number of paragraphs according to the shape you want it to have. Is there a climax to the article? A place where that treasured story needs to go?

Here’s where math comes in. My standard default article length is around 35 paragraphs, though my most recent article was actually 52 paragraphs after revisions, so, yeah. Choose a number somewhere between 30 and 60 paragraphs – those tables you built will help.

Of those 35 paragraphs, I figure I’ll spend about 3 paragraphs for the intro and 3 to 5 for the conclusion. Method might fall in intro or in main body, depending on my thinking and on the habits of the journal. We’re in the humanities; there is no one journalistic formula.

Then 25 to 28 “main body” paragraphs gives me space for about three categories of supporting data.  What are they? I try to come up with provisional sub-headers, since that will shape content disposition.

As a frame of reference, that collaborative article that got me thinking about this? We have 6 “content points” identified, one of which has 4 subtopics. We’ll do more extensive outlining next week, but I already feel good about where this is going.

This is also a good moment to just free associate. Do I already know of subtopics? Are there authors I should cite? Can I bullet point any of this? Whatever you have an answer to, and this is important, WRITE IT DOWN. At this point, my “progress” might look like a bullet-point list, or a mind map, or a scrawled flowchart, or several pages of word-doodles in my notebook. But it’s a first-round “capture” of what I think I might be doing.

BUILDING THE BIBLIOGRAPHY

Then comes my favorite part: building out my reading list. I do love to go trolling through the literature. I want one of those, and one of those, and three of those… My habit is to have bibliography in at least two and maybe three areas.

One is the content area, obviously, and that often includes going through my old bibliographic lists. Is this a case of “go deep” on the monastic element? The musical one? The “cool thinking about contemporary topics using the past as a case study”? The list of citations will vary depending on the answer to that question.

But the part that’s the most fun is the “how could I approach this topic” reading. There’s a whole set of topic-adjacent literature to draw on, some from people whose work I know, and others who are new to me. That’s the real permissive joy of scholarship: the adding something new to one’s own perspective.

What that looks like for me is usually thinking about one of two things: methods that match, but content that differs, or content that’s similar but spaces or times that are different. For the former, it’s reading about community music – a scholarship largely focused on 20th/21st century musicking experiences – and then applying it to 15th century Vorarlberg. For the latter, it’s reading about chaplains in England and Bavaria in order to write about chaplains in Bregenz. (Austrian-focused chaplain lit would already be in my own content area.)

This “breadth” gives me a focus for reading. My lists of “new lit” typically run from 30 to 90 items for an article, though the handbook article I wrote recently wound up at 125 items or so. (Yikes!) 

I would like to take this moment to thank the staff of the Interlibrary Loan Office without whom life would be much much much more complicated. You make what we do possible.

Not all of these articles and book chapters are going to be in the bibliography, obviously, but it gives me a chance to poke at the shape of the field. I’ll cruise through them at the rate of about 10 articles a week. Some just get the “AIC treatment” – Abstract, Introduction, Conclusion, and a bit of “what is this doing” by flipping through the middle sections of the article. Others I read fully. Still others get extensive notes and make it into my everything notebook. But all of them bring me joy. (Except that one. That one was terrible. That gives me an excuse to grump about the state of scholarship. Grump grump grump. Which brings me a bit of joy. Plus, I now get to DELIBERATELY omit it from my bibliography.)

Once I’m into reading, I’m into writing. And with an outline in hand and a bunch of notes from my reading, I’m no longer facing the proverbial “scary blank page” that causes me angst. (Note: if you came here hoping for actual writing strategies, you might look at my discussions of  strategies to avoid writer's block or strategies to organize writing tasks so you'll actually do them.)

And that, friends, is how I get started with an article. Figure out the “where,” and how it does its business; map out a high-level overview of this current project, and generate the bibliography to support that work. Then it's time to go play with your material and do more formal writing. GOOD LUCK!

Sunday, February 16, 2025

Making magic with old texts – how one scholar uses Transkribus (2/16/25)

A snippet of the Thalbach Chronicle (Bregenz VLA Thalbach Hs 9) and the logo/motto from Transkribus: "Unlock the past with Transkribus"

I’m not a modern-day marvel; digital humanities *seems* cool, but it’s not my training and not my natural modality. I live on a farm, with all the attendant joys of rural internet. (Failed the ping test recently? Us too!) I have read a number of DH articles with interest, and adopted some of the intellectual practices that volume assessment allows for. But at last, the moment has come: it’s time to learn a new tool in order to make my regular work go faster.

There’s this monastic chronicle (Bregenz VLA Thalbach Hs 9), you see, and it’s unedited. That there is, in fact, data from within “my” convent is a wonderful thing, and I’m thrilled to have access. There are two big problems, however. First, it has not been digitized. And second, I don’t read (well, I didn’t read) 18th-century Kurrentschrift. So, here’s what I did.

STEP ONE: GET PHOTOS. With the permission of the archive, I was able to photograph the chronicle. My system relies on the step-basis auto-numbering of photos, and is woefully “brute force” for the more sophisticated user. It is also (I confess it now!) simply a set of photos on my cell phone. No fancy lighting, no high-tech imaging for the ages; these are functional photos for use as a musicologist, not reflecting the book-history elements (which may be interesting but are not my raison d’etre). Before I start, I prepare a written description of the object, with special note of handwriting changes, format changes (from 18 to 24 lines, for example), and so on.

To organize my photos, I start by taking a picture of the description of the MS that I prepared in advance, and of its gathering structure that I prepared on-site. Then I take pictures of the outsides of the MS and of the visually interesting bits that caught my early attention. (These pictures are for slidedecks for any talk I might give – they’re the visual capture of the “coolness” of the item.)

Then, I take a picture of the gathering description for the first gathering, prepared in pencil in the archive. For more complicated gatherings, that includes a gathering structure diagram, but it’s often just a few lines of text. That becomes my photographic “label” for the section. Then, I take sequential pictures of the pages in the gathering. Occasionally I repeat a page if I want to be sure that I captured some particular detail, but mostly I move from first page to last page in the gathering. When I’m done with the gathering, I take a picture of the wooden table as a marker. Why? Because that’s going to leap out at me when I’m looking at all these photos of handwritten pages on my phone!

Now, onto the next gathering, and the next, and the next after that. Each starts with a header photo; each is sequential; each ends with a picture of the wooden table. At the very end, I go back and take pictures of details – with a label card or paper-pad notation first telling what gathering and folio it’s from, and why I thought it important (“detail of the insertion in the right-hand margin showing a different hand”).

Lastly, I go home and back everything up into a file folder. At this point, the sophisticate would probably rename all the photos, but I let the assigned image number stand for the page. YMMV. [That’s the increasingly old-fashioned phrase “Your Mileage May Vary,” if we happen to live in different acronym worlds.]

ORGANIZING IMAGES ON MY HARD DRIVE

With a couple hundred images, managing the inventory can seem daunting, but I’ve developed some habits over the years. I’m a spreadsheet person; spreadsheets make my heart sing. I love me some useful spreadsheets. So, for each of the important manuscripts in my life, I have a translation table: Photo Image number, gathering, folio or page, content, commentary, and then columns of whatever I’m interested in (concordances or chapter number or dates or places or whatever – that’s for the assessment phase).

So with the Chronicle, I now had a mass of mostly indecipherable eighteenth-century text entries carefully organized into a folder and managed via spreadsheet. Now it was time to start reading. Except, I don’t (yet) read Kurrentschrift, so I’ve got a whole mess of gobblety-gook. Enter the wonderful world of technology! I know AI has its issues – not least of which is its ecological impact – but there are tasks for which it is exceptionally well-situated, and it turns out that teaching the novice how to read new scripts is, for me, one of those true talents.

TRANSKRIBUS

I’ve heard of Transkribus (at https://www.transkribus.org/) for years; the idea of an app that can decode various historical scripts is an attractive short-cut for handwriting styles I don’t know, particularly since my focus is the content more than its presentation. I’m an extractivist: I want to know what the chronicle actually says and add those data-points to the story I’m telling. Also, I’m not keen to prepare editions – a chronicle is a side-witness to the music for me, not a central focus of my work. Many of my decisions reflect that perspective. I didn’t seek out a colleague for collaboration, for one thing, nor go to a paleographic institute. Hooray for brute force, right?

I searched out the Transkribus website and read all the (very helpful) guides that were prepared. I even watched two of the introductory videos, though I had to go to town for them to download at playable speeds. And, they have a capacity to try a few sample pages lower down on the page (scroll down to “try it out”). I chose a representative image and uploaded it to see what it did. Magic! From the loops and lines of Kurrentschrift emerged words that were, for the most part, German dialect, and familiar in style and spellings from other texts from the area. Success! I admit that I scooped up the sample reading and dumped it into a document file; I wanted to be sure that whatever I had, I saved.

The next step of learning was a several day project. One of the best things about the Transkribus tool is that it has a lot of subsets that use certain sets of documents as training tools. These models are available to apply to your document(s), and some of them work better than others. I literally made a list of ALL of the models that covered German Kurrentschrift of the 18th century and tested them with two different pages from my chronicle. For each, I did A/B testing: was this model better than that one? I kept notes on which ones did well, and went back to a couple of the models three or four times until I settled in on the one that seemed the most accurate on a first pass. I know that I could train the model for MY project, but I wasn’t interested in that this first time through, in part because I was a complete script-reading newbie, and didn’t want to mis-train the AI.

Once I had a model in hand – and had taken careful notes on its model number and name for scholarly purposes – it was time to start the transcription project. So, I created a free account (which currently gives you 100 pages of transcription free per month), and priced out the subscription model I’d use once we’re in the new fiscal year at my University.

As I planned my project, I realized that organizing the materials is an important consideration. There are “collections” in Transkribus, and “documents” within the collections. As a reminder to the reader: I’m not aiming at edition prep; I’m working toward extracting my data. So I created a hodge-podge organization that made sense to me. Instead of a collection that was the entire chronicle – something that I believe would probably be best practice – I broke out my chronicle into its gatherings, so I can navigate to-and-fro easily.

And then, I uploaded subsections of the gatherings as documents, rather than the entire gathering at a go or (at the other end of the spectrum) the individual leaves of the chronicle. This is being created for my convenience, after all, and this first go-round I wasn’t certain how things worked. I have between 4 and 16 pages in each “document.” I did learn that the windows folder bugaboo, randomization, occasionally impacted my uploads, which is one of the reasons that I kept my “documents” short. I also decided to retain document naming based on image number; for me, my spreadsheet is the controlling document. Renaming is both time-intensive and an area in which error can enter. As a result, my documents are named such compelling things as “IMG_1421-1427.” It works for me. (On the other hand, my naming for the gatherings is a bit more obvious to the outsider: “ChronikGath3” works here, and continuously typing in “ThalbachChronikGath3” just seemed like more work than needful since I’m not contemplating doing this with other chronicles, at least not in the next three years.)

Finally, after uploading the first document in the first collection, it was time to drive. I selected my pages, hit the “Recognize” button, and was taken to the interface. I added the “public model” that I had selected through testing, then took a deep breath, and hit “recognize.” The job runs in the background, and eventually the selected pages will have header colors that turn orange, to signal that the draft text is ready to review.

USING AI TEXT RECOGNITION TO CREATE A SEAT-OF-THE-PANTS EDITION

Here’s the part where things get wonderful. The AI model I chose is actually pretty decent with my text. As a new reader of Kurrentschrift, it took me a while to get a hang of it, but I used the process to teach myself the reading skills which will be necessary to me for this document and a couple of others upcoming. (I’m a 14th-15th century scholar; our handwriting is MUCH more legible, thank you very much!) For those who are in my boat, here are a few things I did that made learning to read the script go quickly.

First, I pull the transcribed text into a document file so that it’s on my local machine. (Remember, I’m that “rural internet” guru; failure to reach the world as a whole is as regular an experience as is going grocery shopping.) Alas, I haven’t been using the export function, though it’s there; instead, I cut-and-paste. It’s a rube’s approach, I know, but it’s fast, and it puts everything in a space I can edit with my own tools and habits. (I’m on LibreOffice these days; again, YMMV. But it’s free, and it doesn’t keep trying to put everything in OneDrive. Which is out in cyberspace. And often unavailable here at the farm. I’m glaring at you, Microsoft.)

To manage these texts, I insert headers for each individual page in all caps (to stand out from the transcribed text). For my purposes, the image number and the MS gathering and folio numbers suffice – along the lines of “PHOTO 1363 CHRONICLE GATH3 p. 34”. Also, like the AI transcription, I honor the line breaks of the original, so that toggling from transcription to image and back is easy. (Also, I insert my cut-and-paste as unformatted text; others might want the line numbers, but there were enough errors in line identification that I found it easier to do without.) This was a good cross-check to that randomizing ordering that windows puts on file transfer; by checking each image against its image number and page or folio number, I was able to ensure that the order of my text was in fact the order of the chronicle (except that the chronicle gatherings are actually out of order, but that’s a fault in the manuscript, not the editor nor the technology!).

Second, I got myself a couple of tables of cursive letterforms compared to Fraktur letter forms, so that the basic shapes were something I could puzzle through. I admit that my first pass awareness-level was so low that on the first four pages I read, the only word I could decode independently was “septuagesima.” However, once I learned that those really precise looking “n’s” were actually the letter “e,” I started to see the handwriting emerge from the page.

Third, it is my practice to work through systematically, allowing “bad readings” in order to get from zero to literate. I mangled my way through the first four pages, by which time the d’ as “der” and the dß as “das” was pretty clear. I go line by line, and I’ve learned to highlight the relevant information in different highlighter as I go. (For me, yellow is people, green is liturgy, blue is date or place, red is music, sweet music.) My goal is extraction, not perfection. It’s embarrassing to note that neither the AI nor I at first recognized the swoop at the end of words as an “-n.” Likewise, it took a while before I was confident enough to simply obliterate the AI’s suggestions for my own reading of a word. That said, it’s truly a case of learn-by-doing; as I hit page 20, I was starting to read each word instead of decoding it letter by letter.

Fourth, as a matter of process, I’m comfortable leaving in uncertainties. This work isn’t directly for publication, so if I wasn’t sure of a word, I would simply accept it or type in my best guess, then put in square brackets another possible reading, and frame things with question marks. For instance: “unser lieben Erbar [? frawen?] officii” – even as a newbie reader, the word “erbar” makes no sense here, but rather than worry about it at length, I put in my contextual reading and then moved on. I can search those up and revisit them after I’ve plowed through the first time.

Finally, as I indicated before, I reward myself with the “ping” of a data finding by using those highlighter buttons liberally. As I look back now over less than a month of intermittent work, I’ve got a long roster of people and events to code into my other note-taking systems. I haven’t harvested them yet, but they’ll be easy to identify as I finish up the process. Having those rewards in sight makes the days of “ugh, I can’t DO this” more bearable. And each time I return to the document, more and more of it looks like German instead of just “ink scrawls.”

MY TAKEAWAYS: THE MAGIC OF TECHNOLOGY

The reality is, the technology is remarkably impressive. Even without training it on my manuscript, it’s getting 75 to 80% of the text down properly. (It confuses Q for G, though: Quardian is not a word. Maybe next time I’ll try training the model.) That’s amazing!

It’s working from manuscript, and that’s an imperfect environment. Every so often, particularly when the scribe’s lines have a waver to them or when the page was curved in the photo, it mangles lines and mixes up word order – the manual corrective is absolutely necessary. 

The benefit is that as a scholar, I’m a factor of ten times more competent with the script now than I was at the end of the first week. Having learned to read a cursive 16th-century hand without AI assistance, I can testify to the massive jump-start that having a plausible transcript makes, as long as I’m working systematically, letter by letter and word by word. 

It’s just like practicing. If you work on the details and the techniques, there comes that moment where all of a sudden your perspective shifts from notes on the page to the sounds of the past. And that, my friends, is magical.

ACCESS
https://www.transkribus.org/
https://www.youtube.com/@transkribus
 

Dominican Prayer Gestures (3/29/25)

Medieval prayer was not just a matter of words – it was a full-bodied practice, shaped by movement, posture, and gesture. As Jean-Claude S...