Showing posts with label scholarly practices. Show all posts
Showing posts with label scholarly practices. Show all posts

Saturday, April 5, 2025

Managing Bibliography By Spreadsheet

In this post, I walk through how I structure and use a bibliography spreadsheet—from initial setup to prioritization, assessment strategies, and color-coded insights. Whether you’re in the thick of a new project or trying to wrangle a pile of PDFs into something coherent, this system can help you turn “reading” into actual, visible progress.

INTRO: WHY SPREADSHEETS

Smart people all over academe swear by their own choice of bibliographic tools. Zotero. Mendeley, EndNote: citation managers of all sorts are lovely. But that’s not how I work anymore. EndNote once ate my close-to-a-thousand item listing, and while I got most of it back due to my back-up diligence, it took me untold frustrating hours and permanently pissed me off. And Zotero doesn’t appreciate my multiple identities. Wait, which account am I logged in on? Let’s just say, I have developed serious trust issues on the software front.

Thus, I’m a spreadsheet fan. Give me that good, old-fashioned, sortable, controllable overview of what I’m doing, hosted on my home device, and I’m a happy camper. (By which I mean happy scholar; happy camper is NEXT weekend!)

It’s true, I have to remember which spreadsheet had what. And I’m at the edges of managing all the things for the next book, with nested sheets and a sense that I’d better plan a weekend retreat to review it all. But, article-wise, there’s nothing like a spreadsheet to give you a sense of your bibliography!

If you want to focus on how to shape the content of your bibliography, I got a start at describing my technique in my post on jump-starting a scholarly article. Here I  focus on what to do once you have a sense of what you want to read. I borrowed some of my approach from Raul Pacheco-Vega, https://www.raulpacheco.org/, and if you like reading about approaches to scholarship you should absolutely subscribe to him. He’s brilliant at sharing how he works, and his approach is enough like mine that I browse there regularly to see if I want to include any recent posts in my course reading-lists.

SET UP THE SPREADSHEET (AND READING PLAN)

My approach to article bibliography starts with mechanics. I open a spreadsheet (for me, LibreOffice has been working, though I will often import to Microsoft Office for sorting purposes later in the process). I generally name the file with at least two elements: a one or two word title, plus the designation “_30day” to remind me that this is the go-fast assessment of the literature.

Sample Filenames

Why 30 days? Well, a 30-day reading challenge is bite-sized. We can typically pledge to find at least 20 minutes for 30 consecutive days, and if you take 20 minutes a day every day to say, “what is this thing and why is it important,” you can build up a picture of your area pretty quickly. So, I generally target having 30 items to start, and then add to it as I work through my reading. (To keep my momentum, I treat books as a series of chapters, and each chapter is an “item” on the spreadsheet. Your mileage may vary.)

Why a consistent naming element like _30day? Well, “Bibliography” tends to get overused elsewhere in my life – other people’s bibliography PDFs download that way; students send me bibliographies all the time for this project or that one. Nobody sends me documents labeled _30day – so searching on it in my hard drive makes it easy to find!

So now you have a named file, and a pile of bibliography to enter. I plug the full citation into the spreadsheet, one per row, properly formatted to your preferred style. I’m a modified Chicago practitioner, myself; I keep the city for book citations, for instance, since some journals require it and others prohibit it – I clean that sort of thing up at the “final proofing” stage. And I’m forever forgetting WHERE the page references come for an article in a collection so I know I’ll have to fix that later too. The point is to get the details in so you’ve got them to hand.

And then, I add the “working columns” for the bibliography. Here’s my header list from a recent project, columns A through R:

    • Author, Year, Priority, Status, Citation    
    • Author's Big Idea, Gap They Fill, Paper Sections
    • evidence 1, evidence 2, evidence 3, evidence 4
    • quote 1, quote 2, quote 3, quote 4, other,
    • cited works, abstract from elsewhere

If you want, you can download a copy of the 30day spreadsheet template .

As you can see, some columns want to be wide, and others can be relatively narrow to start. I do a mix of word-wrapping and letting the text trail off into invisibility as I go; I toggle those features regularly as I’m working with my sheet. But setting things up gets me ready to make visible progress, and I’m a sucker for having my work show-as-I-go. Yay, dopamine.

WORK THE LIST

The good news is that the first part of working the list is easy to do. This is a great end-of-day or start-of-day activity since it doesn’t require much brainpower. Take the first columns -- Author, Year, Priority, Status, Citation – and plug them in. I find author and year sorting to be useful, though technically you could manage that with the Citation column. But that lets me do Author/Date placeholders as I write, and translate easily to full citation. It works for me.

Prioritizing is up to you. I tend to choose a number from 5 (read first!) to 1 (read when I get there). I might have 7 “read first” assignments, but that just means that it will get done in that first week’s pass.

Status, on the other hand, is a management tool. I’ve evolved my own shorthand if it’s helpful:  
  • HAVE means it’s a PDF in my file-folder or notes in a file somewhere.
  • DONE means I’ve actually assessed it.
  • READ ME means I’ve assessed it and want to look at it more closely.
  • NOT means I’ve read it and it isn’t right for the current project.
  • ILL-Date means I placed an order for it.
  • FETCH means I need to head to the library to grab that thing.
  • A blank in that space means I have some bibliographic hunting to do to get on top of it; that often happens with new citations at the bottom of the list. I save those for brain-dead time.
  • AGAIN means my brain wasn’t up to it this time, and I should come back to it later.

ASSESS THE ITEM: AIC READING

AIC reading – Abstract, Introduction, and Conclusion – is the law of the land. Seriously, we diligent types were taught to read things in order. But, and here comes a HUGE caveat, reading in order is NOT the best way to get the info to stick. Instead, move around the article or book chapter with abandon.

Start by reading the abstract. What does it claim it’s going to do? Then, read the intro, or the first 3-5 paragraphs if the item lacks subheaders. What’s the context, what’s the author’s stated task, what’s the thesis? Then flip to the conclusion: what’s the big claim and why does it matter?

Generally, I recommend performing an AIC, and then going back and filling out the next couple of columns: Author's Big Idea, Gap They Fill, Paper Sections. (I’m serious about the paper sections – you don’t have to quote the author’s subtitles, but give the gist of how they divvy up their information. Your later search strategies will thank you.) Each of these is telling you where they are trying to fit into the scholarly conversation, and also what will be useful for you in situating your own work.

This is not a “read every word” process. Be one with that.

By now, you’ll have spent somewhere between ten and twenty minutes on your item. Urgent class prep pressing? Household tasks such as making and eating food on the necessary list? Then you can pause and put a pin in it. Change your HAVE to DONE or NOT or, very rarely, AGAIN – that last a designation to tell you to come back to it, perhaps with another cup of coffee in your system. This is progress, and progress is good.

READ FOR THE EVIDENCE

Once you have time, move on to a more detailed assessment. Even this may be a “skim” more than a “read deeply.” Remember my status column? Part of the overview of the literature is designed to tell me which of the items I want to prioritize for full reading. They get a “READ ME” status until I’ve got an afternoon slot or a morning coffee work-cycle to spend the hour or two to work through details. So your task at this stage is to figure out how the author is working and what they have to say, and with what tools.

For this stage, I tend to flop through once, with an eye for what each paragraph is doing. Is it documentary? Analytical? Critical-theory based? This information will presumably marry up with the article sections you’ve already reviewed, and now you’re learning more deeply what this particular contribution is offering. As I go, I make comments on the evidence used and on quotes I may want to integrate in my own writing later on.

This is a kind of quick-and-dirty notetaking, but I’m careful to put quotes around quoted material or key phrases, and to put my initials in front of material that I am saying in response to that bit of the  reading. Here’s a typical entry:

Bijsterveld says: “inalienable objects”… “they symbolize or represent owners… their power and virtues.” (Bijsterveld 2007, p. 86). Of these, Bijsterveld lists arms, jewellery, crowns and regalia, relics, precious and holy books, objects connected with princely descent, costly textiles, precious materials, names, stories, sagas, etc: CJC SAYS: all get their power from associative knowledge – the relational ties of donor/recipient, and the meaning of the context, not the object itself.
(Bijsterveld, Arnoud-Jan A. Do ut des: Gift Giving, Memoria, and Conflict Management in the Medieval Low Countries, Middeleeuwse Studies en Bronnen CIV (Hilversum: Verloren, 2007), Ch 4)

Is it eloquent prose? No. Does it get the content across in ways I can mentally access it again? You bet. This is the quick-and-dirty approach. Writing will emerge from this, but this is only a single stone in the creek-crossing of knowledge-building. You’re going to need to heft several of these before the path forward emerges – so don’t get too wrapped up in getting it perfect; focus on getting it down.

How do I note-take at this point? This is the joy of the spreadsheet. I toggle back and forth from evidence to quote and back again. Occasionally, an item will be SO rich that it actually gets a second row in my spreadsheet, but mostly those go into the READ ME and get their own “document” with notes and responses. How do I decide? Mood of the moment, really. And, I always try to leave enough keyword info in the spreadsheet that a term search will pull up the right set of articles.

And then there are last two columns. If you had to choose 3 items from this person’s bibliography, which were the most important to them? I give a shorthand citation, unless I decide that I really need to read it too in which case it gets a short-hand reference in the right-hand column AND an entry in my own citation column. And, I cut-and-paste the abstract in if it’s convenient. About a quarter to a third of items wind up with abstracts cut in.

MANAGE INFORMATION: COLORS AND COLUMNS

But wait, there’s more. One of the habits I’ve developed that keeps me moving forward quickly is the use of color. As I take notes, I find that certain things leap out as important to my own argument. Those cells become green. There are others that need more thought, or might apply, or make me mad in one of those “that’s not it” sorts of ways that helps clarify what I am thinking. Those become yellow (or if urgent, orange), because I want to come back to them and sit with the idea some more. I’ve sometimes used urgent red to get my attention the next work cycle or to track through inter-library loan until that issue is resolved. But mostly, my spreadsheet is green, yellow/orange, or void.
 
Excerpt from a Shakespeare-related 30-day bibliography for a conference paper, showing the use of color
 


The orange cell here actuall included 4 things: a quote, a cross reference to another scholar, my own thoughts, and the details of what I needed to track down for my argument.

Dr Johnson: "The meek sorrows and virtuous distress of Catherine have funished some scenes which may justly numbered among the greatest efforts of tragey. But the genius of Shakepseare comes in and goes out with Catherine. Every other part may be easily conceived, and easily written." Perhaps per Hannah Pritchard, recreated role annually at Drury lane 1752-1761. CJC: the long popularity of the play in performance speaking against its relative neglect by critics. (Rep of Sarah Siddons, helen Faucit, Ellen Terry, Ellen Tree, Charlotte Cushman. Bowers p. 29-30)  Similarly, the XXX thatcher and gooch suggest that the play is more amenable to the sounding and staged interpretations than the literary ones.

(Bowers, A. Robin. 1988.  "'The Merciful Construction of Good Women': Katherine of Aragon and Pity in Shakespeare's King Henry VIII." Christianity and Literature 37, no. 3 (1988): 29-51.)

You’ll notice the typos (ewwww, typos) and the short-hand; notes are not prose! Think of your notes as an action item, a step on the way toward actual argument. Don't polish the stone -- it just needs to get you across the stream.

A second add-on tool is the “additional column” trick. As I’m working with a given topic, it can be handy to group the bibliography in various ways. For the prayers-before-an-icon article that I’m currently researching, for instance, I added both a “category” column and a “statue” column to my spreadsheet so that I can group all the “gesture” bibliography, or take a look at everything that includes statues as a part of their evidence. These are functional project-based bibliographies, and infinitely adaptable to need.

WHY SPREADSHEETS

And that brings us to why I spreadsheet to begin with. I don’t always, to be honest. My kazoo project (ahem, yes, I have a kazoo project) is a long 20-plus page document file with notes intermingled with the citations. 

 But when I want a quick overview, or when I’m trying to ready myself for writing, I’ll often back-migrate my information into spreadsheet format. The act of adding keywords or search terms serves as part of my self-discipline, a guarantee that I really actively review my information, and don’t just look at it. You know, the way one “looks” at things.

Passing your eyes over something is different than acting on it, and the spreadsheet, with its columns and colors, retyped sub-headers and great one-liners, ensures that my reading stays active. It is true that the cut-and-paste of bibliography across projects can get clunky, since some will have an extra two columns, and others five, and others none. Making sure the data align *is* a pain in the neck.

But that’s easy work, whereas accountability is hard work. Spreadsheets for me are a tool of accountability, both to what scope I want to have myself have read, and to what speed with which I want to get this project done.

And if you find a tool that hits your dopamine receptors on a regular basis to encourage you to do more of it, in a scholarly-productive way? Keep using that tool, whatever it is! For me, it’s spreadsheets for the win!


RESOURCES

Cynthia Cyrus, "How To Jump-Start a Scholarly Article: The Plan," Silences And Sounds [Blog], March 22, 2025, https://silencesandsounds.blogspot.com/2025/03/how-to-jump-start-scholarly-article-plan.html

30-day reading list template:  https://docs.google.com/spreadsheets/d/1kx8olonvggdhQk2SK7NuP8_VzRViR39Fhzb-bqPsypI/copy

Raul Pacheco-Vega website https://www.raulpacheco.org/blog/ , and especially his resources page https://www.raulpacheco.org/resources/ 

Wednesday, March 12, 2025

50 Questions: Adulting as Faculty

So many questions...

  1. Have you checked your spam folder?
  2. What about your institutional spam folder?
  3. What about the invisible-to-you institutional spam folder?
  4. Have you released the possibly important messages?
  5. What about the ones asking you to do things for the discipline?
  6. Or the ones from the listserve you subscribe to?
  7. Have you deleted the older messages?
  8. No, we can’t let you delete more than 50 messages at a time. Try again.
  9. Have you saved the messages you’ll need three months from now? What about the ones that are in the “maybe” category?
  10. Did you back up the email chain for your advisee?
  11. Have you reviewed their progress toward degree?
  12. Did you write the student?
  13. Did you copy the dean?
  14. Did you send them a reminder about their appointment? Don’t you want them to graduate?
  15. Have you backed up your computer recently?
  16. Have you enabled the institutional cloud backup?
  17. Why not?
  18. What do you mean it bricks your computer?
  19. Have you written to IT about that?
  20. Did you copy the proper Associate Dean?
  21. Did you give documentation?
  22. Have you checked your phone messages? Yes, your office has been assigned a phone number. No, we haven’t assigned you a telephone.
  23. Have you checked your computer file where we put such messages?
  24. How do expect to recruit students if you don’t check messages on your nonexistent phone?
  25. Have you updated your C.V. recently?
  26. Did you remember to include the peer review stint last week?
  27. Did you give the journal ISSN?
  28. Has that populated through to your online CV?
  29. How are we supposed to establish a reputation if you don’t share your work? No, we can’t just let you submit a PDF. You need to retype that information in our form. No, we can’t accommodate umlauts. No, it’s not set up for italics. No, we don’t have a category for that. Put it under “Other.”
  30. Have you submitted your mid-semester grades?
  31. Have you met with any delinquent student?
  32. Why not?
  33. No, Zooming the sick ones into class isn’t allowable. Yes, it used to be required. No, we can’t explain our policy. Why do you ask?
  34. Have you uploaded your material to the LMS?
  35. Is any of that material copyrighted?
  36. Not by you, by real copyright holders?
  37. Did you pay the fees? Did you charge the students? Of course we understand that you thought that use was covered by the library agreements. Have you always been an optimist?
  38. Have you planned next year’s classes?
  39. Did you submit the times and rooms?
  40. Those rooms aren’t available, now where do you want it?
  41. Main campus isn’t available, now where do you want it?
  42. What do you mean the students won’t fit in the room? Have you considered a rotating attendance policy?
  43. Yes we know that it has passed curricular review. No, it’s not ready to show in the system. Don’t you think the staff have enough to do?
  44. Have you filled out your institutional satisfaction survey?
  45. What did you say?
  46. Privacy is so 1990s. Let us know how we can help you. We see that you haven’t yet responded.
  47. Have you published anything this year? Was it a book? Then why are you bothering us about it?
  48. Have you responded to the committee meeting-time poll yet? Which committee? What do you mean?
  49. Have you backed up your computer yet? This is your second warning.
    We can’t be accountable for any glitches in that process. It worked perfectly for another faculty member on another system altogether.
  50. Do you still feel valued?

Monday, March 3, 2025

My Late Lamented “Everything Notebook”

Blue Spring 2025 notebook, plus a page from a previous notebook

I’m a paper person. I think best while writing; I am an inveterate list-maker; I write up things as a bit of anticipatory joy; I take notes in pen on things to come back to for class. My life is wrapped up in bound notebooks, not just in the abundant books-for-life that I am actively reading at any given moment. (Fantasy! Gardening! Nuns! The Gaze! Space! Soundscapes! If it has words, I probably want to read it. But that’s not what this post is about.)

Because writing is so integral to life, I carry an Everything Notebook almost everywhere I go. It’s gone to meetings (so many meetings); it’s been there while I’ve read email (put a note on the list of future agenda topics); it’s been there when I needed to outline or brainstorm; it’s been there at those difficult draft stages when the ideas need to move hither and thither. It’s done poetry, and drafts of valentines notes, and organized the garden. It has made note of trail damage to report to the ranger; it has the outline of the backpacking trip I want to take. It has non-Amazon book buying websites, and great quotes for the next time I teach that writing class.

And it’s gone. Sometime last weekend, the winter Everything Notebook escaped for freedom. It’s not in the scout bag, it’s not at the bottom of the car, it’s not at Lost and Found, it’s nowhere to be found.

The good news is, there were only about five pages of future-book related notes, and those are mostly mentally recoverable. I had just submitted project 1; I had also submitted project 2; and project 3 is up in the cloud, with part 1 out for review and part 2 in a brand new group brainstorm. There’s never been a better time to lose the Everything Notebook, because nearly all of its big sections are in the “done and done” stage, checked off with gigantic check-marks cutting across the page.

I can reconstruct most of it; there’s probably about three hours of focused work that I need to do to feel fully “back in control.” My list of Amazon substitutes will be out there in social media; the to-do list for the garden I can reconstruct in the car as we drive up north at the end of the week, and I’m pretty sure I’ve got notes on nearly all that I’ve read saved on the computer somewhere. I’ll be missing a couple of great quotes (“here’s that place I wanted to share this thing I once read” doesn’t go over as well as the quote I’d actually copied out, alas), and my record of what we did over the long holidays will be more memory than archive. But it’s okay.

WHAT?
So now, here I am, starting a Spring-based Everything Notebook. I’ve put the blue tape on the front with its label. I’ve foliated the thing (that is, put numbers on each leaf, or folio, rather than on every page – so each opening has a number). I have left my space for a table of contents.

But I thought you, dear reader, might like to know about the concept of the Everything Notebook.

I know a lot of people invest heavily in theirs, with nice almost-like-cardstock pages. I get mine from as bound books (wide ruled!) from the dollar store. It’s going on canoe trips in my backpack, so cheap cardboard covers and a capacity to take notes are my priority. But no spiral binding; spiral binding gets squashed and catches on things. A plain old bound composition book, one that doesn’t create a hurdle to writing in it. (I once carefully inscribed the title page of a notebook. That notebook was too nice for real use and it languished. Now, I just put a title and my contact information. My current notebook has a pre-torn spot on the cover and came pre-installed with a coffee stain when the cat bumped my arm one morning – it’s messy and disposable enough to USE, not to CHERISH.)

My brand new slightly soiled and torn Everything Notebook – plus a page out of one from last year

And treatment of the Everything Notebook differs from one person to the next. Some people are crafty and elegant. Their handwritten notebooks are works of art, with beautifully drawn flowcharts and multi-colored pen annotations that could be reproduced in their next article. Not me. My Everything Notebook is a squawky thing, with mind-maps with words sideways and angled to draw attention to this or that relationship, and lined-off lists of tasks accomplished, and giant caps to remind me to do “the thing” the next time I see that page. 

 A scholarship brainstorm with lines aslant

Some people use fountain pens and inscribe their notebooks as beautifully crafted legacies for their  progeny who may someday consult these pages of wisdom. My handwriting ranges from the tidy to the out-of-control scrawl as the car bounces up and down on carpool days. Nor do I use the fountain pens or multicolored pen shades to carefully shape what a person notices. Instead, in mine, there’s a mix of all-too-bleedable felt-tip ink with good solid ballpoint and even pencil that smears because for me, different kinds of writing implements support different kinds of thinking. This is a work space, not, for me, a pretty one. I don’t put on my glasses when I jot a note in the middle of the night, I just try not to overwrite things already noted down -- though it’s been known to happen.

Page labeled “Brahms” with some hotel scribbles, a note about Climbing safety, and prep notes for a training. Pretty? No. Functional? Yes.

SOURCE AND ORIGIN
I came to the Everything Notebook from two places. First, I used to take topical notes: one topic in one bound composition book. I’ve got three such notebooks from my early days as a monastic scholar, for instance – notes on readings, lists of convents, those kinds of things. And then I had another book on Beethoven, and one on Mozart, and piles of paper for my to-do list. Packing for the office was regularly a virtuosic act of “where did I put the things?” and not the calm collected departure that leads smoothly into the productive part of the day. Besides, I was forever leaving a notebook at the office, and needing it at home or vice versa, and writing on slips of paper to be added in later, and … well, it was a mess.

So by moving to a single bound composition notebook – large enough to fit a lot on a page, small enough to fit handily in even my smallest daypack – and ensuring that it goes everywhere with me, I’ve done away with a lot of the paper clutter. (Cue my family laughing heartily; but now my paper clutter is at the level of the article draft or PDF printout rather than at the level of little slips of paper. Trust me, it’s an improvement!). That “aha” moment was about a decade ago, after one last frustrating search for the list I’d made just the day before. I rage-wrote the list in the back of the nuns notebook. And then the lightbulb went off: what if I just put everything in one place? And the Everything Notebook was born.

The second inspiration was my 20 page to-do list. Okay, not everyone does that kind of self-organizing. But I found early on that each of my projects (and I have had a lot of projects) has about 10 things I’m trying to track. Schedule the next meeting, draft five bullet points, find the verb list and write up that Learning Outcome chart for the bureaucrats out there. Now, each project can be tracked at once in one place with my Everything Notebook in hand. My upcoming trip to China and Nepal is there alongside my class prep is adjacent to the bibliographic planning for the next article. And the list of seeds I need from the store will be there when I go to handle the recycling later today. I manage (mostly) to get it all done, because I can track it.

HOW?
To support my wild-and-crazy work/life balance, I’ve divided my book into sections. Sometimes I start at the back-end of a section and work toward the book’s front, and other topics work in the regular front to back. It seems chaotic, but it does help me navigate.

And my sections are:
  • fols. 2-3: Table of Contents (grows organically as I work)
  • SECTION 1: RESEARCH
    • fols. 4-5: Info about conference and book deadlines, high level overview of the season’s plan
    • fols. 5-20: Research on the book
    • fols. 20-40: Research on other projects, either grouped or interspersed, depending on mood
  • SECTION 2: CLASS PREP / TRAVEL
    • fols. 50 backwards to 40: class prep stuff
    • fols. 50 forward to 60: travel planning
  • SECTION 3: ADMINISTRIVIA
    • fols. 75 toward the front: meetings, so many meetings. And more meetings. And then some notes on meetings
  • SECTION 4: PERSONAL
    • fols. 75 toward the back: language learning. Right now, I’m getting ready for China. Chinese is haaaaaaard.
    • Fols. 97 toward the front: to-do lists.
And, at random in the range of the 80s or so, things to do with life. Poetry. Bird lists. Recipes. Stuff.

WHY?
Why tell you all of this? On the one hand, it’s one quirky person’s way of managing All The Things. On the other hand, this is the kind of practice that can really make a difference in terms of personal productivity, because it puts “life” and “work” into the same physical space, and invites a contemplation of brussel sprouts (with honey and sriracha) alongside contemplation of the intricacies of prayer transmission in the 16th century. Because both are important. And the Everything Notebook helps me keep track of it all.

Another advantage, which I didn’t think of when I started this practice a decade ago or so, is that I do actually remember my work chronologically. Oh, that was the project I was working on when we were doing improvements down at lakeside. Pull out the 2019 notebook, and there are the bibliographic notes from that work on this-or-that. It helps me remember more than if it were limited to the thing itself. It also helps me find things on my computer, since I can put boundaries on the date search.

And third, I really do believe all the scholarship that tells us we remember what we write by hand better than what we type. Type is fast; ideas flow through the fingers onto the page. But the dramatic sad face next to the bad archival news recorded on the sheet of paper is the thing my brain actually chooses to remember. I’m a geographical filer; that’s true in note-taking space as well as in my life. I know where to look, and that’s enough to help me track down the thing I’m looking for. (Where was that great mushroom soup recipe? Oh, yeah, that was the year we did the quick departmental retreat – it was at the back of that notebook. Yum.) So for me, this kind of organization works with the ways in which my brain chooses to connect things. My coffee stains and rain-ruffled pages are my version of Proust’s madeleines – the spark that brings to life the whole complexity of thinking indulged in by my previous self.

As long as I can keep track of where I’ve put my book. Sniff. I’ll miss that winter volume, but I still carry around an image of the coffee stain on page 16 with the notes from that inter-library loan book on scribes, and the carefully checked-off “tell my sister X, Y, and Z” list from the winter holidays at the back of the book. The writing imprinted not just the page, but also my memory.

And that process of writing information into memory is exactly what the Everything Notebook is for!

Sunday, February 16, 2025

Making magic with old texts – how one scholar uses Transkribus (2/16/25)

A snippet of the Thalbach Chronicle (Bregenz VLA Thalbach Hs 9) and the logo/motto from Transkribus: "Unlock the past with Transkribus"

I’m not a modern-day marvel; digital humanities *seems* cool, but it’s not my training and not my natural modality. I live on a farm, with all the attendant joys of rural internet. (Failed the ping test recently? Us too!) I have read a number of DH articles with interest, and adopted some of the intellectual practices that volume assessment allows for. But at last, the moment has come: it’s time to learn a new tool in order to make my regular work go faster.

There’s this monastic chronicle (Bregenz VLA Thalbach Hs 9), you see, and it’s unedited. That there is, in fact, data from within “my” convent is a wonderful thing, and I’m thrilled to have access. There are two big problems, however. First, it has not been digitized. And second, I don’t read (well, I didn’t read) 18th-century Kurrentschrift. So, here’s what I did.

STEP ONE: GET PHOTOS. With the permission of the archive, I was able to photograph the chronicle. My system relies on the step-basis auto-numbering of photos, and is woefully “brute force” for the more sophisticated user. It is also (I confess it now!) simply a set of photos on my cell phone. No fancy lighting, no high-tech imaging for the ages; these are functional photos for use as a musicologist, not reflecting the book-history elements (which may be interesting but are not my raison d’etre). Before I start, I prepare a written description of the object, with special note of handwriting changes, format changes (from 18 to 24 lines, for example), and so on.

To organize my photos, I start by taking a picture of the description of the MS that I prepared in advance, and of its gathering structure that I prepared on-site. Then I take pictures of the outsides of the MS and of the visually interesting bits that caught my early attention. (These pictures are for slidedecks for any talk I might give – they’re the visual capture of the “coolness” of the item.)

Then, I take a picture of the gathering description for the first gathering, prepared in pencil in the archive. For more complicated gatherings, that includes a gathering structure diagram, but it’s often just a few lines of text. That becomes my photographic “label” for the section. Then, I take sequential pictures of the pages in the gathering. Occasionally I repeat a page if I want to be sure that I captured some particular detail, but mostly I move from first page to last page in the gathering. When I’m done with the gathering, I take a picture of the wooden table as a marker. Why? Because that’s going to leap out at me when I’m looking at all these photos of handwritten pages on my phone!

Now, onto the next gathering, and the next, and the next after that. Each starts with a header photo; each is sequential; each ends with a picture of the wooden table. At the very end, I go back and take pictures of details – with a label card or paper-pad notation first telling what gathering and folio it’s from, and why I thought it important (“detail of the insertion in the right-hand margin showing a different hand”).

Lastly, I go home and back everything up into a file folder. At this point, the sophisticate would probably rename all the photos, but I let the assigned image number stand for the page. YMMV. [That’s the increasingly old-fashioned phrase “Your Mileage May Vary,” if we happen to live in different acronym worlds.]

ORGANIZING IMAGES ON MY HARD DRIVE

With a couple hundred images, managing the inventory can seem daunting, but I’ve developed some habits over the years. I’m a spreadsheet person; spreadsheets make my heart sing. I love me some useful spreadsheets. So, for each of the important manuscripts in my life, I have a translation table: Photo Image number, gathering, folio or page, content, commentary, and then columns of whatever I’m interested in (concordances or chapter number or dates or places or whatever – that’s for the assessment phase).

So with the Chronicle, I now had a mass of mostly indecipherable eighteenth-century text entries carefully organized into a folder and managed via spreadsheet. Now it was time to start reading. Except, I don’t (yet) read Kurrentschrift, so I’ve got a whole mess of gobblety-gook. Enter the wonderful world of technology! I know AI has its issues – not least of which is its ecological impact – but there are tasks for which it is exceptionally well-situated, and it turns out that teaching the novice how to read new scripts is, for me, one of those true talents.

TRANSKRIBUS

I’ve heard of Transkribus (at https://www.transkribus.org/) for years; the idea of an app that can decode various historical scripts is an attractive short-cut for handwriting styles I don’t know, particularly since my focus is the content more than its presentation. I’m an extractivist: I want to know what the chronicle actually says and add those data-points to the story I’m telling. Also, I’m not keen to prepare editions – a chronicle is a side-witness to the music for me, not a central focus of my work. Many of my decisions reflect that perspective. I didn’t seek out a colleague for collaboration, for one thing, nor go to a paleographic institute. Hooray for brute force, right?

I searched out the Transkribus website and read all the (very helpful) guides that were prepared. I even watched two of the introductory videos, though I had to go to town for them to download at playable speeds. And, they have a capacity to try a few sample pages lower down on the page (scroll down to “try it out”). I chose a representative image and uploaded it to see what it did. Magic! From the loops and lines of Kurrentschrift emerged words that were, for the most part, German dialect, and familiar in style and spellings from other texts from the area. Success! I admit that I scooped up the sample reading and dumped it into a document file; I wanted to be sure that whatever I had, I saved.

The next step of learning was a several day project. One of the best things about the Transkribus tool is that it has a lot of subsets that use certain sets of documents as training tools. These models are available to apply to your document(s), and some of them work better than others. I literally made a list of ALL of the models that covered German Kurrentschrift of the 18th century and tested them with two different pages from my chronicle. For each, I did A/B testing: was this model better than that one? I kept notes on which ones did well, and went back to a couple of the models three or four times until I settled in on the one that seemed the most accurate on a first pass. I know that I could train the model for MY project, but I wasn’t interested in that this first time through, in part because I was a complete script-reading newbie, and didn’t want to mis-train the AI.

Once I had a model in hand – and had taken careful notes on its model number and name for scholarly purposes – it was time to start the transcription project. So, I created a free account (which currently gives you 100 pages of transcription free per month), and priced out the subscription model I’d use once we’re in the new fiscal year at my University.

As I planned my project, I realized that organizing the materials is an important consideration. There are “collections” in Transkribus, and “documents” within the collections. As a reminder to the reader: I’m not aiming at edition prep; I’m working toward extracting my data. So I created a hodge-podge organization that made sense to me. Instead of a collection that was the entire chronicle – something that I believe would probably be best practice – I broke out my chronicle into its gatherings, so I can navigate to-and-fro easily.

And then, I uploaded subsections of the gatherings as documents, rather than the entire gathering at a go or (at the other end of the spectrum) the individual leaves of the chronicle. This is being created for my convenience, after all, and this first go-round I wasn’t certain how things worked. I have between 4 and 16 pages in each “document.” I did learn that the windows folder bugaboo, randomization, occasionally impacted my uploads, which is one of the reasons that I kept my “documents” short. I also decided to retain document naming based on image number; for me, my spreadsheet is the controlling document. Renaming is both time-intensive and an area in which error can enter. As a result, my documents are named such compelling things as “IMG_1421-1427.” It works for me. (On the other hand, my naming for the gatherings is a bit more obvious to the outsider: “ChronikGath3” works here, and continuously typing in “ThalbachChronikGath3” just seemed like more work than needful since I’m not contemplating doing this with other chronicles, at least not in the next three years.)

Finally, after uploading the first document in the first collection, it was time to drive. I selected my pages, hit the “Recognize” button, and was taken to the interface. I added the “public model” that I had selected through testing, then took a deep breath, and hit “recognize.” The job runs in the background, and eventually the selected pages will have header colors that turn orange, to signal that the draft text is ready to review.

USING AI TEXT RECOGNITION TO CREATE A SEAT-OF-THE-PANTS EDITION

Here’s the part where things get wonderful. The AI model I chose is actually pretty decent with my text. As a new reader of Kurrentschrift, it took me a while to get a hang of it, but I used the process to teach myself the reading skills which will be necessary to me for this document and a couple of others upcoming. (I’m a 14th-15th century scholar; our handwriting is MUCH more legible, thank you very much!) For those who are in my boat, here are a few things I did that made learning to read the script go quickly.

First, I pull the transcribed text into a document file so that it’s on my local machine. (Remember, I’m that “rural internet” guru; failure to reach the world as a whole is as regular an experience as is going grocery shopping.) Alas, I haven’t been using the export function, though it’s there; instead, I cut-and-paste. It’s a rube’s approach, I know, but it’s fast, and it puts everything in a space I can edit with my own tools and habits. (I’m on LibreOffice these days; again, YMMV. But it’s free, and it doesn’t keep trying to put everything in OneDrive. Which is out in cyberspace. And often unavailable here at the farm. I’m glaring at you, Microsoft.)

To manage these texts, I insert headers for each individual page in all caps (to stand out from the transcribed text). For my purposes, the image number and the MS gathering and folio numbers suffice – along the lines of “PHOTO 1363 CHRONICLE GATH3 p. 34”. Also, like the AI transcription, I honor the line breaks of the original, so that toggling from transcription to image and back is easy. (Also, I insert my cut-and-paste as unformatted text; others might want the line numbers, but there were enough errors in line identification that I found it easier to do without.) This was a good cross-check to that randomizing ordering that windows puts on file transfer; by checking each image against its image number and page or folio number, I was able to ensure that the order of my text was in fact the order of the chronicle (except that the chronicle gatherings are actually out of order, but that’s a fault in the manuscript, not the editor nor the technology!).

Second, I got myself a couple of tables of cursive letterforms compared to Fraktur letter forms, so that the basic shapes were something I could puzzle through. I admit that my first pass awareness-level was so low that on the first four pages I read, the only word I could decode independently was “septuagesima.” However, once I learned that those really precise looking “n’s” were actually the letter “e,” I started to see the handwriting emerge from the page.

Third, it is my practice to work through systematically, allowing “bad readings” in order to get from zero to literate. I mangled my way through the first four pages, by which time the d’ as “der” and the dß as “das” was pretty clear. I go line by line, and I’ve learned to highlight the relevant information in different highlighter as I go. (For me, yellow is people, green is liturgy, blue is date or place, red is music, sweet music.) My goal is extraction, not perfection. It’s embarrassing to note that neither the AI nor I at first recognized the swoop at the end of words as an “-n.” Likewise, it took a while before I was confident enough to simply obliterate the AI’s suggestions for my own reading of a word. That said, it’s truly a case of learn-by-doing; as I hit page 20, I was starting to read each word instead of decoding it letter by letter.

Fourth, as a matter of process, I’m comfortable leaving in uncertainties. This work isn’t directly for publication, so if I wasn’t sure of a word, I would simply accept it or type in my best guess, then put in square brackets another possible reading, and frame things with question marks. For instance: “unser lieben Erbar [? frawen?] officii” – even as a newbie reader, the word “erbar” makes no sense here, but rather than worry about it at length, I put in my contextual reading and then moved on. I can search those up and revisit them after I’ve plowed through the first time.

Finally, as I indicated before, I reward myself with the “ping” of a data finding by using those highlighter buttons liberally. As I look back now over less than a month of intermittent work, I’ve got a long roster of people and events to code into my other note-taking systems. I haven’t harvested them yet, but they’ll be easy to identify as I finish up the process. Having those rewards in sight makes the days of “ugh, I can’t DO this” more bearable. And each time I return to the document, more and more of it looks like German instead of just “ink scrawls.”

MY TAKEAWAYS: THE MAGIC OF TECHNOLOGY

The reality is, the technology is remarkably impressive. Even without training it on my manuscript, it’s getting 75 to 80% of the text down properly. (It confuses Q for G, though: Quardian is not a word. Maybe next time I’ll try training the model.) That’s amazing!

It’s working from manuscript, and that’s an imperfect environment. Every so often, particularly when the scribe’s lines have a waver to them or when the page was curved in the photo, it mangles lines and mixes up word order – the manual corrective is absolutely necessary. 

The benefit is that as a scholar, I’m a factor of ten times more competent with the script now than I was at the end of the first week. Having learned to read a cursive 16th-century hand without AI assistance, I can testify to the massive jump-start that having a plausible transcript makes, as long as I’m working systematically, letter by letter and word by word. 

It’s just like practicing. If you work on the details and the techniques, there comes that moment where all of a sudden your perspective shifts from notes on the page to the sounds of the past. And that, my friends, is magical.

ACCESS
https://www.transkribus.org/
https://www.youtube.com/@transkribus
 

A handful of sunrises

A sunrise begins in freshness, In hues that can’t be named, A wordless shout of wonders To call forth inner joy. The weigh...