Once upon a time as recently as yesterday, I lived in a world in which Google Translate was an imperfect but useful tool. I could ask it my best guess of what someone had said to me, and it would spit out a decent explantion of what we were talking about. I could point my phone camera at a wall-label in a museum, and out would come the information I was reading about in a language I can speak. This all was incredibly useful, particularly on my Asian trip last summer.
Another thing I happened to use Google Translate for was as a short-cut in my research. Now, I’ve been trained up with the best of them. I know that looking at the original language of, say, a medieval charter is the best and most accurate way to understand that document’s meaning. Nevertheless, when working at volume, it can be handy to skim, and while I can get in the groove with modern German, my medieval Alemannic dialect reading is slower-paced. If I want a really fast assessment of something, there’s nothing like my native tongue, which is English, as you’ve probably guessed by now.
So, when looking over the roughly 200 charters relevant to the current chapter, I’ve been going through them quickly via google translate to see if there’s utility in doing the close-up work of line-by-line and word-by-word reading. About one out of every 5 will have a topic of particular interest. I can skim a 69-line whole-side-of-a-cow sized parchment charter in its janky English translation in about 10 minutes. I can read said document directly in something more like 45 minutes.
Let’s think about the math:
- To skim English: 10x200 = 2,000 minutes, or roughly 30 hours of reading.
- To read medieval Alemannic: 45x200=9,000, or roughly 150 hours of reading
Okay, I’ll even be fair; add back another 20 hours for going through the targeted documents in detail and I’m still looking at the difference between 50 hours of work and 150 hours of work.
Why am I heated up about this topic? Well, they broke google translate last night.
Let me say that again, with all the feels:
THEY BROKE GOOGLE TRANSLATE LAST NIGHT.
I have receipts, of course. I’m going to share just one, because it’s been a long and stressful day this morning (bwahaha).
Here’s a clause out of one of my documents:
3. brieff Alsz dann der vorgemelt keb hailig Santgall unnser hußsatter Jarlichen ain Suma gebt Im den vigrechten der gestifften Jorlichen Jarzeten
Here’s its translation, as of yesterday:
Usable, right? Tells me the basics of what’s going on. Is it elegant? No. Is it fully accurate? Also no. It is, I think we’d all agree, a janky translation. (Oxford definition of janky: “of extremely poor or unreliable quality.”).
But here’s the thing: this janky translation is USABLE. It tells me whether or not this is a place I want to spend some of my precious minutes. I mean, I like down time just like everyone else; these translations are a shortcut!
But no, it wasn’t getting enough time-on-the-page, I guess, so Google “improved” (and I use that word with scare quotes for a reason, so be scared, be very very scared) its translation tool. Let’s look at the result, shall we?
3. When the aforementioned [name omitted], the [name omitted], gives our [name omitted] an annual sum in accordance with the established annual [terms omitted].
This is predictive technology gone bad. The AI underpinning here is obvious. The “improved” tool is happy to predict anything that’s sort of standard in a regular document of this type. But all, all, ALL of the interesting details are now redacted. Because names, and places, and specific amounts of money are NOT predictable. So I guess we shouldn’t need to see them, eh? Because everything useful in life is predictable. (Mad, me mad? Whatever do you mean???)
And this, this is what they’re calling the “classic” version of the tool. Not that it bears any resemblance to what the tool was doing yesterday, of course. But it’s a handy marketing ploy for a company that clearly Does Not Give A Shit about the user experience. The advanced version, well, it simply redacted lines 6 to 9 of my document altogether since those are just like line 5, a list of payments to particular chaplains.
But MY study is looking (in part) at exactly that. I need to know how much more the parish priest gets than the altarist at the St Mang altar. It’s part of my evidence. And it changes over time. Oh, which makes it unpredictable.
So when we premise translations on what words mean, we get one kind of information. Yesterday, I might argue with whether the “Mesner” was better translated as a “sacristan” or a “sexton.”
In the land of predictive AI, however, we premise translations on what other texts think might come next, and that means skipping the “minutiae.” The result? I can no longer tell from the translation that the Mesner, whatever his role might be, was even present in the document. A bad translation is something I can argue with; a predictive omission is something I can’t even see.
This is arguably great if you’re translating prose. It’s an absolute disaster if you’re looking at legal records and payments and guidelines for the foundations. Those kinds of documents are actually designed to deliver the very small, unpredictable details that AI wants to suppress. They are accounting devices, legal instruments, and memory machines. It’s like AI trying to tell you what flavor of icecream is your favorite based on other people’s orders. It has absolutely, positively no idea of what *you* might want, but that won’t stop it trying, using that oh-so-confident voice, though.
Janky, bad translations, in other words, are part of my world of work. They have a use. They may be inelegant, but their very bumps and hiccups are pointers to the curious oddity. They keep the text visible as a text. As a user, I still see names, sums, offices, altars, weird textual repetitions – the very things that are likely innovations in this particular textual example. Predictive smoothing, by contrast, is a lie of fluency. It gives you the shape of a charter without its substance. To put it another way, jankiness is epistemologically honest. It doesn’t pretend to understand more than it does.
Cory Doctorow has brought us the concept of “enshittification,” the reality that a captured audience is merely monetary potential to the big firms that think they own our data. And yes, this update is truly, truly, truly the enshittified version of what a translator is supposed to do. In fact, from where I’m sitting, this is not even translation anymore. It’s instead content abstraction masquerading as translation. A translator is accountable to the source text; a predictive model is accountable to statistical plausibility. In fact, I have trouble communicating just how BAD it is at the job it was perfectly adquate at yesterday, but you get the general gist.
And the reality is that an enshittified product is pretty much what you’re stuck with from here on out, unless Google changes its mind, and rolls back to yesterday’s model.
Happily for me, I can, in fact, read my texts. I have access to good dictionaries, and I do subscribe to DeepL for toggling languages with modern German. (DeepL struggles *hard* with Alemannic, but then, don’t we all?). And in a pinch, ChatGPT actuall does a decent job with the odd sentence or two.
But the fact that yesterday was easy, and today my tool is broken? This is the way of this tech-heavy world of ours. Because yesterday’s Google Translate assumed that you were the expert deciding what mattered. Today’s assumes the model knows better. That’s not just frustrating; it’s a quiet and very, very creepy reordering of authority in knowledge production. Scholars of thin archives (like the ones I work on in Bregenz, Austria and in Bischofszell, Switzerland) are exactly the ones who lose when the world (or the tech-companies) decides that unpredictability is noise. Because the unpredictable is often where the truth lies.








