Donna doesn’t speak Celtic! Philosophical issues about machine translation

Le alasdairmaccaluim

If you are a Doctor Who fan, you may well remember the episode “The Fires of Pompeii”.

Alas, the Doctor and Donna went there in the year 79 CE just before the volcano erupted and not in 1971 CE to see the more important historic event of Pink Floyd recording their Live at Pompeii film!

David Gilmour playing live in Pompeii, 1971

Despite arriving at the wrong time to enjoy some of the best prog rock ever recorded, the Doctor and Donna arrive in a market place. Learning she is in an ancient Roman territory, Donna uses a few words of school Latin with a market stall holder. He replies something like “sorry love I don’t speak Celtic” and Donna doesn’t understand why he is speaking English. The Doctor explains that he is in fact speaking Latin but the TARDIS’s psychic translation circuit is allowing them to understand each other.

The recent growth in machine translation made me think of this. It made me imagine a world where everything in one language is immediately available in another and where it isn’t even clear which languages were being used in the first place.

This quite simply isn’t a good idea.

One of the things that troubles me about the ready availability of machine translation is that it enables anyone to instantly translate Gaelic content into English and immediately be in what would otherwise be all-Gaelic spaces.

This creates two problems – firstly if the translation is correct, and secondly if the translation is wrong.

Let me explain.

Many years ago in the early days of the internet and before social media, Highland Council set up a Gaelic forum on their web pages. It was there to encourage people to discuss different issues in Gaelic.

It worked OK for a while but it was mainly un-moderated and English posting was also allowed and in the end, the posts in Gaelic were swamped by non-Gaelic speakers talking about ancestry and asking for translations for tattoos and suchlike. In the end, the conversation of Gaelic speakers was drowned out by non-Gaelic speakers and Gaelic speakers stopped using it.

This isn’t a big or important example but shows the effect that automatic machine translation can have. It leads to Gaelic being crowded out.

If you post in Gaelic on social media, you are doing so mainly because you want people to interact with your content in Gaelic. This doesn’t mean that you want it to be hidden, or for other people not to understand it, it just means you value the use of Gaelic. Instant Gaelic-English translation can take away a domain for the use of Gaelic to some extent by allowing unlimited access by non-Gaelic speakers.

If we can’t have a conversation within the Gaelic community in Gaelic about our community, what’s the point of even speaking or learning the language?

To be sustainable, languages have to have domains which only belong to that language and MT is a real risk to online and written domains for Gaelic.

If people have to make some effort to get a machine translation of what you’ve written, it’s one thing but if it is done very easily or even automatically, this is a different matter.

I’ve seen examples of websites where Gaelic contributions are automatically translated to English and where it isn’t immediately clear that it wasn’t originally written in English.  

This is very concerning. Firstly, in addition to in-group communication within the Gaelic community, using Gaelic online for many people is about raising the profile of the language. Secondly and more importantly, automatic MT of content is rarely as good as the original.

At best, you will have the situation where people are reading your content in worse English than you could have written yourself. Bear in mind too that most Gaelic speakers find writing in Gaelic more difficult than in English and are less confident in Gaelic writing meaning that they have to put in more effort to write in Gaelic. Again, if there is automatic or very easy MT of Gaelic content this takes away the point of writing in Gaelic.

And if you write in Gaelic and non-Gaelic speakers are able to get an instant translation – or if the software translates it automatically, people will reply in English and before long it will be like the Highland Council Gaelic forum again. You might not be writing for English speakers, but with MT they become part of your audience whether you like it or not.

And that is all if the translation is correct! At worst, the translation will be wrong.

I am on Bluesky. With Bluesky there is a function to translate posts via Google Translate. .

I normally post in Gaelic and as an experiment, I’ve been looking at what translations would be given for my Gaelic posts.

As an advocate of public transport and someone with critical views of AI, I’m personally very sceptical about self-driving cars. I posted something recently referring to autonomous vehicles as “sgleò-bhathair” – vapourware. Google Translate translated this as “a piece of crap!”

This is ruder than I would have been! Vapourware is I suppose a pejorative term, but it does have an important meaning:

And this was just a post by me – not important, not contentious. Imagine, however, if a parliamentarian or a public body related to transport used the word “vapourware” in a meeting and quotes this in Gaelic in an official social media post by the organisation in question.

All it would take would be one journalist or social media user to click on the translation and you’d have a bunch of headline and angry tweets saying things like:

“[Insert public body/public figure here] calls self-driving cars a piece of crap”

I checked quite a few of my posts and a not inconsiderable number of them came up with inaccurate translations.

I don’t tend to post about any contentious issues or get political on social media as I have a day job where I have to be politically neutral. But it got me wondering if I need to check all my messages in “translation” to check they don’t inadvertently say something dodgy when translated by MT. This isn’t something that should have to be a consideration.

This made me think of an incident in the Scotsman many years ago which Ronald Black, the editor of the paper’s Gaelic content told me about.

If I remember the tale correctly, an article about Scottish Secretary Malcolm Rifkind started with the text: “Thug Malcolm Rifkind…”

Thug is the Gaelic for “gave” and sounds like the English word “hook”. It merely says “Malcolm Rifkind gave” but looking at it from the point of view of a non-Gaelic speaker you might read it as:

Evidently a complaint was received about how the article was impugning the character of Mr Rifkind!

This is the kind of thing that we may have to think about more and more.

Are we responsible for what people might read our Gaelic words as meaning though machine translation to English?

Of course not morally speaking. However, in practice, if you did have a situation like the “piece of crap” scenario above, that would hardly matter and you would be likely to see non-Gaelic speakers in the media pontificating about the meaning of words in a language they have no knowledge of and where the the voice of actual Gaelic speakers probably wouldn’t be heard in the matter. This wouldn’t end well.

There has been a lot of talk of “a right to be forgotten” online – that personal, outdated, or irrelevant data shouldn’t be kept or be available online when no compelling reason exists for it to be. I think we should have a “right not to be translated” as far as possible online and that minority language groups should advocate this.

MT is here to stay and people can always use it if they want, but automatic translation of posts should be something that the poster themself can control and turn off. There should also be more disclaimers about the inaccuracy of MT. And there should never ever be websites which automatically change all posts automatically to English.

If people really want to translate what I – or anybody else – posts on social media in Gaelic or Irish or Welsh or whatever, they can copy and paste into Google Translate so they can still read it but if it is automatic or too easy, it is very damaging for minority languages.

And after all, wouldn’t it be a boring world if Doctor Who started saying “let’s go” as a catchphrase rather than “allons y!”

Alasdair

Powered by WPeMatico


Tadhail air Trèanaichean, tramaichean is tràilidhean