By Diana Singureanu
This article is written in response to WIRED's Pro Interpreters vs AI Challenge, referenced within the article
Translators and interpreters have been conscious of the threat of AI for some time. Many view the question of whether machine interpretation will take our jobs as a when rather than an if. I must also declare I'm sceptical that machine interpretation can replace human interpreters.
As an interpreter, I understand the complexities of interpreting as a communicative act. I have recently completed a PhD and I found that an interpreter’s levels of emotional intelligence help solve challenges of a linguistic, paralinguistic, environmental, and interpersonal nature during the interpreted encounter. Furthermore, successful machine interpretation requires a complex mix of different new technologies.
Here is a quick outline of the process.
Speech to text: To convert the audio to text within the same language
Text-to-text translation: To convert from the source language to the target language
Text-to-speech: A synthetic voice to express the translated text
Each stage risks losing something through interpretation. That could be meaning, nuance, intent, or other intangibles.
Considering the above, I watched WIRED’s Pro Interpreters vs AI Challenge with mixed emotions. I was fascinated but also a little anxious. Yet, most of all, I admired the two professional interpreters taking the challenge. The pair’s openness and curiosity stood out as they discussed the performance of Kudo, the AI speech translator.
So, let’s cut to the chase and answer the big questions: who won?
Both interpreters (and I) were pretty surprised by how well Kudo did in processing the completeness of the message. In particular, it succeeded with some of the more technical terms. Jargon is challenging to get right during simultaneous interpretation. So, this element of Kudo is impressive. For me, this was a reminder that technological advances are taking place faster than anticipated. In particular, the application of natural language processing (NLP) in machine translation and machine interpretation. As you might already know, automatic speech recognition is already integrated into CAI (computer-assisted interpreting) tools for conference interpreters1. This technology supports interpreters with terminology, numbers, and proper nouns. The lack of errors in the initial stage (speech to text in the same language) surprised me. Live subtitles often fail, especially with fast speech.
There were also issues with the machine interpretation system. For example, awkward combinations of words and, at least on one instance reported by the interpreter taking part in the experiment, these errors were close to the point of nonsense. Additionally, Kudo made some utterances appear as complete sentences when there were actually logical pauses in the source messages. Unsurprisingly, the tone of the synthetic voice appears flat and lacking in emotion. This dryness was apparent in one of the speeches addressing the impact of COVID-19 where the interpretation needed to communicate empathy and understanding however, Kudo could not achieve these aims.
That said, the level of completeness was particularly impressive, considering the fast speech. Yet I can't help wondering what the audience might have missed from the original address. The flat synthetic voice uttered words at a breakneck pace, with the occasional misuse of words, syntax, and misleading pauses. The intended humanity did not manage to make it through the process.
The battle between human and AI interpreters is interesting from a technological standpoint. However, this experiment also serves as a demonstration of what it means to interpret spoken language. Interpreters do not translate words. We translate meaning. More importantly, we also read the room while we work whilst also doing justice to the intentions of the original speaker. Ultimately, the interpreters' verdict was that AI interpretation is not ready for independent use. Some of the areas where it could fail are in high-stakes situations, such as asylum applications or medical and legal settings. More pointedly, it could pose risks in diplomacy, where nuance can affect millions of people.
The big question is where we can apply the safe and ethical use of machine interpretation. Ewandro Magalhaes, Kudo’s Chief Language Officer, who is a former chief interpreter at the UN, also urges caution regarding the use of MI in sensitive situations. Nevertheless, he stresses that machine interpretation will help us create a more inclusive world which reminds me of another important field we need to concern ourselves with namely AI ethics. Echoing the words of Shannon Vallor2, I feel that we must continue exploring ethical questions around the use of AI in interpreting. We must consider how we can best harness the technology’s power to expand our moral concern for others and not narrow it down by delegating tasks to AI.
AI-powered tools augmenting human interpretation seem to be the winning combo. They can help interpreters raise their game under challenging circumstances, such as fast speakers or addresses with highly technical language. Supporting interpreters in these situations will allow them to do what they do best: maintain the original speaker's emotion and intentionality and even improving comprehensibility as a result of processing the information.
2Vallor, S. (2021). Twenty-first-century virtue: Living well with emerging technologies. In E. Ratti & T. A. Stapleford (Eds.), Science, technology, and virtues: Contemporary perspectives (pp. 77–96). Oxford University Press.
Diana Singureanu holds a PhD in Interpreting Studies from the University of Surrey where she is also collaborating as a researcher on various projects on remote interpreting. She also holds a Masters in Translation Studies, a second Masters in Conference Interpreting from London Metropolitan University and a DPSI option Law. She is working part time on the private market as a Conference Interpreter (Romanian A, English B and French C) and as a Legal Interpreter.
Diana is a Member of the Chartered Institute of Linguists and a Chartered Linguist.
You can find her full biography here: https://romanianconferenceinterpreter.com/
Views expressed on CIOL Voices are those of the writer and may not represent those of the wider membership or CIOL.
Prandi, B. (2023). Computer-assisted simultaneous interpreting: A cognitive-experimental study on terminology (Translation and Multilingual Natural Language Processing 22). Berlin: Language Science Press. ISBN: 978-3-96110-397-3
Rodríguez González E., Saeed A., Davitti E., Korybsk T., Braun S. (2023). "Reimagining the remote simultaneous interpreting interface to improve support for interpreters". In Óscar Ferreiro-Vázquez, Ana Teresa Varajão Moutinho Pereira Correia and Sílvia Lima Gonçalves Araújo (Eds.), Technological innovation put to the service of language learning, translation and interpreting: Insights from academic and professional contexts. Peter Lang. ISBN 978-3-631-88913-8.
Defrancq, B., & Fantinuoli, C. (2020). Automatic speech recognition in the booth: Assessment of system performance, interpreters’performances and interactions in the context of numbers. Target. https://benjamins.com/online/target/articles/target.19166.def
Fantinuoli, C. (2019). The technological turn in interpreting: the challenges that lie ahead. In Proceedings of the conference Übersetzen und Dolmetschen (Vol. 4, pp. 334-354).