Student Research Partnership in Spanish – Comparison of Translated Sources

6 October 2023
By Mitch Porter '25

My work with Silvia explored the use of artificial intelligence for linguistic translation. Machine-assisted translation has been in development for decades, with the debut of the first significant tool, Google Translate, in 2006. Our work looked at how these tools can be applied in a practical sense, the areas in which they still have difficulty, and how they perform in various cross-linguistic contexts.

Our research project arose from a real-world translation need—the publication of an issue of the Spanish-language literary journal, Constelaciones. Using a newer AI program, DeepL, we converted source texts in English and Portuguese—articles to be included in the final published edition—into Spanish without the need for a human translator. At this point, we are about halfway through the translation process for the journal. Our job, thus far, has been to verify this translation process in a field that relies on precise language and accurate terminology.

DeepL, like Google Translate, is an artificial neural network, which means that, like real human neurons, the program is able to respond to a series of input stimuli in order to interpret and produce a given output. Both translators, similar to other AI tools such as ChatGPT, are predictive. They rely on a large corpus of existing translations to convert individual words and phrases from their input texts to the new output language. In essence, they are mimetic processes—they mimic the role of human translators, predicting how a human being with an understanding of both the input language and the output language might choose to translate a particular language unit.

The normal function of a neural network does not treat language as a cognitive process. It does not understand grammatical rules in the same way as a human speaker. A program like Google Translate “learns” each word in its input to mimic grammar. In a sentence like “I speak Spanish,” the program might choose to translate “speak” instead of “speaks,” because it knows that this form of the word is most likely to follow “I.” DeepL is a different type of neural network, called a convolutional network, which examines each part of a sentence in individual segments. The grammar is controlled by other mechanisms included in the program and is not the first function of the network itself. This allows the network to find translations that more accurately match the tone and fluency of the source material-producing a translation that, while totally artificial, is far more advanced than the product of something like Google Translate. In fact, everything in this post was written by DeepL, having been originally composed in Spanish (any corrections by me are underlined).

How do we understand the role of a human translator when tools like DeepL are available? For me, seeing the limitations of automated translation is something that has helped to put this work in a clearer lens. Over the course of our work, I have been tracking and cataloging the types of errors DeepL makes, with the hopes that this might help advise future use of machine translation in an academic context.

DeepL has an impressive grab of idiomatic language, able to capture more abstract phrases like “remain oblivious,” “shed light on,” and “turn one’s nose up.” DeepL works especially well in translating between Romance languages, like Spanish and Portuguese. When working across language families, the program still functions well, but still struggles in a few areas. Quantitatively speaking, when moving between English and Spanish DeepL had the most difficulties with determiners (deciding number and determinacy for nouns), dependent clauses (especially deciding to begin a clause with a progressive verb or a conjunction like “that”), and translations with multiple meanings in the source language (like horizonte, juego, or parlamento). However, the biggest takeaway has been that, by and large, the program works very well.

I found it striking that, across all translation types, one of the most common weaknesses in DeepL translations is one that also arises in human translations: the program is not well inclined to change the ordering of phrases in a sentence or break up clauses in order to maintain the same flow and tone of the original in the target language. Silvia and I often have to intervene when a translated sentence requires the translator to make a subjective choice—and the answer to how best to deal with the problem is not always clear.

Understanding a text in translation is never the same as understanding a text in its original form. At what point must the text be changed outright to be understood in a new linguistic and cultural context? To what degree can AI be responsible for making these changes itself? Does the integrity of a translated piece require human supervision? What is most striking to me about questions like these is that they apply regardless of whether a neural network or a human translator is making decisions. Translation is always, in some form, a departure from source text. The role of a translator is to navigate that departure. Sophisticated tools such as DeepL are just that: tools. They change the way we translate, but they do not change the decisions and questions inherent in communicating across languages.