But they are still astonishingly useful
In the past few months free online translators have suddenly got much better. This may come as a surprise to those who have tried to make use of them in the past. But in November Google unveiled a new version of Translate. The old version, called “phrase-based” machine translation, worked on hunks of a sentence separately, with an output that was usually choppy and often inaccurate.
The new system still makes mistakes, but these are now relatively rare, where once they were ubiquitous. It uses an artificial neural network, linking digital “neurons” in several layers, each one feeding its output to the next layer, in an approach that is loosely modelled on the human brain. Neural-translation systems, like the phrase-based systems before them, are first “trained” by huge volumes of text translated by humans. But the neural version takes each word, and uses the surrounding context to turn it into a kind of abstract digital representation. It then tries to find the closest matching representation in the target language, based on what it has learned before. Neural translation handles long sentences much better than previous versions did.
The new Google Translate began by translating eight languages to and from English, most of them European. It is much easier for machines (and humans) to translate between closely related languages. But Google has also extended its neural engine to languages like Chinese (included in the first batch) and, more recently, to Arabic, Hebrew, Russian and Vietnamese, an exciting leap forward for these languages that are both important and difficult. On April 25th Google extended neural translation to nine Indian languages. Microsoft also has a neural system for several hard languages.
Google Translate does still occasionally garble sentences. The introduction to a Haaretz story in Hebrew had text that Google translated as: “According to the results of the truth in the first round of the presidential elections, Macaron and Le Pen went to the second round on May 7. In third place are Francois Peyon of the Right and Jean-Luc of Lanschon on the far left.” If you don’t know what this is about, it is nigh on useless. But if you know that it is about the French election, you can see that the engine has badly translated “samples of the official results” as “results of the truth”. It has also given odd transliterations for (Emmanuel) Macron and (François) Fillon (P and F can be the same letter in Hebrew). And it has done something particularly funny with Jean-Luc Mélenchon’s surname. “Me-” can mean “of” in Hebrew. The system is “dumb”, having no way of knowing that Mr Mélenchon is a French politician. It has merely been trained on lots of text previously translated from Hebrew to English.
Such fairly predictable errors should gradually be winnowed out as the programmers improve the system. But some “mistakes” from neural-translation systems can seem mysterious. Users have found that typing in random characters in languages such as Thai, for example, results in Google producing oddly surreal “translations” like: “There are six sparks in the sky, each with six spheres. The sphere of the sphere is the sphere of the sphere.”
Although this might put a few postmodern poets out of work, neural-translation systems aren’t ready to replace humans any time soon. Literature requires far too supple an understanding of the author’s intentions and culture for machines to do the job. And for critical work—technical, financial or legal, say—small mistakes (of which even the best systems still produce plenty) are unacceptable; a human will at the very least have to be at the wheel to vet and edit the output of automatic systems.
Online translating is of great benefit to the globally curious. Many people long to see what other cultures are reading and talking about, but have no time to learn the languages. Though still finding its feet, the new generation of translation software dangles the promise of being able to do just that.