Professor Krzysztof Jassem: As far as translation of general texts is concerned, humans will remain irreplaceable

What is machine translation? It is a process in which a text is translated by a computer without human intervention. This term is often used in contrast with the term computer-aided translation, a process in which a text is translated by a human supported by computer software.

Machine translation is faster and cheaper although less precise. According to many users, the result of machine translation only conveys the meaning of a processed text and cannot be considered a reliable source of knowledge. The most popular machine translation system currently available is Google translator, which is provided free of charge.

Human vs. AI

In the case of computer aided translation the decision on the final version of a text is made by a human. The translator is supported by computer tools, e.g. electronic multilingual dictionaries, computer thesauri or glossaries of technical terms.

As far as technical texts go, one of the most helpful tools includes the so called translation memory, i.e. a set of sentences and phrases in the source language paired with translations thereof created on the basis of already translated documents. The most popular paid CAT system is Trados by SDL, with Omega being the most sought-after freeware.

Although machine translation systems are considered artificial intelligence, software supporting the translation process is not classified as such.

In 1990s the approach to research on machine translation based on a set of rules shifted towards statistical methods based on data. The statistical approach requires large bilingual corpora, i.e. sets of texts and translations thereof.

Corpora are used to calculate the so-called translation model which determines the probability of whether a sentence in one language is a translation of that sentence in another language. The machine translation system matches a given sentence in the source language with a sentence in the target language for which the probability is the highest.

Neural translation

In 2014 two independent research groups – from Google and from the University of Montreal – came up with machine translation systems based on neural networks.

The translation process is carried out in two phases: coding and decoding. In the coding phase individual words of a source sentence (e.g. in Polish) are converted to their numerical representation. Then, a neural network, trained (similarly to the statistical approach) with the use of a bilingual corpus, builds two numerical representations of the whole source sentence: the first is constructed word by word starting from the beginning of the sentence, the second is constructed in the opposite direction. Both representations are combined into a single one (marked with grey blocks).

In the decoding phase a target sentence (e.g. in English) is generated. This is done word by word; at each step the neural network calculates the probability of matches for the next word and chooses the word with the highest probability.

Machine translation systems are considered artificial intelligence but programs supporting the translation process are not classified as such.

The technological progress in machine translation is inspired by the needs of global companies. Recent achievements in that domain are published by research centers of leading global corporations: in 2017 researchers collaborating with Facebook proposed the use of the so-called convolutional neural networks, while researchers from Google came up with an idea of implementation of the so-called attention-based model.

In April 2018 the scientists from Microsoft announced that they had built a system allowing to translate English news to Chinese with translation quality surpassing human capabilities.

Poland: Poznań and Warsaw in the forefront

Currently, in Poland there are two public research centers dealing with development of machine translation tools. The first is Adam Mickiewicz University in Poznań, which has substantially contributed to development of that field of study. In 2012 (so still in the era of domination of the statistical paradigm) it invented the so called compact phrase table method, the implementation of which has greatly improved efficiency of algorithms.

In 2016 a group of researchers collaborating with the university published the project called amuNMT, in which the translation decoding process in neural networks was implemented. The goal of the project was to increase the speed of operation of translation systems with the neural method.

In April 2018 the scientists from Microsoft announced that they had built a system allowing to translate English news to Chinese with translation quality surpassing human capabilities.

The system, which has been implemented on the basis of the amuNMT project, competed in the 2nd Workshop on Neural Machine Translation and Generation and took the first place in the computing performance category. Trained with the use of sufficient bilingual data amuNMT also got the second place in the WMT 2106 competition in the “translation of press news from Russian to English” category.

In 2017 the same group published the “Marian” project, in which both essential components of the neural systems were included: coding and decoding. Using the project for academic purposes, you can train a neural system capable of performing machine translation of any text in any pair of languages, provided that a sufficient language corpus is available.

Researchers from the Polish-Japanese Academy of Information Technology in Warsaw are focused on improving existing solutions so that they could be used to translate Polish speech. The solutions they offered were presented at IWSLT (International Workshop for Spoken Language Translation) in 2013 and 2014. The academy also developed solutions which competed at WMT in 2016 and 2017.

The objective of recent experiments of that group of researchers is to improve the quality of “Marian” translations by using group’s own programs for the so-called stemming and for segmentation of words into smaller units (so called sub-words).

Launch of Marian

In 2018 a group of researchers from the University of Adam Mickiewicz in Poznań, in collaboration with Poleng, a company from Poznań dealing with implementation of machine translation systems for commercial purposes, did two experiments to assess the quality of texts about a specific domain translated to and from Polish. In both experiments translation systems were trained for specific enterprises interested in implementation of solutions for their own needs.

In the first experiment the domain of translation was rather general and the volume of training texts provided by the enterprise was relatively small. Given the circumstances, it was difficult to gather and select a satisfactorily large number of texts on similar topics.

The system was trained by “Marian”. The training set consisted of:

• 60,027 pairs of sentences provided by the enterprise;
• 7,198,092 pairs of sentences gathered by system engineers.

The system was trained to operate in the following language pairs: Polish-English and English-Polish.

Translated texts were assessed not only by a computer but also by a human: each of 488 test translations were rated on a scale from 1 (lowest) to 5 (highest), a mathematical average being the final result. Evaluation criteria for translated texts included their fidelity and fluency.

Surprisingly, the BLEU metric [Editor’s note: an algorithm for evaluating the quality of translations developed in 2002 by IBM] rated the English-to-Polish translations higher, whereas the human decided that the quality was better in the case of Polish-to-English translations. The reason for that might have been the fact that the evaluating person was Polish and therefore might have been more critical about translations generated in their native language.

Fluency first

The second experiment was done with the corpus of 1,200,000 pairs of sentences, consisting entirely of texts provided by the enterprise – very similar to the testing set. In their experiment, researchers compared the operation of two translators: the neural one and the statistical one (based on the codes of the publicly available “Moses” system for English-to-Polish translations. Additionally, the corpus of testing sentences was subject to the Google Translate system intended for translation of general texts (without specifying their domain)).

The experiment was to give answer to two questions:

• Which of the translation paradigms yields better results for a relatively small base of training texts?
• Does the translation quality depends more on the size of a training corpus or on the degree of similarity of training texts and testing texts?

Both systems trained with the use of specialist corpora outclassed the system intended for general translations. Furthermore, the results from a small specialist corpus proved significantly better than the results obtained in the first experiment conducted on a bigger corpus of texts from a more general domain.

People who translate texts will work more efficiently with machine translators

Surprisingly, the statistical system achieved better results than the neural one. In order to verify that effect, a “blind” comparison experiment was conducted in which humans were to assess the quality of translations. Two independent proofreaders compared the translation results from both systems, unaware of which system generated which translation. For each pair of the pairs compared the proofreader was to decide whether a winner should be declared or whether it was a draw. In the case of victory, the proofreader had to point to the winning aspect: fidelity, fluency or other.

What were the results? In the opinion of humans the neural method outclassed the statistical method, with the decisive factor being greater fluency of generated translations. The experiment confirmed the conclusions drawn from the experiments that had been done for other language pairs. According to the conclusions, the neural method, in the opinion of humans, yields better results than it might have been deduced from the BLEU metric.

What does the future hold for translation?

The analysis of contemporary publications regarding machine translation and of results of international competitions, as well as the conclusions drawn from the experiments conducted for the Polish language allow to present some hypotheses concerning development of that domain in the nearest future:

  1. Neural translation will remain a dominant technology in machine translation. The progress in that domain will be possible through development of architecture of neural networks.
  2. Machine translation will be used mostly to translate technical texts. Over time, translators intended for general texts are likely to play a lesser role.
  3. Translation quality improves if the texts used to train the system are very similar, in terms of topics, to those for which it is used. For that reason, in the nearest future, enterprises and institutions will make their internal documents available to train systems for their own needs.
  4. People who translate texts will work more efficiently with machine translators. In the case of translation of texts concerning a narrow field of knowledge, the work of a human translator will focus on post-editing the texts generated by a computer. As far as translation of general texts is concerned, humans will remain irreplaceable.

It seems that the next step towards automation of translation processes will be to combine automated techniques for text subject recognition and translation. Let us imagine that for each pair of languages a certain number of translators for different domains and specializations is trained. In that scenario, a translation system will first recognize the text subject and then apply a relevant translation model trained with the use of the texts regarding that subject.


The foregoing is an adaptation of a part of the paper by professor Krzysztof Jassem entitled “Development of artificial intelligence in the context of machine translation with special reference to institutions and persons dealing with the aforesaid subject in Poland” (Poznań 2018), prepared for the National Information Processing Institute.