
The work of translators is interesting: they constantly receive a lot of information in different languages. It often happens that the translation of the next 100-page instruction was needed yesterday. And if similar texts have already been translated (previous versions of the manual or other technical documentation), then the task may be a little simpler, but at the same time doing copy-paste and ensure that all changes are taken into account, then another lesson. In order to use an already existing translation and to ensure that its sequence exists, there is a special class of programs called CAT tools.
CAT stands for Computer-Aided (Assisted) Translation - "translation using a computer" or "automated translation". But you should not identify these technologies with machine translation, when you enter text in one language, press a button and receive its translation: automated translation is a broader concept, and in the case of CAT systems, an existing translation made by man is used.
The other day, ABBYY Language Services began closed testing
SmartCAT , a proprietary platform to automate the translation process. And in this post we will try to tell you a little what CAT systems can do.
Firstly, CAT tools include various linguistic resources that facilitate the work of translators with similar texts containing standard phrases and sentences - technical, legal and medical terms, descriptions of goods, and much more. One of the most common resources are Translation Memory databases, which are translation memory bases that contain previously translated text segments (phrases and sentences). They are created and replenished on the basis of pairs of parallel texts. Another important resource is glossaries, which contain terms and concepts adopted in a particular company (or approved for a specific group of projects). In addition, SmartCAT allows you to work with machine translation technology. Foreign translators have been using this resource for a long time, because it helps speed up translation processes and increase labor productivity. In Russia, not everyone understands what can be expected from machine translation, but interest in this technology is growing: this year, participants in many industry conferences (for example,
Loc Kit ,
Translation Forum Russia ) discussed the features of introducing and using machine translation more actively than in events of past years.

All of the above linguistic resources simplify the work of the translator who uses the CAT tool. In the process of translating text, SmartCAT will offer translation options for individual segments, using substitutions from existing translation memory databases and connected glossaries with corporate terminology. The translator can:
')
- take advantage of such substitutions and accept them
- edit the proposed translation options (if you need to change the grammatical form)
- translate the segment in its own way.
At the same time, a modified version can also be added to the existing translation memory databases, then the platform next time will offer it. In addition, the results of the machine translation of the selected segment will be shown in a separate panel on the right side of the SmartCAT interface. In most cases, it is much easier to edit such "raw" material than to translate "from scratch" - this is usually called post-editing: the translator or editor checks the finished text, compares it with the original, and brings it to the desired language standard or the required level of quality. This will not work with works of art, creative texts (slogans, promotional materials, etc.), personal correspondence and other similar texts.
CAT tools preserve the formatting of documents. Suppose a translator is working on a document with a complex structure that contains multi-level lists, styles, links, and other design elements. SmartCAT stores information about the layout of the source text in special tags, which when working on the translation can be left in place, and then the translated text will look the same as the original.
Most CAT tools are desktop programs - they are installed on one computer, and you can use the program only on it. If you want to transfer to another computer, you need a floating license or some other tricks. SmartCAT has a simple interface and cloud architecture, which gives certain advantages:
- Several translators can work simultaneously on one project, even if they are located in different parts of the world;
- All necessary materials (translation memory bases, glossaries, etc.) are simultaneously available to all translators of a specific project.

Our platform has a special TranslationConnector module that allows you to connect to external resources - content development and content creation systems, electronic document management and many others. Thanks to this, you can get a translation of, say, a website or an e-commerce portal literally with one click: the task in the internal resource is transferred to the translator responsible for its solution, and he makes the necessary changes directly in the system and returns the finished text. Thus, SmartCAT users can work with translation in the interfaces of their usual systems, and companies can build and conduct translation processes in the most convenient way, creating solutions for specific projects on the basis of the platform. Translation can be handled by both an internal team (for example, a translation department) and an external one (translation companies).
Sometimes translators have to work with PDF documents and images, which brings considerable inconvenience. The text in such files just can not be changed, so before translation they need to be recognized - extract text data. Of course, you can always print scans, hang them next to the monitor and reprint their contents in a text editor, if you do not mind the time and effort. SmartCAT makes it much easier to work with such file formats due to its integration with ABBYY OCR technology: it’s enough to load the necessary document into the system and it will automatically extract the text for translation. That is, translators do not even have to exit the program.
In addition, our CAT tool can measure the productivity of translators in specific projects. In March, our colleagues attended the TAUS conference on translation automation. According to the majority of participants in the event, in projects on post-editing of machine translation, it is necessary to track the time and volume of editing at the level of an individual segment. We decided that it makes sense to control not only the work with machine translation, but also the entire translation process, and added a system of online project monitoring to SmartCAT. The platform analyzes in real time various metrics and performance indicators, which provides information for optimizing the work of translators, editors and proofreaders with linguistic materials. In addition, such data helps to assess how justified are the costs of using automation technologies in a particular project.
And now let's talk a little about what our developers have done so that SmartCAT can see the light. In particular, they wrote a small, but powerful application server with 1200 lines of code, which is a .Net build loader in win-service. It can safely shut down or reboot again if it suddenly encounters bugs in code, third-party components, or another unpleasant surprise. In this case, he carefully pledges his fall in order to get back into operation. At the same time, the plug-in assembly contains the NInject module with the processor of the part of the business process that cannot be contained within the framework of the web request. This part is presented in the form of a task, which is placed in the queue. And for fast and scalable work with job queues in MongoDB and SQL, we developed generalized patterns.
In addition, our specialists have implemented beautiful and convenient routing on attributes in WebAPI 5.0. In order not to limit job handlers on RAM or hard disk, we added streaming data from external file providers (for example, an OCR server) to the TranslationConnector, and in turn, the same transfer to MongoDB GridFS.
We also came up with a way to organize config-files for easier application setup during development, testing and operation. For example, the deployment of these files does not contain accounting information for combat services and databases - they are dynamically connected from another directory. There are also settings that depend on the specific role of the server and its network connections. All this allows you to contain many handlers on different servers.
In the near future, we will try to tell you more about the technical details from our developers and what advantages these technologies bring to SmartCAT users. The cloud platform itself is still in the stage of closed testing, but all interested can apply for participation in it on the
official website .
Denis Frolov
ABBYY Language Services