Translator from the car, or how to teach the MFP to translate documents
Hi% username%!
Recently, we, ABBYY LS , together with Xerox launched the Xerox Easy Translator Service - a service that allows you to receive a machine translation of a document - for this you need to scan it using an Xerox ConnectKey multifunction printer or take a picture of your phone camera. Through the same platform, you can order and professional translation.
')
How it works? Let's figure it out!
Machine translate
The user scans, photographs or downloads the finished file via the Web. The file is saved in the database, after which it starts parsing into segments - objects that store text fragments (as a rule, these are sentences), and information about the layout of these fragments. Files in graphic formats are pre-recognized using ABBYY Recognition Server . Before sending a document for recognition, we ask the user what language the document is written in. The image can be recognized without it, but specifying the language of the source document will allow it to be recognized faster and more qualitatively.
In the process of integration with Recognition Server, we needed to select processing parameters for our document flow: the format for exporting results, the appropriate ratio for speed / recognition quality, the type of document assembly.
As an export format, we now use the “old man” .doc, since at the moment it describes the Rich-text most fully and at the same time solves a number of problems related to the layout of elements on the page during segmentation (hello, .docx!). However, the transition to .docx is in the plans. The relationship between speed and quality caused the most controversy. On the one hand, the quality of recognition is the highest priority for machine translation of a document, since the whole process is automated and there is no possibility to attract layout specialists. On the other hand, the main advantage of MT (machine translation) is speed (especially in the scenario when the user is waiting for a printed translation near the MFP), and you have to pay quality for speed. Nevertheless, the choice was made in favor of quality.
The type of document assembly determines which elements of the original document fall into the file with the result. It is possible to limit the recognition result to plain-text (this option does not suit us, since non-textual information that is important for understanding the context will be lost), you can save the formatting of this text (now better, but what about important non-textual information?). The Editable Copy type preserves text with formatting and non-text content, but without reference to pages. It would seem that the layout of the pages is broken - and this is a minus. But since during translation the length of words can vary significantly (for example, the German translation of the word “friendship” - “Freundschaftsbezeigungen”), the absence of linking to the pages of the source file allows us to avoid situations where blocks with the text “bump” on other page elements, as well as when The translation text cannot be entered in the dimensions of the source block. The last option Exact Copy saves both text with formatting and non-text content. At the exit we have a document as close as possible to the original from the point of view of the pagination. This option looks more solid from the point of view of formats that support paginated output (pdf, djvu), but the translation may turn out to be “overboard”. In the end, we made a choice in favor of Editable Copy .
Source text
Exact Copy Example
EditableCopy Example
Next, the recognized file passes the already mentioned segmentation, and for the segments that make up the first 1000 characters of the document, the text language is automatically detected. Despite the fact that we have already asked the user to specify the document language when downloading the file, we still do auto-detection, because when working through the API, setting the language is not necessary for graphic formats, and not necessarily when loading text documents. Knowing the language, we can calculate the statistics on the document: the number of pages, words, characters. After that, the service sends blocks of document segments via the Machine Translation API (MT-API) to one of several MT engines. Upon completion of the translation, the document is collected and a notification is sent to the user.
Machine Translation Example:
Source image
Result
I want to note that, despite the fact that the technology of machine translation is still much inferior in quality to professional, it copes well with the task of quickly translating large amounts of information when an understanding of the general provisions of the document is required. Another frequency scenario is the understanding of relevant pieces of source text, which can then be translated more carefully. Nevertheless, we are taking steps to improve the quality of machine translation through the use of translation memory bases, which we will discuss below.
Professional Translation
If the user needs a better translation, he can place an order for professional translators. In this case, the file path will be slightly different:
The text that was obtained as a result of recognition on the Recognition Server, along with the original document, is sent to SmartCAT , a platform for automating the translation process. The source file is needed in order for the translation to be available non-text content, which may contain information that is important to preserve the translation context. But before the document gets to the translator himself, the manager checks whether he needs preliminary layout and, if necessary, attracts layout specialists. Only then performers are appointed. Directly in the editor, the translator has access to both the machine translation engines and the Translation Memory databases, which allows reducing the time spent working on the document. When the transfer is completed, it is edited, read and checked again by the manager. And now the translation is completed, and the user receives an email notification and a file with a high-class result.
Translation example:
Source image
Result
How does all this work inside? Good question!
Cake is not a lie!
Do you like puff cakes? We love, and the infrastructure of our application can be represented as such a cake:
Each piece consists of dll-assemblies that implement a specific feature (feature), for example, FileManagement - file management. Also libraries are divided into layers: Contracts, Web API, Data Storage, Task Processing. The separation implemented the principle of CQRS - command-query responsibility segregation, according to which the method should be either a command that performs some action or a query that returns data, but not simultaneously. In other words, asking a question should not change the answer ( wiki ).
Contracts
The contract assembly stores the interfaces on which the application modules interact, as well as the commands and queries that the described feature operates on. These assemblies are used by other layers of the same set of functionality (for example, file management is FileManagement.Api and FileManagement.Processing) and other features (order management uses file management).
Web API
Everything is simple - the API method, called by the consumer, initiates the execution of commands, queries, or their combinations, and gives the result of the execution to the user.
Data storage
Data storage. The assembly subscribes to the execution of commands and requests of certain types and carries out the modification or reading of data. We use MongoDB for these purposes, but since working with data is done through commands and queries, the rest (not Data Storage) assemblies can only guess about the documentary nature of the database.
Task processing
Perform lengthy operations. Like the data storage assembly, this assembly subscribes to a call for certain commands, however, the real time to start processing such a command is regulated by the scheduler. Such commands are called tasks. Parsing a file into segments is one such task.
The whole project tree looks like this:
Such a division into layers and features allows a fairly flexible increase in the functionality of our cake application, adding more and more tasty new useful features.