20 projects, 20 languages, deadline yesterday

Imagine: you have 7 development teams with a total of more than 100 people. They simultaneously saw 13 applications. Work is being done in 20 repositories.

All applications need to be translated. Some are for 6 languages, some are for 20. And some are for 13, but this is a completely different set of languages, it is not included in the previous 20.

Everyone has a different stack, as a result, different string formats: js, json, ts, yaml or yml. And some people still keep their texts in the database.
')
You work on Agile: daily delivery of valuables, two-week sprints. DoR includes all required translations. And, of course, translations were needed yesterday in order to have time to test.

There is a department of technical writers. Who is a technical writer? This is a person who writes external documentation, sometimes - internal. Writes all kinds of texts that users or partners can see: interface texts, letter texts, API responses, errors. Accompanying the development process to be immersed in technology and business logic. And ensures timely delivery of translations in the application.

There is also the position of a copywriter-translator and a localization manager. This is a person who creates all the texts in English, and also monitors the consistency of translations, assigns translators, and solves all related problems.
Attention, the question is: how many technicals, copywriters and localization managers are needed in order not to stop development and not to hurt the entire technical department?

In our case, we managed with 4 technical notes and 1 copywriter-localization manager. The delivery of transfers on average fits into one working day and never exceeds three working days. I hope you became interested.

How did we come to this

6 years ago we worked in Google sheets and databases. That is, if in the development process there appeared lines for translation, we copied them into a tablet, and then sent them by mail for translation. When the translation was ready, it was manually uploaded to the database. The only advantage of such a decision is that you do not have to re-upload the application to see the new lines. But if there is an error in the translation, it will not be possible to roll back. No translation memory, no glossaries. The consistency of translations is achieved by a close look.

First try

The first version to automate this process looked like this: when the developer had lines, he added them to a new branch in a special repository for translations. Then in the same thread, the pipeline was launched, which sent the entire diff strings to be translated by API. True, the translations should have been sent back to the database, and loading the lines from the external resource to the internal database via the API did not work.

What did such integration give? A step was removed where the technical writer needs to collect everything into a single table, manually send, and then divide the resulting translations according to the applications and the number of languages. In this case, the lines were immediately sent for translation within the framework of the project, of the same name with the application for which they were intended. At the exit, the technical writer received a set of archives for each of the applications for which the work was carried out. This significantly reduced the share of manual labor. Moreover, on the side of the provider a translation memory was implemented. But this decision also preserved a number of shortcomings: storing the strings in the database did not allow for the full-fledged management of the strings on our side and still meant a large proportion of manual work.

Pain and continuous localization

The next integration brought a lot of suffering to the developers. It seems to me that those who found her still have an eye twitching at the word “localization”. This was the first integration with Serge and Smartcat.

Here it is important to tell what Serge and Smartcat are.

Serge is a utility that supports git. She knows how to get the necessary strings from a branch, send them for translation, and then return the translation for those strings to the same branch. You also need a plugin that will call the API of the CAT system in which we translate. The plugin should receive new strings from Serge and return the finished translations to Serge.

Smartcat is a CAT system with support for a glossary, a translation memory, and placeholders. Also, Smartcat aggregates and simplifies the process of mutual settlements with freelancers, supports the connection of vendors of translations.

At this step, we moved the lines from the database to the project repositories. Now the lines had to be sent directly from the application repository and returned to the same place.

It was supposed that it would work like this: the developer knows from which branch he created his feature-branch, and the diff in the resource files between these two branches is exactly what needs to be translated. When a developer has a set of strings for translation, he runs a job with the Serge config in his branch. Serge calculates diff, extracts new strings, calls the plugin and sends strings to translate. When the translations are ready, the developer calls the following job: he deploys the Serge instance created in the previous step, gets the translations ready and commits them to the original branch.

The solution turned out to be unstable: Serge is not designed to be deployed from scratch with every launch of the pipeline, the developers did not want to think about diffs between branches, and the Smartcat plugin was in urgent need of updating and refinement. The process of delivering new lines could take hours. And, alas, did not always end in success.

Theoretically, all stages of the process were automated, in fact, the maintenance, calculating the diff before launching the pipeline and troubleshooting took more time than performing the same task completely manually.

Light at the end of the tunnel

By August 2018, we launched the current version of integration. We have a localization server. On the server for each repository there is a Serge instance. Serge scans all branches in the repository, sends new lines for translation and commits finished translations to the original branches. In the current integration, everything is fast and stable. After creating a branch for translations, the lines appear in Smartcat within 5-6 minutes. After confirming the transfers, the commit is similar, for the same 5-6 minutes. The delivery time of transfers is limited only by human factors: the workload of translators, the difference in time zones and so on.

In the following articles I will explain how to configure the integration of Serge-Smartcat-Gitlab from scratch, and how we solved various non-standard tasks.

Source: https://habr.com/ru/post/445532/

All Articles

20 projects, 20 languages, deadline yesterday

How did we come to this

First try

Pain and continuous localization

Light at the end of the tunnel

More articles: