20 projects, 20 languages, deadline yesterday. Part 3

The final article on the integration of Serge + Smartcat . In this article I will tell you how we scale Serge for the whole company, consider 4 non-standard integrations and, as a bonus, tell you about 2 features that can simplify your life.

Previous articles:

20 projects, 20 languages, deadline yesterday
20 projects, 20 languages, deadline yesterday. Part 2

Scalability

In the last article I told you how to configure Serge for one repository. In our company, we have dozens of repositories that need translations, so a separate server was allocated for localizations. The file structure and environment on it are completely identical to that described in the previous article. Each repository uses its own Serge instance. In order not to execute commands manually, each instance corresponds to cron, which sequentially runs the Serge commands: getting new lines from the repository, getting new translations, parsing, sending new lines to Smartcat and sending new translations to Gitlab.
')

Integration Options

Two sets of languages in one repository

Let's start with the simplest case. Imagine that there are several sets of resource files in your repository. For example, the strings for the client and the application API are stored in the same repository, but in different directories. The client is translated into 20 languages, API - at 6.

Objective : to organize an independent delivery of translations in each of the directories.
Solution :

Configure 2 projects in Smartcat: in 6 languages and in 20.
Configure 2 projects on the localization server.
In the first project in the project1.cfg file add the line our $ unmerged_branch_mask = '^ (translateAPI-)'; # process unmerged branches matching this mask , where “ translateAPI- ” is the prefix of the name of the branch. The prefix will indicate to Serge that translations in the API directory are needed in this thread.
In the file project1.serge.tmpl in the source_dir parameter specify the path to the resource files in the API directory.
Similarly, for the second project in the project2.cfg file add the line our $ unmerged_branch_mask = '^ (translateCLIENT-)'; # process unmerged branches matching this mask , where “ translateCLIENT ” is the prefix for the branches of this project. The prefix will indicate to Serge that translations in the Client directory are needed in this thread.
In the file project2.serge.tmpl in the source_dir parameter specify the path to the resource files in the CLIENT directory.

Note that prefixes must be unique among all projects configured for a single repository.

Total, we have 2 projects in Smartcat and 2 corresponding projects on the localization server. Both projects look at the same repository in Gitlab, but in different directories. Serge understands which lines he needs to send for translation by the branch prefix. To calculate the diff, the same base-translate branch is used.

Localization Swagger

In our company, all products, including documentation, are localized. Now we are introducing the auto-generation of documentation from swagger, and we are faced with the need to localize it.
Task : localize swagger with minimal effort.

Solution : In the myproject.tmpl.serge file, in the parser object, add the data object and enumerate in it those fields whose value should be extracted and sent for translation:

parser { plugin parse_json data { path_matches \/(summary|description)$ } }

A similar task : it is necessary to translate texts from a file, but not all, but only legal ones. Other texts are supplied by a team of marketers. In order not to complicate the structure, and not to create an additional file for legal texts, the keys of all legal lines received the legal prefix:

 parser { plugin parse_json data { path_matches ^\/legal\..* } }

Subtleties of legal translations

Another interesting case. We have a legal document, the terms of which vary depending on the country. But, nevertheless, this one application and resource files are in the same directory.

Task : within one project to translate several documents, and each document must be translated into one specific language.

What was done :

For each country, a corresponding directory was created, inside which lay the relevant source file in English.
The path for the source_dir variable is specified to the shared directory with the resource files.
We include search for resource files in all subdirectories : source_process_subdirs YES
Add a new plugin to the list of called plugins, which allows you to send each specific resource file to the desired language. As a guide, we use the name of the directory where it lies:

 callback_plugins { :feature_branch { plugin feature_branch data { master_job job.base-translate } } :limit_languages { plugin limit_languages data { # all rules are processed top to bottom; each rule can add or remove languages # so the most priority rules are placed at the bottom if { # by default, don't localize file_matches . then { exclude_all_languages YES } } if { file_matches de-au\/ then { include_languages de-AT } } if { file_matches li-LI\/ then { include_languages li } } if { file_matches pt\/ then { include_languages pt-BR } } if { file_matches zh-Hans\/ then { include_languages zh-Hans } } # and so on.. } }

Localization when storing strings in the database

In our system there is a part of the code that stores translations in the database, and for several reasons it cannot move to the resource files in the repository. However, we need to be able to deliver translations quickly and automatically.

Task : Organize the process of continuous localization, if the rows are not stored in the repository, but in the database.

Solution :

Create a repository, collect and group it according to a convenient principle (according to the number of translation languages or products) all the rows from the database.
Create a project in Smartcat.
Run a standard cycle of continuous localization.
Transfer branches merge into the base-translate branch.
Check the value of the last commit hash in base-translate by crown. If the hash changed, that is, new translations were merged, parse the diff between the old and current hash, and send new / modified strings to the database.

Bonus features

Alerts

Basic alerts Smartcat did not suit us, as each team wants to receive notifications only about its branches and only about full readiness of translations in all resource files of the product.

It was decided to build on the availability of all translations in the repository and, if they are fully prepared, send notifications to the corporate messenger, in our case it is Google Chat.

Task : to organize alerts in the repository, where 8 teams can commit, duplicate all alerts to the channel of the technical documentation department.

Solution :

Agree with each of the teams that the name of the branches must contain the name of the team. Still use the translate prefix to denote branches that need translation.
Create a pipeline that runs only for branches with the translate- prefix.
In the pipeline, determine which command the branch belongs to, check for the presence of lines with an empty value, and, if not, send readiness notifications to the appropriate channel. Since the code is quite voluminous, I brought it into the script.

CI

 check-translations: stage: check-translations image: node:8.14.0 tags: - devops script: - chmod +x ./notification.sh - ./notification.sh only: - base-translate - /^translate.*$/ when: always

Alert Script

 #!/bin/bash hangouts(){ curl -X POST --max-time 180 -H "Content-Type: application/json; charset=UTF-8" --data "{ \"cards\": [{\"header\": {\"title\": \"LOCALIZATION IS READY\",\"subtitle\": \"REPOSITORY NAME\",\"imageUrl\": \"https://avatanplus.com/files/resources/mid/5775880ee27f8155a31b7a50.png\"},\"sections\": [{\"widgets\": [{\"keyValue\": {\"topLabel\": \"Translation is finished in the branch\",\"content\": \"$1\"}}]},{\"widgets\": [{\"buttons\": [{\"textButton\": {\"text\": \"SEE COMMIT\",\"onClick\": {\"openLink\": {\"url\": \"https://gitlab.loc/common/publisher-client/commit/$2\"}}}}]}]}]}]}" "$3" || true } cd app/translations if echo "$CI_COMMIT_REF_NAME" | grep "commandname1"; then grep -rl '\:\s\"\"' *.json >> result.file if [ -s network.file ]; then echo "Translations are not ready"; cat result.file else hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_1 hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC fi fi if echo "$CI_COMMIT_REF_NAME" | grep "commandname2"; then grep -rl '\:\s\"\"' *.json >> result.file if [ -s result.file ]; then echo "Translations are not ready"; cat result.file else hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_2 hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC fi fi ... if echo "$CI_COMMIT_REF_NAME" | grep "commandname8"; then grep -rl '\:\s\"\"' *.json >> result.file if [ -s result.file ]; then echo "Translations are not ready"; cat result.file else hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_COMMAND_NAME_8 hangouts $CI_COMMIT_REF_NAME $CI_COMMIT_SHA $HANGOUTS_NOTIFICATIONS_DOC fi fi

Translator Assignments via Smartcat API

This is what our localization manager looks like when it is time to assign all branches for translation.

On average, we have more than 10 branches in our work every day. In Smartcat, each language pair is a separate document, and translators must be assigned to each such document. Manually. Imagine: 40-60 appointments every day. To simplify this process, we made an assignment through the API, and also put it in the pipeline. This job is launched by a button. There is a reasonable question: why not make assignments automatic when sending translations, and not place a method call in the Smartcat plugin, and not in the pipeline?

There are several reasons for this decision:

Human factor. Despite the fact that we build processes and try to stick with them, the unreadable lines or non-contextual lines regularly fall into Smartcat. Automatic assignment in this case would mean additional expenses for us, since some lines would be sent for translation twice: before and after editing.
The distribution of roles. Localization engineer or technical project writer is involved in setting up and managing projects at the localization server level. The assignment and communication with translators is handled by the localization manager. Therefore, assignments must be managed, transparent, and accessible through the GUI.

Solution: when the localization manager considers that the lines in this branch are ready for translation, she presses the button in Gitlab. The whole team of translators is assigned to this branch. The task is taken by the translator who responded first.

CI

 assignee: stage: assignee image: node:8.14.0 tags: - devops script: - chmod +x ./assignee.sh - ./assignee.sh only: - base-translate - /^translate.*$/ - assignee when: manual

Assignment script

 #!/bin/bash if echo "$CI_COMMIT_REF_NAME" | grep "translate-"; then node -pe "JSON.parse(process.argv[1]).documents.forEach(function(elem){ if(elem.name.indexOf(\"$CI_COMMIT_REF_NAME\") !== -1) { console.log(elem.id) } });" "$(curl -XGET -H "Authorization: Basic $SMARTCAT_API_KEY" -H "Content-type: application/json" "https://smartcat.ai/api/integration/v1/project/$SMARTCAT_PROJECT_ID")" >> documents fi sed '$d' documents > documents.list while read LINE; do bash -c "curl -XPOST -H 'Authorization: Basic $SMARTCAT_API_KEY' -H "Content-type:application/json" -d '{"documentIds":[\""$LINE"\"],"stageNumber": 1}' 'https://smartcat.ai/api/integration/v1/document/assignFromMyTeam'";done < documents.list

This concludes the series of articles on integration and setting up continuous localizations. I will be glad to answer any of your questions.

Source: https://habr.com/ru/post/452580/

All Articles