📜 ⬆️ ⬇️

How to localize the application in many languages, so as not to be painfully embarrassing

At Habré, localization / internationalization of applications has been repeatedly discussed. We, the ABBYY Language Services company, have been working in the field of linguistic services, services and technologies for a long time, and we are constantly engaged in the localization of software. We have gained considerable experience in this matter, we decided to share it, with a greater emphasis on the organization of the whole process. Localization of applications is a more difficult task than is commonly believed, and it can be approached in various ways: you can initially create simple and understandable text, you can invest in steep translators who can draw sense out of any text, you can prepare and translate the text “somehow “But plant a community or testers to reconcile the final result. It is only necessary to remember that the verification of the source text is done in one language, and the verification of the result is done in all languages, i.e. efforts need to be spent N times more.

In general, localization is in fact the opening of another market, and it is clear that, when deciding on localization, the management expects to receive additional profit. At the same time, it is often in this very localization that they invest only a small part of the total development budget (say, about 1-2%). Those. the calculation goes to the fact that by adding 1%, you can get + 50% of income. How realistic can such expectations be?

Goals and scope


The first thing that needs to be decided is what the localization will be done for, what languages, how much and with what quality. Usually the goal is business objectives - product launch in new markets, expanding the audience. There are also cases when the product will actually be sold in the original language, but according to the laws of the state, it is necessary to have its translated version (for example, various user guides).

In general, the whole localization process requires active interaction between the client and translators, therefore, if translation into any language is carried out without real support in the local market (representation, user community) and it will not be in the near future, there is a chance that all the work will have to be redone .
')
Speaking about the translation of software, we can distinguish the following main types of content:


With the transition to each subsequent item, the amount of information that needs to be translated increases. The application itself may contain 10–100 thousand words, various guides and training courses - another 200–500 thousand, and full product assistance - up to 1-3 million words (everything, of course, depends on the project).

If there is still enough money for the product itself, then everything else can remain behind the scenes, although, of course, a large product without translation of the user's manual can be generally useless.

If everything is localized, the important factor is that both the user manual and product assistance should be based on the lines of the application itself. Therefore, vague terminology or an unfortunate name of the interface elements, as well as errors and misprints in the application lines, lead to the reproduction of the same errors in other materials, often with the number of them multiplied. Plus, there may be inconsistency of software and everything else, with all the ensuing consequences in terms of quality, timing and cost.

Organization of the process


So, let's say the choice is made, localization is needed.

Immediately you need to think about how everything will be implemented technically. That is, will the lines be extracted from the product, or will the translation go directly to the source materials, etc.

Possible options (not all, of course) for software:


After the text resources have been somehow allocated, it is necessary to think about how they will be stored: in a specialized database, in the form of text files on the server, etc.
The next question to ask yourself is: “How will the updates be translated and new versions be released?” Usually, all these tasks can be automated, but it’s better to think about it from the very beginning.

Content


The main feature of this stage is that any mistake made during the preparation of the content is automatically multiplied by the number of translation languages ​​(N). Take an arbitrary poorly worded phrase.

At best, N translators will ask the translation project manager (TRM) to clarify this phrase. TRM is not necessarily a product specialist: it is the person who organizes the process. He will have to spend obviously more time to understand the situation than people who directly create content.

In the worst case, N translators, in order not to waste time, will translate the phrase as it suits them. As a result, it will be necessary to correct the translation in N languages, attracting not only TRM, but also testers, to confirm the correctness of the edit. After that, N * X clients will spend the time to install the update program.

Terminology

Content can be roughly divided into two components: terminology (foundation) and the rest of the lines (the whole building). Terminology is a place where you need to exert maximum effort to get a quality translation. You need to create a list of the most frequently used terms, add the most complex concepts to it, and make sure that all terms are one-to-one: one concept - one term. Any ambiguity is a headache for translators.

After the list of terms has been created in the source language, it must be translated into all target languages ​​and verified (this is a very important point) by local specialists. This step, including reconciliation, should in no case be skipped. While this process is running, you can do the product lines themselves.

Product lines

There are many ways to create application content. The concrete implementation depends on the development process, the way the strings are placed, and other factors.

Frequent situation: the designer, communicating with the client (or product marketing), creates the primary specification of the product.

Further, the programmer writes the code, sometimes copying the names of the interface elements, sometimes introducing something of his own, including errors.

Then the tester checks the work of the program, finds functional errors, gives recommendations, as a result of which interface elements and messages appear that were not present in the original design specification. There is usually no time or resources to synchronize the new version of the code with the original specification.

The whole procedure can be repeated several times. The problem is exacerbated in large international companies, when the main language of application and development, which is also the basis of localization, is not the native language for one or several participants in the chain. Moreover, all participants can speak in different languages: the product is created in English, the designer is Russian, the programmer is Chinese, and the Mexican is testing.

Add two more problems:


The result of such a process can be a set of strings with incomprehensible terminology and a certain number of errors. Sometimes the company has a terminology specialist and a department of technical writers who check the generated content. In general, such a check is optimal at the stage of creating specifications, before the lines get into the code. Otherwise, programmers will have to redo their work. Sometimes technical writers deal only with the description of the application created, and the quality of the lines remains on the developers' conscience.

A large role is played by the presence of a worked - out style guide ( example ). And, by the way, the fact that there is a department of technical writers does not guarantee the presence of a style guide.

Consider the case where the lines still undergo some kind of control.


Aspects of source content verification


Errors can be many. For example, in working with one thematic cluster responsible for a specific functionality (especially if it is specific for a particular country, the language of which does not coincide with the language of development), terms may be used which in the other module will mean something completely different. Or vice versa: different entities in different functional parts of the application may have the same name. This confuses both customers and developers with technical writers who will eventually describe the system.

Real examples of non-standardized content:
• Qty. On hand
• OnHand Qty
• On-hand Quantity
• On Hand Qty.
• Qty on Hand
• Quantity On Hand
• On Hand Qty
• On Hand Quantity
• Qty On Hand

• Start Date cannot be greater than the End Date.
• Due Date cannot be before Start Date.
• End Date Cannot Be Before The Start Date.
• Start date may not be greater than the End Date.
• From Date cannot be greater than To Date.
• The Star date cannot be greater than the End date.
• can not be greater than the end date.
• From Date cannont be greater than To Date.
• From Date cannot be later than To Date.
• Start date cannot be greater then End date.
• The Start date cannot be greater than the End date.
• Begin Date may not be greater than the End Date.

• Invoice not found.
• Invoice cannot be found.
• Unable to find invoice.
• Invoice was not found.

The given examples are identical strings in meaning, but they are created by different programmers in different departments. Instead of using one thing everywhere, we have 10 options, for the transfer of which you have to pay money.

Controlled Language System

One of the ways to solve the problem of simplifying content is the implementation of the Controlled Language (CL) system . Its main idea is to unify the language and terminology of the application, aligning it with the company's developed style guide and automating the verification. It uses a specific set: for example, a limited set of words, grammatical structures, restrictions on the length of a sentence, etc.
The system provides the following benefits:


Controlled Language can be deployed inside, can be given to contractors, for example, simply as a translation from English to Simplified English.

Summary




Thus, the content of the application (software lines) must undergo an extremely complete control, verified for compliance with a specific set of rules specific to the industry in general and the company in particular. Content should be understandable for as many people as possible, including those who do not have in-depth knowledge of the product.

Translation and quality control deserve a separate article, as there is a whole range of questions: on whether to trust automatic checks or necessarily involve local users, to choosing a translation management system and supporting a pool of translators.

Posted by fridge .

From the life of blog editors
Just before publication, we received the following review from one of the authors:
The programmer said that there could have been a few more pictures, and at least one of them was naked, but, I suppose, this is not an option :).
In general, there are no big comments.

It’s impossible to deny a programmer, but a nudity is too much. So here you are:

Source: https://habr.com/ru/post/168653/


All Articles