
A key feature of online services is that users from almost all over the world who speak different languages have access to them. If you are developing such a service and you want people from different countries to use it, then you need to translate and adapt it, in other words, localize it.
The idea of writing this article arose after the
MoscowJS mitap , at which I spoke about how
the localization process takes place
at Badoo . But in this article I would like to tell a little more about localization features using the example of web applications, what solutions exist for localization, and why Badoo went its own way. All not indifferent - welcome under cat.
')
Why and why?
If you focus only on the Russian market, then, most likely, you will never need localization, or at least will not need it until your product becomes interesting for an international audience. Then you have to very quickly adapt your application to new realities, and this is not so easy. Therefore, it is necessary to immediately determine and answer the question:
do you really need localization ? If you are inclined to a positive answer, then your service will need to be prepared for various language features. In general, the techniques and tools that allow you to take into account such features are called internationalization, which is the process of creating an application that can work in different languages and with different regional features without any additional changes.

The next question to be answered is:
why is localization important ? And it is important primarily for users and customers, because each of them should feel comfortable using your application. For example, residents of a country, even if they know English well, prefer to make purchases in their native language. Most of them also prefer to use the support services in their native language. If we take Europe, in which there are about 50 countries, then practically each of them will have its own regional features in displaying dates, numbers, or currencies. And if we expand our audience to the whole world, then there are countries such as China, Iran, Afghanistan or Saudi Arabia, in which the text is recorded from right to left or top to bottom, and the numbers themselves are recorded using Indo-Arabic or Persian numerals.
Language features
What features of the language should pay attention in the first place , if you decide to implement localization? First of all, you should pay attention to the display of the date and time in their usual format. The following table shows several format dependencies on a country. As you can see, in most countries the format is different.
Format | Date example | A country |
---|
yyyy.MM.dd | 2016.09.22 | Hungary |
yyyy-mm-dd | 2016-09-22 | Poland, Sweden, Lithuania, Canada |
yyyy / MM / dd | 2016/09/22 | Iran, Japan |
dd.M.yyyy | 09/22/2016 | Russia, Slovenia, Turkey, Ukraine |
M / d / yyyy | 9/22/2016 | USA |
The time format also depends on the country. For example, in the USA, Canada, Australia, New Zealand, the 12-hour time format (English system) is used, in the rest of the world - the 24-hour format (French system).
The next feature is the format of numbers and currency display. As can be seen from the table below, the thousandth and decimal separator may look like a comma, a period or a space character. And the position of a currency sign may differ not only in different languages, but also in different countries. For example, countries such as Germany and Austria speak the same language, but the money format is different.
Example | Locale | A country |
---|
123 456,79 € | ru-RU | Russia |
€ 123,456.79 | en-us | USA |
123.456,79 € | de-de | Germany |
€ 123,456.79 | de-AT | Austria |
Of particular interest is the traditional
record of numbers in China. In Chinese, numbers are divided into digits in a different way than, for example, in Russian. We are used to splitting large numbers into groups according to the number of thousands, while the Chinese, according to the number of tens of thousands. For example, the number 150.000.000 will be written by them as 1 亿 5000 万. In addition, the Chinese very strongly believe in "numerical superstitions" and treat numerology seriously and thoughtfully. For example, the number 4 sounds similar to the word "die," and the Chinese are trying to avoid it. In many hotels you will not find rooms with the number 4, and sometimes even the 4th floor. The same applies, for example, to bank accounts: the dream of a simple Chinese man to have a number with eight - a symbol of wealth and prosperity.
In addition, the problem may be the English system of measures used in the United States, Myanmar and Liberia. Why this is important: it is worth remembering the
Mars Climate Orbiter satellite, which flew up to Mars and fell on it, as the teams on space equipment used force in newtons, and the software on Earth - pound-forces. During the entire flight, no one suspected an error. As a result, it cost 125 million dollars. So do not forget to display the results in the usual measures for users.
When you have dealt with the format of dates and numbers, you can move on to the features of translations. And the most obvious problem is the declension of nouns after the numerals. As we know, in Russian there are three plural forms (plural forms), while in English there are only two. But there are languages in which there can be six such forms. For example, under this
link you will find a table of forms for each language.
Russian language | | English |
---|
You have 1 gift | Singular | You have 1 gift |
You have 5 gifts | Plural | You have 5 gifts |
You have 2 gifts | Few | |
Then you can select a whole block - features of translations. Here there are a sufficient number of features that must be considered.
1. Translate entire phrases and sentences. They can not be divided into components, since the sequence of words in different languages may be different.
For example, take the following phrase:
8.283 out of 15.311 people liked you!For English, it looks like this:
<b>{{num_voters_yes_maybe}}</b> out of <b>{{num_voters_total}}</b> {{people}} liked you!
But in Japanese, the same sentence already looks different:
<b>{{num_voters_total}}</b>{{people}}<b>中{{num_voters_yes_maybe}}</b>人があなたを気に入っています!
As can be seen from this example, in Japanese, the reverse sequence of words. Therefore, as many often do,
you can’t just write
' ' + pageNum + ' ' + total
2. Some translations differ according to the person’s gender.
As can be seen from the example below, if for the English language you can specify one phrase for both male and female, then there will be a different phrase in the Slovak language for each sex.
English You got an award on <span>{{award_date}}</span>
Slovak : Toto ocenenie si získal <span>{{award_date}}</span> : Toto ocenenie si získala <span>{{award_date}}</span>
3. String translation should depend on the context. The translator must know the meaning of the whole sentence, phrase or paragraph, otherwise he may misunderstand and incorrectly translate. For example, a sentence like "
You can save this {{item}} " can have a different translation: "
You can save / save this {{item}}) ". Ideally, the translator should not only see a set of strings to translate, but also an image of the area where this string is located.
4. The reuse of translation resources may be unsafe. For example, “Save” (file) and “Save” (settings) in some languages may have different names. Or a word like thread can be translated as “thread”, or it can be translated as “thread”.
Here, perhaps, we identified the most popular features that occur when localizing web applications. But this is still not all that may affect localization, because it can also include design features (for example, Japanese and Chinese require an enlarged font, in some languages the length of the text will be 2 times longer than, for example, in English); the color palette (red and green in different cultures mean opposite things, for example, a red check mark in Japanese means that you did something wrong); the images used (yes, in Asia it would be nice to show Asians, and in Europe, Europeans) and many other aspects that are characteristic of a particular country and culture. All this goes beyond the scope of the article, but this is worth remembering.
And now let's look at the Internet and see, and
what are the tools for client localization ?
Localization methods on the client
The development of interfaces and the introduction of complex business logic has already required developers to solve many localization problems on the client. The internationalization possibilities that ECMAScript provided up to a certain point were rather scanty, so libraries such as Closure, Globalize, YUI, Moment.js, or some of their own solutions from each of the developers began to appear. All of them expanded ECMAScript capabilities and filled in the gaps in internationalization, but the solutions had a different software interface and certain limitations associated, for example, with string comparisons. So, in December 2012, the
ECMA-402 standard appeared, which was supposed to simplify the life of front-end developers when internationalizing applications. But is this really the case? Let's see what this standard now offers us.
ECMAScript Internationalization API
This is a standard that describes the ECMAScript programming interface for adaptation to the linguistic and cultural characteristics of languages or countries. The work takes place through an Intl object, which provides the functions of formatting numbers (Intl.NumberFormat), dates (Intl.DateTimeFormat), and string comparison (Intl.Collator). Currently
supported by all modern browsers. The latest browser that recently added support was Safari, you can use
polyfill for outdated browsers.
A big plus of this standard is that it was developed with the support of Google, Microsoft, Mozilla, Amazon, and, as we are
promised , it will be developed. Will be added to the formatting of strings, taking into account the plural form and gender, parsing numbers and much more. It is a pity that this is all happening quite slowly. For example, the standard itself was approved as early as 2013, and support by the most popular browsers was implemented only in 2016. For now, the functionality of the Intl object is rather limited and does not provide opportunities for translations. Therefore, you have to use third-party solutions or use a
polyfill for a yet not approved format.
Pros:
- native implementation in the browser;
- high performance;
- does not require additional resources;
- formatting strings with different locales without loading JavaScript resources;
- development of ECMAScript 2017 Internationalization API.
Minuses:
- for outdated browsers, a polyfill download is required;
- system dependency. Some locales may not be supported by the client;
- There may be different results in different browsers.
ECMAScript Internationalization API Examples var mFormat = new Intl.NumberFormat("ru", { style: "currency", currency: "GBP" }).format(1234567.93); console.log(mFormat);
Slightly more examples can be found
here .
As you can see, a lot of other things are not implemented in the standard; not all features that web application developers encounter are taken into account. Therefore, as I wrote above, many have to either look for ready-made solutions or develop their own. At the moment there are a lot of solutions, and each has its own advantages and disadvantages. If you turn to Google, then the search in the first results will be
i18next ,
FormatJS ,
Globalize ,
jQuery.i18n and others. Some of these libraries offer their own solutions, others try to follow the ECMA-402 standard. Take for example the two libraries that we get in the first results in Google search and see what they can do.
i18next
According to the developer, this is a very popular library for internationalization both on the client and on the server (node.js). For her, there are many plug-ins, utilities, it integrates with different frameworks. It provides an interface for translators into which you can upload translation files, but unfortunately, it is already paid. It really has a lot of things implemented, and the library continues to evolve, which of course pleases. But it does not follow the ECMA-402 specification and has its own structural format for messages, not
ICU Message syntax . In addition, formatting numbers and dates requires loading
moment.js or
numeral.js . Accordingly, you will have to load these libraries into the project and add locales to them for the necessary languages.
Pros:
- support for many features of the language;
- the ability to download resources from the backend;
- additional plugins and various utilities;
- extensions for popular frameworks, template engines.
Minuses:
- requires resource downloads (i18next 35kb + moment 20kb + necessary locales);
- does not follow the ECMA-402 standard;
- paid interface for translators.
More information on working with the library and more examples can be found on the
official website .
Format JS
Format JS is a modular collection of JavaScript libraries for internationalization. It is based on ECMA-402, ICU, CLDR standards and has integration with many frameworks and template engines, such as Dust, Ember, Handlebars. This library, if necessary, either loads the polyfill to work with internationalization, or uses the capabilities of the browser. In addition, it supports work both on the client and on the server.
Pros:
- modularity;
- uses the capabilities of ECMA-402 or polyfill;
- extensions for popular frameworks, template engines.
Minuses:
- requires loading of resources if necessary;
- not all translation options.
For example, the text for translation in ICU format will look like this:
{ gender, select, female {{ count, plural, =0 { } one { # } few { # } other { # } }} other {{ count, plural, =0 { } one { # } few { # } other { # } }} }
Check the work can be on this
link . Use the example above and enter the “ru” locale. The format, at first glance, is quite complicated, but allows you to take into account many features of the language. The only thing is that I have not yet met with convenient systems for translators who would work with a similar format.
As you can see, there are a lot of solutions, you just need to choose. But the localization process does not end simply with the choice of a system for localization and attempts to cope with the various features of languages. Any translation system should be closely integrated into your development process, it should represent a unified infrastructure for both the developer and the translator, and answer several important questions, for example:
- what will the translation process look like?
- How will translation files get to the translators and back to the system?
- How can I tell the translator where the specific text is?
And only when you have answers to these questions, can we say that you have a well-integrated localization system, convenient for work.

We had to pay attention to all these questions and features of languages when Badoo entered the international market. In those days already far away, even if similar systems for localization existed, they did not meet all our requirements, and, of course, we had to develop our own system for localization (about which we
already wrote on Habré , and also
talked about the layout features multilingual applications). This system should be well integrated into our overall process, be transparent and not delay the development process (since, for example, we “release” twice a day, and it is very important for us that new product ideas quickly find themselves in a production environment). In addition, she needed to be able to work not only with the web, but with all our other platforms, such as iOS, Android, Windows Phone, and also be used for email dispatches.
With the advent of the general format of communication between our clients and the server (hereinafter - the protocol), or, as we call it, “apification”, many texts began to come from the server. This approach turned out to be convenient for us, since it is not necessary to store large amounts of translations on the client and thanks to this approach, we, for example, can conduct AB testing of tokens or create such tokens that depend on the user's actions. Each client can also keep the necessary translations in his. The decision on where to store translations — on the client or server — is taken by the team responsible for developing the protocol. If some translations are updated, then each client can request new ones (as translations are updated frequently, and new releases appear in the app stores at regular intervals). We call this mechanism Hot Lexem update.

As can be seen from the figure above, in the localization process, not only client developers and translators, but also many other teams are busy with us. For example, the MAPI team, as I wrote above, designs the protocol and decides where the translations will be stored. The BackOffice team provides a convenient interface for translators, translators, of course, do translations, and SRV (server developers) or Frontend (client developers) commands generate and display the necessary translation. In addition, when we created such a system, we were able to create a collaborative translation system (
https://translate.badoo.com/ ) on its basis, in which our users can participate. And they greatly help us to make translations taking into account the local peculiarities of each country.
Conclusion
It is absolutely clear that the process of localization of any application is a rather serious and painstaking work, because it affects different project teams, and not just developers and translators. And at the end of this article I want to once again draw your attention to the main, in my opinion, moments in the localization of applications:
- Localization is a rather complicated procedure for implementation "on top". If you need it, it should be incorporated into the project from the very beginning.
- Localization resources must be application independent.
- Localization extends not only to lines, and this must also be taken into account when designing.
- Make your system convenient not only for developers, but also for translators, automate the translation process.
- If you are not sure about the quality of the translation, do not translate the text at all.
- Try to take into account the cultural characteristics of each country and language.
- Designs, layouts, colors, used images should be subject to localization.
Perhaps, on this subject I have everything. I hope that you have learned some new subtleties of the localization process. If you have your own interesting experience, then share it and comments in the comments. Make web great again!
Vyacheslav Volkov, frontend developer, badoo