Hi, Habr! I present to you the translation of the material “Localization Technologies at Netflix”, written by the Netflix team about internal localization processes and programs developed specifically for this.

The localization program in Netflix is based on three principles: impeccable linguistics, harmonious atmosphere in the team and advanced technology.
We are not afraid to experiment and try new processes and tools, to oppose generally accepted norms in localization - thanks to this we have advanced so far! Working in Netflix means being a pioneer.
')
In this article, we talk about two technologies that will lead us to the world domination ... More under the cut.
Netflix Global String Repository
Netflix succeeded not because we make high-quality content, but because of how we serve this content. Much of the success is an intuitive, easy-to-use, and localized user interface (UI). Netflix is available on different platforms: web version, Apple iOS, Google Android, Sony PlayStation, Microsoft Xbox, Sony, Panasonic TVs and so on. Each of these platforms has its own requirements for internalization, which represents a serious challenge for our team.
Here are examples when UI localization is required:
- add new language
- adding new features
- changes in existing texts and data
Text translation for UI is not an automated process; During the translation, localization managers work together with the development team in order to clearly understand what a particular line refers to, what languages it needs to be translated to, and by what time it is necessary to provide localized files. Everything becomes much more complicated when several features are developed in parallel and maintained in different branches of Git.
After the translation is completed, the application is assembled, tested and placed on the platform. Some devices require a confirmation from a third party (for example, from Apple). All this provokes an undesirable delay of terms. Especially unpleasant are the cases of emergency changes.
But what if the localization process is made open to all stakeholders — both for the development team and localizers? What if we don’t need to rebuild more builds each time we edit the text?
To solve these problems, we developed a global UI string repository called the Global String Repository; localized strings are stored here, which are substituted into the environment for code execution. We integrated the Global String Repository into a localization process, so they complement each other.
The Global String Repository separates the localization packages and the namespace (placeholders). The localization package stores all the data in rows in all languages. Placeholders are place holders for packages the team is working on. During development, standard placeholders are used. The workflow looks like this:
- The developer makes changes to the English version of the string in the package (in the namespace-placeholder)
- The translation process starts automatically.
- Linguists complete translation
- Translators make sets in placeholders accessible.
When integrating with the Global String Repository, there are two types of application behavior:
- During execution: allows you to quickly make changes to the UI
- At the time of assembly: using the Global String Repository separately for localization, and data packages - with assembly (build)
The Global String Repository enables integration at the build stage by providing access to localized data through the REST API.
We open the Global String Repository through the Netflix API, so the same scaling and requirements apply to it as to the metadata of other APIs. For applications that integrate at runtime, this is the critical part. We have 60 million users running Netflix across devices, so the Global String Repository is a top priority.
Like Netflix, Global String Repository has microservice architecture. Microservice is a Java web application (made in Apache Cassandra and ElasticSearch) that is hosted in three AWS regions. We collect statistics for each API request.
The Global String Repository interface is developed on Node.js, Bootstrap and Backbone and is hosted in AWS.
On the user side, the Global String Repository uses the REST API to get data and offers a Java client with built-in caching.
Despite the fact that we have come a long way and are actively developing the Global String Repository, we have something to strive for. This is what we are working on now:
- We develop support for strings with numeric variables and strings with gender identifiers
- We develop the resilience of our technical solutions to failures
- Improving scaling processes
- We support export to different formats (Android XML, Microsoft .Resx, etc.)
The Global String Repository does not have a binding to the Netflix business domain, so we plan to release it as open source software.
Hydra
Netflix is a global service that supports many locales in a myriad of different combinations on different devices / UI; manual testing is not appropriate. Previously, the team of localizers and developers UI tested everything manually on different devices - from consoles to iOS and Android; so we checked all the lines for compliance with the context and the UI (for example, if there is no “trimming” of the text).
But Netflix’s philosophy is that we strive for excellence. This approach allows us to rethink what we are doing. So was born Hydra.
The task of Hydra is to create a catalog of all possible options for a unique screen that will show exactly the screen you want (search is carried out by filters, for example, you can select a device and locale). For example, as a specialist in German localization, you can set up filtering in such a way that you can see the entire path that unregistered users go through on the PS3, website, and Android. These same screens can be viewed at a pace in which the user will open them on his device.
Working with screens in Hydra
Hydra does not work with screens directly; It serves for their cataloging and display. To take a screen display from the Hydra catalog, we use our UI automation model. Using Jenkins CI, data-driven tests work in parallel in all supported locales: this is how screenshots are created that are published to Hydra with the corresponding metadata (page name, function area, UI platform and one critical piece of metadata, a unique on-screen definition).
A unique on-screen definition is needed in order to compile a complete catalog of screens without false matches. This allows you to compare a larger number of screens in the long term, as the image of each screen is compared with itself. The definition of a unique screen is different from UI to UI; for the browser, it is a combination of the page name, browser, resolution, local environment and development environment.
Technology
Hydra is a full-stack web application hosted by AWS. The Java back-end performs two basic functions: it processes incoming screenshots and provides data for the back end through the REST API.

When the UI automation sends the screen to Hydra, the image file itself is written to S3, which provides endless storage (plus or minus), and much smaller metadata is written to the RDS database to later request them through the REST API. REST endpoints (REST endpoints) provide a display of the query string parameters for MySQL queries.
For example:
REST/v1/lists/distinctList?item=feature&selectors=uigroup,TVUI;area,signupwizard;locale,da-DK
This request contains parameters for selecting the necessary data from the Database:
select distinct feature where uigroup = 'TVUI' AND area = 'signupwizard' AND locale = 'da-DK'
JavaScript Front End, which uses knockout.js, allows users to select filters and view the screens corresponding to these filters. The contents of the filters, as well as the screens that match the selected filters, are provided by calling the REST endpoints mentioned above.
Application scaling
After installing Hydra and launching automation, adding new locales is as easy as adding one line to an existing properties file that is sent to the Data Provider testNG framework. Screens with a new locale will be displayed with the following working Jenkins builds.
What's next?
We need to implement a function that will alert you that the screen has changed. At the moment, if the line is changing, there is nothing that would automatically notify about it. Hydra can turn into a more or less working queue, and then localization experts will be able to log in and see only a specific set of screens that have changed.
Another feature is to be able to match individual strings of keys with which screens to display. This will allow the translator to change the string, then search by key and see the screens affected by this change; so the translator will see how this string changes in context in advance.
We are not afraid to solve complex problems. Netflix will become a global service, and our localization team will expand. Such challenges allow us to attract the most talented people, and we create a team that can do what is considered impossible.