📜 ⬆️ ⬇️

Localization and globalization

In the article, the author tries to outline some aspects of internationalization necessary for understanding in .NET, features of Chinese and not only localizations, and several funny moments.

Terminology


Here are some freely translated and edited definitions from MSDN:
Localization is the process of translating application resources into localized versions for each language and regional settings supported by the application. Localization transition should occur only after the Localizability condition, i.e. executable code is separated from any user interface elements.

Globalization (Globalization) is an application design and development process that supports localized user interface and regional data for users from different cultures. Information on specific language and regional parameters may include: a letter system, used calendars, agreements on date and time formats, numbers, monetary and physical quantities, sorting rules, and even address, telephone, and default paper size formats.

So, what the teams do that lead to the needful type of initially non-localizable products is called a loud word - globalization , but not localization.


language and regional standards


So, in .NET there is a main class that provides information about language and regional standards (for unmanaged code in English called “locale”) - System.Globalization.CultureInfo . Next to it there are also Calendar , RegionInfo , NumberFormatInfo , DateTimeFormatInfo and more. other

Culture has a name (in fact - a code), it is convenient to communicate in these terms. The invariant culture name is empty, so we will denote it as ivl .
')
Two flow cultures

Any thread - an instance of Thread has two properties: public CultureInfo CurrentCulture {get; set;} public CultureInfo CurrentCulture {get; set;} and public CultureInfo CurrentUICulture {get; set;} public CultureInfo CurrentUICulture {get; set;}
The first culture is used to format numbers, dates, and so on. Regional settings, and the second is used in the search algorithm for suitable localized resources.

So why do we need two cultures? There is a reason for this: for a descendant of the Anglo-Saxons, born and living in India, the native language is English. On it, he wants to see the program interfaces on his laptop. However, when working in Excel, he is likely to operate with rupees (letter रु in Hindi), and he also knows that the area of ​​his native country is 32,87,590.01 km 2 .



Crop tree structure

Cultures form a tree. Those. every culture has a parent.

At the root of the tree is "no" culture - invariant . It does not contain information about the region, it represents a non-existent invariant language, the formatting rules in which are strangely similar to the American ones. The parent of an invariant culture is another invariant culture, and so on until stack overflow.

The opposite is certain ( specific , specific ) culture. They contain information about the language / letter, and the region, and the formatting of numbers and dates. Examples: ru-RU , en-US , en-IN .

Parents of specific cultures are neutral cultures. The purpose of such cultures is to carry information about language and writing. Before .NET 4.0 neutral cultures could not contain information about formatting and region, now this information is taken from the dominant specific culture. Examples: ru , en , mn-Cyrl (Mongolian, Cyrillic), mn-Mong (Old Mongolian).

The backfill question for the attentive reader: who can be the parent of a neutral culture?

Common Misconceptions

So, we can easily present a twig of a tree of cultures on the example of ivl <— ru <— ru-RU . But it is not true that hierarchy always consists of three cultures. So, for example, the authors of the book C # 2005 Programming Language for Professionals in the 17th Chapter Example thought, and then it was almost true .

But languages ​​with several kinds of letters break the stereotype.



Before .NET 4.0, everything was completely confusing: there were specific cultures whose parent was invariant. See tula .

Chinese bush



Chinese is spoken by over 1.3 billion people, official is in the People’s Republic of China, the Republic of China (aka Taiwan) and Singapore. And do not forget about the special administrative districts - Hong Kong and Macau.

There are two types of Chinese letters: simplified (since 1956) and traditional. Traditionally, the Chinese wrote from top to bottom, and the columns went from right to left. More recently, since 2004, vertical writing has ceased to be officially used in Taiwan. Now the “European” way of writing is used - horizontally from left to right.

Let's go back to .NET. The zh-CHS and zh-CHT cultures in .NET 2.0 have been declared obsolete and replaced with zh-Hans and zh-Hant . In the crop tree, zh-Hans is the parent of the zh-CHS for the fallback process to work correctly. In the future, with any patch obsolete cultures may disappear.

Separately, I emphasize that in the territory of the People's Republic of China both types of letters are used: in Hong Kong and Macao - traditional, in the rest of the larger territory - simplified.

Fallback process

To search for suitable resources (text, coordinates and sizes of controls, icons, etc.), an instance of ResourceManager looks at Thread.CurrentThread.CurrentUICulture . UI culture can be both specific and neutral. But Thread.CurrentThread.CurrentCulture only specific culture.

First, the resource manager tries to find resources whose culture coincides with the UI culture. If not found, then take the parent culture and repeat the search. If in this way we reach an invariant culture, then we will have to use the default (neutral) resources (they are often located in the main assembly, but not necessarily).

True, the default resources can also be labeled culture. See MSDN for details.

MS shoals

№1

One more bush of cultures - Uzbek is presented to your attention:



It is clear what happened: after 1991, the languages ​​that were once translated into Cyrillic began to rescue the Cyrillic alphabet strenuously.

The CultureInfo class has a string NativeName property string NativeName , i.e. The name of the culture in the language described. For the uz-Latn-UZ culture uz-Latn-UZ value is equal to U'zbek (U'zbekiston Respublikasi) , although in reality it should be O'zbek (O'zbekiston Respublikasi) .

Bagu already has many versions of .NET.

№2

Let's talk about the former Federal Republic of Moldova , the self-name '' Moldova ''. Moldovans speak Moldavian. Although scholars argue that this is not an independent language, but a dialect of Romanian.

In fact, there are three Romanian languages:

It would seem that in .NET we can expect to see three specific Romanian cultures, well, or two - for political reasons (Transnistria). But no, there is no Moldavia in Windows NLS API . There is only ro-RO culture, Romanian (Romania). This is exactly the locale that Moldovan users use. But Microsoft in Moldova is .

And of course, .NET allows you to create your own cultures.

It is interesting that once upon a time, ru-MO and ro-MO cultures were noticed in the first .NETs and old operating systems. Yes, the region code was MO , not MD as it is now. ISO changed?

Taboo for localizable applications


The list can not be complete, but examples from personal experience of catching bugs localized applications.

№1

Obviously, you should never stitch on the names of system folders. Although it would seem, where can Program Files go? For some ridiculousness in the Windows localized Windows, this folder was not renamed. But not in all localizations!

In the Spanish localization folder proudly referred to as Archivos de programa . I recommend: Google translation from Spanish to Russian.

№2

The real scourge of a globalized-localized application is strings. Concatenated. But even if the lines are substitutions, the translators of the substitutions without comments are not obvious: "{0}" "{1}".{2} {3} . And by {2} we mean the banal Environment.NewLine .

Links


MSDN


Articles


Instruments


Source: https://habr.com/ru/post/166053/


All Articles