In the article, the author tries to outline some aspects of internationalization necessary for understanding in .NET, features of Chinese and not only localizations, and several funny moments.
Terminology
Here are some freely translated and edited definitions from MSDN:
Localization is the process of translating application resources into localized versions for each language and regional settings supported by the application. Localization transition should occur only after the Localizability condition, i.e. executable code is separated from any user interface elements.
Globalization (Globalization) is an application design and development process that supports localized user interface and regional data for users from different cultures. Information on specific language and regional parameters may include: a letter system, used calendars, agreements on date and time formats, numbers, monetary and physical quantities, sorting rules, and even address, telephone, and default paper size formats.
So, what the teams do that lead to the needful type of initially non-localizable products is called a loud word -
globalization , but not localization.

language and regional standards
So, in .NET there is a main class that provides information about language and regional standards (for unmanaged code in English called “locale”) -
System.Globalization.CultureInfo
. Next to it there are also
Calendar
,
RegionInfo
,
NumberFormatInfo
,
DateTimeFormatInfo
and more. other
Culture has a name (in fact - a code), it is convenient to communicate in these terms. The invariant culture name is empty, so we will denote it as
ivl
.
')
Two flow cultures
Any thread - an instance of
Thread
has two properties:
public CultureInfo CurrentCulture {get; set;}
public CultureInfo CurrentCulture {get; set;}
and
public CultureInfo CurrentUICulture {get; set;}
public CultureInfo CurrentUICulture {get; set;}
The first culture is used to format numbers, dates, and so on. Regional settings, and the second is used in
the search
algorithm for suitable localized resources.
So why do we need
two cultures? There is a reason for this: for a descendant of the Anglo-Saxons, born and living in India, the native language is English. On it, he wants to see the program interfaces on his laptop. However, when working in Excel, he is likely to operate with rupees (letter
रु in Hindi), and he also knows that the area of his native country is
32,87,590.01
km
2 .

Crop tree structure
Cultures form a tree. Those. every culture has a parent.
At the root of the tree is "no" culture -
invariant . It does not contain information about the region, it represents a non-existent invariant language, the formatting rules in which are strangely similar to the American ones. The parent of an invariant culture is another invariant culture, and so on until stack overflow.
The opposite is
certain (
specific ,
specific ) culture. They contain information about the language / letter, and the region, and the formatting of numbers and dates. Examples:
ru-RU
,
en-US
,
en-IN
.
Parents of specific cultures are
neutral cultures. The purpose of such cultures is to carry information about language and writing. Before
.NET 4.0
neutral cultures could not contain information about formatting and region, now this information is taken from the dominant specific culture. Examples:
ru
,
en
,
mn-Cyrl
(Mongolian, Cyrillic),
mn-Mong
(Old Mongolian).
The backfill question for the attentive reader: who can be the parent of a neutral culture?Common Misconceptions
So, we can easily present a twig of a tree of cultures on the example of
ivl <— ru <— ru-RU
. But it is not true that hierarchy always consists of three cultures. So, for example, the authors of the book
C # 2005 Programming Language for Professionals in the 17th Chapter Example thought, and then it was
almost true .
But languages with several kinds of letters break the stereotype.

Before .NET 4.0, everything was completely confusing: there were specific cultures whose parent was invariant. See
tula .
Chinese bush
Chinese is spoken by over 1.3 billion people, official is in the People’s Republic of China, the Republic of China (aka Taiwan) and Singapore. And do not forget about the special administrative districts - Hong Kong and Macau.
There are two types of Chinese letters: simplified (since 1956) and traditional. Traditionally, the Chinese wrote from top to bottom, and the columns went from right to left. More recently, since 2004, vertical writing has ceased to be officially used in Taiwan. Now the “European”
way of writing is
used - horizontally from left to right.
Let's go back to .NET. The
zh-CHS
and
zh-CHT
cultures in .NET 2.0 have been declared obsolete and replaced with
zh-Hans
and
zh-Hant
. In the crop tree,
zh-Hans
is the parent of the
zh-CHS
for the
fallback process to work correctly. In the future, with any patch obsolete cultures may disappear.
Separately, I emphasize that in the territory of the People's Republic of China both types of letters are used: in Hong Kong and Macao - traditional, in the rest of the larger territory - simplified.
Fallback process
To search for suitable resources (text, coordinates and sizes of controls, icons, etc.), an instance of
ResourceManager
looks at
Thread.CurrentThread.CurrentUICulture
. UI culture can be both specific and neutral. But
Thread.CurrentThread.CurrentCulture
only specific culture.
First, the resource manager tries to find resources whose culture coincides with the UI culture. If not found, then take the parent culture and repeat the search. If in this way we reach an invariant culture, then we will have to use the default (neutral) resources (they are often located in the main assembly, but not necessarily).
True, the default resources can also be labeled culture. See
MSDN for details.
MS shoals
№1
One more bush of cultures - Uzbek is presented to your attention:

It is clear what happened: after 1991, the languages that were once translated into Cyrillic began to rescue the Cyrillic alphabet strenuously.
The
CultureInfo
class has a
string NativeName
property
string NativeName
, i.e. The name of the culture in the language described. For the
uz-Latn-UZ
culture
uz-Latn-UZ
value is equal to
U'zbek (U'zbekiston Respublikasi)
, although in reality it should be
O'zbek (O'zbekiston Respublikasi)
.
Bagu already has many versions of .NET.
№2
Let's talk about the former Federal Republic of
Moldova , the self-name '' Moldova ''. Moldovans speak Moldavian. Although scholars argue that this is not an independent language, but a dialect of Romanian.
In fact, there are three Romanian languages:
- Romanian in Romania (Latin);
- Romanian in Transdniestria (Cyrillic), remaining as it was at the time of the collapse of the Union;
- Romanian in Moldavia (Latin), with its variant of romanization, which does not coincide with the one adopted in Romania.
It would seem that in .NET we can expect to see three specific Romanian cultures, well, or two - for political reasons (Transnistria). But no, there is no Moldavia in
Windows NLS API . There is only
ro-RO
culture, Romanian (Romania). This is exactly the locale that Moldovan users use. But Microsoft in Moldova
is .
And of course, .NET allows you to create your own cultures.
It is interesting that once upon a time,
ru-MO
and
ro-MO
cultures were noticed in the first .NETs and old operating systems. Yes, the region code was
MO
, not
MD
as it is now.
ISO
changed?
Taboo for localizable applications
The list can not be complete, but examples from personal experience of catching bugs localized applications.
№1
Obviously, you should never stitch on the names of system folders. Although it would seem, where can
Program Files
go? For some ridiculousness in the Windows localized Windows, this folder was not renamed. But not in all localizations!
In the Spanish localization folder proudly referred to as
Archivos de programa
. I recommend:
Google translation from Spanish to Russian.
№2
The real scourge of a globalized-localized application is strings. Concatenated. But even if the lines are substitutions, the translators of the substitutions without comments are not obvious:
"{0}" "{1}".{2} {3}
. And by
{2}
we mean the banal
Environment.NewLine
.
Links
MSDN
Articles
Instruments