This article was published on the w3c blog in 2007, but it seems that most software developers don’t even think about the problem of text length. How many times have you seen messages that do not fit into an alert?
For whom this article is: web developers, web project managers, localization specialists, as well as anyone interested in how changing the length of the text during localization affects the page design.
')
When text is translated from one language to another, the length of the original and translated text is likely to be different. There are several situations in which these differences will be repeated systematically.
This article contains reference materials describing some of these differences. Other articles will discuss what implications this has for the development of web pages and suggest solutions.
In general, the more flexible the layout you create, the better. Allow text to flow and avoid small fixed blocks or dense placement wherever possible. Be especially careful when positioning text accurately in the layout. Separate the content of the page and its presentation in such a way that you can easily adapt font sizes, line spacing, etc. when translating. You should also keep this in mind when designing databases, while specifying the length of text fields.
Most problems with English and Chinese
Texts in English and Chinese, as a rule, are very compact, and the translation from these languages will, as a rule, be wider. Sometimes much wider.
For example, the Flickr user interface has recently been translated into several languages. One of the most common messages that appear when you view your photos is the number of views. For example, "392 views." Let's compare the length of the translation of the word "view", relative to the original English.
Tongue | Transfer | Attitude |
Korean | 조회 | 0.8 |
English | views | one |
Chinese | 次 检视 | 1.2 |
Portuguese | visualizações | 2.6 |
French | consultations | 2.6 |
German | -mal angesehen | 2.8 |
Italian | visualizzazioni | 3 |
Due to the large width of the glyphs, each character in Chinese or Korean is considered two characters in English.
Increasing text length up to 300% in Italian is common for short lines like this. In 1994, IBM published the Localized Application Design Guide, which provides the following average correlation lengths for translations from English to European languages (see Volume 1 of the Guide).
The number of characters in English | Medium length change |
To 10 | 200-300% |
11-20 | 180-200% |
21-30 | 160-180% |
31-50 | 140-160% |
51-70 | 130-140% |
More | 70 150% |
In general, as a rule, the translation text takes up more space, and the shorter the length of the original message, the higher the probability of a significant increase in the translation length.
Of course, this is not all the lines or messages increase in length, but you must find a way to solve this problem when it occurs. For example, Flickr translates “FAQ” as “FAQ” in German and French versions, and in Portuguese as “Perguntas freqüentes” and “Preguntas frecuentes” in Spanish.
As a rule, the shorter the word in English, the higher the likelihood that the translation will be “clamped” in tight spaces, for example, next to the form field or inside the chart, or in a tab of limited width, etc.
Keep in mind that text expansion is not a problem of interfaces with original text in English or Chinese. If the original application is in Spanish, then the term “Idioma de la interfaz” will be shorter in English (“Interface language”), but significantly longer in Malay (“Bahasar pegantar untuk penelusuran”). In addition, shorter translation lines also create problems, since they create excess white space on the page.
When translating whole paragraphs of text, the relative expansion is likely to be less, but there may still be situations that are worth paying attention to. For example, you can display on the "first screen" everything that you have in mind? Will the page elements still be aligned the way you want if the blocks grow in height with different speeds?
Complicating factors
In addition to the unpredictability of the number of characters in the translation, there are other factors that complicate the management of text in the layout.
Articulated nouns
In some languages, such as Finnish, German, and Dutch, one large “word” is often created, replacing a sequence of several short words in other languages.
For example, the English phrase “Input processing features” is transformed into “Eingabeverarbeitungsfunktionen” in German. English text is easily split into two lines if there is a restriction on the width of the block, for example, next to the input fields in the form, or in tabs or buttons, or in narrow columns. The German “super-word” cannot be transferred automatically and can create a significant problem in the layout.
Character width
Chinese, Japanese, and Korean, as well as some other languages, have more complex spelling of characters than Latin-based languages. This leads to the fact that even if the number of characters in the translation string remains the same, or even becomes slightly smaller, the space occupied by the string can be much larger than in the original.
For example, the English word "desktop" turns into "デ ス ク ト ッ プ" in Japanese. The Japanese translation is less than one character, but usually takes up much more horizontal space.
Character height and line spacing
Often, non-Latin characters are much higher than Latin characters. In addition, writing features often require a longer line spacing.
For example, the figure below shows the same text in English and Thai. Please note that in both cases there are only two lines, but the Thai version takes up much more space. This is partly due to the complexity of the characters (this leads to higher glyphs, and, consequently, an increase in the height of the line), but in addition, Thai is characterized by more leading. There are many scripts for which a much longer line spacing is required than for Latin: Arabic (especially in
Nastalik ), Chinese, Devanagari (used for Hindi), Japanese, Korean, Tibetan, etc.
Think twice before using abbreviations
When you use abbreviations to place text in a confined space, you should seriously consider whether this is really a good idea? In other languages, there may not be a similar abbreviation, and the text of the translation may be much longer.
In many languages, abbreviations are very rare. This may be due to the style of the language. In other cases, this may be due to practical considerations. For example, Arabic “words”, as a rule, are built on the basis of a compact root with prefixes, suffixes and small internal changes in order to more accurately reflect the meaning. Cuts without loss of meaning become a big problem.
In addition, you will have to acquaint translators with a list of abbreviations and abbreviations used.