📜 ⬆️ ⬇️

Bidirectional rendering with diacritics support

Introduction


In this article I will share the experience of how bi-directional text was added to your own TextBox with the correct display of diacritics using FriBidi and HarfBuzz. This is the second article on this topic, and the first was Adding support for bidirectional text to your own TextBox . In it, I described the features of adding Arabic to my own text using FriBidi.

Sample Arabic Text


What is the problem?


Diacritic marks (diacritics (professional-slang)) in typography are writing elements that modify the character pattern and are usually typed separately. In the previous sentence, the accent marks on a and a are the diacritical marks. For example, in Russian language two points can be considered as diacritics above “” and briefly above “”. But the addition of these diacritics led to the creation of new letters, although for e two points are often omitted.
')
In most languages, when working with text, there are no special problems with the rendering of diacritics (unless of course you specify the emphasis on each letter), since letters with a diacritic are either a separate letter in the alphabet or in font files, they appear as a separate character. In other words, the TextBox does not need to separately place the diacritics above the letters.

But in Arabic (and for example, in Hindi) is not so simple. In Arabic, the vocabulary is an accent mark. They can be used with almost every letter, and even a single letter can have several voices.

Sample Arabic Text

The black color represents the letters of the Arabic alphabet, while the gray color represents the vowels (diacritics).

As you understand, no one went through all possible combinations of letters and public announcements and did not start a separate character for each combination. That is, for the correct rendering of the Arabic text, it is necessary to draw an Arabic letter and separately draw a diacritic above or below it.

FreeType, which we used, allows you to get the image of the diacritic from the font file and even tells us the shifts. But these shifts are incorrect, i.e. it is impossible to figure out by one symbol how to arrange the diacritic. Below is a case in point - a few diacritics above the letter. For the correct positioning it is necessary to analyze the entire text.

Arabic letter with diacritic

To calculate the position of the diacritics above the letters, we used the HarfBuzz library. The library allows you to get glyph numbers in the font and their shifts for further rendering.

How to use HarfBuzz


HarfBuzz receives a font and a line as input, and returns the position of each letter and additional information (for example, the number of the glyph).

hb_buffer_t *buf; // harfbuzz . hb_buffer_create/hb_buffer_destroy hb_font_t *hb_ft_font; // harfbuzz ,    hb_font_create,   hb_font_destroy hb_script_t script; //   .  hb_unicode_script   . hb_direction_t dir = hb_script_get_horizontal_direction(script); hb_buffer_set_direction(buf, dir); //       hb_buffer_set_script(buf, script); hb_buffer_add_utf32(buf, (const uint32_t*)text,length, 0,length); //     harfbuzz . hb_shape(hb_ft_font, buf, NULL, 0); //  unsigned int glyph_count = 0; hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); //    . hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); //   . 


It should be noted that the above code should be applied only to text that uses the same font and has the same script. To implement the splitting of the text into such parts, you can use the function hb_unicode_script, which returns the script of the symbol.

Since we were faced with the task of supporting not just Arabic, but also bidirectional text (for example, Arabic and Latin may be present in one line), we used FriBidi for correct positioning. But this was described in more detail in the first article Adding support for bidirectional text to your own TextBox .

TextBox changes


So, Text Boxing has already supported bidirectional text. Characters are stored in memory in the order of input, but each of them has a corresponding position in the order of rendering.

Storing bidirectional text in the program

With the addition of daikritics, the situation became a bit more complicated, since several letters entered could correspond to one letter. In order for the cursor positioning code to work independently of diacritics, the letters had to be slightly more complicated. Now, each letter kept a list of glyphs that are included in it.

Splitting bidirectional text with diacritics

With this approach, the implementation of editing functions, including copying and pasting, has been simplified. But such an approach does not make it possible to remove a separate diacritic, since the cursor can only be placed in front of or behind the letter.

Example


You can find an example of the bidirectional text rendering here GitHub / ex-sdl-freetype-harfbuzz-fribidi . The example uses: SDL2 - to create a visualization window; Freetype - for rendering letters; fribidi - for proper positioning; harfbuzz - to get glyphs and their positions.

Example of the example

Disclaimer


Yes, we write our bike, so we implement our TextBox from scratch. And we did not use the Pango, because with him was a bad experience before. Maybe it would be easier with Pango.

useful links


Source: https://habr.com/ru/post/277525/


All Articles