📜 ⬆️ ⬇️

About source code rasterization

I occasionally see how people on the blogs translate the code into pictures in order not to fight the buggy engine of a particular platform. In most cases, the authors simply make a screenshot, but I went a more direct way - I embedded the ability to “rasterize” the code in my own editor. This post is about how I did it. The post is also an illustration of what he describes, because The code here is really rasterized. All sources here: http://butbucket.org/nesteruk/typografix .

Where to start?
Already there are solutions for syntax highlighting, which are used for example in HabraEditor . The decision itself is based on this project , which I personally took and slightly reworked in order to have on the same Habré the highlighting of such languages ​​as Boo. But how to rasterize the resulting HTML into a picture? It is clear that you can, for example, use a real browser (the same IE, for example), display HTML in it, take a screenshot, clipping, well, that's it. But I did not like this approach for two reasons:

Let's first talk about the text rasterizer, about what kind of animal it is and why it is needed.

See the headlines in this article? They are graphics, jpg files. It’s natural that I didn’t create them one by one in Photoshop and then insert them into the post - for this, I use just a rasterizer , that is, a component that can take some markup on the input (a la HTML, but more interesting), and output to give a graphic file.

Here is a small example: if I write Hello, World in the markup, I’ll get

Hello, World

Accordingly, the markup that I use gives me the opportunity to rasterize both the usual bold and italic, and OT-features like capitals, ligatures, etc.

Ordinary, BoldandItalic

All this is done using tags such as [b] , [i] , [sw] , etc. For each individual segment of text there is the following structure:

All text (or code) that is rasterized is thus just a sequence of such elements. We have a MarkupParser class that parses a mark-up using ordinary textual comparison, applying markup to a particular element.

All this is used by the recursive parser, which separates the markup from the text.

The root element is the following method, which the editor calls (he himself is written in WPF).

As you can see, the “prototype” of text formatting is transferred to the method, which will be applied to all elements that do not have the headset redefined, font size, color, etc.

Now how it is rasterized. There are a few steps. Firstly, since I use DirectWrite (a reminder: the DirectWrite driver for .Net is already ready, but does not work in a 64-bit environment), I have a lot of infrastructure, which also uses COM.

All this “good” somehow appears in the creation of bitmaps or photo-texting. The most interesting is IDWriteTypography - this is where the set of OT features is defined for a specific piece of text.

Next, the prototype is parsed, and the markup is parsed:

After this, each element is traversed and its graphic properties are set up that are needed. Here is a small example of how simple it is:

The content of the created render target is subsequently copied into the byte of that bitmap (meaning System.Drawing.Bitmap ), which we “locked” before passing through P / Invoke.

Code rasterization
So, we have HTML and rasterizer, we need to get the correct markup. This is done simply:

There is a small problem here - we have to change [ to \[ because square brackets are used for marking. Nothing wrong. Now the last step is to prepare an empty Bitmap and draw on it using our DirectWrite-rasterizer:

It was so easy to fit the existing infrastructure under the rasterizer. I hope you can already see the results of his work. Yes, of course, you cannot cut & paste with such text, but apart from that, everything is very beautiful (IMHO), and most importantly - you can start using the rasterization feature for any annotations - right in the code. I think it's worth a try.

Source: https://habr.com/ru/post/102333/

All Articles