📜 ⬆️ ⬇️

Notepad ++. Cyrillic characters mistakenly in the code - a solution

Yesterday I spent almost two hours trying to find an error in the seemingly correct code. The problem turned out to be banal - the Cyrillic letter “e” somehow got into the key of the “text” array. It does not differ from the “e” Latin in appearance, and it turned out to be very difficult to find the problem. I am sure that most programmers, and simply people who work with textual information, occasionally encounter similar troubles. This is especially true of the English letter "s" and the Russian "es", which are located on the same key in the Russian and English layouts. I have this case is not the first, and therefore I decided to look for a solution to this problem closely. And the solution - albeit not very elegant, but quite workable - was found.

Historically, I often use Notepad ++ for work in general, and for writing scripts in PHP in particular. And in it, for example, the names of the variables $ iicuxiphametod and $ icucihirpathod (ignore strange names - this is just an example) look exactly the same, although in the word to the right half of the characters are Cyrillic.

image

My first thought was to find all lowercase Cyrillic characters that are immediately to the right or to the left of the Latin character and manually or, again, by their regular expression, using the search for regular expressions .
')
Example of a search (pattern (? <= [A-Za-z]) [a-yay] | [a-yay]] (? = [A-Za-z]), in the symbolic classes "i" Ukrainian):

image

Search results:

image

For simplicity, I did not choose only those Cyrillic characters that look like Latin characters in character classes, but included them all (Russian and Ukrainian alphabets, with the exception of some Ukrainian letters) - I just wanted to show the principle itself.

As an option, this solution can be considered, but then each file will have to be checked each time the code works wrong. And this is not convenient.

My second thought was: “Is it possible to set a separate font for Cyrillic or a font of a separate size , so that Cyrillic and Latin differ in appearance when entering, mistakenly entered characters would be noticeable and could be corrected immediately rather than later? "In Notepad ++, this option was not found. You can set individual fonts, sizes, colors for different programming languages, for different types of data - variables, strings, reserved words, etc., but not for Cyrillic.

Then I thought that maybe there is a plugin that allows this. But the search for such additions also did not bring success.

And then I had a bright idea - you need to find a font in which the Cyrillic alphabet will differ from the Latin alphabet , and set it for official words, variables and some other problematic categories. And such fonts, albeit with exotic names, were found (although it should be noted that not many of these fonts were found).

So, for example, the above names appear, if for variable names you set the font SimSun-ExtB (Options-> Define styles-> Font style):

image

More examples :

Font MingLiU-ExtB:

image

NSimSun Font:

image

If we go further, then for string data you can specify a font in which Cyrillic characters are different from Latin, for example, SimSun-ExtB, and for some others, for example, for variables, where Cyrillic in normal conditions is not needed - a font that does not exist Cyrillic , for example, the font Miriam Fixed. Instead of Russian letters, other characters are displayed in such fonts and immediately catch the eye.

image

Compare the same names in the Courier New font:

image

and in the Miriam Fixed font:

image

Fonts are very similar, but in the second case, erroneous input of a Cyrillic character is practically excluded.

This solution works for Notepad ++, but I think the same can be done in some other editors and IDEs.

I hope this method will help someone save their time and prevent these elementary, but such unpleasant mistakes in the future.

Source: https://habr.com/ru/post/147374/


All Articles