📜 ⬆️ ⬇️

Encodings must die

YNDHPNBYKH YANANI NEDMN KHG MJKHLEPGEYIHU NPSFHI ANPEAH I PSMERNL.

"KOI8", thought Stirlitz.

As Yandex suggested, in the most comprehensive dictionary of Korean hieroglyphs, prepared about a thousand years ago, about 53 thousand characters were taken into account. It is difficult for them, probably, to the Koreans. In Russian, there is another problem: only 33 letters, but the encodings ... did someone count them? Me not. In opera 4, firefox offers a choice of 7.

I will not go into the history of the issue and dig into which cataclysm gave rise to which of the Russian encodings, I will only say the main conclusion that I made for myself from this outrage: the national encodings are evil. The rudiment that the Internet (and in this case it is with a capital letter) should be discarded as useless. One would like to blurt out “Long live the Great Rips of the Internet in UTF-8!” :-)
')
Where to look for the root of evil? Take a look at the HTTP protocol . So what do we see? The headers, the GET request string and the POST data are encoded in the “url-encoded” format, which, in turn, is based on US-ASCII characters.

It is easy to imagine how much more pleasant it would be to see the addresses of pages like habrahabr.ru/blog/Habrablog , encoded in UTF-8.

Source: https://habr.com/ru/post/7606/


All Articles