📜 ⬆️ ⬇️

How I added 6 characters to Unicode

Asterisk characters (★) have long been part of Unicode, and therefore can appear as symbols on web pages, in texts and mail. But there were no half stars, therefore special images or fonts were required for them. I recently participated in writing a proposal for adding half-stars to Unicode, and that's just what our proposal was accepted. In the next Unicode release, the half-star can be used along with any characters. In the article I will tell you how I achieved the addition of half-stars and two other characters in Unicode.


4 different uses of a half star for rating 3.5

Unicode is a computer standard that defines which characters will be used by almost any computer. It allows different computers to display text in almost any language and with almost any necessary characters (prior to Unicode, working with texts in non-English was terribly confusing). But Unicode does not include everything. Last June, in a comment on Hacker News, they complained that Unicode does not have a half-star symbol, which is necessary for movie ratings and reviews.

I suggested that someone write a proposal for adding this symbol, but I quickly realized that I needed to become someone. Since I had already successfully added two characters to Unicode, I was familiar with the necessary process.
')
A few years ago, in a detailed article it was described how two people added symbols of power symbols in Unicode. Adding a new character to Unicode is easier than you might think. No need to pay money, work in a large company or join a commission. You just need to write a sentence explaining the need to include a symbol. If the Unicode committee agrees with it, they will approve the addition to Unicode.

In 2015, I began programming the IBM 1401 of the 1960s mainframe at the computer history museum. But when I described this system, I had difficulties. The computer used a 6-bit character set (predecessor to EBCDIC) with several strange characters. All of them were in Unicode, with the exception of one - group labels. I was shocked that the Unicode, which contains 128,172 characters, lacks the character I need. After reading about the successes of the group that added the power symbols, I decided that it would be interesting to see if I could add the group label symbol to Unicode. I wrote a proposal, sent it to the commission, and was approved at the next meeting.


Label description of the group from IBM 1905 705 instruction

After a few months, I discovered that in Unicode there is no symbol for bitcoins. This was unexpected, since this symbol is widely used. He was already rejected, so in October 2015 I wrote a more thorough proposal, using the active support of / r / bitcoin and other groups. This proposal was accepted by the Unicode committee in November 2015.


And when I saw a comment about the half stars on Hacker News, I decided that it would be quite simple to ensure its adoption in Unicode. After discussions on HN and in the Unicode mailing list, I wrote a sentence. The commission considered it in August 2016, but to my surprise they received another similar offer, and decided to wait for a single offer. It turned out that Andrew West also wrote a proposal for the half-stars, and we sent our offers independently. So we joined forces and made a combined proposal, which was adopted by the commission on September 30, 2016.

Why did we offer four different types of half stars? We included both circled asterisks and solid ones, because both of these types are used frequently (I was not sure if the commission would consider these characters sufficiently different to include both of them, but that’s what happened). In languages ​​with right-to-left writing, such as Hebrew, the rating in asterisks is also written from right to left (which surprised me), so we also included mirrored versions of asterisks for such languages. As a result, four different versions covered all use cases.


If there is such a character that you would like to see in Unicode, and it meets the requirements, you need to place a sentence, because this process is simple and interesting. Make sure the symbol meets the requirements. In particular, it is necessary to find quite a few examples of its use in the text. The Unicode Committee will not add a character just because you think it is cool, so you will need examples of its use. Creating a font to demonstrate a new symbol is the most difficult part. I used FontForge. The team with food symbols had many advisers who helped make a successful proposal. I will also be happy to offer such advice to you.

It is necessary to mention that for Emoji the process is very different, so it’s not necessary to say that “if there is an emoji for poop, then my symbol also has a right to exist” (the symbol was added for backward compatibility with Japanese mobile phones). For Emoji, the expected popularity of the character is the main factor influencing approval. But Unicode is not concerned with popularity - the Tangutov historical scrolls will not have a millionth of the popularity of the new emoji - but with the use of texts. I got the feeling that many members of the Unicode committee would not want to do emoji at all.

After accepting a symbol, it has a long way to go before appearing in fonts and use. The new version of Unicode is released every June, so half-stars should appear in Unicode 11.0 in the middle of 2018. The Bitcoin community had to wait a very long time, since the symbol for Bitcoin slightly missed the release of Unicode version 9.0, so it should appear in Unicode 10.0 in mid-2017. So, if you are patient, you will eventually see the use of the group label, the bitcoin symbol and the half-star on the web pages along with other symbols.

★★★★★

Source: https://habr.com/ru/post/398145/


All Articles