📜 ⬆️ ⬇️

Emoji ?! No, I have not heard

image Emoji has long entered our lives. And in social networks, and in all sorts of instant messengers, we use them without thinking, expressing our emotions with just one symbol. But for a cross-platform application, sending and displaying Emoji is not an easy task. The problem is that sent emoji from mobile apps do not always display correctly on websites.

The latest versions of iOS and Android have support for more than 1,200 Emoji characters, but the “desktop” market cannot boast such success. We in Badoo want and do everything so that users can comfortably communicate on all platforms, without any restrictions in correspondence.
Next, I will tell you how we achieved 100% Emoji support for the web.


This is how a Windows user would see a message in a browser without emoji:
image

')
The basic idea is that we take any Emoji character, define its Unicode code and replace it with an html element that will be displayed correctly in the browser.

Theory


Will consider image (smiling face). It has the code U + 1F600 . How to get its code using javascript:

' image '.length // 2
' image '.charCodeAt (0) .toString (16) // D83D
' image '.charCodeAt (1) .toString (16) // DE00

As a result, we received a surrogate pair: U + D83D U + DE00 .

UTF-16 encodes characters as a sequence of 16-bit words, it allows you to write Unicode characters in the ranges from U + 0000 to U + D7FF and from U + E000 to U + 10FFFF (a total of 1,112,064). If it is required to represent in UTF-16 a symbol with a code greater than U + FFFF, then two words are used: the first part of the surrogate pair (in the range from 0xD800 to 0xDBFF) and the second (from 0xDC00 to 0xDFFF).

To get the Emoji code that is in the range greater than U + FFFF, we use the formula:

(0xD83D - 0xD800) * 0x400 + 0xDE00 - 0xDC00 + 0x10000 = 1f600 

Now let's translate back:

 D83D = ((0x1f600 - 0x10000) >> 10) + 0xD800; DE00 = ((0x1f600 - 0x10000) % 0x400) + 0xDC00; 


This is quite difficult and inconvenient, consider what we can offer ES 2015 .

With the new JavaScript standard, you can forget about surrogate pairs and make your life easier:

 String.prototype.codePointAt //    , String.fromCodePoint //    . 

Both methods work correctly with surrogate pairs.

The ability to insert eight-digit codes in the string:
\ u {1F466} instead of \ uD83D \ uDC66

RegExp.prototype.unicode : the u flag in regular expressions gives the best support when running Unicode:

 /\u{1F466}/u 


At the moment, the Unicode 8.0 standard contains 1281 Emoji characters, and that is not counting skin color modifiers and groups (Emoji family). There are various implementations from well-known companies:
image


Emoji can be divided into several groups:



Decision:


  1. we receive the source text with the symbol, we look for in it with the help of the regular expression all sets of emoji;
  2. determine the character code using the codePointAt function;
  3. create an img element (it is important that this is the img tag) with a url, which consists of the code of this symbol;
  4. replace the character with img in the source text.

 function emojiToHtml(str) { str = str.replace(/\uFE0F/g, ''); return str.replace(emojiRegex, buildImgFromEmoji); } var tpl = '<img class="emoji emoji--{code} js-smile-insert" src="{src}" srcset="{src} 1x, {src_x2} 2x" unselectable="on">'; var url = 'https://badoocdn.com/big/chat/emoji/{code}.png'; var url2 = 'https://badoocdn.com/big/chat/emoji@x2/{code}.png'; function buildImgFromEmoji(emoji) { var codePoint = extractEmojiToCodePoint(emoji); return $tpl(tpl, { code: codePoint, src: $tpl(url, { code: codePoint }), src_x2: $tpl(url2, { code: codePoint }) }); } function extractEmojiToCodePoint(emoji) { return emoji .split('') .map(function (symbol, index) { return emoji.codePointAt(index).toString(16); }) .filter(function (codePoint) { return !isSurrogatePair(codePoint); }, this) .join('-'); } function isSurrogatePair(codePoint) { codePoint = parseInt(codePoint, 16); return codePoint >= 0xD800 && codePoint <= 0xDFFF; } 


The basic idea in a regular expression that finds emoji characters:

 var emojiRanges = [ '(?:\uD83C[\uDDE6-\uDDFF]){2}', //  '[\u0023-\u0039]\u20E3', //  '(?:[\uD83D\uD83C\uD83E][\uDC00-\uDFFF]|[\u270A-\u270D\u261D\u26F9])\uD83C[\uDFFB-\uDFFF]', //   '\uD83D[\uDC68\uDC69][\u200D\u200C].+?\uD83D[\uDC66-\uDC69](?![\u200D\u200C])', //  '[\uD83D\uD83C\uD83E][\uDC00-\uDFFF]', //   '[\u3297\u3299\u303D\u2B50\u2B55\u2B1B\u27BF\u27A1\u24C2\u25B6\u25C0\u2600\u2705\u21AA\u21A9]', //  '[\u203C\u2049\u2122\u2328\u2601\u260E\u261d\u2620\u2626\u262A\u2638\u2639\u263a\u267B\u267F\u2702\u2708]', '[\u2194-\u2199]', '[\u2B05-\u2B07]', '[\u2934-\u2935]', '[\u2795-\u2797]', '[\u2709-\u2764]', '[\u2622-\u2623]', '[\u262E-\u262F]', '[\u231A-\u231B]', '[\u23E9-\u23EF]', '[\u23F0-\u23F4]', '[\u23F8-\u23FA]', '[\u25AA-\u25AB]', '[\u25FB-\u25FE]', '[\u2602-\u2618]', '[\u2648-\u2653]', '[\u2660-\u2668]', '[\u26A0-\u26FA]', '[\u2692-\u269C]' ]; var emojiRegex = new RegExp(emojiRanges.join('|'), 'g'); 


Chat


Next, we consider how to build a chat prototype with Emoji support.

The input field is a div:

 <div id="t" contenteditable="true" data-placeholder=" "></div> 


When entering a message or pasting from the clipboard, we will clear its contents from possible html tags:

 var tagRegex = /<[^>]+>/gim; var styleTagRegex = /<style\b[^>]*>([\s\S]*?)<\/style>/gim; var validTagsRegex = /<br[\s/]*>|<img\s+class="emoji\semoji[-\w\s]+"\s+((src|srcset|unselectable)="[^"]*"\s*)+>/i; function cleanUp(text) { return text .replace(styleTagRegex, '') .replace(tagRegex, function (tag) { return tag.match(validTagsRegex) ? tag : ''; }) .replace(/\n/g, ''); } 


To process the line inserted from the clipboard, use the paste event:

 function onPaste(e) { e.preventDefault(); var clp = e.clipboardData; if (clp !== undefined || window.clipboardData !== undefined) { var text; if (clp !== undefined) { text = clp.getData('text/html') || clp.getData('text/plain') || ''; } else { text = window.clipboardData.getData('text') || ''; } if (text) { text = cleanUp(text); text = emojiToHtml(text); var el = document.createElement('span'); el.innerHTML = text; el.innerHTML = el.innerHTML.replace(/\n/g, ''); t.appendChild(el); restore(); } } } 


Then we replace all the emoji found with the img html tag, as shown above. It is on img, since contenteditable works best with it. With other elements there may be bugs when editing.

After inserting the img in the input field, you need to restore the position of the caret so that the user can continue typing the message. To do this, use the JavaScript Selection and Range objects:

 function restore() { var range = document.createRange(); range.selectNodeContents(t); range.collapse(false); var sel = window.getSelection(); sel.removeAllRanges(); sel.addRange(range); } 


After the message set is completed, the reverse procedure is required. Namely, turn img into a symbol for sending to the server using the fromCodePoint function:

 var htmlToEmojiRegex = /<img.*?class="emoji\semoji--(.+?)\sjs-smile-insert".*?>/gi; function htmlToEmoji(html) { return html.replace(htmlToEmojiRegex, function (imgTag, codesStr) { var codesInt = codesStr.split('-').map(function (codePoint) { return parseInt(codePoint, 16); }); var emoji = String.fromCodePoint.apply(null, codesInt); return emoji.match(emojiRegex) ? emoji : ''; }); } 


You can see an example of the chat here: https://jsfiddle.net/q9484hcc/

So we developed Emoji support so that our users can fully express emotions and communicate with each other without restrictions. If you have ideas for improving our methods or changing them - write in the comments, we will be happy to discuss them!

Useful links:
http://emojipedia.org/
http://getemoji.com/
Polyphyl String.fromCodePoint
Polyphyl String.prototype.codePointAt

Artem Kunets
Badoo frontend developer

Source: https://habr.com/ru/post/282113/


All Articles