Code that does not exist

Hi, habravchane!

About a year ago, Habr was overwhelmed by a wave of posts on the topic "% string% in N lines on JavaScript" . I don’t even remember how it all ended, but it all started with “Excel in 30 lines” . A lot of other interesting variations appeared on this topic, even a game of zero lines on JS , but this is a completely different story ...

No matter how hard I tried to think of something more compact, nothing came of it. Then it was decided to look at the problem from a different angle. At about this moment, the question flashed in my head: is it possible to “collapse” the code so that it does not exist at all ? ~~And then David Blaine called me.~~
')
I tried to add some magic and that's what I did.

The "Dispeller" code

The task is to write code, which ... no matter how. He must also be able to do something. Obviously, any manipulations need to be accompanied by some kind of function that could interpret these manipulations, and therefore hide the code altogether , alas, will not work, but it is easy to shorten the latter to a couple of lines .

Many people know or have heard that non-printing characters exist in computer typography, i.e. virtually invisible. And this is not some kind of bug or chip, but a completely normal behavior - to be invisible. Currently, one of the generally accepted and standardized text encodings is UTF-8 , it is used on almost any modern website. It is also valuable that there is a whole bunch of invisible characters ! For example, one of them is Zero Width Space (U + 200B) . Here it is: "". Do you see? Not? And he is.

David Blaine Method

For those who want to touch it, I quote a link to an example of a year ago: a working demo to watch for free online . Already then, several months later after a note in Habr's sandbox, I accidentally came across a post where this idea was viewed (method number three), but without a raisin.

In my version, the encoding was done in the most primitive way. Minus - the file size increases greatly, plus - you need only two characters for encoding. It looked like this:

var code = '1101101111110111111111111111110101101101111101111';

After quite a long time, I returned to this topic within the framework of one project I was engaged in. An attempt was made to go further and began by saying that now each character is encoded not by ones and zeros, but by four characters:

 "f".charCodeAt(0).toString(16); // "66" //    "f" - 0x0066 String.fromCharCode("0x0066"); // "f"

As a result, having a set of 16 characters, you can reduce the excess code excess:

 var Symbols = ["","","","","","","","","","","","","","","",""]; //    "f": var bar = invisibleJS("f"); // bar = "";

The increase in the volume occupied by the code in this example has decreased to 4x (4 characters to encode one), but in theory, if you don’t need Russian and / or some other non-Latin characters, you can achieve 2x .

Examples in the studio

Let it be this code:

 alert("Hello world!");

After feeding the code to the obfuscator (I will not bring and analyze the code, there is nothing interesting in it), the output is something like:

 var helloworld = "⁡⁡‫‌⁡⁡‫⁬⁡⁡‫‪⁡⁡‬‍⁡⁡‬‏⁡⁡‍‭⁡⁡‍‍⁡⁡‏‭⁡⁡‫‪⁡⁡‫⁬⁡⁡‫⁬⁡⁡‫⁯⁡⁡‍⁡⁡⁡‬‬⁡⁡‫⁯⁡⁡‬‍⁡⁡‫⁬⁡⁡‫‏⁡⁡‍‌⁡⁡‍‍⁡⁡‍‮";

Note that the semicolon is inside the quotation marks, although in reality this is not the case (you can check it in almost any text editor, for example, Sublime ). On the one hand, it adds +5 to obfuscation, is misleading and threatens with a light brain-fuck , and on the other hand, the “right” editor will not use characters that affect the direction of the text (from left to right, right to left).

This is what the decoding function looks like:

 var revealJS = function(s){return s.match(/(.{4})/g).map(function(b){return b.split('').map(function(i){return Array.apply(null,{length:10}).map(Number.call,Number).concat('abcdef'.split(''))['⁡‌‍‎‏‪‫‬‭‮⁪⁫⁬⁭⁮⁯'.split('').indexOf(i)]})}).map(function(c){return String.fromCharCode(0+"x"+c.join(''))}).join('')}

I am more than sure that the code could be better, smaller and more elegant, but such a task is not worthwhile in this post. I note that at the end of the line, the code again goes from right to left. In general, this nuance can be eliminated if you pick up a few other invisible characters.

Now you can "show" invisible code:

 var helloworld = "⁡⁡‫‌⁡⁡‫⁬⁡⁡‫‪⁡⁡‬‍⁡⁡‬‏⁡⁡‍‭⁡⁡‍‍⁡⁡‏‭⁡⁡‫‪⁡⁡‫⁬⁡⁡‫⁬⁡⁡‫⁯⁡⁡‍⁡⁡⁡‬‬⁡⁡‫⁯⁡⁡‬‍⁡⁡‫⁬⁡⁡‫‏⁡⁡‍‌⁡⁡‍‍⁡⁡‍‮"; revealJS(helloworld); // "alert("Hello world!")" eval(revealJS(helloworld)); // !

You can copy from here to the console or look here .

( The example “manifesto” code cited as an example is sharpened specifically for certain symbols used in the “disappear” function. Swapping these symbols and / or using others, the number of variants of the “code table” soars far to infinity. )

All this has one big drawback: you can peep the code executed with the help of eval() . Moreover, the console will even indicate the file / line from where this code is running:

You can correct this misunderstanding. If you use all four characters for encoding (as I said above), then there is the possibility of obfuscation inside obfuscation. Xzibit is delighted :

 //  ... alert("Hello world!"); // ... window["alert"]("Hello world!"); // ... window[revealJS("⁡⁡‫‌⁡⁡‫⁬⁡⁡‫‪⁡⁡‬‍⁡⁡‬‏")](revealJS("⁡⁡‏‭⁡⁡‫‪⁡⁡‫⁬⁡⁡‫⁬⁡⁡‫⁯⁡⁡‍⁡⁡⁡‬‬⁡⁡‫⁯⁡⁡‬‍⁡⁡‫⁬⁡⁡‫‏⁡⁡‍‌")) // ...   : "⁡⁡‬‬⁡⁡‫‮⁡⁡‫⁮⁡⁡‫‏⁡⁡‫⁯⁡⁡‬‬⁡⁡‪⁫⁡⁡‫‌⁡⁡‬‎⁡⁡‫⁯⁡⁡‍‭⁡⁡‍‍‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡⁡⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‍⁪‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁯⁡⁡‍‍⁡⁡‍‮⁡⁡‪⁭⁡⁡‍‭⁡⁡‫‌⁡⁡‬‎⁡⁡‫⁯⁡⁡‍‭⁡⁡‍‍‍⁡‫‌‍⁡‫‌‍⁡⁡⁯‍⁡‍⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‍⁪‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁯‍⁡‫‌‍⁡‫‌‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡‍⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁯‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡⁡⁯‍⁡‫‌‍⁡‫‌‍⁡⁡⁭‍⁡⁡⁬⁡⁡‍‍⁡⁡‍‮⁡⁡‍‮"

You can watch here . By the way, no one bothers to obfuscate the code at least three times, at least four times.

Now the resulting code looks like this:

Already better, but something else can be done. At the beginning of the article, I identified the task of hiding the code as if it were not there. Now, you can just open the console and everything is immediately clear that it is not good. Eliminating this nuance was quite simple: instead of eval() you should use:

 var script = document.createElement("script"); script.innerHTML = revealJS("⁡⁡‬‬⁡⁡‫‮⁡⁡‫⁮⁡⁡‫‏⁡⁡‫⁯⁡⁡‬‬⁡⁡‪⁫⁡⁡‬‍⁡⁡‫‪⁡⁡‬‫⁡⁡‫‪⁡⁡‫‌⁡⁡‫⁬⁡⁡‏⁪⁡⁡‪‎⁡⁡‍‭⁡⁡‍‍‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡⁡⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‍⁪‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁯⁡⁡‍‍⁡⁡‍‮⁡⁡‪⁭⁡⁡‍‭⁡⁡‬‍⁡⁡‫‪⁡⁡‬‫⁡⁡‫‪⁡⁡‫‌⁡⁡‫⁬⁡⁡‏⁪⁡⁡‪‎⁡⁡‍‭⁡⁡‍‍‍⁡‫‌‍⁡‫‌‍⁡⁡⁯‍⁡‍⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‍⁪‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁯‍⁡‫‌‍⁡‫‌‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡‍⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁯‍⁡‫‌‍⁡‫‌‍⁡‍⁬‍⁡⁡⁭‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡‫⁬‍⁡‫‌‍⁡‫‌‍⁡‍⁫‍⁡⁡⁯‍⁡‫‌‍⁡‫‌‍⁡⁡⁭‍⁡⁡⁬⁡⁡‍‍⁡⁡‍‮⁡⁡‍‮"); document.getElementsByTagName('body')[0].appendChild(script); //      ,      document.getElementsByTagName('body')[0].removeChild(script);

Unfortunately, when executing this code on jsFiddle, the code still appears. There is a suspicion that this is somehow related to the fact that the code in the JavaScript window also turns into eval() or the same. When testing on a local project where there are no “miracles”, everything works as it should, the function performed does not manifest itself:

Areas of use

1 . For fun.
2 Means obfuscation code.
3 Use in conjunction with other code minification / obfuscation methods. For example, Google Closure Compiler or UglifyJS .
4 The possibility of secretive communication in open areas.
5 “Sleeping” scripts, “bookmarks” in articles, posts on forums, bulletin boards, contextual advertising, in general, anywhere, where they give something to write and this then gets into users' browsers.

With the last two points is not so simple, but could not mention them. I will reveal a little thought. When browsing through it, it was discovered that, for example, Gmail and Yandex.Mail do not delete such characters. Some are transformed into the &zwj; ‪ &lrm; &zwj; ‪ &lrm; , but the part remains invisible. I think that the situation in email clients is similar (I checked Thunderbird ) - nothing is visible. So in the letter, you can send a hidden message that, during a “visual inspection”, does not show itself in any way and with all this, even in the case of detection of obscure hidden characters, only those who have a decryption algorithm (which, in general, can decipher this message) you can keep it in your head and write the code directly in the browser console:

 ,  ⁡⁡‍⁬⁡⁡‍⁡⁡‏‏‌⁡‏‎‪⁡‏‎‎⁡‏‎⁮⁡‏‎‏⁡‏‎⁭⁡‏‏⁯⁡⁡‍⁡⁡‏‎‍⁡⁡‍⁡⁡‏‎⁯⁡‏‏⁯⁡‏‏‍⁡‏‏⁬⁡⁡‍⁬⁡⁡‍⁡⁡‏‎⁯⁡‏‏⁡⁡‏‎‭⁡‏‏‪⁡‏‎⁮⁡‏‎‏⁡‏‎‭⁡⁡‍⁡⁡‏‎⁮⁡‏‎‏⁡‏‎‭⁡‏‎⁭?

But in fact (it is necessary to apply the revealJS() to the part that is between the letter “a” and the question mark):

   /*,   ,  */?

Thus, there is the possibility of covert communication in open areas. At the same time, no one will suspect anything. (Unless they will purposefully search, but this is a different conversation). In general, one of the main advantages (and maybe the only one) is that the content goes through "visual control" (but even this is not so smooth: you can change the encoding and everything will be exposed). It is like a stealth man and a video surveillance system. Who knows, maybe the whole Internet has long been stuffed with such messages? :)

As for the hidden scripts, everything is obvious here: if malicious js-code containing “manifester” and “launcher” gets into your browser in any way (for example, many load libraries from outside, take the same jQuery , which not so long ago cracked , or some common plugin, such as AdBlock ), then the code hidden on the page will be launched with a certain degree of probability. That is, in this case, the scheme is this: a lot of different "vectors", each of which is directed in its own way and one single "activator", which is quite tiny.

Thanks for attention.

P.S. Little nishtyachok: if in the script at the beginning of the line you put the character U + 202E (Right-To-Left Override) in quotes, then there will be fun. The efficiency of the code is preserved:

 "‮";var revealJS = function(s){return s.match(/(.{4})/g).map(function(b){return b.split('').map(function(i){return Array.apply(null,{length:10}).map(Number.call,Number).concat('abcdef'.split(''))['⁡‌‍‎‏‪‫‬‭‮⁪⁫⁬⁭⁮⁯'.split('').indexOf(i)]})}).map(function(c){return String.fromCharCode(0+"x"+c.join(''))}).join('')}

Source: https://habr.com/ru/post/243351/

All Articles

Code that does not exist

The "Dispeller" code

David Blaine Method

Examples in the studio

Areas of use

More articles: