📜 ⬆️ ⬇️

Can you trust the code in the editor? bi-directional text

def maps(): print "maps maps maps" def spam(): print "Erasing everything..." print "done." 

You know that if you look at the next line for a very long time, then there will be only three words “spam”?

 s = "spam‮" ,spam ,"‬spam" s[1]() 

Indeed, the first line is very unusual. In general, the result of this code will be the malicious spam function.

Look at ideone . (For those who do not know: there is a conclusion of the executed program below)

Rlo


At the heart of our bi-directional problem is the idea that the text in memory is always stored in the order in which it is written by a person. Including when writing from right to left, in which the text will be drawn in the opposite direction.
')
The direction of drawing is determined automatically by the fact that the characters belong to a specific alphabet (Hebrew, for example) or, if it is a punctuation mark or a number, then according to more tricky rules, depending on the context.

RLO - formatting symbol, stands for right-to-left override . Changes the direction of the letter to the right-hand side for symbols with a default-left-side letter. (The standard says that it can be used to write such identifiers , when they consist of mixed Hebrew and English and, apparently, English inclusions are naturally read from right to left).

So. Thanks to him, we can get our beauty:
  s = "spam <RLO>", spam, "<PDF> spam" 
  s = "spam", spam, "spam" 

PDF stands for pop directional formatting, resets the effect of the last RLO or its friends .

It is not difficult to guess that the interpreter will be indifferent to incomprehensible characters in string literals. But some editors, like emacs *, Xcode, Kate, will expand the intermediate text exactly as the browser does.

* in the case of emacs, it is possible that the behavior depends on the terminal. But in vim and nano there are no problems in the same terminal: both show only the code of the RLO symbol in the corresponding position.

About other uses in code


The RLO symbol is not whitespace, and besides, python swears at it as part of identifiers, which slightly limits its applicability.

It should be put in string literals, or in the comment. At the end of each line of the file, as a paragraph, the action RLO ends.

Pictures


vim


sublime text


xcode


emacs


... and again the link to ideone .

upd: there is still such an option with Embedding and Mark

Bonus

"Rm -rf echo", which actually only prints "rm -rf"
 echo -e '\xe2\x80\x8f\xe2\x80\xaaecho \xe2\x80\xac\xe2\x80\x8f\xe2\x80\xaarm -rf \xe2\x80\xac\xe2\x80\x8f' 

bash for some reason ignores the formatting characters at the beginning of the command, which opens up a lot of room for evil.

Source: https://habr.com/ru/post/252813/


All Articles