




This article discusses the most interesting transformations that the chain of two
transpilers (the first translates the Python code into the code in the
new 11l programming language , and the second - the 11l code in C ++), and compares the performance with other acceleration tools / code execution in Python (PyPy, Cython, Nuitka).
Replacing "slices" \ slices on ranges \ ranges
Python | 11l |
s[-1] s[-2] s[:-1] s[1:] s[:1:] s[1:2] s[::2] s[3:10:2] s[3:10:]
| s.last s[(len)-2] s[0..<(len)-1] s[1..] s[0..<1] s[1..<2] s[(0..).step(2)] s[(3..<10).step(2)] s[3..<10] |
An explicit indication for indexing from the end of the array
s[(len)-2]
instead of just
s[-2]
needed to eliminate the following errors:
- When it is required, for example, to get the previous character by
s[i-1]
, but with i = 0 such / this record instead of an error will silently return the last character of the string [ and I encountered this error in practice - commit ] . - The expression
s[i:]
after i = s.find(":")
will work incorrectly when the character is not found in the string [ instead of the "part of the string starting from the first character :
and then" the last character of the string will be taken ] (and , to return -1
with the find()
function in Python, I also consider incorrectly [ should return null / None [ and if -1 is required, then it should be written explicitly: i = s.find(":") ?? -1
] ] ). - Writing
s[-n:]
to get the last n
characters of the string will work incorrectly with n = 0.
Chains of comparison operators
At first glance, itβs an outstanding feature of the Python language, but in practice it can be easily abandoned / dispensed with using the
in
operator and ranges:
a < b < c | b in a<..<c |
a <= b < c | b in a..<c |
a < b <= c | b in a<..c |
0 <= b <= 9 | b in 0..9 |
List inclusion (list comprehension)
Similarly, as it turned out, you can opt out of another interesting Python feature - list comprehensions.
While some
glorify list comprehension and even suggest discarding `filter ()` and `map ()` , I found that:
In all places where I met Python's list comprehension, you can easily get away with the functions `filter ()` and `map ()`. dirs[:] = [d for d in dirs if d[0] != '.' and d != exclude_dir] dirs[:] = filter(lambda d: d[0] != '.' and d != exclude_dir, dirs) '[' + ', '.join(python_types_to_11l[ty] for ty in self.type_args) + ']' '[' + ', '.join(map(lambda ty: python_types_to_11l[ty], self.type_args)) + ']'
`filter ()` and `map ()` in 11l look prettier than in Python dirs[:] = filter(lambda d: d[0] != '.' and d != exclude_dir, dirs) dirs = dirs.filter(d -> d[0] != '.' & d != @exclude_dir) '[' + ', '.join(map(lambda ty: python_types_to_11l[ty], self.type_args)) + ']' '['(.type_args.map(ty -> :python_types_to_11l[ty]).join(', '))']' outfile.write("\n".join(x[1] for x in fileslist if x[0])) outfile.write("\n".join(map(lambda x: x[1], filter(lambda x: x[0], fileslist)))) outfile.write(fileslist.filter(x -> x[0]).map(x -> x[1]).join("\n"))
and therefore the need for list comprehensions in 11l actually disappears [the replacement of list comprehension with filter()
and / or map()
is performed during the conversion of the Python code to 11l automatically ] .
Convert the if-elif-else chain to switch
While Python does not contain a switch statement, this is one of the most beautiful constructs in 11l, and so I decided to insert the switch automatically:
Python | 11l |
ch = instr[i] if ch == "[": nesting_level += 1 elif ch == "]": nesting_level -= 1 if nesting_level == 0: break elif ch == "'": ending_tags.append(''') # '' elif ch == "'": assert(ending_tags.pop() == ''') | switch instr[i] '[' nesting_level++ ']' if --nesting_level == 0 loop.break "'" ending_tags.append("'") // '' "'" assert(ending_tags.pop() == "'") |
For completeness, here is the generated C ++ code. switch (instr[i]) { case u'[': nesting_level++; break; case u']': if (--nesting_level == 0) goto break_; break; case u''': ending_tags.append(u"'"_S); break; // '' case u''': assert(ending_tags.pop() == u'''); break; }
Converting small dictionaries to native code
Consider this line of Python code:
tag = {'*':'b', '_':'u', '-':'s', '~':'i'}[prev_char()]
Most likely, this form of recording is not very effective
[ in terms of performance ] , but very convenient.
')
In 11l, the entry corresponding to this line
[ and obtained by the Python transpiler β 11l ] is not only convenient
[ however, not as elegant as in Python ] , but also fast:
var tag = switch prev_char() {'*' {'b'}; '_' {'u'}; '-' {'s'}; '~' {'i'}}
The line above is shipped to:
auto tag = [&](const auto &a){return a == u'*' ? u'b'_C : a == u'_' ? u'u'_C : a == u'-' ? u's'_C : a == u'~' ? u'i'_C : throw KeyError(a);}(prev_char());
[ Calling the lambda function, the C ++ compiler will embed \ inline during the optimization process and only the chain of ?/:
Operators will remain. ]In the case when assignment is made to a variable, the dictionary is left as is:
Python | rn = {'I': 1, 'V': 5, 'X': 10, 'L': 50, ...} |
11l | var rn = ['I' = 1, 'V' = 5, 'X' = 10, 'L' = 50, ...] |
C ++ | auto rn = create_dict(dict_of(u'I'_C, 1)(u'V'_C, 5)(u'X'_C, 10)(u'L'_C, 50)...); |
Capture \ Ca pture external variables
In Python, to indicate that a variable is not local, but must be taken outside
[ from the current function ] , the nonlocal keyword is used
[ otherwise, for example, found = True
will be interpreted as creating a new local variable found
, rather than assigning a value existing external variable ] .
In 11l, the prefix @ is used for this:
Python | 11l |
writepos = 0 def write_to_pos(pos, npos): nonlocal writepos outfile.write(...) writepos = npos | var writepos = 0 fn write_to_pos(pos, npos) @outfile.write(...) @writepos = npos |
C ++:
auto writepos = 0; auto write_to_pos = [..., &outfile, &writepos](const auto &pos, const auto &npos) { outfile.write(...); writepos = npos; };
Global variables
Similar to external variables, if you forget to declare a global variable in Python
[ via the global keyword ] , you get an inconspicuous bug:
break_label_index = -1 ... def parse(tokens, source_): global source, tokeni, token, scope source = source_ tokeni = -1 token = None break_label_index = -1 scope = Scope(None) ... | var break_label_index = -1 ... fn parse(tokens, source_) :source = source_ :tokeni = -1 :token = null break_label_index = -1 :scope = Scope(null) ... |
The code at 11l
[ on the right ], as opposed to Python
[ on the left ], will
break_label_index
error at the compilation stage with the 'undeclared variable
break_label_index
'.
Index / number of the current container item
I always forget the order of the variables that the Python function returns
enumerate
{first comes the value and then the index or vice versa}. Ruby's analog behavior -
each.with_index
- is much easier to remember: with index means that index comes after value, not before. But in 11l, the logic is even easier to memorize:
Python | 11l |
items = ['A', 'B', 'C'] for index, item in enumerate(items): print(str(index) + ' = ' + item) | var items = ['A', 'B', 'C'] loop(item) items print(loop.index' = 'item) |
Performance
As a test, the
program uses
the conversion of PC markup to HTML , and the source data is taken from the source code of the
article on PC markup [ since this article is currently the largest written on PC markup ] , and is repeated 10 times, i.e. It turns out from 48.8 kilobyte article a file of 488Kb
Here is a diagram showing how many times the corresponding method of executing Python code is faster than the original
[ CPython ] implementation:

And now let's add to the diagram the implementation generated by the Python transpiler β 11l β C ++:
Execution
time [ 488Kb file conversion time ] was 868 ms for CPython and 38 ms for generated C ++ code
[ this time includes full-featured [ i.e. not just working with data in RAM ] launching the program by the operating system and all the input / output [ reading the original file [ .pq ] and saving the new file [ .html ] to the disk ] ] .
I also wanted to try
Shed Skin , but it does not support local functions.
Numba also failed to use (gives the error 'Use of unknown opcode LOAD_BUILD_CLASS').
Here is the archive with the program used for performance comparison
[ for Windows ] (installed Python 3.6 or higher and the following Python packages are required: pywin32, cython).
Python source and output of Python β 11l and 11l β C ++ transpilers: