Brainfuck is a programming language created with one goal: to write an interpreter for it. They have been written so much that I will not even give links to them. This article explains the simple, but effective way to optimize it.
To optimize you need to follow 3 rules:
Reverse instructions ( +
and -
, >
and <
) are mutually destroyed when they go one after another. Code >++>><<-
turns into >+
. It is rather a fool proof than an optimization, since such code is meaningless.
Repetitive instructions are replaced with commands, the arguments of which say how many times a specific instruction is executed. For example, the code +++
is replaced with the ADD command with an argument of 3, and the code <<<
with MOVE: -3.
[+]
and [-]
zeroes the current cell, and the ADD command behind this code assigns its current argument to the current cell. Code --[-]+++
is replaced by the command ASSIGN: 3.Each command has its own argument (simply arg). The value of the argument is limited only by the ADD command, since the size of a single cell is 256.
Team name | Instructions | Description |
---|---|---|
ADD | + and - | Changes the value of the current cell to arg |
MOVE | > and < | Changes the index of the current cell to arg |
READ | , | Passes arg characters from the input stream and reads the character following them. |
. | Prints the character corresponding to the current cell value arg times | |
Goto | [ and ] | Goes to the execution of arg on the account of the command relative to the current |
ASSIGN | [+] and [-] | Assigns the current cell to arg. |
Interpretation is divided into 3 stages. Reading the source code, converting it into optimized commands and executing these commands. This all happens in the main and parse functions.
The first and last stage takes place immediately in the main function. Her code with comments:
int main(int argc, char* argv[]){ /* */ if(argc == 1){ fprintf(stderr, "%s: Wrong argument count\n", argv[0]); return 1; }; /* */ FILE* f = fopen(argv[1], "r"); if(f == NULL){ fprintf(stderr, "%s: Can't open %s\n", argv[0], argv[1]); return 2; }; /* */ int n = fread(str, sizeof(char), SIZE - 1, f); if(n == 0){ fprintf(stderr, "%s: Can't read data from %s\n", argv[0], argv[1]); return 3; }; str[n] = '\0'; /* */ fclose(f); if(brackets()){ fprintf(stderr, "%s: Wrong brackets sequence\n", argv[0]); return 4; }; /* */ parse(); /* */ ptr = mem; int (*exe[])(int) = {exe_a, exe_b, exe_c, exe_d, exe_e, exe_f}; struct code* p = src + 1; for(; p->cmd; ++p){ p += (*exe[p->cmd - 1])(p->arg); }; return 0; };
Optimization by converting bf instructions to interpreter commands occurs in the parse function. Her code with comments:
void parse(){ /* */ struct code *p = src; p->cmd = 0; p->arg = 0; /* */ char* ch = str; /* */ top = stk; /* 8 */ int (*prs[])(struct code*) = {prs_0, prs_1, prs_2, prs_3, prs_4, prs_5, prs_6, prs_7, nothing}; /* */ for(; *ch != '\0'; ++ch){ p += (*prs[find(*ch)])(p); }; /* - */ (p + 1)->cmd = 0; };
For comparison, I took an interpreter from the official ubuntu repository and several tests from this repository. The table shows the average test time in seconds.
Test name | Official | From the article |
---|---|---|
Long | 34 | 20 |
Mandelbrot | 21 | 26 |
EasyOpt | 24 | 27 |
Counter | 34 | 37 |
In all tests except the first, the official interpreter works about 12 percent faster than the interpreter from the article. In the first test, unlike the others, in most cycles the number of instructions >
does not match the number of instructions <
. This makes the cycles unbalanced and not amenable to optimization (another kind of optimization not described in the article). It seems that the official interpreter optimizes balanced cycles and gets a 12% increase in speed from this, while the interpreter from the article does not do this and wins only on the first test. Although this is a good result for such a simple optimization algorithm.
Sources on Github
Source: https://habr.com/ru/post/321630/
All Articles