📜 ⬆️ ⬇️

A bit about JIT compilation or writing an optimized Brainfuck interpreter

The essence of the language Brainfuck is that we always run through the cells of the tape, reducing or increasing the values ​​in them. In cycles, we can run from one end to the other, counting something, often using many nested loops. It is not difficult to guess that the interpretation of this language is relatively slow. Of course, on modern computers this is almost not noticeable, but ... I offer a small test: take the interpreter written by you, and run this code that is not tricky:

>+>+>+>+>++<[>[<+++>- >>>>> >+>+>+>+>++<[>[<+++>- >>>>> >+>+>+>+>++<[>[<+++>- >>>>> >+>+>+>+>++<[>[<+++>- >>>>> +++[->+++++<]>[-]< <<<<< ]<<]>[-] <<<<< ]<<]>[-] <<<<< ]<<]>[-] <<<<< ]<<]>. 


Waited for the end of execution? Agree that it was not as fast as it seemed at once. Well, let's see how to make an interpreter that will execute this code in no more than a few seconds.

At the core lies the idea of ​​JIT compilation. Wikipedia tells us that JIT compilation is a technology to increase the performance of software systems that use bytecode by compiling bytecode into native code while the program is running. A little thought, I decided that it would be much more expedient to immediately build the entire necessary array of machine code, and not replace it during operation. I think that the essence of this does not change much. But, let's order.
')
So, as everyone already knows, the syntax of the Brainfuck language consists of only 8 operators. There is a set of memory cells and a pointer to the current cell. In fact, as a result of the program execution, we need to convert the code on Brainfuck into assembly commands (their machine codes) of the x86 processor. For this, it is nice to have some idea of ​​the assembly language, as well as the rules for generating machine codes (opcodes). We will replace each command on Brainfuck with one or several x86 processor commands.

First of all, you need to create an array of memory cells and get a pointer to it. This can be done using the dynamic allocation function. So, in the Pascal programming language (and the further code of examples will be on it), this can be done as follows:
 Var Stack : Pointer; Begin GetMem(Stack,30000); //    brainfuck - 30000 End; 

Next, you need to think about the strategy for processing Brainfuck commands. I used the following:
We agree that the pointer to the array of cells will be stored in the ESI register.
 mov esi, Stack 


Then, the remaining teams will be:
BrainfuckAssemblerOpcode
+inc byte [esi]FE 06
-dec byte [esi]FE 0E
>inc esi46
<dec esi4E
[cmp byte [esi], 080 3E 00
je "] +1"0F 84 xx xx xx xx or 74 xx
]jmp "["E9 xx xx xx xx or EB xx


As for the commands '.' and ',', then the procedure for displaying the symbol and the water symbol function will be needed here. They can be implemented as follows:

 procedure WriteChar(S: Char); begin Write(S); end; function ReadChar: Char; var c : char; begin Read(c ); ReadChar := c; end; 

Then, it can be agreed that the EDX register, when the executable code starts, contains the address of the procedure for printing the character
 mov edx, offset WriteChar 

and EBX register is the address of the character input function:
 mov ebx, offset ReadChar 

Then, handle the '.' Can be as follows:
 mov al,[esi] 8A 06 ;  AL    cbw 66 98 ;    EAX push eax 50 ;     () call edx FF D2 ;  WriteChar 

And the command “,” like this:
 call ebx FF D3;  ReadChar mov [esi],al 88 06;       

For simplicity, we will create a static array of ExBuf, where we will put all our opcodes, and then transfer control to the beginning of the array. The variable ptr will be the pointer of the last opcode entered.
 Var ExBuf : Array [1..65535] of Byte; Ptr : Word; Tmp : LongInt; 

The first three commands in our code block are, as we agreed, setting the ESI register to the cell pointer, the EDX register to the WriteChar procedure pointer, and the EBX register to the ReadChar function pointer. You can do it like this:
  ptr := 1; ExBuf[ptr] := $BE; inc(ptr); // mov esi,Stack asm mov edx,Stack mov Tmp,edx end; Move(Tmp,ExBuf[ptr],4); inc(ptr,4); ExBuf[ptr] := $BA; inc(ptr); // mov edx,offset WriteChar asm mov edx, offset WriteChar mov Tmp, edx end; Move(Tmp,ExBuf[ptr],4); inc(ptr,4); ExBuf[ptr] := $BB; inc(ptr); // mov ebx,offset ReadChar asm mov edx, offset ReadChar mov Tmp, edx end; Move(Tmp,ExBuf[ptr],4); inc(ptr,4); 


Everything, primary initialization is carried out. Now you need to write a procedure for processing Brainfuck codes. For the first teams, it can be like this:
 Procedure JITCode(S: Char); Begin Case S of '+': Begin ExBuf[ptr] := $FE; //inc byte ptr [esi] ExBuf[ptr+1] := $06; Inc(ptr,2); End; '-': Begin ExBuf[ptr] := $FE; //dec byte ptr [esi] ExBuf[ptr+1] := $0E; Inc(ptr,2); End; '>': Begin ExBuf[ptr] := $46; //inc esi Inc(ptr); End; '<': Begin ExBuf[ptr] := $4E; //dec esi Inc(ptr); End; '.': Begin ExBuf[ptr] := $8A; //mov al,[esi] ExBuf[ptr+1] := $06; Inc(ptr,2); ExBuf[ptr] := $66; //cbw ExBuf[ptr+1] := $98; Inc(ptr,2); ExBuf[ptr] := $50; //push eax Inc(ptr); ExBuf[ptr] := $FF; //call edx ExBuf[ptr+1] := $D2; Inc(ptr,2); End; ',': Begin ExBuf[ptr] := $FF; //call ReadChar ExBuf[ptr+1] := $D3; Inc(ptr,2); ExBuf[ptr] := $88; //mov [esi],al ExBuf[ptr+1] := $06; Inc(ptr,2); End; End; 

At the input of the procedure, we alternately transfer Brainfuck codes, which are converted into machine codes and are gradually entered into the ExBuf array.
The situation with the '[' and ']' loop commands is somewhat more complicated. As we all remember, x86 processors have several addressing modes, and accordingly, several near commands or short (short) transitions. So, if the total length of opcodes between a transition command and a label does not exceed 127 bytes, you can use short (short) transitions. If it is more, then near commands will be needed. I implemented it as follows: when meeting the command '[', I generate the command je near xxx and memorize in the additional array the current position of the command buffer. After, when the ']' command is encountered in the code, I calculate the difference between the beginning of the cycle and the end (byte length), and if it is less than 127 bytes, I return to the memorized position, overwrite the old je near xxx with je short xxx and transfer the rest of the code four bytes ahead. With this in mind, it remains to supplement the above JITCode procedure:
  '[': Begin ExBuf[ptr] := $80; //cmp byte ptr [esi],0 ExBuf[ptr+1] := $3E; ExBuf[ptr+2] := $00; Inc(ptr,3); ExBuf[ptr] := $0F; //je near xxx ExBuf[ptr+1] := $84; Inc(ptr,2); bstack[bcnt] := ptr; inc(bcnt); Inc(ptr,4); End; ']': Begin Tmp := ptr - bstack[bcnt-1]; If Tmp > 122 then begin ExBuf[ptr] := $E9; //jmp Inc(ptr); Inc(Tmp); Move(Tmp,ExBuf[bstack[bcnt-1]],4); Tmp := -Tmp-9; Move(Tmp,ExBuf[ptr],4); Inc(ptr,4); Dec(bcnt); end else begin ExBuf[ptr] := $EB; //jmp short Inc(ptr); ExBuf[bstack[bcnt-1]-2] := $74; // je short xxx Dec(Tmp,2); Move(Tmp,ExBuf[bstack[bcnt-1]-1],1); Tmp := -Tmp-5; Move(Tmp,ExBuf[ptr],1); Inc(ptr); Move(ExBuf[bstack[bcnt-1]+4],ExBuf[bstack[bcnt-1]],ptr-bstack[bcnt-1]-4); Dec(ptr,4); Dec(bcnt); end; End; 

That's practically all, this is a ready-made working implementation. It remains not much at all - write the last command ret (return from the procedure). Its opcode is 0xC3:
  ExBuf[ptr] := $C3; //retn 

And transfer control to the generated machine instruction buffer:
  asm mov edx,offset ExBuf call edx end; 


But let's see what else can be done with the optimization of our program. An idea that literally lies on the surface: in the Brainfuck code, repetitive commands of the same type are often encountered, for example +++++++++++ or <<<<. What does it mean to increase the current cell by 11 or move the cell pointer 4 left? You can enter additional characters to indicate the following:

<<<<< = is replaced by p # (prev) right shift of the pointer to the left by # values
>>>>> = replaced by n # (next) right shift of pointer to # of values
+++++ = is replaced by i # (inc) increasing the current cell immediately to #
- = is replaced by d # (dec) decrementing the current cell immediately to #
[-], [+] = is replaced by z (zero) setting the cell value to 0

Such preprocessing can be carried out at the beginning of the interpreter. Sometimes, this gives a significant decrease in the source code on Brainfuck, and as a result, a decrease in the x86 code. After this processing, it remains to supplement our JITCode procedure:
  'i': Begin ExBuf[ptr] := $80; //add byte ptr [esi],xx ExBuf[ptr+1] := $06; ExBuf[ptr+2] := Ord(Buf[i+1]); Inc(i); Inc(ptr,3); End; 'd': Begin ExBuf[ptr] := $80; //sub byte ptr [esi],xx ExBuf[ptr+1] := $2E; ExBuf[ptr+2] := Ord(Buf[i+1]); Inc(i); Inc(ptr,3); End; 'n': Begin ExBuf[ptr] := $83; //add esi,xx ExBuf[ptr+1] := $C6; ExBuf[ptr+2] := Ord(Buf[i+1]); Inc(i); Inc(ptr,3); End; 'p': Begin ExBuf[ptr] := $83; //sub esi,xx ExBuf[ptr+1] := $EE; ExBuf[ptr+2] := Ord(Buf[i+1]); Inc(i); Inc(ptr,3); End; 'z': Begin ExBuf[ptr] := $C6; //mov byte ptr [esi],0 ExBuf[ptr+1] := $06; ExBuf[ptr+2] := $00; Inc(ptr,3); End; 


Download the source code and compiled Win32 version here: rghost.ru/4249797

The program has the keys:
-c - create an OUT.BC file on the disk containing an intermediate byte code after the preprocessor.
-b - create the BF.BIN file on the disk containing the generated array of machine codes. If you open it in HIEW, then you can see the usual assembly commands.

Good luck to all.

Source: https://habr.com/ru/post/113339/


All Articles