Assembler for brainfuck

One cold May day out of boredom I decided to start learning this amazing language - Brainfuck.
His interpreters have published on Habré many times already .
But I want to learn more deeply the language itself and the algorithms on it, and not to write another interpreter. Therefore, it was decided to make ~~an elephant~~ compiler of a high-level language into a brainfuck.
However, the real brainfuck began very quickly: the absence of the if statement, the absence of random access to the cells and a bunch of other problems immediately fell on me. It was necessary to wait a little with the high-level language and to make for the beginning an assembler into which the high-level language would be compiled.
On the implementation of the assembler under the cut.

Implementation of the main parts

For convenience, the tape was divided into several parts:

Registers
Stack
Temporary cells and flag cells
The remaining memory is used to store arrays and strings.

The translator assumes that program execution begins on a zero cell filled with zero tape. The tape must be looped.

Registers

This is the most important part of the assembler. It is with them most often have to interact. You can put numbers in registers (remember that numbers are limited to 255 in brainfuck) and symbols, add numbers and values of other registers to them, take numbers and values of other registers, multiply them into other registers, divide them into other registers and compare them with each other. .
The translator uses four registers: ax, bx, cx, dx - by analogy with the x86 assembler, each of which is represented by one memory cell.
Examples:

mov ax 5 //   ax 5 mov bx 6 //   bx 6 sub ax bx //   ax 255 mov cx 12 mov dx 2 mul cx dx //  cx  24 mov dx 10 div cx dx //   :  cx - 2,  dx - 4

Stack

On the left side of the zero cell, the stack begins. It is stored on the principle of C-lines and can be of any size, but you need to remember that the memory is looped and a very large number of items put on the stack can spoil your other data.
In order to put and remove values from the stack, there are push and pop commands, for example:

 push ax push 5 pop bx pop cx

However, storage in the form of C-rows has one drawback: you cannot put a zero on the stack, since zero serves as an indicator of the top of the stack. Of course, it was possible to organize a normal stack, but then its length would be limited to 256 elements.
')

Temporary cells and flag cells

Temporary cells are used for intermediate calculations in complex operations. Three flag cells are used in the comparison operation, more on this below.

Arrays

Arrays - ranges of cells that can be accessed by index:

 array test 10 //    test    set test 0 42 //       42 get test 0 ax //   ax

By the way, the index can be a register containing, for example, user input, but it is impossible to access elements with indexes greater than 255.

Strings

Strings are declared with the string keyword:

 string hello "Hello, World!" //   hello puts hello //

Strings are placed in the section of arrays and initialized when the program starts.
Strings can be treated as arrays:

 get hello 0 ax put ax //  'H'

Basic constructions

Input Output

There are three I / O operations:

 take ax //    ax    put ax //  ,    ax puts str //     str

By the way, with the puts command, you can also print regular arrays (up to the first zero in it).

Cycle organization

In principle, assembler jumps can also be organized, but this is somewhat hemorrhoidal, so the cycles are more similar to Baysik's ones:

 while ax //  while      // do smth. endwhile

This cycle will do something until the register ax is zero.

Comparisons

The comparison is made in assembly style. To compare something, you need to apply the cmp command to the registers:

 mov ax 2 mov bx 1 cmp ax bx

After this command, the values of the flag cells change. There are three of them: for equality, for less and for more. If the first register is equal to the second, then the equality flag is set to zero, if the first register is less, then the second flag is set to zero, and if more, the third one.
You can use the result of the comparison as follows:

 cmp ax bx //          ne //   //   ,     end nl //   // do end ng //   // do end eq //  // do end lt //  // do end gt //  // do end

Conclusion

The assembler is made with a simple syntax and is designed more for automatic generation, although you can write programs on it anyway. The translator is written in Ruby and is available here . There you can see examples of programs and write your own.

useful links

Algorithms on brainfuck - I used multiplication and division algorithms from there.

BFDev is a convenient brainfuck-IDE, all programs have been tested in it.

BrainFix is a high-level language that translates into brainfuck. From there, the algorithm for accessing the array element with an index unknown in the compile time was taken.

Source: https://habr.com/ru/post/182598/

All Articles