📜 ⬆️ ⬇️

Low level brainfuck

Part I
Part II
Part III

In this article we will write Brainfuck-translator on TurboAssembler, but first we will write an interpreter in some high-level language, for example, in Pascal.

The data_arr array will represent the data memory, the string str_arr will contain commands.
')
We write a program that outputs a character whose ascii code corresponds to the number of + (so we will only need the + and . Commands)

var data_arr:array[1..10] of integer; //   str_arr: string; //  i, j: integer; //     begin j:=1; //       readln(str_arr); //  for i:=1 to length(str_arr) do begin //     if (str_arr[i]='+') then data_arr[j]:= data_arr[j]+1; if (str_arr[i]='.') then write(chr(data_arr[j])); end; end. 

bf code +++++++++++++++++++++++++++++++++. will give out ! (the ascii code of the symbol ! is 33).

The program can be checked at online ide ide.com
Here you can debug bf code step by step.

Next, replace the for loop with the goto statement and add the commands <>
At the end, we will output the data_arr array

 LABEL prev,next; var data_arr:array[1..10] of integer; //   str_arr: string; //  i,j,k: integer; //     begin i:=1; j:=1; readln(str_arr); //  prev: if i>length(str_arr) then goto next; if (str_arr[i]='+') then data_arr[j]:= data_arr[j]+1; if (str_arr[i]='-') then data_arr[j]:= data_arr[j]-1; if (str_arr[i]='>') then j:=j+1; if (str_arr[i]='<') then j:=j-1; if (str_arr[i]='.') then write(chr(data_arr[j])); i:=i+1; goto prev; next: for k:=1 to 10 do begin write(data_arr[k]); write(' '); end; end. 

Code +>++>+++will return 1 2 3 0 0 0 0 0 0 0
Code +>++>+++will return 1 2 2 0 0 0 0 0 0 0
ideone.com

Next, add [ and ]
Add another i_stor variable.
If the current element passed the check for [ , then we check the current element of the data_arr array to zero, and, if the element is greater than zero, load the value from the variable i into i_stor .

When processing the closing bracket ] , if data_arr is non-zero, in the variable i from the variable i_stor, we load the address of the opening bracket [

Next, go to the command i: = i + 1;
If before this, the value from i_stor was loaded into i (when checking ] ), then after the jump we will be behind [ (otherwise we will be behind ] )
 LABEL prev,next; var data_arr:array[1..10] of integer; //   str_arr: string; //  i,j,k: integer; //     i_stor: integer; begin j:=1; i:=1; readln(str_arr); //  prev: if i>length(str_arr) then goto next; if (str_arr[i]='+') then data_arr[j]:= data_arr[j]+1; if (str_arr[i]='-') then data_arr[j]:= data_arr[j]-1; if (str_arr[i]='>') then j:=j+1; if (str_arr[i]='<') then j:=j-1; if (str_arr[i]='.') then write(chr(data_arr[j])); if (str_arr[i]='[') then begin if data_arr[j]>0 then i_stor:=i; end; if (str_arr[i]=']') then begin if data_arr[j]>0 then begin i:=i_stor; end; end; i:=i+1; goto prev; next: for k:=1 to 10 do begin write(data_arr[k]); write(' '); end; end. 

Code +++++[>+<]transfers the number 5 to the next cell 0 5 0 0 0 0 0 0 0 0
ideone.com
HelloWorld code looks like ideone.com

Let's go to the assembler


To organize a loop (loop), you must put in the CX register the number of cycles of the loop and put a label on which the transition will be made at the end of the clock (by the loop command).

 mov CX, 28h ; -   prev: ;   ;  ;  ;   loop prev ;    prev 

Create an array of commands str_arr , put there +++
Create an array of data data_arr , (for clarity) put there 1,1,1,1,1,1,1,1,1,1

In the loop, compare the current symbol with the symbol. +and, if the characters are equal, increase the value in the current cell by 1.

 text segment ; bf1.asm assume cs:text, ds:data, ss:stk begin: ;   mov AX,data ;    mov DS,AX mov DL, str_arr ;   DL 1  mov CX, 0Ah ; 10  prev: cmp DL, 2Bh ;   + jne next ; ,    next mov BL, 00h ;   BL  inc data_arr[BX] ; ,      1 next: inc i ;       mov BL, i mov DL, str_arr [BX] loop prev mov AX, 4c00h ;   int 21h text ends data segment str_arr DB 2Bh,2Bh,2Bh,'$' ;  +++ data_arr DB 1,1,1,1,1,1,1,1,1,1,'$' ;  i DB 0 ;    data ends stk segment stack db 100h dup (0) ;  256  stk ends end begin 

Assembling (translation) is performed by the command tasm.exe bf1.asm
Linking is done with the tlink.exe command bf1.obj

After running the program in the TurboDebagger debugger, you can see that commands are located at address 0130 +++
Next is the data array in which we changed the first element, then comes the variable i , which after the execution of the cycle became equal to 0Ah.



Add commands <>.
In order to output a single character using the interrupt function 21h int 21h , you must (before calling the interrupt) put the character code in the DL register.

  mov AH,2 mov DL,   int 21h 

Write the program entirely

 text segment ; bf2.asm assume cs:text,ds:data, ss:stk begin: ;   mov AX,data ;    mov DS,AX mov DL, str_arr ;   DL 1  mov CX, 0Ah ; 10  prev: cmp DL, 2Bh ;   + jne next ; ,    next mov BL, j ;   BL   inc data_arr[BX] ; ,      1 next: cmp DL, 2Dh ;   - jne next1 ; ,    next1 mov BL, j dec data_arr[BX] next1: cmp DL, 3Eh ;   > jne next2 ; ,    next2 inc j ; ,      data_arr next2: cmp DL, 3Ch ;   < jne next3 ; ,    next3 dec j ; ,      data_arr next3: cmp DL, 2Eh ;   . jne next4 ; ,    next4 mov AH,2 ; ,    mov BL, j mov DL, data_arr[BX] int 21h next4: inc i ;       mov BL, i mov DL, str_arr [BX] loop prev mov AX, 4c00h ;   int 21h text ends data segment str_arr DB 2Bh,3Eh,2Bh,2Bh,'$' ;  +>++ data_arr DB 0,0,0,0,0,0,0,0,0,0,'$' ;  i DB 0, '$' ;    j DB 0, '$' ;    data ends stk segment stack db 100h dup (0) ;  256  stk ends end begin 



The loop works like this:
if the current element of the string str_arr is not +then jump to the label next: (otherwise perform +)
if the current element of the string str_arr is not then jump to the next1 label :
if the current element of the string str_arr is not >then jump to next2 tag :
if the current element of the string str_arr is not <then jump to next3 label :
if the current element of the string str_arr is not .then jump to next4 tag :
After the next4 tag : increase the index of the string str_arr and jump to the beginning of the loop - by the prev label :

Next, add [ and ]
Add a variable i_stor .

If the current element passed the check for [ , then check the current element of the data_arr array to zero, and if the element is zero, jump further (to the next label), otherwise load the value from the variable i into i_stor .

 next4: cmp DL, 5Bh ;   [ jne next5 ; ,    next5 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next5 ;  ,   mov DL, i ;   mov i_stor, Dl ;  i_stor   i next5: 

When processing the closing bracket ] , if data_arr is not zero, then in the variable i from the variable i_stor, we load the address of the opening bracket [

 next5: cmp DL, 5Dh ;   ] jne next6 ; ,    next6 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next6 ;  ,   mov DL, i_stor ;   mov i, Dl ;  i_stor   i next6: 

Check the code +++++[>+<]

 text segment ; bf4.asm assume cs:text, ds:data, ss:stk begin: ;   mov AX,data ;    mov DS,AX mov DL, str_arr ;   DL 1  mov CX, 50h ; 80  prev: cmp DL, 2Bh ;   + jne next ; ,    next mov BL, j ;   BL   inc data_arr[BX] ; ,      1 next: cmp DL, 2Dh ;   - jne next1 ; ,    next1 mov BL, j dec data_arr[BX] ;BX,   Bl next1: cmp DL, 3Eh ;   > jne next2 ; ,    next2 inc j ; ,      data_arr next2: cmp DL, 3Ch ;   < jne next3 ; ,    next3 dec j ; ,      data_arr next3: cmp DL, 2Eh ;   . jne next4 ; ,    next4 mov AH,2 ; ,    mov BL, j mov DL, data_arr[BX] int 21h next4: cmp DL, 5Bh ;   [ jne next5 ; ,    next5 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next5 ;  ,   mov DL, i ;   mov i_stor, Dl ;  i_stor   i next5: cmp DL, 5Dh ;   ] jne next6 ; ,    next6 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next6 ;  ,   mov DL, i_stor ;   mov i, Dl ;  i_stor   i next6: inc i ;     mov BL, i mov DL, str_arr[BX] loop prev ;    prev: mov AX, 4c00h ;   int 21h text ends data segment str_arr DB 2Bh,2Bh,2Bh,2Bh,5Bh, 3Eh,2Bh,3Ch,2Dh ,5Dh, '$' ;  ++++[>+<-] data_arr DB 0,0,0,0,0,0,0,0,0,0,'$' ;  i DB 0,'$' ;    j DB 0,'$' ;    i_stor DB 0,'$' data ends stk segment stack db 100h dup (0) ;  256  stk ends end begin 



Add the function to enter the line 3fh interrupt 21h
  mov ah, 3fh ;   mov cx, 100h ; 256  mov dx,OFFSET str_arr int 21h 


We will exit the loop upon reaching the end of the string '$' .
To do this, we will compare the current character with the symbol '$'
 cmp DL, 24h ;  '$' je exit_loop 

Replace the loop with the jmp command.
 text segment assume cs:text,ds:data, ss: stk begin: ;   mov AX,data ;    mov DS,AX ;   mov ah, 3fh mov cx, 100h ; 256  mov dx,OFFSET str_arr int 21h ; mov DL, str_arr ;   DL 1  ;mov CX, 100h ; 256  prev: cmp DL, 24h ;    '$' je exit_loop cmp DL, 2Bh ;   + jne next ; ,    next mov BL, j ;   BL   inc data_arr[BX] ; ,      1 next: cmp DL, 2Dh ;   - jne next1 ; ,    next1 mov BL, j dec data_arr[BX] ;BX,   Bl next1: cmp DL, 3Eh ;   > jne next2 ; ,    next2 inc j ; ,      data_arr next2: cmp DL, 3Ch ;   < jne next3 ; ,    next3 dec j ; ,      data_arr next3: cmp DL, 2Eh ;   . jne next4 ; ,    next4 mov AH,2 ; ,    mov BL, j mov DL, data_arr[BX] int 21h next4: cmp DL, 5Bh ;   [ jne next5 ; ,    next5 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next5 ;  ,   mov DL, i ;   mov i_stor, Dl ;  i_stor   i next5: cmp DL, 5Dh ;   ] jne next6 ; ,    next6 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next6 ;  ,   mov DL, i_stor ;   mov i, Dl ;  i_stor   i ;       prev: next6: inc i ;     mov BL, i mov DL, str_arr[BX] ; loop prev ;    prev: jmp prev exit_loop: MOV AH,2 ;     MOV DL,0Ah INT 21h mov AX, 4c00h ;   int 21h text ends data segment str_arr DB 256h DUP('$') ;   256  data_arr DB 0,0,0,0,0,0,0,0,0,0,'$' ;  i DB 0,'$' ;    j DB 0,'$' ;    i_stor DB 0,'$' data ends stk segment para stack db 100h dup (0) ;  256  stk ends end begin 

During the compilation process we get an error
Relative jump out of range by 0001h bytes

The fact is that je / jne commands can jump over only a few lines of the program (each line takes from 1 to 5 bytes in memory).


Long jumps to the end of the je / jne program cannot.
Therefore, we replace the expression
  cmp DL, 24h ;  '$' je exit_loop ... exit_loop: 

by expression
 cmp DL, 24h ;  '$' jne exit_ jmp exit_loop exit_ ... exit_loop: 


So, if the current character matches $ , then go to the exit_loop label : with the jmp command, otherwise jump over the jmp command.
The jmp command can do an intrasegment relative short transition (transition less than 128 bytes, i.e. IP: = IP + i8) or an intrasegment relative long transition (transition less than 32767 bytes, i.e. IP: = IP + i16).
By default, the jmp command makes a relative long jump, which is what we need (and in general, instead, you can simply add the jumps directive to the beginning of the program).
 ;jumps text segment assume cs:text,ds:data, ss: stk begin: ;   mov AX,data ;    mov DS,AX ;;; mov ah, 3fh ;   mov cx, 100h ; 256  mov dx,OFFSET str_arr int 21h ;;; mov DL, str_arr ;   DL 1  ;mov CX, 100h ; 256  prev: cmp DL, 24h ;  '$' ;je exit_loop jne l1 jmp SHORT exit_loop l1: cmp DL, 2Bh ;   + jne next ; ,    next mov BL, j ;   BL   inc data_arr[BX] ; ,      1 next: cmp DL, 2Dh ;   - jne next1 ; ,    next1 mov BL, j dec data_arr[BX] ;BX,   Bl next1: cmp DL, 3Eh ;   > jne next2 ; ,    next2 inc j ; ,      data_arr next2: cmp DL, 3Ch ;   < jne next3 ; ,    next3 dec j ; ,      data_arr next3: cmp DL, 2Eh ;   . jne next4 ; ,    next4 mov AH,2 ; ,    mov BL, j mov DL, data_arr[BX] int 21h next4: cmp DL, 5Bh ;   [ jne next5 ; ,    next5 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next5 ;  ,   mov DL, i ;   mov i_stor, Dl ;  i_stor   i next5: cmp DL, 5Dh ;   ] jne next6 ; ,    next6 mov BL, j mov DL, data_arr[BX] cmp DL, 00 ; ,    data_arr   jz next6 ;  ,   mov DL, i_stor ;   mov i, Dl ;  i_stor   i ;       prev: next6: inc i ;     mov BL, i mov DL, str_arr[BX] ; loop prev ;    prev: jmp prev exit_loop: MOV AH,2 ;     MOV DL,0Ah INT 21h mov AX, 4c00h ;   int 21h text ends data segment str_arr DB 256h DUP('$') ;   256  data_arr DB 0,0,0,0,0,0,0,0,0,0,'$' ;  i DB 0,'$' ;    j DB 0,'$' ;    i_stor DB 0,'$' data ends stk segment para stack db 100h dup (0) ;  256  stk ends end begin 




Link to github with program listings.

Source: https://habr.com/ru/post/423121/


All Articles