As a basis, I took the Brainfuck language, it is so small that it is possible to expand a little to get an almost new and fairly functional programming language. And while not losing the zest of the original language - my language will still torment the brain of a programmer, like his parent!
So Brainfuck. In short, the idea is this, there are N registers / cells. The programmer has access to all of them, but the movements on them are made explicitly. Those. from cell 2 you cannot go to cell 7 immediately, you need to consistently.
“Key words” of the language:
')
- > - go to the cell to the right.
- <- go to the cell to the left.
- + - increase the cell value by one.
- - - reduce the cell value by one.
- , - read the value in the cell from the standard input device.
- . - print the cell value with a standard output device.
- [- start a while loop if the value of the current cell is not equal to 0 and go to the next cell.
- ] - end of the while block. Continue the cycle if the value of the “conditional” cell is not equal to 0 (the “conditional cell” is the cell on which the cycle started).
Added “keywords”:
- $ - read the value in the cell as a number (> redefine as reading as an ANCII character)
- ! - print as a number
- {- start of the function, after the start comes the name of the function (the name can be any sequence of letters between the characters% <function name>%. For any function, a copy of the cells is created, the return value is written to the current register of the calling block
- } - end of function
- (- start of comment
- ) - end of comment
- @% <function name>% - function call
- ^ - reset the cell
Since the entire set of keywords consists of ANCII characters, we have:
//
const char bf_next = '>';
const char bf_prev = '<';
const char bf_incr = '+';
const char bf_decr = '-';
const char bf_prnt = '.';
const char bf_read = ',';
const char bf_wBeg = '[';
const char bf_wEnd = ']';
//
const char bf_pNum = '!' ;
const char bf_rNum = '$';
const char bf_fBeg = '{';
const char bf_fEnd = '}';
const char bf_fNme = '%';
const char bf_comm= '(';
const char bf_call = '@';
const char bf_null = '^';
Without loss of generality, we take a limited number of cells, say 256, and in the case of an attempt to move to an invalid cell, we will move to the very first cell (if the transition is to the left) or to the most recent (if the transition is to the right).
Add:
const unsigned long regSize = 256; //
long reg[ regSize ]; //
long stck[ regSize ]; // ,
void resetRegisters(); //
void printRegistres(); //
Now, let's say we have test.bf, as an input file containing the code in my language or in my native Brainfuck. The interpreter should provide “backward compatibility”.
Again, without loss of generality, we can store all the code in a limited array. Those. The interpreter will work with files of limited size, let's say:
const unsigned long maxCodeSize = 1024000; /* */
unsigned long realCodeSize; // realCodeSize < maxCodeSize
char code[maxCodeSize]; //
The interpreter reads all the code at once. In one character array, for this we will use the readCode () function. After reading non-empty text, m_realCodeSize will contain the exact number of characters in the code, without taking comments into account, comments are discarded during reading.
int main (int argc, char ** argv)
{
welcome ();
resetRegisters ();
readCode (“test.bf“);
loop (0, realCodeSize - 1, regSize, reg);
return 0;
}
Next, we define a pair of functions for the while loop and copying the stack and the actual execution of the function.
bool loop( unsigned long from,
unsigned long to,
unsigned long condRegIndx,
unsigned long currReg,
long* registers );
bool runFunction( unsigned long from,
unsigned int to,
unsigned int& retValue);
void copyRegistersTo( long* source, long* destination );
The first will execute the cycle and return true if the cycle was completed without problems, i.e. no syntax errors.
The second will actually perform the function, and the return value is written to retVal, which in turn is assigned to the register on which the function was called. The return value will be the first register of the function’s stack after its termination.
By the way, about the while loop, in general, the loop can continue indefinitely. But, in order not to face the problem of the interpreter hang, we introduce a variable responsible for the maximum number of cycles.
const unsigned long maxLoopCycles = 999999;
We first implement backward compatibility. Let our interpreter be able to execute only Brainfuck code for now.
We will need functions:
bool makeCommand( char command, long* registers, unsigned long currReg )
unsigned long findLoopEnd( const unsigned long from )
The second and third parameters of the first function are required. The third parameter is needed in order to navigate with which cell to work, the second is needed because the registers of each function are different, and the operations on them are the same.
The second function, based on the name, finds the end of the cycle, i.e. the corresponding character is '['.
Thus, we have an interpreter for the Brainfuck language.
Attached to the record
source code , my interpreter with the test code
$[+<->]<<$>!<>>++++[++++++++++<->]<+++.++++++++++++++++++<<<!>[<-<+>>]>>.<<<!
To the code above, my interpreter will display the sum of the two numbers entered in the form a + b = c.
Successful ... programming!
PS
If interested, I'll tell you later how to implement the rest.