In the
last article we looked at the simplest Forth CPU J1. Now is the time to tell what language this
fort is , and how to compile it well for this processor.
Grammar language
The fort is the perfect language for the parser. The program consists of words, the words are separated by spaces. Words are an analogue of functions, for example:
open read close
This means the sequential execution of three functions - open, read and close. Comments in the fort usually look like this:
')
\ ( )
Everything is very simple. The only thing that can upset is the reverse Polish record (RPN). For example, the addition of three numbers is written like this:
1 2 3 + + 1 2 + 3 +
The program on the Forte is nothing more than:
- define your words based on existing ones
- perform some action by calling these words
Standard words
Let's define some minimum words on the basis of which it will be possible to build our constructions. For each word, I will try to use a comment, as is customary in Forte - to describe the state of the stack before and after the call of the word.
noop ( -- ):
+ ( ab -- a+b ):
xor ( ab -- a^b ):
and ( ab -- a&b ):
or ( ab -- a|b ):
invert ( a -- ~a ):
= ( ab -- a==b?1:0 ):
< ( ab -- a<b?1:0 ):
swap ( ab -- ba ):
dup ( a -- aa ):
drop ( ab -- a ):
over ( ab -- aba):
nip ( ab -- b ):
>r ( a -- ):
r> ( -- a ):
@ ( a -- [a] ):
! ( ab -- ): ([b]:=a)
1- ( a -- a-1):
All these words can be implemented with one ALU instruction J1 (except for "!", There is a trick - you need to remove two elements from the stack, but J1 does not know how).
There are a few more words that can be implemented with instructions, but we will not complicate them, but proceed to create your own words.
Creating new words
In order to describe a word use the following syntax:
: my-word ( before
Here, a colon means the declaration of a new word, my-word is the word itself, a semicolon at the end is a return from a function (after all, a word call is essentially a CALL instruction, which means it must be RETURN).
For example, there is such a word - rot (abc - bca). It performs a shift of the last three numbers on the stack in a circle (that is, it places the third number from the top of the stack to the top). Since the standard words operate on only two numbers, we will have to temporarily store the third one somewhere. For this we need the call stack r. For example:
: rot ( abc -- bca ) >r ( ab ) swap ( ba ) r> ( bac ) swap ( bca ) ; ( ) 1 2 3 rot ( 2 3 1 )
Here is another interesting instruction (returns one of two numbers depending on the third):
: merge ( abm
This word already looks harder. But then Fort makes you write short words that can be easily checked separately from the whole code. And then the code will also be simple and clear. This sensible thought originated from Charles Moore in the 1970s.
Control structures and other elements of the language
The language has variables, constants, cycles, branching. Description of variables and constants look like this:
( : constant ) 0 constant false 1 constant true 42 constant answer ( : variable ) variable x variable y ( - . x=2;y=x+1 ) 2 x ! ( x = 2 ) x @ ( stackTop = x ) 1 + ( stackTop = stackTop + 1 ) y ! ( y = stackTop )
A loop of the form do..while is written like this: begin ... condition until:
5 begin 1 - dup 0 = until
Before the cycle, we put the number 5 on the stack. In the cycle, decrement it by one. We compare the result with zero, and if not zero, repeat the cycle.
The conditional statement, depending on the value at the top of the stack, executes one of its branches. The number from the top of the stack is deleted.
( : condition IF ... THEN condition IF ... ELSE ... THEN )
Here are the basic constructs of the language. There are also cycles with a counter and with a precondition, and many others, but this is already beyond the format of one post.
Conclusion
The language is pretty simple and interesting. If you get used to it, you can even read the code. A simple compiler for J1 is available
here . He knows how to compile while very little, but maybe someone will be interested. It is also written in Go, like an emulator.
In real life, Forth is used mainly in embedded because byte-code takes up very little space (sometimes even less than C). Of the major projects at Forth, I can name
OpenFirmware and OLPC bios laptops (in fact, also openfirmware). By the way, OLPC has
a good
tutorial on the site.