
Functional programming is back in fashion. Depending on whether you prefer classics or hardcore, suffer from imposed industrial standards, or you are just a hipster, your favorite preference may be Scala, Haskell, F # or even good old Lisp. Such combinations of words as a function of a higher order, the absence of side effects, and even monads, caress the ears of all “weak young minds,” whether they are from JetBrains or a student who first saw SICP.
But there is another programming, in the literal sense, even more functional, basically having more likely not a lambda calculus, but a composition of functions. And I want to tell a little about him.
What is concatenative programming?
Concatenation is a concatenation of several lines. A language in which the usual combination of two code fragments turns out to be their composition is called concatenative
[1] . Each of these pieces of code is a function that takes a stack as an argument and returns the stack as its result. Therefore, more often such languages ​​are called stack-based.
')
Stack languages ​​are alive and exist. Due to the speed and lightness of translators, they are used in the real world. JVM, the most common virtual machine in the world; CPython, another stack virtual machine [2], which is used in many high-load projects, such as Youtube and BitTorrent; in the PostScript language, used now in high-performance top-end printers, finally, the good old Forth, which is still found for programming microcontrollers and embedded systems. All of them are wonderful in their own way, but it is worth noting that although the general principles are similar, but by and large the discussion will deal with languages ​​of a higher level, such as Fätor, Joy or Cat.
Functions
In concatenative languages, everything is a function, but for comparison with other languages, let's look at an example of the squaring function with a definition. Unshakable imperative
From :
int square(int x) { return x * x; }
Functional
Scheme :
(define square (lambda (x) (* xx)))
Concatenative
Joy :
DEFINE square == dup * .
Concatenative
Factor :
: square ( x
Concatenative
Cat :
define square { dup * }
Here
dup creates a copy of the argument at the top of the stack, and puts this copy there too.
* - this is the usual multiplication function.
*:: (int) → (int) → (int)
which “expects” two arguments on the stack; in this case, the two most recent ones on the stack are a copy of the same number. If now we want to use this function, it will be enough for us to write
3 square , where 3 is also a function with a signature
3::() → (int)
i.e. returning herself. In fact, it is more correct to say that this is a type with row polymorphism and it returns the entire current stack. [5]
In concatenative languages, they are not used (more precisely, it is not recommended, but if you really want, you can) variables and lambdas, and this fact makes it rather concise. Instead of an applicative approach, here is a composition written in postfix (reverse Polish) notation, which corresponds to the sequence of data with which we work on the stack, which means that in some way it makes it easier to read the stack code. Let's look at two examples. The first is the usual Hello world, which looks somewhat unusual in this notation:
"Hello world" print
We put the string “Hello World” on the stack, and then use the print function to retrieve an element from the top of the stack and display it on the screen.
10 [ "Hello, " "Factor" append print ] times
It uses operations on quotations, which will be discussed a little later, but it’s pretty obvious that 10 and a quotation are put on the stack, and times the code from the quotation is executed a specified number of times. The code inside the quote just sequentially pushes two lines onto the stack, then connects them and displays them on the screen.
You can try a small concatenative language directly in your browser at
trybrief.com . As it is easy to notice, the interpreter is written in javascript and is quite simple, you can look at it at ./Engine.js. Cat, whose interpreter was originally implemented in C #, and now in so many other languages, also has an
online js interpreter .
Quotes and combinators.
The ability to take pieces of code in quotes ([]), makes it possible to write your own control structures. Quotes are functions that return the contents of a quoted expression. The citation function in this case will be of the following type.
quote :: (T) → (() → T).
For example, the entry [5] in square brackets will return not the number 5 of the integer type itself, but its quotation. It is clearer when it comes to expressions, for example [5 6 +]. Without quoting, the number 11 will be calculated and put on the stack, while quoting will put the expression 5 6 + on the stack, with which you can interact in many different ways.
Combinators are functions that expect one or more quotes from the stack and apply them in a special way to the rest of the stack.
Essentially, quotes are regular listings. They can be combined with other quotes, perform various manipulations on the stack, and then apply them using different combinators. For example, the combinator fold collapses a list into a value, an example that summarizes the elements of a list (Joy):
[2 5 3] 0 [+] fold
Quick sort with recursive binrec in Joy combinator:
DEFINE qsort == [small] [] [uncons [>] split] [[swap] dip cons concat] binrec .
Nothing prevents the implementation of monads
as in Factor and yes, Fator, for example, knows how to optimize tail recursion.
Control structures.
Thanks to citations and combinators, you can use current or create your own control structures. Example for if (Factor):
= [ 23 ] [ 17 ] if
To understand how this works, I recommend refreshing knowledge on the
definition of Boolean functions in lambda calculus. A slightly more complex example on Factor:
: sign-test ( n -- ) dup 0 < [ drop "" ] [ zero? [ "" ] [ "" ] if ] if print ;
The function determines the sign of the number. If operates on two quotes, highlighted by square brackets and placed on the stack higher than the expression dup 0 <. If the number at the top of the stack when calling sign-test (ie, the function argument) is less than zero, the first quote is used, which pushes “negative” onto the stack (throwing out the test number that was copied). In case the number is more, the second quotation is fulfilled, inside containing one more condition. After that, the stack contains a string denoting the sign of a number, this string is output by the print function.
Benefits:
Speed ​​and optimizability .
:A 1 2 + print :add2 2 + ; :B a add2 print
In this case, A = B due to the associativity of the composition. The program in concatenative language is divided into pieces that are assembled after compilation. In fact, this is a regular
map-reduce , fast and allows you to compile each piece of code in parallel.
Refactoring . If you see two identical parts of the code, put them into a separate function. Use combinators, quoting and own control structures. Examples of Cat language optimizations:
f map g map => [fg] map [f] map x [g] fold => x [[f' f] g] fold [f] filter x [g] fold => x [f [g] [pop] if] fold [f] map [g] filter x [h] fold => x [fg [h] [pop] if] fold
Difficulties
All the flaws of stack languages ​​obviously originate from their features. This is a composition instead of an application, this is pointless notation, this is the absence of variables and the difficulty of writing some expressions. Provide a simple expression in the concatenative language
'abc d' . If the function c takes two arguments, then in the applicative form the expression will be written as
d (c (b, a)) , if
b and
c take one argument at a time, then
d (c (b (a))) ; if
d is a function of two arguments,
b is one, and
c is a function that does not accept arguments, then the expression in the applicative form will be of the form
d (c, b (a)) . You need to very carefully write code in a concatenated language, because you have to understand everything in it. This is supported by strong typing in Cat, perceived support from the IDE, various methods and mechanisms that allow you to change your thinking. But such a problem can overtake you in your favorite Haskell with its pointless notation of function compositions (in which there are just dots). Writing in a postfix form without named variables for any complex algebraic expression is also not possible in the same way as we used to be from childhood. This is the main problem of these languages ​​- that, like classical functional ones, they require a slightly different thinking from you. In terms of production, and the story of Scala has not calmed down yet, the languages ​​familiar to everyone are required, because the whole thing rests on human resources.
Each function in the concatenative language operates on a stack, and although it may be by itself without side effects, there are no assignments in it, variables, but it changes the stack, i.e. global state, so the question of purity for many is not obvious. However, for those high-level languages ​​in question, this is not entirely fair, because it is rather a matter of implementation. And in order to be applicable in the real world, they also have variables for those rare occasions when they are needed (for example, SYMBOL: in Factor or function declarations to increase readability of the code), they can be quite pure in mathematical terms (like Joy
[ 4] ) and, like many conceptual differences between approaches, paradigms and platforms, in the end be just a matter of taste. Provided adequate selection of tools for the task. That is why Forth is used in embedded systems, its translator is very easy to implement and it will be very small.
I hope someone discovered something new. Personally, I was inspired by the text “Why concatenative programming is important”
[5] and now for several days I can not tear myself away from this topic, and even being in a sleepy state I can say the word “concatenative” many times.
1.
Wikipedia: Concatenative programming2.
Pypy interpreter3.
http://docs.factorcode.org/content/article-cookbook-philosophy.html4.
Functional Programming in Joy and K5.
Why Concatenative Programming Matters?6.
Objects and Aspects: Row Polymorphism