A radical approach to application development

From the translator: In 2007, in search of a web engine, I came across a very interesting and unusual lisp dialect. And after reading a few articles, I was fascinated by its principles. Since my main job is far from web programming, I do not use it professionally, but from time to time I return to it and gradually become “stormy”.

For all the time familiar with this language, he almost never flashes, and in Russian there is almost no information about him. Let's try to fill this gap. Despite the fact that the original article dates from the 2006th year, the topic is quite relevant.
Many thanks for the help in translating Nadezhda Zakharova and the wonderful site Notabenoid .

1. Introduction

I work as a consultant and developer of free software. For twenty years, I and my partners have been working on projects such as image processing, computer-aided design, modeling, and various financial and business applications.

For almost all of these projects, we used Lisp. My daily work is to listen to customer requests, analyze business processes and develop software according to their needs.
')
Usually, in business applications such as ERP or CRM, this is a process of constant change. At the beginning of a new project, neither the developer nor the customer knows exactly what is needed, or what the final product should look like.

This comes during an iterative process (some call it “extreme programming”). The client evaluates each new version, then strategies for further development of the program are discussed. Often, unforeseen requirements force rewriting large parts of the project. This does not mean that the project was poorly planned, because the process I describe is planning. In an ideal world, software development is just planning, the time spent directly writing code should tend to zero.

We need a programming language that allows us to express directly what we want the program to do. We believe that the code should be as simple as possible, so that any programmer at any time can understand what is happening in the program.

Over the years, the Pico Lisp system has evolved from a minimalist Lisp implementation to a specialized application server. Please note, we are not talking about the tool for rapid prototyping. At each stage of development, the result is a fully functional program, and not a prototype, which grows to the serial (possibly the latest) version. On the contrary, it can be called a powerful tool of a professional programmer who is used to keeping his development environment in order and wants to express the logic of his application and data structure in a concise presentation.

First we want to introduce you to Pico Lisp, explain why Pico at a low level is significantly different from other Lisp or development systems, and then show its advantages at higher levels.

2. Radical approach

Perhaps the community (Common-) Lisp will not be delighted with Pico Lisp, because it destroys some of the beliefs and dogmas that have become traditional. Some of them are just myths, but they can cause excessive complexity, slowness of Lisp. Practical experience with Pico Lisp proves that lightweight and fast Lisp is optimal for many types of efficient application development.

2.1. Myth 1: Lisp needs compiler

In fact, this is the most important myth. If you participated in Lisp group discussions, then you know that the compiler plays a major role. It may appear that this is almost a synonym for the execution environment. It is important for people what actions the compiler performs with the code and how effective it is. If it seems to you that a Lisp program is slowly running, then you decide that you need a better compiler.

The idea of an interpreted Lisp is regarded as an old misconception. Modern Lisp requires a compiler, and the interpreter is only a useful addition, and is needed for interactive debugging of the application. It is too slow and bloated to execute programs that are already at the stage of readiness for release.

We are sure that the opposite point of view is correct. On the one hand (and not only from a philosophical point of view), a compiled Lisp program is no longer Lisp at all. This violates the fundamental rule of "formal equivalence of code and data." The resulting code does not contain S-expressions and cannot be processed by Lisp. The source language (Lisp) is converted to another language (machine code) with inevitable incompatibilities on different machines.

In practice, the compiler complicates the system as a whole. Features such as multiple-linking strategies, typed variables, and macros have been developed to meet the needs of compilers. The system is bloated because it must also support the interpreter, and accordingly, two different architectures.

But is it worth the effort? Of course, the execution speed is higher, and compiler creation is interesting for learning purposes. But we argue that in everyday life a well-designed “interpreter” can often surpass the compiled system.

You understand that in reality we are not talking about “interpretation”. The Lisp system immediately converts the data passed to it into internal pointer structures called S-expressions. True “interpretation” works with one-dimensional character codes, and this significantly slows down the execution process. Lisp, on the other hand, “computes” S-expressions by quickly following these pointer structures. There are no searches, so nothing is really "interpreted." But we will stick to this familiar term.

A Lisp program as an S-expression forms a tree of executable nodes. The code of these nodes is usually written in an optimized C or assembler, so the task of the interpreter is to transfer control from one node to another. Because many of these built-in Lisp functions are very powerful and do a lot of computation, most of the execution time is on nodes. The tree itself functions as a kind of glue.

The Lisp compiler removes some of this glue and replaces some nodes with primitive or streaming functionality directly into machine code. But, since in any case most of the execution time comes from built-in functions, these improvements are not as dramatic as, for example, the Java bytecode compiler, for which each node (bytecode) has relatively primitive functionality.

Of course, compilation itself also takes quite a lot of time. The application server often executes Lisp source files on the fly in one run and immediately discards the code as soon as it runs. In such cases, either the initially slower compiler-based Lisp interpreter or the extra time spent by the compiler will significantly reduce overall performance.

Pico Lisp's internal structures were originally designed for ease of interpretation. Although they were completely written in C, and were not specifically optimized for speed of execution, there was never a problem of insufficient performance. The first commercial system written in Pico Lisp was a system for processing and retouching images, as well as for creating page layouts for printing. It was created in 1988 and was used on Mac II with a 12 MHz CPU and 8 MB RAM.

Of course, there was no Lisp compiler, there were only low-level pixel manipulations and bezier functions written in C. Even then, when working on a computer that was hundreds of times slower than modern ones, no one complained about performance.

For the sake of interest, I installed CLisp and compared it with Pico Lisp using simple tests as an example. Of course, this does not mean that the test results show the usefulness of a particular system as an application server, but they give a rough idea of the performance of these systems. At first, I tried to perform a simple recursive Fibonacci function.

(defun fibo (N) (if (< N 2) 1 (+ (fibo (- N 1)) (fibo (- N 2)) ) ) )

When I called this function with parameter 30 (fibo 30), I got the following results (testing was done on a Pentium-I 266 MHz laptop):

Pico (interpretation)	12 seconds
CLisp interpretation	37 seconds
CLisp compiled	7 seconds

The CLisp interpreter is almost three times slower, and the compiler is almost twice as fast as Pico Lisp.

However, the Fibonacci function is not a good example of a typical Lisp program. It consists only of a primitive flow and arithmetic functions, which is easily optimized by the compiler and can be written directly in C if this is critical in time (in this case, the execution would take only 0.2 s)

Therefore, I took another extreme case, with a function that performs extensive list processing:

 (defun tst () (mapcar (lambda (X) (cons (car X) (reverse (delete (car X) (cdr X))))) '((abcabc) (bcdbcd) (cdecde) (defdef)) ) )

Calling this function 1 million times, I received:

Pico (interpretation)	31 seconds
CLisp interpretation	196 seconds
CLisp compiled	80 seconds

Now the CLisp interpreter is more than 6 times slower, but to my surprise even the compiled code is 2.58 times slower than Pico Lisp.

Maybe CLisp has a slow compiler? And maybe the code can be sped up with some tricks. But these results still leave many doubts as to whether the overhead of compilation can be justified. To mess around with compiler optimization is the last thing I want to do when it comes to application logic, and when the user still doesn't notice the delays.

2.2. Myth 2: Lisp Requires Many Data Types

The Fibonacci function described in the example above can be accelerated by declaring the variable N as an integer. But then this example will show how strongly the requirements of compiler support affect Lisp. The compiler can produce more efficient code if the data types are hard coded. Common Lisp supports many different data types, including various integer types, fixed / floating point types, fractional numbers, characters, strings, structures, hash tables, and vector types in addition to lists.

On the other hand, Pico Lisp supports only three built-in data types — numbers, symbols, and lists — and it is quite fine for these types of data. A Lisp system runs faster with fewer data types, because fewer options need to be checked at runtime. Perhaps this will entail a less efficient use of memory, but a smaller number of types allows you to save space due to the fact that less bits are required for tags.

The main reason for using all three types of data is simplicity, and the advantage in simplicity outweighs the benefit of speed and memory compensation.

In fact, Pico Lisp at the lowest level uses only one type of data, cells that are used to form numbers, symbols, and lists. A small number or minimum character occupy only one memory cell, dynamically increasing if necessary. This memory model allows for efficient garbage collection and completely avoids fragmentation (as it would, for example, with vectors).

At the highest level, you can always emulate other data types using these three primitive data types. So, we emulate trees using lists, strings, classes, and objects using symbols. While there are no performance problems, why complicate things?

2.3. Myth 3: Dynamic binding is bad

Pico Lisp uses a simple implementation of dynamic surface binding. The contents of the cell storing the value of the character are saved when entering a lambda expression or binding environment, and then a new value is set. When returned, the original value is restored. As a result, the current value of a symbol is determined dynamically by history and execution state, and not by static checks of the lexical environment.

Perhaps for the interpreted system this is the simplest and fastest strategy. To look at the cell value, no searches are required (only access to the cell value is required) and all characters (local or global) are treated the same. On the other hand, the compiler can produce more efficient code for lexical binding, thus compiled Lisp code usually complicates everything because of the support of several types of binding strategies.

Dynamic binding is a very powerful mechanism. You can access the current value from any place, the variable itself and its value are always physically existing “real things”, and not what “seem” (as is the case with lexical binding, and to some extent using transit characters in Pico Lisp (see below)).

Unfortunately, big opportunities are impossible without big risks. The programmer must be familiar with the basic principles in order to use their advantages and avoid traps. However, as long as we adhere to the agreements recommended by Pico Lisp, the risks will be minimal.

There are two types of situations where the results of calculations using dynamic linking can get out of the programmer’s control:

the symbol is associated with itself, and we are trying to change the meaning of the symbol;
the problem of funarg (functional argument), when the value of a symbol is dynamically changed by a through code that is invisible in the environment of the current source code.

Such situations can be avoided by using transit characters.

Transit characters are characters in Pico Lisp that look like strings (and are often used as strings) and that are only temporarily interned for the duration of the execution of one file with the source code (or only parts of it). Thus, they have lexical capabilities comparable to static identifiers in C programs, only their behavior is completely dynamic, because they are normal characters in all other respects.

So the rules are simple: whenever a function has to change the value of a variable passed to it or calculate the result of a passed Lisp expression (directly or indirectly), the parameters of this function should be written using transit characters. Practical experience shows that such cases are rare in high-level software development processes and occur mainly in auxiliary libraries and system tools.

2.4. Myth 4: Property Lists Is Bad

Properties is an elegant, clear way to associate information with symbols in addition to the value / function cell. They are extremely flexible, since the number and type of data are not statically fixed.

It seems that many think that property lists are too outdated and primitive to use them in our time. Instead, more advanced data structures should be used. Although this is true in some cases, depending on the total number of properties in a symbol, the payback threshold may be higher than expected.

Previous versions of Pico Lisp have experimented with hash tables and self-balancing binary trees to store properties, but we have found that regular lists are more efficient. We must take into account the cumulative effect of the entire system, and the overhead of both supporting a large number of internal data structures (see above) and more complex search algorithms is often greater than using simple linear search. And when we also touch upon the issue of memory efficiency, the benefits of property lists definitely benefit.

Pico Lisp implements properties in the form of a list of key-value pairs. The only concession in favor of speed optimization is the “most recently used” scheme, which slightly accelerates repeated access, but we have no concrete signs that this was actually necessary.
Another argument against properties is their declared global visibility. This is as true as the fact that a global element in a C-structure or an instance variable in a Java object is global.

Of course, in a global symbol, a property is also global, but in typical application development, properties are stored in anonymous characters, objects, or database elements that are available only in a well-defined context. Therefore, the “color” property can be used in a certain sense in one context, and in a completely different sense in another context, without any mutual interference.

3. Application Server

Based on this simple Pico Lisp machine, we have developed a vertically structured application server. It unifies the database engine (based on the PicoLisp implementation of stored (persistent) objects as a first-class data type) and an abstract graphical interface (generating, for example, HTML or Java applets).

A key element in this unified system is a Lisp-based markup language that is used to implement individual application modules.

Whenever the application server requests a new view from the database, a document or report, or some other service, the Lisp source code file is loaded and executed on the fly. This is similar to a URL request and then sent the HTML file in a traditional web server.

However, Lisp expressions computed in such a scenario usually have the side effect of building and processing an interactive user interface.

These Lisp expressions describe the structure of GUI components, their behavior in response to user actions and their interaction with database objects. In short, they contain a complete description of the software module. To make this possible, we found that it is important to adhere strictly to the Local Principle, and use the “Prefix Classes” and “Communication Support Daemons” mechanisms (the latter two are described in another document ).

3.1. The principle of Locality

As we said, business application development is a process of constant change. The principle of Locality proved to be a great help in the development of such projects. This principle requires that all information relating to one module should be stored with this module in one place. This allows the programmer to focus only on one place where all this is stored.

Of course, all this seems quite obvious, but in contrast, software development methodologies prescribe to encapsulate behavior and data and hide it from the rest of the application. Usually, this leads to the fact that the application logic is written in one place (the source file), but the functions, classes and methods for implementing this logic are defined elsewhere. Of course, this is a good recommendation, but it brings many problems, manifested in the need to constantly move to different storage locations: modifications and context switching occur simultaneously in several places. If a function is outdated, some modules may also become outdated, but we forget to delete them.

Thus, we believe that it is optimal to create an abstract library of functions, classes and methods - as universal as possible, constant in time and various applications, and used to build a strict markup language, which has a high degree of expressiveness for creating applications.

This language must have a compact syntax and allow to describe all the static and dynamic aspects of the application. Locally, in one place. No need to define behavior in separate files.

3.2. Lisp

And this is the very main reason why we stated from the very beginning that Lisp is the only language that suits us.

Only Lisp allows you to process code and data in the same way, and this is the basis of the Pico Lisp application development model. It allows you to intensively use functional blocks and calculated expressions that are freely mixed with static data and which can be both transmitted somewhere and stored in internal data structures at run time.

As far as we know, in other languages this is impossible, at least with the same simplicity and elegance. To some extent, this can be done with the help of scripting languages, using interpreted lines of text, but this solution will be rather limited and awkward. And, as we described above, systems on a compiled Lisp can be too heavy and inflexible. In order for all these data structures and code fragments to work smoothly, the dynamic surface referencing strategy is a great advantage, since expressions can be calculated without the need for binding the environment settings.

Another reason is that Lisp allows you to directly manipulate complex data structures, such as characters and nested lists, without the need for explicitly declaring, allocating, initializing, or freeing these structures from memory. This contributes to the compactness and readability of the code and gives the programmer a powerful expression tool that allows you to do different things on the same line where other languages require writing a separate module.

In addition, since Pico Lisp does not make a formal distinction between database objects and internal characters, all of these advantages also apply to working with a database, leading to direct connection of operations with GUI and DB in one local context using identical code.

4. Conclusion

The Lisp community seems to suffer from the paranoia of "inefficient" Lisp. This is probably due to the fact that for decades they have been forced to defend their language from the claims that "Lisp is slow" and "Lisp is bloated."

This was partly true. But on today's hardware, execution speed does not matter for many practical applications. And in those cases where it has, coding several critical functions in C usually solves this problem.

Now let's focus on more practical aspects. Some may be surprised at how compact and fast a supposedly ancient Lisp system can be. Thus, we must be careful not to make Lisp really “bloated”, overloading the core of the language with more and more possibilities, but rather decide to use simple solutions that give complete flexibility to the programmer.

Pico Lisp can be seen as proof of the concept “Less can be more.”

Source: https://habr.com/ru/post/177791/

All Articles