📜 ⬆️ ⬇️

The evolution of extension language: the story of Lua

from the lane The source material dates back to 2001, so some things may seem funny. Also, all references to "today", "at the moment", etc. refer to that period.
The presentation is conducted on behalf of the author, as in the original.
All links are added by me.

The presentation is organized in chronological order. We begin with our experiments, which formed the basis for the creation of Lua in 1993, and go through eight years of discussion, decisions, work, and entertainment.

Introduction


There is an old joke: “a camel is a horse developed by a committee”. Among developers of programming languages, this joke is almost as popular as the legend about the languages ​​developed by the committees. The legend is supported by languages ​​such as Algol 68, PL / I and Ada, developed by committees and not meeting the expectations.
However, apart from the committees, there is an alternative explanation for the partial failure of these languages: they were all born large. Each of them followed the downstream design process, in which the language is fully described before the programmer can try it out, and even before the appearance of the first compiler.

On the other hand, many successful languages ​​have manifested themselves before their full study. They followed an upward design process, appearing as small languages ​​with modest goals. When people began to use an underdeveloped language, new possibilities were added to it (or, sometimes, deleted), ambiguous places were clarified (or, on the contrary, everything was even more complicated). Therefore, the path of development is an important topic when learning a programming language. For example, SIGPLAN has already sponsored two conferences on the history of programming languages.
')
This document describes the history of the Lua programming language. Since its introduction as a language for a couple of its own specific “home” projects, Lua has gone beyond our most optimistic expectations. In our opinion, the main reasons for this success lie in our approach to designing a language: keep the language itself simple and compact, and its implementation is simple, compact, fast, portable and free.

Lua was developed (or, more precisely, nominated) by the committee; albeit small - with only three members - but a committee. In hindsight, we understand that the initial development of the language by a small committee had a positive effect on the language. We added a new feature only by unanimous agreement, otherwise it was postponed for the future. It is much easier to add new features later than to delete them. This development process kept the language simple; and simplicity of language is our most valuable contribution. Other most important qualities of Lua - speed, compactness and portability, derived from its simplicity.

From the first versions, Lua had “real” users, not just us alone. They made important contributions to the language with discussions, complaints, user reports and questions. On the other hand, our small committee played an important role: its structure gave us enough inertness to listen to users without following all their suggestions.

Start


Our first experience at TeCGraf ( from the translation of the Computer Graphics Technology Group of the Catholic University of Rio de Janeiro in Brazil ), related to the development of its PL, occurred in an application for data entry. The engineers of PETROBRAS (a Brazilian oil company) several times a day required the preparation of source data files for simulators. It was a boring and error-prone process, since the simulation programs required strictly formatted source files, usually consisting of simple columns of numbers without any indication what number it meant. Of course, each number had a definite meaning, understood by the engineers at first glance at the diagram of a specific simulation. PETROBRAS turned to TeCGraf for the creation of a graphical frontend for similar source data. Numbers could be entered interactively after a simple click into the appropriate part of the diagram - and this was a much simpler and more visual task than editing numeric columns. Moreover, it made it possible to add verification and calculation of dependent values ​​from the source data, thereby reducing the amount of data required from the user, and at the same time increasing the reliability of the whole process.
To simplify the development of this application in TeCGraf, we decided to program everything in a uniform way, developing a simple declarative language for describing all the tasks for the input data. Here is an example of a typical program section in this language, which we called DEL (data entry language):
:e gasket "gasket properties" mat s # material mf 0 # factor m yf 0 # settlement stress ti 1 # facing type :p gasket.m>30 gasket.m<3000 gasket.y>335.8 gasket.y<2576.8 

Expression : e declares an entity (named in the example “gasket”), which has several fields with default values. Expression : p imposes some restrictions on the values ​​in the gasket, thus implementing data validation. DEL also has expressions for describing input and output data.
An entity in a DEL is, in general, a structure or entry in traditional PL. The difference is that its name also appears in a graphical metafile containing a diagram with which engineers enter data as previously described.

This simple language has proved its success both in TeCGraf, simplifying development, and among users by the ease of adapting data entry applications. Soon, users needed more features from the DEL, such as boolean expressions to control the entity's readiness to enter data, and the DEL became heavier. When users started asking about conditional blocks and cycles, it became clear that we needed a full-fledged programming language.

At about the same time, we started working on another project for PETROBRAS called PGM: a customizable report generator for lithologic profiles. As the name implies, the reports created by this program should have been well tuned: the user could create and position tracks, choose colors, fonts and texts; each track could have a grid that has its own settings (logarithmic / linear, with vertical and horizontal cutoffs, etc.); each curve had its own automatic scale; and much more.

All these settings were set by end users, usually geologists or engineers, and the program had to work on small machines, such as a PC with MS-DOS. We considered that the best way to customize the application is possible through a specialized description language, which we called Sol: an acronym for Simple Object Language (Simple Object Language), and at the same time it was translated from Portuguese as “sun”.

As the report generator used many different objects with many attributes for each, we decided not to introduce these objects and attributes into the language. Instead, the language allowed to describe types. The main task of the interpreter was to read the description, check the correctness of the description of objects and attributes, and transfer the information to the main program. To implement the interaction between the main program and the interpreter, the latter was implemented as a C library linked to the main program. Thus, the main program had full access to all configuration information through the library API. Moreover, the program could register a callback-function for each described data type caused by the interpreter when creating an object of this type.

Here is a typical example of Sol code:
 -   'track',    'x'  'y', -    'z'.  'y'  'z'    . type @track { x:number,y:number= 23, z=0} -   'line',   't' () -  'z' ( ). -  't'     -  ( 'track')   x=8, y=23,  z=0. type @line { t:@track=@track{x=8},z:number*} -   't1'  'track' t1 = @track { y = 9, x = 10, z="hi!"} -   ( 'line') 'l'  t=@track{x=9, y=10} -  z=[2,3,4] () l = @line { t= @track{x=t1.y, y=t1.x}, z=[2,3,4] } 

Sol syntax was strongly influenced by BiBTeX and UIL (User Interface Language) , a Motif user interface description language.
In March 1993, we completed the first implementation of the Sol language, but never represented it. In mid-1993, we realized that DEL and Sol can be combined into one more powerful language. The lithologic profile visualization program soon demanded support for procedural programming to create more complex layers. On the other hand, data entry programs also required descriptive means for programming their user interface.

So, we decided that we need a complete programming language with assignments, control structures, procedures, and all that. Also in the language needed a means of describing data by analogy with Sol. Moreover, since many potential users of the language were not professional programmers, the language should avoid complex syntax (and semantics). Finally, the implementation of the new language was supposed to be well portable.

The requirement of portability turned out to be one of the main advantages: for those two applications ( from the lane, apparently, DEL and Sol are meant ) should have been portable, and the language should have been the same. PETROBRAS, as a state-owned company, could not choose specific hardware, since it was purchased with very strict restrictions on spending public money. Because of this, PETROBRAS had a very diverse collection of computers, each of which had to run software created by TeCGraf for PETROBRAS: this included PC DOS, Windows (3.1 at the time), Macintosh and all possible Unix.

At this point, we could take the existing language, rather than create another new one. The main candidates were Tcl and, with a big lag, Forth and Perl. Perl is not an extension language. Plus, in 1993, Tcl and Perl worked only on Unix platforms. All three languages ​​also have complex syntax. And none of them had good data description support. So we started work on a new language.

Soon we realized that for our purposes in the language there is no need to declare data types. Instead, we could use the language itself to write type-checking functions based on the basic language capabilities of reflection (such as type information at runtime ( from the first talk about RTTI )). View assignment
 t1 = @track {y = 9, x = 10, z="hi!"} 

permissible in Sol is also permissible in a new language, but with a different meaning: it creates an object (in this case, an associative table) with the specified fields, and then calls the track function to check the object (and, sometimes, to set default values).
Since the language was a modified version of Sol ("sun"), a friend at TeCGraf suggested the name Lua ("moon" in Portuguese), and so the language Lua was born.

Lua inherited the syntax of records and list construction from Sol, but combined their implementation using associative tables: the records used strings (field names) as indices; lists used integer indices. In addition to these data description capabilities in Lua, there were no new concepts, since we needed an easy language of general application. So, we started with a small set of control structures, the syntax of which was borrowed from Modula (while, if, repeat until). From CLU, we took multiple assignments and returns multiple values ​​as a result of a function call (a much clearer approach than in-out parameters or passing by reference). From C ++, we took the idea of ​​locality of the scope of variables in the place of their declaration.

One of the small (even minor) innovations was the syntax of string concatenation. Since the language allowed implicit coercion of strings to numbers, using the "+" operator would be ambiguous, we added the syntax ".." (two dots) for such an operation.

Disputes caused the use of ";" (semicolons). We thought that the requirement of using a semicolon would somewhat confuse engineers who know FORTRAN; on the other hand, not using it will confuse those who know C or Pascal. In the end, we came to the decision of the optional use of a semicolon (a typical committee decision).

Initially, the Lua language had seven data types: numbers (stored in floating point format), rows, (associative) tables, nil (data type with a unique value also called nil), userdata (a simple C's pointer to represent C data structures inside Lua), Lua functions and C functions. (After eight years of language evolution, the only change in this list was only the unification of Lua and C functions into a single type). To preserve the compactness of the language, we did not include a boolean data type. Similar to Lisp, nil is set to false, while all other values ​​are set to true. This is one of the few savings we now sometimes regret.

Lua also borrowed from Sol an implementation approach as a library. The implementation followed the principle now supported by Extreme Programming: “The simplest implementation that can work.” We used lex for the lexical scanner and yacc for the syntax parser. The parser translated the program into bytecode, which was then executed by a simple stack interpreter. The language had a very small standard library and it was very easy to add new functions to C.

Despite the simple implementation — or perhaps because of it — Lua exceeded our expectations. Both projects (PGM and ED) successfully used Lua (and PGM is still in use). In the end, and other projects in TeCGraf began to use Lua.

Early years (1994–1996)


New users create new requests. Not surprisingly, one of the first requests was an increase in Lua performance. Using Lua to describe the data posed an atypical problem for a regular scripting language.
Soon after we started using Lua, we noticed its potential for use as a language for graphic metafiles. The ability to describe the data in Lua allowed to use it as a graphic format. Comparing with other programmable metafiles, using Lua gave all the advantages of a full-fledged procedural language. The VRML format, for example, uses Javascript to model procedural objects, which leads to non-uniformity (and, therefore, ambiguity) of the code. With Lua, combining procedural objects in a scene description is natural. Fragments of a procedural code can be combined with declarative expressions to model complex objects while maintaining overall clarity.

The data entry program (ED) was the first to use Lua for its graphic metafiles. It was common to have diagrams with thousands of parts, described by thousands of elements in Lua in a file of hundreds of kilobytes. This meant that from a programming point of view, Lua handles huge programs and expressions. And the fact that Lua compiled such programs "on the fly" (like the "just-in-time" compiler) meant, and so, the very high speed of the Lua compiler itself. The first victim of the pursuit of the manufacturer has become a lex. Replacing the scanner generated by lex with self-written code almost doubled the speed of the Lua compiler.
We also created new opcodes for designers. The source code for the list designer looked like this:
 @[30, 40, 50] 

what turned into similar baytkod:
 CREATETABLE PUSHNUMBER 1 #  PUSHNUMBER 30 #  SETTABLE PUSHNUMBER 2 #  PUSHNUMBER 40 #  SETTABLE PUSHNUMBER 3 #  PUSHNUMBER 50 #  SETTABLE 

With the new scheme, the code began to look like this:
 CREATETABLE PUSHNUMBER 30 #  PUSHNUMBER 40 #  PUSHNUMBER 50 #  SETTABLE 1 3 #      1  3 

For long designers it was not possible to put all their elements in a stack before saving. Because of this, the code generator from time to time issued a SETTABLE opcode to reset the stack.
(Since then, we have always tried to improve compile time. Now Lua compiles a program with 30,000 assignments six times faster than Perl, and eight times faster than Python).

We released a new version of Lua with these optimizations in July 1994 as Lua 1.1. The version was available for download via ftp. The previous version of Lua 1.0 was never publicly available. After some time, we also released the first documentation describing Lua.

Lua 1.1 had a limited user license. The language could be freely used for academic purposes, but not for commercial use. (Despite the license, the language itself has always been open source). But this license did not work. Most competitors, such as Perl and Tcl, were free. Moreover, restrictions on commercial use prevented even academic use, since some academic projects planned to enter the market later. So the release of the next version of the language, Lua 2.1, was free.

Lua version 2


Lua 2.1 (released in 1995) brought many important changes. One of them was not in the language itself, but in the development process: we believed that we should always try to improve the language, even if at the cost of a slight inverse incompatibility.
In version 2.1, we introduced many incompatibilities with version 1.1 (but provided tools to assist in code migration). We removed the @ syntax from table constructors and unified the use of curly braces for both records and lists. Throwing out @ was a trivial change, but it changed the perception of the language, not only its appearance.

More importantly, we simplified the semanik of designers. In Lua 1.1 design
 @track{x=1, y=10} 
had a special meaning. In Lua 2.1 design
 track{x=1, y=10} 
is syntactic sugar for
 track({x=1, y=10}) 
that is, it creates a new table and passes it with a single parameter to the track function.

From the very beginning, we developed Lua as an extension language, and therefore C programs could register their own functions, transparently called from Lua. With this approach, it was easy to extend Lua with object-oriented primitives, which allowed the end user to adapt the language for specific tasks.

In version 2.1, we introduced the concept of fallbacks: user-defined functions that are called by Lua in case of uncertain situations. ( from Lane. This approach is a kind of superset of operator overloading. The principle is similar, but there are more possibilities ). Lua becomes a language that can be extended in two ways: expanding the set of “primitive” functions and expanding their semantics through fallbacks. That is why we now call Lua extensible extension language.

We have declared fallbacks for arithmetic, comparison of operators, string concatenation, access to tables, etc. ( from the lane. more here and a little ahead here ). After the user specifies, a similar function is called whenever the operands of this operation are not suitable for their types. For example, when adding two values, one of which is not a number, a fallback is called, and the result of its call is used as the total amount.

Of particular interest is (and is the main reason for the appearance of fallbacks) the access operation to the table: if, when performing x = a [i], the value of a [i] is nil ( from the first page, that is, table a does not contain the field i ) , then the fallback function is called (if specified), whose result is used as a value for a [i]. This simple new functionality allowed programmers to implement various access table semantics. In particular, you can implement some kind of inheritance through delegation:
 function Index (a,i) if i == "parent" then --    return nil end local p = a.parent if type(p) == "table" then return p[i] --     Index else return nil end end setfallback("index", Index) 

This code goes up the “parents” chain until it finds the required field or reaches the end. With this code, the following example will display “red” even if b does not have a color field.
 a = Window{x=100, y=200, color="red"} b = Window{x=300, y=400, parent=a} print(b.color) 

There is no magic or hard-coded behavior in the delegation through the "parent" field. This is the choice of the developer. It can use a different name for the field, or implement more complex multiple inheritance, allowing the “parent” field to be a table itself, sequentially traversed, or something else.

Another fallback will be called for the expression a [i] if a is not a table at all. These are fallback “gettable”, triggered to get the value of a [i] in a situation like x = a [i], and fallback “settable”, triggered when writing to a [i] in a situation like a [i] = x.

There are many possibilities for using these table fallbacks. Interlanguage interaction is a very powerful feature: when a stores a value of type userdata (a pointer to something in the C code), fallback implements transparent access to the values ​​inside the data structures of the main program.

Our decision not to sew rigidly similar behaviors in the implementation of the language led to one of the main concepts of Lua: meta-mechanisms. Instead of littering the language with many possibilities, we have provided ways for those opportunities to be implemented for those and only to those who need it.

The fallback meta-mechanism allows Lua to support OOP in the context of some of the implemented types of inheritance and operator overloading. We even added some syntactic sugar to describe and use “methods”: functions can be declared in the format a: f (x, y, z), in which case the hidden parameter self is added to the call, making the call a: f (10.20 , 30) equivalent to af (a, 10,20,30).

In May 1996, we released Lua 2.4. The main innovation of this new version was an external compiler, called luac. This program compiled Lua code and saved the bytecode and string tables in a binary file. The format of such a file was chosen for easy download and portability between different platforms. With luac, programs could avoid parsing and code generation during startup, which was expensive, especially for large static programs, such as graphical metafiles.

Our first publication on Lua was already considering the possibility of an external compiler, but we only needed it after the wide distribution of Lua in TeCGraf and the huge graphical metafiles with Lua code created by graphic editors.

In addition to speeding up the load, luac also allows you to perform syntax checking during compilation and protect the source code from changes by the user. However, precompilation does not speed up the execution, since Lua always precompiles the source code before execution.

luac « », , Lua , , , . Lua . , (, ), 40% Lua 4.0, . Lua ( , Crazy Ivan, RoboCup 2000 2001 , «» Lua).

(1996–2000)


1996 Lua Software: Practice & Experience ( . . pdf , ). 1996 Dr. Dobb's Lua . , , Lua.

Dr. Dobb's Lua. :
From: Bret Mogilefsky <mogul@lucasarts.com>
To: "'lua@icad.puc-rio.br'" <lua@icad.puc-rio.br>
Subject: LUA rocks! Question, too.
Date: Thu, 9 Jan 1997 13:21:41 -0800

...

  Dr. Dobbs  Lua,     ,
      !
     .   
    .

 :      LucasArts Entertainment Co.
        SCUMM  Lua.

 [...]

, Bret Mogilefsky Grim Fandango , LucasArts 1997 . , « Lua» ( ). Lua . Lua rec.games.programmer comp.ai.games .
, , Lua . Lua ( LucasArts, BioWare, Slingshot Game Technology Loewen Entertainment) Lua . Lua , - . , Bret Mogilefsky Lua Grim Fandango, .

. , AI , . , , «», «», «», .. . , . .

, 2000, LucasArts , Lua, Escape from Monkey Island , Monkey Island. Lua SCUMM ( ) Lua.

( , Grim Fandango, Baldur's Gate, MDK2, Escape from Monkey Island) Lua .

Lua PUC-Rio ( . --, TeCGraf ) . . AXAF ( ) — NASA.

Performance Technologies Lua CPC4400 — ethernet . Lua CPC4400, ( , , RMON) Lua.

Tollgrade Communications used Lua in their new generation product to test the DigiTest telephone network. Lua was used for user interface, automated test scripts and analysis results.

Lua is also used at the InCor Heart Institute (Instituto do Coração, São Paulo) in Brazil, at CEPEL (research center of the state electric power company ELETROBRAS) also in Brazil, at the Weierstrass Institute in Berlin, at the Berlin Technical University , and in many other places .

In 1998, Cameron Laird and Kathryn Soraiz in their column about scripting languages SunWorld , « - Lua ». « », .

Lua 3


Lua 3.0 ( 1997) fallback' . Fallback' : , . Lua , , , . fallback , , - .
Lua 3.0 userdata. — , , fallback', . ( userdata) fallback .

Lua . , , . , : ( ) .

fallback', Lua , . , fallback' — . , fallback' ( ) , , .

, . . Lua 4.1 ( ).

In Lua 3.0, there is also support for conditional compilation in a format similar to the C preprocessor. Like any language feature, it was very easy to add this (although this complicated the lexer), and soon programmers began to use it (programmers use any language features). When a new functional appears, the demand for its further development immediately grows. One of the most frequent requests was the addition of macros, but a lengthy discussion never resulted in a clear suggestion either on the mailing list or between us. Each of the proposals required huge changes in the lexer and parser, without showing any clear benefits. So the preprocessor remained static with Lua 3.0 and until version 3.2 (for two years).

In the end, we decided that the preprocessor did more harm than good, making the code cumbersome and luring users into endless discussions, and removed it in Lua 4.0. And without a preprocessor, Lua has become cleaner. Over the years, we have been trying to make Lua easier and remove the dark corners of the language, which we once considered new features, but which were used only by rare programmers, and later became considered errors in general.

Lua version 4


Before version 3.2, only one “state” of a Lua machine could be active at one time. We had an API for changing the state, but it was somewhat inconvenient to use. To simplify the development of the API, we did not include an explicit state parameter in the function — there was only one global state. Now it is clear that this was a mistake. By the time Lua 3.2 was released, it became clear that many applications would be easier if they could conveniently work with several Lua states. For example, we made a special version of Lua 3.2, included in CGILua — an extension of browsers for uploading dynamic pages ( from the lane, it means server side ) and CGI programming on Lua. Earlier, LucasArts did something similar for Lua 3.1.
Lua 3.3 API . , . , - API , Lua 4.0. API , . , API , API Lua 1.1. , . Lua API, , , .

Lua 4.0 2000. API, , .

, « » . , , . , , . , , , , . . , — .

1.1 for Lua. while , .

. , for, . Pascal ( Modula) , . , to . , for C Lua.

3.1 ( 1998) . ( , for Lua). Lua 3.1 :
 foreach(table, f) foreachi(table, f) 

foreach f -, . foreachi , ( ): . , .
, . . , , . , , . , for: .

for . , , while. , - , while.

Conclusion


Lua . 500 30 . (www.lua.org) 500 50 . - Ethernet .
ftp Lua, , DLL Windows, SIS EPOC, RPM Linux, RISC OS, .. , Lua CD ( Dr. Dobb's, Linux Magazine France C Magazine).

Lua - . Lua , , Lua , palm', , . Cameron Laird Kathryn Soraiz 1998 , « ( , , ) Lua». , .

Lua , .

. . , — ( ) .

, — , . : , , .

Thanks


Lua . TeCGraf - : , , TeCGraf. Marcelo Gattass, TeCGraf, . Lua TeCGraf, .
Lua . — . , , . , , , .

. , Lua, luajit.
, :)

Source: https://habr.com/ru/post/229269/


All Articles