Embed Lua interpreter in the project for the microcontroller (stm32)

In fairly large applications, a considerable part of the project is business logic. It is convenient to debug this part of the program on a computer, and then build it into the microcontroller project, expecting that this part will be executed exactly as it was intended without any debugging (the ideal case).

Since most programs for microcontrollers are written in C / C ++, abstract classes that provide interfaces to low-level entities are usually used for these purposes (if a project is written only with C, then function pointer structures are often used). This approach provides the required level of abstraction over iron, however, it is fraught with the need for constant re-compilation of the project with subsequent programming of the non-volatile memory of the microcontroller with a binary firmware file of large volume .
')
However, there is another way - the use of a scripting language that allows you to debug business logic in real time on the device itself or download work scripts directly from external memory, without including this code in the microcontroller firmware.

I chose Lua as the scripting language.

Why Lua?

There are several scripting languages that can be embedded in a project for a microcontroller. Several simple BASIC-like, PyMite, Pawn ... Each has its pros and cons, the discussion of which does not include the list of issues discussed in this article.

Briefly about what exactly lua is good for - you can read in the article “Lua in 60 minutes” . I was strongly inspired by this article and I, for a more detailed study of the question, read the official manual-book from the author of the language to Robert Ieruzalimsky " Programming in the Lua language " (available in the official Russian translation).

Separately, I want to mention the project eLua. In my case, I already have a ready-made low-level software layer for interacting with the microcontroller periphery and for other required peripherals located on the device board. Therefore, this project was not considered by me (since it is recognized to provide those layers for the connection of the Lua core with the microcontroller periphery).

About the project in which Lua will be embedded

Traditionally , my sandbox project will be used as a field for experimentation (a link to a commit with an already integrated lua with all the necessary modifications described below).

The project is based on the stm32f405rgt6 microcontroller with 1 MB of non-volatile and 192 KB of RAM (the older 2 blocks with a total capacity of 128 KB are currently used).

The project has a real-time operating system FreeRTOS to support the infrastructure for working with hardware peripherals. All memory for tasks, semaphores and other FreeRTOS objects is statically allocated at the layout stage (located in the .bss memory area). All FreeRTOS entities (semaphores, queues, task stacks, and others) are parts of global objects in private areas of their classes. However, the FreeRTOS heap is still allocated to support the malloc , free , calloc functions (required for functions such as printf ) that are redefined to work with it. There is a raised API for working with MicroSD (FatFS) cards, as well as a debugging UART (115200, 8N1).

About the logic of using Lua in the project

For the purpose of debugging business logic, it is assumed that the commands will be sent over the UART, packed (in a separate object) into complete lines (terminated with a "\ n" + 0 terminator) and sent to the lua-machine. In case of unsuccessful execution, output by means of printf (since it was previously involved in the project). When the logic is debugged, it will be possible to implement the download of the final business logic file from the file from the microSD card (not included in the material of this article). Also for the purpose of debugging Lua, the machine will be executed inside a separate FreeRTOS thread (in the future, a separate thread will be allocated to each debugged business logic script in which it will run with its environment).

Inclusion of the submodule lua in the project

The official mirror of the project on github will be used as the source of the lua library (since my project is also located there. You can use the source directly from the official site ). Since the project has an established submodule assembly system as part of the project for individual CMakeLists sub modules for each module, I created a separate submodule in which I included the fork of this and CMakeLists to preserve the unified stylistics of the assembly.

CMakeLists builds the source code of the lua repository as a static library with the following compilation flags of the submodule (taken from the submodule configuration file in the main project):

SET(C_COMPILER_FLAGS "-std=gnu99;-fshort-enums;-fno-exceptions;-Wno-type-limits;-ffunction-sections;-fdata-sections;") SET(MODULE_LUA_COMP_FLAGS "-O0;-g3;${C_COMPILER_FLAGS}"

And the specificization flags of the processor used (specified in the root CMakeLists ):

 SET(HARDWARE_FLAGS -mthumb; -mcpu=cortex-m4; -mfloat-abi=hard; -mfpu=fpv4-sp-d16;)

It is important to note the need for the root CMakeLists to specify a definition that allows not to use double values (since the microcontroller does not have hardware support for double. Only float):

 add_definitions(-DLUA_32BITS)

Well, it remains only to inform the linker about the need to build this library and include the result in the layout of the final project:

CMakeLists site for building a project with the lua library

 add_subdirectory(${CMAKE_SOURCE_DIR}/bsp/submodules/module_lua) ... target_link_libraries(${PROJECT_NAME}.elf PUBLIC # -Wl,--start-group       #      . #  Lua    ,      #  . "-Wl,--start-group" ..._... MODULE_LUA ..._... "-Wl,--end-group")

Definition of functions for working with memory

Since Lua itself does not work with memory, this responsibility passes to the user. However, when using the bundled lauxlib library and the luaL_newstate function from it, the l_alloc function is bound as a memory system. It is defined as follows:

 static void *l_alloc (void *ud, void *ptr, size_t osize, size_t nsize) { (void)ud; (void)osize; /* not used */ if (nsize == 0) { free(ptr); return NULL; } else return realloc(ptr, nsize); }

As it was said at the beginning of the article, the project already has overridden functions malloc and free , but there is no realloc function. We need to fix this.

In the standard FreeRTOS heap handling mechanism, in the heap_4.c file used in the project, there is no function for resizing a previously allocated block of memory. In this regard, it is necessary on the basis of malloc and free to make its implementation.

Since it is possible to change the memory allocation scheme in the future (using another heap_x.c file), it was decided not to use the internals of the current scheme (heap_4.c), but to make a higher level superstructure. Though less effective.

It is important to note that the realloc method not only removes the old block (if it existed) and creates a new one, but also moves data from the old block to the new one. Moreover, if there was more data in the old block than in the new one, then the new one is filled with the old one to the limit, and the remaining data is discarded.

If you do not take into account this fact, then your machine will be able to execute such a script three times from the line " a = 3 \ n ", after which it will fall into the hard fault. The problem will be solved after studying the residual image of the registers in the hard fault handler, from which it will be possible to know that the crash occurred after trying to expand the table in the depths of the interpreter code and its libraries. If you call the script by the type " print 'test' ", then the behavior will change depending on how the firmware file is assembled (in other words, the behavior is not defined).

In order to copy the data from the old block to the new one, we need to know the size of the old block. FreeRTOS heap_4.c (like other files that provide methods for working with the heap) does not provide an API for this. So you have to add your own. As a basis, I took the function vPortFree and cut its functionality to the following:

VPortGetSizeBlock function code

 int vPortGetSizeBlock (void *pv) { uint8_t *puc = (uint8_t *)pv; BlockLink_t *pxLink; if (pv != NULL) { puc -= xHeapStructSize; pxLink = (BlockLink_t *)puc; configASSERT((pxLink->xBlockSize & xBlockAllocatedBit) != 0); configASSERT(pxLink->pxNextFreeBlock == NULL); return pxLink->xBlockSize & ~xBlockAllocatedBit; } return 0; }

Now it's small, write realloc based on malloc , free , and vPortGetSizeBlock :

Realloc implementation code based on malloc, free, and vPortGetSizeBlock

 void *realloc (void *ptr, size_t new_size) { if (ptr == nullptr) { return malloc(new_size); } void* p = malloc(new_size); if (p == nullptr) { return p; } size_t old_size = vPortGetSizeBlock(ptr); size_t cpy_len = (new_size < old_size)?new_size:old_size; memcpy(p, ptr, cpy_len); free(ptr); return p; }

Add support for working with stdout

As it becomes known from the official description, the lua interpreter itself does not know how to work with input-output. For these purposes, one of the standard libraries is connected. For output, it uses the stdout stream. The luaopen_io function from the standard library is responsible for connecting to the stream. To support stdout (as opposed to printf ), you will need to override the fwrite function. I redefined it based on the functions described in the previous article .

Fwrite function

 size_t fwrite(const void *buf, size_t size, size_t count, FILE *stream) { stream = stream; size_t len = size * count; const char *s = reinterpret_cast<const char*>(buf); for (size_t i = 0; i < len; i++) { if (_write_char((s[i])) != 0) { return -1; } } return len; }

Without its definition, the print function in lua will work successfully, but there will be no output. At the same time, there will be no errors on the Lua machine stack (since formally the function was executed successfully).

In addition to this function, we need the fflush function (for the functioning of the interactive mode, which will be discussed later). Since this function cannot be redefined, you will have to call it a little differently. The function is a truncated version of the fwrite function and is intended to send what is now in the buffer with its subsequent cleaning (without additional carriage return).

Mc_fflush function

 int mc_fflush () { uint32_t len = buf_p; buf_p = 0; if (uart_1.tx(tx_buf, len, 100) != mc_interfaces::res::ok) { errno = EIO; return -1; } return 0; }

Getting strings from a serial port

To get the strings for the lua-machine, I decided to write a simple uart-terminal class that:

receives data on the serial port byte-by-byte (in interrupt);
adds the received byte to the queue from where the stream receives it;
in the byte stream, if it is not a newline, it is sent back in the form in which it came;
if the line feed arrived (' \ r '), then 2 bytes of the terminal carriage return are sent (" \ n \ r ");
after sending the response, the byte handler is called (string composition object);
controls the keystroke for deleting a character (to avoid deleting service characters from the terminal window);

Links to sources:

UART class interface is here ;
UART base class is here and here ;
the uart_terminal class is here and here ;
creating a class object as part of the project here .

Additionally, I note that in order for this object to work properly, you need to assign a UART interrupt priority in the allowed range to work with FreeRTOS functions from an interrupt. Otherwise, you can get interesting difficult debugged errors. In the current example, the following parameters for interrupts are set in the FreeRTOSConfig.h file.

Settings in FreeRTOSConfig.h

 #define configPRIO_BITS 4 #define configKERNEL_INTERRUPT_PRIORITY 0XF0 //   FreeRTOS API   //   0x8 - 0xF. #define configMAX_SYSCALL_INTERRUPT_PRIORITY 0x80

In the project itself, an object of the nvic class sets the interrupt priority to 0x9, which is within the allowable range (the nvic class is described here and here ).

Forming a string for a Lua machine

The bytes received from the uart_terminal object are transferred to an instance of the simple serial_cli class, which provides a minimal interface for editing a string and passing it directly to the stream in which the lua-machine is running (via a callback function). Upon accepting the '\ r' character, the callback function is called. This function should copy the string and “release” the control (since during a call, the reception of new bytes is blocked. This is not a problem if the priorities of the threads are set correctly and the UART speed is low enough).

Links to sources:

description files serial_cli here and here ;
creating a class object as part of the project here .

It is important to note that this class considers a string longer than 255 characters to be invalid and resets it. This is intentional, since the lua interpreter allows you to enter constructions line by line, waiting for the block to end.

Passing a string to the Lua interpreter and its execution

By itself, the Lua interpreter does not know how to accept the block code line by line, and then execute the whole block itself. However, if we install Lua on a computer and run the interpreter interactively, we can see that the execution proceeds line by line with the appropriate notation as it is entered, that the block is not yet complete. Since the interactive mode is what is provided in the standard package, we can see its code. It is in the lua.c file. We are interested in the doREPL function and everything that it uses. In order not to invent a bicycle, I made a port of this code into a separate class for receiving interactive mode functions, which I named after the original function lua_repl , which uses printf to output information to the console and has a public method add_lua_string to add a string obtained from the class object The serial_cli described above.

References:

description of the lua_repl class here ;
code here , here and here ;

The class is made according to the Myers pattern of singleers, since there is no need to provide several interactive modes within the same device. An object of class lua_repl receives data from an object of class serial_cli here .

Since the project already has a unified system for initializing and servicing global objects, a pointer to an object of class lua_repl is passed to the object of the global class player :: base here . In the start method of the player :: base class object (declared here . It is immediately called from the main), the init method of the lua_repl class object is called with the FreeRTOS 3 task priority (in the project, you can assign a task priority from 1 to 4. Where 1 - the lowest priority, and 4 - the highest). Upon completion of the successful initialization, the global class launches the FreeRTOS scheduler and the interactive mode begins its work.

Problems that appear when porting

Below is a list of problems that I encountered during the port of the Lua machine.

2-3 single-line variable assignment scripts are executed, then everything falls into the hard fault

The problem was in the realloc method. It is required not only to re-allocate the block, but also to copy the contents of the old one (about which I wrote above).

At attempt to print value the interpreter falls in hard fault

It was already harder to detect the problem, but as a result, we managed to find out that snprintf is used for printing. Since lua stores values in double (or float in our case), printf (and its derivatives) with floating point support is required (I wrote about the intricacies of the printf settings here ).

Requirements for non-volatile (flash) memory

Here are some measurements that I made, allowing you to judge how much you need to allocate non-volatile (flash) memory to integrate a Lua-machine into the project. Compilation was done using gcc-arm-none-eabi-8-2018-q4-major. The version used is Lua 5.4. Below in the measurements, the phrase “without Lua” means not including the interpreter directly in the project and the methods interacting with it and its libraries, as well as the object of the lua_repl class. All low-level entities (including overrides for the work of the printf and fwrite functions) remain in the project. FreeRTOS heap size is 1024 * 25 bytes. The rest is occupied by the global essence of the project.

A summary table of the results looks like this (all sizes in bytes):

Build options	No lua	Core only	Lua with base library	Lua with libraries base, coroutine, table, string	luaL_openlibs
-O0 -g3	103028	220924	236124	262652	308372
-O1 -g3	74940	144732	156916	174452	213068
-Os -g0	71172	134228	145756	161428	198400

RAM requirements

Since the consumption of RAM depends entirely on the task, I will give a summary table of consumed memory immediately after turning on the machine with a different set of libraries (it is displayed with the command print (collectgarbage ("count" * 1024) )).

Composition	RAM used
Lua with base library	4809
Lua with libraries base, coroutine, table, string	6407
luaL_openlibs	12769

In the case of using all libraries, the size of the required RAM increases significantly in comparison with the previous sets. However, its use in a large part of the application is not necessary.

In addition, 4 kb is also allocated to the task stack, in which the Lua-machine is executed.

Further use

To make full use of the machine in the project, then you will need to describe all the interfaces required by the business logic code to the hardware or service objects of the project. However, this is a topic for a separate article.

Results

This article explained how you can connect a Lua-machine to the microcontroller project, as well as launch a full-fledged interactive interpreter that allows you to experiment with business logic directly from the command line of the terminal. In addition, the hardware requirements of the microcontroller were considered for different trim levels of the Lua machine.

Source: https://habr.com/ru/post/459602/

All Articles