📜 ⬆️ ⬇️

Static distribution of FreeRTOS objects

By default, all objects in the FreeRTOS system are dynamically distributed - queues, semaphores, timers, tasks (streams), and mutexes. The programmer sees only a “heap” - an area where memory is dynamically allocated at the request of the program or system, and what’s going on inside is not clear. How much is left? Unknown. Does it take more than necessary? Who knows? Personally, I prefer to solve the issues of memory organization at the stage of writing the firmware, not bringing to runtime errors, when the memory suddenly ended.

This article is a logical continuation of yesterday 's static allocation of objects in the memory of the microcontroller, only now applied to FreeRTOS objects. Today we will learn how to place FreeRTOS objects statically, which will allow us to more clearly understand what is happening in the microcontroller's RAM, how exactly our objects are located and how much they occupy.

But just to take and start placing FreeRTOS objects statically a lot of mind is not required - FreeRTOS starting from version 9.0 just provides the functions of creating objects placed statically. Such functions have the Static suffix in the title and there is excellent documentation for these functions with examples. We will write comfortable and beautiful C ++ wrappers on FreeRTOS functions, which will not only place objects statically, but also hide all the giblets, as well as provide a more convenient interface.
')
The article is designed for novice programmers, but who are already familiar with the basics of FreeRTOS and the synchronization primitives of multithreaded programs. Go.

FreeRTOS is an operating system for microcontrollers. Well, ok, not a full OS, but a library that allows you to run several tasks in parallel. FreeRTOS also allows tasks to exchange messages through message queues, use timers, synchronize tasks using semaphores and mutexes.

In my opinion, any firmware where you need to simultaneously do two (and more) tasks is solved much easier and more elegant if you use FreeRTOS. For example, read data from slow sensors and at the same time maintain the display. Only so that without brakes, while the sensors are read. In general, must have! I strongly recommend to study.

As I said and wrote in the last article, I don’t really like the approach to creating objects dynamically if we know their number and size at the compilation stage. If such objects are placed statically, then we can get a clearer and more understandable picture of the memory allocation in the microcontroller, and therefore avoid surprises when the memory is suddenly over.

The organization of the FreeRTOS memory will be considered on the example of the BluePill board on the STM32F103C8T6 microcontroller. In order not to bathe with the compiler and the build system, we will work in the ArduinoIDE environment, installing support for this board. There are several implementations of Arduino under STM32 - in principle, any one will do. I have installed stm32duino according to the instructions from the Readme.md project, the bootloader as stated in this article . FreeRTOS version 10.0 is installed through the ArduinoIDE library manager. Compiler - gcc 8.2

We will invent a small experimental task for ourselves. There may not be much practical sense in this task, but all the synchronization primitives that are in FreeRTOS will be used. Something like this:


The standard implementation (according to the documentation and tutorials) may look like this.

#include <STM32FreeRTOS.h> TimerHandle_t xTimer; xSemaphoreHandle xSemaphore; xSemaphoreHandle xMutex; xQueueHandle xQueue; void vTimerCallback(TimerHandle_t pxTimer) { xSemaphoreGive(xSemaphore); } void vTask1(void *) { while(1) { xSemaphoreTake(xSemaphore, portMAX_DELAY); int value = random(1000); xQueueSend(xQueue, &value, portMAX_DELAY); xSemaphoreTake(xMutex, portMAX_DELAY); Serial.println("Test"); xSemaphoreGive(xMutex); } } void vTask2(void *) { while(1) { int value; xQueueReceive(xQueue, &value, portMAX_DELAY); xSemaphoreTake(xMutex, portMAX_DELAY); Serial.println(value); xSemaphoreGive(xMutex); } } void setup() { Serial.begin(9600); vSemaphoreCreateBinary(xSemaphore); xQueue = xQueueCreate(1000, sizeof(int)); xMutex = xSemaphoreCreateMutex(); xTimer = xTimerCreate("Timer", 1000, pdTRUE, NULL, vTimerCallback); xTimerStart(xTimer, 0); xTaskCreate(vTask1, "Task 1", configMINIMAL_STACK_SIZE, NULL, tskIDLE_PRIORITY, NULL); xTaskCreate(vTask2, "Task 2", configMINIMAL_STACK_SIZE, NULL, tskIDLE_PRIORITY, NULL); vTaskStartScheduler(); } void loop() {} 

Let's see what's going on in the memory of the microcontroller, if you compile such code. By default, all FreeRTOS objects are placed in dynamic memory. FreeRTOS provides as many as 5 implementations of memory managers, which differ in complexity of implementation, but on the whole they have the same task - to cut memory into pieces for the needs of FreeRTOS and the user. The slices are cut from either a common microcontroller heap (using malloc) or use their own separate heap. What kind of heap is used for us is not important - all the same, we will not be able to look inside the heap.

For example, for the FreeRTOS name heap it would look like this (output from the objdump utility)

 ... 200009dc l O .bss 00002000 ucHeap ... 

Those. we see one large piece, inside which all FreeRTOS objects are sliced ​​- semaphores, mutexes, timers, queues, and even the tasks themselves. The last 2 points are very important. Depending on the number of elements, the queue can be quite large, and tasks are guaranteed to take up a lot of space due to the stack, which is also allocated along with the task.

Yes, this is a minus of multitasking - each task will have its own stack. Moreover, the stack should be large enough to fit not only the calls and local variables of the task itself, but also the stack of interruptions, if any. Well, since an interrupt can happen at any time, each task must have a stack reserve in case of an interruption. Moreover, CortexM microcontrollers can have nested interrupts, so the stack must be large enough to hold all the interrupts if they occur at the same time.

The size of the task stack is set when the task is created using the xTaskCreate function parameter. The stack size cannot be less than the configMINIMAL_STACK_SIZE parameter (specified in the FreeRTOSConfig.h configuration file) —this is the reserve for interrupts. The heap size is set by the parameter configTOTAL_HEAP_SIZE and in this case is equal to 8kb.

Now try to guess if all our objects fit in a 8kb heap? And a couple of objects? And a few more tasks?
With certain FreeRTOS settings, all objects in the heap did not fit. And it looks like this: the program just does not work. Those. everything compiles, floods, but then the microcontroller just hangs and that's it. And go guess what the problem is in the heap size. I had to increase a lot to 12kb.

Stop, and what are the xTimer, xQueue, xSemaphore, and xMutex variables? Do they not describe the objects we need? No, these are only handles - pointers to some (opaque) structure, which describes the synchronization objects themselves.

 200009cc g O .bss 00000004 xTimer 200009d0 g O .bss 00000004 xSemaphore 200009cc g O .bss 00000004 xQueue 200009d4 g O .bss 00000004 xMutex 

As I already mentioned, I suggest repairing this whole mess in the same way as in the previous article - we will distribute all our objects statically at the compilation stage. Static distribution functions become available if the configSUPPORT_STATIC_ALLOCATION parameter in the FreeRTOS configuration file is set to 1.

Let's start with the queues. Here's how to queuing documentation on FreeRTOS

 struct AMessage { char ucMessageID; char ucData[ 20 ]; }; #define QUEUE_LENGTH 10 #define ITEM_SIZE sizeof( uint32_t ) // xQueueBuffer will hold the queue structure. StaticQueue_t xQueueBuffer; // ucQueueStorage will hold the items posted to the queue. Must be at least // [(queue length) * ( queue item size)] bytes long. uint8_t ucQueueStorage[ QUEUE_LENGTH * ITEM_SIZE ]; void vATask( void *pvParameters ) { QueueHandle_t xQueue1; // Create a queue capable of containing 10 uint32_t values. xQueue1 = xQueueCreate( QUEUE_LENGTH, // The number of items the queue can hold. ITEM_SIZE // The size of each item in the queue &( ucQueueStorage[ 0 ] ), // The buffer that will hold the items in the queue. &xQueueBuffer ); // The buffer that will hold the queue structure. // The queue is guaranteed to be created successfully as no dynamic memory // allocation is used. Therefore xQueue1 is now a handle to a valid queue. // ... Rest of task code. } 

In this example, the queue is described by three variables:


Make as in the example, of course, no problem. But personally, I don’t like to have as many as 3 variables per entity. A class that can encapsulate it here asks. Only one problem - the size of each queue may differ. In one place, you need a bigger queue, in the other, just a couple of items. Since we want to place the queue statically, we must somehow specify this size at the compilation stage. To do this, you can use the template.

 template<class T, size_t size> class Queue { QueueHandle_t xHandle; StaticQueue_t x QueueDefinition; T xStorage[size]; public: Queue() { xHandle = xQueueCreateStatic(size, sizeof(T), reinterpret_cast<uint8_t*>(xStorage), &xQueueDefinition); } bool receive(T * val, TickType_t xTicksToWait = portMAX_DELAY) { return xQueueReceive(xHandle, val, xTicksToWait); } bool send(const T & val, TickType_t xTicksToWait = portMAX_DELAY) { return xQueueSend(xHandle, &val, xTicksToWait); } }; 

At the same time, the functions of sending and receiving messages also settled in this class, and at the same time convenient for us.

The queue will be declared as a global variable, something like this

 Queue<int, 1000> xQueue; 

Posting a message

  xQueue.send(value); 

Receive a message

  int value; xQueue.receive(&value); 

We will understand now with semaphores. And although technically (inside FreeRTOS) semaphores and mutexes are implemented through queues, semantically these are 3 different primitives. Therefore, we will implement them in separate classes.

The implementation of the semaphore class will be fairly trivial — it simply stores several variables and declares several functions.

 class Sema { SemaphoreHandle_t xSema; StaticSemaphore_t xSemaControlBlock; public: Sema() { xSema = xSemaphoreCreateBinaryStatic(&xSemaControlBlock); } BaseType_t give() { return xSemaphoreGive(xSema); } BaseType_t take(TickType_t xTicksToWait = portMAX_DELAY) { return xSemaphoreTake(xSema, xTicksToWait); } }; 

Semaphore declaration

 Sema xSema; 

Semaphore capture

  xSema.take(); 

Release the semaphore

  xSema.give(); 

Now mutex

 class Mutex { SemaphoreHandle_t xMutex; StaticSemaphore_t xMutexControlBlock; public: Mutex() { xMutex = xSemaphoreCreateMutexStatic(&xSemaControlBlock); } BaseType_t lock(TickType_t xTicksToWait = portMAX_DELAY) { return xSemaphoreTake(xMutex, xTicksToWait); } BaseType_t unlock() { return xSemaphoreGive(xMutex); } }; 

As you can see, the mutex class is almost identical to the semaphore class. But as I said semantically, they are different entities. Moreover, the interfaces of these classes are not complete, and they will expand in completely different directions. So, the semaphore can add methods giveFromISR () and takeFromISR () to work with the semaphore in the interrupt, while the mutex, except that the tryLock () method is added - semantically it has no other operations.

I hope you know the difference between a binary semaphore and a mutex.
I always ask this question at interviews and, unfortunately, 90% of candidates do not understand this difference. In fact, the semaphore can be captured and released from different threads. Above, I already mentioned the signal-wait semaphore mode, when one thread sends a signal (calls give ()) and the other waits for a signal (the function take ()).

A mutex, by contrast, can be released only from the same thread (task) that captured it. I'm not sure that FreeRTOS keeps track of this, but some operating systems (for example, Linux) are pretty strictly monitoring this.

Mutex can be used in C style, i.e. directly call lock () / unlock (). But since we are writing in C ++, you can use the charms of RAII and write a more convenient wrapper that will capture and release the mutex itself.

 class MutexLocker { Mutex & mtx; public: MutexLocker(Mutex & mutex) : mtx(mutex) { mtx.lock(); } ~MutexLocker() { mtx.unlock(); } }; 

When you exit the scope, the mutex will be automatically released.

This is especially useful if there are several ways out of the function and you do not need to constantly remember about the need to release resources.

  MutexLocker lock(xMutex); Serial.println(value); } // mutex will be unlocked here 

Now it's the turn of the timers.

 class Timer { TimerHandle_t xTimer; StaticTimer_t xTimerControlBlock; public: Timer(const char * const pcTimerName, const TickType_t xTimerPeriodInTicks, const UBaseType_t uxAutoReload, void * const pvTimerID, TimerCallbackFunction_t pxCallbackFunction) { xTimer = xTimerCreateStatic(pcTimerName, xTimerPeriodInTicks, uxAutoReload, pvTimerID, pxCallbackFunction, &xTimerControlBlock); } void start(TickType_t xTicksToWait = 0) { xTimerStart(xTimer, xTicksToWait); } }; 

In general, everything is similar to the previous classes, I will not dwell in detail. Perhaps the API leaves much to be desired, well, or at a minimum requires expansion. But my goal is to show the principle, and not bring it to production ready state.

And finally, the task. Each task has a stack and must be placed in memory in advance. Let's use the same technique as with queues - let's write a template class

 template<const uint32_t ulStackDepth> class Task { protected: StaticTask_t xTaskControlBlock; StackType_t xStack[ ulStackDepth ]; TaskHandle_t xTask; public: Task(TaskFunction_t pxTaskCode, const char * const pcName, void * const pvParameters, UBaseType_t uxPriority) { xTask = xTaskCreateStatic(pxTaskCode, pcName, ulStackDepth, pvParameters, uxPriority, xStack, &xTaskControlBlock); } }; 

Since task objects are now declared as global variables, they will be initialized as global variables before calling main (). This means that the parameters that are passed to the tasks should also be known at this stage. This nuance must be taken into account if in your case something is passed that needs to be calculated before creating the task (I have just NULL there). If this still does not suit you - consider the option with local static variables from the previous article .

Compile and get the error:

 tasks.c:(.text.vTaskStartScheduler+0x10): undefined reference to `vApplicationGetIdleTaskMemory' timers.c:(.text.xTimerCreateTimerTask+0x1a): undefined reference to `vApplicationGetTimerTaskMemory' 

The point is this. Each OS has a special task - Idle Task (the default task, the task of doing nothing). The operating system performs this task if all other tasks cannot be performed (for example, they sleep or are waiting for something). In general, this is the most common task, only with the lowest priority. But it is created inside the core of FreeRTOS and we cannot influence its creation. But since we started posting tasks statically, we need to somehow tell the OS where to place the control unit and the stack of this task. That's what FreeRTOS is for and asks us to define the special function vApplicationGetIdleTaskMemory ().

The situation is similar with the task timers. The timers in the FreeRTOS system do not live by themselves - the OS runs a special task that serves these timers. And this task also requires a control block and a stack. And the OS also asks us to indicate where they are located using the vApplicationGetTimerTaskMemory () function.

The functions themselves are trivial and simply return the corresponding pointers to statically allocated objects.

 extern "C" void vApplicationGetIdleTaskMemory( StaticTask_t **ppxIdleTaskTCBBuffer, StackType_t **ppxIdleTaskStackBuffer, uint32_t *pulIdleTaskStackSize) { static StaticTask_t Idle_TCB; static StackType_t Idle_Stack[configMINIMAL_STACK_SIZE]; *ppxIdleTaskTCBBuffer = &Idle_TCB; *ppxIdleTaskStackBuffer = Idle_Stack; *pulIdleTaskStackSize = configMINIMAL_STACK_SIZE; } extern "C" void vApplicationGetTimerTaskMemory (StaticTask_t **ppxTimerTaskTCBBuffer, StackType_t **ppxTimerTaskStackBuffer, uint32_t *pulTimerTaskStackSize) { static StaticTask_t Timer_TCB; static StackType_t Timer_Stack[configTIMER_TASK_STACK_DEPTH]; *ppxTimerTaskTCBBuffer = &Timer_TCB; *ppxTimerTaskStackBuffer = Timer_Stack; *pulTimerTaskStackSize = configTIMER_TASK_STACK_DEPTH; } 

Let's see what we did.

I will hide the helper code under the spoiler, you just saw it
 template<class T, size_t size> class Queue { QueueHandle_t xHandle; StaticQueue_t xQueueDefinition; T xStorage[size]; public: Queue() { xHandle = xQueueCreateStatic(size, sizeof(T), reinterpret_cast<uint8_t*>(xStorage), &xQueueDefinition); } bool receive(T * val, TickType_t xTicksToWait = portMAX_DELAY) { return xQueueReceive(xHandle, val, xTicksToWait); } bool send(const T & val, TickType_t xTicksToWait = portMAX_DELAY) { return xQueueSend(xHandle, &val, xTicksToWait); } }; class Sema { SemaphoreHandle_t xSema; StaticSemaphore_t xSemaControlBlock; public: Sema() { xSema = xSemaphoreCreateBinaryStatic(&xSemaControlBlock); } BaseType_t give() { return xSemaphoreGive(xSema); } BaseType_t take(TickType_t xTicksToWait = portMAX_DELAY) { return xSemaphoreTake(xSema, xTicksToWait); } }; class Mutex { SemaphoreHandle_t xMutex; StaticSemaphore_t xMutexControlBlock; public: Mutex() { xMutex = xSemaphoreCreateMutexStatic(&xMutexControlBlock); } BaseType_t lock(TickType_t xTicksToWait = portMAX_DELAY) { return xSemaphoreTake(xMutex, xTicksToWait); } BaseType_t unlock() { return xSemaphoreGive(xMutex); } }; class MutexLocker { Mutex & mtx; public: MutexLocker(Mutex & mutex) : mtx(mutex) { mtx.lock(); } ~MutexLocker() { mtx.unlock(); } }; class Timer { TimerHandle_t xTimer; StaticTimer_t xTimerControlBlock; public: Timer(const char * const pcTimerName, const TickType_t xTimerPeriodInTicks, const UBaseType_t uxAutoReload, void * const pvTimerID, TimerCallbackFunction_t pxCallbackFunction) { xTimer = xTimerCreateStatic(pcTimerName, xTimerPeriodInTicks, uxAutoReload, pvTimerID, pxCallbackFunction, &xTimerControlBlock); } void start(TickType_t xTicksToWait = 0) { xTimerStart(xTimer, xTicksToWait); } }; template<const uint32_t ulStackDepth> class Task { protected: StaticTask_t xTaskControlBlock; StackType_t xStack[ ulStackDepth ]; TaskHandle_t xTask; public: Task(TaskFunction_t pxTaskCode, const char * const pcName, void * const pvParameters, UBaseType_t uxPriority) { xTask = xTaskCreateStatic(pxTaskCode, pcName, ulStackDepth, pvParameters, uxPriority, xStack, &xTaskControlBlock); } }; extern "C" void vApplicationGetIdleTaskMemory( StaticTask_t **ppxIdleTaskTCBBuffer, StackType_t **ppxIdleTaskStackBuffer, uint32_t *pulIdleTaskStackSize) { static StaticTask_t Idle_TCB; static StackType_t Idle_Stack[configMINIMAL_STACK_SIZE]; *ppxIdleTaskTCBBuffer = &Idle_TCB; *ppxIdleTaskStackBuffer = Idle_Stack; *pulIdleTaskStackSize = configMINIMAL_STACK_SIZE; } extern "C" void vApplicationGetTimerTaskMemory (StaticTask_t **ppxTimerTaskTCBBuffer, StackType_t **ppxTimerTaskStackBuffer, uint32_t *pulTimerTaskStackSize) { static StaticTask_t Timer_TCB; static StackType_t Timer_Stack[configTIMER_TASK_STACK_DEPTH]; *ppxTimerTaskTCBBuffer = &Timer_TCB; *ppxTimerTaskStackBuffer = Timer_Stack; *pulTimerTaskStackSize = configTIMER_TASK_STACK_DEPTH; } 


Code of the main program entirely.

 Timer xTimer("Timer", 1000, pdTRUE, NULL, vTimerCallback); Sema xSema; Mutex xMutex; Queue<int, 1000> xQueue; Task<configMINIMAL_STACK_SIZE> task1(vTask1, "Task 1", NULL, tskIDLE_PRIORITY); Task<configMINIMAL_STACK_SIZE> task2(vTask2, "Task 2", NULL, tskIDLE_PRIORITY); void vTimerCallback(TimerHandle_t pxTimer) { xSema.give(); MutexLocker lock(xMutex); Serial.println("Test"); } void vTask1(void *) { while(1) { xSema.take(); int value = random(1000); xQueue.send(value); } } void vTask2(void *) { while(1) { int value; xQueue.receive(&value); MutexLocker lock(xMutex); Serial.println(value); } } void setup() { Serial.begin(9600); xTimer.start(); vTaskStartScheduler(); } void loop() {} 

You can disassemble the resulting binary and see what and how it is located there (the output of the objdump is slightly tinted for better readability):

 0x200000b0 .bss 512 vApplicationGetIdleTaskMemory::Idle_Stack 0x200002b0 .bss 92 vApplicationGetIdleTaskMemory::Idle_TCB 0x2000030c .bss 1024 vApplicationGetTimerTaskMemory::Timer_Stack 0x2000070c .bss 92 vApplicationGetTimerTaskMemory::Timer_TCB 0x200009c8 .bss 608 task1 0x20000c28 .bss 608 task2 0x20000e88 .bss 84 xMutex 0x20000edc .bss 4084 xQueue 0x20001ed0 .bss 84 xSema 0x20001f24 .bss 48 xTimer 

The goal is achieved - now everything is in full view. Each object is visible and understandable in its size (well, except that Task complex objects consider all their parts as one piece). The compiler statistics are also extremely accurate and this time are very useful.

 Sketch uses 20,800 bytes (15%) of program storage space. Maximum is 131,072 bytes. Global variables use 9,332 bytes (45%) of dynamic memory, leaving 11,148 bytes for local variables. Maximum is 20,480 bytes. 

Conclusion


And although the FreeRTOS system allows you to create and delete tasks on the fly, queues, semaphores and mutexes, in many cases this is not necessary. As a rule, it’s enough to create all the objects at the start once and they will work until the next reboot. And this is a good reason to distribute such objects statically at compile time. As a result, we will get a clear understanding of the memory occupied by our objects, where what lies and how much more free memory remains.

Obviously, the proposed method is only suitable for placing objects whose lifetime is comparable to the lifetime of the entire application. Otherwise, it is worth using dynamic memory.

In addition to the static placement of FreeRTOS objects, we also wrote convenient wrappers over FreeRTOS primitives, which made it possible to somewhat simplify the client code, as well as encapsulate

If necessary, the interface can be simplified (for example, do not check the return code, or do not use timeouts). It is also worth noting that the implementation is incomplete - I did not bother with the implementation of all possible ways of sending and receiving messages through a queue (for example, from an interrupt, sending to the beginning or end of a queue), work with synchronization primitives from interrupts, counting (not binary) semaphores, and many other things.

I was too lazy to bring this code to the state of "take and use", I just wanted to show the idea. But who needs a ready-made library, I just stumbled upon the frt library . In it all is practically the same, only brought to mind. Well, the interface is slightly different.

An example from the article is here .

Thank you all who read this article to the end. I will be glad to constructive criticism. I would also be interested to discuss the nuances in the comments.

Source: https://habr.com/ru/post/459086/


All Articles