Overview of one Russian RTOS, part 4. Useful theory

Hello colleagues! The next publication from the informal “Book of Knowledge” of the MAKS RTOS is ready.

A request to regular readers to treat me loyally and not minus for a small repetition of part of the material from the previous article (about stack protection) - here it turned out to be more logical. And there I have already deleted.

General content (published and unpublished articles):
')
Part 1. General information
Part 2. Core MAX MAX
Part 3. The structure of the simplest program
Part 4. Useful theory (this article)
Part 5. The first application
Part 6. Thread synchronization tools
Part 7. Means of data exchange between tasks
Part 8. Work with interruptions

Some unobvious data details

Some facts about the heap

Many programmers for some reason believe that the new and delete operations are fairly lightweight and simple. Therefore, the code is often replete with the allocation and release of dynamic memory. This is more or less acceptable on powerful systems (gigabytes of RAM and gigahertz clock frequency), but with limited resources it can create some problems, especially for programs running 24/7.

The most obvious problem is address space fragmentation. In the modern .NET environment, arrays stand out on the heap by reference. Therefore, at any time, the system can stop the program and do garbage collection. At the same time, the arrays may well be shifted inside the memory, because they will be accessed by reference, and the reference table will be adjusted. The C ++ language, which is most often written for microcontrollers (as, in fact, pure C, and assembler) works with arrays through pointers. How many pointers are in memory - no one knows (the user program has the right to copy them, pass them as arguments, even add and subtract). So the data can not be moved. Where an array or structure has been allocated, there they will be fixed until they are deleted. Well, and then the classic case arises, when, as a result of selections and exemptions, the heap will take, say, this form:

Fig. 1. Illustration of a classic case of problems arising from the fragmentation of the address space.

In this case, the total amount of free memory is more than required for allocation, but it is impossible to allocate it - there is no fragment of the required size. And defragmentation is impossible;
However, there is a more serious problem. In fact, in order to allocate memory, you should look at a sufficiently large number of tables (we will call it tables, although of course, the implementation of the memory manager can also be on lists) to find a fragment of suitable length (of course, this is a case when the system has been working for a long time, and allocations and deletions often occur - the more fragments, the more table elements should be iterated). That is, the operation of memory allocation is not so cheap either in speed or resource consumption - the memory allocation table should be stored somewhere;
Moreover, the memory allocation operation cannot be attributed to real-time operations. Its performance is impossible to predict. It is possible that it will be executed quite quickly (the required record will be found quickly), but it can be - and quite slowly, and real-time systems imply guaranteed speed. Needless to say, this is not an operating system problem, it is a problem of a poorly designed application program, since its author did not take into account this third-party effect;
Finally, stream switching is blocked during memory allocation to ensure thread-safe operation.

From all this it follows that memory should be allocated on the heap with extreme caution. Ideally, this should be done during the initialization phase of the program. If you want to allocate memory in the course of work, it is best to do it as rarely as possible. Do not get involved in the constant allocation and release. It is also worth fearing operations that allocate memory implicitly, within themselves. It distorts me from a code of this type, especially when you consider that it is executed in a system where 50 kilobytes of RAM are allocated to everything about everything:

String output; if (cnt > 0) output = ','; output += "{\"type\":\""; output += (entry.isDirectory()) ? "dir" : "file"; output += "\",\"name\":\""; output += entry.name(); output += "\""; output += "}";

From my point of view, it is much calmer if the code has a similar look (although it is a little less beautiful, but it doesn’t mind the memory allocation and freeing functions):

 char xml [768]; ... xml[0] = 0; if (cnt > 0) { strcat (xml,","); } strcat (xml,"{\"type\":\""); if (entry.isSubDir()) { strcat (xml,"dir"); } else { strcat (xml,"file"); } strcat (xml,"\",\"name\":\""); entry.getName(xml+strlen(xml),255); strcat (xml,"\"}");

This code is specially made “in the forehead” in order to clearly show that after its addition, there was no danger of fragmentation of the address space, and it would also be clearly seen what had been replaced. But to perfection he is still far away. To begin with, for the sake of clarity, the principles of OOP are violated in it, and we continue with the fact that the strcat function each time iterates over the receiver string from the beginning, which negatively affects the speed. Theoretically, the receiver string may also overflow (although in this particular example, overflow protection is in the entry.getName function).

We give the option proposed by comdiv , devoid of these shortcomings.

We describe a class for working with a static string, containing, among other things, an indication of the current length, which will allow us not to start inspecting the string every time from the beginning. For simplicity, we implement only the "+ =" operator in this class. It is on this class that the semantic load of the example will fall.

 class StaticString { protected: char* m_buf; //    int m_size; //    int m_len; //    public: StaticString (char* buf,int size) { _ASSERT(NULL != buf); _ASSERT(size > 0); m_buf = buf; //     buf[0] = '\0'; //  m_size = size; //   m_len = 0; //    -  } StaticString& operator+=(const char *str) { int i = 0; //  ,         while ((m_len < m_size - 1) && (str[i] != '\0')) { //    m_buf[m_len++] = str[i++]; } // ,    -   m_buf[m_len] = '\0'; return *this; } };

And the main code will again take on a familiar form, differing only in the declaration of the output variable, which "wraps" the xml line:

 char xml [768]; ... StaticString output (xml,sizeof(xml)); if (cnt > 0) output += ','; output += "{\"type\":\""; output += (entry.isDirectory()) ? "dir" : "file"; output += "\",\"name\":\"";

But due to the use of another class, the danger of fatal fragmentation of the address space has passed. And unlike the “frontal” solution, the speed has been optimized and the danger of a buffer overflow has been eliminated.

It is possible to improve the class for a long time (now only one version of the operator "+ =" is covered in it), but this is more likely related to programming guidelines in general rather than to the manual for the RTOS MAKS. In the meantime, I’ll just note that whichever of the replacement options (“in a hurry, but visual” or “correct, but more complex”) is chosen, they illustrate the same idea:

If you can refuse the constant call to new / delete, then it is better to do it.

Short about stack variables

I have often met programmers who do not know exactly how local variables are implemented in C and C ++ languages. At the same time, those programmers are well aware of what the stack is, as well as how the contents of the registers (which will be corrupted) and the return addresses from the subroutines are saved (though in the ARM architecture, the return address falls into the LR register). Perhaps this is due to the fact that all these programmers graduated from the same university (which is no secret, I finished it myself, and 10 years ago I also didn’t fully imagine what a stack frame is.) Nevertheless, it will be useful to briefly describe how these mysterious local variables are stored. At the end of the section, intrigue will be revealed on how this applies to the MAX RTOS.

So. It turns out that the stack is not only used to store return addresses (though not from ARM) and temporarily save the contents of the processor registers. The stack is also used to store local variables.

Let's see what a typical preamble of a function looks like, which has so many of these local variables that they do not fit in registers.

;;;723 static void _CopyRect(int LayerIndex, int x0, int y0, int x1, int y1, int xSize, int ySize) {
000000 e92d4ff0 PUSH {r4-r11,lr}
000004 b087 SUB sp,sp,#0x1c

The first instruction is PUSH - everything is clear with it. It just keeps the registers on the stack so that they can be restored before exiting. And what is this subtraction of the constant 0x1C from SP? And this is just the allocation of the stack frame. It is known from the computer science course that a stack is such a thing that is not addressed directly, but relative to the pointer to the top of the stack. Consider graphically what these two lines will do.

Fig. 2. Effect of function preamble on stack

What is this stack frame? It's simple. Its size is such that all local variables of functions fit in it (except those that the optimizer puts in registers). The size of the stack frame is calculated by the compiler. Each variable gets its offset relative to the beginning of the frame, and the call to them goes something like this:

;;;728 BufferSize = _GetBufferSize(LayerIndex);
000016 4620 MOV r0,r4
000018 f7fffffe BL _GetBufferSize
00001c 9006 STR r0,[sp,#0x18]

Obviously, the BufferSize variable is offset from the beginning of the frame by 0x18 bytes.

;;;730 SrcAddr = Offset + (y0 * _xSize[LayerIndex] + x0) * _BytesPerPixel[LayerIndex];
000030 4816 LDR r0,|L8.140|
000032 f8500024 LDR r0,[r0,r4,LSL #2]
000036 fb076000 MLA r0,r7,r0,r6
00003a 4915 LDR r1,|L8.144|
00003c 5d09 LDRB r1,[r1,r4]
00003e fb005001 MLA r0,r0,r1,r5
000042 9005 STR r0,[sp,#0x14]

And the variable SrcAddr - offset 0x14

Well, and so on. The variable LayerIndex is explicitly placed not in the stack frame, but in the R4 register.

Of course, at the end of its work, the compiler quickly restores everything (and also puts the former contents of LR in the PC, thereby moving to the return address)

00007e b007 ADD sp,sp,#0x1c
000080 e8bd8ff0 POP {r4-r11,pc}

From all this, some things become clear:

It is clear why local variables are visible only inside the function. They are addressed relative to the SP register, and in nested functions (as well as in functions of a higher level) SP will be different.
It is clear why the KEIL development environment debugger does not display some variables - for some reason, developers are not able to display the contents of variables placed in registers.
It is clear why going beyond the bounds of an array placed in local variables can lead to a complete inoperability of the program - the return address from the function is stored in the same stack and may well be corrupted.
It is clear that recursive functions spend a stack not only on return addresses, but also on stack frames. The more local variables, the more stack is spent on recursive calls. This is best avoided when working in conditions of limited memory.

The most important conclusion is that after some workouts, the programmer can begin to figure out how much stack the task will need (based on the estimates of the deepest nesting of the mutual function call and the set of their local variables). Windows defaults to allocating a megabyte of stack to each thread. When working with microcontrollers, we are talking about kilobytes. Moreover, these kilobytes are allocated on the heap, so they reduce the amount of free dynamic memory and global variables, so knowledge of physics is not just useful, but often vital.

Protection of task stack from overflow

When creating a task, the stack size for it is determined. After this, the size cannot be dynamically changed. If it was chosen unsuccessfully (in the course of the work a large call nesting occurred, or the number of local variables turned out to be high, which could have happened while accompanying the program), the data can jump out of the selected limits, damaging the data in other tasks stacks, on the heap, or other data and doing other unpredictable actions. Such a situation is desirable to identify and inform the developer that it requires elimination.

The ideal way to prevent such a situation would be to check at the compiler level, without the participation of the OS, but unfortunately, such a mechanism at least creates a large overhead. The main task for controllers is not to check the programmer, but to make control. With a clock frequency of around one hundred and two hundred megahertz (and sometimes dozens of megahertz), this method of control is already unacceptable.

At the OS level, you can also monitor the stack for overflow. The MAX RTOS uses the following protection methods:

Checking the current position of the stack pointer when switching tasks. Almost no effect on performance, but has low reliability. First, the destruction of the stack has already happened, and secondly - during the system clock, the program could not only enter the function that caused the overflow, but also exit it, which means that the pointer could have time to return to the allowed range.
If the stack size is set to be larger than the minimum, then one word is automatically added to it at the top, where the magic number is written - 32-bit number of a random type, which is unlikely to occur when the program is running. If the stack overflows, this number will be overwritten by the application data, which will almost certainly make it possible to fix the fact of stack overflow even after the pointer returns to the workspace.
In the case when the processor contains an MPU (Memory Protection Unit), immediately beyond the stack boundary is placed the minimum size of the memory area with access protection. This is the most advanced method of control, as with any access to a protected area, a hardware interrupt will occur. It should, however, be remembered that in some cases, the protection zone may not be touched. For example, if a part of local variables that fall into this particular zone is reserved but not used. Protection is made for self-control and should not go to the detriment of the main tasks.

Details for working with stack protection can be found among the constants specified in the Task class (in the MaksTask.h file). Studying the comments to these constants, one can understand the specific values of the parameters “minimum stack”, “protected area”, etc. If desired, these parameters can be changed. It should only be remembered that the size of the protected area must be a power of two.

Everything, finally the necessary minimum of the theory, without which it is impossible to begin practical experiments, is over.

In general, the authors usually first describe everything about the subject area, and already in the last chapter (or in general, in the appendix) they give information about the practical work. Apparently, they believe that readers will first remember everything, and only then will begin experiments. It would be naive to assume that the average person will remember the whole mass of knowledge thrown at him. It is more convenient to try all the knowledge gradually.

Therefore, before you start stuffing the reader with further theory, it is worth a little practice. But for this you need to consider how to get started with the MAKS RTOS. Let's assume that the reader is familiar with how to compile and run the program under the microcontroller he has, otherwise the text will be heavily overloaded.

If this is not the case, then I highly recommend that you familiarize yourself with the wonderful guides from the Keil development environment for working with ST debugging boards (unfortunately, in English, but much is also clear from the figures):

http://www.keil.com/appnotes/files/apnt_253.pdf

http://www.keil.com/appnotes/files/apnt_261.pdf

And in the next article we will begin the first practical experience.

Source: https://habr.com/ru/post/337476/

All Articles

Overview of one Russian RTOS, part 4. Useful theory

Some unobvious data details

Some facts about the heap

Short about stack variables

Protection of task stack from overflow

More articles: