📜 ⬆️ ⬇️

How to protect against stack overflow (on Cortex M)?

If you are programming on a “big” computer, then you most likely do not have such a question. Stack a lot to overfill, you need to try. In the worst case, you click OK on a window like this and go to figure out what's wrong.

image

But if you program microcontrollers, the problem looks a little different. First you need to note that the stack is full.

In this article I will talk about my own research on this topic. Since I program mainly under STM32 and under Milander 1986 - I focused on them.

Introduction


Imagine the simplest case - we are writing a simple single-threaded code without any operating systems, i.e. we have only one stack. And if you, like me, are programming in uVision Keil, then the memory is distributed like this:
')


And if you, like me, consider the dynamic memory on microcontrollers evil, then like this:



By the way
If you want to prohibit the use of heaps, you can do this:
#pragma import(__use_no_heap_region) 

Details here

Okay, what's the problem? The problem is that Keil places the stack right behind the static data area. A stack in Cortex-M is growing in the direction of decreasing addresses. And when it overflows, it simply climbs out of its allotted piece of memory. And overwrites any static or global variables.

It is especially great if the stack overflows only when entering an interrupt. Or, even better, in a nested interrupt! And quietly spoils some variable that is used in a completely different part of the code. And the program falls on the aspert. If you are lucky. Spherical heisenbag, so you can search for a whole week with a lantern.

Immediately, I’ll make a reservation that if you use a bunch, then the problem doesn’t go anywhere, just a bunch of global variables spoil instead of global variables. Not much better.

Okay, the problem is clear. What to do?

MPU


The simplest and most obvious is to use MPU (that is, Memory Protection Unit). Allows you to assign different attributes to different pieces of memory; in particular, you can surround the stack with read-only regions and catch MemFault when writing there.

For example, in stm32f407 MPU is. Unfortunately, in many other "younger" stm it is not. And in Milandrovsky 19861 it is not there either.

Those. The solution is good, but not always affordable.

Manual control


When compiling, Keil can generate (and does this by default) an html report with a call graph (the linker option "--info = stack"). And this report provides information about the stack used. Gcc can do this too (option -fstack-usage). Accordingly, you can sometimes glance at this report (or write a script that does this for you, and call it before each build).

And at the very beginning of the report a path was written leading to the maximum use of the stack:



The problem is that if your code has function calls by pointers or virtual methods (and I have them), then this report can greatly underestimate the maximum stack depth. Well, interrupts, of course, are not counted. Not a very reliable way.

Tricky stack placement


I learned about this method from this article . An article about rust, but the basic idea is this:



With gcc, this can be done using double linking .

And in Keil, you can change the location of areas using your linker script (Keil's scatter file). To do this, open the project options and uncheck "Use memory layout from target dialog". Then the default file will appear in the “Scatter file” field. It looks like this:

 ; ************************************************************* ; *** Scatter-Loading Description File generated by uVision *** ; ************************************************************* LR_IROM1 0x08000000 0x00020000 { ; load region size_region ER_IROM1 0x08000000 0x00020000 { ; load address = execution address *.o (RESET, +First) *(InRoot$$Sections) .ANY (+RO) } RW_IRAM1 0x20000000 0x00005000 { ; RW data .ANY (+RW +ZI) } } 

What to do next? Possible options. Official documentation suggests defining sections with reserved names - ARM_LIB_HEAP and ARM_LIB_STACK. But this entails unpleasant consequences, at least for me - the stack and heap sizes will have to be set in the scatter file.

In all the projects I use, the stack and heap size is set in an assembler startup file (which Keil generates when creating the project). I do not really want to change it. I just want to include a new scatter file in the project, and everything will be fine. So I went a little different way:

Spoiler
 #! armcc -E ; with that we can use C preprocessor #define RAM_BEGIN 0x20000000 #define RAM_SIZE_BYTES (4*1024) #define FLASH_BEGIN 0x8000000 #define FLASH_SIZE_BYTES (32*1024) ; This scatter file places stack before .bss region, so on stack overflow ; we get HardFault exception immediately LR_IROM1 FLASH_BEGIN FLASH_SIZE_BYTES { ; load region size_region ER_IROM1 FLASH_BEGIN FLASH_SIZE_BYTES { ; load address = execution address *.o (RESET, +First) *(InRoot$$Sections) .ANY (+RO) } ; Stack region growing down REGION_STACK RAM_BEGIN { *(STACK) } ; We have to define heap region, even if we don't actually use heap REGION_HEAP ImageLimit(REGION_STACK) { *(HEAP) } ; this will place .bss region above the stack and heap and allocate RAM that is left for it RW_IRAM1 ImageLimit(REGION_HEAP) (RAM_SIZE_BYTES - ImageLength(REGION_STACK) - ImageLength(REGION_HEAP)) { *(+RW +ZI) } } 


Here I said that all objects named STACK should be located in the REGION_STACK region, and all HEAP objects should be located in the REGION_HEAP region. And everything else - in the region RW_IRAM1. And I arranged the regions in this order - the beginning of the operatives, the stack, the heap, everything else. The assumption is that in an assembler startup file, the stack and the heap are set using this code (ie, as arrays with the names STACK and HEAP):

Spoiler
 Stack_Size EQU 0x00000400 AREA STACK, NOINIT, READWRITE, ALIGN=3 Stack_Mem SPACE Stack_Size __initial_sp Heap_Size EQU 0x00000200 AREA HEAP, NOINIT, READWRITE, ALIGN=3 __heap_base Heap_Mem SPACE Heap_Size __heap_limit PRESERVE8 THUMB 


Ok, maybe you ask, but what does it give us? And that's what. Now, when going beyond the stack, the processor tries to write (or read) a memory that does not exist. And on STM32, an interrupt occurs on the exception - HardFault.

This is not as convenient as MemFault because of the MPU, because HardFault may occur due to a variety of reasons, but at least the error is loud, not quiet. Those. it occurs immediately, and not after an unknown period of time, as it was before.

What is the best thing, we did not pay for it, no lead overhead! Great. But there is one problem.

It does not work on Milandra.

Yes. Of course, in Milandra (I'm mainly interested in 19861 and BE91) the memory card looks different. Before STM32, there is nothing before the start of the operative, and on Milandra, before the operative, there is an area of ​​the external bus.

But even if you do not use an external bus, you will not get any HardFault. And maybe get it. Or maybe get it, but not immediately. I could not find any information on this subject (which is not surprising for Milandra), and the experiments did not give any clear results. HardFault sometimes arose if the stack size was a multiple of 256. Sometimes HardFault arose if the stack went very far into non-existent memory.

But it doesn't even matter. If HardFault does not occur every time, then simply moving the stack to the beginning of the RAM does not save us. And if it’s really fair, the STM is also not obliged to generate an exception, the core specification of the Cortex-M seems to say nothing concrete about this.

So even on STM it’s rather a hack, just not very dirty.

So you need to look for some other way.

Access breakpoint to write


If we move the stack to the beginning of the RAM, then the stack limit will always be the same - 0x20000000. And we can just put a breakpoint on the entry in this cell. You can do this with a command and even register it in autorun using a .ini file:

 // breakpoint on stackoverflow BS Write 0x20000000, 1 

But this is not a very reliable way. This breakpoint will be triggered every time the stack is initialized. It is easy to accidentally beat it by clicking "Kill all breakpoints". And he will protect you only in the presence of a debugger. No good.

Dynamic overflow protection


A quick search on this account led me to Keil's "--protect_stack" and "--protect_stack_all" options. Options that are useful, unfortunately, they do not protect from overflowing the entire stack, but from getting another function into the stack frame. For example, if your code goes beyond the bounds of an array or fails with a variable number of parameters. Gcc, of course, also knows how to (-fstack-protector).

The essence of this option is as follows: “guard variable”, that is, a guard number, is added to each stack frame. If this number has changed after exiting the function, the error handler function is called. Details here .

Useful thing, but not quite what I need. I need a much simpler check - so that when entering each function, the value of the SP register (Stack Pointer) is reconciled with the previously known minimum value. But do not write this test hands on entering each function?

SP Dynamic Control


Fortunately, gcc has a wonderful "-finstrument-functions" option that allows you to call a user-defined function when entering each function and when exiting each function. This is usually used to display debug information, but what's the difference?

Even more fortunately, Keil quite consciously copies the gcc functionality, and there the same option is available under the name "--gnu_instrument" ( details ).

After that, you just need to write this code:

Spoiler
 //   ,    //   ,         scatter- extern unsigned int Image$$REGION_STACK$$RW$$Base; //    ,   static const uint32_t stack_lower_address = (uint32_t) &( Image$$REGION_STACK$$RW$$Base ); //         extern "C" __attribute__((no_instrument_function)) void __cyg_profile_func_enter( void * current_func, void * callsite ) { (void)current_func; (void)callsite; ASSERT( __current_sp() >= stack_lower_address ); } //   -   extern "C" __attribute__((no_instrument_function)) void __cyg_profile_func_exit( void * current_func, void * callsite ) { (void)current_func; (void)callsite; } 


And voila! Now, upon entering each function (including interrupt handlers), a stack overflow check will be performed. And if the stack overflows - will be assert.

Small explanations:
  • Yes, of course, you need to check for overflow with a certain margin, otherwise there is a risk of “jumping over” the beginning of the stack.
  • Image $$ REGION_STACK $$ RW $$ Base is the special magic of getting information about the memory area using the constants generated by the linker. Details (although not very intelligible in places) here .


Is the solution ideal? Of course not.

Firstly, this check is far from free, the code from it swells up by 10 percent. Well, the code will work slower (although I did not measure it). Whether this is critical or not is up to you; In my opinion, this is a reasonable price for security.

Secondly, it most likely will not work when using precompiled libraries (but since I don’t use them at all, I didn’t check them).

But this solution is potentially suitable for multi-threaded programs, since we do the checking ourselves. But I have not really thought of this idea yet, so for now I will hold it.

Let's sum up


It turned out to find working solutions for stm32 and for Milandr, although for the latter one had to pay with some overhead projector.

For me, the most important thing was a small paradigm shift in thinking. Prior to the aforementioned article, I did not think at all that you could somehow protect against a stack overflow. I did not perceive this as a problem that needs to be solved, but rather as a kind of spontaneous phenomenon - sometimes it rains, and sometimes the stack overflows, well, there's nothing you can do, you have to bite the bullet and suffer.

And in general, I often notice for myself (and for other people) that - instead of spending 5 minutes in Google and finding a trivial solution - I have been living with my problems for years.

I have it all. I understand that I didn’t discover anything fundamentally new, but I didn’t come up with any ready-made articles (at least, Joseph Yu himself doesn’t offer this in an article on this topic). I hope in the comments I will be prompted, I am right or not, and what are the pitfalls of this approach.

Source: https://habr.com/ru/post/425071/


All Articles