We had the opportunity to conduct a small but extremely instructive tactical exercise.
The other day, in the process of porting FreeRTOS to a microcontroller with a Cortex-M1 core, about which I already wrote, a small question arose, which quite unexpectedly violently resisted all attempts to find an answer to it with the help of GUGLA all-powerful. Moreover, in the process of searching, it turned out that this question interested not only me, and therefore, could not be a consequence of innate (or acquired) stupidity of the questioner, or, in extreme cases, it indicates that such is not so rare. Slightly puzzled by the inability to apply the usual way of searching for answers, I decided to resort to a more exotic and slightly forgotten way - to think and find the answer myself. Unfortunately, it didn’t work either, nor did it help to try to consult with other stupid people (you don’t praise yourself - you go all day as a spat). Since there should be an excess of those on Habré, let's try an extensive solution by involving more specialists in this process. Therefore, instead of winning a post I write a plaintive post - help, good people, who can. So, go to the heart of the problem.
In the process of switching tasks, it is necessary to save and then restore the process context. Obviously, this process is hardware-dependent, and in the process of porting it should be given special attention. Since the solution for M0 was taken as a basis, on the M1 architecture, which is a subset of the above, everything fell without problems. Nevertheless, I decided to look at the codes of this site in order to get a bit of exp. And here some unexpectedness awaited me, namely: the code seemed to me intricate, because instead of the expected PUSH commands there was the following picture:
xPortPendSVHandler: ; - mrs r0, psp ldr r3, =pxCurrentTCB ldr r2, [r3] subs r0, r0, #32 str r0, [r2] stmia r0!, {r4-r7} mov r4, r8 mov r5, r9 mov r6, r10 mov r7, r11 stmia r0!, {r4-r7} ; , push {r3, r14} cpsid i bl vTaskSwitchContext cpsie i pop {r2, r3} ; ldr r1, [r2] ldr r0, [r1] adds r0, r0, #16 ldmia r0!, {r4-r7} mov r8, r4 mov r9, r5 mov r10, r6 mov r11, r7 msr psp, r0 subs r0, r0, #32 ldmia r0!, {r4-r7} bx r3 vPortSVCHandler; ... vPortStartFirstTask ...
By the way, taking this opportunity, even before analyzing the actual issue, I would like to curse the authors of this code. Note that the three labels are written in a different format - with a colon at the end, without a colon at the end (as allowed by the language description) and without a colon, but with a semicolon, which opens a missing comment. If we consider that in the latter case, the label was also redefined by the preprocessor directive, it took me some time in an attempt to understand why it was done this way. The answer “because” was found rather quickly and did not bring pleasure. Further, in the first and fourth lines of code, the value is calculated, which is sent in the fifth line to the address calculated in the second and third lines. Well, why break the calculation of the value of the calculation of the address? On the one hand, it is encouraging that the neglect of style has an international character, and is not our national peculiarity, on the other hand, does not add optimism. I remember the classic "Do not look for malicious intent in what can be explained by ordinary stupidity." But this is so, a lyrical digression on the theme of the brightness of the sun and the greenness of the grass. Let's return to the task itself.
It is easy to see that part of the process context, namely the r4-r11 registers, is saved in lines 7 through 12, and using the index multiple forwarding (the rest of the context, the r0-r3 and r12-r15 registers, was saved during exception handling Why is it not the PUSH command that is used, but the long transfer command, with register-register with shipments (the long transfer command does not work further than the r7 register). Well, firstly, unfortunately, the PUSH command in the M architecture is also close, so not to avoid, but all equal for it would be much easier to understand what is happening. This is where the dog and fumbled.
The fact is that in M ​​architecture there are two modes of operation - Threat (let's call it user) and Handler (let's call it system). Such names are quite consistent with the spirit, since the Handler mode is enabled for interrupt handling, which is peculiar to the system level. There are still privileged and non-privileged modes, but in M1 they are not there anyway (they are indistinguishable). Further, in the architecture of M, there are two stack pointers, MAIN (let's call it system) and Process (let's call it user). This naming is also quite justified, since after a reset a MAIN pointer is used, and this is clearly a system level. In this case, both pointers have unique names in the space of special registers, MSP and PSP, respectively, which is used in the first line of code. In addition to the unique names, there is also a register (suddenly) of the stack pointer for accessing the stack pointer, which shows us one of the above two under the control of a bit in a special register (for details, refer to the ARM documentation). While everything looks logical, we look further.
In the user mode MK, it is possible to switch this bit and, accordingly, access to both stack pointers. Well, personally, I would not give such a right to this regime to avoid, but who am I to argue with ARM? But in the system mode, the MK has access ONLY to the system stack pointer and cannot switch the value of this bit. Therefore, it cannot directly write to the user stack through the call instructions to the stack. At the same time, of course, it remains possible to access the corresponding memory area through register indexing, which is done in the subroutine, but I have a question “Why is this done?” .. Why is the user mode allowed to switch pointers and, possibly, shoot themselves in the foot the collapse of the system stack, and the system mode, which should be designed by more carefully trained people, is denied this opportunity? If such permission would be given to both modes, there would be no question - the developers did not consider it necessary to do protection, that is their right. But for the system mode, this feature is deliberately prohibited, which means there is a piece of equipment that is responsible for this ban. Of course, this part is not too complicated and I myself can offer a couple of simple options, but it could not appear by itself. Hence, there is reason to do so, but I do not understand them. He twirled in his head options related to nested interrupts, did not invent anything. Unfortunately, I didn’t find the answer on the ARM website, they write about HOW this part of the MC works, but WHY it isn’t said (maybe this is sacred knowledge and, having received it, you can learn how to create architectures no worse than ARMs). With the secret hope that everything is exactly what I am putting this question to the Habra-community court, I am waiting for your answers.