“Now I will show you a portrait ... Hmm ... I warn you that this is exactly a portrait ... In any case, please treat it as a portrait ...In this post we will talk about the development and debugging of programs for MC SS1350 in the recommended by the manufacturer development environment CCS. The advantages (and they are) and disadvantages (and how without them) of the above-mentioned products will be affected. The text will not include screenshots designed to show (circled) the location of the compilation icon in the integrated programming environment or file selection in the directory. Recognizing the fundamental possibility of articles in a similar style, I will try to focus on conceptual points in the hope that my reader will be able to sort out the details.
The purpose of this opus, in addition to transferring the experience gained, is to try to arouse healthy envy from domestic manufacturers of MCs who are TI's direct competitors ("in the country where we thrive with you") - a task that is frankly ungrateful, but they say that a drop of stone wears away.
I’ll emphasize right away that we’ll talk only about Windows (moreover, only) version 7, although on the TI website there is an option for Mac and for Linux, I haven’t tried them, I’m quite ready to believe that everything is not so great, but why think about the bad (or vice versa, everything is great there, but then why envy).
')
So, what does the TI website teach us? To get started with the evaluation modules, you must perform three necessary steps:
- Buy evaluation modules - completed.
Marginal note (PNP): You will also have to do this, because in the programming environment in question I personally (unfortunately) could not find the possibility of hardware emulation for debugging, at least where I was looking. - Install the development environment - download, run the installer, everything turned out. We connect the evaluation module to USB - the firewood rises by itself and everything turned out again - completed. When we try to program the device, we get a message about the need to update the firmware, we agree, and again everything worked out. In general, there is nothing to write about, if it was always and everywhere ...
- To go and study the course TI SimpleLink Academy 3.10.01 for SimpleLink CC13x0 SDK 3.10 is a strange proposition, sort of like teaching me - just spoil it, but so be it, open the corresponding link and quietly stunned - how many things have been done here.
Here we see training materials on working with SYS / BIOS hardware drivers and with TI-RTOS operating system and using the NDK network stack, including USB, using wireless protocols and many other aspects of working with representatives of various MK families produced by the company. And all this wealth is accompanied by ready-to-use examples, and if one also considers the availability of user manuals and module descriptions, then perhaps there is nothing more to wish for. But there are still utilities that facilitate the work on the preparation and configuration of program code, firmware and debugging in various ways, and this wealth is also quite documented.
PNP: if anyone is inclined to consider this material as an ad for the company, its products and the programming system, then it will most likely be right and I'm really very impressed with the amount of software found. I’ll talk about its quality further and, I hope, the suspicions of bias will be dispelled, I’m completely not blinded by the feeling and I continue to see well the flaws of the object of description, so this is not the love of youth, but a serious feeling of an adult specialist. I am afraid to imagine the amount of material costs necessary to create and maintain such a volume of software and documentation for it, but this was clearly not done in one month, well, and the company certainly understands what it is doing.
Well, until we postpone the study of materials for later, we will all comprehend "along the way with the gut of the nugget" and boldly open the CCS. It implements the concept of workspaces received from the parent - Eclipse. Personally, the project concept is closer to me, but no one bothers us to keep exactly one project in space, so go ahead.
But then things get a little worse - we open the workspace (RP) for our debug board and see a lot of projects (usually in two versions - under the RTOS and for “bare iron”). As I said earlier, this is not a crime, but the fact that many projects contain identical files with identical software modules is not cool at all. The code is repeatedly duplicated and supporting changes becomes a very nontrivial task. Yes, with this solution it is much easier to transfer the project by simply copying the directory, but for such things there is an export of the project, and it is quite well implemented. Links to files in the project tree are supported adequately, so the solution with the inclusion of the files themselves in the examples provided cannot be considered satisfactory.
We continue research - we will begin work with the finished project, but not the flashing of the LED, although there are two of them on the debugging board, but work with the serial port, the finished example is uartecho. We are creating a new RP, we include the project we are interested in and ... nothing works, it is clear from the message that it is necessary to include a related project in the RP. It is not very clear why this should be done, but it is not difficult to fulfill the requirements of the environment, after which the project begins to be built.
PNP: on the home machine, I used the “Import Project” team and all the necessary inclusions occurred on their own. Where exactly related projects are indicated, I do not know, let's leave the analysis of this aspect for the future.
Compile, flash and start debugging. We discover an interesting phenomenon - the execution in steps is not quite adequately displayed when considering the library of work with a serial port - the costs of optimization. Turning off the optimization in the compiler settings (which settings are not there, are there really people who know them all and, moreover, use them all), build the project again - and nothing changes. It turns out that only those files that are included in the project tree are recompiled, at least in the form of links. We add links to the source code of the library and after rebuilding everything is debugged correctly (provided that we have the option to generate debug information enabled).
PNP: but I found the options to enable the check for compliance with MISRA-C.
PNP: another way is to use the “Clean ...” command followed by the build, the “Build All” command for some reason does not affect the related project.
Next, we find that not always and not everything is being debugged normally, sometimes we find ourselves in areas of machine code for which the compiler does not find the source code. Since the programming environment provides us with all the necessary files for work - the result of the preprocessor, the assembler code and linker map (you just need to remember to enable the corresponding options), we turn to the latter. We find two areas of program code - starting at 0x0000. and starting at 0x1000. (32-bit architectures are good for everyone, but address writing is not their strong point). We turn to the documentation on the microcircuit and find out that inside there is a ROM area mapped to 0x1000., And the built-in part of the libraries is located in it. It is argued that the use of subroutines from it improves performance and reduces consumption compared with the address space of 0x000. While we master the MK, we are not so interested in the last parameters, but the convenience of debugging is decisive. You can disable the use of the ROM (and for our purposes it is necessary) by setting the compiler option NO_ROM, what we are doing and rebuilding the project.
PNP: the transition to the subroutine in the ROM looks quite funny - there is no long transition in the command system, so the transition is first performed with a return to an intermediate point in the small address area (0x0000), and already there is a PC boot command, the parameters of which are not recognized by the disassembler. Something I do not believe, as if with such overhead costs can be won in speed, although for long subroutines - why not.
By the way, an interesting question - and how is it guaranteed that the contents of the ROM correspond to the source codes, kindly presented by the company? I can immediately suggest a mechanism for embedding in ROM additional (of course, debugging and service) functions that for the user - programmer MK will be completely unnoticeable. And personally, I have no doubt that the developers of the chip know many other mechanisms that implement similar functionality, but we will finish the attack of paranoia.
On the other hand, I can only welcome the emergence of such an analogue of the BIOS, because in the long run this will make the developers' dream of true code portability between different MK families with one core real. We also note the peculiarity of the implementation of interaction with “embedded” software modules. If in the early attempts to create such a mechanism implemented in TivaC models, there was a call supervisor, which was addressed with the group number and the entry point number in the subroutine, which caused a significant overhead, here the connections are resolved at the linker level due to the double function names and inserted straight long jumps to subroutines in ROM. This is much faster in execution, but requires recompilation of the project when the usage model changes.
Now that we are fully prepared for convenient debugging, we return to our project and begin quietly debugging the program with access to the source codes of the modules (well, I thought so ...), which will allow us to form our opinion about the quality of these texts. The project under study provides a mirror of the serial communication channel and is extremely convenient for learning purposes. Of course, we took the option using the RTOS, I do not see the slightest reason not to use it in our configuration (a lot of memory and program memory).
Immediately, we note that the source codes are in C, often this is not very convenient, many language constructs look cumbersome compared to their counterparts on the pluses, but the creators were more worried about the compatibility of the code than the syntactic sugar. Although it would be possible to create a C ++ version of libraries, the conditional compilation has been known for a long time and is used everywhere, but this entails additional material costs. Surely, the management of the company knows what it is doing, and my comments are a kind of “divan analytics,” but it seems to me that I also have the right to my opinion.
I also know the opposite approach, when a library is designed using the latest C ++ tools, and the question of what developers who use compilers that do not meet the latest specifications use is a great answer - switch to new versions or not this library (I highly recommend the second option in such cases). My personal opinion is that if we really want our product to be used (and TI clearly wants this, and does not make the library on the principle of “from ... bring down the figs from me, here's a new drum”), then its approach is absolutely correct.
The source code of the program looks classically - initializing the hardware and software environment, creating tasks and running the scheduler in the main module, the text of the task in a separate compilation module. In this example, the task is exactly one - mainThread, the purpose is not quite clear from the name, and also that I am somewhat confused - the name of the file containing the source text does not coincide with the name of the function (uartecho.c - although this is the name that says) yes Search in the programming environment is implemented in a standard way (context menu or F3 on the name of the entity) and there are no problems with that.
The process of setting the task parameters before launching is pretty much expected:
- we create the structure of parameters (local, of course),
- assign default values to it,
- set the parameters other than the standard ones, and
- use the structure when creating the task.
Despite the seemingly naturalness of these operations, not for all library authors it is obvious, and I saw different implementations in which there were no, for example, stage 2, which led to a funny (for an outside observer, not for a programmer) program behavior. In this case, everything is fine, the only question that has arisen is why the default values are not constant, probably, this is the legacy of the damned past.
PNP: in a well-known FREE-RTOS, a slightly different approach is taken, indicating the parameters of the task directly in the call body of the API function for creating a task. The pros and cons of these approaches are as follows:
- + allows you not to explicitly specify parameters that match the default values, + does not require memorizing the order of parameters, - more verbose, - Greater memory cost, - you need to know the default parameters, - creates a named intermediate object
- - requires specifying all parameters, requires memorizing the order of parameters, + is more compact, + requires less memory, + does not require named intermediate objects.
There is a third method promoted by the author of this post (in the style of TURBO), which has its own set - + allows you not to explicitly specify the parameters that match the standard, + does not require memorizing the order of the parameters, - is rich, - Higher memory costs, - you need to know the default parameters, + works in the "lambda" style, + makes standard errors hard to implement, - looks a little weird because of the many right brackets.
Well, there is a fourth option, devoid of any shortcomings, but requiring C ++ not lower than 14 - lick and pass by.
We start debugging, run the program and open one of the two serial ports provided by the debugging board in the terminal window provided by the programming environment. Which of the two ports (one is the debug, probably the second is user, you can see the numbers in the system) is difficult to say in advance, sometimes younger, sometimes older, well, at least it does not change when the board is reconnected, so you can write it on the board. Well, another inconvenience - open terminals are not saved with the project and are not restored when you open the debug session, although they do not close when you exit it. We check the work of the program and immediately discover another drawback - the terminal cannot be configured, for example, it basically works in Unix style with the closing / r, unaccustomed to such minimalism, although nobody prevents us from using an external terminal program.
PNP: Let's note one more debugging feature, well, this is true for any development environment - when switching a task with a sheduler, we lose the focus of the trace, breakpoints will help us to solve this problem.
To begin, consider the process of creating an instance of a serial port - here everything seems to be standard, using a structure, to the fields of which we assign the required object parameters. Note that on the pros, we have the opportunity, completely absent in C, the entire initialization is very nice to hide “under the hood”, but I have already voiced the possible arguments in favor of the second solution. There is a function of the initialization of the tuning structure, and this is good (no matter how paradoxical it sounds, this function does not seem obligatory for the authors of some libraries). At this point in the narrative, the honeymoon ends and an ordinary
(married) life begins.
Careful examination of the source shows that all is not so good. What is the problem - the initialization function copies to our control structure the default values from an object that lies in the constant domain, which is great, but for some reason:
- the object is global, although it is used by the only parameter initialization function (similar practice cost Toyota a lot in its time) - well, adding the static directive is easy;
- the managing object is named, in C there is no beautiful solution to this problem, or rather, there is a solution with an anonymous instance and I gave it in a long post, but many right brackets do not allow to call this option really beautiful, there’s a solution to the awesome beauty, but that dream unrealizable;
- all fields of the object are clearly redundant in terms of bit depth, even bit fields (enumerations of two possible values) are stored in 32-bit words;
- enumerated mode constants are defined as defines, which makes it impossible to check at the compilation stage and necessary in runtime;
- repeat section of an infinite loop in different places of possible failures, it would be much more correct to make one (in this case, empty) handler;
- well, all operations for setting up and running a task can (and should) be hidden in one function or even a macro.
But initialization of the receive buffer is well done - we use a previously reserved memory, no manipulations with the heap, the call chain is somewhat complicated, but everything is quite readable.
PNP: in the debugging window we have a call stack in front of our eyes, everything is done as it should be and soundly - respect and respect. The only thing that is somewhat surprising is the attempt to hide this window leads to the end of the debugging session.
Well, another somewhat unexpected solution is to set the possible number of objects in the enumeration, for serial ports and for a given debugging card equal to 1, in the style
typedef enum CC1310_LAUNCHXL_UARTName { CC1310_LAUNCHXL_UART0 = 0, CC1310_LAUNCHXL_UARTCOUNT } CC1310_LAUNCHXL_UARTName;
Such solutions are standard for real enumerations, but for describing hardware objects - but I didn’t know that it was possible, although it worked for itself. With the initialization of iron done, go ahead.
In the running task, we observe a classic infinite loop, in which data is read from the serial port by the function
UART_read(uart, &input, 1);
and immediately sent back by function
UART_write(uart, &input, 1);
. Enter the first one and see the attempt to read characters from the receive buffer.
return (handle->fxnTablePtr->readPollingFxn(handle, buffer, size))
(how can I hate such things, but in C it is simply impossible otherwise), we go deeper and find ourselves in UARTCC26XX_read, and from it we get into the implementation of the ring buffer — a function
RingBuf_get(&object->ringBuffer, &readIn)
. Here, ordinary life enters an acute phase.
To say that this particular module (ringbuf.c file) I didn’t like - it didn’t say anything, it’s just awful to write and I’d personally expel the authors of this part in such a respectable company in disgrace (you can still take me to their place, but I’m afraid that the level of salaries of our Indian colleagues does not suit me), but, probably, I don’t know something. Watch your hands:
1) the re-roll of read / write pointers is implemented through the remainder of the division
object->tail = (object->tail + 1) % object->length;
and there are no compiler optimizations when performing this operation, such as a bitmask overlay is not and cannot be, since the buffer length is not a constant. Yes, in this MC there is a hardware division operation and it is quite fast (I wrote about it), but still it will never take 2 measures, as in the correct implementation with an honest re-roll (and I also wrote about this),
PNP: I recently saw a description of the new M7 architecture in the implementation, and I don’t remember who, for some reason, division of 32 into 32 was performed in 2-12 cycles instead of 2-7. Or is it a translation error, or ... I do not even know what to think of.
2) moreover, this code fragment is repeated in more than one place - macros and inlines for weaklings, ctrl + C and ctrl + V rule, the DRY principle goes through the forest,
3) a completely unnecessary counter of the filled places in the buffer was implemented, which caused the following drawback,
4) critical sections in both reading and writing. Well, well, I can still believe that the authors of this module do not read my posts on Habré (although this behavior is unacceptable for firmware professionals), but they should be familiar with the Mustang Book, there this issue is considered in detail,
5) like a cherry on a cake, another indicator of the maximum buffer size is introduced, with a rather vague name and a completely missing description (the latter applies in general to the entire module). I do not exclude that this parameter can be useful for debugging, but why drag it to release - what are we, the processor clock cycles have nowhere to go with the RAM?
6) at the same time, there is no buffer overflow handling (there is a -1 return signaling this situation) - even in Arduino, it is there, let us leave aside the quality of this processing, but its absence is even worse. Or did the authors inspire the well-known fact that any assumptions regarding the empty set are true, including the fact that it is not empty?
In general, my comments completely correspond to the first line of the demotivator on the topic of review of the code “10 lines of code - 10 comments”.
By the way, the penultimate of these shortcomings makes you think about more global things - and how can we generally implement the base class in order to be able to carry out its deep modification. To make all fields protected is a dubious idea (although, perhaps, the only correct one), to insert a call to friendly functions in the heirs is very much like crutches. If in this particular case there is a simple answer to the question about the introduction of the buffer fullness indicator — a generated class with overwritten writing and reading and an additional counter, then to implement reading without advancing the buffer (as in this case) or replacing the last character placed (I saw and implementation of the ring buffer) without access to the internal data of the parent class is indispensable.
At the same time, there are no complaints about the implementation of the actual reading from the serial interface - the input is blocking, in the absence of a sufficient number of characters in the receive buffer, the semaphore is cocked and control is transferred to the scheduler - everything is implemented neatly and correctly. Personally, I do not really like the control of equipment in a general-purpose procedure, but this reduces the nesting of procedures and reduces the cyclomatic complexity index, whatever that means.
We now turn our attention to the transmission of received data to the serial channel, since, when creating the object, it was provided with only one ring buffer, the receive buffer. Indeed, the internal buffer of the hardware is used for transmitting characters, and when it is filled, wait for readiness is entered (at least in the blocking operation mode). I can not help but be critical of the style of the corresponding functions: 1) for some reason, the object has a generalized pointer that constantly turns into a pointer to characters inside the function
*(unsigned char *)object->writeBuf);
2) the logic of the work is completely opaque and slightly confusing. But all this is not so important, because it remains hidden from the user and "does not affect the maximum speed."
In the process of research, we come across one more feature - we do not see the source code of some internal functions in debug mode - this is due to the change of names for different compilation options (ROM / NO_ROM). Replace the required source file (C: \ Jenkins \ jobs \ FWGroup-DriverLib \ workspace \ modules \ output \ cc13xx_cha_2_0_ext \ driverlib \ bin \ ccs /./../../../ driverlib / uart.c--) I did not succeed (but I didn’t really try), although I found the source (of course, in the file in the uart.c file, thank you, captain), fortunately this fragment is simple and unequivocally identify the assembler code with the C source code (especially if you know the features of the ITxxx team). How to solve this problem for libraries with complex functions I don’t know yet, we will think when the need arises.
And finally, a small note - I am ready to believe that the hardware implementation of the serial channel for the CC13x0 MK models coincides with that for CC26x0, and duplicating the contents of the file named UARTCC26XX.c is not the right solution, but creating an intermediate definition file with inclusion source file, redefinition of functions and the corresponding comment would be welcome, because it would make the program more understandable, and it should always be welcomed, as well.
So, the test example works, we learned a lot about the internal structure of standard libraries, noted their strengths and not so much sides. In conclusion, we will try to find an answer to a question that usually concerns a programmer in the OS or non-OS dilemma - context switching time. Here two ways are possible: 1) consideration of the source code is rather a theoretical way, it requires a level of immersion in the subject that I am not ready to demonstrate, and 2) a practical experiment. Of course, the second method, unlike the first, does not give absolutely correct results, but “the truth is always concrete” and the obtained data can be quite regarded as adequate if the measurements are organized correctly.
To begin with, in order to estimate the switching time, we need to learn how to estimate the overall execution time of various program fragments. In the architecture under consideration, there is a debugging module, of which the system counter is a part. Information about this module is quite accessible, but the devil, as always, is hiding in details. To begin with, we will try to adjust the required mode with handles directly through access to the registers. We quickly find the block of registers CPU_DWT and in it we find both the actual counter CYCCNT and the control register to it CTRL with the bit CYCCNTENA. Naturally, or, as they say, of course, nothing happened and on the ARM website there is an answer to the question why it is necessary to allow the operation of the debugging module with the TRCENA bit in the DEMCR register. But with the last register, everything is not so simple - it is not in the DWT block, it is lazy to search in other blocks - they are quite long, but I did not find any search by name in the register window (and it would be nice to have it). We go to the memory window, enter the register address (it is known to us from the date) (by the way, for some reason the hexadecimal format of the address is not default, we need to add the 0x prefix to the pens) and, suddenly, we see a named memory cell called CPU_CSC_DEMCR. It's funny, if not more, why the company renamed the registers in comparison with the architecture proposed by the licensor by names, probably it was necessary. And precisely, in the block of registers CPU_CSC we find our register, put the necessary bit in it, go back to the counter, resolve it and it all worked.
PNP: the search by name is still there, it is called (naturally) with the Ctrl-F combination, it is just there only in the context menu, and in the usual one it is extinguished, I apologize to the developers.
Immediately, I note the lack of a memory window — the printing of the content is interrupted by specifying the named cells, which makes the output ragged and not segmented by 16 (8,32,64, the necessary substitute) words. Moreover, the issue format changes when the window size is changed. Maybe all this can be customized as the user needs, but, based on my own experience (and what else to proceed from), I declare that setting the display window of the memory viewport does not apply to intuitively obvious solutions. I am completely in favor of enabling such a convenient feature as displaying named memory areas in the viewport to be enabled by default, otherwise many users would never have known about it, but we must take care of those who consciously want to turn it off.
By the way, I would not at all give up the possibility of creating macros (or scripts) for working with the environment, because such setting registers (to turn on the time dimension) had to be done every time after resetting the MC, because I consider correcting the code by inserting register manipulations for debugging purposes not very correct. But, although I haven’t found any macros, working with registers can be greatly simplified due to the fact that individual (necessary) registers can be included in the expressions window, and thus greatly simplify and speed up the work with them.
To emphasize that the engineer’s feeling for the MK family was not cool (I’m still scolding different aspects of the development environment), I’ll note that the meter works fine - I couldn’t detect any extra taps in any of the debugging modes, but before to be, at least in the LuminaryMicro MK series.
So, we outline the experiment plan for determining the context switching time - we create the second task, which will increment an internal counter (in an infinite loop), start the MC for a certain time, find the relationship between the system counter and the task counter. Next, run the MC for a similar time (not necessarily exactly the same) and enter 10 characters at a pace about once per second. It can be expected that this will result in 10 switchings to the echo task and 10 switchings back to the counter task. Yes, these context switches will be made not by the Scheduler's timer, but by event, but this should not affect the total execution time of the function being studied, so we start to put the plan into practice, create the counter task and run it.
Immediately we find one feature of the RTOS, at least in the standard configuration - it is not supplanting “real”: if the priority task is always ready for execution (and the counter task is this) and does not give control to the scheduler (does not wait for signals, does not fall asleep, is not blocked by flags, etc.) then no task of lower priority will be executed from the word at all. This is not Linux, in which various methods are used to guarantee the receipt of a quantum by all, "and so that no one leaves offended." This behavior is quite expected, many light RTOS behave this way, but the problem is deeper, because they do not receive control and the tasks are equal to the always-ready priority. That is why in this example I set the echo task, which comes in waiting, the priority is higher than the constantly ready task of the counter, otherwise the latter will capture all processor resources by time.
We start the experiment, the first part (just waiting for the execution time) gave the data of the ratio of the counters 406181k / 58015k = 7 - quite expected. The second part (with 10 consecutive characters for ~ 10 seconds) gives the results 351234k-50167k * 7 = 63k / 20 = 3160 cycles, the last digit is the time associated with the procedure for switching the context in bars of the MC. Personally, this value seems to me somewhat larger than expected, we continue research, it seems that there are still some actions that spoil the statistics.
PNP: the experimenter’s common mistake is not to estimate the previously expected results and believe in the garbage received (hello to the developers of 737).
Obviously (“well, it’s quite obvious”) that the result, in addition to the actual context switching time, also contains the time required to perform character reads from the buffer and output it to the serial port. It is less obvious that it also has the time required to process an interrupt when receiving and placing a character in the receive buffer). How do we separate the cat from the meat - for this we have a cunning trick - stop the program, enter 10 characters and run it. It can be expected (it would be necessary to look at the sources) that an interrupt on reception will occur only once and all the characters will be sent from the receive buffer to the ring drive at once, which means we will see less overhead. Determining the time of issue to the serial port is also easy - we will output every second character and solve the resulting 2 linear equations with 2 unknowns. And it is possible and even easier - do not deduce anything, which I did.
And here are the results of such tricky manipulations: we make the input a packet and the missing cycles become less - 2282, turn off the output and the costs drop to 1222 cycles - that's better, although I was hoping for 300 cycles.
But with the read execution time, nothing like this will come up, it will be scaled simultaneously with the desired context switching time. The only thing I can offer is to disable the internal timer when you start entering a received symbol and to turn it on again before entering the next one. Then the two counters will work synchronously (with the exception of switching) and can be easily determined. But this approach requires a deep introduction into the texts of system programs and still the component of interrupt handling will remain. Therefore, I propose to limit ourselves to the data that has already been obtained, which makes it possible to firmly assert that the task switching time in the considered TI-RTOS does not exceed 1222 clock cycles, which for a given clock frequency is 30 μsec.
PNP: it’s still a lot - I counted 100: 30 cycles to preserve the context, 40 to define the finished task and 30 to restore the context, but we get an order of magnitude more. Although optimization is now disabled, we turn on –o2 and see the result: it has not changed much - it was 2894 instead of 3160.
There is another idea - if the OS supported switching of peer-to-peer tasks, then it would be possible to run two tasks with counters, in some magical way obtain data on the number of switches for some time and calculate the loss of the system counter, but due to the characteristics of the scheduler, which I already said, this approach will not lead to success. Although another option is possible - to do ping-pong between two peer (or even raznoorangovgovyh) tasks through a semaphore, it is easy to count the number of context switches - you will have to try, but this will be tomorrow.
The traditional poll at the end of the post this time will be devoted not to the level of presentation (to any unbiased reader it is clear that it is beyond all praise and exceeds any expectations), but to the topic of the next post.