Article hypothesis. Described anywhere was not implemented, although, in principle, nothing prevents to gash this in the Phantom.
This idea occurred to me a long time ago and even somewhere was described by me. The trigger to describe it today is a discussion of Linux network drivers in the comments on the
driver's Anatomy .
I will formulate the problem described there, as I understand it: the Linux network driver works in a separate thread that reads the received packets from the device and processes them synchronously. Runs through routing, firewall and, if the package is not us, sends it to the outgoing interface.
')
It is clear that some packages are serviced quickly, while others can take a long time. In such a situation, I would like to have a mechanism that dynamically generates serving threads as needed, and a cheap enough mechanism in a situation when extra threads are not needed.
That is, I would like such a function call, which, if necessary, can be converted into a thread start. But at the price of the function call, if the thread really was not needed.
This idea came to me when I considered absolutely fantastic models for Phantom, including an actor model with the launching of a thread in general for any function / method call. I dropped the model itself, but the idea of ​​lazy threads has remained and still seems interesting.
Like this.
Suppose we run the function void worker (packet), which should silently accomplish something. We are not interested in the return code (or it is given to us asynchronously), and we would like to perform a function within our thread, if it is short, and within a separate thread, if it is long.
The concept of “long” is open here, but it would be reasonable for him to apply a simple assessment point — if we met our own planning quantum — the function is short. If during the life of the function there was a preemption and the processor was taken from us, it was a long one.
To do this, run it through the proxy lazy_thread (worker, packet), which performs a very simple operation - fixes the link to the stack at the moment before calling the worker function in the special lazy_threads_queue queue, and replaces the stack with a new one:
push( lazy_threads_queue, arch_get_stack_pointer() ); arch_set_stack_pointer(allocate_stack())
If worker returned, then cancel this operation:
tmp = arch_get_stack_pointer() arch_set_stack_pointer( pop( lazy_threads_queue ) ); deallocate_stack(tmp)
And we continue as if nothing had happened. Everything cost us a couple of lines of code.
If considerable time has passed, and the worker is still working, we will perform a simple operation - at the point of the stack change we will split the threads after the fact: let's pretend that a full thread creation has occurred inside lazy_thread (): copy the properties of the old thread to the new, return address on the new stack (which we selected in lazy_thread) rearranged so that it points to the function thread_exit (void), and in the old thread we set the pointer of the following instruction to the exit point from the function lazy_thread.
Now the old thread continues to work, and the new one will perform what has been begun, and will be destroyed where it would return from lazy_thread in the original script.
That is: the actual decision to start a thread to process a particular request occurred after we started to fulfill it and were able to assess the actual severity of this request. You can impose on the decision point on the launch of a lazy thread additional restrictions - for example, the average load average for 15 seconds is less than 1.5 / processor. If it is higher - parallelization is unlikely to help, we will spend more resources on the start of meaningless threads.
In the modern world, when a common thing is 4 processors in a pocket machine and 16 in a desktop, there are clearly needed mechanisms that help the code adapt to the load capacity of the hardware. Maybe so?