Competitiveness: Cooperativeness

I continue my small series of articles on the means of organizing and implementing competitive computing.

In the last article, we looked at the abstraction of threads, which allows us to pretend that the code of functions is executed simultaneously and continuously.

In this we will look at two more models, one of which does not do this kind, and the second looks at competitive calculations from a more abstract side.

Parallelism (part 1)
Cooperativeness (part 2, current)
- Coroutines
- Actors
Asynchrony (part 3)

Cooperative

Unlike preemptive multitasking, which interrupts the execution of your code at any time, in any place you want, cooperative is the “manual version”, when your code knows that more than one is running, there are other pending processes, and it decides when to transfer control to others.

With cooperative multitasking, it is important not to perform lengthy operations, and if you perform, then periodically transfer control.

The ideal option would be if your “cooperative part” does not work with blocking I / O and powerful calculations, but will use a non-blocking asynchronous API, and these time-consuming things will be carried out “outside”, where parallel “pseudo-parallelism” will be executed.

Korutiny

I said that the operating system schedule streams , executing their code in certain portions of time. But let's think about how this is possible in principle. There are two options:

The processor supports the ability to interrupt the execution of instructions after some time and execute some other predefined code (timer interrupt, or, if possible, by the number of instructions executed).
We are compiling the machine code compiler into the machine code, which will itself count the number of instructions executed in some way and interrupt execution when the counter reaches a certain limit.

The second option to the overhead projector for context switching (save the value of all registers somewhere) adds an overhead projector to this modification of the code (although it can be done by the AOT ), plus the calculation of instructions during their execution (everything will become slower no more than two times, and in most cases - much less).

And when for some reason we don’t want (or can’t) use processor interrupts on a timer, and the second option is generally a trough - cooperative multitasking comes into play. We can write functions in such a style that we ourselves say when we can interrupt its execution and perform some other tasks. Something like this:

void some_name() { doSomeWork(); yield(); while (true) { doAnotherWork(); yield(); } doLastWork(); }

Where, with each call to yield() system retains the entire context of the function (the values of the variables, the place where yield() was called) and continues to perform another function of the same type, restoring its context and resuming execution from the place where it had finished the last time.

This approach has both pros and cons. Of the benefits:

If we have only one physical flow (or if our group of tasks is executed only in one), then locks will not be required for some of the shared memory, since we decide when other tasks will be performed, and we can perform actions without fear that someone else will see or intervene in them halfway, and where locks are needed, they are implemented simply by boolean.

Minuses:

The time quanta will be very uneven (which is not so important, the main thing is that they are small enough so that delays are not noticeable).
Any function may still create a noticeable delay, implemented incorrectly. And, much worse, if she does not return control at all.

In terms of speed, it's hard to talk. On the one hand, it can be faster if it does not change contexts as often as the scheduler, it can be slower if it switches contexts too often, and on the other hand, too long delays between returning control to other tasks can affect the UI or I / O, that becomes noticeable and then the user is unlikely to say that it began to work faster.

But back to our Korutin. Coroutines (coroutines, coroutines) have more than one entry point and one exit (as usual functions are subprograms), but one starting point, optionally one final and an arbitrary number of output-input pairs.

First, consider the case with an infinite number of outputs (an infinite list generator):

 function* serial() { let i = 0; while (true) { yield i++; } }

This is Javascript, when you call the serial function, an object will be returned that has a next() method, which, when successively called, will return objects of the form {value: Any, done: Boolean} , where done will be false until the generator runs at the end of the function block , and in value , the values that we send by yield.

... but besides returning the value of a yield, it can also take new data inside. For example, let's make some such adder:

 function* sum() { let total = 0; while (true) { let n = yield total; total += n; } } let s = sum(); s.next(); // 0 s.next(3); // 3 s.next(5); // 8 s.next(7); // 15 s.next(0); // 15

The first call to next() gets the value that the first yield passed, and then we can pass to next() value that we want the yield to return.

I think you understand how it works. But if you still don’t understand how to use it - wait for the next article, where I will talk about promises and async / await .

Actors

The actor model is a powerful and fairly simple model of parallel computing, which makes it possible to achieve both efficiency and convenience at a low price (more on this later). There are only two entities: the actor (which has the address and state ) and messages (arbitrary data). Upon receipt of a message, the actor may:

Act according to your condition
Create new actors, he will know their addresses , can set their initial state
Send messages to known addresses (you can send addresses in messages, including your own)
Change your state

What is good in actors? If the resources are properly divided by actors, then you can completely get rid of any locks (although, if you think locks become results expectations , during this wait you have to process other messages, and not just wait).

In addition, your code is likely to become much better organized, logically divided, you will have to work out the actors API well. And the actor is much easier to reuse than just a class, since the only way to interact with him is to send him messages and receive messages from him at the addresses passed to him, he has no hard dependencies and implicit connections, and any of his "outside call" is easily intercepted and customized.

The price of this is a message queue and an overhead to work with it. Each actor will have a queue of incoming messages to it, in which incoming messages will accumulate. If he does not have time to process them - it will grow. In loaded systems, you will have to somehow solve this problem by inventing ways for parallel processing so that you have groups of actors who do some one task. But in this case, the queues give you a plus, because it becomes very easy to monitor places where you do not have enough performance. Instead of one metric "I waited for the 50ms result" , you have for each component of the system a metric "can process N requests per minute" .

Actors can be implemented in many different ways: you can create your own thread for each (but then we can’t create really many instances), or you can create a couple of threads that will actually work in parallel and twist the message handlers inside them - nothing will change (if only some of them do not take very long operations, which will block the implementation of the rest), and it will be possible to create more actors. If the messages are serializable, then there are no problems to distribute the actors across different machines, which increases the ability to scale well.

I will not give examples if you are interested, I advise you to read Learn You Some Erlang for Great Good! . Erlang is an IP program built entirely on the concept of actors, and the supervisor system allows you to make applications truly resilient. Not to mention OTP , which sets the right tone and makes the task of writing a bad system rather complicated.

In the third part, we turn to the most interesting part - the methods of organizing asynchronous calculations, when we make a request for some action, and we get the result of this request only in an uncertain future. Without any pasta, callback hell and undefined states.

UPD: The third part .

Source: https://habr.com/ru/post/318786/

All Articles

Competitiveness: Cooperativeness

Cooperative

Korutiny

Actors

More articles: