More on one innovation in the latest version of SObjectizer

When we started talking about our OpenSource actor framework for C ++ on Habré, we promised to describe some features of the SObjectizer implementation details. One of the new features, which was implemented in the recently released version 5.5.19 , is great for such a story. In addition, it is also interesting because we had to look at the scenarios of using SObjectizer from a completely different side. You could even say that one of our templates was broken.

It is about the ability of SObjectizer to perform all its actions on one single working thread. Starting from version 5.5.19, you can even use the Actor- and Publish / Subscribe models in a single-threaded application. It is clear that the actors will have to work in cooperative multitasking mode, but in some cases this is exactly what is required.

And where you may need to use actors in a single-threaded application?

As it turned out, there is a whole class of tasks where small lightweight applications are needed. Inside which the use of actors in general and SObjectizer, in particular, is appropriate, but the creation of several worker threads and the associated overhead costs are already like a cannon on sparrows.

Let's say we can have a large application consisting of the main master process and child processes — workers, of which there can be at least a hundred, even a thousand. The master process distributes the work of workers and takes the results of their work, and also controls the viability of the workers, restarting them as needed. Child workers, as a rule, should be simple and lightweight processes. I very much want each of them to manage with just one working thread. Indeed, it is one thing to have a system of a thousand processes-workers with one thread inside, quite another - a thousand workers with four working threads inside.

Or another example: a small program that should occasionally poll a couple of devices and send the captured data to the MQTT broker. Work with each of the devices can be framed in the form of agents. But multithreading is hardly required. Moreover, all of this can work on a small single-board device with limited resources, and even if the single-board device itself draws an ordinary Linux distribution, it still makes no sense to spend resources without due reason.

Where is the gap between our template?

Originally, SObjectizer was created as a tool to simplify the development of large and complex multi-threaded applications. SObjectizer-based dispatchers and agent interaction only by means of asynchronous messages allow writing applications with dozens, and even hundreds, of worker threads inside, while the programmer does not have to deal with any mutex or condition_variable. Therefore, we did not even consider the segment of small single-threaded applications as a niche for the use of SObjectizer. As it turned out, nothing. Model Actors and Publish / Subscribe quite well and feel in single-threaded applications.

How did we manage to make SObjectizer work in single-threaded mode?

First you need to tell why SObjectizer-in general needed several worker threads. These workflows are needed for:

Timer services. SObjectizer starts a separate timer thread, which determines the time for sending pending and periodic events. When a timer thread is used, the processing of messages by agents has almost no effect on the accuracy of the timer.
Completion of the deregistration of the cooperation. When co-operation with agents is withdrawn from the SObjectizer Environment, all agents entering the co-operation must be untied from their dispatchers. And dispatchers, accordingly, should release the resources allocated to agents. So, if the agent was bound to the active_obj dispatcher, then the dispatcher must complete the work of the dedicated thread and call join () for it. And here it is very important on which context join () will be called. For if you call join () on the context of the thread for which join () is called, a classic deadlock will arise. Therefore, SObjectizer uses a separate thread, which sends notifications that all cooperation agents have completed their work completely and can be untied from the dispatchers. Therefore, all joins are safely called on the context of this particular thread.
Maintenance agents tied to the default dispatcher. If the programmer does not explicitly bind the agent to any particular dispatcher, then the agent is bound to the default dispatcher. This default dispatcher needs some kind of working thread on which it will trigger events for the agents attached to it.

It turns out that when the normal SObjectizer is launched by calling so_5 :: launch, the current thread (that is, the one on which so_5 :: launch was called) is used to perform the initial actions, after which it is blocked until the end of the work of the SObjectizer Environment. Along the way, SObjectizer creates the three threads described above for the timer, the final deregistration of the cooperatives and the default controller. Plus there are as many threads as additional dispatchers will need.

We wanted SObjectizer to do all the operations it needs on the context of just one thread - the one on which the so_5 :: launch call occurred.

For this we needed to introduce a new concept - environment infrastructure , i.e. infrastructure that will serve the needs of the SObjectizer itself. The corresponding interface was made, the insides of the SObjectizer Environment were reworked, so that the methods of this interface were twitched in the right places. Well, then several implementations were made:

default_mt is a good old implementation that uses a few extra working threads. It is created and used by default if the programmer has not explicitly specified another type of environment infrastructure;
simple_mtsafe is a simple single-threaded implementation in which timers, deregistration of cooperations and the default dispatcher use the thread on which so_5 :: launch was called. But at the same time, the simple_mtsafe infrastructure provides the thread safety of SObjectizer. More on this below;
simple_not_mtsafe is another simple single-threaded implementation in which timers, deregistration of cooperatives and the default dispatcher also use the same thread on which so_5 :: launch was called. However, the thread safety of SObjectizer is not ensured.

How do single-threaded infrastructures work?

At the core of simple single-threaded infrastructures is a single cycle, within which the SObjectizer Environment consistently performs the following actions:

checks the presence of fully ready for deregistration cooperation. If there are any, it performs the final deregistration of these cooperatives and destroys the agents in them;
checks for the presence of triggered timers and, if any, dispatches deferred / periodic messages whose response time has arrived;
Checks the presence of requests in the queue of the default dispatcher. If the queue is not empty, then the first is taken and executed, after which the cycle is repeated again.

In this case, obviously, the accuracy of the timer begins to depend on which agents work on the default dispatcher: if these agents quickly process their events, the timer works more or less accurately. If the processing can be delayed for seconds or tens of seconds, then the accuracy of the timer turns out to be no, and after the completion of the lengthy handler, a packet of timer events can be generated immediately. But this is quite a natural payment for the absence of a separate timer thread.

The word “simple” in the simple_mtsafe and simple_not_mtsafe names is used for a reason, but because the default dispatcher uses a simple FIFO event processing scheme without taking into account the priorities of the agents. If someone needs a single-threaded infrastructure with support for the priorities of agents, then let us know, we will include such a refinement in our work plan .

What is the difference between simple_mtsafe and simple_not_mtsafe?

We need to clarify why we have simple_mtsafe and simple_not_mtsafe, and what SObjectizer’s protection against multithreading generally means.

Basically, there are two situations where we may need a single-threaded SObjectizer:

Single-threaded SObjectizer should work strictly within a single-threaded application. Those. SObjectizer launched on the main thread and everything, then all the work is done only inside the SObjectizer. There are no other worker threads, you cannot access the SObjectizer from outside the main application thread. For such a situation, the simple_not_mtsafe infrastructure is intended. In its implementation, it uses the fact that they work with SObjectizer from only one thread, so the insides of SObjectizer do not need to be protected from multithreading.
')
Single-threaded SObjectizer should work inside a multi-threaded application. For example, the GUI interface should work on the main thread of the application, and the SObjectizer should work on the next thread. In this case, it is possible to contact SObjectizer not only from the thread on which it is running. But also with any other application thread. For example, a GUI thread can create new cooperatives, destroy old cooperatives, send messages to agents. For such a situation, the simple_mtsafe infrastructure is intended. It protects the internals of SObjectizer from multi-threaded access, which makes possible the operation of SObjectizer on one thread, and sending messages to SObjectizer from another thread.

We see the simple_mtsafe infrastructure task in minimizing the overhead of a SObjectizer, but at the same time retaining the ability of SObjectizer to work in a multi-threaded application. So, in simple_mtsafe, SObjectizer will use only one working thread instead of three or four, as is the case with the default_mt infrastructure. But at the same time, the user can create as many additional workflows in his application as he needs, and the name, while being able to interact with the SObjectizer from these threads.

The main application of simple_mtsafe we see in the development of small GUI-applications, in which the developer wants to transfer part of his logic to an additional stream, in which the SObjectizer will spin. At the same time, the main flow of the application will remain available for servicing GUI-related operations.

But the simple_not_mtsafe infrastructure is needed only for cases when the user wants to have that single-threaded application, in which there should be a single workflow on which all of the application’s operations are performed.

Accordingly, we see the main application of simple_not_mtsafe in small utilities, with more or less complex logic inside, but in which resource saving is important. In lightweight processes — worker — ah. And also in applications for very weak platforms.

Just in that the simple_not_mtsafe infrastructure is intended only and exclusively for single-threaded applications, there is a fundamental difference in the implementations of simple_mtsafe and simple_not_mtsafe: the infrastructure of simple_mtsafe is forced to protect its “offal” mutex. While simple_not_mtsafe does not need to do this.

As a result, the basic infrastructure operation cycles simple_mtsafe and simple_not_mtsafe are very similar, and they differ in the presence of working with std :: mutex in the case of simple_mtsafe. Code for simple_mtsafe:

template< typename ACTIVITY_TRACKER > void env_infrastructure_t< ACTIVITY_TRACKER >::run_main_loop() { m_activity_tracker.wait_started(); std::unique_lock< std::mutex > lock( m_sync_objects.m_lock ); for(;;) { process_final_deregs_if_any( lock ); perform_shutdown_related_actions_if_needed( lock ); if( shutdown_status_t::completed == m_shutdown_status ) break; handle_expired_timers_if_any( lock ); try_handle_next_demand( lock ); } }

And for simple_not_mtsafe:

 template< typename ACTIVITY_TRACKER > void env_infrastructure_t< ACTIVITY_TRACKER >::run_main_loop() { m_activity_tracker.wait_started(); for(;;) { process_final_deregs_if_any(); perform_shutdown_related_actions_if_needed(); if( shutdown_status_t::completed == m_shutdown_status ) break; handle_expired_timers_if_any(); try_handle_next_demand(); } }

Note The simple_mtsafe infrastructure methods (such as process_final_deregs_if_any () and try_handle_next_demand ()) are passed a link to std :: unique_lock so that you can release the mutex for the duration of the corresponding operations, and then grab it again.

True, working with std :: mutex in simple_mtsafe is not free. The efficiency of the simple_mtsafe infrastructure on synthetic benchmarks like ping-pong is 25% -30% lower than that of the default_mt and simple_not_mtsafe infrastructures. Which is quite expected.

Current status and future directions of work

Version 5.5.19, which implements the default_mt, simple_mtsafe, simple_not_mtsafe infrastructures, is available for download at SourceForge . There is also relevant documentation .

Currently, the simple_not_mfsafe infrastructure does not have its own mutex, but only for its own main working cycle and related data structures (for example, the mutex lists of cooperatives and time-based applications that are not ready for final deregistration are not protected). However, in other parts of SObjectizer, various synchronization primitives (like mutex and spinlock) are still present. For example, inside each agent there is a spinlock, which, in principle, is not needed for simple_not_mtsafe, but it takes its place inside the agent_t class.

This happened because, according to preliminary estimates, an attempt to remove SObjectizer’s internals from the synchronization objects for the simple_not_mtsafe infrastructure could delay the work on version 5.5.19 for at least a few more months. What we really did not want.

We also didn’t want to break the compatibility between the SObjectizer versions, which would be inevitable if we tried to switch to the use of template magic for a more efficient implementation of simple_not_mtsafe. For example, one of the ideas was to set the type of infrastructure for the agent as a template parameter. Then I would have to describe my agent classes in some way:

 template<typename ENV_INF> class my_agent : public so_5::agent_t<ENV_INF> { ... };

And this would necessarily break the compatibility and significantly complicate the translation of the old code to the new versions of SObjectizer.

Therefore, we decided in version 5.5.19 to leave the already existing synchronization objects as is, and to consider the way of removing them for simple_not_mtsafe when developing the next version. Here, we begin to think. If it seems to someone that this is a very important thing, then let me know, we will start thinking more intensively;)

In order to demonstrate where all this can lead in the limit, we tried to write down an example of a primitive single-threaded HTTP server in which asynchronous request processing is delegated to SObjectizer . At the same time, both the HTTP server (based on the parser from NodeJS and Asio) and SObjectizer work together on the only main thread of the application. It seems to work. However, related technologies, like restinio (our asynchronous HTTP server) and so_5_extra (allows you to live together on the same thread of SO-5 and Asio) have not yet reached production quality. But we are working on it.

Instead of an afterword

Work on version 5.5.19 took much more time than we ourselves expected, although the reason is quite objective reasons . We hope that the next version, 5.5.20, work on which, in fact, has already begun, we will be able to roll out much more quickly. Something like a wish-list is being formed for the new version . Well and, accordingly, readers have the opportunity to influence the functionality of SObjectizer. Write to us in the comments what you would like to see in SObjectizer. Or, on the contrary, what you would not want to see. Or maybe something prevents you from using SObjectizer?

We listen very carefully to what we are told. So, at one time, we got rid of the so_5 :: rt namespace and added features such as agent priorities, hierarchical finite automata, and mutable messages precisely because of discussions of SObjectizer on various specialized resources and not only. Therefore, there is a very real chance to make SObjectizer a tool you need, only by somebody else’s hands :)

Source: https://habr.com/ru/post/328872/

All Articles