SObjectizer: what is it, what is it for and why does it look that way?

Developing multithreaded C ++ programs is not easy. Developing large multi-threaded programs in C ++ is not very easy. But, as is usually the case in C ++, life is greatly simplified if you manage to pick up or make a tool “sharpened” for a specific task. Fourteen years ago, there was nothing particularly to choose from, so we made such a “sharpened” instrument for ourselves and called it SObjectizer. The experience of everyday use of SObjectizer in commercial software engineering does not yet allow regretting what has been done. And if so, then why not try to talk about what it is, why it is and why we did it that way, and not otherwise ...

What is it?

SObjectizer is a small OpenSource tool, freely distributed under the 3-point BSD license. The basic idea behind SObjectizer is the construction of an application from small entity agents that interact with each other through message exchange. SObjectizer takes responsibility for:

message delivery to recipient agents within the same process;
management of working threads on which agents process messages addressed to them;
timers mechanism (in the form of deferred and periodic messages);
the possibility of setting the parameters of the above mechanisms.

It can be said that SObjectizer is one of the implementations of the Actor Model . However, the main difference between SObjectizer and other similar developments is the combination of elements from the Actor Model with elements from other models, in particular, Publish-Subscribe and CSP .

In the “classic” Actor Model, each actor is the immediate addressee in the send operation. Those. if we want to send a message to some actor, we must have a reference to the recipient actor or the id of this actor. The send operation simply adds the message to the recipient's message queue.

In SObjectizer, the send operation receives a link not to the actor, but to such a thing as the mbox (message box). Mbox can be considered as a kind of proxy that hides the implementation of the message delivery procedure to the recipients. There may be several such implementations, and they depend on the type of mbox. If this is a multi-producer / single-consumer mbox, then, as in the “classic” Actor Model, the message will be delivered to a single recipient, the owner of the mbox. But if this is a multi-producer / multi-consumer mbox, then the message will be delivered to all recipients who have subscribed to this mbox.
')
Those. the send operation in SObjectizer is more like a publish operation from the Publish-Subscribe model than a send from the Actor Model. The consequence of which is the availability of such practical opportunities in practice as broadcasting messages.

The message delivery mechanism in SObjectizer is similar to the Publish-Subscribe model also by the subscription procedure. If an agent wants to receive type A messages, then he should subscribe to type A messages from the corresponding mbox. If he wants to receive messages of type B, he must subscribe to messages of type B. And so on. At the same time, the message type plays the same role as the topic name in the Publish-Subscribe model. Well, as in the Publish-Subscribe model, where the recipient can subscribe to any number of topics, an agent in SObjectizer can subscribe to any number of message types from different mboxes:

class example : public so_5::agent_t { ... public : virtual void so_define_agent() override { //        //   mbox-. //        //  . so_subscribe(some_mbox) .event( &example::first_handler ) .event( &example::second_handler ) //        -. .event( []( const third_message & msg ){...} ); //         //   mbox-. so_subscribe(another_mbox).event( &example::first_handler ); so_subscribe(yet_another_mbox).event( &example::first_handler ); ... } ... private : void first_handler( const first_message & msg ) {...} void second_handler( const second_message & msg ) {...} };

The next important difference between SObjectizer and other implementations of the “classic” Actor Model is that the agent does not have its own message queue in SObjectizer. The message queue in SObjectizer belongs to the working context on which the agent is serviced. And the working context is determined by the dispatcher to which the agent is attached.

A dispatcher is one of the cornerstones in SObjectizer. Dispatchers determine where and when agents will process their messages.

The easiest dispatcher owns only one working thread. All agents attached to such a dispatcher work on this common thread. This thread owns one single message queue, and messages for all agents associated with the dispatcher are placed in this single queue. The working thread takes the message from the queue, calls the message handler from the corresponding receiving agent, and then proceeds to the next message, etc.

There are other types of dispatchers. For example, dispatchers with workflow pools, dispatchers with agent priority support and different policies for handling these priorities, etc. In all cases, the working context and message queue for the agent is assigned to the dispatcher to which the agent is associated.

The next distinguishing feature of SObjectizer is the existence of such a thing as " agent cooperation ". A co-operation is a group of agents that jointly performs an applied task. And these agents must begin and complete their work at the same time. Those. if any agent cannot start, then all other agents of cooperation will not start either. If some agent cannot continue working, then all other agents of cooperation cannot continue their work.

Cooperation appeared in SObjectizer because in a fair amount of cases, in order to perform more or less complex actions, it is necessary to create not one but several interconnected agents. If you create them one by one, you need to decide, for example:

who starts first and in what order he creates the rest;
how agents that have already started will determine that all other agents have already been created and can begin their work;
what to do if there is a problem when starting the next agent.

In the case of cooperations, everything is simpler: interconnected agents are created all at once, are included in cooperation, and cooperation is registered in the SObjectizer. And already SObjectizer itself ensures that the registration of cooperation is carried out transactionally (that is, that all agents of cooperation start, or that none start):

 //      3- : // -   TCP-  AMQP-   AMQP-; // -     . // -        // (      ); so_5::environment_t & env = ...; //   ,     . //       , ,    // ,     SObjectizer-. auto coop = env.create_coop( so_5::autoname ); //  ... coop.make_agent< amqp_client >(...); coop.make_agent< db_worker >(...); coop.make_agent< message_processor >(...); //  . //       SObjectizer. env.register_coop( std::move(coop) );

Partly cooperations solve the same problem as Erlang's supervisor system : agents entering into cooperation are, as it were, under the control of the all-for-one supervisor. Those. Failure of one of the agents leads to deregistration of all other agents of cooperation.

The next important feature of the SObjectizer is that the agents in the SObjectizer are state machines . An agent can have an arbitrary number of states, one of which at a particular time is the current state. The reaction of the agent to external influence depends on both the incoming message and the current state. The agent can process the same message differently in different states, for which he signs different message handlers in each of the states.

Agents in a SObjectizer can be quite complex finite automata: nesting of states, temporary constraints on the agent's stay in the state, states with deep and shallow-history, as well as state entry and exit handlers are supported.

 class device_handler : public so_5::agent_t { //  ,     . //     ... state_t st_idle{ this }, st_activated{ this }, // ...   ,    //  st_activated. st_cmd_sent{ initial_substate_of{ st_activated } }, st_cmd_accepted{ substate_of{ st_activated } }, st_failure{ substate_of{ st_activated } }; ... public : virtual void so_define_agent() override { //     st_idle. st_idle.activate(); //      st_idle   //       st_activated. st_idle.transfer_to_state< command >( st_activated ); //   st_activated     : //   turn_off    st_idle. //      turn_off  //   . st_activated .event( [this](const turn_off & msg) { turn_device_off(); st_idle.activate(); } ); //   st_cmd_sent ""    //   st_activated, .. st_cmd_sent   // . st_cmd_sent .event( [this](const command & msg) { send_command_to_device(msg); // ,      150ms. send_delated<check_status>(*this, 150ms); } ) .event( [this](const check_status &) { if(command_accepted()) st_cmd_accepted.activate(); else st_failure.activate(); } ); ... //   st_failure    50ms, //     st_idle. //        //  . st_failure .on_enter( [this]{ reset_device(); } ) .time_limit( 50ms, st_idle ); } ... };

From a CSP model, SObjectizer borrowed such a thing as channels, which are called message chains in SObjectizer. CSP channels were added to SObjectizer as a tool to solve one specific problem: interaction between agents is built through message exchange, so it is very easy to give some command to the agent or send some information to the agent from any part of the application — just send the message via send . However, how can agents act on the non-SObjectizer part of the application?

This problem is solved by message chains (mchains). The message chain may look just like the mbox: you need to send messages to the mchain using all the same send-a. But messages from the mchain functions of the receive and select functions are retrieved, for which you do not need to create SObjectizer agents.

Working with the message chain in SObjectizer is similar to working with channels in the Go language . Although there are serious differences:

SObjectizer mchains can simultaneously store messages of any type, while Go-shny channels are typed;
In the go-shnoy select construct, you can use both send and receive operations. Whereas in SObjectizer-ovsky select, only receive-operations are allowed (at least in versions up to 5.5.17 inclusive);
SObjectizer mchains may or may not have a message queue size limit. While in Go, channel size is always limited. For a limited-size mchain, SObjectizer forces you to choose the appropriate behavior to try to put a new message on a full mchain (for example, wait a while and throw away the oldest message from mchain or wait for nothing and immediately throw an exception).

Message chains are a relatively recent addition to SObjectizer. However, the thing turned out to be quite useful and rather unexpected consequence of its addition was the fact that it became possible to develop some multi-threaded applications on SObjectizer even without the use of agents:

 void parallel_sum_demo() { using namespace std; using namespace so_5; //  ,         . struct consumer_result { thread::id m_id; size_t m_values_received; uint64_t m_sum; }; wrapped_env_t sobj; //      . auto values_ch = create_mchain( sobj, //     ,   //   5       . chrono::minutes{5}, //   300   . 300u, //       . mchain_props::memory_usage_t::preallocated, //         5 , //     . mchain_props::overflow_reaction_t::abort_app ); //        . auto results_ch = create_mchain( sobj ); //  . vector< thread > workers; for( size_t i = 0; i != thread::hardware_concurrency(); ++i ) workers.emplace_back( thread{ [&values_ch, &results_ch] { //      // ,     //  . size_t received = 0u; uint64_t sum = 0u; receive( from( values_ch ), [&sum, &received]( unsigned int v ) { ++received; sum += v; } ); //    . send< consumer_result >( results_ch, this_thread::get_id(), received, sum ); } } ); //      .    //    ,     //   . for( unsigned int i = 0; i != 10000; ++i ) send< unsigned int >( values_ch, i ); //          //   . close_retain_content( values_ch ); //       . receive( //   ,     . from( results_ch ).handle_n( workers.size() ), []( const consumer_result & r ) { cout << "Thread: " << r.m_id << ", values: " << r.m_values_received << ", sum: " << r.m_sum << endl; } ); for_each( begin(workers), end(workers), []( thread & t ) { t.join(); } ); }

Why is this?

Certainly, readers who have never used the Actor Model and Publish-Subscribe, have a question: "And what, does all of the above really simplify the development of multi-threaded C ++ applications?"

Yes. Simplifies. Proven in humans. Repeatedly.

Clear business, simplifies not for all applications. Indeed, multithreading is a tool that is used in two very different directions. The first direction, called parallel computing , uses streams to load all the available computing resources and reduce the total computation time for computing tasks. For example, accelerating video transcoding by loading all the cores, with each core performing the same task, but on its own set of data. This is not the direction for which SObjectizer was created. Other tools are designed to simplify the solution of this class of problems: OpenMP , Intel Threading Building Blocks , HPX , etc.

The second direction, called concurrent computing , uses multithreading to provide parallel execution of many (almost) independent activities. For example, an email client in one stream can send outgoing mail, in the second - download incoming, in the third - edit the new letter, in the fourth - perform background spell checking in a new letter, in the fifth - perform a full-text search in the mail archive, etc.

SObjectizer was created just for the direction of concurrent computing, and the above features of SObjectizer reduce the headache of the developer.

First of all, by building the interaction between agents through asynchronous messaging.

The interaction of independent threads through message queues is much easier than through manual work with shared data protected by semaphores or mutexes. And the simpler it is, the more workflows in the application and the more often and diverse their interaction.

Getting lost in mutexes and conditional variables is easy even on a dozen workflows. And when the score goes to hundreds of workflows, then manual fussing with low-level synchronization primitives generally turns out to be beyond the capabilities of even experienced developers. Whereas a hundred threads interacting through message queues, as practice has shown, is absolutely not a problem.

So the main thing that gives the SObjectizer developer (like any other implementation of the Actor Model) is the ability to represent independent activities within the application as agents that communicate with the outside world only through messages.

The next key point is the binding of agents to the appropriate working contexts.

Common sense suggests (and practice confirms this) that it is not good to give out all agents on their own working thread. The application may need ten thousand independent agents. Or a hundred thousand. Or even a million. Obviously, the presence of such a number of working threads in the system will not lead to anything good. Even if the OS will be able to create them (depending on which OS and on which equipment), the overhead of ensuring their work will still be too large for the application built in this way to work with acceptable performance and responsiveness.

The opposite, when all agents are tied to a single common thread or to a single thread pool, is also not an ideal solution for all cases. For example, in an application there may be a dozen agents who have to work with a third-party synchronous API (make queries to the database, communicate with devices connected to a computer, perform heavy computational operations, etc.). Each such agent is able to slow down the work of all other agents that will be on the same working thread with it. Several such agents can easily slow down the entire application if it uses a single pool of workflows: just each agent will take one of the pool's threads ...

In order to solve these problems, SObjectizer has dispatchers and such an important operation as binding agents to the appropriate dispatchers. All together, this gives the developer the necessary freedom and flexibility, while relieving the developer from the worries of managing these threads.

A programmer can create as many dispatchers as he needs, and so distribute his agents among these dispatchers as he sees fit. For example, in the application may be:

one dispatcher of type one_thread, on which a certain agent runs an AMQP client;
one dispatcher of type thread_pool, on which the agents responsible for processing messages from AMQP-shny topics work;
one active_obj dispatcher to which agents are attached to interact with the DBMS;
another active_obj dispatcher on which agents will communicate that communicate with HSMs connected to the computer;
and another thread_pool dispatcher for agents that monitor and control the entire kitchen described above.

Another thing that the SObjectizer users themselves point out as one of the most important is the support for deferred and periodic messages. Those. work with timers .

It is often necessary to perform some action in N milliseconds. And then after M milliseconds, check for the result. And, if there is no result, wait K milliseconds and repeat all over again. Nothing complicated: there is send_delayed, which makes a message postponed for a specified time.

Often, agents work on a clock basis. For example, once a second, the agent wakes up, performs a pack of operations accumulated during the last second, and then falls asleep before the next tact. Again, nothing complicated: there is send_periodic, which repeats the delivery of the same message with a given tempo.

Why is SObjectizer exactly like that?

SObjectizer has never been an experimental project, it has always been used to simplify everyday work with C ++. Each new version of SObjectizer went straight to work, SObjectizer was constantly used in the development of commercial projects (in particular, in several business-critical projects of Intervale, but not only). It left its mark on its development.

The work on the last version of SObjectizer (we call it SObjectizer-5) began in 2010, when the C ++ 11 standard was not yet adopted, some C ++ 11 things were already supported in some places, and some had to wait more than five years.

In such conditions, not everything could be done conveniently and concisely from the first time. Mostly there was not enough experience using C ++ 11. Very often, we were limited by the capabilities of compilers that we had to deal with. We can say that the forward movement went by trial and error.

At the same time, we also needed to take care of compatibility: when SObjectizer is at the heart of business-critical applications, you cannot just throw away some piece of SObjectizer or change some of its API in some fundamental way. Therefore, even if time showed that somewhere we were mistaken and something could be made easier and more convenient, then there was no opportunity to “take and rewrite”. We were moving and moving in an evolutionary way, gradually adding new features, but not throwing out old pieces overnight. As a small illustration: there has not been any serious backward compatibility violation since the release of version 5.5.0 in the fall of 2014, although since then there have already been about 20 releases as part of the development of version 5.5.

SObjectizer acquired its unique features as a result of many years of use of SObjectizer in real projects. Unfortunately, this uniqueness “climbs sideways” when trying to tell the SObjectizer to the general public. Too SObjectizer is not like Erlang and other projects created in the image and likeness of Erlang (for example, C ++ Actor Framework or Akka ).

Here, for example, we have the ability to run several independent instances of SObjectizer in one application. The opportunity is very exotic. But it was added because in practice sometimes this is necessary. To support this feature, a concept like the SObjectizer Environment has appeared in SObjectizer. And this SObjectizer Environment needed to be “pulled” through a fair amount of the SObjectizer API, which could not but affect the brevity of the code.

But in the C ++ Actor Framework, this feature was not originally available. The actors API in CAF looked much simpler, and the code examples were shorter. Because of this, we often come across statements that CAF is perceived as simpler and clearer than SObjectizer.

, , , CAF- , SObjectizer Environment ( actor_system) . CAF . CAF-. , , CAF SObjectizer.

, SObjectizer, , : SObjectizer-5 . - « ? Erlang- , Akka — , CAF — , ?!!»

Not.For a very simple reason: there was such support in SObjectizer-4, but over time it turned out that there was no transport that would ideally fit different conditions. If the nodes of the distributed application chase each other large chunks of video files - this is one thing. If they exchange hundreds of thousands of small packages, that’s completely different. If a C ++ application needs to communicate with Java applications - this is the third. Etc.

SObjectizer-5 , , , . - AMQP, - MQTT, - REST. . , .

, , C++ - :

CAF , Erlang- C++, ;
, QP , ;
Actor Model Just::Thread Pro Actor Edition .

. , , SObjectizer , -, , , . , , , SObjectizer .

Afterword

, , SObjectizer- : « ?», « ?» « ?» SObjectizer-. , - . , SObjectizer-, , , ., ..

Source: https://habr.com/ru/post/304386/

All Articles

SObjectizer: what is it, what is it for and why does it look that way?

What is it?

Why is this?

Why is SObjectizer exactly like that?

Afterword

More articles: