Tender friendship of agents and exceptions in SObjectizer

Sooner or later something goes wrong in the program. The file did not open, the working thread was not created, the memory did not stand out ... And with this you need to somehow live. In a small single-threaded application it is quite simple: you can interrupt all the work and restart. This is one of the factors due to which Erlang has earned itself a well-deserved popularity, because the ideology of fail fast is one of the cornerstones of Erlang with its lightweight processes. If the application is large, complex, and multi-threaded, then it is not reasonable to restart the entire application if only one of its threads has encountered problems. Even worse in the situation with the implementation of the Model Actors, in which hundreds of thousands of actors can work on dozens of working threads. The problem of one actor is unlikely to affect all other actors.

In this article, we will explain how we approached error handling in our SObjectizer framework .

Exceptions - yes, return codes - no!

When SObjectizer-4 appeared in 2002, we made a big mistake - we preferred to use return codes for exceptions. And all the subsequent development experience on SObjectizer-4 again and again convinced one simple truth: if the error can be predicted by the developer, then it will be ignored by him. Therefore, when creating SObjectizer-5, we decided to use exceptions to report errors.

It was the right choice. In the “exceptions against return codes” disputes, the spears still break, but our experience shows that development only benefits if you cannot accidentally miss, for example, an agent subscription error or agent cooperation registration.
')
So, SObjectizer-5 throws an exception if it cannot perform a particular operation. Most often, these operations are performed by agents already registered in the SObjectizer. What should an agent do if he faces an exception?

A normal agent should not let exceptions out!

This is the main rule that exists for exclusion agents. If an agent receives an exception when processing its event (it does not matter if a SObjectizer exception or someone else threw an exception), then the agent should not throw this exception out.
The explanation is simple. An agent in a SObjectizer does not own its own working context. Roughly speaking, the agent does not own the working thread on which he works. The working context is provided by the dispatcher to which the agent is attached, during the processing of the next event, and then may be provided to another agent. When an agent releases an exception to the outside, the exception will fall to the dispatcher, which highlighted the working context. If the application does not want the dispatcher to decide whether to kill the application or allow it to continue, then the agents of this application should attend to the exception handling.

Ideally, this means that agent events must be noexcept methods. But this is an ideal case. And the noexcept mechanism in C ++ is a good thing, but it only guarantees that an exception from the noexcept method will not arrive. At the same time, it can fly out, but the compiler doesn’t beat on hands if non-noexcept methods are called in noexcept methods. And if the exception crashes, it leads straight to std :: terminate (). That does not always suit us. How to be in that non-ideal world in which we live?

SObjectizer can tell you how to react to an exception that has escaped from an agent.

Since shit does happen from time to time, even when we undertake to provide a no exception guarantee for agents, we can make a mistake and the exception will still go outside. He will be caught by the dispatcher and will decide what to do next.

To do this, the dispatcher will call the so_exception_reaction () virtual method on the problem agent. This method should return one of the following values:

so_5 :: abort_on_exception. This will cause std :: abort () and terminate the entire application;
so_5 :: shutdown_sobjectizer_on_exception. This value means that the agent provides basic exception guarantee (i.e., the absence of resource leaks and / or damage to something), but there is no sense to continue further. Therefore, the agent is transferred to a special state in which the agent cannot handle any events, and the SObjectizer Environment operation completes normally, without calling std :: abort (). In this case, all registered cooperations are properly deregistered, which allows other agents to complete their work normally and clean up resources. Note that in an application several SObjectizer Environment can work simultaneously. In the case of shutdown_sobjectizer_on_exception, only the SObjectizer Environment in which the exception was caught was terminated;
so_5 :: deregister_coop_on_exception. This value means that the agent provides basic exception guarantee and the application can continue its work without this agent and its cooperation. Therefore, the agent is transferred to a special state, and his cooperation is deregistered in the usual way (which allows other cooperative agents to complete their work normally);
so_5 :: ignore_exception. This value means that the agent provides a strong exception guarantee (i.e. there are no resource leaks and / or damage to something + the agent remains in the correct state) and can continue its work. Therefore, the dispatcher simply ignores the exception, as if it did not exist.

The presence of such a variant as ignore_exception may seem strange after it was stated that normal agents should not throw exceptions outside. However, in practice, having such a value is convenient for agents with very simple event handlers. For example, an agent receives a message of type M1 and converts it into messages of type M2. During the conversion, an exception may occur, but it has little effect: the state of the agent is not violated, the message M2 is lost, well, so the messages may be lost for one reason or another. In such cases, it is easier to allow exceptions to fly out of simple agents so that the dispatcher ignores them, rather than include a try-catch block in each event handler.

Thus, the programmer can decide himself which option is best for his agent, override the so_exception_reaction () method and thereby inform the SObjectizer about how to be after catching the exception:

using namespace so_5; //  ,    M1  mbox- src    //   M2  mbox- dest. ,     // , . class my_simple_message_translator final : public agent_t { public : my_simple_message_translator( context_t ctx, mbox_t src, mbox_t dest ) : agent_t( ctx ) { so_subscribe( src ).event( [dest]( const M1 & msg ){ send< M2 >( dest, ... );} ); } //  SO-5,     . virtual exception_reaction_t so_exception_reaction() const override { return ignore_exception; } };

Reaction to exceptions at the level of cooperation

The standard implementation of agent_t :: so_exception_reaction () pulls the exception_reaction () method of the cooperation, which includes the agent. Those. By default, the agent inherits the exception reaction from its cooperation. And this reaction can be set during the registration of cooperation.

For example:

 //    ,     //        . env.introduce_coop( []( coop_t & coop ) { coop.set_exception_reaction( deregister_coop_on_exception ); coop.make_agent< some_agent >(...); ... } );

Thus, in SObjectizer, the reaction to an exception can be set at the agent level, and if this has not been done, then the reaction to the exception specified for the agent's cooperation is used.

But what happens if the set_exception_reaction () method is not called when creating a cooperation (and in most cases it is not called)?

If the programmer did not explicitly call coop_t :: set_exception_reaction (), then coop_t :: exception_reaction () will return a special value - so_5 :: inherit_exception_reaction. This value indicates that the cooperation inherits the reaction to expulsion from its parent cooperation. If this parent cooperation exists, the SObjectizer will call exception_reaction () for it. If the parent cooperation returns the value so_5 :: inherit_exception_reaction, then the SObjectizer will cause exception_reaction () for the parent of the parent cooperation, etc.

In the end, it may turn out that there is no regular parental cooperation. In this case, the SObjectizer will call exception_reaction () for the entire environment_t. And already environment_t :: exception_reaction () will return the value so_5 :: abort_on_exception. Which will lead to the collapse of the entire application through a call to std :: abort ().

However, the programmer can specify an exception response for the entire SObjectizer Environment. This is done through setting the properties of the SObjectizer at startup:

 so_5::launch( []( environment_t & env ) {...}, []( environment_params_t & params ) { params.exception_reaction( shutdown_sobjectizer_on_exception ); ... } );

Small intermediate summary

So, if the agent throws an exception, the SObjectizer intercepts it and asks the agent what to do with the exception through the call agent_t :: so_exception_reaction (). If the programmer did not redefine so_exception_reaction (), then the response to the exception is determined by the cooperation the agent belongs to.

Usually, the co-operation tells SObjectizer that it inherits the reaction to the exception from its parent. And SObjectizer will ask parent cooperation. Then the parent of the parental cooperation, etc. And when the parents run out, SObjectizer will ask the environment_t for the exception reaction, within which the problem agent works. By default, environment_t will say that the application needs to be interrupted via a call to std :: abort (). Thus, the programmer can influence the occurrence of exceptions at different levels:

in the agent itself, catching all exceptions inside the agent's events or overriding so_exception_reaction ();
in agent cooperation or parental cooperation;
in the SObjectizer Environment, within which agents and cooperatives operate.

How to react to the deregistration of cooperation?

As shown above, the SObjectizer may react to exceptions issued from the agent in different ways. For example, it can deregister only problematic cooperation. But what is the point of this reaction? After all, the cooperation solved some applied task in the application, and if it did not solve it, then it would not exist. And here this cooperation suddenly disappears ... How to find out about it and how to react to it?

SObjectizer allows you to receive a notification that some kind of cooperation has been deregistered. In some ways, this mechanism resembles the ability to monitor processes in Erlang: for example, you can call erlang: monitor (process, Pid) and, if the Pid process is terminated, then the message {'DOWN', ...} comes.

In SObjectizer, it is possible to “hang” the notifier on a deregistration event. The notifier is a functor that SObjectizer will call automatically when it completes the deregistration of a cooperation. In this functor, SObjectizer will pass both the name of the deregistered cooperation and the reason for its deregistration. This functor can do what the application needs. For example, you can send a message to a interested agent about the disappearance of cooperation. And you can simply re-register the cooperation:

 //       //     -  . #include <iostream> #include <so_5/all.hpp> void start_coop( so_5::environment_t & env ) { env.introduce_coop( [&]( so_5::coop_t & coop ) { struct raise_exception : public so_5::signal_t {}; //     . //        . auto agent = coop.define_agent(); agent.on_start( [agent] { so_5::send_delayed< raise_exception >( agent, std::chrono::seconds(1) ); } ) .event< raise_exception >( agent, [] { throw std::runtime_error( "Just a test exception" ); } ); //  SObjectizer-     . coop.set_exception_reaction( so_5::deregister_coop_on_exception ); //  ,      // ,   ,     . coop.add_dereg_notificator( []( so_5::environment_t & env, const std::string & coop_name, const so_5::coop_dereg_reason_t & why ) { std::cout << "Deregistered: " << coop_name << ", reason: " << why.reason() << std::endl; if( so_5::dereg_reason::unhandled_exception == why.reason() ) start_coop( env ); } ); } ); } int main() { so_5::launch( []( so_5::environment_t & env ) { //     . start_coop( env ); //     . std::this_thread::sleep_for( std::chrono::seconds( 5 ) ); //  . env.stop(); } ); }

There is no ready -made supervisor system, as in Erlang , in SObjectizer. Somehow it was possible to do without it. But, if it is needed for an application, then something similar can be collected on the basis of notifiers.

A little philosophical remark finally

C ++ is an unsafe language. And writing code that provides at least basic guarantees for the security of exceptions requires some effort from the developer. Therefore, when implementing actors in C ++, you need to be wary of using the principle of fail fast. This is good in Erlang — if they discovered some problem in the process, they simply killed the process, after which Erlang VM cleaned everything up after it, and the corresponding supervisor executed the launch of the new process instead of the failed one.

In C ++, all agents live in the same process. Therefore, if any of the agents is not implemented with sufficient quality, allows for the leakage of resources and / or damage to something in the process memory, then its deregistration and subsequent creation of a new agent instead of deregistered may be not a solution, but an even bigger problem.

It is because of this that in SObjectizer, by default, the work of the entire application is interrupted if an agent throws an exception. If the programmer is not satisfied and he is going to change the reaction to some other (especially the ignore_exception reaction), then you should think twice and carefully check the agent code to ensure that exception safefy.

Conclusion

Perhaps with this article we close the story about the main distinguishing features of SObjectizer. We are going to release the following articles on SObjectizer when something new will appear. Well, or if they come across interesting questions, it is difficult to give an exhaustive answer to which in the comments.

At the same time, taking this opportunity, we invite you to attend the conference Corehard C ++ Autumn 2016, which will be held on October 22 in Minsk . And on which there will be a report on the Model of Actors as applied to C ++. Including about SObjectizer.

Source: https://habr.com/ru/post/312128/

All Articles