Actors for fun and profit

This article is an adapted version of the text of the report of the same name from the C ++ CoreHard Autumn 2017 conference . The article is considered as the completion of the topic raised earlier in the publications “ Model of Actors and C ++: What, Why and How? ” And “Cones Stuffed in 15 Years of Using Actors in C ++” Part I and Part II . Today we will talk about how to understand that the Actor Model can be successfully used in your project.

In principle, the article is "captain", because The things described in it are quite obvious and are dictated by common sense. But, unfortunately, not often attention is focused on them.

Lyrical digression on the topic “Model of Actors and C ++: Myth or Reality?”

The article discusses the things that are inherent in the Actor Model itself, regardless of specific programming languages. But, since The author is closely related to software development in C ++, that is, some emphasis on the applicability of actors in C ++.

There is a lot of information on the Internet about how to use the Erlang programming language or the Akka framework. But there is little information about how the Model of Actors is used in C ++. And it may seem that the Model of Actors and C ++ are not compatible in principle.

This is not true. The model of actors can be successfully used in C ++. And, characteristically, it does apply. Here is a small list of application areas for which I know examples of using the Actor Model in C ++:

industrial automation (ACS TP);
telecom;
electronic and mobile commerce;
simulation modeling;
CAD;
game construction;
Middleware (DBMS, ...)

There are even a few ready-made, live and developing frameworks for C ++, here are the most famous ones:

QP / C ++ (dual license);
Just :: Thread Pro: Actors Edition (commercial license);
C ++ Actor Framework (BSD-3-Clause license);
SObjectizer (BSD-3-Clause license).

Besides:

OOSMOS (dual license, C language, but can also be used in C ++);
Asyncronous Agents Library (included in Visual Studio);

In principle, there is plenty to choose from. You can not even write your bike. Although we, C ++ programmers, love this very much. However, being an old developer of one of these frameworks, I can say two things:

Firstly, the Actor Model can be safely used in C ++. It brings positive results. Repeatedly tested in practice;
secondly, doing your own implementation of the Model Actors for C ++ is a thankless task. Labor and time will have to invest a lot. And whether it will pay off in the medium or long term is unknown, most likely it will not pay off.

So if you want to use the Actor Model in C ++, then it makes sense to try something from the ready one first. And only if nothing suits your task, then you can already think about your bike. Or even at the expense of changing the programming language.

Is it worth it to talk about Model Actors?

The main part of the story should start with a million question: “Do we really need a model of actors?”

Especially often this question arises on specialized resources. There, anonymous experts who know and are able to do everything in the world claim that they are not needed. Perhaps such mega-monsters do not need either the Actor Model or other approaches to competitive programming. But I am interested in an adequate and balanced response to this survey.

And in my opinion, the question “Do we need a model of actors?” Is very similar to the question “Do we need a dump truck on the Formula 1 track?”

The bottom line is that both one and the second question are meaningless without one important clarification ... Namely: " For what? "

If we ask the question “What is a dump truck on the Formula 1 track for?”, We immediately get quite meaningful space for normal answers. For example: to repair the Formula 1 track. Is a dump truck needed for this? Probably yes. And the question gained meaning, and there is a normal answer for it.

Likewise with the need for the Model Actors. As soon as we ask the question “What is the Model of Actors for in such and such conditions?”, We immediately get the opportunity to find a meaningful answer.

And this answer will often be: in order to simplify your life!

But...

However, not all so simple. Perhaps you read the book “You are, of course, joking, Mr. Feiman!” There was a story about how in the Princeton laboratory the young Richard Feiman tried to put on a toy experience and he exploded a large bottle of water. Photos with the results of previous important experiments were spoiled. The head of the laboratory then told Feyman: “Experiments of beginners should be carried out in the laboratory for beginners!”

And the same words can be addressed to all those who want to drag technology into a real project, with which no one has worked before.

Basically, we love it. There is even the term HDD - Hype Driven Development. When we learned something new, were inspired, dragged into a combat project and for a long time and persistently struggled with the consequences.

So, in order for the Model Actors not to cause unbearable pain in a large project, you need to first practice. For example, on cats :)

Need to train on cats!

Try first Model Actors in some small, toy tasks. Create actors, send messages between them. See where you liked it, where you didn't like it. Think about why not like it.

Very often at the beginning of work with actors people abuse messages. They try to represent any entity in the program as actors, and any interaction between them through asynchronous messages. But this does not always work well. It is necessary to experience firsthand that line when the advantages of asynchronous messages begin to turn into the disadvantages of asynchronous messages.

If you haven’t felt this line yet, then having dragged the Model of Actors into a real big project you, most likely, will create superfluous actors who will exchange unnecessary messages. This will be a headache for you and cause problems with your code.

In general, everything is good in moderation, and this measure is better to find on toy puzzles, rather than in a combat project.

Model Actors as a way to look at the world

When we take on the Model of Actors, we must understand that the Model of Actors is not only a way of organizing interaction between entities within a program. It is also an approach to analyzing the problem and the design method.

Here it is appropriate to draw an analogy with the Object-Oriented Approach. 25 or 30 years ago, the object approach with its three simple principles was a real breakthrough. Breakthrough, which allowed not only to simplify the writing of the code of large software systems. But, most importantly, the object approach has become a tool that has significantly simplified the analysis and design of these largest software systems.

The principles of the objective approach allowed to look at the subject area with different eyes. People learned in a special way to classify the objects of their subject area. And this made it possible to more easily implement objects within the program.

Here is something similar giving people and using the Model Actors. She herself is based on three simple principles:

an actor is an entity with behavior;
actors respond to incoming messages;
having received the message the actor can:
- send a certain number of messages to other actors;
- create a number of new actors;
- define for yourself a new behavior for processing subsequent messages.

But these principles give the developer, first of all, a different way to look at their subject area. Working with the Actor Model, we begin to see in the subject area not just some objects with some properties. We begin to see objects with their own behavior. As well as ways to communicate these objects with other similar objects.

It turns out that we first learn to detect actors in the subject area itself. And then we get the opportunity to safely transfer these actors to objects within our code.

And it is precisely this that the Model of Actors is valuable: we have the opportunity to operate with the same concepts both in relation to the subject domain, and in relation to the program implementation.

but on the other hand

We can look at some subject area and not see actors there in general.

A typical example: computational mathematics. There is practically nothing to represent in the form of actors. Of course, you can try, but it will not make sense. For example, you can make an actor matrix and an actor vector. And the vector will send to the matrix the message “multiply yourself by me”. But it even sounds pretty stupid.

Some kind of actor may arise when paralleling mathematical calculations. There are entities that are responsible for paralleling and collecting results. These entities may be similar to actors. But far from being a fact that they are profitable to do actors. It may be easier to use map-reduce or task based parallelism.

So there are subject areas in which the use of the Actor Model does not only bring no bonuses, but it can also complicate our lives. In such areas it makes no sense to use the Model Actors. And if you have such an area, then you simply do not need a dump truck, that's all.

Litmus paper

Let's look at a few markers, the presence of which may prompt you that the Actor Model in your task will take root.

Immediately make a reservation: these markers are necessary, but they do not guarantee anything. However, the more markers in your subject area you find, the higher the likelihood that the Actor Model will simplify your work.

Fire-and-Forget Principle

The first marker is the ability to use the send-and-forget principle.

What is this principle about?

First, it’s about the fact that the progress of work in your task is not very much needed to be controlled. Everything itself is done when the appropriate time and resources are found for this. You simply do your part, give the results of your work somewhere further and you are no longer interested in what happens with these results.

Secondly, this principle is that in most cases you do not need to know the result of the operation started here and now. If you need something, then you send your application somewhere and you can continue your work without waiting for the result of processing your application.

Thirdly, this principle is that if something is not done at all, then there is nothing terrible about it. We can either ignore the lack of result, or we can repeat the operation after some time.

In general, this is a simple and obvious principle that we regularly use in everyday life. Including to solve complex and important tasks.

For example, you want to hold a conference for C ++ developers. You need to invite interesting speakers. You make a list of those you would like to see and send them letters asking about the possibility of participation. Letters are gone, you do not need to wait for immediate responses. While people think, you can deal with other organizational issues. If someone didn’t answer your question at all, you can ask him again. Or simply consider that the person did not want to take part and will not rely on him.

Those. in real life, we often apply the send-and-forget principle. But from the same real life, we know that it does not always work. Actually, the same thing happens when writing programs. Somewhere we can use "send-and-forget", but somewhere - no.

For example, we may have two worker threads. In the first we hang on select or epoll. When data for reading appears, we read it, give the second working thread for parsing, and proceed to read data from another socket. For the first working thread, it does not matter when the second working thread performs the parsing and whether this parsing will be successful or not.

Another example. We commit a transaction to the database. Most likely, we would immediately like to know the result: whether the commit was successful or not. And we can hardly continue to work until the result of the commit is unknown.

In general, if you see that the principle of "send-and-forget" for your task is quite natural, then you can try on the Task Actors Model. But if you see that you almost always need to immediately know the results of the operations you started, then the Actor Model is unlikely to suit you.

State machines

The next important marker is the presence in your subject domain of entities that can be represented as finite automata.

In general, if you look at the principles of the Model Actors, you can see that the actors, in fact, are finite automata, albeit simple:

an actor is an entity with behavior;
actors respond to incoming messages;
having received the message the actor can:
- send a certain number of messages to other actors;
- create a number of new actors;
- define for yourself a new behavior for processing subsequent messages.

Indeed, each actor has a behavior that determines how the actor will handle the next message. During the processing of the message, the actor can choose for itself a new behavior.

This is the same as the state machine: the automaton has its current state, which determines how the input signal will be processed. The new state of the automaton is determined by its current state and the type of incoming signal.

Therefore, it is not surprising that tasks in which finite automata are widely used fit well into the Actor Model.

Not all state machines are equally useful.

Suppose that we have an entity that coordinates the process of user input into the system, for example, to an online cinema site. This entity receives an input request with a username and password, and then requests an authentication subsystem. If the authentication is successful, then a request is made to the billing subsystem to determine the user's balance. After that, a request is made to the notification subsystem to receive a list of notifications for the user. As a result, the initial page for the user is formed, which displays information about his current balance and the list of notifications awaiting reaction.

It seems that everything is simple and clear. But something confuses.

And confuses the fact that here the finite state machine is not needed at all. In fact, we have a simple linear sequence of actions with synchronous calls to external subsystems.

To express such a sequence, it is much better to use ordinary threads of the operating system. So that each call to a third-party subsystem is a simple synchronous call.

auto process_login(const login_params & params) -> start_page_data { const auto auth_result = request_auth_service(params); if(auth_result.valid_user()) { const auto balance = request_balance(auth_result.user_token()); const auto pending_messages = request_pending_messages(auth_result.user_token()); return make_start_page_data(auth_result, balance, pending_messages); } else return make_unknown_user_page_data(); }

Of course, there are problems with scalability, because the creation of a separate thread for each user is too expensive. But this problem is solved if we have the opportunity to use filers or coroutines. Then all actions are made out in the form of a linear coroutine with blocking calls inside. And we do not need any state machines.

Accordingly, if the majority of activities in your program are represented as linear sequences from synchronous operations, then you hardly need the Actor Model. And you need to look somewhere in the direction of CSP or task-based parallelism.

Really useful finite automata

And in what cases are finite automata useful?

Several types of input signals in each of the states

First, when we are waiting at each moment of time not just one type of input signal, but several.

Imagine that we need to program an intercom panel. The panel is activated when you first press a button with a number. After that, we can wait for either pressing another button with a number, or pressing the “Call” button, or pressing the “Reset” button to reset the entered number, but remain in the activated state, or a timer signal that it is time to deactivate the intercom panel.

That's when in each state we have to respond to several different types of signals, finite state machines can be simpler and easier to implement than any other approach.

Nonlinear transitions between states

Secondly, finite automata can be useful when the logic of behavior is nonlinear. Those. when we can go from state S1 to state S2, from there to S3, and from there, depending on the input signal, we can return to either S1 or S2, but we can also go to S4, from where we can return to S2.

In cases of such cyclic transitions between states, the finite state machine may also be more convenient to try to write a linear code.

Advanced state machines

Third, you may need advanced state machine capabilities:

reaction to the entrance / exit to / from the state;
state hierarchy (nested states, event inheritance);
history for states;
restrictions on the time spent in the state.

Everything you wanted to know about state machines, but ...

In general, there is a fundamental article on the formal notation for state diagrams from David Harel: Statecharts: A Visual Formalism For Complex Systems (1987) .

They understand the various situations that can occur when working with finite automata on the example of control of a conventional electronic clock. If someone did not read it, I highly recommend it. In principle, everything that Harel described was then transferred to the UML notation. But when you read the description of state diagrams from UML, I don’t always understand what, for what and when. But in the article by Harel, the presentation goes from simple situations to more complex ones. And you are better aware of all the power that finite automata hide in you.

Obvious summary on state machines

If your subject area is literally teeming with finite automata, then you have a direct route to the Actor Model.

Shared Nothing Architecture

The next marker is perhaps the most important: how simple is it in your subject area to follow the principle of Shared Nothing.

Those. Can your entities live and work without shared data at all?

If you bring it to the limit, is it possible to present your every entity as an autonomous OS process that communicates with other similar processes only through asynchronous messages?

Ideally, actors should not have any shared data. Each actor is an autonomous independent entity. With its own state, which no one else can see. Therefore, such a limiting view that each actor is an independent and independent process with its own address space is very justified.

It is, however, somewhat extreme. And we can move away from it, for example, for reasons of efficiency. Nevertheless, if in principle we can imagine a solution in which each actor is a separate process, then this is a good sign.

Here it is necessary to emphasize two important points.

Shared Nothing is not always possible.

First, it is obvious that not always we can adhere to the principle of Shared Nothing.

For example, we can keep in mind a large graph of social connections. And in order to efficiently handle multiple requests to it, we may need multi-threaded processing of these requests. Workflows will be forced to jointly own the graph and use some synchronization mechanisms in order not to violate its integrity.

Another example: computational problems. We can keep in mind several large matrices that are involved in the calculations. To speed up the calculation, we can run several parallel threads and these threads will work together with common data.

It all depends on the height from which we look down ...

Secondly, the situation may change fundamentally depending on the level of abstraction at which you are considering your task.

Let's go back to the example of online cinema. If you look “by the big top”, then we see quite a Shared Nothing architecture. The authentication subsystem works with own data, the billing works with own data, the notification subsystem works with its own data. They all have nothing to share. They communicate with each other only through asynchronous messages. That is, in fact, they are all actors.

However, if we descend to the level of implementation of a specific component, then there can already be no autonomous actors in principle.

For example, in the billing subsystem there can be a huge data structure in RAM and several workflows that work very cleverly with it (for example, using lock-free algorithms and persistent data structures).

Those. we may be confronted with the fact that at the conceptual level we seem to have the Model of Actors, but at the level of implementation, in the code, we have the usual multi-threaded imperative trash, hardcore and sodomy.

And that's fine. Recall once again that the Model Actor is not just a set of techniques for writing code. It is also an approach to analyzing the subject area and designing a software system.

Therefore, we can use the Actor Model at the design level, highlighting the components that are conceptually actors. But at the level of implementation there will be nothing left of the Model Actors.

In general, there may be a reverse situation: your application can be a huge monolith and do use shared data between streams, but in some part of this application you can easily use the Actor Model, effectively isolating part of the application from the rest of the code.

Total about Shared Nothing

If the use of Shared Nothing architecture is difficult and / or leads to additional overhead, then you can not look in the direction of the Model Actors.

But in general, Shared Nothing is a great thing. Very much simplifies life. Especially in multi-threaded programming. And the Actor Model facilitates the implementation of the Shared Nothing architecture. So, if you are trying to build your application using the Shared Nothing architecture, then actors can help you a lot.

Timers

Separately worth staying at work with timers.

It cannot be said that timers are a special marker that is inherent in the Actor Models. But because of the send-and-forget principle, working with timers is very important. It so happens that we run some kind of operation, and after a while we should check its result. In this case, you can not do without convenient work with timers.

In the case of actors, timers are implemented via pending messages. Which is very convenient, because When the timer is triggered, you receive a normal message.

Let's look at a simple example:

You receive a request from the user. But you do not want to immediately process it, because the processing of single requests is not profitable. You can wait a bit. Suddenly a few more requests will come, then you can process them all in a crowd. For example, it is beneficial for you to process requests in groups of 100 applications. This, of course, worsens the latency for a separate request, but it improves the throughput of your service.

It turns out that you need to wait for the fulfillment of two conditions:

Or you have accumulated 100 requests and processed them all at once.
Or you waited, say, 250 milliseconds and process everything that you managed to do. 99 means 99. One means one.

It is implemented very simply:

 class bunch_processor { ... public: void on_request(request & req) { requests_.push_back(move(req)); if(1 == requests_.size()) timeout_timer_ = send_delayed<timeout>(this, 250ms); else if(100 == requests_.size()) { timeout_timer_.reset(); handle_collected_requests(); } } void on_timer(timeout&) { handle_collected_requests(); } ... };

Summarizing the topic of timers, I can say that actors and timers are very good friends with each other. Therefore, if your task has a lot of work with timers, then the Actor Model can help you with this.

Where does all this apply?

Well, in order to consolidate the material, let's try to briefly walk through the areas in which the Actor Model has proven itself well. What I will say further is based on my own experience and on the experience of colleagues with whom I have occasioned to discuss the topic of the Model Actors.

Equipment management

The first area that comes to mind is the management of real equipment using a computer. For example, in the tasks of industrial automation.

The operation of external devices is often described using state machines. It is therefore not surprising that state machines are also used to work with devices in the program itself.

The interaction between actors through asynchronous messages also turns out to be very similar to working with real equipment. Since communication with external devices is often exactly asynchronous. Let's say we write a command to some kind of I / O port. Then we have to wait for some time, then read the contents of some other I / O port to see if our command is executed or not. By the way, convenient work with timers helps a lot in such cases.

Simulation

Another direction is the simulation modeling of any processes, for example, in queuing systems. Especially processes that include many diverse entities (see, for example, Agent-Based Model ).

Since the actor is an autonomous entity with its own behavior, it is convenient to use them to simulate objects of the real world. You can create completely different types of actors, you can create the same type of actors, which differ only in the values of some parameters. You can fill your model with at least a million actors, each of which will be at least somewhat different from the others. And it allows you to conduct complex experiments in the field of simulation.

Development of test environments

When developing components of large software systems, it is necessary to create a test environment that simulates the behavior of adjacent components. This can be done for various reasons:

adjacent components themselves may not yet be available. They are being developed in parallel and may not yet be ready for joint integration testing. Therefore, you need some kind of simulator that can replace an adjacent component here and now;
You may need to simulate the abnormal behavior of an adjacent component. For example, you need the adjacent component to delay the response to every 5th request strongly, not to respond to every 10th request at all, and to every 20th request to respond with some kind of garbage.

Own test environment may be required even when developing small systems, for example, when working with external equipment, when you do not have this equipment yet. But you need some kind of external device simulator.

Experience shows that in such cases, simulators based on the model of actors are implemented easily and naturally. And this is not by chance, since here you can find a lot in common between work with equipment and simulation modeling, which was discussed earlier.

Pipeline data / transaction processing

Pipeline processing of data streams or transaction flows is not exactly the subject of the Actor Model. This is an area of data flow programming. However, in practice, the Model Actors can easily become the foundation on which pipeline processing is built. Thus, the pipeline stages are easily implemented by actors, and the transfer of information from one stage to another is done through asynchronous messages (in such tasks, the send-and-forget principle feels good).

A big plus of actors in such tasks is that actors have a state and this allows them to do interesting things. For example, to accumulate single requests into packets so that further batch processing is performed. We have already considered such an example above: the actor receives the first request, cocks the timer and waits for either the complete package to be generated or for the timer to work.

Another good point is that actors can rebuild their connections in dynamics. For example, there may be an actor performing load balancing on five subordinate worker-workers. The balancer can track how long each worker processes the next packet. And if it finds out that this time starts to grow, then the balancer can reduce the load on this problem worker.

True, if actors are used in pipelining tasks, the problem of back pressure comes up. But that's another story. Especially since it is quite solvable. And in the same Akka there is Akka Streams, which are built just above the usual Akka-actors.

Several platitudes in the end

I want to finish in the role of Captain Obvious, therefore, a few platitudes:

You need to be guided by common sense when choosing an approach to solve your problem.
Common sense says that the Model Actor is not a silver bullet.
In some cases, the Actor Model really makes life easier.
But in order to make life easier, you need to have experience working with the Actor Model.
This experience is better to get on toy prototypes.
In order to get this experience, you can take any of the existing ready-made actor frameworks for C ++.

Source: https://habr.com/ru/post/342316/

All Articles