This article reads a series of publications about the technologies that we use and are studying for the development of the HostTracker web site monitoring service. We hope our experience will be useful.Message passing is one of the popular concepts of parallel programming. It is often used to create complex distributed systems with a high degree of parallelism. The implementation of this concept is represented in programming languages as actors (actor) or agents (agent).
Distributed agents HostTracker. Quick check with http://updownchecker.com')
Features message passing:
• The solution consists of isolated components that run in parallel (in parallel threads from the thread pool). The interaction between the components is through the exchange of messages over a specific protocol. Network components can use TCP, UDP, HTTP, etc. Local interact through a protocol defined by a specific language of its implementation or library.
• The component defines the logic for processing input messages. The latter enter the queue and are successively taken from it for processing.
• A component may own some resources and be their provider for other components. A resource can be: data in a specific format in RAM, a hardware-software resource, or a combination of both.
• A component has a certain state, which can encapsulate a resource (from the previous item), or, as in the case of a state machine (state machine), can be expressed as a specific message processing algorithm, which translates it into another state.
• Interface between components:
- postSync, postAsync — send a message to a component synchronously or asynchronously.
- receiveSync, receiveAsync - receive a message from a component synchronously or asynchronously. Asynchronous waiting is that the thread as a resource is returned to the system and can be used for other work. In this case, the system registers a callback function for a specific event.
- tryReceive functions are similar to those listed above, but with a certain delay in receiving data.
• The concept is closely related to the concept of asynchronous execution.
Asynchronous and parallel execution
Asynchronous execution means that the thread does not expect the result of the execution of a particular operation, using the resources of the operating system. Instead, the thread returns to the pool and can be used for other operations. In the OS, a callback function is registered that accepts the result of an asynchronous operation. This function is launched upon receipt of the result (in Windows OS through the IO Completion ports ports termination mechanism, in Nix systems through other kernel mechanisms).
Implementing the concept of message passing
Implementation | Tongue | Interaction type | Special features | Developer |
Erlang processes | Erlang | local network | The first implementation for telecommunications equipment. Can be used for networking. A method has been defined for sending messages of the form Pid! message and receive with pattern matching messages. | Ericson corp. |
Mailboxprocessor | F # | local | Uses asynchronous computing. Additionally, the notion of a response channel - AsyncReplyChannel, which allows you to receive a response synchronously and asynchronously. | Microsoft research |
TPL dataflow | .NET languages | local | A set of primitives - blocks that implement receivers and data sources. | Microsoft |
Scala actors | Scala (Java machine) | local network | The implementation resembles the classic Erlang processes. | EPFL, Typesafe Inc. |
Nodejs | Javascript | networked | Event-driven approach to application development: declarative definition of asynchronous operations and callbacks, a queue calling an event handler. The external interface is determined by the application developer. | Joyent Inc. |
MPI: MPICH, HP-MPI, SGI-MPI, WMPI,
| C, C ++, Java, Fortran | networked | The specified interface interaction of hosts. A single executable code for all nodes with the choice of the execution branch by process identifier. | Microsoft, HP, SGI, Argonne National Laboratory, others |
Different programming languages support this concept differently. So, JavaScript (NodeJs platform) is an example of an Event-Driven language (a language that has built-in support for working with asynchronous computing). A typical NodeJs program is a declarative registration of callback functions to IO event responses. At the beginning of its execution, a queue of operations is created, in which the callback functions are triggered. That is, the program itself is single-threaded, but asynchronous. That is, the program itself is single-threaded, but asynchronous. In this case, the NodeJs virtual machine (based on the Chrome v8 JavaScript runtime) is an example of the implementation of message passing and satisfies all the above conditions if it implements a specific network interface to interact with other components of a distributed application (for example, a REST service).
In some languages (for example, F # - async monad, C # - async, await) there is the concept of asynchronous computing - a part of code defined by a developer that contains both synchronous and asynchronous operations. Also, it is possible to start calculations in a parallel thread, cancel calculations, make a synchronous start.
Examples of the implementation of the concept of Message Passing
The concept of message passing was first introduced in the Erlang language (a functional language with dynamic typing), which was developed by Ericson for specialized telecommunications equipment. Erlang processes are components described earlier. Their creation and data exchange between them, in contrast to OS processes, does not require a significant time resource. Modern functional languages such as Scala (Java platform) and F # (. NET platform) took the implementation of the message passing from Erlang. Scala agents can exist both within the framework of one OS process, and in different processes on different computers. The principles of working with them are similar to the principles of Erlang-processes. F # offers a slightly different approach - MailboxProcessor. Also, its own features of the implementation of message passing can be found in the TPL Dataflow library, developed for .NETTPL Dataflow - .NET (can be used in C #, VB, F #);
Network interaction is based on the principles of message passing. Here, the component is a program on a separate machine, the resources are all the hardware components of the computer + individual functions that are delegated to this computing node. Data is stored in the application process. The interface of interaction with other components (nodes) is determined by a specific technology, usually created on Berkley sockets. As an example of such a technology, one can cite MPI (Message passing interface). Its peculiarity is that each node processes the same code, selecting a separate branch for processing using conditional statements using the stream identifier. Zero flow is highlighted. Data exchange over the network occurs through an interface similar to the above, (the concept of message passing) and uses flow identifiers to indicate the source and / or receiver of melons. Another example is the REST and SOAP services, which enable the developer to define the network interface himself. And although, at first glance, it seems that in this way a certain interface will not fit into the concept of message passing, it is an equivalent to a set of post (send) and receive operations. So, in SOAP, the call of a network function occurs by calling the HTTP xml envelope (envelope), which includes information about the function. Functions work similarly in REST — HTTP requests are used with a data format defined by the developer.
In further publications, MailboxProcessor F # will be reviewed in detail (it is actively used in the
HostTracker system) and features of working with the TPL DataFlow library.