The purpose of this note is to familiarize PHP developers with interprocess communication capabilities in this language. The note does not intend to tell in details about each of the possibilities, implementation details, or show working code examples.
Since the task of parallelization sooner or later appears in any programmer, this note was conceived as a starting point from which you can begin your journey into the world of fascinating
hemorrhoids in the process of building such systems.
')
Also, the threading theme in PHP will be affected, moreover, it is threads (
Thread ), because until recently, PHP allowed (conveniently) to implement some kind of parallelization only thanks to
Fork (we will not touch on curl_multi_ *, popen \ fsockopen, distortions, Apache MPM, etc.). The topic will be considered only in the context of IPC, leaving the search for details of the implementation of an approach.
The narration will be conducted in the context of software running on a single computer, and IPC within a single computer. This is due to the fact that interprocess communication in distributed systems is a very, very extensive topic. Therefore, all kinds of message brokers, databases, Pub \ Sub, and other “intercomputer” approaches will not be considered, besides, they are covered on other resources on the Web.
In view of all of the above, some preparation is required of the reader, since the terminology will be explained only in key points, however, the text is abundantly provided with links to the necessary articles in the PHP documentation, wikipedia, etc., as well as the understanding that many things are on purpose simplified due to the format of this material.
What does IPC look like?
So, what is interprocess communication?
Interprocess communication - (English Inter-Process Communication, IPC ) - a set of ways to exchange data between multiple threads in one or more processes. Processes can be run on one or more computers connected by a network. IPC methods are divided into messaging, synchronization, shared memory, and remote call (RPC) methods. IPC methods depend on the bandwidth and latency of the interaction between the streams and the type of data transferred.
Describe like IPC looks like!
The outline of the outline is as follows:
0.
PCNTL1.
Sockets2.
Shared Memory3.
Semaphore, Shared Memory and IPC4.
pthreads0. PCNTL
The extension implements the most basic functionality for working with processes, but we are interested in working with
UNIX signals , and more specifically, the
pcntl_signal function, which allows you to install a signal handler. This approach is the least functional of all considered, as it does not allow to transfer data. Using this extension, you can, for example, organize the start / stop of workers, or read tasks from the buffer (file, database, memory, etc.), or signaling one part of the system about an event.
The most easy to implement, there are many examples and possibilities in the application, often can be more than enough for some not very complex tasks.
1. Sockets
Sockets - (English socket - connector) - the name of the software interface for data exchange between processes. The processes of this exchange can be executed on one computer, as well as on various computers connected by a network. A socket is an abstract object representing the end point of a connection.
It is necessary to distinguish between client and server sockets. Client sockets can be roughly compared with the terminals of the telephone network, and server sockets with switches. A client application (for example, a browser) uses only client sockets, and a server application (for example, a web server to which the browser sends requests) - both client and server sockets.
Perhaps this is the most obvious and most well-known way to implement IPC, however, and the most labor-intensive. The first option is to create a broker (socket server), to connect to it by client-threads. Here you will find the fascinating world of debugging non-blocking input / output (and how did you want to write a blocking code?), As well as the implementation of many trivial things like wrappers over extension functions. The second option is simpler, it can be used for simpler implementations:
create_socket_pair , which creates a pair of connected sockets, an example is available by reference.
Using sockets for IPC implementation requires quite a serious approach and smoking of manuals, but the advantages include the possibility of posting system elements on different servers in the future without resorting to significant code corrections. Also, the advantage of this technology is its versatility: for example, writing a client in PHP, connecting to a C-shny server is not difficult.
Also, it is necessary to cancel also minuses: the non-blocking IO mentioned above. Since the data will be received in chunks, you should think carefully about the mechanism for ensuring their integrity, buffering and processing, which would not bring to nothing the benefits of non-blocking input / output.
2. Shared Memory
This extension allows you to fully work with
virtual memory . The advantages of the approach are that it is the fastest (if speed is put at the forefront of the application) and the least resource-intensive. In addition, its implementation is not associated with as many pitfalls as in the previous decision, and the technology itself is not difficult to assimilate.
There are many use options: both the general space and the allocation of blocks individually for each thread / process, data processing is also simplified due to the precise definition of the block size. The disadvantages include some difficulty in the convenient implementation of this interaction: you have to forward block addresses to child processes (as parameters, when
pcntl_fork is started, using marker files, etc.)
This approach is perhaps the most common and preferred, since it does not require large labor costs to implement, and is more versatile.
3. Semaphore, Shared Memory and IPC
This extension includes the capabilities of the previous one, however, adds basic resource synchronization capabilities such as semaphores, another way of interaction between threads, known as
messaging.Semaphores can come in handy when streams are forced to work with some shared resource, say, you wrote a firewall, which, with every request, climbs into a file with Roscomnadzor's IP addresses and makes street magic with the incoming request. The file, of course, is updated by some other service flow, therefore, it is unacceptable to read (or change) it while it is being updated, by someone else. The theory of semaphore work is simple, and there are many examples of their implementation, therefore, for those who have not worked with this type of locks, I recommend to get acquainted, it will help to better understand the processes of building the interaction between threads.
Messaging is a more “high-level” and convenient solution than shared memory, but this topic is poorly covered in the context of PHP. In addition, I know of cases when this technology has shown some oddities, let's say, in its work, therefore, it is necessary to carefully check and recheck the results of the code.
4. Pthreads
And now we have reached the
segfault of the pinnacle of the evolution of both IPC and multithreading in PHP.
A cool guy named
Joe Watkins has written a
pthread extension that adds support for the most real multithreading in PHP. Just the other day (09/09/2013), the first stable version (0.0.45) was released, however, the author in his
post on Reddit opened up the beta / stable releases theme in great detail, therefore, do not focus on this. I strongly recommend to study all his comments in the topic, there is a lot of useful information about pthread.
What are the advantages? In all. Pthreads provides an extremely simple and convenient API for implementing any of your multi-threaded fantasies. Here you and synchronize as in Java, and events, and IPC with forwarding objects! However, not everything is so smooth with them (see
examples on the githaba), and the author writes that this problem is not his business, however, he managed to create a miracle with socket resources, and now the
socket_accept from the main thread you can stick in the child - amazing! It is enough to analyze the examples in order to understand how simple and elegant everything is.
I will not describe all the features and advantages of this extension, everything is on the author's github and in the
documentation on php.net
Apparently, the author is quite intensively working on his project, because in the future he may have many more interesting features, stay tuned.
To run an extension, you need to build PHP in Thread-safe mode, here’s a small script that does everything for you:
If necessary, modify the filemkdir /opt/php-ts && \ cd /opt/php-ts && \ wget http://www.php.net/get/php-5.5.3.tar.bz2/from/ua1.php.net/mirror -O php-5.5.3.tar.bz2 && \ tar -xf php-5.5.3.tar.bz2 && \ cd php-5.5.3/ext && \ git clone https://github.com/krakjoe/pthreads.git && \ cd ../ && \ ./buildconf --force && \ ./configure --disable-all --enable-pthreads --enable-maintainer-zts && \ make && \ TEST_PHP_EXECUTABLE=sapi/cli/php sapi/cli/php run-tests.php ext/pthreads && \ alias php-ts="/opt/php-ts/php-5.5.3/sapi/cli/php"
Does he look like a pipe?
On this, perhaps, everything. Although the language is limited in its IPC capabilities, it nevertheless allows you to write effective applications using different approaches to implement interprocess communication. Those of you who are faced with the task of implementing such an interaction now, I recommend carefully studying all the methods listed in the note, since they are not interchangeable, but effectively complement each other.
PS This does not apply directly to the topic of the article, but it is even applicable to some of the points described here, namely, blocking IO and the imperfection of the event model: I recommend that you familiarize yourself with the
Eio and
Ev extensions (the author of both
osmanov ).