I recently tried pthreads and was pleasantly surprised - this is an extension that adds the ability to work with several real threads in PHP. No emulation, no magic, no fakes - all for real.
I am considering such a task. There is a pool of tasks that need to be completed faster. In PHP, there are other tools to solve this problem, here they are not mentioned, the article is about pthreads.
It is worth noting that the author of the extension, Joe Watkins , in his articles warns that multithreading is always not easy and you have to be ready for this.
Who is not afraid, go ahead.
Pthreads is an object-oriented API that provides a convenient way to organize multithreaded computing in PHP. The API includes all the tools needed to create multi-threaded applications. PHP applications can create, read, write, execute, and synchronize threads using the objects of the classes Threads, Workers, and Threaded.
The hierarchy of the main classes that we just mentioned is shown in the diagram.
Threaded - the basis of pthreads, allows parallel code execution. Provides methods for synchronization and other useful methods.
Thread . You can create a thread, inheriting from Thread and implementing the run () method. The run () method starts to run, and in a separate thread, at the moment when the start () method is called. This can only be initiated from the context that creates the stream. You can also merge threads only in this same context.
Worker . A persistent state that in most cases is used by different threads. Available while the object is in scope or until shutdown () is forced.
In addition to these classes, there is also a Pool class. Pool - a pool (container) of Workers can be used to distribute Threaded objects across Workers. Pool is the easiest and most efficient way to organize multiple streams.
We will not be very sad about the theory, but we will immediately try all this with an example.
You can solve different tasks in multiple threads. It was interesting to me to solve one specific and, I think, a very typical problem. I will remind her again. There is a pool of tasks, they must be quickly completed.
So let's get started. To do this, create a data provider MyDataProvider
(Threaded), it will be the same and common to all threads.
/** * */ class MyDataProvider extends Threaded { /** * @var int */ private $total = 2000000; /** * @var int */ private $processed = 0; /** * * * @return mixed */ public function getNext() { if ($this->processed === $this->total) { return null; } $this->processed++; return $this->processed; } }
For each stream we will have a MyWorker
(Worker), where the link to the provider will be stored.
/** * MyWorker , MyWork. */ class MyWorker extends Worker { /** * @var MyDataProvider */ private $provider; /** * @param MyDataProvider $provider */ public function __construct(MyDataProvider $provider) { $this->provider = $provider; } /** * Pool. */ public function run() { // } /** * * * @return MyDataProvider */ public function getProvider() { return $this->provider; } }
The processing of each task of the pool itself (let it be some kind of resource-intensive operation), our narrow neck, for which we started multithreading, will be in MyWork
(Threaded).
/** * MyWork , */ class MyWork extends Threaded { public function run() { do { $value = null; $provider = $this->worker->getProvider(); // $provider->synchronized(function($provider) use (&$value) { $value = $provider->getNext(); }, $provider); if ($value === null) { continue; } // $count = 100; for ($j = 1; $j <= $count; $j++) { sqrt($j+$value) + sin($value/$j) + cos($value); } } while ($value !== null); } }
Notice that the data from the provider is collected in synchronized()
. Otherwise, it is likely that a part of the data will be processed more than 1 time, or a part of the data will be skipped.
Now let's make it all work with Pool
.
require_once 'MyWorker.php'; require_once 'MyWork.php'; require_once 'MyDataProvider.php'; $threads = 8; // . // $provider = new MyDataProvider(); // $pool = new Pool($threads, 'MyWorker', [$provider]); $start = microtime(true); // . // , . $workers = $threads; for ($i = 0; $i < $workers; $i++) { $pool->submit(new MyWork()); } $pool->shutdown(); printf("Done for %.2f seconds" . PHP_EOL, microtime(true) - $start);
It turns out pretty elegant in my opinion. I put this example on githab .
That's all! Well, almost everything. In fact, there is something that can grieve an inquisitive reader. None of this works on standard PHP compiled with default options. To enjoy multithreading, you need ZTS (Zend Thread Safety) enabled in your PHP.
The documentation states that PHP must be compiled with the --enable-maintainer-zts option. I did not try to compile myself, instead I found a package for Debian, which I installed myself.
sudo add-apt-repository ppa:ondrej/php-zts sudo apt update sudo apt-get install php7.0-zts php7.0-zts-dev
Thus, I still have the same PHP, which is run from the console in the usual way, using the php
command. Accordingly, the web server uses it the same. And another PHP appeared, which can be run from the console via php7.0-zts
.
After that you can put the extension pthreads.
git clone https://github.com/krakjoe/pthreads.git ./configure make -j8 sudo make install echo "extension=pthreads.so" > /etc/pthreads.ini sudo cp pthreads.ini /etc/php/7.0-zts/cli/conf.d/pthreads.ini
Now that's it. Well ... almost everything. Imagine that you wrote a multithreaded code, and PHP on a colleague's machine is not configured properly? Confusion, is not it? But there is a way out.
Here again, thanks to Joe Watkins for the pthreads-polyfill package . The essence of the solution is as follows: this package contains the same classes as in the pthreads extension, they allow your code to run, even if the pthreads extension is not installed. Just the code will be executed in one thread.
To make it work, you simply connect this package through the composer and don't think about anything else. It checks whether the extension is installed. If the extension is installed, then the polyfill job ends there. Otherwise, stub classes are connected so that the code works in at least 1 thread.
Let's now see whether processing actually occurs in several streams and estimate the gain from using this approach.
I will change the value of $threads
from the example above and see what happens.
Information about the processor on which the tests were run
$ lscpu CPU(s): 8 : 2 : 4 Model name: Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz
Let's look at the processor core loading diagram. Everything is in line with expectations.
$threads = 1
$threads = 2
$threads = 4
$threads = 8
And now the most important thing for which all this. Compare lead time.
$ threads | Note | Run time, seconds |
---|---|---|
PHP without ZTS | ||
one | without pthreads, without polyfill | 265.05 |
one | polyfill | 298.26 |
PHP with ZTS | ||
one | without pthreads, without polyfill | 37.65 |
one | 68.58 | |
2 | 26.18 | |
3 | 16.87 | |
four | 12.96 | |
five | 12.57 | |
6 | 12.07 | |
7 | 11.78 | |
eight | 11.62 |
From the first two lines it can be seen that when using polyfill we lost about 13% of the performance in this example, it is relatively linear code on quite simple PHP “without everything” .
Next, PHP with ZTS. Do not pay attention to such a large difference in runtime in comparison with PHP without ZTS (37.65 vs. 265.05 seconds), I did not try to lead to a common denominator of PHP settings. In the case without ZTS, I have XDebug for example.
As you can see, when using 2 threads, the program execution speed is about 1.5 times higher than in the case of a linear code. When using 4 threads - 3 times.
You can note that even though the processor is 8-core, the program execution time was almost unchanged if more than 4 threads were used. It seems that this is due to the fact that my processor has 4 physical cores. For clarity, I have depicted a tablet in the form of a diagram.
In PHP, quite elegant work with multithreading is possible using the pthreads extension. This gives a tangible increase in performance.
Source: https://habr.com/ru/post/300952/