📜 ⬆️ ⬇️

Simple implementation of multithreading in PHP

Multithreading in PHP is missing out of the box, so a great many options have been invented for its implementation, including the pthreads , AzaThread (CThread) extensions, and even a few PHP programmers' own developments.

The main disadvantage for me was the too many “bells and whistles” of these solutions - it is not always necessary to exchange information between threads and the parent process or to save resources. It should always be possible to solve the problem quickly and with minimum cost.

I want to make a reservation in advance that there are no great secrets in this post - it’s more likely for newbies in the language, and I decided to publish it only because I had once faced the problem myself and did not find a ready-made solution to make this kind of multithreading emulation on my own.

So, the task is to process a large amount of data that came into our script. My task was to process the JSON array of textual information, which the script had to digest from at least a large commit for PostgreSQL.
')
First of all we collect data in the parent file:

index.php

// bigdata.json -    .      - ,     .. $big_json = file_get_contents('bigdata.json'); $items = json_decode($big_json, true); //   php    ,    , ,  unset($big_json); // ... 

The size of the array ranged around 400mb (later reduced to ~ 50mb), and all the information was textual. It is not difficult to estimate the speed with which all this was digested, and if you consider that the script was executed by cron every 15 minutes, and the computing power was so-so - the speed suffered greatly.

After receiving the data, you can estimate their volume and, if necessary, calculate the required number of threads per CPU core, or you can simply decide that there will be 4 threads and count the number of rows for each thread:

index.php

  // ... $threads = 4; $strs_per_thread = ceil(count($items) / $threads); //      -   echo "Items: ".count($items)."\n"; echo "Items per thread: ".$strs_per_thread."\n"; echo "Threads: ".$threads."\n"; // ... 

It is necessary to immediately make a reservation - such a calculation "in the forehead" will not give an accurate result on the number of elements for each stream. He needed more to simplify the calculations.

And now the essence - create tasks for each thread and run it. We will do this "in the forehead" - creating a task for the second file - thread.php. It will act as a “stream”, receiving a range of elements to process and starting independently of the main script:

index.php

  // ... for($i = 0; $i < $threads; $i++){ if($i == 0) { passthru("(php -f thread.php 0 ".$strs_per_thread." & ) >> /dev/null 2>&1"); } if($i == $threads-1) { passthru("(php -f thread.php ".($strs_per_thread * $i)." ".count($items)." & ) >> /dev/null 2>&1"); } if(($i !== 0)&&($i !== $threads-1)) { $start = $strs_per_thread * $i + 1; $end = $start -1 + $strs_per_thread; passthru("(php -f thread.php ".$start." ".$end." & ) >> /dev/null 2>&1"); } } // ... 

The passthru () function is used to run console commands, but the script will wait for the completion of each of them. To do this, we wrap the command to run into a set of statements that will start the process and immediately return anything by starting the process and the parent process will not pause waiting for the execution of each child:

 #  ,    ,   Linux- (php -f thread.php start stop & ) >> /dev/null 2>&1 

What exactly is happening here, unfortunately, I can not say for sure - my friend linuksoid suggested to me a set of parameters. If you can decipher this magic in the comments, I will be grateful and will supplement the post.

File thread.php:

 $start = $argv[1]; $stop = $argv[2]; for ($i = $start; $i <= $stop; $i++) { // -          } 

This is a fairly simple way to implement multithreading emulation in PHP.

If you cut the whole example to dry output, then I think it would sound like this: the parent stream starts the child processes via the command line, telling them what kind of information to process.

Speaking of "emulation" I mean that with this method of implementation there is no possibility for the exchange of information between threads or between parent and child threads. It is suitable if it is known in advance that such opportunities are not needed.

Source: https://habr.com/ru/post/448464/


All Articles