Differences between asynchronous and multithreaded architecture using the example of Node.js and PHP

Recently, there has been an increase in platforms built on an asynchronous architecture. The asynchronous model built the world's fastest web server nginx. Active server javascript in the face of Node.js is actively developing. What good is this architecture? How does it differ from the classic multi-threaded system? A great many articles were written on this topic, but they gave far from everyone a complete understanding of the subject. Often you have to observe the controversy around Node.js vs PHP + apache. Many people do not understand why some things can be done on Node.js, but not on PHP or vice versa - why quite correct PHP working code will slow down greatly in Node.js, or even hang it. In this article I would like to once again explain in detail the difference in their architecture. As examples of two systems, let's take a web server with PHP and Node.js.

Multithreaded model

This model is known to everyone. Our application creates a number of threads (pool), passing each of them a task and data for processing. Tasks are performed in parallel. If the threads do not have shared data, then we will not have the overhead of synchronization, which makes the work fast enough. After the work is completed, the thread is not killed, but lies in the pool, waiting for the next task. This removes the overhead of creating and deleting threads. It is on such a system that the web server works with PHP. Each script works in its own thread. One thread processes one request. We have a large number of threads, slow requests take the stream for a long time, and fast requests are processed almost instantly, freeing the stream for other work. This does not allow slow requests to take all the CPU time, forcing quick requests to hang. But such a system has certain limitations. A situation may arise when a large number of slow queries come to us, for example, working with a database or file system. Such requests will take away all the threads that will make it impossible to execute other requests. Even if the request needs only 1 ms to execute - it will not be processed in time. This can be solved by increasing the number of threads so that they can process a sufficiently large number of slow requests. But unfortunately, the threads are processed by the OS, and CPU time is also allocated to it. Therefore, the more threads we create, the greater the overhead for processing them and the less processor time is allocated to each thread. The situation is aggravated by PHP itself - the blocking operations of working with the database, the file system, I / O also waste processor time, without performing any useful work at this point. Here we will elaborate on the features of blocking operations. Imagine this situation: we have several streams. Each processes requests consisting of 1ms of processing the request itself, 2ms for accessing and receiving data from the database and 1ms for rendering the received data. In total, we spend 4ms for each request. When sending requests to the database, the thread starts to wait for a response. Until the data is returned, the flow will not perform any work. This is 2ms for idle request for 4ms! Yes, we cannot render the page without receiving data from the database. We have to wait. But at the same time, we get 50% of processor idle time! And here you can add additional OS costs for allocating processor time to each thread. And the more flows - the more these costs. As a result, we get quite a lot of downtime. This time directly depends on the duration of queries to the database and file system. The best solution that allows us to completely load the processor with useful work is the transition to an architecture that uses non-blocking operations.

Asynchronous model

A less common model than a multi-threaded, but with no less possibilities. The asynchronous model is built on an event queue (event-loop). When an event occurs (a request came, a file was read, a response came from the database) it is placed at the end of the queue. The thread that processes this queue, takes the event from the beginning of the queue, and executes the code associated with this event. While the queue is not empty, the processor will be busy. Node.js works like this. We have a single thread that processes the event queue (with the cluster module - there will be more than one stream). Almost all operations are non-blocking. Blocking devices are also available, but their use is highly discouraged. Then you will understand why. Take the same example with the request 1 + 2 + 1ms: from the message queue, an event associated with the arrival of the request is taken. We process the request, spend 1ms. Next, an asynchronous non-blocking database request is made and control is immediately passed on. We can take the next event from the queue and execute it. For example, we take another 1 request, process it, send a request to the database, return the control and do the same one more time. And here comes the response of the database to the very first request. The event associated with it is placed in a queue. If there was nothing in the queue - it will be executed immediately, the data will be rendered and given back to the client. If there is something in the queue, you will have to wait for the processing of other events. Usually the speed of processing a single request will be comparable to the speed of processing a multi-threaded system and blocking operations. In the worst case, waiting for the processing of other events will take time and the request will be processed more slowly. But then, while the system with blocking operations just waited for 2ms of response, the system with non-blocking operations managed to perform 2 more parts of 2 other requests! Each request may be a little slower overall, but at a time, we can process many more requests. Overall performance will be higher. The processor will always be busy with useful work. At the same time, much less time is spent on processing the queue and transition from event to event than on switching between threads in a multi-threaded system. Therefore, asynchronous systems with non-blocking operations must have no more threads than the number of cores in the system. Node.js initially only worked in single-threaded mode, and to fully use the processor, you had to manually lift several copies of the server and distribute the load between them, for example, using nginx. Now a cluster module has appeared for working with several cores (at the time of writing this article is still experimental). This is where the key difference between the two systems becomes clear. A multi-threading system with blocking operations has a lot of downtime. An excessive number of threads can create a lot of overhead, while an insufficient amount can lead to slower work with a large number of slow queries. An asynchronous non-blocking application uses processor time more efficiently, but is more difficult to design. This especially affects memory leaks - the Node.js process can run for a very large amount of time, and if the programmer does not take care of cleaning the data after processing each request, we will get a leak, which will gradually lead to the need to restart the server. There is also an asynchronous architecture with blocking operations, but it is much less profitable, which can be seen further in some examples. Let us highlight the features that need to be considered when developing asynchronous applications and analyze some errors that people have when trying to deal with the features of the asynchronous architecture.

Do not use blocking operations. Never

Well, at least until you fully understand the architecture of Node.js and can not accurately work with blocking operations.
When switching from PHP to Node.js, some people may want to write code in the same style as before. Indeed, if we need to first read the file, and only then proceed with its processing, then why can not we write the following code:
')

var fs = require('fs');
var data = fs.readFileSync("img.png");
response.write(data);

, . , , Node.js , . . , , . :

var fs = require('fs');
fs.readFile("img.png", function(err, data){
	response.write(data);
});

: , , Node.js . — , readFile . , , — . , : , . , . , .
, event-loop:

var fs = require('fs');
var dataModified = false;
var myData;

fs.readFile("file.txt", function(err, data){
	dataModified = true;
	myData = data+" last read "+new Date();
});

while (true){
	if(dataModified)
		break;
}

response.write(myData);

, . , , . - … !

var fs = require('fs');
var events = require('events');
var myData;
var eventEmitter = new events.EventEmitter();

fs.readFile("file.txt", function(err, data){
	myData = data+" last read "+new Date();
	eventEmitter.emit('dataModified', myData);
});

eventEmitter.on('dataModified', function(data){
	response.write(data);
});

-, . — , , emit , . events.EventEmitter . eventEmitter.on , .
, Node.js. , . , .

.

, ? , ? — . , . , . , , , .

function incredibleGigantCycle(){
	cycleProcess();
	process.nextTick(incredibleGigantCycle);
}

. .

,

( Node.js — ). (, 500) , . - ? , . , , , . , . , , .

,

, , . Node.js Sync. callback. , , .

var fs = require('fs');
fs.readFile("img.png", function(err, data){

});
response.write(data);

. . — . . . , , callback-. .

, «-» Node.js, callback', PHP ? .
:

 $user->getCountry()->getCurrency()->getCode()

user.getCountry(function(country){
	country.getCurrency(function(currency){
		console.log(currency.getCode())
	})
})

3 . : PHP . , . , . , - , . — , , , . 3 . PHP , Node.js , , .

Node.js , PHP , . Node.js , . — Node.js. , PHP — Node.js , .

Source: https://habr.com/ru/post/150788/

All Articles

Differences between asynchronous and multithreaded architecture using the example of Node.js and PHP

Multithreaded model

Asynchronous model

Do not use blocking operations. Never

.

,

,

More articles: