All you need to know about Node.js

Hi, Habr! I present to you the translation of the article "Everything you need to know about Node.js" by Jorge Ramón.

Nowadays, Node.js platform is one of the most popular platforms for building efficient and scalable REST API's. It is also suitable for building hybrid mobile apps, desktop programs, and even IoT.

I have been working with the Node.js platform for over 6 years and I really love it. This post is mainly trying to be a guide to how Node.js actually works.

Let's get started !!

What will be discussed:

World to Node.js

Multithreaded server

Web applications written according to the client / server architecture work as follows: the client requests the necessary resource from the server and the server sends the resource in response. In this scheme, the server, responding to a request, terminates the connection.

This model is effective since each request to the server consumes resources (memory, processor time, etc.). In order to process each subsequent request from the client, the server must complete the processing of the previous one.

Does this mean that the server can only process one request at a time? Not really! When the server receives a new request, it creates a separate stream to process it.

A thread , if in simple terms, is the time and resources that the CPU allocates for executing a small block of instructions. With that said, the server can process several requests at the same time, but only one per stream. Such a model is also called the thread-per-request model .

To process N requests, the server needs N threads. If the server receives N + 1 requests, then it must wait until one of the threads becomes available.

In the figure above, the server can process up to 4 requests (streams) at a time and when it receives the next 3 requests, these requests must wait until any of these 4 threads becomes available.

One of the ways to get rid of the limitations is to add more resources (memory, processor cores, etc.) to the server, but this is not the best solution ....

And, of course, do not forget about technological limitations.

Blocking I / O

The limited number of threads on the server is not the only problem. Perhaps you wondered why a single thread could not process several requests at the same time? all because of blocking I / O operations .

Suppose you are developing an online store and you need a page where the user can view a list of all products.

The user knocks on http://yourstore.com/products and the server renders an HTML file with all products from the database in response. Not difficult at all, is it?

But, what happens behind the scenes?

When a user knocks on /products particular method or function must be executed to process the request. A small piece of code (yours or framework) analyzes the URL of the request and looks for a suitable method or function. The thread is working .
Now the necessary method or function is executed, as in the first paragraph - the thread works.
Since you are a good developer, you save all the system logs to a file, and of course, to be sure that the router performs the desired method / function - you also log the line "Method X executing !!". But all this is blocking operations I / O. Stream is waiting .
All logs are saved and the following function lines are executed. The thread is running again .
Time to access the database and get all the products — a simple query, like SELECT * FROM products , does its job, but guess what? Yes, yes, this is a blocking I / O operation. The flow is waiting .
You have received an array or a list of all the products, but make sure that you have all this secured. The flow is waiting .
Now you have all the products and it's time to render the template for the future page, but before that you need to read them. The flow is waiting .
The rendering engine does its job and sends a response to the client. The thread is running again .
The stream is free, like a bird in heaven.

How slow are I / O operations? Well, it depends on the specific. Let's turn to the table:

Operation	The number of CPU cycles
CPU Registers	3 cycles
L1 Cache	8 cycles
L2 cache	12 cycles
Ram	150 cycles
Disk	30,000,000 cycles
Network	250,000,000 cycles

Network and disk reads are too slow. Imagine how many requests or calls to external API your system could handle during this time.

Summarizing: I / O operations make the thread wait and waste resources.

C10K problem

Problem

C10k (English C10k; 10k connections - the problem of 10 thousand connections)

In the early 2000s, server and client machines were slow. The problem arose with parallel processing of 10,000 client connections to a single machine.

But why the traditional model of thread-per-request (stream on request) could not solve this problem? Well, let's use some math.

Native implementation of threads allocates more than 1 MB of memory per stream, leaving this - for 10 thousand threads 10 GB of RAM is required and this is only for the stack of threads. Yes, and do not forget, we are in the early 2000s !!

Nowadays, server and client computers work faster and more efficiently and almost any programming language or framework can cope with this problem. But in fact the problem is not exhausted. For 10 million client connections to the same machine, the problem returns again (but now it is C10M Problem ).

Javascript salvation?

Beware, spoilers !!!
Node.js actually solves the C10K problem ... but how ?!

Server-side JavaScript was not something new and unusual in the early 2000s, at that time there were already implementations on top of the JVM (java virtual machine) RingoJS and AppEngineJS that worked on the thread-per-request model.

But if they could not solve the problem, then how could Node.js ?! All because JavaScript is single-threaded .

Node.js and event loop

Node.js

Node.js is a server platform that runs on the Google Chrome engine - V8, which can compile JavaScript code into native code.

Node.js uses an event-oriented model and non-blocking I / O architecture, which makes it lightweight and efficient. This is not a framework, and not a library, it is a JavaScript runtime environment.

Let's write a small example:

 // Importing native http module const http = require('http'); // Creating a server instance where every call // the message 'Hello World' is responded to the client const server = http.createServer(function(request, response) { response.write('Hello World'); response.end(); }); // Listening port 8080 server.listen(8080);

Non-blocking I / O

Node.js uses non-blocking I / O operations, what does this mean:

The main thread will not be blocked by I / O operations.
The server will continue to service requests.
We have to work with asynchronous code .

Let's write an example in which a request to the /home server sends an HTML page in response, and for all other requests, a 'Hello World'. To send an HTML page, you first need to read it from a file.

home.html

 <html> <body> <h1>This is home page</h1> </body> </html>

index.js

 const http = require('http'); const fs = require('fs'); const server = http.createServer(function(request, response) { if (request.url === '/home') { fs.readFile(`${ __dirname }/home.html`, function (err, content) { if (!err) { response.setHeader('Content-Type', 'text/html'); response.write(content); } else { response.statusCode = 500; response.write('An error has ocurred'); } response.end(); }); } else { response.write('Hello World'); response.end(); } }); server.listen(8080);

If the requested url is /home , then the native fs module is used to read the home.html file.

The functions that fall into http.createServer and fs.readFile as arguments are callbacks . These functions will be performed at some point in the future (the first, as soon as the server receives the request, and the second, when the file is read from the disk and placed in the buffer).

While the file is being read from disk, Node.js can process other requests and even read the file again and all this in one stream ... but how ?!

Cycle of events

The event loop is the magic that happens inside Node.js. It is literally an endless loop and is actually one thread.

Libuv is a C library that implements this pattern and is part of the Node.js core. You can learn more about libuv here .

The event cycle has 6 phases, each performance of all 6 phases is called a tick .

timers : this phase executes callbacks scheduled by the setTimeout() and setInterval() methods;
pending callbacks : almost all callbacks are executed, except for close events, timers, and setImmediate() ;
idle, prepare : used only for internal purposes;
poll : responsible for receiving new I / O events. Node.js can block at this stage;
check : callbacks called by the setImmediate() method are executed at this stage;
close callbacks : for example, socket.on('close', ...) ;

Well, there is only one thread, and this thread is the event loop, but then who performs all the I / O operations?

note !!!
When an event loop needs to perform an I / O operation, it uses an OS thread from the thread pool, and when the task is completed, the callback is queued during the pending callbacks phase.

Isn't that cool?

The problem of CPU-intensive tasks

Node.js seems perfect! You can create whatever you want.

Let's write an API for computing prime numbers.

A prime number is an integer (positive integer) number greater than one and divisible only by 1 and by itself.

Given the number N, the API should calculate and return the first N prime numbers in the list (or array).

primes.js

 function isPrime(n) { for(let i = 2, s = Math.sqrt(n); i <= s; i++) { if(n % i === 0) return false; return n > 1; } } function nthPrime(n) { let counter = n; let iterator = 2; let result = []; while(counter > 0) { isPrime(iterator) && result.push(iterator) && counter--; iterator++; } return result; } module.exports = { isPrime, nthPrime };

index.js

 const http = require('http'); const url = require('url'); const primes = require('./primes'); const server = http.createServer(function (request, response) { const { pathname, query } = url.parse(request.url, true); if (pathname === '/primes') { const result = primes.nthPrime(query.n || 0); response.setHeader('Content-Type', 'application/json'); response.write(JSON.stringify(result)); response.end(); } else { response.statusCode = 404; response.write('Not Found'); response.end(); } }); server.listen(8080);

prime.js is an implementation of the necessary computations: the isPrime function checks whether a number is simple, and nthPrime returns N such numbers.

The index.js file is responsible for creating the server and uses the prime.js module to process each request for /primes . The number N is prokibyvaetsya through the query string in the URL.

To get the first 20 primes we need to make a request for http://localhost:8080/primes?n=20 .

Suppose 3 clients are knocking us and trying to get access to our non-blocking I / O API:

The first asks for 5 prime numbers every second.
The second requests 1000 prime numbers every second.
The third is requesting 10,000,000,000 primes, but ...

When a third client sends a request, the main thread is blocked and this is the main symptom of a CPU-intensive task problem . When the main thread is busy with the execution of a “heavy” task, it becomes unavailable for other tasks.

But what about libuv? If you remember, this library helps Node.js to perform I / O operations using OS threads, avoiding blocking the main thread, and you are absolutely right, this is a solution to our problem, but in order for this to be possible, our module must be written in C ++ so that libuv can work with it.

Fortunately, starting with v10.5, the native Worker Threads module has been added to Node.js.

Workers and their streams

As the documentation tells us:

Workers are useful for performing CPU-intensive JavaScript operations; do not use them for I / O operations, mechanisms already built into Node.js are more efficient in coping with such tasks than Worker thread.

Code fix

It's time to rewrite our code:

primes-workerthreads.js

 const { workerData, parentPort } = require('worker_threads'); function isPrime(n) { for(let i = 2, s = Math.sqrt(n); i <= s; i++) if(n % i === 0) return false; return n > 1; } function nthPrime(n) { let counter = n; let iterator = 2; let result = []; while(counter > 0) { isPrime(iterator) && result.push(iterator) && counter--; iterator++; } return result; } parentPort.postMessage(nthPrime(workerData.n));

index-workerthreads.js

 const http = require('http'); const url = require('url'); const { Worker } = require('worker_threads'); const server = http.createServer(function (request, response) { const { pathname, query } = url.parse(request.url, true); if (pathname === '/primes') { const worker = new Worker('./primes-workerthreads.js', { workerData: { n: query.n || 0 } }); worker.on('error', function () { response.statusCode = 500; response.write('Oops there was an error...'); response.end(); }); let result; worker.on('message', function (message) { result = message; }); worker.on('exit', function () { response.setHeader('Content-Type', 'application/json'); response.write(JSON.stringify(result)); response.end(); }); } else { response.statusCode = 404; response.write('Not Found'); response.end(); } }); server.listen(8080);

In the index-workerthreads.js , each request for /primes creates an instance of the Worker class (from the native module worker_threads ) to upload and execute the primes-workerthreads.js file primes-workerthreads.js stream. When the list of primes is calculated and ready, the message event is triggered - the result falls into the main thread due to the fact that the worker has no work left, he also triggers the exit event, allowing the main thread to send data to the client.

primes-workerthreads.js changed a bit. It imports workerData (this is a copy of the parameters passed from the main thread) and parentPort through which the result of the worker's robots is passed back to the main thread.

Now let's try our example again and see what happens:

The main thread is no longer blocked. !!!!!

Now everything works as it should, but it’s still not a good practice to produce workers without any reason, to create streams is not a cheap pleasure. Be sure to create a thread pool before this.

Conclusion

Node.js is a powerful technology worth exploring when possible.
My personal recommendation - always be curious! If you know how something works from the inside, you can work with it more effectively.

This is all for today, guys. I hope this post was useful for you and you have learned something new about Node.js.

Thanks for reading and see you in the next posts. .

Source: https://habr.com/ru/post/460661/

All Articles