Bad press reviews about Node.js often refer to performance issues. This does not mean that Node.js has more problems than with other technologies. Just the user must keep in mind some features of its work. Although the technology has a flat learning curve, the mechanisms ensuring its operation are quite complex. You need to understand them to prevent performance errors. And if something goes wrong, you need to know how to quickly put everything in order. In this article, Daniel Hahn talks about how Node.js manages memory and how to track down memory-related problems.

Unlike platforms like PHP, applications on Node.js are long-term processes. There are a number of positive aspects - for example, the ability to connect to the database once and use this connection for all queries. But this feature can create problems. First, let's take a look at the basics of Node.js.
Real Austrian garbage collector')
Node.js is a C ++ program controlled by a V8 JavaScript engineGoogle V8 is an engine that was originally written for Google Chrome, but could also be used autonomously. Therefore, it is ideal for Node.js and is, in fact, the only part of the platform that "understands" JavaScript. V8 compiles javascript into machine code and executes it. During execution, the engine controls the allocation and cleaning of memory as needed. This means that when it comes to memory management in Node.js, we are in fact talking about V8.
Here you can see a simple example of how to use V8 from the point of view of C ++.
V8 memory circuitA running program can always be represented after a certain amount of space allocated in memory. This place is called Resident Set. V8 uses a scheme similar to the Java Virtual Machine scheme, and divides the memory into segments:
Code: the currently executing code.
Stack: contains all primitive types of values (like integer or Boolean) with pointers that refer to objects on the heap and define the control flow of the program.
Heap: a memory segment for storing reference types like objects, strings, and closures.
V8 memory circuitIn Node.js, current memory usage data can be obtained by calling
process.memoryUsage ().The function returns an object containing:
- Resident Set size;
- total heap size;
- the amount of space used in the heap.
This function can be used to record memory usage over time and plot a graph that displays how the V8 manages memory.
Node.js memory usage over timeWe see that the graph of used space in the heap is extremely unstable, but always remains within certain limits in order to keep the value of average consumption constant. The process that allocates and frees memory is called
garbage collection.Introduction to Garbage CollectionEach program that consumes memory requires a reservation and freeing mechanism. In C and C ++, this function is performed by the malloc () and free () commands, as shown in the example below:
char * buffer; buffer = (char*) malloc (42);
We see that the programmer is responsible for freeing unused memory. If the program only allocates memory and does not free it, the heap will grow until the memory used is exhausted, which will cause the program to crash. We call it
a memory leak.As we already know, in Node.js, JavaScript is compiled into native code using V8. The data structures obtained after compilation cannot do anything with their original representation and are simply controlled using V8. This means that we cannot actively allocate and clear memory in JavaScript. V8 uses to solve this problem a well-known mechanism - garbage collection.
The principle of garbage collection is quite simple: if no one refers to a memory segment, we can assume that it is not used and clean it up. However, the process of obtaining and storing this information is rather complicated, since the code may contain chain references and redirections that form a complex graph structure.
Count Heap. A red object can only be deleted if it is no longer referenced.Garbage collection is a rather expensive process because it interrupts the execution of the application, which naturally affects performance. To remedy this situation, V8 uses 2 types of garbage collection:
- Scavenge - fast, but incomplete;
- Mark-Sweep is relatively slow, but clears all unused links.
An excellent post containing very detailed information about garbage collection can be found
at this link.Now, looking at the graph obtained using process.memoryUsage (), you can easily distinguish between different types of garbage collection: a pattern resembling saw teeth, notes the work of Scavenge, falling down - Mark-Sweep.
Using the built-in
node-gc-profiler module, you can get even more information about the work of the garbage collector. The module subscribes to garbage collector events and translates them into JavaScript.
The returned object indicates the type of garbage collection and the duration. Again, this data can be easily displayed graphically to make it clearer how things work.
Duration and frequency of launches of the garbage collectorYou can clearly see that Scavenge runs much more often than Mark-Sweep. Depending on the complexity of the application, the duration may vary. It is noteworthy that on this graph you can see frequent and short-term launches of Mark-Sweep, the function of which is not clear to me yet.
When something goes wrongIf the garbage collector cleans the memory, why should we worry? In fact, memory leaks can easily occur in your logs.
Memory Leak ExceptionUsing the previously created schedule, we can observe how the memory is constantly clogged!
Memory leak progressThe garbage collector is doing everything possible to free up memory. But with each launch, we see that memory consumption is constantly increasing, and this is a clear sign of a memory leak. Since we know how to accurately detect a memory leak, let's see what needs to be done to trigger it.
We create a memory leakSome leaks are obvious - like storing data in global variables (for example, folding the IP addresses of all logged-in users into an array). Others are not so noticeable - for example, a well-known
memory leak from
Walmart due to the omission of a
small expression in the Node.js core code, which took weeks to find the source.
I am not going to look at errors in the kernel code here. Let's just take a look at a hard-to-find leak in the code
from the Meteor blog, which you can easily admit in your code.
Entering an error in your JavaScript codeAt first glance it looks fine. One would think that theThing is overwritten with every call to replaceThing (). The problem is that someMethod has its own private scope as context. This means that someMethod () knows about unused () and, even if unused () is never called, this fact will prevent the garbage collector from freeing memory from originalThing. Just because there are too many indirect calls. This is not a bug, but can lead to memory leaks that are difficult to track down.
True, it would be great if you could look in a bunch and see what is there now? Fortunately, there is such an opportunity! V8 allows you to dump heaps at the current moment, and V8-profiler allows you to use this functionality for JavaScript.
var fs = require('fs'); var profiler = require('v8-profiler'); var _datadir = null; var nextMBThreshold = 0; module.exports.init = function (datadir) { _datadir = datadir; setInterval(tickHeapDump, 500); }; function tickHeapDump() { setImmediate(function () { heapDump(); }); } function heapDump() { var memMB = process.memoryUsage().rss / 1048576; console.log(memMB + '>' + nextMBThreshold); if (memMB > nextMBThreshold) { console.log('Current memory usage: %j', process.memoryUsage()); nextMBThreshold += 50; var snap = profiler.takeSnapshot('profile'); saveHeapSnapshot(snap, _datadir); } } function saveHeapSnapshot(snapshot, datadir) { var buffer = ''; var stamp = Date.now(); snapshot.serialize( function iterator(data, length) { buffer += data; }, function complete() { var name = stamp + '.heapsnapshot'; fs.writeFile(datadir + '/' + name , buffer, function () { console.log('Heap snapshot written to ' + name); }); } ); }
This simple module creates a heap dump file if memory usage is constantly increasing. Yes, there are much more complex approaches to defining anomalies, but for our purposes this will be enough. In the event of a memory leak, you may have many such files. So you need to closely monitor this and add the ability to alert this module. The same functionality for working with the heap dump is provided by Chrome, and you can use Chrome Developer Tools to analyze the dumps of the V8-profiler.
Chrome Developer ToolsOne heap dump may not help, because you will not see how the heap changes over time. Therefore, Chrome Developer Tools allows you to compare different files. Comparing 2 dumps, we get a delta of values, which shows which structures increase between two dumps:
A comparison of the dumps shows our leakage.Here we see our problem. The variable that contains a string of asterisks and is called longStr is referenced by the originalThing, referenced by some method that is referenced ... I think you understand. This is a long series of nested references and closure contexts that do not allow to clear longStr. Although this example leads to obvious results, the process is always the same:
- Create multiple heap dumps with a time difference and with a different amount of allocated memory.
- Compare several dumps to see which values grow.
FinallyAs you can see, the garbage collection process is quite complex, and even valid code can cause memory leaks. Using the built-in V8 functionality along with Chrome Developer Tools, you can understand what causes memory leaks and, if you embed this functionality in your application, have everything you need to solve a similar problem when it occurs.
One question remains: how can I fix a leak? The answer is simple: just add theThing = null; at the end of the function, and you are saved.