Recently,
Reside Real Estate faced problems: at the most crucial moments, the Node.js server began to fall. Suspicion fell on the memory. Employees of the company resorted to temporary measures, which made it possible to eliminate the inconvenience of users, and began searching for the source of the problems. As a result, they managed to find and fix problems.
In this article, they talk about how to find and fix errors related to memory usage. Namely, it will be about memory leaks, and about situations when programs use much more memory than they actually need. This story will help those who are faced with something similar, immediately understand the reason for the strange behavior of the server and quickly return it to the system.
Types of memory problems
â–Ť Memory leak
In computer science, a memory leak is a kind of uncontrolled use of resources that occurs when a program improperly controls memory allocation, with the result that memory that is no longer needed is not released.
In low-level languages ​​like C, memory leaks often occur when memory is allocated, like this:
buffer = malloc(num_items*sizeof(double));
, but not free after memory is no longer needed:
free(buffer);
.
')
In languages ​​with automatic memory release management, leaks occur when entities that are no longer needed can be accessed from an executing program, or from some root object. In the case of JavaScript, any object that can be accessed from the program is not destroyed by the garbage collector, respectively, the place it occupies on the heap is not freed. If the heap grows too large, there will be a shortage of memory.
â–Ť Excessive memory usage
In a situation of excessive use of memory, the program takes up much more memory than it needs to solve the task assigned to it. For example, this may occur when references to large objects are stored longer than necessary for the correct operation of the program, which prevents the garbage collector from destroying these objects. This happens even when large objects are kept in memory that are simply not needed by the program (this causes one of two main problems, which we will discuss below).
Identify memory problems
Our memory problems manifested themselves in quite an obvious way, mostly in the form of this grim log message:
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
Signs of memory leaks also include a decrease in program performance over time. If the server periodically performs the same process, which is initially fast, and gradually becomes slower before the failure, this very likely indicates a memory leak.
Signs of overuse of memory are usually expressed in poor program performance. However, excessive use of memory without leakage over time does not lead to a drop in performance.
Temporary solution to the problem
Often, when something happened, there is no time to understand the essence of the problem and fix everything. We definitely did not have it. Fortunately, there are ways to increase the amount of memory allocated to the Node process. The V8 engine has a standard memory limit of about 1.5 GB on 64-bit computers. Even if you run the Node process on a computer that has much more RAM, it does not matter, unless you increase this limit. In order to increase the limit, you can pass the key
max_old_space_size
to the Node process. It looks like this:
node --max_old_space_size=$SIZE server.js
The
$SIZE
parameter is specified in megabytes and, in theory, can be any number that makes sense on a particular computer. In our case, the parameter 8000 was used, which, taking into account the peculiarities of the server, allowed us to gain enough time for research. In addition, we increased the dynamic memory. We use Heroku, there it is done simply.
We also used the Twilio service, set it up so that we were notified every time a request comes to the server that requires a lot of memory. This allowed us to monitor the request and restart the server after it is completed. Such a solution is not ideal, but in order for our users not to face failures, we were ready for anything, even on round-the-clock duty without days off.
Debugging
So, thanks to the Node settings and the organization of monitoring the server, we won the time that could be spent on getting to the root cause of the problem. At first glance it may seem that the “problem with the memory of the server” is something terrible, and to get rid of this “problem” you will need fantastic tools and skills. However, in fact, everything is not so scary. There are quite accessible tools for the study of applications, there are many materials in which you can find tips. We will use the Chrome developer
tools to examine the memory of the Node server.
â–Ť Nesting pile
A “memory leak” is a problem that is expressed in a constantly growing amount of heap. As a result, the heap is too large to continue the normal operation of the server. Therefore, at the very beginning of the study, you need to make several snapshots (snapshots) of the heap, at some intervals, and dive into the study of these snapshots using the Chrome developer tools in order to understand why the heap is so large and why it grows. Pay attention to what you should do several snapshots, after a while, as a result, you can explore the objects that will move from one snapshot to another. These objects, quite possibly, are responsible for the memory leak. There are many ways to create snepshot heaps.
â–Ť Use heapdump to create heap snapshots
We used
heapdump to create heap
snapshots . This npm package has been very useful. You can import it into the code and access it in those places of the program where you need to make snapshots. For example, we made snapshots every time a server received a request that could trigger a memory intensive process. Immediately we formed a file name containing the current time. Thus, we could reproduce the problem by sending more and more new requests to the server. Here is how it looks in code:
import heapdump from 'heapdump'; export const handleUserRequest = (req) => { heapdump.writeSnapshot( `1.User_Request_Received-${Date.now()}.heapsnapshot`, (err, filename) => { console.log('dump written to', filename); }); return startMemoryIntensiveProcess(req); };
â–Ť Use Chrome remote debugger to create heap snapshots
If you work with Node 6.3. or with a later version, you can use the Chrome remote debugger to create heaps snapshots. To do this, first start Node with a command like this:
node --inspect server.j
s. Then go to
chrome://inspect
. Now you can remotely debug Node processes. To save time, you can install
this Chrome
plugin , which automatically opens the debugger tab when you start Node with the -
--inspect
flag. After that, just take snapshots when you find it necessary.
Chrome Remote Debugging and Heap SnapshotsLoading snapshots and determining the type of memory problem
The next step is to load snapshots on the Memory tab of the Chrome developer tools. If you used Chrome remote debugger to create snepshots, they will already be loaded. If you used heapdump, you will need to download them yourself. Be sure to download them in the correct order, namely the one in which they were made.
The most important thing to pay attention to at this stage of work is to understand - what exactly you are facing - with a leak or with excessive use of memory. If you have a memory leak, then you probably already have enough data to start exploring the heap in search of the source of the problem. However, if there is an excessive use of memory in front of you, you need to try some other analysis methods in order to get meaningful data.
Our first memory issue looked on the Chrome developer’s Memory tab, as shown below. It is easy to see that the pile is constantly growing. This indicates a memory leak.
Heap increases over time - obvious memory leakOur second memory problem, which arose a couple of months after the leak was fixed, in the end, on the same tests, looked like the one shown in the figure below.
The heap doesn’t grow over time - it’s not a memory leakHeap size does not change with time. The fact is that with excessive use of memory, its size exceeds some expected performance is not always, but only when performing certain operations. In this case, snapshots are made at some moments that are not tied to situations with excessive memory use. If, at the time of creating the snapshot, there was no execution of an incorrectly written resource-intensive function, then the heap will not contain any valuable information about the memory used by this function.
To identify such problems, we recommend two methods that helped us to identify the culprit for the problem - a function and a variable. This is a record of the memory allocation profile and the creation of snapshots on a server that is under heavy load.
If you are using Node version 6.3 or later, you can write a memory allocation profile through the Chrome remote debugger by running Node with the already mentioned -
--inspect
key. This will provide information on how individual functions use memory over time.
Record memory allocation profileAnother option is to send multiple simultaneous requests to your server and create multiple snapshots during the processing of these requests (it is assumed that the server is asynchronous, as a result, some snapshots can be much more than others, indicating a problem). We bombarded the server with queries and made snapshots. Some of them were very large. The study of these snapshots can be done to identify the source of the problem.
Snapshot analysis
Now we have data that may well help find the culprits of memory problems. In particular, we consider the analysis of the situation in which the sizes of successively made snepshots grow. Here is one of the snapshots that is loaded in the Memory tab of the Chrome developer tools.
Examining memory leaks - all features point to our email serviceThe Retained Size is the size of the memory freed after an object is deleted along with its dependent objects that are not reachable from the root object.
You can start the analysis by sorting the list in descending order by the Retained Size parameter, and then proceed to the study of large objects. In our case, the function names pointed to the part of the code that caused the problem.
Since we were confident that we had a memory leak, we knew that the study should start with a search for variables with the wrong scope. We opened the mail service
index.js
file and immediately found the module level variable at the top of the file.
const timers = {};
We dealt with all of this, made the necessary changes, tested the project several times and finally fixed the memory leak.
The second problem was harder to debug, but then the same approach worked. Below is the memory allocation profile that we recorded using the Chrome developer tools and the Node -
--inspect
key.
Search for perpetrators of excessive memory useJust as when analyzing data during a memory leak search, many of the names of functions and objects cannot be recognized at a glance, since they are at a lower level than the code they write for Node.js. In a situation like this, meet an unfamiliar name and write it down.
The memory allocation profile led us to one of the functions,
recordFromSnapshot
, it became a good starting point. Our study of the heap snapshot, which was not particularly different from the research performed when searching for a memory leak, allowed us to detect a very large
target
object. It was a variable declared inside the
recordFromSnapshot
function. This variable remained from the old version of the application, it was no longer needed. By getting rid of it, we corrected the situation with excessive use of memory and accelerated the process, which previously took 40 seconds, to about 10 seconds. In this process, no additional memory was required.
Results
The two memory problems described above caused us to slow down the development of our project, which was very fast before that, and analyze the server's performance. Now we understand the features of server performance at a much deeper level than before, and we know how long it takes to perform certain functions normally and how much memory they use. We have a much better understanding of what resources we need to further scale the project. And, most importantly, we stopped being afraid of memory problems and stopped waiting for them to appear in the future.
Dear readers! Have you encountered memory problems in Node.js? If yes, please tell us how you solved them.