Monitoring file changes in Node.js

The material, the translation of which we are publishing today, is devoted to the organization of monitoring file changes in Node.js. The author of the material, Dave Johnson, says that the need for a file monitoring system appeared to him in the process of creating an IoT project related to feeding aquarium fish. When someone from a family member feeds them, he presses one of the three buttons. In particular, we are talking about a button on an expansion card connected to the Raspberry Pi, an Amazon Dash button, and a button in the web interface. Any of these actions results in writing a string to the log file indicating the date, time, and type of event. As a result, looking at the contents of this file, you can understand whether it is time to feed the fish or not. Here is his fragment:

2018-5-21 19:06:48|circuit board 2018-5-21 10:11:22|dash button 2018-5-20 11:46:54|web

After the creation of the file is established, it is necessary that the system based on Node.js respond to the events of the change of this file and take the necessary actions. There will be reviewed and analyzed several approaches to solving this problem.

Packages for organizing file monitoring and Node.js built-in features

This material focuses on the Node.js built-in file-monitoring capabilities. In fact, such tasks can be solved exclusively by means of Node, without resorting to using third-party packages. However, if you are not against external dependencies or just want to get to a working solution as quickly as possible, without going into details, you can use the appropriate packages. For example - packages chokidar and node-watch . These are excellent libraries that are based on Node's internal file system monitoring capabilities. It is easy to use them, they solve the tasks assigned to them. Therefore, if you need to organize monitoring of files, without particularly going into the particular implementation of certain things in Node, these packages will help you with this. If, in addition to getting a practical result, you are also interested in how the corresponding Node subsystems are arranged, let's examine them together.

The first steps

In order to explore various Node tools for organizing file monitoring, we first create and configure a new project. We will be guided by novice Node developers, so we will describe everything in sufficient detail.
')
So, in order to create a project, we will create a new folder and move it to it using the means of the terminal. In the terminal, run the following command:

 $ npm init -y

In response, the system will create a package.json file for the Node.js project.
Now install the log-timestamp package from npm and save it in package.json as a dependency:

 $ npm install --save log-timestamp

The log-timestamp package allows you to attach a time stamp to messages that are output to the console using the console.log command. This will allow you to analyze the time of occurrence of events related to the monitoring of files. This package is needed solely for training purposes, and, for example, if you will be preparing something similar to what we are going to talk about, for use in production, you will not need to log-timestamp .

Using fs.watchFile

The built-in Node.js fs.watchFile method may seem like a logical choice for monitoring the state of our log file. The callback passed to this method will be called whenever the file changes. fs.watchFile .

 const fs = require('fs'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); fs.watchFile(buttonPressesLogFile, (curr, prev) => { console.log(`${buttonPressesLogFile} file Changed`); });

Here we start monitoring the changes in the button-pressed.log . Callback is called after the file is changed.

The callback functions are passed two arguments of the fs.stats type. This is an object with data about the current state of the file ( curr ), and an object with data about its previous state ( prev ). This allows, for example, to find out the time of the previous file modification using the prev.mtime construct.

If, after running the above code, open the button-pressed.log and make changes to it, the program will respond to this, the corresponding entry will appear in the console.

 $ node file-watcher.js [2018-05-21T00:54:55.885Z] Watching for file changes on ./button-presses.log [2018-05-21T00:55:04.731Z] ./button-presses.log file Changed

Experimenting, you can notice a delay between the moment of making a change in the file and the moment the message about it appears in the console. Why? The thing is that the fs.watchFile method, by default, polls files for changes every 5.007 seconds. This time can be changed by passing fs.watchFile object with parameters containing the interval property to the fs.watchFile method:

 fs.watchFile(buttonPressesLogFile, { interval: 1000 }, (curr, prev) => { console.log(`${buttonPressesLogFile} file Changed`); });

Here we set the polling interval to 1000 milliseconds, thereby indicating that we want the system to poll our log file every second.

Note that the fs.watchFile documentation indicates that the callback function in the handler will be called whenever a file is accessed. I, preparing this material, worked in Node v9.8.0, and in my case the system behaved differently. The callback call occurred only when changes were made to the observed file.

Using fs.watch

The fs.watch method is a much better way to organize file monitoring . While fs.watchFile spends system resources on polling files, fs.watch relies on the operating system for system notifications of file system changes. The documentation says that Node uses the inotify mechanism in the Linux OS, FSEvents on MacOS, and ReadDirectoryChangesW on Windows to receive asynchronous notifications when files change (compare this with synchronous polling of files). The performance gain derived from using fs.watch instead of fs.watchFile is even more significant when, for example, you need to keep track of all files in a certain directory, since you can pass either the path to the specific file as the first argument to fs.watch file or folder. fs.watch .

 const fs = require('fs'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); fs.watch(buttonPressesLogFile, (event, filename) => { if (filename) {   console.log(`${filename} file Changed`); } });

Here we observe what is happening with the log file, and, after detecting changes, we output a corresponding message to the console.

Change the log file and see what happens. What will be described below happens when you run the example on the Raspberry Pi (Raspbian), so what you see when you run it on your system may look different. So, this is what was displayed after changes were made to the file.

 $ node file-watcher.js [2018-05-21T00:55:52.588Z] Watching for file changes on ./button-presses.log [2018-05-21T00:56:00.773Z] button-presses.log file Changed [2018-05-21T00:56:00.793Z] button-presses.log file Changed [2018-05-21T00:56:00.802Z] button-presses.log file Changed [2018-05-21T00:56:00.813Z] button-presses.log file Changed

Interestingly, one change was made to the file, and a handler that responds to the file change was called four times. The number of these events depends on the platform. Perhaps the fact that one change causes several events is due to the fact that the operation of writing a file to a disk lasts for a certain period of time X, and the system detects several changes to the file on this interval of time. In order to get rid of such “false positives”, we need to modify our solution, make it less sensitive.

Here is one technical feature of fs.watch . This method allows you to respond to events that occur either when a file is renamed (these are rename events) or when its content changes. If we need accuracy and we only want to observe changes in the contents of the file, the code should be brought to the following state:

 const fs = require('fs'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); fs.watch(buttonPressesLogFile, (event, filename) => { if (filename && event ==='change') {   console.log(`${filename} file Changed`); } });

In our case, such a modification of the code will not fundamentally change anything, but perhaps if you build your own system to monitor the status of files, this technique will be useful to you. In addition, it should be noted that, when experimenting with this code, the rename event could be detected when running Node under Windows, but not under Raspbian.

Attempt at enhancement # 1: comparing file modification points

We need the handler to be called only when real changes are made to the log file. Therefore, we will try to improve the fs.watch code by observing the moment of file modification, which will allow us to identify real changes and avoid false positives of the handler.

 const fs = require('fs'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); let previousMTime = new Date(0); fs.watch(buttonPressesLogFile, (event, filename) => { if (filename) {   const stats = fs.statSync(filename);   if (stats.mtime.valueOf() === previousMTime.valueOf()) {     return;   }   previousMTime = stats.mtime;   console.log(`${filename} file Changed`); } });

Here we write to the previousMTime variable the value of the previous file modification moment and call console.log only in cases when the file modification time changes. It seems that this idea is good and now everything should work as we need. Check it out.

 $ node file-watcher.js [2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log [2018-05-21T00:56:55.611Z] button-presses.log file Changed [2018-05-21T00:56:55.629Z] button-presses.log file Changed [2018-05-21T00:56:55.645Z] button-presses.log file Changed

The result, unfortunately, does not look much better than what we saw last time. Obviously, the system (Raspbian in this case) generates many events in the process of saving the file, and we, in order not to see unnecessary messages, will have to find another way to improve the code.

Attempt improvement # 2: comparison of MD5 checksums

Create an MD5-hash (checksum) of the file contents at the beginning of the work, and then, at each file change event that fs.watch responds fs.watch , we calculate the checksum again. We may be able to get rid of unnecessary messages about changing the file, if we take into account the state of the contents of the file.

To do this, we first need to install the md5 package.

 $ npm install --save md5

Now we will use this package and write the code to identify the real changes in the file using the checksum.

 const fs = require('fs'); const md5 = require('md5'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); let md5Previous = null; fs.watch(buttonPressesLogFile, (event, filename) => { if (filename) {   const md5Current = md5(fs.readFileSync(buttonPressesLogFile));   if (md5Current === md5Previous) {     return;   }   md5Previous = md5Current;   console.log(`${filename} file Changed`); } });

In this code, we use an approach that resembles the one we used, comparing the modification time of a file, but here we analyze changes in the contents of a file using its checksum. Let's see how this code behaves in practice.

 $ node file-watcher.js [2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log [2018-05-21T00:59:00.924Z] button-presses.log file Changed [2018-05-21T00:59:00.936Z] button-presses.log file Changed

Unfortunately, this is not what we need again. The system probably generates file change events during the file saving process.

The recommended way to use fs.watch

We looked at the various uses of fs.watch , but never achieved what we wanted. However, not everything is so bad, because, in the search for a solution, we learned a lot of useful information. Let's make another attempt to achieve the desired. At this time, we use the technology to eliminate the “chatter” of events by entering into our code a small delay, which will allow us not to react to events about file changes within the specified time window.

 const fs = require('fs'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); let fsWait = false; fs.watch(buttonPressesLogFile, (event, filename) => { if (filename) {   if (fsWait) return;   fsWait = setTimeout(() => {     fsWait = false;   }, 100);   console.log(`${filename} file Changed`); } });

The bounce function has been created thanks to some help from StackOverflow users. As it turned out, a delay of 100 milliseconds is enough to issue only one message with a single file change. At the same time, our solution is also suitable for cases when the file is subject to quite frequent changes. This is what the output of the program now looks like.

 $ node file-watcher.js [2018-05-21T00:56:50.167Z] Watching for file changes on ./button-presses.log [2018-05-21T01:00:22.904Z] button-presses.log file Changed

As you can see, all this works fine. We found a magic formula for building a file monitoring system. If you take a look at the Node npm packages, which are aimed at monitoring file changes, you will find that many of them implement the bounce filtering functions. We used a similar approach, building a solution based on standard Node mechanisms, which made it possible not only to solve the problem, but also to learn something new.

As a result, I would like to note that the function to suppress the “bounce” can be combined with checking MD5 checksums for issuing messages only if the file has really changed and not displaying messages in situations where there was no no real changes made.

 const fs = require('fs'); const md5 = require('md5'); require('log-timestamp'); const buttonPressesLogFile = './button-presses.log'; console.log(`Watching for file changes on ${buttonPressesLogFile}`); let md5Previous = null; let fsWait = false; fs.watch(buttonPressesLogFile, (event, filename) => { if (filename) {   if (fsWait) return;   fsWait = setTimeout(() => {     fsWait = false;   }, 100);   const md5Current = md5(fs.readFileSync(buttonPressesLogFile));   if (md5Current === md5Previous) {     return;   }   md5Previous = md5Current;   console.log(`${filename} file Changed`); } });

Perhaps, all this looks a bit difficult, and in 99% of cases there is no need for such a thing, but, in any case, I suppose it gives some food for the mind.

Results

In Node.js, you can monitor file changes and execute some code in response to these changes. When applied to the aquarium IoT project, this makes it possible to monitor the status of the log file that contains the feed feeding records.

There are many situations in which monitoring files can be helpful. It should be noted that the use of fs.watchFile to monitor files fs.watchFile not recommended, since this command, in order to detect file change events, performs regular queries to the system. Instead, pay attention to fs.watch with a function to suppress the "bounce" of events.

Dear readers! Do you use mechanisms for monitoring file changes in your Node.js projects?

Source: https://habr.com/ru/post/360495/

All Articles