On the Internet, again, someone is wrong - in yesterday's Node Weekly there was a link to a post in which the author tries to measure and compare the performance of the Stream API in Node.js with its peers. Sadness causes, how the author works with streams and what conclusions he is trying to do on the basis of this:
... this worked out pretty well. Although it has been streaming its memory
Let's try to figure out what is wrong with the conclusions and code of the author.
From my point of view, the problem is that the author of the article does not know how to use the Stream and this is a problem with which one has to face quite often. This phenomenon, in my opinion, has three reasons:
Together, this leads to the fact that developers often do not know how and do not want to use the Stream API.
What is wrong with the author's code ?
To begin, we will repeat the task here (the original is in English and the link to the file can be found in the post):
There is a certain 2.5 GB file with lines of the form:
C00084871|N|M3|P|201703099050762757|15|IND|COLLINS, DARREN ROBERT|SOUTHLAKE|TX|760928782|CELANESE|VPCHOP&TECH|02282017|153||PR2552193345215|1151824||P/R DEDUCTION ($76.92 BI-WEEKLY)|4030920171380058715
You need to parse it and find out the following information:
What is the problem? - The author honestly says that he loads the entire file into memory and because of this Node “hangs up” and the author brings us an interesting fact.
Fun fact: Node.js can only hold up to 1.67GB in memory at any one time
The author makes a strange conclusion from this fact that it is the Stream that loads the entire file into memory, and it was not he who wrote the wrong code.
Let's refute the thesis: " Although Node.js was streaming the whole file, " writing a small program that counts the number of lines in a file of any size:
const { Writable } = require('stream') const fs = require('fs') const split = require('split') let counter = 0 const linecounter = new Writable({ write(chunk, encoding, callback) { counter = counter + 1 callback() }, writev(chunks, callback) { counter = counter + chunks.length callback() } }) fs.createReadStream('itcont.txt') .pipe(split()) .pipe(linecounter) linecounter.on('finish', function() { console.log(counter) })
NB : the code is intentionally written as simple as possible. Global variables are bad!
What you should pay attention to:
Well, let's test our creation on a large file:
> node linecounter.js 13903993
As we see, everything works. From which we can conclude that the Stream API does an excellent job with files of any size and the statement of the author of the post, to put it mildly, is not true. Approximately also we can calculate any other value required in the task.
Tell us:
Source: https://habr.com/ru/post/427901/
All Articles