⬆️ ⬇️

Output Node.JS version 0.6.3 and two errors found when working with encodings of child processes in the Windows XP console

The official blog Node.JS yesterday (November 25) announced the release of a new version of Node, at number 0.6.3. The changes are not very significant: corrected a dozen errors and shortcomings.



I found it a reasonably interesting birthday present (by a pleasant coincidence, on November 25th I turned 33 years old). However, on the same day, wondering “ How can I accept the output of a Windows command called from node.js? “I started a series of experiments in Windows XP, the final result of which was the discovery of two Node errors at once when working with encodings under Windows.



First, when calling a command with the require ('child_process'). Exec (...) method, the Node expects it to output in UTF-8 encoding , while in the Russified Windows system, commands (for example, dir ) output the text in CP866 encoding.

')

Secondly, if the child console process changes the console coding, it will affect the console coding of the parent Node process (in particular, the output by the console.log method) - it turns out that they have the same console or something the same way.



And now some details.



It is easy to encounter the first of the two bugs I found if you run the simplest script to call a child process with writing its output in the literal form to a file:

var fs = require('fs'); // file system require('child_process').exec('dir', function(err, outstr){ fs.createWriteStream('testfile.txt', { flags: 'w', encoding: 'binary' }).write(outstr); }); 


Instead of Russian letters in the file will be nonsense.



As sdevalex rightly suggested , there is a workaround for this problem: it suffices to use the Windows “chcp” command to change the encoding of the process being called. The script, compiled with this in mind, outputs to the text file the desired type of command output:

 var forker = require('child_process'); var fs = require('fs'); // file system forker.exec('chcp 65001 | dir', function(err, outstr){ fs.createWriteStream('testfile.txt', { flags: 'w', encoding: 'binary' }).write(outstr); }); 


However, on this workaround, you may stumble upon the second error if you want to output the results not only to the file, but also to the console, for which it is enough to write this script:

 var clog = console.log; clog('\nRunning under Node.js version ' + process.versions.node + ' on ' + process.arch + '-type processor, ' + process.platform + ' platform.'); var forker = require('child_process'); var fs = require('fs'); // file system forker.exec('chcp 65001 | dir', function(err, outstr){ fs.createWriteStream('testfile.txt', { flags: 'w', encoding: 'binary' }).write(outstr); clog('\n' + outstr); }); 


You can stumble. And you may, oddly enough, and not stumble. It depends on whether you use raster fonts or vector fonts in the console (which are Lucida Console fonts in Windows XP), that is, the settings in the center of the second tab of the usual console properties dialog box:



[console properties window]



As far as I remember, in Windows XP, the console uses raster fonts by default (correct me if I'm wrong). So, if you didn’t change this setting, then the above script will output the desired text to a file (“testfile.txt”), and it will display something unattractive to the console:



[console screenshot]



And all this is because in the raster console the “chcp” command changes only the encoding of the text output by the commands; raster fonts cannot adjust to it, so even the output of the “chcp” command itself in the console looks unattractive:



[chcp screenshot]



If you have a console configured to display text with vector fonts (Lucida Console), then you will not notice this problem, because your script output will look correct in any encoding, whatever you decide to preset in the console with the “chcp” command :



[vector console screenshot]



At this stage, the hair should stand on end on the head and evenly move. Because it is clear that we are faced with an unusually insidious problem that allows a developer (if he uses a vector console) to literally write a script with a dozen lines of code, which he himself will have to work perfectly, and for a lot of other users (in a raster console) he will work disgustingly.



But what is this problem?



Perhaps, Node can not cope with the output in the Windows console, because in JavaScript the strings are unicode, and in the Windows console, they are CP866 encoded? But no, this is not the case - which is not difficult to prove with a simple test output to the console:



[screenshot of test]



Perhaps, Node switches to garbage, when the displayed characters are outside of the CP866 encoding? Also not, and it suffices to print the string " \ u2248 \ u0422 \ u0435 \ u0441 \ u0442 " to fully verify this. The \ u2248 character itself will be replaced with a question mark, but the rest of the string will not be affected.



It turns out that another guess is valid: this garbage in the raster console has the same nature as the garbage displayed by the command “chcp 65001” instead of the message about changing the code page. Moreover: it was she who caused it. We submitted this command in a child process and were going to change the encoding of the text output by the dir command — however , the chcp command had an effect on the parent console!



In order to fully demonstrate this, a simple test suffices:



[screenshot of the next test]



It is easy to see that “chcp 65001” from the child process affects the console window of the parent process (it acts until the command “chcp 866” is issued and the code page CP866 is used by default).



Understanding this new error allows us to discover a more perfect way to bypass a previously found error. By calling “chcp 65001” before the “dir” command, we will inevitably have to call “chcp 866” again to return the console to its original state before displaying the text issued by the “dir” command:



 var clog = console.log; clog('\nRunning under Node.js version ' + process.versions.node + ' on ' + process.arch + '-type processor, ' + process.platform + ' platform.'); var forker = require('child_process'); var fs = require('fs'); // file system forker.exec('chcp 65001 | dir', function(err, outstr){ fs.createWriteStream('testfile.txt', { flags: 'w', encoding: 'binary' }).write(outstr); forker.exec('chcp 866', function(){ clog('\n' + outstr); }); }); 


This script is already capable of displaying irreproachable immaculate text not only in the test file, but also in the console:



[screenshot of the next test]



This workaround, for all its flawlessness, has an insurmountable architectural flaw: two .exec () calls , of course, are asynchronous, so that between them the console remains in an abnormal mode for some time.



Both bugs were reported to the Node developers via GitHub: donnerjack13589 yesterday created issue 2190 , and today I created issue 2196 .

Source: https://habr.com/ru/post/133426/



All Articles