Today I have two reasons to run over the keys.
First, after last week I translated the
jParser documentation (after reviewing the RReverser example of
using jParser in analyzing BMP files ), it seems to me appropriate to go to the next step that is to follow: develop a theme, share my own example with readers using jParser to analyze a slightly more complex data structure. (In part, this will be the answer
to the question that
alekciy asked,
taking interest in further examples of the practical use of jParser.)

Secondly, about half a year ago
( November 26, 2011 ),
ertaquo asked why I wanted to use Node.js in Fidonet. Then I said that I simply liked the name (I remember those times when the term “node”
or “now,” if used without clarification, in the Russian computer world, by default meant a Fidonet node), but could not give any good example of a working code. and now bring.
')
So, the example will be double. I bring to your attention the analysis of the headers of letters from Fidonet echomail, stored in the JAM format. This format has been popular in Fidonet since the days of the distant and immemorial
( Wikipedia says that the appearance of JAM dates back to 1993). I’ll say
right away that I have long preferred JAM to another popular format
( Squish ), because this latter stores in the header of the letter identifiers of no more than nine responses to it, whereas JAM uses a more flexible data structure (a
linked list ) instead of an array of limited length, so which allows you to build a complete tree of answers, even in the most lively and extensive discussions.
JAM documentation can be easily found on various fidosh BBSs, but BBS tends to close or change addresses over time, so for reliability I refer to my own
letter five years ago, in which I quoted this documentation literally and in its entirety. (The Czech BBS, which then served as a source for me, is now already closed. Everything is ghostly in this raging world.)
As you can see there, the headers of the Fidonet echomail letters are stored inside the
JHR file. This file consists of a fixed-length header (
FixedHeaderInfoStruct ), followed by the actual letter headers (
MessageHeader ), each of which consists, again, of a fixed-size structure (
MessageFixedHeader ) and a variable tail consisting of several fields (
SubFieldXX ), total length specified in the
SubfieldLen field within the
MessageFixedHeader structure. The
SubFieldXX field again consists of a fixed-size header followed by a string of bytes, the length of which is specified in the previous number
datlen . (This resembles the implementation of strings in the dialects of Pascal, common in the same nineties -
Turbo-Pascal, UCSD Pascal; however, in Pascal the length was indicated by one byte, and in JAM the number
datlen is of
the ulong type , that is, it is thirty-two
bits . This is prudent. )
Much less clear is another important fact: inside the
JHR file, the
MessageHeader headers are not necessarily end-to-end. The subsection “Updating message headers” indicates that if, after editing or processing a letter, its header grows in size, then it is placed at the end of the file, and the old header is marked as deleted. The fate of the letters, whose title did not grow in volume, but decreased, doesn’t say
anything - however, in practice many Fidonet programs write such a new title to the previous one, changing the value of
SubfieldLen accordingly (and, if necessary, individual values ​​of
datlen ). Between this and the subsequent
MessageHeader , there remains garbage consisting of the contents of the former last fields of
SubFieldXX . That is why, after reading the next
MessageHeader header, there is no more reasonable way to go to the next
MessageHeader header, besides searching for a string of three
ASCII characters "JAM" followed by a null byte - this is the
Signature sequence with which the
MessageFixedHeader header must start.
The module code for Node.js, which reads echomail headers
from a JHR file into RAM, can therefore be sketched as follows:
var fs = require('fs'); var jParser = require('jParser'); var ulong = 'uint32'; var ushort = 'uint16'; var JAM = function(echotag){ if (!(this instanceof JAM)) return new JAM(echotag); this.echotag = echotag;
This sketch uses raw data caching
from a JHR file inside the exported
JAM object (in the
JHR field
) —a solution that is not economical from the point of view of the current module design, but it will be useful if, along with the
ReadHeaders method,
you need a simpler method that
reads , for example Only the
FixedHeaderInfoStruct header. There are also fields for the other three JAM files (for JDT, and JDX, and JLR), but commented out. (Ideally, the cache should also be kept up-to-date — doing
stat () , and not
watchFile () , but it’s clear that for the initial draft of the module, this code will fit without it.)
The data types from the JAM documentation (for example,
ulong ) are not specified by jParser tools (for example,
“ 'ulong': 'uint32' ”), but are declared as JavaScript variables (for example, “
var ulong = 'uint32' ”), whose values ​​are used in description of data structures. This is for speed: it is clear that the V8 JavaScript engine code will work much faster than the jParser module code.
In the description of the
SubField structure
, you will find the commented
type field
- it is filled with a javascript function containing mnemonic field notations borrowed from the JAM documentation. Can be used for debugging purposes.
The
Subfields field within the
MessageHeader structure is defined in two ways. The first (fast) reads this field as a string of bytes the size of
SubfieldLen . The second (commented out) fully processes this field, isolating the subfields by jParser - if the application using the module needs metadata from the variable part of the fidomail header in any case, then why postpone their analysis for a long time.
The
AfterSubfields field contains a simple search for a string of three
ASCII characters “JAM” followed by a null byte — the reason for this is set out in one of the previous paragraphs. The commented out
console.log () call has a debugging meaning, no more. (The name of the
moveSIG internal variable is an allusion to the meme "
All your base are belong to .")
The number
69 in the description of the
MessageHeaders field in the
JHR structure is "magic"; its goal is to ensure that the analysis does not get too close to the end of the file, where you can also expect garbage data.
I checked the speed of the analysis with the help of this test script:
var JAM = require('../'); var util = require('util'); console.log( new Date().toLocaleString() ); var blog = JAM('blog-MtW'); blog.ReadHeaders(function(err,data){ if (err) throw err;
The script is in the
test subdirectory, so the first line uses a call to the parent directory, where the text of the main module is in the
index.js file
; since this name is
implied by default in Node.js , it suffices to specify only the parent directory.
The test data in the
blog-MtW.jhr file contains the headers of my Fidonet blogogues
( Ru.Blog.Mithgol ) blog
entries that have been accumulated since March 2007.
A single-core Pentium IV (2.2 GHz) test runs shows that headers are processed
in three to four seconds. If the simple reading of the
Subfields array
is replaced by its analysis (which is now commented out), then this time is still doubled.
This is a lot for a single ehoconference, because on the Fidonet node such ehoconferences can easily be more than a hundred, and the total time for analyzing the echomail headers will turn out to be multi-minute.

But fidoshnikam certainly do not need to be reminded that the popular Fidonet mail editor GoldED (GoldED +,
GoldED-NSF) scans echo conferences (at the beginning of their work) much faster, and their names flash on the status bar on his screen saver so quickly that it is easy to see - on each spent a fraction of a second, no more. One has to come to a unpleasant conclusion: javascript analysis of binary data, even on the fast V8 engine, works an order of magnitude slower - and not even slower than just one order of magnitude.
It only remains cynically to suspect that at the beginning of work GoldED reads for speed not the entire file, but only one header structure
FixedHeaderInfoStruct (there would be enough data from it to display the number of messages in echo conferences, and more than GoldED
does not do anything at the beginning of work
) , I can neither confirm nor deny this suspicion, because
CVS GoldED + did not have time to figure it out.
I put the code for my module (JAM header reader)
on Github under a free MIT license.