📜 ⬆️ ⬇️

Script on NodeJS for Backup data: End

I continue to describe the process of creating a script for Backpup data. In the last article I wrote (and then rewrote) the function of determining the list of changed files. In the current I will describe the process of creating a file of modified data.

Creating a file archive


The changed data in this case is the changed files and directories (they could be added / deleted). So our file of changed data will be essentially an archive. When creating an archive, we have two ways -
  1. Pack each file separately
  2. Pack all files together
  3. Mixed strategy: we can pack together all files with one extension

Option 2 is not very optimal: if you need to add a .gif and .txt file to the archive, then by packing .gif we will get an increase in its size due to the fact that the data in it is already packed. If this is not packaged, it is also bad, because text files are well compressed and leaving the file unpacked, we lose the size that we could get when packing.
There are options 1 and 3 - of which I chose the first option, since it is more simple to implement.
I will also do encryption of files in the archive, so that backups can be stored on external resources such as a Yandex disk and at the same time not be afraid that data will be stolen. :)

Outline our archive




I will not paint in detail the format of the archive, as in the picture, in my opinion, everything is clear enough.
')

Encryption and packaging using stream stream. Transform


We turn to the method of encryption and packaging. To do this, use the class stream.Transform . This class is of type Duplex - i.e. he can read and write data. And by redefining the _transform function we can transform the data (encrypt / decrypt, pack / unpack) as we need.
For simplicity, as an example, I will do an encryption class via an XOR operation.
//    var stream = require("stream"); var util = require("util"); //    function XOR(options) { //   if (!(this instanceof XOR)) { return new XOR(options); } //     XOR this._byteXOR = 0xAB; //  stream.Transform stream.Transform.call(this, options); } util.inherits(XOR, stream.Transform); //    XOR.prototype._transform = function (chunk, enc, cb) { // /  for(var i=0;i<chunk.length;i++) { chunk[i] ^= this._byteXOR; } //   this.push(chunk, enc); //     cb(); }; //   module.exports = { //     createCipher : function(options) { return new XOR(options); }, //     createDecipher : function(options) { return new XOR(options); } }; 

Everything is quite simple - create a class derived from the class stream.Transform , define the _transform method which has the following parameters
The encryption module must return two functions.
Since the XOR operation is symmetric (i.e., to decrypt, it is enough to perform the same operation as for encryption), in my case both functions return the same class.
File Encryption Example
Use the thread redirection function pipe ()
 //    var fs = require("fs"); var path = require("path"); //    var xor = require('./crypto/xor'); //   var inFile = __filename; var ipherFile = __filename+".xor"; var deipherFile = __filename+".in"; //     var rs = fs.createReadStream(inFile); //     var ts = xor.createCipher(); //     var ws = fs.createWriteStream(ipherFile); //  rs .pipe(ts) .pipe(ws) .on('finish', function () { //     var rs = fs.createReadStream(ipherFile); //     var ts = xor.createDecipher(); //     var ws = fs.createWriteStream(deipherFile); //  rs.pipe(ts).pipe(ws); }); 
I gave an example of a self-made class for encryption. In the program, I will use standard encryption classes ( crypto streams ) and packaging ( zlib streams ).
Packing functions will be located in the packer folder, encryption functions in the crypto folder.

Ready class to create an archive
 //--    var stream = require("stream"); var util = require("util"); var path = require("path"); var async = require("async"); var fs = require("fs"); //    var FILENAME_HEADER = "header"; //    function requires(nameLib) { var res = {}; var fullpath = path.join(__dirname,nameLib); if( fs.existsSync(fullpath) ) { var files = fs.readdirSync(fullpath); for(var i in files) { var name = path.basename(files[i],".js"); //console.log(nameLib+">>","name\t",name); res[name] = require("./"+nameLib+"/"+name); } } return res; } //   ,     var _noPackExt=[ 'png','jpeg','jpg','gif','ico', 'docx','xlsx', 'mp4','avi', 'mp3','ogg', 'zip','rar','gz','7z','arj' ]; var noPackExt={}; for(var i in _noPackExt) { noPackExt["."+_noPackExt[i]] = 1; } //   function mkDir(pathDir) { //console.log(pathDir); var pathPrev = pathDir; var _dirs=[]; while(true) { var bname = path.basename(pathPrev); if(bname.length==0) break; _dirs.push(bname); pathPrev = path.dirname(pathPrev); } _dirs = _dirs.reverse(); //console.log("_dirs",_dirs,pathPrev); for(var i in _dirs) { pathPrev = path.join(pathPrev,_dirs[i]); //console.log(pathPrev,fs.existsSync(pathPrev)); if( !fs.existsSync(pathPrev) ) { try { fs.mkdirSync(pathPrev); } catch(ex) { } } } } //    var factoryCrypto = requires("crypto"); //    var factoryPacker = requires("packer"); //--         function streamBytesCount(options) { //   if (!(this instanceof streamBytesCount)) { return new streamBytesCount(options); } //    this._bytesCount = 0; //  stream.Transform stream.Transform.call(this, options); } util.inherits(streamBytesCount, stream.Transform); //    streamBytesCount.prototype._transform = function (chunk, enc, cb) { //   this._bytesCount += chunk.length; //   this.push(chunk, enc); //     cb(); }; //    streamBytesCount.prototype.bytesCount = function () { return this._bytesCount; }; //   module.exports = { //--      // arhPath -  // items -   // basePath -     pack : function(arhPath,basePath,items,opts,callback) { //   var cryptoID = opts.cryptoID ? opts.cryptoID : ""; //       var wsData = fs.createWriteStream(arhPath); //       var filepathHeader = arhPath+".dat"; var wsHeader = fs.createWriteStream(filepathHeader); //      function Number2Buffer(value,Nbytes) { //console.log("\tNumber2Buffer\t",value,"\t",Nbytes); //     var buf = new Buffer(Nbytes); //  switch(Nbytes) { case 1: buf.writeUInt8(value,0); break; case 2: buf.writeUInt16LE(value,0); break; case 4: buf.writeUInt32LE(value,0); break; case 8: { var hi = Math.round(value/256/256/256/256); var lo = value&0xFFFFFFFF; buf.writeUInt32LE(lo,0); buf.writeUInt32LE(hi,4); } break; } //    //console.log(value,Nbytes,buf,stream.path); // return buf; } //     function String2Buffer(str) { // var buf = new Buffer(str,'utf8'); return Buffer.concat([Number2Buffer(buf.length,2),buf]); } //   function _pushFile(filePath,item,cb,fdebug) { //   var packerID = ''; if( typeof(item.packerID)!='undefined' ) packerID = item.packerID; else { packerID = "node"; // :       ? var ext = path.extname(item.name).toLowerCase(); if( typeof(noPackExt[ext])!='undefined' ) { packerID = ''; } } //     var stream = fs.createReadStream(filePath); //  if( typeof(factoryPacker[packerID])!='undefined' ) { var sp = factoryPacker[packerID].createPacker({ params : opts.params ? opts.params : {} }); stream = stream.pipe(sp); } //  if( typeof(factoryCrypto[cryptoID])!='undefined' ) { var sc = factoryCrypto[cryptoID].createCipher({ password : opts.password ? opts.password : item.name, params : opts.params ? opts.params : {} }) stream = stream.pipe(sc); } //   var oBytesCount = new streamBytesCount; //    oBytesCount.on('end',function(err){ cb(err,oBytesCount.bytesCount(),packerID); }); // stream = stream.pipe(oBytesCount).pipe(wsData,{ end: false }); } //    async.eachSeries(items, function(item, cb) { //  : 0 -  / 1 -  var bufHeader = Number2Buffer(item.size<0 ? 1 : 0 ,1); //   bufHeader = Buffer.concat( [bufHeader,String2Buffer(item.name) ] ); //    (..   - 0) if(item.size>=0) { //   var filePath = path.join(basePath,item.name); //   _pushFile(filePath,item,function(err,sizePack,packerID){ //console.log("1"); //       bufHeader = Buffer.concat( [ bufHeader, Number2Buffer(item.size,8), //    Number2Buffer(sizePack,8), //    String2Buffer(packerID) //   ] ); //        wsHeader.write(bufHeader,cb); }); } else { //        wsHeader.write(bufHeader,cb); } }, function(err) { //  ,    if(err) return callback(err); //    wsHeader.end(); //    var packerID = "node"; //    var bufHeaderH = Buffer.concat([ //   String2Buffer(cryptoID), //    String2Buffer(packerID), //   Number2Buffer(items.length,8) ]); //     wsData.write(bufHeaderH,function(err) { //  ,   if(err) callback(err); //      _pushFile(filepathHeader,{ name : FILENAME_HEADER, packerID : packerID },function(err,sizePack,packerID){ // //console.log("header pack",sizePack,bufHeaderH.length); //       wsData.write(Number2Buffer(sizePack+bufHeaderH.length,8),function(err) { //   wsData.close(); //    wsData.on('close',function(){ //        fs.unlink(filepathHeader, callback); }); }); }); }); }); }, //  unpack : function(arhPath,fnIterator,opts,callback) { //     fs.stat(arhPath, function(err,stat) { //      var fdr = fs.openSync(arhPath, 'rs'); //      var position = stat.size - 8; //      N  function _readNumber(Nbytes) { var res=0; //    var buf = new Buffer(Nbytes); //console.log("position",position); position += fs.readSync(fdr, buf, 0, buf.length, position, buf.length); //   switch(Nbytes) { case 1: res = buf.readUInt8(0); break; case 2: res = buf.readUInt16LE(0); break; case 4: res = buf.readUInt32LE(0); break; case 8: { var lo = buf.readUInt32LE(0); var hi = buf.readUInt32LE(4); res = Math.round(hi*256*256*256*256)+lo; //console.log(hi,lo,res); } break; } // //console.log("\t_readNumber\t",res,"\t",Nbytes); // return res; } //     function _readString() { var res = ""; //    var len = _readNumber(2); //console.log("len string",len); //   if(len) { var buf = new Buffer(len); position += fs.readSync(fdr, buf, 0, buf.length, position, buf.length); res = buf.toString('utf8'); } //console.log("\t_readString\t",res); return res; } //      var sizeHeader = _readNumber(8); //console.log("sizeHeader",sizeHeader); //      position = stat.size - 8 - sizeHeader; //   var cryptoID = _readString(); //console.log("cryptoID",cryptoID); //    var packerID = _readString(); //console.log("packerID",packerID); //   var cnt = _readNumber(8); //console.log("cnt",cnt); //    sizeHeader -= (2+unescape(encodeURIComponent(cryptoID)).length+2+unescape(encodeURIComponent(packerID)).length+8); //   function _popFile(item,cb) { //console.log("_popFile",item.name,item.filepath); //   mkDir( path.dirname(item.filepath) ); //     var file = fs.createWriteStream(item.filepath); //      if(item.sizePack==0) { //   file.close(); } else { //   var stream = fs.createReadStream(arhPath,{ start : item.offset, end : (item.offset+item.sizePack-1) }); //  if( typeof(factoryCrypto[cryptoID])!='undefined' ) { var sc = factoryCrypto[cryptoID].createDecipher({ password : opts.password ? opts.password : item.name, params : opts.params ? opts.params : {} }) stream = stream.pipe(sc); } //  if( typeof(factoryPacker[item.packerID])!='undefined' ) { var sp = factoryPacker[item.packerID].createUnpacker({ params : opts.params ? opts.params : {} }); stream = stream.pipe(sp); } stream.pipe(file); } //    file.on('close', cb); } //      var filepathHeader = path.join(__dirname,"header.sb.extract.dat"); //   _popFile({ filepath : filepathHeader, name : FILENAME_HEADER, offset : (stat.size - 8 - sizeHeader), sizePack : sizeHeader, packerID : packerID },function(err){ //  ,   if(err) return callback(err); //     fdr = fs.openSync(filepathHeader, 'rs'); //      position = 0; //         var items=[] var offset = 0; while(cnt>0) { //    var item = { offset : offset }; //  item.typ = _readNumber(1); //  item.name = _readString(); //    if(item.typ==0) { //  item.size = _readNumber(8); //   item.sizePack = _readNumber(8); //   item.packerID = _readString(); //   offset += item.sizePack; } // //console.log("item",item); //    item.filepath = fnIterator(item); //       if( typeof(item.filepath)=='string' ) { //       items.push(item); } //      cnt--; } //console.log(items); //return; //    fs.unlink(filepathHeader, function(err){ //    (    4- ) async.eachLimit(items, 4, function(item, cb) { //   switch(item.typ) { case 0: //  { //   _popFile(item,cb); } break; case 1: //  { //   mkDir( item.filepath ); //    cb(); } break; } },callback); }); }); }); } }; 

Work with storage


This is how the link to the repository [repository type : //] [ repository address ] will look
The module for working with the repository should contain two methods
In this case, the storage address is transferred to the method (i.e., the type of storage is not transferred to the method).
Fs module to save to file system
 //   fs ( ) var fs = require('fs'); var path = require('path'); //   module.exports = { //     put : function (url,filename,rs,callback) { //    var filepath = path.join(url,filename); //     var ws = fs.createWriteStream(filepath); ws.on('error',callback); //   rs.pipe(ws); //        ws.on('close',callback); }, //     get : function (url,filename,ws,callback) { //    var filepath = path.join(url,filename); //     var rs = fs.createReadStream(filepath); rs.on('error',callback); //   rs.pipe(ws); //        rs.on('close',callback); } }; 

It remains to collect from the already written modules the final program.

The most common mistake


And I would like to write separately about the most common mistake (in my experience) when writing programs on NodeJS: we should not forget that the functions in NodeJS are ASYNCHRONIC . In particular, I forgot that the stream.close () stream method is asynchronous. So I did close and immediately started working with the file. Sometimes it worked (the file had time to close), and sometimes not - and then I received a file of zero length and an error when working with it. It was necessary to catch the event close and continue to work there. Do not repeat my mistakes !

Link to all source codes - these are all source codes mentioned (and not mentioned) in the following articles: Script for NodeJS for Backup and data: Start , Script for NodeJS for Backup for data: End .
Help on the manufacturer's website
The link for downloading the program is a link to the final program. Unlike the source code, the data in this archive will change. In particular, it is planned to add support for FTP and yandex-disk. Currently, the work with the file system is supported as storage - i.e. saving occurs in the specified folder.

Source: https://habr.com/ru/post/244435/


All Articles