KDPV, for the sake of HYIP, and not flame.
My attempt to ease the flour in the backup process at least a little bit. What requirements were put forward by me and how I implemented them.
I will say right away, the article is not about version control systems. Which of you is putting CAD drawings in Git? And hundred megabyte graphic files? Daily? And not about cloud storage. An article about the implementation of a specific need, which was voiced more than once by acquaintances and colleagues who use ordinary MS Windows, creating something at home or in the office.
Once you realize that it was not worth a week ago to delete a page in a drawing, or merge layers in a sketch. Suppose backups were made, even incremental ones. At this moment, the last thing I want is to figure out how to get the files from the container, google the command line keys and all that. I want the simplest - any file manager to find the latest version of the file, or a week ago, better by name, and not hash content. Wrong file? Maybe you need two weeks ago?
')
So there was my development. Called her BURO. Does not decrypt.
The rdiff-backup utility was the inspiration, but it had a “fatal flaw”. It is written in python and stores data in text-compressed files. It was not acceptable for me to wait 20 minutes for her to work at the end of the working day, or getting ready to go to bed. Cannot operate with very long names. It is even scary to think how I can “unwind” the necessary file from the reserve for n revisions / days ago.
For example, I’ll show you how to backup a folder:
buro --backup --srcdir=" " --dstdir=" "
In the folder specified as dstdir, a buro.db file and a mirror folder will be created, which will be a complete copy of the srcdir folder. The next time we run the same command, a subfolder will be created in the dstdir folder, called the backup launch timestamp, to which the modified or deleted files will be moved compared to the previous launch. It looks like this:
<2017-06-14T00-01-34.771>
< >
1.doc
<2017-06-15T00-57-35.858>
.xls
<mirror>
< >
1.doc
2.doc
< >
Vivaldi.mp3
.xls
buro.db
As you can see, the file manager has the opportunity to get any file and its previous versions. Names and paths are preserved. Hooray!
Files from the backup are restored with the command:
buro --restore --srcdir=" " --dstdir=" "
It's simple. If you need the status of a folder on a specific date, then we add the argument —timestamp and specify the date in the format “YYYY-MM-DD HH: mm: SS.SSS”, for example:
buro --restore --srcdir="z:\backup" --dstdir="d:\documents" --timestamp=”2017-06-15 00:00:00.000”
It is not necessary to specify the exact time of the backup state, any intermediate can be used.
Increments began to occupy a lot of space? You do not need to delete your hands, there is a purge command. Example:
buro --purge --srcdir=" " --older="P30D"
Use the older argument to specify a specific date, or a time period in ISO 8601 format (P1Y2M3DT4H5M6S, in the example above, the period is 30 days).
But after all, even that very Visual Studio each time generates service files for many megabytes. The place is not free. Customize file masks and folders to exclude? Pfff ... There is an option to compress files! To the --backup command we add the argument --compress and each file will be converted to a bzip2 archive. The extension ".bz2" will also be added to the file name. The degree of compression is specified by a number from 1 to 9 (default is 5). Example:
buro --backup --srcdir="d:\myprojects" --dstdir="g:\backup" --compress=9
At least with my file manager, it is still possible to pull out any file. I doubt that the NTFS compression option will compress more.
My typical backup scenario for which Buro was developed is to save to removable media or a network drive. Data is transferred to a poorly controlled environment, secrecy does not interfere. Encryption of removable media, and especially network drives, is always a little sad topic, especially if you plan to continue working with files elsewhere. But there is a solution here too! Built-in encryption. The option --password is added to the --backup command. The extension ".encrypted" is added to the file name. You can use compression and encryption at the same time, the extension will still be ".encrypted"
Suppose the enterprise is inadequate to the Security Council, which may have questions about the origin of the file “Salary statement of management.xls.encrypted”. For this, the option of encrypting names --encodenames is implemented.
Example. File Directory
<.github>
<doc>
<fmt>
<support>
<test>
.gitignore
.travis.yml
Android.mk
ChangeLog.rst
CMakeLists.txt
CONTRIBUTING.rst
LICENSE.rst
README.rst
Turns into:
<3kOFykh>
<aqFkiIL7WrQu>
<fInOJigKGsSu>
<QCvcAkh>
<WmT4cT0A>
Au!U33x!41SS(8Ir
BuN0kVhe85aPkQh2$
fS1twqYhBCWCagGr
IKNW$LFo3$x9Mgb!rQd
nzSFA6G3RGfKkFA~XaXDZ
pWWMxno894zDuu0L!s3WY
rHtwmdKLrYxkWN~)NtKCuxH4Q
UiJ~)YM0zu01O8g52x~iAPIk
If someone is interested in the technical details of the implementation of encryption:
When using the password protection option, the “buro.db” file is also encrypted. The database engine is Berkeley DB, encrypts the data with the AES CBC 128bit algorithm, the key is obtained from SHA1 (password + salt). My utility skips the user's password through the Argon2 function, gets 256 bits of data on output, converts it into a text string that serves as a password to the database. To encrypt files and names, the ChaCha20 algorithm is used. Keys are generated randomly and stored in the database. I was too lazy to get my own key for each file, I use one. To prevent different versions of the same file from being superimposed on each other, each one generates its own NONCE, writing it down as the first 8 bytes of the encrypted file. File names before encryption are compressed with a simple LZ-like algorithm to make it harder to guess the length of the original name. Correct me if something is wrong.
Boring details:
- Informational messages are displayed in stdout, error messages in stderr;
- To see the detailed information that goes where it is copied and moved, use the --verbose option;
- To add a timestamp to messages, use the --printtimestamp option. To display the logging level, use --printloglevel;
Did not begin to add data integrity control with a checksum, IMHO in the process of data transfer and so there is a control at many levels. If you really want, use the compression option, there is control there, and I’m doing it.
For those interested, link to
source codes and binaries .
ps During the development of the Buro, I died SSD. What irony.