⬆️ ⬇️

Another archive storage format: dar

Introduction





There is a well-known saying that system administrators are divided into three types: those who do not make backups; those who are already doing backups and those who are doing and checking that the backups are working.



However, this is not enough, and now for the user of the backup system, such a parameter as speed is important, and not only the speed of the backup itself, that is, file archiving, but also recovery.



Agree, it is foolish to read the entire archive of 50-100-1000 gigabytes in size in order to extract one file.

')

And if you have these archives incremental, then to restore one file for the desired date, you will need to consistently read all the archives in order. And everything becomes much worse if the archive file is located on a remote server.



And this is exactly what you will do if you use the TAR archive format. After all, this is an industry standard for archives, and it is used in many backup utilities.



And the reason for this behavior is very simple - the lack of indexes, by which you can pull out one file from the archive.



TAR has a lot of flaws, many of which are fatal. I will give a small list of the main shortcomings that came across during the study:







And this is just what I remembered on the go.



I conducted a fairly extensive study of archivers (zip, rar, 7zip), and even any monster systems for backups: open source (well, or conditionally open source) type bacula, and proprietary.



And I found the format of the archive, which more or less suited me and the company in all respects and fit my task.



I suggest you pay attention to the dar archiver and briefly tell about its advantages and disadvantages (they exist, but there are few of them, and you can live with them), and then proceed to practical examples.



Virtues









These are only its main advantages, in general, dar is very rich in features, and a quote from the man speaks best of it: “...



The project is actively developing and is well supported by the developer. I received an answer to my questions within a couple of days, and the answers are always very informative. I know only one project with the same level of support - libguestfs , by the way, I already wrote about it.



disadvantages









Not that this is a disadvantage, but dar is very verbose. If tar writes one line after the operation, dar writes a lot and in great detail. And of course, you can shut it up (no one has yet escaped >/dev/null 2>&1 ).



Practical work





I bet that some of the audience already ran to install dar in their favorite distributions and read the man’s own. For those who stayed, I will tell you how to use it. And when enthusiasts return, I will show how to use this wonderful utility, and tell you about some basic concepts that you will encounter on the man dar pages.



Archiving





The first example is the simplest:



dar -R $HOME -c /mnt/backup/archive





Archives the / home directory.



Let's exclude a couple of directories (~ / movies, ~ / downloads):



dar -R $HOME -c /mnt/backup/archive -P movies -P downloads





I think everyone has already noticed that the name of the archive does not mention the file extension .dar. And in the file name, the number 1 came from somewhere. This is all because dar was originally intended for backup to removable media (CD, DVD or, for example, tape drives), so it archives into slices , and digit 1 occurs because This slice is the first . And since we did not specify the key -s 100M - and the only one. Dar also has keys for running scripts, when performing certain operations (tar has such keys). For example, when a slice is recorded, you can run the script and change the media, and then again, and so on.



In general, breaking the archive into several parts will not surprise anyone.



By default, dar archives without compression, and to enable compression, you need to pass the -z algo:level key to it. Supports gzip, bzip2, lzo. And at the output we get the same .N.dar file, without adding any .gz and others. The archiver himself knows what is inside him.



Let us turn to the following goodies - exceptions for compression during archiving:



dar -R $HOME -c /mnt/backup/archive -Y "*.txt" "*.fb2" -Z "*.mp4"





The -Y switch indicates for which files the compression should be turned on, and -Z for which it is not needed. And by default, the exception has a higher priority (but this behavior can be changed if necessary).



And now let's get down to the differential, incremental and most tasty - decremental backup.



If someone doesn’t know what it means - not a problem, I’ll tell you:







At the same time, no one bothers you to implement both incremental and decrement backup at the same time. So for two weeks, a backup may look like this (above the days of the week, below is the type of backups d- - decremental, + i - incremental):



MTWTFSSMTWTFS

d- d- d- d- d- d- f +i +i +i +i +i +i





What will allow to manage one full copy, saving a significant amount of space.



You should also know that the only thing you need in order to make an incremental archive is an index. In terms of dar, an index is called a directory, and saving an index to a file is the isolation of a directory. Also, I will continue to use only the terms incremental / decremental, since the differential archive is a special case of incremental



So let's create an incremental archive:



dar -R $HOME -c /mnt/backup/archive_monday -A /mnt/backup/archive





And now let's do one more:



dar -R $HOME -c /mnt/backup/archive_tuesday -A /mnt/backup/archive_monday





Do you understand the idea? Great, we go on. And now let's save the index separately (note, we will not cut it from the archive, but simply copy it as with mbr backup. After all, do you back up your bootloader?), So that you don’t back up with a multi-gigabyte backup just to create an incremental archive. We are doing “isolation on the fly” now, but the catalog can be saved at any convenient time by taking it from a ready-made archive.



dar -R $HOME -c /mnt/backup/archive_wednesday -A /mnt/backup/archive_tuesday -@ /mnt/backup/CAT_archive_wednesday





And now let's make a backup one more time, using only the CAT archive wednesday index:



dar -R $HOME -c /mnt/backup/archive_thursday -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday





Well, we figured out the incremental backup that was familiar to many, but what is the decrement backup for the beast?



For a start, we need one yesterday's complete archive from which we will make decremental, and today's full.



dar -R $HOME -c /mnt/backup/archive_sunday

dar -R $HOME -+ /mnt/backup/archive_saturday_decremental -A /mnt/backup/archive_saturday -@ /mnt/backup/archive_sunday -ad





In general, everything is a little confused here (get used to it), because -+ according to the documentation was created to merge two archives, and -@ , as we have said, serves to isolate the directory on the fly, and the -ad changes the behavior of these keys to implement decrement. In a sense, this is logical. Probably.



Well, we got to the moment of truth - data recovery. After all, everyone understands that a backup that cannot be restored is equivalent to an undrawn backup?



Before recovery, it would be nice to check the archive:



dar -t /mnt/backup/archive_sunday



If dar did not return an error code (at the end of the man, all possible exit codes that dar can return are listed), then you can recover:



mkdir sunday

dar -x /mnt/backup/archive_sunday -R sunday



Remote Machine Operations





File recovery





I casually already mentioned that recovering files from remote machines via a pipe (for example, via ssh) is a non-trivial task.



I will try to tell you in detail how it works.



All the difficulties are due to the fact that to restore a single dar file you need to read the index. If, however, it is used in the same way as tar, in stream reading mode (key - sequence-read), then no such problems arise.



To solve the problem with reading the index, two versions of dar have been created:







Therefore, the scheme of work (for recovery) is as follows:



(2) --> dar --> (1) --> dar_slave archive --> (2)





  1. dar through the pipe says dar_slave: “I want to restore file A”.
  2. dar_slave reads the index of the archive file, finds out at what offset the file is located, and transfers it to stdout, which reads dar and writes the resulting file to disk.




The difficulty is in transferring the file from dar_slave to dar. For such a “ring” data transfer, we will have to build a small crutch using mkfifo:



mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

mkfifo /tmp/fifo<.code>

dar -x -i /tmp/fifo -R sunday | ssh user@host dar_slave sunday > /tmp/fifo


rm /tmp/fifo



, , , NFS.







: , :



dar -R $HOME -c - -A /mnt/backup/CAT_archive_wednesday -@ /mnt/backup/CAT_archive_thursday | ssh user@host 'cat > archive_thursday'









dar dar_manager, dar. , , , , (, , , ).



, , .



, , , : production-, , , , , .



dar dar_static: , .







, ( ), dar. Ubuntu 12.04 , dar 2.4.2, / . dar 2.4.12 .



, , 2.4, dar 2.3 .

Source: https://habr.com/ru/post/215449/



All Articles