The story of how ISPsystem developed a backup solution. Says the head of development, Alexander Bryukhanov.
All users are divided into three groups:
those who don't back up,
those who already make them
and those who check made.')
Someone will just amuse my story, and someone in him will recognize himself. This is a story about how it was in the ISPsystem 15 years ago, how and why it changed, what it came to. In the first part I will tell you how we started developing a solution for backing up virtual servers.
So, the beginning of the 2000-x yard. OpenVZ and KVM are not yet created. FreeBSD Jail comes out and on its base we are the first to develop a solution for providing virtual server services.
If you have data, you have a problem: how not to lose this data?
At first, they did the usual archiving of virtual server files, since UnionFS allows it.
True, there is a slippery moment: when you delete a file from a template, a so-called WHITEOUT is created, which tar does not see. Consequently, when recovering from such a backup, deleted files, if no others were created in their place, rise from the ashes of the template.The service, as they say, flooded, and we made incremental archives. On FreeBSD, tar did it out of the box.
The service is still a pearl. The account of our clients' servers went not to pieces, but to the racks (for guys who started with the Internet in 56K and rooms in 20 squares - this was very good). And over time, problems began to arise.
Problem one: CPU
Somewhere in this moment we began to look at ready-made solutions. Then, in addition to bacula, a very young product at that time, I did not find anything suitable. We tried to deploy it in one of the data centers, but it did not meet our expectations. It turned out to be quite difficult to set up, getting the files from it was not as convenient as from the usual .tgz archive, and the performance was not impressive.
Lowering the priority to the backup process also didn’t lead to anything good: backups either didn’t have time to be made within a day or stopped creating at all.
The solution lay on the surface - it is necessary to transfer the backup to a separate machine! Fortunately, this is done on the knee through the shell script. We did, and instead of the usual file server, we had a full backup server. The problem with the CPU has been solved! But then another one appeared.
Problem Two: Disk
Especially with a weekly full backup. The answer was quickly. We also had a previous copy, in which there are most of the files! Therefore, the next step was to get the contents of the files for the backup not from the server, but from the previous copy. This is how the first ispbackup implementation appeared. And it raised the speed at times!
Along the way, this allowed to solve the WHITEOUT problem:
readdir () "does not see" deleted files, but
fts_read () sees them!
The stream used for gz compression generally does not imply reading from the middle, and repacking data is quite a resource-intensive exercise.
For this, backup copies were cut into separate parts (the part contained a certain set of files entirely and had a limit on the offset of the beginning of the file relative to the beginning of the archive). In order not to repack files when they are reused, several parts of the previous one could be completely used in the new archive. The reused parts could contain outdated data; to get rid of them, the “compression” function of the backup was implemented.
We also got a funny unexpected bonus. “Hot” files began to gradually be collected in some parts, and “cold” in others, which somewhat optimized the process. It's nice when something good happens by itself :)
Problem three: And if something went wrong?
If at some point something went wrong, a broken archive could be created that could go unnoticed for months. In fact, until the moment when you do not need it ... Advice based on your own bitter experience: If you care about your data, check your backups.
Epilogue
In general, the tool has acquired crutches. But he worked !!! For several years we lived happily, and then hard drives dramatically cheaper and the size of virtual machines in a short time increased several times (if not orders of magnitude).
For a while, we resisted. Introduced the setting for the virtual server: whether to back it up daily, weekly or not to back up at all, but it was already agony. Backups have been replaced by reliable RAID or network storages with their subsequent sensitive monitoring. Appeared KVM and OpenVZ.
Instead of backing up all the files, we began to write a backup of user data for ISPmanager, but this is another story.
Who cares, the source code of ispbackup is laid out on
github .