📜 ⬆️ ⬇️

What is a “backup”?


Very often I hear phrases like “why do I need backup, I have RAID!”. Or "I make backups on the second HDD in the server!". Or something similar. Very often a few months after that I hear the question “how can I recover the dead data?”. And it is sad.

In the article, I want to speculate a little bit about what “backup” is and what scheme of such a copy will help protect you from losing your data. Well, try to expose some myths and bad habits.

Most, I think, will not find anything new for themselves, but if you still belong to the category of those who do not or do backups, but these are not backups - welcome!
')


Requirements?


Let's define the terminology. What is a backup?
It is logical to assume that this is a copy of the data previously saved for the purpose of recovery in the event of the destruction of the original.
This implies the first requirement - isolation . It makes no sense to make a copy of the documents on the apartment and store it in the same place as the original. So it makes no sense to make a copy of the data and store it on the same disk / in the same server as the original. Is it logical Full

We go further. If we make a copy of the data, then we are afraid of losing it. So? So. It means that all the reserved data is valuable for us. So? Again so. Hence the second requirement - integrity . There is no point in copying without checking the integrity - at the output, we may well get broken data or lose part irretrievably.

Another item. Imagine that you deleted a file. Or not a file, but many files. For example, accidentally made "rm -rf ./ test". And went to sleep, with a clear conscience. And at midnight ... a backup occurred. But bad luck - he was set up so that he created a complete copy of the data without taking into account versions and changes. Those. deleted the file you deleted and on the backup media too - did the opposite thing to its purpose. Submitted? The third requirement is versioning . You should be able to return the previous state of your data, and not just have two identical copies.

Well, enough, I guess. The article is focused on SOHO-users, and not on the enterprise, so the requirements for security, speed recovery recovery, limited redundancy and other things we will not consider.

And what is the result?


As a result, we received three requirements that the backup system must meet in order to carry this proud name and securely store your data. Isolation will protect against equipment failure or external factors (fire, flood, etc.), as well as malicious deletion of data (will not allow an attacker or a virus to infect / delete and backup too), integrity monitoring ensures that all your data is reserved and you if you lose the main copy, if you find out about the problems too late, versioning will not allow the backup system to move the bullet from the user's leg, who has shot himself to his knee - to his head.

Closer to practice.


No problem.

Analyzing the existing one or inventing a new IBS for yourself - think, does it meet the criteria outlined above?
Do the main and backup copies intersect in one place? Is backup isolation provided? Is it possible to simultaneously modify files in the main and backup storage? Is there a significant (more significant than an atomic explosion) the likelihood that both carriers will be simultaneously destroyed or lost? If the answer to any of these questions is yes, there is an error in the system. For example, if you backed up files from a laptop on a usb flash and put it in a safe, then you are great. If you made this backup and put the flash drive in the bag to the laptop - you did not backup.

Does your schema ensure data integrity? For example, if the backup media runs out of space and the copy cannot be saved correctly, will you find out?
Does it provide fullness? If this application - are the settings saved, if the database is a schema, etc.?
Is it possible to get a working original from an existing copy? Or something missing?

Do you know what you will do if you lose the basic data? Is there a (albeit the simplest) recovery technique? Are all its items feasible and sufficient for obtaining data? Practical examples are known when a backup was made on an encrypted HDD, and a complex and secure encryption key was not stored in the owner’s head or even on a yellow piece of paper, but ... yes, on that laptop, from which the backup was made. As you can imagine, when the laptop was stolen, the data was lost forever.

Hold the “teachings” - imagine that the main carrier is lost and try to recover. I am sure that the first time you will fail, or it turns out that a lot of things are in fact not quite the way you imagined before.

Have you answered? Spent? Everything is fine? No, not at all. Do not forget about IBS. Keep it up to date. Began to use new software? List its catalogs on the backup. Think about how to restore it. Monitor the status of the backup media (if it is a single disk, flash drive or NAS - it is not eternal). Think about your data, except you, no one will.

Myths and examples of bad decisions


For some reason, people love to fool themselves. For example, many believe that RAID replaces backup and ensures data integrity. Especially if the RAID is not simple - the first, and heaped up, 5th for example.
But RAID is not a backup . Of the criteria defined above, in general, all three are not fulfilled - mirror disks are not isolated, are not controlled, and are not versioned. A file system crash, a random “rm -rf /” or an error while working with partitions will destroy the data on both disks and RAID will not help to save them. Moreover, if a damaged FS on a single disk can usually be restored, at least partially, then a decaying array is almost always absent.

The common “separate HDD for backup” scheme is also not viable. First, backup data is available and vulnerable to an attacker, a virus, or a common error like the aforementioned “rm -rf /”. Secondly, there are many situations, and very probable ones, which will destroy both disks at the same time. For example stormy and beautiful (with special effects) death of the power supply. Or a bucket of water tipped over by a computer cleaner. Or ... a lot of them.

Utilities like dropbox are also not very suitable for backup - unless, of course, versioning is not provided. Accidentally spoiling the data in the main copy, you will lose and backup, as soon as the changes are synchronized between them. Data will no longer be returned.

Instead of conclusion


Take care of your data by spending 15 minutes "before" - you can save 15 hours "after." Do not forget the bearded anecdote about those who do not make backups and those who already do them.

Source: https://habr.com/ru/post/162839/


All Articles