📜 ⬆️ ⬇️

Backup rule "3-2-1". Part 1

It is believed that the backup rule “3-2-1” was first described by Peter Krogh in his book “Managing Digital Assets for Photographers”. And this is probably not surprising, since the loss of a personal archive means a complete catastrophe for a professional photographer, and he is simply obliged to adhere to such a backup strategy, which is guaranteed to protect it from data loss.



So, the rule "3-2-1" states that to ensure reliable data storage, you must have at least:
  1. THREE backups,
  2. which must be stored in TWO different physical storage formats,
  3. moreover, ONE of the copies should be transferred to off-site storage

All three constituent rules are based on the principle of ensuring fault tolerance through redundancy of data storage.

" Three different copies" means "three copies stored in three physically different places." (Two different folders located on the same physical disk are considered to be located in the same place). I will not go into mathematics, but when you increase the number of copies, then (assuming that the physical characteristics of storage devices are the same and the threats to these devices are statistically independent) the probability of failure increases linearly, and the reliability of storage is consistent with a power function. That is, when you make three copies instead of one, you triple the probability of failure of a given set of copies with a cubic increase in reliability. In real life, this makes the data stored in three copies virtually “indestructible,” although you may have to replace the failed disks somewhat more often, simply because they have become larger in total.
')
However, alas, it is in real life that statistical dependence of threats often takes place. For example, when an electromagnetic pulse arises in an office supply circuit, it affects all disks at the same time. And, if one drive fails, then the other two will most likely fail (due to the homogeneous nature of the impact of the pulse on typical commercially available drives, which impose identical requirements on the quality of power supply).

Why do I need three copies, not two? Because in real life, very often, threats to two copies of data turn out to be statistically dependent due to the logical organization of the backup procedure. For example, consider RAID1 (or a disk array with "mirroring", see more about the post " technology of disk snapshots "). If a virus infects a file on one disk of the array, it immediately infects the second copy on the mirror disk. Similarly, if replication is configured, the replica will instantly also be corrupted by a virus. Even if you just build a full backup daily, it will also be infected if the administrator does not notice the infection of the original data in a day. In general: two copies will not be enough to restore information in all cases when the time of detection and administrator response to damage to the original data exceeds the period between the adjacent tasks of copying / replicating / mirroring this data.

In order to provide even greater statistical independence of threats, it is recommended to record data in at least two different physical formats . For example, if you save the data to a DVD (optical recording of information), then it will not suffer from the previously described electromagnetic pulse. Even if the DVD drive fails, the optical media itself will save your data. Other examples of a statistically dependent threat can be a prolonged critical temperature increase due to a failed air conditioner in the server room or a fire in the office, which, of course, affects all copies stored inside the office in a completely uniform manner.

Thus, storing copies in different physical formats is intended to reduce the likelihood of simultaneous data loss of all copies due to homogeneous impact.

In fact, the third point, storing one copy out of the office , solves the same problem (reducing the statistical dependence of threats to different copies of data), only through the geographical distribution of storage sites. Theft or fire in the office can lead to the loss of all copies stored there, but a fire or theft in one office will not lead to a fire or theft in another geographically separate office, which makes these threats in different offices statistically independent.

And what about storing data in the cloud? Can this be considered a replacement backup? Obviously not. This is simply an alternative place to store data or their backups, and, by the way, a good candidate for branchless backup storage. However, one must always remember that data can be lost in the cloud as well as in any other place.

In this case, the undoubted advantage of cloud providers is that the backup process is greatly simplified. The administrator does not need to buy and configure complex storage systems or “fiddle” with changing tapes. Often, cloud storage expands transparently at the request of the client, that is, it does not have a physical size limit for it (the restriction for the client is rather financial in nature), which also has its advantages over the storage system, which can suddenly run out of space.

In fact, cloud storage of backup copies is an alternative to tapes, since data from it is retrieved with a certain delay compared to local disk storage (which depends on the channel width and provider rate, as the lower the cost is, the slower data retrieval rate).

Is it always necessary to strictly follow the rule "3-2-1"? No, it all depends on the cost of your data, on the one hand, and the criticality (cost of potential damage) and the probability of threats to the data, on the other hand. Any protection should not exceed the value of the protected object. Therefore, if your data are not very valuable, or the threats are low critical or unlikely - you can implement the rule “3-2-1” partially. The main thing is to create a matrix of threats to the data (that is, to make a list of all possible threats, assess their likelihood and criticality) and carry out the process of their deactualization (that is, each threat is either written in the table or “deactivated by such and such technical measures” or “recognize irrelevant in terms of the nature of the company's business "). After elaboration of the matrix of threats, it will become clear to what extent rule 3-2-1 should be implemented, and which budget will be required as a result.

In the second part of this article, you will learn how you can implement the “3-2-1” backup rule using the Veeam Backup & Replication v7 product.

References:
1. Peter Krogh. "The DAM Book: Digital Asset Management for Photographers"

Source: https://habr.com/ru/post/188544/


All Articles