Experienced trivia-10, or "DFS and fault tolerance"

The continuation of "experienced trifles." Previous parts can be read here .
Today's release will be a promise-release. Doing what I promised, I will tell you how you can do an interesting thing with DFS. This will, of course, not be a complete file data fault tolerance, but something similar to an online backup, at least.

To begin, I will repeat my empirical beliefs that it is not necessary to arrange a file cluster, using DFS. Not for these purposes DFS was created. And to dot the I, here are my arguments:

In the DFS mechanism, there is no way to determine which replica of the file is correct.
If there are several replicas in the same site, DFS chooses where to send the user request to the replica A or to the replica B, being guided by this, judging by everything, by the load on the storage server. (There are some settings for the selection of the replica, but in fact they do not change: if there are several replicas within the site, then the choice of a particular one can be unpredictable.
These nuances allow you to simulate a situation where user A turns on replica A and works there with data, while user B turns on replica B and works there with data. As a result, TWO branches of the modified data will be formed, and DFS will not know which data is correct, but will simply select the ones that last changed. Can you imagine what will happen in this situation with file storage, or worse, with databases
Well, it is worth noting that the replication of open files may be delayed indefinitely. The simplest example is users who do not close office documents when they go home.

All of the above allows us to say that DFS is best suited for transferring data to branch offices, synchronizing data that is rarely changed (orders, orders, archives) and similar tasks. However, you can do a little smarter and use DFS, perhaps not quite normal, but nevertheless a useful way.

You can build a kind of online replica based on DFS, which will not work most of the time (which means most of the problems with data synchronization will not manifest), and which can be enabled in case of failure of the main replica.
It may look like this:

Here (on the example of the Department folder) two replicas of the same folder are created, the replication group and replication tasks are configured (this is done by the setup wizard and will not cause you any problems). The gusto of the idea is that one of the links to the storage servers is disabled, i.e. there is a replica, replication between servers takes place as specified, but users who access DFS in this folder will be redirected exclusively to the first, active server.
')
The second server will replicate data as far as possible, and will be sort of "on pickup." In the case of some abnormal situation, it will be possible to castling and turn on the link to the second server, and to link to the first one - to turn it off and the users will again get to their native data, which will be as relevant as DFS replication was able to do (in practice this is from complete relevance, i.e., 0.5-2 seconds old, up to 2-3 days in the case of open files that are not replicated until they are closed, i.e. unlocked by the application).

It would seem great! Urgently ran to do this super-system! But besides all the good moments, there are not very good ones:

You will need a minimum of two in-place reserves on each volume for the hidden DfsrPrivate folder (a service folder for data replication). Given the double storage costs (the same is stored on both servers, and only one works at a time), this no longer looks so attractive, because places for such fault tolerance should be taken at least 4 times more than the data itself
Users can sometimes experience brakes when working with DFS. I did not succeed in understanding the exact reasons, but it was always the result of the presence of several replicas and a non-zero load on the network. As soon as the replica was left alone, the brakes became vanishingly small. This was definitely not related to working replication, it was very similar to some problems with DFS rezolving.
In order for users to see the new replica, to which you switched them to “hour X”, they will most likely have to reboot their computers, otherwise they will try to follow the old path.
Automatic switching to a working replica - I did not, because there are no standard methods for this, and it seemed to me to be reckless to write a miracle script in a situation where the technology itself has so many drawbacks.

As you can see in the example described, in addition to fairly weighty advantages. There are also rather big disadvantages, so set priorities, weigh FOR and AGAINST, and decide for yourself how to act in your particular situation.

By the way, according to those who know, in the Windows Server 2008 (R2) environment, DFS (and especially its replication service) was radically improved, and perhaps some of the problems were successfully solved. Try it - maybe the proposed scheme will work much better there.

To be continued.

Source: https://habr.com/ru/post/122671/

All Articles

Experienced trivia-10, or "DFS and fault tolerance"

More articles: