📜 ⬆️ ⬇️

A few simple steps to help avoid problems with creating and restoring backup



Correctly they say that system administrators are divided into those who do not backup yet, and those who already do them. Of course, this is a joke, and even with an enormous beard, but in every joke, as they say ... And many companies have problems with backups. These problems can cost hundreds of thousands of dollars or even millions if we are talking about a large company with a developed client base.

To avoid problems with backups, you can follow a few simple rules. There is nothing difficult. We try to fulfill them at home, and I must say, it works. I must say at once that the discussion here is not about technological breakthroughs, but about routine trivia, which many people forget. We have long ago established the process of creating backups and data recovery from them. And in our work we use some useful trifles, which, perhaps, may be useful for you.

Trifle first. Monitoring


Observation, or the lack thereof, cannot by itself cause a failure to create an archive copy of the data or to restore information from this copy. But these processes need to be monitored so that something does not go wrong. And if something goes wrong, it can be quickly fixed. The problem lies in the fact that data is backed up in some cases on several servers at once. Most of the monitoring systems, with the exception of the newest ones, are simply not designed to monitor a large number of servers at a time.
')
The solution may be an automated monitoring system that collects data and visualizes it for the user. In such a system should be presented all the company's servers, including individual and client. Ideally, such a system should support the creation of backups using services and equipment from different manufacturers, with compliance with different standards.

It is worth setting up an automatic matching system. To do this, you can only compare checksums, since it is not always possible to compare a full copy with the original.

Trifle second. Missed notifications


In the normal case, notifications are sent to the administrator by email. This is a relatively reliable communication channel, although problems sometimes arise with it. The fact is that everything around is changing rapidly. The configuration of the server, applications, devices for creating backups and employees - all this is updated, disappears and appears again. In general, all these reasons lead to the fact that notifications do not reach the right people, and then the problems begin.

In this case, the perfect solution is the real-time notification mode. It is necessary to adjust the process of delivering notifications through several channels. And especially priority messages should not receive one person. In some cases it is better to make it so that there are several people.

Communication channels in the modern world can be different. These are e-mail, SNMP, SMSS and others. All this allows you to receive messages and respond quickly to them. It is best to implement in your company a system for sending notifications in real time. In this case, you can see everything at once, and not receive a message in minutes, hours or days after the incident or failure.

Trifle 3. The command line does not always help


Yes, many administrators prefer the console to everything else. But the command line can cause the administrator to make a big problem for his data center, having made a mistake when entering.

What can be done? Best of all, this case also works with the GUI. You can be the administrator of the seven genius in the forehead and make a small mistake that will lead to huge problems. And if a system with a GUI interface is put into operation, in most cases the human error factor will be excluded.



Trifle 4. Reporting and planning


Few people like reporting, even fewer people are attracted to planning. System administrators are no exception. As a result, work efficiency suffers. In addition, this is only a small part of the problems. There is one more - this is an analysis of statements received from other departments. Too often, administrators do not pay due attention to such reports.

As a solution, you can advise to distribute the data on the backup across several infrastructure objects at once. This, ideally, should not have a negative impact on the quality of work of the DC, but it will help collect data from a variety of sources, compiling them in an arbitrary way. Redundancy should always be. The result of archiving will be more reliable if you use several servers in the process, and not one. In addition, it is advisable to keep copies of data on different servers. If there is a physical threat to one of the servers, then the data can be recovered from another location.

Wrong setting


Despite the fact that professionals usually work in the IT department, mistakes happen quite often here. Here are a few of their reasons:





Regardless of the types of such errors, we would recommend using deployed automated monitoring systems that allow you to simultaneously monitor all corners of the data center, including backups. And it is worth remembering that they are useless in themselves, valuable information is valuable, which is restored in the process of working with backup. Data backup needs to be made simple and straightforward, the same way you need to deal with the data recovery process. Another important point is the automation of processes.

System administrators need to understand that there will be failures in any case, in any data center. You just need to do everything to be confident in the operation of its backup IT infrastructure during a crash. As mentioned above, for this you can use automated solutions that will work on a clear algorithm. In this case, any member of the team must learn what he should do and in what case. Something bad will happen necessarily, and the task of professionals will be to eliminate the consequences of this something in the shortest time.

If everything is done correctly, archived copies of data will be created without any problems in automatic mode. Well, restoring data will also be easy. The main thing is a streamlined process of backing up and reading data from saved backups.

Source: https://habr.com/ru/post/314338/


All Articles