We are conducting a test of the disaster recovery plan.

We all want to hope that nothing like this will ever happen.
')
Most enterprises have (well, or at least should be) a disaster recovery plan. A similar plan should be at the data center operator. Any of these objects is subject to external factors - the accident option cannot be completely excluded. Even, it would seem, the most protected objects can still get into a very unpleasant situation, which we
have already written about .
Accordingly, a DRP plan (
disaster-recovery plan ) should help the company quickly reach the previous working level of the accident. Usually, such a plan describes the actions of employees in the event of an accident. In drawing up such a plan, the goal is usually to minimize the consequences of the accident by ensuring that control of critical tasks can be returned using predetermined resources. But the plan is a plan, but will it work? To test this, it is worthwhile to conduct a "training alarm".

Data centers contain a lot of external-sensitive equipment, which, in turn, works with huge amounts of data that can be very valuable. A recent example of what even a small accident can lead to is the
cancellation of most Delta Airlines flights.
Most likely, such a huge company had its own DRP-plan. Perhaps there were unrecorded moments in it, because of which the company itself and its clients suffered. Indeed, just a plan and the possibility of its quick implementation are two different things.
Any company, and even more so, an IT company must take into account the infrastructure, people and processes when drawing up its own disaster recovery plan (whether it is an earthquake, a fire, or the human factor).
How often do we need to conduct “training anxieties”?
Actually, it is difficult to answer here - each company has a unique situation that makes it impossible to unify both the DRP plan and its implementation. However, at any time, the head of the company must be confident that the plan meets the current situation and can be implemented. Revise the DRP plan is worth after each major change in infrastructure. And “anxiety” can be carried out once a month or once a year - it all depends on how often the company changes.
Experts recommend checking at least once a year.
Getting ready

Before the company starts to check the realism and efficiency of its plan, you need to be confident in its results. Make sure that the responsibilities of all employees are distributed rationally and correctly. We should not allow any employees to have any duties at all, and someone would have plenty of them, and this person (or people) would be, in fact, irreplaceable.
The catastrophe is a catastrophe, that one of the employees may be inaccessible and if this is a key person, the whole plan can go derailed. All instructions and rules should be clear and understandable. During a plan check, you need to closely monitor the progress of the DRP plan implementation.
Every detail of the checked plan should be fixed, taking into account all the problems and difficulties encountered. The verification should be carried out with a time reference, tracking how much time it takes to solve a particular problem and the implementation of any stage. The company's management and individual employees should know what will happen if the equipment and services of IT companies will stand for a certain time. How will this affect operations, customers and revenue?
How to test
1. Plan VerificationThis is a purely theoretical stage, which almost never includes full-fledged "teachings". It is necessary to revise the plan for compliance with its current situation in the company and the environment around it several times a year.
By the way, DRP should have a steering committee. It usually includes competent employees, often top managers. In addition, for the work it is necessary to involve experts who can greatly help on the way to planning a rescue from a catastrophe.
2. Check without alarmAt this stage, it is necessary to check the knowledge of all employees who, according to the plan, should participate in the process of eliminating the consequences of the disaster. Each of the employees must be interviewed for their duties and their performance in the event of any unforeseen situation.
If nothing like this is done, then the staff will not take your plan too seriously. Someone will surely forget something, will not understand or even decide not to take part. In order to prevent the significant influence of the “human factor” on the consequences of the catastrophe, such a plan should be checked. All difficulties, misunderstanding of employees, lack of clarity in synchronization of actions - all this needs to be fixed and corrected.
3. Full-scale testThese are really field teachings, they should be as close as possible to the possible development of the situation in the event of a disaster. The result should be tangible. The data center operator must take into account how negatively significant downtime can affect the work of the company.
Some companies prefer to hide information that the “teachings” are not real, from ordinary employees. The fact is that it allows you to get from them the speed of reaction and action, as close as possible to reality.
At this stage, you will have to use the company's resources, including time, equipment and facilities. The result should be the return in distinct terms of "damaged" equipment with the rapid adaptation of the company's employees to the situation.
What if something goes wrong?
This is most likely to occur in varying degrees. The main thing - it is worth remembering that a smooth check of this level cannot pass 100%. Any mistakes of employees and interference of unexpected factors will necessarily affect the implementation of the plan.
After completion of testing, all information should be distributed among the employees of the company. And some things should be reported only to those who are connected with them. Ideally, DRP testing is worthwhile when something changes dramatically in a company.
And after the test, all the results obtained should be used for the benefit of your own company. In general, keeping employees and the entire company prepared for an extraordinary situation is critical. It is necessary to work with the plan (modify and modify it) every few months. Experts recommend doing it once or twice a quarter. But of course. It all depends on the company and its employees.

You need to check the work of the plan with different scenarios and situations. Only in the event that employees are ready for disaster, the company will be able to quickly restore work after a failure. Otherwise, the business of such a company may suffer very much.
By the way, it would be interesting to find out if your company is prepared for such problems, and if so, how do you check the performance of the plan, and what features does it have?