📜 ⬆️ ⬇️

Disaster Recovery Planning. Part Three - Final

We correlate business needs with its capabilities.




In previous articles ( 1 , 2 ) on disaster recovery planning, procedures were described for collecting and processing information about an organization's IT infrastructure, which would provide accurate information on:


And all would be nothing if it were not for the limited financial capacity of the organization, which does not allow to acquire all the necessary reserves for rapid recovery. For this reason, the final task of disaster recovery planning is finding a balance between the needs and financial capabilities of the business, and fixing it in the form of a Service Level Agreement (SLA) to eliminate the incidents that occur.
')
This stage consists entirely of coordination with the company's management of the following aspects of interaction:

1. Business support time by internal IT service




The willingness of technicians to begin disaster recovery immediately after receiving information about a failure is a major factor in determining the time of support. Eight-hour working day, vacation, illness, time off naturally limit this opportunity. If you do not have specialists with the competencies necessary for carrying out restoration work or there is not enough overlap by engineers both in time and in the absence of one of them, then the business should not count on support in the 24/7 schedule. If the current overlap by specialists does not guarantee the prompt response even in the 9 * 5 schedule, then the following options are possible:


However, even with external contractors it’s not so simple:

2. SLA with external contractors


Behind the external welfare of cooperation with an external contractor may be his inability to eliminate incidents within the time frame required by the business. Convenience and efficiency of work can turn into a headache at the very first problems due to the lack of understanding of the level of service you require from an external supplier.

If the existing external supplier service level agreement is unsatisfactory for your business (or is simply missing), then the following options are possible:


After you have decided on the people and / or companies that will be engaged in the rehabilitation work, you can designate the support time for user services, which can be incorporated into the framework of the service level agreement between the IT department and the business. It remains only to agree on the deadlines for their recovery, and for this it is necessary to discuss:

3. Getting the reserves needed for disaster recovery




The availability of the necessary equipment reserves directly affects the ability to quickly restore the service. If you have one physical server in your company, then if you refuse, you will have no chance to restore work (for more information on determining the necessary reserves, see the previous article ). If at the moment your company does not have all the equipment reserves necessary for the restoration work, then the following options are possible:


In principle, at this stage you can already designate the time within which it is possible to restore those or other user services in case of any failures. If the terms do not suit the management even if all the necessary reserves are available, then this is a reason to discuss:

4. Pre-harvesting to speed disaster recovery


This can be either an additional monitoring system, a backup, or an additional server or network equipment configured and operating in hot-swap mode. You may need them in order to localize and restore the work of the user service a little faster.

After you have approved with the management all the necessary investments in people, service contracts, equipment and software, in addition to the support time, you can also agree on the deadlines for the restoration of user services. But to ensure that these deadlines are met, another little touch is needed:

5. The volume of the performed scheduled tasks




To guarantee recovery in case of failures, you must be sure that in the event of an emergency you will have all the necessary resources for recovery. To do this, you must constantly monitor their presence and correctness. Possessing information on previously agreed reserves and resources, you can make an accurate list of the necessary regulatory measures, the regular implementation of which may require the involvement of additional technical specialists. This is a necessary payment for reliability, but, unfortunately, sometimes it is even useless:

6. Situations beyond the scope of the SLA.


There are situations in which it is difficult to predict the timing of recovery and that are beyond the scope of planning. These are not only force majeure situations, but also events with the simultaneous failure of two or more elements of the same type, the occurrence of which is admitted by probability theory.

Often it does not make economic sense to prepare IT infrastructure and IT professionals for the prompt elimination of any accidents. In some cases it is much cheaper and more efficient to prepare the business itself for action in case of their occurrence. For example, to prepare blank invoices for manual registration of goods, in case of a complete failure of computer systems, or to organize strict accounting of primary documentation to restore business operations since the last force majeure backup of the database was not difficult. Possible technical measures to reduce the negative impact of such situations on the business were described earlier .

At this stage of coordination can be considered complete - only minor formalities remain:

We fix the agreed parameters and act




The results of your negotiations with the leadership should be fixed on paper, reflecting in it:


Arranged in the document, the agreement will allow you to move from a situation when "IT infrastructure pretends that it works, and business pretends that it invests in it," to a situation where a business understands what level of service it can expect depending on IT investments.

At this point, disaster recovery planning can be considered successfully completed. However, sometimes, after evaluating all the necessary changes and their cost, it becomes clear that it is cheaper to fundamentally change the existing IT infrastructure. But that's another story.

Successes!

Ivan Kormachev
Company "IT Department"
www.depit.ru

Source: https://habr.com/ru/post/228115/


All Articles