
In the topic - the
main steps to ensure business continuity , which give a basic result. These actions will help to avoid a disaster, perform disaster recovery and get out of the situation with minimal losses.
I remind you that implementation management is a task that is highly appreciated by the management, and, almost always in Russia, leading to career growth.
The previous post was about the theory of the continuity of the activities of his own company. The topic is long. From the time of first ideas to conducting exercises for disaster recovery plans, it can take many months. There are things to start with to get an intermediate result. So, actions:
')
1. Form an emergency committee
This is a group of leaders from different directions who are the first to take action when an emergency situation arises - at any time of the day or night. These should be people who have been working in the company for a long time and who understand well what employees need to perform their daily tasks.
At the same time, it should be understood that some of the accidents that can stop your business can be very large. Consequently, the closer the emergency committee members live to the workplace, the better. You also need to take care of the presence of several types of communications. Only a mobile number is a serious point of failure. Print at least some of the contacts of all members of the emergency committee on a business card format card so that it will always be with everyone.
If you have a large enough company, in addition to the emergency committee, you need to create disaster recovery teams for each activity.
2. Prescribe the procedure for starting a disaster recovery plan
Not every accident requires this. The key point is whether it has led to the fact that it is impossible to restore work by regular means for any predetermined time. The recovery period can be from several hours to several days, depending on the industry, company size and other factors. Install it in advance, because then it will be late.
3. Define procedures for emergency communications.
One of the simplest and most effective organizational methods is called a call tree. Each potential member of the disaster recovery team has a laminated card with contacts of several colleagues whom he is obliged to notify when he learns about the accident. One calls three, each of them three more, and as a result all interested persons become aware of what happened very quickly. Much more effective than someone alone to sit on the phone and vyzvanivat dozens of people. The theory of distributed networks and the construction of revolutionary cells to help you.
Technically, it is necessary to provide alternative ways of communication - a mailbox on Gmail, 2-3 phone numbers, etc.
4. Prepare Basic Disaster Recovery Plans
We must remember that while you are working on a temporary solution, you must decisively cut off the extra details so as not to die under their load. It is enough that you have thought up high-level steps in advance and you do not have to reinvent them in a stressful situation.
- Highlight key business processes (functions) of the organization. If you do not have a formal list, this task can be a dead end. Do not despair, start with high-level processes. The organizational structure of the company will help you with this (it definitely is). Highlight logistics, marketing, sales, production, service, and so on. Collect knowledgeable colleagues and write together how to spell.
- When the list of business processes is ready, sort out the processes by their importance for the survival of the organization. Everyone can not be equally important, in an emergency situation with a lack of resources will have to prioritize. Better to think about it beforehand.
- Identify the resources that are needed to restore each business process from the list. If you are working on the continuity of the company as a whole, you will need to take into account a lot - key specialists, buildings, equipment, paper documents and so on. If IT acts as a “test ball”, we will talk about centralized IT systems, user workstations and several infrastructure services (LAN and WAN, telephony, Active Directory, etc.).
- You can lock up your organization's key gurus for a couple of days in a meeting room with someone who has experience of similar projects and is familiar with the methods of continuity management. An external consultant can be very useful here. When describing systems and preparing plans, one always clings to another, interrupting the flight of thought is very harmful here. What can be done in two days of continuous work will be done a few months if you meet twice a week for two hours.
5. Choose alternative sites for employees
Your main building may be damaged or inaccessible. Here are some recipes that will help restore the key functions of the company.
- Remote work. If the data center has survived, it is possible that many employees will be able to work with the applications they need from home using terminal access or virtual workstations (VDI). Of course, for this, they must have instructions and a workplace prepared in advance. Duplication of basic infrastructure “in the cloud” can help a lot.
- Relocation of employees to the surviving offices, if there are several, or to the office of a friendly company. There, too, should be ready to accept your colleagues.
- Transition to alternative ways of doing business - Excel tables, gone after the introduction of DRP instead of ERP, e-mail instead of document flow. If, for example, a head office is not available for a production company, production can work autonomously for several days if the necessary procedures are available.
6. Consider what can be done to prevent the possible consequences of an accident.
- Verify that backup copies of everything vital for company operations are being made, including important data located on users' workstations. The fact that they should not be stored there is a separate question. If your profiles are centrally located on the storage system - super!
- Organize backup storage outside the office. It’s great if it's a backup in the “cloud”, but regular removal of tapes from the library to another office or special storage is better than nothing.
Once I observed such overheating in the server room that all the walls were wet from condensation (and the air there is usually very dry). Dozens of servers are out of order. Imagine what happened at this temperature with the data on the tapes in the library standing in the same place. - Digitize paper documents (or better, abandon them as much as possible). Keep paper copies in a dedicated part of the office, preferably in fireproof cabinets.
- Check the fire alarm and fire extinguishing system, including the server one.
- Test the UPS, air conditioners, diesel generator, if you have one.
The overwhelming majority of long downtime of the data center I have seen and heard about begin with the fact that work begins on one of the power lines coming to the data center. Then something happens on the second, the UPS starts up, but the diesel engine does not start, or the load switching system does not work. - Conduct an evacuation and first aid exercise. The most important asset of a business is people, and their security should be paramount in the business continuity management system.
By the way, many companies do not hold mass management conferences and other team building in other cities and countries. When all the top is in the same plane, this is bad in terms of risk management.
7. Issue disaster recovery plans in well-structured documents.
Counselors can also help in this - they have written such documents and know how everything should look. Distribute paper copies to all disaster recovery team members. One copy must be kept at work, the second - at home.
8. Do the plan exercises.
You can start with the "staff" exercises, when the emergency committee staff and disaster recovery teams read out plans for roles: who is calling and who does what. The steps are discussed, clarified and documented by results. Imagine how much abuse there will be between all these bosses and company experts.
Then you should do a simulation, when everything is done for real - the systems are restored from backups, people move to another office and try to work from there, and so on. By experience, it is best to allocate weekends for exercises (of course, having compensated them by time off after). Usually, it takes 2-3 iterations - we test, adjust the plan, test again.
Yes, it is difficult, sometimes scary, but the results will be achieved only by those boring types who have enough courage and organizational skills to conduct such a large-scale simulation.
9. Update
Determine the date when the plans will be updated and repeated exercises.
10. Improvement
As a result of this project, bring to the leadership the need for more serious investment of time and money in the topic of continuity, and start a full-fledged project.
All the described steps can be completed in a month, if there is time and desire. But we can not expect that the result will be a panacea. As I said earlier, business continuity is a continuous topic, it needs to be dealt with constantly.
In my next topic, I will focus on solutions for the protection of centralized IT systems, which have shown themselves well in our projects for ensuring continuity and building backup data centers.