📜 ⬆️ ⬇️

While the thunder did not strike, or Continuity and GOST R 53647.4-2011 / ISO / PAS 22399: 2007

A few years ago, in the Moscow office of one of the companies, a depressurization of the gas fire extinguishing system occurred. The threat to the life and health of people was more than real. As a result of an emergency, 1 person died, 13 with varying degrees of poisoning were taken to hospital, 60 were evacuated. Such threats are highly relevant - after all, dozens of administrative and office buildings in Moscow, St. Petersburg and other cities are equipped with exactly the same fire extinguishing systems.

Another situation is possible: the bank interrupts work due to the threat of an explosion. An alarm can come from an attacker or a bully, or from a bank employee if they find a suspicious object or package in the room. Whether it is dangerous, it will be found out later, however, actions according to instructions in such a situation are strictly obligatory.

Another extreme option: a smoke bomb was thrown at the organization’s office. It can do without victims, but the panic is provided. Smoke streams making their way into the corridor, and the noise in the next room is unlikely to leave anyone indifferent. And certainly not contribute to the working atmosphere. Someone may become ill simply from excitement, and where to get the medicine is unknown.

To avoid casualties and other serious consequences, certain procedures must be followed. For example, notify everyone who is in the organization, evacuate employees and visitors, inform emergency services and security agencies (and in some cases in the media), contact with relatives of employees, bring information about the incident to management. At the same time, all employees should have the skills to act in a crisis situation - not just managers or specially designated people. Similar information should contain the current regulations in Russia in the field of business continuity.
')
Algorithms of actions of employees in the event of any incidents - this is part of a much wider area - ensuring business continuity. Below we analyze the existing ISO / PAS 22399: 2007 standard in our country (Guideline for incidental preparedness and operational continuity management): can the guidelines presented there really help in preparing for possible emergencies and improve the response processes within companies?

To our chagrin, there is little information in the standard about incident preparedness - it is more indicative of business continuity. Despite the ambitious title, many questions remain unanswered. We will try to answer them yourself, guided by our experience.

How to determine the scale of the incident?


Here a pre-compiled list of questions is useful:


It is clear that when the collapse has already come and there is no certainty, no one will sit and answer these questions. Therefore, it is worthwhile to draw up a table of damage by types and ranges of losses in advance (see Table 1).

Tab. 1. Example of damage table

Loss range

Financial losses

Control loss

Damage due to violation of laws / regulations

Reputation damage

Personnel losses

Catastrophic losses

over ...

Violation of production processes, product recall, letters with explanations, etc.

Unscheduled inspections of controlling and / or inspection bodies, revocation of a license, violation of legal requirements, etc.

Negative comments, reviews, articles, customer churn, increase in the number of complaints, partners' doubts, etc.

Victims of the incident, one way or another affected by the incident, processing, firing due to the incident, etc.

Big losses

from to …

Sensitive loss

from to …

Low loss

from to …


A table with ranges of measurable parameters will help you make an informed decision about the scale of the event.

Who initiates the action?


PE can happen anywhere, respectively, and a signal of its occurrence can give any employee. To manage incidents, 2 directions of information flow need to be developed: from the bottom up - the escalation tree from the initiator to the decision maker, and from the top to the bottom - the tree that alerts employees about the decision made by management. There are several types of escalation:


How do incident boundaries change over time?


The faster the incident is detected and localized, the less affected. Over time, the boundaries of the incident expand. For example, with fast recovery, a server crash may not even be noticed. But a lengthy downtime can disrupt internal processes (for example, a report or payment order will not be prepared). In some cases, it may affect the company as a whole (failure to report to the regulatory authorities or payment for goods / services may result in significant financial damage or damage to the company's reputation).

It is necessary to clearly define possible boundaries in advance - during the incident only the scale is determined, i.e. choose the option of boundaries that most faithfully describes what happened. To facilitate the selection, as already mentioned, you can use the damage table.
How to limit the level of escalation (do not call the same general whenever a failure occurs in IT)?

If the company has formalized instructions, which describe the order of escalation, they should be followed closely. You can argue with the illogicality of what is written in a calm atmosphere, but not when the reaction speed is critical.

If there are no such instructions, but there is a support service or security service, you need to inform them of what happened. They understand their areas of responsibility and represent the sequence of actions in these areas.

Finally, if there is nothing of this, and you need advice, report the incident to your immediate supervisor or his deputy. If they are out of reach, go higher up the hierarchy.

Who participates in the crisis committee?


The crisis committee must have the authority and competence to make decisions on incidents promptly. It must include representatives from all areas of the company:


Who coordinates all actions in the event of an incident?


The person responsible for the management in the circumstances of the incident should have the authority to make decisions that are binding on all other employees of the company. This does not necessarily have to be the same one who carries out management in the normal mode - for managing in a crisis situation, stress resistance and the ability to quickly make decisions are required.

It is important to develop in advance typical schemes for the interaction of employees in various incidents, descriptions of authorities and the structure of subordination.

What are the incident assessment options (incident rating scale)?

You can use several scales to evaluate the incident - qualitative and quantitative.

Tab. 2. Quantitative assessment: the frequency and scale of the impact of incidents.

 

Almost never

Seldom

Often

Regularly

Catastrophic losses

High risk

Critical risk level

Unacceptable level of risk

Unacceptable level of risk

Big losses

Low risk

High risk

Critical risk level

Unacceptable level of risk

Sensitive loss

Negligible Risk

Low risk

High risk

Critical risk level

Low loss

Negligible Risk

Negligible Risk

Low risk

High risk


Tab. 3. Qualitative assessment of the incident (an extensive description of these terms is given in the annex to the article)

Term

Description

Failure

A situation in which resources, such as IT infrastructure, do not work as expected. The impact of such a situation is considered minimal.

Critical situation (serious incident)

It occurs when, as part of incident management, it is not possible to solve a serious incident of the first priority in the allotted time.

Crash

Such a destructive event, in which the processes in the company are not executed, as expected. The availability of these processes and related equipment cannot be restored in a given period of time.

A crisis

The situation is different from the normal state. Despite the preventive measures taken, such a condition can occur at any time and cannot be overcome by ordinary procedural or organizational measures.

Catastrophe

An event that a company cannot limit in time and space and that has a large-scale impact on people, wealth and the environment. The very existence of the company, the life and health of employees are at risk.


What technical tools support incident management?


As part of incident management, there are several separate tasks:

There are products on the IT market that solve most of these tasks.

How to develop the necessary response measures?


It is impossible to foresee all incidents, but it is possible to work out measures in the main directions: they can be combined and modified for a specific situation. What are the main activities of the company:


How to maintain relevance in normal mode?


Nothing better than regular workouts / testing has been invented yet.

How to make changes? How often? What little things you should pay attention to what should be considered in the plan?


To make changes to the company, there must be a special formalized change management process. Possible options for change: change in organizational structure, the emergence of new posts, a change in technical solutions, changes in risks, the emergence of new products / services.

How to conduct testing?


There are several arguments that can help interest the top management of the company to personally participate in testing.


Participation in the “desktop” test is sometimes enough for the top management to be convinced that their own company is not ready to react correctly to the incident.

Now several options to increase the involvement of ordinary employees in the testing process:


What information should the incident report contain?


The incident report should include the following information:


Application:

Failure is a situation in which resources, such as IT infrastructure, do not work as expected. The impact of such a situation is considered minimal. That is, the amount of damage does not prevent the company from carrying out its tasks (or the damage is negligible compared to its annual turnover). However, if the failure is not corrected in time, it can grow to the scale of the accident. Note that failures are related to incident management (dispatch service, 2nd and 3rd support lines), and not to the IT continuity process.

A critical situation (serious incident) occurs when, as part of incident management, it is not possible to resolve a serious first-priority incident in the allotted time.

Accident is a destructive event in which the processes in the company are not performed as expected. And their availability cannot be restored in the allotted period of time. Business operations are seriously affected. Performance of SLA becomes impossible. The damage ranges from large to very large, i.e. the accident has an unacceptably large negative impact on the company's annual revenue.
It is impossible to react to accidents as to critical situations, i.e. stay within the staff incident management procedures. Their elimination requires a special response within the business continuity management process.

A crisis is a situation different from the normal state. Despite the preventive measures taken, such a condition can occur at any time and cannot be overcome by ordinary procedural or organizational measures. There is a need for crisis management. There are no clear, formalized procedures for managing in crisis conditions, only general recommendations. A typical feature of the crisis is its uniqueness.

Accidents affecting the course of business processes can grow to the extent of a crisis. That is, a crisis is an expanded accident that threatens the existence of a company or the life and health of employees. The crisis affects the company, but does not have a large impact on the environment or public safety. The crisis can largely be resolved by the company itself.

There are a number of crises that do not have a direct impact on business processes. These include economic crises, liquidity crises, management crises, fraud cases, large product reviews, kidnappings, or terrorist threats. Such crises, as a rule, cannot be eliminated by the company itself, require the involvement of external organizations (internal affairs bodies, regulators, financial institutions) and can be considered examples of disasters.

A catastrophe is an event that a company cannot limit in time and space and which has a large-scale impact on people, wealth and the environment. The very existence of the company, the life and health of employees are at risk. The consequences of an event of this magnitude cannot be eliminated by the efforts of the organization itself; this requires the participation of emergency services.

The article was prepared by Konstantin Musatov, a consultant in the direction of business continuity of Jet Infosystems. We welcome your constructive comments.

Source: https://habr.com/ru/post/309900/


All Articles