📜 ⬆️ ⬇️

Functional security, Part 2 of 7. IEC 61508: Who should be, Sherlock Holmes or Date Tutashkhia?


source # 1 ; source # 2

A whole hub is dedicated to security in the Habré, and perhaps no one really thinks about what is embedded in the concept of security, and so everything is clear: information security. However, there is another side to safety, the safety associated with risks to human health and life, as well as the environment. Since information technologies themselves do not pose a danger, they usually speak of a functional component, that is, of security associated with the proper functioning of a computer system. If information security has become critical with the advent of the Internet, then functional security has been considered before the advent of digital control, because accidents have always occurred.

This article continues the series of publications on functional safety .
')
In order to make one more step, it is necessary to continue consideration of the IEC 61508 standard “Functional safety of electrical, electronic, programmable electronic systems related to safety” (IEC 61508 Functional safety of electrical / electronic / electronic / programmable electronic safety-related systems). The fact is that functional safety is a rather formalized property, since systems important to safety are subject to state licensing in all countries.

Studying standards and terminology embedded in them is not the most fun activity in the world, but, with a pragmatic approach, the technical level of a specialist can increase from this. The interpretation of the terminology is a kind of “technical jurisprudence”, and any author of the detective can envy the intricacies of plot lines in stating the requirements.
I will not assure you that, by studying standards, everyone will immediately become technical Sherlock Holmes. Although, knowledge of the basics of standardization (that is, technical legislation) is the basis of the work of any detective technical expert. Studying standards is rather the way Date Tutashkhia from the now-forgotten novel by Chabua Amirejibi (and there is also a film adaptation - “Shores”), the way is not so much forward as in depth, not so much in action as in understanding.

General information about IEC 61508


The standard uses the term electrical / electronic / programmable electronic (E / E / PE) system (electrical / electronic / programmable electronic).

A feature of the standard is a risk-based approach. Depending on the risk that the man-made object creates for the environment, human life and health, risks are set for the failure of control systems.

For example, pay attention to the protection systems of nuclear reactors. For them in the permanent operation mode, failures should occur no more often than once every 1000 years of operation (10 million hours of time between failures). Such indicators are set not for a single object, but for a “fleet”, i.e. for a variety of similar objects. It seems that failures are quite rare events, because not a single nuclear power plant will work for a thousand years. However, if we take into account that more than 400 nuclear reactors are being operated in the world, then for the “fleet” we will already receive a figure of one failure per 2.5 years, which sounds much more depressing. During the Chernobyl and Fukushima nuclear disasters, emergency protection systems did not work as expected by the designers. This is another argument in favor of the importance of considering functional safety.

To reduce the risk values ​​below the specified indicators, a set of organizational and technical measures is implemented, which are also regulated in IEC 61508, depending on the permissible value of the risk of failure.

In addition, IEC 61508 represents the top level of a whole family of industry standards that detail functional safety requirements for the management systems of medical equipment, road and rail transport, automated process control systems, etc.

The first edition of IEC 61508 was developed from 1998 to 2000. In the Russian Federation, the first edition was adopted as the state standard GOST R IEC 61508 in 2007. Currently, the world has the second edition of IEC 61508, released in 2010. In the Russian Federation, the second edition of IEC 61508 is also relevant since 2012 (GOST R IEC 61508-2012).

The IEC 61508 series of standards includes 7 parts, which together contain about 600 pages of text. The purpose of the parts is easy to guess by their names.



Of course, there are quite complex connections between parts of IEC 61508, which is reflected in the figures of the standard itself: The overall framework of the IEC 61508 series (IEC 61508, Figure 1).



IEC 61508 Terminology: Basic Safety Terms


In order to understand the concept of functional safety, perhaps, it is necessary to begin, not from the beginning and not from the end, but from the middle, namely from the fourth part ( IEC 61508-4 ), where the basic terminology is presented. As you can see, the terms are divided into 8 groups (3.1-3.8). These groups are logically related. In my opinion, the most important are the groups 3.1 (Safety terms) and 3.5 (Safety functions and safety integrity).



Further, the terms are cited selectively in accordance with the Russian language text of IEC 61508-4, and then their author's interpretation is given.

So, the first section on terminology, 3.1 "Terms relating to security."

See safety related terms »
3.1.1 harm: Physical damage or damage to human health, property, or the environment.
3.1.2 hazard: Potential source of harm
3.1.6 risk (risk): The combination of the probability of an event of harm P (t) and the severity of this harm C.
3.1.7 tolerable risk: A risk that is acceptable in the circumstances based on the values ​​that exist in society.
3.1.11 safety: The absence of unacceptable risk.
3.1.12 functional safety: Part of the overall safety due to the use of controlled equipment (EO) and the EO management system, and dependent on the proper functioning of safety-related systems and other means to reduce risk.
3.1.13 safe state: State of the SA in which safety is achieved.

I tried to connect all entities into a single whole, and the result was the following scheme (perhaps, we can call it an ontology). It is important to note that a computerized management system (LCP) is only one of many measures to reduce risks. There are many so-called passive protection measures, for example, a seat belt in a car or in an airplane.



Risk is an indicator of safety, and the concept of risk will need to be paid further attention when considering these very indicators. For the time being I will give a simple example. For some reason a man walks along the edge of the roof. Other things being equal, the probability of falling from the roof does not depend on the height of the building. But the degree of damage depends. Then the risk of falling from the roof of a 10-story building will be higher than the risk of falling from the roof of a one-story building. But the risk of falling from the roof of a 10-story building is almost equal to the risk of falling from the roof of a 100-story building, since if the probability of falling is the same, then the consequences (damage) of falling here, unfortunately, are also the same.

Interesting, in my opinion, is the concept of acceptable (acceptable) risk. It depends on the historical and humanitarian context. Is it true that the highest values ​​of modern society are human life and concern for the environment, which forms and supports this very life? The real state of man-made objects demonstrates how the state and society implement the declared values.

Another important concept is "safe state". For example, one of the most important safety systems, an emergency protection system (ESD), should stop the operation of a controlled object. How does this happen? As a rule, by breaking electrical circuits (this already depends on the technological algorithms for controlling equipment), which occurs by transferring the output discrete signals to the “logical 0” state (the so-called de-energize to trip principle, so that the system can also work during an emergency power loss ). If necessary, "logical 0" can be inverted into "logical 1" via intermediate relays.

IEC 61508 terminology: terms related to safety integrity and safety integrity functions


In the first part of the series of articles, I have already briefly mentioned that the concept of functional safety includes the implemented safety functions and the completeness (integration) of these functions.

In IEC 61508-4, Section 3.5, Security Functions and Safety Integrity, the relevant terms are given.

See terms related to safety features and safety integrity.
3.5.1 safety function: A function implemented by an E / E / PE safety-related system or other risk mitigation measures designed to achieve or maintain a safe condition of an EI in relation to a specific hazardous event.
3.5.4 safety integrity: The likelihood that a safety-related system will satisfactorily perform the required safety functions under all agreed conditions for a given time interval.
3.5.5 software safety integrity: A component of the safety integrity of a system related to security, relating to systematic failures occurring in a dangerous mode and related to software.
3.5.6 safety completeness regarding systematic safety integrity: The safety integrity component of a safety-related system regarding systematic failures occurring in a hazardous mode.
3.5.7 hardware safety integrity: Component of the safety integrity of a system related to safety, relating to accidental equipment failures occurring in dangerous mode.
3.5.8 safety integrity level; Safety integrity level (SIL): A discrete level (taking one of four possible values) corresponding to the range of safety integrity values ​​at which the safety integrity level is 4, is the highest safety integrity level, and the safety integrity level is 1, corresponds to the smallest safety integrity.
3.5.9 systematic capability resistance: A measure of confidence (expressed in the CCO 1 - CCO 4 range) in volume. that the systematic safety integrity of an element meets the requirements of a given value of the safety integrity level for a particular safety function of the element, if that element is applied in accordance with the guidelines specific to that element in the relevant safety manual.
3.5.16 mode of operation: The way to perform a security function either in mode:
- with a low request rate (low demand mode), in which the safety function is performed only on request and transmits the PP to a certain safe state, and the request frequency does not exceed one per year or
- with a high frequency of requests (high demand mode), in which the safety function is performed only on request and transmits the PP to a certain safe state, and the frequency of requests exceeds one per year, or
- continuous mode (continuous mode), in which the safety function maintains the EI in a safe state, as in normal operation.

From the point of view of the definition of safety integrity, this property actually boils down to reliable performance of safety functions, i.e. considered as part of the reliability, which, in turn, is part of the classical reliability. In fact, from other provisions of IEC 61508, it follows that safety completeness is a more complex property associated with attributes such as maintainability, availability, durability, information security. Terminological and taxonomic aspects of the components of reliability and safety are an adjacent area of ​​expertise. In subsequent publications, it makes sense to understand this in more detail.

Another central concept in IEC 61508 is the Safety Integrity Level (SIL). The SIL value is set depending on how much the influence of the controlled equipment creates a risk for people and the environment.

On this basis, the risk of failure is established for the computer management system itself. For example, at the beginning of the article I said that for the protection system of a nuclear reactor, the mean time to failure should be not less than 10 million hours. This corresponds to SIL3. In general, it is considered that only the most simple devices can correspond to SIL4. For programmable logic controllers (PLC) used in process control systems, SIL3 is achievable.

From the structure of definitions, it also follows that safety completeness is divided into two components: safety completeness, concerning systematic failures (software security completeness also falls here) and hardware security completeness.

The first component requires the application of measures of protection against systematic failures caused by design errors. To do this, it is necessary to improve the design and development processes, testing, configuration management, project management, etc. This resembles the Capability Maturity Model Integration (CMMI) levels, but is not directly traced to them. For each of the SIL values, a set of methods of protection against systematic failures is defined, and their number and “severity” increase with increasing SIL.

The completeness of hardware security is related to protection against accidental failures and is ensured by the use of components with a high level of reliability and self-diagnosis, and, of course, redundancy.

There is an interesting trick that many PLC developers apply. SIL2 can be achieved with a single channel PLC configuration. Then the redundant configuration will give SIL3. At the same time, the development processes (systematic capability) must comply with SIL3.

Now, by analogy with the previous section, we will try to apply the structure of the environment (danger, damage, risks, countermeasures, controlled equipment) for a computer control system (LCP). Here we are talking about the dangers for the implementation of the security functions of KSU, since their failure to comply with the risk. To reduce this risk, various measures are taken to ensure safety integrity. As we already know, these measures are aimed at protecting against accidental and systematic failures.



Let us now try to combine the resulting scheme with the scheme from the previous size. It turns out such a two-level structure that demonstrates the terminological environment of functional safety "on the fingers."



IEC 61508 terminology: some more useful terms


So according to IEC 61508-4, we still have six of the eight sections on terminology.

The following terminology section: 3.2 "Equipment and devices". Here are quite trivial definitions related to the types of software and hardware used in systems important to security. I will give only a definition for the above-mentioned managed equipment.

See equipment and device related terms.
3.2.1 controlled equipment; EQ [equipment under control (EU)]: Equipment, machinery, apparatus, or installations used for production, processing, transportation, medicine, or other processes.

Section 3.3, Systems: General Aspects, also contains definitions that are understandable to technicians.

Section 3.4, Systems: Security Aspects, contains an important definition that answers the question: “what exactly is meant by a security-related system?”

See safety related terms.
3.4.1 safety-related system: A system that:
- implements the necessary safety functions required to achieve and maintain the safe state of the MA and
- designed to achieve its own means or in combination with other E / E / PE safety-related systems and other means of reducing the risk of the required safety integrity for the required safety functions.

Section 3.6 “Failure, Failure, and Error” defines these annoying incidents. In addition, this section provides definitions of accidental and systematic failures already known to us, as well as dangerous and safe failures. This is followed by the definition of safety indicators, which is worth considering in a separate publication.

View terms related to failures, failures and errors.
3.6.1 fault: Abnormal mode, which may cause a decrease or loss in the ability of the function block to perform the desired function.
3.6.4 failure: The termination of the ability of a functional unit to perform a necessary function or the functioning of this unit in any way other than the required one.
3.6.5 random hardware failure: A fault that occurs at a random point in time, which is the result of one or more possible mechanisms for degrading the hardware characteristics.
3.6.6 systematic failure: A failure associated in a deterministic way with some cause that can be eliminated only by modifying a project, or a production process, operations, documentation, or other factors.
3.6.7 dangerous failure: The failure of an element and / or subsystems and / or systems that are involved in the implementation of safety functions, resulting in:
a) the safety function is not performed when it is required (for modes with a low or high request rate) or fails (for continuous mode), which results in the transfer of the EA to a dangerous or potentially dangerous state;
b) it reduces the likelihood that the safety function will be correctly performed if necessary.
3.6.8 safe failure: The failure of an element and / or subsystems and / or systems that are involved in the implementation of safety functions, resulting in:
a) lead to the performance of the safety function of transferring the SP (or its part) to a safe state or maintain a safe state;
b) increases the likelihood of the safety function to transfer the PP (or its part) to a safe state or to maintain a safe state.
3.6.10 common cause failure: A failure that is the result of one or more events that caused the simultaneous failure of two or more separate channels in a multi-channel system, leading to a system failure.
3.6.11 error: The discrepancy between the calculated, observed or measured value or condition and the correct, specified or theoretically correct value or condition.

The definitions of section 3.7 “Life cycle processes”, as well as the life cycle itself, are the subject of a separate publication.

See terms related to life cycle processes.
3.7.1 safety lifecycle: The necessary processes related to the implementation of safety-related systems over a period of time from the design concept development stage to the stage when all E / E / PE systems associated with safety, and other means of risk reduction are no longer used.

In Section 3.8, “Confirmation of Security Measures”, of interest are the definitions given for verification and validation. I’ll say at once that usually in the life cycle validation and verification (verification and validation, V & V) is considered as a single process. Validation directly refers to tests of a fully integrated system with a physical simulation of input and output signals, and verification - all other reviews, analyzes and tests. But from the definitions of IEC 61508 this does not follow at all.

See terms related to validation of safety measures.
3.8.1 verification: Verification of compliance with the requirements by examining and collecting objective evidence.
3.8.2 validation: Validation, by testing and presenting objective evidence, by meeting specific requirements for the specific use envisaged.

findings


IEC 61508 is a fairly lengthy, complex, sometimes confusing and contradictory standard, which includes 7 parts. "Unravel" it can only be applied in practice.

IEC 61508 is based on a risk-based approach. Risk levels for computer control systems are assigned depending on the impact of the controlled technogenic object on the environment, as well as on the health and lives of people.

To do this, IEC 61508 introduces the concept of Safety Integrity Level (SIL), which are set in increasing order, from 1 to 4. To comply with this or that SIL, it is necessary to implement measures to protect against accidental hardware failures and systematic failures. caused by design errors. Therefore, for each of the SIL, the requirements for the product are specified in the form of values ​​of safety indicators and requirements for the "severity" of the implementation of life cycle processes.

The terminology for functional safety is set out in the fourth part of IEC 61508 (IEC 61508 4).

Having dealt with the terminology, in the next part of the publication we will be able to consider the structure and interconnections for the remaining parts of IEC 61508.

Laws and standards can be interpreted in different ways, but, in any case, the interpreter is obliged to interpret their content in order to distinguish good from evil in a timely manner.

PS To explain the main aspects of functional safety, the following cycle of articles is developed:
- Introduction to the subject of functional safety ;
- Standard IEC 61508: terminology ;
- IEC 61508 Standard: requirements structure ;
- The relationship between information and functional safety of the process control system ;
- Management processes and functional safety assessment ;
- The life cycle of information and functional security ;
- The theory of reliability and functional safety: basic terms and indicators ;
- Methods to ensure functional safety .

Here you can watch video lectures on the topic of publication.

Source: https://habr.com/ru/post/309636/


All Articles