The program code began to kill people with the help of machines back in 1985.
A typical single therapeutic dose of radiation is up to 200 rad . 1000 glad - lethal dose. Risen car figachila defenseless earthlings 20 000 happy. ')
Consider the case when the phased, but not consistent implementation of software improvements has led to a system error. The worst software bug ever.
In Therac-25, hardware protection was removed and security functions were assigned to the software.
How was the investigation carried out? What designers of IT systems, programmers, testers should do to avoid such things.
Killer
Therac-25 is a radiation therapy device, a medical accelerator created by Canadian government organization Atomic Energy of Canada Limited.
Advertising apparatus for housewives.
Murder
From June 1985 to January 1987, this device caused six overdoses of radiation, some patients received doses of tens of thousands of rad. At least two died directly from overdose.
The nurse remembered that that day she replaced “x” with “e”. It turned out that if it was done quickly enough, overexposure happened with almost 100 percent probability.
Investigation
While litigating against AECL, the Texas State Attorney’s Office turned to Nancy Liveson (a computer science professor at the University of California, Irvine) as an expert to investigate. She made a significant contribution to computer security. For three years, Nancy and Clark Turner have been collecting materials and reconstructing Therac-25 events. This result is important because in most security incidents the information is incomplete, contradictory and incorrect.
The Canadian state organization Atomic Energy of Canada Limited (hereinafter AECL) has released three versions: Therac-6, Therac-20 and Therac-25. 6 and 20 were produced in conjunction with the French company CGR. The partnership ended before the design of Therac-25, but both companies still had access to the projects and source codes of the earlier models.
The program code in Therac-20 was based on Therac-6 code. A PDP-11 computer was installed on all three devices. Previous models did not need it, since they were designed as stand-alone devices. The radiation therapy technician set up various parameters manually , including the position of the rotary dial to adjust the operating mode of the device.
In electronic mode, the deflection magnets distribute the beam so that the electrons cover a large area. In the X-ray mode, the target was located in the path of the radiation, electrons struck at it to produce x-ray photons aimed at the patient. Finally, a reflector could be placed in the path of the accelerator, with which the X-ray technician could direct radiation precisely at the sore spot. If the reflector was on the way, the electron beam did not start.
On Therac-6 and 20, the hardware locking mechanisms did not allow the operator to do something dangerous, say, to choose a high-power electron beam without an X-ray target in place.
An attempt to activate the accelerator in the wrong mode led to the opening of the fuses and stopping the operation. PDP-11 and related equipment were built for convenience. A technician could enter a recipe into the VT-100 terminal, and the computer, using servos, automatically set up the rotary disk and other devices.
The hospital staff liked the computer to set things up faster than people. The less time it took to set up, the more patients could be taken in a day.
When it came time to make Therac-25, AECL decided to leave only computer control . They abandoned manual control devices and hardware locking mechanisms. The computer had to monitor the device settings and, in the event of a problem, should turn off the power of the entire machine.
Oh well.
At least four errors were found in Therac-25 software that could lead to overexposure.
The same variable was used both to analyze the entered numbers and to determine the position of the turntable. Therefore, when quickly entering data through a terminal, the Therac-25 could deal with the wrong position of the turntable ( race condition ).
Setting the position of the deflection magnets takes about 8 seconds. If during this time the parameters of the type and radiation power were changed, and the cursor was set to the final position, the system did not detect changes.
The division by the amount of radiation, which in some cases leads to a division error by zero and a corresponding increase in the amount of exposure to the maximum possible.
Setting the boolean variable (single-byte) to the value “true” was performed with the command “x = x + 1”. Therefore, with a probability of 1/256, when the “Set” button was pressed, the program could skip information about the incorrect position of the disk.
Potential errors were identified - there was no synchronization in the multitasking operating system.
Fixes
All interruptions related to the dosimetry system stopped the procedure, rather than paused it. The operator was required to re-enter all parameters.
Added one-click software shutdown.
Added one-click independent hardware shutdown.
The coded error messages were replaced by meaningful ones and the current exposure level was displayed on the screen.
Added a potentiometer that determines the position of the rotary dial.
Changing the position of the disk and other parts of the device is now possible only when the operator holds a special pedal (deadman switch).
In the X-ray therapy mode, the diverting magnets for electronic therapy are set to such a configuration that the electron beam is deflected 270 °.
The manufacturer reported that the software and hardware have been tested for many years. However, during the trial it turned out that the software was checked by the minimum number of tests on the simulator, and most of the time the entire system was tested. Thus, unit testing was neglected, and only integration testing was performed.
It was a naive assumption that reusing a code or a boxed product would increase the security of the software due to the duration of their successful application. Code reuse does not guarantee the safety of the module in the new system, since its design has its own characteristics. Rewriting from scratch allows you to get a simpler and more transparent system, and as a result, a safer one.
In this case, there was a reuse of code with Therac-6 and Therac-20. In Therac-6 there was no X-ray therapy at all, in Therac-20 a mechanical interlock was used.
After accidents, the Therac-25 FDA changed its attitude to the many problems of safety-related systems, and especially in relation to software. As a result, the FDA launched a process to improve its procedures, guidelines, and reporting systems, and incorporated software into them. This lesson was important not only for the FDA, but also for all industrial safety-critical systems.
The Software Engineering Institute speaks of an average of 1 bug for every 100 lines of code and 98% of cases of device malfunctions caused by bugs in software could easily have been avoided with proper code testing. Knowing this, I want to join the movement " give the code a look ." It seems that measures have been taken after high-profile cases, but still I don’t really want to run into a drill, where in the variable responsible for the angular velocity, they “got it wrong”. Dear testers (programmers, developers), do your job well.
UPD
The University of California, Berkeley: Computer Science 61A - Lecture 35: Therac-25