This is a reference material about the heisenbag. We talk about how they look and what they have to do with mainframes - the ancestors of the cloud.
/ photo Lars Zimmermann CC BY
Heisenbug (Heisenbag or Heisenbag) is a term that describes errors that change properties when debugging code. That is, they disappear during testing and debugging, but they appear in production.
The name “Heisenbag” refers to the
Heisenberg uncertainty principle from quantum mechanics. In general terms, it can be described as an unexpected change in the properties of the observed object as a result of the fact of observation.
Story
The author of the term "heisenbag" is considered to be an employee of the IBM research center Bruce Lindsay. He contributed to the development of relational databases and was engaged in the development of corporate
IBM System RMS .
')
In 1985, while studying at Berkeley University, Bruce and
Jim Gray (James Nicholas Gray), an American scientist in the field of the theory of computer systems,
worked on CAL-TSS. It was written specifically for the
Control Data 6400 dual-processor mainframe [
PDF , p.3], where the military processed large amounts of data.
Of course, there were bugs in the development process. But several of them were special - as soon as the engineers tried to fix them, they disappeared. At that time, Lindsay just studied physics and the Heisenberg principle in particular. Suddenly, Lindsay dawned - he and Gray witnessed a similar phenomenon: errors disappeared, because observation affected the properties of the object. Hence the name "heisenbag".
Lindsay told this story
in an interview with representatives of
the Computer Engineering Association (ACM) in 2003.
Examples of Heisenbags
Users on the network and on thematic platforms like Stack Overflow shared several examples of heisenbags that they met in their projects. One of the residents of SO tried to calculate the area of ​​the figure between the two curves with an accuracy of three decimal places. To debug the algorithm in C ++, he added a line:
cout << current << endl;
But as soon as he commented on it, the code stopped working and got stuck. The program
looked like this :
#include <iostream> #include <cmath> using namespace std; double up = 19.0 + (61.0/125.0); double down = -32.0 - (2.0/3.0); double rectangle = (up - down) * 8.0; double f(double x) { return (pow(x, 4.0)/500.0) - (pow(x, 2.0)/200.0) - 0.012; } double g(double x) { return -(pow(x, 3.0)/30.0) + (x/20.0) + (1.0/6.0); } double area_upper(double x, double step) { return (((up - f(x)) + (up - f(x + step))) * step) / 2.0; } double area_lower(double x, double step) { return (((g(x) - down) + (g(x + step) - down)) * step) / 2.0; } double area(double x, double step) { return area_upper(x, step) + area_lower(x, step); } int main() { double current = 0, last = 0, step = 1.0; do { last = current; step /= 10.0; current = 0; for(double x = 2.0; x < 10.0; x += step) current += area(x, step); current = rectangle - current; current = round(current * 1000.0) / 1000.0; //cout << current << endl; //<-- COMMENT BACK IN TO "FIX" BUG } while(current != last); cout << current << endl; return 0; }
The essence of heisenbag : when there is no printout, the program performs a comparison with high accuracy in the registers of the processor. The accuracy of the result exceeds the capabilities of double. To output a value, the compiler returns the result of the calculations to main memory — in this case, the fractional part is discarded. And the subsequent comparison in while leads to the correct result. When the line is commented out, implicit truncation of the fractional part does not occur. For this reason, the two values ​​in while always turn out to be unequal to each other. As a solution to the problem, one of the participants in the discussion suggested using an approximate comparison of floating-point numbers.
Another story about heisenbag
shared by engineers who worked with the environment of the language
Smalltalk-80 on Unix. They noticed that the system was hanging if you left it for some time without work. But after moving the mouse cursor, everything worked as usual again.
The problem was with the Unix scheduler, which reduced the priority of tasks that are idle. At some point, the priority was so low that the processes in Smalltalk did not have time to complete. The stack of tasks grew and hung up the program. When the user moved the cursor, the OS restored the priority and everything returned to normal.
Other * bugs
There are a number of terms that describe various kinds of errors: Borbag, Mandelbag, Shroedinbag.
Borbag - the opposite of Heisenbag - a common mistake that is easy to find and fix. Named in honor of Niels Bohr, who in 1913 proposed a simple and understandable model of the structure of the atom. According to this model, the electrons of an atom move along certain orbits, which means that their momentum and radius of motion can be predicted. Similarly, the appearance of borbagov can be predicted, if you create the necessary conditions for them.
/ photo OLCF at ORNL CC BY
Shredinbag is a mistake that exists and does not exist at the same time until the developer looks at it. The name of the error was in honor of the famous
mental experiment .
As for
Mandelbag , this is a mistake, because of which the system behaves erratically and unpredictably. The phenomenon is named after the physicist, mathematician and creator of fractal geometry,
Benoit Mandelbrot .
What is the result
There are
many examples of heisenbags (and other * bugs). They are very difficult to search for, but the causes are usually trivial: an uninitialized variable, synchronization errors in a multi-threaded environment, or problems with the algorithms for
removing dead code . It turns out that to combat such errors, they need to be cut off even at the design stage of the application.
From the corporate IaaS blog: