📜 ⬆️ ⬇️

What is 'stable'?

We had a pretty hot discussion at work about whether to consider python 2.7 stable. I leave the outcome of the dispute and the question itself aside, but here I want to state and systematize certain thoughts about real programs that strongly contradict the world of Von Neumann and Turing.

The world in which programmers work is the world of the correct code. Of course, there are countless errors in it, but these errors are subject to correction. If these are errors in someone else's code that is not corrected, they must be documented and taken into account in their code. But a mistake is always a reason to find and eliminate it.

The world of system administration is different. Here is the code that is "what it is, it is." It is impossible even to catch a glimpse of the source code of all the packages, even for the smallest and most modest installation. 300+ MB of Linux, the source codes of the main libraries and programs ... It is in principle immense. You can know the specific programs, specific places of the programs - but it is impossible to know the whole runtime, the whole program environment of which the OS consists.
')
And it is full of mistakes. One may argue about the charm of the mat. evidence of the code, but it does not help at all if the failure is found in the proprietary video card driver (it would seem, and what does the server have to do with it?) or a network card with TCP offload.

There is a completely different approach to the software problem - a purely practical approach. We have a priori buggy code that sometimes works as we expect it.

And around this “sometimes” the whole concept of “stability” and product-ready is built.

We recognize that the program is full of errors, and that errors can be caused by errors in other programs, a combination of errors and their absence. Sometimes we even know that two errors in different programs can be mutually neutralized.

We accept it “as is” and strive for such a situation in which there are no errors with a high probability. Where are the mistakes the least? That's right, where tested the most. If it is known that foobar 1 2 gives the correct result, then most likely there are no errors. Which, by the way, does not mean at all that foobar 2 1 will work correctly. Not to mention foobar -1 -2, foobar 1.2 2, \\ windows \ path \ foobar 1 2, foobar “01” “02”, etc.

In other words, we start talking about the “software testing” category. Obviously, all the options with all the data cannot be tested - but the more certain branches of the code are tested, the higher the probability that there are few errors.

So the more testing the better.

The greatest testing programs are in the default settings and typical configurations of the best practicie. If there is no reason to do your own - you need to do the typical.

The same applies to the choice of software. If there is no fundamental difference - you need to use what is offered by default in the distribution. Not that it was better - it was simply used more .

I will not be unfounded, I will tell a small instructive story from the life of Selektel: For a long time, we used python 2.4 (by default in centos 5.5). Programmers kanyuchili much and wanted python 2.6, then dirty hacks (using libraries that work only with 2.6+) almost gave an ultimatum. After the initial testing, it was decided to take 2.6 into the work environment.

Most of our services are implemented on the principle of easy fall - as soon as the program does not like something, it immediately (controllably) terminates and is restarted by an external restart service. This solves the problem of a very extensive logic of processing temporary errors (database unavailability, lack of space, etc.) and changing managers (changing the master in the cluster, in the pool, changing the address in the configuration file, etc.). One of the programs depended on the pool master (which is alone in the pool) and on the other pool servers (slaves), finding that the master was not there, quietly terminated, restarted (with a delay), and ended again. If the master changed, then the program on the old master was completed and went into standby mode, and on a new one, it started working with a small delay.

On Python 2.4, it worked great. Python 2.6 arrived ... And we found a significant consumption of CPU time by an idle system. The culprit was exactly python2.6 - it started (along with the new libraries) for significantly longer than python 2.4.

From a programmer's point of view: what the hell is that?
From the point of view of the system administrator: updated the software to the new version - they were full of unexpected problems.

... By the way, the problem was solved by a special condition in the program - due to the absence of the wizard, it does not end, but sleeps and tries again, that is, we even had to change the architecture of the application, partially abandoning the easy fall principle.

It would seem a trifle, there are no factual errors in the code - but the consequences are, and not very pleasant.

Thus, there arises a natural need for conservatism (as it was, let it be so) - from where the "rotten" distributions come with ancient software and ten-year terms of supporting this very software. The main reason is the fear of changes in the work environment.

Most Linux distributions can be qualified by their degree of commitment to stable. The key principle is how the software is updated - whether only security updates, or whether updating the software to new versions without changing the version of the entire distribution package is valid.

This is a watershed between different Linux distributions: For example, ubuntu allows itself to change software versions; Debian is not. Thus, Debian is more conservative, and for this reason it is more suitable for server use. The most conservative is Red Hat, with which RHEL 4.2 is still supported. I don’t remember exactly, but it seems that they haven’t finished the support of linux 2.4 systems yet, or finished it soon.

And what's more, I can assume that in the foreseeable future, RH or someone else, the enterprise-oriented market will launch a product with an even longer support cycle (15-20 years). For what? In order to “put it once” and not think about it anymore.

By the way, Microsoft, which includes new versions of the browser in critical updates of the server OS, makes an obvious nonsense, breaking the ideology of stability.

Do I need security updates?


You know, I saw an interesting position that in some configurations it is not. Like, if it works like this, you can’t touch it anymore. And what's more, there are several cases from the practice when security updates broke software functionality.

However, general practice says that bug fixes should be installed. The key difference in updating with bugfixes from new software versions is the lack of new functionality. Only bugs are fixed, otherwise (striving to maintain) functionality at the same level.

Bleeding edge


The inverse of stability is the bleeding edge concept. The latest version. Sometimes even from nightly builds (that is, not even a version, but an intermediate state of the software during development). Pluses at Bleeding Edge actually are.

First, you are using an atypical version. That is, with a high probability, its bugs are unknown to anyone and there are no exploits for them. Secondly, the bugfixes for you come in the first place. Third, the new functionality is available "from the pen."

But the price of this is high - no one has tested this software (and you are in the front ranks of those fallen warriors that make up a stable wall), all its bugs, incompatibilities, stupidity and typographical errors are yours.

The choice is yours. For example, I completely use the mix of sid and experimental Debian at home (that is, almost the same Bleeding Edge), “stable” (relatively) LTS Ubuntu on a laptop at work and stable (without quotes) Debian / Centos on servers. Moreover, between the release of the new distributor and the moment of updating it, I prefer to wait several months (sometimes until the end of the old support) - during this time they correct several errors in the new distributor, solve potential compatibility problems, etc. In an amicable way, this should have been done before the release was announced - but it is better to be outrun than to have a sleepless night with solving problems.

Source: https://habr.com/ru/post/114338/


All Articles