Discussion about static code analysis

There were comments to one of my articles, in which so many preconceptions about static code analysis were collected, that I decided to write the answer in a separate publication. I hope the answer will allow the author of comments and all doubting programmers to take a fresh look at the static analysis tools in general, and PVS-Studio in particular.

Comments

Comments are written to the article " I send my greetings to the developers of Yandex ." The first is here . I quote it:

All of these analyzers, except for their use, add hemorrhoids in the form of a build slowdown, false positives, information noise, and a false sense of security. For myself, I decided that the less of these tools the better, the hemorrhoids easily outweigh the benefits. And this also does not help nubasam much - they would learn to write unit tests, and not waste time plugging varningov. Perhaps Yandex thinks the same way.

Although if the tests are driven before the release, then the hemorrhage will be less, and it will be possible to filter out the normal time / error ratio, so that xs.
Disclaimer: I have not tried PVS-Studio.

And there is an addition :

Do not forget about your real competitors (not de jure, but de facto): dynamic analyzers (ASAN, MSAN, TSAN, UBSAN). Of course, they have a problem: they do not work so quickly and require more resources. But on the other hand, they have an advantage: if you have found a “SAN” of something “fuzzing” with you, then it is useless to fight back - this is a mistake, you need to repair it. And with a static analyzer ... anything can happen.
')
So you need to get into the niche between the analyzer built into clang / gcc / msvc (free and always available) and * SAN (difficult and expensive, but false positives are almost zero, any error can be entered into the bugtracker immediately). Is this niche big?

In my opinion, these are very good comments that can be used as a collective image of misconceptions regarding static code analysis. Let's carefully consider each critical comment.

I will speak from the position of the evangelist of the PVS-Studio static analyzer, however, all I will say is related to the static analysis tools in general.

Analyzers mainly generate distracting noise

All of these analyzers, except for their use, add hemorrhoids in the form of a build slowdown, false positives, information noise, and a false sense of security. For myself, I decided that the less of these tools the better, the hemorrhoids easily outweigh the benefits.

A static analyzer is a higher-level and intelligent version of the warning mechanism implemented in compilers. Actually, some diagnostics implemented in static analyzers gradually appear in compilers. It's like when something from the boost library goes to the std library. At the same time, the boost library continues to live and benefit. The analogy may not be the best, but I think the idea is clear.

So, analyzers examine the code more efficiently and deeply in comparison with compilers. Moreover, the percentage of false positives for good analyzers is much lower than that of the compiler. In addition, static analysis tools provide many additional features that simplify their use. For this, the developers of analyzers and get their money.

I foresee an objection of the following type: “You are embellishing the percentage of false positives. My compiler almost does not issue warnings, and running a static analyzer, I see thousands of warnings. ”

Everything is very simple: the fact is that the analyzer was not configured. The comparison is not fair. Try to compile your code with another, not yet used by the compiler, and you will receive the same thousands, or even tens of thousands of warnings. And this is on condition that it will immediately be possible to compile the code without revising it.

As I said, good static code analyzers give few false positives and make it easy to suppress those that exist. You just need to spend some time setting it up. These thoughts are described in more detail in my article " Characteristics of the PVS-Studio analyzer on the example of EFL Core Libraries, 10-15% of false positives ."

I got a little away from the main point, but I had to explain that the static analyzer is a more powerful tool than the mechanism for generating warnings in the compiler.

So, to abandon static analysis is just as stupid as turning off all compiler warnings. Look, let's rewrite the source comment a little:

All these warnings issued by compilers except for the benefit add hemorrhoids in the form of build slowdown, false positives, information noise and a false sense of security. For myself, I decided that the less of these tools the better, the hemorrhoids easily outweigh the benefits.

Any adequate programmer will say that stupidity is written. Yes, compiler warnings are sometimes noise. But this does not negate their benefit.

If someone says that compiler warnings give them a false sense of security, then this is the problem of the speaker, not the compiler.

One last thing: the fact that the harm from warnings outweighs the benefits is not at all due to professionalism, but, on the contrary, the incompetence of such a programmer. It simply does not know how to use compiler warnings.

As you understand, the same thing applies to static code analysis. We have examined in detail the first part of the errors and proceed to the next.

There is no newcomer for static analysis.

And this also does not help nubasam much - they would learn to write unit tests, and not waste time plugging varningov.

I agree that first of all beginners should learn a programming language, learn to debug code, learn to write unit tests, and so on. Without fundamental knowledge, no supporting tools will help.

Yes, probably, static analysis is not the technology that needs attention in the first place. However, static analysis is always a friend, regardless of the developer’s professionalism. It may well be useful to students, prompting them that something is wrong with the code, and giving food for further language learning. By the way, we have an interesting article on the verification of the student code: " About evil, accidentally called upon by the students of magicians ."

If we are talking not entirely about newbies, but, for example, about new employees, then static analysis will allow the team leader to quickly understand what a programmer is and what he should learn.

Note. Unit tests, by the way, are not a panacea either, and the static analyzer complements this methodology, and does not compete with it. See the article " How static analysis complements TDD ."

You can run a static analyzer before release.

Although if the tests are driven before the release, then the hemorrhage will be less, and it will be possible to filter out the normal time / error ratio, so that xs.

This is a completely wrong way to use the analyzer! The only worse option is when static analysis is not used at all or is used every 5 years.

With this approach, all errors are looked for slowly and sadly with the help of unit tests, debugging, testers, and so on. And at the release verification stage, using the analyzer, mostly minor errors will be found that were not detected earlier due to their insignificance.

The meaning of static analysis is to detect errors as early as possible! That is, at the stage of writing code. This is how a static analyzer saves the most time, nerves and money.

Here again the analogy with the compiler warnings is appropriate. Suppose warnings are completely disabled, and a project is being developed for 3 months. And then, the day before the release, all these compiler warnings are included. Agree, this is some kind of nonsense.

It would also be appropriate here to offer to get acquainted with the article of a colleague dedicated to the stupidity of one-time analyzer launches: "The philosophy of static code analysis: we have 100 programmers, the analyzer found few errors, is it useless? ".

Dynamic analysis - competing static analysis

Do not forget about your real competitors (not de jure, but de facto): dynamic analyzers (ASAN, MSAN, TSAN, UBSAN). Of course, they have a problem: they do not work so quickly and require more resources. But on the other hand, they have an advantage: if you have found a “SAN” of something “fuzzing” with you, then it is useless to fight back - this is a mistake, you need to repair it. And with a static analyzer ... anything can happen.

Yes, dynamic analyzers have advantages, but there are also disadvantages. Dynamic analyzers are able to find errors that the static analyzer does not notice. And vice versa! Dynamic analyzers have almost no false positives. But some parts of the program are extremely difficult for them to test, or it takes too much time. The static analyzer, on the contrary, very quickly checks the entire source code.

It is important not to consider these methodologies in the spirit of “static versus dynamic analysis”. These technologies do not compete, but complement each other. Using together static and dynamic analysis, you can identify a huge number of diverse errors.

We do not consider dynamic analyzers as competitors, since professional programmers do not raise the question “What to choose?”. They use both of these technologies, because both of them help answer the question: “What else can I do to improve the quality of the code?”.

Note

By the way, for reasons unknown to me, some programmers believe that a dynamic analyzer can do the same thing as a static analyzer, which means that it is better, since it gives fewer false positives. No, it is not, a dynamic analyzer can not do much. On this subject, I have two good publications:

It is difficult to get into the niche between compilers and dynamic analyzers

So you need to get into the niche between the analyzer built into clang / gcc / msvc (free and always available) and * SAN (difficult and expensive, but false positives are almost zero, any error can be entered into the bugtracker immediately). Is this niche big?

The niche is wide, and not only the PVS-Studio analyzer fits nicely in it, but also the tools of many other companies, such as SonarSource, Synopsys (formerly Coverity), Gimpel Software, Rogue Wave Software and so on.

Why is the niche wide? The answer is simple: voiced in the comments does not limit static code analyzers. Dynamic analyzers are placed at one boundary. But, as it turned out earlier, there is no competition, but a friendly symbiosis.

At the other border - compilers. Yes, compilers are getting smarter. However, static analysis tools also do not stand still and are developing rapidly.

For skeptics, I have a number of publications in which I demonstrate that PVS-Studio easily finds errors in these compilers:

Check LLVM (Clang) (August 2011), second check (August 2012), third check (October 2016)
GCC Verification (August 2016)
Visual C ++ code is closed. But you can check the libraries and demonstrate the capabilities of PVS-Studio. Checking Visual C ++ libraries (September 2012), second check (October 2014), third check (May 2017)

I have not tried PVS-Studio

In vain, try :). Many people like it and they become our customers .

Moreover, it is very easy to try PVS-Studio. Go to the product page and download the demo version.

For any questions, feel free to write in support. We will help you analyze a large project, deal with warnings, and advise you on licensing options.

Thank you all for your attention and wish you a reckless code.

Source: https://habr.com/ru/post/343940/

All Articles