In anticipation of the release of a static analyzer from Intel called Advisor, which will be included in Intel Parallel Studio 2011, it will be useful to talk in general about static code analysis technology and its application. The fact is that according to experience in Russia, static analysis is not often used, apparently due to the fact that we do not have so many complex software projects. Therefore, a brief text on what it is and who may be useful, I hope, will be useful. Well, to whom, if not the authors of the PVS-Studio analyzer, do this text? :-)
Static code analysis technologies are used in companies with mature software development processes. However, the level of application and implementation in the development process of code analysis tools may be different. Starting from manual launch of the analyzer “from time to time” or when searching for subtle errors, and ending with daily automatic launch or launch when adding a new source code to the version control system.
The article describes the different levels of use of static code analysis technologies in team development, it is shown how to “translate” the process from one level to another. As an example, the article uses the PVS-Studio code analyzer developed by the authors.
Static code analyzer is a tool for finding software errors by source code. The use of such a tool helps to avoid identifying software errors at the design stage, and not at the stages of testing or use.
However, companies are not always able to get the benefit of such tools. The reasons for this are very different. Some projects are simply not economically suitable for the implementation of a code analyzer, some projects are not large enough for the effect to be noticeable. Therefore, before being introduced into the process of developing static code analysis, it is necessary to understand when it can be useful and when it cannot.
The article based on the experience of the authors (engaged in the development, promotion and sale of their own static code analyzer) formulated the main considerations that should guide the introduction of such tools in the development process.
What is static code analysis
Static code analysis is a technology for finding errors in programs by parsing the source code and searching for patterns (patterns) of known errors in it. This technology is implemented by special tools called static code analyzers.
The word "static" means that the code is parsed without starting the program for execution. The tools that analyze a program while it is running are called dynamic code analyzers.
The most famous static analyzers are produced by Coverity, Klocwork, and Gimpel Software. Popular dynamic analyzers make Intel (Intel Parallel Inspector) and Micro Focus (DevPartner Bounds Checker). It is also necessary to mention the specialized static code analyzer PVS-Studio, which the authors of the article are developing and promoting.
The result of a static analyzer is a list of potential problems found in the code with the indication of the file name and a specific string. In other words, this is a list of errors, very similar to that produced by the compiler. The term “potential problems” is not used here by chance. Unfortunately, the static analyzer cannot exactly tell if this potential error in the code is a real problem. This can only know the programmer. Therefore, alas (and this is inevitable), code analyzers give false positives.
Tools for static code analysis are divided by the type of supported programming languages (Java, C #, C, C ++), by diagnosed problems (general-purpose or specialized analyzers, for example, for developing 64-bit or parallel programs).
For which projects static code analysis is relevant
It is advisable to use static code analysis not in all projects, but only in medium and large ones. The discussion on what to consider as a small / medium / large project is clearly beyond the scope of this article, but from our own experience we recommend thinking about using static analysis in projects that are more than 30 person-months. If the software project is smaller than the specified size, then instead of using static analysis, it is sufficient to have several qualified developers in the project. A team of two to four qualified staff will fully draw such a project and be able to make it qualitatively from a programmatic point of view. But if more people work on the project, or the project lasts more than six months, then the hope that “you just have to write without errors” is naive enough.
Variants (scenarios) of using static code analyzers
Consider situations in which the development team may come up with the need to use static code analysis. Here, a case is deliberately considered when static analysis only appears in the development process - after all, if static analysis has already been implemented and used for a long time, then it makes no sense to discuss implementation issues.
So, suppose a team of 5 people is engaged in carrying out the transfer of the code of a software project to work on 64-bit computers. Suppose also that the project code is written in C / C ++. We will say in advance that such prerequisites are made in order to use our code analyzer PVS-Studio in the example. The developers corrected the main compilation errors, compiled the application distribution. We started testing and found out that there are extremely mysterious errors in the program, which manifest themselves only in the 64-bit version of the program. Developers go to Google, introduce “64-bit platform with ++ issues” and among the 8.5 million results on the first page find a link to our article “20 issues of porting C ++ code on the 64-bit platform” (in the Russian version “20 traps porting C ++ code to a 64-bit platform ”), from which they learn what happens in C / C ++ applications when developing 64-bit versions of programs, various previously imperceptible problems appear. In the same place, they will learn that there is a PVS-Studio tool that will allow to find and fix these problems. Next, the developers download the tool, look at the trial version, if it suits them, then they buy a license, find with the tool some errors in their code, fix them, and the program turns out to be without errors. After that, the developers consider the task of creating the 64-bit version of the program finished and further refuse to use the analyzer, as they believe that they do not need it anymore.
Another scenario close to this. When developing a Java application, a team of 5 developers encountered an error in one of the third-party modules. Unfortunately, the “eyes” didn't find the error in the code, the developers downloaded a trial version of any Java code analyzer, found an error in this third-party module, fixed it, but did not buy the license for the tool - the project budget constraints. Error fixed, application released, tool license is not broken. It seems to be all right, but this option of using a static analyzer cannot be called correct.
The third use case. The developers switched to using Visual Studio Team Foundation Server, in which it is possible to run code analysis for files added to the version control system. A few weeks later, the developers turned off the code check, as adding a new code turned into a game “convince the analyzer to allow the addition of a file”.
All these three considered use cases are not successful cases of static analysis. And this is despite the fact that in the first two cases the analyzer helped to find real errors in the code, and in the third, the code of programmers was apparently frankly bad. What are the reasons for these failures?
What prevents the full use of static code analyzer
We show the reasons that the above three options for using static analysis are not successful use cases.
If a team uses a specialized code analyzer (as in the described case to search for problems of 64-bit code), then there is a great temptation to abandon the tool after problems seem to be found and fixed. And indeed, if a 64-bit version of a software product is released, it may seem that there is no sense to continue using a special tool. However, it is not. If you refuse to use such an analyzer, then over time (after several months), in the new code, those errors that could be detected using the code analyzer will appear. That is, although the 64-bit version of the application exists and (once) has been debugged, the new code may contain errors typical of 64-bit applications. The conclusion on the first use case - the rejection of a specialized code analyzer after the main work with it is completed, leads to the appearance of new software errors of this type.
In the second case described, the team decided to use a specialized tool only when it became apparent that there were difficult-to-find errors in the project. And after correcting these errors, the team abandoned the tool. The problem with this approach is that it is difficult to detect errors again sooner or later appear in the project. But, perhaps, at first, users, not developers or testers, will now see them. The conclusion on the second use case coincides with the first conclusion - the abandonment of the tool will necessarily lead again to the appearance of difficult-to-find errors.
In the third use case, when due to the difficulties of adding new code to the version control system, it was decided to refuse static analysis when adding code, the problem is generally not in the static analyzer, but in the insufficient command level. First, the team could not configure the tool so that its messages were useful. And, secondly, apparently the code was really not very good, since the analyzer produced a lot of diagnostic messages.
So, let us formulate the main problems that prevent us from constantly using static code analysis tools:
- The high price of code analysis tools does not allow the use of these tools in small (primarily budget) projects. You just need to understand that there are projects in which static analysis is not suitable not because of technological, but because of economic reasons.
- A code analysis tool gives a lot of false positives. Alas, any code analyzer gives false positives and often gives quite a lot of them. The reason here lies in the philosophy of such tools. It is better to give ten to one hundred false messages than to miss one present. Hoping that some analyzers produce few false positives is not worth it. It is better to choose a tool that somehow supports the ability to work with false positives. For example, our PVS-Studio analyzer contains the function “Mark as False Alarm”. With its help, you can mark the false alarms of the analyzer right in the code. That is, indicate that the analyzer should not issue such a type of messages in such and such a line.
- Poor integration into the development environment. If the code analysis tool does not have a smooth “seamless” integration into the development environment, it is unlikely that they will be used regularly.
- Lack of automated launch via command line. This does not allow for the analysis of the code of the entire project regularly, for example, during daily builds.
- Lack of integration with version control system. Although in the example considered earlier, checking new code when adding it to the version control system was a rejection of the use of such tools, the very possibility of such integration is still useful.
- Too complicated, or vice versa too simple code analyzer settings.
The solution here is the interaction of a company that wants to use static code analysis technologies with the company that provides these technologies. That is, the relations from the category “buy a tool and use it” go into the category “buy a solution, implement it and only then use it”. Like it or not, in most cases it will not be possible to simply buy a “program analyzer” and use it profitably. It is necessary to “pull up” the development process in the company and, together with the supplier of solutions for static analysis, implement the proposed tool in a regular regular team development process.
According to this scheme, the leaders of the static analysis market like Coverity or Klocwork work. This incidentally has, perhaps, a not quite understandable external manifestation. It’s not so easy for these companies to get at least some introductory version from the site. And it’s not at all possible to get an answer to the question "how much it costs" until sales managers know the maximum amount of information about the customer.
If your company plans to use static code analysis, then the following should be considered:
- The introduction of static code analysis has an impact on the entire development process.
- A static analyzer is not a small utility and not just another copy of Windows that you can buy and use without any interaction with the supplier. Always count on the fact that it is necessary to communicate tightly with the developers of the analyzer, and the procedure for implementing the tool requires time and effort.
- A static analyzer enhances the overall culture of software development in a team, but only if the team itself is ready for this increase. That is, this process is mutual.
- Improving the development culture through the use of static code analyzers is an expensive process. We must be prepared for this and understand that this will require significant investments.