New mechanism to suppress unnecessary messages analyzer

At the moment, the PVS-Studio analyzer already has a mechanism for suppressing false positives (False Positive). This mechanism completely suits us from the functional point of view, i.e. we have no complaints about the reliability of his work. However, some of our users and clients have a desire to be able to work with the analyzer messages only on the “new”, i.e. newly written code. This desire is quite understandable, given that in a large project the analyzer can generate thousands or even tens of thousands of messages to the existing code, which, of course, no one will rule.

The possibility of marking messages as “false” in some sense intersects with the desire to work only with “new” messages, since Nothing, theoretically, prevents to mark all the found messages as “false”, and in the future only work with messages on the newly written code.
')
However, in the existing mechanism for marking “false messages” there is a fundamental usability problem (which we will discuss later), which can become an obstacle when using it on real projects to solve this problem. In particular, this existing mechanism does not imply use for “mass” markup, which will be inevitable when processing thousands of messages of the analyzer.

Due to the fact that the problem described above is fundamental to the existing method, it cannot be eliminated while maintaining the same method. Therefore, it makes sense to consider the possibility of implementing an alternative method for solving this problem.

The task of suppressing false positives and the task of suppressing the results of previous launches.

The mechanism for comparing the source code and analyzer diagnostics implies the ability to match the source code string with a specific analyzer diagnostics. In this case, it is important to preserve this connection for a long period of time, during which both the user code and the diagnostic output of the analyzer may change.

The mechanism of comparing the source code and diagnostics can be used to solve 2 tasks:

The task of suppressing false positives. The user is given the opportunity to mark the analyzer's messages, which he considers false, in a special way, which later will allow him to filter such messages (for example, hide them). This markup should be retained on subsequent launches of the analyzer, including after the source code is changed. This functionality has been available for a long time in PVS-Studio.
The task of suppressing the results of previous launches is to allow the user to see only the “fresh” results of the analyzer launch (that is, the results that have already been found on previous launches should not be displayed). This task was not previously implemented in PVS-Studio, and it is about this that will be discussed in the next section of the article.

PVS-Studio has a source code matching and diagnostics mechanism based on markers (special comments) in the source code. The mechanism is implemented at the level of the analyzer kernel (PVS-Studio.exe) and IDE plug-ins. The IDE plugin performs the initial placement of markers in the code, and also allows you to filter the analysis results by this marker. The analyzer's kernel can “pick up” the markers already present in the code and mark its output, thereby preserving the markup of the code from previous launches.

Consider the advantages and disadvantages of the existing mechanism.

Advantages:

Simplicity of implementation at the kernel level and plugins
Intuitive use for users of the tool, the possibility of manual marking.
The guarantee of preservation of the connection "code - diagnosis" for any subsequent modifications of the code by the user.
"Free" support for team development - because the markers are stored in the source files, they can be synchronized using the same system as for synchronizing the files themselves (for example, version control system)

disadvantages

Code is clogged with special comments that are not related to the logic of this code.
The problem with the use of version control systems, the need to put a special kind of comments in the repository.
Potential damage to the source code with global mass markup.

The problems described above make it impossible from a practical point of view to use the existing matching mechanism to accomplish the task of suppressing the results of previous launches, i.e. for "mass" markup messages on the existing code base.

In other words, no one wants, without looking to add 20,000 comments to the code that suppress existing messages, and to put all these changes into the version control system.

A new mechanism for comparing diagnostics with the source code based on message base files.

As was shown earlier, the main problem of the existing mechanism is its commitment to modify the source code of the user. This fact results from both the undoubted advantages that are characteristic of such an approach and its disadvantages. It becomes obvious that to implement an alternative approach, it is necessary to abandon the modification of the user code, and store information about the analyzer diagnostics – user code link in some external storage, and not in the source files themselves.

Long-term storage of such markup mappings poses the fundamental task of accounting for changes over a large time interval both in the diagnostics of the analyzer itself and in the user's source code. The disappearance of diagnostics in the issuance of the analyzer is not a fundamental problem, since the message with such diagnostics has already been marked as false / unnecessary. But changes in the user code can lead to a "second coming" of messages that were previously marked up.

This problem is not terrible when using the code markup mechanism. No matter how much the code section is changed, the marker will remain in it until the user himself (by decision or unknowingly) deletes it, which seems unlikely. Moreover, an experienced user can add such a marker to a new (or changed) section of the code, if he knows that the analyzer will swear here.

What exactly is required to identify the diagnostic message of the analyzer? The analyzer message itself contains the file name, the project, the line number in the file and the checksums of the previous, current and next lines of code on which these diagnostics were found. In order to compare diagnostics when changing the source code, it will definitely be necessary to ignore the line number, since it changes unpredictably and at the slightest modification of the document.

To implement the above-described repository of “links” of diagnostics with a user code, we chose the path of creating local “database files”. Such files (files with the suppress extension) are created next to the project files (vcproj \ vcxproj) and contain lists of marked "unnecessary" diagnostics. Diagnostics are stored without regard to line numbers, paths to the files in which these diagnostics have been identified, are stored in a relative format — relative to project files. This allows you to transfer such files between the developers' machines even if their projects are deployed in different places (in terms of the file system). These files can be put into version control systems, because in most cases project files themselves store the paths to the source files in the same relative format. The exceptions here are generated project files, as in the case of CMake, for example, where the source tree can be located independently of the project tree.

We used the following fields to identify the message in the suppress file:

Analyzer diagnostic message text;
Message error code;
Relative name to the file in which the message was found;
The hash of the sum of the line with the code on which the message was found, as well as the previous and next lines of code.

As you can see, it is precisely due to the storage of hash sums of source code lines that we would like to correlate the analyzer message with the user code. At the same time, if the user code “shifts”, then the analyzer message will also move, however, the “context” of this message (ie, the code that surrounds it) will remain unchanged. If the user rules his code in the place where the message was generated, then it is quite logical to consider such code as “new”, and show the analyzer message to this code. At the same time, if the user actually “fixed” the error to which the analyzer pointed his message, then the message simply “disappears”. Otherwise, if the suspicious location is not corrected, the user will again see the analyzer message.

It is clear that relying on hashes of lines of code in the user's files, we will encounter a number of restrictions. For example, if a user has several identical lines of code in a file, we will count all messages on such lines as suppressed, even if only one of them was marked up. In more detail about the problems and limitations that we encountered when using the described methodology, will be discussed in the next section.

The PVS-Studio IDE plug-ins automatically create suppress files during the initial marking of messages, and further compare all newly generated diagnostics with those contained in suppress databases. And, if after re-checking the newly generated message will be identified in the database, it will not be shown to the user.

Statistics on the use of the new mechanism of suppression

After the implementation of the first working prototype of the new mechanism, we naturally wanted to see how this mechanism will show itself when working with real projects. We did not wait for several months / years, until such projects accumulate a sufficient amount of changes, but simply took a few past revisions in several large open source projects.

What did we want to see? We took some fairly old revision of the project (depending on the activity of the developers, it could be a week and a whole year), checked it with our analyzer, laid all the messages received in the suppress base. Then the project was updated to its latest head revision and checked again with the analyzer. Ideally, we should see messages found only on the “new” code, i.e. code that was written in the time frame we are considering.

When checking the first project, we encountered a number of problems and limitations of our methodology. Consider them in more detail.

First of all, that, in principle, was expected, the messages “appeared again” in case the code was modified at the place of issue of the message itself, or on the previous / next line. At the same time, if the modification of the line of the message itself quite expectedly led to a “resurrection” of such a message, then the modification of the surrounding lines, as it may seem, should not lead to this. This, in particular, is one of the main limitations of the method chosen by us - we are attached to the text of the source file on these 3 lines. Further, it seems inexpedient to bind to only one line — too many messages could potentially be “confused.” In the project statistics, which will be given below, we will designate such messages as “paired” - i.e. messages that already exist in suppress databases, but resurfaced.

Secondly, it turned out another feature (or rather, another limitation) of our new mechanism - the “resurrection” of messages in h (header) files when these files were included in other source files in other projects. This limitation is due to the fact that the bases are generated at the IDE level of the project. A similar situation arises in the case of the emergence of new projects in the solution that reuse the header / source files.

Further, it was not a good idea to focus on the text of the analyzer message to identify such a message in the databases. Sometimes the analyzer message text may contain line numbers in the source code (they change in case of a shift) and the names of the variables appearing in the user code. We solved this problem by not storing the full analyzer message in the database - all numeric characters are cut out of it. But we decided to consider the "resurrection" of the message when the name of the variable changes to be correct - after all, not only its name, but also the definition could change - we consider this to be a "new" code.

Finally, some messages “migrated” - either the code with them was copied to other files, or the files were included in other projects, which, in principle, overlaps with the very first problem described by our methodology.

We list the statistics for several projects on which we tested the new system. A large number of diagnostic messages caused by the fact that all messages were taken into account. Including diagnostics of 64-bit errors, which unfortunately generates a lot of false positives in general, and nothing can be done about it.

LLVM is a large project for a universal system for analyzing, transforming and optimizing programs. The project has been actively developing for several years, respectively, it was enough to take the dynamics of changes in just 1.5 months to get a large number of code modifications. The well-known Clang compiler is part of this project. For the 1600–1800 project file, 52,000 messages were marked as unnecessary. After 1.5 months, 18,000 new messages of the analyzer were detected, of which 500 pairs and 500 messages migrated to other files;
Miranda is a well-known messaging program for Windows. Miranda has 11 versions since its first release. We took the latest one: Miranda IM. Unfortunately, due to conflicts in the Miranda development team, this version did not change too often: I had to take changes at intervals of as much as 2 years. For 700 files, 51,000 messages were marked as unnecessary. Two years later, only 41 new messages appeared;
ffdShow is a media decoder commonly used for fast and highly accurate decoding of video streams. ffdShow is a fairly complete project, at the time of this writing, the last release was in April 2013. We took the dynamics of change for 1 year. Out of 570 files, 21,000 messages were marked as unnecessary. A year later, 120 new messages appeared;
Torque3D is a game engine. Now the project is practically not developing, but at first everything was different. The latest release, at the time of this writing, was May 1, 2007. At the time of active development, the dynamics of changes with an interval of a week issued 43259 messages. During this period, 222 new ones appeared;
OpenCV is a library of computer vision, image processing, and general-purpose numerical algorithms. A fairly dynamic project. We took the dynamics of changes for 2 months and for the year. 50948 unwanted messages were marked. Of these, 1174 new messages after 2 months and 19,471 a year later;

What conclusions can we draw from our results?

It is quite expected that on projects that are not actively developing, we did not see a large number of “new” messages, even in such a long period of time as a whole year. Note that for such projects we did not count the number of “paired” and migrated messages.

But the greatest interest, of course, for us are "live projects". In particular, using the example of LLVM, we see that the number of “new” messages was 34% of those marked on the version that was only 1.5 months behind the time! However, out of these 18,000 new messages, only 1000 (500 migrated + 500 doubles) are related to the limitations of our methodology, i.e. only 5% of the total number of new posts.

In our opinion, these figures very well demonstrated the viability of the new mechanism. Of course, it is worth remembering that the new suppression mechanism is by no means a “panacea”, but nothing abolishes the ability to use the numerous previously existing suppression / filtering methods. For example, if a message in the h file starts to “pop up” very often, there’s nothing wrong with killing it forever by adding a comment like // - Vxxx to the line.

Despite the fact that the new mechanism is already quite debugged, and we are ready to show it to our users in the next release, we decided to continue testing it by organizing a regular (every night) check of the LLVM / Clang project. The new mechanism will allow us to look only at messages from the "fresh" code of the project - theoretically we will be able to find errors even before the developers find them. This will very well show the real benefits of regular use of static analysis - and it would be impossible without our new suppression system, because it is unrealistic to view 50,000 every day. Wait for reports on fresh bugs found in Clang on our twitter .

Source: https://habr.com/ru/post/243107/

All Articles

New mechanism to suppress unnecessary messages analyzer

The task of suppressing false positives and the task of suppressing the results of previous launches.

A new mechanism for comparing diagnostics with the source code based on message base files.

Statistics on the use of the new mechanism of suppression

More articles: