How to embed static analysis in a project with more than 10 megabytes of source code?

So, you are a developer in a project in which there is a lot of (or even very much) source code. For example, more than 10 megabytes. You have read articles about checking open source projects and you wanted to check your own project with some code analyzer. You checked the project and received over a thousand messages from the analyzer. A thousand is an optimistic option. Maybe more than ten thousand. But you are not a lazy developer? You started browsing them. And, oh, horror, the fifth message of the analyzer turned out to be a real mistake! As well as the seventh, ninth, twelfth and fifteenth. You have written yourself a dozen more real errors pointed to by the analyzer and are going to the boss with the words:

“Chef, look. I downloaded a cool analyzer. He has already found ten real mistakes in only half an hour. In total, he issued a thousand (two, three, four) messages. Let's buy this analyzer, we will deal with the guys and fix all the messages in two or three weeks. And then, when we fix everything, he will give us 0 messages. So, we are cool programmers and make high-quality code! ”

')
And although you are already mentally putting up a chest for a new medal (after all, you are fighting for the quality of the project!), Your boss will very likely answer something like this:

“You have nothing to do? You want to divert the whole team for three weeks to correct mistakes, and we have a release in a month! So what if the mistakes are real? Lived somehow with them and nothing. Yes, I understand that it would be great not to make mistakes in the new code. Well, you do not do it! And I will not allow the old code to touch! He has already been tested and customers paid for it. And who for three weeks of work of the team will pay? So we will not buy the analyzer, and we will not touch the old code. And your current tasks are apparently over. So we will throw you right now. And so tomorrow was ready! "

Real scenario? Real. Worst of all, the good idea of introducing static analysis stalled by the need to work through all the messages that the code analyzer produces. After all, if each analyzer run will produce a thousand messages, then it will be impossible to understand where the new messages are and where the old ones will be. Or is there a way to solve this problem?

How does this issue solve in PVS-Studio?

A couple of releases back in PVS-Studio appeared a mechanism for suppressing old or "uninteresting" messages from the code analyzer. By version 5.22, we finally debugged it, worked it out, and now it turned out to be so convenient that we recommend using it to anyone who is thinking about implementing static analysis in their project. And I'll tell you how to use it.

So, before you project for 5 million lines of code (just for example), in which 150 megabytes of source code. You checked it with the analyzer and received several thousand general purpose messages (General Analysis). You would be happy to fix them, but the project manager does not allocate time.

OK, no question. After completing the analysis of the entire solution, go to the PVS-Studio menu -> Suppress Messages ... The dialog is simple, click "Suppress Current Messages", then Close. As a result, you will have 0 messages left in the PVS-Studio window. And even if you run again the analysis of the whole code - after the analysis is completed there will be 0 messages. But now, if you start writing a new code or modifying the old code, then the analyzer will already swear on this code. Unless of course you write it immediately perfect.

How it works?

This mechanism works quite simply. Rather, now it is simple, but how many experiments were performed to come to this option ... Well, okay, what happens from the point of view of the user when he presses the “Suppress Current Messages” button?

A database is created in the project folder from .suppress files. For each project (.vcxproj file), a .suppress file is created next to it. It stores information about all messages that were issued by the analyzer when checking this project. With these messages, we compare the results of each new analysis.

Messages are naturally compared out of the forehead. We consider the message code, the text of the current line of code, as well as the text of the previous and next lines of code. But on the contrary we do not take into account the line number: after all, if the message was issued at the end of the file (and so it is stored in the database), then inserting the lines at the beginning of the file will lead to a change in the message issue line. We will track this and will not issue an "uninteresting" message once again. But if one of the lines has changed (previous, current or next), then we are already fighting, because the context of the message has been touched, and you can consider this message already issued to the new (modified) code.

What to do with these. suppress files? If PVS-Studio runs on the same machine, say, a build server, then you can store these files directly in the project folders. If this is not convenient (for example, a clean assembly is made), then before launching the analyzer, you can copy files to the project folder using robocopy from another location - it saves the folder structure.

If several developers work with the analyzer, then .suppress files can be put into the repository. The question arises - how to synchronize these .suppress files between developers? On the one hand, the answer is simple - it is XML, so there is no problem. But on the other hand, it turns out that synchronizing these files (and indeed somehow modifying them) is not necessary. It is enough to create them once during the implementation of the analyzer, and there is no need to modify them anymore. Well, at the same time try to maintain 0 messages from the analyzer on the project in the future.

Note. But how to maintain 0 messages from the analyzer in the future, if it doesn’t, no, and gives false warnings? In this case, there are several ways to suppress individual warnings. About this will be discussed below.

So what is this mechanism for?

If suddenly someone else did not understand, then I will give an absolutely concrete example. You can add all messages to this database (we will receive 0 messages during the check). Next on the build server, the analyzer is launched every night ( how to configure the launch of the analyzer on the build server ), which generates only new messages, that is, messages for a new code written by the team in a day. These messages are saved not only in .plog (analyzer report in xml-format), but in a text file side by side. And this text file can already be sent by mail to project participants using any suitable program. For example, using SendEmail .

In the morning, people see messages from the analyzer in their mail and can correct errors even without installing PVS-Studio on their machines. Well, if you still want to open a report (.plog), then it is available on the build server. With the help of this trick you can save great money on licenses for PVS-Studio.

By the way, you can configure which messages (more precisely, which levels) should be included in a text report. This is done using the OutputLogFilter option, which is located on the Specific Analyzer Settings tab of the PVS-Studio settings. We recommend including General Analysis Level1 and Level 2 messages in a text file.

A small caveat about incremental analysis.

We still recommend that first of all, developers use PVS-Studio, including on local machines, in incremental analysis mode. This is a tick “Analysis after Build (Modified files only)” in the PVS-Studio menu. When enabled, the analyzer monitors your work and automatically runs for those files that have been compiled (it keeps track of the modification of .obj files). If the analyzer does not find anything, then you will not even notice that it has started. And if it does, it will pop up an error message.

Incremental analysis supports the database of "uninteresting" messages. If you have such a database, then the messages from incremental analysis will be only about the new code. And if not, then completely on the file that is being analyzed.

The incremental analysis mode allows you to correct errors as soon as they appear, even before the error is in the version control system. And as we know from McConnell, error correction at this stage is the cheapest. Therefore, if possible, we recommend using PVS-Studio both during daily checks on the server and in the incremental analysis mode on the programmers' machines.

How to fix messages that are in the database?

Well, you implemented static analysis on the project, the analyzer gives you no more than a few messages per day, which you and the team immediately rule. Fine. But then there was a week of free time, which can be spent on editing old mistakes. How to get to them? There are two options.

You can use the “Suppress Messages ...” command to bring up a dialog box in which the “Display Suppressed Messages” checkbox appears. Turn it on and messages will appear.
Or, if you have a daily launch configured on the build server, then you simply go to the folder with the results of the last analysis and see the following files there:
1. SolutionName.plog - a log with only new messages;
2. SolutionName.plog.txt - a text log, the same as SolutionName.plog;
3. SolutionName_WithSuppressedMessages.plog - all messages, including “uninteresting”.

Here with this file SolutionName_WithSuppressedMessages.plog and it is necessary to work. Open it and get all the messages. At first, this file will be large, as there are many messages. But if you rule them at least sometimes, then in time he will become small, and maybe you will be able to refuse him (and .suppress files, respectively), too.

It is important to understand this. If you have time and opportunity, you can always go back to the old "uninteresting" messages and correct them. We recommend doing this, because errors are errors and they should be corrected.

And this is not contrary to the function of Mark as False Alarm?

In PVS-Studio there is a command Mark As False Alarm - mark as a false alarm. This is when a comment of the form // - V501 is added to the message line. Having met such a comment, the analyzer will not issue a V501 message on this line.

The mechanism described in this article and Mark As False Alarm do not contradict each other. But they serve a slightly different purpose. Mass suppression of uninteresting messages - for mass marking in the implementation of the analyzer in the project. And Mark As False Alarm - in order that the analyzer does not swear into separate fragments.

In principle, it would be possible to mark the whole code as False Alarm earlier, but usually it’s scary for people to make so many edits in the code. In addition, it is not clear how to work with old errors - remove everything that is marked as False Alarm? And if there really is a false alarm.

But using the mass repression mechanism to suppress individual messages is also wrong.

In general, these are two mechanisms that solve different problems. Do not confuse them.

Conclusion

So, the approach to the implementation of static analysis in a live project looks like this:

We mark all messages as “uninteresting” using the “Suppress Messages ...” command.
Now the next time you start the analyzer will display messages only on the new code.
If necessary, you can always correct errors that were hidden during the implementation.

This allows you to start benefiting from static analysis right away.

Additional links

Source: https://habr.com/ru/post/252493/

All Articles