📜 ⬆️ ⬇️

Overview of the static analyzer CppCat

Most recently, a new static C ++ code analysis tool CppCat was presented at Habré. Its authors have long and in detail told about their previous project (PVS-Studio) in Habré. I had a twofold attitude towards him - on the one hand, of course, static analysis is needed. With him is better than without him. On the other hand, PVS-Studio was frightened by its scale, such “enterprise”, and also by the price. I well imagined how a person in a 50-person project team can buy it, but what a single-person or a 5-person team can do in a single developer — I did not understand. I remember that I once suggested that the authors deploy “PVS as a service” in the cloud and sell access to it in time. But they went their own way and made a cut-down version for relatively little money (such a budget is quite realistic to “break through” or even buy just for yourself).

Let's see if the game is worth the candle.


The first thing that I don’t like at once during installation is the lack of integration with Visual Studio 2005 \ 2008. No, I understand that the old versions of the studio had a completely different API for extensions and maybe additional efforts are needed to support it, but the C ++ language is not at all young, many of the projects I meet still live on VS2008 and nobody does not plan to migrate them, if necessary, about a dozen lines of edits per quarter. So, to work with legacy, a full-fledged PVS-Studio is still needed. Well, OK.
')
The minimalism in integration with Visual Studio pleases: a menu for a couple of points, one toolbar - nothing superfluous. The “Enable Analysis on Build” checkbox is turned on by default. What for? It seems to me more logical not to slow down the speed of each build, but to make an analysis of the entire project, for example, before committing to the repository. You can make yourself a reminder on the pre-commit svn \ git clients hook that it would be nice to check the new code with a static analyzer.

For the objectivity of the tests, I chose three projects:


Notepad ++ and ZeroMQ were chosen because I had little experience in developing both - literally a couple of patches here and there, but at least I don’t have to figure out how to compile and test (I knew exactly about the possibility of building under VS2010).

Notepad ++


86 files in the project, 2 minutes for a full scan. 48 reports of suspicious code from CppCat (an average of 0.55 messages per file)

A common mistake - in an attempt to transfer bytes, the logical 'and' and bitwise 'and' are confused.
ToAscii(wParam,(lParam >> 16) && 0xff,keys,&dwReturnedValue,0); // V560. A part of conditional expression is always true/false. 


Check of unsigned type for negativity
 WPARAM wParam ... if(wParam<0) // V547. Expression is always true/false. 


Potential addressing of an array cell with index '-1'
 j=lstrlen(BGHS[SelfIndex].editstring); BGHS[SelfIndex].editstring[j-1]=0x00; // V557. Array overrun is possible. 

It’s not at all a fact that this is a mistake - it’s possible to check the string for emptiness before that, but it seems to me better not to hope for a chance.


Double assignment One thing is clearly superfluous.
 lpcs = &cs; lpcs = (LPCREATESTRUCT)lParam; // V519. The 'x' variable is assigned values twice successively. Perhaps this is a mistake. 


Nonsense in building conditions.
 if(MATCH) { return returnvalue+MAX_GRIDS; } if(!MATCH)) // V560. A part of conditional expression is always true/false. { return -1; } 


Incorrect check for correctly allocated memory
 char *source = new char[docLength]; if (source == NULL) // V668. There is no sense in testing the pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. return; 

According to my personal statistics - the most common error in any code in C ++ (this project occurs 24 times). Roots to C memory allocation functions that returned NULL on error. But when using the new operator in C ++, in order to check “did the available memory run out?”, You need to catch the exception std :: bad_alloc, and not check the result for NULL.


Extra paranoia
 TCHAR intStr[5]; ... if (!intStr) // V600. Consider inspecting the condition. The 'Foo' pointer is always not equal to NULL. 

If in the program it comes to the fact that the pointer to a locally declared array of five characters becomes NULL - something went so bad that it was better to just fall.


Double check of the same condition
 do { ... } while(openFound.success && (styleAt == SCE_H_DOUBLESTRING || styleAt == SCE_H_DOUBLESTRING) && searchStartPoint > 0); 

They tried to find quotes (double and single), but in fact double check doubles. Kopipasta with a plan to replace the second check on SCE_H_SINGLESTRING, which was forgotten in the process. One of the most useful bugs found is really a bug in the XML parser, and not just an “improvement tip”.


False (in my opinion) operation

Two consecutive if blocks with the same condition
 //Setup GUI if (!_beforeSpecialView.isPostIt) // V581. The conditional expressions of the 'if' operators situated alongside each other are identical. { ... } //Set old style if not fullscreen if (!_beforeSpecialView.isPostIt) { ... } 


Here I disagree with CppCat. Who knows why the programmer decided to write like this? The code is valid, it works well. It can not be a mistake "forgot to remove!", Because otherwise there would have been "else." Just a design code.

A few errors of "Priority of the '&&' operation is higher than that of the '|| operation
 printPage = (!(_pdlg.Flags & PD_PAGENUMS) || (pageNum >= _pdlg.nFromPage) && (pageNum <= _pdlg.nToPage)); 


CppCat stubbornly believes that programmers do not know the priorities of operations. In this case, both by meaning and by hyphenation, it is clear that the programmer knew what he was doing. Of course, “explicit is better than implicit,” but there are no mistakes here.

The analyzer does not assume that meaningless code can make sense with other compilation keys.
 if (unicodeSupported) ::DispatchMessageW(&msg); else // V523. The 'then' statement is equivalent to the 'else' statement ::DispatchMessage(&msg); 


This is because "#define DispatchMessage DispatchMessageW" is previously written. But after all, what's the matter - this replacement is enabled by conditional compilation macros. In this case, the code really does not make sense, but if the project is compiled with other keys, then DispatchMessage will point to DispatchMessageA, which means the code will make sense.
I, of course, find fault with it: it’s unfair to require the static analyzer to guess “what else compilation options could there be”. But from the programmer to think about it does not hurt.

Output by Notepad ++
Of the 48 error messages in my opinion, about 38 really deserve attention and some changes in the code. Of these, 30 places are obvious bugs, and 8 are useful, but optional stylistic edits. I regard 10 messages from CppCat as false in the context of code logic. Overall, not bad.

ZeroMq


72 files, 1 minute per analysis.

Exactly 1 found suspicious place. And in fact - false.

 rc = pipe_->write (&probe_msg_); // zmq_assert (rc) is not applicable here, since it is not a bug. pipe_->flush (); rc = probe_msg_.close (); // V519. The 'x' variable is assigned values twice successively. Perhaps this is a mistake. 


The analyzer is upset by the fact that the poor rc value is not needed by anyone between its first and second assignment. Yes, not necessary. But:
  1. It is convenient to set breakpoint when debugging and see what it is equal to.
  2. There is a comment indicating what was here before (or could be) the rc check. He also says that this check is definitely not needed.


The analyzer is not obliged, of course, to read the comments, and therefore does not know why everything is ok here.

Conclusions on ZeroMq
Not a single useful message about a suspicious place in the code. Well, I said from the very beginning that I had a very high opinion about the code of this library.

My project


420 files (in 8 subprojects), 9 minutes for analysis, 99 warnings. On average, it is 0.23 warnings per file.

Stupid mistake in using type std :: list
 void CHttpDownloaderBase::GetResponseHeader(const std::string& strHeaderName, std::list<std::string>& listValues) const { listValues.empty(); // V530. The return value of function 'Foo' is required to be utilized. ... 

The confusion is due to the fact that in some frameworks the empty method of the container type actually clears the container. But for std :: list, this is just a void check. Need to replace with clear ()


Check pointer after the first use
 const char *xml_attr(xml_t* xml, const char *attr) { ... const char *name = xml->name; if (! xml || ! xml->attr) // V595. The pointer was utilized before it was verified against nullptr. return NULL; 


Undefined behavior
 while (*(n = ++s + strspn(s, XML_WS))) // V567. Undefined behavior. The variable is modified while being used twice between sequence points. {...} 

In fact, on the C ++ compiler from VS2010 this code works as intended. But formally, yes, the standard says "indefinite behavior." Better fix.


Notorious operations priorities
 if (!text[++r] == '\\') // V562. It's odd to compare a bool type value with a value of N. { break; } if (!text[++r] == 'u') // V562. It's odd to compare a bool type value with a value of N. { break; } 

It's simple, the operator "!" has higher priority than "=="


Pointer arithmetic
 TCHAR *ch = path + lstrlen(path) - 1; // V532. Consider inspecting the statement of '*pointer++' pattern. Probably meant: '(*pointer)++'. while (*ch && *ch != '\\') *ch--; 

The assumption of CppCat that I might have wanted to change the value on the pointer — wrong, I wanted to change the pointer itself. But there is no less benefit from this message - it indicates an extra operation of taking a value by the pointer - "*". It is not used here, which means you can just write "ch--"


Extra condition
 while (*p1 != 0 && *p1 == _T(' ')) // V590. Consider inspecting this expression. The expression is excessive or contains a misprint. p1++; 

It's simple - if you compare p1 with a space, then the first check is not needed.


Excess operation
  m_dwInPartPos = 0; m_pcOutData = NULL; ... m_dwInPartPos = 0; // V519. The 'x' variable is assigned values twice successively. 


Not that enum
 if (type == eRT_unk) // V556. The values of different enum types are compared. return false; 

One of the biggest afflictions of C ++ (well, at least before the enum classes appears in the new standard) is that enum is actually not a separate entity, but a set of numbers. Having eRt_unk in one enum, and in the second - eResourceUnknown, I was just lucky that they both were equal to 0. The error was in the code for years, although everything was pure luck and it worked.


So where do without the classics of the genre: = instead of == in comparison
 if (scheme == ZLIB_COMPRESSION) out.push(boost::iostreams::zlib_compressor()); else if (scheme = GZIP_COMPRESSION) out.push(boost::iostreams::gzip_compressor()); // V559. Suspicious assignment inside the condition expression of 'if/while/for' operator. 

Well, what can I say - a stupid mistake, no excuses.


False alarms


Array bounds

 int len = lstrlen(szLogDir); TCHAR ch = szLogDir[len-1]; // V557. Array overrun is possible. 

Indeed, the variable len is not checked for "> 0", but just above the code, the szLogDir string itself is checked for non-emptiness. The second check will not add reliability.

Suspicious failures

I found such a piece in my code:
 if (m_packets[i] != NULL) delete m_packets[i]; 


CppCat did not say anything about it, although, in my opinion, PVS-Studio in such cases said that deleting NULL is safe.

Conclusion on my project
Of the 99 warnings, about 65 were in the case, the distribution of bugs and simply "improving" code is somewhere around 50/50.

findings


pros


Minuses



It could be a minus, but no!

Lack of support for 64-bit checks.

I have never understood why the developers of PVS-Studio have always focused on them so much. Under Windows x64, 32-bit program assemblies work, and accordingly, the issue of creating a 64-bit version becomes more common either when you need to write a driver, or when a program needs more than 3 GB of RAM. In my statistics it is about 10-15% of projects.

Lack of integration with MsBuild, etc.

It seems logical for me to check the code manually, before committing to the repository. Not with every build (slow), not on the build server (not interactive), but like this. Time is running out a lot, before colleagues do not disgrace - it is convenient.

Could be a plus, but no!

Austerity in the settings

CppCat turned out very severe. Not enough flexibility: you can exclude from the scan files, folders, a specific line of the file. But how to exclude part of the file? How to disable checking for a specific error or class of errors? And if not also globally, but for part of the files? This is either impossible, or I have read little documentation.

Wishlist


Hotel 1

It would be cool to be able to store information about warnings excluded from checking not in the comments to the code, but somewhere in an external file that could be excluded from the commit in the repository. Not all colleagues can use the same static analysis tool, and reading comments like "// - V: is_test_ok & =: 501" in the code and can be annoyed by their purpose.

Hotel 2

It would be nice to add the item “Copy” to the context menu of the warning, well, hot key Ctrl + V. It would be very convenient to copy-paste these warnings into the text of the comment to the commit to the repository. Now you can, of course, open the documentation and copy the title from there - but there the text is generalized, and the test results indicate specific lines in the code, the names of the variables are convenient and understandable.

Final conclusions


What is good about CppCat is the simplicity of calculating whether a purchase tool is worth it. Let's take the average salary of the average C ++ programmer in a vacuum from a review recently run on Habré: 80,000 rubles ($ 2,370). So 1 hour of his work costs about $ 14. I think you will agree that finding and correcting a bug similar to the one described above (and even if you do not know exactly what you are looking for) is at least an hour of work. The cost of CppCat is $ 250. It's like 17 hours of work. If in your project you plan to spend on bugfixing for more than 17 hours (and what, are there projects where you plan less?), Then the purchase is justified.

In general, thanks to the authors of CppCat for taking care of small projects and individual developers. I hope for the implementation of "hotelok" and integration with VS2008.

Source: https://habr.com/ru/post/209984/


All Articles