⬆️ ⬇️

Leo Tolstoy and static code analysis

PVS-Studio vs Apache HTTP Server

This time using PVS-Studio we checked the Apache HTTP Server. As expected, they found errors in it. There are few mistakes. This is also expected.



Other developers face the same situation, testing PVS-Studio on their projects. Unfortunately, the first conclusion that one would like to make, after seeing only a few mistakes, that such a tool is of little use to him. Now I have come up with a good analogy that explains why this is not so.



War and peace

Imagine the following situation. We have a work of Leo Tolstoy, for example, " War and Peace ". We want to try out the spelling engine built into Microsoft Word on this book. We run the analysis, we detect only a few real errors and a large number of false positives. In addition, Leo Tolstoy loved writing long sentences. Microsoft Word, failing to understand the text, will emphasize a lot of green, suggesting the presence of punctuation errors. We look at it all and turn up our nose. We do not like that we almost did not find any errors. We do not like that many names are considered to be incorrect. We do not like the punctuation analysis. We conclude that the spell checker in Word is absolutely useless in the work thing.



Let us examine the incorrectness of such a study tool.

')

We forget that people have already worked on the mistakes in the book. And they spent a lot of time and energy on it. Leo Tolstoy rewrote his drafts and exterminated mistakes. The editor worked, turning the text into a book. Something was fixed in subsequent editions.



Of course, the program will not find all the errors in the book. People can do it better. But the program does it much faster, tremendously reducing the waste of human power. The true benefits of such a tool in its regular use. I wrote a paragraph, noticed an underlined error and instantly corrected it. And now, rereading the text, a person can correct not banal typos, but focus on the content and subtle mistakes.



Remarks about false positives are also not convincing. It is enough to add an unknown word to the program once in the dictionary, and it will no longer disturb the author of the text. Well, for the other obviously false warnings, you can always click the right mouse button and specify “ignore”.



I think it's silly to argue about the benefits of text-checking mechanisms built into such programs as Word, Thunderbird, TortoiseSVN. We use them and do not experience that they will find few errors in the book “War and Peace”.



Identical situation with a static analysis of the source code of programs. It makes no sense to check a project like Apache HTTP Server and try to evaluate the possible benefits of analysis. Apache HTTP Server code is old by programmer standards. As a project, the Apache HTTP Server has been around for over 15 years. The quality of the code is confirmed by the huge number of projects built on its basis. Naturally, a huge number of errors were corrected due to the wide use of this code by many programmers and a long period of development.



It is logical to assume that many of the errors found earlier could be found much easier if static analysis were used. The analogy with Word is complete. It is more useful to constantly look for mistakes than to do it one-time.



As in the Word editor, PVS-Studio false errors are easy to eliminate. Simply click on the false message and hide it (a special comment will automatically be added to the code).



Let's make the last comparison on the speed of finding errors. Word Editor can underline errors immediately. PVS-Studio analyzer does this after compiling files. Previously, the syntax is too complicated to parse the unfinished code. However, this is enough. You can consider PVS-Studio as a means of increasing the warnings issued by the compiler. By the way, now PVS-Studio can do this in different versions of Visual Studio: 2005, 2008, 2010. So I invite you to try incremental analysis .



In general, the analogy with Word, in my opinion, is complete. If someone does not agree, then he is ready to continue the discussion in the comments or in the mail (karpov [@] viva64.com).



I finished the main idea that I wanted to share uncontrollably. However, I’ll disappoint readers if I don’t write anything about errors found in the Apache HTTP Server. So now a little about the errors found.



Here is an example of excess sizeof ().

PSECURITY_ATTRIBUTES GetNullACL(void) { PSECURITY_ATTRIBUTES sa; sa = (PSECURITY_ATTRIBUTES) LocalAlloc(LPTR, sizeof(SECURITY_ATTRIBUTES)); sa->nLength = sizeof(sizeof(SECURITY_ATTRIBUTES)); ... } 


PVS-Studio diagnostic message: V568 This is the case of the sizeof () operator is the sizeof (SECURITY_ATTRIBUTES) 'expression. libhttpd util_win32.c 115



The size of the SECURITY_ATTRIBUTES structure is set incorrectly. As a result, correct work with such a structure is impossible.



However, I suspect that most of the examples that I will give are in code that rarely gets control. Therefore, do not worry much. This is the following example. The error is in the function associated with error handling.

 apr_size_t ap_regerror(int errcode, const ap_regex_t *preg, char *errbuf, apr_size_t errbuf_size) { ... apr_snprintf(errbuf, sizeof errbuf, "%s%s%-6d", message, addmessage, (int)preg->re_erroffset); ... } 


PVS-Studio diagnostic message: V579 It is possibly a mistake. Inspect the second argument. libhttpd util_pcre.c 85



The meaning is as follows. Instead of the buffer size, the pointer size is passed to the function. As a result, an error message cannot be generated correctly. In practice, the message will be 3 or 7 characters long.



There are also classic copy-paste errors.

 static int magic_rsl_to_request(request_rec *r) { ... if (state == rsl_subtype || state == rsl_encoding || state == rsl_encoding) { ... } 


PVS-Studio diagnostic message: V501 There are identical sub-expressions 'state == rsl_encoding' to the left and to the right of the '||' operator. mod_mime_magic mod_mime_magic.c 787



In fact, the comparison should be:

 if (state == rsl_subtype || state == rsl_separator || state == rsl_encoding) { 


There are errors that will manifest themselves only in the Windows operating system.

 typedef UINT_PTR SOCKET; static unsigned int __stdcall win9x_accept(void * dummy) { SOCKET csd; ... do { clen = sizeof(sa_client); csd = accept(nsd, (struct sockaddr *) &sa_client, &clen); } while (csd < 0 && APR_STATUS_IS_EINTR(apr_get_netos_error())); ... } 


PVS-Studio diagnostic message: V547 Expression 'csd <0' is always false. Unsigned type value is never <0. libhttpd child.c 404



The accept function in Visual Studio header files returns a value of unsigned type SOCKET. Therefore, the 'csd <0' check is invalid, because its result is always false (false). The returned values ​​must be explicitly compared with various constants, for example, with SOCKET_ERROR.



There are a number of errors that can be attributed to potential vulnerabilities. I do not know, maybe at least theoretically use them to attack. Nevertheless, I will give a couple of similar examples.

 typedef size_t apr_size_t; APU_DECLARE(apr_status_t) apr_memcache_getp(...) { ... apr_size_t len = 0; ... len = atoi(length); ... if (len < 0) { ... } else { ... } } 


PVS-Studio diagnostic message: V547 Expression 'len <0' is always false. Unsigned type value is never <0. aprutil apr_memcache.c 814



The condition “len <0” in this code is always false, since the variable 'len' is of unsigned type. Accordingly, if you input a string with a negative number, it will probably fail.



Still in Apache there is a lot of code where various buffers with data are reset. Most likely, this is due to security. However, zeroing itself often does not occur due to an error of the following form:

 #define MEMSET_BZERO(p,l) memset((p), 0, (l)) void apr__SHA256_Final(sha2_byte digest[], SHA256_CTX* context) { ... MEMSET_BZERO(context, sizeof(context)); ... } 


PVS-Studio diagnostic message: V512 A call of the 'memset' function will. apr sha2.c 560



Only the first bytes in the buffer are reset, because the sizeof () operator returns the size of the pointer, not the buffer with data. I counted such errors at least 6 pieces.



The remaining mistakes are more boring, so I will not write about them. Thanks for attention. Come try PVS-Studio. And we still provide keys for the developers of free open-source projects and those who really want to try the analyzer in full force. Write to us.

Source: https://habr.com/ru/post/124611/



All Articles