⬆️ ⬇️

Checking the Samba project using PVS-Studio under Linux

Checking Samba with PVS-Studio under Linux If you followed the news on the latest developments in the field of analysis tools for C / C ++ code, then you must have heard about the PVS-Studio tool. I learned about it through articles that developers publish on their website and in which they talk about audits of open source projects. To date, an impressive number of projects have been tested, including the Linux kernel, Qt, Unreal, etc., and each time they manage to find interesting bugs that live in the code for a long time without being detected by anyone. Typos, inaccurate copying, undefined behavior, meaningless code, syntax errors that are miraculously missed by the compiler ...



As John Carmack said, " Everything that is valid from a syntax point of view and skipped by the compiler will eventually end up in your code base . "



Unfortunately, only Windows is declared as a supported operating system. The tool is available as a plug-in for Visual Studio or as a separate application if you do not have Visual Studio installed. My first experience with the analyzer came in 2014, when I checked with its help the relatively large base of C ++ code that was used for internal needs by the computer graphics laboratory at my university in Lyon ( LIRIS ). We then conducted the development in Visual Studio (I usually do not use it), and I thought, why not try the analyzer in action. I was pleased with the results and continued to follow the appearance of new articles on the PVS-Studio website.



Two years later (during this time several articles of PVS-Studio were published) I started working on the Samba project. Its total size is about 2 million lines of C code, and I thought it would be interesting to test it with PVS-Studio. In the static analyzer code, in principle, there should not be many platform-specific sections, so I began to look for a way to implement such a test. The analyzer works with preprocessed code, so it needs to run the source code through the preprocessor, and for this it needs to know about all preprocessor flags, macros, and paths of the included files. Automatic collection of such data can be time consuming. To solve it, I wrote a small strace-based script to track the compiler calls — in this case, you can check regardless of the build tool used. The latest version of my solution can be found on github .

')

I sent this script to the developers of PVS-Studio, and after a small correspondence they sent me an experimental build of PVS-Studio under Linux (for which I thank them again!). Now the script carries out all stages of the analysis: from collecting information on compilation keys to directly analyzing, displaying and filtering the results.



And now let's figure out how to work with this script.



To avoid having to specify the location of the license file and the executable file each time you start, you can set up environment variables.

$ export PVS_LICENSE=~/prog/pvs/PVS-Studio.lic $ export PVS_BIN=~/prog/pvs/PVS-Studio 


Go to the project directory and generate a configuration file for your C ++ 11 project.

 $ pvs-tool genconf -l C++11 pvs.cfg 


If necessary, set the build parameters before compiling, then enable tracking of the current build (the build command should be specified after the - characters).

 $ pvs-tool trace -- make -j8 


After this, the file “strace_out” will be generated, which contains all the necessary information. At the analysis stage, all the compilation units and the preprocessor flags will be extracted from this file, after which they will be checked in PVS-Studio.

 $ pvs-tool analyze pvs.cfg pvs-tool: deleting existing log pvs.log... 001/061 [ 0%] analyzing /hom../rtags/src/ClangIndexer.cpp... 002/061 [ 1%] analyzing /hom../rtags/src/CompilerManager.cpp... 003/061 [ 3%] analyzing /hom../rtags/src/CompletionThread.cpp... 004/061 [ 4%] analyzing /hom../rtags/src/DependenciesJob.cpp... <...> 061/061 [98%] analyzing /hom../rtags/src/rp.cpp... pvs-tool: analysis finished pvs-tool: cleaning output... pvs-tool: done (2M -> 0M) 


At the stage of cleaning, the script will remove duplicate lines, which will significantly reduce the file size with the analysis results.



Now you can view the results grouped by file, to which they belong:

 $ pvs-tool view pvs.log 


The format of the output file is similar to the format of gcc / make, that is, you can work with it "as is". For example, you can open it in the Emacs editor and apply standard built-in error transition functions (goto-error) to it. You can also turn off individual diagnostics, for example:

 $ pvs-tool view -d V2006,V2008 pvs.log 


By default, warnings of only 1 level are displayed, but this parameter can be changed using the -l command .



Detailed help is invoked with the -h command.



PVS-Studio found many problem areas in Samba. Most of them turned out to be false positives, but this is unavoidable when checking the volume code base, no matter what analyzer you use. It is important that these errors were found. Below I give the most interesting of them, as well as ways to fix them in the format of diff-patches.

 - if (memcmp(u0, _u0, sizeof(u0) != 0)) { + if (memcmp(u0, _u0, sizeof(*u0)) != 0) { printf("USER_MODALS_INFO_0 struct has changed!!!!\n"); return -1; } 


In this example, the closing bracket is set incorrectly. The result of comparing sizeof with zero was used as the memory size for the memcmp function (always 1 byte). We are also interested in the size of the type pointed to by the pointer u0 , and not the size of the pointer itself.

  handle_main_input(regedit, key); update_panels(); doupdate(); - } while (key != 'q' || key == 'Q'); + } while (key != 'q' && key != 'Q'); 


In this code, it is necessary to exit the loop if the letter 'q' occurs in any case.

  uid = request->data.auth.uid; - if (uid < 0) { + if (uid == (uid_t)-1) { DEBUG(1,("invalid uid: '%u'\n", (unsigned int)uid)); return -1; } 


Here, the uid_t variable is checked for a negative value.



The sign of a variable of type uid_t is not defined in POSIX. It is defined as an unsigned 32-bit type on Linux, therefore, <0 is always false.



In the case of an unsigned version of the uid_t type, comparing versus uid == -1 will always implicitly cast -1 to an unsigned type, resulting in a comparison with both the signed and unsigned type uid_t will always give the correct result. I made the transformation explicit: in our case, the less magic, the better.

  DEBUG(4,("smb_pam_auth: PAM: Authenticate User: %s\n", user)); - pam_error = pam_authenticate(pamh, PAM_SILENT | - allow_null_passwords ? 0 : PAM_DISALLOW_NULL_AUTHTOK); + pam_error = pam_authenticate(pamh, PAM_SILENT | + (allow_null_passwords ? 0 : PAM_DISALLOW_NULL_AUTHTOK)); switch( pam_error ){ case PAM_AUTH_ERR: DEBUG(2, ("smb_pam_auth: PAM: ....", user)); 


The usual confusion with the priority of operators.

  gensec_init(); dump_args(); - if (check_arg_numeric("ibs") == 0 || - check_arg_numeric("ibs") == 0) { + if (check_arg_numeric("ibs") == 0 || + check_arg_numeric("obs") == 0) { fprintf(stderr, "%s: block sizes must be greater that zero\n", PROGNAME); exit(SYNTAX_EXIT_CODE); 


In this example, the same thing was checked twice.

  if (!gss_oid_equal(&name1->gn_type, &name2->gn_type)) { *name_equal = 0; } else if (name1->gn_value.length != name2->gn_value.length || - memcmp(name1->gn_value.value, name1->gn_value.value, + memcmp(name1->gn_value.value, name2->gn_value.value, name1->gn_value.length)) { *name_equal = 0; } 


In this code, the memcmp function is called with the same pointer as its parameters, with the result that the same memory block is compared to itself.

  ioctl_arg.fd = src_fd; ioctl_arg.transid = 0; ioctl_arg.flags = (rw == false) ? BTRFS_SUBVOL_RDONLY : 0; - memset(ioctl_arg.unused, 0, ARRAY_SIZE(ioctl_arg.unused)); + memset(ioctl_arg.unused, 0, sizeof(ioctl_arg.unused)); len = strlcpy(ioctl_arg.name, dest_subvolume, ARRAY_SIZE(ioctl_arg.name)); if (len >= ARRAY_SIZE(ioctl_arg.name)) { 


Here, the number of array elements was passed as the memset parameter, not the size in bytes.

  if (n + IDR_BITS < 31 && - ((id & ~(~0 << MAX_ID_SHIFT)) >> (n + IDR_BITS))) { + ((id & ~(~0U << MAX_ID_SHIFT)) >> (n + IDR_BITS))) { return NULL; } 


Using negative values ​​as the left operand of the left shift operation leads to undefined behavior according to the C standard.

  if (cli_api(cli, param, sizeof(param), 1024, /* Param, length, maxlen */ - data, soffset, sizeof(data), /* data, length, maxlen */ + data, soffset, data_size, /* data, length, maxlen */ &rparam, &rprcnt, /* return params, length */ &rdata, &rdrcnt)) /* return data, length */ { 


In this example, we deal with data that was once an array allocated on the stack, but then turned into a buffer allocated on the heap, while the sizeof operation was forgotten to correct.

  goto query; } - if ((p->auth.auth_type != DCERPC_AUTH_TYPE_NTLMSSP) || - (p->auth.auth_type != DCERPC_AUTH_TYPE_KRB5) || - (p->auth.auth_type != DCERPC_AUTH_TYPE_SPNEGO)) { + if (!((p->auth.auth_type == DCERPC_AUTH_TYPE_NTLMSSP) || + (p->auth.auth_type == DCERPC_AUTH_TYPE_KRB5) || + (p->auth.auth_type == DCERPC_AUTH_TYPE_SPNEGO))) { return NT_STATUS_ACCESS_DENIED; } 


Before correction, the condition was always true and the function always returned the status “access denied”.

 - Py_RETURN_NONE; talloc_free(frame); + Py_RETURN_NONE; } 


Py_RETURN_NONE is a macro that hides the return statement . In this snippet, where a binding to Python code is used, many functions returned a result before the dynamically allocated memory was released. This problem was encountered in dozens of functions.

  int i; - for (i=0;ARRAY_SIZE(results);i++) { + for (i=0;i<ARRAY_SIZE(results);i++) { if (results[i].res == res) return results[i].name; } return "*"; 


In this code, the condition of the for operator always turned out to be true.

  int create_unlink_tmp(const char *dir) { + if (!dir) { + dir = tmpdir(); + } + size_t len = strlen(dir); char fname[len+25]; int fd; mode_t mask; - if (!dir) { - dir = tmpdir(); - } - 


And here the dir pointer was used before checking for zero.



On the whole, I was satisfied with the PVS-Studio analyzer and willingly recommend it for use. Unfortunately, it is officially unavailable under Linux, but, as I understand it, it is enough just to write to the developers, and they will help you with customization for this operating system :)

Source: https://habr.com/ru/post/280856/



All Articles