📜 ⬆️ ⬇️

Trust mathematicians but verify

PVS-Studio.  You shall not pass!
I am sometimes puzzled, considering the mistakes in the next software project. Many of these errors live in projects for years. You look at a hundred bloopers in the code and wonder how the program works at all. And it somehow works. She even enjoyed. Moreover, I'm not talking about the code that draws a pokemon in the game. And, for example, about math libraries. Yes, you guessed it. This article is about checking the code of the Scilab math library.

Scilab


Today, we will look at suspicious code fragments in the Scilab math package. The analysis was performed using the PVS-Studio tool.

Scilab is a package of applied mathematical programs that provides a powerful open environment for engineering (technical) and scientific calculations [ Wikipedia ].

Official website: http://www.scilab.org/
')
Many tools are available in the system:Get ready. The article will be long. It's not my fault that there are so many different dirty tricks. And I want to show different classes of errors.

Of course, the deficiencies found are not related to mathematics. Perhaps all the algorithms in the library are correct and efficient. However, since C ++ language was chosen for writing, we must not forget that in addition to errors in the algorithms, there are also typos, dereferencing null pointers and other errors. Actually, it does not matter to the user whether he will encounter a logical error in the numerical algorithm or he will become a victim of an uninitialized variable.

Yes, static analysis finds only some kinds of errors. But since they are easy to detect, why deny yourself such a pleasure. It is better to correct another 10% of errors than not to fix anything at all.

So, let's see what the PVS-Studio analyzer told me about the Scilab project.

Buffer that does not exist


Buffer that does not exist
int sci_champ_G(....) { .... char * strf = NULL ; .... if ( isDefStrf( strf ) ) { char strfl[4]; strcpy(strfl,DEFSTRFN); strf = strfl; if ( !isDefRect( rect ) ) { strf[1]='5'; } } (*func)(stk(l1), stk(l2), stk(l3), stk(l4), &m3, &n3, strf, rect, arfact, 4L); .... } 

PVS-Studio message: V507 Pointer to local array 'strfl' is stored outside the scope of this array. Such a pointer will become invalid. sci_champ.c 103

The reference to the temporary array 'strfl' is stored in the variable 'strf'. When exiting the “if () {...}” block, this array ceases to exist. However, the program works with the 'strf' pointer.

The behavior of such a program is unpredictable. You cannot work with an array that is no longer there. Of course, the program can work quite correctly. However, this is luck. The memory where the array was located may at any time be occupied by storing other arrays or variables.

Similar problems:

Something is not considered


 int C2F(pmatj) (char *fname, int *lw, int *j, unsigned long fname_len) { .... ix1 = il2 + 4; m2 = Max(m, 1); ix1 = il + 9 + m * n; .... } 

PVS-Studio warning: V519 The 'ix1' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 2387, 2389. stack1.c 2389

There is something wrong with the variable 'ix1'. I think there is some typo here.

At the beginning of the test, then initialization


At the beginning of the test, then initialization

An interesting piece of code. You need to get some values ​​and then check them. But it turned out the opposite.
 int sci_Playsound (char *fname,unsigned long fname_len) { .... int m1 = 0, n1 = 0; .... if ( (m1 != n1) && (n1 != 1) ) { Scierror(999,_("%s: Wrong size for input argument #%d: ") _("A string expected.\n"),fname,1); return 0; } sciErr = getMatrixOfWideString(pvApiCtx, piAddressVarOne, &m1,&n1,&lenStVarOne, NULL); .... } 

PVS-Studio warnings: V560 A part of the conditional expression is always false: (m1! = N1). sci_playsound.c 66; V560 A part of the conditional expression is always true: (n1! = 1). sci_playsound.c 66

The variables m1 and n1 should get the values ​​when calling the getMatrixOfWideString () function. Then these variables should be checked. It just turned out that the check is carried out before the getMatrixOfWideString () function call.

At the time of testing, the variables m1 and n1 are equal to 0. The condition “if ((m1! = N1) && (n1! = 1))” is not met. As a result, verification does not affect the operation of the program.

Total The variables m1 and n1 are not checked.

Magic numbers


Imaginary friend
 void CreCommon(f,var) FILE *f; VARPTR var; { .... if ( strncmp(var->fexternal, "cintf", 4)==0 ) .... } 

PVS-Studio warning: V666 Consider the third argument of the function 'strncmp'. This is not the case. crerhs.c 119

The magic number 4 is used. And this number is incorrect. The string "cintf" has five characters, not four. Do not use such magic numbers.

I would make a special macro to calculate the length of string literals and use it like this:
 if ( strncmp(var->fexternal, "cintf", litlen("cintf"))==0 ) 

How to make a macro 'litlen', we will not discuss. There are lots of ways for every taste. The main thing is to get rid of the number.

Other wrong row sizes:

1, 2, 3, 4, 4, 6


Array, hole.
 int C2F(run)(void) { .... static int *Lpt = C2F(iop).lpt - 1; .... Lpt[1] = Lin[1 + k]; Lpt[2] = Lin[2 + k]; Lpt[3] = Lin[3 + k]; Lpt[4] = Lin[4 + k]; Lct[4] = Lin[6 + k ]; Lpt[6] = k; .... } 

PVS-Studio warning: V525 The code containing the collection of similar blocks. Check items '1', '2', '3', '4', '4' in lines 1005, 1006, 1007, 1008, 1009. run.c 1005

A typo in the sequence of numbers. As a result, one element of the array will remain uninitialized. You can get a lot of interesting mathematical results.

Code evolution


Evolution
 int write_xml_states( int nvar, const char * xmlfile, char **ids, double *x) { .... FILE *fd = NULL; .... wcfopen(fd, (char*)xmlfile, "wb"); if (fd < 0) { sciprint(_("Error: cannot write to '%s' \n"), xmlfile); .... } 

PVS-Studio warning: V503 This is a nonsensical comparison: pointer <0. scicos.c 5826

I'm pretty sure that once in this code, the open function was used to open a file. Then, the code was rewritten and began to use the function on _wfopen . Her call is hidden in the 'wcfopen' macro.

But check that the file was successfully opened, forgot to correct. The open () function returns -1 in case of an error. Checking that the pointer is less than zero has no practical meaning.

Another place where history is traced.
 void taucs_ccs_genmmd(taucs_ccs_matrix* m, int** perm, int** invperm) { int n, maxint, delta, nofsub; .... maxint = 32000; assert(sizeof(int) == 4); maxint = 2147483647; /* 2**31-1, for 32-bit only! */ .... } 

PVS-Studio warning: V519 The 'maxint' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 154, 157. taucs_scilab.c 157

There is no error, but the code is funny.

It is seen that a long time ago, it was written "maxint = 32000;". Then below appeared:
 assert(sizeof(int) == 4); maxint = 2147483647; /* 2**31-1, for 32-bit only! */ 

Sort one item


Sort one item
 char *getCommonPart(char **dictionary, int sizeDictionary) { .... char *currentstr = dictionary[0]; qsort(dictionary, sizeof dictionary / sizeof dictionary[0], sizeof dictionary[0], cmp); .... } 

PVS-Studio warning: V514 Dividing sizeof a pointer 'sizeof dictionary' by another value. There is a possibility of logical error presence. getcommonpart.c 76

The second argument to the qsort () function is the number of elements in the array. Due to an error, the number of elements is always one.

Consider the expression "sizeof dictionary / sizeof dictionary [0]". Here the size of the pointer is divided by the size of the pointer. The result is one.

Probably the correct code should have been:
 qsort(dictionary, sizeDictionary, sizeof dictionary[0], cmp); 

Similar error here: getfilesdictionary.c 105

Stubborn strings


 void GetenvB(char *name, char *env, int len) { int ierr = 0, one = 1; C2F(getenvc)(&ierr,name,env,&len,&one); if (ierr == 0) { char *last = &env[len-1]; while ( *last == ' ' ) { last = '\0' ; } last--; } .... } 

V527 It is odd that the '\ 0' value is assigned to the 'char' type pointer. Probably meant: * last = '\ 0'. getenvb.c 24

This line is terrible. Or beautiful, if we are talking about interesting mistakes.
 while ( *last == ' ' ) { last = '\0' ; } 

If the first character in the string is blank, then the pointer will become zero. Then there will be a call on the null pointer.

It seems to me that this code should have replaced all spaces with a '\ 0'. Then the code should be like this:
 while ( *last == ' ' ) { *last++ = '\0' ; } 

It's funny that there is another place in the code where people also want to change spaces to zeros. And it also failed to write correctly.
 static int msg_101(int *n, int *ierr) { .... for (i=0;i<(int)strlen(line);i++) { if (line[i]==' ') line[i]='\0'; break; } .... } 

PVS-Studio warning: V612 An unconditional 'break' within a loop. msgs.c 1293

Everything would be fine if it were not for the 'break' operator. Only one space will be replaced. However, if you remove the 'break' it does not help. The strlen () function will return zero, and the loop will still stop.

Similar "one-time" cycles:

Null pointer dereferencing


Null pointer dereferencing
 char **splitLineCSV(....) { .... if (retstr[curr_str] == NULL) { *toks = 0; FREE(substitutedstring); substitutedstring = NULL; freeArrayOfString(retstr, strlen(substitutedstring)); return NULL; } .... } 

PVS-Studio warning: V575 The null pointer is passed into the 'strlen' function. Inspect the first argument. splitline.c 107

Strange code. At the beginning, the pointer was explicitly zeroed. Then, gave it to the mercy of the function strlen ().

Most likely, the call to the freeArrayOfString () function should be located higher than the FREE () call.

It was a warm-up. Now consider the more difficult case.
 inline static void create(void * pvApiCtx, const int position, const int rows, const int cols, long long * ptr) { int * dataPtr = 0; alloc(pvApiCtx, position, rows, cols, dataPtr); for (int i = 0; i < rows * cols; i++) { dataPtr[i] = static_cast<int>(ptr[i]); } } 

V522 Dereferencing of the null pointer 'dataPtr' might take place. scilababstractmemoryallocator.hxx 222

In this function, they want to allocate memory using the alloc () function. It may seem that the function returns a value by reference. The final argument is the 'dataPtr' pointer. It seems that the pointer to the allocated memory buffer will be written to it.

But it is not. The pointer will remain zero. Let's see how the alloc () function is declared:
 inline static int *alloc( void * pvApiCtx, const int position, const int rows, const int cols, int * ptr) 

See, the last argument is not a link. By the way, it's not at all clear why he is needed. Let's look inside the alloc () function:
 inline static int *alloc( void * pvApiCtx, const int position, const int rows, const int cols, int * ptr) { int * _ptr = 0; SciErr err = allocMatrixOfInteger32( pvApiCtx, position, rows, cols, &_ptr); checkError(err); return _ptr; } 

The last argument 'ptr' is not used at all.

In any case, the allocation code is incorrect. The code should be:
 inline static void create(void * pvApiCtx, const int position, const int rows, const int cols, long long * ptr) { int *dataPtr = alloc(pvApiCtx, position, rows, cols, 0); for (int i = 0; i < rows * cols; i++) { dataPtr[i] = static_cast<int>(ptr[i]); } } 

Similar situations:

Invalid error messages


The PVS-Studio analyzer often finds typos in error handlers. This code is rarely executed, and errors for a long time go unnoticed. I think because of such errors we often cannot understand what is wrong with the program. The diagnostic message issued by the program does not correspond to reality.

Example of incorrect error message generation:
 static SciErr fillCommonSparseMatrixInList(....) { .... addErrorMessage(&sciErr, API_ERROR_FILL_SPARSE_IN_LIST, _("%s: Unable to create list item #%d in Scilab memory"), _iComplex ? "createComplexSparseMatrixInList" : "createComplexSparseMatrixInList", _iItemPos + 1); .... } 

PVS-Studio message: V583 The '?:' Operator, regardless of its conditional expression, always returns one and the same value: "createComplexSparseMatrixInList". api_list.cpp 2398

Regardless of the value of the '_iComplex' variable, “createComplexSparseMatrixInList” will always be printed.

Similarly:Now consider an error handler that never gets control:
 #define __GO_FIGURE__ 9 #define __GO_UIMENU__ 21 int sci_uimenu(char *fname, unsigned long fname_len) { .... if (iParentType == __GO_FIGURE__ && iParentType == __GO_UIMENU__) { Scierror(999, _("%s: Wrong type for input argument #%d: ") _("A '%s' or '%s' handle expected.\n"), fname, 1, "Figure", "Uimenu"); return FALSE; } .... } 

PVS-Studio warning: V547 Expression 'iParentType == 9 && iParentType == 21' is always false. Probably the '||' operator should be used here. sci_uimenu.c 99

The condition (iParentType == __GO_FIGURE__ && iParentType == __GO_UIMENU__) is never satisfied. The variable cannot be at the same time the number 9 and the number 21. I think they wanted to write here:
 if (iParentType != __GO_FIGURE__ && iParentType != __GO_UIMENU__) 

Another, especially sweet example.
 int set_view_property(....) { BOOL status = FALSE; .... status = setGraphicObjectProperty( pobjUID, __GO_VIEW__, &viewType, jni_int, 1); if (status = TRUE) { return SET_PROPERTY_SUCCEED; } else { Scierror(999, _("'%s' property does not exist ") _("for this handle.\n"), "view"); return SET_PROPERTY_ERROR ; } .... } 

PVS-Studio warning: V559 Suspicious assignment of the operator: status: 1. set_view_property.c 61

Error here: "if (status = TRUE)". Instead of comparing, assignment occurs.

Lack of choice


Lack of choice

A function that can be clearly reduced. Apparently, it was written using Copy-Paste, and they forgot to correct something in the copied code.
 static int uf_union (int* uf, int s, int t) { if (uf_find(uf,s) < uf_find(uf,t)) { uf[uf_find(uf,s)] = uf_find(uf,t); return (uf_find(uf,t)); } else { uf[uf_find(uf,s)] = uf_find(uf,t); return (uf_find(uf,t)); } } 

PVS-Studio warning: V523 The 'then' statement is equivalent to the 'else' statement. taucs_scilab.c 700

Regardless of the condition, identical actions are performed.

Now the situation is different. Here are the conditions:
 int sci_xset( char *fname, unsigned long fname_len ) { .... else if ( strcmp(cstk(l1), "mark size") == 0) .... else if ( strcmp(cstk(l1), "mark") == 0) .... else if ( strcmp(cstk(l1), "mark") == 0) .... else if ( strcmp(cstk(l1), "colormap") == 0) .... } 

PVS-Studio warning: V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 175, 398. sci_xset.c 175

There are some more incorrect conditions:

Classic


Perhaps, I singled out the most frequent bloop in C / C ++ programs. At the beginning, the pointer is dereferenced, and only then it is checked for equality to zero. This does not always lead to an error. But ugliness, it is ugliness.
 static void appendData(....) { .... sco_data *sco = (sco_data *) * (block->work); int maxNumberOfPoints = sco->internal.maxNumberOfPoints; int numberOfPoints = sco->internal.numberOfPoints; if (sco != NULL && numberOfPoints >= maxNumberOfPoints) .... } 

PVS-Studio warning: V595 The 'sco' pointer was used before it was verified against nullptr. Check lines: 305, 311. canimxy3d.c 305

At the beginning, accessed members using the 'sco' pointer:
 int maxNumberOfPoints = sco->internal.maxNumberOfPoints; int numberOfPoints = sco->internal.numberOfPoints; 

And then suddenly they remembered that this pointer should be checked:
 if (sco != NULL ..... 

The analyzer issued 61 more V595 warnings. I see no point in listing them in the article. I give them a separate list: scilab-v595.txt .

Another common situation is the use of incorrect format specifiers when working with the sprintf () function and similar ones. Almost everything that was found is not interesting. We print unsigned values ​​as signed. Therefore, I cite all these warnings as a list: scilab-v576.txt .

Of the interesting things, only this can be noted:
 #define FORMAT_SESSION "%s%s%s" char *getCommentDateSession(BOOL longFormat) { .... sprintf(line, FORMAT_SESSION, SESSION_PRAGMA_BEGIN, STRING_BEGIN_SESSION, time_str, SESSION_PRAGMA_END); .... } 

PVS-Studio warning: V576 Incorrect format. A different number of actual arguments is expected while calling the 'sprintf' function. Expected: 5. Present: 6. getcommentdatesession.c 68

The string SESSION_PRAGMA_END will not be printed.

Caution undefined behavior


Caution, indefinite behavior.  This is C ++!
 short ezxml_internal_dtd(ezxml_root_t root, char *s, size_t len) { .... while (*(n = ++s + strspn(s, EZXML_WS)) && *n != '>') { .... } 

PVS-Studio warning: V567 Undefined behavior. The variable is modified while it is being used. ezxml.c 385

It is not known whether the expression “++ s' or the expression“ strspn (s, EZXML_WS) ”will be evaluated at the beginning. Accordingly, the result may differ on different compilers, platforms, and so on.

Another more interesting case. Here, undefined behavior arises from a typo.
 static char **replaceStrings(....) { .... int i = 0; .... for (i = 0; i < nr; i = i++) .... } 

V567 Undefined behavior. I'm variable is variable csvread.c 620

The trouble here is: i = i ++.

Apparently, I wanted to write this:
 for (i = 0; i < nr; i++) 

More about strings


 char *PLD_strtok(....) { .... if ((st->start)&&(st->start != '\0')) .... } 

PVS-Studio warning: V528 It is odd that the pointer to the 'char' type is compared with the '\ 0' value. Probably meant: * st-> start! = '\ 0'. pldstr.c 303

We wanted to check that the string is not empty. But in fact, the pointer is compared twice with NULL. The correct code is:
 if ((st->start)&&(st->start[0] != '\0')) 

Similar blooper:

V528 It is odd that the pointer to the 'char' type is compared with the '\ 0' value. Probably meant: ** category == '\ 0'. sci_xcospalload.cpp 57

The following code does not appear to be complete:
 int sci_displaytree(char *fname, unsigned long fname_len) { .... string szCurLevel = ""; .... //Add node level if (szCurLevel != "") { szCurLevel + "."; } .... } 

PVS-Studio warning: V655 The strings was concatenated but are not used. Consider inspecting the 'szCurLevel + "."' Expression. sci_displaytree.cpp 80

Code that works because of luck


Luck
 static int sci_toprint_two_rhs(void* _pvCtx, const char *fname) { .... sprintf(lines, "%s%s\n", lines, pStVarOne[i]); .... } 

PVS-Studio Warning: V541 It is dangerous to print the string 'lines' into itself. sci_toprint.cpp 314

The sprintf () function saves the result of its work to the 'lines' buffer. In this case, the same buffer is one of the input lines. So doing is not good. The code may well work. But it is dangerous. When changing the compiler, you can get an unexpected and unpleasant result.

Similar situation: sci_coserror.c 94

Sample code that works, although not true:
 typedef struct JavaVMOption { char *optionString; void *extraInfo; } JavaVMOption; JavaVMOption *options; BOOL startJVM(char *SCI_PATH) { .... fprintf(stderr, "%d: %s\n", j, vm_args.options[j]); .... } 

PVS-Studio warning: V510 The 'fprintf' function is not expected. jvm.c 247

Here they wanted to print the string referenced by the 'optionString' pointer. The correct code should have been:
 fprintf(stderr, "%d: %s\n", j, vm_args.options[j].optionString); 

But in fact, the fprintf () function will take an object of type JavaVMOption as an argument. The code works thanks to a wonderful coincidence of circumstances.

First, the 'optionString' member is located at the beginning of the structure. Therefore, it is this function that fprintf () will take and process it as a pointer to a string.

Secondly, after this function does not print anything. Therefore, garbage will not be printed (the contents of the 'extraInfo' variable, which will also go on the stack).

Hallelujah!

Idle cycle


 static void reinitdoit(double *told) { int keve = 0, kiwa = 0; .... kiwa = 0; .... for (i = 0; i < kiwa; i++) .... } 

V621 Consider inspecting the 'for' operator. It is possible that you will not be executed at all. scicos.c 4432

There is something wrong here. The variable 'kiwa' is always zero. The cycle is not executed. Perhaps the code is unfinished.

What is not included in the article


To be honest, I'm already tired of looking through the report and writing this article. So I decided to stop. Perhaps, it was possible to mention about a couple of suspicious places. But I considered them irrelevant, and laziness triumphed. Plus, I probably missed something, because I am not familiar with the project. Therefore, I recommend the authors to independently verify the project using the PVS-Studio analyzer.

Note. To those who decided to check the project once and not buy an analyzer, I want to remind you that this is a completely meaningless action. The whole essence of static analysis in regular checks, and not in one-time runs. You make a typo and the analyzer immediately detects it. The time for testing, debugging and working with errors appearing in the bug tracker is reduced. See also the article: " Leo Tolstoy and Static Code Analysis ".

Note


Be sure someone asks which version of Scilab was checked. Unfortunately, not the freshest. About a month and a half, I checked this project, wrote out suspicious code fragments. And ... And I forgot about this file. There was a lot of work involved in comparing analyzers . Now I came across this file, and for a long time I remembered what it was. I check so many projects that everything in my head is already messed up, and I don’t even remember whether I was watching a project or not.

However, nothing terrible. Now I will write this article. Scilab authors will see it and check the project themselves. The purpose of my articles is to show the possibilities of the static analysis methodology, and not to find errors in the most recent version of the project.

Conclusion


Use static analysis on a regular basis. You will reduce the time to eliminate stupid mistakes and be able to spend more time on something useful.

For medium and large projects where night checks are required, analyzer refinement, integration with MSBuild, support of Visual Studio 2005/2008, and so on, we offer the PVS-Studio tool.

Additional links


  1. Terminology. Static code analysis .
  2. Andrey Karpov. Alternative to PVS-Studio for $ 250 .
  3. Andrey Karpov, Evgeny Ryzhkov, Pavel Eremeev, Svyatoslav Razmyslov. Comparison of code analyzers: CppCat, Cppcheck, PVS-Studio, Visual Studio . ( comparison methodology ).


This article is in English.


If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Mathematicians: Trust, but Verify .

Read the article and have a question?
Often our articles are asked the same questions. We collected the answers to them here: Answers to questions from readers of articles about PVS-Studio and CppCat, version 2014 . Please review the list.

Source: https://habr.com/ru/post/217557/


All Articles