📜 ⬆️ ⬇️

We check the code of the dynamic analyzer Valgrind using a static analyzer

Friendship static and dynamic analysis I’ll say right away that the article is not written at all to show that static analysis works better than dynamic one. Such a statement would be wrong, as well as the reverse. Static and dynamic analysis tools complement each other, and do not compete with each other. Those and those have strengths and weaknesses. Some errors cannot detect dynamic analyzers, and some cannot find static ones. Therefore, you should treat this note simply as a regular demonstration of the capabilities of PVS-Studio, and not as a comparison of the two methodologies.

Methodologies for dynamic and static code analysis


The source code of the program contains hints that help identify errors. Consider a simple example:

char *str = foo(); if (str == '\0') 

It is strange to compare a pointer not with nullptr , NULL or at least 0 , but with the character literal '\ 0' . Based on this oddity, the static code analyzer can assume that they really wanted to check not that the pointer is 0, but that the string is empty. Those. We wanted to check that the terminal zero is located at the beginning of the line, but accidentally forgot to dereference the pointer. Most likely it turns out that this is really a mistake, and the correct code should be like this:

 char *str = foo(); if (*str == '\0') 

When compiling this information is lost, and the dynamic analyzer is not able to identify this error. From the point of view of the dynamic analyzer, the pointer is checked for NULL equality - there is nothing to worry about.
')
Another weakness of dynamic analyzers is the need to execute code that contains an error. And for many parts of the code, this can be very difficult. Let me explain with the example of code taken from a real application:

 ADOConnection* piTmpConnection = NULL; hr = CoCreateInstance( CLSID_DataLinks, NULL, CLSCTX_INPROC_SERVER, IID_IDataSourceLocator, (void**)&dlPrompt ); if( FAILED( hr ) ) { piTmpConnection->Release(); dlPrompt->Release( ); return connstr; } 

If the CoCreateInstance function fails, then the null pointer piTmpConnection will be dereferenced. In fact, here is the line piTmpConnection-> Release (); just superfluous, since no connection has yet been created.

To identify such a situation using a dynamic analyzer is problematic, since it is necessary to emulate a situation where the CoCreateInstance function returns an error status. Make it easy.

Purely theoretically, a static analyzer has information about the code, and therefore is able to find more errors than a dynamic analyzer. In practice, the capabilities of static analyzers are limited by the amount of available memory and acceptable runtime. In other words, a static analyzer can consider how the code will work for all possible input data. But he will do it for 150 years on a cluster, where, in addition, an incredible amount of memory is installed.

As a result, in practice, static analyzers cannot identify many types of errors. For example, they do not notice leaks if the pointer is passed between many functions. In turn, dynamic analyzers do an excellent job with such tasks, regardless of the complexity of the code.

Test results


We regularly check various projects to popularize static code analysis methodology, and I could not get past such a project like Valgrind. Finding mistakes in it is a kind of challenge. This is a high-quality, tested project, which in addition is checked by the Coverity analyzer. And in general, I am sure that this code was tested by enthusiasts with various tools. So even finding a few mistakes is a big success.

Let's see what PVS-Studio analyzer found in the code of the Valgrind project, which is interesting.

 static void lk_fini(Int exitcode) { .... VG_(umsg)(" taken: %'llu (%.0f%%)\n", taken_Jccs, taken_Jccs * 100.0 / total_Jccs ?: 1); .... } 

PVS-Studio warning: V502 Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower limit than the '/' operator. lk_main.c 1014

Operator?: Extremely tricky, and it should be used very carefully. I talked more about this topic in Chapter 4 of my small book , where I recommend looking. Consider what this code is suspicious.

It seems to me that the programmer wanted to protect himself from division by zero. Therefore, if the variable total_Jccs is 0 , then the division should be carried out by 1 . It was planned that the code would work like this:

 taken_Jccs * 100.0 / (total_Jccs ?: 1) 

However, the priority of the operator?: Is lower than that of the multiplication and division operators. Therefore, the expression is calculated as:

 (taken_Jccs * 100.0 / total_Jccs) ?: 1 

However, perhaps the code works exactly as intended. If this is so, anyway, it is better to add brackets so that other programmers won’t guess later on whether there is an error or not.

 static Bool doHelperCall (....) { .... UInt nVECRETs = 0; .... vassert(nVECRETs == (retTy == Ity_V128 || retTy == Ity_V256) ? 1 : 0); .... } 

PVS-Studio warning: V502 Perhaps the '?:' Operator works in a different way. The '?:' Operator has a lower limit than the '==' operator. host_arm_isel.c 795

Interesting case. The?: Operator is used incorrectly, but the code is correct.

It was thought that the test should work like this:

 nVECRETs == ((retTy == Ity_V128 || retTy == Ity_V256) ? 1 : 0) 

But it works like this:

 (nVECRETs == (retTy == Ity_V128 || retTy == Ity_V256)) ? 1 : 0 

The funny thing is that if you look closely, it is clear that these checks are equivalent. The result will be the same.

Similar checks are found here:


 typedef ULong DiOffT; typedef struct { Bool fromC; DiOffT off; SizeT size; SizeT used; UChar data[]; } CEnt; static Bool is_sane_CEnt (....) { .... CEnt* ce = img->ces[i]; .... if (!(ce->size == CACHE_ENTRY_SIZE)) goto fail; if (!(ce->off >= 0)) goto fail; // <= if (!(ce->off + ce->used <= img->real_size)) goto fail; .... } 

PVS-Studio warning: V547 Expression 'ce-> off> = 0' is always true. Unsigned type value is always> = 0. image.c 147

The off member is an unsigned type variable, which means it is always greater than or equal to zero. Thus, the condition (! (Ce-> off> = 0)) is always false.

 static void sdel_Counts ( Counts* cts ) { memset(cts, 0, sizeof(Counts)); free(cts); } 

PVS-Studio warning: V597 The compiler could delete the memset function call, which is used to flush the 'cts' object. The memset_s () function should be used to erase the private data. cg_merge.c 324

Apparently, to simplify the search for errors in Valgrind itself, the memory is filled with zeros before being freed. However, in the release version, the compiler will most likely remove the memset function call, since the buffer is no longer used before the free function call.

Similar places where memory can not be reset:


 static Bool dis_AdvSIMD_scalar_shift_by_imm(DisResult* dres, UInt insn) { .... ULong nmask = (ULong)(((Long)0x8000000000000000ULL) >> (sh-1)); .... } 

PVS-Studio warning: V610 Unspecified behavior. Check the shift operator '>>'. The left operand '((Long) 0x8000000000000000ULL)' is negative. guest_arm64_toIR.c 9428

If the right-shift operand is negative, the resulting value depends on the implementation (implementation-defined). So we are dealing with dangerous code.

Now consider the situation when pointer dereferencing is before it is checked for NULL equality:

 PRE(xsm_op) { struct vki_xen_flask_op *op = (struct vki_xen_flask_op *)ARG1; PRINT("__HYPERVISOR_xsm_op ( %u )", op->cmd); // <= PRE_MEM_READ("__HYPERVISOR_xsm_op", ARG1, sizeof(vki_uint32_t) + sizeof(vki_uint32_t)); if (!op) // <= return; .... } 

PVS-Studio warning: V595 The 'op' pointer was used before it was verified against nullptr. Check lines: 350, 360. syswrap-xen.c 350

Similar cases:


 Bool ML_(read_elf_debug_info) ( struct _DebugInfo* di ) { .... if (inrw && sdynbss_present) { vg_assert(di->sbss_present); sdynbss_present = False; vg_assert(di->sbss_svma + di->sbss_size == svma); di->sbss_size += size; .... } else // <= if (inrw && !di->sbss_present) { di->sbss_present = True; di->sbss_svma = svma; di->sbss_avma = svma + inrw->bias; .... } 

PVS-Studio warning: V705 It is possible that 'else' block was forgotten or commented out, thus the program's operation logics. readelf.c 2231

The else keyword looks extremely suspicious in this code. The code is not aligned in accordance with the logic of its work. In addition, else is followed by an empty string. This suggests that we are confronted with the consequences of unsuccessful code refactoring and given else is superfluous.

 static Bool doHelperCallWithArgsOnStack (....) { .... if (guard) { if (guard->tag == Iex_Const && guard->Iex.Const.con->tag == Ico_U1 && guard->Iex.Const.con->Ico.U1 == True) { /* unconditional -- do nothing */ } else { goto no_match; //ATC cc = iselCondCode( env, guard ); } } .... } 

PVS-Studio warning: V779 Unreachable code detected. It is possible that an error is present. host_arm_isel.c 461

Line of code:

 cc = iselCondCode( env, guard ); 

never done:

 void reset_valgrind_sink(const char *info) { if (VG_(log_output_sink).fd != initial_valgrind_sink.fd && initial_valgrind_sink_saved) { VG_(log_output_sink).fd = initial_valgrind_sink.fd; VG_(umsg) ("Reset valgrind output to log (%s)\n", (info = NULL ? "" : info)); } } 

PVS-Studio warning: V547 Expression '((void *) 0)' is always false. server.c 110

The analyzer warning looks strange and needs clarification.

We are interested in the following expression:

 (info = NULL ? "" : info)) 

The macro NULL takes place in (((void *) 0) and it turns out:

 (info = ((void *) 0) ? "" : info)) 

The priority of the operator ?: Is higher than that of the operator = , so the calculations are as follows:

 (info = (((void *) 0) ? "" : info))) 

Agree that the condition ((void *) 0) for the operator ?: Looks strange, which is what the PVS-Studio analyzer warns about. Apparently, we are dealing with a typo, and the code should have been like this:

 (info == NULL ? "" : info)) 

And the latest code snippet for today:

 void genReload_TILEGX ( /*OUT*/ HInstr ** i1, /*OUT*/ HInstr ** i2, HReg rreg, Int offsetB ) { TILEGXAMode *am; vassert(!hregIsVirtual(rreg)); am = TILEGXAMode_IR(offsetB, TILEGXGuestStatePointer()); switch (hregClass(rreg)) { case HRcInt64: *i1 = TILEGXInstr_Load(8, rreg, am); break; case HRcInt32: *i1 = TILEGXInstr_Load(4, rreg, am); break; default: ppHRegClass(hregClass(rreg)); vpanic("genReload_TILEGX: unimplemented regclass"); break; } } 

PVS-Studio warning: V751 Parameter 'i2' is not used inside function body. host_tilegx_defs.c 1223

I think here we forgot to write NULL to i2 , as it is done in other similar functions:

 *i1 = *i2 = NULL; 

A similar error is found here:

V751 Parameter 'i2' is not used inside function body. host_mips_defs.c 2000

Conclusion


Thank you all for your attention. Try our static code analyzer PVS-Studio for Linux.


Windows developers, I invite here: PVS-Studio for Windows . For them, everything is a little easier. They can simply install a plugin for Visual Studio and test their C, C ++ and C # projects with the help of the demo version.


If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Checking the code of Valgrind dynamic analyzer by a static analyzer

Read the article and have a question?
Often our articles are asked the same questions. We collected answers to them here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please review the list.

Source: https://habr.com/ru/post/327994/


All Articles