📜 ⬆️ ⬇️

Re-checking TortoiseSVN with the help of the PVS-Studio code analyzer

TortoiseSVN and PVS-Studio
For some time we sent a free key for the PVS-Studio analyzer to the TortoiseSVN developers. Before they had time to use it, I decided to quickly download the source code for TortoiseSVN and perform the analysis myself. The goal is clear. Another small article for PVS-Studio advertising.

We have already tested the project TortoiseSVN. It was a long time ago. The verification of the project coincided with the release of PVS-Studio 4.00, where general diagnostic rules first appeared.

From time to time we repeat the checks to demonstrate the benefits of regular use of the tool. It makes no sense to check the project once or twice. In the code being changed, new errors constantly appear. And then slowly and sadly corrected. Accordingly, the maximum benefit will be achieved with daily use of PVS-Studio. Better yet, using incremental analysis .

So, let's see what we managed to find interesting in the project with the help of PVS-Studio version 5.05. The source codes for TortoiseSVN were taken on 06/19/2013 from http://tortoisesvn.googlecode.com/svn/trunk . By the way, the TortoiseSVN project is very high quality and has a huge user base of programmers. So if you find at least something, this is a great achievement.
')

Strange conditions


static void ColouriseA68kDoc (....) { if (((sc.state == SCE_A68K_NUMBER_DEC) && isdigit(sc.ch)) .... || ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch)) || ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch)) .... } 

Diagnostic message: V501 There are identical sub-expressions '((sc.state == 11) && isdigit (sc.ch))' to the left | operator. lexa68k.cxx 160

There are two identical comparisons. There may be a typo.

A typo is probably present in the following code. The value of the 'rv' variable is checked twice.
 struct hentry * AffixMgr::compound_check( .... if (rv && forceucase && (rv) && ....) .... } 

Diagnostic message: V501 operator & r; & force; & (rv):

The following code snippet:
 bool IsAllASCII7 (const CString& s) { for (int i = 0, count = s.GetLength(); i < count; ++i) if (s[i] >= 0x80) return false; return true; } 

Diagnostic message: V547 Expression 's [i]> = 0x80' is always false. The value range of char type: [-128, 127]. logdlgfilter.cpp 34

The IsAllASCII7 () function always returns 'true'. The condition 's [i]> = 0x80' is always false. The value of a variable of type 'char' cannot be greater than or equal to 0x80.

The following code snippet with the wrong comparison:
 int main(int argc, char **argv) { .... DWORD ticks; .... if (run_timers(now, &next)) { ticks = next - GETTICKCOUNT(); if (ticks < 0) ticks = 0; } else { ticks = INFINITE; } .... } 

Diagnostic message: V547 Expression 'ticks <0' is always false. Unsigned type value is never <0. winplink.c 635

The variable 'ticks' is unsigned. This means that the “if (ticks <0)” check does not make sense. An overflow situation will not be processed.

Consider an error due to which the function 'strncmp' does not fully compare strings.
 int AffixMgr::parse_convtable(...., const char * keyword) { char * piece; .... if (strncmp(piece, keyword, sizeof(keyword)) != 0) { .... } 

Diagnostic message: V579 It is possibly a mistake. Inspect the third argument. affixmgr.cxx 3654

The 'sizeof' operator calculates the size of the pointer. This value is not related to the length of the string.

Suspicious string formation


Functions with a variable number of arguments are always everywhere, and as always dangerous.
 class CTSVNPath { .... private: mutable CString m_sBackslashPath; mutable CString m_sLongBackslashPath; mutable CString m_sFwdslashPath; .... }; const FileStatusCacheEntry * SVNFolderStatus::BuildCache( const CTSVNPath& filepath, ....) { .... CTraceToOutputDebugString::Instance() (_T(__FUNCTION__) _T(": building cache for %s\n"), filepath); .... } 

Diagnostic message: V510 The 'operator ()' function is not expected to receive class-type variable as second actual argument:

The qualifier "% s" indicates that the function expects a string as the actual argument. However, the variable 'filepath' is not a string at all, but a complex object consisting of multiple lines. It’s hard to say what will be printed and whether this code will fall at all.

It is dangerous to use functions such as 'printf ()' as follows: "printf (myStr);". If control qualifiers are present inside 'myStr', the program can print out what it is not supposed to or terminate abnormally.

Consider the code from TortoiseSVN:
 BOOL CPOFile::ParseFile(....) { .... printf(File.getloc().name().c_str()); .... } 

Diagnostic message: V618 It’s dangerous to call it. The example of the safe code: printf ("% s", str); pofile.cpp 158

If the file name is "myfile% s% i% s.txt", then the result will be pitiable.

Note. We have an interesting note on the dangers of using the printf () function .

Wrong reset of arrays


I do not know how dangerous it is for TortoiseSVN to leave the contents of the buffers without resetting them. Perhaps generally safe. However, there is code to reset the buffers. And since it does not work, it is worth mentioning about it. Errors look like this:
 static void sha_mpint(SHA_State * s, Bignum b) { unsigned char lenbuf[4]; .... memset(lenbuf, 0, sizeof(lenbuf)); } 

Diagnostic message: V597 The compiler could delete the memset function call, which is used to flush the lenbuf buffer. The RtlSecureZeroMemory () function should be used to erase the private data. sshdss.c 23

Before exiting the function, the 'lenbuf' array should be cleared. Since the array is then no longer used, the optimizer will remove the call to the 'memset' function. To prevent this from happening, you need to use special functions.

Other places where the compiler will remove the call 'memset ()':

Strange


 BOOL InitInstance(HINSTANCE hResource, int nCmdShow) { .... app.hwndTT; // handle to the ToolTip control .... } 

Diagnostic message: V607 Ownerless expression 'app.hwndTT'. tortoiseblame.cpp 1782

Most likely, in the 'InitInstance ()' function, the 'hwndTT' member should be initialized with something. However, due to a typo, the code was unfinished.

64-bit errors


I make the search for mistakes very superficially. I am attentive, just enough to give me enough examples to write an article. No, I'm not a byaka. Simply, the authors of the project will still perform the analysis more qualitatively than I can do.

I look at 64-bit errors even more superficially. It is very difficult to judge, not knowing the structure of the project, it is possible the occurrence of a particular error or not.

I will give only a couple of dangerous places:
 void LoginDialog::CreateModule(void) { .... DialogBoxParam(g_hmodThisDll, MAKEINTRESOURCE(IDD_LOGIN), g_hwndMain, (DLGPROC)(LoginDialogProc), (long)this); .... } 

Diagnostic message: V220 Suspicious sequence of types of castings: memsize -> 32-bit integer -> memsize. The value being casted: 'this'. logindialog.cpp 105

The 'this' pointer is explicitly cast to the 'long' type. It then implicitly expands to type LPARAM (LONG_PTR). The important thing is that the pointer turns into a 'long' for a while. This is bad if the program is 64-bit. The pointer occupies 64-bit. The 'long' type in Win64 is still a 32-bit type. As a result, the high bits of the 64-bit variable will be lost.

If the object is created outside the lower 4 GB of RAM, the program will become unpredictable. The probability of such an event is certainly not great, but such an error is extremely difficult to reproduce.

The correct code is: DialogBoxParam (...., (LPARAM) this);

Consider another dangerous type cast:
 static int cmpforsearch(void *av, void *bv) { Actual_Socket b = (Actual_Socket) bv; unsigned long as = (unsigned long) av, bs = (unsigned long) b->s; if (as < bs) return -1; if (as > bs) return +1; return 0; } 

Diagnostic message: V205 Explicit conversion of pointer type to 32-bit integer type: (unsigned long) av:

Pointers are explicitly cast to the type 'unsigned long' and placed in the variables 'as' and 'bs'. Since the upper bits of the address may be lost, the comparison may not work correctly. It’s not at all clear why here pointers are cast to integer types. You can simply compare pointers.

Outdated null pointer checks


The 'new' operator, when it cannot allocate memory, has not returned NULL long ago. It throws an exception std :: bad_alloc. Of course, you can have the 'new' operator return 0, but this is irrelevant now.

However, the following code continues to live in programs:
 int _tmain(....) { .... pBuf = new char[maxlength]; if (pBuf == NULL) { _tprintf(_T("Could not allocate enough memory!\n")); delete [] wc; delete [] dst; delete [] src; return ERR_ALLOC; } .... } 

Diagnostic Message: V680 pbuf The exception will be generated in the case of memory allocation error.

And so it will come down


About the many errors that I encounter when studying the code, I do not write in articles. The fact is that they do not interfere with the work of the program. This time I decided to write about a couple of such cases. It's just very funny to watch situations when the program works not because it is written correctly, but because of luck.
 void CBaseView::OnContextMenu(CPoint point, DiffStates state) { .... popup.AppendMenu(MF_STRING | oWhites.HasTrailWhiteChars ? MF_ENABLED : (MF_DISABLED|MF_GRAYED), POPUPCOMMAND_REMOVETRAILWHITES, temp); .... } 

Diagnostic message: V502 Perhaps the '?:' Operator works in a different way than it was expected. The '?:' Operator has a lower than the '|' operator. baseview.cpp 2246

Depending on the value of the variable 'oWhites.HasTrailWhiteChars', you need to get one of the combinations of constants:

But the code does not work at all. Priority of operation '|' higher than the priority of the '?:' operation. Place brackets for clarity:

(MF_STRING | oWhites.HasTrailWhiteChars)? MF_ENABLED: MF_DISABLED | MF_GRAYED

The code works correctly only because the constant 'MF_STRING' is equal to 0. It has no effect on the result. As a result, the wrong expression works correctly.

Consider another example of luck. In the TortoiseSVN program, the HWND type is often used as the 'unsigned' type. To do this, explicit type conversions must be performed. For example, as shown in the following functions:
 HWND m_hWnd; UINT_PTR uId; INT_PTR CBaseView::OnToolHitTest(....) const { .... pTI->uId = (UINT)m_hWnd; .... } UINT_PTR idFrom; HWND m_hWnd; BOOL CBaseView::OnToolTipNotify( UINT, NMHDR *pNMHDR, LRESULT *pResult) { if (pNMHDR->idFrom != (UINT)m_hWnd) return FALSE; .... } 

Or, for example, the value of a variable of type HWND is printed as if it were a type of 'long'.
 bool CCommonAppUtils::RunTortoiseProc(....) { .... CString sCmdLine; sCmdLine.Format(L"%s /hwnd:%ld", (LPCTSTR)sCommandLine, AfxGetMainWnd()->GetSafeHwnd()); .... } 

Formally, this code is incorrect. The fact is that the type 'HWND' is a pointer. So, it cannot be turned into 32-bit integer types. And the PVS-Studio analyzer is worried about this, issuing warnings.

But the interesting thing is that this code will work perfectly!

The HWND type is used to store descriptors that are used in Windows to work with various system objects. The same types are HANDLE, HMENU, HPALETTE, HBITMAP and so on.

Although descriptors are 64-bit pointers, for greater compatibility (for example, to allow interaction between 32-bit and 64-bit processes) they use only the lower 32-bits. See " Microsoft Interface Definition Language (MIDL): 64-Bit Porting Guide " (USER and GDI handles are sign extended 32b values) for details.

By placing the HWND type in 32-bit types, the developers were hardly based precisely on these assumptions. Most likely, this is not a very neat code that works correctly thanks to the luck and efforts of the developers of the Windows API.

Conclusion


Use static analysis regularly during development, and you will find a lot of errors at the earliest stages. I naturally recommend first of all to get acquainted with the PVS-Studio code analyzer. However, there are many other good code analyzers: static code analysis tools .

Links


Additional links that can explain some of the subtle points described in the article.
  1. Knowledge base. Overwrite memory - why?
  2. Documentation. V668. It was not allowed to use the operator.
  3. Knowledge base. How to correctly cast a pointer to an int in a 64-bit program?
  4. Karpov Andrey, Evgeny Ryzhkov. Lessons learned from developing 64-bit C / C ++ applications .

Source: https://habr.com/ru/post/184542/


All Articles