
For some time we sent a free key for the PVS-Studio analyzer to the TortoiseSVN developers. Before they had time to use it, I decided to quickly download the source code for TortoiseSVN and perform the analysis myself. The goal is clear. Another small article for PVS-Studio advertising.
We have already
tested the project TortoiseSVN. It was a long time ago. The verification of the project coincided with the release of PVS-Studio 4.00, where general diagnostic rules first appeared.
From time to time we repeat the checks to demonstrate the benefits of regular use of the tool. It makes no sense to check the project once or twice. In the code being changed, new errors constantly appear. And then slowly and sadly corrected. Accordingly, the maximum benefit will be achieved with daily use of PVS-Studio. Better yet, using
incremental analysis .
So, let's see what we managed to find interesting in the project with the help of
PVS-Studio version 5.05. The source codes for TortoiseSVN were taken on 06/19/2013 from
http://tortoisesvn.googlecode.com/svn/trunk . By the way, the TortoiseSVN project is very high quality and has a huge user base of programmers. So if you find at least something, this is a great achievement.
')
Strange conditions
static void ColouriseA68kDoc (....) { if (((sc.state == SCE_A68K_NUMBER_DEC) && isdigit(sc.ch)) .... || ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch)) || ((sc.state == SCE_A68K_MACRO_ARG) && isdigit(sc.ch)) .... }
Diagnostic message: V501 There are identical sub-expressions '((sc.state == 11) && isdigit (sc.ch))' to the left | operator. lexa68k.cxx 160
There are two identical comparisons. There may be a typo.
A typo is probably present in the following code. The value of the 'rv' variable is checked twice.
struct hentry * AffixMgr::compound_check( .... if (rv && forceucase && (rv) && ....) .... }
Diagnostic message: V501 operator & r; & force; & (rv):
- affixmgr.cxx 1784
- affixmgr.cxx 1879
The following code snippet:
bool IsAllASCII7 (const CString& s) { for (int i = 0, count = s.GetLength(); i < count; ++i) if (s[i] >= 0x80) return false; return true; }
Diagnostic message: V547 Expression 's [i]> = 0x80' is always false. The value range of char type: [-128, 127]. logdlgfilter.cpp 34
The IsAllASCII7 () function always returns 'true'. The condition 's [i]> = 0x80' is always false. The value of a variable of type 'char' cannot be greater than or equal to 0x80.
The following code snippet with the wrong comparison:
int main(int argc, char **argv) { .... DWORD ticks; .... if (run_timers(now, &next)) { ticks = next - GETTICKCOUNT(); if (ticks < 0) ticks = 0; } else { ticks = INFINITE; } .... }
Diagnostic message: V547 Expression 'ticks <0' is always false. Unsigned type value is never <0. winplink.c 635
The variable 'ticks' is unsigned. This means that the “if (ticks <0)” check does not make sense. An overflow situation will not be processed.
Consider an error due to which the function 'strncmp' does not fully compare strings.
int AffixMgr::parse_convtable(...., const char * keyword) { char * piece; .... if (strncmp(piece, keyword, sizeof(keyword)) != 0) { .... }
Diagnostic message: V579 It is possibly a mistake. Inspect the third argument. affixmgr.cxx 3654
The 'sizeof' operator calculates the size of the pointer. This value is not related to the length of the string.
Suspicious string formation
Functions with a variable number of arguments are always everywhere, and as always dangerous.
class CTSVNPath { .... private: mutable CString m_sBackslashPath; mutable CString m_sLongBackslashPath; mutable CString m_sFwdslashPath; .... }; const FileStatusCacheEntry * SVNFolderStatus::BuildCache( const CTSVNPath& filepath, ....) { .... CTraceToOutputDebugString::Instance() (_T(__FUNCTION__) _T(": building cache for %s\n"), filepath); .... }
Diagnostic message: V510 The 'operator ()' function is not expected to receive class-type variable as second actual argument:
- svnfolderstatus.cpp 150
- svnfolderstatus.cpp 355
- svnfolderstatus.cpp 360
The qualifier "% s" indicates that the function expects a string as the actual argument. However, the variable 'filepath' is not a string at all, but a complex object consisting of multiple lines. It’s hard to say what will be printed and whether this code will fall at all.
It is dangerous to use functions such as 'printf ()' as follows: "printf (myStr);". If control qualifiers are present inside 'myStr', the program can print out what it is not supposed to or terminate abnormally.
Consider the code from TortoiseSVN:
BOOL CPOFile::ParseFile(....) { .... printf(File.getloc().name().c_str()); .... }
Diagnostic message: V618 It’s dangerous to call it. The example of the safe code: printf ("% s", str); pofile.cpp 158
If the file name is "myfile% s% i% s.txt", then the result will be pitiable.
Note. We have an interesting note on the dangers of using the printf () function .Wrong reset of arrays
I do not know how dangerous it is for TortoiseSVN to leave the contents of the buffers without resetting them. Perhaps generally safe. However, there is code to reset the buffers. And since it does not work, it is worth mentioning about it. Errors look like this:
static void sha_mpint(SHA_State * s, Bignum b) { unsigned char lenbuf[4]; .... memset(lenbuf, 0, sizeof(lenbuf)); }
Diagnostic message: V597 The compiler could delete the memset function call, which is used to flush the lenbuf buffer. The RtlSecureZeroMemory () function should be used to erase the private data. sshdss.c 23
Before exiting the function, the 'lenbuf' array should be cleared. Since the array is then no longer used, the optimizer will remove the call to the 'memset' function. To prevent this from happening, you need to use special functions.
Other places where the compiler will remove the call 'memset ()':
- sshdss.c 37
- sshdss.c 587
- sshdes.c 861
- sshdes.c 874
- sshdes.c 890
- sshdes.c 906
- sshmd5.c 252
- sshrsa.c 113
- sshpubk.c 153
- sshpubk.c 361
- sshpubk.c 1121
- sshsha.c 256
Strange
BOOL InitInstance(HINSTANCE hResource, int nCmdShow) { .... app.hwndTT;
Diagnostic message: V607 Ownerless expression 'app.hwndTT'. tortoiseblame.cpp 1782
Most likely, in the 'InitInstance ()' function, the 'hwndTT' member should be initialized with something. However, due to a typo, the code was unfinished.
64-bit errors
I make the search for mistakes very superficially. I am attentive, just enough to give me enough examples to write an article. No, I'm not a byaka. Simply, the authors of the project will still perform the analysis more qualitatively than I can do.
I look at
64-bit errors even more superficially. It is very difficult to judge, not knowing the structure of the project, it is possible the occurrence of a particular error or not.
I will give only a couple of dangerous places:
void LoginDialog::CreateModule(void) { .... DialogBoxParam(g_hmodThisDll, MAKEINTRESOURCE(IDD_LOGIN), g_hwndMain, (DLGPROC)(LoginDialogProc), (long)this); .... }
Diagnostic message: V220 Suspicious sequence of types of castings: memsize -> 32-bit integer -> memsize. The value being casted: 'this'. logindialog.cpp 105
The 'this' pointer is
explicitly cast to the 'long' type. It then
implicitly expands to type LPARAM (LONG_PTR). The important thing is that the pointer turns into a 'long' for a while. This is bad if the program is 64-bit. The pointer occupies 64-bit. The 'long' type in
Win64 is still a 32-bit type. As a result, the high bits of the 64-bit variable will be lost.
If the object is created outside the lower 4 GB of RAM, the program will become unpredictable. The probability of such an event is certainly not great, but such an error is extremely difficult to reproduce.
The correct code is: DialogBoxParam (...., (LPARAM) this);
Consider another dangerous type cast:
static int cmpforsearch(void *av, void *bv) { Actual_Socket b = (Actual_Socket) bv; unsigned long as = (unsigned long) av, bs = (unsigned long) b->s; if (as < bs) return -1; if (as > bs) return +1; return 0; }
Diagnostic message: V205 Explicit conversion of pointer type to 32-bit integer type: (unsigned long) av:
- winnet.c 139
- winhandl.c 359
- winhandl.c 348
Pointers are explicitly cast to the type 'unsigned long' and placed in the variables 'as' and 'bs'. Since the upper bits of the address may be lost, the comparison may not work correctly. It’s not at all clear why here pointers are cast to integer types. You can simply compare pointers.
Outdated null pointer checks
The 'new' operator, when it cannot allocate memory, has not returned NULL long ago. It throws an exception std :: bad_alloc. Of course, you can have the 'new' operator return 0, but this is irrelevant now.
However, the following code continues to live in programs:
int _tmain(....) { .... pBuf = new char[maxlength]; if (pBuf == NULL) { _tprintf(_T("Could not allocate enough memory!\n")); delete [] wc; delete [] dst; delete [] src; return ERR_ALLOC; } .... }
Diagnostic Message: V680 pbuf The exception will be generated in the case of memory allocation error.
- subwcrev.cpp 912
- repositorybrowser.cpp 2565
- repositorybrowser.cpp 4225
- svnstatuslistctrl.cpp 5254
- svnprogressdlg.cpp 2357
- bugtraqassociations.cpp 116
- xmessagebox.cpp 792
- xmessagebox.cpp 797
- hyperlink_base.cpp 166
- affixmgr.cxx 272
- hashmgr.cxx 363
- hashmgr.cxx 611
And so it will come down
About the many errors that I encounter when studying the code, I do not write in articles. The fact is that they do not interfere with the work of the program. This time I decided to write about a couple of such cases. It's just very funny to watch situations when the program works not because it is written correctly, but because of luck.
void CBaseView::OnContextMenu(CPoint point, DiffStates state) { .... popup.AppendMenu(MF_STRING | oWhites.HasTrailWhiteChars ? MF_ENABLED : (MF_DISABLED|MF_GRAYED), POPUPCOMMAND_REMOVETRAILWHITES, temp); .... }
Diagnostic message: V502 Perhaps the '?:' Operator works in a different way than it was expected. The '?:' Operator has a lower than the '|' operator. baseview.cpp 2246
Depending on the value of the variable 'oWhites.HasTrailWhiteChars', you need to get one of the combinations of constants:
- MF_STRING | MF_ENABLED
- MF_STRING | MF_DISABLED | MF_GRAYED
But the code does not work at all. Priority of operation '|' higher than the priority of the '?:' operation. Place brackets for clarity:
(MF_STRING | oWhites.HasTrailWhiteChars)? MF_ENABLED: MF_DISABLED | MF_GRAYED
The code works correctly only because the constant 'MF_STRING' is equal to 0. It has no effect on the result. As a result, the wrong expression works correctly.
Consider another example of luck. In the TortoiseSVN program, the HWND type is often used as the 'unsigned' type. To do this, explicit type conversions must be performed. For example, as shown in the following functions:
HWND m_hWnd; UINT_PTR uId; INT_PTR CBaseView::OnToolHitTest(....) const { .... pTI->uId = (UINT)m_hWnd; .... } UINT_PTR idFrom; HWND m_hWnd; BOOL CBaseView::OnToolTipNotify( UINT, NMHDR *pNMHDR, LRESULT *pResult) { if (pNMHDR->idFrom != (UINT)m_hWnd) return FALSE; .... }
Or, for example, the value of a variable of type HWND is printed as if it were a type of 'long'.
bool CCommonAppUtils::RunTortoiseProc(....) { .... CString sCmdLine; sCmdLine.Format(L"%s /hwnd:%ld", (LPCTSTR)sCommandLine, AfxGetMainWnd()->GetSafeHwnd()); .... }
Formally, this code is incorrect. The fact is that the type 'HWND' is a pointer. So, it cannot be turned into 32-bit integer types. And the PVS-Studio analyzer is worried about this, issuing warnings.
But the interesting thing is that this code will work perfectly!
The HWND type is used to store descriptors that are used in Windows to work with various system objects. The same types are HANDLE, HMENU, HPALETTE, HBITMAP and so on.
Although descriptors are 64-bit pointers, for greater compatibility (for example, to allow interaction between 32-bit and 64-bit processes) they use only the lower 32-bits. See "
Microsoft Interface Definition Language (MIDL): 64-Bit Porting Guide " (USER and GDI handles are sign extended 32b values) for details.
By placing the HWND type in 32-bit types, the developers were hardly based precisely on these assumptions. Most likely, this is not a very neat code that works correctly thanks to the luck and efforts of the developers of the Windows API.
Conclusion
Use static analysis regularly during development, and you will find a lot of errors at the earliest stages. I naturally recommend first of all to get acquainted with the
PVS-Studio code analyzer. However, there are many other good code analyzers:
static code analysis tools .
Links
Additional links that can explain some of the subtle points described in the article.
- Knowledge base. Overwrite memory - why?
- Documentation. V668. It was not allowed to use the operator.
- Knowledge base. How to correctly cast a pointer to an int in a 64-bit program?
- Karpov Andrey, Evgeny Ryzhkov. Lessons learned from developing 64-bit C / C ++ applications .