One of the programs that allows you to solve the problem of data compression is the popular file archiver 7-Zip, and I myself often use it. Readers have long turned to us with a request to check the code of this application. Well, it's time to look at its sources and see what PVS-Studio can find interesting.

')
Introduction
A few words about the project. 7-Zip is a free file archiver with a high degree of data compression, written in C and C ++. It has a small size of 235 thousand lines of code. It supports multiple compression algorithms and a variety of data formats, including the proprietary 7z format with the highly efficient LZMA compression algorithm. The program has been developed since 1999, it is free and has open source code. 7-Zip is the winner of the 2007 SourceForge.net Community Choice Awards in the "Best Project" and "Best Technical Design" categories. Version 16.00 was chosen for verification, the source code of which was downloaded from
http://www.7-zip.org/download.htmlTest results
To check the 7-Zip code, the
PVS-Studio v6.04 static code analyzer was used. For the article, the most interesting analyzer messages were selected and analyzed. Let's take a look at them.
Typos in conditional statements
Typos in conditional statements are often found in programs. In the case of a large number of checks, their detection can cause a lot of trouble. In such cases, a static code analyzer comes to the rescue.
I will give a few examples of this error.
V501 There are identical sub-expressions "Id == k_PPC" operator. 7zupdate.cpp 41
void SetDelta() { if (Id == k_IA64) Delta = 16; else if (Id == k_ARM || Id == k_PPC || Id == k_PPC)
The analyzer detected the same conditional expression. At best, one of the conditions
Id == k_PPC is redundant and does not affect the logic of the program. To correct a typo, you just need to remove this condition, then the correct expression will look like this:
if (Id == k_IA64) Delta = 16; else if (Id == k_ARM || Id == k_PPC) Delta = 4;
But even more serious consequences of such a typo are possible, if instead of the
k_PPC constant, in one of the recurring conditions, there must be another constant. In this case, the logic of the program may be violated.
Here is another example of a typo in a conditional statement:
V501 There are identical to the left and the right of the | || operator: offs> = nodeSize || offs> = nodeSize hfshandler.cpp 915
HRESULT CDatabase::LoadCatalog(....) { .... UInt32 nodeSize = (1 << hr.NodeSizeLog); UInt32 offs = Get16(p + nodeOffset + nodeSize - (i + 1) * 2); UInt32 offsNext = Get16(p + nodeOffset + nodeSize - (i + 2) * 2); UInt32 recSize = offsNext - offs; if (offs >= nodeSize || offs >= nodeSize
Here the problem is in the repeated condition
offs> = nodeSize .
Most likely, these typos were obtained when using Copy-Paste to duplicate the code. It makes no sense to urge to abandon copying code sections This is too convenient and useful to deprive yourself of such functionality in the editor. You just need to more carefully check the result.
Identical comparisons
The analyzer detected a potential error in a construct consisting of conditional operators. Here is her example:
V517 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 388, 390. archivecommandline.cpp 388
static void AddRenamePair(...., NRecursedType::EEnum type, ....) { .... if (type == NRecursedType::kRecursed) val.AddAscii("-r"); else if (type == NRecursedType::kRecursed)
In the code
NRecursedType is defined as follows:
namespace NRecursedType { enum EEnum { kRecursed, kWildcardOnlyRecursed, kNonRecursed }; }
It turns out that the second condition will never be fulfilled. Let's try to understand this problem in more detail. Based on the description of the command line parameters, the
-r parameter indicates the use of recursion for subdirectories. In the case of the
-r0 parameter
, recursion is used only for wildcard names. Comparing this with the definition of
NRecursedType, we can conclude that in the second case the type
NRecursedType :: kWildcardOnlyRecursed should be used
. Then the correct code will look like this:
static void AddRenamePair(...., NRecursedType::EEnum type, ....) { .... if (type == NRecursedType::kRecursed) val.AddAscii("-r"); else if (type == NRecursedType::kWildcardOnlyRecursed)
Conditions that are always true or false
It is necessary to closely monitor whether you work with a signed or unsigned type. Ignoring these features can lead to unpleasant consequences.
V547 Expression 'newSize <0' is always false. Unsigned type value is never <0. update.cpp 254
This is a sample code from a program in which this language feature was ignored:
STDMETHODIMP COutMultiVolStream::SetSize(UInt64 newSize) { if (newSize < 0)
The problem is that
newSize is unsigned and the condition will never be met. If a negative value is passed to the
SetSize function, this error will be ignored and the function will start using the incorrect size. In 7-Zip, there are 2 more conditions that, due to the confusion with signed / unsigned, are always true or always false.
- V547 Expression 'rec. SiAttr.SecurityId> = 0' is always true. Unsigned type value is always> = 0. ntfshandler.cpp 2142
- V547 Expression 's .Len ()> = 0' is always true. Unsigned type value is always> = 0. xarhandler.cpp 258
The same condition is checked twice.
The analyzer has detected a potential error related to the fact that the same condition is checked twice.
V571 Recurring check. The 'if (Result! = ((HRESULT) 0L))' condition was already verified in line 56. extractengine.cpp 58
This is how the program code fragment looks
void Process2() { .... if (Result != S_OK) { if (Result != S_OK)
Most likely, in this situation, the second check is simply redundant, but a situation is also possible in which the programmer did not change the second condition after copying, and it turned out to be erroneous.
Similar places in 7-Zip code:
- V571 Recurring check. The '! QuoteMode' condition was already verified in line 18. stringutils.cpp 20
- V571 Recurring check. The 'IsVarStr (params [1], 22)' condition was already verified in line 3377. nsisin.cpp 3381
Suspicious pointer handling
It occurs in the code 7-Zip and an error, when the pointer is dereferenced at the beginning, and only then is checked for equality to zero.
V595 The 'outStreamSpec' pointer was used before it was verified against nullptr. Check lines: 753, 755. lzmaalone.cpp 753.
This is a very
common mistake in all programs. It usually occurs because of inattention in the process of refactoring code. A null pointer reference will result in undefined program behavior. Consider an application code fragment containing an error of this type:
static int main2(int numArgs, const char *args[]) { .... if (!stdOutMode) Print_Size("Output size: ", outStreamSpec->ProcessedSize);
The
outStreamSpec pointer
is dereferenceed in the
outStreamSpec-> ProcessedSize expression. Then it is checked for equality to zero. One must either check the pointer even higher or the check that occurs below is meaningless. Here is a list of similar potential problem areas in the program code:
- V595 The '_file' pointer was used before it was verified against nullptr. Check lines: 2099, 2112. bench.cpp 2099
- V595 'a a pointer pointer pointer Check lines: 204, 214. updatepair.cpp 204
- V595 'options ’pointer Check lines: 631, 636. zipupdate.cpp 631
- V595 The 'volStreamSpec' pointer was used before it was verified against nullptr. Check lines: 856, 863. update.cpp 856
Exception inside destructor
If an exception occurs in the program, the stack collapses, during which objects are destroyed by calling destructors. If the destructor of the object being destroyed when the stack collapses throws another exception and this exception leaves the destructor, the C ++ library immediately terminates the program by calling the
terminate () function. It follows from this that destructors should never propagate exceptions. The exception thrown inside the destructor must be processed inside the same destructor.
The analyzer issued the following message:
V509 The 'throw' operator inside the destructor should be placed within the try..catch block. Raising exception inside the destructor is illegal. consoleclose.cpp 62
And here's what the destructor for generating an exception looks like:
CCtrlHandlerSetter::~CCtrlHandlerSetter() { #if !defined(UNDER_CE) && defined(_WIN32) if (!SetConsoleCtrlHandler(HandlerRoutine, FALSE)) throw "SetConsoleCtrlHandler fails";
The V509 message warns that if the CCtrlHandlerSetter object is destroyed during exception handling, then a new exception will immediately crash the program. This code should be rewritten in such a way as to report an error that occurred in the destructor, without using the exception mechanism. If the error is not critical, then it can be ignored.
CCtrlHandlerSetter::~CCtrlHandlerSetter() { #if !defined(UNDER_CE) && defined(_WIN32) try { if (!SetConsoleCtrlHandler(HandlerRoutine, FALSE)) throw "SetConsoleCtrlHandler fails";
Bool variable increment
Historically, for variables of type
bool an increment operation is allowed, it sets the variable to
true . This feature is related to the fact that earlier integer values ​​were used to represent Boolean variables. Subsequently, this feature remained to support backward compatibility programs. Starting with C ++ 98 standard, it is marked as
deprecated and is not recommended for use. In the upcoming standard C ++ 17, the ability to use an increment for a boolean variable is marked for deletion.
In the 7-Zip code, a couple of places were found where this outdated feature is used.
- V552 A bool type variable is being incremented: numMethods ++. Perhaps another variable should be incremented instead. wimhandler.cpp 308
- V552 A bool type variable is being incremented: numMethods ++. Perhaps another variable should be incremented instead. wimhandler.cpp 318
STDMETHODIMP CHandler::GetArchiveProperty(....) { .... bool numMethods = 0; for (unsigned i = 0; i < ARRAY_SIZE(k_Methods); i++) { if (methodMask & ((UInt32)1 << i)) { res.Add_Space_if_NotEmpty(); res += k_Methods[i]; numMethods++;
In this situation, there are two options. Or the variable
numMethods is a flag and in this case it is better to use initialization with the Boolean value
numMethods = true . Or, judging by the name of the variable, it is a counter that must be an integer.
Check for failed memory allocation
The analyzer detected a situation when the value of the pointer returned by the
new operator is compared with zero. As a rule, this means that the program, if it is impossible to allocate memory, will behave differently than the programmer expects.
This is a
plug- in. The exception will be generated in the case of memory allocation error. far.cpp 399
Here is how it looks in the program code:
static HANDLE MyOpenFilePluginW(const wchar_t *name) { .... CPlugin *plugin = new CPlugin( fullName,
If the
new operator could not allocate memory, then, according to the C ++ standard of the language, an exception is generated
std :: bad_alloc () . Thus, it does not make sense to check for equality to zero. The plugin pointer will never be zero. The function will never return the constant value
INVALID_HANDLE_VALUE . If it is impossible to allocate memory, an exception occurs that is better handled at a higher level, and the test for equality to zero can simply be removed. Well, or if exceptions in the application are undesirable, then you can use the
new operator that does not generate exceptions; in this case, you can check the return value to zero. In the application code, there are 3 more such checks:
- V668 using using _ _ allocated allocated allocated allocated allocated allocated allocated allocated allocated allocated allocated _ _ The exception will be generated in the case of memory allocation error. enumformatetc.cpp 46
- V668 has been defined as the “m_States” The exception will be generated in the case of memory allocation error. bzip2decoder.cpp 445
- V668 has been set using the 'new' operator. The exception will be generated in the case of memory allocation error. bzip2encoder.cpp 170
Constructions requiring optimization
Now a little about the places that can potentially be optimized. An object is passed to the function. This object is passed by value, but it is not modified, since there is a keyword
const . Perhaps it would be rational to pass it using a constant reference in C ++ or using a pointer in C.
Here is an example for a vector:
V801 Decreased performance. It is better to redefine it. Consider replacing 'const ... pathParts' with 'const ... & pathParts'. wildcard.cpp 487
static unsigned GetNumPrefixParts(const UStringVector pathParts) { .... }
Calling this function will call the copy constructor for the
UStringVector class. If such copying of objects occurs frequently, then this can significantly reduce application performance. This code can be easily optimized by adding a link:
static unsigned GetNumPrefixParts(const UStringVector& pathParts) { .... }
Here are some more similar places:
- V801 Decreased performance. It is better to redefine it. Consider replacing 'const ... props' with 'const ... & props'. benchmarkdialog.cpp 766
- V801 Instantiate CRecordVector <CAttribIconPair>: Decreased performance. It is better to redefine it. Consider replacing 'const ... item' with 'const ... & item'. yvector.h 199
Conclusion
7-Zip is a small project that has been developing for quite some time and, of course, failed to find a large number of serious errors. But there are still places in the code that need attention and the PVS-Studio static code analyzer will greatly facilitate this work. If you are developing a project in C, C ++ or C #, I suggest you to download PVS-Studio without delay and check your project:
http://www.viva64.com/ru/pvs-studio-download/If you want to share this article with an English-speaking audience, then please use the link to the translation: Kirill Yudintsev.
Checking 7-Zip with PVS-Studio analyzer .