📜 ⬆️ ⬇️

Top 10 bugs in C ++ projects for 2018

For three months now, 2018 is over. For many, it flew almost imperceptibly, but for us, the developers of PVS-Studio, it turned out to be very rich. We worked hard, fought fearlessly to promote static analysis to the masses and looked for new errors in open source projects written in C, C ++, C # and Java. We collected ten of the most interesting of them for you in this article!



We searched for interesting places with the help of the PVS-Studio static code analyzer. It can detect errors and potential vulnerabilities in code written in the languages ​​mentioned above.

If you are interested in finding errors yourself, you can always download and try our analyzer. We provide a free version of the analyzer for students and enthusiast programmers, a free license for developers of open-source projects, as well as an evaluation version for all-all-all. Who knows, maybe next year you can make your top 10? :)
')
Note: I suggest the reader to check himself and, before looking at the analyzer warning, try to identify the anomalies himself. How many mistakes can you find?

Tenth place


Source: And back into space: how the unicorn Stellarium visited

This error was detected while checking the virtual planetarium Stellarium.

The given code fragment, although small, is fraught with a rather cunning error:

Plane::Plane(Vec3f &v1, Vec3f &v2, Vec3f &v3) : distance(0.0f), sDistance(0.0f) { Plane(v1, v2, v3, SPolygon::CCW); } 

Found?

PVS-Studio warning: V603 The object was not used. If you wish to call the constructor, 'this-> Plane :: Plane (....)' should be used. Plane.cpp 29

The author of the code wanted to initialize part of the object's fields using another constructor nested in the main one. True, instead, he only managed to create a temporary object that would be destroyed when leaving his area of ​​visibility. Thus, several fields of the object will remain uninitialized.

Instead of a nested constructor call, you should use the delegating constructor introduced in C ++ 11. For example, you could do this:

 Plane::Plane(Vec3f& v1, Vec3f& v2, Vec3f& v3) : Plane(v1, v2, v3, SPolygon::CCW) { distance = 0.0f; sDistance = 0.0f; } 

Then all the required fields would be correctly initialized. Isn't it great?

Ninth place


Source: Perl 5: as in the macros errors were hidden

In ninth place, there is a remarkable macro from the Perl 5 source code.

Collecting errors for writing an article, my colleague Svyatoslav came across a warning issued by the analyzer to use a macro. Here it is:

 PP(pp_match) { .... MgBYTEPOS_set(mg, TARG, truebase, RXp_OFFS(prog)[0].end); .... } 

To find out what was wrong, Svyatoslav dug deeper. He opened the definition of the macro, and saw that it contains several more nested macros, some of which also had nested macros. To understand this was so difficult that I had to use a preprocessed file. But, alas, it did not help. In place of the previous line of code, Svyatoslav discovered this:

 (((targ)->sv_flags & 0x00000400) && (!((targ)->sv_flags & 0x00200000) || S_sv_only_taint_gmagic(targ)) ? (mg)->mg_len = ((prog->offs)[0].end), (mg)->mg_flags |= 0x40 : ((mg)->mg_len = (((targ)->sv_flags & 0x20000000) && !__builtin_expect(((((PL_curcop)->cop_hints + 0) & 0x00000008) ? (_Bool)1 :(_Bool)0),(0))) ? (ssize_t)Perl_utf8_length( (U8 *)(truebase), (U8 *)(truebase)+((prog->offs)[0].end)) : (ssize_t)((prog->offs)[0].end), (mg)->mg_flags &= ~0x40)); 

PVS-Studio warning : V502 Perhaps the '?:' Operator works in a different way. The '?:' Operator has a operator. pp_hot.c 3036

I think it will be difficult to find such a mistake with my eyes. To be honest, we meditated on this code for a long time and came to the conclusion that in fact there is no mistake here. But in any case, this is a rather entertaining example of unreadable code.

They say that macros are evil. Of course, there are moments when they are indispensable, but if you can replace a macro with a function, you should definitely do it.

Particularly fraught with nested macros. Not only because they are difficult to understand, but also because they can give unpredictable results. If the author of the macro accidentally makes a mistake in such a macro, it will be much more difficult to find it than in the function.

Eighth place


Source: Chromium: other errors

The following example was taken from a series of articles on the analysis of the Chromium project. It was covered in the library WebRTC.

 std::vector<SdpVideoFormat> StereoDecoderFactory::GetSupportedFormats() const { std::vector<SdpVideoFormat> formats = ....; for (const auto& format : formats) { if (cricket::CodecNamesEq(....)) { .... formats.push_back(stereo_format); } } return formats; } 

PVS-Studio warning : V789 CWE-672 iterators stereocodecfactory.cc 89

The error is that the size of the vector formats changes within the range-based-for loop. Range-based loops are based on iterators, so resizing a container within such loops can invalidate these iterators.

This error will persist if you rewrite the loop with explicit use of iterators. Therefore, for clarity, you can bring this code:

 for (auto format = begin(formats), __end = end(formats); format != __end; ++format) { if (cricket::CodecNamesEq(....)) { .... formats.push_back(stereo_format); } } 

For example, when using the push_back method, vector reallocation can occur - and then iterators will point to an invalid memory area.

To avoid such errors, you should follow the rule: never change the size of the container inside the loop, the conditions of which are tied to this container. This applies to range-based-cycles and cycles using iterators. You can read about which operations may result in iterator invalidation in the discussion at StackOverflow.

Seventh place


Source: Godot: Regular use of static code analyzers

The first example from the video game industry will be a snippet of code found by us in the Godot game engine. You may have to sweat to detect a mistake with your eyes, but I am sure that our sophisticated readers will cope with it:

 void AnimationNodeBlendSpace1D::add_blend_point( const Ref<AnimationRootNode> &p_node, float p_position, int p_at_index) { ERR_FAIL_COND(blend_points_used >= MAX_BLEND_POINTS); ERR_FAIL_COND(p_node.is_null()); ERR_FAIL_COND(p_at_index < -1 || p_at_index > blend_points_used); if (p_at_index == -1 || p_at_index == blend_points_used) { p_at_index = blend_points_used; } else { for (int i = blend_points_used - 1; i > p_at_index; i++) { blend_points[i] = blend_points[i - 1]; } } .... } 

PVS-Studio warning : V621 CWE-835 Consider inspecting the 'for' operator. It is possible that you will not be executed at all. animation_blend_space_1d.cpp 113

Let us consider the condition of the cycle in more detail. The counter variable is initialized with the value blend_points_used - 1 . At the same time, based on the two previous checks (in ERR_FAIL_COND and in if ), it becomes clear that at the time of the execution of the blend_points_used cycle there will always be more than p_at_index . Thus, either the cycle condition will always be true, or the cycle will not be executed at all.

If blend_points_used - 1 == p_at_index , then the loop is not executed.

In all other cases, the check i> p_at_index will always be true, since the counter i is incremented at each iteration of the loop.

It may seem that the loop will run forever, but it is not.

First, an integer overflow occurs for the variable i , which is an undefined behavior. Therefore, relying on it is not worth it.

If i were of the unsigned int type, then after the counter reaches the maximum possible value, the i ++ operator would turn it into 0 . This behavior is defined by the standard and is called "Unsigned wrapping". However, be aware that using such a mechanism is also not a good idea .

This was the first, but there is still the second! The fact is that the integer overflow will not even reach. Where before going out of the array. This means that an attempt will be made to access the memory outside the block allocated for the array. And this is also an indefinite behavior. Classic example :)

To make it easier to avoid such errors, I can only give a couple of recommendations:

  1. Write more simple and clear code
  2. Perform a more thorough Code Review and write more tests for freshly written code.
  3. Use static analyzers;)


Sixth place


Source: Amazon Lumberyard: A Cry from the Heart

Another example from the gamedev industry, namely from the Amazon Lumberyard AAA engine source code.

 void TranslateVariableNameByOperandType(....) { // Igor: yet another Qualcomm's special case // GLSL compiler thinks that -2147483648 is // an integer overflow which is not if (*((int*)(&psOperand->afImmediates[0])) == 2147483648) { bformata(glsl, "-2147483647-1"); } else { // Igor: this is expected to fix // paranoid compiler checks such as Qualcomm's if (*((unsigned int*)(&psOperand->afImmediates[0])) >= 2147483648) { bformata(glsl, "%d", *((int*)(&psOperand->afImmediates[0]))); } else { bformata(glsl, "%d", *((int*)(&psOperand->afImmediates[0]))); } } bcatcstr(glsl, ")"); .... } 

PVS-Studio warning : V523 The 'then' statement is equivalent to the 'else' statement. toglsloperand.c 700

Amazon Lumberyard is being developed as a cross-platform engine. Therefore, developers are trying to support as many compilers as possible. Programmer Igor ran into the Qualcomm compiler, as we are told comments.

It is not known whether Igor was able to complete his task and cope with the "paranoid" checks of the compiler, but he left behind a very strange code. It is strange that both then - and else- branches of the if operator contain absolutely identical code. Most likely, such an error was made as a result of inaccurate Copy-Paste.

I do not even know what can I advise. Therefore, I simply wish the developers of Amazon Lumberyard success in correcting errors, and Igor, the programmer, good luck!

Fifth place


Source: Once again, the PVS-Studio analyzer turned out to be more attentive than a person

An interesting story happened to the following example. My colleague, Andrei Karpov, was preparing an article about another testing of the Qt framework. During the writing of notable errors, he encountered a warning from the analyzer, which he considered false. Here is the relevant code snippet and warning:

 QWindowsCursor::CursorState QWindowsCursor::cursorState() { enum { cursorShowing = 0x1, cursorSuppressed = 0x2 }; CURSORINFO cursorInfo; cursorInfo.cbSize = sizeof(CURSORINFO); if (GetCursorInfo(&cursorInfo)) { if (cursorInfo.flags & CursorShowing) // <= V616 .... } 

PVS-Studio Warning: V616 CWE-480 The "CursorShowing" is used in the bitwise operation. qwindowscursor.cpp 669

That is, PVS-Studio swore at the place in which, obviously, there is no error! It cannot be that the constant CursorShowing is 0 , because literally a couple of lines above it is initialized to 1 .

Since an unstable version of the analyzer was used for testing, Andrei doubted the correctness of the warning. He carefully looked at this piece of code several times, and still did not find an error. As a result, he wrote out this false triggering in the bugtracker so that other colleagues could correct the situation.

And only after a detailed analysis, it turned out that PVS-Studio was again more attentive than a person. The value 0x1 is assigned to the named cursorShowing constant, and the named constant CursorShowing participates in the bitwise operation “and”. These are completely different constants, because the first begins with a lowercase letter, and the second with a capital letter.

The code compiles successfully, because the QWindowsCursor class really contains a constant with that name. Here is its definition:

 class QWindowsCursor : public QPlatformCursor { public: enum CursorState { CursorShowing, CursorHidden, CursorSuppressed }; .... } 

If the enum constant is not explicitly assigned, it will be initialized by default. Since the CursorShowing is the first element of the enumeration, it will be assigned the value 0 .

To prevent such errors, the entities should not be given overly similar names. One should especially carefully follow this rule if these entities are of the same type or can be implicitly given to each other. Indeed, in such cases it will be almost impossible to detect an error in the eye, and the incorrect code will successfully compile and live happily ever after inside your project.

Fourth place


Source: Shoot in the leg, processing input

We are approaching the top three finalists, and the next step is an error from the FreeSWITCH project.

 static const char *basic_gets(int *cnt) { .... int c = getchar(); if (c < 0) { if (fgets(command_buf, sizeof(command_buf) - 1, stdin) != command_buf) { break; } command_buf[strlen(command_buf)-1] = '\0'; /* remove endline */ break; } .... } 

PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen (command_buf)'.

The analyzer warns that unchecked data is used in the strlen expression (command_buf) - 1 . And really: if command_buf is empty from the point of view of the C language string (containing a single character - '\ 0'), then strlen (command_buf) will return 0 . In this case, command_buf [-1] will be accessed, which represents an undefined behavior. Trouble

The sap of this error is not even why it happens, but how it happens. This error is one of those nice examples that you can “touch” yourself, reproduce. You can run FreeSwitch, perform some actions that will lead to the execution of the above code, and pass an empty line to the program.

As a result, with a slight movement of the hand, the working program turns (yes no, not into elegant shorts ) into a non-working one! Details on how to reproduce this error can be found in the source article at the link above, but for now I will give a visual result:



Remember that the input data can be anything, and you should always check them. Then the analyzer will not swear, and the program will be more reliable.

Now it's time to deal with our winners: we move to the final!



Third place


Source: NCBI Genome Workbench: Scientific Research Under Threat

The top three winners are opened by a section of code from the NCBI Genome Workbench project - a set of tools for studying and analyzing genetic data. Although it is not necessary to be a genetically modified superman to detect a mistake here, quite a few know about such a possibility of making a mistake.

 /** * Crypt a given password using schema required for NTLMv1 authentication * @param passwd clear text domain password * @param challenge challenge data given by server * @param flags NTLM flags from server side * @param answer buffer where to store crypted password */ void tds_answer_challenge(....) { .... if (ntlm_v == 1) { .... /* with security is best be pedantic */ memset(hash, 0, sizeof(hash)); memset(passwd_buf, 0, sizeof(passwd_buf)); ... } else { .... } } 

PVS-Studio warnings:


Could you find a mistake? If so, then you - well done! .. well, or still a genetically modified superman.

The fact is that modern optimizing compilers can do a lot to make the compiled program run faster. In particular, compilers are able to track that the buffer passed to memset is not used anywhere else.

In this case, they can remove the "unnecessary" call memset , and have the full right to it. Then the buffer that stores the important data may remain in the memory to the delight of attackers.

Against this background, the literary comment “with safety is good to be pedantic” looks even funnier. Judging by the very small number of warnings issued for this project, the developers tried very hard to be careful and write safe code. However, as we can see, it is very easy to miss this security defect. According to the Common Weakness Enumeration, the defect is classified as CWE-14 : Compiler Removal.

To clear memory safely, use the memset_s () function. Not only is it more secure than memset () , but it also cannot be “ignored” by the compiler.

Second place


Source: How PVS-Studio turned out to be more attentive than three and a half programmers

The silver medalist of this top was sent to us by one of our clients. He was sure that the analyzer gave false warnings.

The letter was received by Eugene, skimmed it, and sent it to Svyatoslav. Svetoslav thoughtfully looked at the part of the code sent by the client, and thought “how can the analyzer be so grossly wrong?”. Therefore, he went for advice to Andrew. He also checked the plot and decided: indeed, the analyzer gives false positives.

What can you do, it is necessary to correct. And only when Svetoslav began to make synthetic examples in order to formalize the task as a bugtracker, did he understand what was the matter.

Errors were actually present in the code, but none of the programmers could detect them. Honestly, the author of this article didn’t do it either.

And this is despite the fact that the analyzer clearly issued warnings for erroneous places!

Can you find such a nosy mistake? Check yourself for vigilance and attentiveness.


PVS-Studio warning:
If you succeed - my respect you do not hold!

The error lies in the fact that the logical negation operator (!) Does not apply to the whole condition, but only to its first subexpression:

 !((ch >= 0x0FF10) && (ch <= 0x0FF19)) 

If this condition is met, then the value of the variable ch lies in the interval [0x0FF10 ... 0x0FF19]. Thus, the four further comparisons no longer make sense: they will always be either true or false.

To avoid such errors, it is necessary to adhere to several rules. Firstly, it is very convenient and visual to align the code with a table. Secondly, you should not overload the expression with brackets. For example, this code can be rewritten as:

 const bool isLetterOrDigit = (ch >= 0x0FF10 && ch <= 0x0FF19) // 0..9 || (ch >= 0x0FF21 && ch <= 0x0FF3A) // A..Z || (ch >= 0x0FF41 && ch <= 0x0FF5A); // a..z if (!isLetterOrDigit) 

Then, firstly, the number of brackets becomes much smaller, and secondly, the probability of “catching” an accidental mistake by the eyes increases.

And now - cherry: we move to the first place!

First place


Source: System in shock: interesting errors in the source codes of the legendary System Shock

So, the finalist of our today's top becomes a mistake from the legendary System Shock! This game, released in 1994, became the progenitor and inspirer of such iconic games as Dead Space, BioShock and Deus Ex.

But first, I have to confess something. What I will show you now does not contain any error. By and large, it is not even a code snippet, but I just could not resist sharing it with you!

The fact is that in the process of analyzing the source code of the game, my colleague Victoria found many interesting comments. Here and there, suddenly there were humorous and ironic remarks, and even poems:

 // I'll give you fish, I'll give you candy, // I'll give you, everything I have in my hand // that kid from the wrong side came over my house again, // decapitated all my dolls // and if you bore me, you lose your soul to me // - "Gepetto", Belly, _Star_ // And here, ladies and gentlemen, // is a celebration of C and C++ and their untamed passion... // ================== TerrainData terrain_info; // Now the actual stuff... // ======================= // this is all outrageously horrible, as we dont know what // we really need to deal with here // And if you thought the hack for papers was bad, // wait until you see the one for datas... - X // Returns whether or not in the humble opinion of the // sound system, the sample should be politely obliterated // out of existence // it's a wonderful world, with a lot of strange men // who are standing around, and they all wearing towels 

For our Russian-speaking readers, I made an approximate free translation:
 //    ,    , //    ,       //      //      //     //     ,      // - "Gepetto", Belly, _Star_ //  ,   , //    C  C++     // ================== TerrainData terrain_info; //    ... // ======================= //    ,     , //         //    ,          //      ,      ... - X //       , //         //   ,     //   ,     

These are the comments left by the game developers at the beginning of the distant nineties ... By the way, Dag Church, the chief designer of System Shock, was also writing code. Who knows, maybe some of these comments are written by him personally? I hope about the men in towels - this is not his handiwork :)

Conclusion


In conclusion, I want to thank my colleagues for looking for new errors and writing articles about them. Thanks guys! Without you, this article would not have turned out so interesting.

And I also want to tell you a little about our achievements, because for a whole year we have been dealing not only with finding errors. We also developed and improved the analyzer, as a result of which it has undergone significant changes.

For example, we added support for several new compilers and expanded the list of diagnostic rules. We also implemented initial support for the MISRA C and MISRA C ++ standards . The most important and time-consuming innovation was the support of a new language. Yes, now we can analyze Java code ! And we have updated icon:)

I also want to thank our readers. Thank you for reading our articles and writing to us! Your feedback is very pleasant and important to us.

This is where our top 10 C ++ errors for the 2018th year have come to an end. Which places did you like the most and why? Did you happen to encounter interesting examples in 2018? Tell us about it in the comments!

Until next time!



If you want to share this article with an English-speaking audience, then please use the link to the translation: George Gribkov. Top 10 bugs of C ++ projects found in 2018

Source: https://habr.com/ru/post/444570/


All Articles