📜 ⬆️ ⬇️

Chromium: the sixth project check and 250 bugs

This text begins a series of articles devoted to the regular testing of the Chromium project with the help of the PVS-Studio static code analyzer. The articles will look at various error patterns and offer recommendations to reduce the likelihood of their occurrence in the code. However, before you begin, you need to write a kind of introduction that will answer a number of questions in advance, as well as provide Chromium developers with all the errors they have noticed that they can begin to correct without waiting for the end of the cycle of articles.

Andrey Karpov, PVS-Studio

Prehistory


My name is Andrey Karpov, and I am an evangelist of static analysis in general and the static analysis tool PVS-Studio in particular. However, the term “technical evangelist” is already becoming obsolete, and it is being replaced by “developer advocate”.

I spend a lot of time writing material on improving the quality of the code and increasing the reliability of programs. Now I have a new reason to write a few articles on this topic - checking the open Chromium project with the help of the PVS-Studio analyzer. This is a big project, and in any big project live bugs of various types. This diversity allows you to consider several interesting topics related to the causes of these bugs and ways to prevent them.

It should be noted that this is not the first article dedicated to the Chromium project. My previous publications:
')

As you can see, I had bad titles with interesting titles, and my strength was over. Therefore, further the baton was picked up by my colleagues:


By the way, while I was studying the fresh report, I could not resist and wrote a short note about one very pleasant mistake. Since the article has already been published, I will provide here a link to it:


Every time we checked a project, there were a large number of errors in it. New check is no exception. Moreover, since the PVS-Studio analyzer detects errors more and more, at first I simply did not know what to do with all of them. Skimming through the report, I wrote out about 250 errors and thought. Describe all 250 errors in one article? It will be some kind of horror: long, boring and uninteresting. Break the description is not a few articles? It will not be better either; instead of one boring article, there will be several.

Then I decided to break the errors found by type and consider them separately. Moreover, I decided not only to describe the errors, but to try to suggest some methods of dealing with them besides static code analysis. After all, it is much better not to make a mistake at all than to find it later using static / dynamic code analysis / or something else. Even worse, if the user finds errors. Therefore, if it is possible to improve the coding style so that the appearance of an error becomes less likely, then it is worthwhile to speculate on this topic. This is what I will do in a series of articles.

Before I start looking at error patterns, I need an introduction that you are reading. For example, I need to explain why I did not find the strength to carefully study the report, why I cannot say about the percentage of false positives, and where I can read all the errors I have noticed at once.

Project Verification


At the very end of 2017, my colleague Svyatoslav Razmyslov downloaded the source code of the Chromium project, somehow conjured up with them and gave me a generated project for Visual Studio and a PVS-Studio report. Unfortunately, it was impossible to work with the solution in the Visual Studio environment. Wednesday did not survive the decision, which includes 5021 projects.

5021 projects in solution

Everything is incredibly slow, and in general the environment drops down after a while. Therefore, I studied the report using PVS-Studio Standalone. This, of course, is not as convenient as using Visual Studio, which I am used to, but is perfectly acceptable.

It is necessary to understand that the Chromium project is a big project. Not even said so. This is a BIG PROJECT.

The Chromium project and the libraries used in it consist of 114,201 files in C and C ++. The number of lines of code is 30,263,757. Of these, comments make up 16%.

Just that PVS-Studio can check a project of this size is already an achievement :).

What i found


While the New Year holidays were going, I looked through the report for three evenings and wrote out about 250 code fragments, which, in my opinion, require attention and editing. I confess that I did not find the time and energy to review the report carefully. I looked through many warnings very quickly, and some of them ignored them altogether when I was bored with some sort of error. I will tell you more about this in the next chapter.

It is important that I found a lot of mistakes, which is enough to write several articles. By the time I finish publishing the last line, information about errors found in the project may be a little out of date. But it does not matter. My goal is to demonstrate the capabilities of the static code analysis methodology and share some recommendations on coding style with readers.

So that the developers of Chromium and the libraries could correct the errors I noticed, without waiting for the end of the cycle of articles, I wrote them out in a separate file. This should also be done for the reason that perhaps not all error messages will be included in the articles.

Link to the file with the description of the observed defects: chromium.txt .

Why I did not master to view the report carefully?


I did not set up analyzer to reduce the number of false positives. Therefore, false warnings prevented me from looking at the report, and often I missed messages of the same type without peering at them.

Moreover, I missed code snippets, where it is not immediately clear whether there is an error or not. There are many messages, and I am alone. If I had begun to look closely at the code, I would have written articles only in a few months.

Let me show you with examples why some warnings are difficult to understand, especially if this is unfamiliar code. And in Chromium I do not know the entire code.

So, the PVS-Studio analyzer issued a warning to one of the V8 project files:

V547 CWE-570 Expression 'truncated' is always false. objects.cc 2867

Is a bug found, or is this a false positive? Try to understand yourself what is the matter. I added the comment "// <=" to where the analyzer points.

void String::StringShortPrint(StringStream* accumulator, bool show_details) { int len = length(); if (len > kMaxShortPrintLength) { accumulator->Add("<Very long string[%u]>", len); return; } if (!LooksValid()) { accumulator->Add("<Invalid String>"); return; } StringCharacterStream stream(this); bool truncated = false; if (len > kMaxShortPrintLength) { len = kMaxShortPrintLength; truncated = true; } bool one_byte = true; for (int i = 0; i < len; i++) { uint16_t c = stream.GetNext(); if (c < 32 || c >= 127) { one_byte = false; } } stream.Reset(this); if (one_byte) { if (show_details) accumulator->Add("<String[%u]: ", length()); for (int i = 0; i < len; i++) { accumulator->Put(static_cast<char>(stream.GetNext())); } if (show_details) accumulator->Put('>'); } else { // Backslash indicates that the string contains control // characters and that backslashes are therefore escaped. if (show_details) accumulator->Add("<String[%u]\\: ", length()); for (int i = 0; i < len; i++) { uint16_t c = stream.GetNext(); if (c == '\n') { accumulator->Add("\\n"); } else if (c == '\r') { accumulator->Add("\\r"); } else if (c == '\\') { accumulator->Add("\\\\"); } else if (c < 32 || c > 126) { accumulator->Add("\\x%02x", c); } else { accumulator->Put(static_cast<char>(c)); } } if (truncated) { // <= accumulator->Put('.'); accumulator->Put('.'); accumulator->Put('.'); } if (show_details) accumulator->Put('>'); } return; } 

Understood? Complicated?

Complicated! And that is why I alone cannot study all the warnings of the analyzer.

For those who are too lazy to understand, I will explain the essence.

So, the analyzer claims that the condition if (truncated) is always false. Let's reduce the function, leaving the essence:

 void F() { int len = length(); if (len > kMaxShortPrintLength) return; bool truncated = false; if (len > kMaxShortPrintLength) truncated = true; if (truncated) { // <= accumulator->Put('.'); accumulator->Put('.'); accumulator->Put('.'); } } 

The truncated flag must be true if the text is too long, that is, if the condition is satisfied (len> kMaxShortPrintLength) . However, if the text is too long, the function exits above.

That is why truncated is always false and three dots at the end will not be added.

And even now, after finding out the reason why the analyzer issues a warning, I do not know how the code should be written. Or, really, you need to immediately exit the function, then the code that adds points is just superfluous. Or points are needed, and the first check should be removed, which terminates the function ahead of time. It is very, very difficult to study errors in third-party code.

The PVS-Studio analyzer issued a lot of warnings V547. I looked only about their 10th part. Therefore, if you take a close look, much more errors will be found than I wrote out.

Here is another example of why I was bored with working with these warnings.

 void ResourcePrefetcher::OnReadCompleted(net::URLRequest* request, int bytes_read) { DCHECK_NE(net::ERR_IO_PENDING, bytes_read); if (bytes_read <= 0) { FinishRequest(request); return; } if (bytes_read > 0) ReadFullResponse(request); } 

PVS-Studio warning: V547 CWE-571 Expression 'bytes_read> 0' is always true. resource_prefetcher.cc 308

Unlike the previous case, everything is simple. The analyzer is exactly right in stating that the second condition is always true.

However, this is not an error, but a redundant code. Should I edit such a code? Complex issue. This is, by the way, the reason why it is much better to write the code immediately under the supervision of the analyzer, and not to wade heroically through the warnings of one-time launches.

If the analyzer were used regularly, then such redundant code would most likely not even get into the version control system. The programmer would see the warning and write more gracefully. For example:

 void ResourcePrefetcher::OnReadCompleted(net::URLRequest* request, int bytes_read) { DCHECK_NE(net::ERR_IO_PENDING, bytes_read); if (bytes_read <= 0) FinishRequest(request); else ReadFullResponse(request); } 

The analyzer is silent here. At the same time, the code has become shorter, simpler and clearer.

In addition to the V547 , the analyzer issued a whole mountain of V560 messages. This warning states that not all condition is always true or false, but some part of it.

These messages were also boring for me to study. This does not mean that the V560 warnings are bad. But real serious mistakes are rare. Mostly these warnings indicate poor quality redundant code.

An example of a boring redundant check:

 template <typename ConditionT, typename ActionT> std::unique_ptr<DeclarativeRule<ConditionT, ActionT>> DeclarativeRule<ConditionT, ActionT>::Create(....) { .... bool bad_message = false; // <= std::unique_ptr<ActionSet> actions = ActionSet::Create( browser_context, extension, rule->actions, error, &bad_message); // <= if (bad_message) { // <= *error = "An action of a rule set had an invalid " "structure that should have been caught " "by the JSON validator."; return std::move(error_result); } if (!error->empty() || bad_message) // <= return std::move(error_result); .... } 

PVS-Studio warning: V560 CWE-570 A part of conditional expression is always false: bad_message. declarative_rule.h 472

Condition:

 if (!error->empty() || bad_message) 

can be simplified to:

 if (!error->empty()) 

There is also an option to rewrite the code like this:

  if (bad_message) { *error = "An action of a rule set had an invalid " "structure that should have been caught " "by the JSON validator."; } if (!error->empty() || bad_message) return std::move(error_result); 

I hope I could explain why I did not study the report carefully. This is a lot of time consuming work.

The percentage of false positives


I can not say what the percentage of false positives. First of all, I couldn’t even see the log to the end and don’t know the exact number of errors detected by PVS-Studio. Secondly, it makes no sense to talk about the percentage of false positives without first setting the analyzer.

If you configure the PVS-Studio analyzer, you can expect 10-15% of false positives. An example of such a setting is described in the article " Characteristics of the PVS-Studio analyzer on the example of EFL Core Libraries, 10-15% of false positives ."

Of course, it is possible to perform a similar setup for Chromium, but it is irrational to do this only in order to demonstrate some numbers in the article. This is a big job that we are ready to do, but for a fee. Google may well attract our team to configure the analyzer and, at the same time, to edit all the errors found. Yes, consider this a hint .

Customization, clearly, is possible and will give a good result. For example, about HALF of all false positives is associated with the use of the DCHECK macro in the code.

This is what this macro looks like:

 #define LAZY_STREAM(stream, condition) \ !(condition) ? (void) 0 : ::logging::LogMessageVoidify() & (stream) #define DCHECK(condition) \ LAZY_STREAM(LOG_STREAM(DCHECK), !ANALYZER_ASSUME_TRUE(condition))\ << "Check failed: " #condition ". " 

From the point of view of the PVS-Studio analyzer, this is just some kind of condition check and a set of actions, after which the execution of the rest of the code continues.

As a result, the analyzer generates false warnings, for example, for such a code:

 bool Value::Equals(const Value* other) const { DCHECK(other); return *this == *other; } 

PVS-Studio reports: V1004 CWE-476 The 'other' pointer was used unsafely after it was verified against nullptr. Check lines: 621, 622. values.cc 622

From the point of view of the analyzer, the other pointer is checked for nullptr equality. But regardless of whether the other is a null pointer or not, its dereference will occur next. The analyzer considers similar actions suspicious.

The DCHECK macro is a type of assert macro. But if the analyzer knows about assert , then he does not understand what DCHECK is. To better explain what is happening, I will write pseudocode:

 bool Equals(T* ptr) const { if (!ptr) LogMessage(); return *this == *ptr; } 

This is how the PVS-Studio analyzer sees the code. First, the pointer is checked for equality nullptr . If the pointer is null, then the LogMessage function is called . In this case, the function is not marked as not returning control. So, regardless of whether the ptr is a null pointer or not, the function will continue to be executed.

Further the pointer is dereferenced. But there was a test, where he was checked for equality to zero! Therefore, the pointer may be null and the analyzer signals a problem in the code. This is how the analyzer generates a lot of absolutely correct, but useless messages.

By the way, such a macro implementation is confusing not only PVS-Studio. Therefore, for the analyzer built into Visual Studio, a special “backup” was made:

 #if defined(_PREFAST_) && defined(OS_WIN) // See comments on the previous use of __analysis_assume. #define DCHECK(condition) \ __analysis_assume(!!(condition)), \ LAZY_STREAM(LOG_STREAM(DCHECK), false) \ << "Check failed: " #condition ". " #define DPCHECK(condition) \ __analysis_assume(!!(condition)), \ LAZY_STREAM(PLOG_STREAM(DCHECK), false) \ << "Check failed: " #condition ". " #else // !(defined(_PREFAST_) && defined(OS_WIN)) 

If you make a similar backup for the PVS-Studio analyzer, the picture with false positives will change dramatically. In my opinion, half of the false positives immediately disappear. Yes, exactly half. The thing is that the DCHECK macro is used a huge number of times.

Other publications


This introductory article ends, and here I will gradually post links to other articles. Thanks for attention.

  1. Beautiful Chromium and gnarled memset .
  2. Break and fallthrough operator .
  3. Chromium: memory leaks .
  4. Chromium: typos .
  5. Chromium: using invalid data .
  6. Why it is important to check that the malloc function returned .
  7. Chromium: other errors .



If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Chromium: About the Sixth Check of the Project .

Source: https://habr.com/ru/post/347536/


All Articles