📜 ⬆️ ⬇️

PHP 7 Checker


Re-checking projects is often quite interesting. It allows you to find out which new errors were made during the development of the application, and which errors have already been fixed. Previously, my colleague already wrote about checking PHP. With the release of the new version (PHP7), I decided to check the interpreter source code again and found something interesting.


Checked project


PHP is a general purpose scripting language intensively used for developing web applications. The language and its interpreter are developed in an open source project.


')
On December 3, 2015, PHP version 7.0.0 was announced. The new version is based on the experimental PHP branch, which was originally called phpng (Next Generation PHP), and was developed with a focus on increasing performance and reducing memory consumption.

The object of verification was the PHP interpreter, the source code of which is available in the repository on GitHub . The tested branch is master.

The static code analyzer PVS-Studio was used as the analysis tool. For verification, a compilation monitoring system was used, allowing the project to be analyzed regardless of which system is used to build this project. A trial version of the analyzer is available for download at the link .

You can get acquainted with the previous verification of the project in the article by Svetoslav Razmyslov " Note about PHP verification ".

Bugs found


It is worth noting that many of the errors found by the analyzer are in libraries used in PHP. If all of them are considered in this article, its volume would have grown considerably. But on the other hand, the mistakes made in the libraries used in the project will also manifest themselves when using the project. Therefore, some of these errors will still be written out in this article.

Separately, I would like to note that during the analysis, the impression was that the code was written entirely in macros. They are just everywhere. This greatly complicates the task of analysis, I generally keep quiet about debugging such a code. Their ubiquitous use, by the way, turned out to be sideways - errors from macros were taken away along the code. But more on that below.
static void spl_fixedarray_object_write_dimension(zval *object, zval *offset, zval *value) { .... if (intern->fptr_offset_set) { zval tmp; if (!offset) { ZVAL_NULL(&tmp); offset = &tmp; } else { SEPARATE_ARG_IF_REF(offset); } .... spl_fixedarray_object_write_dimension_helper(intern, offset, value) } 

PVS-Studio warning : V506 Pointer to local variable 'tmp' is stored outside. Such a pointer will become invalid. spl_fixedarray.c 420

If the conditional statement for the if statement given above is true, the offset pointer can be assigned the address of the variable tmp . At the same time, the lifetime of the variable tmp is limited to its scope, i.e. body of the if statement . But below the code, a function is called that takes the offset pointer as one of the parameters, which refers to the variable already destroyed, which can lead to an error when working with this pointer.

Another weird code:
 #define MIN(a, b) (((a)<(b))?(a):(b)) #define MAX(a, b) (((a)>(b))?(a):(b)) SPL_METHOD(SplFileObject, fwrite) { .... size_t str_len; zend_long length = 0; .... str_len = MAX(0, MIN((size_t)length, str_len)); .... } 

PVS-Studio warning : V547 Expression is always false. Unsigned type value is never <0. spl_directory.c 2886

The code logic is simple - first compare 2 values ​​and take the smaller one, after which the result obtained is compared with zero and the largest of these values ​​is written to the str_len variable. The problem lies in the fact that size_t is an unsigned type, therefore, its value is always non-negative. As a result, the use of the second macro MAX simply does not make sense. What is it - just an extra operation, or some more serious mistake - to judge the developer who wrote the code.

This is not the only strange comparison, there are others.
 static size_t sapi_cli_ub_write(const char *str, size_t str_length) { .... size_t ub_wrote; ub_wrote = cli_shell_callbacks.cli_shell_ub_write(str, str_length); if (ub_wrote > -1) { return ub_wrote; } } 

PVS-Studio warning : V605 Consider verifying the expression: ub_wrote> - 1. An unsigned value is compared to the number -1. php_cli.c 307

The variable ub_wrote is of type unsigned size_t . However, the following check is performed like ub_wrote> -1. At first glance it may seem that this expression will always be true, since ub_wrote can store only non-negative values ​​in itself. In fact, everything is more interesting.

The literal type -1 ( int ) will be converted to the variable type ub_wrote ( size_t ), that is, the converted value will participate in the comparison of ub_wrote with the variable. In a 32-bit program, this will be an unsigned value 0xFFFFFFFF , and in a 64-bit program it will be 0xFFFFFFFFFFFFFFFF . Thus, the ub_wrote variable will be compared with the maximum value of the unsigned long type. As a result, the result of this comparison will always be false , and the return statement will never be executed.

Similar code met again. Corresponding message: V605 Consider verifying the expression: shell_wrote> - 1. An unsigned value is compared to the number -1. php_cli.c 272

The following code, for which the analyzer issued a warning, is also associated with a macro.
 PHPAPI void php_print_info(int flag) { .... if (!sapi_module.phpinfo_as_text) { php_info_print("<h1>Configuration</h1>\n"); } else { SECTION("Configuration"); } .... } 

PVS-Studio warning : V571 Recurring check. The 'if (! Sapi_module.phpinfo_as_text)' condition is already verified in line 975. info.c 978

At first glance it may seem that everything is normal, and there is no error. But let's take a look at what the SECTION macro is .
 #define SECTION(name) if (!sapi_module.phpinfo_as_text) { \ php_info_print("<h2>" name "</h2>\n"); \ } else { \ php_info_print_table_start(); \ php_info_print_table_header(1, name); \ php_info_print_table_end(); \ } \ 

As a result, after preprocessing, the following code will be contained in the * .i-file:
 PHPAPI void php_print_info(int flag) { .... if (!sapi_module.phpinfo_as_text) { php_info_print("<h1>Configuration</h1>\n"); } else { if (!sapi_module.phpinfo_as_text) { php_info_print("<h2>Configuration</h2>\n"); } else { php_info_print_table_start(); php_info_print_table_header(1, "Configuration"); php_info_print_table_end(); } } .... } 

Now the problem has become much easier to notice. Some condition is checked (! Sapi_module.phpinfo_as_text ) and, if it is not met, this condition is checked again (which, of course, will never be fulfilled). Agree, it looks at least strange.

A similar situation associated with the use of this macro, met again in the same function:
 PHPAPI void php_print_info(int flag) { .... if (!sapi_module.phpinfo_as_text) { SECTION("PHP License"); .... } .... } 

PVS-Studio warning : V571 Recurring check. The 'if (! Sapi_module.phpinfo_as_text)' condition was already verified in line 1058. info.c 1059

Similar situation. Same condition, same macro. We open, we receive a code of the following form:
 PHPAPI void php_print_info(int flag) { .... if (!sapi_module.phpinfo_as_text) { if (!sapi_module.phpinfo_as_text) { php_info_print("<h2>PHP License</h2>\n"); } else { php_info_print_table_start(); php_info_print_table_header(1, "PHP License"); php_info_print_table_end(); } .... } .... } 

Again the same condition is checked twice. The second will be checked only if the first is true. Then, if the first condition is true ( ! Sapi_module.phpinfo_as_text ), then the second condition will always be true. In this case, the code located in the else branch of the second if statement will never be executed.

Go ahead.
 static int preg_get_backref(char **str, int *backref) { .... register char *walk = *str; .... if (*walk == 0 || *walk != '}') .... } 

PVS-Studio warning : V590 Consider inspecting the '* walk == 0 || * walk! = '}' 'expression. The expression is misprint. php_pcre.c 1033

In this code, the pointer is dereferenced, and its value is compared with some literals. This code is redundant. For clarity, we rewrite, simplifying, the expression:
 if (a == 0 || a != 125) 

As you can see, the condition can be simplified to a! = 125 .

This may indicate both the redundancy of the code, and the fact that the programmer made a more serious error.



The source of some problem areas was the Zend framework:
 static zend_mm_heap *zend_mm_init(void) { .... heap->limit = (Z_L(-1) >> Z_L(1)); .... } 

PVS-Studio warning : V610 Unspecified behavior. Check the shift operator '>>'. The left operand '(- 1)' is negative. zend_alloc.c 1865

This code uses the operation of the right-sided bit shift of a negative value. This is a case of unspecified behavior. Although from the point of view of language, such a case is not wrong, in contrast to the indefinite behavior, it is better to avoid such cases, since the behavior of such code may differ depending on the platform and the compiler.

Another interesting bug is contained in the PCRE library:
 const pcre_uint32 PRIV(ucp_gbtable[]) = { .... (1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)| /* 6 L */ (1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT), .... }; 

PVS-Studio Warning: V501 There are identical sub-expressions '(1 << ucp_gbL)' to the left | operator. pcre_tables.c 161

Errors of this kind can be called classic. They were and are in C ++ projects, projects written in C # are not spared from them , and, probably, in other languages ​​too. The programmer made a typo and duplicated a subexpression in the expression (1 << ucp_gbL) . Most likely (judging by the rest of the source code), the subexpression was implied (1 << ucp_gbT) . Such errors are not evident even in a separately written code fragment, and even against the background of the rest - they become generally extremely difficult to detect.

By the way, my colleague wrote about this error in a previous article, but it's still there.

Another place from the same library:
 .... firstchar = mcbuffer[0] | req_caseopt; firstchar = mcbuffer[0]; firstcharflags = req_caseopt; .... 

PVS-Studio warning : V519 The 'firstchar' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 8163, 8164. pcre_compile.c 8164

Agree, the code looks strange. Write the result of the operation '|' in the variable firstchar , and immediately overwrite its value, ignoring the result of the previous operation. Perhaps, in the second case, instead of firstchar , another variable was implied, but it is certainly difficult to say.

Excess conditions have also been met. For example:
 PHPAPI php_stream *_php_stream_fopen_with_path(.... const char *path, ....) { .... if (!path || (path && !*path)) { .... } 

PVS-Studio warning : V728 Alert check can be simplified. The '||' operator path by and express expressions '! path' and 'path'. plain_wrapper.c 1487

This expression is redundant: in the second subexpression you can remove the check of the path pointer for the inequality nullptr. Then the simplified expression takes the following form:
 if (!path || !*path)) { 

Do not underestimate these mistakes. Instead of the path variable, something else could have been completely implied, then the expression would not be redundant, but erroneous. By the way, this is not the only place. We met a few more:

About third-party libraries


I wrote about this above, but I want to repeat it again. PHP uses some third-party libraries that, alas, are not perfect and contain bugs. Many warnings were issued just on the code of these libraries. It would be possible to analyze them all, but then, believe me, the article would have been very large.

Determining whether the error is in the PHP interpreter code or a third-party library is quite simple - at the beginning of all the source files there is a comment describing the license, project, authors, etc. From these comments it is easy to determine which project file hid the error.

On the other hand, some interesting places were worth considering. In any case, if you use any third-party libraries, the responsibility to users for errors made in these libraries also falls on your shoulders, since the error may manifest itself in the use of your product. Therefore, it is worth more attentively to what dependencies you are drawing in your project.

Conclusion




The test results came out interesting. In fact, there were a lot of errors; the article considered only a small part of the general warnings of high and medium confidence levels. Many of these errors are in libraries that are used in PHP, which means, implicitly, wander into its code. The PHP code itself also showed interesting errors, some of which were written above.

Summing up, I would like to say that it is necessary to use various means to improve the productivity of work and the quality of the code. You should not be limited to tests and code review. A static analyzer is one of those tools that can help a programmer to write better code, allowing more useful time to be spent on finding errors made due to a typo. But do not forget that the static analyzer is a tool for regular use. And if you are not using it yet - I recommend downloading PVS-Studio, check your project and see what interesting things you can find.



If you want to share this article with an English-speaking audience, then please use the link to the translation: Sergey Vasiliev. Analysis of PHP7 .

Read the article and have a question?
Often our articles are asked the same questions. We collected answers to them here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please review the list.

Source: https://habr.com/ru/post/282684/


All Articles