📜 ⬆️ ⬇️

Search and fix bugs in PHP source


Honestly I warn you: take this text with a certain degree of skepticism. I just recently started to get familiar with PHP internals, but I would like to tell you about what’s going on behind the scenes of bug # 75237 .


Bug detection


It all started with the search for lines of code to cover the Chicago PHP UG mitap, which was part of PHP TestFest 2017 .


I found an uncovered string using gcov for ext/json/json_encoder.c . I tried to write PHP code that would process this line .



I tried this way and that, but nothing worked, and suddenly, through this code, I inadvertently initiated segfault :


 <?php class Foo implements JsonSerializable { public function jsonSerialize() { return new self; } } # Segfault! var_dump(json_encode(new Foo)); 

But in PHP should never be segfault. The core PHP code must throw an exception and provide a beautiful error message for user space before a segfault occurs.


To find out which PHP versions are affected by this bug, I created a 3v4l snippet and saw that all actively supported versions generate segfault.


This happens only when jsonSerialize() returns a new instance of itself. If I return the same instance using $this , the code works as it should.


 <?php class Foo implements JsonSerializable { public function jsonSerialize() { return $this; } } # Works fine var_dump(json_encode(new Foo)); 

Send bug report


All bugs in PHP must be reported via bugs.php.net . Although I tried to patch the bug myself, I still had to create a report that can be referenced in the patch.


So I filled out the form - and there was a bug # 75237 .


Patch writing


Now the most difficult. How to start patching this error?


Update php-src and recompile


First, I got all the previous changes from the main Git repository, to be sure that I am working with the most recent version of php-src. I remotely called the upstream command pointing to the main php-src repository on GitHub .


 $ git fetch upstream $ git checkout master $ git rebase upstream/master master 

Then he created a new master branch for the patch and named it after the bug ID.


 $ git checkout -b bug75237 

Now you need to recompile from the source code. I will not go into details, I have another article, which describes in detail how to compile PHP from source code .


When I recompile PHP, I run a bash script called config , which I store outside the main php-src folder. It deletes all existing compiled binaries and other files that are not under version control, collects the configuration script and runs the configuration with the necessary flags enabled by default.


Here is my script if you want to do something similar for yourself:


 #!/bin/sh make distclean ./vcsclean ./buildconf ./configure --enable-maintainer-zts \ --enable-debug \ --enable-cli \ --with-zlib \ --enable-mbstring \ --with-openssl=/usr/local/opt/openssl \ --with-libxml-dir=/usr/local/opt/libxml2 \ --with-sodium \ "$@" 

Why are there such strange ways? I work on a Mac with some dependencies installed with Homebrew , so in some places, for some extensions, the paths lead to /usr/local/opt .


So I just ran my configuration script and then compiled everything with make .


 $ ../config && make -j9 

A binary was created at sapi/cli/php , after which I checked the PHP version and ran the segfault-leading code to make sure that the php-src still still contains a bug.


 $ sapi/cli/php --version $ sapi/cli/php json_encode.php 

Yes! Again segfault. Now we will use the debugger to go through the C-code, which is executed line by line by our PHP script.


Running GDB


To debug the code I used GDB . It is designed for compiled languages ​​like C and does not work with regular PHP files. But PHP is written in C, so using gdb --args you can force GDB to execute a compiled binary with a PHP script.


 $ gdb --args sapi/cli/php json_encode.php 

By running GDB, I set a checkpoint for the pause in the execution of the program, and then went through each line of the C-code. I talked about the details in the above video, but briefly: it turned out that the whole thing is in the php_json_encode_serializable_object() function, so I set a control point here and started the program using run .


 > break php_json_encode_serializable_object > run 

When GDB comes to a checkpoint and pauses the execution of a code, you can write:



After entering the commands, I realized that the code goes into infinite recursion between php_json_encode_zval() and php_json_encode_serializable_object() . This leads to stack overflow and segfault, because the program runs out of memory.


I was interested in an if inside php_json_encode_serializable_object() .


 if ((Z_TYPE(retval) == IS_OBJECT) && (Z_OBJ(retval) == Z_OBJ_P(val))) { // } else { // } 

It processed a case where the return jsonSerialize() is the same instance of it, but no one caught the situation when a new instance was returned.


Once again I will mention that the details are described in the video, and here I will give the final part of the code that I added to throw an exception when jsonSerialize() returns a new instance of itself.


 if ((Z_TYPE(retval) == IS_OBJECT) && // Make sure the objects are not the same instance (Z_OBJ(retval) != Z_OBJ_P(val)) && // Check that the class names of the objects are the same zend_string_equals(ce->name, Z_OBJCE(retval)->name)) { // Throw an exception zend_throw_exception_ex(NULL, 0, "%s::jsonSerialize() cannot return a new instance of itself", ZSTR_VAL(ce->name)); // Garbage collection zval_ptr_dtor(&retval); zval_ptr_dtor(&fname); PHP_JSON_HASH_APPLY_PROTECTION_DEC(myht); // Return fail return FAILURE; } else if ((Z_TYPE(retval) == IS_OBJECT) && (Z_OBJ(retval) == Z_OBJ_P(val))) { // } else { // } 

I made a lot of trial and error to come to this option, and phpinternalsbook.com helped me a lot , especially when working with zval .


Update: Sarah Golmon , who has been a key PHP developer for a long time and one of the PHP 7.2 release managers, gave a wonderful comment to my patch : “If you just try to compare a class, you can only compare ce. So it will be more correct than to compare the names, and (nominally) more effectively. "


Thanks to Sarah’s suggestion, we can change the string with zend_string_equals() in an if expression:


 if ((Z_TYPE(retval) == IS_OBJECT) && (Z_OBJ(retval) != Z_OBJ_P(val)) && // This line changes ce == Z_OBJCE(retval)) { // Same as above } else if ((Z_TYPE(retval) == IS_OBJECT) && (Z_OBJ(retval) == Z_OBJ_P(val))) { // } else { // } 

Testing


Before you apply the patch, you need to write a test confirming that the error is indeed fixed.


Again without details, but if you need more information, then I created a six-part guide on writing tests for PHP source code .


 $ vi ext/json/tests/bug75237.phpt --TEST-- Bug #75237 (jsonSerialize() - Returning new instance of self causes segfault) --SKIPIF-- <?php if (!extension_loaded("json")) die("skip ext/json required"); ?> --FILE-- <?php class Foo implements JsonSerializable { public function jsonSerialize() { return new self; } } try { var_dump(json_encode(new Foo)); } catch (Exception $e) { echo $e->getMessage(); } ?> --EXPECT-- Foo::jsonSerialize() cannot return a new instance of itself 

I ran the test and checked it out, and then I drove all the tests into ext/json to make sure I didn't break anything.


 $ make test TESTS=ext/json/tests/bug75237.phpt $ make test TESTS=ext/json/tests/ 

All tests were passed, so I was ready to apply the patch.


Send PR and update bug


Then I uploaded the patch to my fork on github ...


 $ git add ext/json/json_encoder.c ext/json/tests/bug75237.phpt $ git commit -m "Fix bug 75237 - (jsonSerialize() - Returning new instance of self causes segfault)" $ git push origin 

... and created a PR in the main php-src repository.


Finally, I went back to my bug report and updated it with a link to the newly created PR.


Waiting for response and making changes


After sending PR, a little time passed, and I received a response from several employees working on PHP. One of them helped to understand that I incorrectly compare the class names (I fixed it in the video and in the code snippets in the article, but left the original PR unchanged, so you can analyze the first option).


In the end, Niki P shut down PR , because we identified a deeper, more general problem related to PHP. This language does not have general protection against stack overflow, which means segfault can be created in many other contexts .


Here is how Sara spoke of this:



Good luck!


Although my patch was not merged with the code base, I still consider it successful, because in the process I not only learned a lot, but also attracted the attention of developers to the main problem with stack overflow, which could contribute to its solution.


I hope you liked the story about my adventure with a bug. Good luck finding and fixing your own bugs in the PHP source code!


')

Source: https://habr.com/ru/post/340242/


All Articles