📜 ⬆️ ⬇️

PHP under C-shny debugger: we dig inside Zend Engine

Somehow I had to face a problem: the web crawler on PHP works fine for itself, it works, and then suddenly (after 3-6 hours of operation) it stops doing something and starts to eat 100% of the CPU. How to look for such a problem? How to find out where he goes in cycles? But what if you connect to PHP with a debugger and find out everything you need from there? Details under the cut.

What can you do in this situation?


There are not so many options here: you can put logs all over the script and write to the logs, which one it stopped at. From this you can somehow guess where and how it hangs. This is a very long time - I placed the log in the log, caught the hang, looked, did not have enough information, arrange more records, etc. - so I left this option for later, if no other is suitable.
Using xdebug for this will not work - as far as I understand, it does not have the functionality to connect to an already running PHP script. And if you run the script already under xdebug, then again you will not be able to click “run” and then, when it hangs, click “pause” - you can only travel in xdebug by breakpoints (correct me if I'm wrong here).

The idea is to try using GDB!


My main job is related to PHP, but I often have to write in C ++ under GCC (which I must say, really like). There is an experience of debugging c ++ programs directly on the server using gdb - this is not very difficult in fact, the gdb debugger is quite convenient for the console program. So why not try to debug our PHP script with it? At the same time, and in the insides of PHP, you can slightly dig deeper into living things.

What we need


Need ssh access to the server. root is not needed - we can do everything locally. So:
')
Gdb

You can ask the admin to install it, or compile it and install it yourself locally. I asked the admin.

PHP compiled with debug information

In fact, a debug version is not needed. All that is needed is for PHP to be built with the "-g" key. For some reason, PHP 5.2.17 was not going to be in my debug build with this key, which made it very easy - I managed to use the same extensions that are used for the regular version. As I understand it, if I had assembled PHP in a debug version, I would not have been able to use the same extensions — I would have to use those that would come together with PHP.
Looking ahead, I’ll say that I needed libxml2 for building PHP. Plus it turned out that the problem was in libcurl, so in addition I also collected libcurl in a debug build to get inside it.
So, we collect (I write from memory, so there may be inaccuracies):
$ wget <libxml2 download url> $ tar -xzf libxml2-2.7.8.tar.gz $ cd libxml2-2.7.8 $ ./configure --prefix=$HOME/libs $ make && make install 

 $ wget <libcurl download url> $ tar -xzf curl-7.18.2.tar.gz $ cd curl-7.18.2 $ ./configure --prefix=$HOME/libs --enable-debug $ make && make install 

Building PHP is a little more difficult - you also need to specify the paths to the php.ini files in debian, the path to the compiled libxml2 and the path to the compiled libcurl:
 $ wget <php-5.2.17 download url> $ tar -xzf php-5.2.17.tar.gz $ cd php-5.2.17 $ ./configure --disable-debug --with-config-file-path=/etc/php5/cli --with-config-file-scan-dir=/etc/php5/cli/conf.d --with-libxml-dir=$HOME/libs --disable-pdo --with-curl=$HOME/libs $ make 

I repeat again. Compiling PHP with --disable-debug (all the same, the compiler option -g was specified) and it seemed to me easier to use all ready-made modules than to install PHP completely with all modules locally. Therefore, I did not make install. It may be better to configure it with the --prefix = $ HOME / libs option and do a make install, but what I did above was enough for my purposes.
All compiled - run PHP. Here, too, everything is not so smooth: I didn’t immediately find the option to tell him where the extensions are, so I had to specify this directory every time I start PHP:
 $ php/php-5.2.17/sapi/cli/php -d extension_dir=/usr/lib/php5/20060613 PHP Warning: Module 'curl' already loaded in Unknown on line 0 

The error with curl is clear - we have already compiled PHP with the curl module built in, so when I try to connect an external curl.so, I get this error. It's okay, in general.
With the build all, you can run and catch the bug.

Actually, start and debug


In order not to load the reader with unnecessary information, I made a small scripter in PHP where you can see the debg features via gdb:
 <?php class A { protected $_a = NULL; public function __construct($a) { $this->_a = $a; } public function run() { while (true) { sleep(1); } } } class B { protected $_a = NULL; protected $_b = NULL; public function __construct() { $this->_b = rand(1000, 9999); $this->_a = new A(rand(1000, 9999)); } public function run() { $this->_a->run(); } } $b = new B; $b->run(); 

So, run the script:
 $ php/php-5.2.17/sapi/cli/php -d extension_dir=/usr/lib/php5/20060613 test/test.php 

look at the PID of our process and run GDB in another terminal:
 $ ps auwx | grep test.php $ gdb GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. ...... This GDB was configured as "x86_64-linux-gnu". (gdb) 

Attach to our process:
 (gdb) attach 7455 Attaching to process 7455 Reading symbols from /<homedir>/php/php-5.2.17/sapi/cli/php...done. ..... Reading symbols from /<homedir>/libs/lib/libcurl.so.4...done. Loaded symbols for /<homedir>/libs/lib/libcurl.so.4 ..... 0x00007fd9e6c22040 in nanosleep () from /lib/libc.so.6 (gdb) 

If we see the line Reading symbols from [lib] ..... done , then everything went well, and we can calmly debug this binary.
Watch backtrace
(gdb) bt
#0 0x00007fd9e6c22040 in nanosleep () from /lib/libc.so.6
#1 0x00007fd9e6c21e97 in sleep () from /lib/libc.so.6
#2 0x0000000000587277 in zif_sleep (ht=1, return_value=0x278c010, return_value_ptr=0x0, this_ptr=0x0,
return_value_used=0) at /[homedir]/php/php-5.2.17/ext/standard/basic_functions.c:4794
#3 0x000000000068a733 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d6310)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:200
#4 0x0000000000690204 in ZEND_DO_FCALL_SPEC_CONST_HANDLER (execute_data=0x7fff0b7d6310)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:1740
#5 0x000000000068a221 in execute (op_array=0x278ad38)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#6 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#7 0x000000000068a886 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d6570)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:234
#8 0x000000000068b3af in ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER (execute_data=0x7fff0b7d6570)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:322
#9 0x000000000068a221 in execute (op_array=0x278b8c0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#10 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#11 0x000000000068a886 in zend_do_fcall_common_helper_SPEC (execute_data=0x7fff0b7d68a0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:234
#12 0x000000000068b3af in ZEND_DO_FCALL_BY_NAME_SPEC_HANDLER (execute_data=0x7fff0b7d68a0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:322
#13 0x000000000068a221 in execute (op_array=0x2787b88)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
#14 0x00007fd9e655b90f in zend_oe () from /usr/lib/php5/20060613/ZendOptimizer.so
#15 0x0000000000665598 in zend_execute_scripts (type=8, retval=0x0, file_count=3)
at /[homedir]/php/php-5.2.17/Zend/zend.c:1134
#16 0x0000000000615608 in php_execute_script (primary_file=0x7fff0b7d8ee0)
at /[homedir]/php/php-5.2.17/main/main.c:2036
#17 0x00000000006dfa82 in main (argc=4, argv=0x7fff0b7d90f8)
at /[homedir]/php/php-5.2.17/sapi/cli/php_cli.c:1165
(gdb)

First of all, we are interested in frames inside execute () [Zend / zend_vm_execute.h: 92]. These are calls to PHP functions. How to find out where we are at the moment in a PHP script:
(gdb) f 13
#13 0x000000000068a221 in execute (op_array=0x2d3fb88)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.scope->name
$20 = 0x2d423a0 "B"
(gdb) print execute_data.function_state.function->common.function_name
$21 = 0x2d43790 "run"
(gdb) print execute_data.opline->lineno
$22 = 28
(gdb) f 9
#9 0x000000000068a221 in execute (op_array=0x2d438c0)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.scope->name
$23 = 0x2d42380 "A"
(gdb) print execute_data.function_state.function->common.function_name
$24 = 0x2d44c48 "run"
(gdb) print execute_data.opline->lineno
$25 = 23
(gdb) f 5
#5 0x000000000068a221 in execute (op_array=0x2d42d38)
at /[homedir]/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {
(gdb) print execute_data.function_state.function->common.function_name
$26 = 0x770781 "sleep"
(gdb) print execute_data.opline->lineno
$27 = 10
(gdb)

A few explanations: f [number] throws us to a specific frame, print [csatam] - print the character in the scope of this frame.
In the example above, we got the name of the class, the name of the method / function and the number of the line where it is called (in frame 5, the name of the class is not defined because it is a built-in function sleep ()). In fact, we got a PHP script backtrace. Already on the basis of this information, it is possible to understand where the legs of the elusive bug, described at the beginning of the article, grow from.

That's all for today. If there is interest in the topic, next time I will tell you how to view the contents of variables and how arrays are organized in PHP. Thanks for attention. I hope someone was interested in the material.

Source: https://habr.com/ru/post/129982/


All Articles