In the footsteps of the recently found
tvv 'om bug.
When executing the following code in PHP versions 5.3.0-5.3.2, the result exceeded all expectations.
<?php
f(0, $$var);
$x = 1;
$y = 2;
echo $x;
function f($a, $b) {};
')
As a result, '2' was output. I managed to find a bug and fix it:
# 52001 . In short: a pointer to a special gag variable for uninitialized variables was rubbed through, through which all CV variables in PHP are created.
Seeing the PHP source for the first time, I began the search by checking for the lice of the scanner and the PHP parser. It turned out that the compilation is correct: for this it was necessary to enable the parser debug mode. This helped to name the variables and figure out which structure belongs. In particular, to understand the belonging of various zend_do_ * functions of the compiler.
Then it became clear that there are two different modes of calling functions: by name and by address. The first is used if the name is not known to the compiler in compilation mode. In this mode, the arguments are passed slightly differently, since the prototype is not known to the compiler.
Pseudo-randomly poking printouts of variable addresses, I discovered that two variables (x and y) really have the same addresses in the insides of PHP, which was clearly a bug. At first there was a doubt that the variables were correctly searched in the namespace, which was dispelled by the inclusion of debugging: printing the entire namespace hash when searching for variables in it.
It turned out the following: a call by name leads to a special labeling of the variables being transferred, since they can be references (after all, the prototype is unknown).
The $$ var variable, and all readable variables, are created as a special uninitialized variable. The handler code for retrieving a variable to call a function made sure that the passed value could be used as a ref, for which it was required to copy this variable. In this case, using a pointer to a pointer, the pointer value is rewritten to that special uninitialized variable. It becomes equal to the memory just allocated and has reference count = 1.
After that, any new initialized variable gets this memory. “Wrong” reference count leads to the fact that when writing to these variables they are not copied (copy-on-write) and the same memory is used as a common for all new variables. This leads to the fact that all the data is the same, similar to the old bug in Fortran, when it was possible to make the assignment 1 = 2.
That's all. It was an interesting experience, I really like these bugs.
In more detail the technical part can be read in
this comment .