📜 ⬆️ ⬇️

Weak links in PHP 7

A weak link is a mechanism that allows you to refer to an object, while not prohibiting the garbage collector from deleting an object in programming languages ​​with automatic garbage collection.

In the first approximation, a weak link indicates that we need an object, of course, but if there is a need, we can delete it as well, we can manage it somehow.

Initially, the concept of weak links is missing in PHP, but it is fairly easy to fix. The following discussion focuses on PHP 7.

It should be noted separately that the solution below only works for objects and this is due to the peculiarity of the work of the garbage collector in PHP.
')

Recommended for preliminary reading



Object Handlers


In PHP, functions for handling various events (operations) on an object are represented by a table of handlers with the structure zend_object_handlers .

Internal classes can define and redefine object handlers, thus changing the behavior during various typical operations, for example, cloning, comparing, accessing properties of an object, receiving debug information, etc. A complete list of possible operations for which a handler can be specified is available by reference to the zend_object_handlers structures above.

User classes that do not inherit the inner classes get default handlers stored in the variable std_object_handlers .

Garbage collection


The topic of garbage collection in PHP is discussed in some detail, including in the official documentation and is not the subject of this article.

In short, the value is to be assembled as soon as the reference counter does not become equal to 0. In normal mode, the destructor is called upon the object during the assembly and then the memory occupied by the object is cleared. Internally, the handlers dtor_obj and free_obj are responsible for this, respectively.

Internally, the implementation of the standard destructor is the zend_objects_destroy_object function , the essence of which is to call the __destruct () method, if any.

If an exception is thrown or the language construct is called exit (or its synonym is die ), the object's destructor is not called, for example:

<?php class Test { public function __destruct() { echo 'Goodbye Cruel World!', PHP_EOL; } } $test = new Test(); throw new Exception('Test exception'); 

that will bring

  $ php test.php Fatal error: Uncaught Exception: Test exception in /home/vagrant/Development/php-weak/test.php on line 13 Exception: Test exception in /home/vagrant/Development/php-weak/test.php on line 13 Call Stack: 0.0011 365968 1. {main}() /home/vagrant/Development/php-weak/test.php:0 

Object storage


In order to more fully understand the possible pitfalls in the implementation of weak links, consider where and how all objects are stored in PHP, fortunately, it is very simple - in EG (objects_store) .object_buckets which is an array where the key is Z_OBJ_HANDLE (zval) - integer index of the object, hereinafter referred to as an object handle. At each moment of time it is unique for each object, when deleting an object from EG (objects_store) .object_buckets, the value of the object descriptor can be assigned to another object, for example:

 <?php $obj1 = new stdClass(); $obj2 = new stdClass(); debug_zval_dump($obj1); debug_zval_dump($obj2); $obj2 = null; //      EG(objects_store).object_buckets       $obj2 = new SplObjectStorage(); debug_zval_dump($obj2); 

what if given

 object(stdClass)#1 (0) refcount(2){ } object(stdClass)#2 (0) refcount(2){ } object(SplObjectStorage)#2 (1) refcount(2){ ["storage":"SplObjectStorage":private]=> array(0) refcount(1){ } } 

The value after the hash symbol (#) is the same object descriptor value. As you can see, the value of descriptor 2 was reused.

Implementation of weak links


Based on the foregoing, a very obvious solution to the implementation of weak references is to wrap the object's destructor to which we refer and add our own logic after the initial destructor is executed. The pinnacle of mastery would of course be to make weak references at the level of the interpreter itself, however, that is another story.

The pointer to the object's handler table has the const specifier, to which the standard C90 and C99 say that changing this value by casting to a non- constant type will result in undefined behavior.


This is done for a reason. By changing the value of a single object handler, we change the handler for all objects of a given class, and moreover, subclass handlers, if they are not redefined, and in the case of a custom class object (which is not a descendant of an internal), for all objects created from custom classes.

The best solution would be to replace the pointer to the table of the handlers of a single object by copying the original table and replacing the handler we need - dtor_obj . The abbreviated entry is as follows:

 php_weak_referent_t *referent = (php_weak_referent_t *) ecalloc(1, sizeof(php_weak_referent_t)); memcpy(&referent->custom_handlers, referent->original_handlers, sizeof(zend_object_handlers)); referent->custom_handlers.dtor_obj = php_weak_referent_object_dtor_obj; Z_OBJ_P(referent_zv)->handlers = &referent->custom_handlers; 

Obstacles


The testing process revealed a rather unpleasant fact - changes in the pointer value of the handler table affected the PHP result of the spl_object_hash () function, which uses the pointer values ​​to the handler table to generate the last 16 characters of id (the first 16 are the object descriptor hash) and, as a result, the return value of the object id will differ before and after creating a weak link for it:

 <?php $obj = new stdClass(); var_dump(spl_object_hash($obj)); // 0000000046d2e51 a000000003e3e4a43 $ref = new Weak\Reference($obj); var_dump(spl_object_hash($obj)); // 0000000046d2e51 a00007fc62b682d1b 

It can be considered a positive fact that once it has changed, the hash will no longer change (at least this extension).

But we will go further. Through discussions with PHP developers, an idea appeared ... not to use the pointer value to the table of handlers when generating a hash , since even its first part satisfies the condition of uniqueness for an object hash during its entire life. The second part now uses just an arbitrary value that was previously used as a mask:

 <?php class Test {} class X {} $t = new Test(); $x = new X(); $s = new SplObjectStorage(); var_dump(spl_object_hash($t)); // 00000000054acbeb 0000000050eaeb6f var_dump(spl_object_hash($x)); // 00000000054acbe8 0000000050eaeb6f var_dump(spl_object_hash($s)); // 00000000054acbe9 0000000050eaeb6f 

So much better! This improvement will most likely be available in PHP version> 7.0.2.

But it will be only after PHP 7.0.2, and we need to work globally and reliably on earlier versions of PHP 7 as well.

Not very beautiful, but a very working way is ... replacing the spl_object_hash () function with your own implementation:

 <?php $spl_hash_function = $EG->function_table['spl_object_hash']; $custom_hash_function = function (object $obj) { $hash = null; if (weak\refcounted($obj)) { $referent = execute_referent_object_metadata($obj); $obj->handlers = $referent->original_handlers; $hash = $spl_hash_function($obj); $obj->handlers = $referent->custom_handlers; } if (null == $hash) { $hash = $spl_hash_function($obj); } return $hash; }; $EG->function_table['spl_object_hash'] = $custom_hash_function; 

The author of this substitution method is Etienne Kneuss , the author of the original implementation of weak links in PHP 5 as an extension of php-weakref . Unfortunately, I didn’t wait just a couple of hours before he also implemented support for PHP 7. However, two working extensions are better than none, and at the moment we have a rather different functionality and concept.

Notification mechanism for the destruction of the object


When creating a weak link, the mechanism of notification that the object we are referring to was destroyed is rather important to us. Conceptually, this problem can be solved by polling the weakest link, by storing the weak link into an array, as is done in the implementation of WeakReference in Java; and, what exactly is of gastronomic interest, the call of a user function, as is done in the implementation of weakref.ref in the Python language.

To do this, after calling the source object's destructor, we call for each weak reference a user-defined function if such was specified when creating a weak reference, or save each weak reference object to an array if it was specified. It should be noted separately that if an exception is thrown in the destructor or in one of the user-notifier functions, no further notifier functions will be called. Simplified, the mechanism for notifying the destruction of an object through a call to user-defined functions can be represented as an extension of the base class of an object with a destructor override, where after calling the parent destructor, user functions are called one after the other, from the latest to the very first.

As a result, our destructor wrapper can be represented as a meta code:

 run_original_dtor_obj($object); foreach($weak_references as $weak_ref_object_handle => $weak_reference) { if (is_array($weak_reference->notifier)) { $weak_reference->notifier[] = $weak_reference; } elseif (is_callable($weak_reference->notifier) && $no_exception_thrown) { $weak_reference->notifier($weak_reference); } unset($weak_references[$weak_ref_object_handle]); } 

The weak link class itself has the following form:

 namespace Weak; class Reference { /** * @param object $referent     * @param callable | array | null $notify        * *     ,         . *                 . *            *  -  . * *    ,         , , *    ,         *  -  . */ public function __construct(object $referent, $notify = null) { } /** *     .       null. * * @return object | null */ public function get() { } /** *       :     . * * @return bool */ public function valid() : bool { } /** *   * * @param callable | array | null $notify  .  ,  . * * @return callable | array | null    ,        */ public function notifier($notify = null) { } } 

For more information on values ​​and objects, this extension provides a number of functions that complement the functionality of the weak reference class and allow you to refuse to create various solutions at the user level in no way.

Listing extension functions
 namespace Weak; /** * ,        . * * @param mixed $value * * @return bool */ function refcounted(mixed $value) : bool {} /** *       .       0. * *             ,  *   weak\refcount(new stdClass())  0,           *  . * * @param mixed $value * * @return int */ function refcount(mixed $value) : int {} /** * ,         . * * @param object $value * * @return bool */ function weakrefcounted(object $value) : bool {} /** *      .      0. * * @param object $value * * @return int */ function weakrefcount(object $value) : int {} /** *      .       . * * @param object $value * * @return mixed */ function weakrefs(object $value) : array {} /** *    . * * @param object $value * * @return int */ function object_handle(object $value) : int {} 

These functions are quite specific and should never be used only in cases where the user understands what and why he does.

Practical use


This extension provides the above functionality was the result of work on another extension - integration of the v8 JavaScript engine in PHP (not v8js, simpler and more powerful, at the moment the source code is not publicly available).

When writing an abstraction layer for translating php objects into js objects, it became necessary to unambiguously match and, most importantly, clean the correspondence table of php objects and js representations. A typical problem is that in different places a situation arises when the same php object is returned to js for which there is already a js view, and in order to have identical views inside the js, you need to store a correspondence table.

Initially, a table of correspondence between the id php of the object and the js representation itself was compiled, but as the work progressed, there was a situation when there was a surplus of orphaned js representations: the corresponding php objects were deleted by the garbage collector, since no one referred to them. The notification mechanism, which is implemented along with weak links, will completely solve this problem.

For people not suffering from the symptoms described above and leading a healthy nocturnal lifestyle, this extension may be useful for the following:

I suggest in the comments will share possible ways to use weak links in PHP.

For our own needs, we created an implementation of the WeakMap data structure based on SplObjectStorage - Weak \ SplObjectStorage

The extension is available at github.com/pinepain/php-weak and is distributed under the MIT license . It requires PHP 7 to work.

Source: https://habr.com/ru/post/275151/


All Articles