A weak link is a mechanism that allows you to refer to an object, while not prohibiting the garbage collector from deleting an object in programming languages with automatic garbage collection.
In the first approximation, a weak link indicates that we need an object, of course, but if there is a need, we can delete it as well, we can manage it somehow.
Initially, the concept of weak links is missing in PHP, but it is fairly easy to fix. The following discussion focuses on PHP 7.
It should be noted separately that the solution below only works for objects and this is due to the peculiarity of the work of the garbage collector in PHP.
')
Recommended for preliminary reading
Object Handlers
In PHP, functions for handling various events (operations) on an object are represented by a table of handlers with the structure
zend_object_handlers .
Internal classes can define and redefine object handlers, thus changing the behavior during various typical operations, for example, cloning, comparing, accessing properties of an object, receiving debug information, etc. A complete list of possible operations for which a handler can be specified is available by reference to the
zend_object_handlers structures above.
User classes that do not inherit the inner classes get default handlers stored in the variable
std_object_handlers .
Garbage collection
The topic of garbage collection in PHP is discussed in some detail, including in the
official documentation and is not the subject of this article.
In short, the value is to be assembled as soon as the reference counter does not become equal to 0. In normal mode, the destructor is called upon the object during the assembly and then the memory occupied by the object is cleared. Internally, the handlers
dtor_obj and
free_obj are responsible for this, respectively.
Internally, the implementation of the standard destructor is the
zend_objects_destroy_object function
, the essence of which is to call the
__destruct () method, if any.
If an exception is thrown or the language construct is called
exit (or its synonym is
die ), the object's destructor is not called, for example:
<?php class Test { public function __destruct() { echo 'Goodbye Cruel World!', PHP_EOL; } } $test = new Test(); throw new Exception('Test exception');
that will bring
$ php test.php Fatal error: Uncaught Exception: Test exception in /home/vagrant/Development/php-weak/test.php on line 13 Exception: Test exception in /home/vagrant/Development/php-weak/test.php on line 13 Call Stack: 0.0011 365968 1. {main}() /home/vagrant/Development/php-weak/test.php:0
Object storage
In order to more fully understand the possible pitfalls in the implementation of weak links, consider where and how all objects are stored in PHP, fortunately, it is very simple - in
EG (objects_store) .object_buckets which is an array where the key is
Z_OBJ_HANDLE (zval) - integer index of the object, hereinafter referred to as an object handle. At each moment of time it is unique for each object, when deleting an object from
EG (objects_store) .object_buckets, the value of the object descriptor can be assigned to another object, for example:
<?php $obj1 = new stdClass(); $obj2 = new stdClass(); debug_zval_dump($obj1); debug_zval_dump($obj2); $obj2 = null;
what if given
object(stdClass)
The value after the hash symbol (#) is the same object descriptor value. As you can see, the value of descriptor 2 was reused.
Implementation of weak links
Based on the foregoing, a very obvious solution to the implementation of weak references is to wrap the object's destructor to which we refer and add our own logic after the initial destructor is executed. The pinnacle of mastery would of course be to make weak references at the level of the interpreter itself, however, that is another story.
The pointer to the object's handler table has the
const specifier, to which the standard C90 and
C99 say that changing this value by casting to a
non- constant type will result in undefined behavior.
References for changing const values This is done for a reason. By changing the value of a single object handler, we change the handler for all objects of a given class, and moreover, subclass handlers, if they are not redefined, and in the case of a custom class object (which is not a descendant of an internal), for all objects created from custom classes.
The best solution would be to replace the pointer to the table of the handlers of a single object by copying the original table and replacing the handler we need -
dtor_obj . The abbreviated entry is as follows:
php_weak_referent_t *referent = (php_weak_referent_t *) ecalloc(1, sizeof(php_weak_referent_t)); memcpy(&referent->custom_handlers, referent->original_handlers, sizeof(zend_object_handlers)); referent->custom_handlers.dtor_obj = php_weak_referent_object_dtor_obj; Z_OBJ_P(referent_zv)->handlers = &referent->custom_handlers;
Obstacles
The testing process revealed a rather unpleasant fact - changes in the pointer value of the handler table affected the PHP result of the
spl_object_hash () function, which uses the pointer values to the handler table to generate the last 16 characters of id (the first 16 are the object descriptor hash) and, as a result, the return value of the object id will differ before and after creating a weak link for it:
<?php $obj = new stdClass(); var_dump(spl_object_hash($obj));
It can be considered a positive fact that once it has changed, the hash will no longer change (at least this extension).
But we will go further. Through discussions with PHP developers, an idea appeared ...
not to use the pointer value to the table of handlers when generating a hash , since even its first part satisfies the condition of uniqueness for an object hash during its entire life. The second part
now uses just an arbitrary value that was previously used as a mask:
<?php class Test {} class X {} $t = new Test(); $x = new X(); $s = new SplObjectStorage(); var_dump(spl_object_hash($t));
So much better! This improvement will most likely be available in PHP version> 7.0.2.
But it will be only after PHP 7.0.2, and we need to work globally and reliably on earlier versions of PHP 7 as well.
Not very beautiful, but a very working way is ...
replacing the spl_object_hash () function with
your own implementation:
<?php $spl_hash_function = $EG->function_table['spl_object_hash']; $custom_hash_function = function (object $obj) { $hash = null; if (weak\refcounted($obj)) { $referent = execute_referent_object_metadata($obj); $obj->handlers = $referent->original_handlers; $hash = $spl_hash_function($obj); $obj->handlers = $referent->custom_handlers; } if (null == $hash) { $hash = $spl_hash_function($obj); } return $hash; }; $EG->function_table['spl_object_hash'] = $custom_hash_function;
The author of this
substitution method is
Etienne Kneuss , the author of the original implementation of weak links in PHP 5 as an extension of
php-weakref . Unfortunately, I didn’t wait just a couple of hours before he also implemented support for PHP 7. However, two working extensions are better than none, and at the moment we have a rather different functionality and concept.
Notification mechanism for the destruction of the object
When creating a weak link, the mechanism of notification that the object we are referring to was destroyed is rather important to us. Conceptually, this problem can be solved by polling the weakest link, by storing the weak link into an array, as is done in the implementation of
WeakReference in Java; and, what
exactly is of gastronomic interest, the call of a user function, as is done in the implementation of
weakref.ref in the Python language.
To do this, after calling the source object's destructor, we call for each weak reference a user-defined function if such was specified when creating a weak reference, or save each weak reference object to an array if it was specified. It should be noted separately that if an exception is thrown in the destructor or in one of the user-notifier functions, no further notifier functions will be called. Simplified, the mechanism for notifying the destruction of an object through a call to user-defined functions can be represented as an extension of the base class of an object with a destructor override, where after calling the parent destructor, user functions are called one after the other, from the latest to the very first.
As a result, our destructor wrapper can be represented as a meta code:
run_original_dtor_obj($object); foreach($weak_references as $weak_ref_object_handle => $weak_reference) { if (is_array($weak_reference->notifier)) { $weak_reference->notifier[] = $weak_reference; } elseif (is_callable($weak_reference->notifier) && $no_exception_thrown) { $weak_reference->notifier($weak_reference); } unset($weak_references[$weak_ref_object_handle]); }
The weak link class itself has the following form:
namespace Weak; class Reference { public function __construct(object $referent, $notify = null) { } public function get() { } public function valid() : bool { } public function notifier($notify = null) { } }
For more information on values and objects, this extension provides a number of functions that complement the functionality of the weak reference class and allow you
to refuse to create various solutions at the user level in no way.
Listing extension functions namespace Weak; function refcounted(mixed $value) : bool {} function refcount(mixed $value) : int {} function weakrefcounted(object $value) : bool {} function weakrefcount(object $value) : int {} function weakrefs(object $value) : array {} function object_handle(object $value) : int {}
These functions are quite specific and should
never be used only in cases where the user understands what and why he does.
Practical use
This extension provides the above functionality was the result of work on another extension - integration of the
v8 JavaScript engine in PHP (not v8js, simpler and more powerful, at the moment the source code is not publicly available).
When writing an abstraction layer for translating php objects into js objects, it became necessary to unambiguously match and, most importantly, clean the correspondence table of php objects and js representations. A typical problem is that in different places a situation arises when the same php object is returned to js for which there is already a js view, and in order to have identical views inside the js, you need to store a correspondence table.
Initially, a table of correspondence between the id php of the object and the js representation itself was compiled, but as the work progressed, there was a situation when there was a surplus of orphaned js representations: the corresponding php objects were deleted by the garbage collector, since no one referred to them. The notification mechanism, which is implemented along with weak links, will completely solve this problem.
For people not suffering from the symptoms described above and leading a healthy nocturnal lifestyle, this extension may be useful for the following:
- Implementing caching of objects in runtime.
- Binding data to an object, for example, adding properties and methods in situations where inheritance and / or composition are not very suitable (hello, Java).
- deleting the appropriate event listeners after the object has been deleted.
I suggest in the comments will share possible ways to use weak links in PHP.
For our own needs, we created an implementation of the
WeakMap data structure based on
SplObjectStorage -
Weak \ SplObjectStorageThe extension is available at
github.com/pinepain/php-weak and is distributed under the
MIT license . It requires PHP 7 to work.