zval
no longer allocated separately and does not store refcount in itself. Simple values, such as integer or floating point, can be stored directly in zval
, while complex values are represented using a pointer to a separate structure.zend_refcounted
: struct _zend_refcounted { uint32_t refcount; union { struct { ZEND_ENDIAN_LOHI_3( zend_uchar type, zend_uchar flags, uint16_t gc_info) } v; uint32_t type_info; } u; };
refcount
, data type, information for the gc_info
garbage gc_info
, and also a cell for the type- flags
. Next, we look at the individual complex types and compare them with the implementation in PHP 5. In particular, we will focus on the links that have already been discussed in the first part of the article. We will not touch the resources, since I do not find them interesting enough to be considered here.zend_string
: struct _zend_string { zend_refcounted gc; zend_ulong h; /* hash value */ size_t len; char val[1]; };
refcounted
header, the hash cache h, length len
and val
are also used here. The hash cache is used to not recalculate the hash of the string each time the HashTable
. When first used, it is initialized as a non-zero hash.val
may seem strange: it is declared as an array of characters with a single element. But we certainly want to store strings longer than one character. Here we use a method called “struct hack”: although the array is declared with one element, but when creating a zend_string
we define the possibility of storing a longer string. In addition, it will be possible to access longer lines with val
.zval
. This is especially important for sharing hash table keys.zend_string
string of C language from zend_string
(using str-> val), but you can't directly get a zend_string
from the C-string. To do this, you have to copy the value of the string into the newly created zend_string. Especially annoying when it comes to working with text strings (literal string), that is, constant strings (constant string), found in the original C-code. #define IS_STR_PERSISTENT (1<<0) /* allocated using malloc */ #define IS_STR_INTERNED (1<<1) /* interned string */ #define IS_STR_PERMANENT (1<<2) /* interned string surviving request boundary */
zval
. In PHP 5, this required prior copy to ZMM. for ($i = 0; $i < 1000000; ++$i) { $array[] = ['foo']; } var_dump(memory_get_usage());
$array
receives a new copy of ['foo']
. Why make a copy instead of increasing the reference count? The fact is that the VM string operands do not use a reference counter, so as not to break the SHM. I hope that in the future this catastrophic situation will be corrected and it will be possible to abandon OPCache.zval
used to store zend_object_value
, defined as follows: typedef struct _zend_object_value { zend_object_handle handle; const zend_object_handlers *handlers; } zend_object_value;
handle
is a unique object ID used to search its data. handlers
are VTable function pointers that implement different object behavior. For "normal" objects, this handler table will be the same. But objects created by PHP extensions can use custom handler sets that change the behavior of objects (for example, overriding operators). typedef struct _zend_object_store_bucket { zend_bool destructor_called; zend_bool valid; zend_uchar apply_count; union _store_bucket { struct _store_object { void *object; zend_objects_store_dtor_t dtor; zend_objects_free_object_storage_t free_storage; zend_objects_store_clone_t clone; const zend_object_handlers *handlers; zend_uint refcount; gc_root_buffer *buffered; } obj; struct { int next; } free_list; } bucket; } zend_object_store_bucket;
union
construction depends on whether the storage is currently being used or is on the free list. The case when struct_store_object
used is important to us.object
is a pointer to a specific object. It is not integrated into the object storage, since the objects do not have a fixed size. The pointer is followed by three handlers responsible for the destruction, release and cloning. Please note that in PHP, the operations of destroying and releasing objects are explicit procedures, although the first one may be skipped in some cases (unclean shutdown). The cloning handler is virtually not used at all. Since these store handlers do not belong to regular object handlers, instead of sharing, they are duplicated for each object.handlers
. Those are saved if the object was destroyed without notifying the zval
(in which handlers are usually stored).refcount
, which gives certain advantages in view of the fact that in PHP 5 the reference counter is already stored in zval
. Why do we need two counters? Usually zval
“copied” by simply increasing the counter. But it happens that full-fledged copies appear, that is, a completely new zval
is created for the same zend_object_value
. As a result, two different zval
use the same object storage, which requires reference counting. This “double counting” is a characteristic feature of the zval
implementation in PHP 5. For the same reasons, the buffered pointer in the GC root buffer is duplicated.object
referenced by the object repository. Common objects in user space are defined as follows: typedef struct _zend_object { zend_class_entry *ce; HashTable *properties; zval **properties_table; HashTable *guards; } zend_object;
zend_class_entry
is a pointer to a class whose essence is an object. The following two elements are used to provide the storage of object properties in two different ways. For dynamic properties (that is, those that are added at run time and are not declared in the class), the properties hash table is used, which connects the property names and their values.properties_table
. Relationships between names and an index are stored in a hash table in a class entry. This prevents individual objects from overruning the hash table. Moreover, the property index is polymorphically cached during execution.guards
hash table is used to implement the recursive behavior of "magic" methods like _get
, but here I will not consider it.zval
object, you have to first call the object storage, then the Zend object, then the property table, and finally the property referenced by zval
. At least four levels of indirect addressing, and in real projects there will be at least seven.zend_object
structure: struct _zend_object { zend_refcounted gc; uint32_t handle; zend_class_entry *ce; const zend_object_handlers *handlers; HashTable *properties; zval properties_table[1]; };
zend_object_value
, replaced by a direct pointer to the object and storage of objects, although not completely ruled out, but less often.zend_refcounted
, inside the zend_object
handle
and handlers
“moved”. properties_table
now also uses a structured hack, so zend_object
and the property table are placed in one block. And of course, zval
itself is now directly included in the property table, not pointers to them.guards
table is now removed from the object structure and is stored in the first properties_table
cell, if the object uses __get
, etc. If these “magic” methods are not used, then the guards
table is not involved.dtor
, free_storage
and clone
handlers that were previously stored in the object storage moved to the handlers
table: struct _zend_object_handlers { /* offset of real object header (usually zero) */ int offset; /* general object functions */ zend_object_free_obj_t free_obj; zend_object_dtor_obj_t dtor_obj; zend_object_clone_obj_t clone_obj; /* individual object functions */ // ... rest is about the same in PHP 5 };
zend_object
, but at the same time it usually adds a certain number of elements “from above”. In PHP 5, they were added after the standard object: struct custom_object { zend_object std; uint32_t something; // ... };
zend_object*
to your custom struct custom_object*
. This suggests the introduction of structure inheritance in the C language. However, the approach in PHP 7 has its own peculiarities: since zend_object
uses a structured hack to store the property table, PHP stores properties in the zend_object
itself, overwriting additional internal elements. Therefore, in the seventh version, additional methods are stored in front of the standard object: struct custom_object { uint32_t something; // ... zend_object std; };
zend_object*
and struct custom_object*
because of the offset
between them using a simple conversion. It is stored in the first item in the object handler table. At compile time, offset
can be defined using the macro offsetof()
.handle
. Because now a direct pointer to zend_object
, so there is no longer any need to use a handle
to search for an object in the storage. However, the handle
is still needed, because there is still a repository of objects, albeit in a substantially truncated form. Now it is a simple array of pointers to objects. When an object is created, the pointer is placed in the repository in the index handle
, and is removed from there when the object is released.zval
). The memory consumption has significantly decreased, now 40 bytes are enough for the base object, and 16 bytes for each declared property, including zval
. It has become much less indirect addressing, since many intermediate structures have been excluded or merged with other structures. Therefore, when reading a property, now only one level of indirect addressing is used instead of four.zval
used in special cases. One of them is IS_INDIRECT
. The value of indirect zval
is stored elsewhere. Note that this type of zval
differs from the IS_REFERENCE
in that it directly points to another zval
, unlike the zend_reference
structure in which zval
embedded.zval
type be useful? Let's first consider the implementation of variables in PHP. All variables that are known at the compilation stage are entered into the index, and their values are written into the table of compiled variables (CV) in this index. But PHP also allows us to dynamically reference variables using variable variables or, if you are in the global scope, using $GLOBALS
. With this access, PHP creates a symbol table for the function / script containing a map of the names of the variables and their values.zval**
twice indirect pointers. In a normal situation, these pointers lead to the second pointer table, zval*
, and it, in turn, refers to the zval
: +------ CV_ptr_ptr[0] | +---- CV_ptr_ptr[1] | | +-- CV_ptr_ptr[2] | | | | | +-> CV_ptr[0] --> some zval | +---> CV_ptr[1] --> some zval +-----> CV_ptr[2] --> some zval
zval*
pointers is no longer used, and zval**
pointers refer to hash table storages. A small illustration with three variables $ a, $ b and $ c: CV_ptr_ptr[0] --> SymbolTable["a"].pDataPtr --> some zval CV_ptr_ptr[1] --> SymbolTable["b"].pDataPtr --> some zval CV_ptr_ptr[2] --> SymbolTable["c"].pDataPtr --> some zval
SymbolTable["a"].value = INDIRECT --> CV[0] = LONG 42 SymbolTable["b"].value = INDIRECT --> CV[1] = DOUBLE 42.0 SymbolTable["c"].value = INDIRECT --> CV[2] = STRING --> zend_string("42") SymbolTable["d"].value = ARRAY --> zend_array([4, 2])
zval
can also point to zval IS_UNDEF
. In this case, it is processed as if the hash table does not contain associated keys. And if unset($a)
writes the UNDEF
type to CV[0]
, it will be processed as if the character table does not have the key “a”.IS_CONSTANT
IS_CONSTANT_AST
two special types of zval
, available in PHP 5 and 7 - IS_CONSTANT
and IS_CONSTANT_AST
. To understand their purpose, consider an example: function test($a = ANSWER, $b = ANSWER * ANSWER) { return $a + $b; } define('ANSWER', 42); var_dump(test()); // int(42 + 42 * 42)
test()
function parameter values. But it is not yet defined at the time of the function declaration. The value of the constant will be known only after calling define()
. Therefore, the default values of parameters and properties, as well as constants and all elements capable of accepting a “static expression”, can postpone the evaluation of the expression until the first use.zval
type IS_CONSTANT
with the name of a constant is used. If the value is an expression, then zval
type IS_CONSTANT_AST
, referring to an abstract syntax tree (AST).Source: https://habr.com/ru/post/261131/
All Articles