zval (Zend value) implementations have changed since the fifth version of PHP. Also discuss the implementation of links. In the second part, the implementation of individual data types, such as strings and objects, will be discussed in detail. typedef struct _zval_struct { zvalue_value value; zend_uint refcount__gc; zend_uchar type; zend_uchar is_ref__gc; } zval; value , type and additional information __gc , which I will discuss below. Value is the union of the various possible values that zval can store: typedef union _zvalue_value { long lval; // , double dval; // struct { // char *val; int len; } str; HashTable *ht; // zend_object_value obj; // zend_ast *ast; // } zvalue_value; lval , then its value will be interpreted as a signed integer. The dval value will be represented as a double-precision floating-point number. And so on. #define IS_NULL 0 /* */ #define IS_LONG 1 /* lval */ #define IS_DOUBLE 2 /* dval */ #define IS_BOOL 3 /* lval 0 1 */ #define IS_ARRAY 4 /* ht */ #define IS_OBJECT 5 /* obj */ #define IS_STRING 6 /* str */ #define IS_RESOURCE 7 /* lval resource ID */ /* , */ #define IS_CONSTANT 8 #define IS_CONSTANT_AST 9 zval are being used and which ones need to be cleaned. For this, reference counting is used. Component refcount__gc just stores information about how many times referred to zval . For example, in $a = $b = 42 the value 42 refers to two variables, so refcount is 2. If the value of refcount is zero, this means that the value is not used and can be cleared.zval together until it changes. To modify the shared zval it must be duplicated (separated) and all operations should be carried out with a copy.zval' : $a = 42; // $a -> zval_1(type=IS_LONG, value=42, refcount=1) $b = $a; // $a, $b -> zval_1(type=IS_LONG, value=42, refcount=2) $c = $b; // $a, $b, $c -> zval_1(type=IS_LONG, value=42, refcount=3) // zval $a += 1; // $b, $c -> zval_1(type=IS_LONG, value=42, refcount=2) // $a -> zval_2(type=IS_LONG, value=43, refcount=1) unset($b); // $c -> zval_1(type=IS_LONG, value=42, refcount=1) // $a -> zval_2(type=IS_LONG, value=43, refcount=1) unset($c); // zval_1 , refcount=0 // $a -> zval_2(type=IS_LONG, value=43, refcount=1) zval becomes part of the loop, it is written to the root buffer. When this buffer is full, potential cycles are marked and cleaned by the garbage collector. typedef struct _zval_gc_info { zval z; union { gc_root_buffer *buffered; struct _zval_gc_info *next; } u; } zval_gc_info; zval and an additional pointer. The u pointer, which is a union, is used to denote one of two types. The buffered pointer stores information about where the zval referenced in the root buffer . In the case of zval destruction, the pointer is destroyed until the cyclic collector is started (which is very convenient), next used when the collector deletes values.zvalue_value union is 16 bytes, since str and obj are the same size. The entire zval structure is 24 bytes, and zval_gc_info is 32 bytes. Among other things, placing the zval on the heap consumes an additional 16 bytes. Total for each zval accounts for 48 bytes, regardless of the number of places where it is used.zval . Judge for yourself: let's say it stores a simple integer, which in itself takes 8 bytes. Also, in any case, you need to store and type label, which occupies one byte, but because of the structure requires all eight. To the resulting 16 bytes, you need to add another 16 for the needs of reference counting and the cyclic garbage collector, and another 16 for placement on the heap. Not to mention that the operations of allocation and subsequent deletion consume a lot of resources.zval in PHP 5:Zval (almost) is always required to be placed on the heap.Zval always require the use of reference counting and gathering information about cycles. Even in cases where the sharing of values is not worth the resources spent (integer) or cycles can not occur in principle.zval 's. For example, a string cannot be shared in a zval and hash table key (without storing this key, also in the form of a zval).zval . One of the major innovations is that zval no longer needs to be placed separately on the heap. Also, the refcount is now stored not in the zval itself, but in any of the complex values it points to - in strings, arrays or objects. This gives the following benefits:zval and be a key in a hash table.zval : struct _zval_struct { zend_value value; union { struct { ZEND_ENDIAN_LOHI_4( zend_uchar type, zend_uchar type_flags, zend_uchar const_flags, zend_uchar reserved) } v; uint32_t type_info; } u1; union { uint32_t var_flags; uint32_t next; // hash collision chain uint32_t cache_slot; // literal cache slot uint32_t lineno; // line number (for ast nodes) uint32_t num_args; // arguments number for EX(This) uint32_t fe_pos; // foreach position uint32_t fe_iter_idx; // foreach iterator index } u2; }; value . The second component is an integer one that stores information about the type, which is divided into separate bytes using a merge (you can ignore the macro ZEND_ENDIAN_LOHI_4 , it is needed only to provide a consistent structure between platforms with different byte order). The important parts of this nested construct are type and type_flags , which I will discuss below.Value takes 8 bytes, and due to its structure, adding even one byte will entail an increase in the size of zval by 16 bytes. But we don’t need as many as 8 bytes to store the type. Therefore, in zval there is an additional u2 join, which is not used by default, but can be used to store 4 bytes of data. Different components of the union are designed for different uses of this additional storage.value union is slightly different from the fifth version: typedef union _zend_value { zend_long lval; double dval; zend_refcounted *counted; zend_string *str; zend_array *arr; zend_object *obj; zend_resource *res; zend_reference *ref; zend_ast_ref *ast; // , zval *zv; void *ptr; zend_class_entry *ce; zend_function *func; struct { ZEND_ENDIAN_LOHI( uint32_t w1, uint32_t w2) } ww; } zend_value; value now occupies 8 bytes instead of 16. It stores only integer ( lval ) and floating-point numbers ( dval ). Everything else is a pointer. All pointer types (with the exception of the special ones noted above) use reference counting and contain a header defined by zend_refcounted: struct _zend_refcounted { uint32_t refcount; union { struct { ZEND_ENDIAN_LOHI_3( zend_uchar type, zend_uchar flags, uint16_t gc_info) } v; uint32_t type_info; } u; }; type , flags and gc_info . Type only inherits the type of zval and allows the GC to distinguish between different counting structures without storage in zval . Flags used for different tasks with different data types. I will tell about it in more detail in the second part.Gc_info similar to buffered in the old version of zval . But instead of storing the pointer to the root buffer it now stores the index. Since the root buffer has a limited capacity (10,000 items), it suffices to use a 16-bit pointer instead of a 64-bit one. Also, gc_info contains information about the “color” of the node used to refer to the nodes in the collections.zval no longer needs to be placed separately on the heap. But they need to be stored somewhere. They are still part of the heap structures. For example, a hash table will contain its own zval instead of a pointer to a separate zval . The compiled function variable table and the object property table will be zval arrays. As such, zval now usually stores those whose indirection is one level lower. That is, zval 'is now called what used to be zval *.zval * and increment its refcount in order to use zval in a new location. Now all you have to do is copy the contents of zval (ignoring u2 ) and, perhaps , increment the refcount of the value it points to if the value uses reference counting.type_info component is type_info . #define IS_TYPE_CONSTANT (1<<0) /* */ #define IS_TYPE_IMMUTABLE (1<<1) /* */ #define IS_TYPE_REFCOUNTED (1<<2) #define IS_TYPE_COLLECTABLE (1<<3) #define IS_TYPE_COPYABLE (1<<4) #define IS_TYPE_SYMBOLTABLE (1<<5) /* */ refcounted , collectable and copyable .Collectable means that zval can be part of a loop. For example, string variables are often refcounted , but it’s impossible to create a loop with them.opyable determines whether a value should be copied when duplication is performed. If you duplicate a zval pointing to an array, this does not mean that the refcount value of the array will only increase. Instead, a new independent copy of the array will be created. But in the case of some types, for example, objects and resources, with duplication, the refcount only increases. Such types are called non-copyable. This corresponds to the transfer of semantics of objects and resources (which are not passed by reference). | refcounted | collectable | copyable | immutable -----------------------+------------+-------------+----------+---------- | | | | | x | | x | | | | | | x | x | x | | | | | x | x | x | | | x | | | | x | | | zval control works in practice. First we take a construction with integer values: $a = 42; // $a = zval_1(type=IS_LONG, value=42) $b = $a; // $a = zval_1(type=IS_LONG, value=42) // $b = zval_2(type=IS_LONG, value=42) $a += 1; // $a = zval_1(type=IS_LONG, value=43) // $b = zval_2(type=IS_LONG, value=42) unset($a); // $a = zval_1(type=IS_UNDEF) // $b = zval_2(type=IS_LONG, value=42) zval . I remind you that they are now embedded, and not placed in memory separately. This is underlined by using = instead of ->. When clearing a variable, the type of the corresponding zval will change to IS_UNDEF . $a = []; // $a = zval_1(type=IS_ARRAY) -> zend_array_1(refcount=1, value=[]) $b = $a; // $a = zval_1(type=IS_ARRAY) -> zend_array_1(refcount=2, value=[]) // $b = zval_2(type=IS_ARRAY) ---^ // zval $a[] = 1 // $a = zval_1(type=IS_ARRAY) -> zend_array_2(refcount=1, value=[1]) // $b = zval_2(type=IS_ARRAY) -> zend_array_1(refcount=1, value=[]) unset($a); // $a = zval_1(type=IS_UNDEF) zend_array_2 // $b = zval_2(type=IS_ARRAY) -> zend_array_1(refcount=1, value=[]) zval , but both pointers refer to the same (counted) zend_array structure. After the change is complete, you need to duplicate the array. In PHP 5, in a similar situation, everything works the same way. // #define IS_UNDEF 0 #define IS_NULL 1 #define IS_FALSE 2 #define IS_TRUE 3 #define IS_LONG 4 #define IS_DOUBLE 5 #define IS_STRING 6 #define IS_ARRAY 7 #define IS_OBJECT 8 #define IS_RESOURCE 9 #define IS_REFERENCE 10 // #define IS_CONSTANT 11 #define IS_CONSTANT_AST 12 // #define IS_INDIRECT 15 #define IS_PTR 17 IS_UNDEF used instead of the pointer to zval NULL (do not confuse with IS_NULL zval ). For example, in the examples above, variables are assigned the type IS_UNDEF .IS_BOOL type IS_BOOL divided into IS_FALSE and IS_TRUE . Since this boolean value is now built into the type, this allows you to optimize a number of checks based on the type. This change is unnoticeable for users who still operate with a single “boolean” type.is_ref flag in zval . A new type IS_REFERENCE introduced IS_REFERENCE . Below I will tell how it works.IS_INDIRECT and IS_PTR are special internal types.IS_LONG instead of the usual long from the C language now uses the value zend_long . The reason is that in 64-bit Windows, the length is only 32 bits. Therefore, PHP 5 no longer uses 32-bit numbers on Windows. And in PHP 7, you can use 64-bit values if the system is also 64-bit.zend_refcounted individual types. Here we confine ourselves to parsing the implementation of PHP links.zval needs to be duplicated before making changes. This is done in order not to accidentally change the value for each place using zval , which corresponds to the semantics of passing by value.is_ref flag allows is_ref to determine if a value is a PHP reference, and if so, whether a separation is required before making changes. $a = []; // $a -> zval_1(type=IS_ARRAY, refcount=1, is_ref=0) -> HashTable_1(value=[]) $b =& $a; // $a, $b -> zval_1(type=IS_ARRAY, refcount=2, is_ref=1) -> HashTable_1(value=[]) $b[] = 1; // $a = $b = zval_1(type=IS_ARRAY, refcount=2, is_ref=1) -> HashTable_1(value=[1]) // is_ref=1, PHP zval $a = []; // $a -> zval_1(type=IS_ARRAY, refcount=1, is_ref=0) -> HashTable_1(value=[]) $b = $a; // $a, $b -> zval_1(type=IS_ARRAY, refcount=2, is_ref=0) -> HashTable_1(value=[]) $c = $b // $a, $b, $c -> zval_1(type=IS_ARRAY, refcount=3, is_ref=0) -> HashTable_1(value=[]) $d =& $c; // $a, $b -> zval_1(type=IS_ARRAY, refcount=2, is_ref=0) -> HashTable_1(value=[]) // $c, $d -> zval_1(type=IS_ARRAY, refcount=2, is_ref=1) -> HashTable_2(value=[]) // $d $c, $a $b, zval . zval is_ref=0 is_ref=1. $d[] = 1; // $a, $b -> zval_1(type=IS_ARRAY, refcount=2, is_ref=0) -> HashTable_1(value=[]) // $c, $d -> zval_1(type=IS_ARRAY, refcount=2, is_ref=1) -> HashTable_2(value=[1]) // zval $d[] = 1 $a $b. $array = range(0, 1000000); $ref =& $array; var_dump(count($array)); // <-- count() takes a value directly from a variable, but $array is a PHP reference, so a complete copy of the array is created before it is passed to count() . If $array not a reference, the value would be shared.zval no longer allocated separately, there is no way to use the approach from PHP 5. There is a new type IS_REFERENCE that uses the zend_reference structure as a value: struct _zend_reference { zend_refcounted gc; zval val; }; zend_reference is a zval with reference counting. In all variables of the reference set, the zval will be stored with the IS_REFERENCE type pointing to the same zend_reference instance. The behavior of val is no different from any other zval , including in terms of the possibility of sharing the complex value it points to.zval variables. $a = []; // $a -> zend_array_1(refcount=1, value=[]) $b =& $a; // $a, $b -> zend_reference_1(refcount=2) -> zend_array_1(refcount=1, value=[]) $b[] = 1; // $a, $b -> zend_reference_1(refcount=2) -> zend_array_1(refcount=1, value=[1]) zend_referencewas created by assigning by reference. Notice that the refcount reference is 2 (because two variables are part of the set of PHP references), but the refcount value itself is 1, because it is referenced by one structure zend_reference. Now consider the situation when using links and non-links: $a = []; // $a -> zend_array_1(refcount=1, value=[]) $b = $a; // $a, $b, -> zend_array_1(refcount=2, value=[]) $c = $b // $a, $b, $c -> zend_array_1(refcount=3, value=[]) $d =& $c; // $a, $b -> zend_array_1(refcount=3, value=[]) // $c, $d -> zend_reference_1(refcount=2) ---^ // , PHP-, , zend_array. $d[] = 1; // $a, $b -> zend_array_1(refcount=2, value=[]) // $c, $d -> zend_reference_1(refcount=2) -> zend_array_2(refcount=1, value=[1]) // zend_array, . count() , . , zend_reference .zval , refcount. , — , . , .Source: https://habr.com/ru/post/257999/
All Articles