
Error exception will be thrown. I.e: 1 |> 3; // [1, 2, 3] 2.5 |> 5; // [2.5, 3.5, 4.5] $a = $b = 1; $a |> $b; // [1] 2 |> 1; // Error exception 1 |> '1'; // Error exception new StdClass |> 1; // Error exception T_RANGE , |> is returned. For this you have to update the file Zend / zend_language_scanner.l . Add the following code to it (in the section where all tokens are declared, approximately the 1200th line): <ST_IN_SCRIPTING>"|>" { RETURN_TOKEN(T_RANGE); } ST_IN_SCRIPTING mode. This means that it will only define a sequence of |> characters. Between braces is a code on C, which will be executed when it detects |> in the source code. In this example, the T_RANGE token is T_RANGE .Retreat. If we modify the lexical analyzer, then for its regeneration we need Re2c. For normal PHP builds, this dependency is not needed.
T_RANGE identifier must be declared in the Zend / zend_language_parser.y file . To do this, add to the end of the section where the remaining token identifiers are declared (approximately line 220): %token T_RANGE "|> (T_RANGE)" 1 |> 2; // Parse error: syntax error, unexpected '|>' (T_RANGE) in... token_get_all and token_name . At the moment he is in happy ignorance regarding the T_RANGE token: echo token_name(token_get_all('<?php 1|>2;')[2][0]); // UNKNOWN echo token_name(token_get_all('<?php 1|>2;')[2][0]); // T_RANGE T_RANGE token is used in the PHP scripts. Also the parser is responsible for:Retreat . Priority sets the rules for grouping expressions. For example, in the expression 3 + 4 * 2, the * character has a higher priority than +, therefore the expression will be grouped as 3 + (4 * 2).
Associativity describes the behavior of an operator during chain building: whether the operator can be embedded in the chain, and if so, how it will be grouped within a specific expression. Suppose a ternary operator has left-sided associativity, then it will be grouped and executed from left to right. That is the expression1 ? 0 : 1 ? 0 : 1; // 1
will be executed as(1 ? 0 : 1) ? 0 : 1; // 1
If we correct this and prescribe right-sided associativity, the expression will be executed as follows:$a = 1 ? 0 : (1 ? 0 : 1); // 0
There are non-associative operators that cannot be embedded in chains at all. Let's say the> operator. So this expression will be erroneous:1 < $a < 2;
T_SPACESHIP ). This is done by adding the token T_RANGE to the end of the next line (approximately 70th): %nonassoc T_IS_EQUAL T_IS_NOT_EQUAL T_IS_IDENTICAL T_IS_NOT_IDENTICAL T_SPACESHIP T_RANGE expr_without_variable . Add the following code to it (for example, right before the rule T_SPACESHIP , approximately the 930th line): | expr T_RANGE expr { $$ = zend_ast_create(ZEND_AST_RANGE, $1, $3); } zend_ast_create function zend_ast_create used to create our AST node for a new operator. The node name is ZEND_AST_RANGE , it contains two values: $ 1 refers to the left operand ( expr T_RANGE expr), $ 3 refers to the right operand (expr T_RANGE expr ).ZEND_AST_RANGE . To do this, update the Zend / zend_ast.h file by simply adding a constant under the list of two child nodes (for example, under ZEND_AST_COALESCE ): ZEND_AST_RANGE, 1 |> 2; ZEND_AST_RANGE ) to the large branch operator in the zend_compile_expr function (for example, immediately after ZEND_AST_COALESCE , roughly the ZEND_AST_COALESCE line): case ZEND_AST_RANGE: zend_compile_range(result, ast); return; zend_compile_range : void zend_compile_range(znode *result, zend_ast *ast) /* {{{ */ { zend_ast *left_ast = ast->child[0]; zend_ast *right_ast = ast->child[1]; znode left_node, right_node; zend_compile_expr(&left_node, left_ast); zend_compile_expr(&right_node, right_ast); zend_emit_op_tmp(result, ZEND_RANGE, &left_node, &right_node); } /* }}} */ ZEND_AST_RANGE node into the left_ast and right_ast pointer right_ast . Next, we declare two znode variables in which the result of compiling the AST nodes of each of the two operands will be stored. This is the recursive part of processing the tree and compiling its nodes into opcodes.zend_emit_op_tmp function, zend_emit_op_tmp generate the ZEND_RANGE with its two operands.zend_emit_op_tmp function.Retreat . Opcodes for PHP scripts can be found using:
- PHPDBG:
sapi/phpdbg/phpdbg -np* program.php- Opcache
- Vulcan Logic Disassembler (VLD) Extensions:
sapi/cli/php -dvld.active=1 program.php- If the script is short and simple, then you can use 3v4l
znode_op nodes ( znode_op structures) can be of different types:IS_CV ( C ompiled V ariables). These are simple variables (like $ a), cached in simple arrays to bypass searches in a hash table. They appeared in PHP 5.1 as the optimization of compiled variables (Compiled Variables). In VLD, they are denoted by! N (n is an integer).IS_VAR . For all complex expressions that play the role of variables (like $ a-> b). May contain zval IS_REFERENCE , in VLD are denoted by $ n (n is integer).IS_CONST . For literal values (for example, explicitly spelled strings).IS_TMP_VAR . Temporary variables are used to store the intermediate result of an expression (and therefore not for long). They can participate in reference counting (refcount) (in PHP 7), but cannot contain zval IS_REFERENCE , because temporary variables cannot be used as references. In VLD, denoted by ~ n (n is integer).IS_UNUSED . Usually used to designate an op node as unused. But sometimes znode_op.num can store data for use by the virtual machine.zend_emit_op_tmp , which will generate a zend_op type IS_TMP_VAR . We need this because our operator will be an expression, and the value (array) produced by it will be a temporary variable that can be used as an operand for another opcode (for example, ASSIGN from $var = 1 |> 3; ). ZEND_VM_HANDLER(182, ZEND_RANGE, CONST|TMP|VAR|CV, CONST|TMP|VAR|CV) { USE_OPLINE zend_free_op free_op1, free_op2; zval *op1, *op2, *result, tmp; SAVE_OPLINE(); op1 = GET_OP1_ZVAL_PTR_DEREF(BP_VAR_R); op2 = GET_OP2_ZVAL_PTR_DEREF(BP_VAR_R); result = EX_VAR(opline->result.var); // if both operands are integers if (Z_TYPE_P(op1) == IS_LONG && Z_TYPE_P(op2) == IS_LONG) { // for when min and max are integers } else if ( // if both operands are either integers or doubles (Z_TYPE_P(op1) == IS_LONG || Z_TYPE_P(op1) == IS_DOUBLE) && (Z_TYPE_P(op2) == IS_LONG || Z_TYPE_P(op2) == IS_DOUBLE) ) { // for when min and max are either integers or floats } else { // for when min and max are neither integers nor floats } FREE_OP1(); FREE_OP2(); ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION(); } ZEND_VM_LAST_OPCODE at the end.Retreat . The above code contains several pseudo-USE_OPLINE(USE_OPLINEandGET_OP1_ZVAL_PTR_DEREF). These are not real C-macros during the generation of the virtual machine, they are replaced by the Zend / zend_vm_gen.php script , unlike the procedure performed by the preprocessor during the compilation of the source code. So if you want to see their definitions, then refer to the Zend / zend_vm_gen.php file .
ZEND_VM_HANDLER contains the definition of each opcode. It can have five parameters:ZEND_RANGE ).$vm_op_decode in Zend / zend_vm_gen.php ).$vm_op_decode in Zend / zend_vm_gen.php ).$vm_ext_decode at Zend / zend_vm_gen.php ). // CONST enables for 1 |> 5.0; // TMP enables for (2**2) |> (1 + 3); // VAR enables for $cmplx->var |> $var[1]; // CV enables for $a |> $b; Retreat . If one or both operands are not used, they are marked with ANY.
Retreat .TMPVARappeared in ZE 3. It handles the same types of opcode nodes asTMP|VAR, but generates different code.TMPVARgenerates one method for processingTMPandVAR, which reduces the size of the virtual machine, but requires more conditional logic. AndTMP|VARgenerates separate methods for processingTMPandVAR, which increases the size of the virtual machine, but requires less conditional structures.
USE_OPLINE pseudo- USE_OPLINE to declare the variable opline (zend_op structure). It will be used to read operands (using pseudo- GET_OP1_ZVAL_PTR_DEREF like GET_OP1_ZVAL_PTR_DEREF ) and prescribe the return value of the opcode.zend_free_op . These are simple zval pointers declared for each operand we use. They are needed during the test, if an operand needs release. Then we declare four zval. op1 variables zval. op1 zval. op1 and op2 pointers to these zval 's, they contain operand values. We declare the result variable to store the results of the opcode operation. Finally, we declare tmp to store the intermediate value of a looping operation in a range (range looping operation). This value will be copied to the hash table at each iteration.op1 and op2 initialized with the pseudo- GET_OP1_ZVAL_PTR_DEREF and GET_OP2_ZVAL_PTR_DEREF . Also, these macros are responsible for initializing the variables free_op1 and free_op2 . The constant BP_VAR_R passed to the above macros is a type flag. Its name stands for BackPatching Variable Read , which is used when reading compiled variables . And in the end we dereference opline and assign result its value for further use.if , provided that min and max are integers: zend_long min = Z_LVAL_P(op1), max = Z_LVAL_P(op2); zend_ulong size, i; if (min > max) { zend_throw_error(NULL, "Min should be less than (or equal to) max"); HANDLE_EXCEPTION(); } // calculate size (one less than the total size for an inclusive range) size = max - min; // the size cannot be greater than or equal to HT_MAX_SIZE // HT_MAX_SIZE - 1 takes into account the inclusive range size if (size >= HT_MAX_SIZE - 1) { zend_throw_error(NULL, "Range size is too large"); HANDLE_EXCEPTION(); } // increment the size to take into account the inclusive range ++size; // set the zval type to be a long Z_TYPE_INFO(tmp) = IS_LONG; // initialise the array to a given size array_init_size(result, size); zend_hash_real_init(Z_ARRVAL_P(result), 1); ZEND_HASH_FILL_PACKED(Z_ARRVAL_P(result)) { for (i = 0; i < size; ++i) { Z_LVAL(tmp) = min + i; ZEND_HASH_FILL_ADD(&tmp); } } ZEND_HASH_FILL_END(); ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION(); min and max . They are declared as zend_long , which should be used when declaring long integers (just like zend_ulong used to define long integers without a sign). The size is then declared using zend_ulong , which contains the size of the array to be generated.min > max , an Error exception is thrown. If you pass Null as the first argument in zend_throw_error , the default exception class is Error . With inheritance, you can fine-tune this exception by creating a new class entry in Zend / zend_exceptions.c . But we will talk more about this another time. If this exception occurs, we call the pseudo- HANDLE_EXCEPTION , which proceeds to the execution of the next opcode.min = ZEND_LONG_MIN (PHP_INT_MIN) and max = ZEND_LONG_MAX (PHP_INT_MAX) .HT_MAX_SIZE to make sure that the array HT_MAX_SIZE into the hash table. The total size of the array must not be greater than or equal to HT_MAX_SIZE . Otherwise, we again generate an Error exception and exit the virtual machine.HT_MAX_SIZE = INT_MAX + 1 . If the resulting value is greater than size , then we can increase the latter without fear of overflow. This is what we take as the next step so that the size value matches the size of the range.tmp IS_LONG . Then, using the macro array_init_size initialize result . This macro assigns the result' type IS_ARRAY_EX , allocates memory for the zend_array structure (hash table), and sets up the corresponding hash table. Then, the zend_hash_real_init function allocates memory for the Bucket structures containing each element of the array. The second argument, 1, indicates that we want to make it a packed hash table (packed hashtable).Retreat . A packed hash table is essentially an actual array, that is, an array that is accessed using integer keys (as opposed to typical associative arrays in PHP). This optimization was implemented in PHP 7. The reason for this innovation is that in PHP, many arrays are indexed with integers (keys in ascending order). Packed hash tables provide direct access to the hash table pool. If you are interested in the details of the new implementation of hash tables, then refer to the article by Nikita .
Retreat . The_zend_arraystructure has twozend_array:zend_arrayandHashTable.
ZEND_HASH_FILL_PACKED ( definition ), which essentially keeps track of the current bucket for later insertion. During array generation, the intermediate result (array element) is stored in zval'e tmp . The macro ZEND_HASH_FILL_ADD creates a copy of tmp , inserts it into the current bucket of the hash table, and proceeds to the next bucket for the next iteration.ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION macro (appeared in ZE 3 as a replacement for the individual CHECK_EXCEPTION() and ZEND_VM_NEXT_OPCODE() calls embedded in ZE 2) checks if an exception has occurred. It did not occur, and the virtual machine moves to the next opcode.else if block: long double min, max, size, i; if (Z_TYPE_P(op1) == IS_LONG) { min = (long double) Z_LVAL_P(op1); max = (long double) Z_DVAL_P(op2); } else if (Z_TYPE_P(op2) == IS_LONG) { min = (long double) Z_DVAL_P(op1); max = (long double) Z_LVAL_P(op2); } else { min = (long double) Z_DVAL_P(op1); max = (long double) Z_DVAL_P(op2); } if (min > max) { zend_throw_error(NULL, "Min should be less than (or equal to) max"); HANDLE_EXCEPTION(); } size = max - min; if (size >= HT_MAX_SIZE - 1) { zend_throw_error(NULL, "Range size is too large"); HANDLE_EXCEPTION(); } // we cast the size to an integer to get rid of the decimal places, // since we only care about whole number sizes size = (int) size + 1; Z_TYPE_INFO(tmp) = IS_DOUBLE; array_init_size(result, size); zend_hash_real_init(Z_ARRVAL_P(result), 1); ZEND_HASH_FILL_PACKED(Z_ARRVAL_P(result)) { for (i = 0; i < size; ++i) { Z_DVAL(tmp) = min + i; ZEND_HASH_FILL_ADD(&tmp); } } ZEND_HASH_FILL_END(); ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION(); Retreat . We uselong doublein cases where simultaneous use of integer operands and floating point is possible. The fact is thatdoubleprecision is only 53 bits, so when using this type, any integer greater than 2 53 will not be accurately represented. And along doubleaccuracy of at least 64 bits, so that it allows you to accurately use 64-bit integers.
Z_DVAL_P ,IS_DOUBLE tmp ,Z_DVAL .min , max , or both are neither integer nor floating point. As stated in the second paragraph of the semantics of our range operator, only integer and floating point are supported as operands. In all other cases, the exception Error should be thrown. Let's insert the following code in the else block: zend_throw_error(NULL, "Unsupported operand types - only ints and floats are supported"); HANDLE_EXCEPTION(); var_dump(1 |> 1.5); var_dump(PHP_INT_MIN |> PHP_INT_MIN + 1); array(1) { [0]=> float(1) } array(2) { [0]=> int(-9223372036854775808) [1]=> int(-9223372036854775807) } assert() : assert(1 |> 2); // segfaults . assert() pretty printer , . , ( pretty printer ). , PHP 7.ZEND_AST_RANGE . ( 520- ), 170 ( zend_language_parser.y): * 170 non-associative == != === !== |> ZEND_AST_RANGE zend_ast_export_ex case ( case ZEND_AST_GREATER ): case ZEND_AST_RANGE: BINARY_OP(" |> ", 170, 171, 171); case ZEND_AST_GREATER: BINARY_OP(" > ", 180, 181, 181); case ZEND_AST_GREATER_EQUAL: BINARY_OP(" >= ", 180, 181, 181); assert() : assert(false && 1 |> 2); // Warning: assert(): assert(false && 1 |> 2) failed... Source: https://habr.com/ru/post/276331/
All Articles