📜 ⬆️ ⬇️

Increment in php



Take a variable and increase it by 1. It sounds easy, right? Well ... From the point of view of a PHP developer, probably, yes. But is it really? There may be some difficulties. There are several ways to increment values, they may look equivalent, but under the hood PHP works differently, which can lead to, so to speak, interesting results.

Consider three examples of adding a unit to a variable:

$a = 1; $a++; #    var_dump($a); $b = 1; $b += 1; #    var_dump($b); $c = 1; $c = $c + 1; #    var_dump($c); 

The code is different, but in each case the value of the variable increases. And what will be the result?
')
 int(2) int(2) int(2) 

Intuitively, all three methods look the same. That is, for incrementing, you can use both $a++ and $a += 1 . But let's look at another example:

 $a = "foo"; $a++; var_dump($a); $a = "foo"; $a += 1; var_dump($a); $a = "foo"; $a = $a + 1; var_dump($a); string(3) "fop" int(1) int(1) 

Surely many of you did not expect such a result! Maybe someone already knew that adding to a string variable leads to a change in the character set, but two int(1) ? Where did they come from? From the point of view of a PHP developer, this looks very inconsistent, and it turns out that our three ways of incrementing are unequal. Let's see what happens in the depths of PHP when executing code.

Bytecode


During the PHP script startup, your code is first compiled into an intermediate format - byte code. This fact refutes the view that PHP is a truly interpreted language - it interprets bytecode, not PHP source code.

The above code is converted to this byte code :

 compiled vars: !0 = $a, !1 = $b, !2 = $c line #* EIO op fetch ext return operands --------------------------------------------------------------------------- 3 0 E > ASSIGN !0, 1 4 1 POST_INC ~1 !0 2 FREE ~1 5 3 SEND_VAR !0 4 DO_FCALL 1 'var_dump' 7 5 ASSIGN !1, 1 8 6 ASSIGN_ADD 0 !1, 1 9 7 SEND_VAR !1 8 DO_FCALL 1 'var_dump' 11 9 ASSIGN !2, 1 12 10 ADD ~7 !2, 1 11 ASSIGN !2, ~7 13 12 SEND_VAR !2 13 DO_FCALL 1 'var_dump' 14 > RETURN 1 

You can easily create such opcodes yourself using the VLD debugger or the online service 3v4l.org . Do not think about what it all means. If you get rid of uninteresting things, then only these lines will remain:

 compiled vars: !0 = $a, !1 = $b, !2 = $c line #* EIO op fetch ext return operands --------------------------------------------------------------------------- 4 1 POST_INC ~1 !0 2 FREE ~1 8 6 ASSIGN_ADD 0 !1, 1 12 10 ADD ~7 !2, 1 11 ASSIGN !2, ~7 

Thus, $a++ turns into two opcodes ( POST_INC FREE ), $a += 1 - into one ( ASSIGN_ADD ) and $a = $a + 1 too, into two. Please note that in all three cases, different opcodes are obtained, which already implies a different execution of PHP.

Unary increment operator


Consider the first method of incrementing - the unary operator ( $a++ ). This PHP code is converted to the POST_INC opcode. By the way, PRE_INC obtained from ++$a , and you need to know the difference between them. The second opcode, FREE , clears the result after POST_INC , because we do not use its return value: POST_INC changes the current operand in place. In this case, you can ignore this opcode.

The reason for the difference in the performance of these opcodes lies in the file zend_vm_def.h , which you can find in the source C code of PHP. This is a large header file filled with macros, so it is not so easy to read, even if you know C. When you call the POST_INC POST_INC , the contents of line 971 are executed.

In short, this is what happens:


As you can see, the process of adding a number depends on the type of the variable. If this number is, then surely everything will be reduced to a call to fast_increment_function , and if this is a magic property, then to a call to increment_function() . Below we talk about the operation of these functions.

fast_increment_function ()


The fast_increment_function() function belongs to zend-operators , and its task is to accelerate a specific variable as quickly as possible.

If the variable is of the long type, then a very fast assembly language is used to increment it. If the value has reached the maximum number of type INT ( LONG_MAX ), then the variable is automatically converted to double ( double ). This is the fastest way to increase the number, since this part of the code is written in assembler. It is believed that the compiler can not optimize the C-code better than the assembler. But the method only works if the variable is of type long . Otherwise, the function will be redirected to the increment_function() function. Since incrementing (and decrementing) is most often done in very small inner loops (for example, for ), you need to do this as quickly as possible in order to maintain high PHP performance.

increment_function ()


If fast_increment_function() is a fast way to increment a number, then increment_function is a slow ( slow ) way. The process scenario also depends on the type of the variable.


So, the system checks different types. Note: there is no check here, say, to a boolean value, this suggests that such a type cannot be incremented. $a = false; $a++ $a = false; $a++ not only will not work, but even the error will not return. The variable simply does not change, but remains false .

String increment


And now the funny thing. Working with strings is always full of nuances, but in this case, that's what happens.

First, it checks if the string contains a number. For example, string 123 contains the number 123. Such a “string number” will be converted to a normal number of type long (int(123) ). When converting, several tricks are used:


If the result is a long or double , the number simply increases. For example, if we take the string 123 and increment, we get an int(124) . Note that the variable type changes from string to integer!

If the string cannot be converted to long or double , then the function increment_string() called.

increment_string ()


PHP uses a perl-like increment system. If the string is empty, then it simply returns string("1") . Otherwise, the carry system is used to increment the string.

We start at the end of the variable. If the character is from a to z , then it is incremented ( a becomes b , and so on). If the character is z , then it changes to a and is “transferred” one position ahead of the current one.

That is: a becomes b , ab becomes ac (transfer is not needed), az becomes ba ( z becomes a , a becomes b , because we carry one character).

The same applies to uppercase characters from A to Z , as well as to numbers from 0 to 9 . When incrementing 9 turns into 0 and is transferred to the previous position.

If we have reached the beginning of a string variable and need to make a transfer, then just one more character is added BEFORE the entire string. The type is the same as that of the portable character:

 "z" => "aa" "9" => "00" "Zz" => "AAa" "9z" => "10a" 

So when incrementing a string, you cannot change the type of each character. If it was in lower case, then it will remain in it.

But be careful if you increment the “number in the string” several times.

Incrementing string("2D9") will string("2D9") string("2E0") ( string("2D9" ) is not a number, so the usual string will be incremented). But when string("2E0") incremented, you already get double(3) , because 2E0 is a scientific representation 2 and it will be converted to double , which can then be incremented to 3. So be careful with the increment cycles!

This string incrementing system also explains why we can increment “Z” to “AA”, but cannot decrement “AA” back to “Z”. Only the last “A” character is decremented, but what to do with the first one? Should it also be decremented to “Z” with the help of (negative) transfer? What about “0A”? Should it be Z ? And if so, then with the new increment we will get AA . In other words, we cannot simply remove characters during decrementing, as we add them when incrementing.

Summing Assignment Operator


We now consider the second example from the beginning of the paper — the summing assignment operator ( $a += 1 ). It looks similar to the unary increment operator, but behaves differently in terms of the generated opcodes and actual execution. The expression is fully processed using zend_binary_assign_op_helper , which, after a series of checks, calls add_function with two operands: $a and our int(1) value.

add_function ()


The add_function method works differently depending on the types of variables. For the most part, it consists of checking the types of operands:


If the operands are of some other type (for example, string + long ), then using the zendi_convert_scalar_to_number method, both of them will be converted to scalars. After the conversion, the add_function function will be applied add_function , and this time a match will probably be found for one of the described pairs.

zendi_convert_scalar_to_number ()


The conversion of a scalar to a number depends on the type of scalar. It usually comes down to one of the following algorithms:


Sum statement


This is the easiest of all three options. When it is executed, the fast_add_function() function is fast_add_function() . Like fast_increment_function() , it directly uses the assembler code to increment numbers if both operands are long or double . If this is not the case, then the function is redirected to the add_function() function used by the assignment expression.

Since both the addition operator and the summing assignment operator use the same basic functionality, $a = $a + 1 $a += 1 work the same way. The only difference is that the addition operator CAN execute quickly if both operands are long or double . So if you want to micro-optimize, $a = $a + 1 will work faster than $a += 1 . Not only thanks to fast_add_function() , but also because we don’t need to process additional bytecode to save the results back to $a .

Conclusion


Incrementing a value is different from simple addition: add_function converts types to compatible pairs, and increment_function does not. Now we can explain the results obtained:

 $a = false; $a++; var_dump($a); // bool(false) $a = false; $a += 1; var_dump($a); // int(1) $a = false; $a = $a + 1; var_dump($a); // int(1) 

Since the increment_function does not convert a boolean value (this is not a number or a string that can be converted to a number), a silent failure occurs and the value is not incremented. Therefore, it remains a bool(false) . In the case of add_function , an attempt is made to find a matching pair of boolean and long , which does not exist. As a result, both values ​​are converted to long : bool(false) becomes int(0) , and int(1) remains int(1) . Now we have a pair of long & long , so add_function simply summarizes them and it turns out int(1) . (Question: what will the boolean true + int(1) turn into?)

We can also explain another oddity:

 $a = "foo"; $a++; var_dump($a); // string("fop") $a = "foo"; $a += 1; var_dump($a); // int(1) $a = "foo"; $a = $a + 1; var_dump($a); // int(1) 

Since the string cannot be converted to a number, the usual incrementing of the string is performed. The add expression converts the strings to long after checking for the presence of numbers. Since there are none, then the string is converted to int(0) and int(1) added to it.

Source: https://habr.com/ru/post/305906/


All Articles