📜 ⬆️ ⬇️

PHP strings

Recently, discussions of the PHP language on Habré boil down more to the possibilities of designing complex systems, which is good news. However, having reviewed a dozen of the most recognized web frameworks (Zend Framework, Adept, CakePHP, CodeIgniter, LIMB, Symfony, MZZ, and others), I was surprised to discover some significant flaws in terms of elementary optimization.

One of the weak points is working with strings (the problem of concatenation has already been discussed by habpro-programmers) and I decided to recall my cognitive youth and conduct a couple of tests with strings that I would like to share.


')
In order for this topic to be more technically-oriented, the results are presented in a more rigorous form, which may somewhat complicate the perception.

So, let's go ... The task is extremely simple: to conduct experiments on the speed of forming strings from substrings in single and double quotes. In principle, this question will be relevant for a long time due to the peculiarities of processing string in PHP.

There are many articles on basic script optimization both in Russian and in other languages. Little is said about strings there, but the fact of “parsing” strings in double quotes for variables and control characters is noted (however, as in the official documentation ). Based on this, it is logical to assume that the use of strings in double quotes in the work will be somewhat slower than the same operations with substrings in single quotes.

In addition to substituting variables into strings and concatenating variables with substrings, PHP has implemented at least one more way to form strings: working with the sprintf function. It is logical to assume that this method will be significantly inferior to the "standard" because of the extra function call and parsing the string inside.

The only addition, before I present you the test script code: you need to consider 2 possible options for working with strings in double quotes: taking into account the simple and “advanced” coding style. The fact that the variables are at the very beginning of the lines is not worth paying attention to, probably - they are only examples:
  $ string = "$ _SERVER ['HTTP_HOST'] - not the administration of the Ulyanovsk region. We love the Russian language and do not love those who ..." 

and
  $ string = "{$ _SERVER ['HTTP_HOST']} - not the administration of the Ulyanovsk region. We love the Russian language and do not love those who ..." 


Test number one.
Well, like, all reservations are made - it's time to show the results of work. The source code of the tester can be found here .

Profiler screenshots (copies) are located here , here and here .

The screenshots show that my hypothesis was not confirmed. The only true was the assumption about working with strings via sprintf. The fastest functions turned out to work with double quotes.

After a brief reflection on the situation, the explanation came by itself: the whole point is that the reference string into which the substitutions were made is too short: the passage of the parser along such a line is a trifling matter. However, even here it can be seen that the native substitution of a variable into a string gives an advantage over the “advanced style”.
The weakness of the concatenation approach is the same: the volumes of the inserted data exceed the volumes of the substrings. Where does the overhead come from can be read in the already mentioned habratopic .

However, even these thoughts needed to be confirmed. This required a second test with changes to the possible reasons mentioned for such unpredictable (for me) behavior. Apparently, a lot of things were tweaked in the fifth version (I confess, in the fifth version of php I performed only 1 test: on traversing the elements of arrays).

Test number two.
The second hypothesis: the lengthening of the reference line will ultimately increase the percentage of the operating time of the tester functions associated with the formation of double quotation strings relative to the results of test number 1. Theoretically, the same situation should be observed with regard to the operation of the sprintf function. This is due, primarily, to the need to parse strings and increase the time spent on it. In the case of concatenation of substrings in single quotes, I think there will be about the same result as in the first test, which will give a small decrease in the proportion of the execution time of the quotes_3 () function by the time the entire script runs (but not the performance increase).

Script sources are here ,
Copies of the screenshot can be found here , here and here .

Conclusions, only positive and confirming the hypothesis. With a slight increase in the reference line, a large load appears, which leads to a drop in the speed of the functions for working with double quotes and sprintf.

The assumption about strings in single quotes also turned out to be true: instead of 36.75% of the time in the first test, in the second, the quotes_3 () function took 33.76% of the script execution time

Practical value.
In simple terms, abstracting from the data, we can conclude: the longer the string in which the substitution is to be made, the greater the likelihood that the concatenation operation will be faster than searching for a variable in double quotes. Volunteers can try to select the necessary insertion parameters (the number of variables, the length of the reference line, the length of the lines in the variables) such that they satisfy the equality of execution times.

That's all. It only remains to add that in programming there are no trifles (I’m lovers to say “saving on matches” (c) Adelf). Knowing such subtleties and taking them into account, you can write code that will be optimized at all levels;)

PS:
Tests are conducted using Zend Studio For Eclipse 6.0.0 (Debugger + Profiler included).
PHP Version 5.2.5
Debian Linux OS

PPS:
I would be glad if someone posted their results of these tests. I think this will allow a more objective assessment of the need to use one or another method of substituting into strings. I would also appreciate a healthy criticism of the style of presentation and design.

Thank you all for your attention :)

Source: https://habr.com/ru/post/40072/


All Articles