📜 ⬆️ ⬇️

PHP GR8: Will JIT improve PHP 8 performance



PHP is one of the major development languages ​​in Badoo. In our data centers, thousands of processor cores are busy executing millions of lines of PHP code. We are closely following the innovations and are actively looking for ways to improve performance, since in our volumes even a small optimization leads to a significant saving of resources. One of the main news in the field of performance PHP - the emergence of JIT in the eighth version of the language. This, of course, could not remain without our attention, and we translated an article about what is JIT, how it will be implemented in PHP, why it was decided to do and what to expect from it.

If you didn’t come out of the cave or didn’t come from the past (in this case, welcome), then you already know that there will be a JIT in PHP 8: a few days ago the voting was quiet and peaceful, and the overwhelming majority of participants were in favor of implementation, so everything .
')
In a fit of joy, you can even portray a few crazy movements as in the photo (this, by the way, is called the “DITROIT JIT”:



Now sit down and read this myth-breaking article. I want to clarify the misunderstanding related to what is a JIT and how it is useful, and talk about how it works (but not too detailed, so that you do not get bored).

Since I do not know who will read the article, I will go from simple to complex questions. If you already know the answer to the question in the title, you can safely skip the appropriate chapter.

What is JIT?


PHP is implemented on a virtual machine basis (we call it Zend VM). The language compiles the PHP source code into instructions that the virtual machine understands (this is called the compilation stage). The virtual machine instructions obtained at the compilation stage are called opcodes. At runtime, Zend VM executes opcodes, thereby performing the required work.

This circuit works great. In addition, tools like APC (earlier) and OpCaché (today) cache the results of the compilation stage, so this stage is performed only when necessary.

In short, JIT is a just in time compilation strategy (at the right moment), in which the code is first translated into an intermediate representation, which is then turned into machine code, dependent on the architecture, during execution.

In PHP, this means that the JIT will consider the instructions received at the compilation stage for the virtual machine as an intermediate representation and issue machine code that will not be executed by Zend VM, but by the processor itself.

What does PHP need JIT for?


Shortly before the advent of PHP 7.0, the main focus of the work of the PHP team was the performance of the language. Most of the major changes in PHP 7.0 were contained in the PHPNG patch, which greatly improved how PHP uses memory and processor. Since then, each of us has to glance at the performance of the language.

After the release of PHP 7.0, performance improvements continued: the hash table was optimized (the main data structure in PHP), the specialization of certain opcodes in Zend VM and the specialization of certain sequences in the compiler were introduced, the Optimizer (OpCache component) was constantly improved and many other changes were implemented.

The harsh truth is that as a result of all these optimizations, we are rapidly approaching the limits of performance improvement opportunities.

Please note that by “the limit of opportunities for improvement,” I mean the fact that the compromises that will have to be made for the sake of further improvements no longer look attractive. When it comes to performance optimization, we always talk about trade-offs. Often, for performance, we have to sacrifice simplicity. Everyone would like to think that the simplest code is also the fastest, but in the modern world of C programming this is not the case. The fastest is most often the code that is prepared to take advantage of the internal structure of the architecture or the structures built into the platform / compiler. Simplicity alone does not guarantee better performance.

Therefore, at this stage, the best way to squeeze out of PHP even more performance is the implementation of JIT.

JIT will speed up my site?


In all likelihood, slightly.

Perhaps this is not the answer you expected. The fact is that, in general, PHP applications are limited by input / output (I / O bound), and JIT works best with code that is limited by the processor (CPU bound).

What does "limited by I / O and processor" mean?


To describe the characteristics of the overall performance of a code or application, we use the terms "input-output limited" and "processor limited".

The simplest definition is:


Code and application can be limited by input / output, by processor, or both.

In general, PHP applications tend to be limited by input-output: their main bottleneck is often input-output operations — connection, reading and writing to the database, caches, files, sockets, etc.

What does the processor-specific PHP code look like?


Perhaps some PHP programmers are not familiar with processor-bound code due to the very nature of most PHP applications: they usually act as a link to a database or cache, pick up and give out small amounts of HTML / JSON / XML responses.

You can look at your codebase and find a lot of code that has nothing to do with I / O, code that calls functions that are not related to I / O. And you may be confused that this does not make your application processor-bound, although its code contains more lines that do not work with I / O than they do.

The fact is that PHP is one of the fastest interpreted languages. There is no noticeable difference between calling a function that does not use I / O in Zend VM and in native code. Of course, there is some difference, but both the machine code and Zend VM use the calling convention (call convention), so it doesn’t matter -___() you call -___() in the opcodes or in the machine code - this will not have a noticeable effect on the performance of the entire application that makes the call.

Note: To put it simply, the calling convention is a sequence of instructions executed before entering another function. In both cases, the calling convention passes arguments to the stack.

You ask: “What about cycles, tail calls (tail calls) and other things”? PHP is smart enough - and with the OpCache Optimizer component turned on, your code will be magically transformed into a more efficient version of what you wrote.

It should be noted here that the JIT will not change the Zend VM calling convention. This is done because PHP should be able to switch between JIT and VM modes at any time (so we decided to keep the current agreements). As a result, any calls you see everywhere using JIT will not work much faster.

If you want to see how the processor-specific PHP code looks like, look here: https://github.com/php/php-src/blob/master/Zend/bench.php . This is an extreme example, but it shows that all the grandeur of JIT is revealed in mathematics.

Did you have to make such an extreme compromise in order to speed up mathematical calculations in PHP?


Not. We did this for the sake of expanding the range of language applications (and expanding to a significant one).

We don't want to brag, but PHP dominates the web. If you are doing web development and do not consider using PHP in your next project, then you are doing something wrong (according to a very biased PHP developer).

At first glance it may seem that the acceleration of mathematical calculations in PHP has a very narrow application. However, this opens the way for us, for example, to machine learning, 3D rendering, 2D rendering (GUI) and data analysis.

Why it cannot be implemented in PHP 7.4?


Above, I called JIT an extreme compromise, and I really think so: this is one of the most complex compiling strategies among all existing ones, if not the most difficult. JIT implementation is a significant increase in complexity.

If you ask Dmitry, the author of JIT, if he made PHP difficult, he will answer: “No, I hate complexity” (this is a quote).

In essence, “complex” means “that which we do not understand.” And today, few of the language developers really understand the current implementation of JIT.

Work on PHP 7.4 is progressing rapidly, and the introduction of JIT into this version will lead to the fact that only a few will be able to debug, correct and improve the language. This is not acceptable to those who voted against JIT in PHP 7.4.

Before the release of PHP 8, many of us will understand the JIT implementation. There are features that we want to implement, and tools that we want to rewrite for the eighth version, so we need to understand the JIT first. We need this time, and we are very grateful that the majority voted to give it to us.

Complicated is not synonymous with terrible. Complicated can be beautiful as a star nebula, and this is about JIT. In other words, even when in our team 20 people begin to understand JIT as well as Dmitry, this will not change the complexity of the very nature of JIT.

PHP development slow down?


There is no reason to think so. We have enough time, so it can be argued that by the time PHP 8 is ready, there will be enough among us who have mastered the JIT enough to work no less efficiently than today, when it comes to correcting errors and developing PHP.

When you try to relate this to the idea of ​​the initial complexity of JIT, remember that most of the time we spend on introducing new features takes to discuss them. Most often, when working on features and fixing errors, writing code takes minutes or hours, and discussions take weeks or months. In rare cases, the code has to be written for hours or days, but even then discussions always last longer.

That's all I wanted to say.

And since we are talking about performance, I invite my colleague Pavel Murzakov to the report on May 17 at the PHP Russia conference . Pasha knows how to squeeze the last CPU second from the PHP code!

Source: https://habr.com/ru/post/448622/


All Articles