
Recently, there has been a very lively discussion in the developer community about everything that concerns PHP and its future. What pleases - most of these conversations are held in a positive way. Discussions on PHP 6 and what it might look like are popular. People ask a lot of questions about HHVM and its role in the future of the language and community. So let me share with you some of my thoughts on this.
Backward compatibility
I believe that each subsequent release is
obliged to maintain backward compatibility with the previous one: 6, 7, 99, “elephant enthusiast” - call it what you like. And now I will say "mostly", since some incompatibilities will still occur. But these incompatibilities must be justified and controlled. They should also be directed only at revising the behavior of borderline cases and all that. Although this does not mean that there can not be a serious internal reorganization and striving for purity and simplicity of things. This means that incompatibilities should not put obstacles in the way of developers.
')
This approach is very easy to verify:
The code you write should run without problems on both PHP 5.x and PHP 6.x (and any two consecutive major releases).
Why is it important? Take a look at the transition from PHP 4 to PHP 5. It was fairly easy for programmers to write code that worked on both versions, although the final transition to PHP 5 took about 10 years. And imagine if it was difficult to do?
Although, as it turns out, nothing is needed. This is exactly what happened with Python. The first release of Python 3 was released about 5 years ago. And today, in 2014, he is still not involved to the fullest. Not because it is bad, but because it is very difficult to use a single code that would work without problems on both versions. That is, you use either Python 2 or its functionality that will work in Python 3 (as a result, losing the advantages of both). And if the libraries or platforms you need do not have a version for Python 3, you need to either port them yourself, or well ... you just have no luck. In fact, that's exactly what happens.
I do not want to say that this approach is wrong: the language acquires a million different advantages from such changes. But, as it seems to me, for the community and the average user such a transition is still unnecessarily cardinal.
About rewriting the engine
Many people say: you need to rewrite the PHP engine. Despite the fact that I definitely see this as a plus (yes, the engine is very intricate), I have to ask the question: is this really so necessary? Where is the fundamental dog buried? Undoubtedly, the PHP engine has architectural flaws, but by and large it works well.
So I would prefer to see the transition of the engine to the component basis, its division into subsystems. Today it is already partially done. But I would like to see changes that would make the engine truly component. Why is it important? Because with this approach, individual improvements will be able to make a significant contribution to the development of the engine.
For example, at the moment the most confusing part of PHP is the parser and compiler. They are so closely related and confused that it leads to a lot of problems in development. On the other hand, if they were separate components of the engine, then that parser, that the compiler would be much easier to replace. And their common part could be some kind of Abstract Syntax Tree. Why AST? Because this is a kind of general idea that both components could use. Yes, it would take a lot and a lot of work on this, but the advantages would not keep you waiting: from a consistent and more predictable syntax to adding the ability to define your own syntax using PHP itself (imagine the ability to define DSL in PHP, which are actually part of language).
So there is no need to rewrite. Refactor and mop up.
On the transition of the standard library to the object-oriented approach
Some people suggest moving the standard PHP library to an object-oriented approach: even scalar types would have object behavior. So you could write something like this:
$string = "Foo"; var_dump($string->length);
I do not think that this
needs to happen, although, I confess, it sounds cool.
The reason is simple: scalars are not objects. But, most importantly, they do not belong to any type at all. PHP relies on a type system that considers strings to be integers. The flexibility of the system also lies in the fact that any scalar type can easily be converted to another scalar type. Of course, this is not always good, because of this a very large number of errors occur.
However, such situations could be resolved by more specific behavior. For example, you could throw a warning or an exception to be thrown when trying to "dirty" type conversions, so if someone tried to convert "123abc" to an integer, you would get a message about partial data loss.
More importantly, with such a type system, you cannot know 100% what type a variable has at a given time. You can assume various options, but what is really there is not known. The situation will not change very much even after type casting or if the language will support scalar type hints, since these types can still be changed later.
Thus, all this means that with the object-oriented approach, all scalar operations had to be tied to all scalar types. That would lead to an object model in which scalars would have not only mathematical methods, but also methods for working with strings. What kind of nonsense ...
HHVM formation
Today, at the time of this writing, I do not recommend using HHVM in production. There are several reasons for this. They are all known and not fundamental. Time will tell if they can be solved, but I really hope so.
- HHVM is controlled by one company. Do not get me wrong, the problem is not that Facebook spends a lot of money on development. But the fact that the project is controlled by a company whose business does not depend on whether you use HHVM or not. It’s one thing if they provided paid support and made HHVM a complete product. Another is that now it is neither an open source project, nor a commercial project - something in between. And I would be very tense, translating production to HHVM in such a situation.
- HHVM does not have a public specification, that is, in general, you will program in the same way as under the Zend engine. However, this is a trial and error method, since everything will be fine as long as you do not try to support multiple implementations. As a library developer, I already felt it the hard way. On the other hand, if HHVM and PHP ended up with some kind of general specification, many things would be much simpler ...
- HHVM is a closed source project, although it accepts code from third-party developers (already good). However, the flow of pull requests and patches does not produce an open source project. Well, where is the clarity of the process? Where is the clarity of perspective? Where is the openness of participation? Where is the lead?
At the same time, I know that I am not alone in my judgments. HHVM will be a strong contender in the future, but I believe that while the above issues are not resolved, the time for HHVM in commercial production has not come.
Can PHP and HHVM coexist?
Naturally. Although some tests look convincing, JIT compilers are not magic. They make compromises with this our real world: many tests reveal this. Well, in fact, if you look closely at the vast majority of tests, you will notice that they do not execute the “real” code. Stop-stop, so you are still comparing the performance of HelloWorld or the Fibonacci number generator? Well, good luck to you, just calm down now, please, and throw out all these useless results.
Let me repeat that tests that do not use real systems are useless: this is nonsense and even worse - they are simply dangerous.
In practice, there are tasks that HHVM can do much faster than PHP. But at the same time, there are tasks where PHP will show its speed. The only way to test is to test your application.
But HHVM executes my code as native! How can PHP be faster?Remember, I said that JIT is not magic? So, this is actually the case. You cannot compile PHP directly because it is an interpreted programming language. Which means that you can’t know which code is in the compilation queue until exactly as you don’t execute this code. So JIT does just that. It analyzes the executable code and, having received sufficient information about it, generates a native code. This process is not free of costs, because of this HHVM is slow in the console.
More importantly, the JIT does not generate generic code. It generates code in accordance with the conditions that existed at the time of the creation of this code. So, if your function adds two integers, then such code could be compiled into a simple add instruction. However, the compiler will also add instructions for checking parameters on an integer type. And if then you transfer to your function not a number (which is normal from the standpoint of PHP), one of the checks will give a false result.
When the check gives a false result, something like a “failover” occurs. Simply put, the engine will “cancel” everything that has been compiled for this method and will switch to interpreter mode. Such an operation is much more expensive than a permanent job in the interpreter mode.
And this is just one reason why JIT compilers are not magic.
I don’t want you to think now that I’m against JIT compilers. On the contrary, for most tasks, they will show a significant increase in productivity. But still they are not perfect.
Look at other communities and you will see implementations of virtual machines along with JIT compilers. CPython and PyPy are good examples of this. It is also worth noting that Python has a language specification, so you can easily change one implementation to another.
But HACK is cool!Hack is a new programming language developed by Facebook and included in HHVM. Roughly speaking, this is a statically typed version of PHP with some additional features ...
And Hack is awesome! I really want the HHVM problems I identified to be somehow resolved, and I could contribute!
After all, this is an interesting idea. Now there are several meta languages ​​built on the basis of PHP. Leaders - Hack and Zephir. But there is a problem. Both are designed for a specific runtime environment: Hack runs on HHVM, and Zephir runs on PHP. How to solve this?
Honestly, I would just throw Zephir and build a compiler from Hack to PECL. Since Hack is a statically typed language, there must be the possibility of cross-compiling between Hack and PECL. And given that Hack already supports C ++ bindings (for connecting system libraries), theoretically the compiler should handle this as well. In this case, there would be no point in writing the PECL extension. You would write your extension on Hack (which has static code analyzers, debuggers), and would generate a 100% compatible PECL extension. This thing, of course, is very nontrivial in implementation, but it would be great to try this! Here, by the way, is another argument in favor of the language specification.
About language specification
You, probably, noticed that I have already mentioned several times in the text about the need to specify a language ...
I hint that this is the most important thing that would help improve the future of PHP as a language, platform, ecosystem and community.
Summing up
PHP is entering a very interesting phase of its development. People write cool stuff, move progress. So if we want further growth of PHP, I think we should be very well aware of what we are doing by making this or that choice.