Let me present you a translation of the article by
Johannes Schmitt Automated Code Reviews for PHP . Personally, she helped me to take a slightly different look at the process of developing and testing my applications. And the author’s original approach to testing at least deserves attention.
If you're interested too, welcome under cat.
Since
Trevis appeared, you can in the blink of an eye introduce continuous integration into all your PHP projects. This helps not only to improve the quality of the code, but also greatly simplifies library support by providing assembly information directly to the pull request and, thus, reduces the time for receiving feedback. Travis is very good, but, like other testing tools, suffers from a hereditary disease - in order to do something he needs tests. Get ready to bet that you don’t have any projects honestly covered with tests for 100% or even close to it. This I still hope that the tests you write.
As you may know, I support a significant number of plug-ins for symfony2 and stand-alone PHP libraries. And thanks to the community (thanks guys, keep it up) I constantly receive requests for updates to my repositories. Some of the requests are completely useless, some deserve attention, some can be added to the main branch. But no matter how carefully the request is checked, from time to time it happens that what is not working or working is added, but not always.
A couple of months ago I tried to change this situation, the idea was quite simple: to create a system that checks the update request code and gives feedback. I quickly made a prototype and added a couple of simple checks to it. Then, I wanted to add more complicated ones, for example, whether a test can be called. To understand the benefits of such a check, look at the following example:
<?php class UserProvider { public function loadUser($username) { } public function refreshUser(User $user) { if (null === $user = $this->loadUser($user->getUsername())) { throw new RuntimeException( sprintf('User "%s" was not found.', $user->getUsername())); } return $user; } }
And so, the
refreshUser method gets an object of class User using the
loadUser method and returns this object. And if the object is not found, it throws an exception. It seems to be simple, but is it really? And if I ask about it, then apparently not, and many of you have already noticed a mistake. Inside the if block, the user is null and we cannot call its
getUserName method. To find this kind of error, I tried some simple solutions, but pretty quickly it became obvious that they only work in very specific cases. I needed something better.
')
Type Inference of PHP Code
I spent quite a lot of time delving into the concepts
of data flow, control flow, and
abstract interpretation . Which in itself looks rather complicated and is beyond the scope of this article. But let me just give a few examples and give you a general idea of ​​these concepts.
Control Flow Analysis
Analysis of this stream allows you to determine in which order the various blocks of your code will be executed.
<?php function fooBar($i) { if ($i > 0) { echo 'foo'; } else { echo 'bar'; } }
For this code, the control flow will look like this:

We start at if, then we move to "foor" or "bar" and, finally, we exit. In itself, this is unlikely to help us, but it will serve as the basis for the next step.
Data Flow Analysis
Data flow analysis allows you to determine how the execution context changes while we are moving according to the scheme that we identified in the control flow analysis.
<?php $x = null;
Without knowing the order of execution of the code, we can only conclude that $ x can be null, a number, or a DateTime. But it will not help us to find out whether the
format method can be called.
Abstract Interpretation
For our case, this concept is reduced to the question "What assumptions can we make if we know the result of a conditional expression?". Let's take a look at another example:
<?php class Foo { private $logger; public function __construct(Logger $logger = null) { $this->logger = $logger; } public function doSth() { if (null !== $this->logger) { $this->logger->log('doing sth'); } } }
In this case, the "conditional expression" is null! == $ this-> logger. If this condition is true, then our question can be rephrased like this: “If the expression is null! == $ this-> logger is true, then what assumption can be made to the account $ this-> logger?” As we found out, $ this-> logger may be null or logger. But thanks to abstract interpretation, we can be sure that inside the “if” block, $ this-> logger will always be an instance of Logger, therefore, the method can be called.
Automatic checkout system
What is the use of all this, you ask. At the beginning of the article I said that my goal was to create an automatic code verification system. And I think that now it is ready for wide use and discussion. I have tested my system with leading PHP libraries such as Zend Framework 2, Symfony2, Doctrine, Propel and many others. It contains over 100 validation rules that you can use and configure. If you have a PHP project on Github you can easily try. Just log in
http://jmsyst.com/automated-code-reviews and select the desired repository. And if you don’t like it, you can turn it off at any time.
Now, if someone says that PHP programmers are not too serious about the quality of the code, send them to me.