📜 ⬆️ ⬇️

New PHP, Part 2: Scalar types


In our previous article, we talked about the benefits of the PHP 7 type system, and in particular, about the new support for typed return values. That in itself is not only a great help in supporting the code, but it is a big step forward for PHP.

So far we have talked about types only with respect to classes and interfaces. For years, we only could use them (and arrays). However, PHP 7 adds the ability to use scalar values ​​too, such as int , string and float .

But wait. In PHP, most primitives are interchangeable. We can pass "123" to a function that wants an int , and trust PHP, which will do everything “right”. So why then do we need scalar types?
')
As well as return types, scalars increase the clarity of the code, make it possible to catch more errors at an early stage. That, in turn, increases the reliability of the code.

PHP 7 adds four new types that can be specified by parameters or return values: int , float , string and bool . They will join already existing array , callable , classes, and interfaces. Let's complete our previous example with the new feature:

 interface AddressInterface { public function getStreet() : string; public function getCity() : string; public function getState() : string; public function getZip() : string; } 

 class EmptyAddress implements AddressInterface { public function getStreet() : string { return ''; } public function getCity() : string { return ''; } public function getState() : string { return ''; } public function getZip() : string { return ''; } } 

 class Address implements AddressInterface { protected $street; protected $city; protected $state; protected $zip; public function __construct(string $street, string $city, string $state, string $zip) {   $this->street = $street;   $this->city = $city;   $this->state = $state;   $this->zip = $zip; } public function getStreet() : string { return $this->street; } public function getCity() : string { return $this->city; } public function getState() : string { return $this->state; } public function getZip() : string { return $this->zip; } } 

 class Employee { protected $id; protected $address; public function __construct(int $id, AddressInterface $address) {   $this->id = $id;   $this->address = $address; } public function getId() : int {   return $this->id; } public function getAddress() : AddressInterface {   return $this->address; } } 

 class EmployeeRepository { private $data = []; public function __construct() {   $this->data[123] = new Employee(123, new Address('123 Main St.', 'Chicago', 'IL', '60614'));   $this->data[456] = new Employee(456, new Address('45 Hull St', 'Boston', 'MA', '02113')); } public function findById(int $id) : Employee {   if (!isset($this->data[$id])) {     throw new InvalidArgumentException('No such Employee: ' . $id);   }   return $this->data[$id]; } } 

 $r = new EmployeeRepository(); try { print $r->findById(123)->getAddress()->getStreet() . PHP_EOL; print $r->findById(789)->getAddress()->getStreet() . PHP_EOL; } catch (InvalidArgumentException $e) { print $e->getMessage() . PHP_EOL; } 

The changes affected some methods that give a string or number. Even taking such a simple step, we have already received some advantages.

  1. It is now known that the various fields of the Address class are just strings. Previously, one could only assume that they were lines, not Street objects (consisting of the street number, its name and apartment number) or the city ID from the database. Of course, both of these things are perfectly reasonable in certain circumstances, but they are not considered in this article.
  2. Employee IDs are known to be integers. In many companies, an employee ID is an alphanumeric string or perhaps a leading zero. Previously, there was no way to know, but now other interpretations are excluded.
  3. Security is also a plus. We are guaranteed to know that inside findById() $id is a value of type int . Even if it originally came from user input, it will become integer. This means that it cannot contain, for example, SQL injections. Reliance on type checking when working with user input is not the only or even the best defense against attack, but another layer of protection.

It seems that the first two benefits are redundant in the presence of documentation. If you have good doc blocks in the code, you already know that Address consists of strings and the employee ID is integer, right? It's true; However, not everyone adheres to fanaticism in the matter of documenting their code, or simply forget to update it. With the "active" information from the language itself, you are guaranteed to know that there is no desynchronization, because PHP will throw an exception if it is not.

Explicit type specification also opens the door to more powerful tools. Programs - either PHP itself or third-party analysis tools - can study the source code, finding possible errors or optimization possibilities based on the information received.

For example, we can check that the following code is incorrect and will always fall, even without starting it:

 function loadUser(int $id) : User { return new User($id); } 

 function findPostsForUser(int $uid) : array { // Obviously something more robust. return [new Post(), new Post()]; } 

 $u = loadUser(123); $posts = findPostsForUser($u); 

loadUser() always returns an object of type User , and findPostsForUser() always returns an integer, there is no way to make this code true. This can be said only by looking at the functions and how to use them. And this, in turn, means that the IDE also knows in advance and can warn us about an error before launch. And since the IDE can track a lot more parts than we do, it can also warn of more errors than you can see for yourself ... without executing the code!

This process is called “static analysis,” and it is an incredibly powerful way to evaluate programs to find and fix errors. This is slightly hampered by the standard weak PHP typing. Passing an integer to a function that expects a string, or a string to a function that expects an integer, it all continues to work and PHP silently converts the primitives to each other, as well as always. What makes static analysis, by us or with the help of utilities, less useful.

The introduction of strong typing is the cornerstone of the new PHP 7 type system.

By default, when working with scalar types (parameters or return values), PHP will do everything possible to bring the value to the expected. That is, passing an int to a function that expects a string will work fine, and passing bool at the expected int you will get an integer number 0 or 1, because this is natural behavior expected from the language. For an object passed to a function that is expecting a string , __toString() will be called, the same will happen with the returned values.

The only exception is passing a string to the expected int or float . Traditionally, when a function expects to get int / float values, and a string is passed, PHP will simply truncate the string to the first non-numeric character, resulting in possible data loss. In the case of scalar types, the parameter will work fine if the string is indeed numeric, but if the value is truncated, this will result in a call to E_NOTICE . Everything will work, but at the moment this situation is considered as a minor error in the condition.

Auto-conversion makes sense in cases where almost all input data is transmitted as strings (from a database or http requests), but at the same time it limits the usefulness of type checking. Just for this, PHP 7 offers strict_types mode. Its use is somewhat subtle and not obvious, but with proper understanding the developer gets an incredibly powerful tool.

To enable strong typing, add the declaration to the beginning of the file, like this:

 declare(strict_types=1); 

This declaration must be the first line in the file, before executing any code. It affects only the logic located in the file and only the calls and return values in this file. To understand how strict_types works, let's split our code into several separate files and strict_types it a bit:

 // EmployeeRespository.php class EmployeeRepository { private $data = []; public function __construct() {   $this->data[123] = new Employee(123, new Address('123 Main St.', 'Chicago', 'IL', '60614'));   $this->data[456] = new Employee(456, new Address('45 Hull St', 'Boston', 'MA', 02113)); } public function findById(int $id) : Employee {   if (!isset($this->data[$id])) {     throw new InvalidArgumentException('No such Employee: ' . $id);   }   return $this->data[$id]; } } 

 // index.php $r = new EmployeeRepository(); try { $employee_id = get_from_request_query('employee_id'); print $r->findById($employee_id)->getAddress()->getStreet() . PHP_EOL; } catch (InvalidArgumentException $e) { print $e->getMessage() . PHP_EOL; } 

What can be guaranteed? Most importantly, we probably know that $id inside findById() is an int , not a string. No matter what $employee_id will be, $id will always take an int type, even if E_NOTICE is thrown. If we add the strict_type declaration to EmployeeRepository.php , nothing will happen. We will also all have an int inside findById() .

However, if you declare the use of strong typing in index.php , and then use the findById() call, if $employee_id is a string, or a float or something other than int , this will result in a TypeError .

In other words, methods and functions will be affected by strong typing only if they are called in the appropriately declared file. Code in any other part of the project will not be hurt. The desire of library developers to follow the principle of strong typing will not affect your code, only them.

So what's so good about it? An astute reader may notice that I made a very cunning bug in the last code example. Check the EntityRespository constructor where we create our fake data. The second entry sends the ZIP code as an int, not a string. Most of the time it will not matter. However, in the US, postal codes in the northeast begin with the leading zero. If your int starts with a leading zero, PHP will interpret this as an octal number, that is, a number with a base of 8.

With weak typing, this means that in Boston the address will not be interpreted as a zip-code 02113, but as an integer number 02113, which in base 10 will be: 1099, that's what PHP will translate into the postcode 1099. Believe me, the people of Boston hate it. Such an error can eventually be caught somewhere in the database or during validation, when the code will force you to enter exactly a six-digit number, but at that moment you will have no idea where 1099 came from. Maybe later, after 8 debugs, it will be clear.

Instead, we will switch EntityRespository.php to strict mode and immediately catch the type mismatch. If we run the code, we get quite specific errors that will tell us the exact lines where to find them. And good utilities (or IDE) can catch them even before launch!

The only place where the strict typing mode allows automatic conversions is from int to float . This is safe (except for extremely large or small values ​​when there are consequences of overflow) and logical, since this int by definition also a floating-point value.

When should we use strong typing? My answer is simple: as often as possible. Scalar typing, return types, and strict-mode offer tremendous advantages for debugging and code support. All of them should be used as much as possible, and as a result, there will be more reliable, supported and bug-free code.

An exception to this rule may be code that interacts directly with the input channel, such as databases or incoming http requests. In these cases, the input data is always going to be strings, because browsing the Internet is just a more elaborate way to concatenate strings. When manually working with http or DB, you can transfer data from strings to the necessary types, but usually this results in a large amount of unnecessary work; if the SQL field is of type int , then you know (even if you don’t know the IDE) that a numeric string will always be returned from it and, therefore, you can be sure that there is no data loss when passing them to a function that expects an int .

This means that in our example, EmployeeRespository , Employee , Address , AddressInterface and EmptyAddress should have strict mode enabled. index.php , on the get_from_request_query() , interacts with incoming requests (via a call to get_from_request_query() ), and thus it will probably be easier for PHP to deal with types rather than manually manually. As soon as the undefined values ​​from the request are passed to the typed function, then it is possible to switch to the strictly typed work.

Let's hope that the transition to PHP 7 will be much faster than it was with PHP 5. It is really worth it. One of the main reasons is the expanded type system, which gives us the opportunity to make the code more self-documented and more understandable to each other and our tools. The result is a lot less “hmmm, I don’t even know what to do with it” moments than ever before.

Source: https://habr.com/ru/post/267799/


All Articles