📜 ⬆️ ⬇️

Static code analysis in PHP: regular expressions

Continuing to develop the topic of static analysis, which is generally engaged in the search for any defects in the source code of programs, let's touch on the validation of regular expressions.

The topic of regular expressions for PHP is pretty ticklish (about how to manipulate arrays), so I’ll briefly remind you of what we are dealing with.

Regular expressions are used to analyze and process text, such as checking the correctness of an email address or phone number. Despite their flexibility, it is difficult for a person to perceive regular expressions — you can easily make a mistake in describing a pattern and choosing modifiers. Often there is the use of regular expressions where you can get by with simple string functions.
')
So, let's take a look at the example of many familiar ZF2 and Symfony2, which of the mentioned problems can be found using static analysis.

Instruments


PhpStorm and the Php Inspections static code analyzer (EA Extended) , which is installed as an extension (plugin), were used as tools.

Php Inspections (EA Extended) contains a large set of rules, divided into groups. If you have not used this plugin yet, then the first thing to do is to analyze the entire project. The second is to customize the Code Style inspections (Settings -> Editor -> Inspections -> PHP -> Php Inspections (EA Extended)) in order to focus on other messages.

And, of course, you should not blindly believe analyzers: the semantics of projects is not a trivial thing. Make sure that the code is covered with tests before making changes - working with static analyzers requires some safety precautions.

Defect examples


Basically, minor bugs and only one mistake were found. This is not surprising, since both frameworks are well optimized and tested: it is very difficult to find something serious.

ZF2


Warning: 'str_replace ("-", ...)' can be used instead

return preg_replace('#-#', $this->separator, $value); /* *     , * ..       . * * return str_replace('-', $this->separator, $value); */ 

Warning: '0 === strpos ("...", "get")' can be used instead

  if (preg_match('/^get/', $method)) { ... } /* * ,   ,  *        . * * if (0 === strpos($method, 'get')) { * ... * } */ 

Warning: '[0-9]' can be replaced with '\ d' (safe in non-unicode mode)

  if (!preg_match('/^([0-9]{1,3}\.){3}[0-9]{1,3}$/', $host)) { ... } /* *             /u *  /u     -, ,  . * * if (!preg_match('/^(\d{1,3}\.){3}\d{1,3}$/', $host)) { * ... * } */ 

Warning: '[^ \ s]' can be replaced with '\ S' (as options: \ D, \ W)

  if ($property->isPublic() && preg_match_all('/@var\s+([^\s]+)/m', $property->getDocComment(), $matches)) { ... } /* *   \S   ,  , ,    . */ 

Warning: 'i' modifier is ambiguous here (no az in given pattern)

  if (preg_match('/([^.]{2,10})$/iu', end($domainParts), $matches) || (array_key_exists(end($domainParts), $this->validIdns))) { ... } /* * /i , ..     . */ 


Symfony2


Warning: 'false! == strpos ("...", "file")' can be used instead

  return preg_match('/file/', $this->getFilename()); /* *     , * ..       . * * return false !== strpos($this->getFilename(), 'file'); */ 

Warning: '[^ A-Za-z0-9_]' can be replaced with '\ W' (safe in non-unicode mode)

  $gotoname = 'not_'.preg_replace('/[^A-Za-z0-9_]/', '', $name); /* *   \W   ,  , ,    . */ 

Warning: '[a-zA-Z0-9_]' can be replaced with '\ w' (safe in non-unicode mode)

  return '' === $name || null === $name || preg_match('/^[a-zA-Z0-9_][a-zA-Z0-9_\-:]*$/D', $name); /* *   \w   ,  , ,    . */ 

Warning: Probably / s modifier is missing (nested tags are not recognized)

  $content = preg_replace('#<esi\:remove>.*?</esi\:remove>#', '', $content); /* *     :       . *    /s */ 


Instead of conclusion


First, I want to emphasize that static analyzers are a tool like, say, XDebug. For example, with Php Inspections (EA Extended), the analysis of possible errors occurs in the process of writing code. The cost of this “implementation” is zero, and the complexity of peer review and training of colleagues is markedly reduced.

Secondly, I want to thank Denis Ryabov , who proposed the idea and worked out the specification for the analysis of regular expressions in Php Inspections (EA Extended), thank him very much for that.

Source: https://habr.com/ru/post/260185/


All Articles