Writing system utilities in PHP CLI

For most specialists, PHP is not a language that would be seriously used to write console utilities, and there are many reasons for this. PHP was originally developed as a language for creating websites, but starting from PHP 4.3, in 2002, official support for the CLI mode appeared, so it has long ceased to be so. For several years, Badoo developers have quite successfully used many interactive PHP CLI utilities.

In this article, we would like to share our experience with the CLI mode in PHP and give a few recommendations to those who are going to write scripts in PHP, provided that they will run in * nix-system (however, almost everything is true for Windows ).

Recommendations

Work speed

It is widely believed that PHP is a slow language, and as such it really is. For PHP CLI, it is recommended not to use heavy frameworks and even just large PHP libraries for two reasons:

The running time include / require in the CLI mode will always include parsing and execution, since Bytecode in this mode is not cached (at least by default), which means that initialization will take a long time, even if everything works quite fast from under the web server.
Website users are accustomed to waiting for a certain amount of time to load the page (about 1 second, and sometimes a little more, the user perceives quite normal), but the same cannot be said about the CLI: even a delay of 100 ms will be noticeable, and in 1 second and more can be annoying.

Output on display

In CLI and web mode, screen output is significantly different. In web mode, output is usually buffered, the user cannot be asked anything during the execution of the script; The concept of output to the error stream is missing as a class. In CLI mode, of course, HTML output is unacceptable, and long lines are extremely undesirable. In CLI, echo defaults to flush () (more details can be found here ) - this is convenient because you can not take care of flush () manually if, for example, output is redirected to a file.
')
It also makes sense for CLI scripts to output errors not in STDOUT (using echo), but in STDERR: thus, even if the program output is redirected elsewhere (for example, to / dev / null or grep), the user will not miss the text errors in case of its occurrence. This is the standard behavior for most native * nix 'console utilities, and STDERR exists precisely for the reason described above. In PHP, for example, you can use fwrite (STDERR, $ message) or error_log ($ message) to write to STDERR.

Use return codes

The return code is a number that is 0 if the command is successful and is not 0 otherwise. A return code of 1 is often used in the case of non-critical errors (for example, if the command line arguments are incorrect), and 2 in case of critical system errors (for example, when a network or disk error occurs). Values like 127 or 255 are usually used for any special cases that are reflected separately in the documentation.

By default, with a simple completion of the PHP script, it is assumed that all commands have completed successfully and returns 0. To exit with a specific return code, you must explicitly call exit (NUM), where NUM is the return code (remember that it is 0 in success and has a different meaning in case of errors).

To understand that an external command, executed using exec () or system (), failed, you need to pass the $ return_var variable as parameters of the corresponding functions and check the value for equality to zero.

Attention! If you are going to write exec ('some_cmd ... 2> & 1', $ output) so that errors also fall into $ output, we recommend that you familiarize yourself with the reasons for separating STDOUT and STDERR and remove the explicit redirection of the error stream in STDOUT (2> & 1). Such redirection is required much less often than it might seem. The only case where its use is at least a little justified (in a PHP script) is the need to print on a web page (not in the CLI!) The result of the command, including errors that occurred (otherwise they will go to the web server’s log or go to / dev / null).

"Masking" under the built-in system commands

A good console utility should behave in a standard way and users may not even know that it is in PHP. To do this, the * nix-systems provide a mechanism that many people know about running scripts in Perl / Python / Ruby, but equally applicable to PHP.

If you add to the beginning of a PHP file, for example, #! / Usr / bin / env php and line break, give it execution rights (chmod 755 myscript.php) and remove the .php extension (the latter is not necessary), then the file can be will execute like any other executable file (./myscript). You can add the directory with the script in the PATH or move it to one of the standard PATH directories, for example, / usr / local / bin, and then the script can be called up with a simple set of “myscript”, like any other system utilities.

Processing command line arguments

There is agreement on the format of the command line arguments that most of the built-in system utilities follow, and we recommend that you follow it and your scripts.

Write a brief reference for your script if it received an incorrect number of arguments.

To find out the name of the script being called, use $ argv [0]:

if($argc != 2) { //   \n   echo "Usage: ".$argv[0]." <filename>\n"; //    ,     exit(1); }

For easier flag handling, you can use getopt (). Getopt () is one of the built-in functions for processing command line arguments. On the other hand, there is nothing difficult in processing part of the arguments manually, since in PHP this is not difficult. This method may be necessary if you need to process arguments in the style of ssh or sudo (sudo -u nobody echo Hello world will run echo Hello world as the user nobody, which is specified after the -u flag before the command).

Recommendations for a more complex level

Calling the "correct" system () for CLI

The system () implementation has already been written here . The point is that the standard system () in PHP is not a call to system () in C, but a wrapper over popen (), respectively, “spoils” STDIN and STDOUT of the called script. To prevent this from happening, you need to use the following function:

 //      system()   function cSystem($cmd) { $pp = proc_open($cmd, array(STDIN,STDOUT,STDERR), $pipes); if(!$pp) return 127; return proc_close($pp); }

Work with file system

Surprisingly, we recommend not writing your own implementation of recursive deletion (copying, moving) files, but instead use the built-in commands mv, rm, cp (under Windows, the corresponding analogues). This is not portable between Windows / * nix, but it avoids some of the problems described below.

Let's look at a simple example of implementing a recursive deletion of a directory in PHP:

 //  !  rm -r function recursiveDelete($path) { if(is_file($path)) return unlink($path); $dh = opendir($path); while(false !== ($file = readdir($dh))) { if($file != '.' && $file != '..') recursiveDelete($path.'/'.$file); } closedir($dh); return rmdir($path); }

At first glance, that's right, right? Moreover, even in well-known PHP file managers (for example, in eXtplorer and in the documentation comments), deleting a folder is implemented in this way. Now create a symbolic link to a non-existent file (ln -s some_test other_test) and try to delete it. Or create a symbolic link in the folder to yourself, or to the root of the file system (we recommend not testing this option) ... Specifically for recursiveDelete (), the fix is, of course, trivial, but it is clear that it’s better not to reinvent the wheel and use the built-in commands, even if losing in performance.

Cleaning in case of errors

If your script performs any operations with files (database, sockets, etc.), then it is often necessary to shut down the program correctly in case of unexpected errors: it may be a logging, cleaning temporary files, unlocking file locks and t .d

In PHP web mode, this is implemented using register_shutdown_function (), which even works when the script has completed with a fatal error (by the way, this method is suitable for catching almost any errors, including memory shortage errors). In the CLI mode, everything is a bit more complicated, since the user, for example, can send your script Ctrl + C, and register_shutdown_function () will not work.

But this is simply explained: PHP by default does not process UNIX signals at all, so the receipt of any signal immediately causes the completion of the script. You can fix this by adding declare (ticks = 1) to the beginning of the file after <? Php and registering your handlers for the signals of interest to us (in more detail here ):

 pcntl_signal(SIGINT, function() { exit(1); }); // Ctrl+C pcntl_signal(SIGTERM, function() { exit(1); }); // killall myscript / kill <PID> pcntl_signal(SIGHUP, function() { exit(1); }); //

The functions for signal processing do not need to be the same for all. It is possible not to call exit () inside the signal handler - then script execution will continue after the signal has been processed.

Working with a database in several processes (after fork ())

The recommendation is very simple: you should close all connections to the database before you fork () (ideally, even open files with fopen () should not be present), because performing fork () in these cases can lead to very strange consequences, and for connecting to the database this will simply lead to closing the connection after completing any of the forked processes. In the same SQLite tutorial, it is explicitly stated that a resource that was open before fork () cannot be used in forked processes, because it does not support multi-threaded access in this way. In any case, pcntl_fork () in PHP just makes fork () and logs errors, so you need to handle it as carefully as in C.

Using ncurses for complex rendering to the screen

The ncurses library was created specifically so that you can not care about esc sequences to control the position of the cursor in the terminal and that a program that uses, for example, color, is portable between systems and terminals. On the other hand, even for such simple things as color output, you need to keep in mind that STDOUT does not always support colors. We know one primitive, but unreliable, way to find out without ncurses, whether the terminal supports color - to check whether STDOUT is a terminal (posix_isatty (1)).

Number displayed

Most standard programs display almost nothing on the screen unless they are specifically asked for, with the -v (verbose, chatty) option. Indeed, you should not clutter up the screen without a reason. Finding a balance is not easy, but there are a few simple recommendations:

If the operation does not take much time (less than 10 seconds), do not display anything at all;
If you are doing something non-trivial (for example, you are mounting temporary devices using sudo), on the contrary, inform the user about it so that he knows what to do in case of an error;
If the operation is long and it is possible for it to show progress, it is better to show this progress (for this, the cSystem function mentioned above can be useful);
If the program can work as a filter (for example, cat, grep, gzip ...), check that only data gets into STDOUT, and errors, invitations to input, etc. go to STDERR so that the following programs in the chain will not receive any unnecessary garbage.

To show progress, you can do it the way git does it: use the assumption that all terminals have at least 80 characters width and print a string of fixed width. If we consider that the carriage return character (\ r) returns the cursor to the beginning of the line (and the following output rewrites what was in the line before), it is very easy to write code that displays, for example, the percentage of the operation from 0 to 100, occupying, at the same time, only one line on the user's screen:

 for($i = 0; $i <= 100; $i++) { printf("\r%3d%%", $i); sleep(1); } echo "\n";

Determining the name of the user who called the script

The username is contained in the USER environment variable ($ _ENV ['USER']), but there is one catch - this method uses environment variables that can report incorrect data (the user can execute the script, say, how USER = root myscript, and the script will assume that the username is root).

Therefore, you should use the posix functions:

 // getuid()  ,   ,    uid –        $info = posix_getpwuid(posix_getuid()); $login = $info['name'];

Conclusion

In the article, we tried to provide recommendations that are not entirely obvious directly to PHP developers, rather than to all programmers writing console utilities. Although much of the above can be applied to other programming languages, and perhaps some of the points will be useful to those who are not going to write in PHP.

Yuriy youROCK , developer of Badoo

Source: https://habr.com/ru/post/136846/

All Articles