PHP output buffer

In this article I want to talk about how the output “buffering” layer is implemented in PHP, how it works and how to interact with it from PHP. There is nothing complicated in this layer, but many developers either do not understand how to handle it at all, or do not have full clarity. Everything I’m going to write about is related to PHP version 5.4 and higher. It was from her that many things related to the output buffer (BV) have changed. In fact, this functionality was completely rewritten, so compatibility with version 5.3 was only partially preserved.

What is an output buffer?

The output stream in PHP contains bytes, usually in the form of text, which the developer needs to display on the screen. Most often, the echo or printf () construct is used for this. First, you need to understand that any function that outputs something will use BV from the PHP domain. If we talk about extensions for PHP, you can get access to functions writing to SAPI directly, bypassing any upstream BV. API C is documented in lxr.php.net/xref/PHP_5_5/main/php_output.h , from here you can get a lot of information, for example, about the default buffer size.

The second important point: the BV layer is not the only layer in which the output data is buffered.
')
And third: depending on the SAPI you are using (web or cli), the BV layer may behave differently.

Below is a diagram that will help you understand all of the above:

Here we see that three logical buffering layers are used to control the output in PHP. Two of them belong to that very “output buffer”, and the third one belongs to SAPI. When the output stream leaves the PHP area to get to the lower level of the architecture, “along the way” new buffers can arise: terminal buffer, FastCGI buffer, web server buffer, operating system buffer, TCP / IP stack buffers. Do not forget about it. Although in this article we will only talk about PHP, there are still a lot of software tools in the stack on the data path to the lower layer and the user.

An important note about the SAPI CLI: it disables any output buffer by default in PHP, setting the output_buffering parameter to ini to 0. So, unless you manually register the ob_ () function in the CLI, by default all output data will directly fall into the SAPI layer . Moreover, in the CLI, the implicit_flush parameter is strictly set to 1. The developers always misunderstand the essence of this parameter, although the code says quite unambiguously: when implicit_flush is 1, the SAPI layer buffer is flushed with every write. That is, each time you write data for output using the CLI SAPI, they are immediately sent to the lower level, where they are written as stdout, and then reset.

Standard PHP output buffering layer

If you use SAPI not like CLI, but for example PHP-FPM, then you can experiment with three parameters in ini that are related to the buffer:

output_buffering
implicit_flush
output_handler

Note that using ini_set () with them will have no effect, since their values are read at the moment PHP is started, before it can run any script. If you use ini_set () with any of these parameters, it changes the value, but it will not be used anywhere else. Too late - the BV layer is already running and active. You can change these settings by editing php.ini or by applying the –d key to the PHP binary.

By default, in php.ini, which comes as part of the PHP distribution , output_buffering is set to "4096" (bytes). If you do not use php.ini (or run PHP with the –n key), then the default value will be “0”, that is, disabled. If the value is “On”, then the standard output buffer size (16 KB) will be assigned.

As you have probably already guessed, using a buffer for output in a web environment has a beneficial effect on performance. The initial 4KB is quite enough, because it means that you can write up to 4096 ASCII characters until PHP starts interacting with the underlying SAPI layer. Under web conditions, sending data byte-byte, on the contrary, does not improve performance. Much better if the server sends all the content in an crowd or in large portions. The rarer the levels of data exchange, the better in terms of performance. Therefore, be sure to use an output buffer. PHP will send its contents at the end of the request and you will not have to do anything to do this.

In the previous chapter, I mentioned implicit_flush in the context of CLI. In the case of any other SAPI, implicit_flush is initially disabled. This is good, as you are unlikely to welcome a SAPI reset immediately after writing to it. For the FastCGI protocol, a reset can be compared to the completion and sending of a packet after each entry. However, it is better to first completely fill the FastCGI buffer, and only then send the packets. If you need to manually flush the SAPI buffer, use the flush () PHP function to do this. To reset after each entry, as mentioned above, you can use the parameter implicit_flush in php.ini. As an option - a single call to the PHP function ob_implicit_flush () .

Callback output_handler can be applied to the contents of the buffer before it is output. In general, thanks to PHP extensions, a lot of callbacks are available to us (users can also write them, I will tell you about this in the next chapter).

ob_gzhandler: output compression with ext / zlib
mb_output_handler: translate character encoding with ext / mbstring
ob_iconv_handler: translate character encoding with ext / iconv
ob_tidyhandler: clean up HTML output with ext / tidy
ob_ [inflate / deflate] _handler: output compression with ext / http
ob_etaghandler: automatic generation of ETag headers using ext / http

You can use only one callback, which will receive the contents of the buffer and make useful conversions for the output, which is good news. For analyzing the data that PHP sends to the web server, which is sent to the user, it is useful to use callback and output buffers. By the way, by "conclusion" I mean both the title and the body. HTTP headers are also part of the output buffering layer.

Body and headers

When you use an output buffer (whether user or one of the standard ones), you can send HTTP headers and content as you like. Any protocol requires you to first send the header, and then the body, but PHP itself will do it for you if you use the BV layer. Any PHP function that works with headers ( header (), setcookie (), session_start () ) actually uses the internal function sapi_header_op (), which simply fills the header buffer. If you then write the output data, for example, using printf () , then they are written to one of the corresponding output buffers. And while sending the PHP buffer first

sends the headers, and then the body. If you don’t like this concern from PHP, you’ll have to disable the BV layer altogether.

Custom output buffers

Let's take a look at examples of how this works and what you can do. Keep in mind that if you want to use a standard PHP buffering layer, you will not be able to use the CLI, since it is disabled as a layer.

Below is an example of working with a standard PHP layer using the internal SAPI web server:

/*  : php -doutput_buffering=32 -dimplicit_flush=1 -S127.0.0.1:8080 -t/var/www */ echo str_repeat('a', 31); sleep(3); echo 'b'; sleep(3); echo 'c';

We started PHP with a standard output buffer of 32 bytes, after which we immediately wrote 31 bytes into it, until the execution delay was turned on. The screen is black until nothing is sent. Then the sleep () action ends, and we write another byte, thereby completely filling the buffer. After that, it immediately flushes itself to the SAPI layer buffer, and it flushes itself to output, since implicit_flush is set to 1. The line aaaaaaaaaa {31 times} b appears on the screen, and then sleep () begins to take effect again. Upon completion, the empty 31-byte buffer is filled with a single byte, after which PHP completes and flushes the buffer. Appears on the screen with .

This is how a standard PHP buffer looks like without invoking any ob functions. Do not forget that this is a standard buffer, that is, it is already available (only you cannot use the CLI).

Now, using ob_start (), you can run custom buffers, and as much as you need, until the memory runs out. Each buffer will be placed after the previous one and immediately dumped into the next one, which will gradually lead to overflow.

 ob_start(function($ctc) { static $a = 0; return $a++ . '- ' . $ctc . "\n";}, 10); ob_start(function($ctc) { return ucfirst($ctc); }, 3); echo "fo"; sleep(2); echo 'o'; sleep(2); echo "barbazz"; sleep(2); echo "hello"; /* 0- FooBarbazz\n 1- Hello\n */

Output buffering device

As I said, since version 5.4, the output buffering mechanism has been completely rewritten. Prior to this, the code was very inaccurate, many things were not easy to do, there were often bugs. More information about this can be found at the link . The new code base has turned out much cleaner, better organized, new features have appeared. True, compatibility with version 5.3 is provided only in part.

Perhaps one of the most pleasant innovations was that extensions can now declare their callback and output buffers that conflict with callbacks of other extensions. Previously, it was impossible to fully manage situations where other extensions could also declare their callback.

Here is a quick, quick example of how to register a callback that converts data to upper case:

 #ifdef HAVE_CONFIG_H #include "config.h" #endif #include "php.h" #include "php_ini.h" #include "main/php_output.h" #include "php_myext.h" static int myext_output_handler(void **nothing, php_output_context *output_context) { char *dup = NULL; dup = estrndup(output_context->in.data, output_context->in.used); php_strtoupper(dup, output_context->in.used); output_context->out.data = dup; output_context->out.used = output_context->in.used; output_context->out.free = 1; return SUCCESS; } PHP_RINIT_FUNCTION(myext) { php_output_handler *handler; handler = php_output_handler_create_internal("myext handler", sizeof("myext handler") -1, myext_output_handler, /* PHP_OUTPUT_HANDLER_DEFAULT_SIZE */ 128, PHP_OUTPUT_HANDLER_STDFLAGS); php_output_handler_start(handler); return SUCCESS; } zend_module_entry myext_module_entry = { STANDARD_MODULE_HEADER, "myext", NULL, /* Function entries */ NULL, NULL, /* Module shutdown */ PHP_RINIT(myext), /* Request init */ NULL, /* Request shutdown */ NULL, /* Module information */ "0.1", /* Replace with version number for your extension */ STANDARD_MODULE_PROPERTIES }; #ifdef COMPILE_DL_MYEXT ZEND_GET_MODULE(myext) #endif

Underwater rocks

For the most part, they are documented, some of them quite obvious, and some not too. The obvious ones include, for example, the fact that you should not call any buffer functions from within a callback callback, as well as record the output from there.

The unobvious reefs can be attributed to the fact that some PHP functions use an internal BV for themselves, filling it up, and then dropping or returning it. In this case, the next buffer is pushed onto the stack. Such functions include print_r (), highlight_file () and SoapServer :: handle () . You should not use them from within callbacks — this can lead to unpredictable consequences.

Conclusion

The output layer can be compared with a kind of network that picks up any possible “leaks” of output from PHP and stores them in a buffer of a given size. When the buffer is filled, it is reset (written) to the lower level, if any. At least the lowest available - to the SAPI buffer. Users can control the number of buffers, their size and operations that can be enabled in each buffer layer (clear, reset or delete). This is a very flexible tool that allows, for example, creators of libraries and frameworks to fully control the output stream, directing it to the global buffer and processing there. At the same time, PHP itself regulates the order of sending headers and output stream.

By default, there is one output buffer, controlled by three settings in the ini file. It is designed to make less frequent write operations and not too often access the SAPI layer, and therefore the network. This is done to improve overall performance. PHP extensions can also declare callbacks run in each buffer — for example, to compress data, replace strings, manage HTTP headers, and many other operations.

Source: https://habr.com/ru/post/248573/

All Articles