
Many developers are familiar with the situation when it’s impossible to cache pages on the site for, say, 5-10 minutes just because of one small block, the relevance of which needs to be maintained, if not in real time, then with an “aging” time of no more than 5-10 seconds. At the same time, site traffic continues to grow, the time of page generation grows and something needs to be done with it ...
- Solution 1: Twist something that did not reach the hands of the last six months. Everyone will understand and move you to other tasks. You will be in the role of "Superman" one to save the site from exorbitant load, solving the problem "for free" (without additional infusions into the equipment). You may find the article “Tuning nginx” useful .
- Solution 2: Improve the technical base (buy more brains to the server, improve the disk system, put a separate server under the database). In principle, the problem is not solved, but rather postponed. Now you have time to "dig in" and prepare for the second wave of the influx of load, it will be longer and will cover more.
- Solution 3: Your version, which I will probably find out from the comments.
Let me offer me a proven and relatively simple solution based on one of the oldest technologies in web development.
How it should work
The site can always be divided into a number of independent blocks, which can be generated (if necessary) by different servers.
At the same time, a “collector” is engaged in assembling the blocks into a single whole, and if for any reason any block was not created within the time allotted to it, then this is not a reason to issue the client with a “Gateway timeout” or “Internal Server Error”. You can collect successfully created blocks, and on the site of the "bad" to show outdated content from the cache.
')
To implement such a model, we need a veteran Web development technology:
ssi . As the “collector”, as is clear from the title of the article, is
nginx . “Miracles” will be possible thanks to the
fastcgi_cache module.
So, let's go:
Eliminate the extra link
Apache is not useful to us, the presence of which is usually explained by the use of RewriteRules.
In nginx there is an analog mod_rewrite or a
combination of location / alias with regular expressions , the capabilities of which allow you to write an analog to any apache RewriteRule. In addition, in modern frameworks, the engine can parse the input URL itself (for example,
Zend_Controller_Router_Rewrite in the
Zend Framework )
Any platform can be used as a fastcgi backend. Examples will be in php, but this does not mean that you cannot write similar code in python or perl.
Run php in fastcgi mode:
# /bin/su -m www_user -c "PHP_FCGI_CHILDREN=8 /usr/bin/php-cgi -q -b 127.0.0.1:7777 &"
You can also set the path to the log file in php.ini (error_log = /var/log/fastcgi/fastcgi.log), but you will have to restart php-cgi.
We do:
# killall php-cgi
and run all the new
A more advanced way to run fastcgi is to install
php-fpm .
Install nginx
You can put a standard from the repository / ports ... But if you want the opportunity to "clean" any file in the cache, you will have to compile.
We need a module:
ngx_cache_purgeI will describe in detail how this can be done for a redhat-like system, and compile it for your system by analogy.
# cd ~ / rpmbuild / SRPMS
# yumdownloader --source nginx
# rpm -ivh nginx-0.7.65-1.fc12.src.rpm
edit the nginx.spec file, somewhere in the list
./configure insert the line
--add-module = / root / rpmbuild / BUILD / ngx_cache_purge-1.0 \ . You can also delete lines with unnecessary modules (for example - with-ipv6 \, - with-http_dav_module \, - with-mail \, - with-mail_ssl_module \ ...)
now unpack the contents of
http://labs.frickle.com/files/ngx_cache_purge-1.0.tar.gz in the
/root/rpmbuild/BUILD/ngx_cache_purge-1.0 folder.
Everything can be compiled:
# cd ~ / rpmbuild / SRPMS
# rpmbuild -ba nginx.spec
This is not a beautiful way, because the resulting .src.rpm will not contain a file with the ngx_cache_purge module. If, nevertheless, it is critical for you, then you can
download the “correct” version of nginx .src.rpm for the 8.xx branch. True, I commented out some of the modules I didn't need.
Install rebuilt nginx on our server:
# rpm -ivh nginx-0.7.65-1.fc12.x86_64.rpm
Setting up nginx for a project in php
In the
/ etc / hosts file (add):
# Virtual hosts
127.0.0.1 myproject
In the main
/etc/nginx/nginx.conf config,
add to the http section:
fastcgi_cache_path / var / spool / nginx / cache levels = 1: 2 keys_zone = mycache: 64m;
include /etc/nginx/conf.d/*.conf;
(Do not forget to create the / var / spool / nginx / cache folder and set up a user for it, under which nginx runs)
In the
/etc/nginx/conf.d / folder
we create configs for virtual hosts
Example cofig (
/etc/nginx/conf.d/myproject.conf ):
server {
listen 80;
server_name myproject;
root / var / www / myproject / public;
ssi on;
# Turn on cache if necessary
fastcgi_cache mycache;
fastcgi_cache_min_uses 1;
# Cache time is zero. cache is enabled but no caching
# Specify the cache time for specific pages in the "Cache-Control" header
fastcgi_cache_valid 200 0m;
fastcgi_cache_valid 404 1m;
fastcgi_cache_valid 500 0m;
update_time_add_ http_500; # Use the cache option (even if it is outdated) in case of an error
fastcgi_cache_key $ uri $ is_args $ args;
# Uncomment this section if nginx is built with the ngx_cache_purge module
# location ~ ^ / purge (/.*) {
# fastcgi_cache_purge mycache $ 1 $ is_args $ args;
#}
location ~ / (img | css | js | assets) {
# access_log off;
access_log /var/log/nginx/myproject_img_access.log main;
expires 1h;
}
location / {
access_log /var/log/nginx/myproject_main_access.log main;
error_log /var/log/nginx/myproject_error.log;
fastcgi_pass 127.0.0.1:7777;
fastcgi_index index.php;
include fastcgi.conf;
}
}
Install a php test project in
/ var / www / myproject .
The source code of the sample can be viewed and downloaded here .
Run nginx. For RedHat-like systems, it looks like this:
# service nginx start
That's it, the system is ready to go! We try to start
http: // myproject /Learn backend to manage caching time.
The fact is that in nginx the cache time is specified in the fastcgi_cache_valid 200 0m parameter; and applies to all pages in which the title is not redefined.
In the “default” config, I specified the caching time as 0, i.e. caching is disabled. But if the backend generates a header like this:
Cache-Control: public, max-age = 20
or
Expires: Thu, 18 Mar 2010 20:57:07 GMT
That nginx page will be cached for 20 seconds. In php, the header can be changed using the function header ()
(According to nginx, the most priority is "X-Accel-Cache-Control", then "Cache-Control", then "Expires") .
Let's write a small function. which will manage caching time:
function cacheHeaders ($ lifetime = 0) {
# $ date = gmdate ("D, d MYH: i: s", time () + $ lifetime);
# header ('Expires:'. $ date. 'GMT');
header ('Cache-Control: public, max-age ='. $ lifetime);
}
Master blocks
Any logical part of an html-code without standard html pages headers will be called a block, for example:
<div>
This is a simple block.
</ div>
To visually monitor the state of freshness of each of the blocks, add the code, our test blocks display the time.
<? php echo date ('G: i: s')?>
We look at the working example using SSI blocks.Remove pages from the cache
Unfortunately, nginx has no native (regular) way to delete pages from the cache. This can sometimes cause inconvenience.
If you added the ngx_cache_purge module during compilation, then in the config (/etc/nginx/conf.d/myproject.conf) we add approximately this section, before the “location / {...” section:
location ~ ^ / purge (/.*) {
#allow 127.0.0.1;
#allow 10.1.1.0/24;
#deny all;
fastcgi_cache_purge mycache $ 1 $ is_args $ args;
}
In order to delete the cached page: http: //myproject/mypage.php? Lang = en, I just need to load the page http: //myproject/purge/mypage.php? Lang = en
In php, this can be done with the file_get_contents command (“http: // myproject/purge/mypage.php? Lang = en”);
With the
allow and
deny directives, you can limit the range of hosts from which you can “clean” the cache.
We are testing
I remind you, the link for tests
http://linux.ria.ua/SsiBlocks/src/bin/index.php .
Note that the “frame” of the page is updated every 10 seconds, the remaining blocks are updated according to the notes under the block creation time.
The biggest interest, in my opinion, is the “Zboy block”. If you enter it into the failure simulation mode, you will still see the “unreliable” version of this block until you clear the cache.
In addition, remember that you are not alone now experimenting with this page, if you want to experiment - set up a
local copy of the sample yourself.
Draw conclusions
Even if this approach seems primitive to you, and its functionality is very limited, pay attention to the fact that it works not just fast, but very fast!
A bottleneck can only be a disk system, if the cache "swells" to large sizes and will not fit into the disk cache.
PS: If this article will be interesting to readers, I plan to write the second part about applying the described approach to caching blocks on the Zend Framework.