My first article, do not judge strictly ...Greetings to you, dear Habradevelopery!
Many of you know about the beautiful lightweight web server
nginx .
Some also know that he can work with
memcached .
But only a few know, what does
SSI have to do with and how can it be used in conjunction with nginx and memcached.
')
As you know, the new is a well forgotten old. Each named tool and technology you probably know. I want to talk about how and why to dump all this in one pile :)
nginx
In the classical scheme, nginx is used as a load balancer or reverse proxy in front of one or more Apache-type servers.
nginx gives static resources - images, CSS and JS files. It transmits requests to dynamic pages to Apache / PHP, which independently processes them and returns the result.
Looking for articles about nginx in Habr, I didn’t find a detailed description of how this “classic” scheme works, but I don’t want to be distracted from the topic, so I’m advising you to go deeper to
googling — there’s plenty of information and tutorials on the net.
memcached
nginx can work with memcached using the
ngx_http_memcached_mod module.
When nginx receives a request for a specific URL, it first checks to see if there is a query result for that URL in the cache. If there is, it gives the cached page as it is, and if it does not, it sends the request processing to Apache / PHP. Upon receiving the request, PHP prepares the response, saves it to memcached, and then displays it to the user. Thus, with the next request at the same URL, nginx will find the result in the cache and return it to the user, bypassing the resource-intensive access to Apache / PHP.
In theory, everything is simple and convenient, but in practice the use of such caching is very limited.
Let's say we want to cache the main page of the habr :)
We see the authorization block in the upper right corner (login / register, or general information about the current user), “bookmarks” (all, collective, personal ...), tape of zahabrenny articles, tag cloud, etc.
All blocks on the page can be divided into static and dynamic. Conventionally, because in fact all blocks are dynamic, but some can be cached, while others cannot. Think yourself, what will happen if we cache the main page for an unauthorized user, and then an authorized user logs in at the same URL? That's right, in the upper right corner he will see an invitation to log in or register and will be at least surprised.
What to do? One option is not to cache the entire page! We have the ability to exclude dynamic blocks from caching. And help us in this ...
SSI, or Server Side Includes
If someone else remembers, there was such a technology before, long before JSP, ASP, PHP and other Django's with RoRs :)
Its meaning is that a regular HTML page was processed in a certain way on the server before it left the client. For example, the usual include directive looked like this:
<!--#include file="header.html"-->
Here we are just interested in this particular directive, just do not include the file, but include virtual. They differ in that include file inserts the contents of the file instead of itself, and include virtual - the result of the virtual query at the specified URL. In nginx, the
ngx_http_ssi_module module will help us implement our plans.
Returning to the example with the main page of the habr, we put the page as it is in the cache, only the authorization block is replaced with the following one:
<!-- #include virtual="/auth" -->
Then, when nginx gets this page from the cache, it will have to process all SSI instructions before giving it to the client. Thus, nginx will execute a virtual request via the URL "/ auth" and, if there are no results for this URL in the cache, it will send the request to Apache / PHP. Now, the PHP task is to check the authorization of the current user and return the HTML code of the authorization block, depending on whether the user is authorized.
After receiving the result from Apache / PHP, nginx inserts it in place of the include directive and returns the page to the client. Thus, authorized and unauthorized clients will see different pages, the content of which will still be cached and the poor server of the habr will not need to receive a tape of captured articles for each request, generate a tag cloud, etc. - just a simple authorization block;)
Practice
I wanted to immediately describe the practical examples of the implementation of this idea, but the article turned out to be rather big - I would rather write the practical part separately, with a description of the server configuration and examples in PHP & Zend Framework.
And the most impatient people of Habra the Healers already have all the links to dig deeper into this topic :)
PS: To be honest, in practice, this approach to caching has not been applied to me, unfortunately. There are only a few sketches of the finished code, which at first glance is quite efficient. If the collective mind prompts bottlenecks or potential problems - I will be very grateful. Thanks for attention!