It all started with the fact that I became a system administrator at one provincial Internet service provider. In addition to administering various kinds of resources, I took care of one young but rapidly developing resource. The resource was a classic LAMP
project. The site where content generators were regular users.* By the way, at that time I didn’t understand anything in * nix systems, although all the servers I got were on it, I understood all this quickly enough.
As it usually happens with the resources that are gaining popularity, the glands on which everything is spinning, they no longer cope. The resource was on an old dual-processor server, where almost all services for users were spinning. At that time, the authorities did not perceive the resource as a worthwhile investment, therefore, to my regret (and later, fortunately), I was not allocated any money for a new piece of metal.
nginx + php-fpm
Google came to the rescue. As it turned out, for high-load projects, people use php in fast-cgi mode. It turned out that there are several schemes for implementing such a regime, but in the end, preference was given to Russian developments, the nginx
bundle (Igor Sysoev) and php-fpm
(Andrey Nigmatulin). The performance gain, compared to Apache and its mod_php, is achieved by the fact that php-fpm creates n php processes, which later hang in the system and process scripts transferred from the web server. With this scheme, time and system resources are saved to call the php interpreter. Do not ask why I didn’t choose any spawn-fcgi c lighttpd, I don’t want to start a holivar on this topic, and it doesn’t matter in principle. System performance has increased, the load has subsided, I breathed a sigh of relief for a while.
The next step towards high performance is installing a php-accelerator. The essence of his work is to cache the binary code of the script. Indeed, why waste precious processor time on translating a script into a binary code with each call? Such calls to the same script could be up to 100 per second, so the eAccelerator
came in handy. After it was installed, the system performance increased again, the load decently fell, the page generation time decreased dramatically, and users again became content with an accelerated resource.
I have to say that before working at the company I indulged myself with php, so I was well aware of the resource code and understood what was needed there, but I understood that, like any code, there are bottlenecks and I decided to start their search. After adding timers to the project, counting the execution time of php and the execution time of the sql queries, the bottleneck was found immediately. The resource was based on one open-source CMS and judging by the code, some of the developers had no idea about the indexes
in MySQL. Well, began the long identification and rework of problem queries, as well as adding indexes to where they are really needed. EXPLAIN has
become my companion for the next few days. As a result, the execution time of the SQL queries in some places was reduced to 10–20 times, and again happy days began for dear users.
By the time the load was back on the shelf, the bosses had already allocated a full-fledged server for the resource. But I already had a sporting interest. When the page generation time drops from 2 seconds to 0.5 seconds, it is very, very inspiring, but I wanted more. I wanted to get rid of heavy sql queries at all, leaving only a critical minimum. Plowing through the spaces, I came across the report by Andrei Smirnov: “Web, caching and memcached” (performance on HighLoad ++ 2008). Indeed, this is what you need! Memcached
- the simplest and at the same time high-performance caching server developed at one time for livejournal.com, fit into my scheme at an opportune time. Unfortunately, it was not possible for me to use memcache in full, because my work was not limited to this web resource, but much was still done. I used memcached to cache the result of sql queries, or to store already prepared, rendered blocks. On many pages of the site, the time of their generation has been reduced to horribly small numbers - 0.009s! This was the biggest discovery and achievement of all time.
Sysctl and unix-sockets
Another important moment in the struggle for the quality of the resource is the sysctl
tuning. At high loads, the default configuration leaves much to be desired. A couple of weeks were spent on finding the optimal parameters of the network subsystem. Also, if possible, php-fpm, memcached, and MySQL were hung on unix-sockets
. As a result, the server gives content to peak loads as quickly as it does without loads.
By the time I mastered caching, I wondered if it was possible to accelerate anything else. Naturally! Search was the weakest point of the site. Guess why? Right! There was used a terrible evil - LIKE in sql query. This is where the Sphinx
search engine came to the rescue! Unfortunately, in my case, it turned out to use Sphinx only for saggests (tooltips during the search), because updating the main sql table was very frequent, and the relevance of the data was very important, so we had to abandon it. However, if I had more time for a detailed analysis of this point, perhaps I would have coped with this problem and Sphinx would be another key point in the development of the service.
MySQL in tmpfs
A lot of time has passed since all my tunings and optimizations. The resource became significant for the company, money was allocated for its support and now it was served by 2 servers. The first acted as a web server and processed php, the second was used under MySQL. But the development did not stand still and now the resource reaches 600 hits per second, both parts of the resource stop cope. Plug-in both at the php level, and at the MySQL level. After receiving the third server, the question arose of scaling it, but I could not think of an ideal option. And here on the pages of Habr saw the topic of MySQL in tmpfs
. And I thought, why not? Spent some preparatory work with the base. As much as possible reduced a DB by deletions of unimportant but "gluttonous" functions. Removed some of the logging functions in the database. Ultimately, the weight of the database was reduced from 11 to 2.5 GB. And so, it was decided to use 2 servers under php, and on the 3 rd, start MySQL with “dope”. The standard mysqlhotcopy
utility works very well with backup (1.5 seconds and you're done!). So did. The load on MySQL decreased by 4 times, the performance also increased.
Why did I decide to tell all this? Perhaps this article will be interesting to a certain circle of people who are faced with similar problems. If I had found something similar in my time, it would have helped me a lot. Another resource will undergo great changes. Everything will be rewritten, and from the old nothing will be left except for content and users. For me, the sunset of all my achievements seems to be, but I got a great experience thanks to them, and the experience is invaluable. And the article will be for me the memory of what I once did.
In the process of work, my colleague ( CentAlt
) helped me a lot
, who strongly supported me in all his endeavors and for that he was very grateful. By the way, maybe some of you know him. It keeps one very useful repository for Centos
, where you can find the latest versions of nginx, php-fpm, unbound, clamav, postfix, dovecot, etc.
The article does not claim to primacy or popularity. This is not an instruction on how to do it or not. This is the story of a single high-load project, which I happened to develop and adminit.
Thank you for reading to the end!