Link Optimization Nginx, Apache, PHP, MySql

Suddenly, the task came to find out why a certain site does not work as quickly as you want. It is based on CakePHP, in conjunction with Apache and MySQL. The article describes the process of finding bottlenecks and putting in order as much as possible.

The name of the site will not shine - I think the programmers will find out for themselves. Let me just say that this application is for a social network with a load of 70-150 thousand visitors in normal time. Everything is complicated by the fact that advertising is periodically produced, which attracts about 200-300 thousand visitors in a couple of hours.

So, under the cut description of the entire struggle for 4 days.

This article is aimed at 2 groups of people. The first is of course we, the admins, who have to rake such an outrage. Sometimes pulling out almost by the ears of a dying server. The second part of the audience, which I hope will also read all the material, is the programmers. Friends, looking ahead, I’ll say: there wouldn’t be such a disgrace if you designed your projects correctly, many incidents could be avoided. Especially knowing what kind of audience you write your project.
')
At my disposal was the EX10 server located at the hetzner site. For those who do not know - this is 64 GB of RAM and a 6-core processor. Call it the core of the system. There are 2 more servers, one with statics, the other with a backend base. A small application, InnoDB 500MB database.

With a constant load, as it turned out later, 70-100 online sessions, the situation is as follows: CPU load of 100% per core. In the top, of course, MySQL and Apache are fighting for system resources.

Nginx

The first attempt was to reduce the server load by caching Apache output in order to remove static output from it.

I installed it in a very simple configuration: he had to try to take all all the files by mask from a specific folder, if there is no file in it - pick it up from the proxied server, add it to this folder and give it to the client.

 http {
 proxy_cache_path / var / tmp / nginx_cache / levels = 1: 2 keys_zone = ok: 100m inactive = 1d max_size = 1024m;
 server {
         location ~ * \. (js | jpg | jpg | png | jpeg | gif | zip | tgz | gz | rar | doc | xls | exe | pdf | ppt | txt | wav | bmp | rtf) $ {
                 expires 1y;
                 open_file_cache_errors off;
                 error_page 404 = @fetch;
                 root / var / tmp / _fetch_ok;
                 }

         location @fetch {
                 proxy_store_access user: rw group: rw all: r;
                 proxy_store on;
                 proxy_pass http://127.0.0.1:80;
                 proxy_temp_path / var / tmp / _fetch_ok_temp;
                 root / var / tmp / _fetch_ok;
                 }


        location / {
                 proxy_cache ok;
                 proxy_pass http://127.0.0.1;
                 proxy_cache_valid any 10m;
                 proxy_buffer_size 8k;
                }
 }
 }

The config is not fully given, but only necessary for this example blocks.

Unfortunately, this did not lead to any results.
There were only 3 advantages:

Apache stopped producing pictures, that is, it became a little less loaded
Apache stopped communicating directly with the outside world, therefore it could turn off keep-alive and reduce the number of children.
The first bottleneck was found - a PHP script manages all requests to the site to which .htaccess will redirect any request. Including, you will not believe all the statics, and even css.

Mysql

This is the hardest part, because at the moment I do not have much experience in SQL queries and optimization.
I started by reading the MySQL documentation.

Since we have InnoDB on hand - I took the configuration file from the standard delivery my-innodb-heavy-4G.cnf for the initial position

Below I will describe the configuration parameters to which attention should be paid to high-load projects.

back_log = 5000
max_connections = 1600
The first parameter is responsible for the number of connections that can be in the queue until the server stops responding to new requests. The second is how many connections can be accepted by the server.
I have these values quite large, since there are on average up to 1,300 competitive sessions. It’s not worth putting more than you need, since each connection may require a certain amount of RAM. More on that later.

max_connect_errors = 50
It's simple - the number of errors that a client can make before receiving a disconnect. It was necessary to increase, since the project is under development and there are many chances of incorrect requests.

table_cache = 2048
Opening a table requires some resources, therefore this parameter is responsible for the number of open tables waiting for the next connection for some time after the last one.
You can find out whether it can be changed by variable
SHOW GLOBAL STATUS LIKE 'Opened_tables';
It should not be as small as possible.
It is well written: http://www.mysql.ru/docs/man/Table_cache.html

max_allowed_packet = 16M
Maximum packet size. If we do not use large blobs, it does not make sense to change.

binlog_cache_size = 1M
Binary log cache size for a transaction. The official documentation recommends increasing if we have large transactions.
dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_binlog_cache_size

max_heap_table_size = 64M
tmp_table_size = 64M
As I understand it, the smaller one is taken into account. The parameter is responsible for the maximum size of the temporary table that fits in memory. If the table reaches it, it is placed on the disk. Therefore it is necessary to try to create as few tables on the disk as possible. See what relation of temporary tables to tables on the disk at the moment you can request
show status like '%tmp%tables';
www.mysqlperformanceblog.com/2007/01/19/tmp_table_size-and-max_heap_table_size

sort_buffer_size = 8M
In order not to deceive anyone, I will not undertake to translate. I will only clarify that the documentation advises to look at this parameter only if
show status like '% Sort_merge_passes%'; Above zero
dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_sort_buffer_size

join_buffer_size = 2M
As far as I understand, the maximum buffer size calculated for operations not using an index. Did not touch yet.
dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_join_buffer_size

thread_cache_size = 4096
The maximum number of threads that remain for reuse after the request. It is useful to keep sufficient for MySQL to do as little as possible new threads and use old ones. You can understand the effectiveness of this parameter with respect to the parameters of Threads_created / Connections;
dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_thread_cache_size

query_cache_size = 256M
query_cache_limit = 8M
I think this is better than my colleague, the author of this translation, habrahabr.ru/post/41166, no one will say.
This is probably the most important parameter, so it is better to re-read.

thread_stack = 192K
I will not undertake to describe the purpose of this parameter, I will only pay attention to the fact that it also affects the amount of RAM consumed, since it is also allocated to each connection. Therefore again multiply by max connections

long_query_time = 2
log_long_format
log-queries-not-using-indexes
MySQL server has a very handy tool for evaluating database performance. This is a log file of long queries. In my experience, these are most often ineffective queries or queries that do not use an index.
I advise you to go to the programmers with this log file.

key_buffer_size = 1G
The parameter is responsible for caching indexes in memory. To optimize this value, look at Key_read_requests, Key_reads. The second parameter is responsible for the number of reads from the disk and not their buffer.
mysqltips.blogspot.com/2007/03/key-buffer.html

read_buffer_size = 1M
boombick.org/blog/posts/3 - After reading this text, I would not risk adding anything, since I will not be sure that I was right.

read_rnd_buffer_size = 24M
The parameter affects the speed of sorting operations. Unfortunately I did not find how to evaluate its effectiveness.
dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_read_rnd_buffer_size
www.mysqlperformanceblog.com/2007/07/24/what-exactly-is-read_rnd_buffer_size

myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_max_extra_sort_file_size = 10G
Parameters affect the sorting, just did not dare to change them. Increased the first suggestion that it will increase the performance of complex queries.

sync_binlog = 0
In our case it means not to synchronize binary logs to disk through system functions. If the parameter is greater than zero, the server will synchronize data every n requests.
dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_sync_binlog

innodb_buffer_pool_size = 4G
Increasing this setting reduces the number of disk operations. Unfortunately, I also did not find how to measure it. Since the base is small, it decided not to increase it much. Somewhere met advice, in the case of a large database, increase this parameter to 70% of RAM.
dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size

innodb_log_buffer_size = 32M
If you believe the description, reduces disk operations with heavy transactions.
dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_log_buffer_size

innodb_log_file_size = 1024M
If you believe the documentation, then increasing the log file reduces the workload of IO disk operations, but increases the recovery time in case of failures.
dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_log_file_size

innodb_flush_log_at_trx_commit = 0
With a value of 0, buffers are flushed once a ~~minute a~~ second, and not after each insert.
dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

innodb_thread_concurrency = 14
Recommend to put a little more than the number of cores.

innodb_sync_spin_loops = 10
As I understand it, it affects the number of attempts to access blocked data. Increasing this value, we can lose processor time, and reducing - the reliability of writing to the database.
www.mysqlperformanceblog.com/2006/09/07/internals-of-innodb-mutexes
dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_sync_spin_loops

DB and RAM

There is an excellent perl script which gives a basic understanding of what needs to be changed in the database. mysqltuner.pl/mysqltuner.pl
Very often, this script swears at the maximum memory consumption by the mysql server.
If you look at the source - this is how this program considers memory usage:

per_thread_buffers = read_buffer_size + read_rnd_buffer_size + sort_buffer_size + thread_stack + join_buffer_size;
total_per_thread_buffers = per_thread_buffers * max_connections;
server_buffers = key_buffer_size + max_tmp_table_size;
server_buffers + = innodb_buffer_pool_size;
server_buffers + = innodb_additional_mem_pool_size;
server_buffers + = innodb_log_buffer_size;
server_buffers + = query_cache_size;
total_possible_used_memory = server_buffers + total_per_thread_buffers;

It was useful for me to understand where I did not correctly indicate the values.
By the way, right away you can’t underestimate the parameters, if the script swears on a large consumption of RAM by the base, you shouldn’t. Since many say that this is only a theoretical indicator and the database may never try to take so much memory.

Optimization of the database structure.

When we have done all the possible settings and still did not get a good result - it's time to turn to the slow-log file.
In standard to mysql there is an application mysqldumpslow.
By running
mysqldumpslow -sc <path to the Slow Log File> sorted by the number of entries a list of database queries that were too long or did not use indexes. Usually adding the right indexes fixes both problems.

In most cases, when you see a large number of long queries in the output of this program (count variable), copy a piece of this query and look for an example of such a query in the text of the log file.
Next, go to the database client and execute this query by adding the word explain first.
About this you can also read more here:
habrahabr.ru/post/31072

So you can see if the query uses an index or not.
If the table does not have enough indexes, you can safely add them, although you should not overdo it, either. Indexes are needed for those columns that are used after where and order. Start with a unique index for each hundred. Otherwise, the index may not work.
Here you can learn in more detail how these indexes work:
dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html

Frankly, after optimizing about 5 key requests, the server was able to handle 500-700 connections instead of 50, and the issue time of the php page was reduced to 1s instead of 8s. At maximum load, the page issue time was 5s instead of 50s. (This refers to performance measurements using Apache Benchmark with approximately 1000 threads)

A little more about nginx.

After optimization, he noticed that at large loads, requests for more than a certain amount of nginx are discarded, and not apache. At the same time, the memory and CPU are not loaded.
Began to understand. I saw in the logs that the nginx server is trying to open more files than it should be.
In Suse OS, with which I had to face, the file is responsible for this limitation
/etc/security/limits.conf
I added there such lines:

 nginx soft nofile 300000
 nginx hard nofile 300000

Restart server is not needed.

Apache2

I haven't changed the configuration much yet. The only thing I did was turn off keep-alive. So that the Apache could calmly give an answer and take up the next request at the moment when nginx is still giving the client a page via a slow channel.

eAccelerator

Do not forget that this optimizer also has a number of parameters that can be changed.

Here is what I changed:
eaccelerator.shm_size = "2096"
The size of virtual memory in megabytes that can be used

eaccelerator.shm_only = "1" - use only RAM and not use a disk, in the fight for io on a software raid of 2 sat disks decided to do so.

Here's something that will be useful to read:

habrahabr.ru/post/41166
habrahabr.ru/post/108418
dev.mysql.com/doc/refman/5.5/en/server-system-variables.html

Special thanks to my friend who helped to read the article and correct a whole lot of grammatical and stylistic mistakes.

Instead of epilogue

Friends, I understand that the mistakes and errors in the conclusions given in this article are a whole bunch.
A big request to the guru, with your comments, to help make the article better.

UPD. Summary of comments

Thank you for so much interest. Not expected.

A good link, I recommend reading: www.percona.com/files/presentations/percona-live/dc-2012/PLDC2012-optimizing-mysql-configuration.pdf
Thank you albertum

Regarding php caching systems:
Yes indeed, eAccelerator is not the only option.
There is also APC, and yes indeed they are going to embed it in PHP.
What is better to judge I will not undertake, since I did not do a sufficient number of tests.

MySQL alternatives are also present and many of them.
Key of course:
mAriadb
percona

I myself personally chose the perkona and so far satisfied.

With regard to the fact that this is not the final light result and we must move on - those who have read carefully will see that this is only patching up the holes.

Source: https://habr.com/ru/post/146179/

All Articles