36 million requests per hour, 10,000+ constantly working clients, on one server, nginx + mysql

A situation has arisen that I am participating in a project that works with a fairly large load. As already written - 36 million requests per hour. I have read and tried many things over the last month, setting up a server; I would just like to concisely and compactly issue the thesis that works well in this configuration.

The first thing I noticed was a lot of tips on how to set everything up under a heavy load. Read them carefully, usually in the text you will find that we are talking about a "high load" of 15-20 thousand clients per day. We have about a million customers, active, daily.

We have no money and we do everything at our own expense, so we save. The result - all one million clients served on one server, here on this - EX-60 on hetzner .

We accidentally made ourselves an analogue of DDoS through our clients and as a result of the settings, when there were 4000 php processes, the OS was also loaded under 4000, I managed to try many configurations and find the most working ones. They coped with the error in the software, now these 10-12 thousand requests per second are processed with the load load average: 3.92, 3.22, 2.85. Not one, of course, but for one server I consider it a good result.
')
Operating system - CentOS 7.1, 64 bits. Minimal installation, plus iptables, nginx, php-fpm, mysql. Kernel 4th version, from kernel-ml.

Tuning kernel settings under high pressure tcp connections:

/etc/sysctl.conf

fs.file-max = 1000000
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_max_syn_backlog = 65536
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_mem = 50576 64768 98152
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_orphan_retries = 0
net.ipv4.tcp_syncookies = 0
net.ipv4.netfilter.ip_conntrack_max = 1048576
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.route.flush = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.lo.accept_source_route = 0
net.ipv4.conf.eth0.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rfc1337 = 1
net.ipv4.ip_forward = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.core.somaxconn = 262144
net.core.netdev_max_backlog = 1000
net.core.rmem_default = 65536
net.core.wmem_default = 65536
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

Tuning file limits, since there are no users on the server, we don’t bother too much:

/etc/security/limits.conf

* soft nproc 65535
* hard nproc 65535
* soft nofile 100000
* hard nofile 100000
root soft nofile unlimited
root hard nofile unlimited

Monsters know, but I have not been administering for a long time and was not aware that * does not work as root, and it must be separately tuned.

This is the core of everything.

Muscle settings Installed percona-56.

The choice was finally made on InnoDB, we tried TokuDb, but on large volumes of permanent inserts, and we have them 95% of the 36 million per hour. InnoDB behaves better, the tests of the perkons say the same thing.

Mysql settings:

/etc/my.cnf

[mysql]
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
user = mysql
default-storage-engine = InnoDB
socket = /var/lib/mysql/mysql.sock
pid-file = /var/lib/mysql/mysql.pid
key-buffer-size = 32M
myisam-recover = FORCE, BACKUP
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
datadir = / var / lib / mysql /
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 15000
thread-cache-size = 5000
open-files-limit = 150000
table-definition-cache = 1024
table-open-cache = 50000
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 2G
innodb-file-per-table = 1
innodb-buffer-pool-size = 10G
innodb_flush_log_at_trx_commit = 0
log-error = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes = 0
slow-query-log = 1
slow-query-log-file = /var/log/mysql/mysql-slow.log

Be sure to disable with such a load query-cache. It will really slow down the whole system. However, play around, maybe not in your case, but in many tests and texts I met this moment, checked it myself - it is, with the disabled it works faster.

skip-name-resolve also gives a good increase.

Additional settings in relation to the standard for nginx:

fastcgi_params

fastcgi_param REDIRECT_STATUS 200;
fastcgi_buffer_size 4K;
fastcgi_buffers 64 4k;

nginx tyunim under our needs:

nginx.conf

user nginx;
worker_processes 8;

error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

worker_rlimit_nofile 150000;

events {
worker_connections 8000;
multi_accept on;
use epoll;
}

http {
include /etc/nginx/mime.types;
default_type application / octet-stream;

log_format main '$ remote_addr - $ remote_user [$ time_local] "$ request"'
'$ status $ body_bytes_sent "$ http_referer"'
'"$ http_user_agent" "$ http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

gzip off;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
reset_timedout_connection on;
server_tokens off;
client_body_buffer_size 128k;

include /etc/nginx/conf.d/*.conf;

}

The cores are 8, that's why worker-processes are 8, 8000 per brother, anyway, more than 64k cannot be serviced at a time. There will be a small queue if there are more simultaneous connections.

In a site with php-fpm we communicate through sockets:

/etc/nginx/conf.d/site.conf

fastcgi_pass unix: /var/run/php-fpm/php-fpm.sock;
fastcgi_send_timeout 180s;
fastcgi_read_timeout 180s;

Main configuration php-fpm:

/etc/php-fpm.d/www.conf

listen = /var/run/php-fpm/php-fpm.sock
pm = ondemand
pm.max_children = 4000
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_requests = 0

ondemand is described in few places, but it is better than dynamic under heavy load. And static is, of course, a kill for the server, I did not like it much.

ondemand starts from 5, grows if necessary, but unlike dynamic with a decrease in load, it does not kill processes, then to increase again, but simply at the peak captures the value and puts the unnecessary into standby mode. And if the load suddenly grows again - the processes are ready, no one needs to start from scratch.

pm.max_requests = 0 helps fight memory leaks, in third-party software.

Actually, this is how we serve 36 million per hour, of which 95 percent is transferring data to us and writing them to the database. For 2.8 billion queries, we now have from 10 to 16 slow_query, each not more than 10 seconds, and all of them are selects with many fields and tables. The remaining requests work out instantly.

Instead of php-fpm I compiled and used hhvm at one time, it really works great, much faster than php-fpm, but there is a problem - it drops every 30-40 minutes, and tightly.

I wrote to git developers, until we could not help them, they don’t know the reasons. As a result, we sit on php-fpm, version 5.6.

All software is put through yum, no builds from sorts with megatuning are used.

I think someone will be useful this information about the settings all in one place.

Source: https://habr.com/ru/post/262623/

All Articles

36 million requests per hour, 10,000+ constantly working clients, on one server, nginx + mysql

More articles: