global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 chroot /usr/share/haproxy daemon defaults log global mode http option tcplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 frontend pxc-front bind 10.0.0.70:3306 mode tcp default_backend pxc-back frontend stats-front bind *:81 mode http default_backend stats-back frontend pxc-onenode-front bind 10.0.0.70:3307 mode tcp default_backend pxc-onenode-back backend pxc-back mode tcp balance leastconn option httpchk server c1 10.0.0.106:33061 check port 9200 inter 12000 rise 3 fall 3 server c2 10.0.0.107:33061 check port 9200 inter 12000 rise 3 fall 3 backend stats-back mode http balance roundrobin stats uri /haproxy/stats stats auth haproxy:password backend pxc-onenode-back mode tcp balance leastconn option httpchk server c1 10.0.0.106:33061 check port 9200 inter 12000 rise 3 fall 3 server c2 10.0.0.107:33061 check port 9200 inter 12000 rise 3 fall 3 backup backend pxc-referencenode-back mode tcp balance leastconn option httpchk server c0 10.0.0.105:33061 check port 9200 inter 12000 rise 3 fall 3
service mysqlchk { disable = no flags = REUSE socket_type = stream port = 9200 wait = no user = nobody server = /usr/bin/clustercheck log_on_failure += USERID only_from = 0.0.0.0/0 per_source = UNLIMITED }
... # Local services mysqlchk 9200/tcp # mysqlchk
http: // VIP: 81 / haproxy / statsPort, as well as login and password for Basic authorization are specified in the config.
$ echo "net.ipv4.ip_nonlocal_bind=1" >> /etc/sysctl.conf && sysctl -p
vrrp_script chk_haproxy { # Requires keepalived-1.1.13 script "killall -0 haproxy" # cheaper than pidof interval 2 # check every 2 seconds weight 2 # add 2 points of prio if OK } vrrp_instance VI_1 { interface eth0 state MASTER # SLAVE on backup virtual_router_id 51 priority 101 # 101 on master, 100 and 99 on backup virtual_ipaddress { 10.0.0.70 } track_script { chk_haproxy } }
[mysqld_safe] wsrep_urls=gcomm://10.0.0.106:4567,gcomm://10.0.0.107:4567 # wsrep_urls=gcomm://10.0.0.106:4567,gcomm://10.0.0.107:4567,gcomm:// # - , # , .. # , [mysqld] port=33061 bind-address=10.0.0.105 datadir=/var/lib/mysql skip-name-resolve log_error=/var/log/mysql/error.log binlog_format=ROW wsrep_provider=/usr/lib/libgalera_smm.so wsrep_slave_threads=16 wsrep_cluster_name=cluster0 wsrep_node_name=node105 wsrep_sst_method=xtrabackup wsrep_sst_auth=backup:password innodb_locks_unsafe_for_binlog=1 innodb_autoinc_lock_mode=2 innodb_buffer_pool_size=8G innodb_log_file_size=128M innodb_log_buffer_size=4M innodb-file-per-table
[mysqld_safe] wsrep_urls=gcomm://10.0.0.105:4567 [mysqld] bind-address=10.0.0.106 wsrep_node_name=node106 wsrep_sst_donor=node105
[mysqld_safe] wsrep_urls=gcomm://10.0.0.105:4567 [mysqld] bind-address=10.0.0.107 wsrep_node_name=node107 wsrep_sst_donor=node105
When I first saw this recommendation, I was very upset. I imagined a multi-master in such a way that you can write about any node without worrying about anything, and the changes are guaranteed to be applied synchronously on all nodes. But the harsh realities of life are such that with this approach cluster-wide deadlocks may be possible. The probability is especially great in case of a parallel change of the same data in long transactions. Since I am not yet an expert in this matter, I can not explain this process on the fingers. But there is a good article where this problem is covered in the most detailed way: Percona XtraDB Cluster: Multi-node writing and Unexpected deadlocks
My own tests showed that with aggressive recording on all the nodes they went one after another, leaving only the Reference Node working, i.e. in fact, we can say that the cluster stopped working. This is certainly a minus of such a configuration, because the third node could in this case take the load on itself, but we are sure that the data is safe and, in the most extreme case, we can manually start it in the single server mode.
There are 2 directives for this:
[mysqld_safe] wsrep_urls [mysqld] wsrep_cluster_address
The first, if I understood correctly, was added Galera relatively recently to be able to specify several node addresses at once. There are no more fundamental differences.
The values of these directives at first caused me a special confusion.
The fact is that many manuals advised to leave an empty gcomm: // value in wsrep_urls on the first node of the cluster.
It turned out that it is wrong. Having gcomm: // means initializing a new cluster. Therefore, immediately after the start of the first node in its config, you need to delete this value. Otherwise, after restarting this node, you will receive two different clusters, one of which will consist only of the first node.
For myself, I derived the configuration order at startup and restart of the cluster (already described above in more detail)
1. Node A: start with gcomm: // B, gcomm: // C, gcomm: //
2. Node A: deleting gcomm: // at the end of the line
3. Nodes B, C: run with gcomm: // A
NB: it is necessary to specify the port number for Group Communication requests, the default is 4567. That is, the correct entry: gcomm: // A: 4567
During SST, a clustercheck on the donor will issue HTTP 503, respectively for HAProxy or another LB that uses this utility to determine status, the donor node will be considered inaccessible, as well as the node to which the transfer is made. But this behavior can be changed by editing clustercheck , which is essentially a regular bash script.
This is done by the following edit:
/ usr / bin / clustercheck
#AVAILABLE_WHEN_DONOR=0 AVAILABLE_WHEN_DONOR=1
NB: note that you can only do this if you are sure that xtrabackup is used as SST, and not some other method. In our case, when we use a donor without a load, such editing does not make sense at all.
Source: https://habr.com/ru/post/158377/
All Articles