📜 ⬆️ ⬇️

Build your own failover cloud based on OpenNebula with Ceph, MariaDB Galera Cluster and OpenvSwitch



This time I would like to tell you how to set up this subject, in particular, each individual component, in order to finally get your own, expandable, fault-tolerant cloud based on OpenNebula. In this article I will consider the following points:


The topics themselves are very interesting, so even if you are not interested in the final goal, but you are interested in setting up a separate component. You are welcome under the cat.

image
')

Small introduction


So, what do we get in the end?


After reading this article, you will be able to deploy your own flexible, expandable, and, moreover, fault-tolerant, cloud based on OpenNebula. What do these words mean? - Let's look at:



image

What do we need for this?




Cluster layout


To understand what is happening, here is an approximate diagram of our future cluster:
image

And a label with the characteristics of each node:
Hostnamekvm1kvm2kvm3
Network interfaceenp1enp1enp1
IP address192.168.100.201192.168.100.202192.168.100.203
HDDsdbsdbsdb
HDDsdcsdcsdc
HDDsddsddsdd
SSDsdesdesde
SSDsdfsdfsdf


Everything, now it is possible to start setup! And we begin perhaps with building a repository.



Ceph


About ceph on Habré has already been written. For example, teraflops described its device and basic concepts in some detail in his article . Recommended to read.

Here I will also describe the ceph setting for storing RBD (RADOS Block Device) block devices for our virtual machines, as well as setting the cache pool to speed up I / O operations in it.

So we have three nodes kvm1, kvm2, kvm3. Each of them has 2 SSD drives and 3 HDDs. On these drives, we will raise two pools, one - the main on the HDD, the second - caching on the SSD. In total, we should have something like this:
image

Training


Installation will be done using ceph-deploy, and it implies installation from the so-called admin server.

Any computer with an installed ceph-depoy and ssh client can serve as an admin server, in our case one of the kvm1 nodes will act as such server.

We need to have a ceph user on each node, as well as allow him to walk between the nodes without a password and execute any commands via sudo without a password.

On each node we perform:

 sudo useradd -d /home/ceph -m ceph sudo passwd ceph sudo echo "ceph ALL = (root) NOPASSWD:ALL" > /etc/sudoers.d/ceph sudo chmod 0440 /etc/sudoers.d/ceph 


Go to kvm1.

Now we will generate the key and copy it to the other nodes.
 sudo ssh-keygen -f /home/ceph/.ssh/id_rsa sudo cat /home/ceph/.ssh/id_rsa.pub >> /home/ceph/.ssh/authorized_keys sudo chown -R ceph:users /home/ceph/.ssh for i in 2 3; do scp /home/ceph/.ssh/* ceph@kvm$i:/home/ceph/.ssh/ done 


Installation


Add the key, install the ceph and ceph-depoy repository from it:

 sudo rpm --import 'https://download.ceph.com/keys/release.asc' sudo yum -y localinstall http://download.ceph.com/rpm/el7/noarch/ceph-release-1-1.el7.noarch.rpm sudo yum install -y ceph-deploy 


Ok, now we go for user ceph and create a folder in which we will store configs and keys for ceph.
 sudo su - ceph mkdir ceph-admin cd ceph-admin 


Now install ceph on all our nodes:
 ceph-deploy install kvm{1,2,3} 


Now create a cluster
 ceph-deploy new kvm{1,2,3} 


Create monitors and get the keys:
 ceph-deploy mon create kvm{1,2,3} ceph-deploy gatherkeys kvm{1,2,3} 


Now, according to our original scheme, we will prepare our disks, and launch the OSD daemons:
 # Flush disks ceph-deploy disk zap kvm{1,2,3}:sd{b,c,d,e,f} # SSD-disks ceph-deploy osd create kvm{1,2,3}:sd{e,f} # HDD-disks ceph-deploy osd create kvm{1,2,3}:sd{b,c,d} 


Let's see what we got:
 ceph osd tree 
conclusion
 ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 3.00000 root default -2 1.00000 host kvm1 0 1.00000 osd.0 up 1.00000 1.00000 1 1.00000 osd.1 up 1.00000 1.00000 6 1.00000 osd.6 up 1.00000 1.00000 7 1.00000 osd.7 up 1.00000 1.00000 8 1.00000 osd.8 up 1.00000 1.00000 -3 1.00000 host kvm2 2 1.00000 osd.2 up 1.00000 1.00000 3 1.00000 osd.3 up 1.00000 1.00000 9 1.00000 osd.9 up 1.00000 1.00000 10 1.00000 osd.10 up 1.00000 1.00000 11 1.00000 osd.11 up 1.00000 1.00000 -4 1.00000 host kvm3 4 1.00000 osd.4 up 1.00000 1.00000 5 1.00000 osd.5 up 1.00000 1.00000 12 1.00000 osd.12 up 1.00000 1.00000 13 1.00000 osd.13 up 1.00000 1.00000 14 1.00000 osd.14 up 1.00000 1.00000 


Check the status of the cluster:
 ceph -s 


Cache pool setup


image
So, we have a full ceph cluster.
Let's set up a caching pool for it, first we need to edit the CRUSH cards to determine the rules according to which we will distribute data. To our cache pool was only on SSD-drives, and the main pool only on the HDD.

First we need to disable ceph to update the map automatically, we will add it in ceph.conf
 osd_crush_update_on_start = false 


And update it on our nodes:
 ceph-deploy admin kvm{1,2,3} 


Let's save our current map and translate it into text format:
 ceph osd getcrushmap -o map.running crushtool -d map.running -o map.decompile 


let's bring it to this form:

map.decompile
 # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable straw_calc_version 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 device 10 osd.10 device 11 osd.11 device 12 osd.12 device 13 osd.13 device 14 osd.14 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host kvm1-ssd-cache { id -2 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.0 weight 1.000 item osd.1 weight 1.000 } host kvm2-ssd-cache { id -3 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.2 weight 1.000 item osd.3 weight 1.000 } host kvm3-ssd-cache { id -4 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.4 weight 1.000 item osd.5 weight 1.000 } host kvm1-hdd { id -102 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.6 weight 1.000 item osd.7 weight 1.000 item osd.8 weight 1.000 } host kvm2-hdd { id -103 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.9 weight 1.000 item osd.10 weight 1.000 item osd.11 weight 1.000 } host kvm3-hdd { id -104 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.12 weight 1.000 item osd.13 weight 1.000 item osd.14 weight 1.000 } root ssd-cache { id -1 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item kvm1-ssd-cache weight 1.000 item kvm2-ssd-cache weight 1.000 item kvm3-ssd-cache weight 1.000 } root hdd { id -100 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item kvm1-hdd weight 1.000 item kvm2-hdd weight 1.000 item kvm3-hdd weight 1.000 } # rules rule ssd-cache { ruleset 0 type replicated min_size 1 max_size 10 step take ssd-cache step chooseleaf firstn 0 type host step emit } rule hdd { ruleset 1 type replicated min_size 1 max_size 10 step take hdd step chooseleaf firstn 0 type host step emit }# end crush map 


You can see that instead of one root I did two, for hdd and ssd, the same thing happened with rule and each host.
When editing the map manually, be extremely careful not to get confused in id'shniki!

Now compile and assign it:
 crushtool -c map.decompile -o map.new ceph osd setcrushmap -i map.new 


Let's see what we got:
 ceph osd tree 
conclusion
 ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -100 3.00000 root hdd -102 1.00000 host kvm1-hdd 6 1.00000 osd.6 up 1.00000 1.00000 7 1.00000 osd.7 up 1.00000 1.00000 8 1.00000 osd.8 up 1.00000 1.00000 -103 1.00000 host kvm2-hdd 9 1.00000 osd.9 up 1.00000 1.00000 10 1.00000 osd.10 up 1.00000 1.00000 11 1.00000 osd.11 up 1.00000 1.00000 -104 1.00000 host kvm3-hdd 12 1.00000 osd.12 up 1.00000 1.00000 13 1.00000 osd.13 up 1.00000 1.00000 14 1.00000 osd.14 up 1.00000 1.00000 -1 3.00000 root ssd-cache -2 1.00000 host kvm1-ssd-cache 0 1.00000 osd.0 up 1.00000 1.00000 1 1.00000 osd.1 up 1.00000 1.00000 -3 1.00000 host kvm2-ssd-cache 2 1.00000 osd.2 up 1.00000 1.00000 3 1.00000 osd.3 up 1.00000 1.00000 -4 1.00000 host kvm3-ssd-cache 4 1.00000 osd.4 up 1.00000 1.00000 5 1.00000 osd.5 up 1.00000 1.00000 


Now we will describe our configuration in the ceph.conf config, and in particular, we will write data about monitors and osd.

I got this config:

ceph.conf
 [global] fsid = 586df1be-40c5-4389-99ab-342bd78566c3 mon_initial_members = kvm1, kvm2, kvm3 mon_host = 192.168.100.201,192.168.100.202,192.168.100.203 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_crush_update_on_start = false [mon.kvm1] host = kvm1 mon_addr = 192.168.100.201:6789 mon-clock-drift-allowed = 0.5 [mon.kvm2] host = kvm2 mon_addr = 192.168.100.202:6789 mon-clock-drift-allowed = 0.5 [mon.kvm3] host = kvm3 mon_addr = 192.168.100.203:6789 mon-clock-drift-allowed = 0.5 [client.admin] keyring = /etc/ceph/ceph.client.admin.keyring [osd.0] host = kvm1 [osd.1] host = kvm1 [osd.2] host = kvm2 [osd.3] host = kvm2 [osd.4] host = kvm3 [osd.5] host = kvm3 [osd.6] host = kvm1 [osd.7] host = kvm1 [osd.8] host = kvm1 [osd.9] host = kvm2 [osd.10] host = kvm2 [osd.11] host = kvm2 [osd.12] host = kvm3 [osd.13] host = kvm3 [osd.14] host = kvm3 


And distribute it to our hosts:
 ceph-deploy admin kvm{1,2,3} 


Check the status of the cluster:
 ceph -s 


Pooling


image
To create pools, we need to calculate the correct number of pg (Placment Group), they are needed for the CRUSH algorithm. The calculation formula is as follows:
  (OSDs * 100) Total PGs = ------------ Replicas 
and round up to the nearest power of 2

That is, in our case, if we plan to have only one pool on the SSD and one pool on the HDD with replica 2, the calculation formula is the following:
 HDD pool pg = 9*100/2 = 450[] = 512 SSD pool pg = 6*100/2 = 300[] = 512 

If there are several pools in our root, then the resulting value should be divided into number of pools

Create pools, assign them size 2 - the size of the replica, this means that the data recorded in it will be duplicated on different disks, and min_size 1 - the minimum size of the replica at the time of recording, that is, how many replicas need to be made at the time of recording to “release” the operation records

 ceph osd pool create ssd-cache 512 ceph osd pool set ssd-cache min_size 1 ceph osd pool set ssd-cache size 2 ceph osd pool create one 512 ceph osd pool set one min_size 1 ceph osd pool set one size 2 
Pool one - understandably will be used to store OpenNebula images

Assign rules to our pools:
 ceph osd pool set ssd-cache crush_ruleset 0 ceph osd pool set one crush_ruleset 1 


We configure that the entry in the pool one will be made through our cache pool:
 ceph osd tier add one ssd-cache ceph osd tier cache-mode ssd-cache writeback ceph osd tier set-overlay one ssd-cache 


Ceph uses 2 basic cache flush operations:

To determine the “hot” objects, the so-called Bloom filter is used .

Configure our cache settings:
 #   bloom ceph osd pool set ssd-cache hit_set_type bloom #          ceph osd pool set ssd-cache hit_set_count 4 #       ceph osd pool set ssd-cache hit_set_period 1200 


Just set up
 #           ceph osd pool set ssd-cache target_max_bytes 200000000000 #   ,      ceph osd pool set ssd-cache cache_target_dirty_ratio 0.4 #   ,      ceph osd pool set ssd-cache cache_target_full_ratio 0.8 #         ceph osd pool set ssd-cache cache_min_flush_age 300 #         ceph osd pool set ssd-cache cache_min_evict_age 300 


Keys


Create user one and generate a key for it.
 ceph auth get-or-create client.oneadmin mon 'allow r' osd 'allow rw pool=ssd-cache' -o /etc/ceph/ceph.client.oneadmin.keyring 

Since he will not write directly to the main pool, we will issue him rights only to the ssd-cache pool.

At this setting Ceph can be considered complete.



MariaDB Galera Cluster


image

Now we will configure a fail-safe MySQL database on our nodes, in which we will store the configuration of our data center.
MariaDB Galera Cluster is a MariaDB cluster with master replication that uses the galera library to synchronize.
Plus, it's pretty simple to set up:

Installation


On all nodes
Install the repository:
 cat << EOT > /etc/yum.repos.d/mariadb.repo [mariadb] name = MariaDB baseurl = http://yum.mariadb.org/10.0/centos7-amd64 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck=1 EOT 


And the server itself:
 yum install MariaDB-Galera-server MariaDB-client rsync galera 


run the daemon and do the initial installation:
 service mysql start chkconfig mysql on mysql_secure_installation 


Configure the cluster:


On each node, create a user for replication:
 mysql -p GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on *.* to sst_user@'%'; FLUSH PRIVILEGES; exit service mysql stop 


We give the configuration /etc/my.cnf to the following form:
For kvm1:
 cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://192.168.100.202,192.168.100.203" wsrep_cluster_name='galera_cluster' wsrep_node_address='192.168.100.201' # setup real node ip wsrep_node_name='kvm1' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT 


By analogy with kvm1, we write the configs for the remaining nodes:
For kvm2
 cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://192.168.100.201,192.168.100.203" wsrep_cluster_name='galera_cluster' wsrep_node_address='192.168.100.202' # setup real node ip wsrep_node_name='kvm2' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT 
For kvm3
 cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://192.168.100.201,192.168.100.202" wsrep_cluster_name='galera_cluster' wsrep_node_address='192.168.100.203' # setup real node ip wsrep_node_name='kvm3' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT 


Done, it's time to start our cluster, on the first node we start:
 /etc/init.d/mysql start --wsrep-new-cluster 


On the remaining nodes:
 /etc/init.d/mysql start 


Let's check our cluster, on each node we will launch:
 mysql -p SHOW STATUS LIKE 'wsrep%'; 

Sample output
 +------------------------------+----------------------------------------------------------------+ | Variable_name | Value | +------------------------------+----------------------------------------------------------------+ | wsrep_local_state_uuid | 5b32cb2c-39df-11e5-b26b-6e85dd52910e | | wsrep_protocol_version | 7 | | wsrep_last_committed | 4200745 | | wsrep_replicated | 978815 | | wsrep_replicated_bytes | 4842987031 | | wsrep_repl_keys | 3294690 | | wsrep_repl_keys_bytes | 48870270 | | wsrep_repl_data_bytes | 4717590703 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 7785 | | wsrep_received_bytes | 62814 | | wsrep_local_commits | 978814 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 2 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.002781 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 2 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.002954 | | wsrep_local_cached_downto | 4174040 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_cert_deps_distance | 40.254320 | | wsrep_apply_oooe | 0.004932 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.004932 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 43 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.023937 | | wsrep_incoming_addresses | 192.168.100.202:3306,192.168.100.201:3306,192.168.100.203:3306 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 91e4b4f9-62cc-11e5-9422-2b8fd270e336 | | wsrep_cluster_conf_id | 0 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 5b32cb2c-39df-11e5-b26b-6e85dd52910e | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 1 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 25.3.9(r3387) | | wsrep_ready | ON | | wsrep_thread_count | 2 | +------------------------------+----------------------------------------------------------------+ 

That's all. Just - isn't it?

Note: if all your nodes are turned off at the same time, MySQL will not rise by itself, you will have to select the most current node, and start the daemon with the option --wsrep-new-cluster , so that the other nodes can replicate the information from it.



Openvswitch



About OpenvSwitch ls1 wrote a cool article , I recommend reading.

Installation



Since OpenvSwitch is not in standard packages on CentOS, we will compile and install it separately .
Manual assembly instructions
First, install all necessary dependencies:
 yum -y install wget openssl-devel gcc make python-devel openssl-devel kernel-devel graphviz kernel-debug-devel autoconf automake rpm-build redhat-rpm-config libtool 


To compile OpenvSwitch, create an ovs user and log in under it; we will perform further actions on its behalf.
 adduser ovs su - ovs 


Download the source code, according to the recommendation of n40lab, disable openvswitch-kmod, and compile them.
 mkdir -p ~/rpmbuild/SOURCES wget http://openvswitch.org/releases/openvswitch-2.3.2.tar.gz cp openvswitch-2.3.2.tar.gz ~/rpmbuild/SOURCES/ tar xfz openvswitch-2.3.2.tar.gz sed 's/openvswitch-kmod, //g' openvswitch-2.3.2/rhel/openvswitch.spec > openvswitch-2.3.2/rhel/openvswitch_no_kmod.spec rpmbuild -bb --nocheck ~/openvswitch-2.3.2/rhel/openvswitch_no_kmod.spec exit 


Create a folder for configs
 mkdir /etc/openvswitch 


Install the received RPM package
 yum localinstall /home/ovs/rpmbuild/RPMS/x86_64/openvswitch-2.3.2-1.x86_64.rpm 

In the comments, Dimonyga suggested that OpenvSwitch is in the RDO repository and you don’t need to compile it

Let's install it from there:
 yum install https://rdoproject.org/repos/rdo-release.rpm yum install openvswitch 


Run the daemon:
 systemctl start openvswitch.service chkconfig openvswitch on 


Bridge creation



Now we will configure the network bridge to which ports will be added.

 ovs-vsctl add-br ovs-br0 ovs-vsctl add-port ovs-br0 enp1 


Let's fix the configs of our autorun interfaces:

/ etc / sysconfig / network-scripts / ifcfg-enp1
 DEVICE="enp1" NM_CONTROLLED="no" ONBOOT="yes" IPV6INIT=no TYPE="OVSPort" DEVICETYPE="OVSIntPort" OVS_BRIDGE=ovs-br0 


/ etc / sysconfig / network-scripts / ifcfg-ovs-br0

For kvm1:
 DEVICE="ovs-br0" NM_CONTROLLED="no" ONBOOT="yes" TYPE="OVSBridge" BOOTPROTO="static" IPADDR="192.168.100.201" NETMASK="255.255.255.0" GATEWAY="192.168.100.1" DNS1="192.168.100.1" HOTPLUG="no" 

For kvm2
 DEVICE="ovs-br0" NM_CONTROLLED="no" ONBOOT="yes" TYPE="OVSBridge" BOOTPROTO="static" IPADDR="192.168.100.202" NETMASK="255.255.255.0" GATEWAY="192.168.100.1" DNS1="192.168.100.1" HOTPLUG="no" 

For kvm3
 DEVICE="ovs-br0" NM_CONTROLLED="no" ONBOOT="yes" TYPE="OVSBridge" BOOTPROTO="static" IPADDR="192.168.100.203" NETMASK="255.255.255.0" GATEWAY="192.168.100.1" DNS1="192.168.100.1" HOTPLUG="no" 

Restart the network, everything should start:
 systemctl restart network 




Opennebula



Installation


So it's time to install OpenNebula

On all nodes:

Install the OpenNebula repository:
 cat << EOT > /etc/yum.repos.d/opennebula.repo [opennebula] name=opennebula baseurl=http://downloads.opennebula.org/repo/4.14/CentOS/7/x86_64/ enabled=1 gpgcheck=0 EOT 


Install the OpenNebula server, Sunstone web-interface and the node
 yum install -y opennebula-server opennebula-sunstone opennebula-node-kvm 


Run an interactive script that installs the necessary gems into our system:
  /usr/share/one/install_gems 


Node configuration


At each node, we have the user one, we need to allow him to walk between the nodes without a password and execute any commands via sudo without a password, just like we did with the ceph user.

On each node we perform:

 sudo passwd oneadmin sudo echo "%oneadmin ALL = (root) NOPASSWD:ALL" > /etc/sudoers.d/oneadmin sudo chmod 0440 /etc/sudoers.d/oneadmin 


Let's start the Libvirt and MessageBus services:
 systemctl start messagebus.service libvirtd.service systemctl enable messagebus.service libvirtd.service 


Go to kvm1

Now we will generate the key and copy it to the other nodes:
 sudo ssh-keygen -f /var/lib/one/.ssh/id_rsa sudo cat /var/lib/one/.ssh/id_rsa.pub >> /var/lib/one/.ssh/authorized_keys sudo chown -R oneadmin: /var/lib/one/.ssh for i in 2 3; do scp /var/lib/one/.ssh/* oneadmin@kvm$i:/var/lib/one/.ssh/ done 


On each node we perform:

Let Sunstone listen to any IP, not only local:
 sed -i 's/host:\ 127\.0\.0\.1/host:\ 0\.0\.0\.0/g' /etc/one/sunstone-server.conf 


DB setting



Go to kvm1.

Create a database for OpenNebula:
 mysql -p create database opennebula; GRANT USAGE ON opennebula.* to oneadmin@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on opennebula.* to oneadmin@'%'; FLUSH PRIVILEGES; 


Now move the database from sqlite to mysql:

Download the script sqlite3-to-mysql.py:
 curl -O http://www.redmine.org/attachments/download/6239/sqlite3-to-mysql.py chmod +x sqlite3-to-mysql.py 


Convert and write our database:
 sqlite3 /var/lib/one/one.db .dump | ./sqlite3-to-mysql.py > mysql.sql mysql -u oneadmin -pPASS < mysql.sql 


Now let's say OpenNebula connect to our database, fix the /etc/one/oned.conf config:

Replace
 DB = [ backend = "sqlite" ] 

on
 DB = [ backend = "mysql", server = "localhost", port = 0, user = "oneadmin", passwd = "PASS", db_name = "opennebula" ] 


Copy it to other nodes:
 for i in 2 3; do scp /etc/one/oned.conf oneadmin@kvm$i:/etc/one/oned.conf done 


We also need to copy the oneadmin authorization key in the cluster to the other nodes, since all the OpenNebula cluster is managed just under it.
 for i in 2 3; do scp /var/lib/one/.one/one_auth oneadmin@kvm$i:/var/lib/one/.one/one_auth done 


Check


Now, on each node, we try to start OpenNebula’s serialis and check whether it works or not:

Run
 systemctl start opennebula opennebula-sunstone 


If all is well, turn off:
 systemctl stop opennebula opennebula-sunstone 




Configuring Failover Cluster



It's time to set up your OpenNebula HA cluster
For some reason, pcs conflicts with OpenNebula. By this we will use pacemaker, corosync and crmsh.

On all nodes:

Disable autorun daemon OpenNebula
 systemctl disable opennebula opennebula-sunstone opennebula-novnc 


Add a repository:
 cat << EOT > /etc/yum.repos.d/network\:ha-clustering\:Stable.repo [network_ha-clustering_Stable] name=Stable High Availability/Clustering packages (CentOS_CentOS-7) type=rpm-md baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ gpgcheck=1 gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key enabled=1 EOT 


Install the necessary packages:
 yum install corosync pacemaker crmsh resource-agents -y 


On kvm1:

Let's edit /etc/corosync/corosync.conf, bring it to this form:
corosync.conf
 totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 192.168.100.0 mcastaddr: 226.94.1.1 mcastport: 4000 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } quorum { provider: corosync_votequorum } service { name: pacemaker ver: 1 } nodelist { node { ring0_addr: kvm1 nodeid: 1 } node { ring0_addr: kvm2 nodeid: 2 } node { ring0_addr: kvm3 nodeid: 3 } } 


Generate keys:
 cd /etc/corosync corosync-keygen 


Copy the config and keys to other nodes:
 for i in 2 3; do scp /etc/corosync/{corosync.conf,authkey} oneadmin@kvm$i:/etc/corosync ls done 


And run the HA services:
 systemctl start pacemaker corosync systemctl enable pacemaker corosync 


Check:
 crm status 

Conclusion
 Last updated: Mon Nov 16 15:02:03 2015 Last change: Fri Sep 25 16:36:31 2015 Stack: corosync Current DC: kvm1 (1) - partition with quorum Version: 1.1.12-a14efad 3 Nodes configured 0 Resources configured Online: [ kvm1 kvm2 kvm3 ] 

Disable STONITH (mechanism for finishing the faulty node)
 crm configure property stonith-enabled=false 

If you have only two nodes, disable the quorum, in order to avoid a splitbrain situation
 crm configure property no-quorum-policy=stop 


Now create the resources:
 crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip="192.168.100.200" cidr_netmask="24" op monitor interval="30s" primitive opennebula_p systemd:opennebula \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-sunstone_p systemd:opennebula-sunstone \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-novnc_p systemd:opennebula-novnc \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" group Opennebula_HA ClusterIP opennebula_p opennebula-sunstone_p opennebula-novnc_p exit 


With these actions, we created a virtual IP (192.168.100.200), added three of our services to the HA-cluster and combined them into the Opennebula_HA group.

Check:
 crm status 

Conclusion
 Last updated: Mon Nov 16 15:02:03 2015 Last change: Fri Sep 25 16:36:31 2015 Stack: corosync Current DC: kvm1 (1) - partition with quorum Version: 1.1.12-a14efad 3 Nodes configured 4 Resources configured Online: [ kvm1 kvm2 kvm3 ] Resource Group: Opennebula_HA ClusterIP (ocf::heartbeat:IPaddr2): Started kvm1 opennebula_p (systemd:opennebula): Started kvm1 opennebula-sunstone_p (systemd:opennebula-sunstone): Started kvm1 opennebula-novnc_p (systemd:opennebula-novnc): Started kvm1 




OpenNebula setup


Installation is complete, it remains only to add our nodes, storage and virtual networks to the cluster.

The web interface will always be available at http://192.168.100.200:9869
login : oneadmin
password in /var/lib/one/.one/one_auth



HA VM


Now, if you want to configure High Availability for your virtual machines, following the official documentation just add to /etc/one/oned.conf
 HOST_HOOK = [ name = "error", on = "ERROR", command = "ft/host_error.rb", arguments = "$ID -m -p 5", remote = "no" ] 

And copy it to other nodes:
 for i in 2 3; do scp /etc/one/oned.conf oneadmin@kvm$i:/etc/one/oned.conf done 




Sources




PS: Please, if you notice any shortcomings or errors, write to me in private messages

Source: https://habr.com/ru/post/270187/


All Articles