Proxmox cluster storage. Part two. Launch

Hello!

This is the second part of an article about working with cluster storage in Proxmox . Today we will talk about connecting the storage to the cluster.

In the beginning I want to give an excerpt from the previous article so that no one will forget why we need the whole garden with a cluster:

In our case, the problem of organizing a common repository boils down to two aspects:
')
We have a block device sent over the network to which several hosts will have access simultaneously. In order for these hosts not to fight for space on the device, we need CLVM - Clustered Logical Volume Manager . This is the same as LVM , only Clustered . Thanks to CLVM, each host has up-to-date information ( and can change it safely, without compromising integrity ) about the status of LVM volumes on Shared Storage . Logical volumes in CLVM live just like normal LVM . Logical volumes contain either KVM images or cluster FS .
In the case of OpenVZ , we have a logical volume on which the file system is located. The simultaneous operation of several machines with a noncluster file system leads to inevitable errors in the operation of everything - it is swan, cancer and pike, only worse. The file system must be aware that it lives on a shared resource, and be able to work in this mode.

As a cluster file system, we use Global File System 2 .

Proxmox GFS2 is functional, but the Proxmox developers do not officially support it. However, the core of Proxmox is based on the RedHat core of the RHEL6x branch. That is, GFS2 support in the kernel is very mature. The harness also behaves quite stable, with the exception of a few nuances, which I will discuss later. The gfs2-utils package itself is practically unchanged ( there are only patches in the start-up scripts for customization for Debian specifics) a stable Redhat version of gfs2-utils-3.1.3

The gfs2-utils package appeared in Proxmox in February 2012 . The native Debian's gfs2-tools package wildly conflicts ( which is not surprising ) with the entire RedHat cluster from Proxmox , so before Proxmox version 2.0 from the GFS2 box it was completely unprovable .

So, a huge plus is that the foundation for the cocking of GFS2 in Proxmox is already built.

As iSCSI storage we use HP MSA 2012i . This machine is a fault-tolerant solution based on the use of an array of hard drives connected to two independent raid controllers. Each raid controller has two interfaces for data transfer, in this article it is interesting because the controller does not know how to combine these interfaces. We will use multipath to load both interfaces of the controller. I will not describe the creation of volumes. Volumes are created without any authorization ( I will tell you about the features of an authorized connection from Proxmox to iSCSI in the next article ).

Procedure

The following actions are performed on each node of the cluster.

It is advisable to customize jumbo frames .

To work with multiple network storage interfaces, we set up multipath . Create a file /etc/multipath.conf with the following content:

blacklist { devnode "cciss" } defaults { user_friendly_names yes }

The blacklist includes block devices that must be excluded from processing ( local disks ). In our case, these are cciss devices, which are the HP Smart Array volumes of the controller served by the cciss core module .

The parameter " user_friendly_names " allows you to create user-friendly devices in / dev / mapper of the form " mpath0-part1 ".

Install the missing packages:

 root@pve03:~# apt-get install multipath-tools gfs2-utils open-iscsi parted

The installed multipath immediately takes off and happily picks up the config.

Preparing an open-iscsi daemon. We need to automatically connect available targets at system startup. Edit the /etc/iscsi/iscsid.conf file. In it we change the line:

 node.startup = manual

 node.startup = automatic

Customize LVM . Switch the lock method from file to cluster:

 root@pve03:~# lvmconf --enable-cluster

Allow CLVM start. File / etc / default / clvm :

 START_CLVM=yes

We start CLVM. If fenced is not configured here (see previous article ), we get an error:

 root@pve03:~# service clvm start Starting Cluster LVM Daemon: clvmclvmd could not connect to cluster manager Consult syslog for more information failed!

CLVM does not work if our node does not belong to the fence domain.

We connect storage to a cluster.

In the admin we say " Add iSCSI-target ". After that, all the nodes of the cluster should see several ( in our case, two ) block devices, and multipath should make one of them and put it in the / dev / mapper directory.

Make sure that the multipath device / dev / mapper / mpath0 is the iSCSI we need.

On one of the machines mark up storage:

 root@pve03:~# parted /dev/mapper/mpath0 mklabel gpt root@pve03:~# parted /dev/mapper/mpath0 mkpart cluster01 0% 512G root@pve03:~# parted /dev/mapper/mpath0 mkpart kvm01 512G 100%

In the above example, the volume is divided into two sections: one partition with a volume of 512G , and the second, occupying the remaining space on the volume.

Tom kvm01 we need in the future, when we set up storage for KVM .

Restart multipath daemon:

 root@pve03:~# service multipath-tools restart

On the same machine, we create two cluster volume groups:

 root@pve03:~# vgcreate -cy CLUSTER01 /dev/mapper/mpath0-part1 root@pve03:~# vgcreate -cy KVM01 /dev/mapper/mpath0-part2

The "-c" parameter indicates that the volume group is clustered.

In principle, it was possible to create only one volume group, and keep in it sections for the KVM machines and the GFS2 partition. Here is a matter of taste.

In the group CLUSTER01 create a logical volume :

 root@pve03:~# lvcreate -n STORAGE -l +100%Free CLUSTER01

On all nodes of the cluster, this Logical Volume should be visible:

 root@srv-01:~# lvscan ACTIVE '/dev/CLUSTER01/STORAGE' [976.56 GiB] inherit ACTIVE '/dev/pve/swap' [4.00 GiB] inherit ACTIVE '/dev/pve/root' [16.75 GiB] inherit ACTIVE '/dev/pve/data' [38.21 GiB] inherit

We tell the CLVM which Volume Groups to activate / deactivate when starting / stopping:

File / etc / default / clvm :

 LVM_VGS="CLUSTER01 KVM01"

Everything is ready to create a cluster file system. We look, what is the name of our cluster:

 root@srv-01:~# pvecm status | grep "Cluster Name" Cluster Name: alapve root@srv-01:~#

The cluster name must be specified when creating the FS .

On one of the nodes of the cluster we format FS :

 root@pve03:~# mkfs.gfs2 -t alapve:storage01 -j 3 /dev/mapper/CLUSTER01-STORAGE

Here:

"-t alapve: storage01" is the name of the lock table.
- alapve is the name of the cluster
- storage01 is a unique file system name.
"-j 3" is the number of logs that must be created when creating an FS . Usually equal to the number of nodes in the cluster. For each host that mounts FS , a different log is required.

Look at our FS UUID :

 root@srv-01:~# blkid /dev/CLUSTER01/STORAGE /dev/CLUSTER01/STORAGE: LABEL="alapve:storage01" UUID="8b3f1110-8a30-3f2d-6486-a3728baae57d" TYPE="gfs2"

On each node we create an entry in fstab to mount the FS :

 root@srv-01:~# echo "UUID=8b3f1110-8a30-3f2d-6486-a3728baae57d /mnt/cluster/storage01 gfs2 noatime,_netdev 0 0" >> /etc/fstab

Create a directory / mnt / cluster / storage01 , mount FS into it:

 root@srv-01:~# mount /mnt/cluster/storage01

There is one moment. When the system is turned off, the script /etc/init.d/umountiscsi.sh is called during the shutdown of the open-iscsi daemon in Proxmox . It is committed to disabling iSCSI- mounted file systems. To search for such systems, he uses a rather complex logic, which sometimes fails, which is why an attempt is made to unmount more than necessary, or vice versa - the necessary one is not unmounted. For example, we encountered attempts to unmount the root file system. Of course, he didn’t manage to do this, after which the OS entered into a state of permanent waiting: without stopping iSCSI tags, the system could not reboot, and umountiscsi could not unmount all iSCSI-FS because it ranked the root.

We did not dig deep into the logic of umountiscsi.sh . It was decided that we should not rely on umountiscsi.sh , we will manage the mounted file systems on iSCSI volumes ourselves, and the role of umountiscsi.sh will be reduced to a brave report that “ All systems are unmounted, my general! ”.

So, in /etc/init.d/umountiscsi.sh change the section " stop ".
It was:

  stop|"") do_stop ;;

It became:

  stop|"") #do_stop exit 0 ;;

Now the system will fold correctly. True, on one condition - at the time of the shutdown, the system should not have iSCSI- mounted file systems. If you do not want to disable FS manually, for example, you can unmount it in /etc/init.d/clvm before calling " stop ". At this point, all virtual machines have ( must be ) extinguished. We don’t hope so, and before restart we unmount the FS manually.

We only need to create a shared storage of the type “ Directory ” in the Proxmox admin panel , point it to the path to the directory with mounted FS , and check the box “ publicly accessible ”. All OpenVZ containers created on this repository will be able to safely migrate between nodes.

About problems

After several months of testing, we caught the kernel panic a couple of times in the gfs2- module. Fencing works fine, so at first we didn’t even understand what was happening, we just rebooted the nodes several times. After the transition to a new version of the kernel ( 2.6.32-17-pve ), there was no hangup. The new kernel is based on 2.6.32-279.14.1.el6 core from RHEL6. There are some edits related to gfs2.

Let's go to KVM

Everything is much simpler here. We have already created a volume group, it remains to set Proxmox on it. In the admin panel we create a repository of the type " LVM Group ", in the field " main repository " we indicate " existing partition groups ", in the field " partition group " we select KVM01 , and set the checkbox " publicly available ". For KVM machines in this section, the system will automatically create logical volumes.

Perhaps it is worth rounding out. In the next part I will talk about how you can try to live in OpenVZ on network storage without cluster FS , about some of the nuances in working with network storage, plus some solutions for automating and optimizing life in OpenVZ .

Thanks for attention!

Proxmox cluster storage. Part one. Fencing
Proxmox cluster storage. Part two. Launch
Proxmox cluster storage. Part Three Nuances

Source: https://habr.com/ru/post/166919/

All Articles

Proxmox cluster storage. Part two. Launch

Procedure

The following actions are performed on each node of the cluster.

About problems

Let's go to KVM

More articles: