Hello!

This is the second part of an article about working with cluster storage in
Proxmox . Today we will talk about connecting the storage to the cluster.
In the beginning I want to give an excerpt from the previous article so that no one will forget why we need the whole garden with a cluster:
In our case, the problem of organizing a common repository boils down to two aspects:
')
- We have a block device sent over the network to which several hosts will have access simultaneously. In order for these hosts not to fight for space on the device, we need CLVM - Clustered Logical Volume Manager . This is the same as LVM , only Clustered . Thanks to CLVM, each host has up-to-date information ( and can change it safely, without compromising integrity ) about the status of LVM volumes on Shared Storage . Logical volumes in CLVM live just like normal LVM . Logical volumes contain either KVM images or cluster FS .
- In the case of OpenVZ , we have a logical volume on which the file system is located. The simultaneous operation of several machines with a noncluster file system leads to inevitable errors in the operation of everything - it is swan, cancer and pike, only worse. The file system must be aware that it lives on a shared resource, and be able to work in this mode.
As a cluster file system, we use
Global File System 2 .
Proxmox GFS2 is functional, but the
Proxmox developers
do not officially support it. However, the core of
Proxmox is based on the
RedHat core of the RHEL6x branch. That is,
GFS2 support in the kernel is very mature. The harness also behaves quite stable, with the exception of a few nuances, which I will discuss later. The
gfs2-utils package
itself is practically
unchanged (
there are only patches in the start-up scripts for customization for Debian specifics) a stable Redhat version of
gfs2-utils-3.1.3The
gfs2-utils package appeared in
Proxmox in
February 2012 . The native Debian's
gfs2-tools package wildly conflicts (
which is not surprising ) with the entire
RedHat cluster from
Proxmox , so before
Proxmox version
2.0 from the
GFS2 box it was
completely unprovable .
So, a huge plus is that the foundation for the cocking of
GFS2 in
Proxmox is already built.
As
iSCSI storage we use
HP MSA 2012i . This machine is a fault-tolerant solution based on the use of an array of hard drives connected to two independent
raid controllers. Each
raid controller has two interfaces for data transfer, in this article it is interesting because the controller does not know how to
combine these interfaces. We will use
multipath to load both interfaces of the controller. I will not describe the creation of volumes. Volumes are created without any authorization (
I will tell you about the features of an authorized connection from Proxmox to iSCSI in the next article ).
Procedure
The following actions are performed on each node of the cluster.
It is advisable to customize
jumbo frames .
To work with multiple network storage interfaces, we set up
multipath . Create a file
/etc/multipath.conf with the following content:
blacklist { devnode "cciss" } defaults { user_friendly_names yes }
The
blacklist includes block devices that must be excluded from processing (
local disks ). In our case, these are
cciss devices, which are the
HP Smart Array volumes of the controller served by the
cciss core
module .
The parameter "
user_friendly_names " allows you to create
user-friendly devices in
/ dev / mapper of the form "
mpath0-part1 ".
Install the missing packages:
root@pve03:~
The installed
multipath immediately takes off and happily picks up the config.
Preparing an
open-iscsi daemon. We need to automatically connect available targets at system startup.
Edit the /etc/iscsi/iscsid.conf file. In it we change the line:
node.startup = manual
on
node.startup = automatic
Customize
LVM . Switch the lock method from file to cluster:
root@pve03:~
Allow CLVM start. File
/ etc / default / clvm :
START_CLVM=yes
We start CLVM. If
fenced is not configured here (see
previous article ), we get an error:
root@pve03:~# service clvm start Starting Cluster LVM Daemon: clvmclvmd could not connect to cluster manager Consult syslog for more information failed!
CLVM does not work if our node does not belong to the
fence domain.
We connect storage to a cluster.
In the admin we say "
Add iSCSI-target ". After that, all the nodes of the cluster should see several (
in our case, two ) block devices, and
multipath should make one of them and put it in the
/ dev / mapper directory.

Make sure that the
multipath device
/ dev / mapper / mpath0 is the
iSCSI we need.
On one of the machines mark up storage:
root@pve03:~
In the above example, the volume is divided into two sections: one partition with a volume of
512G , and the second, occupying the remaining space on the volume.
Tom
kvm01 we need in the future, when we set up storage for
KVM .
Restart
multipath daemon:
root@pve03:~
On the same machine, we create two cluster volume groups:
root@pve03:~
The "-c" parameter indicates that the volume group is clustered.
In principle, it was possible to create only one volume group, and keep in it sections for the
KVM machines and the
GFS2 partition. Here is a matter of taste.
In the group
CLUSTER01 create a
logical volume :
root@pve03:~
On all nodes of the cluster, this
Logical Volume should be visible:
root@srv-01:~
We
tell the CLVM which
Volume Groups to activate / deactivate when starting / stopping:
File
/ etc / default / clvm :
LVM_VGS="CLUSTER01 KVM01"
Everything is ready to create a cluster file system. We look, what is the name of our cluster:
root@srv-01:~
The cluster name must be specified when creating the
FS .
On one of the nodes of the cluster we format
FS :
root@pve03:~
Here:
- "-t alapve: storage01" is the name of the lock table.
- alapve is the name of the cluster
- storage01 is a unique file system name.
- "-j 3" is the number of logs that must be created when creating an FS . Usually equal to the number of nodes in the cluster. For each host that mounts FS , a different log is required.
Look at our
FS UUID :
root@srv-01:~
On each node we create an entry in
fstab to mount the
FS :
root@srv-01:~
Create a directory
/ mnt / cluster / storage01 , mount
FS into it:
root@srv-01:~
There is one moment. When the system is turned off, the script
/etc/init.d/umountiscsi.sh is called during the shutdown of the
open-iscsi daemon in
Proxmox . It is committed to disabling
iSCSI- mounted file systems. To search for such systems, he uses a rather complex logic, which sometimes fails, which is why an attempt is made to unmount more than necessary, or vice versa - the necessary one is not unmounted. For example, we encountered attempts to unmount the root file system. Of course, he didn’t manage to do this, after which the
OS entered into a state of permanent waiting: without stopping
iSCSI tags, the system could not reboot, and
umountiscsi could not unmount all
iSCSI-FS because it ranked the root.
We did not dig deep into the logic of
umountiscsi.sh . It was decided that we
should not rely on
umountiscsi.sh , we will manage the mounted file systems on
iSCSI volumes ourselves, and the role of
umountiscsi.sh will be reduced to a brave report that “
All systems are unmounted, my general! ”.
So, in
/etc/init.d/umountiscsi.sh change the section "
stop ".
It was:
stop|"") do_stop ;;
It became:
stop|"")
Now the system will fold correctly. True, on one condition - at the time of the shutdown, the system should not have
iSCSI- mounted file systems. If you do not want to disable
FS manually, for example, you can unmount it in
/etc/init.d/clvm before calling "
stop ". At this point, all virtual machines have (
must be ) extinguished. We don’t hope so, and before restart we unmount the
FS manually.
We only
need to create a shared storage of the type “
Directory ” in the
Proxmox admin
panel , point it to the path to the directory with mounted
FS , and check the box “
publicly accessible ”. All OpenVZ containers created on this repository will be able to safely migrate between nodes.
About problems
After several months of testing, we caught the
kernel panic a couple of times in the
gfs2- module.
Fencing works fine, so at first we didn’t even understand what was happening, we just rebooted the nodes several times. After the transition to a new version of the kernel (
2.6.32-17-pve ), there was no hangup. The new kernel is based on 2.6.32-279.14.1.el6 core from RHEL6. There are some edits related to gfs2.
Let's go to KVM
Everything is much simpler here. We have already created a volume group, it remains to set
Proxmox on it. In the admin panel we create a repository of the type "
LVM Group ", in the field "
main repository " we indicate "
existing partition groups ", in the field "
partition group " we select
KVM01 , and set the checkbox "
publicly available ". For
KVM machines in this section, the system will automatically create logical volumes.

Perhaps it is worth rounding out. In the next part I will talk about how you can try to live in OpenVZ on network storage without cluster
FS , about some of the nuances in working with network storage, plus some solutions for automating and optimizing life in
OpenVZ .
Thanks for attention!