Under the cat, you will learn about setting up a failover cluster, to run the NetUP UTM5 billing system, setting up encryption and backing up "valuable" data.
Why cluster?
I hope that I will not reveal a big secret saying that billing should be as fault-tolerant as possible. And as you know, the main way to achieve fault tolerance is redundancy. We will achieve this through Ganeti.
Ganeti is a virtualization cluster management system built on Xen or KVM virtualization systems. Uses DRBD for the organization of failover clusters.
To implement this idea, we need 2 servers, with the preinstalled Debian lenny. In fact, one “billing will be launched”, the second will be in the hot standby.
Let one of the servers be node1, then the rest will be node2.
All of the following must be done on each of the nodes.
Let's start with the partitioning of disks. It is assumed that we have 2 identical hard drives in each server. When installing, I created a software “mirror” raid of 2 GB in size and installed the base system on it.
It looks like this.
node1:~# fdisk -l /dev/hda
Device Boot Start End Blocks Id System
/dev/hda1 1 243 1951866 fd Linux raid autodetect
node1:~# fdisk -l /dev/hdb
/dev/hdb1 1 243 1951866 fd Linux raid autodetect
node1:~# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hda1[0] hdb1[1]
1951744 blocks [2/2] [UU]
Now we will create another mirrored raid array, for example, dimensions of 5 GB, the size of the array being created, but both nodes must be the same.
node1: ~ # fdisk -l / dev / hda
/ dev / hda1 1,243 1951866 fd Linux raid autodetect
/ dev / hda2 244 865 4996215 83 Linux
node1: ~ # fdisk -l / dev / hdb
/ dev / hdb1 1,243 1951866 fd Linux raid autodetect
/ dev / hdb2 244 865 4996215 83 Linux
')
Create a raid.
node1: ~ # mdadm --create / dev / md1 --level 1 --raid-devices = 2 / dev / hda2 / dev / hdb2
We register in autoload.
node1: ~ # mdadm --examine --scan | grep -v / dev / md0 >> /etc/mdadm/mdadm.conf
Add to / etc / hosts the description of the hosts, it is necessary for the correct authorization
node1: ~ # mcedit / etc / hosts
192.168.0.111 node1.name.org node1
192.168.0.112 node2.name.org node2
192.168.0.113 cluster1.name.org cluster1
192.168.0.114 inst1.name.org inst1
I will omit the description of the interface settings, I believe that you will cope with this yourself.
Install the LVM2 package
node1: ~ # apt-get install lvm2
node1: ~ # pvcreate / dev / md1
Physical volume "/ dev / md1" successfully created
node1: ~ # vgcreate xenvg / dev / md1
Volume group "xenvg" successfully created
We put ganeti
node1: ~ # apt-get install ganeti
Configuring Xen
node1: ~ # mcedit /etc/xen/xend-config.sxp
(xend-relocation-server yes)
(xend-relocation-port 8002)
(xend-relocation-address '')
(network-script network-bridge)
# (network-script network-dummy)
(vif-script vif-bridge)
(dom0-min-mem 0)
Customize grub
node1: ~ # mcedit /boot/grub/menu.lst
## Xen hypervisor options to use with the default Xen boot option
# xenhopt = dom0_mem = 256M
Update the bootloader
node1: ~ # / sbin / update-grub
We reboot and if everything works we proceed to create the cluster.
Install DRBD
node1: ~ # apt-get install drbd8-modules-2.6.26-2-xen-686 drbd8-utils
Add to startup
node1: ~ # echo drbd minor_count = 64 >> / etc / modules
Load the module (of course, if you did not reload the node)
node1: ~ # modprobe drbd minor_count = 64
Customize LVM
node1: ~ # mcedit /etc/lvm/lvm.conf
filter = ["r | / dev / cdrom |", "r | / dev / drbd [0-9] + |"]
Now everything is ready to initiate the cluster. Let the cluster1 be the node1.
On the first node (node1) we execute the command
node1: ~ # gnt-cluster init -b eth0 -g xenvg --master-netdev eth0 cluster1.name.org
Add node2 to the cluster.
node1: ~ # gnt-node add node2.name.org
If everything went well, create an instance (the virtual machine in which the billing will work)
node1: ~ # gnt-instance add -t drbd -n node2.name.org:node1.name.org -o debootstrap -s 2g --swap-size 128 -m 256 --kernel /boot/vmlinuz-2.6.26 -2-xen-686 --ip 192.168.0.114 inst1.name.org
And so, what this team suggests.
1.We create a new virtual machine in the cluster called inst1.name.org
2. Allocating 2 GB of disk, 128 MB of swap file and 256 MB of RAM to it.
3. The distribution for the virtual machine - Debian (debootstrap), kernel xen
4. Be sure to pay attention to the -n option. It determines which node will be the main, and which will be in reserve.
So the entry node2.name.org:node1.name.org indicates that node2 is primary, and node1 is auxiliary (in hot spare).
If the cluster owner is the first node (node1), it is preferable to start the instance on the second node (node2). In case of failure of the first node (node1), the billing will continue to work as soon as the first node (node1) returns to the system - the network raid will be synchronized and the full-time cluster operation will be restored. When the second node (node2) fails, we retain cluster management and have the opportunity to transfer the instance to the first node (node1) with minimal downtime, and quietly put the second one into operation without fear of a “split-brian”.
To gain access to the instance we will carry out several manipulations.
node1: ~ # gnt-instance shutdown inst1.name.org
node1: ~ # gnt-instance startup --extra "xencons = tty1 console = tty1" inst1.name.org
Only now can we get full access.
node1: ~ # gnt-instance console inst1.name.org
List of “inside” instance manipulations (by default, root has an empty password).
Configure the network. (again, I hope for your intellect, or perseverance)
Configure apt.
Install openssh-server and udev
Add a line to / etc / fstab
none / dev / pts devpts gid = 5, mode = 620 0 0
echo inst1> / etc / hostname (Configuring hostname)
apt-get install locales (set locale)
dpkg-reconfigure locales (Configuring Locales)
tasksel install standard (We are reinstalling packages included in the “standard build”)
Initial cluster setup is complete.
I recommend to study the documentation for ganeti without fail. It should be clear to you how to transfer instance from node to node in normal mode, how to transfer in case of an accident, etc. Again, without fail it is necessary to make a reminder: how to act in an emergency situation, laminate and hang before your eyes, because accidents rarely happen, and the brain tends to forget.
What should you think about first after starting the cluster? Personally, it seems to me that you need to be extremely ready for the unexpected visit of harsh masked men who are very hard to refuse anything. Willingness will be to encrypt all valuable data and back it up in a timely manner.
All further settings relate directly to the virtual machine (instance).
Let's start with encryption. We will carry it out this way. When the instance starts, the base system is loaded, then the system waits for the pass phrase to mount the file with the encrypted partition, after which we start the database, billing, and other “necessary services”.
Due to this, everything is stored inside the instance, but for the migration we need to extinguish the machine (unmount the encrypted partition), otherwise the data on the encrypted partition will not be fully synchronized.
Install the necessary packages.
inst1: ~ # apt-get install loop-aes-modules-2.6.26-2-xen
inst1: ~ # apt-get install loop-aes-utils
We are unloading the "old" module (you can reload the instance, which would be for sure)
inst1: ~ # modprobe -r loop
Download the updated module
inst1: ~ # insmod /lib/modules/2.6.26-2-xen/updates/loop.ko
Create file for encrypted partition
inst1: ~ # dd if = / dev / zero of = / var / encrypt_file bs = 4k count = 10000
We mount the created file. I will not give recommendations on the pass phrase, this is a matter of taste.
inst1: ~ # losetup -e AES256 / dev / loop1 / var / encrypt_file
Create a file system, with our ext2 duplication level will be enough
inst1: ~ # mkfs -t ext2 / dev / loop1
Unmount the file
inst1: ~ # losetup -d / dev / loop1
An example of mounting an encrypted partition
inst1: ~ # mount volume -o loop = / dev / loop1, encryption = AES256 / var / secure / -t ext2
Encrypting the swap file is at your discretion, but note that paranoia is a contagious thing.
How to organize data backup? You can run wonderful samopisnyi backup scripts on the crown. Or you can approach responsibly and set up a centralized backup system, for example bacula, recommendations for setting up and examples can be found here
Setting up and understanding Bacula . Backups should also be stored on an encrypted partition, and the server with backups should be kept somewhere far away, in a place known to a narrow circle of people, and for reliability you can pour plenty of pigeon poop on it so that you can’t even kick thoughts.
To back up the database, I advise you to add lines to the job description (job)
ClientRunBeforeJob = "/ usr / local / bin / create_mysql_dump"
ClientRunAfterJob = "/ bin / rm -f /var/lib/bacula/mysql.dump"
It should be borne in mind that the database is in the habit of growing in size, which increases the time of the database dump. To avoid this problem, you can use the table archiving function, which was implemented starting from build 006. The logic is simple, we store archive tables in a separate database, and we can create views of archive tables to access the billing core. Backup is done significantly faster, as only the structure of the view is saved, without the data itself. Data from the archive tables can not be backed up, it is enough to store only 2 copies in case of fire.
The result of all of the above can be the creation of a fault-tolerant cluster, with increased redundancy of the disk subsystem, the possibility of geographical diversity of nodes, encryption and backup of valuable data.
PS Why the choice fell on NetUP UTM5?
When the company bought the license, I first thought it was because of the detailed documentation, which at the time of purchase (late 2006) turned out to be as many as 260 pages, although it was frankly crude. Then I thought that because of the literate. support I don’t argue, the support was high, but it was exceptionally “unfriendly”, which, coupled with raw documentation, overshadowed the joy of purchasing the “product” and stretched the transition for more than a year. In the end, everything turned out to be simple, our director went to see her friends, and as you know, the monkey will devour red berries with frenzy only if she sees how other monkeys do the same. This is how we became the proud owners of this “product”.
If you are already honest to the end, the “product” proved to be extremely capricious, not well documented, but as it turned out with the right settings, it was very stable, so the configured system worked for over a year without any special interventions, the only thing was to clean the base.