📜 ⬆️ ⬇️

Live migration in OpenStack

Dynamic migration is the transfer of a working instance from one compute node to another. Being extremely popular among cloud service administrators, this feature is mainly used to provide zero downtime when servicing the cloud, and can also be useful for maintaining operability, since allows you to move running instances from a heavily loaded compute node to a less loaded one.

Dynamic migration planning should take place early in the planning and design of an OpenStack deployment. Consider the following:
• Today, not all hypervisors support migration to OpenStack, so it’s best to check in the HypervisorSupportMatrix list whether your hypervisor supports dynamic migration. The hypervisors that support this feature currently include, for example, KVM, QEMU, XenServer / XCP, and HyperV.

• In a standard Openstack deployment, each compute node locally manages its instances in a dedicated directory (for example, / var / lib / nova / instances /), but for dynamic migration this folder must be stored centrally and shared by all compute nodes. Therefore, the availability of a file system or storage of shared data blocks is an important requirement for performing dynamic migration. For such data storage, you must correctly configure a distributed file system like GlusterFS or NFS and run it before starting the migration. SAN storage protocols such as Fiber Channel (FC) and iSCSI can also be used for shared memory.
')
• To grant access rights to files when using centralized storage in shared memory, you need to make sure that the UID and GID of the Compute user (nova) are the same on the controller node and on all compute nodes (assuming that the shared memory is located on the controller node). In addition, the UID and GID identifiers of libvirt-qemu must also be the same on all compute nodes.

• It is important to set vncserver_listen = 0.0.0.0 so that the vnc server can accept connections from all compute nodes regardless of where the instances are running. If this value is not specified, then there may be problems with access to transferred instances through vnc, since The IP address of the target compute node will not match the IP address of the computational source node.

The instructions below will enable dynamic migration when deploying a multi-mode OpenStack platform using the KVM hypervisor running on Ubuntu 12.04 LTS, and the NFS file system. This guide assumes that a deployed and running multi-mode platform is already configured using an automated installation system such as Mirantis Fuel. The guide is based on testing using the following components: a cloud controller node, a network node using a Neutron network connection service, and two compute nodes.

I draw your attention to the fact that the guide does not deal with security aspects during dynamic migration. Study this important question yourself and do not take these instructions as a ready-made guide to action from a security point of view.
The manual includes two stages: the first is the NFS deployment procedure; the second is a demonstration of dynamic migration.

Part 1: Deploying NFS File System


The cloud controller node is the NFS server. The goal is to ensure the sharing of / var / lib / nova / instances by all the compute nodes of your Openstack cluster. The directory contains the libvirt KVM disk image files for the instances hosted on this compute node. If you do not use the cloud in a shared storage environment, this directory will be unique for all compute nodes. Note that if your instances are already running in the cloud before the dynamic migration configuration, you need to take precautions to avoid blocking these instances.

Do the following on the NFS server / controller node:
1. Install the NFS server.
root @ vmcon-mn: ~ # apt-get install nfs-kernel-server

2. IDMAPD provides the functionality of the client and the NFSv4 core server by translating user and group IDs into their names and vice versa. Change / etc / default / nfs-kernel-server and set the value of the specified option = yes. This file must be the same on both the client and the NFS server.
NEED_IDMAPD = yes # only needed for Ubuntu 11.10 and earlier

3. Make sure file /etc/idmapd.conf contains the following:
[Mapping]

Nobody-User = nobody
Nobody-Group = nogroup

4. To share / var / lib / nova / instances, add the following to / etc / exports:
192.168.122.0/24(rw,fsid=0,insecure,no_subtree_check,async,no_root_squash)

Where 192.168.122.0/24 is the network address of the compute nodes (usually the target data network) for your OpenStack cluster.

5. Set the next value of the 'execute' bit in the shared directory so that qemu can use images inside directories when exporting them to compute nodes.
root @ vmcom1-mn: ~ # chmod o + x / var / lib / nova / instances

6. Restart the services.
root @ vmcon-mn: ~ # service nfs-kernel-server restart
root @ vmcon-mn: ~ # /etc/init.d/idmapd restart

Perform the following steps on each compute node:
1. Install the NFS client services.
root @ vmcom1-mn: ~ # apt-get install nfs-common

2. Change / etc / default / nfs-common and set the value of the specified option = yes.
NEED_IDMAPD = yes # only needed for Ubuntu 11.10 or earlier

3. Mount the shared file system from the NFS server.
mount NFS-SERVER: / var / lib / nova / instances / var / lib / nova / instances
Where NFS-SERVER is the host name / IP address of the NFS server.

4. In order not to type it again after each reboot, add the following line to / etc / fstab:
nfs-server: / / var / lib / nova / instances nfs auto 0 0

5. Check all compute nodes and make sure that the permissions are set to the following values. This indicates that the correct permissions have been set on the controller node using the above chmod + x command.
root @ vmcom1-mn: ~ # ls -ld / var / lib / nova / instances /
drwxr-xr-x 8 nova nova 4096 Oct 3 12:41 / var / lib / nova / instances /

6. Ensure that the exported directory can be mounted and that it is mounted.
root @ vmcom1-mn # mount –a -v
root @ vmcom1-mn: ~ # df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/ dev / vda1 6192704 1732332 4145800 30% /
udev 1991628 4 1991624 1% / dev
tmpfs 800176 284 799892 1% / run
none 5120 0 5120 0% / run / lock
none 2000432 0 2000432 0% / run / shm
cgroup 2000432 0 2000432 0% / sys / fs / cgroup
vmcon-mn: / var / lib / nova / instances 6192896 2773760 3104512 48% / var / lib / nova / instances

Check if the last line matches the one in the list. It means that the export of / var / lib / nova / instances from the NFS server was correct. If this line does not exist, your NFS file system may not work correctly and you need to fix them before continuing.

7. Update libvirt settings. Modify /etc/libvirt/libvirtd.conf. To see all the possible options, see the libvirtd settings.
before: #listen_tls = 0
after: listen_tls = 0

before: #listen_tcp = 1
after: listen_tcp = 1

add: auth_tcp = "none"

8. Modify /etc/init/libvirt-bin.conf.
before: exec / usr / sbin / libvirtd -d
after: exec / usr / sbin / libvirtd -d -l

-l - short for listen

9. Modify / etc / default / libvirt-bin.
before: libvirtd_opts = "-d"
after: libvirtd_opts = "-d -l"

10. Restart libvirt. After executing this command, make sure that the restart of libvirt was successful.
$ stop libvirt-bin && start libvirt-bin
$ ps -ef | grep libvirt

Other settings

You can skip the steps below if the dynamic migration was planned from the very beginning and the basic requirements are met as indicated in the introduction. These actions are performed to ensure that the nova UID and GID are the same on the controller node and on all compute nodes. In addition, the UID and GID identifiers of libvirt-qemu on all compute nodes must be the same. To do this, you must manually change the GID and UID to unify them on the compute nodes and the controller.
Do the following:
1. Check the nova ID value on the controller node, then do the same on all compute nodes:
[root @ vmcon-mn ~] # id nova
uid = 110 (nova) gid = 117 (nova) groups = 117 (nova), 113 (libvirtd)

2. Now that we know the nova UID and GID values, we can change them on all compute nodes, as shown below:
[root @ vmcom1-mn ~] # usermod -u 110 nova
[root @ vmcom1-mn ~] # groupmod -g 117 nova

Repeat these steps for all compute nodes.

3. Do the same for libvirt-qemu, but remember that the controller node does not have this user, because the controller does not start the hypervisor. Ensure that all compute nodes have the same UID and GID for user libvirt-qemu.

4. Since we changed the UID and GID of the nova and libvirt-qemu users, we need to make sure that this is reflected in all the files owned by these users. To do this, do the following. Stop the nova-api and libvirt-bin services on the compute node. Replace with new UID and GID values ​​in all files owned by the nova user and the nova user group, respectively. For example:
[root @ vmcom1-mn ~] #service nova-api stop
[root @ vmcom1-mn ~] #service libvirt-bin stop
[root @ vmcom1-mn ~] #find / -uid 106 -exec chown nova {} \; # note the 106 here is the old nova uid before the change
[root @ vmcom1-mn ~] #find / -uid 104 -exec chown libvirt-qemu {} \; # note the 104 here is the old nova uid before the change
[root @ vmcom1-mn ~] # find / -gid 107 -exec chgrp nova {} \; #note the 107 here is the old nova uid before the change
[root @ vmcom1-mn ~] #find / -gid 104 -exec chgrp libvirt-qemu {} \; #note the 104 here is the old nova uid before the change
[root @ vmcom1-mn ~] #service nova-api restart
[root @ vmcom1-mn ~] #service libvirt-bin restart

Part 2: Dynamic migration of the OpenStack virtual machine.


After the OpenStack cluster and the NFS file system are configured properly, you can proceed with the live migration. Perform the following steps on the controller node:
1. Check running instance IDs.
nova list
root @ vmcon-mn: ~ # nova list
+ -------------------------------------- + ------ + --- ----- + ------------------------ +
| ID | Name | Status | Networks |
+ -------------------------------------- + ------ + --- ----- + ------------------------ +
| 0bb04bc1-5535-49e2-8769-53fa42e184c8 | vm1 | ACTIVE | net_proj_one = 10.10.1.4 |
| d93572ec-4796-4795-ade8-cfeb2a770cf2 | vm2 | ACTIVE | net_proj_one = 10.10.1.5 |
+ -------------------------------------- + ------ + --- ----- + ------------------------ +

2. Check which compute nodes the instances run on.
nova-manage vm list
root @ vmcon-mn: ~ # nova-manage vm list

instance node type state launched image kernel ramdisk project user zone index
vm1 vmcom2-mn m1.tiny active 2013-10-03 13:33:52 b353319f-efef-4f1a-a20c-23949c82abd8 419303e31d40475a9c5b7d554b28a22f cd516c290d4e437d8605b411af4108fe4f4a4f cd516c290d4e437d8605b411af4108a4f cf5c7904a4f4a4f cd516c290d4e437d8605b411a4f4a4f4f4a4f cf5c7b4a4f4a4f cd516c290d4e47a4f
vm2 vmcom1-mn m1.tiny active 2013-10-03 13:34:33 b353319f-efef-4f1a-a20c-23949c82abd8 419303e31d40475a9c5b7d554b28a22f cd516c290d4e437d8605b411af4108a4f4f4f4f4a4f cd516c290d4e437d8605b411af4108af

We see that the vm1 virtual machine is running on compute node 2 (vmcom2-mn), and vm2 is running on node 1 (vmcom1-mn).

3. Perform a live migration. We will transfer vm1 with id 0bb04bc1-5535-49e2-8769-53fa42e184c8 (obtained from the above nova list) running on compute node 2 to compute node 1 (see the nova-manage command in the vm list above), vmcom1-mn . Please note that this is an administrative function, therefore, as a rule, you must first export the variables or the source file with the admin credentials.
root @ vmcon-mn: ~ # export OS_TENANT_NAME = admin
root @ vmcon-mn: ~ # export OS_USERNAME = admin
root @ vmcon-mn: ~ # export OS_PASSWORD = admin
root @ vmcon-mn: ~ # export OS_AUTH_URL = " 10.0.0.51 : 5000 / v2.0 /"
root @ vmcon-mn: ~ # nova live-migration 0bb04bc1-5535-49e2-8769-53fa42e184c8 vmcom1-mn

If successful, the nova live-migration command returns nothing.

4. Verify that the migration is complete by running:
root @ vmcon-mn: ~ # nova-manage vm list

instance node type state launched image kernel ramdisk project user zone index
vm1 vmcom1-mn m1.tiny active 2013-10-03 13:33:52 b353319f-efef-4f1a-a20c-23949c82abd8 419303e31d40475a9c5b7d554b28a22f cd516c290d4e437d8605b411af4108a4f4f4f4a4f cd516c290d4e437d8605b411af4108a4f cd5c290d4e437d4604b4d44a4f4f4a4f cd516c290d4e437d4604b4d44a4f4f4a4f cd5ccd904f4a4f
vm2 vmcom1-mn m1.tiny active 2013-10-03 13:34:33 b353319f-efef-4f1a-a20c-23949c82abd8 419303e31d40475a9c5b7d554b28a22f cd516c290d4e437d8605b411af4108a4f4f4f4f4a4f cd516c290d4e437d8605b411af4108af

We see that both instances now work on the same node.

Conclusion


Live migration is important to ensure zero downtime when servicing the OpenStack cloud when you need to stop some of the compute nodes. The above steps for sharing data and migrating a running instance were performed to demonstrate dynamic migration on the OpenStack Grizzly cloud running on Ubuntu 12.04 OS using the NFS file system.

Original article in English

Source: https://habr.com/ru/post/206224/


All Articles