📜 ⬆️ ⬇️

Proxmox 4. Day Two. Thin-LVM

Good afternoon friends. After a previously published article a lot of water has flowed, several servers have been raised, several already on the new 5th version. There were clusters and CEPH, and even replication with two nodes (a function appeared in 5-ke). I made a decision for myself (as advised in past comments) that it is easier and more convenient to install debian, mark up the disks correctly and lift proxmox on top of a working soft-raid.

About markup, about VLM and thin disks further and will be discussed.

On one server I ran into a very big and unpleasant thing. The server is separate, on debian 8. With a markup in which a separate large space is allocated under the thin-lvm disk for storing virtual machine disks, there is one subtlety that I did not take into account earlier.

Real configuration example: a soft raid-10 of 4 3 TB disks was created.
')
Of the total 5.7 TB, a separate disk of 5.37 LVM-Thin type for virtualo disks is allocated. There are virtual machines with a total allocated disk space of 4.03 TB. The machines worked for themselves and gradually filled the discs. Filling for six months averaged 20-30% in each of the virtual locks.

On the next day (of course, Monday, which also coincided with the first day of the long-awaited vacation), our zabbix server began to frantically send via telegrams notifications from the VitroAlk. First, about the failures of individual services such as http or ssh, and then completely about the loss of pings. It is useful to connect via ssh to the mail virtual machine, it slows down, nothing is clear from the first couple of seconds, then about a dozen messages from zabbix about the problems of other virtual women arrive here. I understand with a side glance that it’s bad for all virtual women, except for the hyper-personal itself. I climb on it and open the console of the first problematic machine.

And see

Spoiler header
device-mapper: message ioctl on failed: Operation not supported

The first thing I thought was a soft-raid crumbled. But why there was no notification on this topic from the hyper himself — once, why the hipper works outwardly correctly — two.

I climb on lvm –a And I see general data on pve \ data

Data% - 23.51%
Meta% - 99.95%

Checkmate.

I check the rest of the virtual machines - they have the same write errors, the services frantically twitch in convulsions. Users are hysterical.

Of all the sane articles in Google on this topic - they write the same thing everywhere - to expand the space by adding an additional physical hard disk.

Given that getting to our local Ford Knox, where this server is difficult, we lose a lot of time, we send a flash drive with a USB flash drive to 8GB. After 1.5 hours, it is in place, inserts a USB flash drive, I add it to the lvm group, expand the meta disk by another 3 GB with the command:

Spoiler header
lvresize --poolmetadatasize +3G pve/data

And as a result I get Meta% - 1.58%

I restart the machines one by one, checking their disks and fixing problems with my hands, because some (for example, the mail server) did not want to run humanly without problems and corrections via sfck. And finally solve the problem.

What was it, Carl? - I ask myself.

Creating the Thin-LVM partition and adding it to the proxmox, I didn’t think that it was necessary to manually take into account the capacity of the metadata, calculate it on the calculator and set it manually when creating the disk. Why are such important, critical indicators not monitored for example through the same Proxmox GUI?

Guys, if not difficult, in the comments, I beg you to comment on this, what has been done wrong, why very little is written about the creation of Thin and precisely about the meta data. And what are the solutions to the problem besides mine? Not always, an authorized person with a flash drive, who was allowed into the DC, could be given access to the rack, and I, while on vacation for 1 thousand km, managed to solve the problem at 2:00.

PS: Well, the result of course does not suit me. The flash drive still sticks out in the server. Added to the LVM group and can die at any time (with the loss of metadata in this case - and this is worse than when the system simply can not write them). When I return, I will think about how to get rid of the flash drive in other ways (it is no longer possible to change and \ or add a disk to the server). On this occasion, comrades, I would also like to hear objective comments.

Source: https://habr.com/ru/post/339676/


All Articles