📜 ⬆️ ⬇️

ZFS on Linux is not that simple

After reading the article "ZFS on Linux - easy and simple," I decided to share my modest experience of using this file system on a pair of Linux servers.

In the beginning - lyrical digression. ZFS is awesome. This is so cool that it covers all the shortcomings of a file system ported from an ideologically different platform. The Solaris kernel works with primitives other than Linux, so to make it possible to port ZFS using the Solaris code, the developers created the SPL compatibility layer - the Solaris Porting Layer. This layer seems to work fine, but it is additional code in kernel mode, which may well be the source of failures.

ZFS is not fully compatible with Linux VFS. For example, you cannot manage access control lists through the POSIX API and getfacl / setfacl commands, which Samba doesn’t like very much, which stores NTFS permissions in the ACL. Therefore, it is impossible to set up normal permissions on Samba folders and files. Samba, theoretically, supports ZFS ACL, but this module for Linux still needs to be built ... But the extended FS attributes in ZFS on Linux are present and work fine.
')
In addition, Solaris in 32-bit edition uses a different memory allocation mechanism than Linux. Therefore, if you decide to try ZFS on Linux on the x86 architecture, rather than x86_64, get ready for glitches. You are waited by absolute processor loading on elementary operation and kilometers of errors in dmesg. As the ZFS on Linux developers write: “You are strongly encouraged to use a 64-bit kernel. At that moment, zfs will build in a 32-bit environment but will not run stably. ”

ZFS is a kind of “thing in itself”, and it stores in metadata such parameters that are not typical for Linux. For example, the name of the mount point of the FS is set in its settings, and the FS itself is mounted by the zfs mount command, which automatically makes it incompatible with / etc / fstab and other ways of mounting the FS in Linux. You can, of course, set mountpoint = legacy and still use mount, but this, you see, is inelegant. In Ubuntu, the problem is solved by the mountall package, which contains ZFS-specific mount scripts and a patched mount command.

The next problem is instant snapshots of the system, so-called snapshots. In general, ZFS contains a very effective implementation of snapshots, which allows you to create a “time machine” - a set of images, say, for a month, with a resolution of 1 image in 15 minutes. Ubuntu maintainers, of course, included this feature in the zfs-auto-snapshot package, which creates a set of snapshots, albeit more discharged in time. The problem is that each snapshot is displayed in the / dev directory as a block device. The cyclicality of creating snapshots is such that in a month we will get 4 + 24 + 4 + 31 + 1 = 64 block devices for each volume of the pool. Ie, if we have, say, 20 volumes (quite normal value, if we use a server for virtualization), we will get 64 * 20 = 1280 devices per month. When we want to reboot, a big surprise will be waiting for us - the download will be very much delayed. The reason is that when booting, the blkid utility is run, scanning all block devices for the presence of file systems. Either the FS detection mechanism is implemented crookedly in it, or the block devices open slowly, but somehow the blkid process is killed by the kernel 120 seconds after the timeout. Need I say that blkid and all scripts based on it do not work even after the download is complete?

Hot news
Just tried to install the zfs-auto-stapshot package and test it more fully. The result - rotation does not work, old versions of snapshots are not deleted (error 134). So for the month we get 4 * 24 + 24 * 31 + 4 + 31 + 1 = 876 snapshots for one volume or 18 396 for 20 volumes. The script responsible for the snapshots can probably be corrected somehow ...
Package Version - 1.0.8-0ubuntu1 ~ oneric1, OS - Debian Sid x86_64

Suppose we have defeated all these problems, and want to give the newly created partition to other machines via iSCSI, FC or some other way through the LIO-Target system built into the kernel. It was not there! The zfs module, when loaded, uses master number 230 to create block devices in the / dev directory. LIO-Target (more precisely, the targetcli utility) without the latest patches does not consider the device with this number ready for export. The solution is to fix one line in the /usr/lib/python2.7/dist-packages/rtslib/utils.py file, or add the boot parameter for the zfs module to the /etc/modprobe.d/zfs.conf file:

options zfs zvol_major=240 

And finally: as you know, the incompatibility of the CDDL, under which ZFS is released, and the GPL v2 in the kernel prevents the inclusion of the zfs module in the kernel. Therefore, each time the kernel is updated, the module is rebuilt via the DKMS. Sometimes the module succeeds, sometimes (when the kernel is too new) - no. Consequently, you will receive the latest chips (and KVM and LIO-Target bug fixes) from the latest kernels with some delay.

What is the conclusion? You should use ZFS in production with caution. Perhaps those configurations that worked without problems on other FSs will not work, and those commands that you safely executed on LVM will cause deadlocks .

But in Linux production, all ZFS vol. Vol. Chips are now available to you. 28 - deduplication, on-line compression, fault tolerance, flexible volume manager (by the way, it can be used separately), etc. In general, good luck and success to you!

Source: https://habr.com/ru/post/153461/


All Articles