📜 ⬆️ ⬇️

Performance mdadm raid 5,6,10 and ZFS zraid, zraid2, ZFS striped mirror

Test ZFS and mdadm + ext4 performance on SSD Sandisk CloudSpeed
to select a technology for creating a local disk array.

The purpose of this test is to find out what real speed virtual machines can work with in raw file images, if you place them on 4 productive SSD disks. Testing will be done in 32 threads to approximately create the working conditions of a real hypervisor.

image




Measurements will be made using the fio tool.
')
For mdadm + ext4, the options --buffered = 0 --direct = 1 were selected. ZFS does not know how to work with these options, so it is expected that the result of ZFS will be slightly higher. For comparison, I will also disable these options in one of the tests and for the mdadm version.

We will conduct a test with a 10GB file. Presumably, this size is sufficient to assess the performance of the file system when performing routine operations. Of course, if we increase the amount of test data, then the total figures for all tests will be significantly lower, since we will negate all the additional means of caching and prediction on file systems. But there is no such goal. We need not dry synthetic test numbers, but something more close to real life.

As a test stand, we use the following configuration:


Manufacturer:
Supermicro X9DRT-HF +

Processors:
2x Intel® Xeon® CPU E5-2690 0 @ 2.90GHz C2
Technological process - 32 nm
Number of cores - 8
The number of threads - 16
CPU Base Frequency - 2.90 GHz
Maximum turbo frequency - 3.80 GHz
20 MB SmartCache Cache
Bus speed - 8 GT / s QPI
TDP - 135 W

RAM:
16x 16384 MB
Type: DDR3 Registered (Buffered)
Frequency: 1333 MHz
Manufacturer: Micron

Disk controller:
LSI SAS 2008 RAID IT mode

Solid State Drives:
4x 1.92Tb SSD Sandisk CloudSpeed ​​ECO Gen. II
SSD, 2.5 ", 1920 GB, SATA-III, read: 530 MB / s, write: 460 MB / s, MLC
Declared IOPS random read / write 76000/14000 IOPS
Time between failures 2000000 h.

Core:
Linux 4.13.4-1-pve # 1 SMP PVE 4.13.4-26 (Mon, 6 Nov 2017 11:23:55 +0100) x86_64

ZFS version:
v0.7.3-1

IO Scheduler:

cat /sys/block/sdb/queue/scheduler [noop] deadline cfq 

Test tool:
fio-2.16

Array Assembly Parameters


# Parameters for creating a ZFS array on one disk

 zpool create -f -o ashift=12 /dev/sdb 

# Parameters for creating zraid (raid5 equivalent on ZFS)

 zpool create -f -o ashift=12 test raidz /dev/sdb /dev/sdc /dev/sdd /dev/sde 

# Creation parameters of zraid2 (raid6 equivalent on ZFS)

 zpool create -f -o ashift=12 test raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde 

# Options for creating a striped mirror (raid10 analog on ZFS)

 zpool create -f -o ashift=12 test mirror sdb sdc mirror sdd sde 

# General parameters for ZFS arrays

 ZFS set atime=off test ZFS set compression=off test ZFS set dedup=off test ZFS set primarycache=all test 

Under the arc allocated 1/4 of all memory or 52 GB

 cat /etc/modprobe.d/ZFS.conf options ZFS zfs_arc_max=55834574848 

#Mdadm raid5 array creation parameters

 mdadm --zero-superblock /dev/sd[bcde] mdadm --create --verbose --force --assume-clean --bitmap=internal --bitmap-chunk=131072 /dev/md0 --level=5 --raid-devices=4 /dev/sd[bcde] 

#Mdadm raid6 array creation parameters

 mdadm --zero-superblock /dev/sd[bcde] mdadm --create --verbose --force --assume-clean --bitmap=internal --bitmap-chunk=131072 /dev/md0 --level=6 --raid-devices=4 /dev/sd[bcde] 

# General parameters for mdadm 5/6 arrays

 echo 32768 > /sys/block/md0/md/stripe_cache_size blockdev --setra 65536 /dev/md0 echo 600000 > /proc/sys/dev/raid/speed_limit_max echo 600000 > /proc/sys/dev/raid/speed_limit_min 

#Mdadm raid10 array creation parameters

 mdadm --zero-superblock /dev/sd[bcde] mdadm --create --verbose --force --assume-clean --bitmap=internal --bitmap-chunk=131072 /dev/md0 --level=10 --raid-devices=4 /dev/sd[bcde] 

# Parameters for creating a GPT markup table

 parted -a optimal /dev/md0 mktable gpt mkpart primary 0% 100% q 

# Ext4 file system creation options

 mkfs.ext4 -m 0 -b 4096 -E stride=128,stripe-width=256 /dev/md0p1 (/dev/sdb)  stripe-width=256  raid6  raid10  stripe-width=384  raid5 

# Ext4 file system mount options in fstab

 UUID="xxxxx" /test ext4 defaults,noatime,lazytime 1 2 


results




 fio --directory=/test/ --name=read --rw=read --bs=4k --size=200G --numjobs=1 --time_based --runtime=60 --group_reporting --ioengine libaio --iodepth=32 #  ext4 +  --buffered=0 --direct=1 



The reading test clearly shows the effect of the ARC buffer on the operation of the ZFS file system. ZFS demonstrates smooth and high speed in all tests. If you turn off --buffered = 0 --direct = 1 speed on mdadm raid10 + ext4 on ZFS is 3 times slower and 10 times slower in terms of delays and IOPS.

The presence of additional disks in zraid does not provide a significant increase in speed for ZFS. ZFS 0 + 1 is as slow as zraid.

 fio --directory=/ --name=test --rw=randread --bs=4k --size=10G --numjobs=1 --time_based --runtime=60 --group_reporting --ioengine libaio --iodepth=32 --buffered=0 --direct=1 #  ext4 +  --buffered=0 --direct=1 



This is where ARC doesn’t save ZFS. The numbers clearly show the state of affairs.

 fio --directory=/ --name=test --rw=write --bs=4k --size=10G --numjobs=1 --group_reporting --ioengine libaio --iodepth=32 --buffered=0 --direct=1 #  ext4 +  --buffered=0 --direct=1 



Again, buffers help ZFS produce even results on all arrays. mdadm raid6 obviously passes in front of raid5 and raid10. Buffered and cached mdadm raid10 gives twice the best result through all the options on ZFS.

 fio --directory=/ --name=test --rw=randwrite --bs=4k --size=10G --numjobs=1 --group_reporting --ioengine libaio --iodepth=32 --buffered=0 --direct=1 #  ext4 +  --buffered=0 --direct=1 



The picture is similar and random reading. ZFS does not help its buffers and caches. He merges with terrible force. Particularly frightening is the result of a single disk on ZFS and the overall results on ZFS are disgusting.

By mdadm raid5 / 6 everything is expected. Raid5 is slow, raid6 is even slower, and raid10 is about 25-30% faster than a single disk. The buffered Raid10 takes the array into space.

findings


As everyone knows, ZFS is not fast.

It contains dozens of other important features and advantages, but this does not negate the fact that it is significantly slower than mdadm + ext4, even taking into account the work of caches and buffers, prediction systems, and so on. There are no surprises for this part.

ZFS versions v0.7.x did not become significantly faster.

Perhaps faster than v0.6.x, but far from mdadm + ext4.

You can find information that zraid / 2 is an improved version of raid5 / 6, but not in terms of performance.

Using zraid / 2 or 0 + 1 does not allow for a faster array rate than a single ZFS drive.

At best, the speed will not be lower or quite a bit higher. At worst, the availability of additional disks will slow down the overall speed of work. Raid for ZFS is a means of improving reliability, but not performance.

The presence of a large ARC will not compensate for the lag in ZFS performance relative to the same ext4.

As you can see, even a 50 GB buffer cannot significantly help ZFS keep up with its little brother EXT4. Especially on random write and read operations.

Should ZFS be used for virtualization?

Everyone will answer himself. I personally refused ZFS in favor of mdadm + raid10.

Thank you very much for your attention.

Source: https://habr.com/ru/post/344204/


All Articles