Linux: Accelerating software and RAID6 in a home server

What can I do at 0 hours 0 minutes in Moscow? Sit at the holiday table and celebrate? As if not so. In this holiday moment, I want to share with you my current research on tuning the performance of softrade in the home server. You can skip the theory and immediately read the last paragraph where the main salt.

Why RAID-6?

As you know, RAID-5 withstands the death of one broom, and after this death - until the recovery of the raid with the new hard drive is completed, your data is at risk - the recovery usually took up to 70 hours for large arrays and another broom can easily die at that time .
RAID-6 can withstand the death of 2 any brooms. Of the minuses - the generally accepted opinion that slows down, especially the record, even compared to RAID-5. Well, check.

Why softrade?

Iron raid is needed only in one case - if it has a battery and onboard cash. Then the controller immediately responds to the OS that writing to the disk is completed at the physical level and all ACID databases work very quickly and safely.
In other cases, there are no bonuses compared to the soft-raid, there are only disadvantages:
1) Is iron burned out? New server? Be kind to buy the same controller, well, or pray for compatibility. Softreyd from those same disks going anywhere.
2) Price :-) Actually, because of this normal raids with a battery, I didn’t hold it in my hands :-)

Well, those "raid controllers" that stand on ordinary motherboards - you should never use them at all. They simply allow the OS to be loaded from the raid at the expense of the on-board bios (which is executed by the central processor, there is no processor of its own), this is where their benefits end, and only the minuses remain.
')

About a couple of softrade myths

1) He eats a lot of precious processor
If we take a quick look at the source of the RAID driver in the Linux kernel, we will see that everything has been optimized for SSE2 there for a long time. And with SSE2, the processor can read XOR from 16 bytes per 1 clock per 1 core of a modern processor and everything depends on the exchange rate with memory. You can estimate how many% of the load of one core will generate a stream of 1 Gb / s :-) And there are a lot of cores :-) In practice, with my Opteron 165 (1.8Ghz 2 cores), the speed never rested on the CPU.
2) It falls apart and then hell you collect.
If something falls off, it is because of iron (for example, ordinary screws sometimes like to do any background tasks). Adding a fallen out broom is a simple operation that can also be performed automatically. However, on average, this should be done once a year.
mdadm / dev / md0 -a / dev / sde1
3) Softrade monitoring is bad
With monitoring everything is fine and customizable. For example, it is enough just to specify the soap in the mdadm config and it will send you a letter if something happens to your array. Very comfortably )

For example, what comes up if one broom falls off:

This is an automatically generated mail message from mdadm running on XXXXX

A DegradedArray event had been detected on the device / dev / md0 device.

Faithfully yours, etc.

PS The / proc / mdstat file currently contains the following:

Personalities: [raid6] [raid5] [raid4]
md0: active raid6 sda1 [1] sdc1 [4] sdd1 [3] sde1 [2]
2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [5/4] [_UUUU]

unused devices: none

I recommend testing before use:
mdadm --monitor -1 -m myname@myisp.com / dev / md0 -t
4) Softrade has a very low restructuring rate
In the default configuration - yes. And if you finish reading to the end of the article, you will learn how to make everything rebuilt at the speed of the slowest broom :-)

About the bitmap role

Linux software supports a great feature: bitmap. The modified blocks on the disk are marked there, and if for some reason one disk from the array fell off, and then you added it back, you do not need a complete rebuilding of the array. Damn useful. You can store it on the raid itself - internal, or in a separate file - but there are limitations (for the type of file system, for example). I did an internal bitmap. And in vain. Internal bitmap inhibits godlessly. the head of the brooms constantly twitches while recording.

Look at the speed:

Speed can be tested like this:

time sh -c "dd if = / dev / zero of = ddfile bs = 1M count = 5000"
time sh -c "dd if = ddfile of = / dev / null bs = 1M count = 5000"

The results for my RAID-6 of 5xWD 1TB were the following: read 268MB / s, write 37MB / s. All shrug and say: Well, what did you want? RAID-6 slows down when writing, because it needs to read what was written before in order to calculate the updated checksums for all disks. And also this bitmap ...
The array rebuild rate is about 25MB / s - complete array rebuild up to 15 hours. Here it is, your nightmare.

The problems are solved simply:

The raid driver in Linux has such a useful parameter: stripe_cache_size
the default value is 256. Too low value - dramatically reduces the write speed (as it turned out). The optimal value for many is 8192. This is the number of memory blocks per 1 disk. 1 block is usually 4kb (depending on the platform), for a 5 disk array, the cache will occupy 8192 * 4kb * 5 = 160MB.
echo 8192> / sys / block / md0 / md / stripe_cache_size

Acting begins instantly. Now, in most cases, the driver does not have to read the disc before writing (especially with linear recording), and performance dramatically increases. After the reboot, it disappears, so that it does not disappear - we add /etc/rc.local for example.

The array rebuild speed is now 66MB / s (this is across all disks at once, about 5 hours to the entire array), the read speed remains the same, but the write speed has increased to 130MB / s (from 37).
We transfer bitmap to a separate disk (in my case - system). If the system broom dies - no problem, the array will recover without a bitmap.
The head no longer twitches when recording once again, and the recording speed rises to 165MB / s.
mdadm -G / dev / md0 -b / var / md0_intent

So, in 10 seconds we raised the write speed from a depressing 37 MB / s to a quite decent 165 MB / s (more than 4 times !!). Now, through Samba, 95-100 MB / sec are written and read over the network via Samba, and the server upgrade planned due to low raid speed will have to be postponed indefinitely - now the performance of the deadly Opteron 165 is more than enough for all tasks :-)
Happy New Year :-)

Ps. Attention! Under the root to walk only sober!
Ps. In a difficult fight, the first post on Habré in 2011 was published after all

Ps. infi

Source: https://habr.com/ru/post/111036/

All Articles