
Remember the
translation of the article
"When Solid State Drives are not that solid" ? In it, Algolia employees blamed Samsung for data corruption on RAID0 configurations.
The problem was solved as a result of a long trial, during which Algolia employees even had to
write software emulating their type of load on the RAID, so that Samsung’s engineers could repeat the problem on their hardware. The fix affected the Linux kernel, to be exact, the bio.c file, which is responsible for the basic block I / O operations.
The problem was as follows - the kernel's input-output subsystem can divide the block input-output (BIO) operation into several when it is appropriate. The bio_split () function is used for separation. When splitting, a new BIO object is created, and the information in the old one is corrected, taking into account the fact that some of the addresses where I / O takes place have “moved” to the new object. In order to save memory, a new object is created by copying values ​​from the old one, while the pointers in the new and old objects point to the same memory area. For read / write operations, this works fine, since when performing these operations, the contents of the fields of the BIO object accessible through pointers do not change. However, this is not the case for the DISCARD operation - the bio_vec field of the bio structure contains a pointer to the service data necessary for executing the command (starting address and size of the area to be erased).
')
The raid0 and raid10 kernel modules use the bio_split () function and send split requests to the SCSI / SATA driver, but the SCSI / SATA driver does not assume that different requests can use the same memory area and overwrite the contents at the address specified in bio_vec. Therefore, the next request comes with a pointer to incorrect data, which calls DISCARD to incorrect addresses.
The first version of the
patch , proposed by Samsung engineers, provided for modifying the source code of the raid0 driver, but a
more general version entered the kernel, which provided for a complete copy of the bio structure along with the memory pages occupied by it in the case of DISCARD.
This problem affects all drives that support TRIM, regardless of model, in a RAID0 or RAID10 configuration.
It remains unclear why the problem did not appear on Intel drives. Perhaps it's in the timings.