
I
once wrote that LSI, in addition to SAS controllers, produces microchips for modern hard drives and SSDs, and I also
wrote that flash memory requires a more complex approach to I / O operations. What is this complexity and what problem do our chips solve? I want to tell you in this article. It's about Write amplification. Of course, professionals are already familiar with WA, so the article is intended more for beginners.
Modern flash memory chips are designed so that in order to achieve improved performance, they need to read, write and erase data in large blocks. Moreover, the recording block is not less than the reading block in size, and the erasing block is always larger than the recording block. This makes it necessary to combine the memory cells in a hierarchical structure, usually: blocks - sectors - pages.
When we have a pristine new SSD in our hands, there are no problems. All its cells are filled with zeros and ready to receive our data. Therefore, the first entry on the SSD does not pose any problems at all. Typically, data is written in blocks of 4-8 KB, and this happens really fast.
')
And now let's say we need to change a couple of bytes (for drama, we can even assume that one single bit, although this happens rarely in life) in an already recorded file. Unlike traditional hard drives, flash memory does not allow you to simply take and overwrite a data block, you first need to erase it. And since the erase block is a very large write block, this operation will be very inefficient. Thank you, Wikipedia, for a clear illustration:

To do this, we need somewhere to read the contents of the entire erase block, then modify the data in it, then erase the block and write the data to it again. This leads to the appearance of such a negative phenomenon as recording enhancement. In fact, much more data is written onto the disk than the computer “really wanted” to write.
A very simple formula allows you to calculate the recording gain.
Record Gain Ratio = Data written to flash / Data sent by host to record
This phenomenon has been known since the advent of the first chips with NAND memory, but the term Write Amplification was introduced in 2008 by Intel and SiliconeSystems.
Record enhancement has two distinct drawbacks. The first of these is the drop in speed, since reading-erasing-writing large blocks is clearly not as effective as direct writing, but this disadvantage is simply bypassed.
The flash memory controller takes over the work of translating the logical addresses that the computer uses to the physical addresses of the data on the storage media, in our case SSD, this is called Logical Block Addressing (LBA abbreviation familiar to many).
Modern controllers optimize the rewriting operation as follows. If he needs to change any of the file blocks, instead of a complete rewrite with reading, erasing and writing, the SSD controller saves the modified block in another place that was previously erased, marks the old block “accessible but not ready” and changes the information in the metadata he uses for LBA.

At the first stage, this seems like a reasonable idea, but as the disk is filled, we have more and more blocks in the “wrong” state, that is, marked as free but not erased. Therefore, modern hard drives use different garbage collection algorithms that erase such blocks and return them to the lists of "available and ready." Of course, the smaller the free disk space, the more intensively you have to use these algorithms, which in turn determines the dependence of SSD performance on filling.
The second drawback of Write Amplification is to accelerate SSD wear. We all know that memory cells have a limit on the number of valid rewrites. It is higher in SLC disks, lower in MLC, but it is. To combat this, data redundancy and their verification using different checksums are used (I already wrote about the LDPC in the
previous article ), but the redundancy of the free space of memory chips is a limited resource, and recording gain contributes to its faster exhaustion.
What factors and how to affect Write Amplification?
Garbage collection
The fact that the wear of SSD cells will depend on the performance of the GC algorithms is undeniable. Garbage collection on SSD drives can occur both background and explicitly.
In the case of explicit (foreground) garbage collection, cleaning and optimization of freed blocks occurs when writing new ones. In the case of a background assembly, the controller uses idle periods to optimize free space and clean the blocks.
Background cleaning allows you to speed up work because the blocks are released by the time they are needed, but this leads to the fact that often the controller has to optimize the really unnecessary data, which probably can be deleted in the future.
The presence of both advantages and disadvantages leads to the fact that developers are trying to combine these methods, seeking better performance. For example, in the OCZ SSD, the background garbage collector cleans up a small number of blocks and stops, which allows minimizing the number of unnecessary operations, nevertheless providing the disk with quick access to free blocks.

However, the most promising direction now is garbage collection simultaneously with write operations initiated by the host. This allows you to achieve high performance in environments with a large number of write operations when the SSD is not idle. Among the controllers that have such an opportunity are the SandForce controllers from LSI.

Of interesting exotics, it is worth noting the attempts of some manufacturers, primarily Samsung, to develop a garbage collection system that would use the information of the file system located on the disk. This would allow the controller to efficiently use information about recently deleted files and unallocated space. According to the developers, this approach allowed to work effectively in those systems that do not support the TRIM team. Controllers with this feature required that the disk be necessarily marked in NTFS and contain MBR. This technology was very unreliable, and often led to data loss, especially when using other file systems. Another direction in which the developers of SSD are moving is Application Hinting. The easiest way to explain this is through the example of databases. Data in the DBMS is also placed in the pages for easy retrieval. Therefore, the DBMS also has the concept of garbage, or Dirty Page. If the DBMS could send in advance lists of dirty pages to solid-state devices, this would greatly assist in the process of garbage collection. There are examples of optimization for other applications.
# end of the first part #