With this post, I would like to open a small series of articles on Intel Optane products based on 3D XPoint technology. My cursory review of Russian-speaking sources showed that there is no good material on this issue; In addition, from the comments to our announcements, I became convinced that there is a deep misunderstanding of why all this is needed at all and why it is implemented in this way.
3D XPoint technology
Let's start with a brief information on the 3D XPoint technology itself (read as a “tri-di cross-point”). Immediately I apologize - we currently do not disclose detailed information about the technology. In addition, the focus of the reviews will be on the final products, rather than on the technology itself.
First, although the technology is a joint development of Intel and Micron, the implementation of technology in the form of products is under the separate jurisdiction of each of the vendors. Thus, everything that I will talk about products based on 3D XPoint, is related only to Intel products.
')
Secondly, 3D XPoint is not NAND, it is not NOR, it is not DRAM, but a completely different beast. Without revealing the details of the physical implementation of memory, I will describe the key characteristics, as well as the differences between 3D XPoint and NAND and DRAM.
- Unlike NAND, there is no binding of write operations to pages and binding of erase operations to blocks. With 3D XPoint, we can access data at the physical level at the level of a single cell. In addition, we do not need to delete data before a write operation — we can overwrite the data, which allows us to get rid of read-modify-write operations and greatly simplify garbage collection. This leads to a reduction in access latency (latency) and an increase in the number of I / O operations per second (IOPS); in addition, write operations are performed almost as fast as reads. Finally, the endurance of XP XP 3D memory is much higher compared to NAND (an effect such as electron leakage from cells does not exist here). To summarize, 3D XPoint is faster and more durable than NAND. However, it would be unfair not to mention the lack of 3D XPoint - this is the cost of production, which is currently significantly higher compared to the cost of production of 3D NAND.
- Unlike DRAM, 3D XPoint allows you to create devices with greater data storage density, is a non-volatile type of memory and, at the same time, is cheaper. One of the disadvantages of this comparison is that 3D XPoint as a memory implementation technology is somewhat slower than DRAM (note, we compare technologies, and not products based on these technologies).
All of the above concerns 3D XPoint as such - this, however, is less important for users than the characteristics of specific devices based on 3D XPoint. Thus, our conversation goes into the mainstream of the description of Intel Optane products based on this technology. Let's start with a description of what is "Intel Optane". In short, this is the brand name for all Intel products based on 3D XPoint technology. Explaining in more detail, Intel takes the 3D XPoint waffles, conducts its own testing and selection of memory chips, independently develops the design of the end device - creates an SSD controller, PCB layout, firmware; tests and validates the end device, brings it to the market - this is all hidden under the words "Intel Optane".
Intel optane
At the moment, 2 fundamentally different products have been officially announced and launched on the market: Intel Optane Memory - for client use models - and Intel® Optane SSD DC P4800X - for server use. In this article we will take a closer look at the client product, while the server product will be the subject of the next review.
So, Intel Optane Memory. The first thing to understand about this product is that, despite the name, this is not DRAM, but NVMe SSD in the M.2 2280-S3-BM form factor.
Top view - under the label 1 3D XPoint chip (this is the 16GB version, on the 32GB there are 2 3D XPoint chips - the platforms under the second chip are visible):
The module is one-way, so the back side is empty:
The device meets the specifications of NVM Express 1.1. At the moment, 16GB capacities have been put on the market (one 3D XPoint memory chip is used with a capacity of 16GB) and 32GB (two 3D XPoint memory chips are used with a capacity of 16GB each). From interesting design details:
- the controller is internal to intel
- DRAM is not used in the design
- only 2 PCIe gen3 lines are used, and not 4 lines, as many might expect
- claimed durability - 100GB of recorded data every day for 5 years
Performance test
Now about performance
(The performance of the 32GB version is higher due to the fact that 2 xp xoint 3d memory chips are used versus a single 16gb chip)
It would seem that the performance in terms of bandwidth and IOPS is not impressive - however, the dog is not completely buried here. The whole thing is that these performance data were measured at a queue depth of 4, unlike other SSDs that are usually measured with a queue depth of 32 or more. It is on shallow lines that the superiority of Optane is most noticeable. For clarity, here is a graph of the performance of different types of devices at different depths of the queue *:
At the same time, as our internal tests show, the vast majority of tasks faced by an ordinary user at home or in the office have a queue depth of 1 to 4 (for more details see below), and SSD specifications are written using loads with a queue depth of 32 (for SATA) and more (for NVMe). The difference is very clear.
However, Intel is not positioning the use of Optane Memory as a normal SSD for obvious reasons - there will not be enough device capacity for user tasks (with the exception of some interesting options, such as a small but fast and reliable boot drive for Linux, or a scratch disk for Adobe Photoshop, or a small but fast cache with Intel Cache Acceleration Software, or an interesting solution, described
here ). All the power of the Intel marketing device is aimed at promoting a new acceleration technology (roughly speaking, caching, but this is not the exact definition) of a slow SATA drive (be it a hard disk, solid state drive or even some hybrid models) with the fast Optane Memory module.
This usage model places restrictions on supported hardware and operating systems:
- 7th generation or newer Intel Core processor
- Intel 200 Series chipset or newer (full list here )
- BIOS, in which the UEFI-driver RST version 15.5 or later is integrated (15.7 for the X299 series of chipsets). Yes, BIOS legacy mode is not supported - for Optane Memory, booting is required in UEFI mode
- Windows 10 64-bit
- Intel Rapid Storage Technology Driver 15.5 or later
- SATA boot drive (Optane Memory will speed it up). Only GPT markup is supported.
- 5MB of free space at the end of the SATA drive - this is needed for RST metadata
It is configured this way:
- We are convinced that the motherboard's BIOS supports Optane (see above; now all the “Optane Memory Ready” motherboards on 200 series of chipsets are being shipped with a BIOS that supports Optane Memory, but you can still find motherboards on the market from the previous lots - they will need to be updated on BIOS ).
And yes, Intel did a great deal of work with motherboard manufacturers - all the motherboards that support Optane Memory have this nameplate on the box:
- A system with a SATA drive is installed on which Windows 10 64-bit is installed (the drive must be connected to the SATA port divorced from the Intel AHCI controller in the chipset, otherwise RST will not see it), the markup must be GPT.
- The Optane Memory module is connected (the drive must be inserted into M.2 slot with separated PCIe lines from the chipset, supporting “remapping” PCIe lines to the Intel AHCI controller integrated into the chipset).
- The utility is downloaded from here (you can choose the standard RST utility, which allows you to manage both configurations with Optane Memory and conventional RST arrays, or a simplified version of the utility, which allows you only to turn on and off Optane Memory configurations and watch statistics).
- The utility is installed, it automatically changes SATA mode in BIOS to RST / Optane mode (this requires one system reboot), it also includes acceleration using Optane Memory (this requires a second system reboot). As a result, instead of 2 disk devices, the system will see only one thing - the so-called. Optane Volume.
- PROFIT! Namely:
- Faster loading of the operating system;
- Acceleration of the majority of I / O operations (in fact, caching, but rather smart algorithms).
Principle of operation
Also talk a little about how it all works.
First, at the time of activation of Optane Memory, the RST driver will transfer the files necessary for booting the OS, as well as the file table to the fast Optane Memory drive. The key here is to transfer, not copy. The mechanics of the RST driver is such that not all data stored in the cache on a fast device will be copied to a slow device without fail. This increases the overall system performance and, in addition, solves the problem of data synchronization. However, as can be understood, the physical failure of Optane Memory is likely to result in loss of access to data on the SATA disk. Due to the fact that data transfer occurs immediately when Optane Memory is activated, the very first system load will be faster than before Optane Memory (this is especially noticeable if the hard disk was accelerated rather than SATA SSD - however, in the latter case it costs expect increased storage system performance).
Secondly, the driver will continuously perform caching while the RST system is running. And here there is one important difference between the Optane Memory modules of different capacities - on a 16GB device, only block-level caching is supported, on a 32GB device, block-level caching and file-level compression (both work simultaneously). In the case of block caching, the decision to cache a block occurs instantaneously at the time of an I / O request. In the case of file caching, the driver monitors the frequency of access to files and puts it all into a special table, which then (at the time of system idle or scheduled users) uses to determine which files remain in the cache, which are deleted, and which are added.
Both types of caching are used by smart enough, in my opinion, caching decision making algorithms - I cannot describe them deeply, but for general understanding I’ll note that, for example, video files are not cached (yes, the driver looks at the file extension), in the calculation the file size is taken, the type of load is determined - preference in caching is given to random access rather than sequential, which makes sense because of the extremely slow work of hard disks on random access operations, etc. On the Internet, I met some negative comments on the fact that "the cache is immediately overwhelmed with data," "16GB capacity is not enough for anything," and the like - as a rule, these are reviews from people who have never tested Optane Memory. I have not yet heard negative feedback on the performance of this solution from any of our partners with whom I work.
Some very important moments.
- If the system acceleration is enabled using the RST driver and Optane Memory, you need to connect the SATA drive to another system, then you need to either transfer the entire configuration (SATA device + Optane Memory, and make sure that the new system supports Optane Memory), or turn off acceleration (this is done by pressing a single button in the utility — at the same time, the cache data will be transferred to the SATA device, the RST metadata will be deleted, the Optane Memory device will be cleared).
- Disk cloning does not work when acceleration is enabled with Optane Memory, since no utility can work with RST metadata. Direct cloning of the section with metadata will not be enough - the fact is that the metadata is tied to the serial numbers of Optane Memory and SATA devices. There are no difficulties with file system level backups.
Why do you need it
Now it's time to talk in more detail about why all this is needed at all. Let's start with a more detailed analysis of the loads experienced by the systems of ordinary PC users. Even before the end of the Optane Memory product development, as part of the Intel Product Improvement Program, my colleagues conducted a study on what ordinary users are doing with the computer at home and at work. Results - the number of actions of different types produced by users (averaged data for 1 day of using the PC):
All these events are closely related to the performance of the system disk, and, as a rule, they require random access to data, with which hard disks cope extremely badly. Thus, the use of Optane Memory can significantly speed up the execution of each of the above actions.
However, you ask why I need to buy Optane Memory to speed up the hard drive, if I can buy a 128GB SATA SSD for the same money, put an OS and key applications on it, and for other data just use a hard drive? Here, on the one hand, is a matter of convenience - if you have at least some basic skills to be able to choose where to install OS / applications (I suspect that all GT readers fall into this category, however, I can assure you that, for example, my parents, like most PC users, are not capable of this), and at the same time you will not be lazy to do every application (especially problematic for games - with the current requirements for disk space, 128GB will be clogged under the OS and 1-2 games), From this point of view, the hybrid SSD + HDD configuration can be for Al convenient.
However, keep in mind that with Optane Memory no manual data transfer is required - as soon as you stop using one application and begin to actively use another, the necessary data will be quickly added to the cache. On the other hand, let us recall the graph I cited above - performance depending on the queue depth. On small queues, the latency of access to data on Optane Memory is much lower compared to SATA SSD. Inside Intel, we measured the queue depth used by various applications - here are the results:
Queue depth when using applications:
Queue depth when launching applications:
The distribution of the queue depth during a typical corporate user working day (measured by Intel employees in various positions in the company):
Thus, the distribution of the queue depth of different user loads:
And we have already seen how much better Optane Memory does by working on shallow queues.
Comparing system performance with HDD versus the same system with HDD + Optane Memory:
Another interesting comparison is the same test, but in a system without Optane Memory, 2 times more RAM is:
And, in fact, this is a very valid comparison. Although some types of loads require a large amount of RAM, the lion’s share of the requirements for large amounts of memory does not. Thus, for many users, it may make sense to put 4 GB of memory instead of 8 GB, and invest the money saved in speeding up the storage system.
Conclusion
Summing up, let me remind you that Optane Memory can be used as a standalone SSD, but this is not the main usage model. All magic happens when it is used as an accelerator for a slow hard disk (or even SATA SSD) - a relatively small investment of money can speed up the system speed several times at most user loads. This is achieved due to both the hardware (Optane Memory has noticeably lower access delays compared to other SSDs on the market, small queues have much faster performance than alternative solutions), and software — the RST driver uses sufficiently advanced logic to perform caching operations (and This is the difference from the previous technology - Intel Smart Response Technology). This makes the current implementation different from all those hard drive caching / acceleration solutions that have been released to the market earlier, including by us.
I am very interested to learn about the product and the decision as a whole from the comments - however, I would like to avoid negative opinions due to a lack of understanding of the solution’s work or lack of experience in using it. If there is any doubt, ask before submitting to criticism.
PS in the next article we will analyze the server product based on 3D XPoint technology - Intel Optane SSD DC P4800X Series - together with the software solution of Intel Memory Drive Technology.
* All tests listed in this article were conducted inside Intel. All tests with Optane Memory were carried out on 7th generation Intel Core processors, queue depth tests using the 6th generation Intel Core processor. Configuration of the system used for tests:
