Databases, content distribution networks, big data, artificial intelligence, machine learning - all of these data-driven scenarios require high performance of the entire IT infrastructure. For the storage subsystem, everything is solved simply - installing high-speed NVMe and SSD instead of SAS and SATA. With the computational part of all the more difficult - the central processors do not have time for many operations, very sensitive to time. To eliminate this bottleneck, ScaleFlux has developed new media types. Inside them, side by side with 3D NAND memory, FPGA components work, which take on many typical data operations. In this post we will tell you in detail about the ScaleFlux solution.

Principle of operation
The CSS for ScaleFlux is the Computational Storage System. This device usually has the format of a PCI-E expansion card or a U.2 format drive. Inside, a fast flash memory is installed — 1.6 TB, 3.2 TB, or 6.4 TB — as well as a semiconductor component with the complex name “user programmable gate array”, better known as FPGA.

')
In a conventional SSD infrastructure, the central processor takes over all the computational operations. Including those that are most closely associated with the data. For example, compression - it is carried out by applications that work with large amounts of information in order to save disk space (GZIP-compression).
In an infrastructure with CSS ScaleFlux, compression is performed right in the drive. Like other frequent operations. For example:
- Erasure coding
- Search in key-value stores
- AES-128/256 Encryption
- Hash SHA-3
This helps free up CPU resources and direct them to accelerate applications. The principle of operation is clear, now let us tell how it works in real conditions.
ScaleFlux in popular applications
Our main goal is that CSS ScaleFlux can be used without dancing with a tambourine. Together with the device, we deliver a software package for Linux (kernel version 2.6 is required). Using the package, the FPGA, the computational part of the CSS, which the systems access through a compatible API, is configured within a few minutes. We have now released software for use in nine popular data-driven systems: MySQL, PostgreSQL, Hadoop, Aerospike, HBase, Hortonworks, RockDB, Spark, Vitesse Data.
To understand whether it is worth developing support for a particular system, we run benchmarks, where we compare the performance of similar configurations with NVMe-cards and CSS ScaleFlux. Here are the results:
On our site there are more detailed results for each of the scenarios, with graphs and test configurations.The list of officially supported platforms is still missing some fairly well-known ones: MongoDB, Cassandra, Vertica, etc. We are working on compatibility with these systems and add them when we remove all possible roughness. If you still use CSS to work with applications without official support, you will get a standard NVMe with block storage. And then, if necessary, you can easily go to the supported systems and use of the computing part.
Data Protection and Common Questions
CSS ScaleFlux can use different technologies to protect information: flash RAID, redundant writing, scanning and error correction. Control points are constantly being created for critical information, for example, address tables.
Additional capacitors are installed in the CSS to protect against power cuts. In the absence of external power they are enough to record the necessary information without loss. Throttling is provided for operation at elevated temperatures.
At a price, CSS ScaleFlux are comparable to conventional NVMe cards: the difference usually does not exceed 9%. In practice, it often happens that this difference is compensated for by the comparative savings in space achieved with “delegated” compression. The ScaleFlux CSS Warranty is three years at the rate of 5 full data overwrites daily.
We can share some experience of implementation. One of our financial customers provides 4 billion card transactions per year, captures all the data in HBase and analyzes them to form new offers. After the introduction of ScaleFlux, the volume occupied by its data for analysis was halved, as was the time for a query on the database. Another client developing digital security tools uses another database, Aerospike. He replaced six SATA SSDs with one ScaleFlux system and, as a result, doubled the speed of transactions.
If you want to see and test CSS ScaleFlux, you can contact us via the
form , in the comments to the post, by mail to ru@globaldots.com or by phone + 7-495-762-45-85.