📜 ⬆️ ⬇️

Back to basics or a step forward? HPE StoreVirtual VSA

Following the current trends in the data storage and processing industry, I got to SDS - software-defined storage, namely, an HPE product called HPE StoreVirtual VSA, which I will tell you in this article.



Software-defined storage (hereinafter - SDS) are considered the next milestone in the development of data storage technologies. This technology is new to the market, so there is no definite and precise definition of this term. A similar situation was present at the dawn of the “clouds”, when this term was used to call everything, not particularly delving into its meaning. By analogy with the above-mentioned “clouds”, the main idea of ​​SDS is to abstract from the hardware component and come to a business-oriented model for building storage and processing systems.

Now, in most cases, we are talking about software virtualization of data warehouses. In this case, it is not about the notorious replacement of one storage system by another due to built-in functions or hardware gateways, but about virtualization of the local disks of your computing nodes.
')
This approach causes me a little bewilderment, because for many years all the manufacturers of the A-class IT industry strongly inspired us to think that the future is for storage systems based on the idea of ​​separating storage and data processing, which will increase their availability and security

The beginning of my IT career came at the time of the emergence of virtualization, so I personally observed how reluctantly and painfully many companies moved from the “orthodox” “one application - one server” scheme to virtualized clusters with remote storage systems. And now, when the use of storage systems has been repeatedly substantiated and universally accepted in “combat” infrastructures, the same manufacturers tell us that a new milestone is a return to the use of our servers' local disks. That is why in this article I not only want to express my opinion, but also to hear from you your assessment of both this particular product and the whole idea of ​​SDS in general.

Architecture


HPE VSA is a software product that installs nodes on a virtual cluster. Currently, VMware vSphere, Microsoft Hyper-V, and KVM are supported. For vSphere and Hyper-V, there are plugins to integrate with their management consoles, allowing you to manage the entire infrastructure from a single point.


A typical HPE StoreVirtual VSA deployment requires 3 nodes — two nodes for fault tolerance, and a third for placing Quorum Witness, which will be responsible for the consistency of data on all nodes in the event of a connection failure. A scenario is possible using only 2 nodes, but in this case, an independent NFSv3 file ball is required to accommodate Quorum Witness.

From the point of view of the virtual cluster administrator, StoreVirtual VSA is a virtual machine running Enterprise Linux, one for each node in the cluster. Installation can be made both through a special wizard, and by deploying an OVF image. In both cases, you will need to configure the disk of the virtual machine being created. The size of the virtual disk will depend on the block size specified when formatting the datastore.

Like any software, StoreVirtual VSA requires computing resources of the processor and RAM, and to develop a new solution (or to check compatibility with an existing cluster), HPE developed a number of recommendations:
StoreVirtual VSA capacity (total of all storage devices)Memory required (GB) not using Adaptive Optimization or Space ReclamationMemory required (GB) when using Adaptive Optimization and / or Space Reclamation
<= 1 TBfourfour
1 - <= 4 TBfivefive
4 - <= 10 TB7eight
10 - <= 20 TB912
20 - <= 30 TB1217
30 - <= 40 TB1521
40 - <= 50 TB1826

This product is licensed by volume, which we present to the hypervisor as a datastore. It is important to understand that this will be “raw” volume from the end user's point of view. First of all, the local drives of your servers will be combined into RAID groups at the level of the host's RAID controller. The resulting usable volume will be licensed. After each VSA is launched, they are clustered together and summarize the disk space into one common pool, which is then available to users (servers) using the block protocol (iSCSI).

Opportunities


Nowadays it is not enough just to provide some kind of data storage capacity. The storage system must have the ability to protect data from failures, optimize storage, management and processing, and StoreVirtual VSA can offer us all the basic functions that are present in hardware storage systems:

• Thin provisioning - thin volumes that allow the system to allocate space as it is actually filled with data;
• Peer Motion - volume migration without interrupting access to them;
• Multi-site SAN - a single volume distributed across two or three sites;
• Synchronous and asynchronous replication;
• Creating consistent snapshots at the application level;
• Network RAID - building fault-tolerant schemes at the cluster node level;
• Split Site - creation of geographically separated clusters;
• Adaptive Optimization - two-level automatic tearing with a granularity of 256 KB.

Management of all functions is available both through the CMC (Centralized Management Console) and through the plugin for VMWare / Hyper-V management consoles.

In my opinion, the most interesting option will be Network RAID, which protects data from loss in case of failure of the entire cluster node. In essence, this is synchronous replication between nodes, and the RAID level regulates the number of copies of data blocks stored in the cluster.



For example, in the case of Network RAID level 10 recommended by the manufacturer as the most optimal, 2 copies of each data block will always be stored in the cluster. This backup scheme gives us guaranteed protection against data loss in case of failure of the 1st node of the cluster (and with special success - up to half of the nodes), but the overhead will be 1/2. This means that when purchasing a license for StoreVirtual VSA 2 TB, we will receive a virtual storage of 1 TB of usable capacity.

The next level of protection is Network RAID level 10 + 1. In this case, 3 copies of each data block are stored in the cluster. The permissible level of "losses" is 2 knots. The overhead will be 1/3. It is the Network RAID 10 + 1 redundancy algorithm that underlies SplitSite technology, allowing you to create a cluster on 3 geographically dispersed sites. The most remarkable thing about this technology is that, unlike the classic replication of hardware storage systems, we have no such concepts as the main and secondary sites. The application does not matter where the processing unit is located at the moment from the sites, and if the node fails and the duplicate unit is accessed at another site, the application does not notice the difference and, as a result, there is no idle time. On the other hand, this imposes strict requirements on the channel connecting the sites between them. The delay in data transfer, round trip, must not exceed 5 ms.

Network RAID level 10 + 2 creates copies of blocks on all nodes of the cluster, which allows us to lose n-1 nodes, but at the same time the rate will be 1 / n (where n is the number of nodes in the cluster). In this case, the minimum supported configuration - from 3 nodes, which in my opinion is excessive and not applicable in real architectures.

By analogy with hardware raids, there are redundancy levels with checksums: Network RAID level 5 and 6. In these architectures, blocks with checksums appear that are evenly distributed across all nodes of the cluster. Such a redundancy scheme is the most economical in terms of overhead (n-1 for level 5 and n-2 for level 6), and the fault tolerance level is level 10 and level 10 + 1, respectively. However, the need to consider checksums causes a noticeable decrease in the computational performance of the entire cluster, as a result of which they are not recommended for use with high-loaded applications (the most obvious example is databases).

Positioning


The StoreVirtual VSA product is part of the hyperconverged systems solution family. And, like any other software component of hyperconvergent solutions, it is a budget equivalent of its original hardware - dedicated storage system. In my understanding, the main task of this product is to create a fault-tolerant secure storage for your virtualized cluster without purchasing additional equipment. At the same time, you should not underestimate the functionality of Network RAID, which allows to implement synchronous replication between cluster nodes, including geographically dispersed ones.

As you can see from the functional described by me, this solution can be called functionally independent, giving its users the capabilities of an entry-level hardware storage for more modest money.

In conclusion, it is worth noting that currently there is a promotional program: when you purchase any HPE Proliant Gen9 server, you will receive a StoreVirtual VSA license for 1 TB for free. This is an excellent reason not only to download trial licenses for a detailed study of the product, but also to begin to introduce it into the production of new projects without increasing their budget.

Source: https://habr.com/ru/post/308392/


All Articles