
HPC data storage solutions must provide data protection, information availability, scalability and guaranteed high system performance.
RAIDIX storage software in conjunction with the Intel Enterprise Edition for Luster * provides the necessary functionality and allows you to create an efficient storage cluster based on standard hardware.
')
Our article provides technical descriptions for RAIDIX and Intel Enterprise Edition for Luster * solutions, the recommended hardware architecture and storage deployment for high-performance computing.
High Performance Computing (HPC)
Today, High Performance Computing (HPC) technology is not only and so much an IT tool of researchers. More and more companies are discovering the competitive advantages of HPC in various business models. Enterprises generate large amounts of data and use high-performance applications for analyzing and processing information.
For the corporate sector, not only business continuity becomes critical, but also data availability and access performance.
So, in a
recent report at the CNEWS forum, it was noted that high-performance computing and working with big data give an advantage to a large retailer in the fight for the buyer. Processing of huge amounts of information allows you to customize the work of large-scale and anonymized retail at the level of individual service and personal service.
In this regard, the business has a need for a special storage infrastructure with the possibility of flexible horizontal scaling, high throughput and fault tolerance without loss of data.
RAIDIX / Intel Collaborative Solution
The commercial product Intel Enterprise Edition for Luster * includes Luster software functionality optimized for reliable storage and maximum throughput in an HPC environment. The main advantages of Intel Enterprise Edition for Luster * are high performance, flexibly scalable capacity, its own control components and 24/7 support.
To address the challenges of the HPC industry, Radics created a complete, collaborative solution based on RAID cluster HPC cluster-in-a-box technology and Intel Luster software. The solution includes RAIDIX management software for storage systems, running on standard server hardware with Luster OSS (object storage server - object storage server) / OST (object storage target - target object storage) or MDS (metadata server - metadata server) / MDT (metadata storage target - metadata storage target) as a building block for Luster HPC storage infrastructure.
Such building blocks can contain from 8 to 128 disks in a high density chassis with a capacity of up to 12 GB / s. Separate storage nodes are combined into a horizontally scalable system using Intel Enterprise Edition for Luster *.
RAIDIX storage meets high requirements for performance, fault tolerance and integrity of workflows, provides high throughput, low latency and storage reliability through the use of parallel computing and proprietary algorithms in RAID 6 and RAID 7.3. Unique algorithms provide computation speeds of 37 GB / s (in RAID 6) and 25 GB / s (in RAID 7.3) per processor core.
Unlike traditional Luster OSS and MDS server configuration methods using additional equipment and individual configuration of each server, RAIDIX allows you to build HPC storage infrastructure from integrated units and reduce system ownership costs thanks to universal compatibility with standard hardware and SAN and NAS protocols.
Dual-mode configuration
To ensure full resiliency, RAIDIX solutions can operate in dual-controller cluster mode (Active-Active). For dual-controller configurations, Storage Bridge Bay (SBB) compatible platforms that already contain the components necessary for organizing high-availability storage are most suitable.
General requirements for dual-controller RAIDIX platform:
CPU | Intel Xeon E5-2637 v4 / E5-2667 v4 processors |
Motherboard | Must be compatible with the processor model and support PCI Express 3.0 x8 / x16 |
Internal cache | Must be compatible with the corresponding motherboard, from 64 GB for each node |
Chassis | Dual power supply and dual motherboard recommended |
SAS controller (additional ports can be used to connect external JBODs) | Broadcom 93xx recommended |
HBA (cache synchronization controller) | Mellanox ConnectX-3 VPI and above recommended |
HBA (controller for connecting to Luster over the network) | Mellanox ConnectX-3 VPI and above recommended |
HDD | Dual-controller architecture requires SAS disks. |
Layer 2 Cache Devices | HGST SSD SS200 |
Luster network | Infiniband * QDR / FDR / EDR, Ethernet 10GbE / 40GbE / 100GbE |
Control network | Ethernet 1GbE |
Solution characteristics
RAIDIX allows you to organize storage with the ability to quickly and reliably failover (failover), high-performance data processing, broad functionality to ensure the integrity of information and monitoring of the system.
RAIDIX software integrated with Intel Enterprise Edition for Luster * includes a package for installation on systems based on Intel Xeon processors. RAIDIX error-proof coding algorithms configured for use with Intel processors provide high speed operations.
As for the horizontal-scale cluster based on Intel Luster, this technology presents a number of advantages:
- high manageability with Intel Manager for Luster;
- High I / O performance for enterprise applications such as MapReduce
- Intel Xeon Phi client support;
- a Hadoop connector that allows you to use a Luster cluster for Hadoop applications
- full management of the hierarchical data storage structure;
- special patch to improve the processing of single-threaded requests.
Storage Management
RAIDIX-based storage has a convenient web-based interface that allows you to configure storage volumes and monitor system performance.
Luster Cluster Management
The Luster Cluster is managed through Intel Manager for Luster, a web application built on a REST API and a full CLI. The application has the following functionality:
- the formation and monitoring of file systems Luster;
- server and volume configuration;
- means of monitoring performance and resource utilization.
Data Volume Protection
RAIDIX software uses robust coding based on proprietary algorithms optimized for high-performance tasks. RAIDIX supports various RAID levels (RAID 0, RAID 5, RAID 6, RAID 7.3, RAID N + M, and RAID 10) and allows system administrators to achieve the desired level of data protection.
Guaranteed high performance
All RAID algorithms are calculated on standard Intel Xeon processors with high performance and high level of parallelization of computations. As part of the RAIDIX software, there is a proactive reconstruction mechanism that allows you to optimize the read speed during data recovery on disks by excluding disks from the process that have lower read speeds than others.
Proactive reconstruction allows you to recover data using RAID-based computing faster than physically reading data from the disk - at the level of 25 GB / s. This functionality provides high system performance even in the degradation mode / in case of failure of several disks.
High data availability
The RAIDIX cluster system creates a fault-tolerant, high-performance cluster (in dual-controller mode) and places the RAIDs asymmetrically on the nodes. Each RAID can be accessed through a different node. At the same time, the parallel Luster file system allows the client to read and write to multiple OST volumes simultaneously, increasing overall performance.
RAIDIX automatic and manual failover functions help increase system resiliency. In addition, RAIDIX provides high balanced performance due to the ability to migrate RAIDs from any node in the cluster.
The integration of Luster into dual-controller RAIDIX allows the user to:
- asymmetrically place multiple Luster OSTs on each node of a RAIDIX cluster and balance the load on each node;
- ensure high availability of data stored on OST and MDT: in case a node fails, the data will remain available on another node;
- integrate the fail-safe mechanism Luster OST and MDT into the failover process for the entire node. In this case, there is no need to use additional services, such as Corosync and Pacemaker, since the RAIDIX cluster completely takes over the Luster failover.
Solution Deployment Scheme
Figure 1. RAIDIX deployment diagram in conjunction with the Intel Enterprise Edition for Luster *The following system deployment is recommended for a typical HPC application:
- For higher availability, each OST uses a dual-controller (RAID) RAIDIX architecture.
- On each controller in a RAIDIX DC used for OST, Luster OSS is installed in an Active-Active configuration.
- Each OST in a RAIDIX cluster is registered on both OSS servers installed on the cluster nodes. A “native” failover of RAIDIX is configured: in the event of a single OSS failure, the fault tolerant RAIDIX mechanism transfers control over the OST to the second, functioning OSS.
- MGS (management server) and MDS (metadata server) Luster must also be configured in fault tolerant mode within RAIDIX DC in order to achieve higher availability of MGT and MDT targets.
- To provide enhanced functionality for managing and monitoring the system, Intel Manager for Luster is installed.
- 1GbE Ethernet is used to manage network connections
- To connect with Luster, use InfiniBand 56Gb.
- Luster is installed on each client machine.
Fulfilling these guidelines allows you to create a highly available HPC storage infrastructure.
Offered architecture
As a hardware platform, Radix recommends using cluster nodes within the same chassis and identical SBB devices. The platform should scale with additional JBOD disk shelves to increase capacity and performance.
The AIC HA201-TP is a 2U cluster-in-a-box high availability solution (“ready cluster”) using widely available components. Dual-controller configuration is built from two Intel servers (S26xxTP). Each node supports dual processor Intel Xeon series E5-2600 v4.
Figure 2. Module AIC HA201-TP SBB - front and rear panel.The HA201-TP solution provides high availability of data in Active-Active mode and includes fault-tolerant, hot-swappable compute nodes, 24 hard drive bays and 5 PCIe Gen3 slots per node.
Platform | AIC HA201-TP SBB |
CPU | Dual Intel Xeon E5-26xx v4 processor for each motherboard |
Motherboard | Intel Server Board S2600TP |
Internal cache | 64 GB per node |
Chassis | AIC HA201-TP, dual motherboard, dual power supply, 24 hot-swappable HDD bays |
SAS controller (connection via internal backplane) | Broadcom 9300 8-i |
HBA (for cache synchronization) | Dual port adapter from Mellanox ConnectX-3 and up |
HBA (network Luster connection) | Mellanox ConnectX-3 and up |
HDD | 24x NL-SAS 7.2K |
RAIDIX software | v. 4.5 |
Intel Enterprise Edition for Luster * | v. 2.x / 3.x |
Business results
An integrated solution based on RAIDIX HPC and Intel Enterprise Edition for Luster is a reliable building block for building HPC infrastructure. The solution meets the requirements of high performance, fault tolerance and data integrity, provides high bandwidth, low latency and high reliability. Benefits of RAIDIX and Luster include:
- reduced equipment costs;
- reducing the cost of connection means;
- flexible configuration and ease of implementation and maintenance;
- fast failover and high data availability.