📜 ⬆️ ⬇️

HP BURA (HP BackUp, Recovery and Archiving) - HP's offer for organizing data backup and archiving systems



This review describes the Hewlett-Packard approach to organizing a data management system and protecting it.

Backup and Restore


In many modern organizations, with the growth of data volume, as well as the requirements for the availability of information business systems, the requirements for systems for ensuring integrity, data protection and their quick recovery in case of an emergency increase. Since in the overwhelming majority of cases information systems have unequal importance and criticality for business, organizations develop classifiers of systems to be protected with indication of requirements for each class (according to levels of criticality for business).
')
The main metrics of data backup systems include:
- RPO (Recovery Point Objective) - "recovery point", the moment of relevance at which one or another system should be restored.
- RTO (Recovery Time Objective) - the time for which the system must be fully restored.
- Backup window - the time period during which the system should be backed up.
- Retention Policy - the policy and the retention period for backup (daily, weekly, monthly, annual) systems.

In organizations with geographically distributed IT infrastructures, the above requirements are supplemented with features for protecting remote offices and branches: local backup storage policy, communication channel bandwidth, delays, etc.

Depending on the above requirements and features, as well as on the amount of protected data, various technologies can be applied to meet these requirements. The classic equipment of enterprise-level backup systems, such as disk arrays and tape libraries, incorporates technologies that help to partially solve this problem — hardware snapshots and data clones, multi-threaded backups, multiplexing, LAN-free backup.

However, often this is not enough - given the current growth rates of data, it is not a trivial task to ensure their effective and cost-effective protection. More recently, many backup system manufacturers have begun to apply various technologies to minimize the duplication of duplicate information (deduplication technology). The problem of duplicate data is especially relevant for backup, because In accordance with the regulations of organizations, it is very often necessary to copy and store from several units to several dozen copies of the same data.

Deduplication


Deduplication is a technology that allows you to solve several problems related to backup and recovery. It allows you to:
- significantly (up to tens of times) reduce the time to create a full backup of data;
- reduce recovery time from backup;
- significantly reduce the cost of storing backup copies by storing only unimal data blocks.
Hewlett-Packard's approach to the implementation of deduplication technology in its solutions is unique, because Only HP can combine deduplication capabilities in different parts of the IT infrastructure within one integrated solution, as well as use different storage of deduplicated blocks (both hardware and software).

How deduplication works


The deduplication process in Hewlett-Packard solutions can be broken down into a series of sequential actions.



Before deduplication, the HP Data Protector Media Agent software component transfers a stream of backup data to a special buffer. The deduplication engine fetches data from this buffer and performs the following actions:
- breaks data into blocks of variable length (the average block length is 4 KB);
- calculates the hash sums of the blocks;
- determines duplicate blocks by comparing their hash sums;
- compresses unique blocks to save storage space;
- sorts the blocks for more optimal recording in the block storage.

After that, only unique data goes into the storage of HP StoreOnce blocks.
The effectiveness of deduplication is expressed as a coefficient equal to the ratio of the volume of data to be copied (before deduplication) to the actually copied deduplicated data. The key factors affecting the effectiveness of deduplication are:
- Backup policy. The more full backups and less incremental ones, the higher the deduplication efficiency.
- The retention period for backups. The higher the shelf life, the greater the likelihood of previously copied blocks being there and, accordingly, the deduplication efficiency is higher.
- The percentage of relative data changes between backup sessions. The more of these changes, the less likely it is that the previously copied blocks are in the store, and the correspondingly less deduplication factor.
- File size. Incremental copies of files with a size comparable to the size of deduplication blocks (~ 4 KB) reduce the efficiency of deduplication for these files.

HP StoreOnce Federated Deduplication


Data deduplication in Hewlett-Packard solutions can be implemented in the following areas of IT infrastructure:
- on servers of protected applications. In this case, the data from the protected servers come already in deduplicated form. This option is ideal for protecting small remote branches with thin communication channels. However, it is necessary to take into account that deduplication is a resource-intensive process, and the load on application servers can increase significantly.



- on the backup server. In this case, the data from the clients is transferred to the dedicated server “as is” (without deduplication), deduplicated on this server, and then they can be transferred to the central HP StoreOnce storage. This option is suitable for protecting large branches (when there is no need to save traffic from clients and it is advisable to allocate a server for deduplication), when additional load on the protected server is undesirable, and also when the OS version and / or bitness of the application server is not supported by the deduplication engine.



- on specialized disk libraries - backup storage devices. In this case, all deduplication components are embedded in the HP StoreOnce devices, and deduplication is performed using the HP StoreOnce hardware resources. The advantage of this option is its relative simplicity and speed of implementation, as well as minimization of changes to the current IT infrastructure.
As a storage of unique blocks, you can use not only HP StoreOnce hardware disk libraries, but also HP StoreOnce Virtual Storage Appliance (VSA) virtual devices, as well as HP Data Protector Software Stores software storages.

To effectively protect distributed infrastructures in Hewlett-Packard solutions, it is possible to combine the above types of deduplication, depending on the requirements of a particular environment. In addition, to ensure disaster recovery, you can organize data replication between multiple HP StoreOnce repositories located in different sites. In the event of an emergency, data can be quickly restored to a backup site. It is important to note that only changed data blocks will be sent between sites. In all of these scenarios, you can manage and monitor backup and recovery processes from a single HP Data Protector interface.



Archiving


According to the results of a 2013 study by the Enterprise Strategy Group (ESG), one of the most significant trends affecting the architecture and infrastructure of modern IT is the exponential growth in the amount of stored data. Its impact extends to both the storage subsystem and application servers. At the same time, the business value of the stored data for the organization is not the same: according to statistics, about 70-80% of the volume comes from outdated, rarely demanded or duplicate information (for example, electronic messages received / sent several months or years ago, old records in databases, numerous copies of files). Therefore, SLA requirements for performance, backup frequency, recovery time, etc. for different information should be unequal.



At the same time, all this information is occasionally necessary, for example, for the purpose of building analytical reports or trends, as well as in the case of audits or investigations of information security incidents.

One of the most effective approaches used to optimize information storage in medium and large organizations is archiving. Unlike backup, archiving, as a rule, does not create a copy of productive data.

Archived objects are transferred to inexpensive storage, while they are indexed and can be quickly found and restored from the archive if necessary. In addition, special “stubs” (links to archived objects) can be installed on productive servers (such as mail or file servers), which allows end users to quickly navigate to the corresponding object from the familiar interface.

Archiving can also be useful in terms of optimizing backup processes. By setting up archiving simultaneously in several repositories, you can get a fault-tolerant archive. At the same time, by freeing productive servers from processing most of the data, application performance will improve, and the backup and recovery time of such servers will be significantly reduced. And this, in turn, makes it possible to implement flexible backup policies for various classes of data: frequently used, critical data are backed up more often and using hardware snapshots on disk arrays, outdated data are backed up less often and in a standard way.



Hewlett-Packard has in its portfolio a wide range of tools for archiving both structured (databases, structured files) and unstructured data (mail objects, files, MS SharePoint objects, instant messages, etc.). To ensure a quick search of objects in the archive, the analytical engine HP Intelligent Data Operating Layer (IDOL) is used, which, thanks to its scalable architecture, allows you to organize indexing and analytical search processing of a virtually unlimited array of data.

The author of the material is Maxim Lugansky



Distribution of HP solutions in Ukraine , Georgia and Tajikistan .

Training courses on HP in TC MUK (Kiev)
April 20-21, Cloud Computing Foundation (EXIN)
1-3 June Managing 3PAR Disk Arrays
4-5 June, Managing HP 3PAR Disk Arrays: Replication and Performance

MUK-Service - all types of IT repair: warranty, non-warranty repair, sale of spare parts, contract service

Source: https://habr.com/ru/post/255843/


All Articles