📜 ⬆️ ⬇️

Backup & Recovery: in-line and smart deduplication, snapshots and secondary storage

Experts believe that any modern backup and recovery solution should provide flexible data protection, including physical and virtual server protection, deduplication, as well as providing recovery at the file level and the system image level.

Backups should be carried out to a remote site, and, in addition, application privilege management and on-demand disaster recovery should be available. Today we offer to see how modern solutions meet these requirements, which companies and start-ups exist in this market and which technologies they use.

/ Flickr / Rob Brewer / CC-BY
')

Rubrik


Rubrik is a converged data management system for hybrid clouds. The platform allows you to make automatic backups, instant recovery, replication to remote sites and archive data, and can also be easily scaled horizontally. Data management capabilities include managing the life cycle of data from preparing for archiving, storing different versions of data, ensuring their integrity to performing global deduplication and compression.

But the most important advantage of Rubrik is that it replaces many individual elements (backup server, backup proxy server, tools for replication and deduplication, accelerator, disk memory and external storage) needed for backup and recovery. For a small environment, one device with four nodes is enough, but even with an increase in the number of blocks, they can be managed as one system. The founders of the startup have experience in the investment fund Lightspeed, Silicon Valley advertising firm Rocketfuel and even Google.

Pure Storage and Cohesity


Another solution in the field of data backup and storage is offered by this union of two companies. Pure Storage , founded back in 2009, is an all-flash maker, and its founders are former employees of Veritas Software and Yahoo. Startup 2013 Cohesity is a hyperconvergent secondary storage platform. Cohesity’s founder is Mohit Aron, a former Google employee and co-founder of another popular startup in the Hyperix Converged Infrastructure market Nutanix.

This solution allows users to store older snapshots of file systems in Cohesity's secondary storage, which integrates with AWS, Microsoft Azure and Google Cloud. And the last snapshots - in Pure Storage, which provides more reliable data protection. The system also provides efficient multi-level storage on Pure Storage and Cohesity using the snapshot API, automating the creation and storage of snapshots on two platforms using common policies and creating snapshots consistent with applications, which reduces recovery time.

ClearSky Data


In addition to deduplication , backup and disaster recovery, ClearSky Data startup offers multi-level storage. Its essence lies in the fact that the cache of the most important data is stored on the client's area. Less important, so-called warm data is stored on a deployed local cloud within a radius of 200 kilometers from the company's premises. And cold data, which is rarely accessed, is stored on an external Amazon S3 cloud.

A startup uses special software and an algorithm that can manage data and automatically transfer it between different storage locations based on usage patterns, policies, and customer requirements. While the company operates in only a few US cities - in Boston, Philadelphia and Las Vegas, but plans to expand.

Datrium


Startup Datrium was founded by former employees of EMC, VMware and Data Domain. This is a converged infrastructure that offers a hardware DVX solution and software to support cloud storage in data centers. DVX uses server flash for persistent storage to concentrate data storage functions in the server core. Datrium supports enterprise-class server SSDs and client-class flash drives. And DVX flash management software combines data protection with a RAID array and data compression on the ESXi host side.

DVX communicates via a 10 Gigabit Ethernet interface (10 GbE) with a NetShelf device that is used for secondary storage. Some of the data is cached on the SSD, and some is sent to NetShelf to ensure high availability of shared storage. Stream deduplication and data compression is performed on local flash memory before being sent to NetShelf, which does global deduplication. The product should replace the old mid-level data storage arrays that have almost disappeared from the market.

Symantec NetBackup


Traditional approaches to deduplication are quite resource-intensive due to the lack of a clear idea of ​​the data flow: the search for a file boundary in order to identify duplicates occurs by scanning the entire data stream by byte. That is, such methods try to guess the optimal deduplication algorithm and calculate the size of the moving target block due to the use of heavy resources. Therefore, the usual deduplication becomes insufficient .

Symantec NetBackup offers V-Ray technology and smart deduplication, which allow you to precisely define data formats and file boundaries. Therefore, the optimal deduplication algorithm is determined automatically. Removal of redundant data can be organized closer to the source, which maximizes the benefits of deduplication by reducing the CPU and memory usage compared to traditional backups. Or closer to the recipient of information, so as not to allocate a separate server or place to store data.

Datto and Open Mesh


Datto has been on the market for 10 years: companies have their own development in the field of software and hardware, thanks to which snapshots of the entire IT system of the client are made every five minutes. Instant virtualization technology provides disaster recovery for a few seconds. At the end of January this year, the company acquired Open Mesh , a wireless Wi-fi network managed by the cloud platform CloudTrax.

This acquisition enabled Datto to launch the Datto Networking service, which provides full network deployment in minutes, network continuity, scalability and redundancy, Wi-Fi connectivity and 4G LTE failover. Intelligent access points form a self-organizing, self-healing and encrypted mesh network. And companies can use such a network to connect systems to the Datto Networking Appliance customer service. The company's founder is confident that Datto will be able to create a worthy competition to the largest suppliers of solutions in this area: Symantec, HP and EMC.

Primary Data


The Israeli-American company Primary Data is notable primarily because Steve Wozniak is part of its team. The DataSphere platform allows you to virtualize data through a global dataspace, so management covers the local storage layer, as well as public and private clouds. Customers can create a comprehensive policy that regulates archiving and disaster recovery and determines where the data will be placed: locally or in cloud storage systems.

DataSphere's smart data management capabilities provide snapshot-based data protection, which includes cloud-based archiving of snapshots and obsolete data. The platform reduces total cost of ownership because customers do not need to purchase separate solutions for archiving, data migration, and disaster recovery. Primary Data can also integrate EMC storage systems into a single environment, scaling EMC Data Lake solutions across different EMC storage platforms.

PS What else do we write on the IaaS provider blog 1cloud:

Source: https://habr.com/ru/post/322764/


All Articles