📜 ⬆️ ⬇️

How to achieve replication with zero RPO over long distances

What is SLD and why is it needed?


One of the most important tasks of the IT department of the enterprise is to protect data from the effects of various external factors, such as: fire, earthquake, flood, and other disasters. Traditionally, various data replication technologies are used. However, replication usually allows you to synchronize (with one or another RPO value) the same data set only between two data centers. And for many customers this is quite enough. For many, but not for all. If a customer requires zero RPO, then you need to use synchronous replication. However, synchronous replication allows you to place data centers at a distance of about 100 km from each other. In the event of a serious disaster, or simply if the two data centers are too close to each other, both DCs can suffer at the same time - and the data will be lost.


So, if an enterprise needs to provide an extremely high level of data protection, namely:


- for such demanding customers, we can offer a special solution: HPE 3PAR Synchronous Long Distance (SLD).
')
SLD is long-distance replication without data loss. How it works, I will try to explain below.

What types of replication are supported


First I want to remind you what types of replication and which topologies are supported by the HPE 3PAR StoreServ family of arrays.

3PAR StoreServ arrays support 3 modes of replication (Remote Copy):


If the synchronous mode, I hope, does not require explanations, then for asynchronous modes I will briefly describe how they work:


I will add to this that in all 3 modes, naturally, data consistency is maintained during replication.

As a transport layer for replication, you can use the following 3 options:


And finally, supported replication topologies / configurations:



Fig.1. Many-to-many replication configuration. Each array replicates data to 4 other arrays. All replication directions may be bidirectional. Here we are talking, of course, about replicating different data sets (volumes) on different arrays.


How does the SLD


So, SLD is:

  1. Simultaneous replication of a volume group from one array (A) to 2 other arrays (B and C). In this case, replication to one array (B) is performed in synchronous mode, and replication to another array (C) - in asynchronous periodic mode. See below fig.2. Thus, arrays A and B can be located relatively close to each other (the maximum distance is determined by the maximum allowed time for synchronous replication delay between two arrays RTT = 10 ms). On the contrary, the array C can be removed from the arrays A and B for a considerable distance (the maximum distance is determined by the maximum allowed time for asynchronous periodic replication delay between two arrays RTT = 120 ms).

  2. Providing RPO = 0 on remote array C. Let me remind you that, since array C is located far enough, replication to it in synchronous mode is impossible, and the only way to ensure switching to remote array C without data loss (in case of failure of the main array A or during scheduled switching ) Is the use of SLD technology.


Fig.2. SLD scheme.

SLD works as follows: in normal mode, the data is replicated from array A to arrays B and C. At the same time, asynchronous periodic replication is also configured between arrays B and C, which is normally in the passive state (shown in Figure 2). . If the main array A fails, replication from array B to array C is automatically activated, and the data that was written to array B, but not recorded to array C, will be copied to array C. Thus, after failure of array A, arrays B and C will be automatically synchronized up to the last block that was written to array A before its failure.

After synchronization of arrays B and C, data processing can be continued; both array C and array B can be selected as the main array. In this case, no data that was written to array A will be lost (RPO = 0) and replication will be performed between arrays B and C, ensuring continuous data protection after the failure of one of the three arrays.

After restoring array A, new data that was written to arrays B and C will be copied to array A, after which it will be possible to return to the normal operation mode using array A as the main array.

In conclusion, I want to note two more important points:


Vladimir Korobeynikov, @Vladkor

Source: https://habr.com/ru/post/320366/


All Articles