📜 ⬆️ ⬇️

Algorithm of data distribution in a server cluster in dCache

In continuation of the article about dCache I will tell about some details of internal implementation.

One of the important tasks of distributed systems is how to distribute the load among the available nodes. For distributed storage, this task is especially important, since the decision taken at the recording stage affects how the data will be read.


')
Usually, the data is written together, together and will be read. Whether photos from the last vacation or the latest results of a scientific experiment. This fact forces us to scatter incoming data across as many nodes as possible, in order to avoid crowding clients on one of the servers. Easily solvable problem, if all nodes have the same size and the same free disk space. But it is a rarity in real conditions. New and, with large volume, servers are connected when the free space is already on the verge.

In dCache, this is solved with the help of two mechanisms: weighted random distribution taking into account free space (weighted random distribution) during recording and redistribution of data (rebalancing) when adding new nodes.

And the way a weighted arbitrary distribution works, taking into account the free space:

In code, it looks like this:

public Pool selectWritePool(Pool[] pools) { double[] weights = new double[pools.length]; long totalFree = 0; for (Pool pool:pools) { totalFree += pool.getFree(); } int i = 0; for (Pool pool:pools) { weights[i] = (double) pool.getFree() / totalFree; i++; } return pools[i]; } private final Random rand = new Random(); public static int weightetRandom(double[]weights, Random r) { double selection = r.nextDouble(); double total = 0; int i = 0; for (i = 0; (i < weights.length) && (total <= selection); i++) { total += weights[i]; } return i - 1; } 


This mechanism allows the use of all nodes, but those with more free space - more often. Probabilistic sampling ensures that the decision to write to any one specific server will not be accepted for different clients at the same time.

As mentioned above, the internal re-balance command uses the same algorithm to even out the server load. Download is calculated by the ratio of free space to total:
load = FreeN / TotalN

This algorithm has proven itself in combat conditions.

Source: https://habr.com/ru/post/197540/


All Articles