📜 ⬆️ ⬇️

Getting to know the cloud: how dynamic traffic distribution works

In one of our past materials, we talked about static load balancing methods in the cloud of an IaaS provider. Today, dynamic methods are next in line: “bee” and “ant” algorithms, as well as the Biased Random Sampling approach.


/ Flickr / Quinn Dombrowski / CC BY-SA

Dynamic methods of balancing, in contrast to static ones, in their work take into account the current state of the entire system and react to changes in it. Often, information on the workload of nodes is stored in a state table, from which load distribution systems derive information.
')

Dynamic load sharing - monitoring in progress

Biased random sampling


To implement this approach, the network is represented as a virtual directed graph, in which all servers are vertices. When a request is received to perform any task, the load balancer assigns it to that node (node) that has a half degree of approach equal to at least one.

When a node receives a task, the processor begins its execution and in parallel signals a decrease in the number of available computing resources, decrementing the semi-degree of approach. When the problem is "solved", the node increments this indicator, reporting the release of resources.

The choice of the initial vertex for the task is done randomly (hence the random sampling); Subsequent tasks are assigned to neighboring nodes, which are also randomly selected. This method of load balancing is completely decentralized and therefore suitable for use in cloud networks. Including geographically distributed.

Scientists from Liverpool found that Biased Random Sampling, additionally taking into account the geographical distribution of nodes in the network, reduces latency in data transfer by 22%. In the test with 512 nodes located within a thousand km radius, the average delay was approximately 70 ms (in the experiment with a network that does not take into account the geographical distribution of nodes, this figure was 92.5 ms).

Ant algorithm optimization (Ant colony optimization)


This concept was first introduced in the early nineties. It draws inspiration from the behavior of ants. An ant is always able to find a way from a food source to an anthill, even if the usual path was “blocked”. For this, these insects mark the route with special pheromones. It is believed that the stronger this "smell", the closer is the source of food.

In the context of load balancing in telecommunications networks, this is as follows. The network is represented as a graph, and from all possible nodes, the main one is selected, which has the greatest number of neighbors. Switching stations are mapped to nodes in the graph, and communication lines between them to edges.

Each node contains a “pheromone table” in which data is collected on the resources used and the available capacities: the amount of memory, the number of processors, etc. Periodically, ants are run from each node, which are sent to random receiver nodes. "Insects" are moved between the nodes, guided by the pheromone table. This table is updated every time an ant calls to it.

Ants have an age equal to the length of the path. Ants also linger in nodes overflowing with calls. Those insects that choose a shorter and less busy path, affect the probability of choosing this route by subsequent ants to a greater extent than ants, choosing the worst in length and loading path.

This is due to the fact that the first ants earlier fall into the receiving node and have a smaller age. New requests are sent on the shortest unloaded routes. This allows you to balance resources by offloading nodes in "dense" directions in the network.

One of the variations of this algorithm is used by the Anthill framework for P2P applications. This system is a self-organizing network of connected "anthill chambers" - nodes capable of performing calculations and processing data.


The structure of the network-anthill

When a node receives a request from an application, it launches the autonomous agent, the ant, which must complete the task. It moves across the network from node to node until it completes the request. When traveling, the ants "carry" with them information about the request, the result and other metadata.

Ants do not communicate directly with each other; instead, they leave the necessary information for the task solution to resource managers located in the nodes they visit. For example, an ant that implements a search service might leave routing information in this way that would help other ants pave the way for nodes containing the data they need. This form of indirect communication is also used by real ants - this is called stigmergy .

"Bee" optimization algorithm (honeybee foraging)


The first articles describing the bee swarm method were published in 2005. It is based on modeling the behavior of bees when searching for nectar. In the hives there are so-called scout bees who are looking for meadows with flowers. When they find a food source, they return to the hive to tell others about it with the help of a special "waggle dance."


/ Flickr / Irregular Blog - Daedalus / CC BY-SA

With this dance, the bee tells how good nectar is in the meadow and how far it is located. After this, forager bees respond to her call, who fly after a scout, collect "harvest" and return to the hive with "prey". Then the forager bee can take one of the following actions: become an unoccupied forager, leaving the current source of nectar, continue to collect it on its own or call a few other unoccupied bees to help.

This model is used for load balancing as follows. First of all , the current load on virtual machines is calculated and their state is determined (balanced, unloaded, overloaded) depending on the threshold values ​​they set. If the load on the VM is small, then all new tasks will be directed to it.

If there are overloaded virtual machines in the system, each of them can assume the role of a scout or a forager. Processing the request, the virtual server calculates a certain value, which is analogous to the "quality" of the field with flowers, which demonstrates a bee dance. One option for estimating this value may be the time it takes the CPU to execute it. Then the server “advertises” this task to the whole system, as if by posting it on a bulletin board - in this way the server assumes the role of a scout. Other servers can view this bulletin board and process requests made there, becoming foragers.

Other materials from corporate blog 1cloud:

Source: https://habr.com/ru/post/353528/


All Articles