📜 ⬆️ ⬇️

Different approaches to balancing traffic

MPLS technologies today have become the de facto standard for building networks of telecom operators. Some participants in the initial development claim that the work was aimed at obtaining a protocol with a fixed long header to simplify the routing decision-making process, however, there were no revolutionary changes in this sense, and after the appearance of hardware implementations of switching chips, the performance problem faded into the background. But as the standard overgrown with muscles, it became clear that the use of several labels, later called the stack, allows you to look at MPLS as a technology with standardized methods for providing and providing services. So, the labels in the stack are conditionally divided into service and transport. For large networks, this turned out to be insufficient, and its own transport hierarchy soon appeared. In general, transit routers are not obliged to understand what kind of service they are transmitting, their task in the most general sense is limited to working with the top label of their hierarchy, and what's inside the stack is not their concern. This approach allows transit routers to send traffic to hundreds of thousands and even millions of different services.

It would seem, what more could you want ... the “divide and conquer” rule works flawlessly, but it’s not really possible to divide effectively, in the sense that you want to balance traffic as evenly as possible within a limited number of communication channels by slow forces. changes, hardware solutions. In the article you will find some aspects of different methods for solving this problem.

LDP , at first glance, together with the Hop-by-Hop destination based routing paradigm inherits the ECMP capabilities of the IGP, but at the same time some of the peculiarities of balancing inherent in the IP world are inherited. IP routers expect to see inside the packet what they are familiar with - potentially highly entropic headers of the network and transport layers. They are easy to disassemble and, with a large number of flows, the distribution looks quite uniform. The complexity of applying this rule in the MPLS world is that the transit LSR, looking inside the stack, does not always find an IP packet there, often under a transport label there is a service, for example VPNv4, Labeled Unicast, VPWS, VPLS or EVPN, often with lower entropy therefore, such a strange idea, as transit routers inspect the entire stack of tags and even a packet / frame packed on the stack, has become a necessary necessity. If the switching chip can do this on the stack of the required depth - well, if not - no balancing on it will work, it is necessary to change or fill the stack (network design) or the switching chip, since one does not match the capabilities of the other.

Difficulties associated with the need to inspect the entire stack can be solved in another way. To this end, entropy-enhancing tags are added to the stack at the point of service. The standards Flow Aware Transport and Entropy Label are just about that, but as if by the law of conservation of energy, by reducing the requirements for transit routers, the standards increase the requirements for border routers. Any switching chip can do a push operation a finite number of times. In addition to the transport and service labels, one to two entropy labels should be placed on the stack in one pass, this is not the case for all chips and not for all cases, therefore the effectiveness of MPLS traffic balancing in the LDP domain depends on the hardware implementation of the switching chips, the stack length and its size entropy.
')
Distributed RSVP-TE solves the problem of balancing by a completely different method. The uniform distribution here depends not so much on the capabilities of transit routers, but on the quality of the solution of the optimization problem by border routers, each of which calculates the paths to the neighbors taking into account the current level of channel utilization and traffic characteristics. If the calculation succeeds, the router informs its neighbors about it, so that they reserve part of the band of the corresponding channels and begins to use the new path. The word “if” was not written by chance, the fact is that the solution to this problem very much depends on the input data and the dynamic state of the system. Since each router paves the way, taking into account only its own wishes, global optimization can hardly be achieved. The router can influence the paths already established only with the help of setup and hold priorities, assigning priorities is not a trivial task and there are several approaches to its solution (priority by band, priority by type of traffic, etc.). Since each of the approaches has narrow objectives, and there are no automation tools for this process, it is difficult to call the classic RSVP-TE a simple technology with universal applicability. If there are not many nodes in the network, and the band of some paths is close to the channel capacity, it is easy to get a network with fragmented utilization. Therefore, in practice, they usually avoid excessive swelling of paths by calculating several parallel paths or use the approach of Sevice Transport Planes when several paths between routers are used for different types and instances of services. This allows a smoother fill channels and increases the likelihood of successful installation paths, but requires a significant amount of manual work. Not by any means minimizing the possibilities of RSVP-TE, I still think that this approach lacks simplicity.

Centralized RSVP-TE is a step into a bright future along the SDN road; instead of a local calculation of paths in conditions of limited visibility, a smart controller appears that sees the entire MPLS domain in full view. The controller can calculate the optimal paths even if for this it is necessary to change the routes of already existing paths, for example, to regroup them, to free part of the band. After calculation, the PCEP controller informs the routers of the result, and they seamlessly rebuild the network status. The problem of inter-area paths is solved here permanently and irrevocably, the controller can receive TE information from all network segments using BGP Link-State NLRI. It is pleasant to observe the development of technologies, at the moment there are several implementations of the PCE controller, but with the choice of models of routers supporting PCE, things are more complicated, while manufacturers do not rush to massively add support for this method.

The centralized SPRING-TE is potentially the most functional, flexible and scalable approach listed above, this solution uses the controller even more closely, in fact, the entire thinking process takes place there. The stack of tags in this case serves as instructions for transit routers. Having received a path or paths in some way, the ingress device divides traffic flows using different stack stuffs, and transit routers perform only simple swap / pop operations. Given that the filling of the stack with a long set of instructions may raise questions for hardware implementations of switching chips, the idea of ​​Centralized SPRING-TE looks very harmonious in a DC environment with mpls capable servers, where the immersion of tags on the stack is not so rigidly limited. At the moment, a consensus on band reservation mechanisms has not yet been developed. Whether it will be implemented in the near future, it is not yet clear, developments in this direction are only being carried out.

Multipath LSP . Some time ago, the notorious Kireeti Kompella tried to combine RSVP-TE with ECMP balancing. The result was a rather interesting draft [Multi-path Label Switched Paths Signaled Using RSVP-TE]. The essence of the method is accurately described by the phrase of the donkey Eey, about how simply a deflated balloon can be placed in an empty pot. Instead of signaling one thick ingress path, the router cheats a few less bandwidth-intensive paths (sub-LSPs). In this process, the current utilization of the channels and their topology determines the number of sub-LSPs and the bandwidth reserved for them. Depending on the type of sub-LSP traffic can be balanced evenly across all paths or, in accordance with the reserved band. In the second case, there are no serious requirements for the switching chips of the transit nodes, since the decision on the choice of path lies solely with the ingress router. This approach seems to me very useful, since it can be implemented on a relatively modest transit network without the special manual work of engineers or a large controller, so the details of the draft and its implementation will be described in the next article.

Source: https://habr.com/ru/post/325488/


All Articles