
For a long time, all the big projects related to the network, be it a web-project or a DC of a large enterprise, were one and the same structure. It was a characteristic tree architecture, differing only in the size of the tree and the density of the "branches", due to different requirements for reliability and performance. But the digital world does not stand still, but is rapidly growing and developing, and not only in increasing volumes and speeds, but also changing its structure. All sorts of Big Data, clouds and distributed computing have led to the fact that the network has become necessary to transfer huge amounts of data between a large number of end nodes, and, preferably, with minimal latency.
All this led to the fact that the traditional tree-like architecture, consisting of access levels, traffic aggregation and the core, began to slip and openly falter. There is a need to replace it. For what?
First, let's try to characterize the network structure in the so-called "traditional" Enterprise-projects:
- from hundreds to several thousand knots;
- static routing;
- VLAN structure without server virtualization;
- vertically oriented (north-south) architecture
- 1G-interconnects with 10G uplinks.
And here are the same characteristics for modern data center networks that work with such Web 2.0 projects as clouds, Big Data, distributed computing, and similar modern large projects:
- from thousands to millions of nodes;
- dynamic routing;
- cloud structure with virtual servers;
- mostly horizontal (west-east) architecture;
- fast (hours, but not weeks) deployment of networks and addition of racks;
- Mostly 10G connections with 40G uplinks.
')
New World NeedsThere is a significant difference requiring organizational change.
If we summarize the various requirements for modern network infrastructure, they will be as follows:
- good performance scalability;
- resilience to failures at all levels;
- high interchangeability to reduce costs;
- predictable latency;
- high availability of equipment;
- convenience of service.
Traditional network infrastructureWhat is completely dissatisfied with the traditional scheme?
- Sharp decrease in performance at failure at the level of aggregation;
- Insufficient scalability caused by the level of aggregation:
- MAC / ARP
- Vlans
- congestion of exchange points with horizontal traffic;
- a sharp increase in the complexity of the structure with increasing reliability;
- Many proprietary variants of the protocols used (MLAG, vPC, STP, UDLD, Bridge Assurance, LACP, FHRP, VRRP, HSRP, GLBP, VTP, MVRP ...)
Decision? As soon as we start talking about the scale, when the cost of service begins to exceed the cost of equipment (yes, so beloved by accountants and marketers, and unloved by the rest of CAPEX and OPEX), a long-known solution in the form of Clos networks goes on the scene, also known as
Leaf-Spine architecture.
Leaf spineImportant note for inattentive: the level of Spine is not at all identical to the level of aggregation. At this level, there are no horizontal links between switches and they are not supposed to be and all the more it is not assumed that all traffic through this level is collected and goes towards the core or, say, the Internet.
By itself, this architecture has been known for half a century and has been successfully used in telephone networks, but now there are all the prerequisites for its active implementation in the data center network. On the one hand, the equipment became at the same time quite productive and inexpensive, while at the same time ensuring extremely small delays (hundreds of nanoseconds is no longer a fantasy, but quite a reality). Well, on the other hand, the tasks themselves became such when the centralized architecture becomes suboptimal.
What does Leaf-Spine provide in the annex to the structures we are considering?
- The ability to rely on ECMP (_which since March 2014 is uniquely identified and recognized as the IEEE 802.1Qbp_ standard) in a continuous IP factory;
- Facilitate the elimination of equipment failures due to its homogeneity;
- Predictable latency;
- Featuring scalability;
- Ease of automation control;
- Lower network bandwidth loss due to equipment failure;
- TOR (Top of Rack) instead of EOR (End of Row). The specifics of TOR and EOR can be found here in this rather old, but still relevant article )
Want some bonuses? You are welcome:
- The default scheme is protected from the appearance of loops and does not require STP for this;
- If the port does not respond, the routing protocol considers it dropped and does not consider the possibility of its participation in routes, unlike STP.
To what extent can such networks be expanded? A two-tier network on common and low-cost switches with forty-eight 10G ports and six 40G uplinks (Overprovisioning Ratio 1.6 when hosting forty servers per rack) allows you to connect up to 1920 servers. Entering the third level increases this figure to 180 thousand. If this is not enough for you, the levels can be increased further.
Can and should this architecture be used on networks of much smaller sizes? Why not, if, of course, your project does not have any specific requirements for L2 routing. Calculate the cost of the classic solution and Leaf-Spine on BMS switches. And if the latter turns out to be clearly advantageous for you - this is a weighty reason to think, right? :)
Of course, besides this, another condition must be fulfilled, which was fundamental when we talked about the need to change the network concept: the traffic in it should be mostly horizontal, the nodes are relatively equal in terms of traffic consumption, there is no explicitly designated direction in which the overwhelming part of the data volume. This does not mean that such a network should not have external connections, but traffic in their direction should be commensurate with the flow between the nodes, and not be the main component.
We, in turn, are ready to offer you everything you need for this.
Eos 420Eos 520For example, cost-effective switches without
the pre-installed OS (Bare Metal Switch) on the Trident II matrixes, the price of 10G ports for which is less than $ 100:
ETegro Eos 420 (48 10G + 6 40G) for the Leaf level and
Eos 520 (32 40G) for the Spine level.
Well, if necessary, provide them with the network OS Cumulus Linux, about the capabilities of which we
wrote a little earlier .
Bare metal switchWhy do we stand for the BMS-version of network equipment? Yes, simply because, in our opinion, it alone can provide both the flexibility needed by modern projects by choosing the OS with the necessary set of functions and low cost of ownership by refraining from paying for sometimes extremely expensive, but absolutely unnecessary vendor-specific features. It is unlikely that someone will dispute the convenience of the fact that you can buy a server from one manufacturer, put another’s OS on it, and supplement it with third-party software. In our opinion, it is time to bring this ideology of open systems and into the world of network equipment.
If there is a desire to "touch" such switches and see what open network operating systems are capable of - write, we have the opportunity to organize testing.