Cisco switch stacking. Part 1

In this article (it will consist of two parts) I would like to briefly go through the basic stacking technologies of Cisco switches. Let's try to deal with the overall packet transfer architecture within each type of stack, the response to failures, as well as throughput numbers. In the first part, we look at the StackWise and StackWise Plus technologies. In the second part - StackWise-160, StackWise-480, FlexStack and FlexStack Plus.

Now you won't surprise anyone with stacking functionality. He is in many models of switches from different manufacturers, including Cisco. But it was not always so. At the dawn of my career (somewhere in the middle of two thousandths) in the field of network technologies, Cisco had only one switchboard with support for a full stack. It was a model of the Cisco 3750 switch. Pseudo stacks based on 2950 and 3550 had practically died at that time. At that time, I, as a young specialist, was very surprised by the fact that Cisco paid so little attention to the issue of stacking switches. At the same time, for example, with 3com switches (note purchased by HP), which at that time were quite popular, stacking was supported on a sufficiently large list of models. Allied Telesis also had things to do. I even remember how followers of Cisco products explained to me that stacking is bad, and this technology should not be used in production. It is a pity, I do not remember the exact wording, but it was sort of about job stability. It is worth noting that at that time the main arguments in favor of stacking were the simplification of management (in any case, at that time it seemed to me exactly that way). Those. instead of configuring two or more devices separately, the stack allows us to get one large switch.

Time went by. Many realized the advantages of stacking. And now, most Cisco switches support this technology. Currently, speaking of stacking, it is worthwhile to separate the stack at the access level (where we connect regular users) and the stack in all other cases.
')
In the first case, the main reason for merging switches into a stack is to simplify administration. At some point in time, it even began to seem to me that this is no longer relevant and is more a marketing point. But not so long ago in communicating with the customer, who has a large fleet of ~~car~~ network devices, I found out that this was the main reason for integrating switches into a stack at the access level.

In all other cases, in my opinion, the main “in favor” in favor of the stack was the possibility of organizing a relatively inexpensive network fault tolerance scheme (both at the network core level and when connecting server hardware). The stack allows us to aggregate the physical channels wound on different switches into one logical one. This provides us not only with greater bandwidth (due to the utilization of several channels simultaneously) and fault tolerance (the failure of one of the stack switches does not cause the network to stop), but in some cases it gives the opportunity to completely abandon loops. So from the use of protocols of the STP family. Those. simplifies life by making the network topology quite simple.

On Cisco equipment, several stacking technologies are used depending on the platform. A small note. We will consider classical stacking schemes. VSS technology will remain behind the scenes.

Technology	Platform	Number of switches in the stack	Total bus bandwidth	The need for a stack set
Stackway	3750, 3750G	9	32 Gbps	Not
StackWise Plus	3750-E, 3750-X	9	64 Gbps	Not
StackWise-160	3650	9	160 Gbit / s	Yes
StackWise-480	3850	9	480 Gbps	Not
Flexstack	2960-S, 2960-SF	four	40 Gbps	Yes
FlexStack Plus	2960-X, 2960-XR	eight	80 Gbps	Yes

I suggest a little more to deal with the numbers of the total bandwidth of the stack bus, as well as the general architecture of packet transmission within each type of stack. I would like to clarify that by a stack bus we mean the internal interfaces and ports that provide stacking. Its performance is the total effective throughput of all stack ports. Why am I not talking about the overall performance of the stack? This is due to the fact that in most technologies, when performing packet switching between internal ports of one switch, only internal logic (switching fabric, ASIC, etc.) is used. In this case, the packet does not fall on the stack bus. The stacked bus is utilized only when the packet gets to the port of one switch and goes through the port of another switch stack.

Stackway

Consider the StackWise technology. She is the oldest among the rest. To connect switches into a stack using the StackWise technology, a specialized stack cable is used. At the same time, there is no separate stack module, the stack ports are immediately integrated into the switch (two ports each).

The bandwidth of a stacked cable is 16 Gbps (in each direction). Since each switch has two stack ports, the throughput of the stack bus must be:

16 Gbit / s * 2 (in each direction) * 2 (number of ports) = 64 Gbit / s

We look in the specification, and there 32 Gbit / s. Where did half the bandwidth go?

In the 3750 (3750v2) and 3750G switches, there is no dedicated internal switching fabric per se (the old shared-ring switch fabric architecture is used). Stack ports are connected directly to the internal bus switch, becoming its continuation. Thus, switches of one stack have one big bus in the form of a ring. This bus at the logical level represents two paths in the form of a ring each.

The capacity of each of them is 16 Gbit / s. These paths are multidirectional: packets are transmitted through them in opposite directions. Since we have a common bus for the whole stack, a packet, hitting the port of any switch of the stack, will pass not only through all the internal ASICs, but through the whole stack ring, even if the outgoing port is on the same switch as the incoming . And the package will be removed from the bus, only when it passes the whole circle and comes back. This allows ASIC, who “captured” one of the paths, to learn that the packet has reached and the path can be freed. Such an algorithm can be called "deletion by the sender" (in terms of Cisco - Source stripped). The choice of the way to send a packet is determined based on the availability of each of them (the token mechanism is used: the ASIC that possesses the token transfers data).

Let's look at this with an example (Fig. 2). In our case, the packet, hitting the switch port (1), falls on the ASIC, which in turn selects the blue path (2) (for example, it was free at this moment). Then the packet goes along the blue path through all the switches (3), eventually getting to the switch where the destination port is located (4). The switch sends a copy of the packet (5) through its local port. But the packet itself continues its journey through the stack ring (6) until it reaches the ASIC that originally sent it (7). Only there it will be removed from the stack bus.

Thus, the same packet passes 2 times through the stack ports of the switch (first through one (3), then through the second (6) ports). So our total usable bandwidth of the stack bus is 32 Gbit / s (exactly two times less than the physical one).

And what will happen if one of the stack switches fails? In this case, the paths are closed on each other, thereby forming one large ring (Fig. 3). The switches will also behave exactly if one of the stacking cables is disconnected.

It is worth noting two more points. Two paths "spin" in different directions. I assume that this is done to average the packet transfer delay inside the stack. The second point is that, for Stackwise, the throughput of the stacked bus is equal to the overall performance of the stack, due to the fact that all switches in the stack use one common bus.

StackWise Plus

Let's turn to the technology StackWise Plus. In the 3750E and 3750X switches, a dedicated switch fabric has been added. This allows local switching of packets without their occurrence in the stack ring. Stack ports are plugged directly into the switch fabric. Now the switching fabric is directly responsible for the logic of working with the stack bus. In the case of StackWise technology with a stack bus, each ASIC worked separately.

The StackWise Plus technology used a new algorithm for processing packets in the stack - “recipient deletion” (in terms of Cisco, Destination stripped, another name is Spatial reuse). In this algorithm, the packet is removed from the stack bus as soon as it reaches the switch on which the outgoing port is located (Figure 4). Now, a small Ack packet (8 bits) is used to signal that the path can be freed.

As in the Stackwise technology, logically, we still have two ways. But since now the switching factory is responsible for working with the stack ring, the mechanism for working with these paths has become more complicated. As before, access to this or that path is performed using the token mechanism. After receiving the token, the switching factory can transmit packets along the stack ring. And since the packages are taken directly from each ASIC, the credit mechanism is responsible for the maintenance of each ASIC. They are distributed by the switching factory.

These innovations made it possible to increase the capacity of the stack bus to marketing 64 Gb / s, by equating the useful capacity to the physical. Now the packet passes only once through the stack port of the switch. I would like to note that both technologies (Stackwise and StackWise Plus) use the same types of stack cables.

It is worth emphasizing that the bandwidth of the stack bus did not become equal to 64 Gb / s, it began to strive for this figure. Why? The reason is that all broadcast, multicast and unknown unicast traffic continues to be processed using the Source stripped algorithm. Those. these types of traffic go through the entire ring before they are removed from the stack bus. This means that double bandwidth is spent on these types of traffic.

You can use any 3750 series switches in one stack. If you add, for example, 3750v2 switches (support StackWise) and 3750X (StackWise Plus) to a single stack, the stack will work using the StackWise technology (Source stripped algorithm). In this case, for the 3750X, packet switching between local ports will be carried out only inside the switch without appearing on the stack bus. For 3750v2 switches, the packets between the local ports will pass through the stack bus in the old fashioned way.

Let's briefly touch the stack operation scheme at the program level. Within the framework of the StackWise or StackWise Plus stack, one of the switches is selected as the master. It performs logic operations (control-plane) for the entire stack. If it fails, the transfer of unicast traffic continues. This is achieved through synchronization of hardware tables. The MAC table and the Cisco Express Forwarding (CEF) tables, namely FIB and Adjacency table, are synchronized between the stack switches. But the rest of the tables, including the routing table, the multicast traffic transfer table, are filled in again on the new master. In this case, it is possible to use the NSF functional - Nonstop Forwarding. Those. The control-plane on the new master starts from scratch.

On this I propose to interrupt. Continuation will appear in the coming days.

Source: https://habr.com/ru/post/269529/

All Articles

Cisco switch stacking. Part 1

More articles: