⬆️ ⬇️

Why doesn’t a standard vSwitch need a Spanning Tree protocol?

Today I would like to digress from vSphere 5 fever a bit and recall the basics of the standard vSwitch, and in particular, how it does without the Spanning Tree Protocol.



I assume that you already have the simplest knowledge of switching and know what vlan is, the switching loop, the spanning tree protocol, and some types of link aggregation protocols. I will try to briefly go over the main features of the standard vSwitch, focusing on facts that seemed interesting to me or that were not very obvious in the official documentation, at least for me. This also implies some confusion of the following.



The main purpose of a standard vSwitch (or vNetworking Standard Switch aka vSS) is to provide communication between virtual machines and the physical network infrastructure. In addition, it provides logical separation of virtual machines using Port Groups, offers various balancing algorithms in case you have more than one uplink on one ESXi host, provides outgoing traffic shaping from virtual machines to vSS, and finally, allows you to detect uplink failure and automatically switch traffic to the remaining uplinks.

')





So what are the main differences from a physical switch?



Unlike physical switches, vSS does not need to learn the MAC addresses of all devices in the same broadcast domain. Since all virtual machines are assigned MAC addresses by ESXi, vSS already knows all the default addresses. Another distinctive feature is that vSS clearly divides the ports into two types - internal ports and uplinks, and applies different switching rules to them.



Port Group and Vlan



Port Group defines a configuration template (for example, Vlan number, shaping, traffic balancing) for internal ports. When you connect a virtual machine to vSS, you simply specify which Port Group to use for it and therefore apply the pre-configured parameters. For example, you can specify that for virtual machines of a certain Port Group, only a specific interface should be used as uplink, and the rest of interfaces should be used as backup. Another good example is the situation when you want to configure on MS Network Load Balancing Cluster virtual machines in Unicast mode. In this case, you will need to create a separate Port Group for these virtual machines and disable the Notify Switches option in them.



Very often, the Port Group is compared to vlan in physical switches, although in fact it is a completely wrong comparison. I think this happens due to the fact that in most cases vSphere admins use Port Groups to separate their virtual machines into different vlans. However, there is not always a direct correspondence between them. The previous example of MS Networking Load Balancing is an excellent proof of this - in our vSphere there are two Port Groups: one for MS Network Load Balancing servers and the second for the other servers. At the same time, the virtual machines of both Port Group belong to the same Vlan.

Well and from this another difference of Port Group from Vlan follows. Computers in different vlan-ah can not communicate directly. They will definitely need either a L3 device to route traffic, or they will need to enable bridging between the vlans, but between the Port Groups, unicast traffic can be freely transmitted.



The big surprise for me was the discovery of vlan 4095 just yesterday, or rather the ability to listen to absolutely all traffic on vSS with it. As a rule, in the official documentation of VMware vlan 4095 it is indicated as VGT - virtual guest vlan tagging. With it, we can forward vlan tags to the virtual machine itself and allow it to make decisions about what to do with this traffic. In practice, this is not used very often - VST (virtual switch vlan tagging) is often used when vSS removes vlan tags and sends traffic to the appropriate Port Group. We do not use VGT at all, so I read about it briefly. It turns out that if you create a port group and assign it vlan 4095, enable Promiscious mode in this Port Group and then place a virtual machine there, you can uncover the wireshark and calmly collect and analyze traffic from all the machines connected to this vSS. Another useful practical application is to place a virtual IDS in such a Port Group for the purpose of traffic inspection.



Uplink - traffic balancing



In vSphere 4.1, you can have a maximum of 32 uplinks per virtual switch. Uplink can only be used on one vSS, that is, using the same uplink on another vSS will not work. It also means that traffic will never be transmitted directly from one vSS to another vSS. The traffic between them must either go uplink to the L3 device, or through the internal virtual ports to the virtual machine that plays the role of the L3 device.



In 99% of situations, your ESXi host will have more than one uplink and you will have to decide how you will balance traffic over 2 or more uplinks. By default, these rules apply to vSS, but you can optionally change them also on certain Port Group s.



There are three basic traffic balancing policies in vSS:



One of the most common traffic distribution scenarios is very similar to that used on physical switches using the Spanning Tree protocol, when different vlans go through different uplinks, until one of the uplinks “drops” and then all the traffic of these vlans ov switches to remaining aplinks. Similarly, in ESXi you create 2 Port Groups on a single vSS. For the first Port Group, the first uplink is indicated as active, the second - the backup. The second Port Group is configured exactly the opposite. Thus, you achieve more efficient use of bandwidth to the physical switch and at the same time do not lose redundancy.



Sometimes vSphere admins allocate a separate pair of uplinks only for vMotion or FT traffic, which seems to me a waste of interfaces, because vMotion or FT traffic balances over more than one interface in very rare cases. Most of the time, one of the interfaces will simply be idle.



Shaping outgoing traffic

In the case of vSS, we can shape only outgoing traffic from virtual machines to vSS. In principle, this, for example, is enough to shape the vMotion traffic. But, alas, we will not be able to shape incoming traffic from physical machines.



So, why doesn't vSS need a spanning tree protocol?





All of this, in principle, proves that you can have a lot of uplinks between vSS and the physical switch, but you don’t absolutely need the Spanning Tree Protocol, which is difficult to troubleshoot.



I compiled most of the information from Comrade Ivan Pepelnjak's great blog and VMware official documentation.

Source: https://habr.com/ru/post/124317/



All Articles