We test and monitor MSTP in a heterogeneous network

Introduction

With the growth of any network, an administrator will sooner or later face three problems, among other things - the risk of accidental drops due to link breaks, the appearance of rings in the switch tree and the lack of performance of individual lines.

To combat these types of evil, humanity, as is known (in particular, from several articles on Habré, Wikipedia and much more from where), invented and uses different versions of the Spanning-Tree protocol. The general idea of which is reduced to the fact that switches in a network with more or less arbitrary connectivity according to some rules collectively decide which links between them to send which packets to use.

Pro and Contra

It is worth noting that from time to time people think about whether it is necessary at all ( here, for example ). Different thoughts on this subject are reduced (ok, as far as I know) to three ideas:

And let's put duplicate links and all sorts of LAG / LACP aggregated pairs of wires everywhere
Well, his second level, we will all be routed to the third
And let's live on real or virtual switch stacks

')
It is clear that for each specific network there are some “design considerations” and it will never hurt to think a couple of times, but there are certain drawbacks for both approaches. The first increases the cost of infrastructure in real conditions, if necessary, maintain fault tolerance. Example - there are ten switches in the potential "ring". And already laid optics. Cheap, 4 cores. If you want to build up to each two independent links, which then make friends in some aggregated interface, you will have to turn up a bunch of new construction sites, or put several wavelengths into one fiber, which is not cheap either. And if “everything is routed”, then the switches will have to be changed to routers (I exaggerate, but the point remains) and have an increase in delays out of the blue. Alas.

Workers and reliable distributed (up to 80 km, it seems) Juniper stacks of switches. But they stand - like a plane. Hang up

Experience son of difficult mistakes

After reading, looking at all this economy, we decided to try to run it. And judging by the first impression from reading various manuals, everything was very rosy - the mood is as it should and it will more or less figure out and fly.
In the arsenal there were Cisco-2960 and Dlink of various kinds. I wanted happiness in the form of MSTP for a couple of VLANs. There was no stand, everyone tried to assemble on a live network (at night, with minimal load, etc.). Why MSTP is because some kind of standard. There is a chance to start a system from equipment of different vendors without large losses. And to provide partial use of blocked links, again.

It did not work from the first run - Cisco rebuilds MSTP for quite a long time and the slightest error in the VLAN alignment according to different Instances leads to the fact that the system does not take off and with probability loses controllability.

We realized that Spanning Tree without monitoring and a quick idea of what state it is in now is worthless, like a RAID, for example.

Stand

They rolled back the configuration, put the Cisco 3750 stack into the core, which removed issues with the performance of the pair of the 2960s and pushed back capacity problems for a considerable time, leaving only the issue of line redundancy.

We assembled a stand from 3 x 2960 and 2 dlink, uncoupled from the main network, and started to play.

Instruments

At first, on one of Cisco, ports were allocated for managing all other switches, and on them the control interfaces were attached to a separate VLAN, which was not planned to be driven into MSTP, in order not to lose connectivity with the equipment for the duration of the experiments.

It has been found that some dlink models support a very limited number of MST Instances, which makes life difficult, but does not make it impossible. Available in our economy are able to 7, alas.

Next, the perl + Net :: Telnet syntaxes were written, which can do two key things:

Automatically and consistently configure switches of different models
Remove information sufficient to display the status of the tree

If someone comes in handy, I’ll give as an example the minimal commands for dlink

config stp version mstp config stp mst_config_id name %cfname revision_level %revision create stp instance_id 1 config stp instance_id 1 add_vlan %inst1vlans create stp instance_id 2 config stp instance_id 2 add_vlan %inst2vlans create stp instance_id 3 config stp instance_id 3 add_vlan %inst3vlans create stp instance_id 4 config stp instance_id 4 add_vlan %inst4vlans create stp instance_id 5 config stp instance_id 5 add_vlan %inst5vlans create stp instance_id 6 config stp instance_id 6 add_vlan %inst6vlans config stp ports 1:1-1:26 state enable enable stp config stp trap new_root enable config stp trap topo_change enable

(Specifically, this example is for conditionally stackable Dlink. For non-stackable, port numbers will be without ":")

and for cisco:

 conf t no spanning-tree mst configuration spanning-tree mode mst spanning-tree mst configuration name %cfname revision %revision instance 1 vlan %inst1vlans instance 2 vlan %inst2vlans instance 3 vlan %inst3vlans instance 4 vlan %inst4vlans instance 5 vlan %inst5vlans instance 6 vlan %inst6vlans exit exit

Instead of% cfname, substitute the name of the configuration, instead of% revision - respectively, revision (a natural number from one and above).
% inst1vlans - list of VLAN tags for the first instance, separated by commas.

For fine tuning - so that the balancing is really turned on, the traffic is spread over the links, etc. - it is necessary to go through the ports and build priorities. It is better with your hands.

How to look at it?

It would be generally ideal if you could pull different switches on SNMP and see more or less identical data in more or less of the plates. But, despite the standard protocols (both SNMP and MSTP seem to be standardized), all vendors have their own ideas about the beautiful and there is no such freebie. Or at least not found. For some reason, Cisco gives data on CIST, but not on other MST Instances. Why - it is not clear even once ...

I had to take up the file and reinvent the wheel. Namely - a program that climbs all the same telnet on switches and removes data from them, parses and displays. To display this kind of data is ideal (for my subjective taste, of course), graphviz is suitable - you can feed a text file of a simple format to the input, but it will decompose you and the graph, draw arrows, and even insert hyperlinks if asked affectionately.

It turned out something like this:

Commands to get information, respectively, for Dlink:

 show stp ports %port

and for Cisco:

 show spanning-tree interface Gi %device/%port detail

Underwater rocks

Spanning tree has the right to converge up to a minute and this is normal.

The basic setting must be identical.

In order for all switches to have all instances, it is necessary that they also have all VLANs. (in the picture above, however, there are switches where not all vlans are!)

Using Spanning Tree (without service instances created exclusively for research purposes), it is impossible to determine which switch port A has switch B turned on, and which switch C is turned on if A in all MST instances is closer to Root than B and C.

Summary

You can make standardized protocols work if you act carefully, take your time and watch your hands and equipment.

Source: https://habr.com/ru/post/195596/

All Articles