
In this small note I want to touch on some interesting points and features of traffic control (or traffic control attempts) in the case of using the BGP protocol.
The article will not answer the question of how to make happiness online!The information presented is cognitive in nature and will be similar to easy reading for specialists in the field of telecommunications. The information will be presented in a fairly free form, without excessive saturation of the specifics. Let's try to answer the question: "why is there no traffic where it should be, and there is where it should not be."
')
We will not consider the appointment of the BGP protocol with all the ensuing consequences, and immediately take the bull by the horns.
Initial data
To begin with, we have our own autonomous system, a delegated block (s) of addresses and one provider. In this case, the connection of our AS with the Internet is carried out using a single channel with the provider. Traffic to our network (to our prefixes) passes through this logical channel, there can be no second opinion here. Similar to incoming traffic, all outgoing traffic will go through the only existing channel.
Everything works well, but sooner or later there comes a time when it becomes necessary to connect to an additional provider. There are many reasons for this, but I would like to dwell a bit on this moment. By connecting a second (third ...) provider, the client is trying to provide channel redundancy, increase connectivity, optimize Internet costs (from provider A is a cheap “global” channel, provider B can provide fast and cheap inclusion in local traffic exchange points) and <add your own option>.
A channel reservation situation can have two main scenarios:
1) there is a main channel with a capacity of 1 Gbit. The reserve (only in case of a reserve) is purchased at a much lower bandwidth - for example, 100 Mbps. In this case, you should be aware of the consequences of the failure of the main channel - the end of the world will not come, but customers will feel the changes;
2) the backup channel is bought with the same (or close to that) bandwidth. This channel is not quite cheap, and I really do not want it to stand idle. Here the administrator starts the shaman with different balances.
Naturally, the network administrator wants to understand exactly how traffic enters / exits / passes through his network. And I would even say - its autonomous system. So, in this you can be 100% sure only if you use one provider. If there are several providers, the understanding of traffic in the network develops into assumptions about traffic in the network. And that's why.
In the administrator’s sleeve there are several mechanisms for influencing information flows (local preference, weight, med, as-path, etc.), but how effective are they? I will say that they are quite effective (one would doubt), but not completely. Below are a couple of interesting examples.
1. Outgoing traffic
Suppose we get Full View from two providers. The first provider will be our main one, the second one will be a backup one. We define the policy for processing announcements from the provider: set to the prefixes received from the first provider, greater local preference (as an option) than to the prefixes received from the second provider.
Fig. one
As a result, all outgoing traffic should go through the main channel. We visualize logical channels (for example, using a
Cacti Weathermap ) and observe a strange picture: the traffic leaves not only through the main channel, but also through the backup one. How so?
The thing is that one Full View to another is different. Let's take a look at what we get from providers, in particular, on the number of received
PfxRcd prefixes (an example is taken from a real router):
#sh ip bgp summary
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
XXXX 4 AAAAA 3582179 106997 96854566 0 0 4w6d 392986
YYYY 4 BBBBB 772880 508161 96854556 0 0 6d02h 400394
The session with equipment XXXX (AS-AAAAA) is basic.
We see the difference in the number of 8000 prefixes. This means that if we request resources located in these 8000 networks, the backup channel will be used. Why is this so? Drawing attention to these prefixes (delta), I noticed that I receive them from the main provider, but in an aggregated form. It means that instead of 4 * / 24 we get 1 * / 22. Who makes this sumarizatsiya? It is difficult to say, perhaps, someone from the upstream.
A small subtotal: even outgoing traffic is able to flow with a tude, where we do not expect.
2. Inbound traffic
In this case, everything is simpler on the one hand and more complicated on the other.
How can we influence the behavior of incoming traffic? The classic is to artificially extend AS-PATH (prepend), send announcements to the provider with some communities in order to understate the provider local preference (I will say that not all providers offer this opportunity, and some are not small enough, not even looking-glass. In this In the case of a colleague, the provider calls and the administrator on duty tells in a telephone mode which prefixes he “sees” and with which attributes) and other significantly less efficient methods.
But no matter how hard we try, everything depends to a greater extent on the policies of the provider. And if with the balancing / load of outgoing traffic all is less or less good, then we will receive the incoming traffic in both channels, and in a rather unpredictable ratio.
For example. Our AS-A has connections with two providers: AS-B (main) and AS-C (standby). We announce our networks to both providers, but in the direction of the backup we specifically extend AS-PATH (we want to get traffic to this channel only for faults with the main one).
Fig. 2
The backup provider receives announcements about our networks from two sources: directly from the client (from us) and from its peering partners (dotted line). In many cases, one has to deal with what the provider considers as a higher priority path to the client network the path that directly connects it to the client. To do this, it (the backup provider) increases the local preference values ​​by announcements received directly from the
client (in this case 200), and not from the
peer (in this case 100). He will tell all his neighbors about the extended path (announcements received from the client), since the BGP router announces only the best route.
This means that if the traffic passes through the autonomous network of the AS-B provider, we will receive it on the main channel, if on the AS-C provider’s network, on the backup one. In the end, whether we like it or not, but incoming traffic to us will “come in” from both channels. In addition, we get asymmetry: we are trying in every possible way to send traffic to the main channel, and we receive it from both the main and the backup.
A small subtotal: with two or more providers, traffic will "flow" from all sides.
3. Sometimes, even the order of session setup plays a role.
1. Consider an example.
Fig. 3
Our network (AS-A) is connected to the provider (AS-B). AS-C, AS-D are other providers, AS-E is the same client as us. The green arrow shows the distribution of routing information, the blue - incoming traffic.
2. And here we decide to establish a connection with AS-E (this is our partner, not a provider). The essence of communication is not in the organization of an additional channel, but in providing a reservation — insurance for each other. By default, the link should not be loaded. In the event of an accident, one AS insures the other.
To do this, we set the policy for outgoing announcements in the direction of the partner, namely, we lengthen AS-PATH. For the affiliate network, this announcement is not the best, so it is not covered further by AS-E.
Fig. four
3. But it so happened that our session with the main provider broke (or we tested it). In this case, the red arrows appear - the extension of the extended route, the blue arrows - the traffic path to the network.
Fig. five
And, too, everything is in order.
Session with the main provider rises. It should be noted that the AS-D provider is the case we talked about earlier (an increased local preference is set for clients), the other providers do not do this, that is, the choice of path is based on AS-PATH.
4. AS-B accepts the announcement from AS-A. Announcement without prepend, so this path is now the best, and that it is announced further.
Fig. 6
We see how further the announcement is distributed and the sources of traffic change:
Fig. 7
5. In the end, information about the availability of its own prefixes AS-A comes to AS-D. For this autonomous system, such a path is considered less acceptable, since earlier local updates were set to higher local preference for updates from the client (from AS-E). The result of these processes is the steady state:
Fig. eight
Please pay attention to Fig. 4 and fig. 8. As can be seen, the nature of incoming traffic varies significantly. In this case, our backup channel has become, if not the main, then far from reserve. How to fix the situation? You can put / raise a session with a partner (AS-E) to get back to normal, but the method is far from scientific.
A small subtotal: I wanted to demonstrate that sometimes even the order in which sessions are established plays a role and affects the nature of the traffic. It can be said that the case is slightly contrived, but it is taken from real life and has a place to be.
Total
Traffic management using BGP and routing between autonomous systems is a complex and interesting process. The number of factors affecting the flow of information, even more than we might think.
Successful routing!