
We want to share a translation of a series of notes from
the Russ White
blog , where he raises the question of using AS-PATH Prepend when connecting to Internet service providers via BGP.
When an organization is connected to two or more BGP providers, several tasks may arise. For example, avoid asymmetric routing, i.e., ensure that response packets are returned via the same provider through which the session is initiated. Or, on the contrary, to achieve a uniform distribution of traffic across the channels of providers, not taking into account asymmetric routing. In both cases, you can influence the routing of Internet providers towards the corporate network using the BGP attribute AS-PATH. But, unfortunately, this approach does not always help to achieve the desired result. The problems of using the AS-PATH Prepend will be discussed.
Why doesn't AS-Path Prepend work?')
So you want to improve the load distribution on incoming channels. Searching on the Internet, you will rather quickly find a website that explains how to configure AS-Path Prepend for this task. During the following maintenance work, you will do all this and ... you will find that some amount of traffic is being redirected, but not as much as we would like. You wait for the next scheduled work on the channel and set up a few more prepend. Again you start everything and ... see that almost nothing has changed. Why is that? There are several reasons why prepend is not always effective. But this is mainly due to the way the Internet itself is organized. As an example of a network, take the image below.

You are in AS65000 and are trying to balance traffic between channels 65001-> 65000 and 65004-> 65000. Suppose you set up prepend for AS65001, since this provider sends you more traffic. Suppose that AS65003 accepts routes from AS65001 and AS65004 on equal terms. By setting up the prepend, you made the route to your networks in AS65000, from the point of view of AS65003, look longer through AS65001. Here the first prepend works.
But what if we add another extra prepend? Will it somehow affect the traffic flow? AS65003 has only two possible paths to destinations: one through AS65001, the second through AS65004. He can choose only one of these two routes. If one prepend worked here, the second will not do anything. This brings us to the first problem with the use of prepend: it is effective only when specified within realistic AS-Path parameters. Additional 256 prepend-ov in this network will not give greater effect than the one that was from the first of them.
If the efficiency of using prepend is related to the total length of the route over the network, then the first question to ask is: What is the average length of the route on the Internet? It turns out that there are organizations that conduct such measurements regularly and for quite some time (by the standards of the Internet). For example, CAIDA, RIPE, and Potaroo analysts have large-scale measurement data from the Internet Default-Free Zone (DFZ). Here is a
plot of the average AS Path length in DFZ since 1998 :
As you can see, the average length of the AS Path has changed little over the past 8 years - even though the number of routes and connected autonomous systems has dramatically increased over the same time. This allows you to understand that in most cases the first AS-path prepend will have the greatest effect, the second will affect less, and all the rest will be just for beauty.
There are two reasons why prepend may not work at all.
First, take the connection between AS65001 and AS65004. We know that there is some kind of peer-to-peer relationship between them: perhaps with payment, perhaps without — this is unknown to us. But we know for sure that AS65001 will always prefer your route received from you to the same route received from AS65004. And this preference is configured for AS65001 through LOCAL_PREF, which priority will always be higher than your AS-Path Prepend. Conclusion? No matter how hard you try, you will not be able to send traffic along the path 65004-> 65001 using prepend.
Second: notice the AS65002 in the corner. AS65001 will always give preference to the route to the client received from the client itself. So, in addition to the above, you also can’t get traffic from AS65002 to go through AS65004 instead of AS65001.
Now, knowing why prepend does not always work, let's think about what we can do.
What to do?When we have determined that AS-Path prepend will not help us in any way, what are our next steps?
Routing is based on the selection of the longest prefix that matches the destination address. This is also true when comparing BGP routes, whatever AS-PATH. So, one of the options here is to divide your address space into longer announced prefixes and announce a part to each of your upstream providers (
to ensure fault tolerance, you can configure the prefix announcement through another provider based on the unavailability condition of the first one - BGP . ). In fig. 3 shows that AS65000 divides its / 44 IPv6 prefix into two and announces them AS65001 and AS65004, respectively, due to which half of the AS65000 subnet traffic comes from one particular provider. This technique can be used with AS-Path prepend to more efficiently distribute the load.
Using longer prefixes to direct traffic over a preferred inbound channel is already a good step towards shaping the desired pattern of incoming traffic. Some scenarios require more granular control.
But what if we don’t have the ability to create longer prefixes in your ASN (for example, in / 24 for IPv4 and / 48 for IPv6)? In such cases, the BGP community attribute can help us. Using the community allows you to influence how the ASN is advertised within the network of our upstream provider and its peers. An example can be seen in Fig. 4. AS65002 - client AS65001, according to the routing policy inside 65001, prefixes received from clients take precedence over those received from BGP peers. This means that despite the AS prepend announced for AS65001 from our channel, AS65002 will still use the 1-gigabit channel. If AS65002 sends us a lot of traffic, we may encounter overloading the incoming connection through AS65001. With the help of the BGP community, you can make sure that AS65001, despite its own policy, directs traffic to AS65004.

Most providers have a document that indicates which community it accepts and uses for such tasks - almost none of the providers use the community indicated in RFC1998. For example, here is an old
L3 document , and here is a
list of the BGP community policies that the JT provider is currently using. The best way to find out such things is to ask the provider directly.
AS prepend, prefix separation and the BGP community can be used simultaneously, but it is very important to think over the BGP policy well.
Stability, which will give a more complex policy, may affect the stability of the system in case of failure or cause excessive complexity in setting up, the logic of which will be difficult to understand in two nights. The main question that an engineer should ask when creating a policy is - what will happen if something falls off? In fig. 5 policies with fig. 3 and fig. 4 applied simultaneously. What should immediately think:
- Do we need to announce an aggregated route along with our long prefixes? Hint: the answer to this question is almost always affirmative.
- Does AS65004 have a conflicting internal routing policy?
- What happens if the channel fails?
- Should responses to outgoing packets be returned on the same channels (stateful firewalling or NAT)?
What should you think about before you solve a problem?This small cycle began with a simple problem - what to do if we have not balanced incoming traffic on two channels with Internet access. In other words, how to implement load balancing in BGP. The first part dealt with problems with AS-Path prepend, and the second - de-aggregating and using the communitu attribute to modify the local preference in your provider's network.
In the last part, we will again deal with deaggregation. Announcing longer prefixes is routing heavy artillery. Be careful when announcing more prefixes. Default Free Zone is a public property. The routing table on the Internet does not belong to anyone, but everyone uses it. Deaggregation doesn't cost you anything, but everyone else has to pay. Adding another route to the routing table is easy, but remember that the longer prefix you add will be visible to the entire Internet. For solving your problem, you pay with a small amount of memory from each router in the world connected to the DFZ. If everyone starts to deaggregate thoughtlessly - everyone will have to buy more powerful routers and additional memory. Including you.
There is a fine line between the use of a public resource and its abuse. If everyone abuses the public resource, because it costs them nothing, the result will be disastrous for this resource. And after destroying public property, it will be very difficult to restore the initial trusting relationships, thanks to which it appeared. So, before setting up the deaggregation, consider whether you really can not do without it.
Is it necessary? Is it really important to balance these two incoming channels? There may be financial reasons for this, for example, the cost of using two channels or the price of excess traffic on one of them. Of course, these are only assumptions, but, perhaps, it would be more reasonable to change the width of the available channels than to introduce a technical solution that needs to be maintained and maintained.
Remember that everything you set up will fail once. And failures often result in calls at night. Consider the different options before you customize the optimization.
How can I reduce damage to public property?
Returning to the network from our first example, is it possible to implement de-aggregation so that traffic can be transferred from AS65001 to AS65004 without affecting the size of the table for everyone to whom these providers are connected? Actually, most providers not only accept the community you send to set up local preference in their AS, but also allow you to block the announcement of any specified route to your peers. You may need to play a little with these communities to understand how they affect the flow of incoming traffic. For example, how will blocking the announcement of longer prefixes to transit peers of a higher provider, in comparison with blocking the announcement of the same prefixes for a certain group of provider’s clients, will have an effect? Since you cannot directly determine how and where this provider is connected, you can contact him and find out what and where should be announced to minimize the global impact, or try various combinations and determine the best option in the process.
Do I have the correct peering? Another option is to pay attention, through which providers you connect to the global network. Suppose that this is one regional and one global provider. In this case, which one sends you more traffic will depend heavily on your customer base.
For example, if you are a local company such as a regional bank or a medical institution, most of your customers will be connected to a regional provider (and not tier-1), and most likely you will receive most of the traffic from its channel. If your business is not so tied to a geographical location, then not so much will come from a regional traffic provider - mostly from people who enter your network while in your area. In such cases, the uneven load on these two channels is quite expected.
If the situation is such, it would probably be more correct to establish interaction with two providers so that it brings you closer to your customers. If your clients are located all over the world, most likely, it is better for you to be connected to two providers at the national or global level than to one regional and one global. Whenever possible, it is better to balance incoming traffic, starting from who your customers are and how best to reach them, instead of going to different engineering tricks, trying to get the networks of two completely different providers to send you the same amount of traffic.
The conclusion here is the following: the engineering solution should be the last solution when nothing else helps. Of course, we are all engineers here, and for us nothing justifies the money spent on the purchase of hardware, as an elegant and voluminous set of commands and configurations of these hardware, with which we solved the problem of high channel utilization. But the real engineering art is to find the source, delving into the very essence of the problem, and not to deal with secondary consequences.