BGP route leaks

Colleagues, attention! On our initiative to introduce the mechanism of automatic protection against the occurrence of “route leaks” (route leaks), the adoption call was announced in the BGP protocol.

This means that starting from May 21, 2017, within two weeks on the IETF mailing list (you can subscribe to it here ), all the pros and cons of accepting the proposals proposed by the authors to the working group will be discussed. Depending on the voting results, the work on this document will continue until the status of a standard (RFC) is received or frozen.

We ask everyone who is not indifferent to the state of BGP-related issues to express their own arguments in English in the thread of letters under the heading “draft-ymbk-idr-bgp-open-policy-03”. Remember that expressing an opinion, you must express your opinion as an engineer, and not the opinion of your employer. It is highly desirable that your opinion be reasoned - for this we recommend once again to familiarize ourselves with our proposals (refer to the draft: one , two ).
')
We remind you that anyone can express their opinions on the IETF mailing list - there is no qualification.

We are in advance grateful to every technical specialist, system administrator, developer, and simply interested person who is ready to support our initiative to modernize one of the key protocols that ensure the efficient operation of modern networks.

Thank.

Hello! My name is Alexander Azimov, I represent the company Qrator Labs. Today I will give you some update on route leaks. The topic of route leaks cannot be called new; a couple of times this problem has already been raised here, including by me. However, if there are newcomers in the room, and I hope so, I will begin by identifying the route’s leakage and the possible consequences.

We will make a reservation in advance that we are talking only about transit route leaks. Route leakage is a situation where the prefix received from one upstream provider or peer is announced by another upstream provider or peer. Good question: what do you care? What do you care about the fact that one autonomous system took your prefix from one provider and announced to another?

Unfortunately, this is up to you - the extra hops you receive will in most cases increase network latency, but can also be used for MitM attacks. And you hardly want to make your network, your global availability, connectivity dependent on someone else's network, which has already proved itself to be poorly managed?

Let's see - how and what networks are influenced by such incidents.

On the slide, you see the statistics we collected in April 2017 and, as you now know, thousands of routes leak daily. Day by day, the set of networks that appear in these anomalies is changing - so in April we see more than 40,000 different leaked prefixes. But in reality, the number of problem networks is much higher, because the route leak resembles a double-edged sword. It affects not only the leaked prefixes, but also the networks and autonomous systems that received them. So what are the effects? You will be surprised - they are exactly the same. If you accept leaked prefixes, you redirect your traffic and the traffic of your clients through the same poorly configured networks, with exactly the same result. Who is the victim?

It turns out that almost every telecom operator is under the influence. Every day, almost every network takes at least one leaked route.

So the problem is global and affects everyone. And the traditional question arises - who is to blame?

We will exclude specific, malicious, leaks. They exist, but most route leaks result from a lack of understanding of how BGP works and making mistakes when setting it up. The number of autonomous systems in which we see a varying configuration in one degree or another is large - in April there were more than 1000 such providers.

As you can see, the trend shows that if we try to expand the observation window, the number of speakers creating such anomalies will be even higher.

So what can we do? We can contact these operators, we can try to explain to them where they have an error. And if their will is good - maybe they will fix it, but there are no guarantees here. Therefore, it is better to focus on the technical side of route leakage problems.

What we have?

Of course, we have a BGP community. If you correctly configure the community, and after - correctly configure the filter, the specified network will never be the source of such anomalies. There are two problems: “if” and “right”. There is no verification, there is no built-in support in the protocol, this is a typical case of BGP - excessive flexibility to the detriment of control. The result of this flexibility is thousands of route leaks.

Of course, there is another way - in some way, set up filters for incoming BGP announcements in order to detect leaks among them and stop further transmission. To this day, the best way is to use AS-SET, but here we are again confronted with a problem. Not all Internet Routing Registry support them, moreover - not all AS-SET are true. Some of them are simply incorrect and, please note, you do not need any authorization to, for example, add a list of clients to your own AS-SET, for example, DTAG. But let's say that we are in an ideal world, where all AS-SETs are correct and relevant.

This still does not solve the general problem of route leaks, because if such an anomaly happens in the client cone of your autonomous system, its source will be correct and you will have to accept these announcements. In this case, it is obvious that all higher-level telecom operators will also accept and use this incorrect route.

What is the solution? This is monitoring. In fact, this is the only real way to detect route leaks outside the boundaries of your network. Let's make a preliminary conclusion.

Today, if you set up the right BGP community, the right filters, we will eliminate the possibility of leaks inside your own network. If we set the filters as hard as possible for incoming prefixes, we can filter out some route leaks. Monitoring remains the only way to detect route leaks outside of its own network. Monitoring can also be used to detect the acceptance of BGP routes that have leaked from someone.

However, none of the above options allows you to automatically fix route leaks that have occurred beyond the limits of a given telecom operator. Third-party leaks cannot be fixed by yourself. From this point of view, the problem looks very complicated and confusing.

At the same time, there is no problem of peer relations typing. There are only 4 types. Another fifth is possible, which is some complex combination of the four basic ones. From my point of view, the problem of route leaks is associated with the lack of the same native relations in BGP, expressed at the protocol level. Therefore, in order to solve the problem, we suggested adding “roles” to BGP.

We propose to add a new configuration parameter BGP role, which just reflects the peer-to-peer relations. At the start of the BGP session, using the BGP capabilities of the exposed roles and verify their compliance with local BGP settings, the speakers exchange information. What if a conflict occurs? This means that maybe you, or your neighbor, are trying to configure the wrong BGP session. There is no alternative, such a BGP session must be terminated.

I believe that roles are a natural mechanism. Roles do not reveal anything to third parties - there is nothing to worry about. Also, roles have many uses. They can be used to automate those mechanisms that need to be manually configured before.

First of all, it’s all the same route leaks. Having established and correct roles, we suggest adding another attribute, called “internal only-to-customer” (iOTC), which has a zero length (this is just a flag). It is installed on all routes received from peers and providers, and we can install a set of automatic filters that will filter ads to other providers and peers, if the attribute is set. Those. apart from installing roles, no additional settings are needed to prevent route leaks.

Get acquainted with the attribute “external only-to-customer” (eOTC), which has a length of 4 octets and corresponds to the number of the autonomous system that installed it. If the route is advertised to a neighbor or client, the autonomous system should set a value equal to its AC number. The value of this attribute should not be changed. In the presented scenario, the autonomous system 3 detects a route leak made by the autonomous system 2. It is very simple, and most importantly it works.

So, what if we found a route leak? It seems that you need to filter the prefix, it may be to reset the session, but in fact - you need to think three times and reduce ardor.

Route leak detection is based on the eOTC transitive attribute. Like any other transitive attribute, it may be incorrectly changed. So, instead of filtering and breaking the session, it is worth doing ONLY the deprioritization of the L value. That's all. In most cases, this will be enough to protect your autonomous system from transmitting and using the leaked route.

We have already introduced the concept based on the raiding demon BIRD, it can be found on GitHub . As you can see, there are not so many lines in the configuration that are required for automatic protection against leakage of routes inside the speaker, and most importantly, detection of leaked routes that have originated outside your network.

In general, it seemed to me a cool idea. We have some general solution to the problem. She is in the code. Everything is automated, without the possibility of curved handles to get into the work of these mechanisms. Moreover, verification by your neighbor using OPEN ensures that the basic configuration of the role is correct. This is the reason we decided to go with our ideas at the IETF. Darling turned out to be interesting, but difficult and long.

www.ietf.org/id/draft-ymbk-idr-bgp-open-policy-03
tools.ietf.org/html/draft-ymbk-idr-bgp-eotr-policy-00

I wanted to convey special thanks to Randy Bush, because without his help I would have given up. Today we have two drafts. The first describes the roles and iOTC. The second is eOTC. I hope that both of them will finally be accepted and we will see support for the roles in the software of your routers in the near future.

tools.ietf.org/html/rfc7908
tools.ietf.org/html/draft-ietf-idr-route-leak-detection-mitigation-06
tools.ietf.org/html/draft-ietf-grow-bgp-reject-07

There are also several other initiatives related to this topic. One of them is called a BGP-reject by Job Snijders, which changes the basic behavior of a BGP router. The idea is that if you do not have an import or export policy, the exchange of announcements will not occur.

There is also a competitor eOTC made by colleagues from NIST. I would also like to separately note that the only document that has the status of RFC on this issue is purely informative - it describes what a route leak is and gives a classification of their types.

The question is: should we blame the IETF for such slow motion? You know, I see here, probably, a hundred people who seem interested in BGP routing issues and, I hope, but not sure that you are all subscribed to the IETF mailing list. So instead of the IETF’s sluggishness accusations, work with it. This is basic.

initiatives.qrator.net/details/route-leak-mitigation

What do we have as a result?

So far, there is no normal way to keep a healthy community properly configured, filters efficiently working, without observing the prefixes in order to minimize the damage that route leaks can cause your networks.

There is some chance that changes will be made to the protocol itself, otherwise (using basic settings) eliminating the problem of leaks.

The existence of this chance is directly dependent on you, and your work and collaboration within the IETF. Thank.

Source: https://habr.com/ru/post/329308/

All Articles

BGP route leaks

More articles: