When it comes to roaming, this concept usually hides two different processes. In the world of cellular networks, which came to us earlier, roaming means the ability to work in a “foreign” network, and not at all a seamless migration between base stations (handover). The imperceptible movement between the cellular network BS is so natural that they remember very little about it.
In the world of WiFi, things are different, and roaming usually implies an imperceptible for the user movement between access points of the same network - BSS transition, although the widespread introduction of SMS authentication in the near future should push operators to introduce roaming standards between foreign WiFi networks in the style of cellular infrastructure and based on its identification.
The following is a description of existing roaming technologies and ways to identify them on unfamiliar equipment, it is assumed that the reader is familiar with the basic principles of WiFi.
If you estimate switching roaming (which is a handover) on a WiFi network from the point of view of cellular networks, the most accurate description is this - it is
NO , not provided by the standard, and for many years the situation has not changed. In cellular networks, switching a subscriber to another BS initiates a network controller based on informational messages from the client, evaluating the signal from the neighboring bases on the client, the client always accepts the switch on WiFi itself - the database can only tell you how to do it faster. But WiFi has a lot of standardized crutches, quite successfully allowing you to tuck in the process of changing the access point to 50 ms and keep the caller voice call over IP, as well as not standardized developments of each manufacturer, which can both help and exacerbate the already sad process (Ubiquity Zero HandOff - an example of when the crutch did worse than it was before). Here you can easily throw a stone at the author, but what about 802.11r / v - but they are not at all obligatory within WiFi, are not supported by all devices, and do not imply anything like a forced translation with band reservation. The choice of where and when to switch - all the same remains for the client. Moreover, the inclusion of 802.11r will make it impossible for old clients to connect to the network, since This is a required option on an 802.11 frame! In some cases, it is not bad for you, but harmful (old drivers, scanners, printers, and the like).
')
Theory
Having a general introduction, it is worth briefly describing what and why it can be useful in WiFi for roaming (we will call it that way).
802.11i
The 2004 amendment to the standard in 2007 focuses on security and describes authentication and encryption (WPA2). We are interested because The key exchange procedure and interaction with external resources (RADIUS) together greatly slow down client switching between TDs. The first principle of fast reconnection is described - the storage of the PMK key, though only for those points where the client has already completed the full procedure once - that is, quick return to the network.
OKC (Opportunistic Key Caching)
The first known crutches, in the 802.1x authentication process, the access point retains the pairwise master key (PMK) key for each client, the idea was that this key be transmitted to neighboring points via the controller — thereby eliminating new access to RADIUS and simplifying the exchange, was significantly reduced switching time to a new point. It is not part of the standard, hence all that implies, but is supported by all serious manufacturers of WiFi-iron and some clients. Without support from the client, the function is useless, for WPA2-PSK, however, too. Some vendors forcibly try to use the method, seeing the stored key, even if the client did not request it in Request, sometimes works.
802.11k
Radio Resource Management , amendment of 2008, since 2012 in the standard, option. The access point with a flag indicates support for the option in Beacon, sends a list of neighboring points to it when requested by the client, the client does not spend time scanning all available channels and immediately goes to the desired one and selects a new point. The battery is saved, the general condition of the ether is also improved in High-Load. Together with 802.11v, it can make the client's life comfortable enough not to think about other technologies (after all, the client chooses the candidate candidate anyway) - unless of course VoIP and 50 ms magic are important to WPA2-Enterprise. Without customer support is useless.
802.11v
Wireless Network Management (WNM) amendments were published in 2011 and in 2012 entered the standard, a large number of options. The main purpose is to efficiently manage the wireless environment — to exchange data about the environment between the stations, to save the client, to improve the roaming and balancing process — messages are sent to the client with suitable APs that address point-overload problems (Load-Balancing) and “stuck” clients with a weak signal, and some other features. Assisted Power Saving sets the maximum timeout for the client, without requiring frequent keep-alive messages, the Direct Milticast Service allows you to receive multicast frames at the client’s connection speed, not cell speed — which frees the air and saves the battery (these functions roam do not include). But BSS Transition is very relevant - within its framework there are 3 types of messages, this is a request from the client to indicate the appropriate points, and two messages from the point - Load Balancing Request in case the point is overloaded, and asks the client to switch to another and Optimized Roaming Request if the RSSI and Data Rate do not meet the minimum requirements of the TD. It is important to note that these are advisory messages, and the actions are left to the discretion of the client. Forced disconnection is possible only within the framework of the proprietary technologies Band / Load Steering / Balancing, and may be incorrectly processed by the client, or ignored altogether (it is disabled by Disassociate frames).
Sharing 802.11k / v gives a good result, and in most cases, home and low office networks are sufficient for customers, without creating problems for various devices. Next comes heavy artillery - it radically solves the main problem, but can cause side effects - this is 802.11r.
802.11r / FT
Fast Roaming / Fast BSS Transition - 802.11r is mandatory for the client when used on a spot, i.e. those who do not support it cannot connect - this is a flag in management personnel and a modified key exchange mechanism, if the subscriber is old and does not know about its existence, he has a problem (on new devices, even without the support, functions sometimes add an understanding of this flag, although according to the standard, you need to fully implement the protocol). It may also crash incorrect drivers of old client adapters - it’s a matter of using the 4 4-Way Handshake to distribute the shared key during the initial connection of the FT, this is what the standard says: " Protocol.
Fast BSS Transition works with RSNA networks (Robust Security Network Association - WPA2) and fully open networks. For WPA2-PSK, the meaning of fast roaming is lost, since the client and the point are still exchanging 4 packets, there is nothing to speed up. The calculations do not take into account the time to search for a suitable point, and for the 5 GHz range it can be fair - it is necessary to scan 16 channels and find a suitable AP, therefore the general strategy is to use the k / v and r protocols together.
If you use RADIUS for authorization and want very fast roaming, you have no choice, only 802.11r!
In addition to the roaming in 802.11r, there is potentially the possibility to poll the point about the availability of resources required by the client and to reserve them (QoS). Accordingly, there are two subspecies of protocols - FT Protocol and FT Resource Request Protocol. Communication between the client and the points can occur either directly through the air (Over-the-Air), or through the point used and controllers (Over-the-DS) - the second method is a little longer. A QoS request from a point on clients is practically never implemented or used at all.
The most important element of the frame is MDE, Mobility Domain Element, it is necessary for successful roaming, which is possible only within one domain.
The time it takes to switch clients depending on the standard (“Performance Study of Fast BS Transition using IEEE 802.11r” by Sangeetha Bangolae, Carol Bell and Emily Qi):

It is necessary to take into account that this is a “pure” switching time, when the client has already decided that the connection is deteriorating and has found a new point!
The practice of 802.11r roaming is perfectly described in the article
antonvn , I see no reason to repeat.
But the work of other additions can be considered by example. Adding a line to a datasheet is not difficult, it’s harder to get this line to work. I have a couple of Adtran Bluesocket points (BSAP 1925) at my disposal, this is the lower middle range, which is much under-functional in terms of functionality to market leaders, but provides good opportunities for integration into the carrier network and good stability and performance. If you have only 2-3 points in one company, there is little sense for you in them (only if renting with a cloud controller), but for distributed or large-scale networks (10-20 +) it becomes interesting. Cambium is next to them - they are not at hand for tests now, but colleagues praise them. According to the description, Cambium has a little more functionality than Bluesocket (there is 802.11r, more types of tunnels for user traffic, the ability to work up to 24 points without an external controller, etc. small things), while Bluesocket has only 802.11k / v / OKC - full roaming r promise in the next software. Aruba / Cisco / Ruckus predictably know everything that is available on the market - the truth is, will you really use it. Testing cheap equipment is often an ungrateful task, Edimax brought us about a year ago, the stability of the portal manager then raised big questions, on which testing was completed without going into the depth of the functions. There are doubts that in such a price category they were able to organize a full monitoring of the air and notifying about the client’s neighbors, I wonder if someone can check it out. Ubiquiti does not support roaming yet, just like Mikrotik (which is a pity!).
It should also be noted that the
presence of the function of notification of neighbors does not make much sense if the point does not know about them - i.e. you need a background scan mode and a search for neighbors . The fact is that in the normal mode the points work only on their channel, and they simply cannot know about their neighbors! The solution with the installation of all points on one channel was tested by Ubiquiti, proving in practice that this is a bad idea (nobody doubted it) - the capacity drops dramatically.
Used equipment
Two Bluesocket BSAP 1925 access points are used, two laptops use traffic on one AirMagnet WiFi Analyzer PRO software paired with an AirMagnet PCI Express Card 3 X 3, the second notebook for catching traffic on another channel - a MacBook 2016 with an 802.11ac adapter. Judging by the dump, he coped with his task, using the program Airtool version 1.6. Why not from one laptop? We have 3 more Proxim Orinoco a / b / g / n USB adapters just for the purpose of simultaneous removal from 3 channels, but as it turned out, they do not work with most of the traffic of modern networks. As soon as any fresh client or point appears on the air, the analyzer stops seeing most of the traffic. Why is this happening, we tried to figure out, in the end, not having got to the depths of the details spat, something changes in the frame, probably 802.11ac. The vendor reports that this is a physical feature of the adapters, and there will be no fix for them, keep in mind! As a result, just recently Airmagnet released a software update and new USB adapters for it, but we don’t have them yet. And you probably will not, like the Airmagnet analyzer, but do you really need it? Everything described below can be seen on any device that can switch to monitor mode and parse all traffic in the 5 GHz band. To understand the real-time transition, you need to run a dump on one computer for 2 channels with two independent adapters, since When using 2 different machines, it is extremely difficult to accurately synchronize the time (I’m not sure exactly, but we are talking about milliseconds), and when one adapter goes over two channels, half of the traffic will be lost.
Testing was done at home with friends, one point stood in the kitchen, another in the room, separated by a capital wall and saw each other with a minimum signal. The situation and the presence of neighboring networks are similar to small offices. The phone almost immediately when it entered the room switched to the point in it. To complicate the task, the transition was made quickly, and the signal promptly fell immediately upon exiting the room around the corner - this scheme checked the client’s work more, the points did not have time to use the balancing system.
Work with dumps
Dumps were considered free and accessible to every Wireshark.
802.11kThe point announces the possibility of sending a list of neighbors in Beacon frames:

The client, if he wants to receive a list of points by his SSID, sends an Action Frame. In my case, the client requests a list of neighbors after connecting to the SSID (Wireshark filter by frame type wlan.fc.type_subtype eq 13):

The answer with the list of neighboring points to the Neighbor List Report client from the current TD, indicating on which channel and which point to search for the client:
802.11vIt did not work out to find traces of 802.11v work - to activate the work of balancing it is necessary to load the point well, and to wait for what will happen on the air, this time it was not possible to do this. Bluesocket says adaptive balancing system, which always allows the client to connect to the desired radio, and then, if necessary, switches it. It makes no sense to squeeze all the default to 5 GHz, when 2.4 is empty, also the client does not always have a sufficient signal level to use the five, and they prevent him from connecting to the pair. By experience, balancing works, but I have not yet managed to catch her work in a full dump - this time only Disassociate messages were caught after the signal dropped and there was no response from the client, but the client reconnected by itself at that time. Fresh devices, like my Xperia Z5, are immediately connected to 5 GHz, all new Apple devices come in the same way. I limited myself to checking the provision of neighbors and dumping roaming on two channels simultaneously. In the process of switching parsing, I saw quite an interesting thing: the device delayed transmission of certain packets when the channel is already installed and working, but there is no application traffic for a long time. So, in real testing of a specific application, it is necessary to take into account the peculiarities of its operation and the network stack of your device - it is quite possible that it’s not your WiFi that is to blame for the delay!
Features of client stack
Next - the most interesting. Dump from channel 44, where the client switched. The dump shows that from the moment of the first request to the successful exchange of keys, 46 milliseconds take place - no 802.11r with the WPA2 using the preshared key is simply needed. Everything depends on how quickly the client understands the need for switching and finds the right point. But this is not the most interesting, the interesting lies in the fact that the traffic of the test application was absent for another 3 seconds! For clarity, ping was launched at an interval of 15ms, the interval was not always followed due to the nature of WiFi and the lack of priority on traffic (Best Effort). Ideally, of course, you need to test something more reasonable, but the program for launching ping was already on the device, so we were content with it.
Authentication and successful connection:

After connecting, network traffic appears, but this is not ICMP, but some other packets! And only after 3 seconds, ICMP requests appear:

This is what happens at the initial access point at this time. It’s hard to say whether the client started the connection procedure to a new point before completely disconnecting from the original one, as it follows from the dump, because time may not be accurate:

After the access point receives the last packets from the client with a signal level of -80 dBm, and then the client does not acknowledge several packets, the point sends it Disassociate messages. Probably, the client at this time is already conducting a successful transmission on a new channel, since no one bothers him to switch to it to scan for available points without disconnecting from the current one, and in this case, it does not need to spend a lot of time.
Visually, some switching delay is present, pings hang, but as the dump showed, this is not a WiFi problem. A complete disconnection from the network on the device did not occur, the signal at the transition between the rooms falls, and then quickly returns to high values.
If the
BSS Transition functionality is supported, its presence in the dump is detected by the specified flag - the Probe Request frame from the client:

Findings?
Do not chase technology for the sake of technology, they do not always play a crucial role. Even with the most fashionable WiFi-points, the last word for the client. Focusing on the information provided, you yourself can check your points for compliance with the needs and functionality declared in the description, and choose the technologies that you need.
Proper layout of the points in the room and network planning will provide good results even with low-cost equipment, just using top-end hardware, you can easily ruin the project with a thoughtless installation.