📜 ⬆️ ⬇️

Wi-Fi seamless roaming

The boss called me into a conversation, told me to bring a laptop with me. It seems to be nothing - both there and in the workplace we have an office wireless network. We arrive - and the download of the large file delivered for downloading was broken, the SSH sessions were closed, the carefully typed web form was dropped for some reason when it was sent. Familiar?
Today we will talk about seamless roaming of devices in wireless networks Wi-Fi.


Roaming is the process of reconnecting a device to a wireless network while moving it in space. The received power of the radio signal weakens with distance to the transmitter, as a result of which the effective information transmission rate drops, channel errors grow up to the interruption of the wireless connection. If there is more than one access point in the radio network with one name (SSID), moving the mobile subscriber from the zone of reliable operation within the first access point to the zone where the signal from the second access point is of higher quality (higher power, higher signal-to-noise ratio) such a reconnection will occur.

The decision to reconnect is always made by the client device (Wi-Fi adapter driver). The access point can only “prompt” the device about the desirability of this action. Sometimes you can specify in the driver settings the parameter “aggressiveness” of making a decision. However, when the subscriber is initially connected, the centrally controlled system may “force” the subscriber to connect to the preferred (in terms of load) point and on the desired channel / band.

A seamless call is such a roaming mechanism in which the losses of data transmitted, occurring at the moment of switching from point to point, are minimal or equal to zero, and the TCP / IP stack of the client operating system does not even notice the fact of switching. Such a mechanism is important in the operation of delay-sensitive and lossy applications, such as voice over the radio network (Voice over Wireless), streaming video, large amounts of data and, in general, all cases where the TCP protocol is not able to “digest” the temporary loss of the transmission channel data.
')
We will set up experiments and take a look at the seamless roaming process implemented by means of a centrally controlled wireless network built on Juniper Wireless equipment, which was discussed in the introductory article . This is an enterprise-class system, specially designed to meet the challenges of seamless roaming. Then we will “break” the seamlessness and demonstrate what this leads to and what behavior we can expect from the devices of the “home class”.

Our network will have a regular Windows7 laptop with an integrated Intel WifiLink 5100abg card, two Juniper MX8 wireless network controllers in one LAN segment, two WLA532 access points, each configured for its own controller. We will create traffic to the laptop from the Linux server using the ping -f -s 1000 utility or ping -s 100 -i 0.05 . The same notebook will do spectrum analysis ( Wi-Spy DBx / Channelyzer Pro) and capture 802.11 frames ( OmniWiFi / OmniPeek).



To authorize subscribers (correctly, WPA2 Enterprise ) using the 802.1x mechanism, we will raise the FreeRADIUS server and configure it on PEAP / MSChapV2. At the same time, we will be able to monitor the “wireless controller – RADIUS server” traffic flow when the latter is launched via freeradius -X and track the events of full authorization and messaging of accounts. As a local user base - a text file with passwords.
Controller settings for authorization in our network " DOT1X " are identical and simple:

 set service-profile Secure-DOT1X ssid-name DOT1X
 set service-profile Secure-DOT1X 11n short-guard-interval disable
 set service-profile Secure-DOT1X rsn-ie cipher-ccmp enable
 set service-profile Secure-DOT1X rsn-ie enable
 set service-profile Secure-DOT1X attr vlan-name default
 set radius server debian64 address 172.16.130.13 timeout 5 retransmit 3 deadtime 5 encrypted-key 0832494d1b1c11
 set radius server debian64 mac-addr-format colons
 set server group debian64-group members debian64
 set accounting dot1x ssid DOT1X ** start-stop debian64-group
 set authentication dot1x ssid DOT1X ** pass-through debian64-group
 set radio-profile default service-profile Secure-DOT1X
Full config of the first controller
# Configuration nvgen'd at 2013-6-28 22:18:27
# Image 8.0.2.2.0
# Model MX-8
# Last change occurred at 2013-6-28 19:56:52
set ip route default 172.16.130.1 1
set ip dns enable
set ip dns server 8.8.8.8 PRIMARY
set log server 172.16.130.100 severity error
set system name WLC-1
set system ip-address 172.16.130.30
set system countrycode RU
set timezone MSK 4 0
set service-profile Secure-DOT1X ssid-name DOT1X
set service-profile Secure-DOT1X 11n short-guard-interval disable
set service-profile Secure-DOT1X rsn-ie cipher-ccmp enable
set service-profile Secure-DOT1X rsn-ie enable
set service-profile Secure-DOT1X attr vlan-name default
set radius server debian64 address 172.16.130.13 timeout 5 retransmit 3 deadtime 5 encrypted-key 0832494d1b1c11
set radius server debian64 mac-addr-format colons
set server group debian64-group members debian64
set enablepass password ...
set accounting dot1x ssid DOT1X ** start-stop debian64-group
set authentication dot1x ssid DOT1X ** pass-through debian64-group
set user anton password encrypted ...
set radio-profile default 11n channel-width-na 20MHz
set radio-profile default service-profile Secure-DOT1X
set ap auto mode enable
set ap 2 serial-id mg0211508096 model WLA532-WW
set ap 2 name WLA-2
set ap 2 blink enable
set ap 2 fingerprint 1a: fb: 2e: d2: ab: e0: 59: 87: a7: 3c: 2a: 20: ec: 2a: 9b: cc
set ap 2 time-out 900
set ap 2 remote-ap wan-outage mode enable
set ap 2 remote-ap wan-outage extended-timeout 10h
set ap 2 radio 1 tx-power 5 mode enable
set ap 2 radio 2 mode disable
set ap 2 local-switching mode enable vlan-profile default
set ap 5 port 5 model WLA532-WW
set ap 5 radio 1 tx-power 5 mode enable
set ap 5 radio 2 mode disable
set ip snmp server enable
set port poe 5 enable
set snmp protocol v1 disable
set snmp protocol v2c enable
set vlan 1 port 1
set vlan 1 port 2
set vlan 1 port 3
set vlan 1 port 4
set vlan 1 port 6
set vlan 1 port 7
set vlan 1 port 8
set interface 1 ip 172.16.130.30 255.255.255.0
set snmp community name CommunityRO access read-only
set mobility-domain mode seed domain-name LocalMobilityDomain
set mobility-domain member 172.16.130.31
Full config of the second controller
# Configuration nvgen'd at 2013-6-28 18:05:38
# Image 8.0.2.2.0
# Model MX-8
# Last change occurred at 2013-6-28 17:56:28
set ip route default 172.16.130.1 1
set system name WLC-2
set system ip-address 172.16.130.31
set system idle-timeout 0
set system countrycode RU
set service-profile Secure-DOT1X ssid-name DOT1X
set service-profile Secure-DOT1X 11n short-guard-interval disable
set service-profile Secure-DOT1X rsn-ie cipher-ccmp enable
set service-profile Secure-DOT1X rsn-ie enable
set service-profile Secure-DOT1X attr vlan-name default
set radius server debian64 address 172.16.130.13 timeout 5 retransmit 3 deadtime 5 encrypted-key 0832494d1b1c11
set radius server debian64 mac-addr-format colons
set server group debian64-group members debian64
set enablepass password ...
set accounting dot1x ssid DOT1X ** start-stop debian64-group
set authentication dot1x ssid DOT1X ** pass-through debian64-group
set user anton password encrypted ...
set radio-profile default wmm-powersave enable
set radio-profile default 11n channel-width-na 20MHz
set radio-profile default service-profile Secure-DOT1X
set ap auto mode disable
set ap auto blink enable
set ap 2 serial-id mg0211508096 model WLA532-WW
set ap 2 name WLA-2
set ap 2 fingerprint 1a: fb: 2e: d2: ab: e0: 59: 87: a7: 3c: 2a: 20: ec: 2a: 9b: cc
set ap 2 time-out 900
set ap 2 remote-ap wan-outage mode enable
set ap 2 remote-ap wan-outage extended-timeout 10h
set ap 2 radio 1 channel 11 tx-power 5 mode enable
set ap 2 radio 2 mode disable
set ap 2 local-switching mode enable vlan-profile default
set load-balancing mode disable
set port poe 5 enable
set port poe 6 enable
set vlan 1 port 1
set vlan 1 port 2
set vlan 1 port 3
set vlan 1 port 4
set vlan 1 port 5
set vlan 1 port 7
set vlan 1 port 8
set interface 1 ip 172.16.130.31 255.255.255.0
set mobility-domain mode member seed-ip 172.16.130.30
set security acl name portalacl permit udp 0.0.0.0 255.255.255.255 eq 68 0.0.0.0 255.255.255.255 eq 67
set security acl name portalac 0.0.0.0 capture 255.255.255.255 capture
commit security acl portalacl

We arrange the laptop so that the signal it receives from both access points (on 6 and 11 channels of b / g, 2.4 GHz) is approximately the same, connect to the network, launch ping, look at the distribution of energy in the air:



To check roaming operability it is necessary that the laptop starts receiving from the access point with which it is currently associated, the signal is significantly worse than from another point with the same SSID and encryption settings. You can carry a laptop, but I was not so comfortable, so I wore one of the access points (on a long Ethernet cable) closer or behind a concrete corner, and weakened the second signal, covering it with three nested steel pots, like a nesting doll. Each pan gave a reduction of 3-4 dB. As a result, at some point in time, the client "jumped" to another access point:



At the same time, the analysis of packets on the air shows such a picture (in order for OmniPeek to see all the traffic, I had to repeat all the experiments with points rigidly tied to the 11th channel, otherwise I would lose the most interesting thing while scanning on {6.11}).
The laptop tries to find a more preferred access point (probe request, frame 79086), and receives responses from both, with a signal level of 23% (current) and 63% (candidate).
The last useful frame 79103 is transmitted to the server via ap2 , after which frames 79122-79136 are quickly switched to ap1 , including authorization, reassociation, EAPOL exchange.



The reassociation request in frame 79126 contains the PMKID (Pairwise Master Key) key, which defines, roughly speaking, the identifier of this wireless session. If the access points work collectively (under the control of a single controller, or the controllers communicate with each other), the “new” access point checks the received identifier using its tables and, if found, skips the authorization step and immediately allows data exchange.



In our case, the first useful frame 79138 through the new access point, ap1 , went 90 milliseconds after the last one through the old one. The RADIUS server received only the account bouncing message from the point that was left alone:
 rad_recv: Accounting-Request packet from host 172.16.130.30 port 20000, id = 143, length = 264
Read more
Acct-Status-Type = Interim-Update
Acct-Multi-Session-Id = "SESS-30-d44e4b-437681-7f4d10"
Acct-Session-Id = "SESS-30-d44e4b-437681-7f4d10"
User-Name = "test"
Event-Timestamp = "Jun 28 2013 20:56:42 MSK"
Trapeze-VLAN-Name = "default"
Calling-Station-Id = "00-21-5D-C8-06-8A"
NAS-Port-Id = "AP5 / 1"
Called-Station-Id = "78-19-F7-7C-6A-40: DOT1X"
Trapeze-Attr-19 = 0x77696e646f777337
Trapeze-Attr-21 = 0x77696e646f7773
NAS-Port = 33
Framed-IP-Address = 172.16.130.128
Acct-Session-Time = 921
Acct-Output-Octets = 246429867
Acct-Input-Octets = 238261281
Acct-Output-Packets = 232900
Acct-Input-Packets = 247341
NAS-Port-Type = Wireless-802.11
NAS-IP-Address = 172.16.130.30
NAS-Identifier = "Trapeze"
Acct-Delay-Time = 0
# Executing section preacct from file / etc / freeradius / sites-enabled / default
+ - entering group preacct {...}
++ [preprocess] returns ok
[acct_unique] Hashing 'NAS-Port = 33, Client-IP-Address = 172.16.130.30, NAS-IP-Address = 172.16.130.30, Acct-Session-Id = "SESS-30-d44e4b-437681-7f4d10", User -Name = "test" '
[acct_unique] Acct-Unique-Session-ID = "c1a67d40e7b54bea".
++ [acct_unique] returns ok
[suffix] No '@' in User-Name = "test", looking up realm NULL
[suffix] No such realm "NULL"
++ [suffix] returns noop
++ [files] returns noop
# Executing section accounting from file / etc / freeradius / sites-enabled / default
+ - entering group accounting {...}
[detail] Expand: / var / log / freeradius / radacct /% {Client-IP-Address} / detail-% Y% m% d -> /var/log/freeradius/radacct/172.16.130.30/detail-20130628
[detail] / var / log / freeradius / radacct /% {Client-IP-Address} / detail-% Y% m% d expands to /var/log/freeradius/radacct/172.16.130.30/detail-20130628
[detail] expand:% t -> Fri Jun 28 20:56:16 2013
++ [detail] returns ok
++ [unix] returns noop
[radutmp] expand: / var / log / freeradius / radutmp -> / var / log / freeradius / radutmp
[radutmp] expand:% {User-Name} -> test
++ [radutmp] returns ok
++ [exec] returns noop
[attr_filter.accounting_response] expand:% {User-Name} -> test
attr_filter: Matched entry DEFAULT at line 12
++ [attr_filter.accounting_response] returns updated
Sending Accounting-Response ID 143 to 172.16.130.30 port 20000
Finished request 269.

All this worked so quickly only because both access points (or rather, the controllers serving them) have a common base of connected active client devices. To do this, the controllers are combined into a "mobility group". The name is selected, the initial “candidate for the main role” is set:

WLA-1:
 set mobility-domain mode seed domain-name LocalMobilityDomain
 set mobility-domain member 172.16.130.31

WLA-2:
 set mobility-domain mode member seed-ip 172.16.130.30

An example of a functioning mobility group:
 WLC-1 # show mobility-domain 
 Mobility Domain name: LocalMobilityDomain
 Flags: u = up [2], d / e = down / config error [0], c = cluster enabled [0],
        p = primary seed, s = secondary seed (S = cluster preempt mode enabled),
        a = mobility domain active seed, A = cluster active seed (if different),
        m = member, y = syncing, w = waiting to sync, n = sync completed,
        f = sync failed 
 Member: * = switch behind NAT
 Member Flags Model Version NoAPs APCap
 ---------------- ----- -------- ---------- ----- -----
 172.16.130.30 upa-- MX-8 8.0.2.2 1 12
 172.16.130.31 um --- MX-8 8.0.2.2 1 12


When subscribers are roaming between access points, the controllers exchange the subscriber context, including the history of its movements:

 Roaming history:
   Switch AP / Radio Association time Duration
   --------------- ----------- ----------------- ------- ------------
  * 172.16.130.30 5/1 06/28/13 22:12:22 00:06:34            
   172.16.130.31 2/1 06/28/13 21:57:28 00:14:54            
   172.16.130.30 5/1 06/28/13 22:08:56                     
Details in full
WLC-1 # sh sessions network verbose

1 sessions total

Name: test
Session ID: 42
Global ID: SESS-41-d44e4b-442936-c9f222
Login type: dot1x
SSID: DOT1X
IP: 172.16.130.128
MAC: 00: 21: 5d: c8: 06: 8a
AP / Radio: 5/1 (Port 5)
State: ACTIVE
Session tag: 1
Host name: Cartman
Vlan name: default (AAA)
Device type: windows7 (AAA)
Device group: windows (AAA)
Up time: 00:09:59

Roaming history:
Switch AP / Radio Association time Duration
- - - - * 172.16.130.30 5/1 06/28/13 22:12:22 00:06:34
172.16.130.31 2/1 06/28/13 21:57:28 00:14:54
172.16.130.30 5/1 06/28/13 22:08:56

Session Start: Fri Jun 28 22:08:57 2013 MSK
Last Auth Time: Fri Jun 28 22:08:57 2013 MSK
Last Activity: Fri Jun 28 22:18:56 2013 MSK (<15s ago)
Session Timeout: 0
Idle Time-To-Live: 179
EAP Method: PASSTHRU, using server 172.16.130.13
Protocol: 802.11 WMM
Session CAC: disabled
Stats age: 0 seconds
Radio type: 802.11g
Last packet rate: 54.0 Mb / s
Last packet RSSI: -52 dBm
Last packet SNR: 43
Power Save: disabled
Voice Queue: IDLE

Packets bytes
- - Rx Unicast 44824 21023372
Rx Multicast 249 31159
Rx Encrypt Err 0 0
Tx Unicast 22615 7289459
Rx peak A-MSDU 0 0
Rx peak A-MPDU 0 0
Tx peak A-MSDU 0 0
Tx peak A-MPDU 0 0

Queue Tx Packets Tx Dropped Rex Transmit Rx Dropped
- - - - - Background 17015 0 1641 0
BestEffort 5745 0 495 0
Video 0 0 0 0
Voice 11 0 1 0

11n Capabilities:
Max Rx A-MSDU size: 0K
Max Rx A-MPDU size: 0K
Max Channel Width: 20MHz
SM power save: none
TxBeamformer: All
TxBeamformee: NonComp-Unknown, Comp-Unknown



Now is the time to model a typical office case when access points provide the same network, but they don’t know about each other. Break-do not build: remove mobility group membership on one of the controllers (clear mobility-domain mode member). Similarly, a roaming event triggered leads to more serious consequences:



Although the client sends the correct PMKID in the reassociation message (27818), the new access point confirms the association (27820) and immediately requests a full 802.1 re-authorization (EAP, 27823). This leads to a long chain of events, including sending messages to the “old” access point de-association (27887) and
Full authorization cycle on the RADIUS server
rad_recv: Access-Request packet from host 172.16.130.31 port 20000, id = 132, length = 139
NAS-Port-Id = "AP2 / 1"
Calling-Station-Id = "00-21-5D-C8-06-8A"
Called-Station-Id = "78-19-F7-75-5F-80: DOT1X"
Service-Type = Framed-User
EAP-Message = 0x020100090174657374
User-Name = "test"
NAS-Port = 19
NAS-Port-Type = Wireless-802.11
NAS-IP-Address = 172.16.130.31
NAS-Identifier = "Trapeze"
Message-Authenticator = 0xaf02c98034a727ae8cc5063bcda80c39
# Executing section authorize from file / etc / freeradius / sites-enabled / default
+ - entering group authorize {...}
++ [preprocess] returns ok
++ [chap] returns noop
++ [mschap] returns noop
++ [digest] returns noop
[suffix] No '@' in User-Name = "test", looking up realm NULL
[suffix] No such realm "NULL"
++ [suffix] returns noop
[eap] EAP packet type response id 1 length 9
[eap] No EAP Start, assuming it's an on-going EAP conversation
++ [eap] returns updated
[files] users: Matched entry test at line 206
++ [files] returns ok
++ [expiration] returns noop
++ [logintime] returns noop
[pap] WARNING: Auth-Type already set. Not setting to PAP
++ [pap] returns noop
Found Auth-Type = EAP
# Executing group from file / etc / freeradius / sites-enabled / default
+ - entering group authenticate {...}
[eap] EAP Identity
[eap] processing type md5
rlm_eap_md5: Issuing Challenge
++ [eap] returns handled
Sending Access-Challenge of id 132 to 172.16.130.31 port 20000
and so on three log screens
As a result, an effective interruption in data transmission was 332 milliseconds. And in our experiment, the RADIUS server used a local database, i.e. did not contact a slow SQL server, did not ask permission in Active Directory, and did not deal with the transfer, verification and comparison of X.509 certificates.

In this article, we did not consider the various standard or vendor-dependent client roaming “assistance” mechanisms, such as WPA2 Fast BSS Transition (FT) 802.11r, Neighbor Reports, etc. Those interested can read this series of articles .

Summarize:

Source: https://habr.com/ru/post/185138/


All Articles