📜 ⬆️ ⬇️

Sustainable channel based on a cellular modem cluster (SD-WAN): solve route selection problems


Field Tests

There is a commercial task: you need to quickly connect sites to a normal WAN network, but do it where there is only cellular coverage available and there is no possibility to run a cable or organize a radio relay transfer to fiber or copper.

The solution is modem clusters. The obvious problem of the solution is that each of the modems is a separate physical channel. It is necessary with the help of a chisel and some mother to combine them into one encapsulated device, which will simply give up the channel. In addition, it is necessary that when a cable appears, it is not necessary to change the box and reconfigure something.

From whom to choose


The technology is called SD-WAN. The leaders are 3 American startups: Versa Networks, Viptela, Velocloud. Classic network equipment manufacturers are trying to catch up. In particular, Cisco states that they have 2 SD-WAN solutions - iWAN and Meraki. But at the same time a couple of months ago they announced the purchase of Viptela. And Riverbed bought Ocedo about a year and a half ago to launch SD-WAN solutions.
')
In general, evaluated:


What got


We settled on the Versa decision; first because of the price advantage. Ordinary SDN solutions are used a little for other tasks, in particular, for uniting branches of companies into one logical network, visible to all terminals and servers as a single physical addressing space, somewhat similar to Cisco's DMVPN, but with its own blackjack in the form of ZTP, channel -bundling and SLA. The chosen solution turned out to be a little more specific, and due to the lack of fullstack protocols of classic routers on the box itself and the use of standard components, the base cost was reduced. Most vendors offer to buy hardware and software immediately or by subscription, but Versa does not do hardware, so software is by subscription, and hardware is from partners who make reference x86 boxes. For customers, the cost model becomes more and more convenient with each passing year. For example, the smallest box from Versa (pictured above) costs no more than the Cisco 800 Series, but it can pump 500 Mbps through itself. And this is on IMIX traffic, where 90% of TCP and 10% of UDP are enabled while IPv4 Routing / Forwarding, IPSecEncryption, Layer 7 Application based traffic steering, CGNAT, NextgenFirewall (NGFW), QOS (Classification and Marking), SLA monitoring, internal service chaining, URL Filtering).


SD-WAN concept

The principle of separation of control and data plane, as well as overlay, is taken from SDN. The control plane is a Director (control) controller (BGP route reflector and IPSEC), Analytics is an optional component, but adds transparency to the services used on the WAN.


Comparison of those two boxes

The logic of work within each branch box.


Logical box architecture

How to install


The comparison is this: at the end of the year, Cisco will have the 19xx series in EOS, and it will have to be replaced. That is, bypass (fly around, travel) physically all the points with their feet. Versa allows you to send a box in place by any delivery service. On the spot in the piece of iron will need to plug a modem or cable with the Internet. As soon as the piece of iron has a fresh connection, it will build a VXLAN tunnel with the controller, receive IKE from it, and with it build the service IPSEC tunnel through which it will take the settings and receive the entire network’s routes from the controllers, I’m reminding you that the controller is BGP route reflector All this happens completely automatically.

That is, even an accountant can do it on site - he plugged the cable, a little magic, it worked.


Initialization of the box on the new site

Here’s how the procedure looks in English-language documentation: 1. Controller’s IP address is the IP address in the IPSec config. 2. Establishes IKE session with controller over VXLAN tunnel. 3. Controller assigns an IP address to Versa Director. 4. VD IP address is notified to branch. 5. Branch installs reverse route to VD. 6. VD pushes the session and reboots the branch device.

The settings themselves are templates, with templates with variables — they can include QOS, shaping, SLA, LAN settings, external channel balancing rules, and the bonding of these channels into one pipe.

If there is a great desire, then you can climb on the box with your hands, the manufacturer has not yet closed this, and there, a surprise, Juniper-like-console.

Tests



Prototypes with two modems

To emulate a real network, where part of the boxes on the cable, and some for LTE-clusters. 3G-LTE-dangles were stuck into one piece of iron (at the time of the photo they were MTS and Megaphone), and into the second cable with the Internet from Garza. Modems are combined into a pipe, the transmission of information. Zhelezka looks traffic, recognizes it by application, imposes policies and criteria for prioritization.

After a small finish, Megafon and MTS (as in the photo) with their locked modems began to be automatically recognized by the boxes. Synthetic iperf traffic at 5 sessions gave out almost 50/50 balancing.


Monitoring functions built into the Director


Charts of loading two LTE-modems in the Director interface

The total band was about 50 Mbit, for each of the modems separately - 25 Mbit. Embedded analytics out of the box in real time gave out statistics on the load.

It was empirically very successful with telephony: for example, if there are 15 telephone calls at the same time, then we push the first to the eighth channel on the first channel, and the rest on the second one with high priority (in the second, other office services like mail).

The second feature: in places where there are many drops on the last mile due to the nature of network loading or coverage, we managed to put a more or less stable connection.
Tested per-packet balancing. One TCP session is spread over two LTE interfaces. Iperf was tested, one TCP session, in the dashboard you can see that it is divided into two channels. That is, it works on synthetic traffic, then it already depends on each particular application, how they behave when such balancing is done. For example, based on our own experience, we can confirm that video streaming over RTSP via VLC works fine. This policy can be applied separately for each service. That is, services that work well with per-packet balancing, we balance per packet, the rest - per-flow. In this case, the politicians themselves roll out on the boxes on the button. Because of this, best-practice considers doing several groups of sites: test ones closer to themselves (preferably in the office), then the second for early-deploy templates and policies and already the main one for production.

Switching traffic from one LTE modem to another works. We tested iperf with one session and all the same VLC video streaming. You take out the modem on which the session is going on, there is a slight drawdown in bandwidth (iperf shows a drop of 40–60%), the video ruffles a little for a couple of seconds, then everything is restored.

Versa Features


  1. There is an analyst. There are no external control systems, there is no need to monitor the channels separately using PRTG, etc., all at once as it should.
  2. One controller can serve any number of devices, in fact it is a web-scale. In particular, there is no need for a telecom operator or cloud provider to deploy a separate SD-WAN solution in telco-cloud for each customer.
  3. If necessary for important traffic, there is a setting for duplicating packets on various channels to ensure content delivery.
  4. In addition, TPM-chips are used on the boxes. These are special modules for storing encryption keys on devices. At the initialization of the TPM module, a pair of keys is created - private and public. The private key cannot be read: there are no methods to access it, but there is an API for calling the encryption-decryption method.
  5. Dynamic tunnels. When controllers act as BGP route reflectors, they drop information on routes to the final piece of hardware of each site, and the spoke-to-spoke-tunnel is built only for the appearance of traffic between the sites. And it also allows to ensure the scalability of the solution to thousands of sites.

Differences in classic site-to-site ipsec and in Versa SD-WAN IPSEC:


Versa VPN Cloud

Another important point: the glands for cellular operators are many times cheaper, it is possible to purchase almost directly from large factories. To do this, the vendor gives the go-ahead to the plant to supply the devices directly to a large customer, and then gives the plant basic firmware. The software is poured into the box, the box arrives to the telecom operator or cloud provider. The provider clogs 2 parameters into it: the IP controller and the Internet access configuration (for example, static-IP). Then the box is sent at least by RF mail to the customer. The box itself will find what and how. Compatibility is wide, that is, immediately IP-telephony services, video conferencing - everything can come in.

There is also a standard Packet steering (SLA) and bonding - this is just the union of a group of external channels into one pipe and prescribing the switching logic for each service. Moreover, the Versa or Riverbed solution automatically recognizes which service owns the passing traffic in order to communicate not at the “port-type” session level, but at the “Skype for buisness corporate level to give priority, but not to give such normal Skype video calls”. Bonding also helps to solve problems with the long provision of the last mile. They inserted 3 LTE-modems from different operators and resolved the issue of site availability at 99.999.


Dynamic tunnels between boxes at different sites

Summary


  1. Now we have launched a service for setting up these glands in our country, and not in the US cloud (as in a number of vendors), in order to make life easier for Russian companies.
  2. This solution is very well suited for cellular operators and cloud providers. For the former, it is an opportunity to sell retail (for example, a laundry, grocery or car service network), not only a channel for sending, say, reports and mail, but also an additional service for managing application traffic in a bundle of channels, as well as diverse monitoring of the performance of applications existing communication channels (the so-called managed service). By the way, as the global trend shows, currently the largest projects on SD-WAN (the number of connection points) occur in the financial sector and large retail.
  3. Very simple repair. If at the other end of the planet something broke in the grocery - the seller shoveles the cord from one “box” to another - and the config of the old one is automatically loaded to the new one in 10-15 minutes. In general, if desired, even an accountant can handle it.
  4. To work with SD-WAN you do not need an IT specialist at the branch for any of the cases.

Everything. I will answer questions in the comments. Well, or write to the mail: MKazakov@croc.ru

Source: https://habr.com/ru/post/336210/


All Articles