📜 ⬆️ ⬇️

Balancing traffic in the operator’s IP networks

Immediately I warn you that if you want to read about the modern architecture of solutions, it is better to start from the end of the article.
If it is interesting to read about the difficulties encountered in the design of the part of the carrier network, welcome to Cat.

The article describes how to organize balancing of traffic at the network boundary under the following conditions:


The connected user network segment is considered on the example of IEEE 802.11 wireless networks [ 5 ] using controllers.
')
Solved problems:



The distribution level is a boundary component of the network that performs the following main functions:


Wireless Access Controllers (UKBD) - a group of controllers that perform the following main functions:


The level of radio coverage - access points located on sites.

Center for the provision of services (CPU) - provides the connection of controllers to the data transmission network, management and control provided to users, connection to the Internet, IP address translation.

In terms of routing, the IP network is divided into several routing segments: a user segment, an access point segment, and a control segment. This article only covers the custom routing segment.

The proposed solution uses the dynamic routing protocol OSPFv2 [ 1 ] and the Multi-Instance extension [ 2 ]. The main configuration parameters used for OSPF are shown in Figures 1-5.

Using VRF at the distribution level


Using multiple VRFs allows you to assign different primary / backup BNG combinations for user traffic.
For this purpose, at the distribution level, on each of the two L3 switches, user interfaces are defined in different virtual routing tables ( VRF Lite [9]):
Each VRF creates one OSPF process.

Balancing user traffic at a network connection point


Balancing is done by distributing user devices across virtual networks (VLANs). For this purpose, on wireless access controllers, access points are divided into groups (up to 10-15 access points per group). Each group should be assigned a VLAN ID and a subnet of IP addresses of users with a capacity of at least 2-4 class C networks (up to 25 active connections per access point and additional capacity to account for inactive user connections and the specifics of using DHCP: "lease time" [ 6 ]).
On L3 switchboards of distribution level to which controllers are connected, used IP networks are divided into two large groups. This is necessary to further summarize the routing information and balancing the traffic between BNG at the 3rd level of the OSI model.
Each group is defined in one of the distribution level switch VRFs.
On L3, reservation is made using the OSPF protocol, as shown in the figure.
picture 1



The choice of NSSA zone type is determined by the following factors:
- Reduces the number of routes in the NSSA by summing the route information on the networks of wireless users on the ASBR.
- Provides the ability to set the AD value (Administrative Distance) for "external" OSPF routes on the ABR.
- Provides the ability to simply select and summarize the routing information of the redistributed routes to ABR.
- Provides the ability to set ABR as a source of routing information when sending LSA to area 0 [ suppress-fa 14 ]. This allows you not to send information about the IP addressing structure and sources of external routes from the NSSA zone to Area 0.
- Allows you to get by sending out two default routes within the NSSA zone [ no-summary 14 ]. Traffic balancing between ABR is performed by setting the values ​​of the cost of the channels between the ASBR and ABR within the NSSA zone.
- Allows you to select 2 types of external routes to filter and control the routing of the user routing segment on the ABR.

This article does not disclose the ability to connect custom routing segments to BNG via the MPLS network, but the choice of some of the solutions used is determined by the requirements to work in this mode ([15] sham-link backdoor routing ).

Figure 2 shows examples of using VRF at the distribution level:
- WUsers1 — for users using the SG-01 CPU as the main services gateway and the SG-02 CPU as the backup services gateway;
- WUsers2 — for users using the SG-02 CPU as the main services gateway and the SG-01 CPU as the backup services gateway.

Figure 2


The choice of the primary / backup service gateway in VRF WUsers1 and WUsers2 is implemented through dynamic routing and assignment of different costs to the virtual communication channels.

Load balancing at the distribution level


The IP networks assigned to the virtual channels (VLANs) of users, within each of the L3 switches of the distribution level, are determined by two VRFs. Thus, wireless users, depending on which AP group they were included by the wireless access controllers, fall into different VRFs and use different pairs of primary / backup service gateways, ensuring load distribution between service gateways.

In the event of a failure of one of the distribution level switches, all users will be switched to the remaining switch in operation, reconnecting to the wireless network and obtaining an IP address from the new IP network. The IP networks of switched wireless users are also distributed across two VRFs. Thus, the load distribution between the BNGs is preserved, regardless of the fact that at some point in time only one L3 switch of the distribution level functions.

The reservation of service gateway connection can be organized using duplicate virtual communication channels located on different IP subnets and terminated on different physical ports of the service gateway.
The physical scheme and network topology, as well as the corresponding organization of logical communication channels, are not presented in this article. Solutions used at these levels also provide for the organization of reserved physical and logical communication channels.

The routing scheme at the distribution level is shown in Figures 1 and 2.

Traffic routing between distribution level and CPU


The organization of communication between the UR switches and gateway services of the CPU is possible by one of the following methods:

At the 2nd level of the OSI model, without the use of an intermediate "L3-hop".
Using the intermediate "L3-hop".
The first solution requires more resources (VLAN ID, STP).
When using the second method, the stack of switches on which 2 VRFs are created can serve as intermediate routers.

This solution allows to significantly reduce the number of virtual channels (VLANs) necessary for the organization of communication between gateways of the CPU and UR services.

A diagram of the organization of communication between the UR routers and the CPU services gateways is shown in Figure 3.
Figure 3

Equal OSPF metric assigned to parallel virtual communication channels allows the distribution of wireless user traffic between virtual communication channels and, as a result, balancing traffic between physical communication lines.

NAT to CPU


NAT routers translate (translate) private IP addresses (Network Address Translation, NAT) to public IP addresses. To implement the IP address translation mechanism, it is necessary to allocate a range (pool) of unique public IP addresses. For a pair of routers, the corresponding NAT groups are formed, in each of which one router is selected as the main (active) and the other as a backup. In the event of failure of the main router, the backup becomes active, continuing to serve user sessions.

Routing between service gateways and NAT routers


When using NAT routers, the following restrictions are taken into account:


Inside VLAN is used to communicate with the CPU service gateways. Outside VLAN is used to communicate with edge BGP routers.

In order to increase fault tolerance, two physical interfaces are used to connect each BNG. Due to the various features of the equipment, as well as the need to tightly bind the pool of external IP addresses to a specific BNG, we suggest using the following restrictions:
- Do not use Etherchannel technology, but organize load balancing and redundancy using L3 routing;
- For each NAT router, use one physical channel to communicate with the BNG.

Thus, it becomes necessary to organize an intermediate “L3-node” (hereinafter referred to as the ASBR CPU) between the BNG and the NAT routers. The intermediate node will perform the following functions:
- OSPF ASBR for area 0.
- Distribution of default routes for area 0.
- Routing packets coming from NAT routers to OSPF ABR.
- Static routing of packets coming from OSPF area 0 to NAT routers (default gateways).

The role of an intermediate router can be performed by the L3 switch stack, which provides BNG connection and NAT routers, on which 2 VRFs (VRF Lite [9]) are created for this purpose: Users1_out and Users2_out.

It is important to use exactly the stack of L3 switches, since this allows:
- use both BNG physical connections to organize virtual communication channels with each of the NAT routers;
- to provide load balancing between the physical interfaces of the BNG;
- ensure that the BNG connection to the L3 switch stays intact, in case of failure of one of the L3 switches of the stack or problems with the operation of one of the physical interfaces of BNG.

Another feature of the solution is the use of two VRFs on a stack of L3 switches.
This is necessary in order to tightly “tie” each BNG to a specific ASBR (see Figure 4) and, accordingly, link the pool of external IP addresses to a specific BNG.
For each of these VRFs (Users1_out and Users2_out), independent OSPF processes are started on the L3 switch stack. Virtual links between BNG and VRF Users1_out and Users2_out of the stack of switches are included in the 0th (backbone) OSPF zone.

For routing between ASBR and NAT routers, static routing is used:


To send the default route in ASBR VRF Users1_out and ASBR Users2_out OSPF processes, the default information originate is enabled.

The scheme using the intermediate “L3-node” is shown in Figure 4.
Figure 4


Equal OSPF metric assigned to parallel virtual communication channels allows the distribution of wireless user traffic between virtual communication channels and, as a result, balancing traffic between the physical communication lines through which the CPU service gateways are connected to the switch stack.

The ASBR CPU is the border router for the OSPF protocol and is used to redistribute routes from other routing segments, NAT IP address pools, and the Internet.

Routing and balancing traffic between the ASBR CPU and NAT routers


Virtual communication links are created between the ASBR CPU and NAT routers as shown in Figure 5. The default gateway resiliency on NAT routers can be implemented using the Hot Standby Router Protocol [11] mechanism of the HSRP.

On the interfaces of the NAT routers, two HSRP groups are used. The first HSRP group is responsible for the default gateway for the NAT-group1, the second HSRP group is responsible for the default gateway for the NAT-group2, as shown in Figure 5.
Figure 5



Routing between NAT routers and network border routers


In the proposed solution, the routing was performed using static routing and HSRP on the network border routers (outside-router, see Figure 6). This decision is not considered in detail in this article.
Figure 6



Virtual links are created between NAT routers and border routers. Fault tolerance of default gateways on border routers can be implemented using an HSRP or similar mechanism, depending on the capabilities of the equipment used. Two HSRP groups are used for this purpose.
The routing scheme is shown in Figure 6.

Schemes and drawings
Figure 1. VRF wireless users at the distribution level, summarizing routes on users IP subnets.


Figure 2. Distribution level routing.


Figure 3. Routing between distribution level and CPU.


Figure 4. Routing between the BNG and the ASBR CPU.


Figure 5. Routing between the ASBR CPU and NAT routers.


Figure 6. Routing at the network edge.


Sources
[1] J. Moy (Ascend Communications), Request for Comments: 2328 “OSPF Version 2”, April 1998.
[2] A. Lindem (Ericsson), A. Roy, S. Mirtorabi (Cisco Systems) Request for Comments: 6549, OSPFv2 Multi-Instance Extensions, March 2012
[3] S. Wadhwa (Alcatel-Lucent), J. Moisand (Juniper Networks), T. Haag (Deutsche Telekom), N. Voigt (Nokia Siemens Networks), T. Taylor, Ed. (Huawei Technologies) Request for Comments: 6320, Protocol for Access Control Mechanism in Broadband Networks, October 2011
[4] P. Srisuresh (Jasmine Networks), K. Egevang (Intel Corporation) Request for Comments: 3022, Traditional NAT, January 2001
[5] IEEE 802.11, “Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications,” 1997.
[6] R. Droms (Bucknell University) Request for Comments 2131, Dynamic Host Configuration Protocol, March 1997
[7] AP Group VLANs with Wireless LAN Controllers Configuration Example, www.cisco.com , 2008
[8] L. Andersson, T. Madsen (Acreo AB) Request for Comments 4026, Provider Provisioned Virtual Private Network (VPN) Terminology
[9] Configuring VRF-lite, Cisco web site [Online]. Available: www.cisco.com
[10] Y. Rekhter TJ Watson Research Center, IBM Corp. T. Li, Cisco Systems Editors, Request for Comments 1518, An IP Address Allocation with CIDR, September 1993
[11] T. Li (Juniper Networks), B. Cole (Juniper Networks) P. Morton (Cisco Systems), D. Li (Cisco Systems), Request for Comments 2281: Cisco Hot Standby Router Protocol (HSRP), March 1998
[12] NAT Examples and Reference, Cisco web site [Online]. Available: www.cisco.com
[13] “Creating a wireless public networks" 2008-2010 ", step.ru/projects/industrys/telecom
[14] Cisco IOS IP Routing: OSPF Command
[15] Sham-link backdoor routing




One of the difficulties was to design a solution that included a significant number of nodes, services and related systems with which it was necessary to ensure integration. As well as artists responsible for the design of various systems and services.

A few conclusions based on the experience:
- perform end-2-end design of services, including traffic routing;
- divide functional components into separate IP nodes (BNG, NAT routers, BGP border routers);
- stackable routers greatly simplify the projected solution.
- when using virtual p2p channels, do not forget to properly configure ospf on the interfaces of routers;)
UPDATE: Added a very detailed description of the solution. I hope it became clearer.
Pictures corrected.

Prepared according to the materials of 2008.
You can find out about the use of modern BNG in networks of telecom operators on the Learning Club website or on the information resources of telecom equipment manufacturers.

Source: https://habr.com/ru/post/427433/


All Articles