📜 ⬆️ ⬇️

Tuning a Linux network stack for the lazy

The Linux network stack works fine on desktops by default. On servers with a load slightly above average, you already have to figure out how to properly configure everything. At my current job, this has to be done almost on an industrial scale, so without automation there is no need to explain to each colleague what was arranged for a long time and force people to read ≈300 pages of English text mixed with C code ... You can and should, but the results will not be in an hour or a day. Therefore, I tried to distribute a set of utilities for tuning the network stack and a guide to using them, which does not go into specific details of certain tasks, which remains compact enough so that it can be read in less than an hour and take at least some favor


What needs to be achieved?


The main task in tuning the network stack (no matter what role the server performs - a router, a traffic analyzer, a web server that accepts large amounts of traffic) is to evenly distribute the processing load between the processor cores. Preferably, taking into account that the CPU and the network card belong to the same NUMA node, and without creating unnecessary packet swapping between the cores.


Before the main task, the primary task is performed - selection of the hardware, of course, taking into account what tasks lie on the server, where and how much traffic comes from and how, etc.


Recommendations for the selection of iron



Thus, if a 2+ source of traffic volume is given more than 2 Gbit / s, then you should think about the server with the number of processors and NUMA nodes, as well as the number of network cards (not ports) equal to the number of these sources.


"Lord, I do not want to understand this!"


And do not. I already figured out and, in order not to waste time trying to explain this to my colleagues, I wrote a set of utilities - netutils-linux . Written in Python, tested on versions 2.6, 2.7, 3.4, 3.6.


network top


network top


This utility is needed to assess the applied settings and displays the uniform distribution of the load (interrupts, softirqs, the number of packets per second per processor core) on server resources, all sorts of packet processing errors. Values ​​above threshold are highlighted.


rss-ladder


# rss-ladder eth1 0 - distributing interrupts of eth1 (-TxRx) on socket 0:" - eth1: irq 67 eth1-TxRx-0 -> 0 - eth1: irq 68 eth1-TxRx-1 -> 1 - eth1: irq 69 eth1-TxRx-2 -> 2 - eth1: irq 70 eth1-TxRx-3 -> 3 - eth1: irq 71 eth1-TxRx-4 -> 8 - eth1: irq 72 eth1-TxRx-5 -> 9 - eth1: irq 73 eth1-TxRx-6 -> 10 - eth1: irq 74 eth1-TxRx-7 -> 11 

This utility distributes the network card interrupts to the cores of the selected physical processor (zero by default).


autorps


 # autorps eth0 Using mask 'fc0' for eth0-rx-0. 

This utility allows you to configure the distribution of packet processing between the cores of the selected physical processor (the default is zero). If you use RSS, most likely you will not need this utility. A typical use case is a multi-core processor and network cards with a single queue.


server-info


 # server-info rate cpu: BogoMIPS: 7 CPU MHz: 7 CPU(s): 1 Core(s) per socket: 1 L3 cache: 1 Socket(s): 10 Thread(s) per core: 10 Vendor ID: 10 disk: vda: size: 1 type: 1 memory: MemTotal: 1 SwapTotal: 10 net: eth1: buffers: cur: 5 max: 10 driver: 1 queues: 1 system: Hypervisor vendor: 1 Virtualization type: 1 

This utility allows you to do two things:


  1. server-info show : see what kind of hardware is installed on the server. In general, it looks like a bicycle that repeats lshw , but with an emphasis on the parameters of interest.
  2. server-info rate : find bottlenecks in server hardware. In general, it is similar to the Windows performance index, but again with an emphasis on the parameters we are interested in. Evaluation is made on a scale from 1 to 10.

Other utilities



Lord, I want to understand this!


Read articles about:



These articles inspired me to write these tools.


Also a good article was written on the classmates blog 2 years ago.


Ordinary cases


But the manual for launching utilities in itself says little about how exactly they should be applied depending on the situation. We give some examples.


Example 1. As simple as possible.


Given:



Decision:



Example 2. A little harder.


Given:



Decision:


1 Drag one of the 10 Gbit / s network cards into another PCI slot tied to NUMA node1.


2 Reduce the number of combined queues for 10 Gb ports to the number of cores of one physical processor:


 for dev in eth0 eth1 eth2 eth3; do ethtool -L $dev combined 8 done 

3 Distribute the interrupts of the ports eth0, eth1 to the processor cores that fall into NUMA node0, and the ports eth2, eth3 to the processor cores that fall into the NUMA node1:


 rss-ladder eth0 0 rss-ladder eth1 0 rss-ladder eth2 1 rss-ladder eth3 1 

4 Increase eth0, eth1, eth2, eth3 RX-buffers:


 for dev in eth0 eth1 eth2 eth3; do rx-buffers-increase $dev done 

Unusual cases


Not always everything goes perfectly:





Update: after publication, the author realized that people use not only RHEL-based distributions for network tasks, but tests in debian on data sets collected in RHEL-based systems do not catch a lot of bugs. Many thanks to all who reported that something is not working in Ubuntu / Debian / Altlinux! All bugs fixed in release 2.0.10


Update2. in the comments they mentioned that RPS is still often useful to people and I underestimate it. In principle, this is the case; therefore, a significantly improved version of the autorps utility appeared in release 2.2.0 .


Update3: Release 2.5.0


')

Source: https://habr.com/ru/post/331720/


All Articles