A note on how to debug locks in the Linux kernel

Hello! This note concerns one practical technique that I used to solve the problem of determining the place in the Linux kernel, where a certain spin lock (spin lock) is held. This problem arose quite spontaneously and, given the complexity of the kernel, could require a large amount of time to solve. If you are interested, please read more ...

So, here is the real situation that arose in the Tempesta FW project, where I had to take part in the role of the developer of a nuclear WEB accelerator / firewall (yes, implemented in the Linux kernel).

During the work of the module, it was noticed that sometimes during the load, debug messages appear in the kernel log, indicating that an attempt is made to execute a block of code, assuming that the object (socket) processed in it will not be held by anyone else.

Due to the high coding culture in the project, namely the abundant use of constructions like BUG_ON/WARN_ON and other various BUG_ON/WARN_ON , situations where something went against the logic implemented by the developer were easily caught. So this time - having knowledge of the internal logic of the kernel, the developer found it necessary to protect himself and others, and just in case added a check of the form:

  if (unlikely(spin_is_locked(&sk->sk_lock.slock))) SS_WARN("Socket is used by two cpus, action=%d" " state=%d sk_cpu=%d curr_cpu=%d\n", sw.action, sk->sk_state, sk->sk_incoming_cpu, smp_processor_id());

And everything was fine until the messages "Socket is used by two cpus, ..." really appeared in the kernel log. That is, if (unlikely(spin_is_locked(&sk->sk_lock.slock))) { ... } condition worked in this place, which was evidence of the fact that someone else held the lock. Further, I will tell you how I managed to find out quite easily who is holding this spin-lock.

So, we have the following. There is a struct sock socket descriptor structure in which there is a sk_lock lock of the socket_lock_t type, including a spinlock slock type (see here and here ). The spin-lok itself is based on the raw_spinlock_t structure, which includes a dep_map of struct lockdep_map , containing (including) information about the holders. The latter becomes available only when building a kernel with the CONFIG_DEBUG_LOCK_ALLOC option. But we are hackers :)

Thus, for debugging, I needed to build the kernel with the CONFIG_DEBUG_LOCK_ALLOC option and make the following change to the code:

 --- a/tempesta_fw/sock.c +++ b/tempesta_fw/sock.c @@ -1101,11 +1101,14 @@ ss_tx_action(void) while (!tfw_wq_pop(this_cpu_ptr(&si_wq), &sw)) { struct sock *sk = sw.sk; - if (unlikely(spin_is_locked(&sk->sk_lock.slock))) + if (unlikely(spin_is_locked(&sk->sk_lock.slock))) { SS_WARN("Socket is used by two cpus, action=%d" " state=%d sk_cpu=%d curr_cpu=%d\n", sw.action, sk->sk_state, sk->sk_incoming_cpu, smp_processor_id()); + printk("raw_spinlock = %p\n", &sk->sk_lock.slock.rlock); + BUG(); + } bh_lock_sock(sk); switch (sw.action) {

The meaning of the change is to dump the address of the captured spinlock to a log when the necessary situation arises and crash the kernel due to a BUG . In fact, it provides an excellent opportunity to dump the kernel image as of the time of the error and, knowing the address of the object, examine its contents later.

A bunch of QEMU + kdump was used to capture the dump. The latter is a kernel mechanism that allows you to stop the current kernel and, via kexec, load a backup (simpler) one, which will dump the image of the previous one as a file.

The crash utility was used to analyze the resulting dump. It is better to pump out and collect by hand, because I had problems with the regular versions in the distributions. We have the following:

 crash> log [ 56.934038] [sync_sockets] Warning: Socket is used by two cpus, action=0 state=1 sk_cpu=2 curr_cpu=2 [ 56.934066] raw_spinlock = ffff880001f28b70

Well, the address of the object is known ( ffff880001f28b70 ). Let's see what is there:

 crash> raw_spinlock_t ffff880001f28b70 struct raw_spinlock_t { raw_lock = { { head_tail = 0x3434, tickets = { head = 0x34, tail = 0x34 } } }, magic = 0xdead4ead, owner_cpu = 0xffffffff, owner = 0xffffffffffffffff, dep_map = { key = 0xffffffff82cd95f0 <af_family_slock_keys+16>, class_cache = {0xffffffff8261be30 <lock_classes+319920>, 0xffffffff826283d0 <lock_classes+370512>}, name = 0xffffffff815b719d "slock-AF_INET", cpu = 0x1, ip = 0xffffffff813c667f } }

At the address ip = 0xffffffff813c667f we find the function in which the lock was captured:

 $ addr2line -f -e vmlinux 0xffffffff813c667f tcp_tasklet_func /home/milabs/linux-stable-4.1/net/ipv4/tcp_output.c:773

Here is the answer - the lock on the socket was captured in tcp_tasklet_func tcp_tasklet_func .

Source: https://habr.com/ru/post/321220/

All Articles

A note on how to debug locks in the Linux kernel

More articles: