Deep into the core: familiarity with LTTng

In one of the previous publications, we have already touched upon the issue of tracing and profiling the Linux kernel.

Today we would like to return to this issue again and talk in detail about one interesting tool - the LTTng kernel tracer , developed by Canadian programmer and researcher Mathieu Denoyer. With it, you can receive information about events both in kernel space and in user space.

')
LTTng can be used, for example, to solve the following tasks (the list is far from complete):

analysis of interprocess communication in the system;
analysis of the interaction of applications in user space with the kernel;
measuring the time spent by the kernel on servicing application requests;
analysis of the features of the system under high loads.

LTTng has long been included in the official repositories of modern Linux distributions. It is used in practice by companies such as Google, Siemens, Boeing and others. There are practically no publications about it in Russian.
In this article we will look at how LTTng works and show how kernel tracing is done with it.

A bit of history

In the last article, we detailed the mechanism of trace points (tracepoints) and even gave examples of code. But we did not say that the mechanism of tracepoints was created in the process of working on LTTng.

Back in 1999, IBM employee Karim Yaghmour began work on the LTT project (Linux Trace Toolkit) . LTT was based on the following idea: to statically instrument the most important fragments in the kernel code and thereby obtain information about the system’s operation. A few years later, this idea was picked up and developed by Mathieu Denoye as part of the LTTng project (Linux Tracing Tool New Generation). The first LTTng release took place in 2005.

The words New Generation are used in the name of the tool for a reason: Denoye made a great contribution to the development of Linux trace and profiling mechanisms. He added static instrumentation for the most important nuclear functions: this is how the kernel markers mechanism came about, the improvement of which resulted in the well-known tracepoints. Tracepoints are actively used in LTTng. It is thanks to this mechanism that it became possible to perform tracing without creating additional load on the system’s work.
Based on the material of the work done, Denoyer prepared a dissertation , which was defended in 2009.

Mathieu Denuaye is constantly working to improve LTTng. The latest stable version 2.7 was released in October 2015. Version 2.8 is about to be released, which is currently in the status of a release candidate and is available for download here .

Installation

LTTng is included in the repositories of most modern Linux distributions, and you can install it in a standard way. In the new versions of popular distributions (for example, in the recently released Ubuntu 16.04), the latest version, 2.7, is available for installation by default:

$ sudo apt-get install lttng-tools lttng-modules-dkms

If you are using an older version of Linux, to install LTTng 2.7 you will need to add a PPA repository:

 $ sudo apt-add-repository ppa:lttng/ppa $ sudo apt-get update $ sudo apt-get install lttng-tools lttng-modules-dkms

The lttng-tools package contains the following utilities:

babeltrace - utility for viewing trace findings in CTF (Common Trace Format) ;
lttng-sessiond - daemon to control tracing;
lttng-consumerd - a daemon that collects data and writes it to the ring buffer;
lttng-relayd is a daemon that transmits data over the network.

The lttng-modules-dkms package includes numerous kernel modules that allow you to interact with the built-in tracing and profiling mechanisms ... You can view a complete list with the command

 $ lsmod | grep lttng

All these modules can be divided into three groups:

modules for working with trace points (tracepoints);
modules for working with the ring buffer;
probe modules for dynamic kernel tracing.

LTTng-UST is also available from the official repositories for installation, which is used to trace events in user space. In this article we will not consider it. Interested readers refer to the article , which can serve as a quite good introduction to the topic.
All LTTng commands should be executed either on behalf of the root user, or on behalf of the user included in the special tracing group (it is created automatically during installation).

Basic Concepts: Sessions, Events, and Channels

The process of interaction of all LTTng components can be represented as the following diagram:

LTTng

To better understand how it all works, let us make a small theoretical excursion and analyze the content of the basic concepts.

As mentioned above, tracing in LTTng is based on both static and dynamic code instrumentation.

During the execution of the instrumented code, a special test function is called (English probe, which can also be translated as “sensor” or “probe”), which reads the session status and records event information in the channels .

Let us clarify the content of the concepts “session” and “channel”. Session is a tracing procedure with user-defined parameters. The figure above shows only one session, but in LTTng you can start several sessions at the same time. You can stop a session at any time, change its settings and then start it again.

Each session uses channels to transfer debug information. A channel is a set of events with specific characteristics and additional contextual information. The number of characteristics (in more detail we will tell about them below) of the channel include: buffer size, trace mode, buffer cleaning period.

Why do we need channels? First of all, in order to maintain a common ring buffer in which events are recorded during tracing and from where they are then collected by a consumer daemon. The ring buffer in turn is divided into numerous sections (sub-buffers) of the same size. Event data is recorded in the section until it is filled. After this, the data recording will continue, but in another section. Data from the overflowed sections is locked by the daemon receiver and stored on disk (or transmitted over the network).

Ideally, there shouldn't be any situations when the section is filled and there is no data to write to. In actual practice, however, such cases sometimes occur. The official documentation explicitly states that in LTTng performance is paramount: it is better to lose some of the events than to slow down the system. When creating a channel, you can choose one of two modes of operation for buffer overflows:

discard mode - all new events will not be recorded until one of the sections is released;
overwrite mode - the oldest events will be deleted, and new ones will be recorded in their place.

More information about setting up channels can be found here .

Event tracing

Let's start studying LTTng in practice and launch the first trace session.
You can view the list of events available for tracing as follows (here is a small fragment, in fact this list is much longer):

 $ lttng list --kernel Kernel events: ------------- writeback_nothread (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_queue (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_exec (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_start (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_written (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_wait (loglevel: TRACE_EMERG (0)) (type: tracepoint) writeback_pages_written (loglevel: TRACE_EMERG (0)) (type: tracepoint) ……. snd_soc_jack_irq (loglevel: TRACE_EMERG (0)) (type: tracepoint) snd_soc_jack_report (loglevel: TRACE_EMERG (0)) (type: tracepoint) snd_soc_jack_notify (loglevel: TRACE_EMERG (0)) (type: tracepoint) snd_soc_cache_sync (loglevel: TRACE_EMERG (0)) (type: tracepoint)

Let's try to track some event - for example, sched_switch. First, create a session:

 $ lttng create test_session Session test_session created. Traces will be written in /user/lttng-traces/test_session-20151119-134747

So, the session is created. All data collected during this session will be recorded in the file / user / lttng-traces / test_session-20151119-134747. Then we activate the event that interests us:

 $ lttng enable-event -k sched_switch Kernel event sched_switch created in channel channel0

Next, run:

 $ lttng start Tracing started for session test_session $ sleep 1 $ lttng stop

Information about all sched_switch events will be saved in a separate file. You can view the data of its contents as follows:

 $ lttng view

The list of events will be too large. Moreover, it will include too much information. Let's try to specify the request and get information only about events that occurred during the execution of the sleep command:

 $ babeltrace /user/lttng-traces/test_session-20151119-134747 | grep sleep [13:53:23.278672041] (+0.001249216) luna sched_switch: { cpu_id = 0 }, { prev_comm = "sleep", prev_tid = 10935, prev_prio = 20, prev_state = 1, next_comm = "swapper/0", next_tid = 0, next_prio = 20 } [13:53:24.278777924] (+0.005448925) luna sched_switch: { cpu_id = 0 }, { prev_comm = "swapper/0", prev_tid = 0, prev_prio = 20, prev_state = 0, next_comm = "sleep", next_tid = 10935, next_prio = 20 } [13:53:24.278912164] (+0.000134240) luna sched_switch: { cpu_id = 0 }, { prev_comm = "sleep", prev_tid = 10935, prev_prio = 20, prev_state = 0, next_comm = "rcuos/0", next_tid = 8, next_prio = 20 }

The output contains information about all sched_switch events registered in the system core during the session. It consists of several fields. The first field is the time stamp, the second is the so-called delta (the amount of time between the previous and the current event). The cpu_id field indicates the CPU number for which the event was registered. This is followed by additional contextual information.

Upon completion of the trace, the session needs to be deleted:

 $ lttng destroy

System call tracing

To track system calls, the lttng enable-event command has a special option - syscall. In the article on ftrace, we analyzed the example of tracing the chroot system call. Let's do the same thing with LTTng:

 $ lttng create $ lttng enable-event -k --syscall chroot $ lttng start

Create an isolated environment using the chroot command, and then run:

 $ lttng stop $ lttng view [12:05:51.993970181] (+?.?????????) andrei syscall_entry_chroot: { cpu_id = 0 }, { filename = "test" } [12:05:51.993974601] (+0.000004420) andrei syscall_exit_chroot: { cpu_id = 0 }, { ret = 0 } [12:06:53.373062654] (+61.379088053) andrei syscall_entry_chroot: { cpu_id = 0 }, { filename = "test" } [12:06:53.373066648] (+0.000003994) andrei syscall_exit_chroot: { cpu_id = 0 }, { ret = 0 } [12:07:36.701313906] (+43.328247258) andrei syscall_entry_chroot: { cpu_id = 1 }, { filename = "test" } [12:07:36.701325202] (+0.000011296) andrei syscall_exit_chroot: { cpu_id = 1 }, { ret = 0 }

As you can see, the output contains information on all inputs to the syscall system call and exits from it. Compared to ftrace, it looks somewhat more structured and human-readable.

Dynamic tracing

Above, we looked at examples of static tracing using the tracepoints mechanism.

To track events in Linux, you can also use the kprobes dynamic tracing mechanism (short for kernel probes, as it is not hard to guess). It allows you to add new trace points (samples) to the core on the fly. It is on kprobes that the famous SystemTap tool is based. It works like this: to add a sample to the core, you need to write a script in a special C-like language; then this script is translated into C code, which is compiled into a separate kernel module. To use such a tool in practice is very difficult.

Starting with kernel version 3.10, kprobes support has been added to ftrace . Due to this, dynamic tracing became possible without writing scripts and adding new modules.

Implemented kprobes support in LTTng. Two types of samples are supported: the actual kprobes (“basic” samples that can be inserted anywhere in the kernel) and kretprobes (placed before exiting the function and giving access to its result). Consider some practical examples.

In LTTng, to install the “base” samples, an option is used which is called - probe:

 $ lttng create $ lttng enable-event --kernel sys_open --probe sys_open+0x0 $ lttng enable-event --kernel sys_close --probe sys_close+0x0 $ lttng start $ sleep 1 $ lttng stop

The output obtained as a result of the trace will look like this (here’s a small fragment):

 ….. [12:45:26.842026294] (+0.000028311) andrei sys_close: { cpu_id = 1 }, { ip = 0xFFFFFFFF81209830 } [12:45:26.842036177] (+0.000009883) andrei sys_open: { cpu_id = 1 }, { ip = 0xFFFFFFFF8120B940 } [12:45:26.842064681] (+0.000028504) andrei sys_close: { cpu_id = 1 }, { ip = 0xFFFFFFFF81209830 } [12:45:26.842097484] (+0.000032803) andrei sys_open: { cpu_id = 1 }, { ip = 0xFFFFFFFF8120B940 } [12:45:26.842126464] (+0.000028980) andrei sys_close: { cpu_id = 1 }, { ip = 0xFFFFFFFF81209830 } [12:45:26.842141670] (+0.000015206) andrei sys_open: { cpu_id = 1 }, { ip = 0xFFFFFFFF8120B940 } [12:45:26.842179482] (+0.000037812) andrei sys_close: { cpu_id = 1 }, { ip = 0xFFFFFFFF81209830 }

The above output contains the ip-address field of the monitored function in the kernel.

Using the –−function option, you can set dynamic samples to enter and exit a function, for example:

 $ lttng enable-event call_rcu_sched -k --function call_rcu_sched ..... [15:31:39.092335027] (+0.000000742) cs31401 call_rcu_sched_return: { cpu_id = 0 }, { }, { ip = 0xFFFFFFFF810E7B10, parent_ip = 0xFFFFFFFF810A206D } [15:31:39.092398089] (+0.000063062) cs31401 call_rcu_sched_entry: { cpu_id = 0 }, { }, { ip = 0xFFFFFFFF810E7B10, parent_ip = 0xFFFFFFFF810A206D } [15:31:39.092398883] (+0.000000794) cs31401 call_rcu_sched_return: { cpu_id = 0 }, { }, { ip = 0xFFFFFFFF810E7B10, parent_ip = 0xFFFFFFFF810A206D }

In the above output, there is one more field: parent_ip - the address of the function that called the monitored function.

In this section, we have given only the most simple examples. However, the possibilities of LTTng are much wider: by combining various methods of instrumentation, it can be used to obtain such information about the operation of the system, which is difficult or impossible to obtain with other tools. Read more about this and get acquainted with interesting examples, for example, in this article on LWN.net .

Conclusion

LTTng is an interesting and promising tool. To tell about all its possibilities, one article is clearly not enough. Therefore, we would be grateful for any comments and additions.
And if you use LTTng in practice, we invite you to share your experience in the comments.

For anyone who wants to get acquainted with LTTng closer, here is a selection of links to interesting and useful materials:

https://lttng.org/docs/ - official LTTng documentation;
http://mdh.diva-portal.org/smash/get/diva2:325301/FULLTEXT01 - a study of the effectiveness of LTTng in a multi-core environment, many interesting technical details;
https://sourceware.org/ml/systemtap/2011-q1/msg00244.html - comparing the performance of LTTng and SystemTap;
https://events.linuxfoundation.org/slides/2011/lfcs/lfcs2011_tracing_goulet.pdf - A presentation about tracing events in user space.

If for one reason or another you cannot leave comments here, welcome to our corporate blog .

Source: https://habr.com/ru/post/300966/

All Articles