Frame from the film "The Matrix: Revolution"
In this article, we will look at the details of one interesting find in detail: two frequently used system calls ( gettimeofday
, clock_gettime
) in AWS EC2 are very slow.
Linux has a mechanism for speeding up these two frequently used system calls, thanks to which their code is executed in user space, which avoids switching to the kernel context. This is done using the virtual shared library provided by the kernel, which is mapped into the address space of all running programs.
The above two system calls cannot use virtual dynamic Dynamic Shared Object vDSO in AWS EC2 because the virtualized clock source in xen (and some kvm configurations) does not support retrieving time information through vDSO.
Bypass this problem will not work. You can change the time information source to tsc
, but this is not safe. Next, we will look at the issue in more detail and conduct a comparative test using microbenchmark.
To quickly check your system for this problem, compile the program below and run it with strace
:
#include <stdio.h> #include <stdlib.h> #include <sys/time.h> int main(int argc, char *argv[]) { struct timeval tv; int i = 0; for (; i<100; i++) { gettimeofday(&tv,NULL); } return 0; }
gcc -o test test.c strace -ce gettimeofday ./test % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 100 gettimeofday ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 100 total`
Strace
counted 100 gettimeofday
calls. This means that vDSO was not used, and real system calls were made instead of it, which led to switching to the kernel context. The vDSO mechanism in Linux was designed taking into account gettimeofday
(this call is even mentioned in the vDSO man page ). Any system call made through vDSO is fully executed in user space, thereby avoiding context switching. As a result, any system call that successfully passed through vDSO will not appear in the strace
output .
Next, we will examine in detail why and how this happens, and also look at the very interesting results of profiling.
There are several important aspects that you should be familiar with in order to better understand the problem description and the corresponding code snippets.
Before continuing to read this article, we strongly recommend that you carefully study our previous publication, which describes in detail the work of the Linux system calls: The Definitive Guide to Linux System Calls (in English) .
A vDSO is essentially a shared library provided by the kernel that maps to the address space of each process. When gettimeofday
, clock_gettime
, getcpu
or time
getcpu
, glibc attempts to execute the code provided by vDSO. This code gets the necessary data without switching to the kernel context and thus avoids the extra cost of making this system call.
Since system calls made via vDSO do not lead to switching to the kernel context, strace
does not receive the appropriate notifications. As a result, in the output of strace
there will be no mention of, for example, gettimeofday
, if the program successfully made this system call via vDSO. Instead of strace
in this case you need to use ltrace
. More detailed information on how the strace
utility is arranged can be found in our publication “ How does strace work ”.
In AWS EC2, gettimeofday
appears in the strace
output. This is because the vDSO performs normal system calls in some situations.
On Linux-based x86-based systems, several different mechanisms are used to obtain time information:
Each of these mechanisms has its pros and cons. Detailed information can be found in the source code Documentation / virtual / kvm / timekeeping.txt .
It is important to understand that virtualization creates additional difficulties when working with time information. For example:
The guys from VMWare have published a very interesting article that describes these and other issues related to working with time. The information in this article is presented as specific to VMWare, but for the most part, it refers to any virtualization system.
To solve these and other problems, KVM and Xen have their own time management systems: KVM PVclock and Xen time. In the Linux kernel, they are called clocksource (the source of time information).
The current system clocksource can be found in the file /sys/devices/system/clocksource/clocksource0/current_clocksource
.
It is this source that the system will access when it calls gettimeofday
or clock_gettime
.
Let's see how the gettimeofday
call gettimeofday
implemented in the vDSO code. Let me remind you that this code is included in the kernel, but is executed in user space.
If we carefully examine the code located in arch / x86 / vdso / vclock_gettime.c and compare the implementations of gettimeofday
( __vdso_gettimeofday
) and clock_gettime
( __vdso_clock_gettime
) in vDSO, we find that in both cases there are similar if closer to the end of the function:
if (ret == VCLOCK_NONE) return vdso_fallback_gtod(clock, ts);
In the __Vdso_clock_gettime
code there is the same check, but another function is called: vdso_fallback_gettime
.
If ret
is VCLOCK_NONE
, this means that the current system time source does not support vDSO. In this case vdso_fallback_gtod
routinely performs a system call (switching to the kernel context with all the attendant additional costs).
But when does ret
get VCLOCK_NONE
?
If we start moving upward from this condition block, we will find that ret
gets the value of the vclock_mode
field of the current clocksource. In the following sources:
vclock_mode
not equal to VCLOCK_NONE
.
On the other hand, in the sources:
CONFIG_PARAVIRT_CLOCK
parameter is not included in the kernel configuration, or the processor does not provide the paravirtualized clock feature functionalityvclock_mode
is equal to VCLOCK_NONE (0)
.
AWS EC2 uses Xen. In the default Xen clocksource (xen)
the vclock_mode
field vclock_mode
set to VCLOCK_NONE
, so EC2 instances will always use slow system calls — the vDSO mechanism will not be involved.
But how will this affect performance?
In this experiment, we will use microbenchmark to measure how much faster gettimeofday
is through vDSO compared to a normal system call.
To do this, we will launch a test program with three cycles in the EC2 instance. We will first test with a clocksource equal to xen
, and then tsc
.
Setting the clocksource to tsc
in EC2 is not safe. It is unlikely, but still possible, that this could lead to unexpected backlog of clock (backwards clock drift). Do not do this in production systems.
AWS instance parameters:
We will measure the execution time using the time
program. You may be surprised: “How can you use the time
program, which is able to destabilize the source of time information (clocksource)?”
Fortunately, kernel developer Ingo Molnar has written a program to detect time warps: time-warp-test.c . Please note that to work on 64bit x86-systems, the program should be slightly modified.
During our experiment, the time-warp-test
utility did not record time distortions.
To obtain a more reasonable result, you can do the following:
For the purposes of our experiment, it was enough to run tests for time distortion.
From the results it can be seen that normal system calls in ec2 conditions are about 77% slower than vDSO calls:
5 million gettimeofday
calls:
50 million gettimeofday
calls:
500 million gettimeofday
calls:
To fix this problem, you need to add vDSO support to Xen. Fortunately, several corresponding patches are already in the works .
Until this (or similar) change gets into the kernel, and then into EC2, the gettimeofday
and clock_gettime
system calls will run 77% slower than on similar systems with vDSO support.
As expected, vDSO system calls are significantly faster than normal system calls. This is achieved due to the fact that vDSO does not switch to the kernel context. It is important to remember that successfully executed vDSO system calls do not fall into the strace
output. If vDSO could not be used, a normal system call will be made, which will appear in the strace
output.
There are several patches in the work that are designed to add vDSO support to Xen , but it is not known when these changes will appear in AWS EC2.
Until this happens, gettimeofday
and clock_gettime
will run about 77% slower than they should.
Using strace
slows down the execution of the application, but gives invaluable information about exactly what it does. All programmers should run their applications through strace
and analyze the output of this utility.
If you liked this article, I recommend reading our other publications, which also contain a lot of low-level technical information:
How does strace
work?How does ltrace
work?References:
Source: https://habr.com/ru/post/326298/
All Articles