Most recently, a fix has been released that eliminates the complete freeze of the 32-bit Linux kernel when loaded on Intel processors. Here is a small story about where the error came from and what kind of research was done to find the causes of its occurrence.
I'll start with a little excursion into the boot process. Most of you already know that there are many phases in OS boot, two of which, for example, boot loader (sorry for tautology) and load OS kernel, in our case Linux. Let's just dig aside, what happens at the moment after the transfer of control of the loader to the Linux kernel.
Conventionally, the Linux kernel can be divided into two parts according to its execution - bootable and executable. After obtaining control of the kernel, it executes the boot part, which accounts for decompression and the location of the kernel in the physical memory of the system. Then there is a minimum setting of the memory manager, the detection of the type of processor and its flags, etc. After the implementation of these steps, control is transferred to the code, where the part of the kernel that is not attached to the architecture directly starts to work (strictly speaking, this is not quite the case, but here we emphasize the transition from assembly code to C code). The process is described in more detail in
[1] .
Now let's also recall the fact that modern processors use so-called. “Microcode”, configuring execution of some processor instructions. It also allows you to eliminate some of the iron deficiencies without reissuing the crystal.
')
The natural desire of any OS kernel developers is the ability to impose fixes as early as possible during the boot process. Earlier, in Linux, this issue was dealt with by special daemons in user space, loading at a rather late stage.
A few years ago, Fenghua Yu suggested (
see [2] ) putting the microcode file in the initial RAM disk image (initrd) and using it in the early stages. The change greatly improved the situation, but there were still flaws, in particular the need for an initial disk image and the inability to keep the microcode for different processor versions, since the file name is fixed.
Most recently, Borislav Petkov decided to correct the first of them, posting the change
[3] .
This is where the dance begins. The call to the
load_ucode_bsp()
function for 64-bit and 32-bit kernels comes from different places of the kernel loading process. In the 64-bit environment, the call is made from the C code, where the MMU and the program memory manager are initialized, but in the 32-bit case it occurs much earlier.
The effect of this behavior was this. Consider the causal function load_builtin_intel_microcode (), which is executed at an early stage.
static bool __init load_builtin_intel_microcode(struct cpio_data *cp) { unsigned int eax = 0x00000001, ebx, ecx = 0, edx; unsigned int family, model, stepping; char name[30]; native_cpuid(&eax, &ebx, &ecx, &edx); family = __x86_family(eax); model = x86_model(eax); stepping = eax & 0xf; sprintf(name, "intel-ucode/%02x-%02x-%02x", family, model, stepping); return get_builtin_firmware(cp, name); }
Note the call to the
sprintf()
intranuclear library function. It is her challenge, regardless of the parameters (assuming they are correct) that destroys the system.
What is happening there? My colleague, MIka Westerberg, suggested that the reason was in such an early code call, when in fact functions are called by their physical addresses, and not virtual. While the MMU is not configured and the memory manager is not initialized, virtual addresses do not work, therefore, for execution, a correspondence is needed between virtual and physical addresses 1-in-1, which is not observed for a part of the functional. (By the way, if you try to call
strcpy()
, the result will be the same.)
The merge window is looming not far away (I told about it a little earlier in
[4] ), and Borislav decided so far to disable his change for 32-bit kernels, sending an update
[5] .
The moral of the story is that the OS boot is a very delicate process, requiring a rather deep knowledge of architecture to understand what is happening there.
[1]
www.ibm.com/developerworks/library/l-linuxboot[2]
lwn.net/Articles/530346[3]
www.spinics.net/lists/linux-tip-commits/msg28000.html[4]
habrahabr.ru/post/253421[5]
permalink.gmane.org/gmane.linux.kernel/1969480UPDATE.Completely forgot to add one important note. Many developers test their code not on real machines, but in virtual ones using the same QEMU. So everything works fine there.
In the
comments jcmvbkbc shared his analysis of what is happening.