Build a Linux kernel module without accurate header files

Imagine that you have a Linux kernel image for an Android phone, but you don’t have any source code or kernel header files. Imagine that the kernel has support for loading modules (fortunately), and you want to build a module for a given kernel. There are several good reasons why you can’t just build a new core from source and just end up on that (for example, there is no support for some important device in the assembled core, like an LCD or touchscreen). With the ever-changing Linux kernel ABI and the lack of source files and header files, you might think that you are completely stumped.

As a statement of fact, if you assemble a kernel module using other header files (rather than those used to build the kernel image you have — note.), The module cannot load with errors depending on how much header files were different from those required. He may complain about bad signatures, bad versions and other things.

But more about this further.

Kernel configuration

The first step is to find the kernel sources as close as possible to the image of the kernel. Probably getting the right configuration is the most difficult part of the whole module building process. Start with the kernel version number that can be read from /proc/version . If, like me, you are assembling a module for an Android device, try Android kernels from Code Aurora, Cyanogen or Android, the ones closest to your device. In my case, it was the core of msm-3.0. Note that you do not need to search for exactly the same source version as the version of your kernel image. Minor version differences are most likely not to be a hindrance. I used the kernel source 3.0.21, while the version of the existing kernel image was 3.0.8. Do not try, however, to use the source code for the 3.1 kernel if you have a 3.0.x kernel image.
')
If the kernel image that you have is kind enough to provide the /proc/config.gz file, you can start with this, otherwise, you can try to start with the default configuration, but in this case you need to be extremely careful ( although I will not go into the details of using the default configuration, as I was lucky not to resort to this, then there will be some details as to why the correct configuration is so important).

Assuming that arm-eabi-gcc is accessible through one of the paths in the PATH environment variable, and that the terminal is open in the folder with the kernel source files, you can start the kernel configuration and the installation of header files and scripts:

 $ mkdir build $ gunzip config.gz > build/.config #   ,  ,   .config $ make silentoldconfig prepare headers_install scripts ARCH=arm CROSS_COMPILE=arm-eabi- O=build KERNELRELEASE=`adb shell uname -r`

The silentoldconfig is most likely to ask if you want to enable certain options. You can choose defaults, but this may well not work.

You can use something else in KERNELRELEASE , however, this must match exactly the version of the kernel from which you plan to load the module.

Writing a simple module

To create an empty module, you need to create two files: the source and the Makefile . hello.c following code in the file hello.c , in a separate directory:

 #include <linux/module.h> /* Needed by all modules */ #include <linux/kernel.h> /* Needed for KERN_INFO */ #include <linux/init.h> /* Needed for the macros */ static int __init hello_start(void) { printk(KERN_INFO "Hello world\n"); return 0; } static void __exit hello_end(void) { printk(KERN_INFO "Goodbye world\n"); } module_init(hello_start); module_exit(hello_end);

Place the following text in the Makefile in the same directory:

 obj-m = hello.o

Building the module is simple enough, but at this stage the module will not be able to load.

Module assembly

In a normal kernel build, the kernel build system creates the file hello.mod.c , the contents of which can create various problems:

 MODULE_INFO(vermagic, VERMAGIC_STRING);

The value of VERMAGIC_STRING determined by the VERMAGIC_STRING macro, which is located in the include/generated/utsrelease.h generated by the kernel build system. By default, this value is determined by the kernel version and git repository status. This is what KERNELRELEASE when configuring the kernel. If VERMAGIC_STRING does not match the kernel version, loading the module will result in a message of this kind in the dmesg :

 hello: version magic '3.0.21-perf-ge728813-00399-gd5fa0c9' should be '3.0.8-perf'

Further, we also have the definition of the module structure here:

 struct module __this_module __attribute__((section(".gnu.linkonce.this_module"))) = { .name = KBUILD_MODNAME, .init = init_module, #ifdef CONFIG_MODULE_UNLOAD .exit = cleanup_module, #endif .arch = MODULE_ARCH_INIT, };

By itself, this definition looks harmless, but the struct module structure defined in include/linux/module.h carries an unpleasant surprise:

 struct module { (...) #ifdef CONFIG_UNUSED_SYMBOLS (...) #endif (...) /* Startup function. */ int (*init)(void); (...) #ifdef CONFIG_GENERIC_BUG (...) #endif #ifdef CONFIG_KALLSYMS (...) #endif (...) (... plenty more ifdefs ...) #ifdef CONFIG_MODULE_UNLOAD (...) /* Destruction function. */ void (*exit)(void); (...) #endif (...) }

This means that in order for the init pointer to be in the right place, CONFIG_UNUSED_SYMBOLS must be defined in accordance with what our kernel image uses. So what about the exit pointer, is CONFIG_GENERIC_BUG , CONFIG_KALLSYMS , CONFIG_SMP , CONFIG_TRACEPOINTS , CONFIG_JUMP_LABEL , CONFIG_TRACING , CONFIG_EVENT_TRACING , CONFIG_FTRACE_MCOUNT_RECORD , CONFIG_TRACEPOINTS , CONFIG_JUMP_LABEL , CONFIG_TRACING , CONFIG_EVENT_TRACING , CONFIG_FTRACE_MCOUNT_RECORD

You begin to understand why it is usually supposed to use exactly the same header files with which our kernel was compiled?

Next, define character versions:

 static const struct modversion_info ____versions[] __used __attribute__((section("__versions"))) = { { 0xsomehex, "module_layout" }, { 0xsomehex, "__aeabi_unwind_cpp_pr0" }, { 0xsomehex, "printk" }, };

These definitions are taken from the Module.symvers file, which is generated in accordance with the header files.

Each such record represents the character required by the module, and which signature should have the character. The first character, module_layout , depends on what the struct module looks like, that is, it depends on which configuration options mentioned earlier are enabled. The second, __aeabi_unwind_cpp_pr0 , is a function specific to ABI ARM, and the last is for our calls to the printk function.

The signature of each character may differ depending on the kernel code for this function and the compiler used to build the kernel. This means that if you build a kernel from source, as well as modules for a given kernel, and then rebuild a kernel after modifying, for example, the printk function, even in a compatible way, the modules that were built initially will not boot with the new kernel.

So, if we build a kernel with sources and configurations that are close enough to those with which the existing kernel image was built, there is a chance that we will not get the same signatures as in our kernel image, and it cursed while loading the module:

 hello: disagrees about version of symbol symbol_name

Which means that we need the correct, appropriate kernel image, the Module.symvers file, which we don’t have.

Studying the core

Because the kernel does these checks when loading modules, it also contains a list of characters that it exports and the corresponding signatures. When a kernel loads a module, it goes through all the characters that the module requires in order to find them in its symbol table (or other module symbol tables that the module uses) and check the corresponding signatures.

The kernel uses the following function to search in its symbol table (in kernel / module.c):

 bool each_symbol_section(bool (*fn)(const struct symsearch *arr, struct module *owner, void *data), void *data) { struct module *mod; static const struct symsearch arr[] = { { __start___ksymtab, __stop___ksymtab, __start___kcrctab, NOT_GPL_ONLY, false }, { __start___ksymtab_gpl, __stop___ksymtab_gpl, __start___kcrctab_gpl, GPL_ONLY, false }, { __start___ksymtab_gpl_future, __stop___ksymtab_gpl_future, __start___kcrctab_gpl_future, WILL_BE_GPL_ONLY, false }, #ifdef CONFIG_UNUSED_SYMBOLS { __start___ksymtab_unused, __stop___ksymtab_unused, __start___kcrctab_unused, NOT_GPL_ONLY, true }, { __start___ksymtab_unused_gpl, __stop___ksymtab_unused_gpl, __start___kcrctab_unused_gpl, GPL_ONLY, true }, #endif }; if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data)) return true; (...)

The structure used in this function is defined in include / linux / module.h:

 struct symsearch { const struct kernel_symbol *start, *stop; const unsigned long *crcs; enum { NOT_GPL_ONLY, GPL_ONLY, WILL_BE_GPL_ONLY, } licence; bool unused; };

Note: this kernel code has not changed significantly over the past four years (apparently, since the release of the kernel 3.0 being considered, - approx. Lane).

What we have above in the each_symbol_section function is three (or five, when the CONFIG_UNUSED_SYMBOLS config CONFIG_UNUSED_SYMBOLS enabled) fields, each of which contains the beginning of the symbol table, its end, and two flags.

These data are static and constant, which means that they will appear in the kernel binary as is. By scanning the kernel for three successive sequences of three pointers in the address space of the kernel and following integer values from the definitions in each_symbol_section , we can determine the location of the symbol tables and signatures, and recreate the Module.symvers file from the kernel binary.

Unfortunately, most kernels are compressed today ( zImage ), so a simple search for a compressed image is impossible. A compressed core actually represents a small binary, followed by a compressed stream. You can scan the zImage file in order to find the compressed stream and get an unpacked image from it.

I wrote a script for decompression and extraction of information about the kernel symbols in automatic mode . This should work with any fresh version of the kernel, provided that the kernel is not relocatable and you know the base address in memory where it is loaded. The script accepts options for the number and order of bits (endianness) of the architecture, and by default uses values appropriate for ARM. The base address, however, must be specified. It can be found, on the ARM cores, in the dmesg :

 $ adb shell dmesg | grep "\.init" <5>[01-01 00:00:00.000] [0: swapper] .init : 0xc0008000 - 0xc0037000 ( 188 kB)

(note. Lane - however, not all kernels output this data to the log, I happened to meet one such practically unique case, when, apparently, due to reduced configuration options, this information was not output, in that case you can refer to the PAGE_OFFSET configuration in the arch file / arm / Kconfig and just hope that the vendor used one of the defaults).

The base address in the example above is 0xc0008000 .

If, like me, you are interested in loading a module on an Android device, then the kernel binary you have is a full boot image. The boot image contains other things besides the kernel, so you cannot directly use it with the script above. The only exception is if the kernel in the boot image is compressed, and the part of the script that expects a compressed image at the entrance still finds the kernel.

If the kernel is not compressed, you can use the unbootimg program as outlined in this post in order to get a kernel image from your boot image. Once you have a kernel image, the script can be run as follows:

 $ python extract-symvers.py -B 0xc0008000 kernel-filename > Module.symvers

Kernel build

Now that we have the correct Module.symvers file for the kernel from which we want to load the module, we can finally assemble the module (again, assuming arm-eabi-gcc accessible from the PATH , and that the terminal is open in the source directory):

 $ cp /path/to/Module.symvers build/ $ make M=/path/to/module/source ARCH=arm CROSS_COMPILE=arm-eabi- O=build modules

That's all. You can copy the file hello.ko to the device and load the module:

 $ adb shell # insmod hello.ko # dmesg | grep insmod <6>[mm-dd hh:mm:ss.xxx] [id: insmod]Hello world # lsmod hello 586 0 - Live 0xbf008000 (P) # rmmod hello # dmesg | grep rmmod <6>[mm-dd hh:mm:ss.xxx] [id: rmmod]Goodbye world

This article is a translation of a posting on the Mike Hommey blog .

Source: https://habr.com/ru/post/331202/

All Articles